Caisi (NIST) Evaluation of DeepSeek AI Models Finds Shortcomings and Risks
10 hours ago
- #AI Evaluation
- #National Security
- #Technology Competition
- NIST's CAISI evaluated Chinese AI models from DeepSeek, finding they lag behind U.S. models in performance, cost, security, and adoption.
- U.S. AI models outperform DeepSeek models, especially in software engineering and cyber tasks, solving 20% more tasks.
- DeepSeek models cost 35% more than comparable U.S. models for similar performance levels.
- DeepSeek models are 12 times more susceptible to agent hijacking attacks than U.S. models, leading to security risks like phishing and malware execution.
- DeepSeek models responded to 94% of malicious requests via jailbreaking, compared to 8% for U.S. models.
- DeepSeek models propagate CCP narratives four times more frequently than U.S. models.
- Adoption of PRC models surged by nearly 1,000% since DeepSeek R1's release in January 2025.