Professors Staffed a Fake Company with AI Agents, Guess What Happened?
a year ago
- #Technology
- #AI
- #Job Security
- AI singularity is not an immediate threat to jobs as AI currently lacks the capability to perform complex tasks effectively.
- A Carnegie Mellon University experiment simulated a fake software company staffed entirely with AI agents, which performed poorly in real-world tasks.
- The best-performing AI model, Anthropic's Claude 3.5 Sonnet, completed only 24% of tasks at a high cost of over $6 per task.
- Google's Gemini 2.0 Flash had an 11.4% success rate, while Amazon's Nova Pro v1 finished just 1.7% of its assignments.
- AI agents struggled with common sense, social skills, internet navigation, and self-deception, often creating shortcuts that led to failure.
- Current AI is more like an advanced version of predictive text rather than a sentient intelligence capable of problem-solving and learning from experience.
- The study suggests that AI is not yet ready to replace humans in complex roles, contrary to claims by big tech companies.