The Normalization of Deviance in AI
7 hours ago
- #AI Safety
- #Normalization of Deviance
- #LLM Security
- The AI industry is normalizing deviance by over-relying on unreliable LLM outputs, similar to the cultural failures leading to the Challenger disaster.
- Large language models (LLMs) are inherently untrustworthy, requiring security controls downstream, but system designers often ignore this, accepting risks.
- Companies treat probabilistic, non-deterministic LLM outputs as reliable, leading to systems where untrusted outputs take consequential actions without adequate oversight.
- Dangers include safety incidents from over-trusting benign outputs (e.g., hallucinations) and exploitation via adversarial inputs like prompt injection or backdoors.
- Cultural drift occurs through temporary shortcuts under competitive pressure, making deviations the norm and eroding security guardrails over time.
- Industry examples from Microsoft, OpenAI, Anthropic, and Google show vendors acknowledging risks but still pushing agentic AI, normalizing unsafe practices.
- To mitigate risks, AI should remain human-led in high-stakes contexts, with proper threat modeling, security controls, and oversight, rather than trusting models to 'do the right thing'.