Do LLMs Break the Sapir-Whorf Hypothesis?
16 hours ago
- #LLMs
- #computational linguistics
- #Sapir-Whorf
- The study examines whether LLMs support or contradict the Sapir-Whorf hypothesis by analyzing internal representations across languages.
- Experiments using multiple languages and topics show that in transformer models, reasoning layers separate thought from language, with meaning dominating over language identity.
- Four architecturally different models (Qwen3.5-27B, MiniMax M2.5, GLM-4.7, GPT-OSS-120B) consistently exhibit a three-phase structure: encoding, language-agnostic reasoning, and decoding.
- PCA visualizations confirm that early layers cluster by language, middle layers by topic (meaning), and late layers return to language-specific representations.
- Extended tests with code (Python) and math (LaTeX) show the same pattern, indicating a modality-agnostic universal semantic space beyond natural language.
- Results challenge strong Sapir-Whorf determinism, suggesting LLMs create an 'anti-Whorfian bottleneck' where language is I/O, not integral to reasoning.
- The findings partially align with Chomsky's universal structure idea but locate it in emergent semantic geometry rather than innate syntax.
- Implications include potential for cross-lingual steering, scaling studies, and testing with culturally specific concepts to explore universality limits.