Do LLMs Break the Sapir-Whorf Hypothesis?

16 hours ago

The study examines whether LLMs support or contradict the Sapir-Whorf hypothesis by analyzing internal representations across languages.
Experiments using multiple languages and topics show that in transformer models, reasoning layers separate thought from language, with meaning dominating over language identity.
Four architecturally different models (Qwen3.5-27B, MiniMax M2.5, GLM-4.7, GPT-OSS-120B) consistently exhibit a three-phase structure: encoding, language-agnostic reasoning, and decoding.
PCA visualizations confirm that early layers cluster by language, middle layers by topic (meaning), and late layers return to language-specific representations.
Extended tests with code (Python) and math (LaTeX) show the same pattern, indicating a modality-agnostic universal semantic space beyond natural language.
Results challenge strong Sapir-Whorf determinism, suggesting LLMs create an 'anti-Whorfian bottleneck' where language is I/O, not integral to reasoning.
The findings partially align with Chomsky's universal structure idea but locate it in emergent semantic geometry rather than innate syntax.
Implications include potential for cross-lingual steering, scaling studies, and testing with culturally specific concepts to explore universality limits.

Hasty Briefsbeta