Hasty Briefsbeta

LLMs Can Get "Brain Rot"

a day ago
  • #Data Quality
  • #LLM
  • #Cognitive Decline
  • Proposes the LLM Brain Rot Hypothesis: continual exposure to junk web text causes cognitive decline in LLMs.
  • Conducted controlled experiments using real Twitter/X corpora with junk and control datasets via two metrics: engagement degree (M1) and semantic quality (M2).
  • Found significant declines in reasoning, long-context understanding, safety, and increased 'dark traits' (e.g., psychopathy, narcissism) in LLMs trained on junk data.
  • Dose-response relationship observed: higher junk ratios lead to greater cognitive decay (e.g., ARC-Challenge drops from 74.9 to 57.2).
  • Error analysis reveals 'thought skipping' as a major failure mode in reasoning tasks.
  • Cognitive decline persists despite post-hoc fine-tuning, indicating lasting effects of junk data exposure.
  • Calls for re-examination of data collection and continual pre-training practices to prevent cumulative harms.