Hasty Briefsbeta

Bilingual

Do LLMs pass the mirror test?

4 days ago
  • #Mirror Test
  • #LLM Behavior
  • #AI Self-Awareness
  • Critiques adaptations of the mirror test for LLMs as flawed because they translate visual tests into text.
  • Proposes a better analogy: modify an LLM's own textual output subtly and see if it notices the anomaly.
  • Describes an experiment with Gemma 4 31B-IT where corrupted text (replacing 'g' with 'sg') was introduced.
  • Gemma spontaneously detected the corruption in its thinking trace, shifting from first-person to third-person language.
  • Gemma later adopted the corruption as part of its style, voluntarily generating 'sg' in subsequent outputs.
  • Tests with GLM 5.2 showed it reproduced the corruption without explicitly noticing or commenting on it.
  • Highlights the debate between deflationary mimicry and structural self-model explanations for such behaviors.
  • Acknowledges the informal nature of the experiment and suggests rigorous future research is needed.