LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?

a month ago

The RYS (Repeat Your Self) method, discovered in mid-2024, involves duplicating middle layers in LLMs to improve performance without weight changes or training.
Experiments on Qwen3.5-27B confirm that relayering helps modern models, with middle layers showing the most improvement when duplicated.
The model's internal structure has three phases: encoding (layers 0–5), reasoning (layers ~10–50), and decoding (layers ~55–64).
Cross-language experiments show that the model's internal representation prioritizes content over language, indicating a universal 'thinking space.'
Heatmaps reveal that duplicating middle layers (e.g., layers 24–35) boosts both math and EQ performance, with diminishing returns for larger blocks.
Single-layer repeats can improve math performance but have minimal impact on EQ, suggesting that multi-layer circuits are more effective.
Beam search and surrogate models were used to explore combinatorial configurations, but contiguous mid-stack blocks remain the most efficient.
The Pareto frontier analysis identified optimal configurations, with layer 33 alone providing significant EQ gains at minimal overhead.
RYS variants of Qwen3.5-27B were released on HuggingFace, offering different trade-offs between performance and compute cost.
The findings suggest that Transformer reasoning is organized into discrete functional circuits, a general property across models.

Hasty Briefsbeta