LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?
6 hours ago
- #LLM
- #Transformer
- #Neuroanatomy
- The RYS (Repeat Your Self) method, discovered in mid-2024, involves duplicating middle layers in LLMs to improve performance without weight changes or training.
- Experiments on Qwen3.5-27B confirm that relayering helps modern models, with middle layers showing the most improvement when duplicated.
- The model's internal structure has three phases: encoding (layers 0–5), reasoning (layers ~10–50), and decoding (layers ~55–64).
- Cross-language experiments show that the model's internal representation prioritizes content over language, indicating a universal 'thinking space.'
- Heatmaps reveal that duplicating middle layers (e.g., layers 24–35) boosts both math and EQ performance, with diminishing returns for larger blocks.
- Single-layer repeats can improve math performance but have minimal impact on EQ, suggesting that multi-layer circuits are more effective.
- Beam search and surrogate models were used to explore combinatorial configurations, but contiguous mid-stack blocks remain the most efficient.
- The Pareto frontier analysis identified optimal configurations, with layer 33 alone providing significant EQ gains at minimal overhead.
- RYS variants of Qwen3.5-27B were released on HuggingFace, offering different trade-offs between performance and compute cost.
- The findings suggest that Transformer reasoning is organized into discrete functional circuits, a general property across models.