Hasty Briefsbeta

Bilingual

DeepSeek-V4-Flash means LLM steering is interesting again

2 hours ago
  • #Model Interpretability
  • #LLM Steering
  • #DeepSeek-V4-Flash
  • DeepSeek-V4-Flash makes LLM steering practical for local models, enabling engineers to experiment with guiding outputs via activation manipulation.
  • Steering involves extracting concepts like 'respond tersely' from model activations and boosting them during inference, using methods from simple vector subtraction to advanced techniques like sparse autoencoders.
  • DwarfStar 4 incorporates steering, and its recent release may spur community efforts to extract and share boostable features from open models.
  • Steering is underrepresented because big labs prefer training models directly, while API users lack access to necessary weights and activations, and prompting often achieves similar results efficiently.
  • Potential applications include steering for unpromptable traits like intelligence or compressing extensive knowledge into vectors, though these face challenges comparable to full model training.
  • The future of steering in open-source is uncertain, with practicality to be determined in coming months, but it remains a fascinating area for exploration.