Don't let the LLM speak, just probe it
6 hours ago
- #Hidden State Extraction
- #LLM Classification
- #Zero-shot Learning
- LLMs can make classification decisions before token generation, allowing direct extraction of hidden states for classification.
- Use a small MLP or linear probe on the hidden state at the last prompt token to create a fast, zero-shot classifier.
- Train an optional LoRA to write verdicts, reshaping geometry for clearer decision extraction without generating text.
- Optimize by caching KV for content to score multiple criteria efficiently, though this may hinder interaction for complex cases.
- Technique powers applications like Predicate in safety stacks, offering low-cost, high-speed structural question answering.