Analog Foundation Models
a year ago
- #LLMs
- #Machine Learning
- #Analog Computing
- Analog in-memory computing (AIMC) improves speed and power efficiency for neural network inference.
- AIMC introduces challenges like noisy computations and strict quantization constraints.
- Existing LLMs struggle to achieve 4-bit-level performance on AIMC hardware.
- A new method adapts LLMs for noisy, low-precision analog hardware effectively.
- State-of-the-art models like Phi-3-mini-4k-instruct and Llama-3.2-1B-Instruct retain performance comparable to 4-bit weight, 8-bit activation baselines.
- The method also enables quantization for low-precision digital hardware.
- Models benefit from test-time compute scaling, showing better behavior than static quantization models.
- The work bridges the gap between high-capacity LLMs and efficient analog hardware.