Hasty Briefsbeta

Bilingual

Lexar Wants to Offload Local AI Models to SSD Amid the RAMpocalypse

5 hours ago
  • #SSD technology
  • #AI hardware
  • #memory optimization
  • Lexar is developing SSD technology to offload local AI models from DRAM to cheaper NAND Flash to reduce memory costs.
  • The Lexar AI Storage Core SSD can cut DRAM requirements by at least 40%, allowing larger LLMs to run on PCs with less RAM.
  • In tests, running the Qwen 3.5 122B model required only 32 GB of DRAM instead of 128 GB, with improved token generation speeds compared to traditional methods.
  • The technology enables running models with larger context windows, like 256K tokens, where traditional approaches fail, though latency increases with model size.
  • A hot-swappable M.2 SSD design with Lexar's custom SPU DRAM-less controller is showcased for Mini-PCs, supporting PCIe Gen 4 and Gen 5 for direct processor connections.
  • Challenges include slower time-to-first-token and potential wear on NAND Flash from frequent model updates, with debate over long-term cost savings versus DRAM.