Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB

4 months ago

Z80-μLM is a conversational AI designed for Z80 processors with 64KB RAM, using quantization-aware training (QAT).
The project explores minimalistic AI with personality, resulting in a 40KB .com binary running on a 4MHz 1976 processor.
Features include trigram hash encoding for typo-tolerant input, 2-bit weight quantization, and 16-bit integer inference.
Includes a chatbot and a 20 Questions game, demonstrating terse, personality-driven responses.
Training tools provided for generating data with LLMs (Ollama/Claude API) and balancing class distributions.
Technical highlights: no floating point, autoregressive generation, and tight Z80-native arithmetic loops.
Despite limitations (e.g., no deep context tracking), it offers a unique, constrained interaction mode.
License options: MIT or Apache-2.0.

Hasty Briefsbeta