Hasty Briefsbeta

Bilingual

Building a Korean ambiguity solver fast enough to skip the GPU: 7,300 words/SEC

3 days ago
  • #Korean NLP
  • #CPU Optimization
  • #Ambiguity Resolution
  • A Korean ambiguity solver was developed for Kimchi Reader, a tool for Korean language learners, to resolve lemma ambiguity efficiently without a GPU.
  • The solution is a 14M-parameter KoELECTRA model quantized to int8, running server-side on a CPU at about 7,300 disambiguations per second.
  • Four attempts were made over time, starting with fine-tuning Gemma 3 1B (slow and inaccurate), then trying embeddings, training a custom model, and finally succeeding with KoELECTRA.
  • Key constraints were speed (needed for processing entire books ahead of time) and using the model only to suggest from rule-based lemmatizer candidates, ensuring no hallucination.
  • The final approach involved a closed-set selection from pre-generated candidates, optimized through quantization, custom Rust inference, and SIMD for CPU performance.
  • Throughput improved significantly, with production handling ~18,500 words/second on 16 cores, and accuracy enhanced stats like word frequency rankings.