Hasty Briefsbeta

My trick for getting consistent classification from LLMs

9 days ago
  • #classification
  • #LLM
  • #vectorization
  • LLMs can be inconsistent in generating class labels, but semantically consistent.
  • A technique using vectorization and clustering can achieve deterministic labeling from stochastic LLMs.
  • Vectorization leads to label convergence, reducing the number of unique labels significantly.
  • Initial costs and latency are higher with vectorization, but they decrease over time as cache hits increase.
  • By the 10,000th tweet, vectorization is 10x cheaper and more efficient than pure LLM classification.
  • A Golang implementation is provided, including embedding generation, cache checks, and clustering logic.
  • Benchmarking shows the effectiveness of the technique in terms of cost, latency, and scalability.