Hasty Briefsbeta

Bilingual

Count-Min Sketches in JS – Frequencies, but without the data

6 months ago
  • #javascript
  • #data-structures
  • #probability
  • Daniel introduced Count-Min Sketches in Instant, a sync engine.
  • Count-Min Sketches are small, fast data structures for frequency estimates.
  • PG Wodehouse's novels were analyzed using a Count-Min Sketch.
  • The sketch can estimate word counts with 0.05% error rate and 99% confidence.
  • Count-Min Sketches have applications in password safety, link popularity, and database optimization.
  • The post explains how to build a Count-Min Sketch from scratch in JavaScript.
  • The sketch uses multiple hash functions and buckets to minimize errors.
  • Formulas relate error rate and confidence to the number of rows and columns in the sketch.
  • The full code example is available on GitHub.
  • Bonus sections cover probabilities and serialization of the sketch.