Count-Min Sketches in JS – Frequencies, but without the data
6 months ago
- #javascript
- #data-structures
- #probability
- Daniel introduced Count-Min Sketches in Instant, a sync engine.
- Count-Min Sketches are small, fast data structures for frequency estimates.
- PG Wodehouse's novels were analyzed using a Count-Min Sketch.
- The sketch can estimate word counts with 0.05% error rate and 99% confidence.
- Count-Min Sketches have applications in password safety, link popularity, and database optimization.
- The post explains how to build a Count-Min Sketch from scratch in JavaScript.
- The sketch uses multiple hash functions and buckets to minimize errors.
- Formulas relate error rate and confidence to the number of rows and columns in the sketch.
- The full code example is available on GitHub.
- Bonus sections cover probabilities and serialization of the sketch.