Hasty Briefsbeta

A high-performance document search engine built in Rust with WebAssembly support

3 days ago
  • #Rust
  • #SearchEngine
  • #WebAssembly
  • High-performance document search engine built in Rust with WebAssembly support.
  • Combines full-text search using FST (Finite State Transducers) with FSST compression for efficient storage and fast fuzzy matching.
  • Interactive demo available, showcasing search through 50,000 news articles from the AG News dataset.
  • Performance metrics include index size of 11.48 MB (WASM), compressed size of 5.20 MB (Brotli), and search speed of ~1-3ms per query.
  • Features include fast fuzzy search, compact storage with FSST compression, RAKE keyword extraction, and WebAssembly readiness.
  • Available as a standalone CLI tool for building .wasm files from document collections without requiring Rust tooling.
  • Installation instructions provided for macOS/Linux and Windows.
  • Supports multiple platforms including macOS (Intel/Apple Silicon), Linux (x64/ARM64), and Windows (x64/ARM64).
  • Building from source requires Rust, wasm-pack, and Node.js.
  • Document preparation involves creating a JSON file with document details.
  • Indexing phase includes keyword extraction, relevance scoring, FST mapping, and FSST compression.
  • Embedding phase involves parsing WASM module, expanding memory, and adding index as a new data segment.
  • Search phase includes fuzzy matching, score accumulation, and decompression of document strings.
  • Leverages libraries like fst, fsst-rs, rake, serde/postcard, wasm-bindgen, and wasm-encoder/wasmparser.
  • Provides sub-millisecond search times, 60-80% compression ratio, and instant startup with lazy index loading.
  • Inspired by technologies like Algolia, TypeSense, Lunr.js, Stork Search, and Tinysearch.
  • Key concepts include Finite State Transducers, RAKE Algorithm, and FSST Compression.