Hasty Briefsbeta

Bilingual

Ohm's Peg-to-WASM Compiler

3 days ago
  • #WebAssembly
  • #parsing
  • #performance
  • Ohm is a parsing toolkit for JavaScript and TypeScript, useful for custom file formats or building language tools.
  • Version 18 is a complete rewrite that compiles grammars into WebAssembly, achieving over 50x speed improvement and 10% memory usage compared to previous versions.
  • Previous versions used AST interpretation with PExpr trees, where parsing expressions were evaluated via methods like 'eval'.
  • The new engine compiles grammars to WebAssembly, avoiding interpretation overhead and inlining code for expressions.
  • CST nodes are managed with a bump allocator in Wasm linear memory, using region-based management to reduce overhead.
  • Terminal nodes are optimized using tagged 32-bit values to avoid per-character allocations.
  • Chunked bindings with fixed-size chunks improve performance by eliminating array resizes and making backtracking cheap.
  • Memoization uses a block-sparse table for efficient storage of parsing results, with entries packed into i32 values.
  • Parameterized rules are handled via static specialization, generating unique rule bodies for each parameter combination.
  • Optimized space skipping avoids creating CST nodes for whitespace until needed, improving performance in many grammars.
  • Additional optimizations include single-use rule inlining and preallocated nodes for fixed-structure elements.
  • The release is available as a beta via npm, with acknowledgments to funders and contributors like Alex Warth.