Hasty Briefsbeta

Bilingual

Adding lookbehinds to rust-lang/regex

10 months ago
  • #rust
  • #regex
  • #lookbehind
  • Implementation of unbounded captureless lookbehinds in Rust's regex engine.
  • Lookbehinds allow regexes to make assertions about preceding text without including it in the match.
  • Negative lookbehinds are also supported, asserting that something is not preceding.
  • The regex engine is structured into 'regex-syntax' for parsing and 'regex-automata' for matching.
  • The PikeVM engine was modified to support lookbehinds with new NFA states: 'WriteLookAround' and 'CheckLookAround'.
  • Performance optimizations were implemented to avoid unnecessary scanning to the end of the haystack.
  • Bounded lookbehind optimization improved performance by up to 150x in benchmarks.
  • A backtracking engine with memoization was also extended to support lookbehinds.
  • Benchmarks showed the implementation is 2-5x slower than Python's 're' but maintains linear time complexity.
  • The work lays the foundation for future extensions like lookaheads and benefits the Rust ecosystem.