Hasty Briefsbeta

It is surprising that Earley can efficiently parse C, ambiguities and all

a day ago
  • #computer science
  • #parsing
  • #Earley algorithm
  • Earley parsing algorithm overview and optimizations, including Leo's right-recursion fixes.
  • Parsing challenges in computer science, especially with grammars requiring localized ambiguity or arbitrary recursion.
  • Definitions: recognizer, scannerless parser, parse tree, AST, parse forest, and complexity notations (O(n), O(n^2), etc.).
  • Earley parsers handle arbitrary context-free grammars with O(n^3) worst-case complexity, optimized to O(n) for sane grammars.
  • Elizabeth Scott's contribution: generating parse forests in the same complexity as Earley recognition, addressing ambiguity.
  • Challenges with left-to-right disambiguation rules in right-to-left parse forests, especially in C's dangling else problem.
  • Proposed fixes: reversing grammar and input token stream, with trade-offs in implementation complexity and practicality.
  • Special considerations for parsing C, involving typedefs and the need for multiple recognition and parsing phases.
  • Comparison with other parsing algorithms: CYK, GLR, GLL, Parsing With Derivatives, and Parsing With Zippers.
  • Personal implementation experience confirming the effectiveness of the described approach for C parsing.