Hasty Briefsbeta

Reproachfully Presenting Resilient Recursive Descent Parsing

11 days ago
  • #rust
  • #parsing
  • #programming-languages
  • The post is part of a series on creating a programming language using Rust, focusing on parsing.
  • Parsing is the first pass of the compiler, converting source text into an in-memory representation (a tree).
  • The author expresses disdain for parsing tutorials, noting the abundance of such resources and the persistent challenges in syntax design.
  • Parsing serves two main roles: validating syntax and creating a structured representation (usually a tree) of the source code.
  • The post discusses the complexity of parsing, including choices about syntax (e.g., whitespace sensitivity, keywords) and parsing methods (e.g., parser generators, recursive descent).
  • The chosen parsing strategy is recursive descent, favored for its simplicity and adaptability to error handling and full-fidelity parsing.
  • The parser uses the `rowan` library for constructing a concrete syntax tree (CST), which includes all source text details for error recovery and IDE support.
  • Lexing is introduced as a preliminary step to tokenize the input, using the `logos` library for efficient token recognition.
  • The syntax includes identifiers, integers, functions, applications, and let expressions, with specific rules for each.
  • Error handling is emphasized, with strategies for resilience (continuing past errors) and full fidelity (representing all source text in the CST).
  • The post details the implementation of parsing functions for atoms, applications, let expressions, and the overall program structure.
  • A final `parse` function combines all components, returning both the CST and any parsing errors.