Reproachfully Presenting Resilient Recursive Descent Parsing
11 days ago
- #rust
- #parsing
- #programming-languages
- The post is part of a series on creating a programming language using Rust, focusing on parsing.
- Parsing is the first pass of the compiler, converting source text into an in-memory representation (a tree).
- The author expresses disdain for parsing tutorials, noting the abundance of such resources and the persistent challenges in syntax design.
- Parsing serves two main roles: validating syntax and creating a structured representation (usually a tree) of the source code.
- The post discusses the complexity of parsing, including choices about syntax (e.g., whitespace sensitivity, keywords) and parsing methods (e.g., parser generators, recursive descent).
- The chosen parsing strategy is recursive descent, favored for its simplicity and adaptability to error handling and full-fidelity parsing.
- The parser uses the `rowan` library for constructing a concrete syntax tree (CST), which includes all source text details for error recovery and IDE support.
- Lexing is introduced as a preliminary step to tokenize the input, using the `logos` library for efficient token recognition.
- The syntax includes identifiers, integers, functions, applications, and let expressions, with specific rules for each.
- Error handling is emphasized, with strategies for resilience (continuing past errors) and full fidelity (representing all source text in the CST).
- The post details the implementation of parsing functions for atoms, applications, let expressions, and the overall program structure.
- A final `parse` function combines all components, returning both the CST and any parsing errors.