How Other Link Checkers Do Recursion

4 days ago

Recursion in link checkers is handled by architecture, not by a clever trick—crawlers are built with cycles from the start, unlike lychee's stream-based DAG.
Deduplication must occur at enqueue time before making requests, a key fix for race conditions that lychee previously missed.
Termination detection is universally solved with mechanisms like WaitGroup (muffet), joinable queues (LinkChecker), onIdle() promises (linkinator), or drain events (broken-link-checker).
Frontier and rate-limiting must be separate components; using a single bounded channel for both causes deadlock.
Runtime differences affect ease: Node.js's single-threaded event loop simplifies dedup, Go's goroutines simplify concurrency, while Rust's ownership adds friction.

Hasty Briefsbeta