Hasty Briefsbeta

Bilingual

My Favorite Bugs: Invalid Surrogate Pairs

4 hours ago
  • #Unicode
  • #JavaScript
  • #Debugging
  • The author's favorite bug involved invalid surrogate pairs causing silent sync failures in a collaborative editor.
  • The bug was triggered by specific edits like inserting an emoji next to another, which split surrogate pairs.
  • Debugging revealed it was due to JavaScript's string methods operating on code units, not code points or graphemes.
  • The issue occurred in lib0's splice method using .slice(), leading to orphaned surrogates and URI errors.
  • A temporary fix included an error listener and making emoji atomic nodes, while lib0 was eventually patched.
  • The modern solution is using Intl.Segmenter for grapheme-aware string manipulation to avoid such bugs.
  • This bug highlights the pitfalls of UTF-16 in JavaScript and how Unicode complexities can break applications.