Hasty Briefsbeta

Bilingual

Noroboto: Lying Fonts and Mitigation in Rust

8 hours ago
  • #LegalTech AI
  • #Unicode Security
  • #PDF Rendering
  • The most exciting phrase in science, heralding new discoveries, is 'That's funny...'
  • Switching from PDFium to hayro in Rust for PDF rendering led to discovering a bug with double-t 'tt' non-Unicode values, which also affected PDFium.
  • This discovery raised concerns about adversarial exploitation of specification complexity and imperfections in legal tech stacks (AI native law firms).
  • Noroboto.ttf is a malicious font that obfuscates Unicode mappings in embedded fonts, aiming to deceive AI agents in legal pipelines by using Private Use Areas (PUA).
  • Full obfuscation was partially defeated by advanced LLMs (e.g., ChatGPT 5.5), but partial obfuscation and Unicode replacement attacks proved more effective by exploiting agent laziness.
  • Partial obfuscation hides adversarial terms (e.g., 'successors and assigns' in an NDA), while replacement swaps human-visible text (e.g., 'Maryland' with 'Delaware' Unicode values).
  • A proof-of-concept mitigation in Tritium uses Rust to verify font accuracy by comparing expected ASCII strings with OCR results, calculating a Levenshtein distance-based accuracy score.
  • The approach involves creating a font atlas, rendering glyphs, and using OCR to detect deceptive fonts, with tests confirming perfect accuracy for legitimate fonts and imperfections for malicious ones.
  • Ethical and legal considerations of such attacks are noted, with prior art referenced, and the ease of generating these attacks with off-the-shelf models is highlighted.