Hasty Briefsbeta

Bilingual

Show HN: Unicode Steganography

13 hours ago
  • #Steganography
  • #AI Safety
  • #LLM Security
  • Unicode steganography uses invisible characters and visual lookalikes to hide secret messages in text, posing risks for LLM safety.
  • A capable model could embed covert signals in outputs that are invisible to humans but recoverable by other models or processes.
  • Two techniques differ in detectability, capacity, and robustness: zero-width characters and homoglyph substitutions.
  • LLM steganography exemplifies AI deception, as both techniques fool humans but can be caught by specific automated scanners.
  • A key AI alignment challenge is whether a model could create new encodings that evade both human review and unknown scanners.