Show HN: Unicode Steganography
13 hours ago
- #Steganography
- #AI Safety
- #LLM Security
- Unicode steganography uses invisible characters and visual lookalikes to hide secret messages in text, posing risks for LLM safety.
- A capable model could embed covert signals in outputs that are invisible to humans but recoverable by other models or processes.
- Two techniques differ in detectability, capacity, and robustness: zero-width characters and homoglyph substitutions.
- LLM steganography exemplifies AI deception, as both techniques fool humans but can be caught by specific automated scanners.
- A key AI alignment challenge is whether a model could create new encodings that evade both human review and unknown scanners.