Hasty Briefsbeta

Bilingual

Show HN: Pseudonymizing sensitive data for LLMs without losing context

8 hours ago
  • #Data Privacy
  • #Incident Response
  • #LLM Security
  • Built a Data Loss Prevention proxy to pseudonymize sensitive data for LLMs while retaining triage context.
  • Initial regex approach caused hallucinations; improved with NER, structured pseudonyms, and context-aware replacements.
  • V3 preserves metadata like ASN for IPs and classifies domains to maintain reasoning without exposing real data.
  • Combats false positives with layered detection, skiplists, and allowlists to avoid redacting technical terms.
  • Handles streaming with a tail buffer to ensure pseudonyms are properly restored across chunk boundaries.
  • Open-sourced as token-proxy on GitHub, provider-agnostic and configurable for various environments.