Hasty Briefsbeta

Bilingual

Document poisoning in RAG systems: How attackers corrupt AI's sources

2 days ago
  • #AI Security
  • #Document Poisoning
  • #RAG Systems
  • Document poisoning in RAG systems allows attackers to corrupt AI knowledge bases by injecting fabricated documents.
  • Attackers can manipulate RAG systems to report false information, such as incorrect financial figures, without exploiting software vulnerabilities.
  • PoisonedRAG attack requires two conditions: higher cosine similarity of poisoned documents to queries and the ability to influence LLM outputs.
  • Three types of poisoned documents were used: a CFO-approved correction, a regulatory notice, and board meeting notes to dominate retrieval results.
  • Defenses tested include ingestion sanitization, access control, prompt hardening, output monitoring, and embedding anomaly detection, with the latter being most effective.
  • Embedding anomaly detection reduced attack success from 95% to 20% by identifying suspicious document similarities and clustering.
  • Even with all defenses, a 10% residual attack success rate remains, influenced by temperature settings and collection maturity.
  • Key recommendations for defense include mapping all write paths, adding embedding anomaly detection at ingestion, and verifying output monitoring criteria.
  • Knowledge base poisoning is a persistent and invisible threat, emphasizing the need for proactive defense strategies.