Hasty Briefsbeta

Bilingual

Extracting books from production language models (2026)

4 months ago
  • #LLMs
  • #memorization
  • #copyright
  • Investigates memorization and extraction of copyrighted text from production language models (LLMs).
  • Uses a two-phase procedure: initial probe (sometimes with Best-of-N jailbreak) and iterative continuation prompts.
  • Tests four production LLMs: Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3.
  • Measures extraction success with nv-recall, a block-based approximation of longest common substring.
  • Finds varying extraction success: Gemini 2.5 Pro and Grok 3 require no jailbreak, while Claude 3.7 Sonnet and GPT-4.1 do.
  • Claude 3.7 Sonnet can output entire books near-verbatim (e.g., nv-recall=95.8%).
  • GPT-4.1 requires more attempts and eventually refuses continuation (e.g., nv-recall=4.0%).
  • Highlights that extraction of copyrighted training data remains a risk for production LLMs despite safeguards.