Hasty Briefsbeta

Pamela Samuelson – Does Using In-Copyright Works as Training Data Infringe?

11 hours ago
  • #AI
  • #copyright
  • #fair-use
  • Over 40 lawsuits filed in U.S. courts against GenAI developers for copyright infringement related to training data.
  • Fair use is the main defense for GenAI developers, with mixed success in cases like Bartz v. Anthropic and Kadrey v. Meta.
  • Judges debated the implications of using 'pirated' books and a novel 'market dilution' theory affecting human authors.
  • Copyright law protects original works but is limited by the fair use doctrine, which considers four factors.
  • Transformative purpose is key in fair use cases, as seen in Campbell v. Acuff-Rose.
  • Bartz and Kadrey cases involved class actions against Anthropic and Meta for using books to train AI models.
  • Judges found some training uses transformative but cautioned that other factors must also be considered.
  • Commercial purposes can weigh against fair use, but less so if the use is transformative.
  • Using pirated books as training data was criticized but not definitively ruled against fair use.
  • Highly expressive works like those of Bartz and Kadrey are closer to the 'core' of copyright protection.
  • Copying entire works for training was deemed reasonable given the transformative purpose.
  • Lost license revenues and lost sales arguments were largely dismissed by judges.
  • Market dilution theory, suggesting AI outputs flood markets and harm human authors, was seen as novel but speculative.
  • Judge Chhabria suggested certain genres and authors might be more affected by AI-generated competition.
  • The market dilution theory lacks precedent and may face challenges on appeal.
  • Courts may award monetary relief rather than issue injunctions against GenAI training.
  • Bartz and Kadrey decisions indicate some training uses may be fair, while others may not, leaving the issue unresolved.