Pamela Samuelson – Does Using In-Copyright Works as Training Data Infringe?

7 months ago

#AI
#copyright
#fair-use

Over 40 lawsuits filed in U.S. courts against GenAI developers for copyright infringement related to training data.
Fair use is the main defense for GenAI developers, with mixed success in cases like Bartz v. Anthropic and Kadrey v. Meta.
Judges debated the implications of using 'pirated' books and a novel 'market dilution' theory affecting human authors.
Copyright law protects original works but is limited by the fair use doctrine, which considers four factors.
Transformative purpose is key in fair use cases, as seen in Campbell v. Acuff-Rose.
Bartz and Kadrey cases involved class actions against Anthropic and Meta for using books to train AI models.
Judges found some training uses transformative but cautioned that other factors must also be considered.
Commercial purposes can weigh against fair use, but less so if the use is transformative.
Using pirated books as training data was criticized but not definitively ruled against fair use.
Highly expressive works like those of Bartz and Kadrey are closer to the 'core' of copyright protection.
Copying entire works for training was deemed reasonable given the transformative purpose.
Lost license revenues and lost sales arguments were largely dismissed by judges.
Market dilution theory, suggesting AI outputs flood markets and harm human authors, was seen as novel but speculative.
Judge Chhabria suggested certain genres and authors might be more affected by AI-generated competition.
The market dilution theory lacks precedent and may face challenges on appeal.
Courts may award monetary relief rather than issue injunctions against GenAI training.
Bartz and Kadrey decisions indicate some training uses may be fair, while others may not, leaving the issue unresolved.

Hasty Briefsbeta

Pamela Samuelson – Does Using In-Copyright Works as Training Data Infringe?