How Meta AI Staff Deemed More Than 7M Books to Have No "Economic Value"
a year ago
- #AI Ethics
- #Copyright Law
- #Meta Lawsuit
- Meta AI initially stated that using pirated books for AI training is a copyright violation, but later responses varied, citing 'hallucinations' in generative AI.
- Meta is facing a lawsuit (Kadrey et al. v. Meta Platforms) for allegedly using over 7 million pirated books to train its AI model, Llama, without consent or payment.
- Plaintiffs, including prominent authors like Junot Díaz and Sarah Silverman, argue Meta's actions infringe on copyright, while Meta defends its use as 'highly transformative' fair use.
- The case is part of a broader legal battle involving over 16 copyright lawsuits against AI companies for using copyrighted material without permission.
- Internal Meta communications reveal debates over using pirated books, with some employees expressing ethical concerns while others adopted a 'don’t-ask-don’t-tell' approach.
- Meta argues that individual books have negligible impact on AI performance and that licensing millions of works is impractical, likening it to noise in data.
- Authors and publishers, including the Authors Guild, advocate for consent and compensation for AI training, fearing AI-generated content could replace human creativity.
- OpenAI and Google have also faced scrutiny for using pirated content, though OpenAI claims its current models do not rely on LibGen.
- The case raises questions about the commodification of literature and the ethical implications of AI training on copyrighted works without compensation.