Authors Sue OpenAI and Meta Over Alleged Copyright Infringement
Sarah Silverman and Co-Authors Take Legal Action Against Meta and OpenAI
Sarah Silverman, along with authors Richard Kadfrey and Christopher Golden, have initiated legal proceedings against Meta and OpenAI, accusing both tech giants of copyright infringement. The authors allege that their copyrighted books were used without their consent to train the large language models (LLMs) powering OpenAI's ChatGPT and Meta's LLaMa.
The lawsuits are distinct, with each targeting one of the companies. The crux of their argument is that their works were incorporated into the training datasets for these LLMs, which they claim amounts to unauthorized use of their material.
Understanding Large Language Models
An LLM is an advanced AI algorithm that learns from vast quantities of text data, including books and internet content. This training enables the model to understand language patterns, grammar, and context, ultimately allowing it to generate text that mimics human writing and engage in conversational interactions with users.
The lawsuits contend that these models essentially "remix" the copyrighted material of thousands of authors without their permission, compensation, or acknowledgment.
The Broader Context of AI and Copyright
The issue of copyright infringement has become a significant concern since the advent of ChatGPT, which sparked a surge in generative AI technologies. This has raised questions about the impact of AI on creativity and the copyright process.
The lawsuits assert that the LLMs were trained using materials obtained illegally, such as those found on "shadow library" websites. The OpenAI lawsuit specifically mentions the "OpenAI Books2 dataset," which is believed to include around 294,000 titles sourced from notorious sites like Library Genesis, Z-Library, Sci-Hub, and Bibliotik, accessible via torrent systems.
Similarly, the Meta lawsuit points to two sources for their training data: Project Gutenberg, an archive of books no longer under copyright, and the "Books3 section of ThePile" dataset on Hugging Face, which appears to encompass the entire Bibliotik collection.
Legal Representation and Related Cases
Sarah Silverman and her fellow plaintiffs are represented by attorneys Joseph Savery and Matthew Butterick. These same lawyers are also handling a separate lawsuit filed in June against OpenAI by authors Mona Awad and Paul Tremblay, also over alleged copyright infringement.

The ongoing legal battles highlight the tension between AI development and the rights of content creators, a topic that continues to evolve as AI technology advances.
Related article
OpenAI Retires o3 and GPT-4.5 Large Models
As a frontrunner in artificial intelligence, OpenAI's every technical move creates significant industry ripples. Recently, the company dropped a major announcement: it will retire two classic models—o3 and GPT-4.5—from its ChatGPT platform. The GPT-4
AIGCPanel 2.0.0 Major Update: Workflow Engine Opens New Era of Automated Digital Human Creation
AIGCPanel, a powerful tool for local digital human creation, has just launched version 2.0.0—billed as "the most significant update yet." This core overhaul addresses the fragmentation of AI creation tools by linking digital human synthesis, voice cl
BuzzFeed launches AI junk app subsidiary
Amid a significant business crisis, the former digital media giant BuzzFeed is launching an ambitious self-rescue experiment powered by artificial intelligence. At the recent SXSW conference, co-founder and CEO Jonah Peretti announced the creation of
Related Special Topic Recommendations
Comments (20)
0/500
Sarah Silverman mit einem Rechtsstreit gegen Tech-Giganten ist eine mutige, aber vielleicht aussichtslose Sache. KI-Trainingsdaten und Urheberrecht — das wird noch jahrelang juristische Diskussionen füttern. Irgendwie wirkt es wie David gegen Goliath, aber dieses Mal haben die Goliaths Zugriff auf nahezu alle digitalisierten Bücher der Welt. Vielleicht braucht es solche Klagen, um überhaupt einen gesetzlichen Rahmen zu schaffen. 🤔
이 뉴스 보니까 생각이 좀 복잡해지네요. AI가 학습하는 과정에서 저작권 문제가 계속 불거지는군요. 실버먼 작가분들 소송이 어떻게 될지 궁금하긴 한데, 결국 법원 판단에 달린 문제 같아요. AI 발전을 위해선 데이터 접근이 필요하지만 창작자의 권리도 분명히 보호돼야 하고... 🤔 앞으로 이런 논쟁이 더 많아질 텐데, 관련 법이 빨리 정립됐으면 좋겠어요.
GPTの学習データってやっぱり著作権が問題になるよねー。この訴訟、どう転ぶか気になります。AI進化には大量のデータが必須だけど、クリエイターの権利もきちんと守れる仕組みが必要だな。勝敗が今後の生成AIビジネスに大きな影響を与えそう。
This lawsuit is wild! 😮 I mean, authors like Sarah Silverman going after Meta and OpenAI for copyright? That’s a bold move. Makes me wonder if AI’s just gobbling up books like a kid with candy. Hope this sparks a bigger chat about ethics in tech!
This lawsuit is wild! 😮 Silverman and others taking on Meta and OpenAI for copyright issues? I’m curious how this plays out—could set a big precedent for AI and creativity!
Sarah Silverman and Co-Authors Take Legal Action Against Meta and OpenAI
Sarah Silverman, along with authors Richard Kadfrey and Christopher Golden, have initiated legal proceedings against Meta and OpenAI, accusing both tech giants of copyright infringement. The authors allege that their copyrighted books were used without their consent to train the large language models (LLMs) powering OpenAI's ChatGPT and Meta's LLaMa.
The lawsuits are distinct, with each targeting one of the companies. The crux of their argument is that their works were incorporated into the training datasets for these LLMs, which they claim amounts to unauthorized use of their material.
Understanding Large Language Models
An LLM is an advanced AI algorithm that learns from vast quantities of text data, including books and internet content. This training enables the model to understand language patterns, grammar, and context, ultimately allowing it to generate text that mimics human writing and engage in conversational interactions with users.
The lawsuits contend that these models essentially "remix" the copyrighted material of thousands of authors without their permission, compensation, or acknowledgment.
The Broader Context of AI and Copyright
The issue of copyright infringement has become a significant concern since the advent of ChatGPT, which sparked a surge in generative AI technologies. This has raised questions about the impact of AI on creativity and the copyright process.
The lawsuits assert that the LLMs were trained using materials obtained illegally, such as those found on "shadow library" websites. The OpenAI lawsuit specifically mentions the "OpenAI Books2 dataset," which is believed to include around 294,000 titles sourced from notorious sites like Library Genesis, Z-Library, Sci-Hub, and Bibliotik, accessible via torrent systems.
Similarly, the Meta lawsuit points to two sources for their training data: Project Gutenberg, an archive of books no longer under copyright, and the "Books3 section of ThePile" dataset on Hugging Face, which appears to encompass the entire Bibliotik collection.
Legal Representation and Related Cases
Sarah Silverman and her fellow plaintiffs are represented by attorneys Joseph Savery and Matthew Butterick. These same lawyers are also handling a separate lawsuit filed in June against OpenAI by authors Mona Awad and Paul Tremblay, also over alleged copyright infringement.
The ongoing legal battles highlight the tension between AI development and the rights of content creators, a topic that continues to evolve as AI technology advances.
OpenAI Retires o3 and GPT-4.5 Large Models
As a frontrunner in artificial intelligence, OpenAI's every technical move creates significant industry ripples. Recently, the company dropped a major announcement: it will retire two classic models—o3 and GPT-4.5—from its ChatGPT platform. The GPT-4
AIGCPanel 2.0.0 Major Update: Workflow Engine Opens New Era of Automated Digital Human Creation
AIGCPanel, a powerful tool for local digital human creation, has just launched version 2.0.0—billed as "the most significant update yet." This core overhaul addresses the fragmentation of AI creation tools by linking digital human synthesis, voice cl
BuzzFeed launches AI junk app subsidiary
Amid a significant business crisis, the former digital media giant BuzzFeed is launching an ambitious self-rescue experiment powered by artificial intelligence. At the recent SXSW conference, co-founder and CEO Jonah Peretti announced the creation of
Sarah Silverman mit einem Rechtsstreit gegen Tech-Giganten ist eine mutige, aber vielleicht aussichtslose Sache. KI-Trainingsdaten und Urheberrecht — das wird noch jahrelang juristische Diskussionen füttern. Irgendwie wirkt es wie David gegen Goliath, aber dieses Mal haben die Goliaths Zugriff auf nahezu alle digitalisierten Bücher der Welt. Vielleicht braucht es solche Klagen, um überhaupt einen gesetzlichen Rahmen zu schaffen. 🤔
이 뉴스 보니까 생각이 좀 복잡해지네요. AI가 학습하는 과정에서 저작권 문제가 계속 불거지는군요. 실버먼 작가분들 소송이 어떻게 될지 궁금하긴 한데, 결국 법원 판단에 달린 문제 같아요. AI 발전을 위해선 데이터 접근이 필요하지만 창작자의 권리도 분명히 보호돼야 하고... 🤔 앞으로 이런 논쟁이 더 많아질 텐데, 관련 법이 빨리 정립됐으면 좋겠어요.
GPTの学習データってやっぱり著作権が問題になるよねー。この訴訟、どう転ぶか気になります。AI進化には大量のデータが必須だけど、クリエイターの権利もきちんと守れる仕組みが必要だな。勝敗が今後の生成AIビジネスに大きな影響を与えそう。
This lawsuit is wild! 😮 I mean, authors like Sarah Silverman going after Meta and OpenAI for copyright? That’s a bold move. Makes me wonder if AI’s just gobbling up books like a kid with candy. Hope this sparks a bigger chat about ethics in tech!
This lawsuit is wild! 😮 Silverman and others taking on Meta and OpenAI for copyright issues? I’m curious how this plays out—could set a big precedent for AI and creativity!





Home






