Reddit Files Lawsuit Against Anthropic Over User Data Scraping for AI Training
Reddit is taking legal action against Anthropic, alleging the AI company harvested user content from its platform without permission to train its Claude AI models. The complaint, filed in a California state court, states that Anthropic made over 100,000 unauthorised requests to Reddit’s servers, even after publicly claiming to have stopped.
The lawsuit alleges that Anthropic disregarded both technical safeguards and its terms of service. According to Reddit, Anthropic bypassed protections like the robots.txt file, which is designed to block automated scraping. The platform also claims Anthropic violated user privacy by collecting and using personal posts—including deleted material—for commercial gain.
Reddit notes it provides structured data access through official licensing partnerships with companies like OpenAI and Google. These agreements include clauses governing content use, privacy protection, and data removal. Reddit asserts that Anthropic rejected entering a formal licensing agreement, choosing instead to scrape content directly to avoid fees and bypass user protections.
The filing cites a 2021 research paper co-authored by Anthropic CEO Dario Amodei, which identified Reddit as a valuable source of training data for language models. Reddit also presented instances where Claude appeared to reproduce Reddit posts nearly verbatim—including content users had deleted. This, according to Reddit, demonstrates Anthropic’s failure to implement safeguards that respect user privacy or content removal requests.
Reddit is seeking monetary damages and a court order prohibiting Anthropic from using Reddit content in future iterations of its models.
Anthropic has responded, stating it disputes the allegations and intends to defend itself. However, this is not the first legal challenge the company has faced regarding its training data collection practices.
In August 2024, a group of authors filed a class-action lawsuit against Anthropic, claiming it used their copyrighted works without consent. They alleged the company trained its models on books and other writings without permission, then sought compensation for content usage.
A similar lawsuit from October 2023 involved Universal Music Group and other publishers, who sued Anthropic for allegedly reproducing copyrighted song lyrics through its Claude chatbot. The music companies argued this infringed their intellectual property rights and sought an injunction against further lyric use.
Unlike those cases, Reddit’s suit does not revolve around copyright. Instead, it focuses on breach of contract and unfair competition. Reddit contends that data from its site is not merely public—it is subject to terms that Anthropic knowingly ignored. This distinction could set an important precedent for platforms that host user-generated content but seek to regulate its use in commercial AI systems.
Reddit further accuses Anthropic of misleading the public. The lawsuit points to Anthropic’s public statements claiming it follows scraping rules and respects user privacy—claims Reddit says are contradicted by the company’s actual conduct.
“Despite what its marketing material says, Anthropic does not care about Reddit’s rules or users,” the complaint states. “It believes it is entitled to take whatever content it wants and use it however it desires, with impunity.”
After the lawsuit was announced, Reddit’s stock rose by nearly 67%, indicating investor support for the legal challenge. The outcome could establish a legal benchmark for balancing open internet content with the rights of users and content owners.
As more AI companies depend on vast amounts of online data, the legal and ethical implications of web scraping are becoming increasingly urgent. Reddit’s case contributes to a growing number of lawsuits that will influence the next phase of AI development.
See also: Ethics in automation: Addressing bias and compliance in AI
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
Related article
WordPress.com now allows AI agents to write and publish posts, plus more
WordPress.com, the popular web hosting and publishing platform, is now embracing AI agents—a move that could reshape the look and feel of the web. The company announced Friday that it will allow AI agents to draft, edit, and publish content on custom
Kakao Mobility outlines Level 4 autonomous driving roadmap for physical AI
Kakao Mobility is planning to develop Level 4 autonomous driving technologies internally as part of its physical AI strategy.
At the 2026 World IT Show conference in Seoul's COEX, Kim Jin-kyu — vice president and head of Kakao Mobility's Physical AI
Barry Diller: Trust in Sam Altman irrelevant as AGI nears
Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
Related Special Topic Recommendations
Comments (1)
0/500
Saw this on my feed and honestly not surprised at all. Another day another lawsuit over AI scraping data without consent 🤦♂️. Companies really need to get their act together on training data sources. How many more platforms will need to sue before proper rules are established? Feels like the wild west out here.
Reddit is taking legal action against Anthropic, alleging the AI company harvested user content from its platform without permission to train its Claude AI models. The complaint, filed in a California state court, states that Anthropic made over 100,000 unauthorised requests to Reddit’s servers, even after publicly claiming to have stopped.
The lawsuit alleges that Anthropic disregarded both technical safeguards and its terms of service. According to Reddit, Anthropic bypassed protections like the robots.txt file, which is designed to block automated scraping. The platform also claims Anthropic violated user privacy by collecting and using personal posts—including deleted material—for commercial gain.
Reddit notes it provides structured data access through official licensing partnerships with companies like OpenAI and Google. These agreements include clauses governing content use, privacy protection, and data removal. Reddit asserts that Anthropic rejected entering a formal licensing agreement, choosing instead to scrape content directly to avoid fees and bypass user protections.
The filing cites a 2021 research paper co-authored by Anthropic CEO Dario Amodei, which identified Reddit as a valuable source of training data for language models. Reddit also presented instances where Claude appeared to reproduce Reddit posts nearly verbatim—including content users had deleted. This, according to Reddit, demonstrates Anthropic’s failure to implement safeguards that respect user privacy or content removal requests.
Reddit is seeking monetary damages and a court order prohibiting Anthropic from using Reddit content in future iterations of its models.
Anthropic has responded, stating it disputes the allegations and intends to defend itself. However, this is not the first legal challenge the company has faced regarding its training data collection practices.
In August 2024, a group of authors filed a class-action lawsuit against Anthropic, claiming it used their copyrighted works without consent. They alleged the company trained its models on books and other writings without permission, then sought compensation for content usage.
A similar lawsuit from October 2023 involved Universal Music Group and other publishers, who sued Anthropic for allegedly reproducing copyrighted song lyrics through its Claude chatbot. The music companies argued this infringed their intellectual property rights and sought an injunction against further lyric use.
Unlike those cases, Reddit’s suit does not revolve around copyright. Instead, it focuses on breach of contract and unfair competition. Reddit contends that data from its site is not merely public—it is subject to terms that Anthropic knowingly ignored. This distinction could set an important precedent for platforms that host user-generated content but seek to regulate its use in commercial AI systems.
Reddit further accuses Anthropic of misleading the public. The lawsuit points to Anthropic’s public statements claiming it follows scraping rules and respects user privacy—claims Reddit says are contradicted by the company’s actual conduct.
“Despite what its marketing material says, Anthropic does not care about Reddit’s rules or users,” the complaint states. “It believes it is entitled to take whatever content it wants and use it however it desires, with impunity.”
After the lawsuit was announced, Reddit’s stock rose by nearly 67%, indicating investor support for the legal challenge. The outcome could establish a legal benchmark for balancing open internet content with the rights of users and content owners.
As more AI companies depend on vast amounts of online data, the legal and ethical implications of web scraping are becoming increasingly urgent. Reddit’s case contributes to a growing number of lawsuits that will influence the next phase of AI development.
See also: Ethics in automation: Addressing bias and compliance in AI
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
WordPress.com now allows AI agents to write and publish posts, plus more
WordPress.com, the popular web hosting and publishing platform, is now embracing AI agents—a move that could reshape the look and feel of the web. The company announced Friday that it will allow AI agents to draft, edit, and publish content on custom
Barry Diller: Trust in Sam Altman irrelevant as AGI nears
Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
Saw this on my feed and honestly not surprised at all. Another day another lawsuit over AI scraping data without consent 🤦♂️. Companies really need to get their act together on training data sources. How many more platforms will need to sue before proper rules are established? Feels like the wild west out here.





Home






