option
Home
News
Chinese AI Censorship Exposed by Leaked Data

Chinese AI Censorship Exposed by Leaked Data

April 10, 2025
168

China's use of AI to enhance its censorship capabilities has reached a new level, as revealed by a leaked database containing 133,000 examples of content flagged for sensitivity by the Chinese government. This sophisticated large language model (LLM) is designed to automatically detect and censor content related to a wide range of topics, from poverty in rural areas to corruption within the Communist Party and even subtle political satire.

Chinese flag on pole behind razor wire

This photo taken on June 4, 2019, shows the Chinese flag behind razor wire at a housing compound in Yengisar, south of Kashgar, in China’s western Xinjiang region.Image Credits:Greg Baker / AFP / Getty Images

According to Xiao Qiang, a researcher at UC Berkeley who specializes in Chinese censorship, this database is "clear evidence" that the Chinese government or its affiliates are using LLMs to bolster their repression efforts. Unlike traditional methods that depend on human moderators and keyword filtering, this AI-driven approach can significantly enhance the efficiency and precision of state-controlled information management.

The dataset, discovered by security researcher NetAskari on an unsecured Elasticsearch database hosted on a Baidu server, includes recent entries from December 2024. It's unclear who exactly created the dataset, but its purpose is evident: to train an LLM to identify and flag content related to sensitive topics such as pollution, food safety, financial fraud, labor disputes, and military matters. Political satire, especially when it involves historical analogies or references to Taiwan, is also a high-priority target.

a snippet of JSON code that references prompt tokens and LLMs. much of the contents are in Chinese.

Image Credits:Charles rollet

The training data includes various examples of content that could potentially stir social unrest, such as complaints about corrupt police officers, reports on rural poverty, and news about expelled Communist Party officials. The dataset also contains extensive references to Taiwan and military-related topics, with the Chinese word for Taiwan (台湾) appearing over 15,000 times.

The dataset's intended use is described as "public opinion work," a term that Michael Caster of Article 19 explains is typically associated with the Cyberspace Administration of China (CAC) and involves censorship and propaganda efforts. This aligns with Chinese President Xi Jinping's view of the internet as the "frontline" of the Communist Party's public opinion work.

This development is part of a broader trend of authoritarian regimes adopting AI technology for repressive purposes. OpenAI recently reported that an unidentified actor, likely from China, used generative AI to monitor social media and forward anti-government posts to the Chinese government. The same technology was also used to generate critical comments about a prominent Chinese dissident, Cai Xia.

While China's traditional censorship methods rely on basic algorithms to block blacklisted terms, the use of LLMs represents a significant advancement. These AI systems can detect even subtle criticism on a massive scale and continuously improve as they process more data.

"I think it's crucial to highlight how AI-driven censorship is evolving, making state control over public discourse even more sophisticated, especially at a time when Chinese AI models such as DeepSeek are making headwaves," Xiao Qiang told TechCrunch.

Related article
German court sides with Teradyne Robotics, grants injunction against Elite Robots German court sides with Teradyne Robotics, grants injunction against Elite Robots Teradyne's subsidiary Universal Robots recently showcased its mobile manipulator equipped with a UR collaborative robot arm at the MODEX trade show. Source: TeradyneAs the Hannover Messe trade show kicked off in Germany this week, the Regional Court
Hyundai Debuts MobED Robot at AW as AI Transforms Manufacturing Hyundai Debuts MobED Robot at AW as AI Transforms Manufacturing Hyundai will showcase its MobED robot among other Korean systems at AW 2026. Source: Hyundai Motor GroupHyundai Motor Group's Robotics Lab will debut its MobED mobile platform at next week's Smart Factory & Automation World (AW) in Seoul, as robotics
Seoul Automation World to showcase China's humanoid robot makers Seoul Automation World to showcase China's humanoid robot makers Five prominent humanoid robotics companies from China will exhibit and present in Seoul. Source: AW 2026As humanoid robots capture growing interest from global technology leaders, investors, and industrial players, China's top five humanoid developer
Related Special Topic Recommendations
Animation Creation AI Anime Generator for Donghua: Create Web Novel Characters & Comic Avatars
AI Anime Generator for Donghua: Create Web Novel Characters & Comic Avatars

Discover the 2026 best AI anime generators for donghua. Our top-rated, curated list features powerful tools to create stunning web novel characters and comic avatars. Compare free vs paid options with real-world tests. Find your perfect creative partner and bring your stories to life today at XIX.AI.

10 tools
xix.ai
Comic Creation Top AI Auto-Colorization Tools for Manga: Apply Flat Colors with Zero Consistency Errors
Top AI Auto-Colorization Tools for Manga: Apply Flat Colors with Zero Consistency Errors

Discover the 2026 best AI auto-colorization tools for manga at XIX.AI. Our curated list features top-rated, game-changing solutions that apply flat colors with zero consistency errors, boosting your productivity. Explore free vs paid comparisons, real-world tests, and weekly updated rankings to find your perfect match. Unlock your AI edge today.

10 tools
xix.ai
writing Top AI Fiction Profile Creators: Generate Consistent Character Motivations and Fatal Flaws
Top AI Fiction Profile Creators: Generate Consistent Character Motivations and Fatal Flaws

Discover the 2026 best AI fiction profile creators for crafting deep characters. XIX.AI's curated list features top-rated, game-changing tools that generate consistent motivations and fatal flaws. Compare free vs paid options with real-world tests. Unlock your storytelling potential now.

10 tools
xix.ai
Business Top AI Pricing Optimization Software: Track Competitors & Auto-Adjust Store Prices
Top AI Pricing Optimization Software: Track Competitors & Auto-Adjust Store Prices

Discover the 2026 best AI pricing optimization software on XIX.AI. Our curated list features top-rated, game-changing tools that track competitors and auto-adjust your store prices for maximum profit. Compare free vs paid options with real-world tests. Unlock your pricing edge now.

10 tools
xix.ai
code Best AI Code Reviewers: Automate Clean Code Compliance & Refactor Legacy Repo Files
Best AI Code Reviewers: Automate Clean Code Compliance & Refactor Legacy Repo Files

Discover the 2026 best AI code reviewers on XIX.AI. Our curated list features top-rated, game-changing tools for automating clean code compliance and refactoring legacy repo files. Compare free vs paid options with real-world tests and weekly updated rankings. Unlock your AI edge today.

10 tools
xix.ai
Text-to-speech Top AI TTS Apps for Dyslexia: Support Learning and Reading Efficiency for Students
Top AI TTS Apps for Dyslexia: Support Learning and Reading Efficiency for Students

Discover the 2026 latest top-rated AI TTS apps curated for dyslexia support. Our expert rankings compare free vs paid tools, highlighting powerful features for enhanced reading efficiency and learning. Explore must-try, game-changing solutions to unlock student potential. Start your journey at XIX.AI.

10 tools
xix.ai
Comments (38)
0/500
HarryRoberts
HarryRoberts August 11, 2025 at 2:01:05 PM EDT

Whoa, 133,000 flagged posts? That's wild! China's AI censorship game is intense, but I'm curious—how do they even decide what's 'sensitive'? Sounds like a slippery slope. 😬

CharlesGonzalez
CharlesGonzalez August 1, 2025 at 9:47:34 AM EDT

This leak is wild! 133,000 flagged posts show how deep China's AI censorship goes. It's like a digital Big Brother on steroids. 😳 Makes you wonder how much we're not seeing online.

ElijahWalker
ElijahWalker July 22, 2025 at 3:35:51 AM EDT

This leak is wild! 133,000 flagged posts? That’s a scary peek into how AI’s being used to control speech in China. Makes you wonder how much is being filtered without us knowing. 😳

MichaelDavis
MichaelDavis April 21, 2025 at 4:06:03 AM EDT

Essa ferramenta é reveladora! Mostra como a censura por AI na China é profunda. O vazamento do banco de dados é um pouco assustador, mas é importante saber o que está acontecendo nos bastidores. Definitivamente, algo que todos interessados em liberdade na internet devem conhecer. Fique de olho nisso! 👀

SebastianAnderson
SebastianAnderson April 19, 2025 at 6:25:56 PM EDT

Los datos filtrados sobre la censura de IA en China son escalofriantes. Es aterrador pensar en cómo se está utilizando la IA para controlar la información. Necesitamos más transparencia y menos censura, ¿no crees? 🤔

RoyYoung
RoyYoung April 19, 2025 at 12:38:42 PM EDT

中国的AI审查越来越失控了!😱 泄露了133,000个被标记内容的例子,显示出这有多深入。想到AI在自动审查东西,真是可怕。我们需要更多的透明度和更少的控制,对吧?🚫

OR