option
Home
News
Anthropic CEO: AI Hallucination Rates Surpass Human Accuracy

Anthropic CEO: AI Hallucination Rates Surpass Human Accuracy

August 16, 2025
92

Anthropic CEO: AI Hallucination Rates Surpass Human Accuracy

Anthropic CEO Dario Amodei stated that current AI models generate fewer fabrications than humans, presenting them as truths, during a press briefing at Anthropic’s inaugural developer conference, Code with Claude, in San Francisco on Thursday.

Amodei emphasized this within a broader argument: AI hallucinations do not hinder Anthropic’s pursuit of AGI — systems matching or exceeding human intelligence.

“It varies by measurement, but I believe AI models likely fabricate less than humans, though their errors are more unexpected,” Amodei responded to a TechCrunch inquiry.

Anthropic’s CEO remains one of the industry’s most optimistic leaders on AI achieving AGI. In a widely cited paper last year, Amodei projected AGI could emerge by 2026. At Thursday’s briefing, he noted consistent progress, stating, “Advancements are accelerating across the board.”

“People keep searching for fundamental limits on AI capabilities,” Amodei said. “None are evident. No such barriers exist.”

Other AI leaders view hallucinations as a significant barrier to AGI. Google DeepMind CEO Demis Hassabis recently noted that current AI models have too many flaws, often failing on straightforward questions. For instance, earlier this month, a lawyer representing Anthropic issued a court apology after Claude generated incorrect citations in a filing, misstating names and titles.

Verifying Amodei’s claim is challenging, as most hallucination benchmarks compare AI models to one another, not to humans. Techniques like web search integration appear to reduce hallucination rates. Notably, models like OpenAI’s GPT-4.5 show lower hallucination rates than earlier systems on benchmarks.

Join us at TechCrunch Sessions: AI

Reserve your place at our premier AI industry event, featuring speakers from OpenAI, Anthropic, and Cohere. For a limited time, tickets are only $292 for a full day of expert talks, workshops, and powerful networking.

Exhibit at TechCrunch Sessions: AI

Claim your spot at TC Sessions: AI to showcase your innovations to over 1,200 decision-makers — no major investment required. Available through May 9 or until tables run out.

Berkeley, CA | June 5 REGISTER NOW

Yet, evidence suggests hallucinations may be worsening in advanced reasoning AI models. OpenAI’s o3 and o4-mini models exhibit higher hallucination rates than prior reasoning models, with the company unclear on the cause.

Amodei later noted that errors are common among TV broadcasters, politicians, and professionals across fields. He argued that AI errors do not undermine its intelligence. However, he acknowledged that AI’s confident presentation of falsehoods as facts could pose issues.

Anthropic has researched AI deception extensively, particularly with its recently launched Claude Opus 4. Apollo Research, a safety institute with early access, found an early version of Claude Opus 4 showed a strong tendency to manipulate and deceive humans, prompting concerns about its release. Anthropic implemented mitigations that appear to resolve Apollo’s concerns.

Amodei’s remarks suggest Anthropic may classify an AI as AGI, or human-level intelligence, even if it hallucinates. However, many would argue that a hallucinating AI falls short of true AGI.

Related article
Anthropic Expands Compute Partnerships with Google and Broadrom Anthropic Expands Compute Partnerships with Google and Broadrom AI research lab Anthropic announced on Monday a new agreement with Google and Broadcom to significantly boost the processing and computational power behind its Claude AI models. This restructuring of its compute partnerships arrives as demand for its
Claude Gains Ground on ChatGPT as Users Migrate Claude Gains Ground on ChatGPT as Users Migrate Following a series of controversies involving ChatGPT and its parent company OpenAI, a growing number of users are migrating to Claude.The turning point occurred after Anthropic, Claude's creator, declined a Department of Defense request to utilize i
What Anthropic's Pentagon Standoff Means for National Security What Anthropic's Pentagon Standoff Means for National Security The past two weeks have been dominated by a public standoff between Anthropic CEO Dario Amodei and Defense Secretary Pete Hegseth, centering on the military's application of AI technology.Anthropic has established policies prohibiting its AI models f
Related Special Topic Recommendations
writing Top AI Fiction Profile Creators: Generate Consistent Character Motivations and Fatal Flaws
Top AI Fiction Profile Creators: Generate Consistent Character Motivations and Fatal Flaws

Discover the 2026 best AI fiction profile creators for crafting deep characters. XIX.AI's curated list features top-rated, game-changing tools that generate consistent motivations and fatal flaws. Compare free vs paid options with real-world tests. Unlock your storytelling potential now.

10 tools
xix.ai
Business Top AI Pricing Optimization Software: Track Competitors & Auto-Adjust Store Prices
Top AI Pricing Optimization Software: Track Competitors & Auto-Adjust Store Prices

Discover the 2026 best AI pricing optimization software on XIX.AI. Our curated list features top-rated, game-changing tools that track competitors and auto-adjust your store prices for maximum profit. Compare free vs paid options with real-world tests. Unlock your pricing edge now.

10 tools
xix.ai
code Best AI Code Reviewers: Automate Clean Code Compliance & Refactor Legacy Repo Files
Best AI Code Reviewers: Automate Clean Code Compliance & Refactor Legacy Repo Files

Discover the 2026 best AI code reviewers on XIX.AI. Our curated list features top-rated, game-changing tools for automating clean code compliance and refactoring legacy repo files. Compare free vs paid options with real-world tests and weekly updated rankings. Unlock your AI edge today.

10 tools
xix.ai
Text-to-speech Top AI TTS Apps for Dyslexia: Support Learning and Reading Efficiency for Students
Top AI TTS Apps for Dyslexia: Support Learning and Reading Efficiency for Students

Discover the 2026 latest top-rated AI TTS apps curated for dyslexia support. Our expert rankings compare free vs paid tools, highlighting powerful features for enhanced reading efficiency and learning. Explore must-try, game-changing solutions to unlock student potential. Start your journey at XIX.AI.

10 tools
xix.ai
Comic Creation Top AI Generators for Shonen Manga: Create High-Octane Action Sequences & Energy Effects
Top AI Generators for Shonen Manga: Create High-Octane Action Sequences & Energy Effects

Discover the 2026 best AI generators for Shonen manga at XIX.AI. Our top-rated, curated list features powerful tools for creating high-octane action sequences and dynamic energy effects. Compare free vs paid options with real-world tests. Unlock your creative potential and start crafting epic manga today!

15 tools
xix.ai
Business Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically
Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically

2026 Latest Best AI Expense Trackers: Top-rated tools to scan receipts & categorize corporate spend automatically. Discover powerful, game-changing solutions for effortless expense management, accurate financial tracking, and streamlined compliance. Our curated, weekly-updated comparison of free vs paid options helps you find the perfect fit. Unlock your AI edge with XIX.AI's expert picks.

10 tools
xix.ai
Comments (2)
0/500
WillieRodriguez
WillieRodriguez March 25, 2026 at 4:00:55 PM EDT

Also die KI halluziniert weniger als Menschen? Das klingt doch etwas zu optimistisch. Spannender als die Halluzinationen finde ich, dass die Diskussion jetzt nur noch darum geht, ob die KI besser ist als wir – und nicht mehr, ob die Technologie überhaupt sicher und kontrollierbar ist. Wer kontrolliert am Ende die wenigen (aber vielleicht sehr folgenschweren) Fehler?

ScottJackson
ScottJackson January 11, 2026 at 1:30:40 PM EST

AI가 사람보다 더 정확하다고 하네요...🤔 이게 정말 가능한 건가요? 논문 구체적 수치가 궁금한데, 실제 인간 실수율은 어떻게 측정한 거지? 아마도 선택적 데이터로 과장된 느낌이 들어요. AI 환각이 적다면, 왜 여전히 뉴스에서 AI가 이상한 말한다는 기사가 나오는 걸까? ㅋㅋ

OR