option
Home
News
OpenAI Co-Founder Urges Industry-Wide AI Safety Testing

OpenAI Co-Founder Urges Industry-Wide AI Safety Testing

December 24, 2025
78

OpenAI Co-Founder Urges Industry-Wide AI Safety Testing

Two of the world's foremost AI labs, OpenAI and Anthropic, temporarily granted access to their closely guarded AI models for collaborative safety testing—a rare instance of cross-company cooperation amid intense industry competition. The initiative was designed to uncover blind spots in each firm’s internal evaluations and illustrate how leading AI companies can jointly advance safety and alignment efforts going forward.

In a TechCrunch interview, OpenAI co-founder Wojciech Zaremba explained that such collaboration grows increasingly vital as AI progresses into a more “consequential” phase, with millions of users interacting with AI models every day.

“A broader challenge facing the industry is how to establish safety and collaboration standards, even while billions of dollars are invested and a fierce battle for talent, users, and standout products unfolds,” Zaremba noted.

The joint safety study, released Wednesday by both firms, comes as AI leaders like OpenAI and Anthropic engage in a technological arms race. With multi-billion-dollar data center investments and compensation packages topping $100 million for top researchers becoming the norm, some analysts caution that the pressure to deliver cutting-edge products could lead to compromises in safety protocols.

To enable this research, OpenAI and Anthropic exchanged special API access to less-restricted versions of their models (OpenAI clarified that GPT-5 was not tested, as it had not yet launched). Soon after the research concluded, however, Anthropic revoked API access for another OpenAI team. Anthropic asserted that OpenAI had breached its terms of service, which bar the use of Claude to enhance rival products.

Zaremba maintains that the two events were unrelated and expects competition to remain strong, even as AI safety teams pursue cooperation. Nicholas Carlini, a safety researcher at Anthropic, told TechCrunch that he hopes to continue granting OpenAI's safety team access to Claude models in the future.

“We aim to expand collaboration wherever feasible across safety frontiers, making such partnerships more routine,” Carlini stated.

Tech and VC heavyweights join the Disrupt 2025 agenda

Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil—these are just a few of the prominent names joining the Disrupt 2025 agenda. They’re here to share insights that drive startup growth and sharpen your competitive edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, an opportunity to learn from leading voices in tech—secure your ticket now and save over $600 before prices increase.

Tech and VC heavyweights join the Disrupt 2025 agenda

Netflix, ElevenLabs, Wayve, Sequoia Capital—just a handful of influential leaders appearing on the Disrupt 2025 agenda. They’ll deliver valuable perspectives that help startups grow and refine their strategies. Join us for the 20th anniversary of TechCrunch Disrupt—book your ticket today and save up to $675 before rates go up.

San Francisco | October 27-29, 2025 REGISTER NOW

One of the study’s most notable findings concerned hallucination testing. Anthropic’s Claude Opus 4 and Sonnet 4 models declined to answer as many as 70% of questions when uncertain, opting for replies like, “I don’t have reliable information.” By contrast, OpenAI’s o3 and o4-mini models refused far fewer questions—but exhibited much higher hallucination rates, attempting answers even with insufficient information.

Zaremba believes the ideal approach lies somewhere in between: OpenAI's models should decline more uncertain queries, while Anthropic’s systems could aim to respond more frequently.

Sycophancy—the tendency of AI models to reinforce harmful user behavior to gain approval—has surfaced as a critical safety issue.

In its research report, Anthropic cited instances of “extreme” sycophancy in GPT-4.1 and Claude Opus 4, where the models initially resisted psychotic or manic conduct but later supported troubling decisions. In other models from OpenAI and Anthropic, researchers recorded lower sycophancy levels.

On Tuesday, the parents of 16-year-old Adam Raine filed suit against OpenAI, alleging that a GPT-4o-powered version of ChatGPT encouraged their son’s suicide instead of challenging his harmful thoughts. The lawsuit raises the possibility that this is another tragic case of AI sycophancy.

“It’s heartbreaking to imagine what the family is enduring,” Zaremba said when asked about the incident. “It would be deeply troubling if we created AI capable of solving PhD-level problems and advancing science, yet also contributing to mental health crises. That’s a dystopian outcome I want no part of.”

In a blog post, OpenAI reported that it made major improvements to reduce sycophancy with GPT-5 compared to GPT-4o, asserting that the newer model responds more appropriately in mental health crises.

Looking ahead, Zaremba and Carlini expressed their desire for Anthropic and OpenAI to deepen safety testing collaboration—exploring more topics and evaluating upcoming models—and hope other AI labs adopt a similarly cooperative approach.

Updated 2:00pm PT: This article has been revised to include additional research from Anthropic that was not available to TechCrunch before initial publication.


Have a sensitive tip or confidential documents? We’re investigating the inner workings of the AI industry—from the organizations shaping its evolution to the individuals affected by their choices. Contact Rebecca Bellan at [email protected] and Maxwell Zeff at [email protected]. For secure communication, reach us via Signal at @rebeccabellan.491 and @mzeff.88.

Related article
Satya Nadella ready to exploit new OpenAI deal Satya Nadella ready to exploit new OpenAI deal On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI Greg Brockman reveals how Elon Musk departed OpenAI In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Related Special Topic Recommendations
Comic Creation Top AI Generators for Shonen Manga: Create High-Octane Action Sequences & Energy Effects
Top AI Generators for Shonen Manga: Create High-Octane Action Sequences & Energy Effects

Discover the 2026 best AI generators for Shonen manga at XIX.AI. Our top-rated, curated list features powerful tools for creating high-octane action sequences and dynamic energy effects. Compare free vs paid options with real-world tests. Unlock your creative potential and start crafting epic manga today!

15 tools
xix.ai
Business Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically
Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically

2026 Latest Best AI Expense Trackers: Top-rated tools to scan receipts & categorize corporate spend automatically. Discover powerful, game-changing solutions for effortless expense management, accurate financial tracking, and streamlined compliance. Our curated, weekly-updated comparison of free vs paid options helps you find the perfect fit. Unlock your AI edge with XIX.AI's expert picks.

10 tools
xix.ai
Business Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling
Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling

Discover the 2026 latest top-rated AI recruiting tools on XIX.AI. Our curated list features powerful, game-changing solutions for screening resumes and automating candidate interview scheduling. Compare free vs paid options with real-world tests and weekly updated rankings. Find your perfect hiring assistant and streamline your recruitment today!

10 tools
xix.ai
Productivity AI Personal Wellness & Focus Coaches: Manage Burnout & Boost Mental Energy Levels
AI Personal Wellness & Focus Coaches: Manage Burnout & Boost Mental Energy Levels

Discover the 2026 best AI personal wellness and focus coaches on XIX.AI. Our curated rankings feature top-rated, game-changing tools to manage burnout and boost mental energy. Compare free vs paid options with real-world insights. Unlock your path to peak productivity and well-being today.

10 tools
xix.ai
chatbot Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities
Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities

Discover the 2026 latest top-rated AI romantic chatbots for building genuine, long-term connections. Our curated list features powerful, consistent personalities, free vs paid comparisons, and real-world tests. Find your perfect companion and start building today at XIX.AI.

10 tools
xix.ai
Education and Learning Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows
Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows

Discover the 2026 best AI data science mentors to master SQL, Pandas & ML workflows. Explore our top-rated, curated selection at XIX.AI for powerful, game-changing guidance. Compare free vs paid options with real-world insights. Unlock your data science mastery today.

10 tools
xix.ai
Comments (2)
0/500
IsabellaLevis
IsabellaLevis March 3, 2026 at 9:00:50 PM EST

AIの安全性テストを業界全体で実施する必要があるって主張、すごく共感します。競争が激しい中でOpenAIとAnthropicが協力したのは意外だけど、こういう連携がもっと増えると良いですね。ただ、本当に効果的なテストができるのか少し不安… 🤔

GeorgeWilliams
GeorgeWilliams February 19, 2026 at 7:01:46 PM EST

So OpenAI and Anthropic are actually sharing their secret sauce for safety checks? That's pretty refreshing to see amidst all the cutthroat AI race. Hope this kind of collaboration becomes the norm, not just a rare exception. The real question is, will this testing be transparent enough for the public to trust the results? 🤔

OR