Research Chiefs Call on Tech Sector to Track AI Reasoning Processes

AI researchers from OpenAI, Google DeepMind, Anthropic, and a broad coalition of companies and nonprofit organizations are advocating for deeper exploration into monitoring the so-called thought processes of AI reasoning models, according to a position paper published on Tuesday.
A defining characteristic of AI reasoning models, such as OpenAI’s o3 and DeepSeek’s R1, is their use of chains-of-thought, or CoTs—an externalized process where AI models systematically work through problems, much like humans using scratch paper to solve a complex math equation. Reasoning models are fundamental to powering AI agents, and the paper's authors contend that monitoring CoTs could become a vital method for keeping increasingly capable and widespread AI agents under control.
"CoT monitoring offers a valuable enhancement to safety protocols for cutting-edge AI, providing a unique window into how AI agents reach their decisions," the researchers stated in the position paper. "However, there is no certainty that this level of visibility will continue. We urge the research community and frontier AI developers to maximize the benefits of CoT monitorability and investigate ways to preserve it."
The position paper urges leading AI developers to investigate what makes CoTs "monitorable"—specifically, which factors enhance or diminish transparency into how AI models truly generate their answers. The authors note that while CoT monitoring is a promising approach for understanding AI reasoning models, it remains fragile, and they caution against any changes that might reduce its transparency or reliability.
Additionally, the authors call on AI developers to consistently track CoT monitorability and explore how this method could eventually be implemented as a safety measure.
Prominent signatories of the paper include OpenAI's chief research officer Mark Chen, Safe Superintelligence CEO Ilya Sutskever, Nobel laureate Geoffrey Hinton, Google DeepMind cofounder Shane Legg, xAI safety adviser Dan Hendrycks, and Thinking Machines co-founder John Schulman. Leading authors include representatives from the UK AI Security Institute and Apollo Research, with additional signatories from METR, Amazon, Meta, and UC Berkeley.
This paper represents a unified effort by many of the AI industry's top leaders to accelerate research in AI safety. It comes at a time of intense competition among tech companies—competition that has prompted Meta to recruit top researchers from OpenAI, Google DeepMind, and Anthropic with multimillion-dollar offers. Among the most sought-after researchers are those specializing in AI agents and reasoning models.
Techcrunch eventLIVE NOW! TechCrunch All Stage
Build smarter. Scale faster. Connect deeper. Join innovators from Precursor Ventures, NEA, Index Ventures, Underscore VC, and more for a day packed with actionable strategies, immersive workshops, and meaningful networking.
Save $450 on your TechCrunch All Stage pass
Build smarter. Scale faster. Connect deeper. Join innovators from Precursor Ventures, NEA, Index Ventures, Underscore VC, and more for a day packed with actionable strategies, immersive workshops, and meaningful networking.
Boston, MA|July 15REGISTER NOW"We're at a pivotal moment where we have this new chain-of-thought capability. It appears highly useful, but it could disappear in a few years if it doesn't receive focused attention," said Bowen Baker, an OpenAI researcher involved in the paper, in an interview with TechCrunch. "Releasing a position paper like this is, to me, a way to drive more research and attention to this topic before it's too late."
OpenAI first released a preview of its initial AI reasoning model, o1, in September 2024. In the months that followed, the tech industry rapidly introduced competing models with similar capabilities, with some from Google DeepMind, xAI, and Anthropic demonstrating even more advanced benchmark performance.
Nevertheless, there is still limited understanding of how AI reasoning models operate. While AI labs have made significant strides in improving AI performance over the past year, this has not necessarily led to a clearer understanding of their decision-making processes.
Anthropic has been a pioneer in understanding how AI models function—a field known as interpretability. Earlier this year, CEO Dario Amodei pledged to unravel the "black box" of AI models by 2027 and increase investment in interpretability. He also encouraged OpenAI and Google DeepMind to further investigate this area.
Early research from Anthropic suggests that CoTs may not be entirely reliable indicators of how these models generate answers. At the same time, OpenAI researchers have indicated that CoT monitoring could eventually serve as a dependable method for tracking alignment and safety in AI models.
Position papers like this one aim to raise awareness and attract more attention to emerging research areas, such as CoT monitoring. Companies like OpenAI, Google DeepMind, and Anthropic are already conducting research in this space, but this publication may help stimulate additional funding and investigation.
Related article
OpenAI Acquires AI Personal Finance Startup Hiro
OpenAI has acquired the personal finance startup Hiro Finance, founder Ethan Bloch announced on Monday, with OpenAI confirming the deal to TechCrunch. The startup was backed by top fintech venture capital firm Ribbit, along with General Catalyst and
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Related Special Topic Recommendations
Comments (1)
0/500

AI researchers from OpenAI, Google DeepMind, Anthropic, and a broad coalition of companies and nonprofit organizations are advocating for deeper exploration into monitoring the so-called thought processes of AI reasoning models, according to a position paper published on Tuesday.
A defining characteristic of AI reasoning models, such as OpenAI’s o3 and DeepSeek’s R1, is their use of chains-of-thought, or CoTs—an externalized process where AI models systematically work through problems, much like humans using scratch paper to solve a complex math equation. Reasoning models are fundamental to powering AI agents, and the paper's authors contend that monitoring CoTs could become a vital method for keeping increasingly capable and widespread AI agents under control.
"CoT monitoring offers a valuable enhancement to safety protocols for cutting-edge AI, providing a unique window into how AI agents reach their decisions," the researchers stated in the position paper. "However, there is no certainty that this level of visibility will continue. We urge the research community and frontier AI developers to maximize the benefits of CoT monitorability and investigate ways to preserve it."
The position paper urges leading AI developers to investigate what makes CoTs "monitorable"—specifically, which factors enhance or diminish transparency into how AI models truly generate their answers. The authors note that while CoT monitoring is a promising approach for understanding AI reasoning models, it remains fragile, and they caution against any changes that might reduce its transparency or reliability.
Additionally, the authors call on AI developers to consistently track CoT monitorability and explore how this method could eventually be implemented as a safety measure.
Prominent signatories of the paper include OpenAI's chief research officer Mark Chen, Safe Superintelligence CEO Ilya Sutskever, Nobel laureate Geoffrey Hinton, Google DeepMind cofounder Shane Legg, xAI safety adviser Dan Hendrycks, and Thinking Machines co-founder John Schulman. Leading authors include representatives from the UK AI Security Institute and Apollo Research, with additional signatories from METR, Amazon, Meta, and UC Berkeley.
This paper represents a unified effort by many of the AI industry's top leaders to accelerate research in AI safety. It comes at a time of intense competition among tech companies—competition that has prompted Meta to recruit top researchers from OpenAI, Google DeepMind, and Anthropic with multimillion-dollar offers. Among the most sought-after researchers are those specializing in AI agents and reasoning models.
Techcrunch eventLIVE NOW! TechCrunch All Stage
Build smarter. Scale faster. Connect deeper. Join innovators from Precursor Ventures, NEA, Index Ventures, Underscore VC, and more for a day packed with actionable strategies, immersive workshops, and meaningful networking.
Save $450 on your TechCrunch All Stage pass
Build smarter. Scale faster. Connect deeper. Join innovators from Precursor Ventures, NEA, Index Ventures, Underscore VC, and more for a day packed with actionable strategies, immersive workshops, and meaningful networking.
Boston, MA|July 15REGISTER NOW"We're at a pivotal moment where we have this new chain-of-thought capability. It appears highly useful, but it could disappear in a few years if it doesn't receive focused attention," said Bowen Baker, an OpenAI researcher involved in the paper, in an interview with TechCrunch. "Releasing a position paper like this is, to me, a way to drive more research and attention to this topic before it's too late."
OpenAI first released a preview of its initial AI reasoning model, o1, in September 2024. In the months that followed, the tech industry rapidly introduced competing models with similar capabilities, with some from Google DeepMind, xAI, and Anthropic demonstrating even more advanced benchmark performance.
Nevertheless, there is still limited understanding of how AI reasoning models operate. While AI labs have made significant strides in improving AI performance over the past year, this has not necessarily led to a clearer understanding of their decision-making processes.
Anthropic has been a pioneer in understanding how AI models function—a field known as interpretability. Earlier this year, CEO Dario Amodei pledged to unravel the "black box" of AI models by 2027 and increase investment in interpretability. He also encouraged OpenAI and Google DeepMind to further investigate this area.
Early research from Anthropic suggests that CoTs may not be entirely reliable indicators of how these models generate answers. At the same time, OpenAI researchers have indicated that CoT monitoring could eventually serve as a dependable method for tracking alignment and safety in AI models.
Position papers like this one aim to raise awareness and attract more attention to emerging research areas, such as CoT monitoring. Companies like OpenAI, Google DeepMind, and Anthropic are already conducting research in this space, but this publication may help stimulate additional funding and investigation.
OpenAI Acquires AI Personal Finance Startup Hiro
OpenAI has acquired the personal finance startup Hiro Finance, founder Ethan Bloch announced on Monday, with OpenAI confirming the deal to TechCrunch. The startup was backed by top fintech venture capital firm Ribbit, along with General Catalyst and
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha





Home






