OpenAI’s latest AI models have a new safeguard to prevent biorisks
OpenAI's New Safety Measures for AI Models o3 and o4-mini
OpenAI has introduced a new monitoring system for its advanced AI models, o3 and o4-mini, specifically designed to detect and prevent responses to prompts related to biological and chemical threats. This "safety-focused reasoning monitor" is a response to the enhanced capabilities of these models, which, according to OpenAI, represent a significant step up from their predecessors and could be misused by malicious actors.
The company's internal benchmarks indicate that o3, in particular, has shown a higher proficiency in answering questions about creating certain biological threats. To address this and other potential risks, OpenAI developed this new system, which operates alongside o3 and o4-mini. It's trained to recognize and reject prompts that could lead to harmful advice on biological and chemical risks.
Testing and Results
To gauge the effectiveness of this safety monitor, OpenAI conducted extensive testing. Red teamers spent approximately 1,000 hours identifying "unsafe" biorisk-related conversations generated by o3 and o4-mini. In a simulation of the monitor's "blocking logic," the models successfully declined to respond to risky prompts 98.7% of the time.
However, OpenAI admits that their test did not consider scenarios where users might attempt different prompts after being blocked. As a result, the company plans to continue using human monitoring as part of its safety strategy.
Risk Assessment and Ongoing Monitoring
Despite their advanced capabilities, o3 and o4-mini do not exceed OpenAI's "high risk" threshold for biorisks. Yet, early versions of these models were more adept at answering questions about developing biological weapons compared to o1 and GPT-4. OpenAI is actively monitoring how these models might facilitate the development of chemical and biological threats, as outlined in their updated Preparedness Framework.

Chart from o3 and o4-mini’s system card (Screenshot: OpenAI)
OpenAI is increasingly turning to automated systems to manage the risks posed by its models. For instance, a similar reasoning monitor is used to prevent GPT-4o's image generator from producing child sexual abuse material (CSAM).
Concerns and Criticisms
Despite these efforts, some researchers argue that OpenAI may not be prioritizing safety enough. One of OpenAI's red-teaming partners, Metr, noted they had limited time to test o3 for deceptive behavior. Additionally, OpenAI chose not to release a safety report for its recently launched GPT-4.1 model, raising further concerns about the company's commitment to transparency and safety.
Related article
OpenAI Enhances AI Model Behind Its Operator Agent
OpenAI Takes Operator to the Next LevelOpenAI is giving its autonomous AI agent, Operator, a major upgrade. The upcoming changes mean Operator will soon rely on a model based on o3
OpenAI’s o3 AI model scores lower on a benchmark than the company initially implied
Why Benchmark Discrepancies Matter in AIWhen it comes to AI, numbers often tell the story—and sometimes, those numbers don’t quite add up. Take OpenAI’s o3 model, for instance. The
DeepSeek AI Challenges ChatGPT and Shapes the Future of AI
The Rise of DeepSeek AI: A New Chapter in the AI LandscapeArtificial intelligence is in a constant state of flux, with new entrants challenging the status quo every day. Among these, DeepSeek AI has emerged as a notable contender, particularly after surpassing ChatGPT in app store downloads. This mi
Comments (5)
0/200
JamesWilliams
April 24, 2025 at 12:00:00 AM GMT
OpenAI's new safety feature is a game-changer! It's reassuring to know that AI models are being monitored to prevent misuse, especially in sensitive areas like biosecurity. But sometimes it feels a bit too cautious, blocking harmless queries. Still, better safe than sorry, right? Keep up the good work, OpenAI! 😊
0
StephenGreen
April 24, 2025 at 12:00:00 AM GMT
OpenAIの新しい安全機能は素晴らしいですね!生物学的リスクを防ぐための監視システムがあるのは安心です。ただ、無害な質問までブロックされることがあるのが少し気になります。でも、安全第一ですからね。引き続き頑張ってください、OpenAI!😊
0
LarryMartin
April 19, 2025 at 12:00:00 AM GMT
OpenAI의 새로운 안전 기능 정말 대단해요! 생물학적 위험을 방지하기 위한 모니터링 시스템이 있다는 게 안심되네요. 다만, 무해한 질문까지 차단되는 경우가 있어서 조금 아쉽습니다. 그래도 안전이 최우선이죠. 계속해서 좋은 일 하세요, OpenAI! 😊
0
CharlesMartinez
April 21, 2025 at 12:00:00 AM GMT
A nova função de segurança da OpenAI é incrível! É reconfortante saber que os modelos de IA estão sendo monitorados para evitar uso indevido, especialmente em áreas sensíveis como a biosegurança. Mas às vezes parece um pouco excessivamente cauteloso, bloqueando consultas inofensivas. Ainda assim, melhor prevenir do que remediar, certo? Continue o bom trabalho, OpenAI! 😊
0
CharlesJohnson
April 21, 2025 at 12:00:00 AM GMT
¡La nueva función de seguridad de OpenAI es un cambio de juego! Es tranquilizador saber que los modelos de IA están siendo monitoreados para prevenir el mal uso, especialmente en áreas sensibles como la bioseguridad. Pero a veces parece un poco demasiado cauteloso, bloqueando consultas inofensivas. Aún así, más vale prevenir que lamentar, ¿verdad? ¡Sigue el buen trabajo, OpenAI! 😊
0
OpenAI's New Safety Measures for AI Models o3 and o4-mini
OpenAI has introduced a new monitoring system for its advanced AI models, o3 and o4-mini, specifically designed to detect and prevent responses to prompts related to biological and chemical threats. This "safety-focused reasoning monitor" is a response to the enhanced capabilities of these models, which, according to OpenAI, represent a significant step up from their predecessors and could be misused by malicious actors.
The company's internal benchmarks indicate that o3, in particular, has shown a higher proficiency in answering questions about creating certain biological threats. To address this and other potential risks, OpenAI developed this new system, which operates alongside o3 and o4-mini. It's trained to recognize and reject prompts that could lead to harmful advice on biological and chemical risks.
Testing and Results
To gauge the effectiveness of this safety monitor, OpenAI conducted extensive testing. Red teamers spent approximately 1,000 hours identifying "unsafe" biorisk-related conversations generated by o3 and o4-mini. In a simulation of the monitor's "blocking logic," the models successfully declined to respond to risky prompts 98.7% of the time.
However, OpenAI admits that their test did not consider scenarios where users might attempt different prompts after being blocked. As a result, the company plans to continue using human monitoring as part of its safety strategy.
Risk Assessment and Ongoing Monitoring
Despite their advanced capabilities, o3 and o4-mini do not exceed OpenAI's "high risk" threshold for biorisks. Yet, early versions of these models were more adept at answering questions about developing biological weapons compared to o1 and GPT-4. OpenAI is actively monitoring how these models might facilitate the development of chemical and biological threats, as outlined in their updated Preparedness Framework.
OpenAI is increasingly turning to automated systems to manage the risks posed by its models. For instance, a similar reasoning monitor is used to prevent GPT-4o's image generator from producing child sexual abuse material (CSAM).
Concerns and Criticisms
Despite these efforts, some researchers argue that OpenAI may not be prioritizing safety enough. One of OpenAI's red-teaming partners, Metr, noted they had limited time to test o3 for deceptive behavior. Additionally, OpenAI chose not to release a safety report for its recently launched GPT-4.1 model, raising further concerns about the company's commitment to transparency and safety.




OpenAI's new safety feature is a game-changer! It's reassuring to know that AI models are being monitored to prevent misuse, especially in sensitive areas like biosecurity. But sometimes it feels a bit too cautious, blocking harmless queries. Still, better safe than sorry, right? Keep up the good work, OpenAI! 😊




OpenAIの新しい安全機能は素晴らしいですね!生物学的リスクを防ぐための監視システムがあるのは安心です。ただ、無害な質問までブロックされることがあるのが少し気になります。でも、安全第一ですからね。引き続き頑張ってください、OpenAI!😊




OpenAI의 새로운 안전 기능 정말 대단해요! 생물학적 위험을 방지하기 위한 모니터링 시스템이 있다는 게 안심되네요. 다만, 무해한 질문까지 차단되는 경우가 있어서 조금 아쉽습니다. 그래도 안전이 최우선이죠. 계속해서 좋은 일 하세요, OpenAI! 😊




A nova função de segurança da OpenAI é incrível! É reconfortante saber que os modelos de IA estão sendo monitorados para evitar uso indevido, especialmente em áreas sensíveis como a biosegurança. Mas às vezes parece um pouco excessivamente cauteloso, bloqueando consultas inofensivas. Ainda assim, melhor prevenir do que remediar, certo? Continue o bom trabalho, OpenAI! 😊




¡La nueva función de seguridad de OpenAI es un cambio de juego! Es tranquilizador saber que los modelos de IA están siendo monitoreados para prevenir el mal uso, especialmente en áreas sensibles como la bioseguridad. Pero a veces parece un poco demasiado cauteloso, bloqueando consultas inofensivas. Aún así, más vale prevenir que lamentar, ¿verdad? ¡Sigue el buen trabajo, OpenAI! 😊












