OpenAI Unveils GPT-5.4 Pro and Thinking Editions

On Thursday, OpenAI introduced GPT-5.4, a new foundation model described as "our most capable and efficient frontier model for professional work." Alongside the standard version, GPT-5.4 is offered as a reasoning-focused variant (GPT-5.4 Thinking) and a performance-optimized edition (GPT-5.4 Pro).
The API version of the model will support context windows of up to 1 million tokens, marking the largest context capacity OpenAI has ever offered.
OpenAI also highlighted enhanced token efficiency, noting that GPT-5.4 can solve identical problems using significantly fewer tokens than its predecessor.
The new model delivers substantially improved benchmark results, achieving record scores on the computer-use benchmarks OSWorld-Verified and WebArena Verified. It also set a new record with an 83% score on OpenAI's GDPval test for knowledge-work tasks.
According to a statement from Mercor CEO Brendan Foody, GPT-5.4 leads on Mercor's APEX-Agents benchmark, which evaluates professional skills in law and finance.
"[GPT-5.4] excels at producing long-horizon deliverables like slide decks, financial models, and legal analysis," Foody stated, "delivering top-tier performance while operating faster and at a lower cost than competing frontier models."
GPT-5.4 continues OpenAI's work to reduce hallucinations and factual inaccuracies. The company reports the new model is 33% less likely to make errors in individual claims compared to GPT-5.2, with overall responses 18% less likely to contain mistakes.
As part of the launch, OpenAI redesigned how the GPT-5.4 API handles tool calling, introducing a new system named Tool Search. Previously, system prompts had to define all available tools upfront—a process that consumed considerable tokens as tool libraries expanded. The new system lets models retrieve tool definitions on demand, making requests faster and more cost-effective in environments with many tools.
OpenAI has also added a new safety evaluation to assess its models' chain-of-thought—the running commentary that reveals the model's reasoning during multi-step tasks. AI safety researchers have long expressed concern that reasoning models might misrepresent their chain-of-thought, and testing confirms this can occur under certain conditions.
OpenAI's new evaluation indicates that deception is less probable in the Thinking version of GPT-5.4, "suggesting the model lacks the capability to conceal its reasoning and that CoT monitoring remains an effective safety tool."
Related article
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Related Special Topic Recommendations
Comments (0)
0/500

On Thursday, OpenAI introduced GPT-5.4, a new foundation model described as "our most capable and efficient frontier model for professional work." Alongside the standard version, GPT-5.4 is offered as a reasoning-focused variant (GPT-5.4 Thinking) and a performance-optimized edition (GPT-5.4 Pro).
The API version of the model will support context windows of up to 1 million tokens, marking the largest context capacity OpenAI has ever offered.
OpenAI also highlighted enhanced token efficiency, noting that GPT-5.4 can solve identical problems using significantly fewer tokens than its predecessor.
The new model delivers substantially improved benchmark results, achieving record scores on the computer-use benchmarks OSWorld-Verified and WebArena Verified. It also set a new record with an 83% score on OpenAI's GDPval test for knowledge-work tasks.
According to a statement from Mercor CEO Brendan Foody, GPT-5.4 leads on Mercor's APEX-Agents benchmark, which evaluates professional skills in law and finance.
"[GPT-5.4] excels at producing long-horizon deliverables like slide decks, financial models, and legal analysis," Foody stated, "delivering top-tier performance while operating faster and at a lower cost than competing frontier models."
GPT-5.4 continues OpenAI's work to reduce hallucinations and factual inaccuracies. The company reports the new model is 33% less likely to make errors in individual claims compared to GPT-5.2, with overall responses 18% less likely to contain mistakes.
As part of the launch, OpenAI redesigned how the GPT-5.4 API handles tool calling, introducing a new system named Tool Search. Previously, system prompts had to define all available tools upfront—a process that consumed considerable tokens as tool libraries expanded. The new system lets models retrieve tool definitions on demand, making requests faster and more cost-effective in environments with many tools.
OpenAI has also added a new safety evaluation to assess its models' chain-of-thought—the running commentary that reveals the model's reasoning during multi-step tasks. AI safety researchers have long expressed concern that reasoning models might misrepresent their chain-of-thought, and testing confirms this can occur under certain conditions.
OpenAI's new evaluation indicates that deception is less probable in the Thinking version of GPT-5.4, "suggesting the model lacks the capability to conceal its reasoning and that CoT monitoring remains an effective safety tool."
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont





Home






