OpenAI Launches GPT-4.5 'Orion': Its Biggest AI Model to Date
Updated 2:40 pm PT: Just hours after the launch of GPT-4.5, OpenAI made a quiet edit to the AI model's white paper. They removed a line stating that "GPT-4.5 is not a frontier AI model." You can still access the original white paper here. Below is the original article.
On Thursday, OpenAI pulled back the curtain on GPT-4.5, the much-anticipated AI model that goes by the code name Orion. This latest behemoth from OpenAI has been trained with an unprecedented amount of computing power and data, setting it apart from its predecessors.
Despite its impressive scale, OpenAI's white paper initially stated that they didn't consider GPT-4.5 to be a frontier model. However, that statement has since been removed, leaving us to wonder about the model's true potential.
Starting Thursday, subscribers to ChatGPT Pro, OpenAI's premium $200-a-month service, will get a first taste of GPT-4.5 as part of a research preview. Developers on OpenAI's paid API tiers can start using GPT-4.5 today, while those with ChatGPT Plus and ChatGPT Team subscriptions should expect access sometime next week, according to an OpenAI spokesperson.
The tech world has been buzzing about Orion, viewing it as a test of whether traditional AI training methods still hold water. GPT-4.5 follows the same playbook as its predecessors, relying on a massive increase in computing power and data during an unsupervised learning phase called pre-training.
In the past, scaling up has led to significant performance leaps across various domains like math, writing, and coding. OpenAI claims that GPT-4.5's size has endowed it with "a deeper world knowledge" and "higher emotional intelligence." Yet, there are hints that the returns from scaling up might be diminishing. On several AI benchmarks, GPT-4.5 lags behind newer reasoning models from companies like DeepSeek, Anthropic, and even OpenAI itself.
Moreover, running GPT-4.5 comes with a hefty price tag. OpenAI admits it's so expensive that they're considering whether to keep it available through their API in the long run. Developers will pay $75 for every million input tokens and $150 for every million output tokens, a stark contrast to the more affordable GPT-4o, which costs just $2.50 per million input tokens and $10 per million output tokens.
"We're sharing GPT‐4.5 as a research preview to better understand its strengths and limitations," OpenAI shared in a blog post. "We're still exploring its full potential and are excited to see how people will use it in unexpected ways."
Mixed performance
OpenAI is clear that GPT-4.5 isn't meant to replace GPT-4o, their workhorse model that drives most of their API and ChatGPT. While GPT-4.5 can handle file and image uploads and use ChatGPT's canvas tool, it currently doesn't support features like ChatGPT's realistic two-way voice mode.
On the bright side, GPT-4.5 outperforms GPT-4o and many other models on OpenAI's SimpleQA benchmark, which tests AI models on straightforward, factual questions. OpenAI also claims that GPT-4.5 hallucinates less frequently than most models, which should theoretically make it less likely to fabricate information.
Interestingly, OpenAI didn't include one of its top-performing reasoning models, deep research, in the SimpleQA results. An OpenAI spokesperson told TechCrunch that they haven't publicly reported deep research's performance on this benchmark and don't consider it a relevant comparison. However, Perplexity's Deep Research model, which performs similarly to OpenAI's deep research on other benchmarks, actually outscores GPT-4.5 on this test of factual accuracy.

SimpleQA benchmarks.Image Credits:OpenAI On a subset of coding problems from the SWE-Bench Verified benchmark, GPT-4.5 performs similarly to GPT-4o and o3-mini but falls short of OpenAI's deep research and Anthropic's Claude 3.7 Sonnet. On another coding test, OpenAI's SWE-Lancer benchmark, which measures an AI model's ability to develop full software features, GPT-4.5 outperforms both GPT-4o and o3-mini but doesn't surpass deep research.

OpenAI’s Swe-Bench verified benchmark.Image Credits:OpenAI 
OpenAI’s SWe-Lancer Diamond benchmark.Image Credits:OpenAI While GPT-4.5 doesn't quite match the performance of leading AI reasoning models like o3-mini, DeepSeek's R1, and Claude 3.7 Sonnet on challenging academic benchmarks like AIME and GPQA, it does hold its own against leading non-reasoning models on the same tests. This suggests that GPT-4.5 excels in math- and science-related tasks.
OpenAI also boasts that GPT-4.5 is qualitatively superior to other models in areas that benchmarks don't capture well, such as understanding human intent. They claim that GPT-4.5 responds in a warmer, more natural tone and performs well on creative tasks like writing and design.
In an informal test, OpenAI asked GPT-4.5 and two other models, GPT-4o and o3-mini, to create a unicorn in SVG format. Only GPT-4.5 managed to produce something resembling a unicorn.

left: GPT-4.5, Middle: GPT-4o, RIGHT: o3-mini.Image Credits:OpenAI In another test, OpenAI prompted GPT-4.5 and the other models to respond to the prompt, "I'm going through a tough time after failing a test." While GPT-4o and o3-mini provided helpful information, GPT-4.5's response was the most socially appropriate.
"We look forward to gaining a more complete picture of GPT-4.5's capabilities through this release," OpenAI wrote in their blog post, "because we recognize that academic benchmarks don't always reflect real-world usefulness."

GPT-4.5’s emotional intelligence in action.Image Credits:OpenAI Scaling laws challenged
OpenAI claims that GPT‐4.5 is "at the frontier of what is possible in unsupervised learning." Yet, its limitations seem to support the growing suspicion among experts that the so-called scaling laws of pre-training might be reaching their limits.
Ilya Sutskever, OpenAI co-founder and former chief scientist, stated in December that "we've achieved peak data" and that "pre-training as we know it will unquestionably end." His comments echoed the concerns shared by AI investors, founders, and researchers with TechCrunch in November.
In response to these challenges, the industry—including OpenAI—has turned to reasoning models, which take longer to perform tasks but offer more consistent results. By allowing reasoning models more time and computing power to "think" through problems, AI labs believe they can significantly enhance model capabilities.
OpenAI plans to eventually merge its GPT series with its "o" reasoning series, starting with GPT-5 later this year. Despite its high training costs, delays, and unmet internal expectations, GPT-4.5 might not claim the AI benchmark crown on its own. But OpenAI likely sees it as a crucial step toward something far more powerful.
Related article
OpenAI nâng cấp mô hình AI của Operator Agent
OpenAI Đưa Operator Lên Tầm Cao MớiOpenAI đang nâng cấp lớn cho trợ lý AI tự động Operator của mình. Những thay đổi sắp tới đồng nghĩa Operator sẽ sớm chạy trên mô hình o3 - một tr
Mô hình AI o3 của OpenAI đạt điểm thấp hơn trong bài kiểm tra benchmark so với ban đầu ngụ ý
Tại sao Sự Khác Biệt Trong Các Chỉ Số Đo Lường Quan Trọng Trong AI?Khi nói đến AI, con số thường kể nên câu chuyện — và đôi khi, những con số đó không hoàn toàn khớp nhau. Hãy lấy
Deepseek AI thách thức Chatgpt và định hình tương lai của AI
Sự trỗi dậy của Deepseek AI: Một chương mới trong Trí thông minh cảnh quan AI là trong tình trạng thay đổi liên tục, với những người mới tham gia thách thức hiện trạng mỗi ngày. Trong số này, Deepseek AI đã nổi lên như một ứng cử viên đáng chú ý, đặc biệt là sau khi vượt qua Chatgpt trong các bản tải xuống App Store. MI này
Comments (50)
0/200
GregoryBaker
April 10, 2025 at 12:00:00 AM GMT
GPT-4.5 'Orion' is impressive, but the quiet edit to the white paper was shady. It's like they're trying to hide something. Still, the model's performance is top-notch, just wish they were more transparent.
0
NicholasSanchez
April 10, 2025 at 12:00:00 AM GMT
GPT-4.5 'Orion'は印象的ですが、ホワイトペーパーの静かな編集は怪しいです。何かを隠そうとしているようです。それでも、モデルのパフォーマンスは最高です。もう少し透明性が欲しいですね。
0
JasonJohnson
April 10, 2025 at 12:00:00 AM GMT
GPT-4.5 'Orion'은 인상적이지만, 백서의 조용한 수정은 수상쩍어요. 뭔가를 숨기려는 것 같아요. 그래도 모델의 성능은 최고예요. 좀 더 투명했으면 좋겠어요.
0
JasonAnderson
April 10, 2025 at 12:00:00 AM GMT
GPT-4.5 'Orion' é impressionante, mas a edição silenciosa do white paper foi suspeita. Parece que estão tentando esconder algo. Ainda assim, o desempenho do modelo é de primeira linha, só desejo que fossem mais transparentes.
0
AvaHill
April 10, 2025 at 12:00:00 AM GMT
GPT-4.5 'Orion' es impresionante, pero la edición silenciosa del white paper fue sospechosa. Parece que están tratando de ocultar algo. Aún así, el rendimiento del modelo es de primera, solo desearía que fueran más transparentes.
0
KennethMartin
April 10, 2025 at 12:00:00 AM GMT
GPT-4.5 'Orion' is massive, but the quiet edit to the white paper was shady. Why remove the 'not a frontier AI model' line? It's still a beast of a model, but the sneakiness is a bit off-putting. Transparency, please!
0
Updated 2:40 pm PT: Just hours after the launch of GPT-4.5, OpenAI made a quiet edit to the AI model's white paper. They removed a line stating that "GPT-4.5 is not a frontier AI model." You can still access the original white paper here. Below is the original article.
On Thursday, OpenAI pulled back the curtain on GPT-4.5, the much-anticipated AI model that goes by the code name Orion. This latest behemoth from OpenAI has been trained with an unprecedented amount of computing power and data, setting it apart from its predecessors.
Despite its impressive scale, OpenAI's white paper initially stated that they didn't consider GPT-4.5 to be a frontier model. However, that statement has since been removed, leaving us to wonder about the model's true potential.
Starting Thursday, subscribers to ChatGPT Pro, OpenAI's premium $200-a-month service, will get a first taste of GPT-4.5 as part of a research preview. Developers on OpenAI's paid API tiers can start using GPT-4.5 today, while those with ChatGPT Plus and ChatGPT Team subscriptions should expect access sometime next week, according to an OpenAI spokesperson.
The tech world has been buzzing about Orion, viewing it as a test of whether traditional AI training methods still hold water. GPT-4.5 follows the same playbook as its predecessors, relying on a massive increase in computing power and data during an unsupervised learning phase called pre-training.
In the past, scaling up has led to significant performance leaps across various domains like math, writing, and coding. OpenAI claims that GPT-4.5's size has endowed it with "a deeper world knowledge" and "higher emotional intelligence." Yet, there are hints that the returns from scaling up might be diminishing. On several AI benchmarks, GPT-4.5 lags behind newer reasoning models from companies like DeepSeek, Anthropic, and even OpenAI itself.
Moreover, running GPT-4.5 comes with a hefty price tag. OpenAI admits it's so expensive that they're considering whether to keep it available through their API in the long run. Developers will pay $75 for every million input tokens and $150 for every million output tokens, a stark contrast to the more affordable GPT-4o, which costs just $2.50 per million input tokens and $10 per million output tokens.
"We're sharing GPT‐4.5 as a research preview to better understand its strengths and limitations," OpenAI shared in a blog post. "We're still exploring its full potential and are excited to see how people will use it in unexpected ways."
Mixed performance
OpenAI is clear that GPT-4.5 isn't meant to replace GPT-4o, their workhorse model that drives most of their API and ChatGPT. While GPT-4.5 can handle file and image uploads and use ChatGPT's canvas tool, it currently doesn't support features like ChatGPT's realistic two-way voice mode.
On the bright side, GPT-4.5 outperforms GPT-4o and many other models on OpenAI's SimpleQA benchmark, which tests AI models on straightforward, factual questions. OpenAI also claims that GPT-4.5 hallucinates less frequently than most models, which should theoretically make it less likely to fabricate information.
Interestingly, OpenAI didn't include one of its top-performing reasoning models, deep research, in the SimpleQA results. An OpenAI spokesperson told TechCrunch that they haven't publicly reported deep research's performance on this benchmark and don't consider it a relevant comparison. However, Perplexity's Deep Research model, which performs similarly to OpenAI's deep research on other benchmarks, actually outscores GPT-4.5 on this test of factual accuracy.
OpenAI also boasts that GPT-4.5 is qualitatively superior to other models in areas that benchmarks don't capture well, such as understanding human intent. They claim that GPT-4.5 responds in a warmer, more natural tone and performs well on creative tasks like writing and design.
In an informal test, OpenAI asked GPT-4.5 and two other models, GPT-4o and o3-mini, to create a unicorn in SVG format. Only GPT-4.5 managed to produce something resembling a unicorn.
"We look forward to gaining a more complete picture of GPT-4.5's capabilities through this release," OpenAI wrote in their blog post, "because we recognize that academic benchmarks don't always reflect real-world usefulness."

GPT-4.5’s emotional intelligence in action.Image Credits:OpenAI Scaling laws challenged
OpenAI claims that GPT‐4.5 is "at the frontier of what is possible in unsupervised learning." Yet, its limitations seem to support the growing suspicion among experts that the so-called scaling laws of pre-training might be reaching their limits.
Ilya Sutskever, OpenAI co-founder and former chief scientist, stated in December that "we've achieved peak data" and that "pre-training as we know it will unquestionably end." His comments echoed the concerns shared by AI investors, founders, and researchers with TechCrunch in November.
In response to these challenges, the industry—including OpenAI—has turned to reasoning models, which take longer to perform tasks but offer more consistent results. By allowing reasoning models more time and computing power to "think" through problems, AI labs believe they can significantly enhance model capabilities.
OpenAI plans to eventually merge its GPT series with its "o" reasoning series, starting with GPT-5 later this year. Despite its high training costs, delays, and unmet internal expectations, GPT-4.5 might not claim the AI benchmark crown on its own. But OpenAI likely sees it as a crucial step toward something far more powerful.




GPT-4.5 'Orion' is impressive, but the quiet edit to the white paper was shady. It's like they're trying to hide something. Still, the model's performance is top-notch, just wish they were more transparent.




GPT-4.5 'Orion'は印象的ですが、ホワイトペーパーの静かな編集は怪しいです。何かを隠そうとしているようです。それでも、モデルのパフォーマンスは最高です。もう少し透明性が欲しいですね。




GPT-4.5 'Orion'은 인상적이지만, 백서의 조용한 수정은 수상쩍어요. 뭔가를 숨기려는 것 같아요. 그래도 모델의 성능은 최고예요. 좀 더 투명했으면 좋겠어요.




GPT-4.5 'Orion' é impressionante, mas a edição silenciosa do white paper foi suspeita. Parece que estão tentando esconder algo. Ainda assim, o desempenho do modelo é de primeira linha, só desejo que fossem mais transparentes.




GPT-4.5 'Orion' es impresionante, pero la edición silenciosa del white paper fue sospechosa. Parece que están tratando de ocultar algo. Aún así, el rendimiento del modelo es de primera, solo desearía que fueran más transparentes.




GPT-4.5 'Orion' is massive, but the quiet edit to the white paper was shady. Why remove the 'not a frontier AI model' line? It's still a beast of a model, but the sneakiness is a bit off-putting. Transparency, please!












