GPT-5 Launch Faces Challenges as OpenAI Navigates Rollout Hurdles

Updated Friday, August 8, 2025, 5:21 PM ET: Shortly after this article was published, OpenAI co-founder and CEO Sam Altman confirmed the company would restore GPT-4o and other legacy model access for select users, acknowledging that the GPT-5 rollout was "more bumpy than we had hoped."
To put it mildly, the highly anticipated launch of OpenAI’s new model, GPT‑5, has gotten off to a rocky start.
Even overlooking chart errors and voice demo glitches from yesterday’s live-streamed unveiling (which introduced four distinct models as well as a “Thinking” mode available for three of them), multiple user reports since release show GPT‑5 struggling with relatively simple problems that earlier OpenAI models—and rival systems from competing AI labs—solve correctly.
For instance, data scientist Colin Fraser shared screenshots of GPT‑5 incorrectly handling a math proof—specifically, whether 8.888 repeating equals 9 (it does not).
Wow, I was just playing around before but it actually is stupid pic.twitter.com/ao51nOH0Ui
— Colin Fraser (@colin_fraser) August 8, 2025
It also failed on a simple arithmetic equation, 5.9 = x + 5.11, a problem many elementary students could solve.
This is concerning. https://t.co/PUbeCSgtRV
— Benjamin De Kraker (@BenjaminDEKR) August 8, 2025
Using GPT‑5 to evaluate OpenAI's own flawed presentation charts didn’t produce useful or accurate responses either.
Q. Prove using an LLM-as-a-judge still doesn't work
A. pic.twitter.com/KnCK5Xs9ja
— Kangwook Lee (@Kangwook_Lee) August 7, 2025
Additionally, it stumbled on this trickier word problem (which, admittedly, even I found challenging at first—though Elon Musk’s Grok 4 AI answered it correctly. For a hint, remember that flagstones cannot be split; all 80 must remain intact).
Careful not to cut yourself on the jagged frontier pic.twitter.com/buJGgJ6baI
— Greg Burnham (@GregHBurnham) August 8, 2025
In my tests, the older GPT‑4o model handled at least one of these math problems more reliably. Unfortunately, OpenAI is gradually phasing out those earlier models—including the previous default GPT‑4o and the advanced reasoning model o3—for ChatGPT users, though they will remain accessible via the API for developers in the near term.
Coding performance falls short of benchmarks
Despite OpenAI’s internal benchmarks and certain third-party tests showing GPT‑5 as the top-performing model for coding, real-world usage suggests Anthropic’s recently upgraded Claude Opus 4.1 often handles “one‑shot” tasks more effectively—delivering the user’s intended application or software build as requested. See this example from developer Justin Sun, posted on X:
Opus 4.1's one-shot attempt at "create a 3d capybara petting zoo" – 8 minutes total
This was honestly pretty insane, not only are the capybaras way cuter and moving, there are individual pet affinity levels, a day/night switcher, feeding, and even a screenshot feature pic.twitter.com/FiKTO3FKK4
— justin (@justinsunyt) August 7, 2025
Moreover, a report from security firm SPLX revealed that OpenAI’s internal safety measures had significant gaps in areas like business alignment and susceptibility to prompt injection and obfuscated logic attacks.
Although anecdotal, early feedback from AI power users suggests a lukewarm reception overall.
AI influencer and former Googler Bilawal Sidhu ran a poll on X asking followers for a “vibe check.” With 172 votes so far, the prevailing response has been “Kinda mid.”
Alright, GPT-5 vibe check
— Bilawal Sidhu (@bilawalsidhu) August 7, 2025
As the pseudonymous AI Leaks and News account noted, “The overwhelming consensus on GPT-5 from both X and the Reddit AMA are overwhelmingly negative.”
The overwhelming consensus on GPT-5 from both X and the Reddit AMA are overwhelmingly negative
Most users are disgruntled about the broken model picker and non-pro users not having access to legacy models
What are your initial thoughts on GPT-5?
— AI Leaks and News (@AILeaksAndNews) August 8, 2025
Tibor Blaho, lead engineer at AIPRM and a well-known AI commentator on X, compiled a thorough summary of the ChatGPT‑5 rollout issues. He pointed out that one of the flagship features—an automatic “router” that selects either thinking or non‑thinking mode based on query complexity—has become a primary complaint, since the model often defaults to non‑thinking mode for many users.
A bit sad how the GPT-5 launch is going so far, especially after the long wait and high expectations
– The automatic switching between models (the router) seems partly broken/unreliable
– It's unclear exactly which model you're actually interacting with (standard or mini,…
— Tibor Blaho (@btibor91) August 8, 2025
Competitors poised to capitalize
As a result, sentiment around ChatGPT‑5 is far from uniformly positive—posing a serious challenge for OpenAI as competition intensifies from U.S. giants like Google and Anthropic, and from a growing roster of free, open‑source, and capable Chinese large language models offering capabilities many U.S. models lack.
Consider the Alibaba Qwen research team, which today upgraded their high‑performance Qwen 3 model to support 1 million tokens of context. This allows users to exchange nearly four times more information per interaction than GPT‑5 currently offers.
With OpenAI’s other major release this week—a new open‑source gpt‑oss model series—also receiving mixed early reviews, the outlook is uncertain for the user‑leading dedicated AI company (ChatGPT now counts 700 million weekly active users).
This sentiment is echoed on the prediction market Polymarket, where users overwhelmingly bet that Google will likely have the leading AI model by the end of August 2025.
Other power users, such as Otherside AI co‑founder and CEO Matt Shumer—who had early GPT‑5 access and published a positive review—suggested that opinions may shift as更多的人 optimize their workflows for the new model:
A lot of folks who are having a bad experience are using GPT-5 in agent harnesses that aren't yet optimized for it.
For every new model release, there's a time lag between release + when companies that integrate the model have it truly working well.
Agent companies rush to…
— Matt Shumer (@mattshumer_) August 8, 2025
While it’s still early for GPT‑5—and opinions could shift significantly as more people test it across various tasks—initial signs suggest this is not the “home run” that prior launches like GPT‑4, GPT‑4o, or o3 represented. That’s a troubling signal for a company that recently secured another funding round yet remains unprofitable due to steep R&D expenses.
Related article
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Related Special Topic Recommendations
Comments (1)
0/500

Updated Friday, August 8, 2025, 5:21 PM ET: Shortly after this article was published, OpenAI co-founder and CEO Sam Altman confirmed the company would restore GPT-4o and other legacy model access for select users, acknowledging that the GPT-5 rollout was "more bumpy than we had hoped."
To put it mildly, the highly anticipated launch of OpenAI’s new model, GPT‑5, has gotten off to a rocky start.
Even overlooking chart errors and voice demo glitches from yesterday’s live-streamed unveiling (which introduced four distinct models as well as a “Thinking” mode available for three of them), multiple user reports since release show GPT‑5 struggling with relatively simple problems that earlier OpenAI models—and rival systems from competing AI labs—solve correctly.
For instance, data scientist Colin Fraser shared screenshots of GPT‑5 incorrectly handling a math proof—specifically, whether 8.888 repeating equals 9 (it does not).
Wow, I was just playing around before but it actually is stupid pic.twitter.com/ao51nOH0Ui
— Colin Fraser (@colin_fraser) August 8, 2025
It also failed on a simple arithmetic equation, 5.9 = x + 5.11, a problem many elementary students could solve.
This is concerning. https://t.co/PUbeCSgtRV
— Benjamin De Kraker (@BenjaminDEKR) August 8, 2025
Using GPT‑5 to evaluate OpenAI's own flawed presentation charts didn’t produce useful or accurate responses either.
Q. Prove using an LLM-as-a-judge still doesn't work
— Kangwook Lee (@Kangwook_Lee) August 7, 2025
A. pic.twitter.com/KnCK5Xs9ja
Additionally, it stumbled on this trickier word problem (which, admittedly, even I found challenging at first—though Elon Musk’s Grok 4 AI answered it correctly. For a hint, remember that flagstones cannot be split; all 80 must remain intact).
Careful not to cut yourself on the jagged frontier pic.twitter.com/buJGgJ6baI
— Greg Burnham (@GregHBurnham) August 8, 2025
In my tests, the older GPT‑4o model handled at least one of these math problems more reliably. Unfortunately, OpenAI is gradually phasing out those earlier models—including the previous default GPT‑4o and the advanced reasoning model o3—for ChatGPT users, though they will remain accessible via the API for developers in the near term.
Coding performance falls short of benchmarks
Despite OpenAI’s internal benchmarks and certain third-party tests showing GPT‑5 as the top-performing model for coding, real-world usage suggests Anthropic’s recently upgraded Claude Opus 4.1 often handles “one‑shot” tasks more effectively—delivering the user’s intended application or software build as requested. See this example from developer Justin Sun, posted on X:
Opus 4.1's one-shot attempt at "create a 3d capybara petting zoo" – 8 minutes total
— justin (@justinsunyt) August 7, 2025
This was honestly pretty insane, not only are the capybaras way cuter and moving, there are individual pet affinity levels, a day/night switcher, feeding, and even a screenshot feature pic.twitter.com/FiKTO3FKK4
Moreover, a report from security firm SPLX revealed that OpenAI’s internal safety measures had significant gaps in areas like business alignment and susceptibility to prompt injection and obfuscated logic attacks.
Although anecdotal, early feedback from AI power users suggests a lukewarm reception overall.
AI influencer and former Googler Bilawal Sidhu ran a poll on X asking followers for a “vibe check.” With 172 votes so far, the prevailing response has been “Kinda mid.”
Alright, GPT-5 vibe check
— Bilawal Sidhu (@bilawalsidhu) August 7, 2025
As the pseudonymous AI Leaks and News account noted, “The overwhelming consensus on GPT-5 from both X and the Reddit AMA are overwhelmingly negative.”
The overwhelming consensus on GPT-5 from both X and the Reddit AMA are overwhelmingly negative
— AI Leaks and News (@AILeaksAndNews) August 8, 2025
Most users are disgruntled about the broken model picker and non-pro users not having access to legacy models
What are your initial thoughts on GPT-5?
Tibor Blaho, lead engineer at AIPRM and a well-known AI commentator on X, compiled a thorough summary of the ChatGPT‑5 rollout issues. He pointed out that one of the flagship features—an automatic “router” that selects either thinking or non‑thinking mode based on query complexity—has become a primary complaint, since the model often defaults to non‑thinking mode for many users.
A bit sad how the GPT-5 launch is going so far, especially after the long wait and high expectations
— Tibor Blaho (@btibor91) August 8, 2025
– The automatic switching between models (the router) seems partly broken/unreliable
– It's unclear exactly which model you're actually interacting with (standard or mini,…
Competitors poised to capitalize
As a result, sentiment around ChatGPT‑5 is far from uniformly positive—posing a serious challenge for OpenAI as competition intensifies from U.S. giants like Google and Anthropic, and from a growing roster of free, open‑source, and capable Chinese large language models offering capabilities many U.S. models lack.
Consider the Alibaba Qwen research team, which today upgraded their high‑performance Qwen 3 model to support 1 million tokens of context. This allows users to exchange nearly four times more information per interaction than GPT‑5 currently offers.
With OpenAI’s other major release this week—a new open‑source gpt‑oss model series—also receiving mixed early reviews, the outlook is uncertain for the user‑leading dedicated AI company (ChatGPT now counts 700 million weekly active users).
This sentiment is echoed on the prediction market Polymarket, where users overwhelmingly bet that Google will likely have the leading AI model by the end of August 2025.
Other power users, such as Otherside AI co‑founder and CEO Matt Shumer—who had early GPT‑5 access and published a positive review—suggested that opinions may shift as更多的人 optimize their workflows for the new model:
A lot of folks who are having a bad experience are using GPT-5 in agent harnesses that aren't yet optimized for it.
— Matt Shumer (@mattshumer_) August 8, 2025
For every new model release, there's a time lag between release + when companies that integrate the model have it truly working well.
Agent companies rush to…
While it’s still early for GPT‑5—and opinions could shift significantly as more people test it across various tasks—initial signs suggest this is not the “home run” that prior launches like GPT‑4, GPT‑4o, or o3 represented. That’s a troubling signal for a company that recently secured another funding round yet remains unprofitable due to steep R&D expenses.
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont





Home






