OpenAI's GPT-5 rivals human performance across diverse professions
On Thursday, OpenAI introduced GDPval, a groundbreaking benchmark evaluating how its AI models stack up against human professionals across diverse industries. This assessment marks an initial step toward gauging whether OpenAI's systems can surpass humans in economically impactful work—a core objective in the company's pursuit of artificial general intelligence (AGI).
According to OpenAI, both GPT-5 and Anthropic's Claude Opus 4.1 demonstrate output quality nearing that of industry specialists.
While these findings don't imply imminent human job replacement, they represent crucial progress tracking. OpenAI acknowledges GDPval currently assesses only a fraction of real-world professional tasks, countering some CEOs' predictions of widespread AI disruption within years.
GDPval evaluates performance across nine key U.S. GDP sectors—including healthcare, finance, manufacturing, and government—testing 44 occupations from software engineering to journalism.
For GDPval-v0, professionals compared AI-generated reports against human counterparts' work. One sample task involved investment bankers analyzing last-mile delivery competitor landscapes against AI versions. OpenAI calculated each model's "win rate" against human outputs across all occupations.
The enhanced GPT-5-high model matched or exceeded expert output 40.6% of the time, while Claude Opus 4.1 achieved a 49% parity rate—OpenAI suggests this higher score may reflect Claude's superior visual presentation rather than substantive advantage.
Connect with 10,000+ tech and VC innovators at Disrupt 2025
Featuring Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, and Elad Gil among 250+ industry leaders hosting 200+ growth-focused sessions. Celebrate TechCrunch's 20th anniversary while gaining competitive insights from technology's foremost thinkers. Early registration before September 26 saves up to $668.
Connect with 10,000+ tech and VC innovators at Disrupt 2025
Featuring Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, and Elad Gil among 250+ industry leaders hosting 200+ growth-focused sessions. Celebrate TechCrunch's 20th anniversary while gaining competitive insights from technology's foremost thinkers. Early registration before September 26 saves up to $668.

Image Credits: OpenAI OpenAI acknowledges GDPval-v0's narrow focus—currently testing only research report generation—and plans future iterations assessing broader workplace interactions.
Chief Economist Dr. Aaron Chatterji told TechCrunch these results indicate professionals can increasingly delegate routine tasks to AI, freeing them for higher-value work.
Tejal Patwardhan, leading evaluations, notes rapid progress: GPT-4o scored just 13.7% fifteen months ago, while GPT-5 nearly triples that performance—a trajectory expected to continue.
While benchmarks like AIME 2025 and GPQA Diamond dominate AI assessment, many models approach saturation on these academic tests. GDPval represents growing emphasis on practical, industry-relevant evaluation standards—though OpenAI requires more comprehensive testing to conclusively demonstrate human-level performance across professional domains.
Related article
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Related Special Topic Recommendations
Comments (0)
0/500
On Thursday, OpenAI introduced GDPval, a groundbreaking benchmark evaluating how its AI models stack up against human professionals across diverse industries. This assessment marks an initial step toward gauging whether OpenAI's systems can surpass humans in economically impactful work—a core objective in the company's pursuit of artificial general intelligence (AGI).
According to OpenAI, both GPT-5 and Anthropic's Claude Opus 4.1 demonstrate output quality nearing that of industry specialists.
While these findings don't imply imminent human job replacement, they represent crucial progress tracking. OpenAI acknowledges GDPval currently assesses only a fraction of real-world professional tasks, countering some CEOs' predictions of widespread AI disruption within years.
GDPval evaluates performance across nine key U.S. GDP sectors—including healthcare, finance, manufacturing, and government—testing 44 occupations from software engineering to journalism.
For GDPval-v0, professionals compared AI-generated reports against human counterparts' work. One sample task involved investment bankers analyzing last-mile delivery competitor landscapes against AI versions. OpenAI calculated each model's "win rate" against human outputs across all occupations.
The enhanced GPT-5-high model matched or exceeded expert output 40.6% of the time, while Claude Opus 4.1 achieved a 49% parity rate—OpenAI suggests this higher score may reflect Claude's superior visual presentation rather than substantive advantage.
Connect with 10,000+ tech and VC innovators at Disrupt 2025
Featuring Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, and Elad Gil among 250+ industry leaders hosting 200+ growth-focused sessions. Celebrate TechCrunch's 20th anniversary while gaining competitive insights from technology's foremost thinkers. Early registration before September 26 saves up to $668.
Connect with 10,000+ tech and VC innovators at Disrupt 2025
Featuring Netflix, Box, a16z, ElevenLabs, Wayve, Sequoia Capital, and Elad Gil among 250+ industry leaders hosting 200+ growth-focused sessions. Celebrate TechCrunch's 20th anniversary while gaining competitive insights from technology's foremost thinkers. Early registration before September 26 saves up to $668.

OpenAI acknowledges GDPval-v0's narrow focus—currently testing only research report generation—and plans future iterations assessing broader workplace interactions.
Chief Economist Dr. Aaron Chatterji told TechCrunch these results indicate professionals can increasingly delegate routine tasks to AI, freeing them for higher-value work.
Tejal Patwardhan, leading evaluations, notes rapid progress: GPT-4o scored just 13.7% fifteen months ago, while GPT-5 nearly triples that performance—a trajectory expected to continue.
While benchmarks like AIME 2025 and GPQA Diamond dominate AI assessment, many models approach saturation on these academic tests. GDPval represents growing emphasis on practical, industry-relevant evaluation standards—though OpenAI requires more comprehensive testing to conclusively demonstrate human-level performance across professional domains.
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont





Home






