Google Says Gemini 2.5 Pro Outperforms DeepSeek R1 and Grok 3 Beta in Coding Benchmarks

Home

News

October 24, 2025

RogerNelson

110

# Gemini # Google

Google has unveiled a refreshed preview of its flagship Gemini 2.5 Pro model, initially introduced in March and enhanced in May. This iteration, described as the company's "most intelligent" AI to date, is currently in preview with plans for general availability within weeks.
Businesses can now experiment with developing new applications or upgrading existing implementations using the updated "I/O edition" of Gemini 2.5 Pro. According to Google's official announcement, this version delivers more imaginative responses and demonstrates superior capabilities in programming and logical reasoning compared to previous iterations.

Our latest Gemini 2.5 Pro update is now in preview.
It's better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads @lmarena_ai with a 24pt Elo score jump since the previous version.
We also… pic.twitter.com/SVjdQ2k1tJ
— Sundar Pichai (@sundarpichai) June 5, 2025

At its May developer conference, Google revealed enhancements to Gemini 2.5 Pro that surpassed its earlier silent release. Demis Hassabis, CEO of Google DeepMind, highlighted the I/O edition as the company's most advanced coding model to date.

This newest preview, designated Gemini 2.5 Pro Preview 06-05 Thinking, advances beyond the I/O edition's capabilities. The forthcoming public release promises enterprise-grade performance and scalability.

The original I/O edition (gemini-2.5-pro-preview-05-06) became accessible to developers and corporations in May via Google AI Studio and Vertex AI. The enhanced Gemini 2.5 Pro Preview 06-05 Thinking is available through these same channels.

Performance metrics

This upgraded Gemini 2.5 Pro demonstrates measurable improvements over its predecessor.

Google reported a 24-point advancement in LMArena and a 35-point gain in WebDevArena, where it now leads competitor rankings. Comparative testing revealed superior performance against models including OpenAI's o3, o3-mini, and o4-mini, Anthropic's Claude 4 Opus, xAI's Grok 3 Beta, and DeepSeek R1.

"We've also addressed feedback from our previous 2.5 Pro releases, improving its style and structure — it can be more creative with better-formatted responses," Google stated in its announcement.

What enterprises can expect

While Google's rapid Gemini 2.5 Pro enhancements may seem complex, the company positions them as direct responses to user input. The new version carries a pricing structure of $1.25 per million input tokens (without caching) and $10 per million output tokens.

When Gemini 2.5 Pro debuted in March, industry observers recognized it as an underutilized advanced model. Google has since embedded the technology across numerous applications, including the "Deep Think" feature that evaluates multiple hypotheses before generating responses.

Gemini 2.5 Pro's release and subsequent upgrades have reinforced Google's position in the competitive large language model landscape, reclaiming attention from rival reasoning models by DeepSeek and OpenAI.

Within hours of the announcement, developers began testing the updated Gemini 2.5 Pro. Early impressions confirm Google's claims of accelerated performance, though comprehensive evaluation of its enhanced capabilities remains ongoing.

First hour with "Gemini 2.5 Pro Preview 06-05"
Positives:
– It's faster
– It produces more output
– It has a better macro play (multi file edits, better overview)
– Output structure is better (readable)
– It's more concise and LESS APOLOGETIC!!
Before: "You are absolutely…
— Patrick Bade (@nishffx) June 5, 2025

you guys cooked, really enjoying the app builder.
made a game and tested it out, it was using imagen to build assets on the fly ? and it's up, hosted, easy to share. Really the best no-experience no-code builder yet.
keep building out the vibe app marketplace, this could…
— bone (@boneGPT) June 5, 2025

Gemini 2.5 Pro Preview is pretty good.. used it yesterday for deep research and the results are better than some of the big names..
— Janak (@janaks09) June 5, 2025

Google rolls out Gemini in Chrome to India On Wednesday, Google announced it is expanding Gemini integration for Chrome to new regions, including India, Canada, and New Zealand. This rollout allows desktop users to access Gemini via a sidebar, where they can ask Google’s AI chatbot about on-s

YouTube expands AI deepfake detection to politicians, government officials, and journalists On Tuesday, YouTube announced it is expanding its deepfake detection technology to a select group of government officials, political candidates, and journalists. The tool identifies AI-generated likenesses and lets pilot participants request the remo

YouTube Tests AI-Powered Search Feature with Guided Answers Many users turn to YouTube when searching for recipes or travel plans, looking for relevant videos. Now, the platform is introducing an AI-powered interactive search tool that delivers step-by-step results, blending text and video content.With the ne

Related Special Topic Recommendations

Business

Best AI Expense Trackers: Scan Receipts & Categorize Corporate Spend Automatically

2026 Latest Best AI Expense Trackers: Top-rated tools to scan receipts & categorize corporate spend automatically. Discover powerful, game-changing solutions for effortless expense management, accurate financial tracking, and streamlined compliance. Our curated, weekly-updated comparison of free vs paid options helps you find the perfect fit. Unlock your AI edge with XIX.AI's expert picks.

10 tools

xix.ai

Business

Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling

Discover the 2026 latest top-rated AI recruiting tools on XIX.AI. Our curated list features powerful, game-changing solutions for screening resumes and automating candidate interview scheduling. Compare free vs paid options with real-world tests and weekly updated rankings. Find your perfect hiring assistant and streamline your recruitment today!

10 tools

xix.ai

Productivity

AI Personal Wellness & Focus Coaches: Manage Burnout & Boost Mental Energy Levels

Discover the 2026 best AI personal wellness and focus coaches on XIX.AI. Our curated rankings feature top-rated, game-changing tools to manage burnout and boost mental energy. Compare free vs paid options with real-world insights. Unlock your path to peak productivity and well-being today.

10 tools

xix.ai

chatbot

Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities

Discover the 2026 latest top-rated AI romantic chatbots for building genuine, long-term connections. Our curated list features powerful, consistent personalities, free vs paid comparisons, and real-world tests. Find your perfect companion and start building today at XIX.AI.

10 tools

xix.ai

Education and Learning

Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows

Discover the 2026 best AI data science mentors to master SQL, Pandas & ML workflows. Explore our top-rated, curated selection at XIX.AI for powerful, game-changing guidance. Compare free vs paid options with real-world insights. Unlock your data science mastery today.

10 tools

xix.ai

chatbot

Best AI Flirting & Conversation Trainers: Improve Social Charisma and Confidence in Real-Time

Discover the 2026 best AI flirting and conversation trainers on XIX.AI. Our curated, top-rated selection helps you build social charisma and confidence in real-time. Explore must-try, game-changing tools with free vs paid comparisons and weekly updated rankings. Unlock your social edge today.

10 tools

xix.ai

Comments (1)

0/500

Please login first

JohnYoung

May 16, 2026 at 8:00:11 PM EDT

Interesting to see Google claiming coding benchmark wins, but I'm curious about real-world dev experience. Does it handle messy legacy codebases as well as it does clean competition problems? The 'most intelligent' tag feels a bit marketing-heavy until we see more hands-on results. 🤔