option
Home
News
Gemini Unveils Enhanced Model, Extended Context, AI Agents

Gemini Unveils Enhanced Model, Extended Context, AI Agents

April 10, 2025
149

Back in December, we rolled out our first natively multimodal model, Gemini 1.0, available in three sizes: Ultra, Pro, and Nano. Fast forward a few months, and we introduced 1.5 Pro, boasting enhanced performance and a groundbreaking long context window of 1 million tokens.

Developers and enterprise customers have been leveraging 1.5 Pro in some pretty amazing ways, appreciating its long context window, robust multimodal reasoning, and overall stellar performance.

Feedback from users highlighted the need for models with lower latency and cost, which spurred us to keep pushing the envelope. That's why we're excited to introduce Gemini 1.5 Flash today. This model is lighter than 1.5 Pro, designed to be fast and efficient, and perfect for scaling up.

Both 1.5 Pro and 1.5 Flash are now in public preview, with a 1 million token context window, accessible through Google AI Studio and Vertex AI. And for those who need even more, 1.5 Pro now offers a 2 million token context window, available via waitlist to developers using the API and Google Cloud customers.

We're not stopping there. We're also rolling out updates across the entire Gemini family, unveiling our next generation of open models, Gemma 2, and making strides in the future of AI assistants with Project Astra.

Context lengths of leading foundation models compared with Gemini 1.5’s 2 million token capability

Gemini family of model updates

The new 1.5 Flash, optimized for speed and efficiency

Introducing 1.5 Flash, the latest and fastest member of the Gemini family, served through our API. It's tailored for high-volume, high-frequency tasks, offering cost-effective scalability while maintaining our long context window breakthrough.

Although lighter than 1.5 Pro, 1.5 Flash is no slouch. It excels in multimodal reasoning across vast data sets, delivering impressive quality relative to its size.

The new Gemini 1.5 Flash model is optimized for speed and efficiency, is highly capable of multimodal reasoning and features our breakthrough long context window.

1.5 Flash shines in tasks like summarization, chat applications, and captioning images and videos. It's also adept at extracting data from long documents and tables. This versatility stems from being trained by 1.5 Pro through "distillation," where the core knowledge and skills of a larger model are passed down to a more efficient, smaller model.

For more details on 1.5 Flash, check out our updated Gemini 1.5 technical report, the Gemini technology page, and learn about its availability and pricing.

Significantly improving 1.5 Pro

Over the past few months, we've made significant strides in enhancing 1.5 Pro, our top performer across a wide range of tasks.

We've expanded its context window to 2 million tokens and improved its capabilities in code generation, logical reasoning, planning, multi-turn conversations, and understanding audio and images. These enhancements are backed by advances in data and algorithms, showing marked improvements on both public and internal benchmarks.

1.5 Pro now handles increasingly complex and nuanced instructions, including those that define product-level behaviors like role, format, and style. We've refined control over the model's responses for specific use cases, such as customizing chat agent personas or automating workflows with multiple function calls. Users can now steer the model's behavior with system instructions.

We've also added audio understanding to the Gemini API and Google AI Studio, allowing 1.5 Pro to process both images and audio from videos uploaded to Google AI Studio. We're integrating 1.5 Pro into Google products like Gemini Advanced and Workspace apps.

For more on 1.5 Pro, dive into our updated Gemini 1.5 technical report and the Gemini technology page.

Gemini Nano understands multimodal inputs

Gemini Nano is stepping up its game, moving beyond text-only inputs to include images. Starting with Pixel, apps using Gemini Nano with Multimodality will be able to interpret the world in a more human-like way, through text, visuals, sound, and spoken language.

Learn more about Gemini 1.0 Nano on Android.

Next generation of open models

Today, we're also updating Gemma, our family of open models, which are built on the same research and tech as the Gemini models.

We're launching Gemma 2, our next-gen open models for responsible AI innovation. Gemma 2 features a new architecture for superior performance and efficiency, and will come in new sizes.

The Gemma family is growing with PaliGemma, our first vision-language model inspired by PaLI-3. We've also upgraded our Responsible Generative AI Toolkit with LLM Comparator to assess model response quality.

For more details, head over to the Developer blog.

Progress developing universal AI agents

At Google DeepMind, our mission is to build AI responsibly to benefit humanity. We've always aimed to create universal AI agents that can assist in everyday life. That's why we're sharing our progress on the future of AI assistants with Project Astra (advanced seeing and talking responsive agent).

For an AI agent to be truly helpful, it needs to understand and react to the world like a human, taking in and remembering what it sees and hears to grasp context and act accordingly. It should also be proactive, teachable, and personal, allowing for natural, lag-free conversations.

While we've made great strides in processing multimodal information, achieving conversational response times is a tough engineering challenge. Over the years, we've been refining how our models perceive, reason, and converse to make interactions feel more natural.

Building on Gemini, we've developed prototype agents that process information faster by continuously encoding video frames, merging video and speech inputs into a timeline of events, and caching this data for quick recall.

By using our top-tier speech models, we've also improved how these agents sound, giving them a broader range of intonations. They can better understand the context they're in and respond swiftly in conversation.

With this technology, it's easy to imagine a future where everyone has an expert AI assistant at their side, accessible through a phone or glasses. Some of these capabilities will be coming to Google products like the Gemini app and web experience later this year.

Continued exploration

We've come a long way with our Gemini family of models, and we're committed to pushing the boundaries even further. Through relentless innovation, we're exploring new frontiers while unlocking exciting new use cases for Gemini.

To learn more about Gemini and its capabilities, check out our resources.

Get more stories from Google in your inbox.Get more stories from Google in your inbox.

Email addressYour information will be used in accordance withGoogle's privacy policy.

SubscribeDone. Just one step more.

Check your inbox to confirm your subscription.

You are already subscribed to our newsletter.

You can also subscribe with adifferent email address.

Related article
Kakao Mobility outlines Level 4 autonomous driving roadmap for physical AI Kakao Mobility outlines Level 4 autonomous driving roadmap for physical AI Kakao Mobility is planning to develop Level 4 autonomous driving technologies internally as part of its physical AI strategy. At the 2026 World IT Show conference in Seoul's COEX, Kim Jin-kyu — vice president and head of Kakao Mobility's Physical AI
Barry Diller: Trust in Sam Altman irrelevant as AGI nears Barry Diller: Trust in Sam Altman irrelevant as AGI nears Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
YouTube expands AI deepfake detection to politicians, government officials, and journalists YouTube expands AI deepfake detection to politicians, government officials, and journalists On Tuesday, YouTube announced it is expanding its deepfake detection technology to a select group of government officials, political candidates, and journalists. The tool identifies AI-generated likenesses and lets pilot participants request the remo
Related Special Topic Recommendations
Business Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling
Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling

Discover the 2026 latest top-rated AI recruiting tools on XIX.AI. Our curated list features powerful, game-changing solutions for screening resumes and automating candidate interview scheduling. Compare free vs paid options with real-world tests and weekly updated rankings. Find your perfect hiring assistant and streamline your recruitment today!

10 tools
xix.ai
Productivity AI Personal Wellness & Focus Coaches: Manage Burnout & Boost Mental Energy Levels
AI Personal Wellness & Focus Coaches: Manage Burnout & Boost Mental Energy Levels

Discover the 2026 best AI personal wellness and focus coaches on XIX.AI. Our curated rankings feature top-rated, game-changing tools to manage burnout and boost mental energy. Compare free vs paid options with real-world insights. Unlock your path to peak productivity and well-being today.

10 tools
xix.ai
chatbot Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities
Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities

Discover the 2026 latest top-rated AI romantic chatbots for building genuine, long-term connections. Our curated list features powerful, consistent personalities, free vs paid comparisons, and real-world tests. Find your perfect companion and start building today at XIX.AI.

10 tools
xix.ai
Education and Learning Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows
Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows

Discover the 2026 best AI data science mentors to master SQL, Pandas & ML workflows. Explore our top-rated, curated selection at XIX.AI for powerful, game-changing guidance. Compare free vs paid options with real-world insights. Unlock your data science mastery today.

10 tools
xix.ai
chatbot Best AI Flirting & Conversation Trainers: Improve Social Charisma and Confidence in Real-Time
Best AI Flirting & Conversation Trainers: Improve Social Charisma and Confidence in Real-Time

Discover the 2026 best AI flirting and conversation trainers on XIX.AI. Our curated, top-rated selection helps you build social charisma and confidence in real-time. Explore must-try, game-changing tools with free vs paid comparisons and weekly updated rankings. Unlock your social edge today.

10 tools
xix.ai
code Best AI Tools for Automated Unit Testing: Generate Jest, PyTest & JUnit Test Cases in One Click
Best AI Tools for Automated Unit Testing: Generate Jest, PyTest & JUnit Test Cases in One Click

Discover the 2026 latest top-rated AI tools for automated unit testing. Our curated selection features powerful, game-changing solutions to generate Jest, PyTest & JUnit test cases instantly. Compare free vs paid options with real-world tests and weekly updated rankings on XIX.AI. Unlock your AI edge and boost development productivity today.

10 tools
xix.ai
Comments (26)
0/500
GregoryWilson
GregoryWilson April 27, 2026 at 4:00:25 PM EDT

Geminiの進化がすごいですね!長いコンテキストウィンドウは実用的なAIエージェント開発に革命をもたらしそう。でも、競争激化で倫理的なガイドラインが追いついてるか少し心配。🤔 個人的には、もっと小さなプロジェクトでも使える軽量版が早く出てくると嬉しいな。

LucasWalker
LucasWalker April 18, 2025 at 5:37:58 PM EDT

ジェミニの新しいモデルが100万トークンのコンテキストを持つとは信じられない!🤯 まるで何でも扱える超賢いAIを持っているようです。AIエージェントもゲームチェンジャーです。次に何を出すのか楽しみです!🚀

FrankSmith
FrankSmith April 15, 2025 at 8:37:56 PM EDT

젬니니의 새로운 모델 정말 멋지네요! 100만 토큰의 컨텍스트 윈도우는 정말 놀랍습니다. 마치 모든 대화를 기억하는 똑똑한 친구가 있는 것 같아요! 조금 더 빨랐으면 좋겠지만, 뭐 다 가질 수는 없죠? 🤓

JamesMiller
JamesMiller April 15, 2025 at 1:53:33 PM EDT

O novo modelo do Gemini é bem legal! A janela de contexto de 1 milhão de tokens é louca, é como ter um amigo superinteligente que lembra de tudo o que você já disse! Só queria que fosse um pouco mais rápido, mas, ei, não dá pra ter tudo, né? 🤓

MarkRoberts
MarkRoberts April 14, 2025 at 9:25:31 PM EDT

El nuevo modelo Gemini es impresionante, especialmente la ventana de contexto larga. Es genial para desarrolladores, pero puede ser un poco abrumador para principiantes. Los agentes de IA son geniales, pero desearía que hubiera más documentación sobre cómo usarlos de manera efectiva.

BillyGarcia
BillyGarcia April 14, 2025 at 3:20:08 PM EDT

O novo modelo do Gemini com um contexto de um milhão de tokens é loucura! 🤯 É como ter uma IA super inteligente que pode lidar com qualquer coisa. Os agentes de IA também são um divisor de águas. Mal posso esperar para ver o que eles vão lançar a seguir! 🚀

OR