Google Unveils New Chip to Slash Major Hidden AI Cost

At the Google Cloud Next 25 event, Google unveiled the latest iteration of its Tensor Processing Unit (TPU), named Ironwood. This new chip marks a significant shift in focus for Google, emphasizing its use for inference rather than training. Traditionally, TPUs have been used for training neural networks, a process dominated by AI specialists and data scientists. However, with Ironwood, Google is now targeting the real-time prediction needs of millions, if not billions, of users.
Ironwood TPU
The launch of the Ironwood TPU comes at a pivotal time in the AI industry, where the focus is shifting from experimental projects to practical applications of AI models by businesses. The emergence of advanced AI models like Google's Gemini, which enhance reasoning capabilities, has spiked the demand for computing power during inference. This shift is driving up costs, as Google highlighted in their description of Ironwood: "reasoning and multi-step inference is shifting the incremental demand for compute -- and therefore cost -- from training to inference time (test-time scaling)." Ironwood represents Google's commitment to optimizing performance and efficiency, particularly in the increasingly costly domain of inference.
An Inference Chip
Google's journey with TPUs spans over a decade, with six generations preceding Ironwood. While training chips are produced in lower volumes, inference chips cater to a broader audience needing daily predictions from trained models, making it a high-volume market. Previously, Google's sixth-generation TPU, Trillium, was positioned as capable of both training and inference. However, Ironwood's primary focus on inference marks a notable departure from this dual-purpose approach.
Necessary Investment
This shift in focus could signal a change in Google's reliance on external chipmakers like Intel, AMD, and Nvidia. Historically, these vendors have dominated Google's cloud computing operations, accounting for 99% of the processors used, according to KeyBanc Capital Markets. By investing in its own TPUs, Google might be aiming to reduce its dependency on these suppliers and potentially save on the escalating costs of AI infrastructure. Stock analysts, such as Gil Luria from DA Davidson, have estimated that if Google sold TPUs directly to Nvidia's customers, it could have generated up to $24 billion in revenue last year.
Ironwood vs. Trillium
Google showcased Ironwood's technical superiority over Trillium at the event. Ironwood boasts twice the performance per watt, achieving 29.3 trillion floating-point operations per second. It also features 192GB of high-bandwidth memory (HBM), six times that of Trillium, and a memory bandwidth of 7.2 terabits per second, which is 4.5 times higher. These enhancements are designed to facilitate greater data movement and reduce latency on the chip during tensor manipulations, as Google stated, "Ironwood is designed to minimize data movement and latency on chip while carrying out massive tensor manipulations."
Scaling AI Infrastructure
The advancements in memory and bandwidth are central to Google's strategy for scaling its AI infrastructure. Scaling involves efficiently utilizing grouped chips to solve problems in parallel, enhancing performance and utilization. This is crucial for economic reasons, as higher utilization means less waste of costly resources. Google has previously highlighted Trillium's ability to scale to hundreds of thousands of chips, and similarly, they emphasized Ironwood's capability to compose "hundreds of thousands of Ironwood chips to rapidly advance the frontiers of GenAI computation."
Alongside the hardware announcement, Google also introduced Pathways on Cloud, a software solution that distributes AI computing tasks across different machines. Previously used internally, this software is now available to the public, further enhancing Google's AI infrastructure capabilities.
Related article
AI-Powered Music Creation: Craft Songs and Videos Effortlessly
Music creation can be complex, demanding time, resources, and expertise. Artificial intelligence has transformed this process, making it simple and accessible. This guide highlights how AI enables any
Creating AI-Powered Coloring Books: A Comprehensive Guide
Designing coloring books is a rewarding pursuit, combining artistic expression with calming experiences for users. Yet, the process can be labor-intensive. Thankfully, AI tools simplify the creation o
Qodo Partners with Google Cloud to Offer Free AI Code Review Tools for Developers
Qodo, an Israel-based AI coding startup focused on code quality, has launched a partnership with Google Cloud to enhance AI-generated software integrity.As businesses increasingly depend on AI for cod
Comments (17)
0/200
EllaJohnson
August 15, 2025 at 5:00:59 PM EDT
Whoa, Google's Ironwood TPU sounds like a game-changer for AI inference! Cutting costs like that could really shake up the cloud market. Anyone else curious how this stacks up against Nvidia’s gear? 🤔
0
RalphSanchez
August 14, 2025 at 7:01:00 PM EDT
Google's new Ironwood chip sounds like a game-changer for AI inference! 🚀 Excited to see how it cuts costs and boosts efficiency.
0
GaryGonzalez
April 24, 2025 at 3:26:40 AM EDT
Googleの新しいIronwood TPUはAIコストを変えるものですね!今は推論に重点を置いているのがかっこいいけど、トレーニングの側面も気になります。でも、隠れたコストを削減できるなら賛成です。トレーニング部分も改善し続けてほしいですね!🤞
0
WalterWalker
April 24, 2025 at 12:26:10 AM EDT
Googleの新しいTPU、Ironwoodは推論タスクに革命をもたらす!効率化に焦点を当てるのは素晴らしいですね。ただ、古いモデルと互換性がないのがちょっと残念。将来のAI開発に期待しています!🤖
0
ChristopherAllen
April 23, 2025 at 9:03:04 PM EDT
La nueva TPU de Google, Ironwood, es increíble para tareas de inferencia. ¡Me encanta que se estén enfocando en la eficiencia! Aunque me molesta un poco que no sea compatible con modelos anteriores. ¡Espero ver más avances pronto! 🚀
0
TerryScott
April 23, 2025 at 4:52:06 PM EDT
TPU mới của Google, Ironwood, thật sự là một bước tiến lớn cho các nhiệm vụ suy luận! Tôi thích cách họ tập trung vào hiệu quả. Tuy nhiên, việc không tương thích với các mô hình cũ khiến tôi hơi thất vọng. Mong chờ những phát triển AI trong tương lai! 🤓
0
At the Google Cloud Next 25 event, Google unveiled the latest iteration of its Tensor Processing Unit (TPU), named Ironwood. This new chip marks a significant shift in focus for Google, emphasizing its use for inference rather than training. Traditionally, TPUs have been used for training neural networks, a process dominated by AI specialists and data scientists. However, with Ironwood, Google is now targeting the real-time prediction needs of millions, if not billions, of users.
Ironwood TPU
The launch of the Ironwood TPU comes at a pivotal time in the AI industry, where the focus is shifting from experimental projects to practical applications of AI models by businesses. The emergence of advanced AI models like Google's Gemini, which enhance reasoning capabilities, has spiked the demand for computing power during inference. This shift is driving up costs, as Google highlighted in their description of Ironwood: "reasoning and multi-step inference is shifting the incremental demand for compute -- and therefore cost -- from training to inference time (test-time scaling)." Ironwood represents Google's commitment to optimizing performance and efficiency, particularly in the increasingly costly domain of inference.
An Inference Chip
Google's journey with TPUs spans over a decade, with six generations preceding Ironwood. While training chips are produced in lower volumes, inference chips cater to a broader audience needing daily predictions from trained models, making it a high-volume market. Previously, Google's sixth-generation TPU, Trillium, was positioned as capable of both training and inference. However, Ironwood's primary focus on inference marks a notable departure from this dual-purpose approach.
Necessary Investment
This shift in focus could signal a change in Google's reliance on external chipmakers like Intel, AMD, and Nvidia. Historically, these vendors have dominated Google's cloud computing operations, accounting for 99% of the processors used, according to KeyBanc Capital Markets. By investing in its own TPUs, Google might be aiming to reduce its dependency on these suppliers and potentially save on the escalating costs of AI infrastructure. Stock analysts, such as Gil Luria from DA Davidson, have estimated that if Google sold TPUs directly to Nvidia's customers, it could have generated up to $24 billion in revenue last year.
Ironwood vs. Trillium
Google showcased Ironwood's technical superiority over Trillium at the event. Ironwood boasts twice the performance per watt, achieving 29.3 trillion floating-point operations per second. It also features 192GB of high-bandwidth memory (HBM), six times that of Trillium, and a memory bandwidth of 7.2 terabits per second, which is 4.5 times higher. These enhancements are designed to facilitate greater data movement and reduce latency on the chip during tensor manipulations, as Google stated, "Ironwood is designed to minimize data movement and latency on chip while carrying out massive tensor manipulations."
Scaling AI Infrastructure
The advancements in memory and bandwidth are central to Google's strategy for scaling its AI infrastructure. Scaling involves efficiently utilizing grouped chips to solve problems in parallel, enhancing performance and utilization. This is crucial for economic reasons, as higher utilization means less waste of costly resources. Google has previously highlighted Trillium's ability to scale to hundreds of thousands of chips, and similarly, they emphasized Ironwood's capability to compose "hundreds of thousands of Ironwood chips to rapidly advance the frontiers of GenAI computation."
Alongside the hardware announcement, Google also introduced Pathways on Cloud, a software solution that distributes AI computing tasks across different machines. Previously used internally, this software is now available to the public, further enhancing Google's AI infrastructure capabilities.




Whoa, Google's Ironwood TPU sounds like a game-changer for AI inference! Cutting costs like that could really shake up the cloud market. Anyone else curious how this stacks up against Nvidia’s gear? 🤔




Google's new Ironwood chip sounds like a game-changer for AI inference! 🚀 Excited to see how it cuts costs and boosts efficiency.




Googleの新しいIronwood TPUはAIコストを変えるものですね!今は推論に重点を置いているのがかっこいいけど、トレーニングの側面も気になります。でも、隠れたコストを削減できるなら賛成です。トレーニング部分も改善し続けてほしいですね!🤞




Googleの新しいTPU、Ironwoodは推論タスクに革命をもたらす!効率化に焦点を当てるのは素晴らしいですね。ただ、古いモデルと互換性がないのがちょっと残念。将来のAI開発に期待しています!🤖




La nueva TPU de Google, Ironwood, es increíble para tareas de inferencia. ¡Me encanta que se estén enfocando en la eficiencia! Aunque me molesta un poco que no sea compatible con modelos anteriores. ¡Espero ver más avances pronto! 🚀




TPU mới của Google, Ironwood, thật sự là một bước tiến lớn cho các nhiệm vụ suy luận! Tôi thích cách họ tập trung vào hiệu quả. Tuy nhiên, việc không tương thích với các mô hình cũ khiến tôi hơi thất vọng. Mong chờ những phát triển AI trong tương lai! 🤓












