VOOZH about

URL: https://thenewstack.io/google-ai-infrastructure-pm-on-new-tpus-liquid-cooling-and-more/

⇱ Google AI Infrastructure PM on New TPUs, Liquid Cooling and More - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-05-13 08:00:26
Google AI Infrastructure PM on New TPUs, Liquid Cooling and More
podcast,video,
Operations

Google AI Infrastructure PM on New TPUs, Liquid Cooling and More

Google's Chelsie Czop talks about its Ironwood TPU, building hardware for models that are changing at an ever increasing speed and more.
May 13th, 2025 8:00am by Frederic Lardinois
👁 Featued image for: Google AI Infrastructure PM on New TPUs, Liquid Cooling and More

At its Cloud Next 25 conference earlier this year, Google launched Ironwood, its latest custom Tensor Processing Unit (TPU) AI accelerator, which easily outperforms any of its previous-generation chips. To talk about Ironwood, as well as how Google thinks about using GPUs versus TPUs, building hardware for models that are changing at an ever increasing speed, getting data centers ready for next-gen chips and more, I sat down with Chelsie Czop, a senior product manager for Google’s AI Infrastructure.

The latest generation of Ironwood pods, with 9,216 chips per pod, provides a total compute power of 42.5 exaflops, Google says. It also offers a 2x improvement in performance per watt compared to the last generation of TPUs.

As Czop noted, building these chips is always a tradeoff.

“To be able to design these systems, too, it’s interesting because you go back to the constraints that you have: it’s power, it’s thermal — being able to cool it [because the] more power you bring in, the hotter it gets — and then being able to interconnect all these chips together,” she explained. “So it comes incrementally, and then you look at it, and you look back through the generations, and you realize how far you’ve been able to come and how much that leap has been from the beginning.”

As far as the thermal improvements, Google started using liquid cooling quite a few years ago, driven largely by the need to keep its early TPUs cool. The Ironwood TPUs use Google’s fourth generation of liquid cooling systems, Czop said, though she also noted that not every TPU generation used liquid cooling.

“Just watching the evolution as to how Google’s been able to evolve the liquid cooling every single generation, it’s different when you and I talk about it, but then, you go into the data center and you see the little changes,” she said. “We run the liquid cooling pipes on the outside and the front of the systems when you’re walking down the row. And one of the reasons we do that is so that you can visibly see if there’s a leak, and from one generation to another, there’s, like, a spigot that’s pointed up and one that’s pointed down. I’m sure there were some lessons learned with that.”

With these TPUs now being so powerful, one question Czop gets a lot from customers is whether to use TPUs or (mostly NVIDIA) GPUs for their workloads. She noted that it always depends on the customer’s workload, use case and what their teams are already using. At times, she noted, teams may need an NVIDIA framework to speed up their work, which isn’t available for TPUs, for example. But for a lot of businesses, it’s also not an either/or discussion.

“We’ve had customers that go from CPUs directly to TPUs. I was speaking with Moloco in a session earlier, and they had a 10x improvement just porting their training applications from CPUs over to TPUs. They have very embedding-heavy models, so they didn’t even optimize for how they could use the sparse cores that are in TPUs to be able to do that. But at the same time, they still use GPUs as well,” Czop said.

Yet while the hardware keeps improving on an annual cadence, models — and model architectures — keep changing significantly faster. Czop noted that the teams’ relationship with DeepMind helps it look ahead.

“It’s kind of funny to me when we’re writing our announcement blogs, because we’re like, we’re designing this hardware for the next generation, and we’re not even necessarily sure what those new model architectures are going to be,” she said. “And so especially now, we’re focused on think time compute, bringing training into the inference and thinking as you’re doing the inferencing. And that is right now on the bleeding edge. But that could change next week.”

TRENDING STORIES
Before joining The New Stack as its senior editor for AI, Frederic was the enterprise editor at TechCrunch, where he covered everything from the rise of the cloud and the earliest days of Kubernetes to the advent of quantum computing....
Read more from Frederic Lardinois
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Unit.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.