Google splits new AI chips for training, inference

Google doubles down on AI hardware, unveiling two specialised TPU 8 chips for training and inference as it takes sharper aim at Nvidia’s data centre lead.
Updated on

Google is rolling out a new generation of tensor processing units, creating separate processors optimised for AI model training and inference workloads. The eighth-generation TPUs, named TPU 8t and TPU 8i, are slated to become available later in 2026 through Google’s infrastructure.

TPU 8t targets massive compute-heavy training jobs by emphasising higher compute throughput and expanded scale-up bandwidth. TPU 8i instead prioritises memory bandwidth so it can handle latency-sensitive inference tasks where response speed is critical.

Google positions both chips as flexible enough to run a range of workloads but argues that separating training and inference silicon unlocks efficiency gains. TPU 8t is aimed at powering the next wave of large model training and AI agent development on Google’s custom-built supercomputers.

TPU 8i is geared towards the production side, serving huge volumes of inference requests from enterprise applications and consumer services. The shift toward workload-specific chips is framed as a direct response to the rapid rise of AI agents and their diverse compute demands.

Google used its annual Las Vegas conference to showcase tools for building AI agents that automate business tasks. The company highlighted a “full stack” that spans AI infrastructure, Gemini models, data management, enterprise security foundations, developer tooling and agent-focused applications.

Google argues that tight integration between TPU hardware, its model offerings and cloud platform gives it a differentiated position against Nvidia-centric ecosystems. The TPU 8 family is meant to be the silicon backbone for that broader agent-first strategy.

Sources

Updated on

Our Daily Newsletter

Everything you need to know across Australian business, global and company news in a 2-minute read.