Once a niche solution for maximizing silicon yields, chiplets have become the industry’s go-to strategy for delivering more cores at lower costs. AMD popularized the approach with Ryzen and EPYC, Intel reluctantly followed, and now even NVIDIA and Qualcomm are getting in on the act. But while chiplets bring undeniable benefits, including better binning, lower wafer costs, and more flexible design scaling, they also introduce compromises that manufacturers would rather you not dwell on.

👁 Intel Core i9-14900K and AMD Ryzen 9
Despite the drama, these 5 reasons mean x86 isn't going anywhere

Arm processors have shaken up the market, but x86 isn't folding any time soon

Going from monolithic designs to chiplets

Cheaper to manufacture, but comes with higher latencies

The dream of a single monolithic die with low latency, high bandwidth, and no pesky interconnect overhead remains somewhat of a dream in modern processor designs. Moving to chiplets adds complexity. Every additional die means another set of interconnects. While vendors love to tout their ultra-fast die-to-die links, the reality is that latencies go up, bandwidth constraints emerge, and software needs to play along. AMD’s Infinity Fabric, for example, has matured significantly since Zen 2, but it still introduces penalties compared to a traditional monolithic design.

Here's what latency means in practical terms. Latency in relation to processors and performance is how long it takes for information to travel from A to B. The longer it takes for information to travel, the greater the delay in the information reaching its destination. With monolithic processor designs where all the key components of the chip are in one package, the information has less distance to travel, which, in theory, means that the information reaches the destination quicker.

In the case of processors based on a chiplet design, the interconnect, such as AMD’s Infinity Fabric, is like a motorway or highway that acts as a direct route between A and B. The problem with adding a highway is that it creates distance between A and B, and the more distance there is, the further the information has to travel, which means latency is inherently higher.

Source: Intel

So, why is lower latency important in this case? Higher latencies can mean slower response times in applications, lower gaming performance, a drop in those important frame rates, and reduced efficiency in workloads that require fast and concise data access. Higher latencies mean less performance, and in a world where processing power is important, reducing latency is an important component when looking at raw compute performance and overall efficiency.

The impact of these interconnect latencies is most felt in loads with plenty of cross-chiplet communication, such as low-latency financial computations or certain high-frequency trading applications where nanoseconds matter. Intel’s Foveros stacking aims to minimize these trade-offs with direct die-to-die TSV (through-silicon via) connections. Still, it has its own engineering challenges, such as increased heat density and manufacturing complexity that lowers yield.

Then there’s power efficiency. A monolithic design enjoys direct core-to-core communication, whereas chiplet architectures depend on interposers, bridges, or advanced packaging to keep the pieces communicating. This introduces overheads, both in power and die area. These inefficiencies are hard to justify for power-limited environments such as laptops, where every milliwatt is precious. It’s no accident that Intel holds on to monolithic dies for its highest-performance mobile parts while adopting chiplets for desktops and servers.

Source: Samsung

A slight increase in power consumption brought by the added interconnect links also means power scaling efficiency worsens at a lower utilization point. This means chiplet designs are less suitable for workloads that aren’t fully saturating all the available cores, and this has some minor implications for use cases such as media playback, web browsing, or light productivity workloads, where the power efficiency trade-off can be just as important as peak performance.

Gaming performance is also a casualty. While AMD's Ryzen 3D V-Cache processors, such as the latest Ryzen 9 9950X3D have demonstrated how chiplets can be tuned for specific workloads, cross-chiplet communication overheads still lead to variable frame times and latency-sensitive applications. For example, the Ryzen 7000 series improved upon these, but the root issue remains. There’s a reason some games still favor Intel’s monolithic Raptor Lake architecture over AMD’s CCD-based Zen 4 and Zen 5 implementations. Still, AMD has closed some of the performance gaps on most workloads.

The problem isn’t just the raw latency, cache locality, and memory access patterns. When a game thread runs on one CCD but needs data in the L3 cache on another, it has to traverse the Infinity Fabric, which imposes a measurable delay. Game developers have had to code around these, but not all games benefit from such optimizations. That’s why, despite AMD’s advances, some games still favor Intel’s highly integrated monolithic implementations.

Chiplet designs are more complex to design

But they have lower manufacturing costs overall

Chiplet designs also create manufacturing and verification logistical challenges. Each chiplet must be tested individually before being packaged in a final package, adding additional steps to the production process.

It makes debugging and quality control more challenging, as a defect in one chiplet could destroy an entire multi-die package. Thermals are also concerned when chiplets are spread out on a substrate rather than on a single silicon die. Heat dissipation must be carefully controlled to prevent hotspots, and the need for additional power delivery and signal routing adds further constraints to motherboard and cooling designs.

None of this is to suggest that chiplets are a bad thing. They’re necessary in a world where Moore’s Law is on fumes, and the economics of wafer production require maximizing yield at all costs. The industry has progressed beyond packing more transistors onto a single die to breaking up designs and stitching them back with high-speed interconnects. But the next time a chip firm says that they’ve ‘solved’ the problem of interconnect latency, just recall that the laws of physics don’t do PR. The industry has somewhat sacrificed some performance for lower production costs, and that bargain isn’t going away anytime soon.

Chiplet designs are improving, but they aren’t perfect

Looking towards the future

Source: AMD

Chip manufacturers opting for chiplet-based designs is now the industry standard, but they bring drawbacks. While manufacturers continue refining the interconnects or roads that connect chiplets to reduce overall latency and the associated inefficiencies within the design, the fundamental trade-offs of fragmentation, power overhead, and software complexity aren’t something that will vanish overnight. The future of chiplet-based processors hinges on solving these challenges. Still, as long as cost savings drive the design choices, the dream of a truly seamless, monolithic alternative won’t push the envelope in the way that you may think.

Think of the interconnect as a pipe, and these pipes within the pipelines need optimizing; packaging more cores is good for performance, but elements such as latency, power efficiency, and cost are all factors to consider. Interconnects are improving, but to inherently solve the problems, these pathways need further refinement, and it’s up to the engineering design teams to iron out these inefficiencies to push chiplet-based designs to give monolithic efficiency so that the benefits the manufacturers are riding on such as lower cost and higher yields are shared in the way of better performance to the consumers.