VOOZH about

URL: https://thenewstack.io/nvidia-gpu-dominance-at-a-crossroads/

⇱ Nvidia GPU Dominance at a Crossroads - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-12-12 03:00:50
Nvidia GPU Dominance at a Crossroads
Hardware

Nvidia GPU Dominance at a Crossroads

Thanks to the surprise success of ChatGPT, Nvidia enjoys a lead in the market for GPUs and AI accelerators, but rivals such as AMD, Google, Meta and others are rapidly catching up with their own technologies.
Dec 12th, 2023 3:00am by Agam Shah
👁 Featued image for: Nvidia GPU Dominance at a Crossroads
Nvidia CEO Jensen Huang at AWS re:Invent last month. AWS will be the first cloud service to offer the next-gen Nvidia GH200 to it customers.

When ChatGPT was released a year ago, every chip maker, except for Nvidia, was caught sleeping. Nvidia’s GPUs powered the initial surge of users for the AI chatbot.

The AI rush resulted in an overwhelming demand for Nvidia’s technology. A long line included Elon Musk, who had to wait for the GPUs to improve Tesla’s AI backend for its self-driving cars. Even large cloud providers had to wait as long as six months to receive their orders of GPUs.

But a year on, chip makers have woken up with their own AI chips, which are being marketed as alternatives to Nvidia’s latest H100 GPUs.

The AI chips, such as AMD’s MI300X and Google’s TPU v5p, which were introduced this month, will not have the long waits. AMD’s chip will be cheaper, but the AI software support for the GPU is still a work-in-progress.

Nvidia knows the competition is coming and has taken steps to maintain its dominance. The first, and perhaps most significant, is the acceleration of the GPU product release cycle.

Nvidia‘s Strategy for Dominance

Nvidia will now release a new GPU every year, a step up from its previous two-year release cycle. The company’s H100 GPUs are still in short supply, but last month announced the H200 GPU, which delivers the same performance but has more memory capacity. Larger memory can store more data as AI jobs get longer and queries get more complex.

👁 Image

Nvidia H200 (Nvidia)

Nvidia’s former enemies, bitcoin miners, are now turning into allies. Cryptocurrency hunters are shying away from mining and turning data centers into AI computing centers. The miners will provide computing capacity on Nvidia H100 GPUs at significantly cheaper prices than conventional cloud providers.

Crusoe Energy is borrowing $200 million to acquire 20,000 H100 GPUs. The GPUs will become available to customers in the first quarter of next year. Crusoe used the GPUs as collateral to secure financing.

Nonprofit AI cloud provider Voltage Park, founded by blockchain billionaire Jed McCaleb, acquired 24,000 Nvidia H100 GPUs, which were ordered in April 2023. Voltage Park aims to be like eBay, with the highest bidder receiving AI computing time on the H100.

Mining company Bit Digital has acquired a fleet of H100 GPUs that will be deployed in a data center, the company said in a filing with the Securities and Exchange Commission. The company has also signed a multimillion-dollar contract for three years with a customer committed to using the GPUs.

Nvidia Topping Export Restrictions

The China market is important to Nvidia, and the U.S. government has put the company under the microscope.

The U.S. government has twice created restrictions to restrict the sales of some of Nvidia’s top GPUs to China, but Nvidia creatively changed the specifications of the chip that allowed sales to China.

One such chip, the H800 GPU, which is a China market variant of the H100, has been withdrawn. The U.S. government announced restrictions in October that banned the sale of H800, and server makers stopped sales shortly after. Lenovo pulled the GPU from its China server products on Oct. 31.

Nvidia has announced H20, a variant of its fastest GPU, for the China market but has delayed its release.

However, the chip maker is also facing competition from local AI chip makers. Chinese firms Huawei and Biren Technology have developed GPUs.

AMD‘s GPU Is Faster than Nvidias Fastest

AMD was sleeping when ChatGPT was released. Its GPU wasn’t ready for AI, its software stack was broken, and AI wasn’t featured on its long-term roadmap.

But the company knows how to catch up quickly.

AMD this week launched its new MI300X, which the company CEO Lisa Su claimed was the world’s fastest AI accelerator. The GPU has more memory capacity and comparable throughput to an Nvidia GPU.

👁 Image

The AMD MI300X Accelerator.

AMD has claimed better raw AI and computing performance. But those metrics are highly reliant on the foundational model, algorithms, software tools, compilers, and other variables. Nvidia has a stronger software stack with CUDA and has better customization of AI applications to its chips.

Outside of performance, most AI chip makers outside Nvidia are still trying to prove the viability of their silicon. But AMD is eroding that dominance by scoring major customers, including Microsoft, Meta, and Oracle, which are putting MI300X in their data centers.

Microsoft’s infrastructure is heavily reliant on Nvidia hardware, and GPT-4 is now loaded on the MI300X. Microsoft also announced a preview of an MI300X virtual machine, and Oracle is testing the MI300X in its cloud services.

Meta plugged in MI300X into an OCP-compliant server, which was one of the fastest server deployments in Meta’s history, said Ajit Mathews, senior director of engineering at Meta, in an on-stage appearance.

AMD CEO Lisa Su said there was space for many AI chip makers. “We’re now expecting that the data center accelerator TAM will grow more than 70% annually over the next four years to over $400 billion in 2027,” Su said.

Google’s TPU v5p 

Google’s AI chips, called TPUs, have been around for a decade but were not very accessible and mostly used internally. Google last week released the TPU v5p, which is its first AI chip for training with mass availability.

The chip’s release coincided with the launch of Gemini, which is Google’s next-generation large-language model. Google also announced a new type of supercomputer called Hypercomputer, which connects the conventional cloud-based consumption model with a supercomputing infrastructure.

The TPU v5p chips are limited to running AI workloads, while GPUs are designed to run general-purpose workloads. The TPU v5p chips are only available through Google Cloud but it easily accessible. A new feature called Dynamic Work Scheduling guarantees flexible on-demand or scheduled availability.

👁 Image

Source: Google

The current version of Gemini was trained on the TPU v4 and TPU v5e chips. The TPU v5e is designed more for inferencing, while the TPU v5p is a beefier variant that can handle training.

The Hypercomputer server system bunches together 8,960 Cloud TPU v5p chips in a pod. The pods are interconnected via optical circuit switches (OCS), which provide a throughput of 4,800 Gbps. The Hypercomputer supports GPUs, but the optical interconnect, which is faster than copper wires, is reserved for the TPUs.

The TPU v5p is a result of hardware-software co-design, Mark Lohmeyer, Google vice president and general manager of compute and machine learning infrastructure, told The New Stack.

“Google is uniquely able to do that because of our depth and research, own in-house models, our ecosystem of partner models, our experience scaling those and applications that served multiple billions of consumers on top of this infrastructure,” Lohmeyer said.

Many AI companies are running on Google infrastructure and will use TPUs v5p chips, Lohmeyer said.

More Competition on Tap

Intel’s best AI bet for now remains its CPUs, which are being designed for inference. The Xeon server chips include extensions such as AMX that speed up inferencing on models such as Llama 2.

Intel’s got multiple AI accelerators but none have caught fire. The general-purpose Ponte Vecchio GPU, which is in the world’s second-fastest supercomputer called Aurora, has found limited adoption. Another AI chip called Gaudi2 shows more promise — it matches the H100 on AI performance in some cases and is being used in an AI supercomputer being built for StabilityAI.

Intel canceled the successor to Ponte Vecchio and revised its roadmap to release its next-generation GPU in 2025. The AI megachip, called Falcon Shores, merges the Gaudi accelerators with Intel’s GPU accelerators.

TRENDING STORIES
Agam Shah has covered enterprise IT for more than a decade. Outside of machine learning, hardware and chips, he's also interested in martial arts and Russia.
Read more from Agam Shah
SHARE THIS STORY
TRENDING STORIES
Amazon Web Services, Microsoft and Oracle are sponsors of The New Stack.
TNS owner Insight Partners is an investor in: Bit.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.