VOOZH about

URL: https://thenewstack.io/nvidia-open-sources-kai-scheduler-to-help-ai-teams-optimize-gpu-utilization/

⇱ NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-04-01 01:00:58
NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization
kubecon-cloudnativecon-eu-2025,news,
AI / AI Operations / Containers / Kubecon Cloudnativecon EU 2025

NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization

NVIDIA today open sources Run:ai's KAI Scheduler, a project that helps AI teams optimize GPU resource allocations in Kubernetes clusters.
Apr 1st, 2025 1:00am by Frederic Lardinois
👁 Featued image for: NVIDIA Open Sources KAI Scheduler To Help AI Teams Optimize GPU Utilization

At KubeCon Europe, NVIDIA announced today that it is open sourcing KAI Scheduler, a GPU-centric Kubernetes scheduler that was originally developed by Run:ai, which NVIDIA acquired last year. Available under the Apache 2.0 license, KAI Scheduler helps its users optimize GPU resource allocations for AI and machine learning workloads in GPU clusters.

NVIDIA argues that traditional resource schedulers are ill-suited for managing AI workloads because GPU demand can fluctuate quite a lot, with bursty inference workloads and sustained model training runs that can extend over days.

KAI Scheduler promises to give these teams a better tool for managing these workloads by, among other things, dynamically adjusting quotas and limits in real time, while also offering a variety of scheduling strategies — gang scheduling, hierarchical queuing, bin-packing, spreading and GPU sharing — to avoid long wait times for access to GPUs.

Sharing GPUs looks like it will be an especially useful feature here. This allows multiple pods to utilize the same GPU, for example. It’s worth noting that NVIDIA already offers a tool called GPU Operator, a Kubernetes framework for provisioning GPUs, which also includes a GPU time-slicing feature.

GPU Operator, however, is very much focused on working with NVIDIA hardware and large clusters (including NVIDIA’s own DGX racks), while KAI Scheduler is more vendor-agnostic and also supports AI workloads on CPUs.

KAI Scheduler’s approach, other than GPU sharing, focuses on the individual GPUs and the memory available to them. What developers can reserve here is a share of that memory. There is no memory isolation, though.

By default, KAI Scheduler integrates with popular AI tools and cloud native frameworks like Kubeflow’s Training Operator, Ray and Argo.

The code and documentation for KAI Scheduler is now available on GitHub. Quite a few other parts of Run:ai are already open source, too, including the somewhat related Genv GPU environment and cluster management tools.

TRENDING STORIES
Before joining The New Stack as its senior editor for AI, Frederic was the enterprise editor at TechCrunch, where he covered everything from the rise of the cloud and the earliest days of Kubernetes to the advent of quantum computing....
Read more from Frederic Lardinois
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: run.ai.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.