VOOZH about

URL: https://thenewstack.io/nvidia-intros-large-language-model-customization-services/

⇱ Nvidia Intros Large Language Model Customization, Services - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-09-21 08:28:54
Nvidia Intros Large Language Model Customization, Services
Cloud Services

Nvidia Intros Large Language Model Customization, Services

At its GTC developer event, Nvidia introduces new cloud services, for custom training of LLMs and biomedical research on LLM protein models
Sep 21st, 2022 8:28am by Andrew Brust
👁 Featued image for: Nvidia Intros Large Language Model Customization, Services

At its annual GPU Technology Conference (GTC) developer event today, Nvidia is announcing two new cloud services based on Large Language Models (LLM) technology. One service lets users customize pre-trained LLMs for their own specific use cases and another caters to biomedical research using trained protein models. These new services are built on Nvidia’s NeMo Megatron framework, now entering a public beta phase. Nvidia had announced vast improvements to Nemo Megatron less than two months ago.

Must read: Nvidia Shaves up to 30% off Large Language Model Training Times

See Prompt

The New Stack was briefed by Paresh Kharya, Nvidia’s Senior Director of Product Management and Marketing. Kharya explained that LLMs are based on the transformer architecture, invented by Google. That architecture is based on the premise that “AI can understand which parts of a sentence or which parts of an image, or even very disparate data points, are relevant to each other.” Kharya also said transformers can even train on unlabeled data sets, which expands the volume of data on which they can be trained.

It turns out that even fully-trained LLMs can be used for a range of use cases (including those beyond language learning), as long as their massive foundation training is augmented with some additional special training, on a customer’s own data. Using a new technique called “prompt learning,” LLMs can simply be exposed to a small volume of example data — as little as a few hundred specimens — and the LLM can then be used for the customer’s scenario. The training generates a “prompt token,” effectively a companion model that provides context, which is then combined with the foundation model to deliver higher accuracy for that customer-specific use case.

Nvidia’s new NeMo LLM Service will allow exactly that. Users submit their data to the model, then use the prompt token-customized LLM for their own applications. Nvidia says the prompt training times range from minutes to hours, a trivial duration compared to the weeks-to-months training times required for the LLMs themselves. Beyond prompt learning, the cloud service will also allow its LLMs to be used for inference directly.

Another service, called BioNeMo, geared to “digital biology,” facilitates the acceleration of drug discovery for pharma and biotech companies. It supports protein, DNA and biochemical data, providing ready access to four open source protein models, namely EFM-1 (created by Facebook parent company Meta, and retrained by Nvidia), OpenFold, MegaMolBART and ProtT5 (developed in a collaboration led by the Technical University of Munich’s RostLab and including Nvidia).

Early Access and Developer Playground

Users of these cloud services and APIs gain access to massive LLMs, including Megatron 530B (so named because it has 530 billion training parameters) without needing possession of the model or any GPU hardware, be it on-premises or in the cloud. Instead, it’s all managed by Nvidia. Developers need only make the right API calls.

The two services will go into early access next month. And during the early access period, their use will be free. Developers who are interested can apply to Nvidia to be part of the early access program (although the company provided no link for, or details around, the application process). Nvidia will even provide a “playground” for no-code experimentation and interaction with the models.

Chief LLM Enthusiast

Nvidia Founder and CEO Jensen Huang is pretty jazzed about these new services. “Large language models hold the potential to transform every industry,” Huang said. “The ability to tune foundation models puts the power of LLMs within reach of millions of developers who can now create language services and power scientific discoveries without needing to build a massive model from scratch.”

Enabling the massive models as a service (MMaaS?) is a smart and logical move for the leader in GPUs, on which those models can best be trained. In fact, Nvidia is also announcing at GTC that its new H100 GPUs, which have transformer engines built in and will significantly accelerate LLM training, are now in full production.

All of this, especially early access to the NeMo cloud services, provides a cool opportunity for developers to work with the LLMs, with very little barrier to entry.

TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.