VOOZH about

URL: https://thenewstack.io/mosaicml-launches-30b-model-takes-on-llama-falcon-and-gpt/

⇱ MosaicML Launches 30B Model — Takes on LLaMA, Falcon and GPT - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-06-22 07:00:11
MosaicML Launches 30B Model — Takes on LLaMA, Falcon and GPT
AI / Large Language Models / Open Source

MosaicML Launches 30B Model — Takes on LLaMA, Falcon and GPT

MosaicML has launched MPT-30B, which founder Naveen Rao claims out-performs both LLaMA and Falcon in certain use cases for enterprise devs.
Jun 22nd, 2023 7:00am by Richard MacManus
👁 Featued image for: MosaicML Launches 30B Model — Takes on LLaMA, Falcon and GPT
Image via Pexels

MosaicML is launching its second open source large language model (LLM), called MPT-30B, which follows on from the smaller MPT-7B model it debuted in May.

To discuss the new model and what it means for developers, I spoke to MosaicML co-founder and CEO, Naveen Rao. His previous startup was Nervana, a deep learning company that was acquired by Intel in 2016 — so he’s no johnny-come-lately in the AI industry.

As the name suggests, MPT-30B is a 30-billion parameter model. The company claims that it surpasses OpenAI’s GPT-3 in quality, despite having about 1/6th the number of parameters (GPT-3 has 175 billion). “This means MPT-30B is easier to run on local hardware and much cheaper to deploy for inference,” the company says.

MosaicML vs. LLaMA and Falcon

MPT-30B was trained on longer sequences (up to 8,000 tokens) than other models, including GPT-3, LLaMA and Falcon (2,000 tokens each). According to MosaicML, “It is designed to handle even longer sequences in practice, making it a perfect fit for data-heavy enterprise applications.”

In practice, what this means is that users can enter longer prompts. Indeed, MosaicML’s previous 7B parameter model comes with a fine-tuned option, called MPT-7B-StoryWriter-65k+, that has a massive 65,000 “context length.”

“Longer context [lengths] means more flexible usages,” said Rao. “We’re going to have fine-tuned versions that are especially good for writing prose — for writing longer outputs.”

👁 MosaicML platform

The MosaicML platform; via its company blog

Another difference Rao wanted to highlight was its attention mechanism. When Google published its now famous 2017 paper about transformer technology, “Attention Is All You Need,” it noted that “multi-headed self-attention” was the training mechanism that provided its breakthrough for AI (an insight that OpenAI then borrowed to build GPT).

“Attention is the intrinsic part to transformer models,” explained Rao. “That’s actually what allows them to see connections across a sentence, or a paragraph, or a whole corpus of text.”

Rao told me that MosaicML utilizes a technique called “FlashAttention,” which was the subject of a 2022 academic paper.

“It enables you to have faster inference and training — both Falcon and LLaMA do not have this,” he said. “So ours are actually more efficient from a computing perspective.”

Rao added that the new model is more appropriate for enterprise use, because it is “right-sized” to “fit into the constraints of real hardware.” He noted that deep-learning GPUs typically use 40-80 gigabytes of memory. According to Rao, the open source Falcon LLM struggles with this constraint.

“Oddly enough, the Falcon model that they released is a 40 billion parameter model. This doesn’t fit very easily into an 80-gig GPU, because it’s butting right up against the edge.”

He added that its own 30 billion parameter model is smaller in order to better optimize for GPUs. “It doesn’t really hurt us on performance and it will allow you to very easily fit into an 80-gig GPU,” he said.

Rao claims that its new 30B parameter model also compares favorably to both LLaMA and Falcon in performance.

“We’re actually training on less compute, because of our efficiency methods, than LLaMA and Falcon. So it’s actually much cheaper to train. But we’re basically on parity. It depends on the evaluation metric — like, for coding, this model actually does considerably better than those two. On other things, it’s a little bit worse.”

Of course, the people behind LLaMA and Falcon might contest that. But it’s difficult to independently verify the claims of MosaicML because none of the three open source LLM projects Rao talks about (MosaicML, LLaMA or Falcon) have yet been tested using Stanford’s HELM measure.

MosaicML vs. OpenAI

So how does MosaicML’s model compare to OpenAI’s GPT-4? Rao acknowledged that GPT-4 is superior in terms of its capabilities, across most aspects. However, he reiterated that MosaicML’s model offers a longer context length, which allows for unique use cases — such as generating an epilogue to F. Scott Fitzgerald’s famous novel, ‘The Great Gatsby.’ (Aside: as a former English Literature major, this is the last thing I want from LLMs!).

The main challenge with large models like GPT-4, said Rao, is the high cost of running them, making it impractical for most enterprises. MosaicML also focuses on serving companies with specific data — including sensitive data — to fine-tune models for their specific industries.

In terms of use cases, Rao explained that industries like healthcare and banking can benefit from MosaicML’s ability to interpret and summarize large amounts of data. In the medical field, for instance, the model can interpret lab results and provide insights into a patient’s history by analyzing various inputs.

Rao emphasized the importance of open source models in these scenarios, as the nature of health (or indeed financial) data requires secure handling behind a firewall, rather than sending it over an API to the likes of OpenAI.

How Developers Can Use MosaicML

I asked how developers can start using MosaicML’s platform. Rao replied that MosaicML offers various options, depending on the developer’s needs and expertise. For simple integration, they provide an API similar to other companies (like OpenAI), which allows developers to easily incorporate MosaicML’s models into their frontend applications. He claims that MosaicML’s models are more cost-effective compared to similar-sized models from other providers.

Developers also have the option of customizing a MosaicML model by fine-tuning it with their own data. They can download the model, make modifications, and create their own API with the customized version.

For more advanced developers with ample data, Rao said that MosaicML’s tools can be used to pre-train custom models from scratch, and serve them using MosaicML’s platform.

I then asked about the compatibility of MosaicML with popular third-party tools, like LangChain.

“All the tools that you get with LangChain work with our API’s,” he replied. “And what’s really cool about it is, you can use those tools on top of a custom model that you build with us. So we basically give the developer incredible power in terms of customization — even owning the whole model. All your data that went into that model — the weights, everything — are owned by you, so full customization is possible. That’s what we enable. With these API providers [like OpenAI], you get what you get — there is zero customization.”

Team Open Source

Despite talking a little smack about LLaMA and Falcon during our interview, ultimately Rao thinks they’re all on the same team — and that it’s proprietary platforms like OpenAI that are the true competition.

“This puts the power back in the hands of the enterprise developer,” he said, about open source LLMs. “Having all of that in one centralized place, where you get what you get, is a big negative outcome.”

He also insisted that the open source LLMs are “closing the gap to these closed source models.” Maybe not completely yet, he acknowledged, but he thinks open LLMs have “crossed the threshold where these models are actually extremely useful.”

TRENDING STORIES
Richard MacManus is a Senior Editor at The New Stack and writes about web and application development trends. Previously he founded ReadWriteWeb in 2003 and built it into one of the world’s most influential technology news sites. From the early...
Read more from Richard MacManus
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.