VOOZH about

URL: https://thenewstack.io/simplify-ai-development-with-machine-learning-containers/

⇱ Simplify AI Development with Machine Learning Containers - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-01-28 07:46:23
Simplify AI Development with Machine Learning Containers
AI / AI Engineering / Cloud Services / Large Language Models

Simplify AI Development with Machine Learning Containers

The creator of Docker Compose is behind a technology that wraps AI models into containers. There's also a cloud platform to share the models.
Jan 28th, 2025 7:46am by Richard MacManus
👁 Featued image for: Simplify AI Development with Machine Learning Containers
Image via Unsplash+. 

Replicate has a very simple premise: run and share machine learning models in the cloud, using containers technology. Sound familiar? That may be because Replicate’s founder and CEO, Ben Firshman, was previously the creator of Fig, which was acquired by Docker and then became Docker Compose.

Replicate is the end result of Firshman and his business partner Andreas Jansson wanting to create a similar technology for machine learning.

Bringing Containers to AI

What Firshman and Jansson created first was a product called Cog, which he has described as “Docker for machine learning.” Cog, according to Firshman, “makes it easy to package a machine learning model inside a container so that you can share it and deploy it to production.” This tool was open sourced, but like many developers who open source their creations, Firshman and Jansson then created a cloud platform to commercialize the technology, called Replicate.

According to its documentation, Replicate “lets you run AI models with a cloud API, without having to understand machine learning or manage your own infrastructure.”

Cog essentially abstracts the technical aspects of deploying machine learning applications. But similar to Modal, the severless platform for AI apps that I profiled last week, Replicate mostly rents the compute needed to run these models.

“We don’t own our own GPUs,” Firshman explained on the Latent Space podcast last year. “We’ve got a few that we play around with, but not for production workloads. And we are primarily built on public clouds, so primarily GCP and CoreWeave and, like, some smatterings elsewhere.”

Helping Devs Tinker With LLMs

One of the keys to Replicate is that it allows developers to customize, fine-tune and tinker with open source LLM models. As Firshman noted in the podcast, “the whole point of open source is that you can tinker on it and you can customize it and you can fine-tune it and you can smush it together with another model.”

Replicate really came into its own when Meta released its open source Llama models, because of course that allows developers to tinker much more than with APIs from the likes of OpenAI and Google. “The beautiful thing about Llama 2 as a base model is that… you can fine-tune it for like 50 bucks,” Firshman said. “And that’s what’s so beautiful about the open source ecosystem.”

Just this week, the open source LLM ecosystem got another shot in the arm with the sudden emergence of DeepSeek, a Chinese company that claims to have built a reasoning LLM (called R1) that is the equal of OpenAI’s most powerful model, the o1. This claim is still being prodded by developers and the media, but regardless it is undoubtedly good news for developers — the more open source models for its users to tinker with, the better!

“Just start playing around with it, get a feel of how language models work, get a feel of how these diffusion models work, get a feel of what fine-tuning is and how it works…”
– Ben Firshman, Replicate CEO

From a developer perspective, platforms like Replicate and Modal are a boon because they make using machine learning technology in applications viable. In the Latent Space podcast, Firshman likened it to when developers were introduced to web development platforms in the 1990s.

“You don’t need to be digging down into [the] PyTorch level, if you don’t want to — in the same way as a software engineer in the ’90s [didn’t] need to be understanding how network stacks work to be able to build a website. But you need to understand the shape of this thing.”

He urged developers to learn about modern AI, because it is becoming more and more important in the application development landscape.

“Just start playing around with it, get a feel of how language models work, get a feel of how these diffusion models work, get a feel of what fine-tuning is and how it works — because some of your job might be building datasets, you know. Get a feeling of how prompting works, because some of your job might be writing a prompt. And those are just all really important skills to sort of figure out.”

A Plethora of AI Models

In an interview with one of his investors, A16Z, last November, Firshman said there are around 20,000 models in the Replicate ecosystem. These include models that generate or enhance images and videos, transcription models, models for chat or music, models that help you “make 3D stuff,” and more. By far the most popular modal, with 726.9 million “runs” as of writing, is SDXL-Lightning by TikTok owners ByteDance, described as “a fast text-to-image model that makes high-quality images in 4 steps.”

Firshman also said, in an interview with Assembly AI, that “we’re primarily building for startups.”

“And when we say startups, it’s both actual startups, but also small teams inside large companies that are kind of behaving like startups,” he clarified. “These teams are having a lot of success building, like, either whole products that are native to AI, or like particular point solutions to certain things inside the product.”

He added that some of Replicate’s customers are using it to fine-tune language models, but that this isn’t the primary use case.

Democratizing AI: Why Open Source Models Matter

Replicate and Modal are both good examples of a new type of platform providing value to developers: one that makes it easier to integrate AI into applications by abstracting away much of the complexity of machine learning technologies.

You could also argue that Replicate is helping democratize access to open source LLM models by enabling developers to customize and/or fine-tune models such as Meta’s Llama 3 (the latest Llama version) — or even experiment with cutting-edge models like DeepSeek’s R1.

While it’s still early days in this evolution — given that OpenAI and the other industry heavyweights keep their most powerful models proprietary — platforms like Replicate show how open source could yet flourish in the AI engineering ecosystem. The more models that open up and become accessible to tinkerers, the better it will be for devs.

TRENDING STORIES
Richard MacManus is a Senior Editor at The New Stack and writes about web and application development trends. Previously he founded ReadWriteWeb in 2003 and built it into one of the world’s most influential technology news sites. From the early...
Read more from Richard MacManus
SHARE THIS STORY
TRENDING STORIES
Google is a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Docker, OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.