VOOZH about

URL: https://replicate.com/

⇱ Replicate - Run AI with an API


Run AI
with an API.

Run and fine-tune models. Deploy custom models. All with one line of code.

Get started for free
import Replicate from 
"replicate"
;
const replicate = new Replicate({
 auth: process.env.REPLICATE_API_TOKEN
})
const model = 
const input = {
 prompt: 
};
const [output] = await replicate.run(model, { input });
console.log(output);

With Replicate you can

Thousands of models contributed by our community

All the latest models are on Replicate. They’re not just demos β€” they all actually work and have production-ready APIs.

AI shouldn’t be locked up inside academic papers and demos. Make it real by pushing it to Replicate.

    krea / krea-2-medium

    Foundation image model from Krea, tuned for expressive illustration, anime, and painterly styles. Fast and consistent across artistic directions.

    9.8K runs

    Official

    alibaba / happyhorse-1.0

    Alibaba's Happy Horse 1.0 generates videos from text prompts or animates a single image into video. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.

    25.5K runs

    Official

    openai / gpt-image-2

    OpenAI's state-of-the-art image generation model. Create and edit images from text with strong instruction following, sharp text rendering, and detailed editing.

    10.1M runs

    Official

    anthropic / claude-opus-4.7

    Anthropic's most capable model with a step-change improvement in agentic coding, better vision, and stronger multi-step reasoning

    119.4K runs

    Official

    google / gemini-3.1-flash-tts

    Google's fast, expressive text-to-speech model with 30 voices and 70+ language support

    214.5K runs

    Official

    minimax / music-2.6

    Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics

    15K runs

    Official

    bytedance / seedance-2.0

    ByteDance's multimodal video generation model with native audio, multimodal reference inputs, and intelligent duration control.

    898.9K runs

    Official

    prunaai / p-video-avatar

    p-video-avatar is the fastest and cheapest avatar/lipsync video model on the market.

    79.5K runs

    Official

    bytedance / seedream-5-lite

    Seedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge

    2.7M runs

    Official

    xai / grok-imagine-video

    Generate videos using xAI's Grok Imagine Video model

    1.2M runs

    Official

    black-forest-labs / flux-2-max

    The highest fidelity image model from Black Forest Labs

    3.1M runs

    Official

    google / nano-banana-2

    Google's fast image generation model with conversational editing, multi-image fusion, and character consistency

    12.9M runs

    Official

How it works

You can get started with any model with just one line of code. But as you do more complex things, you can fine-tune models or deploy your own custom code.

Run models

Our community has already published thousands of models that are ready to use in production. You can run these with one line of code.

import replicate

output = replicate.run(
 "black-forest-labs/flux-dev",
 input={
 "aspect_ratio": "1:1",
 "num_outputs": 1,
 "output_format": "jpg",
 "output_quality": 80,
 "prompt": "An astronaut riding a rainbow unicorn, cinematic, dramatic",
 }
)

print(output)

Fine-tune models with your own data

You can improve models with your own data to create new models that are better suited to specific tasks.

Image models like SDXL can generate images of a particular person, object, or style.

Train a model:

training = replicate.trainings.create(
 destination="mattrothenberg/drone-art"
 version="ostris/flux-dev-lora-trainer:e440909d3512c31646ee2e0c7d6f6f4923224863a6a10c494606e79fb5844497",
 input={
 "steps": 1000,
 "input_images": 
https://example.com/images.zip
,
 "trigger_word": "TOK",
 },
)

This will result in a new model:

mattrothenberg/drone-art

Fantastical images of drones on land and in the sky

0 runs

mattrothenberg / drone-art

Fantastical images of drones on land and in the sky

0 runs

Then, you can run it with one line of code:

output = replicate.run(
 "mattrothenberg/drone-art:abcde1234...",
 input={"prompt": "a photo of TOK forming a rainbow in the sky"}),
)

Deploy custom models

You aren’t limited to the models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.

Cog takes care of generating an API server and deploying it on a big cluster in the cloud. We scale up and down to handle demand, and you only pay for the compute that you use.

First, define the environment your model runs in with cog.yaml:

build:
 gpu: true
 system_packages:
 - "libgl1-mesa-glx"
 - "libglib2.0-0"
 python_version: "3.10"
 python_packages:
 - "torch==1.13.1"
predict: "predict.py:Predictor"

Next, define how predictions are run on your model with predict.py:

from cog import BasePredictor, Input, Path
import torch

class Predictor(BasePredictor):
 def setup(self):
 """Load the model into memory to make running multiple predictions efficient"""
 self.model = torch.load("./weights.pth")

 # The arguments and types the model takes as input
 def predict(self,
 image: Path = Input(description="Grayscale input image")
 ) -> Path:
 """Run a single prediction on the model"""
 processed_image = preprocess(image)
 output = self.model(processed_image)
 return postprocess(output)

Scale on Replicate

Thousands of businesses are building their AI products on Replicate. Your team can deploy an AI feature in a day and scale to millions of users, without having to be machine learning experts.

Learn more about our enterprise plans

Automatic scale

If you get a ton of traffic, Replicate scales up automatically to handle the demand. If you don't get any traffic, we scale down to zero and don't charge you a thing.

  • CPU$0.000100/sec
  • Nvidia T4 GPU$0.000225/sec
  • Nvidia L40S GPU$0.000975/sec
  • 2x Nvidia L40S GPU$0.001950/sec
  • Nvidia A100 (80GB) GPU$0.001400/sec
  • 8x Nvidia A100 (80GB) GPU$0.011200/sec
  • Learn more about pricing

Pay for what you use

Replicate only bills you for how long your code is running. You don't pay for expensive GPUs when you're not using them.

Forget about infrastructure

Deploying machine learning models at scale is hard. If you've tried, you know. API servers, weird dependencies, enormous model weights, CUDA, GPUs, batching.

Prediction throughput (requests per second)

Logging & monitoring

Metrics let you keep an eye on how your models are performing, and logs let you zoom in on particular predictions to debug how your model is behaving.