VOOZH about

URL: https://www.together.ai/fine-tuning

⇱ Fine-Tuning | Together AI


Fine-Tuning

Fine-tune open-source models for
real production use

Improve accuracy, reduce hallucinations, and control behavior β€” without managing training infrastructure.

Why fine-tune models
with Together AI?

Build models that are faster, more accurate,  and fully yours

Reliable infrastructure at any scale

Multi-node orchestration that eliminates job failures. Fine-tune 100B+ models (DeepSeek-V3, Qwen3-235B) that break other platforms, with the reliability to experiment rapidly.

Research-driven performance gains

ML systems research built into every job. Train with 2-4x longer contexts at no extra cost, advanced DPO variants from SOTA recipes, and continuous optimizations that make your runs faster over time.

Universal model compatibility

Fine-tune any open-source model from Hugging Face Hub. No vendor lock-in, no format conversions β€” seamless integration with your existing workflows.

Fine-tune leading models

Explore top-performing models across text, image, video, code, and voice.

Chat

DeepSeek V4 Pro

Chat

Qwen3.7-Max

Chat

NVIDIA Nemotron 3 Ultra

Chat

MiniMax M3

Chat

Kimi K2.7 Code

Chat

Qwen3.7-Plus

Chat

GLM-5.2

Chat

gpt-oss-120B

Chat

LFM2 24B A2B

Chat

Qwen3.5-397B-A17B

Chat

MiniMax M2.5

Chat

Qwen3-Coder-Next

Chat

Kimi K2.5

Image

Wan 2.6 Image

Image

GPT Image 1.5

Chat

Qwen3.5 9B

Chat

GLM-5.1

Chat

Gemma 4 31B

Have your own model?

Deploy custom containers on Together’s managed GPU infrastructure with automatic scaling, job queues, and built-in observability.

πŸ‘ Image

Fine-tuning options

Choose how fine-tuned models are trained and hosted based on dataset size, cost, and control.

  • LoRA fine-tuning

    Lightweight fine-tuning for fast iteration and lower cost.

    Best for
    Small to medium datasets
    Fast training & deployment
    Easy to update or roll back
    Get started
  • Full fine-tuning

    Train the entire model for maximum control and quality.

    Best for
    Large or complex datasets
    Deeper behavior changes
    Dedicated infrastructure
    Get started

Everything you need to fine-tune at scale

Fine-tune any open-source model on your data. Deploy securely onto scalable infrastructure.

  • πŸ‘ Image
    {
     "model": "zai-org/GLM-5",
     "messages": [
     {
     "role": "user",
     "content": "What is the best GPU provider?"
     }
     ],
     "tools": [
     {
     "type": "function",
     "function": {
     "name": "web_search",
     "description": "Search the web for real-time information",
     "parameters": {
     "type": "object",
     "properties": {
     "query": {
     "type": "string",
     "description": "The search query"
     }
     },
     "required": ["query"]
     }
     }
     }
     ]
    }

Powered by leading research

Our fine-tuning infrastructure is built on research and optimized for scale, efficiency, and production performance.

  • Throughput (TPS)

    • Upipe
    • FPDT
    • ALST
    πŸ‘ Image

    UPipe vs other SOTA Approaches

    82.5% less memory

    Long-context training hits a memory wall at the attention layer. UPipe processes attention heads in smaller chunks, cutting peak activation memory by up to 82.5% β€” enabling 5M token context lengths on a single 8Γ—H100 node.

    learn more
  • Context parallelism approaches on long-context training

    • Together AI (DCT)
    • Baseline (LD)
    πŸ‘ Image

    FFT Optimizer results

    25% less memory

    Fine-tuning large models is memory-hungry. Our FFT-based optimizer replaces expensive SVD projections with fast Fourier transforms, reducing optimizer memory by up to 25% with no loss in training quality.

    learn more

Advanced model shaping capabilities

For teams pushing models beyond standard fine-tuning

  • Speculative decoding

    Accelerate inference with custom speculative decoding, training lightweight draft models to predict multiple tokens

  • Quantization

    Apply FP8 and NVFP4 quantization to push the limits of model efficiency, maximizing hardware utilization with minimal quality loss.

  • Reinforcement learning

    Leverage PyTorch-based reinforcement learning to shape model policies for reasoning, tool use, and long-horizon agentic behavior.

Production-grade
security and data privacy

We take security and compliance seriously, with strict data privacy controls to keep your information protected. Your data and models remain fully under your ownership, safeguarded by robust security measures.

Learn More

SOC 2 Type II. HIPAA-aligned options available. Encryption in transit and at rest. Deploy storage in regions matching your data residency requirementsβ€”North America, Europe, or Asia/Middle East based on your compliance needs.

Customers running inference in production

"Together AI does for fine-tuning and inference what Vercel does for LLM-based appsβ€”it removes the infrastructure layer so we can focus on our product. We fine‑tune and deploy customer‑specific models through simple API calls. That lets our existing team move from weekly to daily iteration, cut costs by 2–3Γ—, and improve accuracy from 77% to 87%."

"The technical challenge was running our multi-stage pipeline reliably at the conversation lengths our therapy models require," explains Daniel Cahn. "Together's platform eliminated the context length constraints and job failures we hit elsewhere, letting us experiment rapidly."

Daniel Cahn

Co-founder & CEO, Slingshot AI

"After thoroughly evaluating multiple LLM infrastructure providers, we’re thrilled to be partnering with Together AI for fine-tuning. The new ability to resume from a checkpoint combined with LoRA serving has enabled our customers to deeply tune our foundation model, ShieldLlama, for their enterprise’s precise risk posture. The level of accuracy would never be possible with vanilla open source or prompt engineering."

Alex Chung

Founder, Protege AI