VOOZH about

URL: https://crazyrouter.com/en/blog/ai-structured-output-json-mode-guide-2026

⇱ AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini - Crazyrouter


Back to Blog

AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini#

One of the most common developer pain points with LLMs is getting consistent, parseable structured output. JSON mode, structured outputs, and schema enforcement have evolved significantly — here's everything you need to know in 2026.

Why Structured Output Matters#

Without reliable JSON output, every LLM integration needs brittle regex parsing, retry logic, and constant prompt tweaking. With proper structured output:

  • Parse responses directly without text cleaning
  • Integrate LLM outputs into databases and APIs reliably
  • Build deterministic workflows on top of non-deterministic models
  • Reduce hallucinated or malformed data

The Three Approaches to Structured Output#

ApproachReliabilityFlexibilitySupport
1. JSON Mode (hint only)⭐⭐⭐HighOpenAI, Gemini, most models
2. Structured Outputs (schema-enforced)⭐⭐⭐⭐⭐MediumOpenAI GPT-5, Gemini 3
3. Prompt Engineering (no enforcement)⭐⭐HighestAll models

Approach 1: JSON Mode#

JSON mode tells the model to output valid JSON, but doesn't enforce a specific schema. It's widely supported and reliable for well-defined prompts.

OpenAI JSON Mode#

python
from openai import OpenAI

client = OpenAI(
 api_key="your-crazyrouter-key",
 base_url="https://crazyrouter.com/v1"
)

response = client.chat.completions.create(
 model="gpt-5-mini",
 response_format={"type": "json_object"}, # Enable JSON mode
 messages=[
 {
 "role": "system",
 "content": "You are a data extractor. Always respond with valid JSON."
 },
 {
 "role": "user",
 "content": """Extract the following from this text and return as JSON:
 
Text: "John Smith, senior engineer at Acme Corp, can be reached at john@acme.com or +1-555-0123."

Return: {"name": "...", "title": "...", "company": "...", "email": "...", "phone": "..."}"""
 }
 ]
)

import json
data = json.loads(response.choices[0].message.content)
print(data)
# {"name": "John Smith", "title": "senior engineer", "company": "Acme Corp", ...}

Gemini JSON Mode#

python
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 response_format={"type": "json_object"},
 messages=[
 {
 "role": "user",
 "content": "List 3 Python web frameworks as JSON: [{name, stars, use_case}]"
 }
 ]
)

data = json.loads(response.choices[0].message.content)

Claude JSON Mode (via prompt)#

Claude doesn't have a native json_object response format in the API, but responds reliably with prompt engineering:

python
response = client.chat.completions.create(
 model="claude-sonnet-4-5",
 messages=[
 {
 "role": "user",
 "content": """Analyze the sentiment of this review and return ONLY valid JSON:

Review: "The product arrived quickly but the build quality is disappointing."

Required JSON format:
{
 "sentiment": "positive|negative|mixed",
 "score": 0.0-1.0,
 "aspects": [{"aspect": "...", "sentiment": "..."}],
 "summary": "..."
}"""
 }
 ]
)

# Claude respects JSON-only instructions reliably
content = response.choices[0].message.content
# May need to strip markdown code blocks:
if content.startswith("```"):
 content = content.split("```")[1]
 if content.startswith("json"):
 content = content[4:]

data = json.loads(content.strip())

Approach 2: Structured Outputs (Schema-Enforced)#

OpenAI's Structured Outputs (available with GPT-5 series) constrain generation to match a JSON Schema exactly. This is the gold standard for reliability.

OpenAI Structured Outputs with Pydantic#

python
from openai import OpenAI
from pydantic import BaseModel
from typing import List, Optional

client = OpenAI(
 api_key="your-crazyrouter-key",
 base_url="https://crazyrouter.com/v1"
)

# Define your expected schema
class JobCandidate(BaseModel):
 name: str
 years_experience: int
 skills: List[str]
 education: str
 salary_expectation: Optional[int] = None
 available: bool

class ResumeAnalysis(BaseModel):
 candidates: List[JobCandidate]
 top_pick: str
 reasoning: str

# Parse resumes with guaranteed schema
response = client.beta.chat.completions.parse(
 model="gpt-5-2", # Structured outputs require GPT-5 series
 messages=[
 {
 "role": "system",
 "content": "Extract candidate information from resumes."
 },
 {
 "role": "user",
 "content": f"Analyze these resumes and rank the candidates:\n{resume_text}"
 }
 ],
 response_format=ResumeAnalysis,
)

# Fully typed, validated output
analysis = response.choices[0].message.parsed
print(f"Top pick: {analysis.top_pick}")
for candidate in analysis.candidates:
 print(f"- {candidate.name}: {candidate.years_experience} years, {candidate.skills}")

Structured Outputs with Raw JSON Schema#

python
response = client.chat.completions.create(
 model="gpt-5-2",
 response_format={
 "type": "json_schema",
 "json_schema": {
 "name": "product_analysis",
 "strict": True,
 "schema": {
 "type": "object",
 "properties": {
 "product_name": {"type": "string"},
 "category": {
 "type": "string",
 "enum": ["electronics", "clothing", "food", "other"]
 },
 "price": {"type": "number"},
 "features": {
 "type": "array",
 "items": {"type": "string"}
 },
 "in_stock": {"type": "boolean"}
 },
 "required": ["product_name", "category", "price", "features", "in_stock"],
 "additionalProperties": False
 }
 }
 },
 messages=[
 {"role": "user", "content": "Analyze this product: iPhone 16 Pro, $999, available now, 48MP camera, titanium design"}
 ]
)

Approach 3: Provider Comparison for Reliability#

Let's be practical about which providers are most reliable for structured output:

ProviderJSON ModeSchema EnforcementReliabilityNotes
OpenAI GPT-5.2✅ (Structured Outputs)⭐⭐⭐⭐⭐Best-in-class
OpenAI GPT-5 Mini✅ (Structured Outputs)⭐⭐⭐⭐⭐Fast + reliable
Gemini 2.5 Flash✅ (responseSchema)⭐⭐⭐⭐Good for Google formats
Claude Sonnet 4.5Prompt-only❌ native⭐⭐⭐⭐Reliable with prompting
Claude Opus 4.6Prompt-only❌ native⭐⭐⭐⭐Best with complex schemas
DeepSeek V3.2Limited⭐⭐⭐Good for simple schemas
Grok 4.1 FastLimited⭐⭐⭐Improving with updates

Gemini Structured Output with Response Schema#

python
response = client.chat.completions.create(
 model="gemini-2.5-flash",
 response_format={
 "type": "json_schema",
 "json_schema": {
 "name": "news_article",
 "schema": {
 "type": "object",
 "properties": {
 "headline": {"type": "string"},
 "topics": {"type": "array", "items": {"type": "string"}},
 "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
 "key_entities": {
 "type": "array",
 "items": {
 "type": "object",
 "properties": {
 "name": {"type": "string"},
 "type": {"type": "string"}
 }
 }
 }
 }
 }
 }
 },
 messages=[
 {"role": "user", "content": f"Analyze this news article:\n{article_text}"}
 ]
)

Production Patterns for Reliable JSON Output#

Pattern 1: The Validator-Retry Loop#

python
import json
from pydantic import BaseModel, ValidationError
from openai import OpenAI

client = OpenAI(
 api_key="your-crazyrouter-key",
 base_url="https://crazyrouter.com/v1"
)

class ExtractedData(BaseModel):
 title: str
 author: str
 date: str
 summary: str

def extract_with_retry(text: str, max_retries: int = 3) -> ExtractedData:
 for attempt in range(max_retries):
 try:
 response = client.chat.completions.create(
 model="claude-sonnet-4-5",
 messages=[
 {
 "role": "user",
 "content": f"""Extract information and return ONLY valid JSON matching exactly:
{{
 "title": "article title",
 "author": "author name", 
 "date": "YYYY-MM-DD format",
 "summary": "one sentence summary"
}}

Text: {text}

Return only the JSON object, no other text."""
 }
 ]
 )
 
 content = response.choices[0].message.content.strip()
 # Clean up potential markdown wrapping
 if "```" in content:
 content = content.split("```")[1]
 if content.startswith("json"):
 content = content[4:]
 
 data = json.loads(content.strip())
 return ExtractedData(**data)
 
 except (json.JSONDecodeError, ValidationError) as e:
 if attempt == max_retries - 1:
 raise
 print(f"Attempt {attempt + 1} failed: {e}. Retrying...")
 
 raise ValueError("Failed to extract valid JSON after retries")

Pattern 2: Multi-Provider Fallback for Critical Data#

python
async def extract_structured_data(text: str, schema: dict) -> dict:
 """Try multiple providers for critical structured extraction."""
 
 providers = [
 ("gpt-5-2", "openai_structured"), # Best reliability
 ("gemini-2.5-flash", "gemini"), # Good fallback
 ("claude-sonnet-4-5", "prompt"), # Reliable with prompting
 ]
 
 for model, method in providers:
 try:
 if method == "openai_structured":
 response = await client.chat.completions.create(
 model=model,
 response_format={
 "type": "json_schema",
 "json_schema": {"name": "extraction", "schema": schema, "strict": True}
 },
 messages=[{"role": "user", "content": f"Extract data from:\n{text}"}]
 )
 else:
 response = await client.chat.completions.create(
 model=model,
 messages=[{
 "role": "user",
 "content": f"Extract data and return JSON matching schema {json.dumps(schema)}:\n{text}"
 }]
 )
 
 content = response.choices[0].message.content
 return json.loads(content)
 
 except Exception as e:
 print(f"Provider {model} failed: {e}")
 continue
 
 raise RuntimeError("All providers failed for structured extraction")

Pattern 3: Node.js with Zod Validation#

javascript
import OpenAI from 'openai';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

const client = new OpenAI({
 apiKey: process.env.CRAZYROUTER_API_KEY,
 baseURL: 'https://crazyrouter.com/v1',
});

// Define schema with Zod
const ProductSchema = z.object({
 name: z.string(),
 price: z.number().positive(),
 category: z.enum(['electronics', 'clothing', 'food', 'other']),
 inStock: z.boolean(),
 features: z.array(z.string()).min(1),
});

async function extractProduct(description) {
 const jsonSchema = zodToJsonSchema(ProductSchema, 'product');
 
 const response = await client.chat.completions.create({
 model: 'gpt-5-mini',
 response_format: {
 type: 'json_schema',
 json_schema: {
 name: 'product',
 strict: true,
 schema: jsonSchema.definitions.product,
 },
 },
 messages: [
 { role: 'user', content: `Extract product data from: "${description}"` }
 ],
 });
 
 const rawData = JSON.parse(response.choices[0].message.content);
 
 // Validate with Zod
 return ProductSchema.parse(rawData);
}

// Usage
const product = await extractProduct(
 'The Sony WH-1000XM6 headphones cost $299, feature noise cancellation and 30hr battery, in stock.'
);
console.log(product);

Best Practices for Structured Output Prompts#

1. Always show an example in the prompt#

python
# Bad: Vague instruction
"Return the data as JSON"

# Good: Show exact expected format
"""Return exactly this JSON structure (no other text):
{
 "status": "success|error|pending",
 "message": "human readable description",
 "data": {"key": "value"}
}"""

2. For Claude: Request JSON inside the assistant turn#

python
messages = [
 {"role": "user", "content": "Classify this email: " + email_text},
 {"role": "assistant", "content": "{"} # Prime the response
]
# Claude will continue the JSON you started

3. Keep schemas simple for less capable models#

Complex nested schemas work well with GPT-5 series and Claude Opus. For faster/cheaper models, flatten the schema:

python
# For faster models (Haiku, Flash Lite)
simple_schema = {
 "sentiment": "positive|negative|neutral",
 "confidence": 0.95,
 "reason": "brief explanation"
}

# For powerful models (Opus, GPT-5.2)
complex_schema = {
 "sentiment": {
 "overall": "positive|negative|neutral",
 "aspects": [{"name": "...", "sentiment": "...", "keywords": []}],
 "confidence": 0.95
 },
 "entities": [{"name": "...", "type": "PERSON|ORG|PRODUCT"}],
 "topics": [],
 "actionItems": []
}

Frequently Asked Questions#

Q: Which provider is most reliable for JSON output? A: OpenAI's Structured Outputs (GPT-5 series) offer the highest reliability with schema enforcement. Claude Opus and Sonnet are highly reliable with prompt engineering. All are accessible via Crazyrouter.

Q: Does Claude support structured outputs natively? A: As of April 2026, Claude does not support json_schema response format natively. However, Claude is highly reliable with well-crafted prompts and the "prime the response" technique.

Q: What's the difference between JSON mode and structured outputs? A: JSON mode hints the model to return valid JSON but doesn't enforce a schema. Structured outputs constrain generation to match your exact schema — zero invalid outputs.

Q: Can I use structured outputs with streaming? A: Yes, with OpenAI's API. You stream chunks and assemble the JSON at the end. Partial JSON parsing is also possible for progressive UI updates.

Q: What models support structured outputs via Crazyrouter? A: All OpenAI GPT-5 series models support full structured outputs. Gemini 2.5+ supports responseSchema. Claude uses prompt-based JSON. All available at crazyrouter.com.

Summary#

In 2026, structured output reliability has dramatically improved:

  • Best reliability: OpenAI Structured Outputs (GPT-5 series) → guaranteed schema compliance
  • Good balance: Gemini 2.5 Flash with responseSchema
  • Reliable with prompting: Claude Sonnet/Opus with clear examples
  • For production: Use validator-retry patterns as a safety net

Access all these models through a single API at Crazyrouter — no need to manage multiple API keys for different providers.

Start building with structured AI outputs at Crazyrouter

Implementation Guides

Related Posts

GPT Agent Mode Complete Guide: Autonomous AI Tasks in 2026

"Learn how GPT Agent Mode works, how to use it via API, and how it compares to standard chat completions for autonomous task execution."

Feb 27

Claude Code Builds a Multi-Model Odds Alert Router: claude-fable-5 vs GPT-5.5 vs Qwen

The third Claude Code World Cup analytics project: route the same odds alert JSON task across claude-fable-5, GPT-5.5, Qwen Plus, and Gemini to measure valid JSON rate, latency, and fallback behavior through Crazyrouter.

Jun 13

Whisper API Guide 2026: Speech-to-Text for Developers

"Complete guide to OpenAI Whisper API for speech-to-text in 2026. Learn transcription, translation, and integration with code examples in Python and Node.js."

Mar 1

Character AI API Guide: Build Conversational AI Characters Programmatically

Complete guide to building conversational AI characters using APIs. Covers Character.AI alternatives, custom character creation with GPT and Claude

Feb 22

DeepSeek R2 API Guide: How to Use the Next-Gen Reasoning Model

Complete guide to DeepSeek R2, the advanced reasoning model. Learn about its capabilities, API integration, pricing, and how it compares to OpenAI o3 and Claude.

Feb 22

AI API Gateway for Thai Developers: Use GPT, Claude and Gemini with One Key

A practical guide for developers in Thailand who want one OpenAI-compatible endpoint for GPT, Claude and Gemini model calls.

May 22