MiniMax-M3 API on EvoLink: Pricing, ID & 1M Context
MiniMax M3 has started to attract developer attention after public discussion described it as a new-generation LLM for coding agents, long-context workflows, multimodal reasoning, and cost-efficient production use.
/v1/chat/completions) and a native Anthropic Messages endpoint (/v1/messages) for Claude Codeβstyle CLIs.MiniMax M3 status at a glance
| Topic | Status as of June 1, 2026 | What it means for developers |
|---|---|---|
| Public release signal | Confirmed | M3 is live on EvoLink |
| EvoLink route availability | Live | Developers can access M3 through EvoLink today |
| Model ID | MiniMax-M3 | Use this exact ID for SDK calls and routing config |
| Pricing | See model page (input from ~$0.70/1M) | EvoLink pricing is published on the model page |
| Context length | ~1M tokens (>512K billed at 2Γ long-context tier) | Plan budget for the long-context tier on large requests |
| Multimodal support | Supported (image / video / PDF) | Send visual and document inputs for multimodal reasoning |
| Endpoints | OpenAI-compatible /v1/chat/completions + native Anthropic /v1/messages | Works with OpenAI SDK and Claude Codeβstyle CLIs |
| Hugging Face / open model status | Not listed in the official MiniMaxAI models checked | Do not assume weights or license terms |
Why developers are watching MiniMax M3
The interest around MiniMax M3 is easy to understand. The public signal frames M3 around several things production AI teams care about:
- Coding and agentic workloads where models need to plan, edit, call tools, and recover from mistakes.
- Long-context tasks such as full codebase analysis, large contracts, long documents, and multi-file reasoning.
- MiniMax Sparse Attention (MSA) as a reported architecture direction for handling very long context more efficiently.
- Native multimodal reasoning for computer-use agents and product interfaces.
- Lower-cost frontier-model routing now that EvoLink exposes published token pricing for M3 alongside fallback model routes.
These are exactly the kinds of workloads where a unified API gateway matters. A team evaluating M3 needs fallback options, cost controls, and a way to switch models without rewriting application code.
What still needs separate MiniMax technical documentation
The strongest public signal before launch was a social post attributed to Skyler Miao describing M3 as a new-generation LLM with MiniMax Sparse Attention for coding and agentic tasks. EvoLink now confirms the API facts that matter for its route: model ID, access, pricing, context tier, endpoint shape, and supported input modalities. Architecture claims and open-model assumptions still need separate MiniMax technical documentation.
| Reported claim | What needs official confirmation |
|---|---|
| MiniMax Sparse Attention architecture | MiniMax technical docs or release notes |
| SOTA coding and agentic performance | Official benchmarks plus independent production-style evaluation |
| Lower cost than Sonnet or other open models | Source-backed pricing comparison with exact model versions and dates |
| Open-model positioning | Official repository, model weights, and license terms |
This distinction matters. Developers can use the EvoLink route details below for integration planning, while treating MSA, benchmark, and open-weight claims as separate technical claims until MiniMax publishes source-backed documentation.
API availability, model ID, and pricing
For API users, the launch-critical pieces are now confirmed on EvoLink:
| Item | Current status | Why it matters |
|---|---|---|
| API availability | Live on EvoLink | Developers can call M3 today |
| Model ID | MiniMax-M3 | Required for SDK calls, routing config, and examples |
| Pricing | Input from ~$0.70/1M, output from ~$2.80/1M, cache reads from ~$0.14/1M for β€512K context | Required for budget planning and cost comparison |
| Endpoints | OpenAI-compatible /v1/chat/completions and Anthropic Messages /v1/messages | Covers existing OpenAI SDK code and Claude Code-style clients |
| Context and billing | ~1M context, with tokens above 512K billed at the 2x long-context tier | Required for production cost planning on full-repo and long-document requests |
What EvoLink users can do now
MiniMax-M3 is live on EvoLink. Here is how to start evaluating it:
- Use model ID
MiniMax-M3in SDK calls and routing config. - Keep your existing OpenAI-compatible integration for
/v1/chat/completions, or use/v1/messagesfor Anthropic-compatible coding CLI workflows. - Run a focused test set for coding-agent, full-repo, multimodal, and long-document prompts before routing broad production traffic.
- Keep fallback routes ready β MiniMax-M2.5 on EvoLink remains useful as a lower-cost MiniMax-family route for coding agents, repo Q&A, and long-context workflows.
What to measure before routing production traffic
MiniMax-M3 is ready to evaluate now. Before routing production traffic, measure:
- Model quality on your coding-agent and long-context tasks
- Cost by context size, especially requests above 512K tokens
- Cache-read savings for repeated system prompts and stable prefixes
- Multimodal input behavior for image, video, and PDF workflows
- Streaming, tool-use, and retry behavior in your existing agent stack
- Fallback routing behavior when a request exceeds your latency or cost target
FAQ
Yes β MiniMax-M3 went live on EvoLink on June 1, 2026.
Yes β MiniMax-M3 is accessible through EvoLink via the OpenAI-compatible
/v1/chat/completions endpoint and the native Anthropic Messages /v1/messages endpoint.Use
MiniMax-M3 in the request body.EvoLink pricing starts at about $0.70 per 1M input tokens, $2.80 per 1M output tokens, and $0.14 per 1M cache-read tokens for requests within the β€512K base tier. Tokens above 512K are billed at the 2x long-context tier. See MiniMax-M3 API for the latest pricing.
Yes β the EvoLink route supports roughly 1M context, with a 2x long-context billing tier above 512K tokens.
Yes β the EvoLink route supports image, video, and PDF input for multimodal reasoning.
Yes β MiniMax-M3 is live on EvoLink. Start from the MiniMax-M3 API page.
Related articles
- MiniMax-M3 API on EvoLink - view the live model page, pricing, and model ID
- MiniMax-M2.5 API on EvoLink - keep a lower-cost MiniMax-family fallback route
- Best LLM for Coding Agents: API Cost, Tool Use, and Reliability Compared - compare production coding-agent options
- Qwen Coder API for Coding Agents - evaluate another coding-focused model family
- AI API Timeout, Retry, and Fallback Strategy - plan resilience across provider routes
Sources
- MiniMax API Docs: Models
- MiniMax model docs
- MiniMax pricing overview
- MiniMax pay-as-you-go pricing
- MiniMax token plan pricing
- MiniMaxAI models on Hugging Face
- Social demand signal attributed to Skyler Miao on X - tracked as a demand signal only, not as confirmation of API availability, pricing, model ID, or production behavior
