VOOZH about

URL: https://evolink.ai/blog/minimax-m3-api-status

⇱ MiniMax-M3 API on EvoLink: Pricing & 1M Context


GLM-5.2 is now availableLearn more
Product Launch

MiniMax-M3 API on EvoLink: Pricing, ID & 1M Context

EvoLink Team
Product Team
June 1, 2026
7 min read

MiniMax M3 has started to attract developer attention after public discussion described it as a new-generation LLM for coding agents, long-context workflows, multimodal reasoning, and cost-efficient production use.

For teams building on EvoLink, the important question is practical: can you call MiniMax M3 through an API today, and should you plan production workloads around it yet?
The answer is now: yes. MiniMax-M3 went live on EvoLink on June 1, 2026. Developers can access M3 through both an OpenAI-compatible endpoint (/v1/chat/completions) and a native Anthropic Messages endpoint (/v1/messages) for Claude Code–style CLIs.
Now live β€” try it right away: MiniMax-M3 API

MiniMax M3 status at a glance

TopicStatus as of June 1, 2026What it means for developers
Public release signalConfirmedM3 is live on EvoLink
EvoLink route availabilityLiveDevelopers can access M3 through EvoLink today
Model IDMiniMax-M3Use this exact ID for SDK calls and routing config
PricingSee model page (input from ~$0.70/1M)EvoLink pricing is published on the model page
Context length~1M tokens (>512K billed at 2Γ— long-context tier)Plan budget for the long-context tier on large requests
Multimodal supportSupported (image / video / PDF)Send visual and document inputs for multimodal reasoning
EndpointsOpenAI-compatible /v1/chat/completions + native Anthropic /v1/messagesWorks with OpenAI SDK and Claude Code–style CLIs
Hugging Face / open model statusNot listed in the official MiniMaxAI models checkedDo not assume weights or license terms
For confirmed model ID, pricing, and context limits, see the MiniMax-M3 API page.

Why developers are watching MiniMax M3

The interest around MiniMax M3 is easy to understand. The public signal frames M3 around several things production AI teams care about:

  • Coding and agentic workloads where models need to plan, edit, call tools, and recover from mistakes.
  • Long-context tasks such as full codebase analysis, large contracts, long documents, and multi-file reasoning.
  • MiniMax Sparse Attention (MSA) as a reported architecture direction for handling very long context more efficiently.
  • Native multimodal reasoning for computer-use agents and product interfaces.
  • Lower-cost frontier-model routing now that EvoLink exposes published token pricing for M3 alongside fallback model routes.

These are exactly the kinds of workloads where a unified API gateway matters. A team evaluating M3 needs fallback options, cost controls, and a way to switch models without rewriting application code.

What still needs separate MiniMax technical documentation

The strongest public signal before launch was a social post attributed to Skyler Miao describing M3 as a new-generation LLM with MiniMax Sparse Attention for coding and agentic tasks. EvoLink now confirms the API facts that matter for its route: model ID, access, pricing, context tier, endpoint shape, and supported input modalities. Architecture claims and open-model assumptions still need separate MiniMax technical documentation.

Reported claimWhat needs official confirmation
MiniMax Sparse Attention architectureMiniMax technical docs or release notes
SOTA coding and agentic performanceOfficial benchmarks plus independent production-style evaluation
Lower cost than Sonnet or other open modelsSource-backed pricing comparison with exact model versions and dates
Open-model positioningOfficial repository, model weights, and license terms

This distinction matters. Developers can use the EvoLink route details below for integration planning, while treating MSA, benchmark, and open-weight claims as separate technical claims until MiniMax publishes source-backed documentation.

API availability, model ID, and pricing

For API users, the launch-critical pieces are now confirmed on EvoLink:

ItemCurrent statusWhy it matters
API availabilityLive on EvoLinkDevelopers can call M3 today
Model IDMiniMax-M3Required for SDK calls, routing config, and examples
PricingInput from ~$0.70/1M, output from ~$2.80/1M, cache reads from ~$0.14/1M for ≀512K contextRequired for budget planning and cost comparison
EndpointsOpenAI-compatible /v1/chat/completions and Anthropic Messages /v1/messagesCovers existing OpenAI SDK code and Claude Code-style clients
Context and billing~1M context, with tokens above 512K billed at the 2x long-context tierRequired for production cost planning on full-repo and long-document requests
MiniMax-M3 went live on EvoLink on June 1, 2026. See the MiniMax-M3 API page for the current route, pricing, and model details.

What EvoLink users can do now

MiniMax-M3 is live on EvoLink. Here is how to start evaluating it:

  • Use model ID MiniMax-M3 in SDK calls and routing config.
  • Keep your existing OpenAI-compatible integration for /v1/chat/completions, or use /v1/messages for Anthropic-compatible coding CLI workflows.
  • Run a focused test set for coding-agent, full-repo, multimodal, and long-document prompts before routing broad production traffic.
  • Keep fallback routes ready β€” MiniMax-M2.5 on EvoLink remains useful as a lower-cost MiniMax-family route for coding agents, repo Q&A, and long-context workflows.

What to measure before routing production traffic

MiniMax-M3 is ready to evaluate now. Before routing production traffic, measure:

  1. Model quality on your coding-agent and long-context tasks
  2. Cost by context size, especially requests above 512K tokens
  3. Cache-read savings for repeated system prompts and stable prefixes
  4. Multimodal input behavior for image, video, and PDF workflows
  5. Streaming, tool-use, and retry behavior in your existing agent stack
  6. Fallback routing behavior when a request exceeds your latency or cost target

FAQ

Is MiniMax M3 released?
Yes β€” MiniMax-M3 went live on EvoLink on June 1, 2026.
Is there a MiniMax M3 API?
Yes β€” MiniMax-M3 is accessible through EvoLink via the OpenAI-compatible /v1/chat/completions endpoint and the native Anthropic Messages /v1/messages endpoint.
What is the MiniMax M3 model ID?
Use MiniMax-M3 in the request body.
How much does MiniMax M3 cost?
EvoLink pricing starts at about $0.70 per 1M input tokens, $2.80 per 1M output tokens, and $0.14 per 1M cache-read tokens for requests within the ≀512K base tier. Tokens above 512K are billed at the 2x long-context tier. See MiniMax-M3 API for the latest pricing.
Does MiniMax M3 support 1M context?
Yes β€” the EvoLink route supports roughly 1M context, with a 2x long-context billing tier above 512K tokens.
Does MiniMax M3 support multimodal reasoning?
Yes β€” the EvoLink route supports image, video, and PDF input for multimodal reasoning.
Does EvoLink support MiniMax M3?
Yes β€” MiniMax-M3 is live on EvoLink. Start from the MiniMax-M3 API page.
What should I use alongside MiniMax-M3?
For lower-cost MiniMax-family fallback routes, keep MiniMax-M2.5 on EvoLink available. For coding-agent routing more broadly, compare confirmed models through EvoLink and keep fallback routes ready.

Related articles

Sources

Related Articles

Ready to Reduce Your AI Costs by 89%?

Start using EvoLink today and experience the power of intelligent API routing.