VOOZH about

URL: https://docs.litellm.ai/docs/providers/mistral

⇱ Mistral AI API | liteLLM


Skip to main content

https://docs.mistral.ai/api/

API Key

# env variable
os.environ['MISTRAL_API_KEY']

Sample Usage

from litellm import completion
import os

os.environ['MISTRAL_API_KEY']=""
response = completion(
model="mistral/mistral-tiny",
messages=[
{"role":"user","content":"hello from litellm"}
],
)
print(response)

Sample Usage - Streaming

from litellm import completion
import os

os.environ['MISTRAL_API_KEY']=""
response = completion(
model="mistral/mistral-tiny",
messages=[
{"role":"user","content":"hello from litellm"}
],
stream=True
)

for chunk in response:
print(chunk)

Usage with LiteLLM Proxy

1. Set Mistral Models on config.yaml

model_list:
-model_name: mistral-small-latest
litellm_params:
model: mistral/mistral-small-latest
api_key:"os.environ/MISTRAL_API_KEY"# ensure you have `MISTRAL_API_KEY` in your .env

2. Start Proxy

litellm --config config.yaml

3. Test it

  • Curl Request
  • OpenAI v1.0.0+
  • Langchain
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "mistral-small-latest",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}
'
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(model="mistral-small-latest", messages =[
{
"role":"user",
"content":"this is a test request, write a short poem"
}
])

print(response)

from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import(
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage

chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000",# set openai_api_base to the LiteLLM Proxy
model ="mistral-small-latest",
temperature=0.1
)

messages =[
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)

print(response)

Supported Models

info

All models listed here https://docs.mistral.ai/platform/endpoints are supported. We actively maintain the list of models, pricing, token window, etc. here.

Model NameFunction CallReasoning Support
Mistral Smallcompletion(model="mistral/mistral-small-latest", messages)No
Mistral Mediumcompletion(model="mistral/mistral-medium-latest", messages)No
Mistral Large 2completion(model="mistral/mistral-large-2407", messages)No
Mistral Large Latestcompletion(model="mistral/mistral-large-latest", messages)No
Magistral Smallcompletion(model="mistral/magistral-small-2506", messages)Yes
Magistral Mediumcompletion(model="mistral/magistral-medium-2506", messages)Yes
Mistral 7Bcompletion(model="mistral/open-mistral-7b", messages)No
Mixtral 8x7Bcompletion(model="mistral/open-mixtral-8x7b", messages)No
Mixtral 8x22Bcompletion(model="mistral/open-mixtral-8x22b", messages)No
Codestralcompletion(model="mistral/codestral-latest", messages)No
Mistral NeMocompletion(model="mistral/open-mistral-nemo", messages)No
Mistral NeMo 2407completion(model="mistral/open-mistral-nemo-2407", messages)No
Codestral Mambacompletion(model="mistral/open-codestral-mamba", messages)No
Codestral Mambacompletion(model="mistral/codestral-mamba-latest"", messages)No

Function Calling

from litellm import completion

# set env
os.environ["MISTRAL_API_KEY"]="your-api-key"

tools =[
{
"type":"function",
"function":{
"name":"get_current_weather",
"description":"Get the current weather in a given location",
"parameters":{
"type":"object",
"properties":{
"location":{
"type":"string",
"description":"The city and state, e.g. San Francisco, CA",
},
"unit":{"type":"string","enum":["celsius","fahrenheit"]},
},
"required":["location"],
},
},
}
]
messages =[{"role":"user","content":"What's the weather like in Boston today?"}]

response = completion(
model="mistral/mistral-large-latest",
messages=messages,
tools=tools,
tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assertisinstance(response.choices[0].message.tool_calls[0].function.name,str)
assertisinstance(
response.choices[0].message.tool_calls[0].function.arguments,str
)

Reasoning

Mistral does not directly support reasoning, instead it recommends a specific system prompt to use with their magistral models. By setting the reasoning_effort parameter, LiteLLM will prepend the system prompt to the request.

If an existing system message is provided, LiteLLM will send both as a list of system messages (you can verify this by enabling litellm._turn_on_debug()).

Supported Models

Model NameFunction Call
Magistral Smallcompletion(model="mistral/magistral-small-2506", messages)
Magistral Mediumcompletion(model="mistral/magistral-medium-2506", messages)

Using Reasoning Effort

The reasoning_effort parameter controls how much effort the model puts into reasoning. When used with magistral models.

from litellm import completion
import os

os.environ['MISTRAL_API_KEY']="your-api-key"

response = completion(
model="mistral/magistral-medium-2506",
messages=[
{"role":"user","content":"What is 15 multiplied by 7?"}
],
reasoning_effort="medium"# Options: "low", "medium", "high"
)

print(response)

Example with System Message

If you already have a system message, LiteLLM will prepend the reasoning instructions:

response = completion(
model="mistral/magistral-medium-2506",
messages=[
{"role":"system","content":"You are a helpful math tutor."},
{"role":"user","content":"Explain how to solve quadratic equations."}
],
reasoning_effort="high"
)

# The system message becomes:
# "When solving problems, think step-by-step in <think> tags before providing your final answer...
#
# You are a helpful math tutor."

Usage with LiteLLM Proxy

You can also use reasoning capabilities through the LiteLLM proxy:

  • Curl Request
  • OpenAI v1.0.0+
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "magistral-medium-2506",
"messages": [
{
"role": "user",
"content": "What is the square root of 144? Show your reasoning."
}
],
"reasoning_effort": "medium"
}'
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
model="magistral-medium-2506",
messages=[
{
"role":"user",
"content":"Calculate the area of a circle with radius 5. Show your work."
}
],
reasoning_effort="high"
)

print(response)

Important Notes

  • Model Compatibility: Reasoning parameters only work with magistral models
  • Backward Compatibility: Non-magistral models will ignore reasoning parameters and work normally

Audio Transcription

Use Mistral's Voxtral models for audio transcription via litellm.transcription().

SDK Usage

from litellm import transcription
import os

os.environ["MISTRAL_API_KEY"]=""

audio_file =open("path/to/audio.wav","rb")

response = transcription(
model="mistral/voxtral-mini-latest",
file=audio_file,
)

print(response.text)

With Optional Parameters

response = transcription(
model="mistral/voxtral-mini-latest",
file=audio_file,
language="en",
temperature=0.0,
response_format="json",
)

Mistral-Specific Parameters

Mistral supports additional parameters beyond the OpenAI-compatible ones:

ParameterTypeDescription
diarizeboolEnable speaker diarization
response = transcription(
model="mistral/voxtral-mini-latest",
file=audio_file,
diarize=True,
)

Usage with LiteLLM Proxy

model_list:
-model_name: voxtral
litellm_params:
model: mistral/voxtral-mini-latest
api_key: os.environ/MISTRAL_API_KEY
model_info:
mode: audio_transcription
litellm --config /path/to/config.yaml
curl --location 'http://0.0.0.0:4000/v1/audio/transcriptions' \
--header 'Authorization: Bearer sk-1234' \
--form 'file=@"audio.wav"' \
--form 'model="voxtral"'

Sample Usage - Embedding

from litellm import embedding
import os

os.environ['MISTRAL_API_KEY']=""
response = embedding(
model="mistral/mistral-embed",
input=["good morning from litellm"],
)
print(response)

Supported Models

All models listed here https://docs.mistral.ai/platform/endpoints are supported

Model NameFunction Call
Mistral Embeddingsembedding(model="mistral/mistral-embed", input)