Overview
| Property | Details |
|---|---|
| Description | Lambda AI provides access to a wide range of open-source language models through their cloud GPU infrastructure, optimized for inference at scale. |
| Provider Route on LiteLLM | lambda_ai/ |
| Link to Provider Doc | Lambda AI API Documentation ↗ |
| Base URL | https://api.lambda.ai/v1 |
| Supported Operations | /chat/completions |
We support ALL Lambda AI models, just set lambda_ai/ as a prefix when sending completion requests
Available Models
Lambda AI offers a diverse selection of state-of-the-art open-source models:
Large Language Models
| Model | Description | Context Window |
|---|---|---|
lambda_ai/llama3.3-70b-instruct-fp8 | Llama 3.3 70B with FP8 quantization | 8,192 tokens |
lambda_ai/llama3.1-405b-instruct-fp8 | Llama 3.1 405B with FP8 quantization | 8,192 tokens |
lambda_ai/llama3.1-70b-instruct-fp8 | Llama 3.1 70B with FP8 quantization | 8,192 tokens |
lambda_ai/llama3.1-8b-instruct | Llama 3.1 8B instruction-tuned | 8,192 tokens |
lambda_ai/llama3.1-nemotron-70b-instruct-fp8 | Llama 3.1 Nemotron 70B | 8,192 tokens |
DeepSeek Models
| Model | Description | Context Window |
|---|---|---|
lambda_ai/deepseek-llama3.3-70b | DeepSeek Llama 3.3 70B | 8,192 tokens |
lambda_ai/deepseek-r1-0528 | DeepSeek R1 0528 | 8,192 tokens |
lambda_ai/deepseek-r1-671b | DeepSeek R1 671B | 8,192 tokens |
lambda_ai/deepseek-v3-0324 | DeepSeek V3 0324 | 8,192 tokens |
Hermes Models
| Model | Description | Context Window |
|---|---|---|
lambda_ai/hermes3-405b | Hermes 3 405B | 8,192 tokens |
lambda_ai/hermes3-70b | Hermes 3 70B | 8,192 tokens |
lambda_ai/hermes3-8b | Hermes 3 8B | 8,192 tokens |
Coding Models
| Model | Description | Context Window |
|---|---|---|
lambda_ai/qwen25-coder-32b-instruct | Qwen 2.5 Coder 32B | 8,192 tokens |
lambda_ai/qwen3-32b-fp8 | Qwen 3 32B with FP8 | 8,192 tokens |
Vision Models
| Model | Description | Context Window |
|---|---|---|
lambda_ai/llama3.2-11b-vision-instruct | Llama 3.2 11B with vision capabilities | 8,192 tokens |
Specialized Models
| Model | Description | Context Window |
|---|---|---|
lambda_ai/llama-4-maverick-17b-128e-instruct-fp8 | Llama 4 Maverick with 128k context | 131,072 tokens |
lambda_ai/llama-4-scout-17b-16e-instruct | Llama 4 Scout with 16k context | 16,384 tokens |
lambda_ai/lfm-40b | LFM 40B model | 8,192 tokens |
lambda_ai/lfm-7b | LFM 7B model | 8,192 tokens |
Required Variables
Environment Variables
os.environ["LAMBDA_API_KEY"]=""# your Lambda AI API key
Usage - LiteLLM Python SDK
Non-streaming
Lambda AI Non-streaming Completion
import os
import litellm
from litellm import completion
os.environ["LAMBDA_API_KEY"]=""# your Lambda AI API key
messages =[{"content":"Hello, how are you?","role":"user"}]
# Lambda AI call
response = completion(
model="lambda_ai/llama3.1-8b-instruct",
messages=messages
)
print(response)
Streaming
Lambda AI Streaming Completion
import os
import litellm
from litellm import completion
os.environ["LAMBDA_API_KEY"]=""# your Lambda AI API key
messages =[{"content":"Write a short story about AI","role":"user"}]
# Lambda AI call with streaming
response = completion(
model="lambda_ai/llama3.1-70b-instruct-fp8",
messages=messages,
stream=True
)
for chunk in response:
print(chunk)
Vision/Multimodal Support
The Llama 3.2 Vision model supports image inputs:
Lambda AI Vision/Multimodal
import os
import litellm
from litellm import completion
os.environ["LAMBDA_API_KEY"]=""# your Lambda AI API key
messages =[{
"role":"user",
"content":[
{
"type":"text",
"text":"What's in this image?"
},
{
"type":"image_url",
"image_url":{
"url":"https://example.com/image.jpg"
}
}
]
}]
# Lambda AI vision model call
response = completion(
model="lambda_ai/llama3.2-11b-vision-instruct",
messages=messages
)
print(response)
Function Calling
Lambda AI models support function calling:
Lambda AI Function Calling
import os
import litellm
from litellm import completion
os.environ["LAMBDA_API_KEY"]=""# your Lambda AI API key
# Define tools
tools =[{
"type":"function",
"function":{
"name":"get_weather",
"description":"Get the current weather in a location",
"parameters":{
"type":"object",
"properties":{
"location":{
"type":"string",
"description":"The city and state, e.g. San Francisco, CA"
}
},
"required":["location"]
}
}
}]
messages =[{"role":"user","content":"What's the weather in Boston?"}]
# Lambda AI call with function calling
response = completion(
model="lambda_ai/hermes3-70b",
messages=messages,
tools=tools,
tool_choice="auto"
)
print(response)
Usage - LiteLLM Proxy Server
config.yaml
model_list:
-model_name: llama-8b
litellm_params:
model: lambda_ai/llama3.1-8b-instruct
api_key: os.environ/LAMBDA_API_KEY
-model_name: deepseek-70b
litellm_params:
model: lambda_ai/deepseek-llama3.3-70b
api_key: os.environ/LAMBDA_API_KEY
-model_name: hermes-405b
litellm_params:
model: lambda_ai/hermes3-405b
api_key: os.environ/LAMBDA_API_KEY
-model_name: qwen-coder
litellm_params:
model: lambda_ai/qwen25-coder-32b-instruct
api_key: os.environ/LAMBDA_API_KEY
Custom API Base
If you need to use a custom API base URL:
Custom API Base
import os
import litellm
from litellm import completion
# Using environment variable
os.environ["LAMBDA_API_BASE"]="https://custom.lambda-api.com/v1"
os.environ["LAMBDA_API_KEY"]=""# your API key
# Or pass directly
response = completion(
model="lambda_ai/llama3.1-8b-instruct",
messages=[{"content":"Hello!","role":"user"}],
api_base="https://custom.lambda-api.com/v1",
api_key="your-api-key"
)
Supported OpenAI Parameters
Lambda AI supports all standard OpenAI parameters since it's fully OpenAI-compatible:
temperaturemax_tokenstop_pfrequency_penaltypresence_penaltystopnstreamtoolstool_choiceresponse_formatseeduserlogit_bias
Example with parameters:
Lambda AI with Parameters
response = completion(
model="lambda_ai/hermes3-405b",
messages=[{"content":"Explain quantum computing","role":"user"}],
temperature=0.7,
max_tokens=500,
top_p=0.9,
frequency_penalty=0.2,
presence_penalty=0.1
)
