![]() |
VOOZH | about |
DeepInfra raises $107M Series B to scale the inference cloud β read the announcement
stepfun-ai/
$0.20
in
$1.15
out
$0.04
cached
/ 1M tokens
Step 3.7 Flash is an open-source multimodal reasoning model by StepFun with 198B total parameters (11B active) using Mixture of Experts. It accepts text and image inputs and features a 256K context window, selectable reasoning effort, tool calling, and agentic capabilities for coding and search workflows, scoring 80.9% on GPQA Diamond and 56.3% on SWE-bench Pro.
Ask me anything
You need to log in to use this model
Log InSettings
Step 3.7 Flash is an open-source frontier multimodal reasoning model by StepFun. Built on a sparse Mixture of Experts (MoE) architecture, it activates only ~11B of its 198B total parameters per token, and pairs its language backbone with a vision encoder for native image understanding β delivering state-of-the-art reasoning at a fraction of the cost of dense models.
image_url content format**\<think>** blocks, with selectable depth via reasoning_effort (low, medium, high). Reasoning is always on for this modelresponse_formatStep 3.7 Flash delivers strong results across search-heavy benchmarks. It scores 47.20% on HLE with Tools, up from 35.68% (text-only) for Step 3.5 Flash, and outperforms Flash models from DeepSeek V4 and Gemini 3.5. It reaches 75.82% on BrowseComp, approaching larger models such as Claude Opus 4.7 and GLM 5.1. On DeepSearchQA, it achieves 92.82% F1 score, comparable to Kimi K2.6, a 1T / 32B-active model. On ResearchRubrics, it scores 71.68%, ahead of GPT 5.5 at 61.50% and close to Claude Opus 4.7 at 73.92%. These results show that Step 3.7 Flash combines Flash-level efficiency with strong deep-retrieval and research capabilities.
| Total Parameters | 198B |
| Active Parameters | ~11B per token |
| Context Window | 256K tokens |
| Modality | Text + Image |
| Reasoning | Always on; selectable effort (low / medium / high) |
| License | Apache 2.0 |
from openai import OpenAI client = OpenAI( base_url="https://api.deepinfra.com/v1/openai", api_key="YOUR_DEEPINFRA_TOKEN", ) # Chat with reasoning response = client.chat.completions.create( model="stepfun-ai/Step-3.7-Flash", messages=[{"role": "user", "content": "Prove that sqrt(2) is irrational."}], ) print(response.choices[0].message.reasoning_content) # thinking print(response.choices[0].message.content) # answer # Image understanding (multimodal) response = client.chat.completions.create( model="stepfun-ai/Step-3.7-Flash", messages=[{"role": "user", "content": [ {"type": "text", "text": "What is in this image?"}, {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}, ]}], ) # Control reasoning depth response = client.chat.completions.create( model="stepfun-ai/Step-3.7-Flash", messages=[{"role": "user", "content": "Plan a 3-day trip to Tokyo."}], extra_body={"reasoning_effort": "high"}, ) # Tool calling response = client.chat.completions.create( model="stepfun-ai/Step-3.7-Flash", messages=[{"role": "user", "content": "What's the weather in Paris?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get weather for a city", "parameters": { "type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"], }, }, }], ) Links - https://huggingface.co/stepfun-ai/Step-3.7-Flash - https://github.com/stepfun-ai/Step-3.7-Flashcopy
Β© 2026 DeepInfra. All rights reserved.