StepFunโs flagship multimodal reasoning model. Powered by a 198B-parameter / 11B-activation sparse MoE architecture, with native support for image and video understanding.Documentation Index
Fetch the complete documentation index at: https://platform.stepfun.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Key facts
Model type
Sparse MoE architecture
198B total params / 11B activated params
198B total params / 11B activated params
Context length
256K tokens
Best for
High-throughput reasoning + native multimodal
Optimized for agent and coding workloads
Optimized for agent and coding workloads
Core capabilities
๐๏ธ Native multimodal
Native support for image and video understanding. Drop a file straight into the chat โ no separate vision model required inside your Agent framework.
๐ High-throughput reasoning
Sparse MoE architecture delivers high throughput and low latency, ideal for real-time agent workflows and high-volume calls.
๐ ๏ธ Tool calling
Reliable
tools / tool_choice orchestration, supports multi-step task decomposition and plan execution.๐ง Complex reasoning
Handles logical reasoning, math, software engineering, and deep research โ a dependable foundation for long-chain agent reasoning.
Reasoning effort
step-3.7-flash supports three reasoning effort levels โ pick one to match task complexity:
| Effort | Best for |
|---|---|
low | Simple Q&A, summarization, rewriting, information extraction |
medium | Default. General reasoning and multi-step tasks |
high | Complex reasoning, math, planning, code analysis |
The Chat Completions API uses
reasoning_effort to control the effort level; the Messages API uses output_config.effort. See the Quickstart for full call examples.Get started
Multimodal quickstart
Get started with images, video, local files, and reasoning-effort control.
Cookbook
Task templates for whiteboard-to-plan, chart-to-data, receipt-to-table, and more.
Mobile Agent
Connect to a real Android device via GELab-Zero and let the model plan operations from screenshots.
Chat Completion
POST /v1/chat/completionsOpenAI-compatible, with streaming and tool calling.
Pricing
| Item | Price (per million tokens) |
|---|---|
| Input (cache hit) | $0.04 |
| Input (cache miss) | $0.20 |
| Output | $1.15 |
Framework support
step-3.7-flash plugs reliably into popular Coding and Agent tools, well-suited to code generation, file editing, and complex task coordination in the terminal, IDE, or Agent workflows.
View Step Plan integration guide โ
Related reading
Reasoning model guide
Recommended usage of reasoning models for complex tasks, tool calling, and long contexts.
Image understanding best practices
A deeper look at image understanding API params, the detail setting, and best practices.
Video understanding best practices
A deeper look at video understanding API params, file limits, and common pitfalls.