Documentation Index
Fetch the complete documentation index at: https://platform.stepfun.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
step-3.5-flash is StepFunβs flagship language reasoning model. It delivers top-tier reasoning quality and fast, reliable execution β decomposing and planning complex tasks, and reliably orchestrating tool calls. Suitable for logical reasoning, math, software engineering, deep research, and other complex workloads. Built on a 196B-parameter / 11B-activation sparse MoE architecture.
Key facts
Model type
Sparse MoE architecture
196B total params / 11B activated params
196B total params / 11B activated params
Context length
256K tokens
Best for
High-throughput reasoning + tool calling
Optimized for agent and coding workloads
Optimized for agent and coding workloads
Core capabilities
π High-throughput reasoning
Sparse MoE architecture delivers high throughput and low latency, ideal for real-time agent workflows and high-volume calls.
π οΈ Tool calling
Reliable
tools / tool_choice orchestration, supports multi-step task decomposition and plan execution.π§ Complex reasoning
Handles logical reasoning, math, software engineering, and deep research β a dependable foundation for long-chain agent reasoning.
Model variants
step-3.5-flash
Base version: general-purpose reasoning and tool calling β well-suited to most Agent and complex-task scenarios.
step-3.5-flash-2603
Agent-optimized version: tuned from
step-3.5-flash for high-frequency agent scenarios. Better token efficiency and faster inference, with an optional low-reasoning mode that significantly reduces token consumption. Improved compatibility with coding workflows and agent frameworks. Supports the reasoning_effort field (low / high).API endpoint
Chat Completion
POST /v1/chat/completionsOpenAI-compatible, with streaming and tool calling.
Pricing
| Item | Price (per million tokens) |
|---|---|
| Input (cache miss) | $0.10 |
| Input (cache hit) | $0.02 |
| Output | $0.30 |
Quickstart
- curl
- Python (OpenAI SDK)
Related reading
Reasoning model guide
Recommended usage of reasoning models for complex tasks, tool calling, and long contexts.
Step 3.7 Flash β multimodal flagship
Building on Step 3.5 Flash with native image and video understanding.