Skip to main content

Documentation Index

Fetch the complete documentation index at: https://platform.stepfun.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

step-3.5-flash is StepFun’s flagship language reasoning model. It delivers top-tier reasoning quality and fast, reliable execution β€” decomposing and planning complex tasks, and reliably orchestrating tool calls. Suitable for logical reasoning, math, software engineering, deep research, and other complex workloads. Built on a 196B-parameter / 11B-activation sparse MoE architecture.

Key facts

Model type

Sparse MoE architecture
196B total params / 11B activated params

Context length

256K tokens

Best for

High-throughput reasoning + tool calling
Optimized for agent and coding workloads

Core capabilities

πŸš€ High-throughput reasoning

Sparse MoE architecture delivers high throughput and low latency, ideal for real-time agent workflows and high-volume calls.

πŸ› οΈ Tool calling

Reliable tools / tool_choice orchestration, supports multi-step task decomposition and plan execution.

🧠 Complex reasoning

Handles logical reasoning, math, software engineering, and deep research β€” a dependable foundation for long-chain agent reasoning.

Model variants

step-3.5-flash

Base version: general-purpose reasoning and tool calling β€” well-suited to most Agent and complex-task scenarios.

step-3.5-flash-2603

Agent-optimized version: tuned from step-3.5-flash for high-frequency agent scenarios. Better token efficiency and faster inference, with an optional low-reasoning mode that significantly reduces token consumption. Improved compatibility with coding workflows and agent frameworks. Supports the reasoning_effort field (low / high).

API endpoint

Chat Completion

POST /v1/chat/completions
OpenAI-compatible, with streaming and tool calling.

Pricing

ItemPrice (per million tokens)
Input (cache miss)$0.10
Input (cache hit)$0.02
Output$0.30
See full pricing details β†’

Quickstart

curl https://api.stepfun.ai/v1/chat/completions \
  -H "Authorization: Bearer $STEP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "step-3.5-flash",
    "messages": [
      {"role": "user", "content": "Hello, please introduce yourself."}
    ]
  }'

Reasoning model guide

Recommended usage of reasoning models for complex tasks, tool calling, and long contexts.

Step 3.7 Flash β€” multimodal flagship

Building on Step 3.5 Flash with native image and video understanding.