Skip to main content

Model overview

Our reasoning models are built for deep analytical tasks, excelling at logical reasoning, math, and coding. The current lineup includes step-3.5-flash-2603 and step-3.5-flash, covering both optimized agent workflows and general high-complexity reasoning.

Models

step-3.5-flash-2603

step-3.5-flash-2603 is an optimized version of Step 3.5 Flash for high-frequency agent workflows and coding tasks. It improves token efficiency and reasoning speed while preserving strong tool-use and long-context performance. It also introduces a new Low Think Mode to help reduce token consumption in cost-sensitive reasoning scenarios.
  • Optimized for Agentic Workloads: Tuned from Step 3.5 Flash for high-frequency agent and automation scenarios.
  • Faster, Leaner Reasoning: Improved token efficiency and reasoning speed for iterative tasks.
  • Low Think Mode: Reduces reasoning token usage when you want a lighter-weight response path.
  • Better Coding Compatibility: Improved fit for coding workflows and agent frameworks.

step-3.5-flash

step-3.5-flash is our flagship general-purpose reasoning model, engineered for high-complexity tasks requiring deep logic and rapid execution. It excels at decomposing multi-step problems, executing tool calls, and maintaining coherence across massive datasets. It is the primary choice for complex workloads such as long-context agents, advanced software engineering, and comprehensive research automation.
  • Mixture of Experts Architecture (MoE): Combines a massive 196B parameter knowledge base with high-efficiency inference (activating around 11B parameters per token). This delivers the logic depth of ultra-large models with the low latency of lightweight models.
  • 256K Long Context: Maintains logical consistency when processing massive datasets or long documents.
  • Native Agent Capabilities: Orchestrates precise tool calling and multi-step reasoning, which makes it ideal for agents and automation.
  • Extreme Efficiency: Optimized for high throughput and cost-effective deployment without compromising reasoning quality.

Quickstart

Reasoning model best practices

See recommended prompting and usage patterns for complex reasoning workloads.