Skip to Content
Developer GuidesReasoning Model Development Guide

Reasoning Model Development Guide

Step 3.5 Flash

step-3.5-flash is our flagship reasoning model, designed for high-complexity tasks requiring deep logic and fast execution. It features:

  • Mixture of Experts Architecture (MoE): Combines a 196B parameter knowledge base with sparse activation (activating around 11B parameters per token) to deliver the logical depth of ultra-large models while keeping inference fast.
  • 256K Long Context: Maintains logical consistency when processing massive datasets or long documents, making it ideal for multi-stage reasoning and research workflows.
  • Native Agent Capabilities: Excels at tool call orchestration, multi-step problem decomposition, and long-context agent development, making it the preferred foundation for engineering and automation workloads.
  • Extreme Efficiency: Optimized for production throughput and cost-effective deployment without compromising on cutting-edge reasoning performance.

Chat Completion Example

The following code demonstrates how to use the step-3.5-flash model for logical reasoning.

import time from openai import OpenAI # Set your API Key and Base URL BASE_URL = "https://api.stepfun.com/v1" STEP_API_KEY = "YOUR_STEPFUN_API_KEY" # Select Model COMPLETION_MODEL = "step-3.5-flash" # User Prompt user_prompt = "How many 'r's are in the word strawberry?" client = OpenAI(api_key=STEP_API_KEY, base_url=BASE_URL) time_start = time.time() try: response = client.chat.completions.create( model=COMPLETION_MODEL, messages=[ {"role": "user", "content": user_prompt} ], stream=True ) except Exception as e: print("Exception occurred when requesting API:", e) exit(1) print("Reasoning Process:") try: for chunk in response: # Check for reasoning content if hasattr(chunk.choices[0].delta, 'reasoning') and chunk.choices[0].delta.reasoning: print(chunk.choices[0].delta.reasoning, end='', flush=True) # Check for standard content elif chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end='', flush=True) except Exception as e: print("\\nError occurred while processing streaming results:", e) time_end = time.time() print(f"\\n\\nTotal generation time: {time_end - time_start:.2f} seconds")

For input parameter details, please refer to the Chat Completion Documentation

Obtaining Reasoning Content

When StepFun’s reasoning models handle complex problems, they include a reasoning field in the output to display the model’s thinking process. Developers can check for the existence of this field to obtain the model’s thinking information.

if chunk.choices[0].delta.reasoning: reasoning = chunk.choices[0].delta.reasoning print("Model thinking process:", reasoning)

For non-streaming scenarios, you can directly extract the reasoning field to get the model’s thinking process.

msg = completion.choices[0].message.content reasoning = completion.choices[0].message.reasoning

Notes

  • JSON Mode Limitation: The current version does not temporarily support JSON mode.
  • Error Handling and Logging: A Trace ID is added to model outputs. Please include this ID when reporting any issues with reasoning behavior.
Last updated on