Skip to main content

Documentation Index

Fetch the complete documentation index at: https://platform.stepfun.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Model overview

Our text models are built on generative AI, with broad domain expertise to understand and generate natural language. They can reason, write creatively, and respond with emotional nuance to boost productivity across many fields. We currently offer the step-1 and step-2 series of production text models.

Models

step-2-mini

Recommended. An ultra-fast model built on our next-generation MFA attention architecture. It delivers step-1–level quality at much lower cost while offering higher throughput and faster latency. Excels at general tasks and coding with a 32k context length.

step-2

Our new MoE architecture at trillion-parameter scale. Performance, UX feel, and planning abilities are on par with leading global models, covering a wide range of Chinese and English use cases and showcasing the latest scaling results.

step-2-16k-exp

An experimental build of step-2 with the latest features, updated continuously. Not recommended for production.

step-1

The classic hundred-billion–parameter series that handles complex language tasks. It improves productivity for writing, bilingual communication, Q&A, and logical reasoning, and it has strong math and coding skills for scientific computing and software development.

Context length

Context length is how much input a model can “look back” and consider when generating a response. A longer context lets the model use more history, improving coherence and accuracy. The limit applies to both input and output—total tokens (not characters) cannot exceed the model’s context window. The step-1 family balances price and context length, offering both cost-effective and ultra-long options. Higher context windows cost more, so choose a smaller context when you don’t need long inputs to reduce spend. Model names include the context length and capability info. For example, step-1-8k is the step-1 series with an 8k window, and step-2-16k is the production step-2 with a 16k window. k means 1000 tokens, so input plus max output cannot exceed 8000 tokens for an 8k model. High-speed models
  • step-2-mini (context window 32k)
Trillion-parameter models
  • step-2-16k
  • step-2-16k-exp
Hundred-billion cost-effective series
  • step-1-8k
  • step-1-32k

Quickstart

Migrate from OpenAI

Switch existing OpenAI-compatible integrations to Stepfun with minimal code changes.

Multi-turn conversations

Store message history and pass context back to the model for continuous dialogue.

JSON Mode

Return machine-parseable JSON so model output can plug into application logic.

Streaming responses

Stream tokens to the UI as they are generated for a faster perceived response.

Tool Call

Let the model invoke tools and external systems to complete real tasks.

Prompt cache

Reuse repeated context to reduce cost and improve latency in repeated requests.