/step_plan/v1/... path prefix, and the domain name is fixed as https://api.stepfun.ai.
Prerequisites
Audio synthesis models
Supported Models
| Model | Description |
|---|---|
stepaudio-2.5-tts | Next-generation Contextual TTS based on context understanding, supporting dual-level control of global and in-text contexts. It generates human-like expressions with natural breathing rhythm, proper emphasis, and emotional arcs. |
Endpoint Paths
| Capability | Request Method | Step Plan Path |
|---|---|---|
| Non-streaming Audio Synthesis | POST | https://api.stepfun.ai/step_plan/v1/audio/speech |
| Streaming Audio Synthesis | WebSocket | wss://api.stepfun.ai/step_plan/v1/realtime/audio |
| Voice Preview | POST | https://api.stepfun.ai/step_plan/v1/audio/voices/preview |
| Voice Cloning | POST | https://api.stepfun.ai/step_plan/v1/audio/voices |
The endpoint parameters are exactly the same as the open platform. For details, please refer to the API documentation of each endpoint: Audio Synthesis, Streaming Audio Synthesis, Voice Cloning Preview, Voice Cloning.
Billing Instructions
The billing logic is consistent with the open platform. Ultimately, the actual billed amount calculated on the open platform will be converted into Step Plan total quota consumption. For specific unit prices, please refer to Pricing and Rate Limits.Examples
- curl
- Python (OpenAI SDK)
- Python (WebSocket Streaming)
Speech recognition models
Supported Models
| Model | Description |
|---|---|
stepaudio-2.5-asr | New-generation streaming ASR model, 4B MTP architecture, targeting near-realtime transcription with low latency and high recognition accuracy |
Endpoint Paths
| Capability | Request Method | Step Plan Path |
|---|---|---|
| Speech Recognition (Streaming Output) | POST | https://api.stepfun.ai/step_plan/v1/audio/asr/sse |
The endpoint parameters are exactly the same as the open platform. See Speech Recognition (Streaming Output) for details.
Capability Limitations
Under Step Plan,stepaudio-2.5-asr only supports the HTTP + SSE call method, consistent with the open platform’s capability boundary. Other real-time transport methods such as WebSocket are not supported.
Billing Instructions
The billing logic is consistent with the open platform. Ultimately, the actual billed amount calculated on the open platform will be converted into Step Plan total quota consumption. For specific unit prices, please refer to Pricing and Rate Limits.Examples
- curl