FAQ

Skip to Content

Experience center

CTRL K

Open platform

···

Home

Documentation

Experience center

Stepfun

CTRL K

- Audio Models
- Reasoning Models
- Billing Introduction
- Pricing and Rate Limits
- Speech synthesis best practices
- Reasoning Model Development Guide

- Audio Models
- Reasoning Models
- Billing Introduction
- Pricing and Rate Limits
- Speech synthesis best practices
- Reasoning Model Development Guide

On This Page

Frequently Asked Questions
Audio models
What model should I use for text-to-speech?
Where can I find the TTS API parameters?
What audio formats are supported?
Is there a limit on input length?

Feedback Problem

Frequently Asked Questions

Audio models

What model should I use for text-to-speech?

Use step-tts-2. See Audio Models for an overview.

Where can I find the TTS API parameters?

See Generate audio for the full request schema and examples.

What audio formats are supported?

wav, mp3, flac, opus, pcm. Default is mp3.

Is there a limit on input length?

Yes. The maximum input length is 1,000 characters per request.

Last updated on March 17, 2026

Quickstart Audio Models

2026 © StepFun Open Platform