Endpoint
POST https://api.stepfun.ai/v1/audio/voices/preview
For Step Plan, use
POST https://api.stepfun.ai/step_plan/v1/audio/voices/previewRequest parameters
-
modelstringrequired
Model to use for cloning. Options:step-tts-2,step-tts-mini,stepaudio-2.5-tts. -
file_idstringrequired
Reference audio file ID. Obtain via file upload; setpurposetostorage. -
textstringoptional
Transcript of the reference audio. If omitted, automatic speech recognition is used. For best results, we recommend providing the transcript. -
sample_textstringrequired
Text to synthesize for the preview. Recommended length: under 50 characters. -
response_formatstringoptional
Audio format for the response. Options:wav,mp3,flac,opus,pcm. Default:mp3. -
speedfloatoptional
Speaking rate. Range: 0.5–2.0. Default: 1.0. -
volumefloatoptional
Volume level. Range: 0.1–2.0. Default: 1.0. -
voice_labelobjectoptional
Voice style tags for emotion and style control. Only one oflanguage,emotion, orstylemay be set at a time.languagestringoptional
Language option:Cantonese,Sichuan dialect,Japanese.emotionstringoptional
Emotion tag. See voice tags for supported options per model.stylestringoptional
Speaking or delivery style. See voice tags for supported options per model.
-
instructionstringoptional
Global natural language guidance. Only effective withstepaudio-2.5-tts; other models will return an error if this parameter is provided. Sets the overall emotional tone and character for the audio. Max length: 200 characters. -
sample_rateintegeroptional
Sample rate in Hz. Options: 8000, 16000, 22050, 24000, 48000. Default: 24000. Higher values produce better quality but larger files. -
pronunciation_mapobject arrayoptional
Custom pronunciation rules for specific characters or symbols.tonestringrequired
Pronunciation mapping separated by/. Example:["word/wɜːrd"].
-
markdown_filterbooloptional
Whether to enable Markdown filtering for the input text.
Response
-
sample_textstring
The text used for the preview audio. -
sample_audiostring
Preview audio in base64 format (WAV). Convert to a file for playback. -
request_idstring
Unique identifier for this request.
Example
- curl