StepFun provides developers with voice interaction models that support audio generation and voice cloning. By integrating these models, applications can extend beyond standard large language model understanding and enable voice interaction.Documentation Index
Fetch the complete documentation index at: https://platform.stepfun.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Quick Start
Quickly Generate an Audio Clip
Copy the following code to quickly generate an audio file.Voice Recommendations by Scenario
StepFun offers dozens of recommended voices across seven major scenarios. You can preview different voices here and use them via the API. We strongly recommend using voice cloning to create custom voices. The step-tts-2 model delivers industry-leading cloning performance, and cloned voices support all emotion and style controls at zero additional cost.1. Marketing
Marketing scenarios require voices with charisma, persuasiveness, and warmth that can effectively convey product value and inspire purchase intent. Step-TTS delivers full emotional expression, building trust and professionalism to make marketing content more compelling.2. Customer Service
Customer service scenarios require voices that are warm, patient, and professional, capable of calming users and providing clear solutions. We offer two types of customer service voices — step-tts-2 voices stand out with rich audio quality, full emotion, and a lifelike human feel, making the first four recommendations especially suited for phone scenarios.| Supported Models | Voice Name | Voice ID | Audio Samples |
|---|---|---|---|
| stepaudio-2.5-tts / step-tts-2 | Straightforward Male | shuangkuainansheng | Sample 1 · Sample 2 · Sample 3 |
| stepaudio-2.5-tts / step-tts-2 | Capable Female | ganliannvsheng | Sample 1 · Sample 2 · Sample 3 |
| stepaudio-2.5-tts / step-tts-2 | Warm Female | qinhenvsheng | Sample 1 · Sample 2 · Sample 3 |
| stepaudio-2.5-tts / step-tts-2 | Energetic Female | huolinvsheng | Sample 1 · Sample 2 · Sample 3 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Elegant Gentle | elegantgentle-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Lively Breezy | livelybreezy-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Gentle Male | wenrounansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Classic Female | jingdiannvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Mature Gentle | wenroushunv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Sweet Female | tianmeinvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Pure Girl | qingchunshaonv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Spirited Male | yuanqinansheng | Sample 1 · Sample 2 |
3. Audiobook
Audiobooks require voices that are expressive and emotionally engaging, capable of vividly bringing different characters and story atmospheres to life. Our TTS stands out with its delicate emotional expression and versatile vocal styles, enabling listeners to fully immerse themselves in the world of the story.| Supported Models | Voice Name | Voice ID | Audio Samples |
|---|---|---|---|
| stepaudio-2.5-tts / step-tts-2 | Lively Girl | lively-girl | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Scholarly Gentleman | ruyananshi | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Gentle Female | wenrounvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Tender Gentleman | wenrougongzi | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Magnetic Male | cixingnansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Spirited Girl | yuanqishaonv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Upright Youth | zhengpaiqingnian | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Spirited Male | yuanqinansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Broadcast Male | boyinnansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Deep Male | shenchennanyin | Sample 1 · Sample 2 |
4. Emotional Companionship
Emotional companionship requires voices that are warm, gentle, and empathetic, capable of providing users with comfort and psychological support. Our TTS features delicate, soothing voice timbres with strong emotional expressiveness, helping you create a safe and comforting interaction environment for users.| Supported Models | Voice Name | Voice ID | Audio Samples |
|---|---|---|---|
| stepaudio-2.5-tts / step-tts-2 | Soft-spoken Gentleman | soft-spoken-gentleman | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Elegant Gentle | elegantgentle-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Lively Breezy | livelybreezy-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Gentle Male | wenrounansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Tender Gentleman | wenrougongzi | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Classic Female | jingdiannvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Friendly Female | qinqienvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Sweet Female | tianmeinvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Magnetic Male | cixingnansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Spirited Girl | yuanqishaonv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Girl Next Door | linjiajiejie | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Scholarly Gentleman | ruyananshi | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Deep Male | shenchennanyin | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Gentle Female | wenrounvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Cute Soft Female | ruanmengnvsheng | Sample 1 · Sample 2 |
5. Voice Assistant
Voice assistant scenarios require voices that are clear, natural, and efficient, capable of accurately understanding and responding to user commands. Our TTS features natural prosody and full emotional expression, making your voice assistant both professional and approachable.| Supported Models | Voice Name | Voice ID | Audio Samples |
|---|---|---|---|
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Elegant Gentle | elegantgentle-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Lively Breezy | livelybreezy-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Pure Girl | qingchunshaonv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Spirited Girl | yuanqishaonv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Girl Next Door | linjiajiejie | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Scholarly Gentleman | ruyananshi | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Clever Girl | jilingshaonv | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Cute Soft Female | ruanmengnvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Kid Sister | linjiameimei | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Intellectual Lady | zhixingjiejie | Sample 1 · Sample 2 |
6. Video Dubbing
Video dubbing requires voices that are expressive, rhythmic, and visually evocative, capable of blending seamlessly with visual content. Our TTS excels in precise emotional delivery and fine-grained speech rhythm control, enhancing the impact and overall appeal of your videos.| Supported Models | Voice Name | Voice ID | Audio Samples |
|---|---|---|---|
| stepaudio-2.5-tts / step-tts-2 | Vibrant Youth | vibrant-youth | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 | Magnetic-voiced Male | magnetic-voiced-male | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Girl Next Door | linjiajiejie | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Kid Sister | linjiameimei | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | College Student | qingniandaxuesheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Cute Soft Female | ruanmengnvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Elegant Female | youyanvsheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Cool Beauty | lengyanyujie | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Intellectual Lady | zhixingjiejie | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Bold Sister | shuangkuaijiejie | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Quiet Scholar | wenjingxuejie | Sample 1 · Sample 2 |
7. Education & Training
Education and training scenarios require voices that are clear, accurate, and inspiring, capable of effectively conveying knowledge and sparking learning interest. Our TTS excels at capturing the vocal characteristics of instructors across different emotional states.| Supported Models | Voice Name | Voice ID | Audio Samples |
|---|---|---|---|
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Elegant Gentle | elegantgentle-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Gentle Male | wenrounansheng | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Lively Breezy | livelybreezy-female | Sample 1 · Sample 2 |
| stepaudio-2.5-tts / step-tts-2 / step-tts-mini | Mature Gentle | wenroushunv | Sample 1 · Sample 2 |
System Voice ID List
| Voice Name | Voice ID | Supported Models | Recommended Use Cases |
|---|---|---|---|
| Vibrant Youth | vibrant-youth | stepaudio-2.5-tts, step-tts-2 | Audiobook, video dubbing |
| Lively Girl | lively-girl | stepaudio-2.5-tts, step-tts-2 | Audiobook, video dubbing |
| Soft-spoken Gentleman | soft-spoken-gentleman | stepaudio-2.5-tts, step-tts-2 | Emotional companionship, audiobook |
| Magnetic-voiced Male | magnetic-voiced-male | stepaudio-2.5-tts, step-tts-2 | Audiobook, video dubbing |
| Confident Male | zixinnansheng | stepaudio-2.5-tts, step-tts-2 | Audiobook, emotional companionship, education, marketing |
| Elegant Gentle | elegantgentle-female | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Customer service, voice-over, education, emotional companionship |
| Lively Breezy | livelybreezy-female | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Emotional companionship, customer service, education, marketing |
| Gentle Male | wenrounansheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice-over, emotional companionship, customer service, education |
| Tender Gentleman | wenrougongzi | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Emotional companionship, audiobook |
| Spirited Male | yuanqinansheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Audiobook, voice-over, customer service |
| Classic Female | jingdiannvsheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Customer service, emotional companionship |
| Mature Gentle | wenroushunv | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Customer service, voice-over, education |
| Sweet Female | tianmeinvsheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Emotional companionship, customer service |
| Pure Girl | qingchunshaonv | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Customer service, voice assistant |
| Magnetic Male | cixingnansheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Audiobook, emotional companionship |
| Spirited Girl | yuanqishaonv | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Audiobook, emotional companionship, voice assistant |
| Girl Next Door | linjiajiejie | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice-over, emotional companionship, voice assistant, video dubbing |
| Upright Youth | zhengpaiqingnian | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Marketing, audiobook |
| College Student | qingniandaxuesheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice-over |
| Broadcast Male | boyinnansheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Audiobook, voice-over |
| Scholarly Gentleman | ruyananshi | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Audiobook, emotional companionship, voice-over, voice assistant |
| Deep Male | shenchennanyin | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Emotional companionship, audiobook |
| Friendly Female | qinqienvsheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice-over |
| Gentle Female | wenrounvsheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Audiobook, emotional companionship |
| Clever Girl | jilingshaonv | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice assistant, voice-over |
| Cute Soft Female | ruanmengnvsheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Emotional companionship, voice assistant, video dubbing |
| Elegant Female | youyanvsheng | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Video dubbing |
| Cool Beauty | lengyanyujie | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Video dubbing |
| Bold Sister | shuangkuaijiejie | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice-over |
| Quiet Scholar | wenjingxuejie | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Voice-over |
| Kid Sister | linjiameimei | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Video dubbing, voice-over, voice assistant |
| Intellectual Lady | zhixingjiejie | stepaudio-2.5-tts, step-tts-2, step-tts-mini | Video dubbing, voice-over, voice assistant |
| Straightforward Male | shuangkuainansheng | stepaudio-2.5-tts, step-tts-2 | Customer service, voice assistant |
| Capable Female | ganliannvsheng | stepaudio-2.5-tts, step-tts-2 | Customer service, voice assistant |
| Warm Female | qinhenvsheng | stepaudio-2.5-tts, step-tts-2 | Customer service, voice assistant |
| Energetic Female | huolinvsheng | stepaudio-2.5-tts, step-tts-2 | Customer service, voice assistant |
Voice Tags List
Voice tags support three categories: speaking style, emotion, and language. Emotion tags must be set in thevoice_label.emotion field, while speaking-style tags must be set in the voice_label.style field.
| No. | Tag Name | Tag Type | step-tts-2 | step-tts-mini |
|---|---|---|---|---|
| 1 | Happy | Emotion | ✓ | ✓ |
| 2 | Very Happy | Emotion | ✓ | ✓ |
| 3 | Sad | Emotion | ✓ | ✓ |
| 4 | Angry | Emotion | ✓ | ✓ |
| 5 | Very Angry | Emotion | ✓ | ✓ |
| 6 | Coquettish | Emotion | ✓ | ✓ |
| 7 | Slow | Speaking Style | ✓ | ✓ |
| 8 | Very Slow | Speaking Style | ✓ | ✓ |
| 9 | Fast | Speaking Style | ✓ | ✓ |
| 10 | Very Fast | Speaking Style | ✓ | ✓ |
| 11 | Fearful | Emotion | ✓ | ✓ |
| 12 | Surprised | Emotion | ✓ | ✓ |
| 13 | Excited | Emotion | ✓ | ✓ |
| 14 | Admiring | Emotion | ✓ | ✓ |
| 15 | Confused | Emotion | ✓ | ✓ |
| 16 | Cold | Delivery Style | ✓ | ✓ |
| 17 | Embarrassed | Delivery Style | ✓ | ✓ |
| 18 | Frustrated | Delivery Style | ✓ | ✓ |
| 19 | Proud | Delivery Style | ✓ | |
| 20 | Tender | Delivery Style | ✓ | |
| 21 | Sweet | Delivery Style | ✓ | |
| 22 | Outgoing | Delivery Style | ✓ | |
| 23 | Serious | Delivery Style | ✓ | |
| 24 | Arrogant | Delivery Style | ✓ | |
| 25 | Elderly | Delivery Style | ✓ | |
| 26 | Shouting | Delivery Style | ✓ | |
| 27 | Sarcastic | Delivery Style | ✓ | |
| 28 | Stuttering | Delivery Style | ✓ |
Output Format
StepFun TTS models support audio output inwav, mp3, flac, opus, and pcm formats. The default format is mp3. You can choose the format that best suits your use case.
Output Languages
StepFun TTS models support generating audio in Chinese, English, mixed Chinese-English, and Japanese.FAQ
Do I own the audio I generate? Yes. You own the audio you create. However, we recommend informing your end users that the audio was generated by AI so they are aware of its nature. How do I adjust the volume of the generated audio? You can set thevolume parameter when calling the generation API. Valid values range from 0.1 to 2.0, representing 10% volume to 200% volume.
How do I adjust the speaking rate of the generated audio?
You can set the speed parameter when calling the generation API. Valid values range from 0.5 to 2.0, representing half-speed to double-speed.