Pricing and Rate Limits
Pricing Details
Pricing for Reasoning Models
| Model | Billing Unit | Input (Cache Miss) | Input (Cache Hit) | Output Price |
|---|---|---|---|---|
| step-3.5-flash | 1M tokens | $0.10 | $0.02 | $0.30 |
Note: Try step-3.5-flash for free. It is now available at no cost on OpenRouter. Upgrade your agent here: https://openrouter.ai/stepfun/step-3.5-flash:free
Pricing for Speech Models
| Model | Model Type | Unit Price |
|---|---|---|
| step-tts-2 | Next-generation text-to-speech model | $50/M characters |
| step-tts-2 | Voice cloning model | Free (limited time) |
Here, one Chinese character counts as one character, two English letters count as one character, and two punctuation marks count as one character.
Tiered Rate Limits
Top-Up and Rate Limits Table
To ensure fair overall resource allocation and prevent abuse, we apply rate limits based on your account’s cumulative top-up amount. Details are below:
| User Tier | Cumulative Top-Up Amount | Concurrency | RPM | TPM |
|---|---|---|---|---|
| V0 | $0 | 5 | 10 | 5,000,000 |
| V1 | $15 | 100 | 1,000 | 20,000,000 |
| V2 | $70 | 200 | 5,000 | 30,000,000 |
| V3 | $300 | 400 | 10,000 | 40,000,000 |
| V4 | $700 | 1,000 | 20,000 | 50,000,000 |
| V5 | $1,500 | 10,000 | 200,000 | 100,000,000 |
Definitions
- Concurrency: number of requests at the same time
- RPM: request per minute, the maximum number of requests you can make to us per minute
- TPM: token per minute, the maximum number of tokens you can interact with us per minute
Notes
- Our default rate limits are intended to allocate resources more fairly. If you believe you need higher and more stable limits, please contact our staff in advance; we will respond within two business days. Contact email: platform@stepfun.com
- We will do our best to ensure normal usage, but when resources reach capacity, we may take temporary throttling measures and adjust rate limits.
Last updated on