StepFun’s latest lightweight editing model. A single model supports both text-to-image and image editing. Within the under-6B parameter range, it sets a performance benchmark in its tier and competes cross-tier with 12B-20B open-source large models. Each editing task takes only 1-2 seconds, redefining real-time interactive image editing.Documentation Index
Fetch the complete documentation index at: https://platform.stepfun.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Showcase
See official sample prompts and generated results.
API quickstart
View minimal runnable curl examples.
Key information
Parameters
Under 6B
Lightweight generation and editing
Lightweight generation and editing
Prompt length
512 characters
Input image limit
4096x4096
(image editing scenarios)
(image editing scenarios)
Core capabilities
🏆 Benchmark performance at lightweight scale
Focused on maximizing performance within the under-6B parameter range. Demonstrates exceptional editing capability, currently the strongest image-editing model at this parameter level.
🚀 High intelligence density and cross-tier superiority
Optimized architecture for parameter efficiency. With a smaller footprint, it surpasses 12B-20B open-source large models cross-tier. In general editing and reference editing, it matches top-tier closed-source domestic models.
⚡ Sub-second response, real-time interaction
Deep architectural optimization yields a qualitative leap in inference speed: 1-2 seconds per editing task. This near-zero latency removes the long-standing bottleneck of large models in real-time interactive image editing.
API endpoints
Text-to-image
POST /v1/images/generationsGenerate an image from a prompt.
Image editing
POST /v1/images/editsModify an image based on input image and prompt.
Pricing
| Item | Price |
|---|---|
| Text-to-image / Image editing | $0.003 / image |
Quickstart
- Text-to-image (curl)
- Image editing (curl)
1024x1024, 768x1360, 896x1184, 1360x768, 1184x896 (format is height x width).text_mode is an optimization strategy for text-rendering scenarios; off by default, enable as needed. When cfg_scale = 1.0, negative_prompt is not passed to the underlying model.Showcase
Text-to-image
Wide-angle landscape photography
Wide-angle landscape photography

Cinematic portrait
Cinematic portrait

Minimalist still life photography
Minimalist still life photography

Classical oil-painting still life
Classical oil-painting still life

Image editing
Pose change / dialogue bubble
Pose change / dialogue bubble
Prompt: Make the cat lie on its back showing its belly. Add a dialogue bubble next to it that says “I was wrong”.
| Input | Result |
|---|---|
![]() | ![]() |
Outfit replacement
Outfit replacement
Prompt: Change the man into a suit and shirt, and the woman into a beautiful Western wedding dress with a veil. Both face the camera directly.
| Input | Result |
|---|---|
![]() | ![]() |



