Image Generation & Editing Models - StepFun Documentation

Model overview

Our text-to-image models generate high-quality, diverse images from text prompts or other inputs. They are widely applicable for art, design, game development, and beyond. We currently provide the step-1x and step-2x model series:

Models

step-2x-large

Our new-generation image model focused on text-to-image generation. Produces more realistic textures and stronger Chinese/English text rendering.

step-1x-edit

Specialized for image editing. Takes images plus text instructions to modify and enhance results. Supports text prompts and reference images, understands intent, and produces edits that match requirements.

step-1x-medium

A strong text-to-image generator with native Chinese support for better semantic understanding of Chinese prompts. Generates high-resolution, high-quality images with style-transfer capability.

Key terms

Image resolution: Pixel width/height of the output. Higher resolution gives more detail but increases generation time and compute.
Image style: Visual characteristics such as realistic, abstract, cartoon, etc.
Prompt/description: The text or reference image describing what to generate. More precise descriptions yield outputs closer to expectations.
Model parameters: Larger models capture more detail and produce higher-quality results. The step-1x line offers a 2B-parameter model.

Usage limits

Supported input: Natural-language descriptions of desired content and style.
Images per request: step-1x models allow up to 1 image per request.
Resolution limits: Squares: 256x256, 512x512, 768x768, 1024x1024; 16:9: 1280x800, 800x1280.
Generation time: Varies with prompt complexity and model throughput.
Quality: Results depend on prompt specificity and training data; multiple attempts may be needed for the best output.
Copyright and usage: You own generated images but must not use them for illegal purposes or rights violations. Models are evolving, so evaluate and adjust for your scenario.

Documentation Index

​Model overview

​Models

​step-2x-large

​step-1x-edit

​step-1x-medium

​Key terms

​Usage limits