Black Friday Annual Sale!Get 50% offClaim Now→





Have a different question and can’t find the answer you’re looking for? Reach out to our support team by sending us an email and we’ll get back to you as soon as we can.
Z-Image is a 6 billion parameter text-to-image model that generates photorealistic images in sub-second time. Developed by Tongyi-MAI, it uses only 8 steps to generate high-quality images.
The model uses Decoupled-DMD (Distribution Matching Distillation) to compress a larger model while maintaining quality. This allows for rapid image generation without compromising on quality.
Z-Image excels at rapid image iteration, bilingual text rendering (English and Chinese), photorealistic results with natural lighting, and cost-effective batch generation due to its speed.
The model handles complex text rendering exceptionally well. For best results, be explicit about what text should appear and where—for example: 'a coffee shop sign that says Morning Brew in elegant gold lettering.'
The model works best at 1024x1024 resolution with 9 inference steps (8 forward passes). Set guidance scale to 0.0 for optimal performance.
Use Z-Image when you need quick iterations for prototyping, accurate text in images (signs, posters, labels), photorealistic photographs and portraits, or cost-effective generation at scale.
Be specific and detailed: describe subject, action, style, and lighting conditions. Include style keywords like 'photorealistic,' 'cinematic,' or 'golden hour.' For text, clearly state what it should say and where.
Yes! New users receive free credits to explore Z-Image. Purchase additional credits or subscribe for unlimited access. Unused credits never expire, and failed generations are automatically refunded.