Create professional videos with synchronized audio instantly
Your generated video will appear here.
Result Time2-3 min
Original Image 1
Video Result 1
Original Image 2
Video Result 2
Have a different question and can’t find the answer you’re looking for? Reach out to our support team by sending us an email and we’ll get back to you as soon as we can.
WAN 2.5 is Alibaba Cloud's state-of-the-art text/image-to-video generation model, available on the DashScope platform. It produces high-quality videos in 480p, 720p, or 1080p, complete with synchronized audio, from simple text or image prompts.
WAN 2.5 takes your text prompt or uploaded image, processes it through advanced AI, and generates a complete video with synchronized audio and lip-sync. Just provide a clear prompt or reference image to get a ready-to-publish video.
You can generate videos from text prompts, images, or both. WAN 2.5 supports outputs in 480p, 720p, and 1080p resolutions, with durations up to 10 seconds.
Inputs: JPG, JPEG, PNG, WEBP (max 10 MB, 360–2000 px shortest edge). Outputs: MP4 or WebM video with synchronized audio, supporting multiple aspect ratios and up to 10 seconds duration.
WAN 2.5 is more affordable and faster than Veo 3, supports longer videos (up to 10s), more aspect ratios, and reliably generates A/V-synced videos in Chinese and other languages. Veo 3 is limited to 8s and a single aspect ratio, and may not support multilingual prompts.
Yes, WAN 2.5 allows you to input voice, sound effects, or background music to drive video generation, enabling precise audio-visual synchronization. This is not supported in Veo 3, which only offers silent or system-generated sound.