Open Source on the Horizon: AI Video Creation for the Masses
Exciting news! AliCloud has officially announced that its much-anticipated next-generation AI video generation model, WanX 2.1, will soon be open sourced! 🎉 This model, which is known as "redefining video generation", has attracted a lot of attention in the industry since its release. Now, AliCloud has decided to open source WanX 2.1, which will undoubtedly inject a powerful new impetus into the field of AI video creation.
WanX 2.1 is the latest addition to AliCloud's "Wanx" multimodal big models. While "Tongyi Wanxiang" debuted in July 2023, WanX 2.1 represents the latest technological advancement in the model series. It not only generates high-quality images and videos based on text commands, but is also the world's first model to support Chinese and English text effects.
Superior Performance: The Leader of the VBench Charts
WanX 2.1 is amazingly good at generating realistic videos. Whether it's dealing with complex motion scenes, optimizing pixel quality, or accurately grasping the rules of physics, WanX 2.1 demonstrates outstanding performance. Especially noteworthy is its accuracy in understanding and executing user commands, which makes it stand out in the authoritative VBench video generation model evaluation list, with a total score of 84.7%, leading the way in key indicators, such as the degree of dynamics, spatial relationships, and multi-object interactions, among others.
As of this writing, the top spot has changed to MiracleVision V5.
What makes WanX 2.1 so outstanding? It can't be separated from the continuous innovation and breakthroughs in technology made by AliCloud's research team.
Technological Innovation: Creating a More Realistic Video World
In pursuit of the ultimate in visual generation quality, the WanX 2.1 R&D team has explored and innovated in a number of key technology areas:
- Self-developed VAE and DiT frameworks: WanX 2.1 utilizes Aliyun's own VAE (Variable Auto-Encoder) and DiT (Denoising Diffusion). Transformer) framework, which significantly enhances the model's ability to understand video timing and spatial relationships. This enables WanX 2.1 to generate more realistic and natural looking video content when dealing with scenes containing complex motion and physics rules.
- Omni-temporal attention mechanisms: With the introduction of the Omni-temporal Attention mechanism, WanX 2.1 is able to more accurately capture and simulate the complex and changing dynamics in the real world, making the generated videos more vivid and vibrant.
- Extra Long Context Training: In order to enhance the model's ability to understand and execute text commands, WanX 2.1 also adopts an ultra-long context training method, which accelerates the model training process and realizes a seamless connection between text commands and video content creation, making video creation faster and more intuitive.
- First bilingual text effects in English and Chinese: WanX 2.1 is the first video generation model in the industry to support bilingual text effects, which has greatly expanded its application scenarios to better meet the diverse creative needs of the advertising design and short video production industries.
Text Alert: "Panoramic shot of a female figure skater performing on an ice rink. She is wearing a purple skating outfit and white skates and is performing a spinning maneuver. Her arms are spread wide and her body is tilted back, showing her skill and grace."
Thanks to these innovations, WanX 2.1 is able to handle large body movements and complex rotational scenes with ease. Even in challenging scenarios such as figure skating, swimming, and diving, which require a high degree of trajectory and body coordination, WanX 2.1 is still able to excel, setting a new quality benchmark for video generation.
Open Source Sharing: Empowering a Broader Creative Ecology
Currently, WanX 2.1 is available on the official website of China. a complete picture of everything Free experience is available. Individual developers and enterprise users can be the first to experience the power of WanX 2.1 through the Aliyun Model Studio platform to unleash their creativity and efficiently generate high-quality video content.
The upcoming open source means that WanX 2.1 will no longer be limited to a specific platform, but will be integrated into the broader AI technology ecosystem. Aliyun's move will undoubtedly greatly promote the popularization and development of AI video generation technology, so that more developers and creatives can stand on the shoulders of giants and jointly explore the infinite possibilities of AI video creation, thus truly realizing the in-depth fusion of AI technology and the creative industry. Let's look forward to the day when WanX 2.1 is open-sourced and witness the arrival of a new era of AI video creation!