SiliconCloud Goes Live with Accelerated Video Model Mochi-1-Preview

AI News1yrs agorelease AI Sharing Circle

40.1K 00

Recently, GenmoAI open source video generation model mochi 1 preview (10B), with high fidelity action and powerful cue following ability, currently supports 480p resolution video generation. Today, SiliconCloud, SiliconCloud on-line reasoning acceleration version of mochi-1-preview (price ¥ 2.8/Video), eliminating the deployment threshold for developers, only in the development of applications to easily call the API, bringing a more efficient user experience. The platform also supports developers to freely compare and experience dozens of large models, and choose best practices for your generative AI applications. SiliconCloud上线加速版视频模型Mochi-1-Preview

Online Experience
https://cloud.siliconflow.cn/playground/text-to-video/17885302647

API documentation
https://docs.siliconflow.cn/capabilities/video

Cue word: A tomato talking with a face

Prompt words: A woman with light skin, wearing a blue jacket and a black hat with a veil, looks down and to her right, then back up as she speaks; she has brown hair styled A woman with light skin wearing a blue jacket and a black hat with a veil, looks down and to her right, then back up as she speaks; she has brown hair styled in an updo, light brown eyebrows, and is wearing a white collared shirt under her jacket; the camera remains stationary on her face as she speaks; the background is out of focus, but shows trees and people in period clothing; the scene is captured in real-life footage.

Cue word: A clear, turquoise river flows through a rocky canyon, cascading over a small waterfall and forming a pool of water at the bottom. The river is the main focus of the scene, with its clear water reflecting the surrounding trees and rocks. The trees are mostly pine trees, with their green needles contrasting with the brown and gray rocks. The overall tone of the scene is one of peace and tranquility. The overall tone of the scene is one of peace and tranquility.

Get a feel for what mochi-1-preview on SiliconCloud looks like after inference acceleration.

Model features and performance

Based on the Asymmetric Diffusion Transformer (AsymmDiT) architecture, mochi 1 is simple and modifiable. Compared to leading closed-source models, mochi 1 is highly competitive. Cue adherence and motion quality are two of the most critical capabilities in video generation models.

Tips to follow: Extremely high alignment with text prompts ensures that the generated video accurately reflects the given instructions. This gives the user detailed control over characters, settings and actions.

Motion Quality: mochi 1 generates up to 5.4 seconds of video at a smooth 30 frames per second, with a high degree of temporal coherence and realistic movement patterns. mochi simulates physical phenomena such as fluid dynamics and hair simulation, and exhibits consistent, smooth human movement.

Token Factory SiliconCloud

Qwen 2.5 (7B) and 20+ other models for free!

As a one-stop big model cloud service platform, SiliconCloud is committed to providing developers with extremely fast response, affordable, complete categories, and silky smooth experience of model APIs. Instruct, HunyuanVideo, Marco-o1, fish-speech-1.5, QwQ-32B-Preview, Qwen2.5-Coder-32B-Instruct, Qwen2-VL, InternVL2, Qwen2.5-7B/14B/32B/ 72B, FLUX.1, InternLM2.5-20B-Chat, BCE, BGE, SenseVoice-Small, GLM-4-9B-Chat, and dozens of open source large language models, picture/video generation models, speech models, code/mathematical models, and vector and reordering models. SiliconCloud上线加速版视频模型Mochi-1-Preview

Among them, Qwen2.5 (7B), Llama3.1 (8B) and other large model APIs are free to use, so that developers and product managers do not need to worry about the arithmetic cost of the research and development phase and large-scale promotion, and realize the "Token Freedom".