Mixed World Model 1.1 - Tencent Mixed World Released Open Source 3D Reconstructed Large Model
What is the hybrid world model 1.1
WorldMirror 1.1 (WorldMirror) is an open source 3D reconstruction model released by Tencent's WorldMirror team, which is an upgraded version of the WorldMirror series. It supports multiview images, videos, and multimodal a priori inputs such as camera position, internal reference, depth map, etc. It breaks through the limitation of traditional 3D reconstruction relying on a single image only, and flexibly adapts to different combinations of inputs through a dynamic a priori injection mechanism. For the first time, it realizes end-to-end multi-task unified output, which can simultaneously generate multiple 3D geometric predictions such as point cloud, multi-view depth map, camera parameters, surface normal and 3D Gaussian points, etc. The tasks are strengthened by end-to-end collaborative training to improve the reconstruction quality and geometric consistency.

Functional features of the Hybrid World Model 1.1
- Multi-modal input support: Supports multiple input methods such as multi-view image and video, and can flexibly handle different types of input data.
- Multitasking Unified OutputThe 3D geometry prediction results can be simultaneously outputted as point cloud, depth map, camera parameters, surface normal and 3D Gaussian points to meet diversified needs.
- Single Card Deployment with Seconds Reasoning: Using a pure feed-forward architecture, it can be deployed on a single graphics card and takes only 1 second locally to process 8-32 view inputs, enabling second-level inference.
- Flexible a priori adaptability: Through the dynamic a priori injection mechanism, the model can flexibly adapt to any combination of a priori, and 3D reconstruction can be performed even without a priori input.
- Strong generalization capabilities: With the help of the course learning strategy, the model's ability to generalize beyond a single image distribution is maximized, and it is better able to handle diverse input data.
- High-precision 3D reconstruction: Outstanding performance in 3D point cloud reconstruction and end-to-end 3DGS reconstruction, with outstanding geometric accuracy and detail reproduction, providing support for high-quality 3D content creation.
Core Benefits of the Hybrid World Model 1.1
- Flexible handling of multimodal inputsThe following is an example: support multi-modal a priori information injection such as camera position, internal reference, depth map, etc., and adapt to arbitrary a priori combinations by fusing global and local geometric constraints through hierarchical coding strategy to improve reconstruction quality and robustness.
- Generalized 3D Visual Prediction: For the first time, it realizes the unified output of multi-tasks such as point cloud, depth map, camera parameters, surface normal, 3D Gaussian points, etc. It optimizes geometric accuracy and detail reproduction through end-to-end collaborative training, and supports high-quality mesh reconstruction and real-time new viewpoint rendering.
- Efficient single-card deployment with second-by-second inferenceIt adopts a pure feed-forward architecture, which can output 3D attributes with a single forward propagation, and takes only 1 second to process 8-32 view inputs, which is significantly better than the traditional iterative optimization methods, and reduces the threshold of hardware to realize 3D reconstruction technology available for everyone.
- Cross-scene generalization capabilities: Optimize training through course learning strategies (task order, data scheduling, resolution progression) to improve adaptability to diverse inputs such as real photos and AI-generated videos, and generate scenes that are well-structured and rich in details.
- Open Source and Ease of Use: Fully open source, provides local deployment documentation and Hugging Face online Demo, supports uploading multi-view image or video real-time preview of 3D reconstruction results, reducing the threshold of technical applications.
What is the official website for the hybrid world model 1.1
- Project website:: https://3d-models.hunyuan.tencent.com/world/
- Github repository:: https://github.com/Tencent-Hunyuan/HunyuanWorld-Mirror
- Hugging Face Model Library:: https://huggingface.co/tencent/HunyuanWorld-Mirror
- HuggingFace online demo:: https://huggingface.co/spaces/tencent/HunyuanWorld-Mirror
- Technical Report:: https://3d-models.hunyuan.tencent.com/world/worldMirror1_0/HYWorld_Mirror_Tech_Report.pdf
People for the Hybrid World Model 1.1
- 3D content creators: It can quickly generate high-quality 3D scenes for game development, VR experience, film and TV production, etc., helping creators to efficiently build virtual worlds.
- Educators and students: It can be used to create immersive 3D teaching environments to enhance the learning experience and effectiveness of educational scenarios such as virtual labs and historical scene recreations.
- Industrial designers and engineers: Assisted product design, virtual assembly and physical simulation to accelerate the industrial design process and improve design efficiency and quality.
- Cultural heritage conservationists: High-precision 3D reconstruction of ancient buildings and cultural relics to support digital preservation and research of cultural heritage.
- Real estate developers and architects: Generate 3D models and virtual tours of buildings for architectural design presentations, virtual model rooms, etc. to enhance user experience.
- Advertising and marketing staff: Create engaging 3D ad content such as product demonstrations, virtual showrooms, etc. to enhance the interactivity and attractiveness of the ads.
© Copyright notes
Article copyright AI Sharing Circle All, please do not reproduce without permission.
Related articles
No comments...