AI Sharing Circle

AI is changing the world!

Depth Anything 3 - 3D Visual Reconstruction Models for ByteHop Seed Open Source

Depth Anything 3 (DA3) is a 3D visual reconstruction model developed and open-sourced by the Byte Jump Seed team. Through a single Transformer architecture to realize the spatial geometry of any viewpoint reconstruction, only need to predict the depth map and ray map can restore the three-dimensional scene, compared to...

Latest AI Resources

8mos ago

049K

DeepSeek-Math-V2 - DeepSeek open source mathematical reasoning model

DeepSeek-Math-V2 is an open source mathematical reasoning model by DeepSeek, an AI company under Phantom Cube, and the latest version is based on DeepSeek-V3.2-Exp-Base improvement, with performance surpassing that of Gemini DeepThink to reach the international number...

Latest AI Resources

8mos ago

040.6K

Z-Image - Ali Tongyi Labs open source image generation model

Z-Image is an open source image generation model from Ali Tongyi Labs with efficient, fast and powerful image generation capabilities. Using a single-stream diffusion Transformer architecture (S3-DiT), it integrates text, visual semantics and image VAE tokens into a unified input stream...

Latest AI Resources

8mos ago

064.5K

ROCK - Alibaba open source smart body training environment sandbox

ROCK (Reinforcement Open Construction Kit) is Alibaba's open source sandbox for training environment of intelligences, which solves the problem that intelligences can't be scaled up for training in real environments.ROCK provides a highly stable sandbox management service...

Latest AI Resources

8mos ago

043K

ViMax - Open Source Multi-intelligent Body Video Generation Framework at the University of Hong Kong

ViMax is an open source multi-intelligence body video generation framework from the Data Science Laboratory of the University of Hong Kong, which can automate the whole process from creative input to video output. Integration of script generation , scene design , shot planning and video rendering and other functions , to support users to generate coherent film and television grade video through natural language description ...

Latest AI Resources

8mos ago

0113.1K

FLUX.2 - Black Forest Open Source Image Generation and Editing Model

FLUX.2 is an open source image generation and editing model released by Black Forest Labs that supports textual raw images, multi-image referencing, and image editing with richer details, clear textures, and stable lighting. There are four versions: FLUX.2 [pro] (comparable to the top closed source...

Latest AI Resources

8mos ago

040.8K

Fara-7B - Microsoft's open-source computer-operated Agent assistant model

Fara-7B is a Microsoft open source release of a 7-billion-parameter-scale computer-operated agent (CUA) model based on the Qwen 2.5-VL-7B architecture. By visually parsing web page screenshots and performing clicks, inputs, and other actions on the screen, without relying on additional accessibility trees or multiple large models...

Latest AI Resources

8mos ago

046.1K

HunyuanOCR - Tencent's open source expert model for optical character recognition

HunyuanOCR is a high-performance optical character recognition model open-sourced by the Tencent hybrid team, with a reference number of only 1 billion. Developed based on the hybrid multimodal architecture, it adopts an end-to-end design and can efficiently handle text detection, recognition and document parsing tasks. The model scored 94.1 points in the complex document test, surpassing...

Latest AI Resources

8mos ago

048.8K

Supertonic - Open source, high performance AI text-to-speech system that runs offline very fast!

Supertonic is open source, high-performance text-to-speech (TTS) system focused on rapid speech generation on local devices. Using ONNX Runtime technology, it can run on devices such as cell phones, computers and even Raspberry Pi, supports 23 languages and speech clones, and requires no network...

Latest AI Resources

8mos ago

042.4K

MiMo-Embodied - Xiaomi's Open Source Cross-Domain Embodied Intelligence Pedestal Model

MiMo-Embodied is the world's first cross-embodied base model that successfully integrates Embodied AI and autonomous driving open-sourced by Xiaomi Group. It solves the knowledge migration problem between Embodied AI and autonomous driving, and realizes the unified modeling of tasks in the two fields.

Latest AI Resources

8mos ago

047.3K

Loading more