AI Sharing Circle

AI is changing the world!
Z-Image - 阿里通义实验室开源的图像生成模型

Z-Image - Ali Tongyi Labs open source image generation model

Z-Image is an open source image generation model from Ali Tongyi Labs with efficient, fast and powerful image generation capabilities. Using a single-stream diffusion Transformer architecture (S3-DiT), it integrates text, visual semantics and image VAE tokens into a unified input stream...
4mos ago
049.8K
ROCK - 阿里巴巴开源的智能体训练环境沙箱

ROCK - Alibaba open source smart body training environment sandbox

ROCK (Reinforcement Open Construction Kit) is Alibaba's open source sandbox for training environment of intelligences, which solves the problem that intelligences can't be scaled up for training in real environments.ROCK provides a highly stable sandbox management service...
4mos ago
026.9K
ViMax - 香港大学开源的多智能体视频生成框架

ViMax - Open Source Multi-intelligent Body Video Generation Framework at the University of Hong Kong

ViMax is an open source multi-intelligence body video generation framework from the Data Science Laboratory of the University of Hong Kong, which can automate the whole process from creative input to video output. Integration of script generation , scene design , shot planning and video rendering and other functions , to support users to generate coherent film and television grade video through natural language description ...
4mos ago
044K
FLUX.2 - 黑森林开源的图像生成与编辑模型

FLUX.2 - Black Forest Open Source Image Generation and Editing Model

FLUX.2 is an open source image generation and editing model released by Black Forest Labs that supports textual raw images, multi-image referencing, and image editing with richer details, clear textures, and stable lighting. There are four versions: FLUX.2 [pro] (comparable to the top closed source...
4mos ago
026.2K
Fara-7B - 微软开源的计算机操作Agent助手模型

Fara-7B - Microsoft's open-source computer-operated Agent assistant model

Fara-7B is a Microsoft open source release of a 7-billion-parameter-scale computer-operated agent (CUA) model based on the Qwen 2.5-VL-7B architecture. By visually parsing web page screenshots and performing clicks, inputs, and other actions on the screen, without relying on additional accessibility trees or multiple large models...
4mos ago
031.7K
HunyuanOCR - 腾讯混元开源的光学字符识别专家模型

HunyuanOCR - Tencent's open source expert model for optical character recognition

HunyuanOCR is a high-performance optical character recognition model open-sourced by the Tencent hybrid team, with a reference number of only 1 billion. Developed based on the hybrid multimodal architecture, it adopts an end-to-end design and can efficiently handle text detection, recognition and document parsing tasks. The model scored 94.1 points in the complex document test, surpassing...
4mos ago
033.1K
Supertonic - 开源的高性能AI 文本转语音系统,极速离线运行

Supertonic - Open source, high performance AI text-to-speech system that runs offline very fast!

Supertonic is open source, high-performance text-to-speech (TTS) system focused on rapid speech generation on local devices. Using ONNX Runtime technology, it can run on devices such as cell phones, computers and even Raspberry Pi, supports 23 languages and speech clones, and requires no network...
4mos ago
027.6K
MiMo-Embodied - 小米开源的跨领域具身智能基座模型

MiMo-Embodied - Xiaomi's Open Source Cross-Domain Embodied Intelligence Pedestal Model

MiMo-Embodied is the world's first cross-embodied base model that successfully integrates Embodied AI and autonomous driving open-sourced by Xiaomi Group. It solves the knowledge migration problem between Embodied AI and autonomous driving, and realizes the unified modeling of tasks in the two fields.
4mos ago
032.7K
MOSS-Speech - 复旦大学开源的语音到语音大模型

MOSS-Speech - Fudan University's open source speech-to-speech grand modeling

MOSS-Speech is an open source speech-to-speech (Speech-to-Speech) big model by Prof. Qiu Xipeng's team at Fudan University. It breaks through the traditional speech processing, without the need for text guidance, and directly understands and generates speech, which can capture non-text elements such as intonation and emotion, making...
4mos ago
028.1K
Parallax - Gradient开源的全球首个全自主AI操作系统

Parallax - The world's first fully autonomous AI operating system open-sourced by Gradient

Parallax is the world's first "fully autonomous AI operating system" open-sourced by Gradient, a distributed AI lab. It supports cross-platform deployment of large models on Mac, Windows and other heterogeneous devices, allowing users to fully control the model, data and AI memory. The system is built-in network-aware ...
4mos ago
083.8K