FG-CLIP 2 - 360 Open Source Cross-Modal Visual Language Model for Graphic Texts
FG-CLIP 2 is the world's leading graphical cross-modal visual language model (VL-M) launched by 360 Artificial Intelligence Research Institute, which surpasses similar models from Google and Meta in 29 authoritative benchmark tests, making it the most powerful VL-M at present.It is able to accurately recognize the gross...
BettaFish - Open Source Multi-Intelligence Public Opinion Analyzing System
BettaFish is an open source multi-intelligence system for public opinion analysis. Using multi-intelligent body architecture, through Query, Media, Insight, Report and other Agents work together to achieve retrieval, extraction and reporting closed loop. The system supports AI-driven full ...
Ouro - A new cyclic language model open-sourced by the ByteHopper Seed team
Ouro is a new type of Looped Language Models (LLMs) developed by the ByteDance Seed team, with the core innovation of directly building inference capabilities in the pre-training phase through a parameter-sharing recurrent computation structure. The model uses 24 layers as the base block through...
ChronoEdit - AI image editing framework jointly open-sourced by NVIDIA and the University of Toronto
ChronoEdit, an open-source AI image editing framework developed by NVIDIA in conjunction with the University of Toronto, redefines the image editing task as a video generation task to ensure that the editing results are temporally and physically consistent. By distilling a pre-trained video generation model with 14B parameters from a...
LongCat-Flash-Omni - A Fully Modal Large Language Model for Meituan Open Source
LongCat-Flash-Omni is an open source fully modal big language model released by the LongCat team of Meituan. With a parameter scale of 560 billion (27 billion activated parameters), it realizes millisecond-level real-time audio and video interaction capabilities while maintaining a large number of parameters.
Petri - Anthropic's open source AI security auditing framework
Petri is an open source AI security auditing framework developed by Anthropic that systematically assesses the security and behavioral alignment of AI models. By simulating a real-world scenario where an automated auditor engages in multiple rounds of conversations with a target model, followed by a judge agent that acts on the model's...
Kimi Linear - A New Hybrid Linear Attention Architecture Open-Sourced by Dark Side of the Moon
Kimi Linear is a new hybrid linear attention architecture open-sourced by Dark Side of the Moon, with Kimi Delta Attention (KDA) as the core, optimizing the traditional attention model through a finer-grained gating mechanism, which significantly improves the hardware efficiency and memory control ability ...
FIBO - The world's first open-source native JSON-enabled text to image modeling
FIBO is the world's first open source text generation image model with native JSON support developed by Bria AI. Based on the DiT (Diffusion Transformer) architecture with 8B parameters, it adopts the Flow Matching training method...
SoulX-Podcast - Soul AI Lab's Open Source Conversational Speech Synthesis Model
SoulX-Podcast is Soul AI Lab's open source advanced multi-speaker conversational speech synthesis model designed for generating high quality podcast content. SoulX-Podcast has the ability to generate multiple rounds of conversations, which can simulate smooth conversations in real podcasting scenarios, and supports Mandarin, English, and multiple Chinese...
GigaBrain-0 - Open source embodied base model driven by world model generation data
GigaBrain-0 is the first end-to-end Vision-Language-Action (VLA) embodied base model in China that uses world model generation data to realize real machine generalization, and it is jointly released as open source by GigaVision and Hubei Humanoid Robot Innovation Center. It adopts the hybrid Transformer architecture, integrating ...









