Hunyuan-MT-7B - Tencent Mixed Meta Open Source Lightweight Translation Model
Hunyuan-MT-7B is a lightweight translation model introduced by Tencent's Mixed Meta Team, with 7 billion references, supporting the mutual translation of 33 languages and 5 folk-Chinese languages/dialects, including Cantonese, Uyghur, and Tibetan. In the International Association for Computational Linguistics (ACL) WMT2025 competition...
Step-Audio 2 mini - Step-Star Open Source Speech Megamodels
Step-Audio 2 mini is an open source end-to-end speech grand model of Step-Audio. It breaks through the traditional speech model structure and adopts the true end-to-end multimodal architecture, which directly transforms the original audio input into speech response output with lower latency, and understands paralinguistic information and non-vocal signals.
MobileCLIP2 - Apple's Open Source Efficient End-Side Multi-Modal Modeling
MobileCLIP2 is an upgraded version of MobileCLIP, an efficient end-side multimodal model introduced by Apple researchers. It is optimized in terms of multimodal reinforcement training by training better-performing CLIP instructor model integration on DFN datasets and improved graphical raw...
InternVL3.5 - Shanghai AI Lab Open Source Multimodal Large Models
InternVL3.5 (Shusheng-Wanxiang 3.5) is an open source multimodal large model of the Shanghai Artificial Intelligence Laboratory, the model is fully upgraded in terms of general ability, reasoning ability and deployment efficiency, providing nine sizes of versions from 1 billion to 241 billion parameters, covering different resource demand scenarios, including thick...
FastVLM - Visual Language Model from Apple
FastVLM (Fast Vision Language Model) is an efficient visual language model introduced by Apple Inc. With FastViTHD hybrid visual coder as the core, it incorporates convolutional and Transformer architectures to significantly reduce visual...
MiniCPM-V 4.5 - Faceted Intelligent Open Source 8B Parameter Multimodal Modeling
MiniCPM-V 4.5 is an open source 8B parametric multimodal model of Facade Intelligence, built based on Qwen3-8B and SigLIP2-400M, with the ability to efficiently process images and videos. It has excellent performance in visual token consumption, processing ...
Aivilization - A Multi-Agent Social Simulation Platform Launched by HKUST
Aivilization is the world's first AI multi-intelligent body social simulation platform developed by the Hong Kong University of Science and Technology. It builds a visual digital sandbox where users can create and guide thousands of AI intelligences to observe the social evolution of future human-AI coexistence. The platform supports...
Grok 2.5 - Musk's xAI open source AI model
Grok 2.5 is an open source AI model from Elon Musk's xAI. With 269 billion parameters, it is based on the Mixed Expert (MoE) architecture for powerful performance and inference. The model has been tested at graduate level scientific knowledge (GPQA), generalized knowledge (MMLU, MM...
Draw A Fish - free online AI fish drawing site with shared virtual fish tanks
Draw A Fish is simple and fun online AI fish drawing site where users can draw fish patterns and place them in a globally shared virtual fish tank.Draw A Fish requires no registration and is easy to use, taking only seconds to create and share.
ToonComposer - Tencent open source generative AI animation tool
ToonComposer is a generative AI animation tool jointly launched by The Chinese University of Hong Kong, Tencent PCG ARC Lab and Peking University. Through generative post keyframe technology, the intermediate frame generation and coloring process is integrated into an automated process, requiring only a sketch and a...