Fish Audio - AI Speech Synthesis and Sound Cloning Tool
Fish Audio is a powerful generative AI speech synthesis tool that supports text-to-speech (TTS) and voice cloning. Users only need to input text, the tool supports the conversion to natural and smooth voice, the platform provides multiple languages and voice styles to choose from, to meet different scenarios and user...
SignGemma - Sign Language Translation Model from Google DeepMind
SignGemma is the world's most powerful sign language interpreting AI model introduced by Google DeepMind, supporting the accurate translation of American Sign Language (ASL) into English text. The model is based on multimodal training, combining visual and textual data to capture sign language actions in real time and quickly translate them into text...
FLUX.1 Kontext - Image Generation and Editing Model from Black Forest
FLUX.1 Kontext is an image generation and editing model from Black Forest Labs that provides context-aware image processing techniques. The model understands responses to text and image cues, performs tasks such as object modification, style conversion, and background replacement, while maintaining the corner...
WebAgent - Ali Tongyi Open Source Autonomous Search AI Agent
WebAgent is an open source autonomous search AI Agent from Alibaba's Tongyi Labs, with powerful end-to-end autonomous information retrieval and multi-step reasoning capabilities.WebAgent can actively perceive, decide and act in the network environment like a human being, and is widely used in academic research, business decision...
Linguaphone IDE - Tongyi Linguaphone Launches AI Native Development Environment Tools
Spirit Code IDE is the AI native integrated development environment (IDE) launched by Tongyi Spirit Code, which is deeply adapted to the 3 major models of Thousand Questions, and has a powerful programming intelligent body mode to support the autonomous completion of tasks such as project perception, code retrieval, and execution of terminal operations. It supports MCP tools and integrates Magic Hitch MCP Square's 3...
BAGEL - Open source multimodal base model launched by Wordpress
BAGEL is a multimodal base model open-sourced by ByteDance with 14 billion parameters, of which 7 billion are active. The model base with the Mixed Transformer Expert Architecture (MoT) captures pixel-level and semantic-level features of an image with two independent encoders, respectively, to support efficient processing of images, text, video...
DeepSeek-R1 - AI inference model from DeepSeek, performance aligned to OpenAI o1 release
DeepSeek-R1 is a high-performance AI inference model launched by Hangzhou-based DeepSeek, benchmarking against OpenAI's o1 official version. The model is post-trained based on large-scale reinforcement learning techniques and requires only a very small amount of labeled data to reason in math, code and natural language...
Phantom Boat AI - One-stop AI short film creation platform, batch generation of various types of video content
Phantom Boat AI is a powerful one-stop AI short film creation platform that supports efficient batch generation of various types of video content, including commercials, promos, animations and more. The platform is based on Midjourney, Runway and other world-leading AI models, and provides creators with a wide range of services from scriptwriting to...
Circuit Tracer - Anthropic open source tool for visualizing the inner workings of models
Circuit Tracer is an open source tool from Anthropic for studying the internal workings of large language models. Based on the generation of attribution graphs (attribution graphs) to reveal the internal steps that the model undergoes when generating a particular output ...
Google AI Edge Gallery - Google launches AI app that supports running AI models on your phone
Google AI Edge Gallery is an experimental AI app from Google that lets users experience and use Machine Learning (ML) and Generative Artificial Intelligence (GenAI) models on native devices. The app is supported on Android devices.