AI Sharing Circle

Daily sharing of the latest AI products, projects, frameworks, paper interpretations, etc.~
AudioGen-Omni - 快手推出的多模态音频生成模型

AudioGen-Omni - Multimodal Audio Generation Model from Racer

AudioGen-Omni is a multimodal audio generation model from Racer that generates high-quality audio, speech, and songs based on inputs such as video, text, etc.AudioGen-Omni is based on advanced techniques such as multimodal diffusionTransformer and phase-aligned...
8mos ago
047.6K
RedOne - 小红书最新推出的社交大模型

RedOne - the latest social mega-model from Little Red Book

RedOne is a large language model customized for social networks introduced by Little Red Book. The model is trained through a three-stage training strategy that incorporates social and cultural knowledge, strengthens multitasking capabilities, and aligns human preferences.RedOne significantly outperforms the base model in social task performance, in harmful content detection and browsing...
8mos ago
044.7K
FastDeploy - 百度推出的高性能大模型推理与部署工具

FastDeploy - Baidu's high-performance large model reasoning and deployment tool

FastDeploy is a high-performance reasoning and deployment tool from Baidu, designed for Large Language Models (LLMs) and Visual Language Models (VLMs).FastDeploy is developed based on the Flying Paddle (PaddlePaddle) framework, and supports a variety of hardware platforms...
8mos ago
045.7K
InteriorGS - 群核科技推出的3D高斯语义数据集

InteriorGS - 3D Gaussian Semantic Dataset launched by Qunar Technologies

InteriorGS is a high-quality 3D Gaussian semantic dataset introduced by Qunar Technology. The dataset contains 1,000 3D scenes covering more than 80 indoor environments such as homes, convenience stores, wedding halls and museums. The dataset has more than 554,000 object instances in 755 categories...
8mos ago
045K
DragonV2.1 - 微软推出的零样本语音合成模型

DragonV2.1 - Zero-Sample Speech Synthesis Model from Microsoft

DragonV2.1 is an advanced zero-sample text-to-speech (TTS) model from Microsoft. Based on the Transformer architecture, the model supports multi-language and zero-sample speech cloning, and generates natural, expressive speech with only 5-90 seconds of voice prompts.
8mos ago
043.1K
ScreenCoder – 开源的UI截图生成前端代码工具

ScreenCoder - Open Source UI Screenshot Generation Front-End Code Tool

ScreenCoder is an open source intelligent tool to quickly convert UI design screenshots into high quality HTML/CSS code. Tools based on modular multi-intelligence architecture , combined with visual understanding , layout planning and code synthesis techniques to support the generation of high-precision and semantic front-end ...
8mos ago
054.6K
Kimi K2 高速版 - 月之暗面Kimi推出的高速版语言模型

Kimi K2 High-Speed Edition - High-Speed Edition of the language model released by Dark Side of the Moon Kimi

Kimi K2 High Speed Edition (kimi-k2-turbo-preview) is a high-performance language model introduced by Kimi, the Dark Side of the Moon. The model is optimized on the basis of Kimi K2, the output speed is greatly increased, and 40 Token per second can be generated...
8mos ago
060.7K
dots.ocr - 小红书hi lab推出的开源多语言文档解析模型

dots.ocr - the open source multilingual document parsing model launched by the Little Red Book hi lab

dots.ocr is a multilingual document parsing model open-sourced by Xiaohongshu hi lab, based on a 1.7 billion-parameter visual language model (VLM), which can efficiently perform document layout detection and content recognition while maintaining a good reading order.
8mos ago
066.7K
HYPIR - 中国科学院团队推出的新型图像复原大模型

HYPIR - A new large model for image restoration introduced by a team from the Chinese Academy of Sciences

HYPIR is a large model for image restoration introduced by Dong Chao's team at Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. The model combines the fractional prior of diffusion modeling with adversarial generative networks to achieve efficient, high-quality image restoration.HYPIR can quickly restore old photos and improve resolution while keeping text clear...
8mos ago
055.8K
FLUX.1 Krea [dev] - 黑森林和Krea AI联合推出的文生图模型

FLUX.1 Krea [dev] - Black Forest and Krea AI joint venture on Vincennes graph models

FLUX.1 Krea [dev] is a text-generated graph model from Black Forest Labs and Krea AI. The model is capable of generating high-quality, photorealistic images based on input text descriptions with a unique aesthetic style that avoids traditional A...
8mos ago
050.8K