Latest AI Resources

Total 2972 articles posts
RedOne - 小红书最新推出的社交大模型

RedOne - the latest social mega-model from Little Red Book

RedOne is a large language model customized for social networks introduced by Little Red Book. The model is trained through a three-stage training strategy that incorporates social and cultural knowledge, strengthens multitasking capabilities, and aligns human preferences.RedOne significantly outperforms the base model in social task performance, in harmful content detection and browsing...
7mos ago
041.4K
TRAE SOLO - 字节跳动TRAE推出的AI自动开发助手

TRAE SOLO - AI Automated Development Assistant from Wordhop TRAE

TRAE SOLO is an AI automated development assistant introduced by TRAE, an AI programming assistant launched by ByteDance, to simplify the software development process with AI technology.TRAE SOLO understands the user's needs, supports text descriptions, voice commands, and file uploads to input the requirements, and automatically plans...
8mos ago
065.2K
LiveTalking:开源实时互动数字人直播系统,实现音视频同步对话

LiveTalking: open source real-time interactive digital human live system, to achieve synchronous audio and video dialogues

Comprehensive introduction LiveTalking is an open source real-time interactive digital human system , is committed to building high-quality digital human live solution . The project uses the Apache 2.0 open source protocol and integrates a number of cutting-edge technologies , including ER-NeRF rendering , real-time audio and video streaming processing ...
1yrs ago
0124K
Yume1.5 - 上海AI Lab联合复旦大学开源的交互式世界生成模型

Yume1.5 - An Interactive World Generation Model Open-Sourced by Shanghai AI Lab and Fudan University

Yume 1.5 is an open source interactive world generation model, jointly developed by Shanghai Artificial Intelligence Laboratory, Fudan University, and Shanghai Innovation Research Institute, which is capable of real-time interactive rendering (12 FPS on a single card). It adopts the joint spatio-temporal channel modeling (TSCM) technology, even if the context length increases...
2mos ago
023.8K
AutoMV - M-A-P联合北邮、南大等开源的免费音乐视频生成系统

AutoMV - M-A-P open source free music video generation system in conjunction with the North Post, South University, etc.

AutoMV is an open source music video generation system developed by the M-A-P team in collaboration with several universities, which can automatically generate coherent music videos based on complete songs without training.It adopts a multi-intelligence body collaboration model, including music analysis, scriptwriting, directing, and quality control modules, and can accurately analyze the lyrics, beats, and...
2mos ago
025.9K
PersonaLive - 澳门大学等开源的实时AI人像动画生成直播框架

PersonaLive - The University of Macau and other open source real-time AI portrait animation generation live framework

PersonaLive is an open source real-time AI face-swapping live streaming framework, jointly developed by the University of Macau, dzine.ai, and the GVC Lab at the University of the Greater Bay Area. It can realize low-latency and high frame rate digital person drive on ordinary consumer-grade graphics cards (12GB video memory), and support real-time through the camera...
2mos ago
026K
MAI-UI - 阿里通义实验室开源的通用GUI智能体基座模型

MAI-UI - Ali Tongyi Labs Open Source Universal GUI Intelligent Body Base Model

MAI-UI is an open source generalized GUI intelligent body base model from Alibaba Tongyi Labs, with four major capabilities: cross-application operation, fuzzy semantic understanding, active user interaction and multi-step process coordination. Adopting end-cloud collaboration architecture, the lightweight model resides in the device to handle daily tasks, and complex tasks can call the cloud big...
2mos ago
031.5K
InstanceAssemble - 小红书联合复旦大学开源的布局控制生成技术

InstanceAssemble - Little Red Book and Fudan University open source layout control generation technology

InstanceAssemble is a layout control generation technology jointly open-sourced by Little Red Book and Fudan University, which realizes accurate image generation from simple to complex and from sparse to dense layout through the mechanism of "Instance Assemble Attention". Adopting a two-stage cascade architecture, Mr. Mr. into the image background, and then one by one ...
2mos ago
016K
MedASR - 谷歌开源的医疗语音识别模型

MedASR - Google's open source medical speech recognition model

MedASR is a 105 million parameter medical speech recognition model open-sourced by Google, fine-tuned on a 5,000-hour desensitized clinical corpus, optimized for drug, dosage, and anatomical terminology, with a built-in 6-gram medical language model, and a word error rate of only 4.6 on the private radiology dataset RAD-DICT...
3mos ago
027.6K