Latest AI Resources

Total 3094 articles posts
LTX Studio:拥有分镜管理工具的AI电影制作平台,可设置多人物保持面部一致

LTX Studio: AI movie-making platform with split-screen management tools to set up multiple characters to keep their faces consistent

General Introduction LTX Studio is an innovative AI-driven video creation platform designed for creators, marketers, filmmakers and studios. It provides full-process operation from story conceptualization, split-screen generation, kinetic effects addition to post-editing, helping users transform creative concepts into...
1yrs ago
070.3K
ModelBest(面壁智能):全球领先的轻量高性能端侧大模型

ModelBest: The World's Leading Lightweight, High-Performance End-Side Big Model

General Introduction ModelBest is a company specializing in developing lightweight and high-performance large models, dedicated to applying advanced AI technologies to mainstream consumer electronics and various end devices in daily life. Its MiniCPM series of end-side models are characterized by extreme arithmetic power and memory usage efficiency...
2yrs ago
070.2K
LazyLLM:商汤开源构建多智能体应用的低代码开发工具

LazyLLM: Shangtang's open source low-code development tool for building multi-intelligence body applications

Comprehensive Introduction LazyLLM is an open source tool developed by the LazyAGI team, focusing on simplifying the development process of multi-intelligence large model applications. It helps developers quickly build complex AI applications through one-click deployment and lightweight gateway mechanisms, saving tedious engineering configuration...
1yrs ago
070.2K
LunaAI换脸:开源的秒鸭相机,部署前后端完整的企业级AI换脸小程序(算力服务付费,可二开)

LunaAI face swap: open source second duck camera, deploy front and back-end complete enterprise AI face swap applet (arithmetic service payment, can be two open)

Comprehensive Introduction LunaAI face swap applet is a face swap application developed based on uniapp and Vue framework. The application utilizes technologies such as PHP, MySQL, Nginx and Redis to achieve the function of the user's face changing operation through the applet. Users can use this small...
1yrs ago
070.2K
CogAgent:智谱开源的智能视觉语言模型,实现图形界面自动化操作

CogAgent: Smart Spectrum's open source intelligent visual language model for automating graphical interfaces

Comprehensive Introduction CogAgent is an open source visual language model developed by Tsinghua University Data Mining Research Group (THUDM), aiming to automate the operation of cross-platform graphical user interface (GUI). The model is based on CogVLM (GLM-4V-9B) and supports bilingual Chinese and English...
1yrs ago
070.2K
Ultravox:实时端到端语音对话的音频多模态大模型,GPT-4o语音交互的开源实现

Ultravox: an audio multimodal macromodel for real-time end-to-end voice dialog, an open source implementation of GPT-4o voice interaction

Comprehensive Introduction Ultravox is an innovative multimodal Large Language Model (LLM) designed for real-time speech processing. Unlike traditional speech recognition systems, Ultravox eliminates the need for a separate Audio Speech Recognition (ASR) stage, and is able to directly convert audio into high-dimensional space in...
1yrs ago
070.1K
录咖:一站式音视频处理平台|视频生成|AI字幕|提取音频|语音转文字

Record Cafe: One-stop Audio/Video Processing Platform|Video Generation|AI Subtitle|Audio Extraction|Speech to Text

Comprehensive Introduction Record Cafe is a one-stop audio/video processing platform that provides AI video dialog, AI subtitles and AI speech to text services. Functions include recording screen, editing video, converting GIF/audio, etc., and supports cloud storage and sharing. The interface is intuitive and easy to use, and it also supports multi-screen recording and multi-language smart...
1yrs ago
070.1K
秘塔AI搜索:提供无广告的高效学术搜索服务,研究模式深度挖掘知识

Secreta AI Search: Providing ad-free and efficient academic search services, research model for deep knowledge mining

General Introduction Secreta AI Search is a technology company dedicated to improving productivity through artificial intelligence technology. The site provides ad-free and efficient academic search services, aiming to provide users with accurate and fast search results. Secret Tower AI Search has a self-developed large language model, MetaLLM, which can...
1yrs ago
070K
火山方舟:大模型训练与云计算服务,注册送150元等额算力

Volcano Ark: Big Model Training and Cloud Computing Service, Sign Up for $150 Equivalent Arithmetic

Comprehensive Introduction Volcano Ark is a cloud computing platform launched by Volcano Engine that focuses on big model services, aiming to provide enterprises with a complete solution from model selection, training to application. Relying on ByteDance's deep accumulation in the field of AI, Volcano Ark integrates the big model resources of several top AI companies...
1yrs ago
069.9K
通义千问:阿里推出的多模态大模型,拥有文本回答、图片理解、视频解析能力

Tongyi Thousand Questions: a large multimodal model launched by Ali with text answering, image understanding, and video parsing capabilities

Comprehensive Introduction Tongyi Thousand Questions is an intelligent big model developed by Aliyun, aiming to provide a human-like interaction experience through deep learning and natural language processing technology. It can quickly generate creative copy to add fun to life, and serve as a learning assistant to help users easily learn all kinds of knowledge. With cutting-edge technology and evolving...
1yrs ago
069.8K
析言GBI(XiYan-SQL):Text-to-SQL智能数据分析,轻松实现ChatBI

Analytics GBI (XiYan-SQL): Text-to-SQL Intelligent Data Analytics for ChatBI with Ease

Comprehensive Introduction Analyzing Words GBI is an intelligent data analysis product based on big models launched by AliCloud Hundred Refine. The product utilizes advanced natural language processing technology to help users query and analyze data through natural language without having to master complex SQL syntax. Analytics GBI supports multiple data sources, including...
1yrs ago
069.8K
AnimeGamer:用语言指令生成动漫视频和角色互动的开源工具

AnimeGamer: An Open Source Tool for Generating Anime Videos and Character Interactions with Language Commands

AnimeGamer is an open source tool launched by Tencent ARC Lab. Users can generate anime videos with simple language commands, such as "Sousuke drive around in a purple car", as well as allow different anime characters to interact with each other, such as Kiki from The Witch's House, and Sky City...
1yrs ago
069.6K
VideoLingo:视频转录单词级时间轴字幕,视频字幕翻译和本地化配音开源工具

VideoLingo: video transcription word-level timeline subtitles, video subtitle translation and localized dubbing open source tools

General Description VideoLingo is a one-stop video translation and localization dubbing tool designed to generate Netflix-grade, high-quality subtitles, eliminating raw machine translation and multi-line subtitles, and adding high-quality voiceovers that enable global knowledge to be shared across language barriers. By...
2yrs ago
069.5K
Fun-ASR - 钉钉、通义联合推出的新一代语音识别模型

Fun-ASR - A New Generation of Speech Recognition Models Jointly Launched by Nail and Tongyi

Fun-ASR is a big model of speech recognition jointly launched by Nail and Tongyi Labs. The model has been trained with massive audio data and can accurately recognize multi-industry terminology, such as Internet, technology, home decoration, etc., significantly improving the recognition accuracy. The model combines with Nail enterprise information for inference optimization to reduce the illusion problem...
8mos ago
069.4K
Amurex:开源AI会议记录助手,自动记录会议内容生成总结

Amurex: open source AI meeting recording assistant, automatic recording of meeting content to generate summaries

General Introduction Amurex is an open source AI meeting assistant developed by The Personal AI Company that aims to improve meeting efficiency through intelligent features.Amurex can provide real-time suggestions, generate intelligent summaries, record meeting content, and automatically send follow...
1yrs ago
069.4K
VideoRAG:理解超长视频的RAG框架,支持多模态检索和知识图谱构建

VideoRAG: A RAG framework for understanding ultra-long videos with support for multimodal retrieval and knowledge graph construction

Comprehensive Introduction VideoRAG is a retrieval-enhanced generative framework designed for processing and understanding very long contextual videos. The tool combines a graph-driven textual knowledge base with hierarchical multimodal context encoding to efficiently process on a single NVIDIA RTX 3090 GPU...
1yrs ago
069.2K
xyks:小猿口算逆向笔记,逆向工程与解密算法

xyks: small ape oral math reverse notes, reverse engineering and decryption algorithms

Comprehensive Introduction Ape Mouth Calculator Reverse Notes is an open source project that aims to document and share the process and methods of reverse engineering the Ape Mouth Calculator application. The project contains a variety of reverse tools and techniques to use the instructions , such as Frida, dexdump , etc., to help users understand and crack the little ape oral math add...
2yrs ago
068.9K
YuE:将歌词转化为完整歌曲的基础模型,支持多种音乐风格

YuE: Transforms lyrics into a base model of a complete song, supporting a wide range of musical styles

General Introduction YuE is an open source full song generation base model that focuses on transforming lyrics into full songs. Unlike other models that can only generate short snippets of non-vocal music, YuE is capable of generating full songs with lead and backing vocals up to several minutes in length. The model addresses music generation in...
1yrs ago
068.8K
Paper2Code:将机器学习论文自动转化为可运行代码

Paper2Code: Automatically Converting Machine Learning Papers into Runnable Code

General Introduction Paper2Code is an open source project that aims to solve the problem of lack of code implementations for machine learning papers. It automatically transforms scientific papers into runnable code repositories through the multi-agent Large Language Modeling (LLM) system PaperCoder. The system uses planning ...
12mos ago
068.8K
Artflow:创作人物一致性的动画故事和虚拟数字人口播视频

Artflow: Creating character-consistent animated stories and virtual digital pop-up videos

General Description Artflow is an online platform that enables users to upload photos, train exclusive AI characters, and create character-consistent videos and animated stories. Offering free training for the first time, users can customize their identity to create unique images and videos for a variety of scenarios. Monthly ...
2yrs ago
068.8K
SegAnyMo:从视频中自动分割任意运动物体的开源工具

SegAnyMo: open source tool to automatically segment arbitrary moving objects from video

General Introduction SegAnyMo is an open source project developed by a team of researchers at UC Berkeley and Peking University, including members such as Nan Huang. This tool focuses on video processing and can automatically recognize and segment arbitrary moving objects in a video, such as people, animals or...
1yrs ago
068.7K
文心智能体平台:建立在完整分发渠道和商业闭环的智能体应用

Wenxin Intelligent Body Platform: Intelligent Body Applications Built on Complete Distribution Channels and Commercial Closures

Introduction Wenxin Intelligent Body Platform AgentBuilder is a Baidu launched based on the Wenxin large model of the intelligent body (Agent) platform, to support the majority of developers in accordance with their own industry sectors, application scenarios, to select different types of development, to create a large model of the era of product capabilities. Developers can ...
1yrs ago
068.6K