Latest AI Resources

Total 3104 articles posts
通义千问:阿里推出的多模态大模型,拥有文本回答、图片理解、视频解析能力

Tongyi Thousand Questions: a large multimodal model launched by Ali with text answering, image understanding, and video parsing capabilities

Comprehensive Introduction Tongyi Thousand Questions is an intelligent big model developed by Aliyun, aiming to provide a human-like interaction experience through deep learning and natural language processing technology. It can quickly generate creative copy to add fun to life, and serve as a learning assistant to help users easily learn all kinds of knowledge. With cutting-edge technology and evolving...
1yrs ago
074.2K
AI2SRT:利用 Gemini模型,一键为长视频创建解说短视频或视频总结

AI2SRT: Create short narrated videos or video summaries for long videos with one click using Gemini models

Comprehensive Introduction AI2SRT is an open source project that utilizes the GeminiAI Big Model to generate short narrated videos and video summaries for long videos with one click, while supporting audio and video transcription subtitles. The project aims to simplify the video content creation process and provide efficient subtitle generation and translation functions. Users can pass...
1yrs ago
074.2K
火山方舟:大模型训练与云计算服务,注册送150元等额算力

Volcano Ark: Big Model Training and Cloud Computing Service, Sign Up for $150 Equivalent Arithmetic

Comprehensive Introduction Volcano Ark is a cloud computing platform launched by Volcano Engine that focuses on big model services, aiming to provide enterprises with a complete solution from model selection, training to application. Relying on ByteDance's deep accumulation in the field of AI, Volcano Ark integrates the big model resources of several top AI companies...
1yrs ago
074.1K
MedRAX: 利用多模态大模型进行胸部X光片分析的智能体

MedRAX: A Smart Body for Chest X-ray Analysis Using Multimodal Large Models

Comprehensive Introduction MedRAX is a state-of-the-art AI intelligence designed for chest radiograph (CXR) analysis. It integrates state-of-the-art CXR analysis tools and multimodal large language models to dynamically process complex medical queries without additional training.MedRAX, through its modular design...
1yrs ago
074.1K
LTX Studio:拥有分镜管理工具的AI电影制作平台,可设置多人物保持面部一致

LTX Studio: AI movie-making platform with split-screen management tools to set up multiple characters to keep their faces consistent

General Introduction LTX Studio is an innovative AI-driven video creation platform designed for creators, marketers, filmmakers and studios. It provides full-process operation from story conceptualization, split-screen generation, kinetic effects addition to post-editing, helping users transform creative concepts into...
1yrs ago
074K
佐糖:在线图片处理工具,一键抠图、去水印、照片修复、人像编辑

Zosugar: online photo processing tools, one-click keying, watermark removal, photo restoration, portrait editing

Comprehensive Introduction ZuoSugar (PicWish) is an intelligent AI image processing platform, providing a wealth of online photo editing tools, supporting the use of all platforms. Users can easily complete one-click keying, watermark removal, blurry photos become clear, lossless zoom, image cropping, image compression and black and white photo...
2yrs ago
073.9K
Ultravox:实时端到端语音对话的音频多模态大模型,GPT-4o语音交互的开源实现

Ultravox: an audio multimodal macromodel for real-time end-to-end voice dialog, an open source implementation of GPT-4o voice interaction

Comprehensive Introduction Ultravox is an innovative multimodal Large Language Model (LLM) designed for real-time speech processing. Unlike traditional speech recognition systems, Ultravox eliminates the need for a separate Audio Speech Recognition (ASR) stage, and is able to directly convert audio into high-dimensional space in...
2yrs ago
073.8K
Amurex:开源AI会议记录助手,自动记录会议内容生成总结

Amurex: open source AI meeting recording assistant, automatic recording of meeting content to generate summaries

General Introduction Amurex is an open source AI meeting assistant developed by The Personal AI Company that aims to improve meeting efficiency through intelligent features.Amurex can provide real-time suggestions, generate intelligent summaries, record meeting content, and automatically send follow...
1yrs ago
073.7K
SegAnyMo:从视频中自动分割任意运动物体的开源工具

SegAnyMo: open source tool to automatically segment arbitrary moving objects from video

General Introduction SegAnyMo is an open source project developed by a team of researchers at UC Berkeley and Peking University, including members such as Nan Huang. This tool focuses on video processing and can automatically recognize and segment arbitrary moving objects in a video, such as people, animals or...
1yrs ago
073.7K
析言GBI(XiYan-SQL):Text-to-SQL智能数据分析,轻松实现ChatBI

Analytics GBI (XiYan-SQL): Text-to-SQL Intelligent Data Analytics for ChatBI with Ease

Comprehensive Introduction Analyzing Words GBI is an intelligent data analysis product based on big models launched by AliCloud Hundred Refine. The product utilizes advanced natural language processing technology to help users query and analyze data through natural language without having to master complex SQL syntax. Analytics GBI supports multiple data sources, including...
1yrs ago
073.7K
MMAudio:为视频画面生成同步音效与配乐,视频到音频的多模态联合训练工具

MMAudio: generating synchronized sound effects and soundtracks for video footage, video-to-audio multimodal co-training tool

General Introduction MMAudio is an open-source project aiming to generate high-quality synchronized audio through joint multimodal training. Developed by Ho Kei Cheng et al. at the Chinese University of Hong Kong, the project's main function is to generate synchronized audio based on video and/or text input.MM...
2yrs ago
073.7K
阿里妈妈创意中心:淘宝生态下的智能化营销创意支持平台

AliMama Creative Center: Intelligent Marketing Creative Support Platform under Taobao Ecology

Comprehensive Introduction Alimama Creative Center is Alibaba's intelligent marketing creative support platform, designed to provide merchants on Taobao, Tmall, and other e-commerce platforms with a full range of creative support from graphics to videos to landing pages. By combining AI intelligent copywriting capabilities and massive templates, Creative Center dramatically improves the design efficiency...
2yrs ago
073.7K
QuillBot:智能辅助改写与校对文本的写作工具

QuillBot: A writing tool that intelligently assists in rewriting and proofreading text

Comprehensive Introduction QuillBot is an AI-based online writing assistance platform designed to help users quickly rewrite, proofread and optimize text content. It provides text rewriting, grammar checking, text summarization and translation through natural language processing technology, which is suitable for students, working professionals and internal...
1yrs ago
073.5K
Dzine:可控的AI图像生成功能与画布设计工具,提供数百种图像风格样式

Dzine: Controllable AI image generation capabilities and canvas design tools, offering hundreds of image styles and styles

General Introduction Dzine (formerly Stylar) is an all-in-one AI design platform that offers an integrated workflow from image generation to editing, unrivaled image composition and style control. Its predefined styles make it easy for users of all skill levels to customize designs without complex...
2yrs ago
073.3K
DeepRant:实时翻译游戏聊天内容的开源客户端

DeepRant: An Open Source Client for Real-Time Translation of Game Chat Content

General Introduction DeepRant is an open source translation tool for gamers, designed to solve the problem of language barriers in international servers. It realizes instant translation of in-game text through shortcut keys, supports multiple languages to translate each other, and allows players to quickly understand and reply to chat messages without exiting the game...
1yrs ago
073.2K
YuE:将歌词转化为完整歌曲的基础模型,支持多种音乐风格

YuE: Transforms lyrics into a base model of a complete song, supporting a wide range of musical styles

General Introduction YuE is an open source full song generation base model that focuses on transforming lyrics into full songs. Unlike other models that can only generate short snippets of non-vocal music, YuE is capable of generating full songs with lead and backing vocals up to several minutes in length. The model addresses music generation in...
1yrs ago
073.2K
xyks:小猿口算逆向笔记,逆向工程与解密算法

xyks: small ape oral math reverse notes, reverse engineering and decryption algorithms

Comprehensive Introduction Ape Mouth Calculator Reverse Notes is an open source project that aims to document and share the process and methods of reverse engineering the Ape Mouth Calculator application. The project contains a variety of reverse tools and techniques to use the instructions , such as Frida, dexdump , etc., to help users understand and crack the little ape oral math add...
2yrs ago
073.2K
GeekAI:自部署商业化多功能AI助手,完整接入多模型API运营后台

GeekAI: Self-deployed commercialized multi-functional AI assistant with complete access to multi-model API operation backend

Comprehensive introduction GeekAI is a full set of open source solutions for AI assistants based on AI big language model API implementation. The project comes with an operations management backend , out of the box , integrated with ChatGPT, Azure, ChatGLM, Xunfei Starfire, Wenxin Yiyin and many other p...
2yrs ago
073.1K
Infinity:生成高分辨率图像的比特自回归建模,实现无限制高分辨率图像生成

Infinity: bitwise autoregressive modeling for generating high-resolution images for unlimited high-resolution image generation

General Introduction Infinity is a groundbreaking high-resolution image generation framework developed by the FoundationVision team. The project breaks through the limitations of traditional image generation models through an innovative bit-level visual autoregressive modeling approach.The core features of Infinity...
1yrs ago
073K
Tough Tongue AI:与AI对话练习面试与职场沟通技巧

Tough Tongue AI: Practice Interview and Workplace Communication Skills by Talking to an AI

General Introduction Tough Tongue AI is an artificial intelligence platform designed for practicing tough conversations. Users can simulate a variety of complex conversational situations, such as job interviews, salary negotiations, sales presentations, etc. by selecting preset scenarios or creating custom scenarios. The platform provides video and...
1yrs ago
073K
CogVLM2:开源多模态模型,支持视频理解与多轮对话

CogVLM2: Open Source Multimodal Modeling with Support for Video Comprehension and Multi-Round Dialogue

Comprehensive Introduction CogVLM2 is an open source multimodal model developed by the Tsinghua University Data Mining Research Group (THUDM), based on the Llama3-8B architecture, and designed to provide performance comparable to or even better than GPT-4V. The model supports image understanding, multi-round dialogs, and visual ...
1yrs ago
073K
法行宝:AI法律顾问,人工智能法律咨询,百度AI法律平台

Fa Xing Bao: AI Legal Advisor, Artificial Intelligence Legal Consultation, Baidu AI Legal Platform

Comprehensive Introduction LawXinbao is an intelligent legal service platform launched by Baidu, which integrates advanced artificial intelligence technology with a professional legal knowledge base. The platform is dedicated to providing users with convenient and professional legal intelligent services, including intelligent legal Q&A, case analysis, contract review and other functions. Through deep learning...
1yrs ago
072.9K
LunaAI换脸:开源的秒鸭相机,部署前后端完整的企业级AI换脸小程序(算力服务付费,可二开)

LunaAI face swap: open source second duck camera, deploy front and back-end complete enterprise AI face swap applet (arithmetic service payment, can be two open)

Comprehensive Introduction LunaAI face swap applet is a face swap application developed based on uniapp and Vue framework. The application utilizes technologies such as PHP, MySQL, Nginx and Redis to achieve the function of the user's face changing operation through the applet. Users can use this small...
2yrs ago
072.8K
Easegen:开源数字人课程制作平台,PPT一键生成克隆数字人讲解视频

Easegen: open source digital human course production platform, PPT one-click generation cloning digital human lecture video

Comprehensive Introduction Easegen is an open source digital human course creation platform that aims to improve the efficiency of teaching content production and management through AI technology. The platform provides a one-stop solution from course production, video management to intelligent questioning, which allows users to create digital human-explained video courses...
2yrs ago
072.8K
飞书知识问答:使用飞书文档作为AI知识库

Flybook Knowledge Quiz: Using Flybook Documents as an AI Knowledge Base

Comprehensive Introduction Flying Book Knowledge Q&A is an AI-driven knowledge management and Q&A tool launched by Flying Book, which deeply integrates DeepSeek R1 big model technology. It supports real-time networking search, multi-format file parsing (including documents, images, etc.), and can seamlessly dock the enterprise knowledge base to help use...
1yrs ago
072.8K
Ant Design X:快速构建AI聊天界面的工具包,支持模型集成和数据流管理。

Ant Design X: A toolkit for rapidly building AI chat interfaces with support for model integration and data flow management.

Comprehensive Introduction Ant Design X is a toolkit open-sourced by Ant Group, designed to help developers quickly build AI-driven dialog interfaces. It provides a rich set of components and templates, supports model integration compatible with OpenAI standards, and is suitable for a variety of applications such as intelligent customer service, AI assistants, and other...
2yrs ago
072.8K
WeaveFox:前端智能研发平台,能够根据设计图直接生成源代码

WeaveFox: a front-end intelligence development platform that generates source code directly from design drawings

Comprehensive Introduction WeaveFox is an AI front-end intelligent R&D platform launched by Ant Group, aiming to improve the efficiency and quality of front-end development through AI technology. The platform is based on Ant's self-developed Bailing multimodal large model, which is able to generate front-end source code directly based on design drawings, and supports multiple clients and technology stacks...
2yrs ago
072.6K
XAudioPro:专业在线音频剪辑工具|有声书制作|文字转语音|伴奏分离

XAudioPro: Professional Online Audio Editing Tool|Audiobook Maker|Text to Speech|Accompaniment Separation

General Introduction XAudioPro is an advanced online audio real-time editing and transcoding tool that is both professional and portable. It supports professional audio editing functions such as cutting, cropping, copying, deleting, restoring, and amplitude gain control. It also provides denoising services such as spectral subtraction noise reduction, low-pass...
2yrs ago
072.6K