AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror
Total 914 articles

Tags: ai open source projects Page 31

AI2SRT:利用 Gemini模型,一键为长视频创建解说短视频或视频总结-首席AI分享圈

AI2SRT: Create short narrated videos or video summaries for long videos with one click using Gemini models

Comprehensive Introduction AI2SRT is an open source project that utilizes the GeminiAI Big Model to generate short narrated videos and video summaries for long videos with one click, while supporting audio and video transcription subtitles. The project aims to simplify the video content creation process and provide efficient subtitle generation and translation functions. Users can simply operate...

CogAgent:智谱开源的智能视觉语言模型,实现图形界面自动化操作-首席AI分享圈

CogAgent: Smart Spectrum's open source intelligent visual language model for automating graphical interfaces

Comprehensive Introduction CogAgent is an open source visual language model developed by Tsinghua University Data Mining Research Group (THUDM), aiming to automate cross-platform graphical user interface (GUI) operations. The model is based on CogVLM (GLM-4V-9B), supports bilingual interactions in English and Chinese, and is able to automate GUI operations through screenshots and natural...

DisPose:生成人体姿态精准控制的视频,创作跳舞的小姐姐-首席AI分享圈

DisPose: generating videos with precise control of human posture, creating dancing ladies

General Introduction DisPose is an innovative open source artificial intelligence project focused on controlled character image animation generation. Developed by a team of researchers and open-sourced on GitHub, the project uses advanced deep learning techniques to achieve precise character animation control by decomposing skeletal pose information.The core of DisPose...

Smolagents:快速开发AI智能体,轻量级构建智能体的开源项目-首席AI分享圈

Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Comprehensive Introduction Smolagents is a lightweight intelligent agent library developed by HuggingFace that focuses on simplifying the development process of AI agent systems. The project is known for its clean design philosophy, with only about 1000 lines of core code, yet provides powerful feature integration capabilities. Its most notable feature is its support for code execution...

Vision Parse:使用视觉语言模型将PDF文档智能转换为Markdown格式-首席AI分享圈

Vision Parse: Intelligent Conversion of PDF Documents to Markdown Format Using Visual Language Models

Comprehensive Introduction Vision Parse is a revolutionary document processing tool that cleverly combines state-of-the-art Visual Language Models (Vision Language Models) technology to intelligently convert PDF documents into high-quality Markdown format content. The tool supports a wide range of top-notch visual language models, including o...

InvSR:开源图像超分辨率项目,提升图像分辨率质量-首席AI分享圈

InvSR: Open source image super-resolution project to improve the quality of image resolution

General Introduction InvSR is an innovative open-source image super-resolution project based on diffusion inversion techniques capable of converting low-resolution images into high-quality, high-resolution images. The project utilizes the rich image prior knowledge embedded in pre-trained large-scale diffusion models, and through a flexible sampling mechanism, supports 1 to...

Infinity:生成高分辨率图像的比特自回归建模,实现无限制高分辨率图像生成-首席AI分享圈

Infinity: bitwise autoregressive modeling for generating high-resolution images for unlimited high-resolution image generation

General Introduction Infinity is a groundbreaking high-resolution image generation framework developed by the FoundationVision team. The project breaks through the limitations of traditional image generation models through an innovative bit-level visual autoregressive modeling approach.The core feature of Infinity is the use of an infinite vocabulary of disambiguators and...

GPTme:在命令行终端中运行的智能编程助手,ChatGPT代码解释器的本地化替代方案-首席AI分享圈

GPTme: Intelligent Programming Assistant Running in a Command Line Terminal, Localized Alternative to ChatGPT Code Interpreter

Comprehensive Introduction GPTMe is a revolutionary terminal AI assistant tool designed to enhance developers' work efficiency. It perfectly combines powerful AI capabilities with the terminal environment, supporting diverse functions such as code execution, file editing, web browsing and visual recognition. As a localized replacement for ChatGPT code interpreter...

KAG:知识图谱与向量混合检索的专业知识库问答框架-首席AI分享圈

KAG: A Professional Knowledge Base Q&A Framework for Hybrid Knowledge Graph and Vector Retrieval

Comprehensive Introduction KAG (Knowledge Augmented Generation) is a logical form-guided reasoning and retrieval framework based on the OpenSPG engine and Large Language Models (LLMs). The framework is specialized in building logical reasoning and fact-questioning solutions for specialized domain knowledge bases, which can effectively overcome the traditional RAG...

VideoSeal:先进的开源视频隐藏水印嵌入与提取工具,保护视频版权-首席AI分享圈

VideoSeal: Advanced open source video hidden watermark embedding and extraction tools to protect video copyrights

General Introduction VideoSeal is an open source video watermarking tool developed by Facebook Research, designed to provide efficient video watermark embedding and extraction. The tool supports the latest open source models and contains pre-trained models, training code, inference code and evaluation tools, all released under the MIT license.Vid...

OASIS:多智能体模拟数百万用户社交媒体互动,研究复杂社会现象-首席AI分享圈

OASIS: Multi-Intelligence Simulation of Social Media Interactions of Millions of Users to Study Complex Social Phenomena

General Introduction OASIS (Open Agent Social Interaction Simulations) is an open source social media simulator capable of simulating the behavior of up to one million users. The platform combines a large-scale language model and rule-based agents designed to realistically reproduce the behavior of social media platforms such as Twitter...

Refly:基于自由画布上流程编排的AI写作平台,自动化生成文章-首席AI分享圈

Refly: an AI writing platform based on process orchestration on a free canvas for automated article generation

General Introduction Refly is a free canvas-based AI-native authoring engine designed to help users turn ideas into high-quality content through multi-threaded conversations, knowledge base integration, contextual memory, and intelligent search technology. The platform covers over 20 professional scenario templates, including academic research and technical...

en_USEnglish