AI Personal Learning
and practical guidance
Beanbag Marscode1
Total 914 articles

Tags: ai open source projects Page 26

JoyGen:音频驱动的3D深度感知人像说话视频编辑工具-首席AI分享圈

JoyGen: Audio-Driven 3D Depth-Sensitive Portrait Talking Video Editing Tool

Comprehensive Introduction JoyGen is an innovative two-stage video generation framework for talking faces, focusing on solving the problem of audio-driven facial expression generation. Developed by a team from Jingdong Technology, the project uses advanced 3D reconstruction techniques and audio feature extraction methods to accurately capture the identity features and expression coefficients of the speaker...

VSR:AI技术无损去除视频水印和硬字幕软件(视频去水印客户端7G+)-首席AI分享圈

VSR: AI technology lossless video watermark removal and hard subtitle software (video watermark removal client 7G+)

Comprehensive Introduction Video Subtitle Remover (Video-subtitle-remover, or VSR for short) is a video processing software based on AI technology, specialized in removing hard subtitles and text watermarks from videos. The tool uses a variety of AI algorithm models (STTN, LAMA, PROPAINTER) to intelligently recognize...

Riona-AI-Agent:社交媒体自动化互动智能体,自动搜索、点赞、留言-首席AI分享圈

Riona-AI-Agent: social media automated interactive intelligences that automatically search, like, and leave comments

General Introduction Riona-AI-Agent is an innovative AI-powered automation tool specifically designed to manage and optimize the operations of major social media platforms. It utilizes advanced AI models to provide intelligent content generation and account management capabilities for platforms such as Instagram, Twitter and GitHub. The system...

"Always-On" Deepseek AI Assistant: Building an Intelligent Voice Interaction System Based on Deepseek-V3

Comprehensive Introduction Always-On AI Assistant is an innovative AI assistant project that creates a powerful and permanently online AI assistant system by integrating advanced technologies such as Deepseek-V3, RealtimeSTT and Typer. The project is especially optimized for engineering development scenarios, providing a complete...

Browser Use Web UI:运行AI智能体浏览网页,让AI能够自动操作网页的开源框架-首席AI分享圈

Browser Use Web UI: an open source framework for running AI intelligences to browse the web, allowing AI to automatically manipulate web pages

Comprehensive Introduction Browser Use Web UI is an innovative open source project focused on providing AI agents with a graphical interface tool for browser interaction capabilities. The project is built on top of the browser-use core framework , through Gradio to build a user-friendly Web interface , making it easy for AI agents to ...

NVIDIA联合LangChain推出:分析编写结构化报告的高级指南,实现AI驱动的技术报告生成-首席AI分享圈

NVIDIA and LangChain Launch: An Advanced Guide to Writing Structured Reports for Analytics, Enabling AI-Driven Technical Report Generation

Comprehensive Introduction This is a structured report generation blueprint project co-developed by LangChain and NVIDIA, showcased in a Jupyter notebook tutorial on GitHub. The project utilizes advanced AI techniques, specifically the Llama-3.3-70b model, to automate the generation of professional technical reports. The core features of the project ...

Lecca:无代码构建AI智能体与AI工作流构建平台-首席AI分享圈

Lecca: Building AI Intelligentsia and AI Workflow Building Platforms without Code

Comprehensive Introduction Lecca is a powerful AI platform that allows users to configure and deploy Large Language Models (LLMs) with multiple tools and workflows. Users can easily build, customize and automate their AI agents.Lecca offers a wide selection of AI providers and models, supports tool integration and workflow...

Ollama OCR:使用Ollama中视觉模型提取图像中的文本-首席AI分享圈

Ollama OCR: Extracting Text from Images Using Visual Models in Ollama

Comprehensive Introduction Ollama OCR is a powerful Optical Character Recognition (OCR) toolkit that utilizes the state-of-the-art visual language model provided by the Ollama platform to extract text from images. The project is available both as a Python package and provides a user-friendly Streamlit web application interface. It supports multiple ...

FitDiT:高保真度AI虚拟试衣工具,提升服装细节真实性-首席AI分享圈

FitDiT: High-Fidelity AI Virtual Fitting Tool to Enhance Authenticity of Garment Details

Comprehensive Introduction FitDiT is a high-fidelity virtual fitting system based on diffusion transformers (Diffusion Transformers). Developed by Tencent AI Lab, the project aims to address the limitations of traditional virtual fitting systems in displaying garment details.FitDiT innovatively proposes a new algorithmic architecture that can...

Thin-Plate-Spline-Motion-Model:静态人像图参考视频人像动作生成视频-首席AI分享圈

Thin-Plate-Spline-Motion-Model: Static Portrait Map Reference Video Portrait Motion Generation Video

General Introduction Thin-Plate-Spline-Motion-Model is a groundbreaking image animation generation project presented at CVPR 2022. The project is based on the theory of Thin-Plate Spline Transforms and is able to realize high-quality animation effects from still images based on drive videos. The project uses an end-to-end unsupervised learning framework ...

DUIX:实时互动的智能数字人,支持多平台一键部署-首席AI分享圈

DUIX: Real-time interactive intelligent digital people with multi-platform one-click deployment support

General Introduction DUIX (Dialogue User Interface System) is an AI-driven digital human interaction platform created by Silicon Intelligence. With open source digital human interaction features, developers can easily integrate large-scale modeling, automatic speech recognition (ASR) and text-to-speech (TTS) features to achieve the same level of interaction with digital...

Fay数字人框架:集成语言模型与3D数字角色,支持多种应用场景-首席AI分享圈

Fay Digital Human Framework: Integrated language modeling and 3D digital characters to support multiple application scenarios

Comprehensive Introduction Fay is an open source 3D virtual digital human framework that integrates language models and digital characters for a variety of application scenarios, such as virtual shopping guides, virtual anchors, assistants, waiters, teachers, and voice- or text-based mobile assistants.The Fay framework supports full offline use, providing milliseconds back...

en_USEnglish