477 Articles
Tags :AI open source project Page 35
Comprehensive Introduction MaskGCT (Masked Generative Codec Transformer) is a completely non-autoregressive Text-to-Speech (TTS) model jointly introduced by Funky Maru Technology and The Chinese University of Hong Kong. The model does not require explicit text-to-speech alignment information and adopts a two-stage generation approach, which first passes ...
Comprehensive Introduction Quanta Quest is the world's first product with "end-side big model + C-side data localization" as the core evolution direction. It helps users to store all data from Gmail, Notion, Dropbox, etc. locally, and process them through vector database to ensure data security and privacy...
General Description Local File Organizer is an AI-powered local file management tool designed to help users organize and categorize files on their computers. The tool utilizes advanced AI models such as Llama3.2 3B and Llava v1.6 via Nexa SDK to enable intelligent scanning of files, re...
General Introduction Inspired by the podcast generation features of Notebook LM and the recent Open Notebook LM open source implementation. In this recipe, we will implement a detailed step-by-step guide on how to build a PDF to podcast pipeline. Given any PDF, we will generate a segment where the host and guest discuss and explain ...
General Introduction Agent.exe is an open source Electron application that utilizes Anthropic's Claude 3.5 Sonnet API to allow users to control their local computer directly through AI. Developed by Kyle Corbitt, the project aims to provide a lightweight solution that allows users to physically...
Comprehensive Introduction MindSearch is an open source AI search engine framework launched by Shanghai Artificial Intelligence Laboratory (SAL), which aims to simulate human thought process for complex information gathering and integration. The tool combines the advanced technology of large-scale language modeling (LLM) and search engine with a multi-intelligence body framework to achieve the...
Comprehensive Introduction CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by FunAudioLLM team, it aims to achieve high quality speech synthesis through advanced autoregressive transformers and ODE-based diffusion models.CosyVoice not only supports...
General Introduction Fabric is an open source AI framework developed by Daniel Miessler to simplify and automate everyday computer tasks and make artificial intelligence easier to use. It helps users efficiently handle a variety of tasks such as content summarization, data extraction through modular design and preset prompt words (Patterns)...
General Introduction NocoDB is an open source Airtable alternative designed to provide a powerful and easy-to-use online database management tool. With NocoDB, users can easily create, read, update and delete data from databases without writing code. The platform supports a wide range of database types,...
General Introduction TANGO (Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation) is an open source collaborative speech gesture video generation framework jointly developed by the University of Tokyo and CyberAgent AI Labs An open source collaborative speech gesture video generation framework jointly developed by the University of Tokyo and CyberAgent AI Lab. The ...