On January 7, a 6.8-magnitude earthquake struck Tingri County, Tibet, and many people are concerned about the progress of rescue and praying for the safety of the affected areas. Meanwhile, in the midst of people's goodwill and concern, a picture of a "little boy buried under the rubble" quickly became popular on the Internet. This picture was accompanied by the words "Rikaze Earthquake", poking the tears of countless people, but also...
We have released vdr-2b-multi-v1, the best multilingual embedding model for visual document retrieval. We also released its English-only version, vdr-2b-v1, and open-sourced the new vdr-multilingual-train dataset. This dataset contains 500,000 high-quality samples and is the best multilingual embedding model for visual...
Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.
✨ Little Red Book Nuggets Secrets Revealed! 🔥 Leek programs can be fun? There are also operation tutorials for free! Hello everyone, recently found that everyone is very interested in the small red book related projects, it seems that everyone wants to dig gold in the small red book this blue sea ah! 🚀 Don't worry, today I bring you dry...
Agent AI: Surveying the Horizons of Multimodal Interaction Originally published at https://ar5iv.labs.arxiv.org/html/2401.03568 Abstract Multimodal AI systems are likely to be ubiquitous in our daily lives. Making these systems more interactive a...
General Introduction Cursor Auto-Free is an open source project developed by GitHub user chengazhen to automate signups to get free access to the Cursor IDE.Cursor is a code editor with integrated AI functionality.With this tool, users can automatically sign up and get a free trial period...
Coze (Button) Automation Work Hands-on Tutorial Introduction In the modern work environment, automation technology is becoming an important tool for organizations to improve productivity with its efficiency, precision and scalability. **Coze (Buckle)** as a lightweight and highly flexible automation tool for various industries from...
General Introduction BrownChat is a real-time audio chat application based on Large Language Modeling (LLM) technology. Developed by GitHub user sugarforever, the project aims to enhance the user's communication experience through advanced natural language processing technology.BrownChat provides an open source platform where users...
Comprehensive Introduction Xunfei instrument is an AI technology-based instrument writing platform launched by Xunfei, relying on the Xunfei Starfire large model, designed to provide efficient and convenient writing solutions for the instrument writing community. The platform covers the whole process functions such as material preparation, manuscript writing, reviewing and checking, etc., aiming to improve the user...
Comprehensive Introduction Lecca is a powerful AI platform that allows users to configure and deploy Large Language Models (LLMs) with multiple tools and workflows. Users can easily build, customize and automate their AI agents.Lecca offers a wide selection of AI providers and models, supports tool integration and workflow...
General Description Automa is a powerful browser extension designed to simplify repetitive user tasks in the browser by automating actions. Whether it's auto-filling forms, taking screenshots, data crawling, or executing complex workflows, Automa can handle it with ease. Users can connect different modules to create...
Comprehensive Introduction Ollama OCR is a powerful Optical Character Recognition (OCR) toolkit that utilizes the state-of-the-art visual language model provided by the Ollama platform to extract text from images. The project is available both as a Python package and provides a user-friendly Streamlit web application interface. It supports multiple ...
Comprehensive Introduction FitDiT is a high-fidelity virtual fitting system based on diffusion transformers (Diffusion Transformers). Developed by Tencent AI Lab, the project aims to address the limitations of traditional virtual fitting systems in displaying garment details.FitDiT innovatively proposes a new algorithmic architecture that can...
Comprehensive Introduction Avatarify Python is an open source artificial intelligence video conferencing tool based on First Order Motion Model technology that maps a user's facial expressions and movements to any avatar in real time. It is supported in Zoom, Skype, Teams and other types of video conferencing software, allowing the use...
General Introduction FaceSwap is an open source deep learning face swapping tool that recognizes and swaps faces in pictures and videos. The project is community-driven development, written in Python, and supports multiple operating system platforms such as Windows, Linux, and macOS.FaceSwap utilizes deep learning techniques,...
In the rapid development of AI, Digital Humans (Digital Humans) have matured and can be generated quickly at low cost. Because of the wide range of commercial application scenarios, it has received attention. Whether in virtual reality (VR), augmented reality (AR) or film and television production, game development, brand promotion, Digital Humans are...
General Introduction Thin-Plate-Spline-Motion-Model is a groundbreaking image animation generation project presented at CVPR 2022. The project is based on the theory of Thin-Plate Spline Transforms and is able to realize high-quality animation effects from still images based on drive videos. The project uses an end-to-end unsupervised learning framework ...
General Introduction DUIX (Dialogue User Interface System) is an AI-driven digital human interaction platform created by Silicon Intelligence. With open source digital human interaction features, developers can easily integrate large-scale modeling, automatic speech recognition (ASR) and text-to-speech (TTS) features to achieve the same level of interaction with digital...
Comprehensive Introduction Fay is an open source 3D virtual digital human framework that integrates language models and digital characters for a variety of application scenarios, such as virtual shopping guides, virtual anchors, assistants, waiters, teachers, and voice- or text-based mobile assistants.The Fay framework supports full offline use, providing milliseconds back...
General Introduction MOFA-Video is an advanced image animation generation tool that utilizes generative motion field adaptation techniques to convert static images into dynamic videos. It was developed in collaboration with the University of Tokyo and Tencent AI Lab and will be presented at the European Conference on Computer Vision (ECCV) 2024.MOFA-Vi...