Google Vids: Creating Marketing Explainer Videos Starting with Prompt Instructions and Documents
General Introduction Google Vids is an AI-powered video creation tool in the Google Workspace suite designed to help users easily create and share work-related video content. With simple prompts and clip integration, users can generate stories...
TableGPT2: A Multimodal Model for Tabular Data Integration
Comprehensive Introduction TableGPT2 is a multimodal model developed by a team from Zhejiang University, focusing on the integration and processing of tabular data. The model is pre-trained and fine-tuned to be able to perform well in table data related tasks while maintaining strong general-purpose language and coding capabilities.TableGP...
Context: Seamless integration of various data sources, multi-role Agent automation to complete different work scenarios content
General Introduction Context Autopilot is an intelligent AI productivity tool from Context designed to improve team productivity through deep integration and office automation. The tool utilizes the world's first context engine, Context-1...
Coming soon Kling 1.5: "Custom Model" trains characters with their own video footage.
Disrupting Traditional Video Generation Kling AI's "Custom Model" feature allows users to train their characters by uploading 10 to 30 videos (each at least 10 seconds long). The process is very different from traditional image training models, as Kling AI utilizes video footage for character...
EyeLevel (GroundX): a multimodal enterprise document data processing platform that eliminates the illusion of LLMs from the RAG source
Comprehensive Introduction EyeLevel is a company focused on preventing data illusions by converting complex enterprise content into data suitable for Large Language Model (LLM) processing. Through its unique data transformation engine and multimodal processing technology, EyeLevel is able to transform complex tables, charts...
WebSpy: Website SEO metrics to analyze and test website requests, optimize website performance
General Introduction WebSpy is a powerful website analysis and testing tool designed for developers and testers. It allows users to monitor and edit HTTP requests and responses of a website, supporting multiple request types (such as GET, POST, PUT, PATCH, DE...
fal: generative macromodeling API for developers of rich media classes
General Introduction fal is an online AI inference platform that helps users build real-time AI applications with high-quality generative media models, including images, video and audio. No cold start required, pay-as-you-go. fal offers a wide range of pre-trained generative models such as Stable Dif...
VideoChat: real-time voice-interactive digital person with customized image and tone cloning, supporting end-to-end voice solutions and cascading solutions
Comprehensive Introduction VideoChat is a real-time voice interaction digital person project based on open source technology, supporting both end-to-end voice schemes (GLM-4-Voice - THG) and cascade schemes (ASR-LLM-TTS-THG). The project allows users to customize the digital ...
Ichigo (llama3-s): local real-time voice AI assistant, open source version of Siri
General Introduction Ichigo is an open source real-time speech AI project that aims to extend text-based language models with native "listening" capabilities. The project uses early fusion techniques inspired by Meta's Chameleon paper.Ichigo's goal is to become...
SFT-data-builder: generate AI training data using free big model API, 0 cost big model training data generation
Comprehensive Introduction SFT-data-builder is an open source project designed to generate high-quality SFT training data by combining user's private domain data using a free Big Model API. The tool supports multiple AI model formats and provides one-click generation, batch generation, flexible editing and local...









