AI Personal Learning
and practical guidance
豆包Marscode1

AI tools

Genspark:基于Genspark智能体深度搜索并撰写研究报告-首席AI分享圈
sticky (of an Internet forum thread etc)

Genspark: deep search and writing of research reports based on Genspark intelligences

General Introduction Genspark is an artificial intelligence-based search tool. It was founded in 2023 by a former Baidu executive and is based in Palo Alto, California. Unlike traditional search engines, Genspark uses multiple AI intelligences to generate customized search result pages in real time, called "Sparkpage...

使用 Gimine 2.5 Pro 模型将视频、语音转为SRT字幕-首席AI分享圈

Converting video, voice to SRT subtitles using Gimine 2.5 Pro model

I've tried to convert speech to multi-speaker subtitle with Gemini 2.0 for free before, and the result is quite good. I tried it again with Gimine 2.5 pro. First of all, I found a sample of standard SRT subtitle as a reference benchmark (speech-to-text conversion is done in advance, using the mainstream model in the market): 00...

uniOCR:跨平台开源的文字识别工具-首席AI分享圈

uniOCR: cross-platform open source text recognition tool

General Introduction uniOCR is an open source text recognition tool developed by mediar-ai team. It is based on the Rust language and supports macOS, Windows and Linux systems. It supports macOS, Windows and Linux systems. users can use it to extract text from images, the operation is simple and free. uniOCR's core feature is cross-platform support...

YOLOE:实时视频检测和分割物体的开源工具-首席AI分享圈

YOLOE: an open source tool for real-time video detection and segmentation of objects

YOLOE is an open source project developed by the Multimedia Intelligence Group (THU-MIG) at Tsinghua University School of Software, with the full name "You Only Look Once Eye". It is based on the PyTorch framework, and is an extension of the YOLO series, which can detect and segment any object in real time. The project is hosted on GitHub, ...

VideoMind:视频按时间戳定位内容与问答的开源项目-首席AI分享圈

VideoMind: video by timestamp positioning content and Q&A open source project

General Introduction VideoMind is an open source multimodal AI tool focused on inference, Q&A and summary generation for long videos. It was developed by Ye Liu of the Hong Kong Polytechnic University and a team from Show Lab at the National University of Singapore. The tool mimics the way humans understand video by splitting tasks into planning,...

SegAnyMo:从视频中自动分割任意运动物体的开源工具-首席AI分享圈

SegAnyMo: open source tool to automatically segment arbitrary moving objects from video

General Introduction SegAnyMo is an open source project developed by a team of researchers at UC Berkeley and Peking University, including members such as Nan Huang. This tool focuses on video processing and can automatically recognize and segment arbitrary moving objects in a video, such as people, animals or vehicles. It combines TAP...

en_USEnglish