Visual Target Detection

Total 18 articles posts

Sorting

Trackers: open source tool library for video object tracking

General Introduction Trackers is an open source Python tool library focused on multi-object tracking in video. It integrates several leading tracking algorithms, such as SORT and DeepSORT, and allows users to combine different object detection models (such as YOLO...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

6mos ago

026.7K

Describe Anything: Open source tool for generating detailed descriptions of images and video regions

General Introduction Describe Anything is an open source project developed by NVIDIA and several universities, with the Describe Anything Model (DAM) at its core. This tool can be based on the user in the image or video tagged...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

6mos ago

026.6K

Find My Kids: child safety monitoring tool through facial recognition and WhatsApp notifications

General Introduction Find My Kids is an open source project hosted on GitHub and created by developer Tomer Klein. It combines DeepFace face recognition technology and the WhatsApp Green API...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

6mos ago

025K

YOLOE: an open source tool for real-time video detection and segmentation of objects

General Introduction YOLOE is an open source project developed by the Multimedia Intelligence Group (THU-MIG) at Tsinghua University School of Software, with the full name "You Only Look Once Eye". It is based on the PyTorch framework, which belongs to the YOLO series of extensions ...

Latest AI Resources # AI Java Open Source Projecct # AI keying to change backgrounds # Visual Target Detection

7mos ago

030.5K

SegAnyMo: open source tool to automatically segment arbitrary moving objects from video

General Introduction SegAnyMo is an open source project developed by a team of researchers at UC Berkeley and Peking University, including members such as Nan Huang. This tool focuses on video processing and can automatically recognize and segment arbitrary moving objects in a video, such as people, animals or...

Latest AI Resources # AI Java Open Source Projecct # AI keying to change backgrounds # Visual Target Detection

7mos ago

029K

RF-DETR: An Open Source Model for Real-Time Visual Object Detection

Comprehensive Introduction RF-DETR is an open source object detection model developed by the Roboflow team. It is based on the Transformer architecture and its core feature is real-time efficiency. For the first time, the model achieves more than 60 APs of real-time on the Microsoft COCO dataset...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

7mos ago

034.9K

HumanOmni: a multimodal macromodel for analyzing human video emotions and actions

General Introduction HumanOmni is an open source multimodal big model developed by the HumanMLLM team and hosted on GitHub. It focuses on analyzing human videos and can process both picture and sound to help understand emotions, actions, and conversational content. The project used 2...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

8mos ago

023.5K

Vision Agent: A Visual Intelligence to Solve Multiple Visual Target Detection Tasks

General Introduction Vision Agent is an open source project developed by LandingAI (Team Enda Wu) and hosted on GitHub, designed to help users quickly generate code to solve computer vision tasks. It utilizes an advanced agent framework and multimodal modeling...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

8mos ago

024.9K

MakeSense: a free-to-use image annotation tool to improve computer vision project efficiency

General Introduction Make Sense is a free online image annotation tool designed to help users quickly prepare datasets for computer vision projects. It requires no complicated installation, just open a browser access to use it, supports multiple operating systems, and is perfect for small deep learning projects. Users can...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

8mos ago

031K

YOLOv12: Open source tool for real-time image and video target detection

Comprehensive introduction YOLOv12 is an open source project developed by GitHub user sunsmarterjie , focusing on real-time target detection technology . The project is based on YOLO (You Only Look Once) series of frameworks , the introduction of note ...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

8mos ago

043.5K

VLM-R1: A Visual Language Model for Localizing Image Targets through Natural Language

Comprehensive Introduction VLM-R1 is an open source visual language modeling project developed by Om AI Lab and hosted on GitHub. The project is based on DeepSeek's R1 approach, combined with the Qwen2.5-VL model through reinforcement learning...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

8mos ago

031.1K

HealthGPT: A Medical Big Model to Support Medical Image Analysis and Diagnostic Q&A

Comprehensive Introduction HealthGPT is a state-of-the-art medical grand visual language model designed to enable unified medical visual understanding and generation capabilities through heterogeneous knowledge adaptation. The goal of the project is to integrate medical visual understanding and generation capabilities into a unified autoregressive framework that significantly improves the medical graph...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

8mos ago

026.6K

MedRAX: A Smart Body for Chest X-ray Analysis Using Multimodal Large Models

Comprehensive Introduction MedRAX is a state-of-the-art AI intelligence designed for chest radiograph (CXR) analysis. It integrates state-of-the-art CXR analysis tools and multimodal large language models to dynamically process complex medical queries without additional training.MedRAX, through its modular design...

Latest AI Resources # AI Java Open Source Projecct # Intelligent Body Application # Visual Target Detection

8mos ago

028.8K

Agentic Object Detection：无需标注和训练的视觉目标检测工具

Agentic Object Detection: A Visual Object Detection Tool without Annotation and Training

Comprehensive Introduction Agentic Object Detection is an advanced target detection tool by Landing AI. The tool performs detection through text prompts, eliminating the need for data annotation and model training, greatly simplifying the process of traditional target detection...

Latest AI Resources # Visual Target Detection

9mos ago

024.8K

CogVLM2: Open Source Multimodal Modeling with Support for Video Comprehension and Multi-Round Dialogue

Comprehensive Introduction CogVLM2 is an open source multimodal model developed by the Tsinghua University Data Mining Research Group (THUDM), based on the Llama3-8B architecture, and designed to provide performance comparable to or even better than GPT-4V. The model supports image understanding, multi-round dialogs, and visual ...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

9mos ago

028.6K

Gaze-LLE: A Target Prediction Tool for Character Gaze in Video

Synthesis Gaze-LLE is a gaze target prediction tool based on a large-scale learning encoder. It was developed by Fiona Ryan, Ajay Bati, Sangmin Lee, Daniel Bolya, Judy Hoffman, and J...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

7mos ago

026.6K

Video Analyzer: analyzes video content and generates detailed descriptions

Comprehensive Introduction Video Analyzer (Video Analyzer) is a comprehensive video analysis tool that combines computer vision, audio transcription and natural language processing techniques to generate detailed video content descriptions. The tool transcribes audio content by extracting key frames in the video...

Latest AI Resources # AI Java Open Source Projecct # Visual Target Detection

9mos ago

039.4K

Twelve Labs：理解视频内容的多模态AI解决方案，视频搜索、生成、嵌入API服务

Twelve Labs: multimodal AI solution for understanding video content, video search, generation, embedding API services

General Introduction Twelve Labs is a multimodal AI company focused on video understanding, dedicated to helping users understand and process large amounts of video content through advanced AI technologies. Its core technologies include video search, generation, and embedding, which are able to extract key features from video such as actions, objects...

Latest AI Resources # AI Open Services # Visual Target Detection

9mos ago

024.7K

No more