OCR

Total 27 articles posts

Sorting

VOP: OCR Tool for Extracting Complex Diagrams and Math Formulas

Comprehensive Introduction Versatile OCR Program is an open source Optical Character Recognition (OCR) tool designed specifically for working with complex academic and educational documents. It can extract text, tables, mathematical formulas, charts and diagrams from PDFs, images and other documents and generate...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

053.1K

Automatically parse PDF content and extract text and tables of open source services

Comprehensive Introduction It can automatically analyze the layout of PDF documents, identify text, titles, images, tables, formulas and other elements in the page, and determine their correct order. The tool supports OCR functionality and can convert scanned PDF to searchable text. It runs on Docker and provides two models...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

060.2K

RolmOCR: Document OCR Model for Recognizing Handwritten and Slanted Characters

Comprehensive Introduction RolmOCR is an open source Optical Character Recognition (OCR) tool developed by the Reducto AI team, based on the Qwen2.5-VL-7B visual language model. It can extract text from images and PDF files faster than similar tools...

Latest AI Resources # AI Java Open Source Projecct # OCR

1yrs ago

065.3K

uniOCR: cross-platform open source text recognition tool

General Introduction uniOCR is an open source text recognition tool developed by mediar-ai team. It is based on the Rust language and supports macOS, Windows and Linux systems. Users can use it to extract text from pictures...

Latest AI Resources # AI Java Open Source Projecct # OCR

1yrs ago

081.8K

PDF Craft: PDF scanned documents to Markdown open source tools

General Introduction PDF Craft is an open source tool designed for scanning PDFs of books and converting them to Markdown format. It was developed by oomol-lab and is hosted on GitHub for users who like to organize their eBooks. The tool works through this ...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

082.8K

SmolDocling: a visual language model for efficient document processing in a small volume

Comprehensive Introduction SmolDocling is a Visual Language Model (VLM) developed by the ds4sd team in collaboration with IBM, built on SmolVLM-256M and hosted on the Hugging Face platform. It is small in size, only ...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

053.2K

Mistral OCR: 94.89% Overall Accuracy, 1000 Pages/30 Seconds, Only $1

In the long history of human civilization, every leap in the way information is acquired and parsed has profoundly driven social progress. From the ancient hieroglyphics, to the portable papyrus, to the later emergence of the printing press and today's wave of digitization, each technological innovation has greatly expanded the paradigm of human knowledge dissemination...

Latest AI Resources # AI Open Services # OCR # Document Extraction and Cleaning

1yrs ago

061.1K

Ollama OCR: Extracting Text from Images Using Visual Models in Ollama

Comprehensive Introduction Ollama OCR is a powerful Optical Character Recognition (OCR) toolkit that utilizes the state-of-the-art visual language model provided by the Ollama platform to extract text from images. The project is available both as a Python package and as a user-friendly Strea...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

0106.5K

STranslate: Lightweight Translation Tool with Multiple Translation Interfaces and OCR Capabilities

General Introduction STranslate is a ready-to-use translation and OCR tool developed by WPF. The tool is designed to provide efficient and convenient translation and Optical Character Recognition (OCR) functionality for a wide range of languages and text types.STranslate is open...

Latest AI Resources # AI Translation # OCR

1yrs ago

062.8K

VisionParser: OCR tool for high-precision processing of receipts and invoices, API available

General Description VisionParser is an OCR (Optical Character Recognition) tool designed for processing receipts and invoices. With advanced generative AI technology, VisionParser is able to quickly and accurately convert all kinds of receipts and invoices into structured data for...

Latest AI Resources # OCR

1yrs ago

058.7K

Chunkr: An All-in-One Service for Document Ingestion and Intelligent Chunking Based on Text Paragraph Hierarchy Using Visual Models

General Introduction Chunkr is a self-hosted API specialized in converting PDF, PPTX, DOCX and Excel files into data suitable for use in RAG (Retrieval Augmented Generation) and LLM (Large Language Model). The project was developed by Lumina...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

055.6K

Llama OCR：利用免费Llama 3.2 Vision接口，三行代码将图像转换为Markdown的OCR库

Llama OCR: OCR library that converts images to Markdown in three lines of code using the free Llama 3.2 Vision interface

General Introduction Llama OCR is an OCR (Optical Character Recognition) library based on Llama 3.2 Vision that converts documents to Markdown format. The library was developed by Nutlope and uses Together...

Latest AI Resources # AI Java Open Source Projecct # OCR # Free Large Model API

1yrs ago

063K

Docling：支持多种格式文档解析并导出为Markdown和JSON，PDF支持OCR

Docling: support for a variety of formats document parsing and export as Markdown and JSON, PDF support OCR

Comprehensive Introduction Docling is a powerful document parsing and exporting tool that supports a wide range of document formats, including PDF, DOCX, PPTX, XLSX, Image, HTML, AsciiDoc and Markdown.It can parse and export these documents...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

0109.9K

ViTLP: Extracting Structured Data from Typographically Complex PDF Documents and Visually Guided Generation of Text Layout Pre-training Models

Comprehensive Introduction ViTLP (Visually Guided Generative Text-Layout Pre-training for Document Intelligence) is an open source project designed to pass...

Latest AI Resources # OCR # Document Extraction and Cleaning

1yrs ago

054.8K

ScreenPipe：24小时收集录屏和操作信息并转换为本地知识库，通过AI助手对话、总结、回顾知识

ScreenPipe: 24-hour collection of recorded screen and operation information and converted into a local knowledge base, through the AI assistant conversation, summarize, review knowledge

General Description ScreenPipe is an AI assistant tool developed by mediar-ai that specializes in recording screen content, capturing screenshots and audio 24/7. It combines rewind.ai and cursor.com's...

Latest AI Resources # AI Text and Audio/Video Summarization Tool # AI Notes # OCR

1yrs ago

067.6K

文本提取API（text-extract-api）：视觉提取文本信息，匿名化的PDF提取工具

Text Extraction API (text-extract-api): visual extraction of text information, anonymized PDF extraction tool

Comprehensive Introduction The Text Extraction API (text-extract-api) is a powerful tool designed to extract and parse content from a variety of document formats (e.g. PDF, Word, PPTX, etc.). The API utilizes state-of-the-art Optical Character Recognition (OCR) technology and Ol...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

1yrs ago

058.1K

Image to Excel Free Tool: Efficiently Recognize Complex Format Tables in Images and Convert to Excel File

General Description Picture to Excel Free Tool is an efficient online tool to quickly and accurately recognize and convert tabular data from pictures to Excel files. The tool supports a wide range of image formats, such as JPG and PNG, and can be used on web pages, iOS apps and Android apps...

Latest AI Resources # OCR

1yrs ago

079.1K

Datalab：专用OCR识别AI模型，PDF转Markdown（开源/API）

Datalab: dedicated OCR recognition AI model, PDF to Markdown (open source/API)

Comprehensive Introduction Datalab offers a range of advanced AI models focused on OCR, layout analysis, PDF to Markdown, and more. These models are not only high performing, but also easy to use and open source. The Marker models on the platform can quickly and accurately...

Latest AI Resources # AI Open Services # AI Java Open Source Projecct # OCR

1yrs ago

066.8K

eSearch: Multi-functional cross-platform OCR tool, integrated search | translation | search map | screen recording and other functions

General Introduction eSearch is an open source cross-platform screenshot tool developed by xushengfeng that supports Windows, macOS and Linux systems. It integrates a variety of features, including screenshot, OCR recognition, search, translation, mapping...

Latest AI Resources # OCR

2yrs ago

059.3K

Surya: professional multilingual document OCR tool, open source native deployment

Comprehensive Introduction Surya is an open source multilingual document OCR toolkit that supports text recognition in over 90 languages. It is capable of not only line-by-line text detection, but also layout analysis, reading order detection, and table recognition.Surya's performance rivals that of cloud services for all types of...

Latest AI Resources # AI Java Open Source Projecct # OCR

2yrs ago

0121K

MinerU：PDF文档提取转换为多模态Markdown格式，支持电子书OCR扫描

MinerU: PDF document extraction and conversion to multimodal Markdown format, support e-book OCR scanning

Comprehensive Introduction MinerU is an open source data extraction tool developed by the OpenDataLab team at the Shanghai Artificial Intelligence Laboratory, focusing on efficiently extracting content from complex PDF documents, web pages, and eBooks. It can take multimodal PDFs containing images, formulas, tables and other elements...

Latest AI Resources # AI Java Open Source Projecct # OCR # Document Extraction and Cleaning

2yrs ago

0141.6K

PixPin: long and dynamic screenshots, built-in native text recognition (OCR)

General Description PixPin is a powerful screenshot and posting tool designed to enhance users' productivity. Whether for daily office or professional needs, PixPin provides convenient screenshot, paste, long screenshot, text recognition (OCR) and dynamic screenshot functions. Its simple interface and...

Latest AI Resources # OCR

2yrs ago

0112.5K

GOT-OCR2.0: end-to-end multimodal OCR model based on QWen2 0.5B

Comprehensive Introduction GOT-OCR2.0 is a StepStar co-proposed de Open Source Optical Character Recognition (OCR) model, which aims to drive OCR technology towards OCR-2.0 through a unified end-to-end model. The model supports a wide range of OCR tasks, including normal text recognition, gr...

Latest AI Resources # AI Java Open Source Projecct # OCR

2yrs ago

066.2K

PaddleOCR: A multi-language OCR tool library based on Flying Paddle, supporting recognition of more than 80 languages

Comprehensive Introduction PaddleOCR is a multilingual OCR toolkit based on PaddlePaddle, designed to provide a practical and ultra-lightweight OCR system. It supports the recognition of more than 80 languages and provides data annotation and synthesis tools to support the service...

Latest AI Resources # AI Java Open Source Projecct # OCR

1yrs ago

088.3K

Pix2Text: open source free image text recognition tool

Pix2Text General Introduction Pix2Text (P2T) is an open source free tool designed to replace Mathpix, providing image text and mathematical formula recognition. Users can use the tool free of charge via the web version to recognize up to 10,000 per day...

Latest AI Resources # OCR

2yrs ago

071.9K

Umi-OCR: open source offline OCR software, batch image recognition and PDF recognition

Umi-OCR Comprehensive Introduction Umi-OCR is an open source, free offline OCR software that supports screenshot, batch image import, PDF document recognition, exclude watermarks and headers and footers, scanning and generating QR codes. The software has a built-in multi-language library for Windows and Li...

Latest AI Resources # OCR

2yrs ago

0103.7K

TTime: Picture Your Text Recognition and Text Translation Software

TTime General Introduction TTime is a project published by InkTimeRecord on GitHub, is a simple and efficient translation software. It mainly provides input, screenshot, stroke and hoverball translation functions, supports multiple translation sources and text recognition services...

Latest AI Resources # AI Translation # OCR

2yrs ago

054.9K

No more