General Introduction
PaddleOCR is a multilingual OCR toolkit based on PaddlePaddle, designed to provide a practical and ultra-lightweight OCR system. It supports recognition of more than 80 languages, and provides data annotation and synthesis tools, and supports training and deployment on servers, mobile devices, embedded and IoT devices. paddleOCR integrates text image correction, layout area detection, regular text detection, stamp text detection, text recognition, table recognition and other features, significantly reducing development costs, and supports high-performance reasoning, service-oriented deployment and end-side deployment. It also supports high performance inference, service-oriented deployment and end-side deployment.
Function List
- multilingual recognition: Supports text recognition in over 80 languages.
- Data annotation and synthesis tools: Provide convenient data labeling and synthesis tools to help quickly generate training data.
- Text Image Correction: Integrated text image correction function to improve recognition accuracy.
- Layout area detection: Supports high-precision layout area detection for parsing complex documents.
- form recognition: Provide table recognition function, able to extract table data accurately.
- Stamp Text Detection: Supports the detection and recognition of stamped text.
- High Performance Reasoning: Supports high-performance inference for real-time applications.
- Multiple Deployment Options: Supports deployment of servers, mobile devices, embedded and IoT devices.
- Low-code development: Provide low-code full-process development tools to lower the development threshold and improve development efficiency.
Using Help
Installation process
- environmental preparation::
- Ensure that Python 3.6 or later is installed.
- Install the PaddlePaddle framework, which can be installed with the following command:
pip install paddlepaddle
- Install PaddleOCR:
pip install paddleocr
- Download model::
- Download the pre-trained model from the official repository, you can refer to the official documentation for specific download links and commands.
- running example::
- Use the following command to run the OCR example:
bash
paddleocr --image_dir ./doc/imgs/11.jpg --det_model_dir ./inference/ch_ppocr_mobile_v2.0_det_infer --rec_model_dir ./inference/ch_ppocr_mobile_v2.0_rec_infer --cls_model_dir ./inference/ch_ppocr_mobile_v2.0_cls_infer
- Use the following command to run the OCR example:
Functional operation flow
- text recognition::
- Prepare the image file to be recognized.
- utilization
paddleocr
command-line tool or the Python API for identification. - Sample code:
from paddleocr import PaddleOCR, draw_ocr import matplotlib.pyplot as plt import cv2 ocr = PaddleOCR(use_angle_cls=True, lang='ch') img_path = 'path/to/your/image.jpg' result = ocr.ocr(img_path, cls=True) for line in result: print(line) # 可视化结果 image = cv2.imread(img_path) boxes = [elements[0] for elements in result] txts = [elements[1][0] for elements in result] scores = [elements[1][1] for elements in result] im_show = draw_ocr(image, boxes, txts, scores, font_path='path/to/your/font.ttf') im_show = cv2.cvtColor(im_show, cv2.COLOR_BGR2RGB) plt.imshow(im_show) plt.show()
- form recognition::
- Prepare the image file containing the form.
- utilization
paddleocr
command-line tool or Python API for form recognition. - Sample code:
from paddleocr import PPStructure, draw_structure_result import cv2 table_engine = PPStructure(show_log=True) img_path = 'path/to/your/table_image.jpg' result = table_engine(img_path) for line in result: print(line) # 可视化结果 image = cv2.imread(img_path) im_show = draw_structure_result(image, result, font_path='path/to/your/font.ttf') im_show = cv2.cvtColor(im_show, cv2.COLOR_BGR2RGB) plt.imshow(im_show) plt.show()
- Layout area detection::
- Prepare image files containing complex layouts.
- utilization
paddleocr
Command line tool or Python API for layout area detection. - Sample code:
from paddleocr import PaddleOCR, draw_ocr import matplotlib.pyplot as plt import cv2 ocr = PaddleOCR(use_angle_cls=True, lang='ch') img_path = 'path/to/your/layout_image.jpg' result = ocr.ocr(img_path, cls=True) for line in result: print(line) # 可视化结果 image = cv2.imread(img_path) boxes = [elements[0] for elements in result] txts = [elements[1][0] for elements in result] scores = [elements[1][1] for elements in result] im_show = draw_ocr(image, boxes, txts, scores, font_path='path/to/your/font.ttf') im_show = cv2.cvtColor(im_show, cv2.COLOR_BGR2RGB) plt.imshow(im_show) plt.show()