AI Personal Learning
and practical guidance

Surya: professional multilingual document OCR tool, open source native deployment

General Introduction

Surya is an open source multilingual document OCR toolkit that supports text recognition in over 90 languages. It is capable of not only line-by-line text detection, but also layout analysis, reading order detection, and table recognition.Surya's performance rivals that of cloud services for a wide range of document types, including PDFs, images, Word documents, and PPTs. The toolkit is designed to provide users with a comprehensive document parsing solution.

Hosting API: https://www.datalab.to/

For PDFs, images, Word documents and PowerPoint

Surya: Professional Multilingual Document OCR Tool-1


 

Function List

  • OCR: Text recognition in more than 90 languages
  • Line-by-line text detection: automatically identifies the location of each line of text in a document
  • Layout analysis: detecting tables, images, headings and other elements in a document
  • Reading Order Detection: Identify the reading order in a document
  • Table Recognition: Detecting Rows and Columns in a Table

 

Using Help

Installation process

  1. Make sure Python 3.9+ and PyTorch are installed.
  2. If you are not using a Mac or GPU machine, you may need to install the CPU version of torch first.
  3. Use the following command to install Surya:
    pip install surya-ocr
    
  4. The first time you run Surya, the model weights are automatically downloaded.

Usage Process

  1. Check and configuresurya/settings.pysettings in the environment variable, you can override any settings with the environment variable.
  2. Surya automatically detects torch devices, but they can be overridden manually. Example:
    TORCH_DEVICE=cuda
    
  3. Use the following command to run the OCR application:
    python run_ocr_app.py
    
  4. When processing documents, you can choose different functional modules, such as text detection, layout analysis, etc.

Functional operation flow

  1. OCR function::
    • Load documents (PDFs, images, etc.).
    • Select language (more than 90 languages are supported).
    • Run OCR recognition to extract the text content.
  2. Line-by-line text detection::
    • Load the document.
    • Run line-by-line text detection to get the position of each line of text.
    • Export test results.
  3. Layout analysis::
    • Load the document.
    • Run a layout analysis to detect elements such as tables, images, headings, etc. in a document.
    • Export the analysis results.
  4. Reading Sequence Testing::
    • Load the document.
    • Run reading order detection to recognize the reading order in a document.
    • Export test results.
  5. form recognition::
    • Load the document.
    • Run Table Recognition to detect rows and columns in a table.
    • Export the recognition results.

Surya provides rich document parsing functions, users can choose different function modules to operate according to their needs. You can refer to the official documentation and sample code for detailed operation procedures and setup instructions.

May not be reproduced without permission:Chief AI Sharing Circle " Surya: professional multilingual document OCR tool, open source native deployment

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish