AI Personal Learning
and practical guidance
讯飞绘镜

Knowledge Table: an open source tool for efficient extraction and exploration of structured data

General Introduction

Knowledge Table (Knowledge Table) is an open source project designed to simplify the process of extracting and exploring structured data from unstructured documents. Users can create structured knowledge representations such as tables and graphs through a natural language query interface. The tool supports customization of extraction rules and formatting options, and ensures data traceability by displaying data sources through a user interface. Knowledge Sheets provides business users with a familiar spreadsheet interface, while providing developers with a flexible and highly configurable backend for a variety of data processing needs.

Knowledge Table:高效提取与探索结构化数据的开源工具-1


 

Knowledge Table:高效提取与探索结构化数据的开源工具-1

 

Function List

  • natural language extraction (NLE): Support for extracting structured data from unstructured documents using natural language queries.
  • Customized extraction rules: Users can define extraction rules to ensure data quality.
  • format control: The output format of the extracted data can be controlled.
  • Document Filtering: Filter documents based on metadata or extracted data.
  • CSV or Trigram Export: Supports downloading of extracted data into CSV or Chart Triad format.
  • chain extraction: Allow references to previous columns in questions.

 

Using Help

Installation and operation

  1. Docker running::
    • Ensure that Docker and Docker Compose are installed.
    • Using commands docker-compose up -d --build Launch the application.
    • Access to the front end http://localhost:3000 and back-end http://localhost:8000The
  2. local operation::
    • Cloning the code base:git clone https://github.com/yourusername/knowledge-table.git
    • Go to the backend directory and create a virtual environment:
      cd knowledge-table/backend/
      python3 -m venv venv
      source venv/bin/activate  # Windows使用 venv\Scripts\activate
      pip install -r requirements.txt
      
    • Start the back-end service:
      cd src/
      python -m uvicorn knowledge_table_api.main:app
      
  3. Front-end settings::
    • Go to the front-end directory and install the dependencies:
      cd ../frontend/
      curl https://bun.sh/install | bash  # 安装Bun
      bun install
      bun start
      
    • Front-end services can be found in the http://localhost:5173 Access.

Usage Process

  • Upload a document: Uploading unstructured documents to the knowledge table, the system splits them into chunks and stores them in a vector database.
  • Setting questions and rules: Define the type of data to be extracted and the corresponding questions that the system will process based on this information.
  • View Results: After completing the data processing, the user can view the structured output and make adjustments as needed.

caveat

  • Ensure that relevant laws and regulations are followed to avoid infringing on the rights of others.
  • Extracted data is regularly validated to ensure its accuracy and currency.
May not be reproduced without permission:Chief AI Sharing Circle " Knowledge Table: an open source tool for efficient extraction and exploration of structured data
en_USEnglish