AI Personal Learning
and practical guidance

Analytics GBI (XiYan-SQL): Text-to-SQL Intelligent Data Analytics for ChatBI with Ease

General Introduction

Analytics GBI is a big model-based intelligent data analysis product launched by AliCloud Bailian. The product utilizes advanced natural language processing technology to help users query and analyze data through natural language without having to master complex SQL syntax. Analytics GBI supports a variety of data sources, including MySQL, PostgreSQL protocol databases and Excel files, and provides flexible deployment options, supporting public cloud and hybrid deployment modes. Its multi-intelligence body framework can dynamically schedule multiple intelligences to perform tasks based on task complexity, providing efficient data analysis and intelligent chart visualization functions to enhance users' decision-making efficiency and data insight capabilities.

Recommended open source products:DB-GPT: Building AI Native Data Application Development Framework, Integrating Multi-Model Management and Intelligent Data Processing

XiYan-SQL: A Multi-Generator Integration Framework for Text to SQL


To address the performance challenges of large language models in natural language to SQL tasks, we propose XiYan-SQL, an innovative framework that employs a multi-generator integration strategy to improve candidate generation. We introduce M-Schema, a semi-structured schema representation for enhanced understanding of database structures. To enhance the quality and diversity of generated candidate SQL queries, XiYan-SQL combines the remarkable potential of context learning (ICL) with the precise control of supervised fine-tuning. On the one hand, we propose a series of training strategies for fine-tuning the model to generate high-quality candidates with diverse preferences. On the other hand, we implement an example selection method based on named entity recognition to prevent overemphasizing entities in ICL approaches. The refiner optimizes each candidate item by correcting logical or syntactic errors. To address the challenge of identifying the best candidates, we fine-tune a selection model to distinguish subtle differences in candidate SQL queries. Experimental results on multiple dialect datasets show that XiYan-SQL exhibits strong robustness in addressing the challenges in different scenarios. Overall, our proposed XiYan-SQL achieves a competitive execution accuracy of 89.65% on the Spider test set, 69.86% on SQL-Eval, 41.20% on NL2GQL, and scores 72.23% on the Bird development benchmark. The framework not only improves the quality and diversity of SQL queries, but also outperforms previous approaches.

Source : https://github.com/XGenerationLab/XiYan-SQL

Analytics GBI: Intelligent Data Analytics for Easy ChatBI-1

 

Function List

  • natural language dialog: Query and analyze data through natural language, without having to master SQL syntax.
  • Multiple Data Source Support: Support for MySQL, PostgreSQL protocol databases and Excel file docking.
  • Intelligent Task Scheduling: The Multi-Intelligents Framework dynamically schedules the execution of tasks based on task complexity.
  • Intelligent Chart Visualization: Generate intelligent charts based on data characteristics to visualize analysis results.
  • Explanation of business logic: Provide flexible business logic interpretation functions to help big models understand business scenarios.
  • Data Table Management: Manage data table information to improve query accuracy.
  • Casebook Self-Operation: Optimize model effectiveness by guiding model self-learning through case management.
  • Security deployment: Supports public cloud and hybrid deployment models for data security.

 

Using Help

Installation and Configuration

  1. Register & Login: AccessAnalyzing Words GBI official website, register and login with your AliCloud account.
  2. Create a projectAfter logging in, enter the console, click "Create Project", fill in the name and description of the project, and select the type of data source.
  3. Connecting to a data source: According to the selected data source type, fill in the corresponding connection information (such as database URL, user name, password, etc.) to complete the data source connection.
  4. Configuring Intelligent Bodies: In the project settings, configure the task scheduling and execution strategy of the intelligences, and select the appropriate intelligent body model.
  5. Deployment and Testing: After completing the configuration, click "Deploy", the system will automatically deploy. After the deployment is completed, you can test through the console to ensure that the configuration is correct.

Guidelines for use

  1. natural language query: Enter a natural language query statement in the console input box, such as "Query 2023 sales data", the system will automatically generate SQL and return the query results.
  2. Intelligent Chart Generation: On the query result page, click "Generate Chart", the system will automatically generate the corresponding chart according to the data characteristics, and users can choose different chart types for display.
  3. Multi-Round Dialog Support: The system supports multiple rounds of dialog, users can add, modify or follow up during the query process and the system will respond intelligently according to the context.
  4. Explanation of business logic: During the query process, users can add business logic explanations to help the system understand the query intent more accurately.
  5. Case library management: Users can add, modify and manage cases in the case library to guide the model through cases for self-learning and to improve the accuracy and effectiveness of the model.
  6. Data Table Management: In the "Data Table Management" module of the console, users can view and manage data table information, including table structure, column information, etc., to help the system more accurately understand the query problem.
  7. Security settings: In the project settings, users can configure data security policies, including VPC access, data encryption, etc., to guarantee the security of data transmission and storage.

common problems

  • Database connection error: Check if the URL format is correct, make sure the URL is a public network accessible address, check the database access IP restriction, and make sure the public network IP of Dialectic GBI is in the whitelist.
  • Poor search results: Split complex problems into multiple simple problems, supplement data table representations and schema information, add business logic explanations, and add cases for optimization.
  • Date formatting error: It is recommended to use the YYYY-MM-DD format and to indicate the date format in the column description.
May not be reproduced without permission:Chief AI Sharing Circle " Analytics GBI (XiYan-SQL): Text-to-SQL Intelligent Data Analytics for ChatBI with Ease

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish