AI Personal Learning
and practical guidance

Experiment: convert WordPress site-wide content into "content query function" of the AI assistant

a factor (leading an effect)

The Chief AI Sharing Circle has compiled a large number of "practical commands" and various "AI tools", which can be found on the website by enteringbywordMatching for searching could not find the exact resource needed. The website is full of excellentGenerate Video ToolsThe inability to be found is intolerable.

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1


 

Lacking the ability to develop a website, we can rely on external functions for searching:

Relying on the search engine to use the "site search" method to solve the problem seems to be a bit cumbersome, and the content is not included in the full:

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

Or just type: site:www.aisharenet.com SEO

 

Of course, I don't have the ability to directly convert website content to semantic search and provide a good interface to use it, so the problem centers around:

How to convert website content into an easily retrievable knowledge base.

 

 

content analysis

AI tools and instructions for their use, in the header area, essentially describes clearly the characteristics of their content, while the content area, although presented in more detail, can appear to bedisruptionstext, which affects the quality of retrieval. Also, there are images in the content that I would like to try to provide readers with a preview of.

 

Examples of AI tool content

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

Example of using command contents

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

 

Thinking about search strategies

 

1. Title and content are mixed as a whole paragraph to participate in semantic retrieval

Pros: complete content

Cons: Too much content leads to imprecise searches

 

2. Retrieve only the title and then cite the content knowledge according to the title

Advantage: accurate search

Cons: Reduced effective search scope

 

3. Input the title and content to the big model to split into QA pairs

Benefits: Greatly improves effective search coverage

Disadvantages: Higher processing costs and time costs; important content and structure of the original text will be lost

PS: do not need any development experience, you can deploy the DIFY project to batch generate QA pairs, not demonstrated here.

 

4. Knowledge mapping

Content not suitable, ignore.

I'm going to rely on free and open platforms for editing intelligences, which also don't support knowledge graphs.

 

Selective retrieval2 is simple and efficient. Although the effective retrieval range is reduced, it can be incrementally optimized through continuous iteration.

The content subject doesn't really need to be involved in the retrieval either, as long as it follows the semantics to retrieve thecaptionThis reduces the number of exceptions generated by the large model when dealing with long contexts, and returning the URL allows for more complete reading.

 

 

Search Tool Carrier

 

Which three-way platform is used to implement semantic search?

There are many free platforms on the market that support knowledge bases, such as Metabase, Smart Spectrum, Buckle, and Wenshin. Here I'm going to choose the platform that supports importing QA pairs for retrieval.

Retrieve QA pairs: return the answer B corresponding to question A by retrieving question A back to the big model, and use B as reference content to answer the user's question.

Which platform is better, which semantic understanding is better, is not considered here, their basic performance is basically considered up to standard.

 

Where do users use it?

The main push is public, so it allows users to search in public.

 

Smart Spectrum is good, but I choose Wenxin Intelligent Body, which has clearer operational instructions when dealing with QA rules. At the same time, Wenxin Intelligent Body can be published to Baidu for customer acquisition. Recommended Reading:Killer traffic portal: using AI intelligent body to get external traffic for websites and public numbers in the long run

 

 

Operation Tutorial

 

1. Export XML files from WordPress

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

2.XML converted to MD format

 

2.1 Click here to downloadblog2md project(math.) genusUnzip to directory D:\222\blog2md

 

2.2 Click the right mouse button at the beginning of the blog2md directory to open the SHELL terminal.

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

2.3 Most likely you need to install the dependency, enter the following command

Installation command:
npm install xml2js

Verify the command:
npm list xml2js

 

2.4 Name the exported XML file 111.xml, place it in the D:\222\blog2md directory, and execute the following command

node index.js w 111.xml out

 

2.5 At this point, the directory D:\222\blog2md\out is generated, and you can verify whether the generated content is correct after entering it.

 

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

3.MD Convert EXCEL Format

The md content grid is structured so it's good to extract, here I write a regular in chatgpt and execute it in python.

I want to extract: filename (the filename is the URL, e.g. https://www.aisharenet.com/anse/), title, content area (--- the content below)

 

3.1 After executing the python script, the output.xlsx file is generated in the current directory.

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

Script content:

Save the script file with a random name: 111.py and put the script in any directory, here I put it in D:\222\blog2md.

Execute from the command line (the default command line cannot execute 111.pt directly, you must add the . \ prefix)

. 1.py

 

The script file code is as follows, please save it as 111.py (generated by CHATGPT)

Directory to read md files: folder_path = "D:\\222\\blog2md\\out"

Generate EXCEL in current directory: output_file = "output.xlsx"

Chief AI Sharing CircleThis content has been hidden by the author, please enter the verification code to view the content
Captcha:
Please pay attention to this site WeChat public number, reply "CAPTCHA, a type of challenge-response test (computing)", get the verification code. Search in WeChat for "Chief AI Sharing Circle"or"Looks-AI"or WeChat scanning the right side of the QR code can be concerned about this site WeChat public number.

3.2 Organize output.xlsx as a knowledge base to be uploaded

Here only the title is kept and the full URL is spliced out.

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

4. Knowledge base uploaded by the Manxim smart body

 

4.1 Accessing the Wenshin Intelligence Body and uploading the knowledge base

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

4.2 Uploading EXCEL files

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

4.3 Customized search columns (this is the reason for using Wenshin Intelligence, other tools lack this interface)

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

For more tips on organizing your knowledge base read on:Wenxin Intelligent Body Tutorial: (4) Processing Documents and Synchronizing to the Knowledge Base

 

5. Create intelligences and publish them for use

 

5.1 Creating Intelligentsia

Let's simply configure it here without getting bogged down in the specifics. Start creating the smart body...

You can try to use low-code mode to create intelligent bodies, add multiple knowledge base judgment logic, after all, the site has many channels well, here I will not demonstrate, interested in low-code friends can read:Wenxin Intelligent Body Tutorial: (V) Organize Intelligent Body Workflow

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

5.2 Configuring Intelligentsia

Turn off non-Knowledge Base functions to avoid anomalies, and I'll leave the other settings at default without fine-tuning.

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

The hit rate of the recalled knowledge base should be tested briefly, otherwise it is easy to match irrelevant content.

Experiment: WordPress site-wide content conversion into "content query function" based AI assistant-1

 

5.3 Debugging and Previewing the Output

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

5.4 Publishing Intelligentsia

Experiment: Convert WordPress site-wide content to a structured QA knowledge base-1

 

ultimate

In the end, you get a smart body that can quickly look up AI tools in the public, all for free! Also, based on the Wenxin Smartbody distribution channel ( Wenxin Intelligent Body Platform: Intelligent Body Applications Built on Complete Distribution Channels and Commercial Closures ), this tool will be released to the Baidu home page to provide users with access.

May not be reproduced without permission:Chief AI Sharing Circle " Experiment: convert WordPress site-wide content into "content query function" of the AI assistant

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish