Given that many to common sense use of knowledge graph to enhance the recall rate or as a long-term memory storage of friends did not understand the big model era of knowledge graph construction method, here simple popularization of common sense, and the actual construction will have more differences.
In fact, the knowledge graph is simple to say the least, and the big model is responsible for three main blocks in its process: relationship extraction, constructing queries, and answering questions.
In fact there are many more details, as shown in the following example: (optimizing the intermediate query process)
Knowledge graph cue word construction
# Step 1: Problem Analysis Prompts QUESTION_ANALYSIS_PROMPT = """ You are an expert query analyzer. Your task is to analyze the given question and extract key search terms. Input Question: {query} Follow these steps. 1. Identify main entities (nouns, proper nouns) 2. Extract important attributes (adjectives, descriptors) 3. Identify relationship indicators (verbs, prepositions) 4. Note any temporal or conditional terms Format your output as. { "main_entities": ["entity1", "entity2"...] ... ["entity1", "entity2"...] ] , "attributes": ["attr1", "attr2"...] . , "attributes": ["attr1", "attr2"...] , "attributes": ["attr2"...] "relationships": ["rel1", "rel2"...] . , "conditions": ["cond1", "cond2"...] , "conditions": ["cond1", "cond2"...] "conditions": ["cond1", "cond2"...] . } Ensure each term is. - Specific and relevant - In its base/root form - Without duplicates Examples. Q: "Who created the Python programming language?" { "main_entities": ["Python", "programming language"], "attributes": ["created"], { "relationships": ["create", "develop"], "conditions": ["create", "develop"], "conditions". "conditions": [] } """ # Step 2: Synonym Expansion Prompts SYNONYM_EXPANSION_PROMPT = """ You are an expert synonym expansion system. Your task is to expand each term with relevant alternatives. Your task is to expand each term with relevant alternatives. Input Terms: {terms} For each term, provide: 1. 1. Exact synonyms 2. Related terms 3. Common variations 4. Abbreviations/Acronyms 5. Full forms 6. Common misspellings Rules: Include industry-standard terminology. - Include industry-standard terminology - Consider different naming conventions - Include both formal and informal terms - Maintain semantic equivalence Format your output as. { "term": { "term": { "synonyms": ["syn1", "syn2"...] [ "variations": [ "var2"...]. "variations": ["var1", "var2"...] . , "abbreviations": ["abbr"]... "abbreviations": ["abbr1", "abbr2"...] . , "related_terms": ["var1"...] , "abbreviations": ["abbr1", "abbr2"...] , "var2". "related_terms": ["rel1", "rel2"...] . } } Example. Input: "Python" { "Python": { "related_terms": ["CPython", "Jython", "IronPython"] } } """ # Step 3: Query the build prompt word QUERY_CONSTRUCTION_PROMPT = """ You are an expert in constructing graph database queries. Your task is to create an optimized search pattern. Your task is to create an optimized search pattern. - Primary Terms: {primary_terms} - Expanded Terms: {expanded_terms} - Relationships: {relationships} Generate a query pattern that: 1. 1. Prioritizes exact matches 2. Includes synonym matches 3. Considers relationship patterns 4. Handles variations in terminology Rules: Start with most specific terms. - Start with most specific terms - Include all relevant relationships - Consider bidirectional relationships - Limit path length appropriately Format your output as. { "exact_match_patterns": ["pattern1", "pattern2"...] ["pattern1" "pattern2"...] , "fuzzy_match_patterns": ["pattern1", "pattern2"...] . , "relationship_patterns": ["pattern2"...]. "relationship_patterns": ["pattern1", "pattern2"...] . , "relationship_patterns": ["pattern1", "pattern2"...] , "relationship_patterns": ["pattern1", "pattern2"...] "priority_order": ["high", "medium", "low"] } Example. { "fuzzy_match_patterns": ["MATCH (n) WHERE n.name =~ '(?i). *python.*'"], "relationship_patterns": ["MATCH (creator)-[:CREATED]->(lang)", "MATCH (lang)-[:TYPE_OF]->(prog_lang)"], "priority_order": ["exact_name_match", "fuzzy_name_match", "relationship_match"] } """ # Step 4: Result Ranking Prompt Word RESULT_RANKING_PROMPT = """ You are an expert in ranking and ordering search results. Your task is to score and rank the retrieved matches. Input Results: {query_results} Original Query: {original_query} Ranking Criteria. 1. Relevance to original query 2. Match quality (exact vs partial) 3. Relationship distance 4. Information completeness 5. Source reliability Score each result on. - Relevance (0-10) - Confidence (0-10) - Completeness (0-10) - Path length penalty (-1 per hop) Format your output as. { "ranked_results": [ { "result": "result_content", "confidence_score": score, { "result": "result_content", "relevance_score": score, "confidence_score": score, "completeness_score". "completeness_score": score, "completion_score": score, "final_score": score "final_score": score, "reasoning". "reasoning": "explanation" } ], { "summary": { "summary": { "reasoning": "explanation" } "summary": { "total_results": number, "total_results": number, "high_confidence_count": number, "high_confidence_count": number "high_confidence_count": number, "average_score": number, "high_confidence_count": number "average_score": number } } """
Based on the above cue words it can be seen that the intermediate process of building a knowledge graph can have a lot of details to deal with. The above example senses are not obvious and the following output example is given as a reference.
Knowledge graph content input and output
Step 1: Problem Analysis
Input.
{
"query": "What deep learning frameworks has Google developed?"
}
Output.
{
"main_entities": ["Google.", "Deep Learning Framework."],
"attributes": ["Development."],
"relationships": ["Development.", "Create", "Maintenance"],
"conditions": []
}
Step 2: Expansion of Synonyms
Input.
{
"terms": ["Google.", "Deep Learning Framework."]
}
Output.
{
"Google.": {
"synonyms": ["Google LLC", "Google Inc.", "Google Corporation"],
"variations": ["google", "GOOGLE"],
"abbreviations": ["GOOG"],
"related_terms": ["Alphabet Inc.", "Google AI.", "Google Research."]
},
"Deep Learning Framework.": {
"synonyms": ["Deep Learning Library", "Neural Network Framework"],
"variations": ["DL framework", "Deep Learning Platform"],
"abbreviations": ["DL framework", "DNN framework"],
"related_terms": ["Machine Learning Framework.", "AI Framework", "Neural Network Library"]
}
}
Step 3: Query Construction
Input.
{
"primary_terms": ["Google.", "Deep Learning Framework."],
"expanded_terms": [
"Google.", "Google LLC", "Google Inc.", "Google AI.",
"Deep Learning Framework.", "Deep Learning Library", "Neural Network Framework"
],
"relationships": ["Development.", "Create", "Maintenance"]
}
Output.
{
"exact_match_patterns": [
"MATCH (org:Organization {name: 'Google'})",
"MATCH (framework:Framework {type: 'deep_learning'})"
],
"fuzzy_match_patterns": [
"MATCH (org) WHERE org.name =~ '(?i). *google.*'",
"MATCH (framework) WHERE framework.type =~ '(?i). *(deep learning|neural network). *'"
],
"relationship_patterns": [
"MATCH (org)-[:DEVELOPED]->(framework)",
"MATCH (org)-[:CREATED]->(framework)",
"MATCH (org)-[:MAINTAINS]->(framework)",
"MATCH (framework)-[:DEVELOPED_BY]->(org)"
],
"priority_order": [
"exact_organization_match",
"exact_framework_match",
"relationship_match",
"fuzzy_framework_match".
]
}
Step 4: Sort Results
Input.
{
"query_results": [
{
"org": "Google.",
"framework": "TensorFlow.",
"relationship": "Development.",
"year": "2015"
},
{
"org": "Google.",
"framework": "JAX",
"relationship": "Development.",
"year": "2018"
}
],
"original_query": "What deep learning frameworks has Google developed?"
}
Output.
{
"ranked_results": [
{
"result": "Google developed TensorFlow, a popular deep learning framework",
"relevance_score": 10.0,
"confidence_score": 9.5,
"completeness_score": 9.0,
"final_score": 9.5,
"reasoning": "Straightforward answer with high confidence - TensorFlow is Google's primary deep learning framework"
},
{
"result": "Google Develops JAX, a Deep Learning Framework.",
"relevance_score": 9.0,
"confidence_score": 8.5,
"completeness_score": 8.0,
"final_score": 8.5,
"reasoning": "Relevant but less well-known than TensorFlow"
}
],
"summary": {
"total_results": 2,
"high_confidence_count": 2,
"average_score": 9.0
}
}
Final Combined Query (Cypher).
// 1. First match the relationship between the organization and the framework
MATCH (org:Organization)-[r:DEVELOPED|CREATED|MAINTAINS]->(framework:Framework)
WHERE org.name =~ '(?i). *google.*'
AND framework.type =~ '(?i). *(deep learning|neural network). *'
// 2. Save the intermediate results
WITH org, framework, r
// 3. Find the shortest path
MATCH p = shortestPath((org)-[*1..2]-(framework))
// 4. Return the paths and sort them by popularity
RETURN p
ORDER BY framework.popularity DESC
LIMIT 10
Final Answer:Google has developed several deep learning frameworks," + "The most famous are TensorFlow (2015) and JAX (2018)." +"TensorFlow is their main framework and the most widely used.
The above implementation process is also not complete, it is recommended to disassemble a complete project for analysis, Llamaindex is a good choice. Below is a collection of some examples of cue words for you to get a feel for it.
Examples of Knowledge Graph Cue Words
Entity Relationship Extraction
SYSTEM: You are an expert in extracting knowledge triples. Your task is to identify entities and their relationships from a given text. For each text, generate a knowledge triad in the following format: (subject, relation, object) Rules: 1. the subject and object must be concrete entities 2. the relation should be a clear and concise predicate 3. each triple should represent a single, fact-based statement 4. avoid generalized or ambiguous relationships 5. maintain consistency in entity naming Input text: {text_chunk} Extract the knowledge triples from the above text. Format each triad as: (entity1) -> [relationship] -> (entity2)
Note: To aggregate entity relationships based on their industry, extracting entity relationship design is complex, so only the basic template is provided here.
Problem Analysis Cue Words
You are a professional query analysis expert. Your task is to analyze given questions and extract key components for knowledge graph search. Enter question: {query} Please follow the steps below carefully: 1. Extract key entities: - Identify all key entities (nouns, proper nouns, technical terms) - List any specific attributes mentioned - Record any values or dates 2. Identify relationships: - Identify verbs that show relationships between entities - Identify prepositions that show connections - Note any implied relationships 3. Detect the query type: - Determine if it is one of the following types: * Factual queries * Relational queries * Comparison queries * Attribute queries * Timeline queries 4. Extract constraints: - Time constraints - Location constraints - Condition constraints - Quantity constraints Please output the results of your analysis in the following format: { "entities": ["entity1", "entity2", ...] , "attributes": ["attribute1", "attribute2", ...] "relationships": ["relationship1", "relationship2", ...] "query_type": "type_of_query", . "constraints": { "time": [], "location": [], "condition": [], "quantity": [] } } Remember: - Identify entities precisely - Include all possible terminology variants - Maintain technical accuracy - Retain domain-specific terminology
You are a professional query analysis expert. Your task is to analyze given questions and extract key search terms. Enter question: {query} Follow the steps below: 1. Identify key entities (nouns, proper nouns) 2. extract important attributes (adjectives, descriptors) 3. identify relationship indicators (verbs, prepositions) 4. note any time- or condition-related terms Format the output as: { "main_entities": ["entity1", "entity2"...] ["entity1", "entity2"...] , "attributes": ["attr1", "attr2"...] . , "attributes": ["attr1", "attr2"...] , "attributes": ["attr2"...] "relationships": ["rel1", "rel2"...] . , "conditions": ["cond1", "cond2"...] , "conditions": ["cond1", "cond2"...] "conditions": ["cond1", "cond2"...] . } Make sure each term: - Specific and relevant - is in base/root form - No repetition Examples: Q: "Who created the Python programming language?" { "main_entities": ["Python", "programming_language"], "conditions": [] }
Analyze the following question and extract key terms to use in your search: Question: {query} Extract: 1. key entities 2. key attributes 3. relationship indicators Format your response into a list of key terms.
synonym expansion cue word
You are a specialist in synonym expansion systems. Your task is to generate a comprehensive list of synonyms for the terms provided, while maintaining technical accuracy and domain contextual integrity. Enter the terms: {terms} Please provide extensions for each term in the following categories: 1. precise synonyms: - Terms that are directly equivalent - Variants with identical meanings 2. related terms: - A broader term - More specific terms - Related concepts 3. abbreviations and alternative forms: - Common abbreviations - Full form - Alternative spellings - Common misspellings 4. Domain-specific variants: - Technical terms - Industry-specific terms - Common Usage in Different Contexts 5. compound terms: - Related Compound Terms - Phrase Variants - Common combinations Rules: 1. maintain semantic equivalence 2. maintain technical accuracy 3. Consider domain context 4. include common variants 5. add relevant technical terms Organize your response in the following format: { "term": { "exact_synonyms": [], "related_terms": [], "abbreviations": [], "compound_terms": [] } } Example: For the term "machine learning": { "machine learning": { "exact_synonyms": ["ML", "machine learning"], { "related_terms": ["artificial intelligence", "deep learning"]. "abbreviations": ["ML", "M.L."], "domain_terms": ["statistical_learning", "computational_learning"], "compound_terms": ["Machine Learning Algorithms", "ML Models"] } }
You are a professional synonym expansion system. Your task is to expand relevant alternatives for each term. Enter term: {terms} For each term, provide: 1. exact synonyms 2. related terms 3. common variants 4. abbreviations/acronyms 5. complete forms 6. Common Spelling Errors Rules: - Include industry standard terminology - Consider different naming conventions - Include formal and informal terms - Maintain semantic equivalence Format the output as: { "term": { "synonyms": ["syn1", "syn2"...] [ "variations": [ "var1": [ "syn2"...] ] ] "variations": ["var1", "var2"...] . , "abbreviations": ["abbr"]... "abbreviations": ["abbr1", "abbr2"...] . , "related_terms": ["var1"...] , "abbreviations": ["abbr1", "abbr2"...] , "var2". "related_terms": ["rel1", "rel2"...] . } } Example: Input: "Python" { "Python": { "synonyms": ["Python programming language", "Python lang"], "variations": ["python", "Python3", "Python2"], { "variations": ["python", "Python3", "Python2"], "abbreviations". "related_terms": ["CPython", "Jython", "IronPython"] } }
You are a professional synonym expansion system. Find synonyms for each word in the list or related words commonly used to reference the same word: Here are some examples: - A synonym for Palantir may be Palantir technologies or Palantir technologies inc. - A synonym for Austin might be Austin texas - A synonym for Taylor swift might be Taylor - A synonym for Winter park might be Winter park resort Format: {format_instructions} Text: {keywords}
For each key term, common alternative expressions are provided: Term: {key_terms} Include: - Common abbreviations - Full name - Similar concepts - Related terms
Query Building Cue Words
You are a knowledge graph query construction expert. Your task is to create a structured query schema using the analyzed terms and their extensions. Input: - Original query: {original_query} - Analyzed components: {analyzed_components} - Extended terms: {expanded_terms} Action Steps: 1. Construct the main search pattern: Consider the following: - Entity schema - Relationship schema - Attribute constraints - Path schema 2. define search priorities: Categorize search elements as: - Terms that must match - Terms that should match - Terms that should match 3. specify relationship depth: Determine: - Direct relationship (1-hop) - Indirect relationship (2-hop) - Complex path (multi-hop) 4. Set constraints: Include: - Time filtering constraints - Type constraints - Attribute conditions - Value range Output format: { "search_patterns": { "primary_entities": [], "secondary_entities": [], "relationships": [], "attributes": [] }, "priorities": { "must_match": [], "should_match": [], "attributes": [] }, "priorities": { "nice_to_match": [] }, "depth_config": { "direct_relations": [], "indirect_relations": [], "complex_paths": [] }, "constraints": { "time_filters": [], "type_constraints": [], "property_conditions": [], "value_ranges": [] } } Example: For "Who contributed to TensorFlow in 2020?" For "Who contributed to TensorFlow in 2020? { "search_patterns": { "primary_entities": ["TensorFlow", "contributor"], { "search_patterns": { "search_patterns": { "relationships": ["contributed_to", "authored"], "attributes": ["date", "contribution_type"] }, "priority". "priorities": { "must_match": ["TensorFlow", "2020"], "should_match": ["TensorFlow", "2020"], { "should_match": ["contributor", "contribution"], { "nice_to_match": ["TensorFlow", "2020"], ["date"] } "nice_to_match": ["commit_message", "pull_request_title"] } ... }
You are an expert in building graph database queries. Your task is to create an optimized search schema. Input. - Primary terms: {primary_terms} - Extended terms: {expanded_terms} - Relationships: {relationships} Generate a query pattern that requires: 1. prioritizes matching exact results 2. include synonym matching 3. consider relationship patterns 4. handle term variants Rules: - Start with the most specific term - Include all relevant relationships - Consider bi-directional relationships - Limit path lengths appropriately Format the output as: { "exact_match_patterns": ["pattern1", "pattern2"...] , "fuzzy_match_patterns": ["pattern1", "pattern2"...] , "relationship_patterns": ["pattern2"...]. "relationship_patterns": ["pattern1", "pattern2"...] . , "relationship_patterns": ["pattern1", "pattern2"...] , "relationship_patterns": ["pattern1", "pattern2"...] "priority_order": ["high", "medium", "low"] } Example: { "exact_match_patterns": ["MATCH (n:Entity {name: 'Python'})"], "MATCH (n:Language {type: 'programming'})"], "fuzzy_match_patterns": ["MATCH (n) WHERE n.name =~ '(?i). *python.*'"], "relationship_patterns": ["MATCH (creator)-[:CREATED]->(lang)", "MATCH (lang)-[:TYPE_OF]->(prog_lang)"], "priority_order": ["exact_name_match", "fuzzy_name_match", "relationship_match"] }
Create a search pattern using expanded terms: {expanded_terms} Generate: 1. primary search term 2. secondary terms 3. relationship schema
Given the following question: {query} Extract key concepts and construct a search pattern to help find relevant information in the knowledge graph. Key Concepts: - Identify key entities - Identify relationships of interest - Consider similar terms/synonyms The search pattern should include: 1. primary entities to be found 2. the relevant relationships 3. any constraints or conditions
Result Sorting Cues
You are an expert in sorting and ranking search results. Your task is to score and rank the retrieved results. Input results: {query_results} Original query: {original_query} Sorting criteria: 1. relevance to original query 2. match quality (exact vs partial) 3. relationship distance 4. information integrity 5. source reliability Rate each result: - Relevance (0-10) - Confidence (0-10) - Integrity (0-10) - Path length penalty (-1 per hop) Format the output as: { "ranked_results": [ { "result": "result_content", "relevance_score": score, "confidence_score": score, { "result": "result_content", "relevance_score": score, "confidence_score": score, "completeness_score". "completeness_score": score, "completion_score": score, "final_score": score "final_score": score, "reasoning". "reasoning": "explanation" } ], { "summary": { "summary": { "reasoning": "explanation" } "summary": { "total_results": number, "total_results": number, "high_confidence_count": number, "high_confidence_count": number "high_confidence_count": number, "average_score": number, "high_confidence_count": number "average_score": number } }
Results Processing Answer Question Prompt Words
You are a query results processor. Your task is to process and format knowledge graph query results into coherent responses. Input: - {original_question} - Query results: {query_results} - Context information: {context} Processing Steps: 1. Analyze results. Evaluate: - Completeness of results - Relevance of results - Quality of results - Coverage of all aspects of the problem 2. Combined information. Combined: - Directly matched results - Indirect relationships - Supporting Information - Contextual details 3. Formatted responses. Structured responses include: - Key findings - Supporting details - Relevant context - Confidence level 4. Identify Information Gaps:. Record: - Missing Information - Aspects of uncertainty - Potential next steps - Possible alternative explanations Output format: { "answer": { "main_response": "", "supporting_facts": [], "confidence_level": "", "information_gaps": [] }, "metadata": { "sources_used": [], "result_quality": "", "processing_notes": [] }, "follow_up": { "suggested_questions": [], "clarification_needed": [], "additional_context": [] } } Guidelines: - Precision and accuracy - Maintain technical correctness - Indicate the level of confidence - Document any uncertainties - Suggest follow-up questions if needed
You are an expert in synthesizing graph database query results into natural language answers. Inputs. 1. original question: {original_question} 2. ranked results: {ranked_results} 3. query metadata: {query_metadata} Task: Generate a comprehensive answer. Generate a comprehensive answer that: 1. directly answer the original question 2. incorporate high-confidence information from the sorted results 3. maintain factual accuracy and correctly attribute sources Guideline. - Start with the most relevant information - Include supporting details when confidence is high - Recognize any uncertainties or missing information - Maintain a clear and concise style - Use correct technical terminology Format your response in the following form: { "main_answer": "The main answer to the core question", "supporting_details": [ "additional_relevant_facts 1", "Additional Supporting Facts 2" "metadata": { "confidence_score": float, "source_count": integer, "source_count": float, "source_count": float "source_count": integer, "information_completeness": float, "information_completeness": float "information_completeness": float }, { "query_coverage": "query_coverage": float "query_coverage": "An explanation of how the available information answered the original question" } Sample output. { "main_answer": "Google developed TensorFlow as its main deep learning framework and released it in 2015." , "supporting_details": [ "Google developed another deep learning framework, JAX, in 2018.", "supporting_details":[ "Google developed JAX in 2018". "TensorFlow has become one of the most widely used deep learning frameworks." ],. "metadata": { "confidence_score": 9.5, "source_count": 2.5, "confidence_score": 2.5 "source_count": 2, "information_completeness". "information_completeness": 0.95 }, "query_coverage": "The query results provide comprehensive information about the development of Google's major deep learning frameworks" }
Based on the retrieved information: {context} Answer the original question: {query} Provide a clear and concise answer that requires: 1. respond directly to the question 2. use only the information from the retrieved content 3. indicate if any information is missing or uncertain