The GraphRAG project aims to leverage implicit relationships in unstructured text to extend the range of questions that AI systems can answer on private datasets.
A key advantage of GraphRAG over traditional vector RAG (or "semantic search") is its ability to answer questions about the entire dataset. global searchFor example, "What are the major themes in the data?" or "What are the most important implications for X?" . In contrast, vectors RAG more specialized local searchThis means that the answer is similar to the query and can be found in a specific text area, usually for questions such as "who", "what", "when" and "where". where" questions.
In recent articles, two new query mechanisms are introduced that leverage the rich, digest-based data index created by GraphRAG to improve local search performance and global search cost, respectively.
In this paper, we introduce a new graph-enhanced RAG approach, LazyGraphRAG, which eliminates the need for pre-summarization of source data and avoids the upfront indexing costs that may be unaffordable for some users and use cases.
The key advantage of LazyGraphRAG is its scalability in terms of cost and quality. Among a range of competing approaches (including standard vector RAG, RAPTOR, GraphRAG local search, global search, and DRIFT search mechanisms), LazyGraphRAG excels in the cost-quality range:
- LazyGraphRAG data indexing costs are the same as vector RAGs, at 0.1% of the cost of a full GraphRAG.
- At a query cost comparable to that of vector RAGs, LazyGraphRAG outperforms all competing methods for localized queries, including long context vector RAGs and GraphRAG DRIFT searches (a recently introduced RAG method that outperforms vector RAGs), as well as GraphRAG localized searches.
- The same LazyGraphRAG configuration also shows comparable answer quality to the GraphRAG global search in terms of global queries, but the query costsReduced by more than 700 timesThe
- Query cost of searching 4% globally with GraphRAG, LazyGraphRAG on local and global query typesSignificantly outperforms all competing methodsThis includes GraphRAG global search at the C2 level (the third level of the community hierarchy recommended for most applications).
LazyGraphRAG will soon join the open source GraphRAG library to provide a unified query interface for local and global queries on lightweight data indexes at a cost comparable to standard vector RAGs.
LazyGraphRAG: Combining the Benefits of Vector RAG and GraphRAG
LazyGraphRAG aims to combine the advantages of Vector RAG and GraphRAG while overcoming their respective limitations:
- The vector RAG is a Best Priority Search, which uses similarity to the query to select the most matching block of source text. However, it does not consider the breadth of the dataset for the global query.
- The GraphRAG global search is a breadth-first search, it uses the community structure of the source text entities to ensure that queries are answered taking into account the complete breadth of the dataset. However, it does not consider the optimal community for localized queries.
LazyGraphRAG combines best-first and breadth-first search dynamics in an iterative deepening manner (Table 1). This approach is "inert" compared to the global search mechanism of the full GraphRAG, which delays the use of LLMs and significantly improves the efficiency of answer generation. The overall performance can be measured by one main parameter - theBudget for relevance testing- - to expand, the parameter controls the trade-off between cost and quality in a consistent manner.
GraphRAG | LazyGraphRAG | |
---|---|---|
Building an Index | a) use LLM to extract and describe entities and their relationships, b) use LLM to summarize all observations for each entity and relationship, and c) use graph statistics to optimize entity graphs and extract hierarchical community structures | a) using NLP noun phrase extraction to extract concepts and their co-occurrences, b) using graph statistics to optimize concept maps and extract hierarchical community structures |
Summary Index | Use LLM to summarize the entities and relationships in each community | None - the "inert" method defers all LLM usage to the query time |
Refinement queries | None - always use the original query | Use LLM to a) identify related subqueries and recombine them into an extended query, and b) refine the subqueries using the matching concepts from the conceptual graphs |
Matching queries | None - all queries are answered using all community summaries (breadth first) | For each q Subqueries [3-5]: - Use text block embedding and block-community relations to first sort text blocks by their similarity to the query and then by their top-k Ranking of text blocks sorts the community (best priority) - Use the LLM-based sentence-level relevance evaluator to assess the relevance of the top-ranked communities in the top-ranked communities.k The relevance of individual blocks of untested text is scored (breadth first) - in z A contiguous community generates zero related text blocks and then recurses to the related subcommunity (Iterative deepeningWhen there are no relevant communities remaining or reaching Budget for relevance testing/q expire at midnight |
Mapping Answers | Using LLM to Answer Raw Queries by Processing Random Batches of Community Summaries in Parallel | For each q Subqueries [3-5]: - Construct conceptual subgraphs from related text blocks - Group related blocks together using community assignment of concepts - Extract statements relevant to the subquery from the group of related blocks using LLM as a way of focusing on only relevant content - Rank and filter the Ranking and filtering of extracted statements to fit a predefined context window size |
Reduced answers | Using LLM to answer the original query using mapped answers | Using LLM to Answer Extended Queries Using Extracted Mapping Statements |
LazyGraphRAG: How it works in detail
Simply put, the core idea of LazyGraphRAG is to "build on demand, check as you go".
- Index construction and summarization (on demand): Instead of pre-built full indexes, LazyGraphRAG utilizes Natural Language Processing (NLP) techniques to dynamically extract key concepts and the relationships between them as queries are received. This information is then organized into a graph structure, and this graph is continuously optimized and refined as the query progresses.
- Query Optimization and Content Retrieval: LazyGraphRAG utilizes techniques such as text chunk embeddings to optimize queries in real-time and pinpoint the most relevant pieces of data by analyzing associations between concepts.
- Answer generation and integration: The system constructs smaller subgraphs from relevant text fragments and integrates these into answers that are coherent, accurate and match the query.
LazyGraphRAG Performance: The Data Speaks for Itself
After testing LazyGraphRAG with different levels of correlation test budgets and against a range of competing methods, the following results were obtained:
- data set:: 5,590 Associated Press news articles (used with permission)
- consult (a document etc): 100 synthetic queries (50 local and 50 global), generated using a new methodology (to be presented in a future article)
- norm: comprehensiveness, diversity, empowerment (e.g.here (literary)(As described, use the LLM to compare each pair of answers positively on each indicator)
- prerequisite: Includes LazyGraphRAG with three correlation test budget settings, and eight competing conditions from GraphRAG and the literature (Table 2).
prerequisite | descriptive |
---|---|
Z100_Lite | LazyGraphRAG with a correlation test budget of 100 and using low-cost LLM models in all steps |
Z500 | LazyGraphRAG, relevance testing budget of 500, relevance testing using low-cost LLMs, query refinement and mapping/reduced answer generation using more advanced (higher cost) LLMs |
Z1500 | LazyGraphRAG, relevance testing budget of 1,500, relevance testing using low-cost LLMs, query refinement and mapping/reduced answer generation using more advanced (higher cost) LLMs |
C1 | GraphRAG global search at community level 1 |
C2 | GraphRAG global search at community level 2 |
C3_Dynamic | GraphRAG global search at community level 3, using dynamic community selection (in thePrevious posts(Cost-saving solutions described in) |
LS | GraphRAG Local Search |
DRIFT | GraphRAG DRIFT Search |
SS_8K | Semantic search, vector RAG to retrieve 8k input tokens |
SS_64K | Semantic search, vector RAG retrieves 64k input tokens (for checking the effect of long context windows on vector RAG) |
RAPTOR | A RAG method based on hierarchical clustering with some similar features to LazyGraphRAG |
Figure 1 shows the win rate of LazyGraphRAG against the eight competing conditions. At the lowest budget level of 100 correlation tests using a low-cost LLM model (same cost as SS_8K), LazyGraphRAG significantly outperforms all conditions for both local and global queries, except the GraphRAG global search condition for global queries. When increasing the budget by 500 correlation tests using the more advanced LLM model (4% for C2 query cost), LazyGraphRAG significantly outperforms all conditions for both local and global queries.LazyGraphRAG's win rate continues to increase as the budget for correlation tests is increased to 1,500, demonstrating its scalability in terms of cost and quality.
Figure 1. Clustered bar chart showing the win rate of LazyGraphRAG relative to the competitive conditions for each combination of (local, global) queries and (comprehensiveness, diversity, empowerment) metrics for a relevance test budget of (100, 500, 1,500)
Table 3 shows an example comparison.
QUERY: Describe what factors U.S. individuals should consider when choosing a health insurance plan during the open enrollment period for health insurance plans in 2024.
Vector RAG (64K contexts) | LazyGraphRAG |
---|---|
# Factors to Consider When Selecting a Health Insurance Plan During Open Enrollment for 2024
Selecting a health insurance plan during the open enrollment period can be a complex process, but understanding the key factors to consider can help individuals make informed decisions. Here are some important aspects to keep in mind. ## Coverage and Networks One of the primary considerations is the coverage network of the plan. Insurers form coverage networks of hospitals and doctors, and it is crucial to Insurers form coverage networks of hospitals and doctors, and it is crucial to ensure that your preferred healthcare providers are included in the network of the plan you are considering. Maintenance Organizations (HMOs) and Exclusive Provider Organizations (EPOs), do not cover claims for most non-emergency care outside their networks. [Data: Sources (47419, 47421, 47420, 47410, 47408)]. ## Costs: Premiums, Deductibles, and Out-of-Pocket Maximums Understanding the costs associated with a health insurance plan is essential. This includes the monthly premiums, deductibles, copays, and coinsurance. Plans are typically categorized into different levels such as platinum, gold, silver, and bronze, with bronze plans generally having lower premiums but higher out-of-pocket costs [Data: Sources (47411, 47411)]. Plans are typically categorized into different levels such as platinum, gold, silver, and bronze, with bronze plans generally having lower premiums but higher out-of-pocket costs [Data: Sources (47411, 47412, 47416, 47414, 47418)]. Additionally, it is important to consider the out-of-pocket maximum, which is the most you will have to pay in a year for covered services before the insurance starts covering all costs [Data: Sources (47411)]. insurance starts covering all costs [Data: Sources (47416, 47415)]. ## Tax Credits and Cost-Sharing Reductions Individuals with lower incomes may qualify for income-based tax credits to help reduce the cost of premiums. It is important to accurately estimate your income for the coming year when applying for these credits to avoid having to repay any excess credits received [Data: Sources (47413)]. income for the coming year when applying for these credits to avoid having to repay any excess credits received [Data: Sources (47412, 47413)]. Additionally, those with incomes below 250% of the federal poverty level may qualify for cost-sharing reductions if they select a silver plan, which can lower deductibles and out-of-pocket maximums [Data: Sources (47417, 47418)]. ## Special Considerations for Medicaid and Medicare For those transitioning from Medicaid, it is important to understand that marketplace plans may come with higher premiums and copays compared to Medicaid. Additionally, many people who have been removed from Medicaid may be eligible to enroll in marketplace plans through an extended enrollment period [Data: Sources (6262, 6285, 6286, 6287)]. Additionally, many people who have been removed from Medicaid may be eligible to enroll in marketplace plans through an extended enrollment period [Data: Sources (6284, 6285, 6286, 6287)]. For seniors, the Medicare open enrollment period allows for switching between traditional Medicare and Medicare Advantage plans, with considerations For seniors, the Medicare open enrollment period allows for switching between traditional Medicare and Medicare Advantage plans, with considerations for prescription drug coverage and additional benefits like dental and vision [Data: Sources (72060, 72061, 72062, 72063, 72064)]. ## Assistance and Resources Navigating the complexities of health insurance can be challenging, and seeking assistance from insurance agents or government-funded navigators can be beneficial. These professionals can help individuals understand their options and make informed decisions without recommending specific plans [ Data: Sources (47424, 47425, 47426)]. These professionals can help individuals understand their options and make informed decisions without recommending specific plans [ Data: Sources (47424, 47425, 47426)]. ## Conclusion Selecting the right health insurance plan requires careful consideration of various factors including coverage networks, costs, eligibility for tax credits and cost-sharing reductions, and available assistance. Selecting the right health insurance plan requires careful consideration of various factors including coverage networks, costs, eligibility for tax credits and cost-sharing reductions, and available assistance. By thoroughly evaluating these aspects, individuals can choose a plan that best meets their healthcare needs and financial situation for the upcoming year. |
### Types of Health Insurance Plans
When selecting a health insurance plan during the open enrollment period for 2024, individuals in the United States have several options to consider. 1. **Marketplace Plans**: These are available under the Affordable Care Act (ACA) and are categorized into platinum, gold, silver, and bronze levels. Bronze plans generally have lower premiums but higher out-of-pocket costs, while platinum plans have higher premiums but lower out-of-pocket costs [ Data: Sources (47411, 47420, 47421)]. 2. **Low-Cost Marketplace Plans**: These often come with high deductibles, requiring patients to pay thousands of dollars before most coverage kicks in. They also have annual out-of-pocket maximums that can exceed $9,000 for individuals and $18,000 for families [Data: Sources (47415, 47416, 47414)]. 47414)]. 3. **Exclusive Provider Organizations (EPOs) and Health Maintenance Organizations (HMOs)**: EPOs and HMOs generally restrict patients to a network of doctors and require a primary care doctor to direct care. EPOs and HMOs generally restrict patients to a network of doctors and require a primary care doctor to direct care. They tend to be cheaper but lack out-of-network flexibility [Data: Sources (47420, 47421, 43218 , 43217)]. 4. **Preferred Provider Organizations (PPOs)**: These plans allow for out-of-network services but at a higher cost. They offer more flexibility compared to HMOs and EPOs [Data: Sources (43217)]. 5. **High-Deductible Health Plans (HDHPs)**: Defined as plans with a deductible of at least $1,600 for individual coverage or $3,200 for family coverage with out-of-pocket maximums of no more than $8,050 or $16,100, respectively. Defined as plans with a deductible of at least $1,600 for individual coverage or $3,200 for family coverage, with out-of-pocket maximums of no more than $8,050 or $16,100, respectively. HDHPs usually have lower premiums, and sometimes companies contribute to a health savings account (HSA) to help cover the deductible [Data: Sources (43227, 43226)]. 6. **Medicare Advantage**: These are privately run versions of the federal government's Medicare program, mostly for people aged 65 and over. They often include prescription drug coverage and may offer additional benefits like dental or vision coverage not provided by traditional Medicare. Medicare [Data: Sources (72063, 72061, 72060, 72062)]. 7. **Short-Term Health Insurance Plans**: These plans are limited to three months and can only be renewed for a maximum of four months under new rules. are intended for temporary coverage but often lack comprehensive benefits [Data: Sources (97999, 97995, 97996, 97997)]. ### Cost Factors: Premiums, Deductibles, Co-pays, and Out-of-Pocket Maximums The overall cost of health insurance plans in 2024 is influenced by several factors. - **Premiums**: This is the set monthly cost you pay for your health insurance plan. Premiums have been rising, with a notable increase of 7% for both family and single plans in 2023, partly due to inflation [Data: Sources (83383, 83382, 83384, 83385, 83381, +more)]. - **Deductibles**: The amount you pay out-of-pocket for health care services before your insurance starts to pay. For HDHPs, the deductible is at least $1,600 for individual coverage or $3,200 for family coverage [Data: Sources (43226)]. deductible is at least $1,600 for individual coverage or $3,200 for family coverage [Data: Sources (43226, 43225)]. - **Co-pays and Co-insurance**: These are the costs you pay each time you receive a medical service. Co-pays are fixed amounts, while co-insurance is a percentage of the service cost. Co-pays are fixed amounts, while co-insurance is a percentage of the service cost. - **Out-of-Pocket Maximums**: This is the maximum amount you will pay for covered services in a year. For example, HDHPs have out-of-pocket maximums of no more than $8,050 for individual coverage or $16,100 for family coverage [Data: Sources (43227, 43226)]. ### Provider Networks: In-Network vs. Out-of-Network The network of healthcare providers is a crucial factor in selecting a health insurance plan. - **In-Network Providers**: These are doctors and hospitals that have agreements with your insurance plan to provide services at lower rates. Ensuring your preferred doctors and specialists are in-network can save you significant costs [Data: Sources (43216, 47419)]. - **Out-of-Network Providers**: Services from these providers are usually more expensive and may not be covered at all, except in emergencies. PPO plans offer some out-of-network coverage but at a higher cost, while HMOs and EPOs generally do not cover non-emergency out-of-network care [Data: Sources (432121, 47421)]. PPO plans offer some out-of-network coverage but at a higher cost, while HMOs and EPOs generally do not cover non-emergency out-of-network [Data: Sources (43217, 47421)]. ### Specific Medical Needs and Services When selecting a health insurance plan, individuals should consider their specific medical needs. - **Prescription Drugs**: Ensure that your medications are covered by the plan's formulary, as drug coverage can change Ensure that your medications are covered by the plan's formulary, as drug coverage can change annually [Data: Sources (43220, 43218, 43219)]. - **Mental Health Services**: Coverage for mental health treatments is essential, especially with new rules pushing insurers to increase their coverage of these services [Data: Sources (97031, 97028, 97027, 97030, 97033, +more]. Coverage for mental health treatments is essential, especially with new rules pushing insurers to increase their coverage of these services [Data: Sources (97031, 97028, 97027, 97030, 97033, +more)]. - **Chronic Conditions**: Plans should cover ongoing treatments and medications for chronic conditions. Medicare Supplement Insurance (Medigap) can help cover gaps in Medicare for chronic disease management [Data: Sources (93367) 93368]. (Medigap) can help cover gaps in Medicare for chronic disease management [Data: Sources (93367, 93368)]. - **Preventive Care**: Coverage for preventive services like cancer screenings and HIV prevention is mandated under the ACA, though its future is uncertain due to ongoing legal battles [Data: Sources (71106, 71109, 71098, 71099, 71100, +more)]. ### Key Dates and Steps for Open Enrollment The open enrollment period for 2024 health insurance plans involves several key dates and steps. - **Marketplace Plans**: Open enrollment starts on November 1, 2023, and runs through mid-December in most states, ending on January 16, 2023. 2024 [Data: Sources (47419, 47411, 47416, 47421, 47409, +more)]. - **Medicare**: Open enrollment for Medicare runs from October 15, 2023, to December 7, 2023. During this period, individuals can choose between traditional Medicare, Medicare Advantage plans, and prescription drug plans [Data: Sources (72061, 72063, 72060, 72062)]. During this period, individuals can choose between traditional Medicare, Medicare Advantage plans, and prescription drug plans [Data: Sources (72061, 72063, 72060, 72062)]. - **Special Enrollment Periods**: Individuals who lose coverage due to life events like job loss or moving may qualify for special enrollment periods. For example, those removed from Medicaid may enroll in marketplace plans through July 2024 [Data: Sources (6288, 6289)]. By considering these factors, individuals can make informed decisions about their health insurance coverage for 2024, ensuring they select plans that best meet their medical needs and financial situations. |
LazyGraphRAG demonstrates that a single, flexible query mechanism has the potential to significantly outperform a variety of specialized query mechanisms in the local-global query range, without the upfront costs of LLM data summarization. Its fast and virtually free indexing makes LazyGraphRAG ideal for one-off queries, exploratory analyses, and streaming data use cases, while its ability to smoothly improve answer quality as the budget for relevance testing increases makes it an invaluable tool for benchmarking RAG methods (e.g., "RAG method X beats task Z with budget Y"). LazyGraphRAG").
Does this mean that all graph-enhanced RAGs should be inert? Not really, for three reasons:
- The GraphRAG Data Index for Entity, Relationship, and Community Summaries has uses beyond Q&A (e.g., reading and sharing reports).
- GraphRAG data indexing of entity, relationship, and community summaries combined with a LazyGraphRAG-like search mechanism may yield better results than LazyGraphRAG alone.
- A new type of GraphRAG data indexing designed to support a search mechanism similar to LazyGraphRAG (e.g., through preemptive declarations and topic extraction) may yield the best results.
We will be exploring these directions for some time to come, and all progress (including LazyGraphRAG itself) will be made via the GraphRAG GitHub repository Posted.