AI个人学习
和实操指南
讯飞绘镜

LazyGraphRAG:大幅优化 GraphRAG 的质量与成本

GraphRAG 项目 旨在利用非结构化文本中的隐含关系,扩展 AI 系统在私有数据集上可回答问题的范围。

相比传统的向量 RAG(或称“语义搜索”),GraphRAG 的一个关键优势是它能够回答针对整个数据集的 全局查询,例如“数据中的主要主题是什么?”或“对 X 最重要的影响是什么?”。 相比之下,向量 RAG 更擅长 局部查询,即答案与查询相似,并且可以在特定文本区域内找到,通常适用于“谁”、“什么”、“何时”和“哪里”等问题。


在最近的文章中,介绍了两种新的查询机制,它们利用 GraphRAG 创建的丰富的、基于摘要的数据索引,分别提高了局部搜索性能和全局搜索成本。

本文将介绍一种全新的图谱增强 RAG 方法——LazyGraphRAG。它无需对源数据进行预先摘要,避免了一些用户和用例中可能无法承受的前期索引成本。

 

LazyGraphRAG 的关键优势在于其在成本和质量方面的可扩展性。在一系列竞争方法中(包括标准向量 RAG、RAPTOR、GraphRAG 局部搜索、全局搜索和 DRIFT搜索机制),LazyGraphRAG 在成本-质量范围内表现出色:

  • LazyGraphRAG 数据索引成本与向量 RAG 相同,仅为完整 GraphRAG 成本的 0.1%。
  • 在与向量 RAG 相当的查询成本下,LazyGraphRAG 在局部查询方面优于所有竞争方法,包括长上下文向量 RAG 和 GraphRAG DRIFT 搜索(最近推出的 RAG 方法,表现优于向量 RAG)以及 GraphRAG 局部搜索。
  • 相同的 LazyGraphRAG 配置在全局查询方面也显示出与 GraphRAG 全局搜索相当的答案质量,但查询成本降低了 700 多倍
  • 以 GraphRAG 全局搜索 4% 的查询成本,LazyGraphRAG 在局部和全局查询类型上显著优于所有竞争方法,包括 C2 级别(大多数应用程序推荐的社区层次结构的第三级)的 GraphRAG 全局搜索。

LazyGraphRAG 即将加入开源 GraphRAG 库,为轻量级数据索引上的局部和全局查询提供统一的查询接口,其成本与标准向量 RAG 相当。

 

LazyGraphRAG:融合向量 RAG 与 GraphRAG 的优点

LazyGraphRAG 旨在融合向量 RAG 和 GraphRAG 的优点,同时克服它们各自的局限性:

  • 向量 RAG 是一种 最佳优先搜索,它使用与查询的相似性来选择最匹配的源文本块。但是,它没有考虑全局查询的数据集广度。
  • GraphRAG 全局搜索是一种 广度优先搜索,它使用源文本实体的社区结构来确保在考虑数据集的完整广度的情况下回答查询。但是,它没有考虑局部查询的最佳社区。

LazyGraphRAG 以迭代加深的方式结合了最佳优先和广度优先搜索动态(表 1)。与完整 GraphRAG 的全局搜索机制相比,这种方法是“惰性”的,它可以推迟 LLM 的使用并显著提高答案生成的效率。整体性能可以通过一个主要参数——相关性测试预算——来扩展,该参数以一致的方式控制成本与质量的权衡。

GraphRAG LazyGraphRAG
构建索引 a) 使用 LLM 提取和描述实体及其关系,b) 使用 LLM 总结每个实体和关系的所有观察结果,c) 使用图统计来优化实体图并提取层次社区结构 a) 使用 NLP 名词短语提取来提取概念及其共现,b) 使用图统计来优化概念图并提取层次社区结构
总结索引 使用 LLM 总结每个社区中的实体和关系 无 – “惰性”方法将所有 LLM 使用推迟到查询时
细化查询 无 – 始终使用原始查询 使用 LLM 来 a) 识别相关的子查询并将它们重新组合成一个扩展查询,b) 使用概念图中匹配的概念来细化子查询
匹配查询 无 – 所有查询都使用所有社区摘要进行回答(广度优先 对于每个 q 子查询 [3-5]:– 使用文本块嵌入和块-社区关系首先按与查询的相似性对文本块进行排序,然后按其 top-k 文本块的排名对社区进行排序(最佳优先)– 使用基于 LLM 的句子级相关性评估器对排名靠前的社区中 top-k 个未测试文本块的相关性进行评分(广度优先)– 在 z 个连续社区产生零个相关文本块后递归到相关子社区(迭代加深)– 当没有相关社区剩余或达到 相关性测试预算/q 时终止
映射答案 使用 LLM 通过并行处理随机批次的社区摘要来回答原始查询 对于每个 q 子查询 [3-5]:– 从相关文本块构建概念子图– 使用概念的社区分配将相关块组合在一起– 使用 LLM 从相关块组中提取与子查询相关的声明,作为仅关注相关内容的一种方式– 对提取的声明进行排名和过滤,以适应预定义的上下文窗口大小
减少答案 使用 LLM 使用映射的答案回答原始查询 使用 LLM 使用提取的映射声明回答扩展查询

 

LazyGraphRAG:工作原理详解

简单来说,LazyGraphRAG 的核心思想是“按需构建,即用即查”。

  1. 索引构建与摘要(按需进行): 不再预先构建完整的索引,LazyGraphRAG 利用自然语言处理(NLP)技术,在接收到查询时,动态地提取关键概念及其之间的关系。这些信息随后被组织成一个图结构,并且这个图会随着查询的进行不断优化和完善。
  2. 查询优化与内容检索: LazyGraphRAG 利用文本块嵌入(text chunk embeddings)等技术,实时优化查询,并通过分析概念之间的关联,精准地定位到最相关的数据片段。
  3. 答案生成与整合: 系统从相关的文本片段中构建出更小的子图,并将这些子图整合成连贯、准确且符合查询需求的答案。

LazyGraphRAG:大幅优化 GraphRAG 的质量与成本-1

LazyGraphRAG 性能表现:数据说话

在对 LazyGraphRAG 进行不同级别的相关性测试预算,并与一系列竞争方法进行对比测试后,得到结果如下:

  • 数据集:5,590 篇美联社新闻文章(经许可使用)
  • 查询:100 个合成查询(50 个局部查询和 50 个全局查询),使用一种新方法生成(将在未来的文章中介绍)
  • 指标:全面性、多样性、赋能性(如此处所述,使用 LLM 对每对答案在每个指标上进行正面比较)
  • 条件:包括具有三个相关性测试预算设置的 LazyGraphRAG,以及来自 GraphRAG 和文献的八个竞争条件(表 2)。
条件 描述
Z100_Lite LazyGraphRAG,相关性测试预算为 100,并在所有步骤中使用低成本 LLM 模型
Z500 LazyGraphRAG,相关性测试预算为 500,使用低成本 LLM 进行相关性测试,使用更高级(更高成本)的 LLM 进行查询细化和映射/减少答案生成
Z1500 LazyGraphRAG,相关性测试预算为 1,500,使用低成本 LLM 进行相关性测试,使用更高级(更高成本)的 LLM 进行查询细化和映射/减少答案生成
C1 GraphRAG 全局搜索,位于社区级别 1
C2 GraphRAG 全局搜索,位于社区级别 2
C3_Dynamic GraphRAG 全局搜索,位于社区级别 3,使用动态社区选择(在之前的文章中介绍的成本节约解决方案)
LS GraphRAG 局部搜索
DRIFT GraphRAG DRIFT 搜索
SS_8K 语义搜索,向量 RAG 检索 8k 个输入标记
SS_64K 语义搜索,向量 RAG 检索 64k 个输入标记(用于检查长上下文窗口对向量 RAG 的影响)
RAPTOR 一种基于层次聚类的 RAG 方法,与 LazyGraphRAG 具有一些相似的特征

图 1 显示了 LazyGraphRAG 对八个竞争条件的胜率。在最低预算级别 100 个相关性测试使用低成本 LLM 模型(与 SS_8K 成本相同)时,LazyGraphRAG 在局部和全局查询方面显著优于所有条件,除了全局查询的 GraphRAG 全局搜索条件。在增加预算 500 个相关性测试使用更高级的 LLM 模型(C2 查询成本的 4%)时,LazyGraphRAG 在局部和全局查询方面显著优于所有条件。LazyGraphRAG 的胜率随着相关性测试预算增加到 1,500 而继续增加,证明了其在成本与质量方面的可扩展性。

LazyGraphRAG:大幅优化 GraphRAG 的质量与成本-2
图 1. 聚类条形图显示了 LazyGraphRAG 在相关性测试预算为(100、500、1,500)时,在(局部、全局)查询和(全面性、多样性、赋能性)指标的每种组合中,相对于竞争条件的胜率

表 3 显示了一个示例比较。

查询:描述美国个人在 2024 年健康保险计划的开放注册期间选择健康保险计划时应考虑哪些因素。

向量 RAG(64K 上下文) LazyGraphRAG
# Factors to Consider When Selecting a Health Insurance Plan During Open Enrollment for 2024

Selecting a health insurance plan during the open enrollment period can be a complex process, but understanding the key factors to consider can help individuals make informed decisions. Here are some important aspects to keep in mind:

## Coverage and Networks

One of the primary considerations is the coverage network of the plan. Insurers form coverage networks of hospitals and doctors, and it is crucial to ensure that your preferred healthcare providers are included in the network of the plan you are considering. Many marketplace plans, such as Health Maintenance Organizations (HMOs) and Exclusive Provider Organizations (EPOs), do not cover claims for most non-emergency care outside their networks [Data: Sources (47419, 47421, 47420, 47410, 47408)].

## Costs: Premiums, Deductibles, and Out-of-Pocket Maximums

Understanding the costs associated with a health insurance plan is essential. This includes the monthly premiums, deductibles, copays, and coinsurance. Plans are typically categorized into different levels such as platinum, gold, silver, and bronze, with bronze plans generally having lower premiums but higher out-of-pocket costs [Data: Sources (47411, 47412, 47416, 47414, 47418)]. Additionally, it is important to consider the out-of-pocket maximum, which is the most you will have to pay in a year for covered services before the insurance starts covering all costs [Data: Sources (47416, 47415)].

## Tax Credits and Cost-Sharing Reductions

Individuals with lower incomes may qualify for income-based tax credits to help reduce the cost of premiums. It is important to accurately estimate your income for the coming year when applying for these credits to avoid having to repay any excess credits received [Data: Sources (47412, 47413)]. Additionally, those with incomes below 250% of the federal poverty level may qualify for cost-sharing reductions if they select a silver plan, which can lower deductibles and out-of-pocket maximums [Data: Sources (47417, 47418)].

## Special Considerations for Medicaid and Medicare

For those transitioning from Medicaid, it is important to understand that marketplace plans may come with higher premiums and copays compared to Medicaid. Additionally, many people who have been removed from Medicaid may be eligible to enroll in marketplace plans through an extended enrollment period [Data: Sources (6284, 6285, 6286, 6287)]. For seniors, the Medicare open enrollment period allows for switching between traditional Medicare and Medicare Advantage plans, with considerations for prescription drug coverage and additional benefits like dental and vision [Data: Sources (72060, 72061, 72062, 72063, 72064)].

## Assistance and Resources

Navigating the complexities of health insurance can be challenging, and seeking assistance from insurance agents or government-funded navigators can be beneficial. These professionals can help individuals understand their options and make informed decisions without recommending specific plans [Data: Sources (47424, 47425, 47426)].

## Conclusion

Selecting the right health insurance plan requires careful consideration of various factors including coverage networks, costs, eligibility for tax credits and cost-sharing reductions, and available assistance. By thoroughly evaluating these aspects, individuals can choose a plan that best meets their healthcare needs and financial situation for the upcoming year.

### Types of Health Insurance Plans

When selecting a health insurance plan during the open enrollment period for 2024, individuals in the United States have several options to consider:

1. **Marketplace Plans**: These are available under the Affordable Care Act (ACA) and are categorized into platinum, gold, silver, and bronze levels. Bronze plans generally have lower premiums but higher out-of-pocket costs, while platinum plans have higher premiums but lower out-of-pocket costs [Data: Sources (47411, 47420, 47421)].

2. **Low-Cost Marketplace Plans**: These often come with high deductibles, requiring patients to pay thousands of dollars before most coverage kicks in. They also have annual out-of-pocket maximums that can exceed $9,000 for individuals and $18,000 for families [Data: Sources (47415, 47416, 47414)].

3. **Exclusive Provider Organizations (EPOs) and Health Maintenance Organizations (HMOs)**: EPOs and HMOs generally restrict patients to a network of doctors and require a primary care doctor to direct care. They tend to be cheaper but lack out-of-network flexibility [Data: Sources (47420, 47421, 43218, 43217)].

4. **Preferred Provider Organizations (PPOs)**: These plans allow for out-of-network services but at a higher cost. They offer more flexibility compared to HMOs and EPOs [Data: Sources (43217)].

5. **High-Deductible Health Plans (HDHPs)**: Defined as plans with a deductible of at least $1,600 for individual coverage or $3,200 for family coverage, with out-of-pocket maximums of no more than $8,050 or $16,100, respectively. HDHPs usually have lower premiums, and sometimes companies contribute to a health savings account (HSA) to help cover the deductible [Data: Sources (43227, 43226)].

6. **Medicare Advantage**: These are privately run versions of the federal government’s Medicare program, mostly for people aged 65 and over. They often include prescription drug coverage and may offer additional benefits like dental or vision coverage not provided by traditional Medicare [Data: Sources (72063, 72061, 72060, 72062)].

7. **Short-Term Health Insurance Plans**: These plans are limited to three months and can only be renewed for a maximum of four months under new rules. They are intended for temporary coverage but often lack comprehensive benefits [Data: Sources (97999, 97995, 97996, 97997)].

### Cost Factors: Premiums, Deductibles, Co-pays, and Out-of-Pocket Maximums

The overall cost of health insurance plans in 2024 is influenced by several factors:

– **Premiums**: This is the set monthly cost you pay for your health insurance plan. Premiums have been rising, with a notable increase of 7% for both family and single plans in 2023, partly due to inflation [Data: Sources (83383, 83382, 83384, 83385, 83381, +more)].

– **Deductibles**: The amount you pay out-of-pocket for health care services before your insurance starts to pay. For HDHPs, the deductible is at least $1,600 for individual coverage or $3,200 for family coverage [Data: Sources (43226, 43225)].

– **Co-pays and Co-insurance**: These are the costs you pay each time you receive a medical service. Co-pays are fixed amounts, while co-insurance is a percentage of the service cost.

– **Out-of-Pocket Maximums**: This is the maximum amount you will pay for covered services in a year. For example, HDHPs have out-of-pocket maximums of no more than $8,050 for individual coverage or $16,100 for family coverage [Data: Sources (43227, 43226)].

### Provider Networks: In-Network vs. Out-of-Network

The network of healthcare providers is a crucial factor in selecting a health insurance plan:

– **In-Network Providers**: These are doctors and hospitals that have agreements with your insurance plan to provide services at lower rates. Ensuring your preferred doctors and specialists are in-network can save you significant costs [Data: Sources (43216, 47419)].

– **Out-of-Network Providers**: Services from these providers are usually more expensive and may not be covered at all, except in emergencies. PPO plans offer some out-of-network coverage but at a higher cost, while HMOs and EPOs generally do not cover non-emergency out-of-network care [Data: Sources (43217, 47421)].

### Specific Medical Needs and Services

When selecting a health insurance plan, individuals should consider their specific medical needs:

– **Prescription Drugs**: Ensure that your medications are covered by the plan’s formulary, as drug coverage can change annually [Data: Sources (43220, 43218, 43219)].

– **Mental Health Services**: Coverage for mental health treatments is essential, especially with new rules pushing insurers to increase their coverage of these services [Data: Sources (97031, 97028, 97027, 97030, 97033, +more)].

– **Chronic Conditions**: Plans should cover ongoing treatments and medications for chronic conditions. Medicare Supplement Insurance (Medigap) can help cover gaps in Medicare for chronic disease management [Data: Sources (93367, 93368)].

– **Preventive Care**: Coverage for preventive services like cancer screenings and HIV prevention is mandated under the ACA, though its future is uncertain due to ongoing legal battles [Data: Sources (71106, 71109, 71098, 71099, 71100, +more)].

### Key Dates and Steps for Open Enrollment

The open enrollment period for 2024 health insurance plans involves several key dates and steps:

– **Marketplace Plans**: Open enrollment starts on November 1, 2023, and runs through mid-December in most states, ending on January 16, 2024 [Data: Sources (47419, 47411, 47416, 47421, 47409, +more)].

– **Medicare**: Open enrollment for Medicare runs from October 15, 2023, to December 7, 2023. During this period, individuals can choose between traditional Medicare, Medicare Advantage plans, and prescription drug plans [Data: Sources (72061, 72063, 72060, 72062)].

– **Special Enrollment Periods**: Individuals who lose coverage due to life events like job loss or moving may qualify for special enrollment periods. For example, those removed from Medicaid may enroll in marketplace plans through July 2024 [Data: Sources (6288, 6289)].

By considering these factors, individuals can make informed decisions about their health insurance coverage for 2024, ensuring they select plans that best meet their medical needs and financial situations.

LazyGraphRAG 表明,单一、灵活的查询机制有可能在局部-全局查询范围内大大优于各种专用查询机制,并且无需 LLM 数据摘要的前期成本。其快速且几乎免费的索引使 LazyGraphRAG 成为一次性查询、探索性分析和流数据用例的理想选择,同时其能够随着相关性测试预算的增加而平稳提高答案质量,使其成为基准测试 RAG 方法的宝贵工具(例如,“RAG 方法 X 在任务 Z 中以预算 Y 击败 LazyGraphRAG”)。

这是否意味着所有图谱增强 RAG 都应该是惰性的?并非如此,原因有三:

  1. 实体、关系和社区摘要的 GraphRAG 数据索引具有超出问答的用途(例如,阅读和共享报告)。
  2. 实体、关系和社区摘要的 GraphRAG 数据索引与类似 LazyGraphRAG 的搜索机制相结合,可能比单独的 LazyGraphRAG 取得更好的结果。
  3. 一种旨在支持类似 LazyGraphRAG 的搜索机制的新型 GraphRAG 数据索引(例如,通过抢先声明和主题提取)可能会取得最佳结果。

我们将在未来一段时间内探索这些方向,所有进展(包括 LazyGraphRAG 本身)都将通过 GraphRAG GitHub 存储库 发布。

未经允许不得转载:首席AI分享圈 » LazyGraphRAG:大幅优化 GraphRAG 的质量与成本

首席AI分享圈

首席AI分享圈专注于人工智能学习,提供全面的AI学习内容、AI工具和实操指导。我们的目标是通过高质量的内容和实践经验分享,帮助用户掌握AI技术,一起挖掘AI的无限潜能。无论您是AI初学者还是资深专家,这里都是您获取知识、提升技能、实现创新的理想之地。

联系我们
zh_CN简体中文