LettuceDetect: an efficient tool for detecting hallucinations in the RAG system

🚀 Invitation to Experience: China's First AI IDE Intelligent Programming Software Trae Chinese version downloadThe DeepSeek-R1 and Doubao-pro are available for unlimited use!

General Introduction

LettuceDetect is a lightweight open-source tool developed by KRLabsOrg that specializes in detecting illusory content generated in Retrieval Augmented Generation (RAG) systems. It helps developers to enhance the context by comparing the context, the question, and the answer, and identifying the parts of the answer that are not supported by the context RAG system accuracy. The tool is based on ModernBERT technology and supports 4096 token LettuceDetect is designed to provide a high performance on the RAGTruth dataset, with a large model version F1 score of 79.22%, outperforming many existing solutions, while being more efficient than traditional encoder models and far less computationally expensive than Large Language Models (LLMs). The project is released under the MIT license, and the code and model are free and open for users who need to optimize the reliability of AI-generated content.

LettuceDetect: an efficient tool for detecting hallucinations in the RAG system-1

Function List

Token level detection: Analyze responses word by word, marking the hallucination section precisely.
Span level detection: Identify complete phantom segments in a response, output position and confidence level.
Long Context Processing: Supports 4096 token contexts for complex tasks.
Efficient ReasoningThe model is available in 150M and 396M models and processes 30-60 samples per second on a single GPU.
open source integration: Installs via pip, provides a clean Python API, and is easy to embed into a RAG system.
Multiple output formats: Supports token-level probabilities and span-level predictions for easy analysis.
performance benchmark: Detailed assessment data is available on the RAGTruth dataset to facilitate comparisons.

Using Help

LettuceDetect is a lightweight and efficient tool that users can get started quickly with a simple installation. Below is a detailed installation and usage guide to help you master its features from scratch.

Installation process

Preparing the Python Environment
Make sure you have Python 3.8 or higher installed and the pip utility. A virtual environment is recommended:
```
python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows
```

Installing LettuceDetect
Install the latest version from PyPI:
```
pip install lettucedetect
```
The installation process automatically downloads the core dependencies, including the ModernBERT model.

Verify Installation
Check for success by running the following code in a Python terminal:

from lettucedetect.models.inference import HallucinationDetector
print("LettuceDetect 安装成功！")

Basic usage

LettuceDetect provides a clean Python API to detect hallucinations with just a few lines of code. Here is a basic example:

sample code (computing)

from lettucedetect.models.inference import HallucinationDetector
# 初始化检测器
detector = HallucinationDetector(
method="transformer",
model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)
# 输入数据
contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million."]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."
# 执行 span 级检测
predictions = detector.predict(
context=contexts,
question=question,
answer=answer,
output_format="spans"
)
# 输出结果
print("检测结果:", predictions)

Sample output::

检测结果: [{'start': 31, 'end': 71, 'confidence': 0.994, 'text': ' The population of France is 69 million.'}]

The results show that "the population is 69 million" is labeled as an illusion because the context suggests a population of 67 million.

Main function operation flow

1. Initialize Detector

Parameter description::
- method: Only "transformer" is currently supported.
- model_path: Optional KRLabsOrg/lettucedect-base-modernbert-en-v1(150M) or KRLabsOrg/lettucedect-large-modernbert-en-v1(396M).
manipulateThe base version is lightweight and fast, while the large version is more accurate.

2. Prepare to enter

Context: Pass a list of strings containing background information, which should be in English.
Question: Enter specific questions that need to be relevant to the context.
Answer: Enter the responses generated by the RAG system.
take note of: Ensure that the total length of the context does not exceed 4096 tokens.

3. runtime detection

invoke a method: Use detector.predict()The
output format::
- "spans": Returns the start and end positions, text, and confidence level of the hallucination segment.
- "tokens": Returns the illusory probability of each token.
manipulate: Choose the appropriate output format, span level for quick viewing, token level for deeper analysis.

4. analysis

span level output: Examine each hallucinatory fragment of the text cap (a poem) confidenceThe confidence level close to 1 indicates a high probability of hallucination.
token level output: word-by-word view prob values to determine specific points of error.
Follow-up treatment: Optimize the RAG system or document issues based on the results.

Featured Functions

Token level detection

LettuceDetect supports word-by-word analysis to provide fine-grained hallucination detection:

predictions = detector.predict(
context=contexts,
question=question,
answer=answer,
output_format="tokens"
)
print(predictions)

Sample output::

检测结果: [{'token': '69', 'pred': 1, 'prob': 0.95}, {'token': 'million', 'pred': 1, 'prob': 0.95}]

This suggests that "69 million" is labeled as an illusion, suitable for scenarios that require precise tuning.

Long Context Support

For long text tasks, LettuceDetect can process 4096 tokens:

contexts = ["A long context repeated many times..." * 50]
predictions = detector.predict(context=contexts, question="...", answer="...")

Just make sure the input is within the limits.

Streamlit Demo

LettuceDetect provides interactive demos:

Install Streamlit:
```
pip install streamlit
```
Run the demo:
```
streamlit run demo/streamlit_demo.py
```
Enter context, questions and answers in your browser to view test results in real time.

Advanced Use

Training customized models

Download the RAGTruth dataset (link (on a website)), put in the data/ragtruth Folder.

Preprocessing data:

python lettucedetect/preprocess/preprocess_ragtruth.py --input_dir data/ragtruth --output_dir data/ragtruth

Training models:

python scripts/train.py --data_path data/ragtruth/ragtruth_data.json --model_name answerdotai/ModernBERT-base --output_dir outputs/hallucination_detector --batch_size 4 --epochs 6 --learning_rate 1e-5

performance optimization

GPU acceleration: Install the PyTorch CUDA version to improve inference speed.
batch file: Place multiple samples into the contexts List of one-time tests.

caveat

Input must be in English, other languages are not supported at this time.
Ensure that the network is open so that the model can be downloaded on the first run.

With these steps, users can easily use LettuceDetect to detect RAG system illusions and improve the reliability of generated content.