General Introduction
LettuceDetect is a lightweight open-source tool developed by KRLabsOrg that specializes in detecting illusory content generated in Retrieval Augmented Generation (RAG) systems. It helps developers to enhance the context by comparing the context, the question, and the answer, and identifying the parts of the answer that are not supported by the context RAG system accuracy. The tool is based on ModernBERT technology and supports 4096 token LettuceDetect is designed to provide a high performance on the RAGTruth dataset, with a large model version F1 score of 79.22%, outperforming many existing solutions, while being more efficient than traditional encoder models and far less computationally expensive than Large Language Models (LLMs). The project is released under the MIT license, and the code and model are free and open for users who need to optimize the reliability of AI-generated content.
Function List
- Token level detection: Analyze responses word by word, marking the hallucination section precisely.
- Span level detection: Identify complete phantom segments in a response, output position and confidence level.
- Long Context Processing: Supports 4096 token contexts for complex tasks.
- Efficient ReasoningThe model is available in 150M and 396M models and processes 30-60 samples per second on a single GPU.
- open source integration: Installs via pip, provides a clean Python API, and is easy to embed into a RAG system.
- Multiple output formats: Supports token-level probabilities and span-level predictions for easy analysis.
- performance benchmark: Detailed assessment data is available on the RAGTruth dataset to facilitate comparisons.
Using Help
LettuceDetect is a lightweight and efficient tool that users can get started quickly with a simple installation. Below is a detailed installation and usage guide to help you master its features from scratch.
Installation process
- Preparing the Python Environment
Make sure you have Python 3.8 or higher installed and the pip utility. A virtual environment is recommended:python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows
- Installing LettuceDetect
Install the latest version from PyPI:pip install lettucedetect
The installation process automatically downloads the core dependencies, including the ModernBERT model.
- Verify Installation
Check for success by running the following code in a Python terminal:from lettucedetect.models.inference import HallucinationDetector print("LettuceDetect installed successfully!")
Basic usage
LettuceDetect provides a clean Python API to detect hallucinations with just a few lines of code. Here is a basic example:
sample code (computing)
from lettucedetect.models.inference import HallucinationDetector
# Initialize the detector
detector = HallucinationDetector(
method="transformer",
model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)
# Input data
contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million."]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."] question = "What is the capital of France?
# Perform span-level testing
predictions = detector.predict(
context=contexts,
question=question,
output_format="spans"
)
# Output results
print("Test results:", predictions)
Sample output::
Results: [{'start': 31, 'end': 71, 'confidence': 0.994, 'text': ' The population of France is 69 million.'}]
The results show that "the population is 69 million" is labeled as an illusion because the context suggests a population of 67 million.
Main function operation flow
1. Initialize Detector
- Parameter description::
method
: Only "transformer" is currently supported.model_path
: OptionalKRLabsOrg/lettucedect-base-modernbert-en-v1
(150M) orKRLabsOrg/lettucedect-large-modernbert-en-v1
(396M).
- manipulateThe base version is lightweight and fast, while the large version is more accurate.
2. Prepare to enter
- Context: Pass a list of strings containing background information, which should be in English.
- Question: Enter specific questions that need to be relevant to the context.
- Answer: Enter the responses generated by the RAG system.
- take note of: Ensure that the total length of the context does not exceed 4096 tokens.
3. runtime detection
- invoke a method: Use
detector.predict()
The - output format::
"spans"
: Returns the start and end positions, text, and confidence level of the hallucination segment."tokens"
: Returns the illusory probability of each token.
- manipulate: Choose the appropriate output format, span level for quick viewing, token level for deeper analysis.
4. analysis
- span level output: Examine each hallucinatory fragment of the
text
cap (a poem)confidence
The confidence level close to 1 indicates a high probability of hallucination. - token level output: word-by-word view
prob
values to determine specific points of error. - Follow-up treatment: Optimize the RAG system or document issues based on the results.
Featured Functions
Token level detection
LettuceDetect supports word-by-word analysis to provide fine-grained hallucination detection:
predictions = detector.predict(
context=contexts,
question=question,
answer=answer,
output_format="tokens"
)
print(predictions)
Sample output::
Results: [{'token': '69', 'pred': 1, 'prob': 0.95}, {'token': 'million', 'pred': 1, 'prob': 0.95}]
This suggests that "69 million" is labeled as an illusion, suitable for scenarios that require precise tuning.
Long Context Support
For long text tasks, LettuceDetect can process 4096 tokens:
contexts = ["A long context repeated many times..." * 50]
predictions = detector.predict(context=contexts, question="..." predictions = detector.predict(context=contexts, question="...", answer="...")
Just make sure the input is within the limits.
Streamlit Demo
LettuceDetect provides interactive demos:
- Install Streamlit:
pip install streamlit
- Run the demo:
streamlit run demo/streamlit_demo.py
- Enter context, questions and answers in your browser to view test results in real time.
Advanced Use
Training customized models
- Download the RAGTruth dataset (link (on a website)), put in the
data/ragtruth
Folder. - Preprocessing data:
python lettucedetect/preprocess/preprocess_ragtruth.py --input_dir data/ragtruth --output_dir data/ragtruth
- Training models:
python scripts/train.py --data_path data/ragtruth/ragtruth_data.json --model_name answerdotai/ModernBERT-base --output_dir outputs/ hallucination_detector --batch_size 4 --epochs 6 --learning_rate 1e-5
performance optimization
- GPU acceleration: Install the PyTorch CUDA version to improve inference speed.
- batch file: Place multiple samples into the
contexts
List of one-time tests.
caveat
- Input must be in English, other languages are not supported at this time.
- Ensure that the network is open so that the model can be downloaded on the first run.
With these steps, users can easily use LettuceDetect to detect RAG system illusions and improve the reliability of generated content.