AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

DeepSeek R1 in RAG: Practical Experience Summary

DeepSeek R1 in RAG: Summary of Practical Experience-1

DeepSeek R1 demonstrated strong reasoning capabilities in its first release. In this blog post, we share the details of using the DeepSeek R1 Experience in building Retrieval-Augmented Generation (RAG) systems specializing in the field of legal documents.


We chose Legal Documents because legal professionals are often faced with a daunting task: searching through a plethora of cases, statutes, and law reviews. Even the most attentive research can be hampered by the difficulty of finding the right document, let alone accurately summarizing the content. This is exactly where RAG Areas where you can make a big difference (see also Harvey AI).

We built a RAG based on a large legal document dataset, using DeepSeek R1 and Qwen2 as embedding models. We open-sourced the entire RAG construction process (provided by the vLLM cap (a poem) SkyPilot provide support) and shared some valuable lessons learned - what works and what doesn't.

Want to experience it for yourself 🔥? Try our online demo:Click hereThe

We have open sourced this RAG build process Click here to view. Feel free to explore how we can incorporate vLLM + SkyPilot for fast, stable and scalable performance.

DeepSeek R1 in RAG: Summary of Practical Experience-2

Why build a RAG on legal documents?

Retrieval Augmentation Generation (RAG) is crucial in domains such as law where accuracy and traceability are critical. By combining a specialized retriever with a powerful Large Language Model (LLM) generator, we can achieve optimal results:

  1. of highly relevant documentslook up
  2. Texts with reasoning powergenerating
  3. By direct referenceReduction of hallucinations

To build this RAG, we used:

  • pile-of-law/pile-of-law A subset of the dataset (focused on legal advice);
  • ChromaDB Serves as a vector repository for storing and querying embeddings;
  • Qwen2 as a retrieval embedding model;
  • DeepSeek R1 Generate a model as a final answer.

The entire system runs on vLLM Above and beyond, and can be passed SkyPilot Seamlessly deploy to any cloud to ensure optimal GPU availability and cost effectiveness. The full open source RAG build process is available here.

Valuable experience: precautions and contraindications

1. Do not use DeepSeek R1 for searches

(go ahead and do it) without hesitating DeepSeek R1 has excellent reasoning ability, but itunsuitableFor generating embeds - at least not for now.

We find that with specialized embedding models Alibaba-NLP/gte-Qwen2-7B-instruct(currently at MTEB) the charts (of best-sellers) (the highest ranked embedding model on) compared to the embedding generated by DeepSeek R1.

We use each of these two models to generate embeddings for the dataset and construct two vector databases. Then, the same query is used for both models and the top 5 most similar embeddings in the respective vector databases are retrieved.

Query: I received a small claims court summons regarding a debt dispute, how should I prepare?

Results for: Qwen2 Search results for DeepSeek-R1
[CA] One question on small claims Can I sue a company that hires telemarketers and keeps calling me in defiance of the law?
[CO] Just received a court subpoena for an insurance paid hospital bill ...... I have proof but do I still need to hire an attorney to resolve this? What do we do when an ex-girlfriend's daughter is arrested for shoplifting?
I'm being sued for defaulting on a $3,500 loan. Late at night after work, you walk into a nearly empty parking lot and see two men prying on your car door. You tell them to stop, but they ignore you. You are in possession of a concealed handgun, what can you do?
Roommate's cat ate some nutty trail mix snacks and had to go to the vet with a bill over $1,500. She's threatening to take me to small claims court to get me to pay for it, but there's no evidence that the snacks are mine. Should I be worried? Missouri possession of marijuana charge with no physical evidence.
I recently received a civil summons regarding a debt I thought I had paid. Is it legal to "set a dog on someone"?

Why does it make a difference?

In the table above, DeepSeek R1 has significantly worse retrieval results. Why is this?

We believe that the fundamental problem lies in the way DeepSeek R1 is trained.The architecture of DeepSeek R1 focuses primarily on reasoning, emphasizing sequential thinking and logical connections.This means that DeepSeek R1 does not map documents into a semantic space.

In contrast, the Qwen2 variant (gte-qwen2-7b-instructTraining specifically for semantic similarity tasksthat can create a high-dimensional space where conceptually similar documents are tightly clustered together, independent of the specific wording.

This difference in training approach means that Qwen2 performs well in capturing query intent, whereas DeepSeek R1 sometimes follows inference paths that lead to retrieved results that are thematically relevant but practically irrelevant.

DeepSeek R1 should not be used as a retrieval embedding model for RAGs unless it is fine-tuned for embedding.

2. Generation with R1: impressive reasoning power

(go ahead and do it) without hesitating R1 There is some difficulty in embedding, but we find it generating The ability to do so is remarkable. By utilizing R1's chain-of-thought approach, we observed:

  • Stronger coherence: The model is capable of synthesizing information from multiple documents and clearly referencing relevant passages.
  • Reduced hallucinations: R1 Reason internally "to yourself" and validate each conclusion with your data.

Let's look at a few examples:

Example 1: Termination of Lease

Question: I want to get out of the lease, but the landlord won't let me.

View DeepSeek R1's thought process

Okay, I want to terminate my lease, but the landlord won't allow it. I'm not sure what options I have. I'm trying to sort things out based on the documentation provided.

First of all, I remember reading in Paper 1 about the penalty for early termination of tenancy. One person asked if it was possible to terminate a tenancy 15 months after the expiry of an 18-month tenancy, and the answer is yes, but only if the landlord agrees or a new tenant is found. Therefore, if I wanted to terminate the lease early, the landlord might hold me liable unless they agreed or could re-let quickly. This seems like a possible route, but I'm not sure my landlord would be willing to negotiate.

Next, in Paper 2, the reason someone would want to terminate the lease is to get married. The answer suggests that the lease may allow for a fee to be paid to break the lease, or it may need to wait until the end of the lease term. So, I can check if my lease has a clause that allows for termination, such as a sublet or assignment clause. If not, I may need to pay a penalty or wait until the end of the lease.

Document 3 discusses the termination of a lease due to a service dog. Options include paying a penalty, subletting, finding another service dog, or waiting until the lease expires. This doesn't quite work for me, but illustrates that unless there is a specific reason, terminating a lease usually has some consequences.

In Paper 4, the landlord had increased the rent and the user queried the short notice period. The answer mentions that in theory there should be a 30 day notice period, but that the landlord may offer a compromise such as agreeing to pay the original rent until the next tenancy. If my landlord refuses to let me out of my lease, he may try to offer a program to pay a higher rent, but I don't know if this is within my financial means.

Document 5 mentions that the landlord refused to renew the lease because of the cat. The answer indicated that the landlord was not legally obligated to renew the lease and therefore had the option of not renewing the lease. This got me thinking that my landlord may have similar discretion, but I'm not sure if that was the reason for their refusal.

Paper 6 discusses the situation where a lease is renewed and new terms are attached. The answer states that the landlord must honor the existing lease until it expires. Therefore, if the landlord refuses to let me out of the lease, it may mean that they have to wait until the end of the lease unless there is a special clause in the lease that allows them not to renew the lease.

Papers 7 and 8 discuss the difficulties faced by leasing offices in renewing leases. The documents mention that the landlord may not want to renew the lease because of bad reviews or other reasons, but legally, the landlord can choose not to renew the lease. This suggests that the landlord has some power, but they may not be legally obligated to let me out of the lease early unless there is a specific reason.

Document 9 discusses the requirement to provide a 30-day notice before moving out. Even if I move in without signing any agreement, the landlord may still require notice. Therefore, if I move out without notice, I may face fines or have to pay rent for the remainder of the lease term.

Finally, Paper 10 deals with a situation where the tenant is responsible for the entire rent when the shared roommate moves out. The answer suggests sharing the rent by subletting. If I were to move out, I might be able to try to sublet my portion of the lease, but it would depend on the terms of the lease.

In summary, I need to check if my lease has a clause about early termination. If not, I may have to pay a penalty unless the landlord agrees. If I have a valid reason, such as moving, I can try to negotiate with the landlord. If subletting is possible, that may be a solution, but I need to check whether the lease allows it. I should also consider whether I can wait until the natural end of the lease, if I can't find another way.

View reply by DeepSeek R1

Based on an analysis of the documentation provided, the following is a structured approach to terminating a lease when the landlord refuses to allow early termination:

1. Review of lease agreements
Early termination clauses: Check your lease agreement for clauses regarding early termination. If such a clause exists, you may be legally required to pay a penalty or give the landlord advance notice to terminate the lease [1].
Subletting or Assigning: Determine if your lease agreement allows you to sublet or assign the lease to someone else. This may help you avoid fines or lawsuits [2].
2. Understanding landlord discretion
Renewal at your discretion: Your landlord may have the right to refuse to renew your lease, especially if your lease does not contain a clause that specifically allows for early termination. In this case, you may be required to honor the lease until the end of the term [6][8].
3. Consideration of financial consequences
Penalties: If you terminate your lease early without your landlord's consent, be prepared to potentially face financial penalties. This could include additional rent payments or a damage deposit [1][9].
4. Exploring negotiations
Negotiate terms: If you have a valid reason for early termination, such as relocation, you can negotiate with your landlord for reduced penalties or permission to terminate the lease early [2].
5. Subletting as a solution
Sublet your space: if possible, consider subletting a portion of your lease to help cover your financial obligations [10].
6. Planning for the future
Wait until the end of the lease: If early termination is not feasible, plan to stay until the lease expires. This avoids fines and legal problems [6][8].
Conclusion:

Always review the terms of your lease carefully before taking any action. If early termination is not allowed, consider subletting or waiting for the lease to end. If you can negotiate with your landlord, this may be the best course of action. Always consult a legal professional for advice on your specific situation.

Example 2: Small Claims Court

Question. I received a small claims court summons regarding a disputed debt. How should I prepare?

View DeepSeek R1's thought process

Okay, I just received a small claims court summons for a disputed debt. I'm not quite sure where to start, but I need to figure out how to prepare for this. From the search results, there are a couple people who have gone through similar situations, so maybe I can learn something from their experience.

First, I should probably understand the nature of the debt. In Doc 5, someone receives a summons for a debt they thought they were already paying. They don't understand why they received another summons. So, I need to verify that this debt is valid. Maybe I should check my records or ask the creditor for proof of debt. According to doc 10, it is a good idea to deny the complaint and ask for proof. This way, I can question the credibility of the debt.

Next, I should prepare my case. In Document 1, someone dealt with a similar issue and the advice was to make sure the case could be presented in 10 minutes or less. This means that I need to be concise and have all my evidence organized. I should gather any documents that prove I don't owe the debt, such as payment receipts, letters from creditors indicating settlement, or insurance documents like in Document 2.

I also need to consider the witness issue. In doc 4, a roommate threatens to take this person to court over a vet bill, and the advice is to ignore the threat unless sued. But if I am sued, I should be prepared to have witnesses to back me up. Perhaps a friend who was with me when I tried to pay, or a former tenant who can testify to the condition of the house, as shown in Document 7.

It is important to understand the court process. As seen in Document 7, after winning a small court case, the landlord filed an appeal, which meant that another court appearance was required. The person had to go through the same process again. So, I should know that even if I win, there may still be an appeal process. I should be prepared to present my case again if needed.

I also need to consider subpoenas and pleadings. In Doc 5, someone has to respond to the pleadings. If I don't, a default judgment may be entered. Therefore, I must respond on time and clearly state my defense.

The filing of a statement of defense is the next step. Document 10 mentions that even if there is an offer, I should always refuse and ask for proof. This will give me more leverage in subsequent negotiations. I will list all my points and evidence in my statement of defense.

It is also important to consider my financial situation. In Doc 6, there is a concern about being incarcerated for not being able to pay, but it has been clarified that debtor's prison does not exist. However, I should still communicate with the court about my financial situation. If I cannot pay, I need to show that I have made reasonable efforts and offer an installment plan, as shown in Document 5.

If I am unsure about a legal issue, perhaps I should consult an attorney. In document 3, the person is moving and has a court date. Consulting a debt settlement attorney may be helpful, especially if I have other debts or complications.

Finally, I should be prepared for the court hearing. It is vital to show up on time, dress appropriately, and have all documents ready. In Document 7, this person won the case by being well-prepared with evidence, so I should follow suit.

In a nutshell, I need to: validate debts, prepare evidence, understand court procedures, properly respond to subpoenas, possibly consult with an attorney, and be organized and ready for trial. Citing relevant documents will help me support my decisions.

View reply by DeepSeek R1

Based on the analysis of the legal documents provided, the following is a structured and organized plan for the preparation of small claims court subpoenas in respect of disputed debts:

Verification of debts
Check the creditor's proof: Start by reviewing any documents provided by the creditor to confirm the validity and details of the debt. Use this information to cross-check with your own records or any prior communications.
Request for documents: Contact the creditor to request official proof of the debt, such as a court judgment document or settlement agreement. This step is essential to challenge the credibility of the debt, as recommended in Paper 10.
Preparing your case
Gathering of evidence: Gather all relevant evidence, such as payment receipts, letters from creditors indicating settlement or insurance documents, as mentioned in documents 2 and 8.
Organize and present evidence: Ensure that all evidence is organized and ready to be presented in court, following the recommendations in Document 1. If required, consider including witness statements, such as friends who were present during payment attempts or former tenants of the property in question.
Understanding of court procedures
Be prepared for multiple hearings: If you win your case, be aware that there may be an appeal process, as described in Document 7. Be prepared to present your case again if necessary.
Familiarize yourself with court procedures: learn the basics of a small court, including the structure of the hearing and the role of the judge.
Responding to subpoenas and indictments
FILING ANSWER: Make sure you respond to the summons in a timely manner to avoid a default judgment. State your defenses clearly in your Answer and request proof of debt, as recommended in Document 10.
Denial of the indictment: Always deny the indictment in order to take advantage of negotiation opportunities, as mentioned in document 10.
Consideration of the financial situation
Communicate your financial situation: If you are unable to pay your debts, demonstrate your financial situation and provide an installment plan, as outlined in Document 5. Demonstrate that you have made reasonable efforts to resolve the debt.
Consult an attorney if necessary
Seek legal advice: If the situation is complex or you are unsure of your legal position, consult a debt settlement attorney, especially if other debts or complications are involved, as suggested in Document 3.
Preparation for court hearings
Be on time and prepared: arrive early, dress appropriately, and bring all necessary documents. Refer to the example in Document 7 for a successful outcome of a well-prepared case.
Consider additional steps
Notify roommates or witnesses: If the debt involves shared property, consider notifying any witnesses or roommates who can provide support, as mentioned in Document 4.

By following these steps, you can confidently respond to a small claims court hearing, ensure that your case is effectively presented, and protect your rights.

From these examples, we observe that DeepSeek R1 has excellent reasoning capabilities. Its thought process clearly demonstrates how to draw conclusions from source legal documents:

  • R1 first constructs a coherent model to understand legal issues, which is reflected in its detailed reasoning process:First, I remember something about early termination penalties ...... document 1 mentions ...... This "reasoning before retrieval" approach allows the model to systematically link concepts from multiple sources before retrieval.
  • When dealing with complex scenarios such as lease termination or small claims court, we observe that R1 No hallucinations. , but rather to explicitly sort out its understanding of each document (Taken together ......).
  • Finally, the generator explains its reasoning using precise references that link the conclusion to the source. This ensures a clear chain from question to reasoning to answer, guaranteeing rigor and accessibility.

We tried a variety of legal queries and found that the model was not only able to extract information from source documents, but also to learn and reason about their content.

Conclusion: In the question and answer and summary tasks, theR1 It's a "gold mine" of step-by-step legal reasoning. Use it in the generative phase and you will hardly be disappointed.

3. Engineering remains important

Cue word design

Advanced reasoning skills do not eliminate the need for well-designed cue words. We have found that explicit guidelines in cue words are critical for

  • drive (sb. to do sth) document reference Generate answers throughout.
  • Preventing hallucinations The "cite a source or state that you do not know" approach is used.
  • Structuring the final answer to make it more readable The

During the experiment, we constructed the following cue words:

You are an AI assistant that helps analyze legal documents and related content. Please follow these guidelines when responding:
- In the search results provided, each document is formatted as [Document X begin]... [Document X end], where X is the document number.
- Use a citation format of the form [citation:X], where X is the document number, and place the citation after the relevant information.
- Include citations throughout the response, not just at the end.
- If the information is from more than one document, use multiple citations, e.g. [citation:1][citation:2].
- Not all search results are relevant - please evaluate and use only relevant information.
- For longer responses, please structure the content in paragraphs or sections to improve readability.
- If you can't find the answer in the documentation provided, please state it directly rather than making up information.
- Some of the documentation may be informal discussions or Reddit posts - please adapt the interpretation to the content.
- Make the answer contain as many quotes as possible.
First, explain your thought process in the  tag.
Then, provide the final answer after the thought process.

Document chunking

In addition, we found Effective document chunking is essential for accurate document retrieval. Chunking the document helps to make each embedding more accurately represent a specific topic and reduces the amount of time needed for each embedding generation Token Quantity.

We use sentence-based segmentation (via NLTK) to chunk documents. Also, the beginning and end of each chunk will contain content that overlaps with neighboring chunks. This helps the model to better understand partial references without losing the overall context. Below is the code used for document chunking:

def chunk_document(document, chunk_size=2048, overlap=512).
"""Split the document into overlapping chunks using sentence-based segmentation.""""
text = document['text']
chunks = []
# Split by sentence first
sentences = nltk.sent_tokenize(text)
current_chunk = []
current_length = 0
for sentence in sentences: sentence_len = len(sentence)
sentence_len = len(sentence)
# If adding this sentence would exceed the chunk size, save the current chunk
if current_length + sentence_len > chunk_size and current_chunk.
chunk_text = ' '.join(current_chunk)
chunks.append({
'id': document['id'],
'name': document['name'],
'content': document['text'],
'chunk_start': len(' '.join(current_chunk[:-(2 if overlap > 0 else 0)])) if overlap > 0 else 0,
# Other metadata fields...
})
# Keep the last few sentences as overlap
overlap_text = ' '.join(current_chunk[-2:]) # keep the last 2 sentences
current_chunk = [overlap_text] if overlap > 0 else []
current_length = len(overlap_text) if overlap > 0 else 0
current_chunk.append(sentence)
current_length += sentence_len + 1 # +1 for space

Summary of key points:

  1. Use NLTK for sentence-based Token splitting, not character-based splitting.
  2. Document context is maintained by keeping overlapping sentences between blocks.

4. vLLM is efficient and fast

Due to the sheer volume of legal document data, generating embeddings for the RAG can take a long time.

Initially, we used HuggingFace's sentence_transformerWe ran it on an Nvidia L4 GPU first and encountered the usual "CUDA memory insufficient" error. We first ran it on an Nvidia L4 GPU, but encountered the usual "insufficient CUDA memory" error. After that, we tried Nvidia A100 and found that sentence_transformer 57GB of DRAM is required to load the complete Alibaba-NLP/gte-Qwen2-7B-instruct Model.

DeepSeek R1 in RAG: Summary of Practical Experience-3

We switched to vLLMThis is a high-throughput, memory-efficient LLM inference and service engine.

With vLLM, we can run the model on a standard Nvidia L4 GPU (~24GB DRAM). In comparison, the L4 is much less expensive than the A100: on GCP, the Nvidia L4 is about $0.7/hour, while the Nvidia A100 is at least $2.9/hour.

Comparing vLLM and sentence transformer on an Nvidia A100 (80GB DRAM), we find that vLLM generates Qwen2 embeddings up to 5.5 times faster!The

When processing 10,000 legal documents (15,000 chunks in total):

  • Standard sentence transformer Processing time: approx. 5.5 hours
  • vLLM processing time: approx. 1 hour

DeepSeek R1 in RAG: Summary of Practical Experience-4

The following are used to run on vLLM Alibaba-NLP/gte-Qwen2-7B-instruct to generate an embedded SkyPilot YAML configuration:

name: compute-legal-embeddings
resources.
# vLLM requires ~24GB DRAM GPU, SentenceTransformer requires 57GB
accelerators: {L4:1, A100:1}
envs.
START_IDX: ${START_IDX}
END_IDX: ${END_IDX}
file_mounts.
/output.
name: my-bucket-for-embedding-output
mode: MOUNT
setup: |pip install torch==2.5
pip install torch==2.5.1 vllm==0.6.6.post
...
envs.
MODEL_NAME: "Alibaba-NLP/gte-Qwen2-7B-instruct"
run: ||vllm.post ...
python -m vllm.entrypoints.openai.api_server \
--host 0.0.0.0 \
--model $MODEL_NAME \
--max-model-len 3072 \
---task embed &
python scripts/compute_embeddings.py \
--start-idx $START_IDX \
--end-idx $END_IDX \\
--chunk-size 2048 \
--chunk-overlap 512 \\
--vllm-endpoint http://localhost:8000

5. Don't let embedding take 100+ hours - use SkyPilot for parallel processing!

Generating legal document embeds on a single L4 GPU takes 100 hours. We can leverage Spot instances to dramatically reduce time and cost:

name (of a thing) Original time 100 Spot Examples
timing 100 hours 1 hour (100x acceleration)
(manufacturing, production etc) costs $70.1 | $28.2 (2.4 times savings)

DeepSeek R1 in RAG: Summary of Practical Experience-5

Results: We no longer have to wait days to generate embeds. Our tasks can scale dynamically, moving quickly from prototype to production, even in the face of large datasets.

reach a verdict

construct (sth abstract) DeepSeek R1 RAG Let's learn a few key points:

  1. Enhance retrieval capabilities with specialized embedding models such as Qwen2.
  2. The reasoning capabilities of R1 are utilized in the generation phase to handle complex legal queries.
  3. (Cue word) Engineering remains key to controlling citations and structuring content.
  4. Enhance inference efficiency with vLLM for dramatic speedups.
  5. Scale AI computing in the cloud and on your own infrastructure to reduce compute time and cost.

Want to know more?

Attention: The RAG system is a powerful tool for legal research and analysis, but is not a substitute for a qualified legal professional. Its output should not be regarded as legal advice.

CDN1
May not be reproduced without permission:Chief AI Sharing Circle " DeepSeek R1 in RAG: Practical Experience Summary

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish