DeepSeek-R1 Official Cue Words and Parameter Configurations: Deploying Open Source 671B with DeepSeek's Official Performance

2.9K 00

DeepSeek-R1 models are highly regarded for their superior reasoning capabilities. To help usersGet performance that is consistent with the official DeepSeek platform experience, an official detailed deployment guide has been released. In this article, we will read this guide in depth.Focus on dissecting the official templates provided for prompts for search and file upload scenarios, as well as the various commands that mitigate models skipping the think stepThe Mastering and strictly following these official configurations is the key to reproducing the official DeepSeek-R1 excellence!This article will provide a critical reference for both developers looking to deploy DeepSeek-R1 locally and researchers looking to deepen their model performance. Whether you are a developer looking to deploy DeepSeek-R1 locally or a researcher working on a deeper dive into model performance, this paper will provide a critical reference to help youThe DeepSeek-R1 experience is a precise replica of the official standard.The

DeepSeek-R1 官方提示词和参数配置：部署开源671B与DeepSeek官方表现一致

The release of DeepSeek-R1 has attracted a lot of attention in the AI technology community, with many developers actively trying to deploy and apply this powerful inference model. In order to help users get an excellent experience, DeepSeek team has released an official deployment guide. In this article, we will read the guide in depth, extract the core points, and analyze the model features in detail, aiming to help readers fully understand the best practices of DeepSeek-R1 and master the key techniques of model performance optimization.

1. Technical analysis of the DeepSeek-R1 model

DeepSeek has introduced its first generation of inference models, consisting of DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero is a technological innovation that relies exclusively on large-scale reinforcement learning (RL) for training, reversing the traditional paradigm of requiring supervised fine-tuning (SFT) as a pre-training step. This approach gives DeepSeek-R1-Zero superior inference capabilities, allowing it to excel in inference tasks and naturally emerge with a number of compelling inference properties.

However, DeepSeek-R1-Zero is not perfect, e.g., it suffers from repetitive output, poor readability, and language mixing in some cases. To overcome these limitations and further improve the model's inference performance, the DeepSeek team introduced DeepSeek-R1. The main improvement of DeepSeek-R1 over DeepSeek-R1-Zero is the incorporation of "cold-start data" prior to reinforcement learning. This approach effectively improves the model's performance on math, code, and complex inference tasks, making it comparable to OpenAI models such as OpenAI-o1The

To give back to the research community, DeepSeek has generously open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models based on the Llama and Qwen architectures distilled from DeepSeek-R1. Notably, DeepSeek-R1-Distill-Qwen-32B outperforms in several benchmarks the OpenAI-o1-mini, setting a new performance benchmark for small dense models.

Special Tip: Before deploying and running the DeepSeek-R1 family of models locally, it is strongly recommended that users carefully read "2. Core Configuration Points" to ensure optimal use of the and replicate the consistent experience of the official platform as much as possibleThe

2. Core configuration elements: reproducing officially consistent results

The official DeepSeek team has provided the following core recommendations for deployment and use of DeepSeek-R1, which are based on best practices for configuring the official model parameters. Strict adherence to these configurations is key for users to reproduce excellent performance in their local environments that is consistent with the official demo platform. Among other things, the officialSearch & File Upload Prompt Templateas well asMitigation models bypass thinkingThe guidelines are even more critical and directly determine whether a locally deployed DeepSeek-R1 will be up to official standards:

2.1 No system prompt:

The DeepSeek-R1 model is designed to work without system prompts. For consistency with the official platform and to obtain the desired model behavior, it is important to disable system prompts and include all instructions directly in the user prompt. A clear and concise question will help the model to accurately understand the user's intent, consistent with the official platform's handling of prompts.

2.1 Set the temperature parameter to 0.6 (Temperature: 0.6):

The Temperature parameter directly affects the randomness and creativity of the model output. The official recommendation is to set this parameter to 0.6, which is one of the key parameters for ensuring that the output style of locally deployed models is consistent with the official platform, striking the ideal balance between creativity and consistency in the output. Lower values will result in a more conservative and deterministic model output, while higher values will encourage the model to produce more varied and novel answers, but deviations from the official temperature setting may result in differences in response style between the local model and the official platform.

2.3 Guidelines to mitigate model bypass thinking:

In order to ensure that the DeepSeek-R1 model engages in sufficient reasoned thought when processing complex queries, it is strongly recommended that users add explicit thought leadership instructions at the beginning of each input prompt <think>\n. This is not only an effective means of mitigating models skipping thought steps, but also a core configuration to ensure that locally deployed models can reproduce the same depth of reasoning as the official platform. Ignoring or misusing this directive can cause local models to deviate from the official platform in complex reasoning tasks. This directive effectively guides the model into "think mode" and prevents the model from outputting results without sufficient reasoning, i.e., avoiding "skipping the think step" (e.g., directly outputting the <think>\n\n</think> ).

2.4. Optimization of Math problems

For math problems, in order to obtain accurate answers consistent with the official platform in a locally deployed environment, it is recommended to explicitly ask the model to "reason step-by-step" in the prompt, and to specify the format of the final answer in the prompt, e.g., "Please reason step-by-step and put the final answer in \boxed{} ". Clear instructions and formatting requirements help the model to better understand the problem type and adopt appropriate solution strategies, ensuring that the local model's ability to answer math problems is aligned with the official platform.

2.5 Performance evaluation

In order to objectively compare the performance difference between the locally deployed DeepSeek-R1 and the official platform, it is recommended to conduct multiple tests and calculate the average of multiple test results to obtain more reliable performance evaluation data. While the results of a single test may be subject to chance, averaging the results of multiple tests can more accurately reflect the true level of the model and provide a scientific basis for users to evaluate whether the local deployment has successfully reproduced the official performance.

2.6 Official prompts for search & file upload

The official DeepSeek deployment uses the same DeepSeek-R1 model as the open source version. In order to ensure that the locally deployed DeepSeek-R1 model has the same user experience as the official DeepSeek-R1 model and to maximize the performance of the DeepSeek-R1 model in specific scenarios, the DeepSeek-R1 model has been provided with specially designed and tuned cueing templates for the two most common scenarios, namely, file uploads and web searches. Fully adopting and correctly using these official prompt templates is the most important guarantee for locally deployed DeepSeek-R1 to reproduce the performance of the official platform. Any modifications or adjustments to the cue templates may cause the local model to deviate from the performance of the official platform on specific tasks.

1. File upload scenario prompt template.

When uploading a file and wanting the model to answer questions based on the content of the file, users must construct the prompts strictly using the following official template. Among other things, the{file_name},{file_content} cap (a poem) {question} These three placeholders represent the name of the file uploaded by the user, the content of the file, and the question asked by the user:

file_template = \
"""[file name]: {file_name}
[file content begin]
{file_content}
[file content end]
{question}"""

2. Web search scenario tip template (Web search).

When a user asks a question that needs to be answered with the results of a web search, be sure to use the following official web search tip template. The template contains {search_results} (Search Results),{cur_date} (current date) and {question} (user issues) Three key parameters.

DeepSeek provides optimized templates for Chinese and English queries:

Chinese search template (search_answer_zh_template):

search_answer_zh_template = \
'''# 以下内容是基于用户发送的消息的搜索结果:
{search_results}
在我给你的搜索结果中，每个结果都是[webpage X begin]...[webpage X end]格式的，X代表每篇文章的数字索引。请在适当的情况下在句子末尾引用上下文。请按照引用编号[citation:X]的格式在答案中对应部分引用上下文。如果一句话源自多个上下文，请列出所有相关的引用编号，例如[citation:3][citation:5]，切记不要将引用集中在最后返回引用编号，而是在答案对应部分列出。
在回答时，请注意以下几点：
- 今天是{cur_date}。
- 并非搜索结果的所有内容都与用户的问题密切相关，你需要结合问题，对搜索结果进行甄别、筛选。
- 对于列举类的问题（如列举所有航班信息），尽量将答案控制在10个要点以内，并告诉用户可以查看搜索来源、获得完整信息。优先提供信息完整、最相关的列举项；如非必要，不要主动告诉用户搜索结果未提供的内容。
- 对于创作类的问题（如写论文），请务必在正文的段落中引用对应的参考编号，例如[citation:3][citation:5]，不能只在文章末尾引用。你需要解读并概括用户的题目要求，选择合适的格式，充分利用搜索结果并抽取重要信息，生成符合用户要求、极具思想深度、富有创造力与专业性的答案。你的创作篇幅需要尽可能延长，对于每一个要点的论述要推测用户的意图，给出尽可能多角度的回答要点，且务必信息量大、论述详尽。
- 如果回答很长，请尽量结构化、分段落总结。如果需要分点作答，尽量控制在5个点以内，并合并相关的内容。
- 对于客观类的问答，如果问题的答案非常简短，可以适当补充一到两句相关信息，以丰富内容。
- 你需要根据用户要求和回答内容选择合适、美观的回答格式，确保可读性强。
- 你的回答应该综合多个相关网页来回答，不能重复引用一个网页。
- 除非用户要求，否则你回答的语言需要和用户提问的语言保持一致。
# 用户消息为：
{question}'''

English query template (search_answer_en_template):

search_answer_en_template = \
'''# The following contents are the search results related to the user's message:
{search_results}
In the search results I provide to you, each result is formatted as [webpage X begin]...[webpage X end], where X represents the numerical index of each article. Please cite the context at the end of the relevant sentence when appropriate. Use the citation format [citation:X] in the corresponding part of your answer. If a sentence is derived from multiple contexts, list all relevant citation numbers, such as [citation:3][citation:5]. Be sure not to cluster all citations at the end; instead, include them in the corresponding parts of the answer.
When responding, please keep the following points in mind:
- Today is {cur_date}.
- Not all content in the search results is closely related to the user's question. You need to evaluate and filter the search results based on the question.
- For listing-type questions (e.g., listing all flight information), try to limit the answer to 10 key points and inform the user that they can refer to the search sources for complete information. Prioritize providing the most complete and relevant items in the list. Avoid mentioning content not provided in the search results unless necessary.
- For creative tasks (e.g., writing an essay), ensure that references are cited within the body of the text, such as [citation:3][citation:5], rather than only at the end of the text. You need to interpret and summarize the user's requirements, choose an appropriate format, fully utilize the search results, extract key information, and generate an answer that is insightful, creative, and professional. Extend the length of your response as much as possible, addressing each point in detail and from multiple perspectives, ensuring the content is rich and thorough.
- If the response is lengthy, structure it well and summarize it in paragraphs. If a point-by-point format is needed, try to limit it to 5 points and merge related content.
- For objective Q&A, if the answer is very brief, you may add one or two related sentences to enrich the content.
- Choose an appropriate and visually appealing format for your response based on the user's requirements and the content of the answer, ensuring strong readability.
- Your answer should synthesize information from multiple relevant webpages and avoid repeatedly citing the same webpage.
- Unless the user requests otherwise, your response should be in the same language as the user's question.
# The user's message is:
{question}'''

Additional guidelines to guarantee official consistency:

In addition to strictly following the officially provided prompt templates and <think>\n In addition to the instructions, the following additional guidelines will help users to approximate the performance of the official platform to the greatest extent possible in their local deployment environments, ensuring that their local DeepSeek-R1 runs "as is":

Math problems: Consistent with the previous section, for math problems, it is again important to explicitly ask for "step-by-step reasoning" from the model in the prompt, and to mark the final answer using the official format, e.g. "Please reason step-by-step and put the final answer in \boxed{}". Be sure to follow all official details on the handling of math problems to ensure that the local model is fully consistent with the official platform in terms of mathematical power.
Performance evaluation: In order to accurately evaluate whether the locally deployed DeepSeek-R1 successfully reproduces the performance of the official platform, it is recommended to conduct multiple tests and calculate the average of the results. The statistical average can effectively reduce the chance and error of single test results, thus providing a more scientific and reliable basis for users to judge whether the local deployment is successful or not, as well as to carry out fine tuning. The rigor of performance evaluation is directly related to the effectiveness of the local deployment program.

summarize

Strictly follow all the configuration guidelines officially provided by DeepSeek, in particular fine-tuning the use of the official tip templates and the <think>\n The thought leadership instructions are the fundamental guarantee for users to reproduce the excellent performance of the official DeepSeek-R1 platform in the local environment, and the only way to get the "original" DeepSeek-R1 experience. By understanding DeepSeek-R1's model architecture, training methodology, and workings, and by implementing the official recommendations into every aspect of your local deployment, you will be able to maximize the consistency of performance between your local model and the official platform. Start practicing these official guidelines to replicate the official DeepSeek-R1 experience in your local environment!