AI Personal Learning
and practical guidance
Resource Recommendation 1

DeepSeek-R1 Official Cue Words and Parameter Configurations: Deploying Open Source 671B with DeepSeek's Official Performance

DeepSeek-R1 models are highly regarded for their superior reasoning capabilities. To help usersGet performance that is consistent with the official DeepSeek platform experience, an official detailed deployment guide has been released. In this article, we will read this guide in depth.Focus on dissecting the official templates provided for prompts for search and file upload scenarios, as well as the various commands that mitigate models skipping the think stepThe Mastering and strictly following these official configurations is the key to reproducing the official DeepSeek-R1 excellence!This article will provide a critical reference for both developers looking to deploy DeepSeek-R1 locally and researchers looking to deepen their model performance. Whether you are a developer looking to deploy DeepSeek-R1 locally or a researcher working on a deeper dive into model performance, this paper will provide a critical reference to help youThe DeepSeek-R1 experience is a precise replica of the official standard.The

DeepSeek-R1 Official Cue Words and Parameter Configuration: Open Source Deployment 671B Consistent with DeepSeek Official Performance-1


The release of DeepSeek-R1 has attracted a lot of attention in the AI technology community, with many developers actively trying to deploy and apply this powerful inference model. In order to help users get an excellent experience, DeepSeek team has released an official deployment guide. In this article, we will read the guide in depth, extract the core points, and analyze the model features in detail, aiming to help readers fully understand the best practices of DeepSeek-R1 and master the key techniques of model performance optimization.

 

1. Technical analysis of the DeepSeek-R1 model

DeepSeek has introduced its first generation of inference models, consisting of DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero is a technological innovation that relies exclusively on large-scale reinforcement learning (RL) for training, reversing the traditional paradigm of requiring supervised fine-tuning (SFT) as a pre-training step. This approach gives DeepSeek-R1-Zero superior inference capabilities, allowing it to excel in inference tasks and naturally emerge with a number of compelling inference properties.

However, DeepSeek-R1-Zero is not perfect, e.g., it suffers from repetitive output, poor readability, and language mixing in some cases. To overcome these limitations and further improve the model's inference performance, the DeepSeek team introduced DeepSeek-R1. The main improvement of DeepSeek-R1 over DeepSeek-R1-Zero is the incorporation of "cold-start data" prior to reinforcement learning. This approach effectively improves the model's performance on math, code, and complex inference tasks, making it comparable to OpenAI models such as OpenAI-o1The

To give back to the research community, DeepSeek has generously open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models based on the Llama and Qwen architectures distilled from DeepSeek-R1. Notably, DeepSeek-R1-Distill-Qwen-32B outperforms in several benchmarks the OpenAI-o1-mini, setting a new performance benchmark for small dense models.

Special Tip: Before deploying and running the DeepSeek-R1 family of models locally, it is strongly recommended that users carefully read "2. Core Configuration Points" to ensure optimal use of the and replicate the consistent experience of the official platform as much as possibleThe

2. Core configuration elements: reproducing officially consistent results

The official DeepSeek team has provided the following core recommendations for deployment and use of DeepSeek-R1, which are based on best practices for configuring the official model parameters. Strict adherence to these configurations is key for users to reproduce excellent performance in their local environments that is consistent with the official demo platform. Among other things, the officialSearch & File Upload Prompt Templateas well asMitigation models bypass thinkingThe guidelines are even more critical and directly determine whether a locally deployed DeepSeek-R1 will be up to official standards:

2.1 No system prompt:

The DeepSeek-R1 model is designed to work without system prompts. For consistency with the official platform and to obtain the desired model behavior, it is important to disable system prompts and include all instructions directly in the user prompt. A clear and concise question will help the model to accurately understand the user's intent, consistent with the official platform's handling of prompts.

2.1 Set the temperature parameter to 0.6 (Temperature: 0.6):

The Temperature parameter directly affects the randomness and creativity of the model output. The official recommendation is to set this parameter to 0.6, which is one of the key parameters for ensuring that the output style of locally deployed models is consistent with the official platform, striking the ideal balance between creativity and consistency in the output. Lower values will result in a more conservative and deterministic model output, while higher values will encourage the model to produce more varied and novel answers, but deviations from the official temperature setting may result in differences in response style between the local model and the official platform.

2.3 Guidelines to mitigate model bypass thinking:

In order to ensure that the DeepSeek-R1 model engages in sufficient reasoned thought when processing complex queries, it is strongly recommended that users add explicit thought leadership instructions at the beginning of each input prompt \n. This is not only an effective means of mitigating models skipping thought steps, but also a core configuration to ensure that locally deployed models can reproduce the same depth of reasoning as the official platform. Ignoring or misusing this directive can cause local models to deviate from the official platform in complex reasoning tasks. This directive effectively guides the model into "think mode" and prevents the model from outputting results without sufficient reasoning, i.e., avoiding "skipping the think step" (e.g., directly outputting the \n\n ).

2.4. Optimization of Math problems

For math problems, in order to obtain accurate answers consistent with the official platform in a locally deployed environment, it is recommended to explicitly ask the model to "reason step-by-step" in the prompt, and to specify the format of the final answer in the prompt, e.g., "Please reason step-by-step and put the final answer in \boxed{} ". Clear instructions and formatting requirements help the model to better understand the problem type and adopt appropriate solution strategies, ensuring that the local model's ability to answer math problems is aligned with the official platform.

2.5 Performance evaluation

In order to objectively compare the performance difference between the locally deployed DeepSeek-R1 and the official platform, it is recommended to conduct multiple tests and calculate the average of multiple test results to obtain more reliable performance evaluation data. While the results of a single test may be subject to chance, averaging the results of multiple tests can more accurately reflect the true level of the model and provide a scientific basis for users to evaluate whether the local deployment has successfully reproduced the official performance.

2.6 Official prompts for search & file upload

The official DeepSeek deployment uses the same DeepSeek-R1 model as the open source version. In order to ensure that the locally deployed DeepSeek-R1 model has the same user experience as the official DeepSeek-R1 model and to maximize the performance of the DeepSeek-R1 model in specific scenarios, the DeepSeek-R1 model has been provided with specially designed and tuned cueing templates for the two most common scenarios, namely, file uploads and web searches. Fully adopting and correctly using these official prompt templates is the most important guarantee for locally deployed DeepSeek-R1 to reproduce the performance of the official platform. Any modifications or adjustments to the cue templates may cause the local model to deviate from the performance of the official platform on specific tasks.

1. File upload scenario prompt template.

When uploading a file and wanting the model to answer questions based on the content of the file, users must construct the prompts strictly using the following official template. Among other things, the{file_name},{file_content} cap (a poem) {question} These three placeholders represent the name of the file uploaded by the user, the content of the file, and the question asked by the user:

file_template = \
"""[file name]: {file_name}
[file content begin]
{file_content}
[file content end]
{question}"""

2. Web search scenario tip template (Web search).

When a user asks a question that needs to be answered with the results of a web search, be sure to use the following official web search tip template. The template contains {search_results} (Search Results),{cur_date} (current date) and {question} (user issues) Three key parameters.

DeepSeek provides optimized templates for Chinese and English queries:

  • Chinese search template (search_answer_zh_template):
search_answer_zh_template = \
'''# The following are search results based on messages sent by the user:.
{search_results}
In the search results I gave you, each result is [webpage X begin]... [webpage X end] format, with X representing the numerical index of each article. Please cite context at the end of sentences where appropriate. Please cite the context in the corresponding part of your answer in citation number [citation:X] format. If a sentence derives from more than one context, list all the relevant citation numbers, e.g. [citation:3][citation:5], remembering not to concentrate the citation at the end returning the citation number but to list it in the corresponding part of the answer.
When answering, please note the following:
- Today is {cur_date}.
- Not all the contents of the search results are closely related to the user's question, you need to combine the question, the search results for screening and filtering.
- For enumerated questions (e.g., list all flight information), try to keep the answer to less than 10 bullet points, and tell the user that they can check the search source and get complete information. Prioritize the enumerations that provide complete and most relevant information; do not actively tell the user what is not provided in the search results if it is not necessary.
- For creative writing type questions (e.g., writing a paper), be sure to cite the corresponding reference number, e.g., [citation:3][citation:5], in a paragraph in the body of the text, not just at the end of the article. You will need to interpret and summarize the user's question, choose the appropriate format, make the best use of the search results and extract important information to generate a thoughtful, creative and professional answer that meets the user's requirements. Your work should be as long as possible, and for each point, you should hypothesize about the user's intentions, give as many angles as possible, and be informative and thorough.
- If the answer is long, try to structure it and summarize it in paragraphs. If you need to divide your answer into points, try to limit it to 5 points or less and merge related content.
- For objective Q&A, if the answer to the question is very short, you can add one or two sentences of relevant information as appropriate to enrich the content.
- You need to choose an appropriate and aesthetically pleasing answer format according to the user requirements and the content of the answer to ensure readability.
- Your answer should synthesize multiple relevant web pages and should not refer to a single web page repeatedly.
- Unless requested by the user, the language of your answer needs to be consistent with the language of the user's question.
# The user message is:
{question}'''
  • English query template (search_answer_en_template):
search_answer_en_template = \
'''# The following contents are the search results related to the user's message.
{search_results}
In the search results I provide to you, each result is formatted as [webpage X begin]... [webpage X end], where X represents the numerical index of each article. Please cite the context at the end of the relevant sentence when appropriate. use the citation format [citation:X] in the corresponding part of your answer. If a sentence is derived from multiple contexts, list all relevant citation numbers, such as [citation:X] and [citation:X]. If a sentence is derived from multiple contexts, list all relevant citation numbers, such as [citation:3][citation:5]. Be sure not to cluster all citations at the end; instead, include them in the corresponding parts of the answer.
When responding, please keep the following points in mind.
- Today is {cur_date}.
- Not all content in the search results is closely related to the user's question. You need to evaluate and filter the search results based on the question.
- For listing-type questions (e.g., listing all flight information), try to limit the answer to 10 key points and inform the user that they can refer to the search sources for complete information. Prioritize providing the most complete and relevant items in the list. Avoid mentioning content not provided Avoid mentioning content not provided in the search results unless necessary.
- For creative tasks (e.g., writing an essay), ensure that references are cited within the body of the text, such as [citation:3][citation:5], rather You need to interpret and summarize the user's requirements, choose an appropriate format, fully utilize the search results, extract key information, and generate a list of key information. You need to interpret and summarize the user's requirements, choose an appropriate format, fully utilize the search results, extract key information, and generate an answer that is insightful, creative, and professional. Extend the length of your response as much as possible, addressing each point in detail and from multiple perspectives, ensuring the content is rich and thorough.
- If the response is lengthy, structure it well and summarize it in paragraphs. If a point-by-point format is needed, try to limit it to 5 points and merge If a point-by-point format is needed, try to limit it to 5 points and merge related content.
- For objective Q&A, if the answer is very brief, you may add one or two related sentences to enrich the content.
- Choose an appropriate and visually appealing format for your response based on the user's requirements and the content of the answer, ensuring strong Your answer should synthesize information from
- Your answer should synthesize information from multiple relevant webpages and avoid repeatedly citing the same webpage.
- Unless the user requests otherwise, your response should be in the same language as the user's question.
# The user's message is.
{question}'''

Additional guidelines to guarantee official consistency:

In addition to strictly following the officially provided prompt templates and \n In addition to the instructions, the following additional guidelines will help users to approximate the performance of the official platform to the greatest extent possible in their local deployment environments, ensuring that their local DeepSeek-R1 runs "as is":

  • Math problems: Consistent with the previous section, for math problems, it is again important to explicitly ask for "step-by-step reasoning" from the model in the prompt, and to mark the final answer using the official format, e.g. "Please reason step-by-step and put the final answer in \boxed{}". Be sure to follow all official details on the handling of math problems to ensure that the local model is fully consistent with the official platform in terms of mathematical power.
  • Performance evaluation: In order to accurately evaluate whether the locally deployed DeepSeek-R1 successfully reproduces the performance of the official platform, it is recommended to conduct multiple tests and calculate the average of the results. The statistical average can effectively reduce the chance and error of single test results, thus providing a more scientific and reliable basis for users to judge whether the local deployment is successful or not, as well as to carry out fine tuning. The rigor of performance evaluation is directly related to the effectiveness of the local deployment program.

 

summarize

Strictly follow all the configuration guidelines officially provided by DeepSeek, in particular fine-tuning the use of the official tip templates and the \n The thought leadership instructions are the fundamental guarantee for users to reproduce the excellent performance of the official DeepSeek-R1 platform in the local environment, and the only way to get the "original" DeepSeek-R1 experience. By understanding DeepSeek-R1's model architecture, training methodology, and workings, and by implementing the official recommendations into every aspect of your local deployment, you will be able to maximize the consistency of performance between your local model and the official platform. Start practicing these official guidelines to replicate the official DeepSeek-R1 experience in your local environment!

Tools Download
May not be reproduced without permission:Chief AI Sharing Circle " DeepSeek-R1 Official Cue Words and Parameter Configurations: Deploying Open Source 671B with DeepSeek's Official Performance

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish