教会普通人正确认识并使用 DeepSeek-R1 的教程

1.2K 00

DeepSeek-R1 对比其他大模型没什么特别，你的惊喜在于看到思考过程或优秀的中文表达能力。如果你使用过 ChatGPT 觉得索然无味，那么 DeepSeek-R1 带来的惊喜可能是错觉。如果你每天忙着带孩子、送外卖，根本没必要关注DeepSeek，除了浪费时间什么也得不到。

背景

关于 DeepSeek-R1 的重要背景信息，如果你不八卦，可以跳过。DeepSeek-R1诞生于一家知名量化投资公司——“幻方量化”，公司全名“杭州深度求索人工智能基础技术研究有限公司”，创始人梁文锋。

DeepSeek 在 2024年11月20日就发布了推理模型“DeepSeek-R1-Lite”，可以在用户界面开启“深度思考”使用。但是 DeepSeek-R1-Lite 使用一个较小的基座模型训练，得到的关注较少，而 DeepSeek-R1 使用了较大的基座模型“DeepSeek-V3-Base”训练而来，整体能力得到了很大提升。因此 DeepSeek-R1-Lite 就是 DeepSeek-R1 2个月前的预览版。这么着急公布给用户体验...

在1月20日梁文锋参加重要“会议”并发言，同日发布 DeepSeek-R1 技术报告，这种巧合好像哪吒爆火一样背后不知道有什么故事...

DeepSeek-R1 前期火爆的原因主要在技术领域，因为他开放了技术报告，告诉大家如何用低成本（557.6万美元的GPU成本，这里有不同的看法）复现"o1"的，并且他的并发能力很高，用更少的硬件资源既可运行超大规模的“推理模型”。简单点解释就是把大模型的价格打下来了，让人人都用得起更好的大模型。

二次出圈是各类新闻报道向网民轰炸，像哪吒爆火一样...

三次出圈是自媒体借助波热点“赚钱”，所以大家在各种群里看到了：360专线DeepSeek-R1、教你本地部署DeepSeek-R1、清华大学教你使用DeepSeek-R1

目前到了最后一个阶段：全民用 DeepSeek-R1 ，在国企、央企，甚至区县街道，应该收到过相关消息。

在哪里可以使用DeepSeek-R1

我询问过很多人，可能连DeepSeek-R1网址都找不到，有些人是因为官网卡，寻找其他提供 DeepSeek-R1 模型的服务，但你了解的可能都不对...

官方使用渠道

他的官方网址是：https://chat.deepseek.com/ 没有PC客户端版，手机APP去各大应用商店搜索“DeepSeek”即可。

其他使用渠道

目前很多AI工具都集成了DeepSeek-R1，而他们的输出质量却不一样，只推荐一个接近原版的工具，避免选择困难：

腾讯元宝：https://yuanbao.tencent.com/ 同时提供APP，请从官网和各大应用商店搜索“元宝”下载。

网页版使用时记得勾选以下选项：

本地电脑、手机中安装DeepSeek-R1

首先你的电脑GPU是否满足安装 DeepSeek-R1 最低要求？如果你对GPU没有概念，不要考虑本地安装。

电脑安装 DeepSeek-R1

推荐使用 Ollama 安装，他的网址是：https://ollama.com/ ，相关可以安装的模型都在这里：https://ollama.com/search?q=deepseek-r1 ，如果你需要详细教程才会安装，那么不推荐本地安装使用。

例如3060显卡，勉强可以跑14B（官方蒸馏版）模型，可以复制以下命令安装：

如果你的电脑配置“高”，还想本地部署，推荐以下本地一键安装包

本地安装需要一定技术基础，这里提供DeepSeek-R1加聊天界面本地一键安装包：避坑指南：淘宝 DeepSeek R1 安装包付费倒卖？免费教你本地部署（附一键安装包）

如果你的电脑配置“低”，还想本地部署，推荐以下云端部署方案

无需本地GPU即可私有化部署 DeepSeek-R1 32B

手机安装 DeepSeek-R1

划重点！手机安装 DeepSeek-R1，你的目的是下载官方APP，还是要让 DeepSeek-R1 模型在你的手机本地运行？如果只是要在手机中使用，并不强调在手机中本地运行，直接在应用商店搜索“DeepSeek”或“腾讯元宝”，下载即可，两者都是在线使用。下面仅提供的是手机本地运行 DeepSeek-R1 模型的方法。

手机中本地安装 DeepSeek-R1 的缺点：安装的模型能力有限，智能写一些简单的文案，整理、总结资料。

如果你决定安装：手机本地安装 DeepSeek-R1 模型说明，适用IOS和安卓高配机型

DeepSeek-R1 适合做什么

DeepSeek-R1 十分优秀，能做事情很多，先用排除法，了解 DeepSeek-R1 不适合做什么。我生成5000个左右问题，向 DeepSeek-R1（满血版）提问，得到了一些经验，供大家参考：

1.错误问题产生幻觉答案：R1幻觉比ChatGPT严重。而普通人向R1提问，会正确“提问”本身就很困难，所以你得到的答案往往幻觉丛生。

2.不适合进行有关时间线问题的研究任务，问题出在3个地方：（1）大模型训练的知识有滞后性，（2）就算在联网模式下“深度思考”因为是一次召回网络信息，召回信息数量有限，无法完整的收集有关时间线问题的信息，（3）思考被过多的上下文干扰，见3

3.深度思考容易被上下文干扰，同样的问题，开启网页搜索后，因为引入了大量的网页信息，思考过程混乱，导致得到的结果更差。这个问题很严重。

4.一条明确的提问指令是如何被“深度思考”干扰的：思考会忽视“主要指令”，反而重视其他上下文，导致思考逐渐发散，而且等待思考时间过长。

让我们具体感受下一条很简单的任务指令，很少的上下文如何被“思考”破坏的，左侧是 DeepSeek，右侧是 ChatGPT。

5.如果只是搜索信息，谷歌、百度可能得到的结果更好，如果你想从搜索结果中得到更多信息，那么使用搜索引擎效率更高。尤其是在信息量较大时。R1不会帮你分析数量较大的网页信息，因为他的帮你搜索的信息量有限、能记住的信息量有限，特别是判断搜索结果对错的能力更有限，会一股脑的把一堆搜索结果帮你进行推理，然后给你答案。

复杂一些的场景，比如要撰写论文，进行“资料类”信息收集、整理，需要多轮信息收集、多轮推理（人工操作也是一样逻辑），R1只是进行一轮信息收集并推理，无法一次解决复杂的系统性问题。当你知道这一点后，可以尝试手工整理相关资料，汇总好以后，在丢给R1分析。

正确使用 DeepSeek-R1

注意：就算不使用 DeepSeek-R1 模型，使用其他模型时，在问题前加一句：“让我们一步一步思考”，其他模型依然会给出你详细的思考过程。但推理详细程度和最终答案有可能没有DeepSeek-R1好。

其实 DeepSeek-R1 的使用技巧与其他模型区别并不大，仅有几个需要注意的细节。

1.输入的问题复杂或开启搜索后得到的答案不满意，尝试关闭“深度思考”。关闭后使用的是V3模型，依然很不错。

2.使用简单的指令，“深度思考”会帮你思考

正确：帮我翻译

错误：帮我翻译为中文，使用符合中文用户的习惯的词语翻译，重要的名词需要保留原文，翻译要注意排版。

更多错误示例：
1.我这里有一份非常重要的市场调研报告，内容很多，信息量很大，希望你认真仔细地阅读，深入思考，然后分析一下，这份报告中最重要的市场趋势是什么？最好能列出最重要的三个趋势，并解释一下为什么认为这三个趋势最重要。
2.以下是一些疾病诊断的例子：[示例1]，[示例2]，现在请你根据以下病历信息，诊断患者可能患有的疾病。[粘贴病历信息]

3.使用复杂的指令，激活“深度思考”（没有一定经验不建议构造复杂指令，复杂指令和携带过长上下文，都会让R1模型混乱）

正确：帮我翻译为中文，使用符合中文用户的习惯的词语翻译，重要的名词需要保留原文，翻译要注意排版。

错误：帮我翻译

注：要辩证的看待2、3产生的冲突，先从简单的指令开始尝试，当答案不满足特定要求时，再适当增加指令条件。

使用以下文本自行测试

**# How does better chunking lead to high-quality responses?
**If you’re reading this, I can assume you know what chunking and RAG are. Nonetheless, here is what it is, in short.**
**LLMs are trained on massive public datasets. Yet, they aren’t updated afterward. Therefore, LLMs don’t know anything after the pretraining cutoff date. Also, your use of LLM can be about your organization’s private data, which the LLM had no way of knowing.**
**Therefore, a beautiful solution called RAG has emerged. RAG asks the LLM to ** answer questions based on the context provided in the prompt itself** . We even ask it not to answer even if the LLM knows the answer, but the provided context is insufficient.**
**How do we get the context? You can query your database and the Internet, skim several pages of a PDF report, or do anything else.**
**But there are two problems in RAGs.**
* **LLM’s **context windows sizes** are limited (Not anymore — I’ll get to this soon!)**
* **A large context window has a high ** signal-to-noise ratio** .**
**First, early LLMs had limited window sizes. GPT 2, for instance, had only a 1024 token context window. GPT 3 came up with a 2048 token window. These are merely the **size of a typical blog post** .**
**Due to these limitations, the LLM prompt cannot include an organization’s entire knowledge base. Engineers were forced to reduce the size of their input to the LLM to get a good response.**
**However, various models with a context window of 128k tokens showed up. This is usually **the size of an annual report** for many listed companies. It is good enough to upload a document to a chatbot and ask questions.**
**But, it didn’t always perform as expected. That’s because of the noise in the context. A large document easily contains many unrelated information and the necessary pieces. This unrelated information drives the LLM to lose its objective or hallucinate.**
**This is why we chunk the documents. Instead of sending a large document to the LLM, we break it into smaller pieces and only send the most relevant pieces.**
**However, this is easier said than done.**
**There are a million possible ways to break a document into chunks. For instance, you may break the document paragraph by paragraph, and I may do it sentence by sentence. Both are valid methods, but one may work better than the other in specific circumstances.**
**However, we won’t discuss sentence and paragraph breaks, as they are trivial and have little use in chunking. Instead, we will discuss slightly more complex ones that break documents for RAGs.**
**In the rest of the post, I’ll discuss a few chunking strategies I’ve learned and applied.********

4.提示词框架依然后效

养成好的提示词输入习惯，只需要输入以下四个条件：[角色][要大模型执行的动作][任务目标][任务背景]（任务背景不是必要的）

例如：扮演公文书写专家，帮我写一篇关于参加“年度优秀员工大会”现场演讲的报告。要求演讲5分钟左右，诚恳谦虚即可。我的公司名叫“中国石油”，上级领导叫“李富贵”，我的工作是石油勘探，这次得奖的原因是员工投票第二名。

5.学会让大模型帮助你提问，好的问题才有好的答案

回顾第“4”点，给出的提示词示例，有没有发现问题？

描述的不够详细，写出的报告不能直接用，大多数人使用大模型难点就是不会提问或者不愿意费脑子补充问题。

其实问题很简单，在构造一个完美的问题前，先学会向大模型请教，让他帮你完善提问。

6.提问要有指向性，或者让R1思考过程有指向性，这是很老的方法，不仅适用于R1

以下方法正常来说不需要用在R1这类推理模型中使用，但具体问题具体分析，如果你的问题非常有指向性，可以在问题中加入一些描述逻辑简单、简短的上下文。

Prompt_ID	Type	Trigger Sentence	中文
101	CoT	Let's think step by step.	我们一步一步地思考。
201	PS	Let's first understand the problem and devise a plan to solve the problem. Then, let's carry out the plan to solve the problem step by step.	首先，让我们理解问题并制定解决问题的计划。然后，让我们按计划一步一步地解决问题。
301	PS+	Let's first understand the problem, extract relevant variables and their corresponding numerals, and devise a plan. Then, let's carry out the plan, calculate intermediate variables (pay attention to correct numeral calculation and commonsense), solve the problem step by step, and show the answer.	首先，让我们理解问题，提取相关的变量和它们对应的数值，然后制定一个计划。接下来，执行计划，计算中间变量（注意正确的数字计算和常识），一步一步地解决问题，并显示答案。
302	PS+	Let's first understand the problem, extract relevant variables and their corresponding numerals, and devise a complete plan. Then, let's carry out the plan, calculate intermediate variables (pay attention to correct numerical calculation and commonsense), solve the problem step by step, and show the answer.	首先，让我们理解问题，提取相关变量及其对应的数值，并制定一个完整的计划。然后，执行计划，计算中间变量（注意正确的数值计算和常识），一步一步解决问题，并显示答案。
303	PS+	Let's devise a plan and solve the problem step by step.	让我们制定一个计划并一步一步地解决问题。
304	PS+	Let's first understand the problem and devise a complete plan. Then, let's carry out the plan and reason problem step by step. Every step answer the subquestion, "does the person flip and what is the coin's current state?". According to the coin's last state, give the final answer (pay attention to every flip and the coin’s turning state).	首先，让我们理解问题并制定一个完整的计划。然后，执行计划并逐步解决问题。每一步回答子问题，“人是否翻转，硬币当前状态是什么?". 根据硬币的最后状态，给出最终答案（注意每次翻转和硬币的翻转状态）。
305	PS+	Let's first understand the problem, extract relevant variables and their corresponding numerals, and make a complete plan. Then, let's carry out the plan, calculate intermediate variables (pay attention to correct numerical calculation and commonsense), solve the problem step by step, and show the answer.	首先，让我们理解问题，提取相关变量及其对应的数值，并制定一个完整的计划。然后，执行计划，计算中间变量（注意正确的数值计算和常识），一步一步解决问题，并显示答案。
306	PS+	Let's first prepare relevant information and make a plan. Then, let's answer the question step by step (pay attention to commonsense and logical coherence).	首先，让我们准备相关信息并制定一个计划。然后，一步一步地回答问题（注意常识和逻辑连贯性）。
307	PS+	Let's first understand the problem, extract relevant variables and their corresponding numerals, and make and devise a complete plan. Then, let's carry out the plan, calculate intermediate variables (pay attention to correct numerical calculation and commonsense), solve the problem step by step, and show the answer.	首先，让我们理解问题，提取相关变量及其对应的数值，并制定一个完整的计划。然后，执行计划，计算中间变量（注意正确的数值计算和常识），一步一步解决问题，并显示答案。