A Side-by-Side Review of Mainstream AI Deep Search Tools in the Market: DeepSeek R1 Outperforms

AI News9mos agorelease AI Sharing Circle

28.7K 00

Statement: This review is unofficial and subjective and the results are for reference only.

Summary of contents

Summary: DeepSeek The official DeepSeek R1+ networked search tool stands out as the top choice among many AI deep search tools for its simplicity and ease of use.

If users expect detailed answers, traditional search engines such as Google are still a solid, market-tested choice with a superior search experience.
Flowith The Oracle schema is surprisingly well behaved and works in the same way as the ChatGPT O1 has similarities. However, thanks to the optimization of the search engine, Flowith has been able to take into account both domestic and international information resources in terms of information sources.
The performance of ChatGPT's Deep Search in this review is mediocre, which is not in line with the high praise from many overseas celebrities. This may be due to the fact that its ability to handle Chinese content is still to be improved. Considering its high search cost, this review is only a one-time attempt.

Subjective scoring results

serial number	Tool name	accuracy	depth	(of a speech etc) profundity	lengths	interactome	derivatives	Total score (out of 60)
1	Gemini 2.0 Flash Thinking	8	8	9	13	8	8	54
2	beanbag	8	7	6	10	9	10	50
3	Flowith Oracle Mode	8	9	8	12	7	6	50
4	ChatGPT 01 Deep Search	8	8	9	9	8	7	49
5	DeepSeek Official Version	9	8	9	7	8	7	48
6	Genspark	6	7	6	11	7	8	45
7	perplexity	7	6	6	8	7	9	43
8	Nano Search	7	7	7	6	6	6	39
9	HeartStream AI Assistant	7	8	6	3	7	7	38
10	Kimi 1.5 Long	7	6	7	4	7	6	37
11	Secret Tower Search	6	7	5	2	7	8	35
12	Tencent Yuanbao (developers of the QQ instant messaging platform)ima.copolit	4	6	3	5	5	8	33
13	Storm	2	3	2	1	2	2	12

This review is somewhat subjective. Nonetheless, the reviewers have set up the following judgment criteria in an effort to evaluate the performance of each AI deep search tool from multiple dimensions.

Evaluation Criteria

[Accuracy]The evaluation was conducted by a team of 12 tools (10 at the beginning of the evaluation) to see if the tools could accurately recognize and list 10 AI tools. If the tool fails to recognize the keyword "AI deep search", it will be judged as the lowest score. If the source of information is not provided, 5 points will be deducted by default.
[Breadth]: Assess whether the tool is able to fully cover the content requested by the reviewer, including key information such as product descriptions, technology paths, etc.
[Depth]: In-depth assessments are somewhat subjective and will be scored by the reviewer based on their personal understanding; there may be some degree of personal cognitive bias in this scoring.
[Length]: The main measure is the number of words in the text generated by the tool.
[Interaction]: Evaluate the interactive experience of the tool, e.g., whether it supports follow-up questions, price information, thresholds for use, etc.
[Export]: Evaluate the data export capabilities of the tool. Tools that only support exporting links or images will be considered to have insufficient export capabilities, and ideally the tool should at least support full-text copying and exporting PDF documents.

Cue word evolution

Initial Cues

最近 AI 搜索很火，我想你围绕 Deep Search
以及他的开源版本还有各种 AI 搜索，整理出一篇详尽的报告，
至少包括产品名称、原理、技术路径

Optimized Cue Words

In order to obtain a more structured and comprehensive report, the reviewer relied on the Flowith-supplied Claude The cue word optimization function optimizes the initial cue word to the following:

<研究主题>
AI搜索技术（重点分析Deep Search及其开源版本）
</研究主题>
<报告结构说明>
你将创建一份关于特定AI技术主题的深度分析报告。请按以下结构组织内容：
概述（2-3段）
技术/主题的整体介绍
核心发现与重要性
现状与未来影响
背景分析
技术发展背景与行业现状
该技术的战略价值
本报告的覆盖范围
技术解析
核心技术原理
关键技术组件
实现路径与方法论
架构细节（如可获得）
市场分析
主要产品与实现方案
核心厂商技术路线
解决方案对比
开源替代方案
未来展望
潜在发展方向
现存挑战与限制
未来研究重点
<格式要求>
使用Markdown标题（# 主标题，## 子标题）
保持专业书面语气
技术主张需附具体解释
不同方案需对比分析
公开信息缺失处需注明
不确定的技术细节需明确标注
聚焦事实性信息，避免推测
使用规范技术术语但保持可读性
<注意事项>
确保分析深度与完整性
保持客观中立立场
提供可验证的技术细节
明确区分事实与推测
包含实际案例与产品
注明当前认知局限
请将完整报告置于<report>标签内，特别注意：
对Deep Search技术架构进行重点剖析
列举不少于5个同类AI搜索产品
开源项目需标注许可证类型
技术路径需包含检索增强生成(RAG)等关键技术
需包含向量数据库等基础设施支持分析

When using ChatGPT's Deep Search, the tool asked the reviewer to answer a few questions to clarify the direction of the search, and the reviewer added further refinements to the prompts. However, since the supplemented prompts were long and contained links, they will not be repeated here.

Review results by platform

1. Bean buns

Total word count: 2918 words

Beanbag excels in engineering, and overall performance is near-perfect, except for nano search-related content.

The exported document has a table of contents, and the overall experience is smooth and comfortable, with a high degree of product completion, in line with the richness of its App product line.

Beanbag's shortcoming is that it has not yet possessed its own large-scale model that has significant advantages at the intelligence level. As a result, its content presents the problem of lack of depth, with a novel form of content presentation, but the depth of content needs to be improved.

2. Nanosearch

Total word count: 1606 words

360's Nano Search is a feature-integrated product. At first glance, the functional modules are more complete and come with a DeepSeek R1 Technology. It gives a good account of OpenAI at the beginning of the article, but in terms of the introduction of AI deep search products, the content is not comprehensive enough and the length is short. However, NanoSearch's introduction of the features of each product is more prominent, and its ability to summarize is fair. In addition, NanoSearch offers some search products that are outside the scope of the reviewer's knowledge, which may be enlightening, even though they may not be strictly AI products.

However, nano search does not support the follow up function, and the sharing function only supports links and images (without full text), with a clear tendency towards commercialization.

3. ima.copliot (Tencent Yuanbao)

Total word count: 1417 words

Tencent had earlier launched a tool combining search and knowledge base functions. At that time, the tool was equipped with a hybrid model with average intelligence, but its information sources were of high quality, mainly from the public platform. Now, with the addition of DeepSeek R1 deep search function, its content quality has been significantly improved.

The main advantage of ima.copliot is that users can conveniently add the searched public number content to their personal knowledge base and conduct Q&A based on the knowledge base, which is a more practical feature. The public number platform is ima.copliot's unique resource advantage. When users use other similar products, they often need to manually click on the public number link to jump and then save the content.

However, compared with open network information, there is a certain lag in the timeliness of public number information. At the same time, due to the strict audit mechanism of the public number platform, the circulation of some emerging things, especially external links, is restricted, resulting in sometimes biased search results. When searching for information outside the public number platform, ima.copliot's performance is relatively poor.

As a result, ima.copliot performed slightly below expectations in this review, and its search results were poorly correlated with the review topic. In particular, under the theme of "AI Deep Search", much of the information provided by ima.copliot is still at the level of traditional search architecture.

ima.copliot remains a valuable tool for specific domains. However, it may need to adopt a more aggressive and differentiated development strategy when targeting the broader public domain.

In addition, ima.copliot only supports copy-paste export.

4. Heartstream AI Assistant

Total word count: 1399 words

it is said Heartstream AI Assistant Originated from Alibaba. The product is more feature-rich.

For example, the Mindstream AI Assistant provides mind maps at the beginning of reports and can generate NotebookLLM-like podcasts of conversations between men and women, ideal for producing AI podcast content.

The number of AI products listed in the search results is small, but the accuracy of the product names is high. The comparisons in the table are not entirely accurate, but they compare favorably with other review tools.

Although the number of words in the text is small, the content generated by HeartStream AI Assistant is more diverse, including tables, pictures and other elements, which makes the content look richer. However, some of the illustrations are weakly related to the theme, and the theme is not clear enough.

The thought process of the HeartStream AI Assistant is well presented and the sources of information are well labeled.

Its main problem is that the sharing and exporting function is not convenient enough, and the formatting is wrong after copying the illustrated content.

5. ChatGPT Deep Search

Total word count: 2865 words

As the official Deep Search from OpenAI, ChatGPT Deep Search performs slightly below expectations in this review, with relatively little output, which is not in line with its $200 monthly membership fee.

After speaking with a friend who assisted with the review, the analysis suggests that there may be two reasons for this:

Imposing too many conditional restrictions on the inference macromodel may instead constrain its performance, and the cue words may be under-optimized.
The GPT model does not have an initial advantage in processing information in Chinese, so perhaps an attempt should be made to search in English and answer in Chinese.

Nevertheless, ChatGPT Deep Search has its merits:

During the questioning session, ChatGPT Deep Search will first ask the user a number of questions in return to guide the user уточнить the direction of the search. This helps avoid wasted resources or direction bias. For example, the reviewer's initial prompt was more concise, and after ChatGPT Deep Search's rhetorical guidance, the reviewer refined the prompt. These two parts of the prompts were combined and provided as the new standard prompts for all participating AI deep search tools. Among them, ChatGPT Deep Search's rhetorical questioning impressed the reviewers with its high quality of rhetorical questions, which may be used as a standard process reference for future AI search projects.

The output of ChatGPT Deep Search is more like a complete article with more coherent logic. The ability to generate long text and strong reasoning ability constitute its high technical barriers. At present, many search tools have accessed DeepSeek R1 to enhance the ability of deep thinking, but due to the limited context window of DeepSeek R1 (32K), these tools in terms of content generation, in fact, it is more like filling the content based on the outline. While there is nothing wrong with this approach, the user experience would certainly be better if they could generate long, coherent articles like ChatGPT Deep Search.

6. DeepSeek official version

Total word count: 1625 words

The DeepSeek Deep Thinking + Internet Search combination performed well, especially in terms of resource matching, and was able to search for more niche and emerging software. However, the official version of DeepSeek is not able to present all the products in the review due to the length of the context, although its display of product features is on point and basically meets the reviewer's expectations.

Against the backdrop of the increasing stabilization of the official DeepSeek service, the reviewers believe that DeepSeek-R1 + networked search is now ideal for the average user to get relatively high-quality answers at a low threshold.

However, the problem of "illusion" still exists in the official version of DeepSeek. If the official can strengthen the information source labeling and expand the context window, the user experience is expected to be further improved. Of course, the response speed also needs to be continuously optimized.

7. Flowith.ai's Oracle model

Total word count: 5369 words

Flowith.ai is a whiteboard-style knowledge base service. Its early publicity focuses on the Oracle model, i.e., through the Agent intelligence body, the user's proposed problem will be broken down into a number of sub-problems and steps, the user can modify and confirm the results of the disassembly, and then the Agent will search and organize them.

The results of the review show that Flowith performs a more extensive search in the second step. It is not clear what model Flowith uses in this step, but it is assumed that it is probably the Gemini model, which has strong contextualization capabilities, and Flowith is the only tool in the evaluation that can completely list and introduce the 10 AI tools requested by the reviewers, which is worthy of recognition. In addition, Flowith's rhetorical questioning mechanism in the initial phase is similar to the way OpenAI Deep Search interacts.

However, Flowith does not allow for much manual adjustment and control during the search process. In fact, none of the participating tools had much control over the search process, but Flowith's visualization of the search process creates the "illusion" that the user is deeply involved.

In addition, Flowith's performance in OpenAI Deep Search is not accurate enough, as the results are more like single keyword searches and lack relevance to OpenAI. This is unfortunate, and reflects the importance of OpenAI's own O3 long text + inference model.

We expect that Flowith will be able to access the APIs of Claude 4.0, O3, or the subsequent DeepSeek R2 in the future, and continue to optimize its engineering capabilities to bring more imagination to users.

8. Genspark

Total word count: 3406 words

Genspark had received attention for its AI Agent + search model and for presenting search results in the form of illustrated notes similar to Little Red Book. However, at that time, due to the lack of modeling capabilities, the quality of its output content was poor, and its timeliness was also poor. Nearly a year later, Genspark recently launched its own Deep Search function.

Revisiting Genspark, it is clear that its capabilities have improved significantly, and Genspark's products have always been characterized by their maturity and ease of use. Genspark's products have always been characterized by their maturity and ease of use, for example, they take longer to think about, retrieve a larger amount of information, and support email notification of report completion.Genspark's introduction of the O3 version of Deep Search is more on point. However, on the whole, Genspark is still in the exploratory stage, and the content it presents has more redundant information and the required product introduction information is missing, which may be related to the lack of Chinese information resources.

It's worth noting that Genspark is the only tool in this review that provides video links and cover previews. Although its YouTube video links do not support direct click-to-play, users still need to open them via an external link.

Genspark does not support direct file export or copying, only sharing of results as links to Genspark website pages.

9. Kimi

Total word count: 1400 words

There is an interesting phenomenon with Kimi. Because the reviewer chose a different route, Kimi continued to display the results in English, and the reviewer then had to emphasize the use of Chinese to answer.

The quality of Kimi's report was fair, with Kimi accurately identifying 5 out of 10 AI tools, and the products were neatly listed. The presentation on Deep Search was also good. However, Kimi omitted many of the products mentioned by the reviewer (even though the reviewer provided links to them).

In addition, Kimi does not support direct export to documents.

Early on, reviewers were impressed with Kimi's long text generation capabilities. Although Kimi's level of intelligence was low at the time, its ability to generate very long texts was still attractive. Today, Kimi's intelligence has been significantly improved and expanded to include multimodal functionality. We are looking forward to further breakthroughs in Kimi's intelligence.

10. Storm

Total word count: 733 words

The Storm architecture originated at Stanford University and has been available for some time. Recently Storm seems to have undergone some optimizations, but its capabilities are significantly behind the times. First, Storm's output has too few words, and second, the descriptions of the components are rather generalized and lack detail.

Perhaps due to its free public interface and limited usage, Storm's development strategy is not as aggressive as the other participating tools.

Overall, Storm's performance was disappointing.

It is worth noting that the user is required to first enter a subject of up to 20 words and then describe the purpose.

11. Tower search

Total word count: 1259 words

If you include a link, the word count of the Secret Tower Search report is close to 10,000 words, but that's not fair.

Secreta Search performed moderately well, especially in terms of page views, the Secret Tower AI Search The first to support browsing a large number of web pages, Secret Tower Search browsed 374 web pages in this review.

Secret Tower Search identifies some niche products, but the number of products is still low.

Slightly amusingly, a QR code for a WeChat group appears at the front of the article.

However, in general, the depth of articles in Secreta Search is still insufficient, and a large number of web pages are not read to achieve the expected results, which is a bit embarrassing.

12. Gemini

Total word count: 8690 words

Google is a major player in search (without mentioning Baidu, of course).

Overall, Google Gemini's responses were of high quality, but in terms of recognizing 10 AI tools, Gemini found only 6. Although above average, Gemini could have done better.

Google's new models are powerful, for example:

Multi-modal model supporting millions of contexts and outputting far more content than any other model (except ChatGPT O1, O3).
Support for YouTube and other Google ecosystem connected searches.
Fast response time.

But Gemini also made two glaring mistakes in this review:

Sometimes Not Good Enough Outputting formatted content, for example, outputting text in code as shown in the screenshot, results in confusing formatting.
External links and YouTube referral links are not displayed.

One interesting detail is that users can click the "three dots" button to have the AI recheck the answer. In practice, however, this feature is not very effective.

13. Perplexity

Total word count: 1931 words

Perplexity's exported content is most comfortably formatted, with links embedded in the text and no external display links. This is probably due to Perplexity's excellent Markdown optimization.

Perplexity performs reasonably well for widely known products, but for niche products, Perplexity has little coverage and largely ignores domestic sources.

summarize

The advent of DeepSeek R1 has enabled vendors to quickly build AI deep search services that work well on the surface. The platforms provide the search function and DeepSeek provides the deep thinking capability. However, a lot of engineering work is still required to effectively combine the two. If you don't want to put too much effort into development, you need to rely on strong modeling capabilities to drive the search service.

DeepSeek does not guarantee the absolute accuracy of the content, but can make it "look" more credible.

As of February 16, 2024, and even in the coming months, quickly accessing and organizing information on the web will not be easy, and will require a significant and sustained investment of resources and technology.

Looking ahead, if DeepSeek R2 can realize millions of context windows, support multimodal inputs, and further improve its responsiveness, its competitiveness in the market will be immeasurable.

AI News

Article copyright AI Sharing Circle All, please do not reproduce without permission.

GitHub Copilot 智能体模式重磅发布，AI 结对编程进化为自主智能体！

GitHub Copilot Intelligent Body Mode re-released, AI pair programming evolves into autonomous intelligences!

AI News

10mos ago

024.3K

YouTube Shorts Integrates Veo 2 for AI Video Background and Clip Generation

AI News

9mos ago

024.2K

Nature重磅：8分钟预测15天全球天气，DeepMind AI击败全球最先进天气预报系统

Nature weighs in: 8 minutes to predict 15 days of global weather, DeepMind AI beats the world's most advanced weather forecasting system

AI News

12mos ago

025.4K

解读 Coze Space：字节跳动布局 AI Agent，瞄准“零门槛”办公助手

Interpretation of Coze Space: byte jump layout AI Agent, targeting "zero threshold" office assistant

AI News

7mos ago

033.3K

No comments

You must be logged in to leave a comment!

No comments...

A Side-by-Side Review of Mainstream AI Deep Search Tools in the Market: DeepSeek R1 Outperforms

Summary of contents

Subjective scoring results

Evaluation Criteria