AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

Intelligent body-driven search inference engine with SimpleQA up to 88.31 TP3T accuracy

In the field of Artificial Intelligence, the intelligent development of search engines has been in the spotlight. Recently, a team of Salaheddin Alzubi, Creston Brooks, Purva Chiniya, Edoardo Contente, Chiara von Gerlach, Lucas Irwin, Yihan Jiang, Arda Kaz, Windsor Nguyen, Sewoong Oh, Himanshu Tyagi, and Pramod Viswanath are among the researchers who have launched a program calledOpen Deep Search (ODS) open source search engine frameworkthat aims to bridge the gap between closed-source AI search engines and open-source solutions.

 

Innovation Core: Open Search Tool and Open Reasoning Agent

The innovation of ODS is that it combines the latest open source Large Language Model (LLM) with reasoning intelligences that enable it to answer user queries using web search tools. The framework consists of two main components: the Open Search Tool and the Open Reasoning Agent.

Open Search Tool

Open Search Tool is an advanced web search tool that outperforms existing closed-source search engines. The tool not only rewrites user queries as necessary, but also extracts relevant context from search results and chunks and reorders them to ensure that all relevant search results are included. In addition, the Open Search Tool has been customized for major sites such as Wikipedia, ArXiv and PubMed, further improving the accuracy and comprehensiveness of search results.


Open source search engine innovation, FRAMES surpasses GPT-4o 10%, SimpleQA reaches 88.3% accuracy-1

Figure 1: Users have the option of plugging in any base LLM of their choice and taking advantage of the open source framework of Open Deep Search (ODS).ODS consists of two components: the Open Search Tool and the Open Reasoning Agent.The query is first fed into the Open Reasoning Agent and the intelligence coordinates a set of available tools to interpret and answer the query. The most important tool is the Open Search Tool, which provides high-quality context from multiple retrieval sources on the Web. In our experiments, we use Llama3.1-70B and DeepSeek-R1 as base models.

Open Reasoning Agent

The Open Reasoning Agent is another key component of ODS, responsible for interpreting user tasks and completing queries by invoking various tools. Two versions of this intelligence are provided: a ReAct-based version (ODS-v1) and a CodeAct-based version (ODS-v2).

  • ODS-v1: Uses the ReAct framework, which combines Chain-of-Thought (CoT) reasoning and ReAct intelligences.CoT enhances reasoning by encouraging the model to think before answering a question, while ReAct further enhances task completion and decision making by combining reasoning steps with action execution.ODS-v1 also integrates the Wolfram Alpha API for handling complex mathematical computations.

    Open source search engine innovation, FRAMES surpasses GPT-4o 10%, SimpleQA reaches 88.3% accuracy-2

    Figure 2: Schematic of the ReAct prompt structure used in ODS-v1.

    The ReAct framework enables tool integration through a standardized interface:

    Thought: [推理跟踪] Action: Tool[参数] Observation: [结果]
    

    In ODS-v1, the ReAct intelligences use prompts consisting of three action options: 'continue.think' (=continue.think) for complex problem decomposition, 'search' (=search internet) to find factual information using OpenPerplex, and "calculate" (=calculate) to connect to the Wolfram Alpha API to handle numerical calculations that are often difficult for base models to handle.

  • ODS-v2: employs the CodeAct framework, which utilizes code generation and execution to enhance inference.CodeAct significantly improves performance by generating executable Python code for tool calls.ODS-v2 is capable of handling more complex tasks and supports the collaborative work of multiple tools and intelligences.

    Open source search engine innovation, FRAMES surpasses GPT-4o 10%, SimpleQA reaches 88.3% accuracy-3

    Figure 3: CodeAct intelligences answering multi-hop questions in ODS-v2.

 

Performance: Beyond closed source solutions

ODS demonstrated excellent performance on two popular evaluation benchmarks, SimpleQA and FRAMES.

  • SimpleQA: ODS-v1 and ODS-v2 achieve an accuracy of 87.71 TP3T and 88.31 TP3T, respectively, outperforming Perplexity's default search AI (82.41 TP3T) and Perplexity Sonar Reasoning Pro (85.81 TP3T). Compared to OpenAI's GPT-4o Search Preview, ODS-v2 outperforms FRAMES, and its performance on SimpleQA is almost equal to it.

    Open source search engine innovation, FRAMES surpasses GPT-4o 10%, SimpleQA reaches 88.3% accuracy-4

    Figure 4: ODS-v1 identifies the correct answer by cross-checking multiple sources using high-quality context retrieved by Open Search Tool.Perplexity Sonar Reasoning Pro fails to retrieve relevant search information.

    Open source search engine innovation, FRAMES surpasses GPT-4o 10%, SimpleQA reaches 88.3% accuracy-5

    Figure 5: ODS+DeepSeek-R1 correctly distinguishes between July 21 and July 20, 2022 as the date that Kaitlin Armstrong pleaded not guilty to the murder charge against Moriah Wilson and was arraigned.The ODS intelligences cross-checked the two conflicting dates and correctly selected July 21st. Conversely, Perplexity Pro was confused and gave the wrong answer of July 20, 2022.

  • FRAMES: ODS-v1+DeepSeek-R1 achieves an accuracy of 56.71 TP3T with a single web search, while ODS-v2+DeepSeek-R1 improves the accuracy to 75.31 TP3T with multiple searches, which significantly outperforms the best available baseline.

    Open source search engine innovation, FRAMES surpasses GPT-4o 10%, SimpleQA reaches 88.3% accuracy-6

    Figure 6: ODS-v1+Llama3.1-70B accurately calculates the age difference using the Wolfram Calculator tool, resulting in the correct answer.90 In contrast, Perplexity pursues the wrong path of reasoning, reporting an age of 79.

 

Open source: driving community innovation

The release of ODS not only demonstrates its power in the search AI space, but also provides a powerful tool for the open source community.The open source implementation of ODS is publicly available, and researchers and developers can access https://github.com/sentient-agi/OpenDeepSearch来获取相关代码 and build on it with innovate and optimize.

 

Future Outlook: Open Source Leads New Direction for Search AI

The emergence of ODS marks an important milestone for open source search engines. By combining advanced reasoning capabilities with high-quality web search tools, ODS not only outperforms existing closed-source solutions in terms of performance, but also lays the groundwork for future innovation and development. As the open source community continues to grow and technology continues to advance, ODS is expected to lead the search AI space into a whole new era.

 

summarize

The launch of Open Deep Search is an important breakthrough in the history of search engine development. It not only demonstrates the great potential of open source solutions in the field of AI, but also provides a powerful and flexible tool for users and researchers. With more and more developers joining this open source project, ODS is expected to drive the further development of search AI technology and provide users with a smarter and more accurate search experience.

May not be reproduced without permission:Chief AI Sharing Circle " Intelligent body-driven search inference engine with SimpleQA up to 88.31 TP3T accuracy
en_USEnglish