OpenAI Launches DeepResearch , Intelligence for Deep Research Using o3 Models

55.6K 00

Are you tired of searching through tons of information and still struggling to find the answers you need? Do you long for an intelligent assistant who can do in-depth research for you like a professional analyst? OpenAI is proud to announce the release of ChatGPT A brand new feature of the -Deep Research! It will revolutionize the way you access information, allowing you to complete complex research tasks that would otherwise take hours to complete in just a few minutes. This article will bring you an in-depth understanding of the powerful functions of in-depth research, application scenarios, working principles and the future direction of development, and together we will witness how AI empowers knowledge work and opens a new era of intelligent research!

This is an intelligence that uses reasoning to synthesize large amounts of online information and perform multi-step research tasks for you, similar to the Perplexity cap (a poem) Gemini OpenAI's newly released "In-depth Research" uses its powerful o3 model to organize and analyze massive amounts of information through web search, and ultimately generates a detailed professional report with citations. This function has attracted widespread attention since its launch, and has been hailed as a "super powerful" AI assistant, heralding the arrival of a new era of AI-assisted research.

It's available to Pro users today, and will be available to Plus and Team users next.

One sentence to summarize:"Deep Research" autonomously conducts multi-step network investigations, completing complex research tasks in 5-30 minutes that would normally take a human researcher hours to complete, and presenting the results to the user in a high quality report.

Reference reading:Google launches Deep Research,Open Deep Research: generating AI research based on web search content,STORM: Search web data based on Topic to generate papers with citations, long paper reports

How good is "deep research"? Let's find out:

Powerful and incredibly efficient: Say goodbye to staying up all night searching for information! "Deep Research" takes research to a whole new level of efficiency by completing complex studies that traditionally take hours in 5-30 minutes, with the ability to dig as deep as needed to provide expert-level analysis.
The results are reliable and well documented: No more worrying about unknown sources! All conclusions are accompanied by detailed citations, down to the relevant paragraph of the original web page or PDF, making it easy for users to trace and verify the accuracy of the information, and making your research more convincing.
Wide range of applications, flexible and easy to use: Whether you need to conduct competitive analysis, market research, product shopping, or academic research, "Deep Research" can be your right hand. Simply select "Deep Research" from the ChatGPT interface and enter a query to start your research. Support for uploading files (e.g. PDFs) to provide more specific context, as well as real-time viewing of research progress and cited sources in the sidebar.
Technologically advanced and excellent performance: Based on end-to-end reinforcement learning, Deep Research is able to perform multi-step browsing and reasoning tasks. It supports website content reading, data processing, chart generation, and citation of source text to support arguments. In a difficult benchmark test called Humanity's Last Exam (HLE), Deep Research scored 26.6%, far exceeding the score of the previous o3-mini (13%) and o1 (9%), demonstrating their powerful information retrieval and integration capabilities and near-human research behavior.
Gradual liberalization and a promising future: It is currently available to Pro users (100 searches per month) and will be extended to Plus users (10 searches per month) within a month, with Team and Enterprise editions to follow. Mobile/desktop apps will be supported in the future, and there are plans to connect to more data sources, both subscription and internal, for even more powerful personalization.

These details are equally noteworthy:

The more the tool is called, the higher the accuracy: The graph shows that as the number of tool calls (Max Tool Calls) increases, the pass rate of Deep Research on the HLE test also increases, indicating a positive correlation between its intelligence and its ability to utilize tools.
The problem of hallucinations still needs to be improved: Despite the impressive performance of Deep Research, there is still the possibility of hallucinations and faulty reasoning, which is a key focus of OpenAI's subsequent optimizations.
Combined with Operator, the potential is limitless: OpenAI future plans to bring deep online research to real-world operations (Operator) are combined to realize more powerful intelligent body functions, which makes people full of expectations!

Full official DeepResearch review

Today, we're launching Deep Research in ChatGPT, a new agent feature that allows multi-step research on the Internet for complex tasks. It can accomplish in tens of minutes what it would take a human hours to do.

Deep Research is OpenAI's next intelligence that can work for you on its own-you give it a hint and ChatGPT will find, analyze, and synthesize hundreds of online sources to create a comprehensive report at the level of a research analyst. It is powered by an upcoming version of the OpenAI o3 model, optimized for web browsing and data analysis, which uses inference to search, interpret, and analyze the vast amounts of text, images, and PDFs available on the Internet, making adjustments as necessary based on the information it encounters.

The ability to synthesize knowledge is a prerequisite for the creation of new knowledge. For this reason, Deep Research marks an important step toward our broader goal of developing AGI, which we have long envisioned generating new scientific research.

Reasons why we build in-depth studies

Deep Research is built for people who do knowledge-intensive work in fields such as finance, science, policy, and engineering and need thorough, accurate, and reliable research. It is also useful for savvy shoppers who want hyper-personalized advice on purchases that usually require careful research, such as cars, appliances, and furniture. Each output is fully documented with clear citations and summaries of its thoughts, making it easy to reference and verify information. It's especially effective at finding niche, non-intuitive information that requires navigating a large number of sites. Deep Research frees up valuable time by letting you offload and speed up complex, time-consuming web research with a single query.

Deep Research independently discovers, reasons about, and integrates insights from across the web. To accomplish this, it uses the same reinforcement learning approach as OpenAI o1 (our first inference model), trained on real-world tasks that require the use of a browser and Python tools. While o1 demonstrates impressive capabilities in coding, math, and other technical areas, many real-world challenges require extensive background and information gathering from diverse online sources. In-depth research builds on these reasoning capabilities to bridge this gap and enable them to tackle the wide range of problems people face at work and in their daily lives.

How to use in-depth research

In ChatGPT, select "Deep Research" in the message editor and enter your query. Tell ChatGPT what you need - whether it's a competitive analysis of streaming platforms or a personalized report on the best commuter bikes. You can attach files or spreadsheets to add context to your question. Once it's up and running, a sidebar appears with a summary of the steps taken and sources used.

In-depth studies can take 5 to 30 minutes to complete their work and require time to delve deeper into the network. In the meantime, you can leave or work on other tasks - you'll be notified when the study is complete. The final output comes in the form of reports in the chat - over the next few weeks, we'll also be adding embedded images, data visualizations, and other analytics output to these reports to provide additional clarity and context.

In contrast to deep research, GPT-4o is well suited for real-time, multimodal conversations. Formultifaceted, domain-specific where depth and detail are criticalQueries, in-depth research for extensive exploration and the ability to cite each statement are the difference between a quick summary and a well-documented, validated answer (that can be used as a work product).

OpenAI 推出 DeepResearch ，利用o3模型进行深度研究的智能体

The in-depth study responds to prompts in a highly detailed manner, providing side-by-side country/territory data for the top 10 developed countries and top 10 developing countries for easy reference and comparison. It utilizes this information to provide detailed, informed and practical market entry recommendations. See official example: https://openai.com/index/introducing-deep-research/

Working Principle

Deep Research is trained using end-to-end reinforcement learning on hard browsing and reasoning tasks in a variety of domains. Through this training, it learned to plan and execute multi-step trajectories to find the required data, and to backtrack and react to real-time information when necessary. The model is also able to browse files uploaded by users, draw and iterate on graphics using python tools, embed generated graphics and images from websites in its responses, and quote specific sentences or paragraphs from its sources. As a result of this training, it reached new heights in many public evaluations focused on real-world problems.

Humanity's Last Exam

exist Humanity's Last Exam (opens in a new window)(a recently released assessment that tests AI on a wide range of topics on expert-level questions), the Deep Research-enabled model scored a new high with an accuracy of 26.6%. The test contains more than 3,000 multiple-choice and short-answer questions covering more than 100 topics, ranging from linguistics to rocket science, and from classics to ecology. The biggest improvements over OpenAI o1 occurred in chemistry, humanities and social sciences, and math. Models supporting deep research demonstrate a human-like approach to effectively seek specialized information when necessary.

mould	Accuracy (%)
GPT-4o	3.3
Grok-2	3.8
Claude 3.5 Sonnet	4.3
Gemini Thinking	6.2
OpenAI o1	9.1
DeepSeek-R1*	9.4
OpenAI o3-mini (medium)*	10.5
OpenAI o3-mini (high)*	13.0
OpenAI deep research**	26.6

The model is not multimodal and is evaluated on a text-only subset.
Using the browse + python tool

GAIA

exist GAIA(opens in a new window) (a publicly available benchmark for evaluating AI on real-world problems) on a new state-of-the-art level (SOTA) for models supporting deep research in external Leaderboard(opens in a new window) Top of the list. Contains questions at three difficulty levels, and successful completion of these tasks requires abilities including reasoning, multimodal fluency, web browsing, and tool use proficiency.

Examples of GAIA tasks

See official example: https://openai.com/index/introducing-deep-research/

Expert-level missions

In an internal evaluation of a series of domain expert-level tasks, in-depth studies were rated by domain experts as having automated hours of difficult manual investigations.

Pass rate and maximum number of tool calls

The more the model skims and thinks about what it's skimming, the better it performs, which is why it's important to give it time to think.

Example of an expert-level assignment

See official example: https://openai.com/index/introducing-deep-research/

The estimated economic value of a task correlates more with the pass rate than with the number of hours spent by humans - what the model considers difficult is different from what humans consider time-consuming.

limitations

Deep Research unlocks important new features, but it is still in its early stages and has limitations. According to internal evaluations, it sometimes generates phantom facts or makes incorrect inferences in responses, albeit at a much lower level than existing ChatGPT models. It may have difficulty distinguishing between authoritative information and rumors, and currently exhibits weaknesses in confidence calibration, often failing to accurately convey uncertainty. There may be minor formatting errors in reports and citations at publication time, and tasks may take longer to initiate. We expect all of these issues to improve rapidly with increased use and over time.

interviews

Deep research in ChatGPT currently requires very high computational power. The longer it takes to study a query, the more inference computation is required. We will start today with the computation required for a query targeting Pro usersOptimized version starts with up to 100 queries per month.Plus and Team usersAccess will come next, followed by Enterprise users. We are still working on providing access to users in the UK, Switzerland and the EEA.

Higher rate limits will soon be available to all paid subscribers when we release a faster, more cost-effective version of the deeper study powered by smaller models that still deliver high-quality results.

Over the next few weeks and months, we will be working on our technical infrastructure, closely monitoring current releases, and conducting more rigorous testing. This is consistent with our principle of iterative deployment. If all security checks continue to meet our release criteria, we expect to release an in-depth study to Plus users in about a month.

follow-up plan

Deep Research is available today on the ChatGPT network and will be rolled out to mobile and desktop applications within a month. Currently, Deep Research has access to the open web and any uploaded files. In the future, you'll be able to connect to more specialized data sources - extending their access to subscription-based or internal sources - to make their output more powerful and personalized.

Going forward, we envision bringing together the Agent experience in ChatGPT to perform asynchronous, authentic research and development. The combination of Deep Research, where asynchronous online surveys can be performed, and Operator, where real actions can be taken, will enable ChatGPT to perform increasingly complex tasks for you.