OpenAI's Isa Fulford and Josh Tobin recently took an in-depth look at the company's newest AI Agent product, Deep Research, on the Training Data podcast, noting that Deep Research represents a major breakthrough in AI research capabilities, using end-to-end training of models instead of the traditional fixed process.
The two product owners explained in detail how high-quality training data and OpenAI o3 Models (They also share how the powerful inference capabilities of OpenAI's state-of-the-art inference model contribute to Deep Research's flexible research strategy. They also share Sam Altman's vision for Deep Research, which expects it to take on a significant percentage of knowledge-based tasks. Additionally, to build transparency and user trust in the product, Deep Research was designed to incorporate key features such as citation sources and a requirements clarification process. By compressing what used to take hours into minutes, Deep Research is revolutionizing the possibilities of numerous business and personal application scenarios.
Articles with similar points of view ref. read:The Future is Here: An In-Depth Look at the "Model as Product" Era
Original address: https://www.sequoiacap.com/podcast/training-data-deep-research/
Summary of contents
In this podcast, OpenAI's Isa Fulford and Josh Tobin detail Deep Research, an AI Agent that completes comprehensive online research in 5 to 30 minutes by searching multiple websites and generating comprehensive reports with detailed citations. This episode delves into how OpenAI builds effective AI Agents and previews what the future holds for Deep Research in both commercial and personal applications.
- End-to-end training outperforms manual scheduling: Instead of the common Agent construction method of building a fixed operational graph containing nodes of a language model, Deep Research trains end-to-end directly on complex browsing tasks. This approach allows the model to develop flexible information-gathering and integration strategies that would be difficult to achieve if scripted manually.
- Data quality is a core strength: High-quality training data is critical to the success of Deep Research development. The OpenAI team utilizes o3 Models (OpenAI's state-of-the-art inference model) Powerful inference capabilities and fine-tuning of the model on carefully curated examples of complex browsing tasks, a combination that has spawned highly creative results.
- Agent is good at defining clear but flexible tasks: Deep Research demonstrates that an AI Agent can be trained to handle specific workflows that cannot be captured by rigid rules. The model's ability to adapt its research strategy based on preliminary findings makes it ideal for tasks such as market research, scientific literature reviews, and consumer studies that benefit from comprehensive and exploratory information gathering.
- Transparency and control build trust: Deep Research builds user trust through clear citations, pre-clarification of requirements, and visualization of the chain-of-mind reasoning process. This transparency, coupled with the model's ability to integrate information from multiple sources, allows users to validate their conclusions while benefiting from comprehensive research that would be difficult for them to actually accomplish on their own.
- Time compression creates new possibilities: Deep Research reduces research tasks that used to take hours to minutes, which is not just a time saver, but also a fundamental change in the paradigm of how knowledge workers work. Users can now conduct in-depth research for decisions they didn't have time for before, such as analyzing potential investments or planning special events.
podcast transcript
Josh Tobin. One lesson I see people learn in this field time and time again is that we think we can write our own programs to be smarter than the models can do. But in reality, usually the models - as the field advances, the models find better solutions than humans.
And, perhaps the overarching lesson of machine learning is that you get what you optimize for. So if you can build a system that allows you to optimize directly for the results you want, the results will be much better than if you try to stitch together models that are not optimized end-to-end for the task you are trying to perform. So my long-term guidance is that I think that fine-tuning reinforcement learning on top of models is probably a key part of building the most powerful Agent.
Sonya Huang. We are pleased to welcome Isa Fulford and Josh Tobin, the product owners of OpenAI Deep Research. Deep Research was released three weeks ago and has quickly become a hit, used by many tech luminaries, such as the Collison brothers, for a variety of purposes ranging from industry analytics to medical research, and even birthday party planning! .
Deep Research trains on complex browsing and inference tasks through end-to-end reinforcement learning, and is the latest addition to the OpenAI Agent product family. Operator The second product since then. We spoke with Isa and Josh about Deep Research on a variety of topics, from its use cases to its underlying technology to what we expect from OpenAI's future Agent products.
Isa and Josh, welcome to the show.
Lauren Reeder. Thank you for coming. Thank you very much for joining us.
Josh Tobin. I'm glad to be here.
Isa Fulford. Thanks for the invite.
What is Deep Research?
Lauren Reeder. So, maybe let's start with what is Deep Research? Tell us a little bit about its origins and what this product does.
Isa Fulford. Deep Research is an Agent that searches a large number of online sites and generates very comprehensive reports. It can accomplish tasks that would take a human hours to do. And it's built into ChatGPT ChatGPT is a tool that takes only 5 to 30 minutes to answer your questions. As a result, it allows for more in-depth research and answers to your questions with more detailed and specific sources than regular ChatGPT responses.
It is our release ofFirst Agent One of them. We've also released Operator before. So Deep Research is the second Agent, and we'll be releasing more in the future.
Sonya Huang. What is the origin story behind Deep Research? When did you guys decide to do this? Where did the inspiration come from? How many people were involved in the development? What was the process of bringing it to fruition?
Josh Tobin. Good question. This was before I joined OpenAI.
Isa Fulford. Oh, yeah. [laughter] I think about a year or so ago we saw a lot of success internally in using this new inference model and training models to think before they respond. At that time we were focusing primarily on math and science, but I think the other thing that this new inference modeling mechanism unlocks is the ability to perform longer time-span tasks that involve the capabilities of an Agent.
We think a lot of people need to perform tasks that require a lot of online research or a lot of outside background information, which involves a lot of reasoning and differentiating between sources of information. And you have to be very creative to accomplish those types of things. I think we finally had models, or ways to train models, that allowed us to solve some of these tasks. So we decided to try to start training models to performBrowse Tasks. Use the same methodology as we used to train the inference model, but apply it to a more real-world task.
Sonya Huang. Is this your idea? Josh, how did you get involved?
Isa Fulford. Yeah, initially it was me and Yash Patil, who's a colleague at OpenAI, who's working on a similar project that's also going to be released at some point, and we're very excited about that. We built an initial demo version. And Thomas Dimson, who is a very good engineer, he'll dive into anything and do a lot of work. So it's been a very interesting process.
Josh Tobin. Yes, I joined a bit later. I rejoined OpenAI about six months ago from my own startup. I had worked at OpenAI earlier in my career, and when I rejoined, I had been looking at various projects and was very interested in some of our Agent projects, including this one, and then I got involved.
Lauren Reeder. Great. Please elaborate on what user groups you are building Deep Research for.
Josh Tobin. Yes, it is actually designed for anyone who does knowledge work in their daily work or life. We see a lot of users using it for their work, for example, conducting research at work to learn about markets, companies, real estate ......
Isa Fulford. A great deal of scientific research, medical research. I think we've seen a lot of medical examples as well.
Josh Tobin. YES. One of the things that we're really excited about is that this style is like, I just have to spend a lot of time doing something, and I have to do a lot of web searches and sort through a lot of information, and that's not just limited to work, but it's useful for shopping and traveling as well.
Isa Fulford. So we're excited about the release of the Plus version so that more people will be able to try out Deep Research and maybe we'll see some new use cases.
Lauren Reeder. Great. This is definitely one of the products I've used the most in the last few weeks. It's excellent.
Isa Fulford. I'm so happy to hear you say that.
Josh Tobin. Do you use it for work?
Lauren Reeder. Work, of course. There's also entertainment.
Sonya Huang. What do you use it for?
Lauren Reeder. Oh, for me? Geez. I was considering buying a new car and was wondering when the next generation of this car would be released. There were a lot of speculative blog posts on the internet about various hints from the manufacturer, for example, so I asked Deep Research if you could analyze all the rumors about this car and everything the automaker had actually done before. It put together an excellent report that told me it might be a few months away, but should be released this year, within the next few months.
Josh Tobin. Yup. One of the really cool things about it is that not only is it a broad collection of all the information about a particular source, but it's also very good at finding very obscure, weird web information. For example, if you want to know something very specific that might not show up on the first page of search results, it's very good at that sort of thing as well. That's cool.
Surprising use cases
Lauren Reeder. What are some of the surprising use cases you've seen?
Josh Tobin. Oh.
Isa Fulford. I think the most surprising thing to me is the number of people who use it toWrite codeThe
Josh Tobin. Yes.
Isa Fulford. It's not really a use case that I've considered, but I've seen a lot of people say on Twitter and various channels where we can get feedback that they use it to write code and search for code, and they also use it to find out the latest documentation on a particular package and to help them with scripting or whatever.
Josh Tobin. Yes, I'm a little embarrassed that we didn't think of this as a use case.
Isa Fulford. [ Giggles ] Yeah.
Josh Tobin. This may seem obvious to ChatGPT users, but I know it's really impressive that it does this so well.
Sonya Huang. How do you think the balance between commercial and personal use will evolve over time? For example, you mentioned the upcoming Plus version. In a year or two, do you think this will be primarily a business tool or primarily a consumer tool?
Isa Fulford. I hope it's both. I think it's a very versatile ability, and I think it's something that we all do in our work and personal lives. So I hope it's both.
Josh Tobin. Yes, I'm looking forward to both. I think the magic of it is that it really saves people a lot of time. If there's something that might take you hours-in some cases, we've heard even days-people can just feed it into Deep Research and get the 90% results that they would have taken a lot of time to come up with on their own. So, yes, I tend to think there are more of these kinds of tasks in the business world than in the personal world. But I mean, I'm sure it's going to be a part of people's lives, regardless of the domain.
Lauren Reeder. It has really become my main way of using ChatGPT. I always choose Deep Research over the regular mode.
Isa Fulford. Really?
Lauren Reeder. [Laughter]
Josh Tobin. Yeah, right. You're so patient.
Lauren Reeder. Apparently so.
Lauren Reeder. So, what consumer use cases are you seeing? What excites you guys?
Isa Fulford. I think a lot of it has to do with shopping and travel advice. I personally use this model a lot. I've been using it for months for these things. We happened to be in Japan when Deep Research was released, and it's been very useful in helping me find restaurants that meet specific requirements, as well as things I might not find.
Josh Tobin. YES. I find it useful when you need to buy something expensive, or you're planning a special trip, or you want to spend a lot of time thinking about it. For me, I could spend hours trying to read all the information on the Internet about this product I'm interested in buying, like scrutinizing all the reviews and forums and stuff like that. And Deep Research can organize similar information very quickly. So it's really useful for that sort of thing.
Isa Fulford. The model is also very good atcomprehension. So if you have a query that contains a lot of different parts or a lot of different questions, like you want to know about the product but you also want to compare it to all the other products, and you also want to know about the review information from Reddit and so on, there's a lot of different things that you can ask for, and it's going to do all of that for you.
Josh Tobin. Yes. Another trick is to just ask for it to be presented in a table. It usually does that too, but it really helps to have a table with tons of citations and such that lists all the categories of information you want to research.
Isa Fulford. YES. There are still some features that are expected to be added to the product in the future, but the underlying model is capable ofEmbedded images, so it can find images of the product. And it is also able toCreating Charts, which then embeds these charts into their responses, but this is not yet a consumer use case. Hopefully these features will be available in ChatGPT soon as well.
Sonya Huang. Geek consumer use cases. [Laughter]
Josh Tobin. Yeah, speaking of geeky consumer use cases.personalized educationIt's also a very interesting use case. For example, if you've been wanting to learn about a certain topic, if you need a refresher on biology, or if you want to learn about some world events, it's very good at organizing all the information that you don't feel like you understand and the aspects that you'd like it to look into, and then it'll put together a nice report for you.
Isa Fulford. I have a friend who is thinking of starting a CPG company and he has been using Deep Research a lot to look up similar products to see if specific names have been registered - if domains are taken, to make market size estimates, etc. It was interesting - he would share reports with me and I would read them. So it was really interesting to see that.
Josh Tobin. Another interesting use case is that it's very good at finding internetIndividual, hidden facts. For example, if there's some, you know, like a cold TV show that you're trying to find, you know, like find a certain episode or something like that, it'll dig deeper and find the only reference information about it on the web.
Isa Fulford. Oh, yeah. My brother's friend's father had a very specific factual question. The question was about an Austrian general who was in power when someone died in a certain battle. It was a very niche question. Apparently, ChatGPT had answered it wrong before, and he was pretty sure ChatGPT's answer was wrong. So he went to the public library, found a transcript, and realized that ChatGPT was indeed wrong. And then Deep Research was able to give the right answer, so we sent him the answer, and he was thrilled. [Laughter]
Sonya Huang. What are your rough mental models for tasks that Deep Research is very good at today? Which scenarios should I use the o-series models for? Which scenarios should use Deep Research?
Josh Tobin. What Deep Research is really good at is, if you have a sense of what you want toelaborate description, and it involves reading a lot of information on the Internet in order to get the best answer. If your question is vague, it can help you toelucidateWhat you want. But it works best when you have a specific set of information to look for.
Isa Fulford. And I think it's very good at it.conformIt's very good at finding specific, hard-to-find information that it encounters, but it's probably not very good at it - and it can generate some new insights from the information it encounters, but I think - itnot yetMake new scientific discoveries. As for using the o-series model, for me, if I ask it to do the same thing as theencodingsrelated things that usually do not require knowledge beyond what the model has gained from pre-training. Thus, for coding or o3-mini HIGH, I usually use o1 Pro or o1.
End-to-end training
Lauren Reeder. Deep Research is an OpenAI New Product DirectionAn excellent example of this. I'm curious, to the extent you can share, how it works?
Isa Fulford. The model that drives Deep Research is Fine-tuned version of o3(math.) genuso3 is our state-of-the-art inference model. We specialize inWe trained it on a collection of complex browsing tasks as well as other reasoning tasks. As such, it can also access thenavigation toolcap (a poem) Python Tools. By training end-to-end on these tasks, it learns strategies for solving them, and the resulting models excel at online search and analysis.
Josh Tobin. And, the way to understand it intuitively is that you make this request, preferably a detailed request about what you want. The model will think hard about that, it will search for information, it will extract information and read it, it will understand how that information relates to that request, and then it will decide what to search for next to get closer to the final answer that you want. And it's trained to do a good job of summarizing all of this information into a neat report that contains references to the original information it found.
Isa Fulford. Yes, I think what's novel about Deep Research as an Agent capability is that because we'reCapable of end-to-end training, and so there are a lot of things in the research process that you can't predict in advance. So I don't think it's possible to write some language model or program or script that makes it as flexible as the model can learn through training, where the model is actually reacting to real-time web-based information and depending on what it sees, it has to make - change its strategy and so on. And so we actually see it doing very muchCreative Search. You can read the Chain of Thought summary, and I'm sure you can sometimes see that it's very clever in terms of figuring out what to look for next or getting around obstacles.
Sonya Huang. John Collison sent out a tweet that's been a bit of a firestorm on the internet. How much of Deep Research's magic comes fromReal-time access to web contentHow many of them? And how much from thethought chainWhat's going on? Can you guys explain a little bit?
Isa Fulford. I think it's absolutelymarriage of the two. And I think you can see that because there are other search products that are not necessarily - that are not trained end-to-end and therefore are not as flexible in responding - in responding to the information that they come across, and are not as creative in terms of how to creatively solve particular problems be as creative because they haven't been trained specifically for that purpose. So it's definitely a combination of both. I mean, it's a fine-tuned version of o3. o3 is a very smart and powerful model. A lot of the analytical power also comes from the underlying o3 model training. So I think it's definitely a combination of both.
Josh Tobin. Before joining OpenAI, I worked at a startup where we were also trying to build Agents, and the way it was built was similar to the way most people I've seen on the internet describe building Agents, which is basically that you build aoperation chart, certain nodes in that graph are language models. Thus, the language model may decide what to do next, but the overall logic of the steps that occur is defined by a human. We have found that this is a powerful way to build prototypes quickly, but it fails quickly in the real world because it is hard to predict all the scenarios that a model might face and to consider the different branches of path that you might want to take.
On top of that, models are usually not the best decision makers for the nodes in that graph because they are not trained to make those decisions. They are trained to do things that look similar to them. So, I think what's really powerful about this model is that itAfter direct end-to-end trainingthat can solve such tasks that users are using it to solve.
Lauren Reeder. So you don't have to set up charts or make architectural decisions like nodes on the back end?
Isa Fulford. This is entirely driven by the model itself.
Josh Tobin. Yes.
Sonya Huang. Can you guys elaborate on this? Because it seems like you guys madeVery clear decisionsOne of them, and apparently it worked. There are a lot of companies building apps on your API that solve specific tasks for specific users with hints. Do you think these apps would be better served if they were trained end-to-end on their specific workflows?
Isa Fulford. I think that if your workflow is verySpecific and predictable, then adopting the approach Josh describes makes a lot of sense. However, if you're dealing with a lot of thingsMarginal conditionsor need to be veryversatile, then an approach similar to Deep Research may be a better option.
Josh Tobin. Yes, my advice to people is that youunwantedin the modelsolidification (chemistry)The thing is, you know.rigid rule. If you have a database or something like that that you don't want the model to touch, it's better to encode it into manually written logic. But I think that's sort of one of the lessons that I've seen people learn time and time again in this field is that we think that we can write our own programs to do smarter things than the models can do. But in reality, usually the models -- as the field advances, the models find better solutions than humans.
And, perhaps the overarching lesson of machine learning is that you get what you optimize for. So if you can build a system that allows you to optimize directly for the results you want, the results will be much better than if you try to stitch together models that are not optimized end-to-end for the task you are trying to perform. So my long-term guidance is that I think that fine-tuning reinforcement learning on top of models is probably a key part of building the most powerful Agent.
Sonya Huang. What were the biggest technical challenges in realizing Deep Research?
Josh Tobin. Well, maybe I can speak as an observer rather than someone who has been involved from the beginning, but it seems that Isa and the rest of the team have worked very hard and seem to be succeedingHide the keyOne of the things that makesVery high quality dataset. It's one of those age-old lessons in machine learning that people keep re-learning. But the quality of the data you feed into your model is probably the biggest factor in determining the quality of the model you get from the other end.
Isa Fulford. And then to have someone like Edward (Edward Sun), who is another person involved in this project, who will optimize any data set. That's the recipe for success.
Lauren Reeder. Find your Edward.
Josh Tobin. Great machine learning model trainer.
Lauren Reeder. How do you guys make sure it's right?
Isa Fulford. Yes, obviously, that's a core part of this model and product that we want users to be able toConfidence in the output results. Part of the reason is that we havequote, so the user is able to see the source from which the model is citing its information. And, during training, that's something that we actually try to make sure is correct, but it's still possible for the model to make mistakes or to hallucinate or to trust sources of information that might not be the most trustworthy. So that's definitely an active area where we want to continue to improve the model.
Deep Research and Operator
Sonya Huang. How should we think about Deep Research in relation to o3 and Operator and other different releases? For example, does Deep Research use Operator? Are they all built on top of each other? Or are they all a series of different applications of o3?
Josh Tobin. Currently, these products arefreestanding, but you can imagine where we're headed in the future with people having access at some point in the future to theUltimate Agent It should not only be able to perform web searches or use a computer, or perform any other type of operation you would want a human assistant to perform, but it should be able to blend all of these functions in a more natural way.
Sonya Huang. What other design decisions have you made that might not be obvious at first glance?
Isa Fulford. I think one of them isClarification process. If you've used Deep Research, the model will ask you questions before you start your research, whereas normally ChatGPT might ask you questions at the end of its response, but not usually at theat firstJust show this behavior. This isdo something deliberatelyBecause if the prompt is very clear and detailed, you're going to get the best response from the Deep Research model. And I don't think it's the natural behavior of the user to provide all the information in the first prompt, so we want to make sure that if you're going to wait 5 minutes, 30 minutes, that your response is as detailed and satisfactory as possible. So we added these extra steps to make sure that the user is providing all the details that we need.
And I've actually seen a lot of people say on Twitter that they have a process where they'll talk to an o1 or an o1 Pro to helpMake its prompts more detailedAnd once they're happy with the prompt, they send it to Deep Research. That's interesting. So people are finding their own workflows to use Deep Research.
Lauren Reeder. Three different Deep Research products have been released in the last few months. Please briefly describe what makes your product different and what we should expect from it.
Sonya Huang. And they're both called Deep Research, right?
Josh Tobin. They're both called Deep Research. Yes, this field ofNot much creativity in naming. I think people should try all of these products for themselves and get a feel for them. I think there's a difference in quality and I think they all have pros and cons, but I think the differences will be obvious. But it comes down to just the way that this model was built and the effort that went into building the dataset and the engine that we use for the o-series model, which allows us to optimize the model and make it very smart and high quality.
Sonya Huang. Last year we had the o1 team on the podcast and we were joking about OpenAI Not very good at naming things.. I would say that Deep Research is yourNaming the most successfulProduct. [CHUCKLES]
Josh Tobin. Deep Research, right? At least it describes what it does, I guess.
future outlook
Lauren Reeder. I'd love to hear your vision for the future. You guys launched Deep Research today. What do you think it's going to look like a year from now? Maybe what other complementary things do you guys want to build in the process?
Isa Fulford. We're happy.Extending the data sources accessible to the model. The model we train is usually very good at navigating public information, but it should also be able toSearch for private data. And then I think it's just furtherEnhancing their capacity. So it could be better at browsing, it could be better at analyzing. Yes, I think in the short term we want to improve those areas.
Josh Tobin. Yes, it does. And then consider how this fits into our broader Agent roadmap. Like, I think the secret sauce here would extend to a very wide range of use cases that would surprise people with how well it works. But the idea is that you take a state-of-the-art inference model, you give it access to the same tools that humans can use to do their jobs or their daily lives, and then you optimize it directly for the kinds of outcomes that you want the Agent to be able to perform. That kind of recipe, there's really nothing stopping that recipe from scaling to increasingly complex tasks, so I feel like, yeah.AGI is now an operational issue. And I think there's a lot more to look forward to in this universal formula.
Lauren Reeder. Sam (Sam Altman) had a very compelling quote about how Deep Research willTake over a single-digit percentage of all economically valuable assignments globally. How should we understand this statement?
Josh Tobin. I think it's fair to understand that Deep Research Can't finish what you started.But it can work for you.use sparinglyhours, and sometimes in some cases even savefew daysof time. So I think the goal that we may be relatively close to achieving is that Deep Research and the Agent that we build next, and the Agent that we build on top of it, will give youuse sparingly 1%, 5%, 10%, 25% hours, depending on the type of work you do.
Sonya Huang. I mean, I think you guys reallyautomatizationGot my 80% working, so ......
Lauren Reeder. [ Chuckles ] Definitely higher for me.
Josh Tobin. I think we just need to start.write checksUp. Yes, it is.
Sonya Huang. What do you thinkEntire occupational groupMore - "at risk" is not the right word, but closer to the areas that Deep Research is very good at? For example, I'm thinking of consulting, but what specific categories do you think are closer to that?
Josh Tobin. Yes, I used to be a counselor. I think.No jobs at risk. I really don't think it's alabor force replacementThe Stuff. However, for these types of knowledge-based jobs, you need to spend a lot of time toBrowse information and draw conclusionsI think Deep Research is going to empower peoplesupernatural powerThe
Isa Fulford. Yes, I'm interested in a lot ofstudy of medicineThe use cases are very exciting. Just thefindsicknessAll documentsmaybeAll recent casesThe ability to do that. I think I've seen a lot of physicians posting online about Deep Research or they've contacted us and said, "Oh, we did this with it. We used it to help find a clinical trial for this patient" or something like that. So it's just a time saver for people who are already very busy or there may be things that they didn't have time to do before and now they're able to access that information.
Josh Tobin. Yes. And I think that the impact of that may be more than it sounds on the surfacemore profound, right? It's not just - it's not just saving 5% time, it's that what might have taken you 4 hours or 8 hours to accomplish, you can now do with a ChatGPT subscription and 5 minutes of your time. So if you haveUnlimited time, what types of things do you do? Now you could probably make many, many copies?
For example, you should research everyPossible startups to invest in, rather than just researching companies that you have time to meet with? Things like that.
Sonya Huang. Or on the consumer side, one of the things that comes to mind is, you know.Working MomToo busy to make time fortoddlerplannerbirthday partyLike, now it's becoming feasible. Like, now this is becoming feasible. So I agree with you. It's much more important than saving 5%'s time.
Josh Tobin. Yes.
Lauren Reeder. These are things you couldn't do before.
Isa Fulford. That's right.
Sonya Huang. How this will changeteachand usdoWhat would you teach kids now that we're in the world of agents and deep research? Now that we're in the world of Agent and Deep Research, what do you teach kids?
Josh Tobin. teachChatGPT has always been used byprimary useOne. I think - and this is true for ChatGPT in general. It's like learning things by talking to an AI system that can learn things based on what you tell it, or in the future, based on what it knows about youpersonalizedIt provides you with information, which feels like a more effective and engaging way to learn than reading a textbook.
Lightning Question Session
Lauren Reeder. We have someLightning questionsThe problem with the link.
Josh Tobin. Okay.
Sonya Huang. Right. What are some of your favorite Deep Research use cases?
Josh Tobin. I'd say yes, for example.personalized education. Just, learn anything I want to learn.
Isa Fulford. I've already mentioned this, but I think a lot of what people share about thefindAbout them or their familiesInformation on diseases sufferedThe personal stories, all of them, are great.
Sonya Huang. The Good. We've seen some application categories in the last yearoutbreaks. For example.encodingsis a clear example. What app categories do you think will explode this year?
Josh Tobin. I mean.Apparently, Agent.The
Isa Fulford. I'll say this too.
Sonya Huang. Okay.2025 is the year of the AgentThe
Josh Tobin. I think so.
Lauren Reeder. So what do you guys think people should be recommended to read to learn more about where Agent or AI is going? It could also be authors.
Sonya Huang. Training Data podcast. [Laughter]
Josh Tobin. I think it's important to keep up with the latest developments in AItricky. I give people thegeneral recommendationYes, choose one or two that really interest yousubthemeAnd then, you know.plannerA list of people you think are making interesting statements about this, and how to find the one or two things that interest you. Maybe actually, this is a good use case for Deep Research. Use it to delve into things you want to know more about.
Isa Fulford. Now this is a bit dated, but I think I saw it a few years ago - I think it was called Strengthening the foundations of learning (Foundations of RL) or something like that, from Pieter Abbeel. It's a bit dated, but I think it's aGreat introduction to intensive learningThe
Josh Tobin. Yeah, I sure will.favor Anything by Pieter Abbeel. My graduate advisor.
Isa Fulford. Oh, yeah.
Sonya Huang. Okay. Intensive learningAfter arush hourAnd then it feels like it's falling back intodownturn. Again, I ask, is this the correct interpretation of the current dynamics of intensive learning?
Josh Tobin. It's back.Yes. Yes.
Sonya Huang. It's back. Why? Why now?
Josh Tobin. on account ofEverything else worked.. Like, I think if anyone has been following this space for a while, they might remember Yann LeCun's cakeparables?
Sonya Huang. Talk about it.
Josh Tobin. So, like, if you're going toMake a cake, then the majority of the cake is the body of the cake and then there is a little bit of frosting and then some cherries on top. The analogy is thatunsupervised learningIt's a cake body.Supervised learningIt's frosting.Intensive learningIt's the cherry on top.
I think that when we were working on reinforcement learning research in this area in 2015, 2016, sort of like, I think Yann LeCun's analogy, which I think in retrospect is probably correct, was that we were trying to work on theNo cake body.present situationAdd Cherry. But now that we have inPre-training on massive data(used form a nominal expression)language modelTheir ability torare. We know how to model these languages onOversight fine-tuningThey're good at what they do.Compliance with directivesand generallyDoing what people want them to do.The
Therefore, since this hasvery effectiveup, then it's now time to fine-tune those models to fit what you can do for theirDefine the reward functionof any type of use case.
Sonya Huang. Great. All right, from this lightning question session, we've got Deep Research's favorite AI apps. Agent will be the breakout category in 2025. And.Intensive learning is back.I like it. I love it. Thank you so much for joining us. We're enjoying the conversation. Congratulations on the release of a great product and we can't wait to see what it brings.