The o1 is not a chat model and teaches you how to properly energize o1 capabilities

AI News7mos agorelease AI Sharing Circle

1.4K 00

How to get it right o1: don't write prompts; write briefs, focus on goals: describe what you want tonothingNot what you want.what wayGet it and be aware of the pros and cons of o1!

Since the release of the o1 in October and the announcement of the o1 pro/o3 in December, many people have struggled to make sense of their perceptions, both positive and negative. We took a strong positive stance at the low point of the o1 Pro sentiment and mapped out what it might take for OpenAI to launch a $2,000 per month proxy product (rumored to be coming in the next few weeks). Since then, o1 has been on all LMArena The charts have been steady at number one.

Since then, he's launched Dawn Analytics and continues to post unfiltered thoughts about o1 - initially as a loud skeptic and slowly becoming an everyday user. We love the various meanings of people who change their minds and think the same conversation is happening all over the world as people struggle to transition from chat mode to the new world of reasoning and hundreds of dollars a month for specialized AI products, now GA))). Here are our thoughts.

How did I go from hating o1 to using it every day to solve my most important problems?

I learned how to use it.

When the o1 pro was released, I didn't hesitate to subscribe.In order to justify the $200 per month price tag, it only needs to provide 1-2 engineers' hours per month

But at the end of the day of trying to get the model to work, I concluded thatIt's garbage.The

Every time I ask a question, I have to wait 5 minutes and am greeted with a lot of contradictory gibberish with unsolicited architecture diagrams + a list of pros and cons.

o1 Answer my question and contradict yourself many times.

Of course, people often go very wild about OpenAI after release (which is the second best strategy for going viral, after negative reviews.)

But this feels different - these perceptions come from people in difficult situations.

As I started talking to people who disagreed with me, the more I realized I was completely wrong:

I use o1 like a chat model - but o1 is not a chat model.

How to use o1 correctly

If o1 is not a chat model, what is it?

I think of it like a "report generator". If you give it enough context and tell it what you want to output, it usually solves the problem once and for all.

swyx's note: OpenAI did publish a proposal for prompting o1, but we think it's incomplete, and in a sense you can think of this article as the "missing manual" for practical experience with o1 and o1 pro in practice.

1. Don't write prompts; write briefs

Provide lots of context. Whatever you think I mean by "a lot" - multiply it by 10.

When you use a program like Claude When modeling chat like 3.5 Sonnet or 4o, you usually start with a simple question and some context. If the model needs more context, it will usually ask you for it (or it will be obvious from the output).

You iterate back and forth with the model, correcting it + extending the requirements until you get the desired output. It's almost like pottery.The chat model essentially extracts context from you through this back and forth. Over time, our problem became faster + lazier - as lazy as possible while still getting good output.

o1 will only accept laziness literally and will not try to extract context from you. Instead, you need toPush as much context as possible to o1The

Even if you're just asking a simple engineering question:

Explain all the ways you've tried that didn't work
Add a full dump of all database schemas
Explain what your company does and how big it is (and define company-specific terms)

In short, treat o1 as a new hire. Note that errors in *o1 include reasoning about how much it should reason. *Sometimes the variance fails to map accurately to task difficulty. For example, if the task is really, really easy, it usually goes down a rabbit hole of reasoning for no apparent reason.Note: The o1 API allows you to specify low/medium/high reasoning_effort, but the ChatGPT Not available to users.

Make it easier for o1 to get contextual hints

I suggest using your mac/phone on the Voice Memos appI just describe the whole problem space for 1-2 minutes and then paste the text. I just describe the entire problem space for 1-2 minutes and then paste that text.
- I actually have a note where I keep long segments of context to be reused.
- swyx: I use Sarav's Careless in LS Discord. Whisper
AI assistants that pop up inside the product can often make this extraction easier. For example, if you use Supabase, try asking Supabase Assistant to dump/describe all relevant tables/RPCs etc.

swyx: I'd change the beginning to "Spend 10x more time on prompts"

2. Focus on the goal: describe what you wantnothingNot what you want.what wayGet it.

Once you've populated the model with as much context as possible -Focus on explaining what you want the output to be.

For most models, we have gotten used to telling the model that we want it towhat wayAnswer us. For example, "You are a professional software engineer. Think slowly and carefully"

This is the opposite of what I find o1 successful. I don't direct it.what wayDo - only instruct itnothing. Then let o1 take over and plan and solve their own steps. This is the purpose of autonomous reasoning, and may actually be much faster than you can manually review and chat as an "artificial in the loop".

swyx's poor attempt at illustration

It requires that youReally know exactly what you want.(And you really should ask for a specific output in each prompt - it can only be reasoned about at the beginning!)

Sounds easier than it is! Do I want o1 to implement a specific architecture in production, create a minimal test application, or just explore options and list pros and cons? These are completely different requirements.

o1 usually explains concepts by default using report-style syntax - fully numbered headings and subheadings. If you want to skip the explanation and output the full document - you just need to state it explicitly.

Professional tips from swyx: Establishing really good criteria for "good" and "bad" helps you toGive the model a way to evaluate its own output and self-improve/fix its own errorsThe

As an added benefit, this will eventually give you LLM as an evaluator tool that you can use for intensive fine-tuning during GA.

Since learning how to use o1, I've been blown away by its ability to generate the right answer the first time. It's actually better in almost every way (except cost/latency).

Here are some of the moments that particularly stand out:

3. Understanding the advantages and disadvantages of o1

o1 Advantages:

Perfect for generating whole/multiple files at once: So far, this is the most impressive capability of o1. I copy/paste a lot of code, and a lot of context about what I'm building, and it generates the entire file (or multiple files!) in a single pass completely ), usually without errors, and following existing patterns in my codebase.
Fewer hallucinations: In general, it seems to confuse things less. For example, o1 is really good at customizing query languages (like ClickHouse and New Relic), whereas Claude often confuses the syntax of Postgres.
**MEDICAL DIAGNOSIS:** My girlfriend is a dermatologist - so whenever any of my friends or any member of my extended family has any skin problems, they send her a pic! For fun, I've started asking o1 at the same time. it's usually very close to the right answer - about 3/5 of the time. It's more useful for medical professionals -It almost always provides an extremely accurate differential diagnosis.
**Explaining Concepts:** I found it to be very good at explaining very difficult engineering concepts with examples. It's almost like generating an entire article. When I'm dealing with difficult architectural decisions, I'll often have o1 generate multiple plans, each with pros/cons, and even compare those plans. I'll copy/paste the responses as PDFs and compare them - almost as if I'm considering proposals.
**Reward: assessment. **I've always been skeptical of using LLM as a jury for evaluation, because fundamentally, jury models typically experience the same failure modes as the model that initially generated the output. However, o1 shows great promise - it is usually able to determine if the generation is correct in very little context.

Disadvantages of o1 (for now):

**Writing in a specific voice/style:** No, I didn't use o1 for this post 🙂 .
I find it very bad at writing anything, especially in terms of a particular voice or style. It has a very academic/corporate reporting style that it wants to follow. I think there's just a lot of reasoning Token Lean the tone in that direction and it's hard to get rid of it.
Here's an example of me trying to get it to write this article - this is after much back and forth - it's just trying to produce a bland school report.

Build the entire application:o1 is very good at generating entire files at once. Still, despite some of the more ...... optimistic ...... demos you might see on Twitter - o1 won't build the entire SaaS for you, at least not after themagnanimousof iterations. But itpossible** Generate almost entire functions at once, especially front-end or simple back-end functionsThe

AI News

The article is copyrighted and should not be reproduced without permission.

Free use of the newest Gemini Experimental 1114 model to hit the charts!

AI News

9mos ago

01.4K

Windsurf Wave 2 重大更新：引入网页搜索和自动化记忆功能，并提供企业级混合部署版本

Windsurf Wave 2 Major Update: Introduces Web Search and Automated Memory Features with Enterprise Hybrid Deployment Edition

AI News

7mos ago

01.6K

Project-level code generation results are in! o3/Claude 3.7 leads the way, R1 is in the top tier!

AI News

5mos ago

01.1K

Those weird little devices at the Consumer Electronics Show (CES) 2025

AI News

7mos ago

01.2K

No comments

You must be logged in to leave a comment!

No comments...

The o1 is not a chat model and teaches you how to properly energize o1 capabilities

How to use o1 correctly

1. Don't write prompts; write briefs

2. Focus on the goal: describe what you wantnothingNot what you want.what wayGet it.

3. Understanding the advantages and disadvantages of o1

450 to train an 'o1-preview'?UC Berkeley open-sources 32B inference model Sky-T1, AI community abuzz

Chongqing University goes fully online with exclusive AI counselor, which has been used by more than 10,000 students

Related posts

Free use of the newest Gemini Experimental 1114 model to hit the charts!

Windsurf Wave 2 Major Update: Introduces Web Search and Automated Memory Features with Enterprise Hybrid Deployment Edition

Project-level code generation results are in! o3/Claude 3.7 leads the way, R1 is in the top tier!

Those weird little devices at the Consumer Electronics Show (CES) 2025

No comments

Latest Collections

Latest Articles

The o1 is not a chat model and teaches you how to properly energize o1 capabilities

How to use o1 correctly

1. Don't write prompts; write briefs

2. Focus on the goal: describe what you wantnothingNot what you want.what wayGet it.

3. Understanding the advantages and disadvantages of o1

450 to train an 'o1-preview'?UC Berkeley open-sources 32B inference model Sky-T1, AI community abuzz

Chongqing University goes fully online with exclusive AI counselor, which has been used by more than 10,000 students

Related posts

Free use of the newest Gemini Experimental 1114 model to hit the charts!

Windsurf Wave 2 Major Update: Introduces Web Search and Automated Memory Features with Enterprise Hybrid Deployment Edition

Project-level code generation results are in! o3/Claude 3.7 leads the way, R1 is in the top tier!

Those weird little devices at the Consumer Electronics Show (CES) 2025

No comments

Selected AI Tools

Latest Collections

Latest Articles