AI Personal Learning
and practical guidance

OPENAI o1 (Inference Model) User's Guide, Suggestions for Writing Prompt Words

OpenAI o1 series models are novel large language models trained through reinforcement learning designed to perform complex reasoning. o1 models think before they answer and are capable of generating long internal chains of thought before responding to the user. o1 models excel in scientific reasoning, ranking in the 89th percentile in Competitive Programming Problems (Codeforces), ranking among the top 500 U.S. students in the American Mathematical Olympiad (AIME) qualifiers, ranked among the top 500 students in the U.S., and exceeded human Ph.D.-level accuracy on benchmark tests (GPQA) for physics, biology, and chemistry problems. o1 model was also able to produce a high level of scientific reasoning, with a high degree of accuracy.

 


Two inference models are provided in the API:

1. o1-preview: An early preview of our o1 model, designed to reason about difficult problems using a wide range of common knowledge about the world.
2. o1-mini: a faster and cheaper version of o1 that is particularly good at coding, math, and science tasks that don't require extensive general knowledge.

 

o1 model provides significant inference in the make headwayBut they Not intended to replace GPT-4o in all use casesThe

For applications that require image inputs, function calls, or consistently fast response times, the GPT-4o and GPT-4o mini models are still the right choice. However, if you are looking to develop applications that require deep inference and can accommodate longer response times, the o1 model may be an excellent choice. We are looking forward to the results you will create with them!

 

🧪 o1 Model is currently in beta stage

o1 The model is currently in testing phase, with limited functionality. Access is limited to Level 5 The developers in the here are Check your usage level and have a low rate limit. We are working on adding more features to add speed limit, and expanding access to more developers in the coming weeks!

 

Quick Start

o1-preview cap (a poem) o1-mini All can be accessed through the chat completions Endpoint Usage.

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
model="o1-preview",
messages=[
{

"content": "Write a Bash script that represents a matrix as a string of the format '[1,2],[3,4],[5,6]' and prints the transposed matrix in the same format."
}
]
)

print(response.choices[0].message.content)

Depending on the amount of reasoning required by the model to solve the problem, these requests can take anywhere from a few seconds to a few minutes.

 

Beta Restrictions

In Beta, many of the chat completion API parameters are not yet available. The most notable ones are:

  • modal (computing, linguistics): Only text is supported, not images.
  • Message Type: Only user messages and assistant messages are supported, not system messages.
  • streaming: Not supported.
  • artifact: Tools, function calls, and response format parameters are not supported.
  • Logprobs: Not supported.
  • (sth. or sb) else:: `temperature`, `top_p` and `n` Fixed to `1`, and `presence_penalty` and `frequency_penalty` Fixed to `0`.
  • Assistants and Batch: These models do not support the Assistants API or the Batch API.

As we phase out of Beta, support for some of these parameters will be added. o1 Future models in the series will include features such as multimodality and tool usage.

 

 

How reasoning works

The o1 model introduces Reasoning tokens. These models use inference tokens The model "thinks", breaks down the understanding of the cue, and considers multiple ways of generating a response. After generating reasoning tokens, the model generates answers as visible completion tokens and discards reasoning tokens from their context.

Below is an example of a multi-step dialog between a user and an assistant. The input and output tokens for each step are retained, while the inference tokens are discarded.

o1 (Reasoning Model) Guidelines for the Use of Prompt Words-1

Inference tokens are not saved in context

 

Although inference tokens cannot be displayed via the API, they still occupy the model's context window space and are used as the output tokens Charges.

 

Manage Context Window

The o1-preview and o1-mini models provide a context window of 128,000 tokens. Each time content is generated there is an upper limit on the maximum number of output tokens - this includes both invisible inference tokens and visible generation tokens. the maximum output tokens limit is as follows:

  • o1-preview: up to 32,768 tokens
  • o1-mini: up to 65,536 tokens

 

When generating content, it is important to ensure that there is enough room for inference tokens in the context window. Depending on the complexity of the problem, the model may generate anywhere from a few hundred to tens of thousands of inference tokens. the exact number of inference tokens used can be found in the Usage Object for Chat Generation Response Objects hit the nail on the head completion_tokens_details View:

usage: {
total_tokens: 1000,
prompt_tokens: 400,
completion_tokens: 600,
completion_tokens_details: {
reasoning_tokens: 500
}
}

 

Control costs

To manage the cost of the o1 family of models, you can use the max_completion_tokens parameter limits the total number of tokens generated by the model (both inference and generation tokens).

In the previous model, `max_tokensThe ` parameter controls the number of generated tokens and the number of tokens visible to the user, which are always equal. However, in the o1 family, the total number of generated tokens can exceed the number of visible tokens because of the presence of internal inference tokens.

Since some applications may rely on `max_tokens` Consistent with the number of tokens received from the API, the o1 series introduces `max_completion_tokens` to explicitly control the total number of tokens generated by the model, both inferred and visible generated tokens.This explicit choice ensures that existing applications are not broken when new models are used. For all previous models, `max_tokensThe ` parameter still maintains its original function.

 

Allow room for reasoning

If the generated tokens reach the limit of the context window or if you set the `max_completion_tokens` value, you will receive `finish_reason` Set to `length`'s chat generation response. This may happen before any visible generated tokens are generated, meaning you may pay for input and reasoning tokens and not receive a visible response.

To prevent this, make sure you leave enough space in the context window or set the `max_completion_tokensThe ` value is adjusted to a higher number.OpenAI recommends setting aside at least 25,000 tokens for inference and output when you start using these models. Once you are familiar with the number of inference tokens needed for hints, you can adjust this buffer accordingly.

 

 

Suggestions for cue words

 

These models perform best when using clear and concise cues. Some cue engineering techniques (such as few-shot cues or letting the model "think step-by-step") may not improve performance and can sometimes be counterproductive. Here are some best practices:

  • Keep the prompts simple and clear: These models excel at understanding and responding to short, clear instructions without providing much guidance.
  • Avoid chain thinking cues: Since these models reason internally, it is not necessary to guide them to "think step-by-step" or "explain your reasoning".
  • Use separators to improve clarity: Using separators such as triple quotes, XML tags, or section headings to clearly label the different parts of the input helps the model to understand each part correctly.
  • In the search for enhanced generation (RAG) in Limit additional context:** When providing additional context or documentation, include only the most relevant information to avoid over-complicating the model's responses.

o1 (Reasoning Model) Usage Guidelines, Suggestions for Prompt Word Writing-1

 

o1 (Reasoning Model) Usage Guidelines, Suggestions for Cue Words Writing - 2

 

Examples of prompts

Coding (refactoring)

The OpenAI o1 family of models is capable of implementing complex algorithms and generating code. The following hints require o1 to refactor a model based on some specific criteria. React Component.

from openai import OpenAI

client = OpenAI()

prompt = """
Directive:
- Make changes to the React component below to make the text of non-fiction books red.
- Return only the code in the response and do not include any additional formatting such as markdown code blocks.
- For formatting, use four-space indentation and no more than 80 lines of code.

const books = [
{ title: 'Dune', category: 'fiction', id: 1 },


].

export default function BookList() {
const listItems = books.map(book =>
  • {book.title}
  • {book.title} ); return (
      {listItems}
    ); } """ response = client.chat.completions.create( model="o1-mini", messages=[ { "role": "user", "content": [ { "content": [ { "type": "text", "text": prompt "text": prompt }, } }, [ "content": [ "type": "text", "text": prompt } ] ) print(response.choices[0].message.content)
  •  

    Code (planning)

    The OpenAI o1 family of models also excels at creating multi-step plans. This sample prompt asks o1 to create the file system structure of a complete solution and provide the Python code that implements the required use cases.

    from openai import OpenAI
    
    client = OpenAI()
    
    prompt = """
    I want to build a Python application that takes a user question and looks up the corresponding answer in a database. If a close match is found, return the matching answer. If no match is found, ask the user for an answer and store the question/answer pair in the database. Create a plan for a directory structure for this purpose and then return the complete contents of each file. Provide your reasoning only at the beginning and end of the code, not in the middle of it.
    """
    
    response = client.chat.completions.create(
    model="o1-preview",
    messages=[
    {
    "role": "user", "content": [ {
    "content": [
    {
    "type": "text", "text": prompt
    "text": prompt
    }, }
    }, [ "content": [ "type": "text", "text": prompt
    }
    ]
    )
    
    print(response.choices[0].message.content)

     

    STEM research

    The OpenAI o1 family of models performs well in STEM research. Prompts used to support basic research tasks typically show strong results.

    from openai import OpenAI
    client = OpenAI()
    
    prompt = """
    Which three compounds should we consider studying to advance research on new antibiotics? Why should we consider them?
    """
    
    response = client.chat.completions.create(
    model="o1-preview",
    messages=[
    {
    "role": "user", "content": prompt
    "content": prompt
    }
    ]
    )
    
    print(response.choices[0].message.content)

     

     

    Examples of use cases

    Some examples of real-world use cases using o1 can be found in the the cookbook Found in.

    AI Easy Learning

    The layman's guide to getting started with AI

    Help you learn how to utilize AI tools at a low cost and from a zero base.AI, like office software, is an essential skill for everyone. Mastering AI will give you an edge in your job search and half the effort in your future work and studies.

    View Details>
    May not be reproduced without permission:Chief AI Sharing Circle " OPENAI o1 (Inference Model) User's Guide, Suggestions for Writing Prompt Words

    Chief AI Sharing Circle

    Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

    Contact Us
    en_USEnglish