AI Personal Learning
and practical guidance
Beanbag Marscode1

Ollama API User's Guide

summary

Ollama A powerful REST API is provided to enable developers to easily interact with large language models. Through the Ollama API, users can send requests and receive responses generated by the model, which are applied to tasks such as natural language processing and text generation. In this paper, we will introduce the basic operations of generating complements and dialog generation in detail, and common operations such as creating models, copying models, and deleting models are also explained.

 

starting point or ending point (in stories etc)

Answer Completion \ Dialog Completion \ Create Model \ Copy Model \ Delete Model \ List Running Models \ List Local Models \ Show Model Information \ Pull Models \ Push Models \ Generate Embedding


 

I. Answer Completion

POST /api/generate

Generates a response to a given prompt using the specified model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and other data from the request.

parameters

  • model: (Required) Model name
  • prompt: Tip to generate a response
  • suffix: Text after model response
  • images: (Optional) a list of base64-encoded images (for multimodal models such as llava )

Advanced Parameters (optional):

  • format: Returns the format of the response. The only values currently accepted are json
  • options:: Other model parameters, such as temperature,seed et al. (and other authors)
  • system: System Messages
  • template: The prompt template to use
  • context:: From the previous review of /generate The contextual parameters returned in the request can be used to keep a short dialog memory
  • stream: If set to false The response will be returned as a single response object rather than a stream of objects.
  • raw: If set to true , there will be no formatting of the prompt. If you specify a full template prompt when requesting the API, you can optionally use the raw parameters
  • keep_alive: Controls how long the model remains in memory after a request (default:5m)

Example request (streaming)

curl http://localhost:11434/api/generate -d '{
"model": "llama3.1", "prompt".
"prompt": "Why is the grass green?"
}'

Tip.

If you use the curl command, please download curl for WindowsTo add environment variables, extract the file, then locate the bin subfile in the directory where the file is located, and copy the address of the file to add the environment variables.

Use the following command in a command line window (not PowerShell, mind you) to check if it was successfully added.

curl --help

The following is displayed to indicate successful addition.

Ollama API Usage Guide-1

Tip.

In a Windows command line window, use the curl When requesting commands, note the use of escaped double quotes. Example commands are listed below.

curl http://localhost:11434/api/generate -d "{\"model\": \"llama3.1\", \"prompt\": \"Why the grass is green\"}"

The following display indicates that the request was successful.

Ollama API Usage Guide-2

Example Response

The return is a stream of JSON objects:

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T02:54:08.184732629Z",
"response": "plant",.
"done":false
}

The final response in the stream also includes additional data about the generation:

  • context: the dialog code used for this response, which can be sent in the next request to keep the dialog memorized
  • total_duration: time spent generating the response (in nanoseconds)
  • load_duration: time taken to load the model (in nanoseconds)
  • prompt_eval_count: number of tokens in the prompt
  • prompt_eval_duration: time taken to evaluate the prompt (in nanoseconds)
  • eval_count: number of tokens in the response
  • eval_duration: time taken to generate the response (in nanoseconds)
  • response: null if the response is streamed, if not, this will contain the full response To calculate the response generation rate (number of tokens generated per second, token/s), i.e.eval_count / eval_duration * 10^9.

Final Response:

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T02:54:10.819603411Z",
"response":"",
"done":true, "done_reason".

"context":[1,2,3], "total_duration":86,3
"total_duration":8655401792,

"prompt_eval_count":17, "prompt_eval_duration":8655401792, "load_duration":5924129727,
"prompt_eval_duration":29196000,
"eval_count":118, "prompt_eval_duration":26196000, "prompt_eval_duration":26196000
"eval_duration":2656329000
}

Advanced Play

non-streaming output

commander-in-chief (military) stream set to falsethat can receive all responses at once.

Example Request

curl http://localhost:11434/api/generate -d '{
"model": "llama3.1", "prompt".
"prompt": "Why is the grass green?" ,
"stream": false
}'

Example Response

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T07:13:34.418567351Z",
"response": "Answer: leaves contain a lot of chlorophyll." ,
"done":true,.
"done_reason": "stop",
"context":[1,2,3],
"total_duration":2902435095,
"load_duration":2605831520, "prompt_eval_count
"prompt_eval_count":17, "prompt_eval_duration":2902435095, "load_duration":2605831520,
"prompt_eval_duration":29322000, "prompt_eval_count":13
"eval_count":13,
"eval_duration":266499000
}

JSON mode

(coll.) fail (a student) format set to json The output will be in JSON format. Note, however, that the prompt The model is instructed to respond in JSON format, otherwise the model may generate a lot of spaces.

Example Request

curl http://localhost:11434/api/generate -d '{
"model": "llama3.1", "prompt": "Why is the grass green?
"prompt": "Why is the grass green? Output the answer in JSON format",
"format": "json",
"stream": false
}'

Example Response

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T07:21:24.950883454Z",
"response":"{\n \"Cause of color\": \"Leaves contain chlorophyll needed for photosynthesis\",\n \n \"Role\": \"To photosynthesize and absorb solar energy\"\n}", "true", "created_at": "2024-08-08T07:24.50883454Z", "model".
"done":true, "done_reason".
"done_reason": "stop",
"context":[1,2,3],
"total_duration":3492279981,


"prompt_eval_duration":28804000,
"eval_count":40, "prompt_eval_duration":8500000
"eval_duration":851206000
}

response will be a string containing JSON similar to the following:

{
"Cause of color": "Leaves contain chlorophyll, which is needed for photosynthesis.", "color": "Leaves contain chlorophyll, which is needed for photosynthesis.", "color".
"Function": "Performs photosynthesis to absorb solar energy."
}

Input contains images

To add a new model to a multimodal model (e.g. llava maybe bakllavaTo submit an image, please provide a base64-encoded version of the images List:

Example Request

curl http://localhost:11434/api/generate -d '{
"model": "llava", "prompt".
"prompt": "Describe this image",
"stream": false,
"images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+ VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+ NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/ XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+ dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+ pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz /6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+ sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/ atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/ nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/ oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+ yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrrJCIW98pzvxpAWyyo3HYwqS0+ H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+ VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/ dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/ 8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+ TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+ oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+ q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/ dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM /nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/ 7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+ TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/ l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/ 9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+ 0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+ LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO +L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/ A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA +F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/ jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/ lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+ q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/ oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+. vZToUQgzhkHXudb/ PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX +b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/ X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+ NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/ UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2 /FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/ f7mgoAbYkAAAAAElFTkSuQmCC"
]
}'

Example Response

{
"model": "llava", "created_at".
"created_at": "2024-08-08T07:33:55.481713465Z", "response":" The image shows a cartoon of an animated character that resembles a cute pig with large eyes and a smiling face.
"response":" The image shows a cartoon of an animated character that resembles a cute pig with large eyes and a smiling face. It appears to be in motion, indicated by the lines extending from its arms and tail, giving a picture of the animal's face. The image shows a cartoon of an animated character that resembles a cute pig with large eyes and a smiling face. It appears to be in motion, indicated by the lines extending from its arms and tail, giving it a dynamic feel as if it is waving or dancing. The style of the image is playful and simplistic, typical of line art or stickers. The character's design has been stylized with exaggerated features such as large ears and a smiling expression, which adds to its charm. The character's design has been stylized with exaggerated features such as large ears and a smiling expression, which adds to its charm.
"done":true, "done_reason": "done_reason":true
"done_reason": "stop", "context":[1,2,4
"context":[1,2,3], "total_duration":1,2,3
"total_duration":2960501550, "load_duration":4550, "load_duration":4500


"prompt_eval_duration":758437000,
"eval_count":108, "prompt_eval_duration":21550012, "prompt_eval_count":1, "prompt_eval_duration":758437000,
"eval_duration":2148818000
}

Reproducible output

commander-in-chief (military) seed Set to a fixed value to get reproducible output:

Example Request

curl http://localhost:11434/api/generate -d '{
"model": "llama3.1", "prompt".
"prompt": "Why is the grass green?" ,
"stream": false, "options": { "stream": false
"options": {
"seed": 1001
}
}'

Example Response

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T07:42:28.397780058Z",
"response": "Answer: because leaves contain a lot of chloride ions." ,
"done":true,.
"done_reason": "stop",
"context":[1,2,3],
"total_duration":404791556,
"load_duration":18317351, "prompt_eval_count".
"prompt_eval_count":17, "prompt_eval_count":17, "prompt_eval_count":17
"prompt_eval_duration":22453000,
"eval_count":16, "prompt_eval_duration":18317351, "prompt_eval_count":17, "prompt_eval_duration":22453000,
"eval_duration":321267000}

 

II. Dialogue Completion

POST /api/chat

Generate the next message in the chat using the specified model. This is also a streaming endpoint, so there will be a series of responses. If the "stream" set to false, then streaming can be disabled. The final response object will include the requested statistics and additional data.

parameters

  • model: (Required) Model name
  • messages: Chat messages, which can be used to keep a memory of the chat
  • tools: The model supports the use of tools. It is necessary to integrate the stream set to false

message The object has the following fields:

  • role: The role of the message, which can be system,user,assistant maybe tool
  • content: The content of the message
  • images(optional): a list of images to be included in the message (for messages such as llava (Multimodal models such as these)
  • tool_calls(optional): list of tools the model wants to use

Advanced Parameters (optional):

  • format: Returns the format of the response. The only currently accepted values are json
  • options: Other model parameters such as temperature,seed et al. (and other authors)
  • stream: If the value for falseThe response will be returned as a single response object rather than a stream of objects.
  • keep_alive: Controls how long the model remains loaded in memory after a request (default:5m)

Example request (streaming)

curl http://localhost:11434/api/chat -d '{
"model": "llama3.1",
"messages": [
{
"role": "user", "content": "Why is the grass green?
"content": "Why is the grass green?"
}
]
}'

Example Response

Returns a stream of JSON objects:

{
"model": "llama3.1", "created_at": "2024-08-08T03:54:36.933701041Z", {
"created_at": "2024-08-08T03:54:36.933701041Z",
"message":{
"role": "assistant", "content":{
"content": "because"
},
"done":false
}

Final Response:

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T03:54:37.187621765Z",
"message":{
"role": "assistant",
"content":""
},
"done_reason": "stop", "done":true, "content":" }, "done":true

"total_duration":5730533217,
"load_duration":5370535786, "prompt_eval_count":1717
"prompt_eval_count":17, "prompt_eval_duration":5730533217,
"prompt_eval_duration":29621000,
"eval_count":13, "prompt_eval_duration":29621000, "prompt_eval_duration":29621000
"eval_duration":273810000
}

Advanced Play

Parameterization of non-streaming output, JSON mode, multimodal input, reproducible output and Answer API of consistency.

With history

Send chat messages with conversation history. Multiple rounds of conversations or thought chain prompts can be started using the same method.

Example Request

curl http://localhost:11434/api/chat -d '{
"model": "llama3.1",
"messages": [
{
"role": "user", "content": "Why is grass green?
"content": "Why is the grass green?"
},
{
"role": "assistant", "content": "Because grass contains chlorophyll.
"content": "Because grass contains chlorophyll."
}, { "role": "assistant", "content": "Because grass contains chlorophyll.
{
"role": "user", "content": "Why does chlorophyll make grass look green?" }, {
"content": "Why does chlorophyll make grass look green?"
}
], { "role": "user", "content": "Why does chlorophyll make grass look green?
"stream": false
}'

Example Response

{
"model": "llama3.1", "created_at".
"created_at": "2024-08-08T07:53:28.849517802Z",
"message":{
"role": "assistant", "content":{
"content": "It's a more complex problem! \n\nChlorophylls are pigments called xanthophylls, and these pigments absorb light energy. In daylight, the chlorophyll in green grass leaves absorbs blue and red light, but reflects yellow and green light, so we see the grass look green. \n\nSimply put, chlorophyll makes grass look green because it reflects the green light that our eyes can see without reflecting the other colors we see."
},.
"done_reason": "stop".
"done":true,.
"total_duration":5065572138,
"load_duration":2613559070,

"prompt_eval_duration":37825000,
"eval_count":106, "eval_duration":2213559070, "prompt_eval_count":48, "prompt_eval_duration":37825000,
"eval_duration":2266694000}

III. Creating models

POST /api/create

recommended general modelfile set to the contents of the Modelfile, rather than just setting the pathRemote Model Creation. Remote model creation must also use Create Blob to explicitly create all file blobs, fields (such as the FROM cap (a poem) ADAPTER) and set the value to the path indicated in the response.

parameters

  • name: Name of the model to be created
  • modelfile(optional): contents of the Modelfile
  • stream(optional): if the value for falseThe response will be returned as a single response object, not a stream of objects.
  • path(optional): path to the Modelfile

Example Request

curl http://localhost:11434/api/create -d '{
"name": "mario", "modelfile": "FROM llama3\nSYSTEM You are mario from Super Mario Bros.
"modelfile": "FROM llama3\nSYSTEM You are mario from Super Mario Bros."
}'

Example Response

A string of JSON objects. Notice that the final JSON object shows "status": "success"Prompted to create successfully.

{"status": "Reading model metadata"}
{"status": "creating system layer"}
{"status": "using already created layer sha256:22f7f8ef5f4c791c1b03d7eb414399294764d7cc82c7e94aa81a1feb80a983a2"}
{"status": "using already created layer sha256:8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b"}
{"status": "using already created layer sha256:7c23fb36d80141c4ab8cdbb61ee4790102ebd2bf7aeff414453177d4f2110e5d"}
{"status": "using already created layer sha256:2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988"}
{"status": "using already created layer sha256:2759286baa875dc22de5394b4a925701b1896a7e3f8e53275c36f75a877a82c9"}
{"status": "writing layer sha256:df30045fe90f0d750db82a058109cecd6d4de9c90a3d75b19c09e5f64580bb42"}
{"status": "writing layer sha256:f18a68eb09bf925bb1b669490407c1b1251c5db98dc4d3d81f3088498ea55690"}
{"status": "Writing manifest"}
{"status": "success"}

Check if the Blob exists

HEAD /api/blobs/:digest

Make sure that the file blob for the FROM or ADAPTER field exists on the server. This is checking your Ollama server and not Ollama.ai.

Query parameters

  • digest: SHA256 digest of blob

Example Request

curl -I http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2

Example Response

Returns "200 OK" if the blob exists, or "404 Not Found" if it does not.

Creating a Blob

POST /api/blobs/:digest

Creates a blob from a file on the server. returns the server file path.

Query parameters

  • digest: Expected SHA256 summary of the document

Example Request

curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2

Example Response

Returns 201 Created if the blob was successfully created, or 400 Bad Request if the digest used was not as expected.

 

IV. Replication model

POST /api/copy

Duplicate a model to duplicate an existing model using another name.

Example Request

curl http://localhost:11434/api/copy -d '{
"source": "llama3.1", "destination": "llama3-backup
"destination": "llama3-backup"
}'

Example Response

Returns "200 OK" if successful, or "404 Not Found" if the source model does not exist.

 

V. Deletion of models

DELETE /api/delete

Delete the model and its data.

parameters

  • name: Name of the model to be deleted

Example Request

[](https://github.com/datawhalechina/handy-ollama/blob/main/docs/C4/1.1 TP3T20OllamaAPI. md#-4)

curl -X DELETE http://localhost:11434/api/delete -d '{
"name": "llama3.1"
}'

Example Response

Returns "200 OK" if successful or "404 Not Found" if the model to be deleted does not exist.

 

VI. Listing of operational models

GET /api/ps

Lists the models currently loaded into memory.

Example Request

curl http://localhost:11434/api/ps

Example Response

{
"models":[
{
"name": "llama3.1:latest",
"model": "llama3.1:latest",
"size":6654289920, "digest": "7538208d99dfaa6".
"digest": "75382d0899dfaaa6ce331cf680b72bd6812c7f05e5158c5f2f43c6383e21d734",
"details":{
"parent_model":"",.
"format": "gguf".
"family": "llama".
"families":["llama"], "parameter_size".
"parameter_size": "8.0B", "parameter_size": "8.0B", "parameter_size".
"quantization_level": "Q4_0"
},
"expires_at": "2024-08-08T14:06:52.883023476+08:00",
"size_vram":6654289920
}
]
}

 

VII. Listing of local models

GET /api/tags

Lists locally available models.

Example Request

curl http://localhost:11434/api/tags

Example Response

{
"models":[
{
"name": "llama3.1:latest",
"model": "llama3.1:latest",
"modified_at": "2024-08-07T17:54:22.533937636+08:00",
"size":4661230977,
"digest": "75382d0899dfaaa6ce331cf680b72bd6812c7f05e5158c5f2f43c6383e21d734",
"details":{
"parent_model":"",.
"format": "gguf".
"family": "llama".
"families":["llama"], "parameter_size".
"parameter_size": "8.0B", "parameter_size": "8.0B", "parameter_size".
"quantization_level": "Q4_0"
}
}
]
}

 

VIII. Display of model information

POST /api/show

Displays information about the model, including details, modelfile, templates, parameters, licenses, and system hints.

parameters

  • name: Name of the model to be displayed
  • verbose(Optional): If set to true, then returns the full data for the Detailed Response field

Example Request

curl http://localhost:11434/api/show -d '{
"name": "llama3.1"
}'

Example Response

{
"license":"..." ,
"modelfile":"..." , "parameters":"..." , "modelfile":"...
"parameters":"..." ,
"template":"..." ,
"details":{
"parent_model":"", "", "format":"..." , "details":{
"format": "gguf".
"family": "llama".
"families":["llama"], "parameter_size".
"parameter_size": "8.0B", "parameter_size": "8.0B", "parameter_size".
"quantization_level": "Q4_0"
}, "parameter_size":["8.0B"], "quantization_level": "Q4_0
"model_info":{
"general.architecture": "llama",
"general.basename": "Meta-Llama-3.1",
"general.file_type":2, "general.

"general.languages":["en", "de", "fr", "it", "pt", "hi", "es", "th"],
"general.license": "llama3.1", "general.
"general.parameter_count":8030261312, "general.quantization_version".
"general.quantization_version":2, "general.size_label".
"general.size_label": "8B",
"general.tags":["facebook", "meta", "pytorch", "llama", "llama-3", "text-generation"], "general.
"general.type": "model".
"llama.attention.head_count":32, "llama.attention.head_count":32, "llama.attention.head_count".
"llama.attention.head_count_kv":8,
"llama.attention.layer_norm_rms_epsilon":0.00001,
"llama.block_count":32,
"llama.context_length":131072,
"llama.embedding_length":4096,
"llama.feed_forward_length":14336,
"llama.rope.dimension_count":128,
"llama.rope.freq_base":500000,
"llama.vocab_size":128256,
"tokenizer.ggml.bos_token_id":128000,
"tokenizer.ggml.eos_token_id":128009,
"tokenizer.ggml.merges":null,
"tokenizer.ggml.model": "gpt2",
"tokenizer.ggml.pre": "llama-bpe",
"tokenizer.ggml.token_type":null, "tokenizer.ggml.
"tokenizer.ggml.tokens":null
}, "tokenizer.ggml.tokens":null
"modified_at": "2024-08-07T17:54:22.533937636+08:00"
}

 

IX. Pulling models

POST /api/pull

surname Cong ollama Library download model. An interrupted pull operation will continue the download from the breakpoint, and multiple calls will share the same download progress.

parameters

  • name: Name of the model to pull
  • insecure(Optional): allows unsafe connections to libraries. It is recommended to use this option only when pulling from your own libraries during development.
  • stream(optional): if the value for falseThe response will be returned as a single response object, not a stream of objects.

Example Request

curl http://localhost:11434/api/pull -d '{
"name": "llama3.1"
}'

Example Response

in the event that stream Not specified or set to true, then a string of JSON objects is returned:

The first object is the list:

{
"status": "pulling manifest"
}

Then there is a series of download responses. Until the download is complete, it may not contain completed key. The number of files to download depends on the number of layers specified in the list.

{
"status": "downloading digestname", "digest".
"digest": "digestname", "total": 2142590208, "digest".
"total": 2142590208, "completed": 241970
"completed": 241970
}

The final response after all the files have been downloaded is:

{
"status": "verifying sha256 digest"
}
{
"status": "writing manifest"
}
{
"status": "removing any unused layers"
}
{
"status": "success"
}

in the event that stream set to false, the response is a single JSON object:

{
"status": "success"
}

 

X. Push models

POST /api/push

Upload the model to the model repository. You need to register ollama.ai and add a public key first.

parameters

  • name: the name of the model to be pushed, in the format of /:
  • insecure(Optional): Allow insecure connections to libraries. Use this option only when pushing to your own libraries during development.
  • stream(optional): if the value for falseThe response will be returned as a single response object, not a stream of objects.

Example Request

curl http://localhost:11434/api/push -d '{
"name": "mattw/pygmalion:latest"
}'

Example Response

in the event that stream Not specified or set to true, then a string of JSON objects is returned:

{ "status": "retrieving manifest" }

Then there is a series of upload responses:

{
"status": "starting upload", "digest".
"digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
"total": 1928429856
}

Finally, when the upload is complete:

{"status": "pushing manifest"}
{"status": "success"}

in the event that stream set to falseThe response is a single JSON object:

{ "status": "success" }

 

XI. Generating embeds

POST /api/embed

Generating embeddings from models.

parameters

  • model: Name of the model from which the embedding is to be generated
  • input: To generate embedded text or a list of text

Advanced Parameters:

  • truncate: Truncate the end of each input to fit the context length. If the value for false and exceeds the context length, an error is returned. The default value is true
  • options: Other model parameters such as temperature,seed et al. (and other authors)
  • keep_alive: Controls how long the model remains loaded in memory after a request (default:5m)

Example Request

curl http://localhost:11434/api/embed -d '{
"model": "llama 3.1", "input": "Why is the grass green?
"input": "Why is the grass green?"
}'

Example Response

{
"model": "llama3.1",
"embeddings":[[
-0.008059342,-0.013182715,0.019781841,0.012018124,-0.024847334.
-0.0031902494,-0.02714767,0.015282277,0.060032737,...
]]
"total_duration":3041671009,

"prompt_eval_count":7}

Example request (multiple inputs)

curl http://localhost:11434/api/embed -d '{
"model": "llama3.1",
"input": ["Why is the grass green?" , "Why is the sky blue?"]
}'

Example Response

{
"model": "llama3.1",
"embeddings":[[
-0.008471201,-0.013031566,0.019300476,0.011618419,-0.025197424.
-0.0024164673,-0.02669075,0.015766116,0.059984162,...
],[
-0.012765694,-0.012822924,0.015915949,0.006415892,-0.02327763,...
0.004859615,-0.017922137,0.019488193,0.05638235,...
]]
"total_duration":195481419,
"load_duration":1318886, "prompt_eval_count":0.017922137,0.019488193,0.05638235.
"prompt_eval_count":14
}

 

error handling

The Ollama API returns appropriate error codes and messages when an error occurs. Common errors include:

  • 400 Bad Request: request format error.
  • 404 Not Found: The requested resource does not exist.
  • 500 Internal Server Error: internal server error.
CDN1
May not be reproduced without permission:Chief AI Sharing Circle " Ollama API User's Guide

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish