Using the Ollama API in JavaScript

🚀 Invitation to Experience: China's First AI IDE Intelligent Programming Software Trae Chinese version downloadThe DeepSeek-R1 and Doubao-pro are available for unlimited use!

This article describes how to use the Ollama API . This documentation is intended to help developers get up to speed quickly and take full advantage of Ollama's capabilities. You can use it in a Node.js environment or import the module directly in your browser. By studying this document, you can easily integrate Ollama into your projects.

Install Ollama

npm i ollama

Usage

import ollama from 'ollama'
const response = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: '为什么天空是蓝色的？' }],
})
console.log(response.message.content)

Browser Use

To use this library without Node.js, import the browser module.

import ollama from 'ollama/browser'

Streaming Response

This can be done by setting the stream: true Enable response streaming so that a function call returns a AsyncGenerator , where each part is an object in the stream.

import ollama from 'ollama'
const message = { role: 'user', content: '为什么天空是蓝色的？' }
const response = await ollama.chat({ model: 'llama3.1', messages: [message], stream: true })
for await (const part of response) {
process.stdout.write(part.message.content)
}

Structured Output

Using the Ollama JavaScript library, the architecture as a JSON object is passed to the format parameter, you can optionally use the object format, or you can use Zod (recommended) to pass the zodToJsonSchema() Method Serialization Architecture.

import ollama from 'ollama';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
const Country = z.object({
name: z.string(),
capital: z.string(), 
languages: z.array(z.string()),
});
const response = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: 'Tell me about Canada.' }],
format: zodToJsonSchema(Country),
});
const country = Country.parse(JSON.parse(response.message.content));
console.log(country);

Creating Models

import ollama from 'ollama'
const modelfile = `
FROM llama3.1
SYSTEM "你是超级马里奥兄弟中的马里奥。"
`
await ollama.create({ model: 'example', modelfile: modelfile })

API

The Ollama JavaScript library API is designed around the Ollama REST API. If you want to learn more about the underlying implementation and full API endpoint information, we recommend referring to the Ollama API User's Guide

chats

ollama.chat(request)

request <Object>: A request object containing chat parameters.
- model <string> The name of the model to be used for the chat.
- messages <Message[]>: An array of message objects representing the chat history.
  - role <string>: The role of the message sender ('user', 'system' or 'assistant').
  - content <string>: The content of the message.
  - images <Uint8Array[] | string[]>: (Optional) The image to include in the message, either a Uint8Array or base64 encoded string.
- format <string>: (Optional) Set the expected format of the response (json).
- stream <boolean>: (Optional) If true, returns AsyncGeneratorThe
- keep_alive <string | number>: (Optional) Holds the length of time the model is loaded.
- tools <Tool[]>: (Optional) A list of tools that may be called by the model.
- options <Options>: (Optional) Configure runtime options.
Returns. <ChatResponse>

generating

ollama.generate(request)

request <Object>: The request object containing the generated parameters.
- model <string> The name of the model to be used for the chat.
- prompt <string>: Sends a prompt to the model.
- suffix <string>: (Optional) The suffix is the text that follows the inserted text.
- system <string>: (Optional) Override modeling system prompts.
- template <string>: (Optional) Override model templates.
- raw <boolean>: (Optional) Bypasses the prompt template and passes the prompt directly to the model.
- images <Uint8Array[] | string[]>: (Optional) The image to include, either a Uint8Array or base64 encoded string.
- format <string>: (Optional) Set the expected format of the response (json).
- stream <boolean>: (Optional) If true, returns AsyncGeneratorThe
- keep_alive <string | number>: (Optional) Holds the length of time the model is loaded.
- options <Options>: (Optional) Configure runtime options.
Returns. <GenerateResponse>

pull model

ollama.pull(request)

request <Object>: The request object containing the pull parameters.
- model <string> The name of the model to pull.
- insecure <boolean>: (Optional) Pulls from servers that cannot authenticate.
- stream <boolean>: (Optional) If true, returns AsyncGeneratorThe
Returns. <ProgressResponse>

push model

ollama.push(request)

request <Object>: A request object containing push parameters.
- model <string> The name of the model to push.
- insecure <boolean>: (Optional) Push to a server that cannot authenticate the identity.
- stream <boolean>: (Optional) If true, returns AsyncGeneratorThe
Returns. <ProgressResponse>

Creating Models

ollama.create(request)

request <Object>: Contains the request object for which the parameters were created.
- model <string> The name of the model to be created.
- path <string>: (Optional) Path to the model file to be created.
- modelfile <string>: (Optional) The contents of the model file to be created.
- stream <boolean>: (Optional) If true, returns AsyncGeneratorThe
Returns. <ProgressResponse>

Delete Model

ollama.delete(request)

request <Object>: The request object containing the deletion parameter.
- model <string> The name of the model to be deleted.
Returns. <StatusResponse>

Replication models

ollama.copy(request)

request <Object>: The request object containing the replication parameters.
- source <string> The name of the model to copy from.
- destination <string> The name of the model to copy to.
Returns. <StatusResponse>

List of Local Models

ollama.list()

Returns. <ListResponse>

Displaying model information

ollama.show(request)

request <Object>: A request object containing the display parameters.
- model <string> The name of the model to be displayed.
- system <string>: (Optional) Overrides the model system prompt return value.
- template <string>: (Optional) Overrides the model template return value.
- options <Options>: (Optional) Configure runtime options.
Returns. <ShowResponse>

Generate Embedding

ollama.embed(request)

request <Object>: A request object containing embedded parameters.
- model <string> The name of the model used to generate the embedding.
- input <string> | <string[]>: Input used to generate the embedding.
- truncate <boolean>: (Optional) Truncate the input to fit the maximum context length supported by the model.
- keep_alive <string | number>: (Optional) Holds the length of time the model is loaded.
- options <Options>: (Optional) Configure runtime options.
Returns. <EmbedResponse>

step

ollama.ps()

Returns. <ListResponse>

Customized Clients

Custom clients can be created using the following fields:

host <string>: (Optional) Ollama host address. Default. "http://127.0.0.1:11434"The
fetch <Object>: (Optional) The fetch library used to make requests to the Ollama host.

import { Ollama } from 'ollama'
const ollama = new Ollama({ host: 'http://127.0.0.1:11434' })
const response = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: '为什么天空是蓝色的？' }],
})

construct (sth abstract)

To build the project file, run:

npm run build

Refer to the documentation:ollama-js

Install Ollama

Usage

Browser Use

Streaming Response

Structured Output

Creating Models

API

chats

generating

pull model

push model

Creating Models

Delete Model

Replication models

List of Local Models

Displaying model information

Generate Embedding

step

Customized Clients

construct (sth abstract)

Related articles

Recommended

Can't find AI tools? Try here!

FLUX.1 image generator (supports Chinese input)

Recent AI Hotspots

AI Tools Recommendations

AI Tools Classification