This article describes how to use the Ollama API . This documentation is intended to help developers get up to speed quickly and take full advantage of Ollama's capabilities. You can use it in a Node.js environment or import the module directly in your browser. By studying this document, you can easily integrate Ollama into your projects.
Install Ollama
npm i ollama
Usage
import ollama from 'ollama'
const response = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: '为什么天空是蓝色的?' }],
})
console.log(response.message.content)
Browser Use
To use this library without Node.js, import the browser module.
import ollama from 'ollama/browser'
Streaming Response
This can be done by setting the stream: true
Enable response streaming so that a function call returns a AsyncGenerator
, where each part is an object in the stream.
import ollama from 'ollama'
const message = { role: 'user', content: '为什么天空是蓝色的?' }
const response = await ollama.chat({ model: 'llama3.1', messages: [message], stream: true })
for await (const part of response) {
process.stdout.write(part.message.content)
}
Structured Output
Using the Ollama JavaScript library, the architecture as a JSON
object is passed to the format
parameter, you can optionally use the object
format, or you can use Zod (recommended) to pass the zodToJsonSchema()
Method Serialization Architecture.
import ollama from 'ollama';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
const Country = z.object({
name: z.string(),
capital: z.string(),
languages: z.array(z.string()),
});
const response = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: 'Tell me about Canada.' }],
format: zodToJsonSchema(Country),
});
const country = Country.parse(JSON.parse(response.message.content));
console.log(country);
Creating Models
import ollama from 'ollama'
const modelfile = `
FROM llama3.1
SYSTEM "你是超级马里奥兄弟中的马里奥。"
`
await ollama.create({ model: 'example', modelfile: modelfile })
API
The Ollama JavaScript library API is designed around the Ollama REST API. If you want to learn more about the underlying implementation and full API endpoint information, we recommend referring to the Ollama API User's Guide
chats
ollama.chat(request)
request
<Object>
: A request object containing chat parameters.model
<string>
The name of the model to be used for the chat.messages
<Message[]>
: An array of message objects representing the chat history.role
<string>
: The role of the message sender ('user', 'system' or 'assistant').content
<string>
: The content of the message.images
<Uint8Array[] | string[]>
: (Optional) The image to include in the message, either a Uint8Array or base64 encoded string.
format
<string>
: (Optional) Set the expected format of the response (json
).stream
<boolean>
: (Optional) If true, returnsAsyncGenerator
Thekeep_alive
<string | number>
: (Optional) Holds the length of time the model is loaded.tools
<Tool[]>
: (Optional) A list of tools that may be called by the model.options
<Options>
: (Optional) Configure runtime options.
- Returns.
<ChatResponse>
generating
ollama.generate(request)
request
<Object>
: The request object containing the generated parameters.model
<string>
The name of the model to be used for the chat.prompt
<string>
: Sends a prompt to the model.suffix
<string>
: (Optional) The suffix is the text that follows the inserted text.system
<string>
: (Optional) Override modeling system prompts.template
<string>
: (Optional) Override model templates.raw
<boolean>
: (Optional) Bypasses the prompt template and passes the prompt directly to the model.images
<Uint8Array[] | string[]>
: (Optional) The image to include, either a Uint8Array or base64 encoded string.format
<string>
: (Optional) Set the expected format of the response (json
).stream
<boolean>
: (Optional) If true, returnsAsyncGenerator
Thekeep_alive
<string | number>
: (Optional) Holds the length of time the model is loaded.options
<Options>
: (Optional) Configure runtime options.
- Returns.
<GenerateResponse>
pull model
ollama.pull(request)
request
<Object>
: The request object containing the pull parameters.model
<string>
The name of the model to pull.insecure
<boolean>
: (Optional) Pulls from servers that cannot authenticate.stream
<boolean>
: (Optional) If true, returnsAsyncGenerator
The
- Returns.
<ProgressResponse>
push model
ollama.push(request)
request
<Object>
: A request object containing push parameters.model
<string>
The name of the model to push.insecure
<boolean>
: (Optional) Push to a server that cannot authenticate the identity.stream
<boolean>
: (Optional) If true, returnsAsyncGenerator
The
- Returns.
<ProgressResponse>
Creating Models
ollama.create(request)
request
<Object>
: Contains the request object for which the parameters were created.model
<string>
The name of the model to be created.path
<string>
: (Optional) Path to the model file to be created.modelfile
<string>
: (Optional) The contents of the model file to be created.stream
<boolean>
: (Optional) If true, returnsAsyncGenerator
The
- Returns.
<ProgressResponse>
Delete Model
ollama.delete(request)
request
<Object>
: The request object containing the deletion parameter.model
<string>
The name of the model to be deleted.
- Returns.
<StatusResponse>
Replication models
ollama.copy(request)
request
<Object>
: The request object containing the replication parameters.source
<string>
The name of the model to copy from.destination
<string>
The name of the model to copy to.
- Returns.
<StatusResponse>
List of Local Models
ollama.list()
- Returns.
<ListResponse>
Displaying model information
ollama.show(request)
request
<Object>
: A request object containing the display parameters.model
<string>
The name of the model to be displayed.system
<string>
: (Optional) Overrides the model system prompt return value.template
<string>
: (Optional) Overrides the model template return value.options
<Options>
: (Optional) Configure runtime options.
- Returns.
<ShowResponse>
Generate Embedding
ollama.embed(request)
request
<Object>
: A request object containing embedded parameters.model
<string>
The name of the model used to generate the embedding.input
<string> | <string[]>
: Input used to generate the embedding.truncate
<boolean>
: (Optional) Truncate the input to fit the maximum context length supported by the model.keep_alive
<string | number>
: (Optional) Holds the length of time the model is loaded.options
<Options>
: (Optional) Configure runtime options.
- Returns.
<EmbedResponse>
step
ollama.ps()
- Returns.
<ListResponse>
Customized Clients
Custom clients can be created using the following fields:
host
<string>
: (Optional) Ollama host address. Default."http://127.0.0.1:11434"
Thefetch
<Object>
: (Optional) The fetch library used to make requests to the Ollama host.
import { Ollama } from 'ollama'
const ollama = new Ollama({ host: 'http://127.0.0.1:11434' })
const response = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: '为什么天空是蓝色的?' }],
})
construct (sth abstract)
To build the project file, run:
npm run build
Refer to the documentation:ollama-js