Windows (computer)
The following is an example of how to customize Ollama to run in the GPU on a Windows system.
Ollama By default the CPU is used for inference. For faster inference, you can configure the GPU used by Ollama.This tutorial will guide you on how to set an environment variable on your Windows system to enable GPU acceleration.
pre-conditions
- The computer has an NVIDIA graphics card.
- NVIDIA graphics drivers are installed and can be used with the command
nvidia-smi
to check if the driver is installed. - The CUDA toolkit is installed and can be used with the command
nvcc --version
to check if CUDA is installed.
Tip.
You can search for tutorials on installing NVIDIA drivers and CUDA kits, so I won't repeat them in this article. If your computer meets the above prerequisites, Ollama is GPU-accelerated by default. If you want to specify a particular GPU, you can follow the steps below to set it up.
Configuring Environment Variables
- Open the system environment variable settings
- Type "Environment Variables" in the Windows search bar and select "Edit System Environment Variables".
- In the "System Properties" pop-up window, click the "Advanced" tab, and then click the "Environment Variables" button.
- Creating the OLLAMA_GPU_LAYER variable
- In the "System Variables" area, click the "New" button.
- In the New System Variable dialog box, enter the following information:
- Variable Name:
OLLAMA_GPU_LAYER
- Variable values:
cuda
(This will tell Ollama to use CUDA for GPU acceleration)
- Variable Name:
- Click "OK" to save the variable.
- (Optional) Specifies the GPU to be used.
- If your system has multiple GPUs and you want to specify that Ollama uses a specific GPU, you can set the
CUDA_VISIBLE_DEVICES
Environment variables. - Finds the UUID of the GPU: It is strongly recommended to use the UUID instead of the number, as the number may change due to driver updates or system reboots.
- Open a command prompt or PowerShell.
- Run command:
nvidia-smi -L
- In the output, find the "UUID" value of the GPU you want to use. Example:
GPU 00000000:01:00.0
lowerUUID : GPU-xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx
The
- Creates the CUDA_VISIBLE_DEVICES variable:
- In the "System Variables" area, click the "New" button.
- In the New System Variable dialog box, enter the following information:
- Variable Name:
CUDA_VISIBLE_DEVICES
- Variable values: The UUID of the found GPU. for example:
GPU-xxxxxxxxx-xxxx-xxxx-xxxxxx-xxxxxxxxxxxxxxx
- Variable Name:
- Click "OK" to save the variable.
- If your system has multiple GPUs and you want to specify that Ollama uses a specific GPU, you can set the
Important: In order to make the environment variable effective, the Restarting a terminal or application on which Ollama is running The
Verify that GPU acceleration is in effect:
- Open a command prompt.
- Run Ollama. for example:
ollama run deepseek-r1:1.5b
- Open a new command prompt window and use the
ollama ps
command to view the processes running in Ollama.
Linux
The following is an example of how to customize Ollama to run on the GPU on a Linux system.
- establish
ollama_gpu_selector.sh
script file with the following contents:
#!/bin/bash
# Validate input
validate_input(){
if[[! $1 =~^[0-4](,[0-4])*$ ]];then
echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas."
please enter numbers between 0 and 4, separated by commas." exit1
exit1." echo "Error: Invalid input.
}
# Update the service file with CUDA_VISIBLE_DEVICES values
update_service(){
# Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file
if grep -q '^Environment="CUDA_VISIBLE_DEVICES='/etc/systemd/system/ollama.service;then
# Update the existing CUDA_VISIBLE_DEVICES values
sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=. */Environment="CUDA_VISIBLE_DEVICES='"$1"'"/'/etc/systemd/system/ollama.service
CUDA_VISIBLE_DEVICES
# Add a new CUDA_VISIBLE_DEVICES environment variable
sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"/'/etc/systemd/system/ollama.service
service'' /etc/systemd/system/ollama.service
# Reload and restart the systemd service
sudo systemctl daemon-reload
sudo systemctl restart ollama.service
echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1"
}
# Check if arguments are passed
if["$#"-eq 0];then
# Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed
read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values
validate_input "$cuda_values"
update_service "$cuda_values"
update_service "$cuda_values" else
# Use arguments as CUDA_VISIBLE_DEVICES values
cuda_values="$1"
validate_input "$cuda_values"
update_service "$cuda_values"
The "$cuda_values" option is available on the following page.
- Adding Execute Permissions to Script Files
chmod +x ollama_gpu_selector.sh
sudo . /ollama_gpu_selector.sh
After running the script, follow the prompts to enter the GPU number to specify the GPU used by Ollama. you can use commas to separate multiple GPU numbers, for example:0,1,2
The
- Restarting the Ollama Service
cat /etc/systemd/system/ollama.service
After running the command, look at the Ollama service file and confirm that the CUDA_VISIBLE_DEVICES
The environment variables have been updated.
If it has been updated, it will add for example Environment="CUDA_VISIBLE_DEVICES=0,1,2"
environment variables.