Bonsai: A three-valued weighted language model suitable for operation on edge devices

Latest AI Resources5mos agorelease AI Sharing Circle

1.7K 00

General Introduction

Bonsai is an open-source language model developed by deepgrove-ai with a parameter size of 500 million, using ternary weights. It is based on the Llama architecture and Mistral The classifier is designed with a linear layer adapted to support three-valued weights. The model was primarily trained using the DCLM-Pro and Fineweb-Edu datasets, totaling less than 5 billion tokens. Despite the small amount of training data, Bonsai performs well and is one of the first lightweight triple-valued models to reach competitive levels. Users can call it via the Huggingface Transformers library. The project code is publicly available on GitHub for developers exploring efficient AI models.

Function List

Lightweight and efficient operation: Using three-value weighting technique, the model is small and runs fast for low-resource devices.
natural language generation: Support for generating fluent text that can be used for tasks such as dialogs, quizzes, and so on.
open source access: The full code is available on GitHub, allowing users to download, modify, and optimize.
Huggingface compatible: Seamless integration into the Transformers library for easy loading and deployment.
excellent performance: Performance comparable to comparable models with a small amount of training data.

Using Help

Installation process

To use Bonsai, you need to set up the runtime environment first. Below are the detailed steps:

Checking the Python Environment
Make sure that Python 3.8 or above is installed on your computer. Type in the terminal:

python --version

If you don't have it, you can download it from the Python website.

Installing the Transformers Library
Bonsai relies on Huggingface's Transformers library. Run it in a terminal:

pip install transformers

After installation, use the pip show transformers Confirm the version.

Download Bonsai Model
The model is hosted on Huggingface. It is recommended that they be loaded automatically via code (see below), or they can be downloaded manually.
Install optional dependencies
If fine tuning or acceleration is required, install the torch cap (a poem) datasets::

pip install torch datasets

How to use

Bonsai uses Python script calls. Here are the basic steps:

Loading Models and Splitters

Run the following code in Python:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepgrove/Bonsai", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepgrove/Bonsai", trust_remote_code=True)

Generate Text

Enter text and generate results:

text = "中国的首都是哪里？"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

The output might be "The capital of China is Beijing." .

Adjustment parameters

The generation parameters can be modified, for example:

outputs = model.generate(**inputs, max_length=50, temperature=0.7)

max_length: Set the output length.
temperature: Controls the randomness of the output, the smaller the value the more stable it is.

Featured Function Operation

efficient operation

Bonsai's triple-valued weights let it run well at 16-bit precision. It can be accelerated automatically if a GPU is available:

import torch
print(torch.cuda.is_available())  # 返回 True 表示 GPU 可用

The GPU will boost performance significantly, but the CPU will run just fine.

Performance Evaluation

Bonsai performs well in several benchmarks. Here are the official figures:

mould	ARC-c	ARC-e	HS.	OBQA	PiQA	Wino.	MMLU	average score
MobiLlama 0.5B	26.62	46.68	51.66	30.00	71.65	54.50	28.61	44.25
Qwen 2 0.5B	28.84	50.29	49.12	33.00	69.26	56.99	31.78	45.61
MobileLLM 600M	29.01	56.65	55.35	34.00	71.65	59.75	31.40	48.13
Qwen 2.5 0.5B	32.25	58.29	52.18	35.40	69.91	56.12	33.40	48.22
Bonsai	33.36	57.95	48.04	34.00	70.24	54.85	30.28	46.96
These tests, which include ARC, OBQA, MMLU, and others, show that Bonsai ranks among the top lightweight models.

fine-tuned model

Bonsai is not fine-tuned for general-purpose generation tasks. If you need to optimize it for a specific use (e.g., quizzing), you can fine-tune it yourself:

Prepare the data: use a text file or datasets Library loading.
Configuration parameters: with TrainingArguments Setting.
Training models:

from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./bonsai_finetuned",
num_train_epochs=3,
per_device_train_batch_size=4
)
trainer = Trainer(model=model, args=training_args, train_dataset=your_dataset)
trainer.train()

For more details see Huggingface DocumentationThe

caveat

Precision Limits: Currently only 16-bit precision operation is supported, and the team is developing mixed precision support.
Uncommanded tuning: The default model is not suitable for direct use in complex command tasks and needs to be fine-tuned.
hardware requirement: Normal CPUs can run it, GPUs are not required but recommended.

application scenario

Educational aids
Bonsai can answer basic knowledge questions such as "What is the capital of France?". . The answers are generated quickly after typing and are suitable for learning.
Edge Device Applications
The model is lightweight and suitable for deployment to mobile or embedded devices for localized language processing.
Modeling Studies
Researchers can use it to test the potential of three-value weighting techniques and explore efficient AI model design.

QA

What are Bonsai's core strengths?
It uses three-valued weighting technique to achieve lightweight and high efficiency, with less training data but strong performance, suitable for resource-constrained scenarios.
Need a GPU?
The CPU is not required to run it, but the GPU will speed it up.
Can it be used directly for dialog?
The default model does not command tuning and it is recommended to fine-tune it before using it for a specific task.