General Introduction
BrowserAI is an open source tool that lets users run native AI models directly in the browser. Developed by the Cloud-Code-AI team, it supports language models like Llama, DeepSeek, and Kokoro. Users can perform tasks such as text generation, speech recognition, and text-to-speech through the browser without the need for a server or complex setup. It utilizes WebGPU technology to accelerate computation, and all data is processed locally to protect privacy.BrowserAI is simple and easy to use, so developers can use it to build AI apps, and ordinary users can also experience AI features. The project is free and open on GitHub, so anyone can download the code to use or improve it.
BrowserAI Text Dialog
BrowserAI Voice Dialog
BrowserAI Text-to-Speech
Function List
- Run local AI models in your browser without server support.
- Text generation is supported, so users can enter text and get a natural language response.
- Provides speech recognition to convert audio to text.
- Supports text-to-speech, turning text into playable audio.
- Accelerated with WebGPU to run at near-native performance.
- Offline functionality is provided so that no internet connection is required after the first download.
- Open source code to support developers to customize models and features.
Using Help
Installation process
BrowserAI does not require a traditional installation, but it does require preparation of the environment and code to run. Below are the specific steps:
- Check your browser
- Use a WebGPU-enabled browser, such as Chrome 113+ or Edge 113+. In the address bar of your browser, type
chrome://gpu
Check if WebGPU is enabled. - Make sure your computer hardware supports 16-bit floating point (some models require it), and a regular CPU will run it, but a GPU is faster.
- Use a WebGPU-enabled browser, such as Chrome 113+ or Edge 113+. In the address bar of your browser, type
- Download Code
- Visit https://github.com/Cloud-Code-AI/BrowserAI.
- Click the "Code" button and select "Download ZIP" to download, or use the command
git clone https://github.com/Cloud-Code-AI/BrowserAI.git
The - Unzip the file or go to the folder.
- Installing Node.js and dependencies
- First, install Node.js, download and install it from the official Node.js website, and when you're done, enter
node -v
Confirm the version. - Open a terminal and go to the BrowserAI folder (e.g.
cd BrowserAI
). - importation
npm install
Install the dependencies, the process may take a few minutes.
- First, install Node.js, download and install it from the official Node.js website, and when you're done, enter
- Initiation of projects
- In the terminal, type
npm run dev
, start the local server. - Open your browser and type
http://localhost:3000
(see the terminal prompt for the port number) and enter the BrowserAI interface.
- In the terminal, type
How to use the main features
The core of BrowserAI is to run AI models in the browser, and the details of how to do this are described below.
Function 1: Text Generation
- procedure
- After startup, the interface displays the model selection box, which by default has the
llama-3.2-1b-instruct
and other options. - Click "Load Model" and wait for the model to load (a few seconds to a few minutes, depending on computer performance).
- Enter text in the input box, such as "What's the weather like today?" , click "Generate".
- The system generates responses such as "It's a beautiful day to go out." .
- After startup, the interface displays the model selection box, which by default has the
- Tips for use
- Small models (e.g.
TinyLlama-1.1B
) loads quickly and is suitable for low end computers. - Enter a specific question for a more accurate response, such as "Write a 50-word tech article."
- Small models (e.g.
- application scenario
- Write the first draft of an article, generate a dialog, or test model language skills.
Function 2: Speech Recognition
- procedure
- Select a model that supports speech recognition, such as
whisper-tiny-en
The - Click "Load Model" to load the model.
- Click on "Start Recording" and speak into the microphone, e.g. "Hello, BrowserAI".
- Click "Stop Recording", wait for a few seconds and the interface displays the transcribed text, such as "Hello, BrowserAI".
- Select a model that supports speech recognition, such as
- Tips for use
- Make sure the microphone is working properly and that there is little background noise for better results.
- Optional parameters
return_timestamps
View the timestamp of each paragraph.
- application scenario
- Record meetings, transcribe voice notes, or develop voice input applications.
Function 3: Text-to-Speech
- procedure
- option
kokoro-tts
model, click "Load Model". - Enter text, such as "Welcome to the BrowserAI experience."
- Select the voice (e.g.
af_bella
) and speed (default 1.0), click "Text to Speech". - Generate audio and play it automatically, or download the file.
- option
- Tips for use
- Phrases are more natural, and slower speeds (e.g., 0.8) sound clearer.
- Test different voice options to find the most suitable tone.
- application scenario
- Create voice prompts, generate podcast clips, or dub videos.
Feature 4: Developer Customization
- procedure
- Download the model file you want to use (e.g. from Hugging Face) and put it in the project directory (see
README.md
). - compiler
src/index.ts
, add the model path. - (of a computer) run
npm run dev
, load the new model.
- Download the model file you want to use (e.g. from Hugging Face) and put it in the project directory (see
- Tips for use
- Ensure that the model is compatible with WebGPU and WebAssembly.
- If you don't know the code, raise an issue on GitHub for help.
- application scenario
- Testing new models, developing custom AI applications.
The sample code uses
Text Generation
- Introducing BrowserAI to the project:
import { BrowserAI } from '@browserai/browserai'; const ai = new BrowserAI(); await ai.loadModel('llama-3.2-1b-instruct'); await ai.loadModel('llama-3.2-1b-instruct'); const response = await ai.generateText('@browserai/browserai') const response = await ai.generateText('Hello, how's the weather today?') ; console.log(response);
speech-to-text
- Record and transcribe audio:
const ai = new BrowserAI(); await ai.loadModel('whisper-tiny-en'); await ai.startRecording(); const audio = await ai.stopRecording(); await ai.startRecording() const audio = await ai.stopRecording(); const text = await ai.transcribeAccess('whisper-tiny-en') const text = await ai.transcribeAudio(audio); const text = await ai.transcribeAudio(audio); console.log(text);
caveat
- performances: Large models (e.g.
Llama-3.2-3b
) A high end computer is required, a small model is recommended for low end. - Offline use: It works even if you disconnect from the Internet after the first load, but you need to download the model in advance.
- Community Support: Problems can be added Discord probing (computer) fileThe
BrowserAI is simple and powerful. Just follow the steps to set up your environment and experience the convenience of local AI in your browser.