General Introduction
Yek is a fast Rust-based tool for reading text files from a repository or directory, chunking them, and serializing them for use in large language models (LLMs). The tool uses .gitignore rules by default to skip unwanted files and uses Git history to infer important files. yek can chunk content based on approximate "token" counts or byte sizes, and automatically detects if the output is pipelined. It supports processing multiple directories in a single command, and is configured via the yek.toml file.
Function List
- Using the .gitignore rule to skip unwanted files
- Using Git History to Infer Important Files
- Inferring additional ignore patterns (e.g., binary files, large files, etc.)
- Chunking content based on approximate "token" count or byte size
- Automatically detects if the output is piped
- Support for handling multiple directories in a single command
- Configuration via yek.toml file
Using Help
Installation process
Unix-like systems (macOS, Linux)
curl -fsSL https://bodo.run/yek.sh | bash
Windows (PowerShell)
irm https://bodo.run/yek.ps1 | iex
Build from source
git clone https://github.com/bodo-run/yek.git
cd yek
cargo build --release
Usage
Yek has reasonable default settings, and you can simply run yek in a directory to serialize the entire repository. By default, it will serialize all files in the repository into 10MB chunks and write the files to a temporary directory, with the paths to the files printed to the console.
typical example
- Processes the current directory and writes to the temporary directory:
yek
- Pipeline the output to the clipboard (macOS):
yek src/ | pbcopy
- Limit the maximum size to 128K tokens and process only the src directory:
yek --max-size 128K --tokens src/
- Limit the maximum size to 100KB and process only the src directory, write to a specific directory:
yek --max-size 100KB --output-dir /tmp/yek src/
- Handles multiple catalogs:
yek src/ tests/
CLI Reference
yek --help
Yek is a repository content chunking and serialization tool for LLM consumption.
usage
yek [OPTIONS] [directories]...
parameters
directories
: Directory to be processed [default: .]
options (as in computer software settings)
--max-size
: Maximum size of each block (e.g. '10MB', '128KB', '1GB') [default: 10MB]
One sentence description (brief)