In the field of artificial intelligence, large-scale language modeling (LLM) technology is changing rapidly, and various tool libraries are emerging. In order to help developers better meet the challenges of LLM development, this paper organizes a toolbox containing more than 120 useful LLM libraries, and divides them by functional categories, which is convenient for engineers to quickly find and apply.
Quick navigation
To make it easier for readers to quickly locate the resources they need, here are quick links to the tool libraries in each category:
🚀 LLM training and fine-tuning | 🧱 LLM Application Development | 🩸 LLM retrieval enhancement generation (RAG) |
🟩 LLM reasoning | 🚧 LLM Service Deployment | 📤 LLM data extraction |
🌠 LLM data generation | 💎 LLM Intelligent Body | ⚖️ LLM Assessment |
🔍 LLM monitoring and control | 📅 LLM Prompt Word Engineering | 📝 LLM Structured Output |
🛑 LLM Safety | 💠 LLM Embedding Models | ❇️ Other |
LLM Training and Fine Tuning
library name | descriptive | link (on a website) |
---|---|---|
unsloth | Fine-tune LLM faster with less memory. | link (on a website) |
PEFT | Advanced library for efficient fine-tuning of parameters. | link (on a website) |
TRL | Training using reinforcement learning transformer Language Modeling. | link (on a website) |
Transformers | Transformers provides thousands of pre-trained models for performing tasks in different modalities such as text, vision and audio. | link (on a website) |
Axolotl | Tools designed to simplify post-training of various AI models. | link (on a website) |
LLMBox | A comprehensive LLM library, including a unified training pipeline and comprehensive model evaluation. | link (on a website) |
LitGPT | Quickly train and fine-tune the LLM. | link (on a website) |
Mergoo | A library for easily merging multiple LLM experts and efficiently training the merged LLM. | link (on a website) |
Llama-Factory | Simple and efficient LLM fine-tuning tool. | link (on a website) |
Ludwig | Low-code framework for building custom LLMs, neural networks, and other AI models. | link (on a website) |
Txtinstruct | A framework for training instruction tuning models. | link (on a website) |
Lamini | An integrated LLM inference and tuning platform. | link (on a website) |
XTuring | XTuring provides fast, efficient and easy fine-tuning of open source LLMs such as Mistral, LLaMA, GPT-J and others. | link (on a website) |
RL4LMs | A modular RL library for fine-tuning language models to human preferences. | link (on a website) |
DeepSpeed | DeepSpeed is a deep learning optimization library that makes distributed training and inference simple, efficient and effective. | link (on a website) |
torchtune | A PyTorch native library designed specifically for fine-tuning LLM. | link (on a website) |
PyTorch Lightning | A library that provides a high-level interface for pre-training and fine-tuning LLMs. | link (on a website) |
LLM Application Development
organizing plan
library name | descriptive | link (on a website) |
---|---|---|
LangChain | LangChain is a framework for developing applications driven by the Large Language Model (LLM). | link (on a website) |
Llama Index | LlamaIndex is the data framework for LLM applications. | link (on a website) |
HayStack | Haystack is an end-to-end LLM framework that allows users to build applications driven by LLM, Transformer models, vector search, and more. | link (on a website) |
Prompt flow | A set of development tools designed to simplify the end-to-end development cycle of LLM-based AI applications. | link (on a website) |
Griptape | A modular Python framework for building AI-driven applications. | link (on a website) |
Weave | Weave is a toolkit for developing generative AI applications. | link (on a website) |
Llama Stack | Build the Llama app. | link (on a website) |
Multiple API Access
library name | descriptive | link (on a website) |
---|---|---|
LiteLLM | A library of over 100 LLM API calls in OpenAI format. | link (on a website) |
AI Gateway | A fast AI gateway with integrated fencing. Routes to 200+ LLMs, 50+ AI fences via 1 fast and friendly API. | link (on a website) |
router (computing)
library name | descriptive | link (on a website) |
---|---|---|
RouteLLM | Framework for servicing and evaluating LLM routers - Saving LLM costs without compromising quality Direct replacement for OpenAI clients for routing simpler queries to cheaper models. | link (on a website) |
memorization
library name | descriptive | link (on a website) |
---|---|---|
mem0 | Memory layer for AI applications. | link (on a website) |
Memoripy | An AI memory layer with short- and long-term storage, semantic clustering, and optional memory decay for context-aware applications. | link (on a website) |
interfaces
library name | descriptive | link (on a website) |
---|---|---|
Streamlit | A faster way to build and share data applications.Streamlit lets users turn Python scripts into interactive web applications in minutes. | link (on a website) |
Gradio | Build and share delightful machine learning applications all in Python. | link (on a website) |
AI SDK UI | Building chat and generative user interfaces. | link (on a website) |
AI-Gradio | Create AI applications supported by a variety of AI providers. | link (on a website) |
Simpleaichat | Python package for easily interacting with chat applications with powerful features and minimal code complexity. | link (on a website) |
Chainlit | Build production-ready conversational AI apps in minutes. | link (on a website) |
low code
library name | descriptive | link (on a website) |
---|---|---|
LangFlow | LangFlow is a low-code application builder for RAG and multi-agent AI applications. It is based on Python and is not related to any model, API or database. | link (on a website) |
(computing) cache
library name | descriptive | link (on a website) |
---|---|---|
GPTCache | A library for creating semantic caches for LLM queries. Reduces the cost of the LLM API by 10x💰 and increases speed by 100x. Fully integrated with LangChain and LlamaIndex. | link (on a website) |
LLM RAG
library name | descriptive | link (on a website) |
---|---|---|
FastGraph RAG | The streamlined and promptable Fast GraphRAG framework is designed for interpretable, highly accurate, agent-driven retrieval workflows. | link (on a website) |
Chonkie | RAG chunking library, lightweight, extremely fast and easy to use. | link (on a website) |
RAGChecker | A fine-grained framework for diagnosing RAG. | link (on a website) |
RAG to Riches | Build, extend, and deploy advanced search-enhanced generation applications. | link (on a website) |
BeyondLLM | Beyond LLM provides an all-in-one toolkit for experimentation, evaluation, and deployment of Retrieval Augmented Generation (RAG) systems. | link (on a website) |
SQLite-Vec | A vector search SQLite extension that runs anywhere! | link (on a website) |
fastRAG | fastRAG is a research framework for efficient and optimized retrieval of enhanced generation pipelines, combining advanced LLM and information retrieval techniques. | link (on a website) |
FlashRAG | Python toolkit for efficient RAG research. | link (on a website) |
Llmware | A unified framework for building enterprise RAG pipelines using small, specialized models. | link (on a website) |
Rerankers | Lightweight unified API for various reordering models. | link (on a website) |
Vectara | Build the Agentic RAG application. | link (on a website) |
LLM reasoning
library name | descriptive | link (on a website) |
---|---|---|
LLM Compressor | Transformers-compatible library for applying various compression algorithms to LLM to optimize deployment. | link (on a website) |
LightLLM | Python-based LLM inference and service framework known for its lightweight design, ease of scalability, and high-speed performance. | link (on a website) |
vLLM | High throughput and memory efficient inference and service engine for LLM. | link (on a website) |
torchchat | Run PyTorch LLM locally on servers, desktops, and mobile devices. | link (on a website) |
TensorRT-LLM | TensorRT-LLM is a library for optimizing Large Language Model (LLM) inference. | link (on a website) |
WebLLM | High-performance in-browser LLM inference engine. | link (on a website) |
LLM service deployment
library name | descriptive | link (on a website) |
---|---|---|
Langcorn | Use FastAPI to automate the servicing of LangChain LLM applications and agents. | link (on a website) |
LitServe | Extremely fast service engine for any AI model of any size. It enhances FastAPI with features such as batch processing, streaming, and GPU autoscaling. | link (on a website) |
LLM Data Extraction
library name | descriptive | link (on a website) |
---|---|---|
Crawl4AI | Open source LLM friendly Web crawler and crawling tool . | link (on a website) |
ScrapeGraphAI | A web crawling Python library that uses LLM and direct graph logic to create crawling pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). | link (on a website) |
Docling | Docling parses documents and exports them easily and quickly to the desired format. | link (on a website) |
Llama Parse | GenAI native document parser that can parse complex document data for any downstream LLM use case (RAG, agent). | link (on a website) |
PyMuPDF4LLM | The PyMuPDF4LLM library makes it easier for users to extract PDF content in the formats required by LLM & RAG environments. | link (on a website) |
Crawlee | A web crawler and browser automation library. | link (on a website) |
MegaParse | Parser for each document type. | link (on a website) |
ExtractThinker | Document Intelligence Library for LLM. | link (on a website) |
LLM Data Generation
library name | descriptive | link (on a website) |
---|---|---|
DataDreamer | DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. | link (on a website) |
fabricator | A flexible open source framework for generating datasets using large language models. | link (on a website) |
Promptwright | Synthetic dataset generation library. | link (on a website) |
EasyInstruct | An easy-to-use framework for processing large language model instructions. | link (on a website) |
LLM Intelligent Body
library name | descriptive | link (on a website) |
---|---|---|
CrewAI | A framework for orchestrating role-playing, autonomous AI agents. | link (on a website) |
LangGraph | Construct the elastic language agent as a graph. | link (on a website) |
Agno | Build AI agents with memory, knowledge, tools, and reasoning capabilities. Chat with them using a beautiful agent UI. | link (on a website) |
AutoGen | An open source framework for building AI agent systems. | link (on a website) |
Smolagents | Library for building powerful agents in a few lines of code. | link (on a website) |
Pydantic AI | Python agent framework for building production-grade applications using generative AI. | link (on a website) |
gradio-tools | A Python library for converting Gradio applications into tools that can be utilized by LLM-based agents to accomplish their tasks. | link (on a website) |
Composio | Production-ready toolset for AI agents. | link (on a website) |
Atomic Agents | Build AI agents atomically. | link (on a website) |
Memary | An open source memory layer for autonomous agents. | link (on a website) |
Browser Use | Make the site accessible to AI agents. | link (on a website) |
OpenWebAgent | An open toolkit for enabling web proxies on large language models. | link (on a website) |
Lagent | A lightweight framework for building LLM-based agents. | link (on a website) |
LazyLLM | A low-code development tool for building multi-agent LLM applications. | link (on a website) |
Swarms | An enterprise-class production-ready multi-agent orchestration framework. | link (on a website) |
ChatArena | ChatArena is a library that provides a multi-agent language game environment and facilitates research on autonomous LLM agents and their social interactions. | link (on a website) |
Swarm | Exploring an ergonomic, lightweight, multi-agent orchestrated educational framework. | link (on a website) |
AgentStack | The fastest way to build powerful AI agents. | link (on a website) |
Archgw | Intelligent Agent Gateway. | link (on a website) |
Flow | A lightweight task engine for building AI agents. | link (on a website) |
AgentOps | Python SDK for AI agent monitoring. | link (on a website) |
Langroid | Multi-agent framework. | link (on a website) |
Agentarium | A framework for creating and managing simulations that populate AI-driven agents. | link (on a website) |
Upsonic | be in favor of MCP framework for reliable AI agents. | link (on a website) |
LLM Assessment
library name | descriptive | link (on a website) |
---|---|---|
Ragas | Ragas is the ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. | link (on a website) |
Giskard | Open source evaluation and testing tools for ML & LLM systems. | link (on a website) |
DeepEval | LLM Assessment Framework | link (on a website) |
Lighteval | An all-in-one toolkit for evaluating LLMs. | link (on a website) |
Trulens | Evaluation and tracking tools for LLM experiments | link (on a website) |
PromptBench | A unified assessment framework for large-scale language models. | link (on a website) |
LangTest | Delivering safe and valid language models. Over 60 test types for comparing LLM & NLP models in terms of accuracy, bias, fairness, robustness, and more. | link (on a website) |
EvalPlus | Rigorous evaluation framework for LLM4Code. | link (on a website) |
FastChat | An open platform for training, serving, and evaluating chatbots based on large language models. | link (on a website) |
judges | A small pool of LLM judges. | link (on a website) |
Evals | Evals is a framework for evaluating LLM and LLM systems, as well as benchmarking open source registries. | link (on a website) |
AgentEvals | Evaluators and utilities for evaluating agent performance. | link (on a website) |
LLMBox | A comprehensive LLM library, including a unified training pipeline and comprehensive model evaluation. | link (on a website) |
Opik | An open source end-to-end LLM development platform that also includes LLM evaluation. | link (on a website) |
LLM Monitoring
library name | descriptive | link (on a website) |
---|---|---|
MLflow | An open source end-to-end MLOps/LLMOps platform for tracking, evaluating and monitoring LLM applications. | link (on a website) |
Opik | An open source end-to-end LLM development platform that also includes LLM monitoring. | link (on a website) |
LangSmith | Provides tools for documenting, monitoring and improving LLM applications. | link (on a website) |
Weights & Biases (W&B) | W&B provides features for tracking LLM performance. | link (on a website) |
Helicone | Open source LLM observability platform for developers. One line integration for monitoring, metrics, evaluation, agent tracking, cue management, playgrounds and more. | link (on a website) |
Evidently | An open source ML and LLM observability framework. | link (on a website) |
Phoenix | An open source AI observability platform designed for experimentation, evaluation, and troubleshooting. | link (on a website) |
Observers | A lightweight library for AI observability. | link (on a website) |
LLM Cue word engineering
library name | descriptive | link (on a website) |
---|---|---|
PCToolkit | Unified plug-and-play hint compression toolkit for large language models. | link (on a website) |
Selective Context | Selective Context compresses the user's prompts and context to allow the LLM (e.g. ChatGPT) to process 2x more content. | link (on a website) |
LLMLingua | Library for compressing hints to accelerate LLM reasoning. | link (on a website) |
betterprompt | A suite for testing LLM prompts before pushing them to the production environment. | link (on a website) |
Promptify | Solve NLP problems with LLM and easily generate different NLP task prompts for popular generative models such as GPT, PaLM, etc. with Promptify. | link (on a website) |
PromptSource | PromptSource is a toolkit for creating, sharing and using natural language prompts. | link (on a website) |
DSPy | DSPy is an open source framework for programming (not prompting) language models. | link (on a website) |
Py-priompt | Cue the design library. | link (on a website) |
Promptimizer | Hints to optimize the library. | link (on a website) |
LLM Structured Output
library name | descriptive | link (on a website) |
---|---|---|
Instructor | Python library for processing structured output from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API. | link (on a website) |
XGrammar | An open source library for efficient, flexible and portable structure generation. | link (on a website) |
Outlines | Powerful (structured) text generation | link (on a website) |
Guidance | Guidance is a valid programming paradigm used to guide the language model. | link (on a website) |
LMQL | A language for constraint bootstrapping and efficient LLM programming. | link (on a website) |
Jsonformer | A foolproof method for generating structured JSON from language models. | link (on a website) |
LLM Security
library name | descriptive | link (on a website) |
---|---|---|
JailbreakEval | A collection of automated evaluators for evaluating jailbreak attempts. | link (on a website) |
EasyJailbreak | An easy-to-use Python framework for generating adversarial jailbreak hints. | link (on a website) |
Guardrails | Adding guardrails to large language models. | link (on a website) |
LLM Guard | A security toolkit for LLM interaction. | link (on a website) |
AuditNLG | AuditNLG is an open source library that can help reduce the risks associated with using generative AI systems for language. | link (on a website) |
NeMo Guardrails | NeMo Guardrails is an open source toolkit for easily adding programmable guardrails to LLM-based dialog systems. | link (on a website) |
Garak | LLM Vulnerability Scanner | link (on a website) |
LLM Embedding Model
library name | descriptive | link (on a website) |
---|---|---|
Sentence-Transformers | Advanced text embedding model | link (on a website) |
Model2Vec | Fast advanced static embedding models | link (on a website) |
Text Embedding Inference | High-speed inference solution for text embedding models.TEI implements high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE, and E5. | link (on a website) |
(sth. or sb) else
library name | descriptive | link (on a website) |
---|---|---|
Text Machina | A modular and extensible Python framework designed to help create high-quality, unbiased datasets for building robust models for MGT-related tasks such as detection, attribution, and boundary detection. | link (on a website) |
LLM Reasoners | A library for advanced large-scale language model reasoning. | link (on a website) |
EasyEdit | An easy-to-use knowledge editing framework for large-scale language models. | link (on a website) |
CodeTF | CodeTF: A one-stop Transformer library for advanced code LLM. | link (on a website) |
spacy-llm | This package integrates a large-scale language model (LLM) into spaCy with a modular system for rapid prototyping and cueing, and transforms unstructured responses into robust outputs for a variety of NLP tasks. | link (on a website) |
pandas-ai | Chat with the user's database (SQL, CSV, pandas, polars, MongoDB, NoSQL, etc.). | link (on a website) |
LLM Transparency Tool | An open source interactive toolkit for analyzing the inner workings of Transformer-based language models. | link (on a website) |
Vanna | Chat with your users' SQL databases. Accurate text-to-SQL generation using RAG's LLM. | link (on a website) |
mergekit | Tools for merging pre-trained large language models. | link (on a website) |
MarkLLM | An LLM watermarking open source toolkit. | link (on a website) |
LLMSanitize | An open source library for contamination detection in NLP datasets and large language models (LLMs). | link (on a website) |
Annotateai | Automatically annotate papers using LLM. | link (on a website) |
LLM Reasoner | Let any LLM like OpenAI o1 and DeepSeek Think like R1. | link (on a website) |