General Introduction
Langfuse is an open source LLM (Large Language Model) engineering platform. It helps developers trace, debug, and optimize LLM applications by providing tools for observing calls, managing prompts, running experiments, and evaluating results. Developed by the Langfuse team, the platform supports frameworks such as LangChain, OpenAI, etc. It is under the MIT license and has an active community. It can be quickly self-hosted locally or in the cloud, and is ideal for teams working together to develop reliable AI applications. langfuse offers cloud services (with free packages) and self-hosted options, and is easy to deploy and proven in production environments.
For Agents and RAG The runtime is visualized and observed, similar to LangSmith.
Function List
- Applied observation: Trace each invocation of the LLM application, recording inputs and outputs, latency and cost.
- Cue Management:: Centralized storage of cue words to support version control and teamwork adjustments.
- Data set management: Create test datasets and run experiments to compare models or cue effects.
- Assessment tools:: Support for user feedback, manual annotation and automated evaluation to check the quality of the output.
- Debugging Support: View detailed logs and user sessions to quickly pinpoint problems.
- Experiment playground: Test prompt words and model configurations to accelerate development iterations.
- Multi-framework support: Compatible with LangChain, OpenAI SDK, LiteLLM and more.
- API Integration: Provides a comprehensive API to customize LLMOps workflows.
Using Help
Installation and Deployment
cloud service
- Register for an account:: Access Langfuse CloudClick on "Sign Up" to register.
- Create a project: After logging in, click "New Project" and enter the project name.
- Get the key:: Generated in the project settings
PUBLIC_KEY
cap (a poem)SECRET_KEY
The - start using: No installation required, connect to cloud services directly through the SDK.
Local Deployment (Docker Compose)
- Preparing the environment: Ensure that Docker and Docker Compose are installed, which can be downloaded from the Docker website.
- Cloning Code: Run in a terminal
git clone https://github.com/langfuse/langfuse.git
Then go to the catalogcd langfuse
The - Starting services: Input
docker compose up
and wait for the startup to complete, the default address ishttp://localhost:3000
The - validate (a theory): Browser Access
http://localhost:3000
If you see the login page, you have succeeded. - Configuration Keys: Generate key in UI for SDK after registration.
Kubernetes Deployment (Production Recommendations)
- Preparing the Cluster: Create a Kubernetes cluster using Minikube (for local testing) or a cloud service such as AWS.
- Add Helm: Running
helm repo add langfuse https://langfuse.github.io/langfuse-k8s
cap (a poem)helm repo update
The - configure: Create
values.yaml
The database and key information is filled in (refer to the official document). - deployments: Input
helm install langfuse langfuse/langfuse -f values.yaml
Wait for it to finish. - interviews: Configure access to the service address based on Ingress.
Virtual Machine Deployment
- Running on a single virtual machine
docker compose up
The steps are the same as for local deployment.
Main Functions
Applied observation
- Installing the SDK: Python Project Run
pip install langfuse
JS/TS Project Runnpm install langfuse
The - initialization: Configure keys and hosts in code:
from langfuse import Langfuse langfuse = Langfuse(public_key="pk-lf-xxx", secret_key="sk-lf-xxx", host="http://localhost:3000")
- record call: Use decorators or manual tracing:
from langfuse.decorators import observe @observe() def chat(input). return openai.chat.completions.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": input}]) chat("Hello")
- ferret out: Check the call details on the "Traces" page of the UI.
Cue Management
- New Tip: On the "Prompts" page of the UI, click "New Prompt" and enter a name and content, for example:
System: You are a helper, answer questions directly. User: {{question}}
- Tips for use: Calls in code
langfuse.get_prompt("prompt-name")
The - version management: Automatically save the version after modifying the prompt, which can be rolled back.
Data sets and experiments
- Creating Data Sets: In the "Datasets" page of UI, click "Create Dataset", name it as "qa-test".
- Add Data: Enter or upload a CSV, for example:
Input: "How many does 1+1 equal?" Expected: "2"
- running experiment:: Test in code:
dataset = langfuse.get_dataset("qa-test") for item in dataset.items: result = chat(item.input) result = chat(item.input) item.link(langfuse.trace({"output": result}), "test-1")
- analyze: View the experiment results in the UI.
Playground
- go into: Click "Playground" in the UI and enter the prompt and model parameters.
- test (machinery etc): Click Run to view the output, adjust the parameters and save.
- jump: Directly from the error results of "Traces" Playground Modification.
Featured Function Operation
Debug Log
- On the "Traces" page, click on a call to see the inputs, outputs, and context.
- View user sessions in "Sessions" to analyze multiple rounds of conversations.
Evaluation Output
- manually operated: Rate the output (0-1) on the "Scores" page.
- automation: Add an assessment via the API:
langfuse.score(trace_id="xxx", name="accuracy", value=0.95)
API Usage
- Called using the OpenAPI specification or an SDK (e.g. Python/JS), e.g. to create a trace:
curl -X POST "http://localhost:3000/api/traces" -H "Authorization: Bearer sk-lf-xxx" -d '{"id": "trace-1", "name": "test"}'
application scenario
- RAG Process Visualization Tracking
- Visual tracking of the overall process from keyword recall, vector recall, recall fusion, rearrangement, answer
- Developing Intelligent Customer Service
- The team uses Langfuse to track conversations, optimize the quality of answers, and improve the customer experience.
- Model Performance Comparison
- Developers create datasets to test the performance of multiple LLMs on a quizzing task.
- On-premise deployment
- The company self-hosts Langfuse to protect sensitive data and debug internal AI applications.
QA
- What languages and frameworks are supported?
- Supports Python and JS/TS, and is compatible with LangChain, OpenAI, LlamaIndex and others.
- What is the minimum configuration for self-hosting?
- Smaller projects use a 2-core CPU and 4GB of RAM, larger ones recommend 8 cores and 16GB.
- How do I disable telemetry?
- Setting the environment variables in the
TELEMETRY_ENABLED=false
The
- Setting the environment variables in the