General Introduction
Airweave is an open source tool designed to make any application searchable by synchronizing a user's application data, APIs, databases, and websites to graph and vector databases.Airweave simplifies the process of making data searchable, whether it is structured or unstructured, and allows it to be processed, stored, and retrieved by Airweave. Core features include data synchronization, integration with multiple data sources, support for multiple vector databases, etc. Airweave was designed with the goal of providing a simple, scalable, and transparent solution that makes it easy for users to semantically search their data.
Function List
- Data Synchronization: Supports synchronization of user's application data, APIs, databases and website data to mapping and vector databases.
- Multi-data source integration: Support more than 20 kinds of data source integration, continue to increase.
- Vector database support: Weaviate instances are used by default, but users can also configure their own vector databases.
- No-Code Configuration: Users can make applications searchable in a few clicks without writing code.
- Asynchronous processing: supports asynchronous processing of large-scale data synchronization.
- Open source core: the core function of open source, the future will provide more advanced commercial features.
Using Help
Installation process
- Cloning Warehouse:
git clone https://github.com/airweave-ai/airweave.git
cd airweave
- Up and running:
chmod +x start.sh
. /start.sh
After running the above command, Airweave will start locally.
Usage Process
Front-end Usage
- interviews React UI: Open your browser and visit
http://localhost:8080
The - To add a new data source: Navigate to the "Sources" page and add a new data source.
- Configure a synchronization schedule: Set up or view a synchronization schedule on the Schedules page.
- Monitoring Synchronization Tasks: Monitor the status of synchronization tasks on the Jobs page.
API Usage
- To access the Swagger documentation: Open a browser and visit
http://localhost:8001/docs
View the API documentation. - Get all data sources:
GET /sources
- Connect to the data source:
POST /connections/{short_name}
Advanced Configuration
- Configure Custom Vector Database: Users can configure their own vector database in the application UI or through the API.
- Asynchronous Client: Airweave provides an asynchronous client that supports non-blocking calls to the API.
import asyncio
from airweave import AsyncAirweaveSDK
client = AsyncAirweaveSDK(api_key="YOUR_API_KEY", base_url="https://yourhost.com/path/to/api")
async def main(): await client.api_keys
await client.api_keys.create_api_key()
asyncio.run(main())
Exception handling
The ApiError exception is thrown when the API returns a non-successful status code (4xx or 5xx response).
from airweave.core.api_error import ApiError
try: client.api_keys.create_api_key()
client.api_keys.create_api_key()
except ApiError as e: print(e.status_code)
print(e.status_code)
print(e.status_code)
auto-retry
The SDK is configured with an automatic retry mechanism and the default number of retries is 2. Users can configure the automatic retry mechanism via the max_retries
option configures the retry behavior.
client.api_keys.create_api_key(... , request_options={"max_retries": 1})
timeout setting
The SDK default timeout is 60 seconds, and users can configure the timeout at the client or request level.
client = AirweaveSDK(... , timeout=20.0)
client.api_keys.create_api_key(... , request_options={"timeout_in_seconds": 1})
Customized Clients
Users can override the httpx client to support customization needs such as proxies and transports.
import httpx
from airweave import AirweaveSDK
client = AirweaveSDK(
... ,
httpx_client=httpx.Client(proxies="http://my.test.proxy.example.com", transport=httpx.HTTPTransport(local_address="0.0.0.0")), .
)