Choose your deployment
Before you begin, decide how you want to run Airweave:
Cloud (Recommended) Managed service with free tier. Get your API key from the dashboard.
Self-hosted Run locally with Docker. API available at http://localhost:8001
This quickstart uses the cloud deployment. For self-hosted, change base_url to http://localhost:8001 in the examples below.
Install the SDK
Step 1: Initialize the client
Get your API key from the Airweave dashboard and initialize the client:
from airweave import AirweaveSDK
# Initialize the client
client = AirweaveSDK(
api_key = "YOUR_API_KEY" ,
base_url = "https://api.airweave.ai" # Use "http://localhost:8001" for self-hosted
)
Store your API key in an environment variable: export AIRWEAVE_API_KEY="your-key-here"
Step 2: Create a collection
A collection is a searchable container that groups multiple data sources together. Think of it as a unified search index.
# Create a collection
collection = client.collections.create(
name = "My First Collection"
)
print ( f "Created collection: { collection.readable_id } " )
print ( f "Collection ID: { collection.id } " )
# Output:
# Created collection: my-first-collection-x7k9m
# Collection ID: 550e8400-e29b-41d4-a716-446655440000
The readable_id is auto-generated from your collection name with a unique suffix. You’ll use this ID to reference the collection in search queries.
Step 3: Add a source connection
Now connect a data source to your collection. Source connections handle authentication and automatically sync data from your apps and databases.
Stripe (API Key)
GitHub (API Key)
Slack (OAuth)
# Connect Stripe with API key
source_connection = client.source_connections.create(
name = "My Stripe Connection" ,
short_name = "stripe" ,
readable_collection_id = collection.readable_id,
authentication = {
"credentials" : {
"api_key" : "sk_test_YOUR_STRIPE_API_KEY"
}
}
)
print ( f "Created: { source_connection.name } " )
print ( f "Status: { source_connection.status } " )
# Connect GitHub with personal access token
source_connection = client.source_connections.create(
name = "GitHub Docs Repo" ,
short_name = "github" ,
readable_collection_id = collection.readable_id,
config = {
"repo_name" : "airweave-ai/airweave" ,
"branch" : "main"
},
authentication = {
"credentials" : {
"personal_access_token" : "ghp_YOUR_TOKEN"
}
}
)
print ( f "Created: { source_connection.name } " )
print ( f "Status: { source_connection.status } " )
# Connect Slack with OAuth (browser flow)
source_connection = client.source_connections.create(
name = "Team Slack Workspace" ,
short_name = "slack" ,
readable_collection_id = collection.readable_id,
redirect_url = "https://app.example.com/connections"
)
# For OAuth sources, you'll get an auth_url to complete authentication
print ( f "Visit: { source_connection.auth.auth_url } " )
Airweave will automatically start syncing data from your source. The initial sync may take a few minutes depending on data volume. You can monitor progress in the dashboard or via the API.
Step 4: Search your collection
Once data is synced, you can search across all connected sources with natural language queries:
# Basic search
results = client.collections.search(
readable_id = collection.readable_id,
query = "Find recent failed payments"
)
print ( f "Found { len (results.results) } results \n " )
# Display results
for i, result in enumerate (results.results[: 3 ], 1 ):
payload = result[ 'payload' ]
print ( f "Result { i } :" )
print ( f " Source: { payload[ 'source_name' ] } " )
print ( f " Content: { payload[ 'md_content' ][: 100 ] } ..." )
print ( f " Score: { result[ 'score' ] :.3f} \n " )
If you get an error about no results, the initial sync might still be running. Wait a minute and try again, or check the sync status in the dashboard.
Step 5: Try advanced search
Airweave supports powerful search features including filters, AI reranking, and query expansion:
from airweave import SearchRequest, Filter, FieldCondition, MatchAny
# Advanced search with filters and AI features
results = client.collections.search_advanced(
readable_id = collection.readable_id,
search_request = SearchRequest(
query = "customer feedback about pricing" ,
filter = Filter(
must = [
FieldCondition(
key = "source_name" ,
match = MatchAny( any = [ "Stripe" , "Zendesk" , "Slack" ])
)
]
),
recency_bias = 0.5 , # Prefer newer content
score_threshold = 0.7 , # High-quality results only
enable_reranking = True , # AI reranking for better relevance
limit = 10
)
)
print ( f "Found { len (results.results) } high-quality results" )
What’s next?
Explore core concepts Learn about sources, connectors, collections, entities, and syncs
Add more connectors Connect GitHub, Notion, Slack, databases, and 50+ other sources
Master search features Filters, reranking, query expansion, and answer generation
Build AI agents Integrate with Claude, OpenAI, and other AI frameworks
Need help?
Join our Discord Get help from the community and team
Browse examples Example notebooks and tutorials