Overview
Retrieve entities that have been synced from your source connections. Entities represent the individual data items (documents, records, files, etc.) that are indexed and searchable within your collections.
This endpoint provides access to the raw entity metadata stored in the PostgreSQL database. For searching entity content, use the Search Collection endpoint instead.
This is a low-level endpoint primarily used for debugging and monitoring. For most use cases, you should use the Collection Search API to query your data.
Query Parameters
Number of entities to skip for pagination (minimum: 0)
Maximum number of entities to return (1-1000)
Filter entities by source connection UUID
Filter entities by collection readable ID
Response
Returns an array of entity metadata objects.
Unique UUID identifier for this entity record
Source-specific entity identifier (may include chunk suffix)
UUID of the sync job that created/updated this entity
UUID of the sync configuration
UUID of the entity definition (schema)
Content hash for deduplication and change detection
UUID of the organization that owns this entity
ISO 8601 timestamp when this entity was first created
ISO 8601 timestamp when this entity was last modified
Example Request
curl "https://api.airweave.ai/v1/entities?limit=10" \
-H "Authorization: Bearer YOUR_API_KEY"
Example Response
[
{
"id": "e1f2a3b4-c5d6-7890-abcd-ef1234567890",
"entity_id": "docs/getting-started.md__chunk_0",
"sync_job_id": "770e8400-e29b-41d4-a716-446655440002",
"sync_id": "660e8400-e29b-41d4-a716-446655440001",
"entity_definition_id": "def12345-6789-abcd-ef01-234567890abc",
"hash": "sha256:a1b2c3d4e5f6...",
"organization_id": "org12345-6789-abcd-ef01-234567890abc",
"created_at": "2024-03-15T12:05:22Z",
"modified_at": "2024-03-15T12:05:22Z"
},
{
"id": "f2g3b4c5-d6e7-8901-bcde-f23456789012",
"entity_id": "docs/api-reference.md__chunk_0",
"sync_job_id": "770e8400-e29b-41d4-a716-446655440002",
"sync_id": "660e8400-e29b-41d4-a716-446655440001",
"entity_definition_id": "def12345-6789-abcd-ef01-234567890abc",
"hash": "sha256:b2c3d4e5f6a7...",
"organization_id": "org12345-6789-abcd-ef01-234567890abc",
"created_at": "2024-03-15T12:05:23Z",
"modified_at": "2024-03-15T12:05:23Z"
}
]
Use Cases
Monitoring Sync Progress
Query entities by sync_job_id to see what was synced in a specific job:
params = {
"sync_job_id": "770e8400-e29b-41d4-a716-446655440002",
"limit": 1000
}
Debugging Data Issues
Inspect entity metadata to diagnose sync or search issues:
params = {
"source_connection_id": "550e8400-e29b-41d4-a716-446655440000",
"limit": 10
}
Entity Count Analysis
Count entities per collection for usage tracking:
from collections import Counter
entities = requests.get(url, headers=headers, params={"limit": 1000}).json()
by_collection = Counter([e.get('collection_readable_id') for e in entities])
This endpoint returns metadata only. To access entity content (text, embeddings, etc.), use the Search Collection endpoint or query the vector database directly.