Documentation Index
Fetch the complete documentation index at: https://docs.clarifeye.ai/llms.txt
Use this file to discover all available pages before exploring further.
Documents are the primary data source for your warehouse. The platform supports PDF files and other document formats.
Upload Documents
# Upload documents to a warehouse
file_paths = [
"/path/to/document1.pdf",
"/path/to/document2.pdf",
"/path/to/document3.pdf"
]
# Upload documents (parsing happens automatically)
responses = warehouse.upload_documents(
file_paths=file_paths,
skip_parsing=False, # Set to True to skip automatic parsing
batch_size=10 # Number of files to upload per batch
)
print(f"Uploaded {len(responses)} batches of documents")
List Documents
# List all documents in the warehouse
documents = warehouse.list_documents()
for doc in documents:
print(f"Document: {doc['file_name']} (ID: {doc['id']})")
# List documents without file content (faster)
documents = warehouse.list_documents_without_file()
Delete Documents
# Delete specific documents by ID
document_ids = ["doc-id-1", "doc-id-2"]
result = warehouse.delete_documents(document_ids)
Monitor Document Processing
After uploading documents, they are processed asynchronously. You can monitor the processing status:
import time
# Wait a moment for the task to be created
time.sleep(5)
# List all tasks
tasks = warehouse.list_tasks()
latest_task = tasks[0] # Most recent task
# Wait for task completion
result = latest_task.wait_for_completion()
print(f"Task status: {result['status']}")