🌟 API Documentation Overview
The FastAPI-powered backend provides a comprehensive document-based question answering system using Retrieval-Augmented Generation (RAG). The API supports semantic search, document indexing, and chat completions across multiple data partitions with full OpenAI compatibility.
🔐 Authentication
Section titled “🔐 Authentication”All endpoints require authentication when enabled (by adding a authorization token AUTH_TOKEN in your .env). Include your AUTH_TOKEN in the HTTP request header:
Authorization: Bearer YOUR_AUTH_TOKENFor OpenAI-compatible endpoints, AUTH_TOKEN serves as the api_key parameter. Use a placeholder like 'sk-1234' when authentication is disabled (necessary for when using OpenAI client).
📡 API Serving Modes
Section titled “📡 API Serving Modes”This API can be served using Uvicorn (default) or Ray Serve for distributed deployments.
By default, the backend uses uvicorn to serve the FastAPI app.
To enable Ray Serve, set the following environment variable:
ENABLE_RAY_SERVE=trueAdditional optional environment variables for configuring Ray Serve:
RAY_SERVE_NUM_REPLICAS=1 # Number of deployment replicasRAY_SERVE_HOST=0.0.0.0 # Host address for Ray Serve HTTP proxyRAY_SERVE_PORT=8080 # Port for Ray Serve HTTP proxyWhen using Ray Serve with a remote cluster, the HTTP server will be started on the head node of the cluster.
🚀 API Endpoints
Section titled “🚀 API Endpoints”ℹ️ System Health
Section titled “ℹ️ System Health”Verify server status and availability.
GET /health_check📦 Document Indexing
Section titled “📦 Document Indexing”Upload New File
Section titled “Upload New File”POST /indexer/partition/{partition}/file/{file_id}Upload a new file to a specific partition for indexing.
Parameters:
partition(path): Target partition namefile_id(path): Unique identifier for the file
Request Body (form-data):
file(binary): File to uploadmetadata(JSON string): File metadata (e.g.,{"owner": "user1"})
Responses:
201 Created: Returns task status URL409 Conflict: File already exists in partition
Replace Existing File
Section titled “Replace Existing File”PUT /indexer/partition/{partition}/file/{file_id}Replace an existing file in the partition. Deletes the current entry and creates a new indexing task.
Parameters: Same as POST endpoint
Request Body: Same as POST endpoint
Response: 202 Accepted with task status URL
Update File Metadata
Section titled “Update File Metadata”PATCH /indexer/partition/{partition}/file/{file_id}Update file metadata without reindexing the document.
Request Body (form-data):
metadata(JSON string): Updated metadata
Response: 200 OK on successful update
Delete File
Section titled “Delete File”DELETE /indexer/partition/{partition}/file/{file_id}Remove a file from the specified partition.
Responses:
204 No Content: Successfully deleted404 Not Found: File not found in partition
Check Indexing Status
Section titled “Check Indexing Status”GET /indexer/task/{task_id}Monitor the progress of an asynchronous indexing task.
Response: Task status information
See logs of a given task
Section titled “See logs of a given task”GET /indexer/task/{task_id}/logsGet error details of a failed task
Section titled “Get error details of a failed task”GET /indexer/task/{task_id}/error🔍 Semantic Search
Section titled “🔍 Semantic Search”Search Across Multiple Partitions
Section titled “Search Across Multiple Partitions”GET /search/Perform semantic search across specified partitions.
Query Parameters:
partitions(optional): List of partition names (default:["all"])text(required): Search query texttop_k(optional): Number of results to return (default:5)
Responses:
200 OK: JSON list of document links (HATEOAS format)400 Bad Request: Invalid partitions parameter
Search Within Single Partition
Section titled “Search Within Single Partition”GET /search/partition/{partition}Search within a specific partition only.
Query Parameters:
text(required): Search query texttop_k(optional): Number of results (default:5)
Response: Same as multi-partition search
Search Within Specific File
Section titled “Search Within Specific File”GET /search/partition/{partition}/file/{file_id}Search within a particular file in a partition.
Query Parameters: Same as partition search Response: Same as other search endpoints
📄 Document Extraction
Section titled “📄 Document Extraction”Get Extract Details
Section titled “Get Extract Details”GET /extract/{extract_id}Retrieve specific document extract (chunk) by ID.
Response: JSON containing extract content and metadata
💬 OpenAI-Compatible Chat
Section titled “💬 OpenAI-Compatible Chat”These endpoints provide full OpenAI API compatibility for seamless integration with existing tools and workflows. For detailed example of openai usage see this section
- List Available Models
GET /v1/modelsList all available RAG models (partitions).
Model Naming Convention:
- Pattern:
openrag-{partition_name}=> This model allows to chat specifically with the partition{partition_name} - Special model:
partition-all(queries entire vector database)
- Chat Completions
POST /v1/chat/completionsOpenAI-compatible chat completion using RAG pipeline.
Request Body:
curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_AUTH_TOKEN" \ -d '{ "model": "openrag-{partition_name}", "messages": [ { "role": "user", "content": "Your question here" } ], "temperature": 0.7, "stream": false }'- Text Completions
POST /v1/completionsOpenAI-compatible text completion endpoint.
💡 Usage Examples
Section titled “💡 Usage Examples”Bulk File Indexing
Section titled “Bulk File Indexing”For indexing multiple files programmatically, you can use this script data_indexer.py utility script in the 📁 utility folder or simply use indexer ui.
Example OpenAI Client Usage
Section titled “Example OpenAI Client Usage”from openai import OpenAI, AsyncOpenAI
api_base_url = "http://localhost:8080" # fastapi base url of 'openrag'base_url = f"{api_base_url}/v1"
auth_key = ... # your api authentification key AUTH_TOKEN in your .env. Is authentification is disabled, use a placeholder like 'sk-1234'client = OpenAI(api_key=auth_key, base_url=base_url)
your_partition= 'my_partition' # name of your partitionmodel = f"openrag-{your_partition}"settings = { 'model': model, 'temperature': 0.3, 'stream': False}
response = client.chat.completions.create( **settings, messages=[ {"role": "user", "content": "What information do you have about...?"} ])⚠️ Error Handling
Section titled “⚠️ Error Handling”The API uses standard HTTP status codes:
200 OK: Successful request201 Created: Resource created successfully202 Accepted: Request accepted for processing204 No Content: Successful deletion400 Bad Request: Invalid request parameters404 Not Found: Resource not found409 Conflict: Resource already exists
Error responses include detailed JSON messages to help with debugging and integration.