Environment Variable Configuration
Overview
Section titled “Overview”OpenRAG provides a large range of environment variables that allow you to customize and configure various aspects of the application. This page serves as a comprehensive reference for all available environment variables, providing their types, default values, and descriptions. As new variables are introduced, this page will be updated to reflect the growing configuration options.
Backend
Section titled “Backend”Indexer Pipeline
Section titled “Indexer Pipeline”Loaders
Section titled “Loaders”Openrag loads all files into a pivot markdown file format before proceeding to chunking. Some environment variables can be configured to customized this pipeline
General variables
Section titled “General variables”| Variable | Type | Default | Description |
|---|---|---|---|
IMAGE_CAPTIONING | bool | true | If true, an LLM is used to describe images and convert them into text using a specific prompt. The image in files are replaced by their descriptions |
IMAGE_CAPTIONING_URL | bool | true | If true, HTTP/HTTPS image URLs in markdown files are fetched and described by the VLM. |
SAVE_MARKDOWN | bool | false | If true, the pivot-format markdown produced during parsing is saved. Useful for debugging and verifying the correctness of the generated markdown. |
SAVE_UPLOADED_FILES | bool | false | When true, uploaded files are stored on disk. You must enable this option if you want Chainlit to show sources while chatting. |
PDFLoader | str | MarkerLoader | Specifies the PDF parsing engine to use. Available options: PyMuPDFLoader, PyMuPDF4LLMLoader, MarkerLoader and DotsOCRLoader. |
PDF Loader
Section titled “PDF Loader”Marker Loader Configuration
Section titled “Marker Loader Configuration”The MarkerLoader is the default PDF parsing engine. It can be configured using the following environment variables:
| Variable | Type | Default | Description |
|---|---|---|---|
MARKER_POOL_SIZE | int | 1 | Number of workers (typically 1 worker per cluster node) |
MARKER_MAX_PROCESSES | int | 2 | Number of subprocesses <-> Number of concurrent PDFs per worker (to increase depending on your available GPU resources) |
MARKER_MAX_TASKS_PER_CHILD | int | 20 | Number of tasks a child (PDF worker) has to process before it gets restarted to clean up memory leaks |
MARKER_TIMEOUT | int | 3600 | Timeout in seconds for marker processes |
MARKER_PDFTEXT_WORKERS | int | 2 | Number of PDF text extractor workers inside marker. |
MARKER_CHUNK_SIZE | int | 10 | Split large PDFs into chunks of this many pages for parallel processing across workers. Use <= 0 to deactivate chunking. |
OpenAI-Compatible OCR Loader Configuration
Section titled “OpenAI-Compatible OCR Loader Configuration”Modern OCR pipelines increasingly rely on VLM-based OCR models (such as DeepSeek OCR, DotsOCR, or LightOn OCR) that convert PDF pages into images and feed them into vision-language models with specialized prompts.
This loader integrates that workflow by exposing an OpenAI-compatible API that accepts PDF image pages and returns structured text produced by the OCR-VLM model in Markdown.
The parameters below configure how the OCR loader communicates with the model server, handles retries, manages concurrency, and controls model sampling behavior.
| Variable | Type | Default | Description |
|---|---|---|---|
OPENAI_LOADER_BASE_URL | string | http://openai:8000/v1 | Base URL of the OCR loader (OpenAI-compatible endpoint). |
OPENAI_LOADER_API_KEY | string | EMPTY | API key used to authenticate with the OCR service. |
OPENAI_LOADER_MODEL | string | dotsocr-model | OCR VLM model to use (e.g., DotsOCR, DeepSeek OCR, LightOn OCR). |
OPENAI_LOADER_TEMPERATURE | float | 0.2 | Sampling temperature. Lower values produce more deterministic OCR results. |
OPENAI_LOADER_TIMEOUT | int | 180 | Maximum request duration (in seconds) before timing out. |
OPENAI_LOADER_MAX_RETRIES | int | 2 | Number of retry attempts for failed OCR requests. |
OPENAI_LOADER_TOP_P | float | 0.9 | Nucleus sampling parameter that limits generation to the top-p probability mass. |
OPENAI_LOADER_CONCURRENCY_LIMIT | int | 20 | Maximum number of OCR requests processed concurrently. Useful for multi-page PDF workloads. |
Audio Loader
Section titled “Audio Loader”OpenRAG provides two deployment options for audio transcription, configurable via the AUDIOLOADER environment variable:
| Variable | Type | Default | Description |
|---|---|---|---|
AUDIOLOADER | str | LocalWhisperLoader | Specifies the audio loader implementation. Options: LocalWhisperLoader (bundled Whisper service) or OpenAIAudioLoader (external OpenAI API) |
Local Whisper Loader ( LocalWhisperLoader )
Section titled “Local Whisper Loader ( LocalWhisperLoader )”For local whisper loader, here are the options to use
| Variable | Type | Default | Description |
|---|---|---|---|
WHISPER_MODEL | str | base | The whisper multilingual model to use depending on available resources. Other options: base, small, large, large-v3, etc. |
WHISPER_N_WORKERS | int | 3 | Number of whisper workers |
WHISPER_CONCURRENCY_PER_WORKER | int | 2 | Maximum number of audio transcription tasks processed concurrently by each Whisper worker. |
OpenAI-compatible audio Loader ( OpenAIAudioLoader )
Section titled “OpenAI-compatible audio Loader ( OpenAIAudioLoader )”The OpenAIAudioLoader option, allows to use openai-compatible audio endpoint/service to transcribe audio endpoint by providing the following variables: TRANSCRIBER_BASE_URL, TRANSCRIBER_API_KEY and TRANSCRIBER_MODEL
The audio is automatically segmented into chunks using silence detection, then transcribes these chunks in parallel for optimal speed and accuracy.
Here are some other variables related to openai-compatible endpoint.
| Variable | Type | Default | Description |
|---|---|---|---|
TRANSCRIBER_BASE_URL | str | http://transcriber:8000/v1 | Base URL for the transcriber API (OpenAI-compatible endpoint). |
TRANSCRIBER_API_KEY | str | EMPTY | Authentication key for transcriber service requests. |
TRANSCRIBER_MODEL | str | openai/whisper-large-v3-turbo | Whisper model identifier served by VLLM for speech-to-text conversion. Other options: openai/whisper-small, openai/whisper-large-v3-turbo, etc. |
TRANSCRIBER_MAX_CONCURRENT_CHUNKS | int | 20 | Maximum number of audio chunks processed simultaneously. Increasing this value improves throughput when sufficient GPU resources are available. |
TRANSCRIBER_TIMEOUT | int | 3600 | Maximum duration in seconds allowed for a single transcription request. |
USE_WHISPER_LANG_DETECTOR | bool | true | When enabled, uses a local Whisper-based language detector to identify the source audio language before transcription. |
Chunking
Section titled “Chunking”| Variable | Type | Default | Description |
|---|---|---|---|
CHUNKER | str | recursive_splitter | Defines the chunking strategy: recursive_splitter. |
CONTEXTUAL_RETRIEVAL | bool | true | Enables contextual retrieval to chunk context, a technique introduced by Anthropic to improve retrieval performance (Contextual Retrieval) |
CHUNK_SIZE | int | 512 | Maximum size (in characters) of each chunk. |
CHUNK_OVERLAP_RATE | float | 0.2 | Percentage of overlap between consecutive chunks. |
CONTEXTUALIZATION_TIMEOUT | int | 120 | Timeout in seconds for individual chunk contextualization LLM calls. Prevents long-running contextualization tasks from blocking the system. |
MAX_CONCURRENT_CONTEXTUALIZATION | int | 10 | Maximum number of concurrent chunk contextualization tasks. Limits parallel LLM requests to prevent CPU exhaustion during batch indexing. |
After files are converted to Markdown, only the text content is chunked. Image descriptions and Markdown tables are not chunked.
Chunker strategies:
recursive_splitter: Uses hierarchical text structure (sections, paragraphs, sentences). Based on RecursiveCharacterTextSplitter, it preserves natural boundaries whenever possible while ensuring chunks never exceeding theCHUNK_SIZE.
Embedding
Section titled “Embedding”Our embedder is OpenAI-compatible and runs on a VLLM instance configured with the following variables:
| Variable | Type | Default | Description |
|---|---|---|---|
EMBEDDER_MODEL_NAME | str | jinaai/jina-embeddings-v3 | HuggingFace Embedding model served by VLLM .i.e Qwen/Qwen3-Embedding-0.6B or jinaai/jina-embeddings-v3 |
EMBEDDER_BASE_URL | str | http://vllm:8000/v1 | Base URL of the embedder (OpenAI-style). |
EMBEDDER_API_KEY | str | EMPTY | API key for authenticating embedder calls. |
MAX_MODEL_LEN | int | 8192 | Maximum context length (in tokens) supported by the embedding model. If the chunk exceeds this limit, the embedder will truncate it. |
If you prefer to use an external embedding service, simply comment out the embedder service in the docker-compose.yaml and provide the variables above in your environment.
Database Configuration
Section titled “Database Configuration”Our system uses two databases that work together:
Vector Database (VDB)
The vector database stores embeddings and is configured using the following environment variables:
| Variable | Type | Default | Description |
|---|---|---|---|
VDB_HOST | str | milvus | Hostname of the vector database service |
VDB_PORT | int | 19530 | Port on which the vector database listens |
VDB_CONNECTOR_NAME | str | milvus | Connector/driver to use for the vector DB. Currently only milvus is implemented |
VDB_COLLECTION_NAME | str | vdb_test | Name of the collection storing embeddings |
VDB_HYBRID_SEARCH | bool | true | To activate hybrid search (semantic similarity + Keyword search) |
VDB_ENABLE_INSERTION | bool | true | Enable or disable vector database insertion. When disabled, documents are processed but not inserted into Milvus. Useful for testing. |
These variables can be overridden when using an external vector database service.
Relational Database (RDB)
The vector database implementation relies on an underlying PostgreSQL database that stores metadata about partitions and their owners (users). For more information about the data structure, see the data model.
The PostgreSQL database is configured using the following environment variables:
| Variable | Type | Default | Description |
|---|---|---|---|
POSTGRES_HOST | str | rdb | Hostname of the PostgreSQL database service |
POSTGRES_PORT | int | 5432 | Port on which the PostgreSQL database listens |
POSTGRES_USER | str | root | Username for database authentication |
POSTGRES_PASSWORD | str | root_password | Password for database authentication |
Chat Pipeline
Section titled “Chat Pipeline”LLM & VLM Configuration
Section titled “LLM & VLM Configuration”The system uses two types of language models:
- LLM (Large Language Model): The primary model for text generation and chat interactions
- VLM (Vision Language Model): Used for describing images (see
IMAGE_CAPTIONING) and, to reduce load on the primary LLM, also handles contextualization tasks (seeCONTEXTUAL_RETRIEVAL)
These are external services to provide !!!
LLM Configuration
Section titled “LLM Configuration”| Variable | Type | Description |
|---|---|---|
BASE_URL | str | Base URL of the LLM API endpoint |
MODEL | str | Model identifier for the LLM |
API_KEY | str | API key for authenticating with the LLM service |
LLM_SEMAPHORE | int | 10 |
MAX_LLM_CONTEXT_SIZE | int | 8192 |
VLM Configuration
Section titled “VLM Configuration”| Variable | Type | Description |
|---|---|---|
VLM_BASE_URL | str | Base URL of the VLM API endpoint |
VLM_MODEL | str | Model identifier for the VLM |
VLM_API_KEY | str | API key for authenticating with the VLM service |
VLM_SEMAPHORE | int | 10 |
Retriever Configuration
Section titled “Retriever Configuration”The retriever fetches relevant documents from the vector database based on query similarity. Retrieved documents are then optionally reranked to improve relevance.
| Variable | Type | Default | Description |
|---|---|---|---|
RETRIEVER_TYPE | str | single | Retrieval strategy to use. Options: single, multiQuery, hyde |
RETRIEVER_TOP_K | int | 50 | Number of documents to retrieve before reranking. |
SIMILARITY_THRESHOLD | float | 0.6 | Minimum similarity score (0.0-1.0) for document retrieval. Documents below this threshold are filtered out |
WITH_SURROUNDING_CHUNKS | bool | true | When enabled, retrieves adjacent chunks (preceding and following) for each matched document to provide additional context. |
Retrieval Strategies
Section titled “Retrieval Strategies”| Strategy | Description |
|---|---|
| single | Standard semantic search using the original query. Fast and efficient for most queries |
| multiQuery | Generates multiple query variations to improve recall. Better coverage for ambiguous or complex questions |
| hyde | Hypothetical Document Embeddings - generates a hypothetical answer then searches for similar documents |
Reranker Configuration
Section titled “Reranker Configuration”The reranker enhances search quality by re-scoring and reordering retrieved documents according to their relevance to the user’s query. Two providers are supported: Infinity (default) and OpenAI-compatible endpoints.
| Variable | Type | Default | Description |
|---|---|---|---|
RERANKER_ENABLED | bool | true | Enable or disable the reranking mechanism |
RERANKER_PROVIDER | str | infinity | Reranker backend to use. Accepted values: infinity, openai |
RERANKER_MODEL | str | Alibaba-NLP/gte-multilingual-reranker-base | Model used for reranking documents. |
RERANKER_TOP_K | int | 10 | Number of top documents to return after reranking. Increase for better results if your LLM has a wider context window |
RERANKER_BASE_URL | str | http://reranker:7997 | Base URL of the reranker service |
RERANKER_API_KEY | str | EMPTY | API key for the reranker service. Required when using the openai provider |
RERANKER_SEMAPHORE | int | 5 | Maximum number of concurrent reranking requests. Adjust based on your server capacity |
Reranker Providers
Section titled “Reranker Providers”| Provider | RERANKER_PROVIDER value | Description |
|---|---|---|
| Infinity | infinity | Uses the Infinity server via its native client. Default port: 7997 |
| OpenAI-compatible | openai | Uses any OpenAI-compatible reranker endpoint (e.g. vLLM, LiteLLM, TEI). Default port: 8000 |
Prompts
Section titled “Prompts”The RAG pipeline comes with preconfigured prompts ./prompts/example1. Here are available Prompt Templates in that folder.
| Template File | Purpose |
|---|---|
sys_prompt_tmpl.txt | System prompt that defines the assistant’s behavior and role |
spoken_style_answer_tmpl.txt | Template for converting responses to a more natural, conversational spoken style (oral / audio type of answer) |
query_contextualizer_tmpl.txt | Template for adding context to user queries |
chunk_contextualizer_tmpl.txt | Template for contextualizing document chunks during indexing |
image_captioning_tmpl.txt | Template for generating image descriptions using the VLM |
hyde.txt | Hypothetical Document Embeddings (HyDE) query expansion template |
multi_query_pmpt_tmpl.txt | Template for generating multiple query variations |
To customize prompt:
- Duplicate the example folder: Copy the
example1folder from./prompts/ - Create your custom folder: Rename it to something meaningful, e.g.,
my_prompt - Modify the prompts: Edit any prompt templates within your new folder
- Update configuration: Point to your custom prompts directory
# Use custom promptsexport PROMPTS_DIR=../prompts/my_prompt| Variable | Type | Default | Description |
|---|---|---|---|
PROMPTS_DIR | str | ../prompts/example1 | Path to the directory containing your prompt templates |
Logging
Section titled “Logging”Our application uses Loguru with custom formatting. Log messages appear in two places:
- Terminal (stderr): Human-readable formatted output
- Log file (
logs/app.json): JSON format for monitoring tools like Grafana. This file resides at the mounted folder./logs
Log Message Format
Section titled “Log Message Format”Terminal output follows this format:
LEVEL | module:function:line - message [context_key=value]Logging Levels & What They Mean
Section titled “Logging Levels & What They Mean”There are several logging levels available (TRACE, DEBUG, INFO, SUCCESS, WARNING, ERROR, CRITICAL). Only the levels intended for use in this project are documented here.
| Level | What You’ll See in Logs |
|---|---|
| WARNING | Potential issues that don’t stop execution: approaching rate limits, deprecated features used, retryable failures, configuration concerns. Review these periodically. |
| DEBUG | Detailed diagnostic information including variable states, intermediate processing steps, and function entry/exit points. Useful during development and troubleshooting. |
| INFO | Standard operational messages showing normal application behavior: server startup, request handling, major workflow stages. This is the typical production level. |
Configuration
Section titled “Configuration”Set the logging level via environment variable:
# Show only warnings and errorsLOG_LEVEL=WARNING
# Show detailed debug information (use in dev and pre-prod)LOG_LEVEL=DEBUG
# Production default (informational messages)LOG_LEVEL=INFOLog File Features
Section titled “Log File Features”- Rotation: Files rotate automatically at 10 MB
- Retention: Logs kept for 10 days
- Format: JSON for easy parsing and ingestion into monitoring systems
- Async: Queued writing (
enqueue=True) prevents blocking operations
Ray is used for distributed task processing and parallel execution in the RAG pipeline. This configuration controls resource allocation, concurrency limits, and serving options.
General Ray Settings
Section titled “General Ray Settings”| Variable | Type | Default | Description |
|---|---|---|---|
RAY_POOL_SIZE | int | 1 | Number of serializer actor instances (typically 1 actor per cluster node) |
RAY_MAX_TASKS_PER_WORKER | int | 8 | Maximum number of concurrent tasks (serialization tasks) per serializer actor instance |
RAY_DASHBOARD_PORT | int | 8265 | Ray Dashboard port used for monitoring. In production, comment out this line to avoid exposing the port, as it may introduce security vulnerabilities. |
| Variable | Type | value | Description |
|---|---|---|---|
RAY_DEDUP_LOGS | number | 0 | Turns off Ray log deduplication that appears across multiple processes. Set to 0 to see all logs from each process. |
RAY_ENABLE_RECORD_ACTOR_TASK_LOGGING | number | 1 | Enables logs at task level in the Ray dashboard for better debugging and monitoring. |
RAY_task_retry_delay_ms | number | 3000 | Delay (in milliseconds) before retrying a failed task. Controls the wait time between retry attempts. |
RAY_ENABLE_UV_RUN_RUNTIME_ENV | number | 0 | Controls UV runtime environment integration. Critical: Must be set to 0 when using the newest version of UV to avoid compatibility issues. |
RAY_memory_monitor_refresh_ms | number | 250 ms | To control the frequency of memory usage checks and task or actor termination if needed. If you set this value to 0, task killing is disabled. |
Indexer Configuration
Section titled “Indexer Configuration”| Variable | Type | Default | Description |
|---|---|---|---|
RAY_MAX_TASK_RETRIES | int | 2 | Number of retry attempts for failed tasks |
INDEXER_SERIALIZE_TIMEOUT | int | 36000 | Timeout in seconds for serialization operations (10 hours) |
Indexer Concurrency Groups
Section titled “Indexer Concurrency Groups”Controls the maximum number of concurrent operations for different indexer tasks:
| Variable | Type | Default | Description |
|---|---|---|---|
INDEXER_DEFAULT_CONCURRENCY | int | 1000 | Default concurrency limit for general operations |
INDEXER_UPDATE_CONCURRENCY | int | 100 | Maximum concurrent document update operations |
INDEXER_SERIALIZE_CONCURRENCY | int | 50 | Maximum concurrent serialization operations |
INDEXER_SEARCH_CONCURRENCY | int | 100 | Maximum concurrent search/retrieval operations |
INDEXER_DELETE_CONCURRENCY | int | 100 | Maximum concurrent document deletion operations |
INDEXER_CHUNK_CONCURRENCY | int | 1000 | Maximum concurrent document chunking operations |
INDEXER_INSERT_CONCURRENCY | int | 10 | Maximum concurrent document insertion operations |
Semaphore Configuration
Section titled “Semaphore Configuration”| Variable | Type | Default | Description |
|---|---|---|---|
RAY_SEMAPHORE_CONCURRENCY | int | 100000 | Global concurrency limit for Ray semaphore operations |
Ray Serve Configuration
Section titled “Ray Serve Configuration”Ray Serve enables deployment of the FastAPI as a scalable service. For simple deployment, without the intend to scale, one can usage the uvicorn deployment mode
| Variable | Type | Default | Description |
|---|---|---|---|
ENABLE_RAY_SERVE | bool | false | Enable Ray Serve deployment mode |
RAY_SERVE_NUM_REPLICAS | int | 1 | Number of service replicas for load balancing |
RAY_SERVE_HOST | str | 0.0.0.0 | Host address for the Ray Serve deployment |
RAY_SERVE_PORT | int | 8080 | Port for the Ray Serve FastAPI endpoint |
CHAINLIT_PORT | int | 8090 | Port for the Chainlit UI interface if ray serve is enable ENABLE_RAY_SERVE. If not chainlit UI is simply a subroute (/chainlit see this) of the FastAPI base_url |
Web Search Configuration
Section titled “Web Search Configuration”Web search allows the LLM to augment RAG document context with live web results. It is disabled by default — set WEBSEARCH_API_TOKEN to enable it.
| Variable | Type | Default | Description |
|---|---|---|---|
WEBSEARCH_PROVIDER | str | staan | Web search provider to use. Currently supported: staan. |
WEBSEARCH_API_TOKEN | str | "" | API token for the web search provider. If empty, web search is disabled. |
WEBSEARCH_BASE_URL | str | (provider default) | Base URL of the web search provider API. |
WEBSEARCH_TOP_K | int | 5 | Number of web search results to return. |
WEBSEARCH_LANG | str | fr-FR | Language/market code for web search queries. |
WEBSEARCH_MAX_TOKENS | int | 2000 | Maximum token budget for all web sources combined in the LLM context. This budget is reserved from the global context window when web results are present. |
WEBSEARCH_FETCH_CONTENT | bool | true | When enabled, fetches actual page content from the top URLs instead of relying on short search snippets. |
WEBSEARCH_FETCH_MAX_RESULTS | int | 3 | Number of top URLs to fetch content from (the remaining results use their search snippet). |
WEBSEARCH_FETCH_TIMEOUT | float | 1.0 | Per-URL timeout in seconds for content fetching. URLs that don’t respond within this time fall back to their snippet. |
WEBSEARCH_FETCH_MAX_TOKENS | int | 500 | Maximum approximate tokens of content to extract per page. Content is truncated at word boundaries. |
WEBSEARCH_FETCH_VERIFY_SSL | bool | false | Whether to verify SSL certificates when fetching page content. |
Map & Reduce Configuration
Section titled “Map & Reduce Configuration”The map & reduce mechanism processes documents by fetching chunks (map phase), filtering out irrelevant ones and summarizing relevant content (reduce phase) with respect to the user’s query. The algorithm works as follows:
- Initially fetches a batch of documents for processing
- Evaluates relevance and continues expanding the search if needed
- Stops expansion when the last
MAP_REDUCE_EXPANSION_BATCH_SIZEchunks are all irrelevant - Otherwise, continues fetching additional documents up to
MAP_REDUCE_MAX_TOTAL_DOCUMENTS
When MAP_REDUCE_DEBUG is enabled, the mechanism logs detailed information to ./logs/map_reduce.md.
| Variable | Type | Default | Description |
|---|---|---|---|
MAP_REDUCE_INITIAL_BATCH_SIZE | int | 10 | Number of documents to process in the initial mapping phase |
MAP_REDUCE_EXPANSION_BATCH_SIZE | int | 5 | Number of additional documents to fetch when expanding the search (also used as the threshold for stopping) |
MAP_REDUCE_MAX_TOTAL_DOCUMENTS | int | 20 | Maximum total number of documents (chunks) to process across all iterations |
MAP_REDUCE_DEBUG | bool | true | Enable debug logging for map & reduce operations. Logs are written to ./logs/map_reduce.md |
FastAPI & Access Control
Section titled “FastAPI & Access Control”By default, our API (FastAPI) uses uvicorn for deployment. One can opt in to use Ray Serve for scalability (see the ray serve configuration)
The following environment variables configure the FastAPI server and control access permissions:
| Variable | Type | Default | Description |
|---|---|---|---|
APP_PORT | number | 8000 | Port number on which the FastAPI application listens for incoming requests. |
AUTH_TOKEN | string | EMPTY | An authentication token is required to access protected API endpoints. By default, this token corresponds to the API key of the created admin (see Admin Bootstrapping). If left empty, authentication is disabled. |
SUPER_ADMIN_MODE | boolean | false | Enables super admin privileges when set to true, granting unrestricted access to all operations and bypassing standard access controls. This is for debugging |
DEFAULT_FILE_QUOTA | int | -1 | Default per-user file quota. <0 disables quotas globally; >=0 sets the default limit when a user has no explicit quota. |
API_NUM_WORKERS | int | 1 | Number of uvicorn workers |
PREFERRED_URL_SCHEME | string | null | URL scheme (http or https) used when generating URLs in API responses (e.g., task_status_url). When running behind a reverse proxy that terminates SSL, set this to https to ensure generated URLs use the correct scheme. If unset, the scheme from the incoming request is used. |
Indexer-UI
Section titled “Indexer-UI”| Variable | Type | Default | Description |
|---|---|---|---|
INCLUDE_CREDENTIALS | boolean | false | If authentification is |
INDEXERUI_PORT | number | 8060 | Port number on which the Indexer UI application runs. Default is 8060 (documentation mentions 3042 as another common default). |
INDEXERUI_URL | string | http://X.X.X.X:INDEXERUI_PORT | Base URL of the Indexer UI. Required to prevent CORS issues. Replace X.X.X.X with localhost (local) or your server IP, and INDEXERUI_PORT with the actual port. |
API_BASE_URL | string | http://X.X.X.X:APP_PORT | Base URL of your FastAPI backend, used by the frontend to communicate with the API. Replace X.X.X.X with localhost (local) or your server IP, and APP_PORT with your FastAPI port. |
Chainlit
Section titled “Chainlit”See this for chainlit authentification
See this for chainlit data persistency