Quick Start
OpenRAG is an open source Retrieval Augmented Generation (RAG) solution. This guide is a step-by-step walkthrough to help you get started with OpenRAG.
Docker
Section titled “Docker”Prerequisites
Section titled “Prerequisites”- Docker and Docker Compose
- Your hardware should meet these specifications:
- CPU deployment: Minimum 13 GiB RAM for light PDF parsers (
PyMuPDF4LLMLoader,PyMuPDFLoader), or 23 GiB RAM for heavier parsers likeMarkerLoader(refer to this section for details) - GPU deployment: 16 GB GPU memory recommended (for systems with separate CPU and GPU memory)
- CPU deployment: Minimum 13 GiB RAM for light PDF parsers (
Installation and Configuration
Section titled “Installation and Configuration”1. Clone the repository:
Section titled “1. Clone the repository:”git clone --recurse-submodules git@github.com:linagora/openrag.git
cd openrag/git checkout main # or a given release2. Create a .env File
Section titled “2. Create a .env File”Create a .env file at the root of the project, mirroring the structure of .env.example, to configure your environment and supply blank environment variables.
cp .env.example .envHere is a brief overview of key environment variables to configure:
# LLMBASE_URL=API_KEY=MODEL=
# VLM (Visual Language Model) you can set it to the same as LLM if your LLM supports imagesVLM_BASE_URL=VLM_API_KEY=VLM_MODEL=
## FastAPI App (no need to change it)# APP_PORT=8080 # this is the forwarded port# API_NUM_WORKERS=1 # Number of uvicorn workers for the FastAPI app
## To enable API HTTP authentication via HTTPBearer# AUTH_TOKEN=sk-openrag-1234
# SAVE_UPLOADED_FILES=true # usefull for chainlit (chat interface) source viewing
# Set to true, it will mount chainlit chat ui to the fastapi app (Default: true)## WITH_CHAINLIT_UI=true
# EMBEDDEREMBEDDER_MODEL_NAME=jinaai/jina-embeddings-v3 # or other embedder from huggingface compatible with vllm# EMBEDDER_BASE_URL=http://vllm:8000/v1# EMBEDDER_API_KEY=EMPTY
# RETRIEVER# RETRIEVER_TOP_K=20 # number of top documents to retrieve, before reranking (lower (~10) is faster on CPU | on GPU, you can try to increase the value (~40) ).
# RERANKERRERANKER_ENABLED=true # deactivate the reranker if your CPU is not powerful enoughRERANKER_MODEL=Alibaba-NLP/gte-multilingual-reranker-base # or jinaai/jina-reranker-v2-base-multilingual
# PromptsPROMPTS_DIR=../prompts/example3_en # you can change it to ../prompts/example3 for french prompts
# RayRAY_DEDUP_LOGS=0 # turns off ray log deduplication that appear across multiple processesRAY_ENABLE_RECORD_ACTOR_TASK_LOGGING=1 # # to enable logs at task level in ray dashboardRAY_task_retry_delay_ms=3000RAY_ENABLE_UV_RUN_RUNTIME_ENV=0 # critical with the newest version of UV
# Indexer UI## 1. replace X.X.X.X with localhost if launching local or with your server IP## 2. Used by the frondend. Replace APP_PORT (8080 by default) with the actual port number of your FastAPI backend## 3. Replace INDEXERUI_PORT with its value in the INDEXERUI_URL variable
INCLUDE_CREDENTIALS=false # set true if fastapi authentification is enabled, i.e AUTH_TOKEN is setINDEXERUI_PORT=3042 # Port to expose the Indexer UI (default is 3042)INDEXERUI_URL='http://X.X.X.X:INDEXERUI_PORT'API_BASE_URL='http://X.X.X.X:APP_PORT' # Base URL of your FastAPI backend.3. File Parser configuration
Section titled “3. File Parser configuration”All supported file format parsers are pre-configured. For PDF processing, MarkerLoader serves as the default parser, offering comprehensive support for OCR-scanned documents, complex layouts, tables, and embedded images. MarkerLoader operates efficiently on both GPU and CPU environments.
Deployment
Section titled “Deployment”In case Indexer UI (A Web interface for intuitive document ingestion, indexing, and management.) is not configured already in your .env, follow this dedicated guide:
➡ Deploy with Indexer UI
Simple and quick launch for testing
Section titled “Simple and quick launch for testing”OpenRAG repository contains a ready-to-use docker-compose.yml file in the quick_start folder. This setup is ideal for local testing and quick deployments.
Directoryquick_start
Directoryextern reranker and embedder utils
Directoryvllm cpu dockerfile for different architectures
- Dockerfile.cpu for x86 CPU
- infinity.yaml reranker service
Directoryvdb
- milvus.yaml
- docker-compose.yml
- .env the configured .env file
- Navigate to the
quick_startdirectory or copy it - Place your
.envfile in thequick_startfolder - Run the appropriate command for your system:
GPU deployment, recommended for optimal performance
docker compose up -d
# run the following command to stop the application# docker compose downCPU deployment
docker compose --profile cpu up -d
# to stop the application# docker compose --profile cpu downDevelopment Environment
Section titled “Development Environment”For development builds, use the --build flag to rebuild images:
Execute these commands from the project root directory or the cloned repository:
Directory.github/
- …
Directory.hydra_config/
- …
- …
Directoryextern/
- …
Directoryvdb/
- …
- docker-compose.yml
- README.md
- .env.example
- pyproject.toml
- uv.lock
- .env the configured .env file
GPU deployment
docker compose up -d
# run the following command to stop the application# docker compose downCPU deployment
docker compose --profile cpu up -d
# to stop the application# docker compose --profile cpu downOnce the app is up and running, you can access the provided services. See the next section.
Ansible
Section titled “Ansible”Clone the OpenRAG repository:
git clone https://github.com/linagora/openrag.gitcd openragRun the provided deployment script and follow the instructions:
./ansible/deploy.sh