Skip to content

Quick Start

OpenRAG is an open source Retrieval Augmented Generation (RAG) solution. This guide is a step-by-step walkthrough to help you get started with OpenRAG.

  • Docker and Docker Compose
  • Your hardware should meet these specifications:
    • CPU deployment: Minimum 13 GiB RAM for light PDF parsers (PyMuPDF4LLMLoader, PyMuPDFLoader), or 23 GiB RAM for heavier parsers like MarkerLoader (refer to this section for details)
    • GPU deployment: 16 GB GPU memory recommended (for systems with separate CPU and GPU memory)
Cloning the OpenRag repository
git clone --recurse-submodules git@github.com:linagora/openrag.git
cd openrag/
git checkout main # or a given release

Create a .env file at the root of the project, mirroring the structure of .env.example, to configure your environment and supply blank environment variables.

Creating the .env file mirroring .env.example
cp .env.example .env

Here is a brief overview of key environment variables to configure:

.env
# LLM
BASE_URL=
API_KEY=
MODEL=
# VLM (Visual Language Model) you can set it to the same as LLM if your LLM supports images
VLM_BASE_URL=
VLM_API_KEY=
VLM_MODEL=
## FastAPI App (no need to change it)
# APP_PORT=8080 # this is the forwarded port
# API_NUM_WORKERS=1 # Number of uvicorn workers for the FastAPI app
## To enable API HTTP authentication via HTTPBearer
# AUTH_TOKEN=sk-openrag-1234
# SAVE_UPLOADED_FILES=true # usefull for chainlit (chat interface) source viewing
# Set to true, it will mount chainlit chat ui to the fastapi app (Default: true)
## WITH_CHAINLIT_UI=true
# EMBEDDER
EMBEDDER_MODEL_NAME=jinaai/jina-embeddings-v3 # or other embedder from huggingface compatible with vllm
# EMBEDDER_BASE_URL=http://vllm:8000/v1
# EMBEDDER_API_KEY=EMPTY
# RETRIEVER
# RETRIEVER_TOP_K=20 # number of top documents to retrieve, before reranking (lower (~10) is faster on CPU | on GPU, you can try to increase the value (~40) ).
# RERANKER
RERANKER_ENABLED=true # deactivate the reranker if your CPU is not powerful enough
RERANKER_MODEL=Alibaba-NLP/gte-multilingual-reranker-base # or jinaai/jina-reranker-v2-base-multilingual
# Prompts
PROMPTS_DIR=../prompts/example3_en # you can change it to ../prompts/example3 for french prompts
# Ray
RAY_DEDUP_LOGS=0 # turns off ray log deduplication that appear across multiple processes
RAY_ENABLE_RECORD_ACTOR_TASK_LOGGING=1 # # to enable logs at task level in ray dashboard
RAY_task_retry_delay_ms=3000
RAY_ENABLE_UV_RUN_RUNTIME_ENV=0 # critical with the newest version of UV
# Indexer UI
## 1. replace X.X.X.X with localhost if launching local or with your server IP
## 2. Used by the frondend. Replace APP_PORT (8080 by default) with the actual port number of your FastAPI backend
## 3. Replace INDEXERUI_PORT with its value in the INDEXERUI_URL variable
INCLUDE_CREDENTIALS=false # set true if fastapi authentification is enabled, i.e AUTH_TOKEN is set
INDEXERUI_PORT=3042 # Port to expose the Indexer UI (default is 3042)
INDEXERUI_URL='http://X.X.X.X:INDEXERUI_PORT'
API_BASE_URL='http://X.X.X.X:APP_PORT' # Base URL of your FastAPI backend.

All supported file format parsers are pre-configured. For PDF processing, MarkerLoader serves as the default parser, offering comprehensive support for OCR-scanned documents, complex layouts, tables, and embedded images. MarkerLoader operates efficiently on both GPU and CPU environments.

OpenRAG repository contains a ready-to-use docker-compose.yml file in the quick_start folder. This setup is ideal for local testing and quick deployments.

  • Directoryquick_start
    • Directoryextern reranker and embedder utils
      • Directoryvllm cpu dockerfile for different architectures
        • Dockerfile.cpu for x86 CPU
      • infinity.yaml reranker service
    • Directoryvdb
      • milvus.yaml
    • docker-compose.yml
    • .env the configured .env file
  1. Navigate to the quick_start directory or copy it
  2. Place your .env file in the quick_start folder
  3. Run the appropriate command for your system:

GPU deployment, recommended for optimal performance

docker compose up -d
# run the following command to stop the application
# docker compose down

For development builds, use the --build flag to rebuild images: Execute these commands from the project root directory or the cloned repository:

  • Directory.github/
  • Directory.hydra_config/
  • Directoryextern/
  • Directoryvdb/
  • docker-compose.yml
  • README.md
  • .env.example
  • pyproject.toml
  • uv.lock
  • .env the configured .env file

GPU deployment

docker compose up -d
# run the following command to stop the application
# docker compose down

Once the app is up and running, you can access the provided services. See the next section.

Clone the OpenRAG repository:

Terminal window
git clone https://github.com/linagora/openrag.git
cd openrag

Run the provided deployment script and follow the instructions:

Terminal window
./ansible/deploy.sh