Quickstart on MacOS
Docker
Section titled “Docker”Prerequisites
Section titled “Prerequisites”- Docker and Docker Compose
- Your hardware should meet these specifications:
- A minimum of 24 GB of unified memory (32 GB recommended). 16 GB may work with varying degrees of success.
- An Apple Silicon based Mac
Installation
Section titled “Installation”We provide precompiled Docker images for OpenRAG and its dashboard companion, Indexer-UI.
You will need the following docker-compose.yaml and .env files to get started:
x-openrag: &openrag_template image: linagoraai/openrag:macOS_poc volumes: - ./data:/app/data - ./.cache/huggingface:/app/model_weights # Model weights for RAG - ./ray_mount/.env:/ray_mount/.env # Shared environment variables - ./ray_mount/logs:/app/logs ports: - 8090:8080 # Localhost only: Ray dashboard/Jobs API is unauthenticated (CVE-2023-48022). Disable when in cluster mode - 127.0.0.1:${RAY_DASHBOARD_PORT:-8265}:8265 networks: default: aliases: - openrag env_file: - .env environment: - APP_PORT=8090 - AUTH_TOKEN=${AUTH_TOKEN:?Set a strong AUTH_TOKEN in your .env} - RERANKER_ENABLED=false - MARKER_MAX_PROCESSES=1 - INDEXERUI_COMPOSE_FILE=true # Does not serve any purpose but needs to be enabled until PR is merged - INDEXERUI_PORT=8067 # Here as well - INDEXERUI_URL=http://localhost:8067 # Here as well - RAY_DEDUP_LOGS=0 - RAY_ENABLE_UV_RUN_RUNTIME_ENV=0s - RAY_memory_monitor_refresh_ms=0 shm_size: 10.24gb
services: openrag: <<: *openrag_template deploy: {} depends_on: - milvus - ollama
rdb: image: postgres:15 environment: - POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?Set POSTGRES_PASSWORD in your .env} - POSTGRES_USER=root volumes: - ./db:/var/lib/postgresql/data
ollama: image: ollama/ollama:latest ports: - "11434:11434" volumes: - ./volumes/ollama:/root/.ollama - ./ollama-entrypoint.sh:/entrypoint.sh restart: unless-stopped entrypoint: ["/usr/bin/bash", "/entrypoint.sh"]
etcd: image: quay.io/coreos/etcd:v3.5.16 environment: - ETCD_AUTO_COMPACTION_MODE=revision - ETCD_AUTO_COMPACTION_RETENTION=1000 - ETCD_QUOTA_BACKEND_BYTES=4294967296 - ETCD_SNAPSHOT_COUNT=50000 volumes: - ./volumes/etcd:/etcd command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd healthcheck: test: ["CMD", "etcdctl", "endpoint", "health"] interval: 30s timeout: 20s retries: 3
minio: image: minio/minio:RELEASE.2023-03-20T20-16-18Z environment: MINIO_ACCESS_KEY: ${MINIO_ACCESS_KEY:?Set MINIO_ACCESS_KEY in your .env} MINIO_SECRET_KEY: ${MINIO_SECRET_KEY:?Set MINIO_SECRET_KEY in your .env} volumes: - ./volumes/minio:/minio_data command: minio server /minio_data --console-address ":9001" healthcheck: test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"] interval: 30s timeout: 20s retries: 3
milvus: image: milvusdb/milvus:v2.5.4 command: ["milvus", "run", "standalone"] security_opt: - seccomp:unconfined environment: ETCD_ENDPOINTS: etcd:2379 MINIO_ADDRESS: minio:9000 volumes: - ./volumes/milvus:/var/lib/milvus healthcheck: test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"] interval: 30s start_period: 90s timeout: 20s retries: 3 ports: - "19530:19530" depends_on: - "etcd" - "minio"
indexer-ui: image: linagoraai/indexer-ui:latest ports: - "8067:3000" environment: - API_BASE_URL=http://localhost:8090 - INCLUDE_CREDENTIALS=true restart: unless-stopped# LLM - For conversationBASE_URL=API_KEY=MODEL=
# VLM - For image interpretationVLM_BASE_URL=VLM_API_KEY=VLM_MODEL=
# EMBEDDER - For text vectorizationEMBEDDER_BASE_URL=EMBEDDER_MODEL_NAME=EMBEDDER_API_KEY=Configuration
Section titled “Configuration”By default, the only necessary configuration change is to set the model settings in the .env file. Make sure all three models are set (they can be the same one if it supports vision, language, and embedding) If using ollama, ensure you pull the desired models locally using the ollama CLI (keep in mind that ollama needs to be running to pull models):
ollama pull qwen3:0.6bBASE_URL=http://ollama:11434API_KEY=EMPTYMODEL=qwen3:0.6bOptimizations
Section titled “Optimizations”As stated earlier, Docker for MacOS does not support GPU acceleration. Therefore, to maximize performance, we recommend using a non-dockerized installation of ollama or LlamaCpp, or running models from an external server. For simplicity, we still provide a dockerized setup here.