Skip to content

Quickstart on MacOS

  • Docker and Docker Compose
  • Your hardware should meet these specifications:
    • A minimum of 24 GB of unified memory (32 GB recommended). 16 GB may work with varying degrees of success.
    • An Apple Silicon based Mac

We provide precompiled Docker images for OpenRAG and its dashboard companion, Indexer-UI.

You will need the following docker-compose.yaml and .env files to get started:

x-openrag: &openrag_template
image: linagoraai/openrag:macOS_poc
volumes:
- ./data:/app/data
- ./.cache/huggingface:/app/model_weights # Model weights for RAG
- ./ray_mount/.env:/ray_mount/.env # Shared environment variables
- ./ray_mount/logs:/app/logs
ports:
- 8090:8080
- 8265:8265 # Disable when in cluster mode
networks:
default:
aliases:
- openrag
env_file:
- .env
environment:
- APP_PORT=8090
- AUTH_TOKEN=OpenRAG
- RERANKER_ENABLED=false
- MARKER_MAX_PROCESSES=1
- INDEXERUI_COMPOSE_FILE=true # Does not serve any purpose but needs to be enabled until PR is merged
- INDEXERUI_PORT=8067 # Here as well
- INDEXERUI_URL=http://localhost:8067 # Here as well
- RAY_DEDUP_LOGS=0
- RAY_ENABLE_UV_RUN_RUNTIME_ENV=0s
- RAY_memory_monitor_refresh_ms=0
shm_size: 10.24gb
services:
openrag:
<<: *openrag_template
deploy: {}
depends_on:
- milvus
- ollama
rdb:
image: postgres:15
environment:
- POSTGRES_PASSWORD=root
- POSTGRES_USER=root
volumes:
- ./db:/var/lib/postgresql/data
ollama:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ./volumes/ollama:/root/.ollama
- ./ollama-entrypoint.sh:/entrypoint.sh
restart: unless-stopped
entrypoint: ["/usr/bin/bash", "/entrypoint.sh"]
etcd:
image: quay.io/coreos/etcd:v3.5.16
environment:
- ETCD_AUTO_COMPACTION_MODE=revision
- ETCD_AUTO_COMPACTION_RETENTION=1000
- ETCD_QUOTA_BACKEND_BYTES=4294967296
- ETCD_SNAPSHOT_COUNT=50000
volumes:
- ./volumes/etcd:/etcd
command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
healthcheck:
test: ["CMD", "etcdctl", "endpoint", "health"]
interval: 30s
timeout: 20s
retries: 3
minio:
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
environment:
MINIO_ACCESS_KEY: minioadmin
MINIO_SECRET_KEY: minioadmin
volumes:
- ./volumes/minio:/minio_data
command: minio server /minio_data --console-address ":9001"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
milvus:
image: milvusdb/milvus:v2.5.4
command: ["milvus", "run", "standalone"]
security_opt:
- seccomp:unconfined
environment:
ETCD_ENDPOINTS: etcd:2379
MINIO_ADDRESS: minio:9000
volumes:
- ./volumes/milvus:/var/lib/milvus
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
interval: 30s
start_period: 90s
timeout: 20s
retries: 3
ports:
- "19530:19530"
depends_on:
- "etcd"
- "minio"
indexer-ui:
image: linagoraai/indexer-ui:latest
ports:
- "8067:3000"
environment:
- API_BASE_URL=http://localhost:8090
- INCLUDE_CREDENTIALS=true
restart: unless-stopped

By default, the only necessary configuration change is to set the model settings in the .env file. Make sure all three models are set (they can be the same one if it supports vision, language, and embedding) If using ollama, ensure you pull the desired models locally using the ollama CLI (keep in mind that ollama needs to be running to pull models):

Pulling models with ollama
ollama pull qwen3:0.6b
.env
BASE_URL=http://ollama:11434
API_KEY=EMPTY
MODEL=qwen3:0.6b

As stated earlier, Docker for MacOS does not support GPU acceleration. Therefore, to maximize performance, we recommend using a non-dockerized installation of ollama or LlamaCpp, or running models from an external server. For simplicity, we still provide a dockerized setup here.