Ray Cluster
⚡ Distributed Deployment in a Ray Cluster
Section titled “⚡ Distributed Deployment in a Ray Cluster”This guide explains how to deploy OpenRAG across multiple machines using Ray for distributed indexing and processing.
✅ 1. Set Environment Variables
Section titled “✅ 1. Set Environment Variables”Ensure your .env file includes the standard app variables plus Ray-specific ones listed below:
# Ray# Resources for all filesRAY_NUM_GPUS=0.1RAY_POOL_SIZE=1RAY_MAX_TASKS_PER_WORKER=5
# PDF specific resources when using markerMARKER_MAX_TASKS_PER_CHILD=10MARKER_MAX_PROCESSES=5 # Number of subprocesses <-> Number of concurrent pdfs per workerMARKER_MIN_PROCESSES=3 # Minimum number of subprocesses available before triggering a process pool reset.MARKER_POOL_SIZE=1 # Number of workers (typically 1 worker per cluster node)MARKER_NUM_GPUS=0.6
SHARED_ENV=/ray_mount/.envRAY_DASHBOARD_PORT=8265RAY_ADDRESS=ray://X.X.X.X:10001HEAD_NODE_IP=X.X.X.XRAY_HEAD_ADDRESS=X.X.X.X:6379# RAY_ENABLE_RECORD_ACTOR_TASK_LOGGING=1 # to enable logs at task level in ray dashboardRAY_task_retry_delay_ms=3000
# Ray volumesDATA_VOLUME=/ray_mount/dataMODEL_WEIGHTS_VOLUME=/ray_mount/model_weightsCONFIG_VOLUME=/ray_mount/.hydra_configUV_LINK_MODE=copyUV_CACHE_DIR=/tmp/uv-cache✅ Use host IPs instead of Docker service names :
EMBEDDER_BASE_URL=http://vllm:8000/v1 EMBEDDER_BASE_URL=http://<HOST-IP>:8000/v1 # ✅ instead of http://vllm:8000/v1
VDB_HOST=milvus VDB_HOST=<HOST-IP> # ✅ instead of VDB_HOST=milvus📁 2. Set Up Shared Storage
Section titled “📁 2. Set Up Shared Storage”All nodes need to access shared configuration and data folders.
We recommend using GlusterFS for this.
➡ Follow the GlusterFS Setup Guide to configure:
- Shared access to:
.env.hydra_config/data(uploaded files)/model_weights(embedding model cache)
🚀 3. Start the Ray Cluster
Section titled “🚀 3. Start the Ray Cluster”First, prepare your cluster.yaml file. Here’s an example for a local provider:
cluster_name: rag-clusterprovider: type: local head_ip: 10.0.0.1 worker_ips: [10.0.0.2] # Static IPs of other nodes (does not auto-start workers)
docker: image: ghcr.io/linagora/openrag-ray pull_before_run: true container_name: ray_node run_options: - --gpus all - -v /ray_mount/model_weights:/app/model_weights - -v /ray_mount/data:/app/data - -v /ray_mount/.hydra_config:/app/.hydra_config - -v /ray_mount/logs:/app/logs - --env-file /ray_mount/.env
auth: ssh_user: ubuntu ssh_private_key: path/to/private/key # Replace with your actual ssh key path
head_start_ray_commands: - uv run ray stop - uv run ray start --head --dashboard-host 0.0.0.0 --dashboard-port ${RAY_DASHBOARD_PORT:-8265} --node-ip-address ${HEAD_NODE_IP} --autoscaling-config=~/ray_bootstrap_config.yamlworker_start_ray_commands: - uv run ray stop - uv run ray start --address ${HEAD_NODE_IP:-10.0.0.1}:6379🛠️ The base image (
ghcr.io/linagora/openrag-ray) must be built fromDockerfile.rayand pushed to a container registry before use.
⬆️ Launch the cluster
Section titled “⬆️ Launch the cluster”uv run ray up -y cluster.yaml🐳 4. Launch the OpenRAG App
Section titled “🐳 4. Launch the OpenRAG App”Use the Docker Compose setup:
docker compose up -dOnce running, OpenRAG will auto-connect to the Ray cluster using RAY_ADDRESS from .env.
With this setup, your app is now fully distributed and ready to handle concurrent tasks across your Ray cluster.
🛠️ Troubleshooting
Section titled “🛠️ Troubleshooting”❌ Permission Denied Errors
Section titled “❌ Permission Denied Errors”If you encounter errors like Permission denied when Ray or Docker tries to access shared folders (SQL database, model files, …), it’s likely due to insufficient permissions on the host system.
👉 To resolve this, you can set full read/write/execute permissions on the shared directory:
sudo chmod -R 777 /ray_mount