Skip to content

Deploying OpenRAG on Kubernetes

This guide explains how to deploy the OpenRAG stack on a Kubernetes cluster using Helm.


  • A Kubernetes cluster with GPU nodes available (NVIDIA runtime) and nvidia-gpu-operator installed.
  • A StorageClass that supports ReadWriteMany (RWX) access mode.
    This is required because the Ray cluster workers and the OpenRAG app need to access the same shared volumes (e.g. for .venv, model weights, logs, data).
  • If using ingress, the ingress-nginx controller needs to be installed on the cluster.

  1. Create a values.yaml file:

    • Copy or create a new values.yaml at the root of your repo.
    • You can see the full example file inside the chart: ../charts/openrag-stack/values.yaml
    • Customize the values you need (e.g., image tags, resources, ingress host, storage class, environment variables, secrets).
  2. Set environment and secrets:

    • Edit the env.config and env.secrets sections in your values.yaml.
    • Secrets (API keys, tokens, Hugging Face credentials, etc.) will be mounted into the cluster as Kubernetes secrets.
  3. Install or upgrade the release from GHCR:

    Terminal window
    helm upgrade\
    --install openrag oci://ghcr.io/linagora/openrag-stack\
    -f ./values.yaml\
    --version 0.1.0
    • openrag is the Helm release name.
    • oci://ghcr.io/linagora/openrag-stack is the remote chart location.
    • -f ./values.yaml specifies your custom configuration.
    • --version 0.1.0 ensures you deploy a specific chart version.

  • If using a public IP instead of a hostname, you can leave ingress.host empty in your values.yaml.
    The ingress will then match all hosts.

  • If you later configure a hostname + TLS (via cert-manager), just update ingress.host and redeploy.

  • Ensure your GPU nodes have the correct NVIDIA drivers and nvidia RuntimeClass configured.