Deploying OpenRAG on Kubernetes
This guide explains how to deploy the OpenRAG stack on a Kubernetes cluster using Helm.
Prerequisites
Section titled “Prerequisites”- A Kubernetes cluster with GPU nodes available (NVIDIA runtime) and nvidia-gpu-operator installed.
- A StorageClass that supports ReadWriteMany (
RWX) access mode.
This is required because the Ray cluster workers and the OpenRAG app need to access the same shared volumes (e.g. for.venv, model weights, logs, data). - If using ingress, the ingress-nginx controller needs to be installed on the cluster.
-
Create a
values.yamlfile:- Copy or create a new
values.yamlat the root of your repo. - You can see the full example file inside the chart: ../charts/openrag-stack/values.yaml
- Customize the values you need (e.g., image tags, resources, ingress host, storage class, environment variables, secrets).
- Copy or create a new
-
Set environment and secrets:
- Edit the
env.configandenv.secretssections in yourvalues.yaml. - Secrets (API keys, tokens, Hugging Face credentials, etc.) will be mounted into the cluster as Kubernetes secrets.
- Edit the
-
Install or upgrade the release from GHCR:
Terminal window helm upgrade\--install openrag oci://ghcr.io/linagora/openrag-stack\-f ./values.yaml\--version 0.1.0openragis the Helm release name.oci://ghcr.io/linagora/openrag-stackis the remote chart location.-f ./values.yamlspecifies your custom configuration.--version 0.1.0ensures you deploy a specific chart version.
-
If using a public IP instead of a hostname, you can leave
ingress.hostempty in yourvalues.yaml.
The ingress will then match all hosts. -
If you later configure a hostname + TLS (via cert-manager), just update
ingress.hostand redeploy. -
Ensure your GPU nodes have the correct NVIDIA drivers and
nvidiaRuntimeClassconfigured.