Description
Hi,
We're implementing GPU support on ContainerOS-based Kubernetes nodes with two critical requirements:
-
Security-driven path isolation: All driver files (.so, bins, configs) must reside under a single host path (e.g., /usr/local/nvidia/driver), avoiding writes to /usr/lib//etc per ContainerOS hardening policies
-
CDI path preservation: nvidia-ctk cdi generate must produce container paths identical to standard gpu-operator output for compatibility
Current workaround (inefficient):
-
Build driver in privileged container
-
Copy entire container rootfs (→ ~1GB bloat) to isolated host dir
-
Manual kernel module loading
Key questions:
- Path mapping: Is there supported configuration (e.g., env vars, config files) to:
- Specify custom host driver path while
- Maintaining default container paths in CDI specs (like /usr/lib/x86_64-linux-gnu/libcuda.so)
-
Minimal file set: Are there tools/patterns to identify/copy only driver-essential files (avoiding 1GB+ OS cruft) while keeping CDI generation intact?
-
Production-grade alternatives: Any recommendations for:
- Single-directory driver containment
- CDI path consistency
- Driver lifecycle decoupled from containers?
Would appreciate insights on achieving secure path isolation without sacrificing CDI compatibility or disk efficiency!