Skip to content

Seeking ContainerOS-friendly Driver Deployment with CDI Path Consistency #1116

Open
@wqlparallel

Description

@wqlparallel

Hi,

We're implementing GPU support on ContainerOS-based Kubernetes nodes with two critical requirements:

  1. Security-driven path isolation: All driver files (.so, bins, configs) must reside under a single host path (e.g., /usr/local/nvidia/driver), avoiding writes to /usr/lib//etc per ContainerOS hardening policies

  2. CDI path preservation: nvidia-ctk cdi generate must produce container paths identical to standard gpu-operator output for compatibility

Current workaround (inefficient):

  1. Build driver in privileged container

  2. Copy entire container rootfs (→ ~1GB bloat) to isolated host dir

  3. Manual kernel module loading

Key questions:

  1. Path mapping: Is there supported configuration (e.g., env vars, config files) to:
  • Specify custom host driver path while
  • Maintaining default container paths in CDI specs (like /usr/lib/x86_64-linux-gnu/libcuda.so)
  1. Minimal file set: Are there tools/patterns to identify/copy only driver-essential files (avoiding 1GB+ OS cruft) while keeping CDI generation intact?

  2. Production-grade alternatives: Any recommendations for:

  • Single-directory driver containment
  • CDI path consistency
  • Driver lifecycle decoupled from containers?

Would appreciate insights on achieving secure path isolation without sacrificing CDI compatibility or disk efficiency!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions