Failure to Initialize Container: unsatisfied condition: cuda>=12.6 #6465

Jack-Khuu · 2025-03-25T23:08:42Z

GPU based workflows fail after bumping Cuda requirements from 12.4 -> 12.6 in torchchat.
Would love some help updating the driver or suggestions on how to update test config

Example Run: https://github.com/pytorch/torchchat/actions/runs/14053926197/job/39349546495

docker: Error response from daemon: failed to create task for container: failed to create shim task: 
OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running prestart hook #0: exit status 1, stdout: , 

stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.6, 
please update your driver to a newer version, or use an earlier cuda container: unknown.

Test Runner Config: https://github.com/pytorch/torchchat/blob/fea361f6cce0b1cdd54cc211dde19266753b60fc/.github/workflows/more-tests.yml#L11-L19

  test-cuda:
    permissions:
      id-token: write
      contents: read
    uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
    with:
      runner: linux.g5.4xlarge.nvidia.gpu
      gpu-arch-type: cuda
      gpu-arch-version: "12.6"

Similar related past Issue: #5191

The text was updated successfully, but these errors were encountered:

clee2000 · 2025-03-26T20:29:22Z

cc @atalman ? @seemethere I'm not sure who owns nova

I think it needs to use a newer cuda driver which would require changes to linux_job_v2 to take in an input for this, but I also see pytorch/pytorch jobs named cuda12.6 but they show 12.4 when nvidia-smi is run. Do the 12.6 on pytorch/pytorch just mean that the binary was built with 12.6 but not necessarily run on 12.6?

HonestDeng · 2025-03-27T03:24:41Z

cc @atalman ? @seemethere I'm not sure who owns nova

I think it needs to use a newer cuda driver which would require changes to linux_job_v2 to take in an input for this, but I also see pytorch/pytorch jobs named cuda12.6 but they show 12.4 when nvidia-smi is run. Do the 12.6 on pytorch/pytorch just mean that the binary was built with 12.6 but not necessarily run on 12.6?

Hi. I'm the owner of Update CI Jobs in anticipation for Cuda 12.4 deprecation pytorch/torchchat#1515.

I'm new to torchchat and confused what you said. Do you mean we should update linux_job_v2.yml to use a newer cuda driver?

Thanks.

Jack-Khuu · 2025-03-27T22:36:50Z

Ah, thanks @clee2000

@HonestDeng I'll follow up with you on Discord 😃

github-project-automation bot added this to PyTorch OSS Dev Infra Mar 25, 2025

Jack-Khuu mentioned this issue Mar 25, 2025

Update CI Jobs in anticipation for Cuda 12.4 deprecation pytorch/torchchat#1515

Closed

Jack-Khuu closed this as completed Mar 27, 2025

github-project-automation bot moved this to Done in PyTorch OSS Dev Infra Mar 27, 2025

Jack-Khuu mentioned this issue Mar 27, 2025

Update CI Jobs in anticipation for Cuda 12.4 deprecation pytorch/torchchat#1504

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure to Initialize Container: unsatisfied condition: cuda>=12.6 #6465

Failure to Initialize Container: unsatisfied condition: cuda>=12.6 #6465

Jack-Khuu commented Mar 25, 2025 •

edited

Loading

clee2000 commented Mar 26, 2025

HonestDeng commented Mar 27, 2025

Jack-Khuu commented Mar 27, 2025

Failure to Initialize Container: unsatisfied condition: cuda>=12.6 #6465

Failure to Initialize Container: unsatisfied condition: cuda>=12.6 #6465

Comments

Jack-Khuu commented Mar 25, 2025 • edited Loading

clee2000 commented Mar 26, 2025

HonestDeng commented Mar 27, 2025

Jack-Khuu commented Mar 27, 2025

Jack-Khuu commented Mar 25, 2025 •

edited

Loading