Skip to content

[WIP] Add RAPIDS Nightly to GPU CI #436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
tc
Signed-off-by: Praateek <[email protected]>
  • Loading branch information
praateekmahajan committed Dec 17, 2024
commit 28c84bfefc2a5c86c87eded9ddde28700c86d1c4
8 changes: 6 additions & 2 deletions .github/workflows/gpuci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ jobs:
if [ "$(docker ps -aq -f name=${{ env.CONTAINER_NAME }})" ]; then
docker rm -f ${{ env.CONTAINER_NAME }}
fi

# This runs the container which was pushed by build-container, which we call "nemo-curator-container"
# `--gpus all` ensures that all of the GPUs from our self-hosted-azure runner are available in the container
# We use "github.run_id" to identify the PR with the commits we want to run the PyTests with
Expand All @@ -80,9 +81,12 @@ jobs:

# In the virtual environment (called "curator") we created in the container,
# list all of our packages. Useful for debugging
- name: Verify installations
# Expect `whoami` to be "azureuser"
# Expect `nvidia-smi` to show our 2 A100 GPUs
- name: Check GPUs + Verify installations
run: |
echo "Checking system user:"
whoami
docker exec ${{ env.CONTAINER_NAME }} whoami
echo "Checking GPU availability:"
docker exec ${{ env.CONTAINER_NAME }} nvidia-smi
Expand All @@ -100,4 +104,4 @@ jobs:
# Thus, we use `docker rm` to permanently removed it from the system
- name: Cleanup
if: always()
run: docker rm -f ${{ env.CONTAINER_NAME }} || true
run: docker stop ${{ env.CONTAINER_NAME }} && docker rm ${{ env.CONTAINER_NAME }}
Loading