-
Notifications
You must be signed in to change notification settings - Fork 4
Device op cuda #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Device op cuda #46
Conversation
related to spack/spack#40725 Signed-off-by: Howard Pritchard <[email protected]>
Hello! The Git Commit Checker CI bot found a few problems with this PR: ef8b526: ROCM: add missing FUNC_FUNC_FN macro
6fd216f: accelerator/rocm: regular memory behaves like unif...
955849b: Device op: pass device to lower-level op to avoid ...
3afec6b: Draft of ompi_op_select_device
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
1 similar comment
Hello! The Git Commit Checker CI bot found a few problems with this PR: ef8b526: ROCM: add missing FUNC_FUNC_FN macro
6fd216f: accelerator/rocm: regular memory behaves like unif...
955849b: Device op: pass device to lower-level op to avoid ...
3afec6b: Draft of ompi_op_select_device
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: ef8b526: ROCM: add missing FUNC_FUNC_FN macro
6fd216f: accelerator/rocm: regular memory behaves like unif...
955849b: Device op: pass device to lower-level op to avoid ...
3afec6b: Draft of ompi_op_select_device
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
…0_dlopen spack:fix for dlopen missing symbol problem
Hello! The Git Commit Checker CI bot found a few problems with this PR: ef8b526: ROCM: add missing FUNC_FUNC_FN macro
6fd216f: accelerator/rocm: regular memory behaves like unif...
955849b: Device op: pass device to lower-level op to avoid ...
3afec6b: Draft of ompi_op_select_device
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
2 similar comments
Hello! The Git Commit Checker CI bot found a few problems with this PR: ef8b526: ROCM: add missing FUNC_FUNC_FN macro
6fd216f: accelerator/rocm: regular memory behaves like unif...
955849b: Device op: pass device to lower-level op to avoid ...
3afec6b: Draft of ompi_op_select_device
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Hello! The Git Commit Checker CI bot found a few problems with this PR: ef8b526: ROCM: add missing FUNC_FUNC_FN macro
6fd216f: accelerator/rocm: regular memory behaves like unif...
955849b: Device op: pass device to lower-level op to avoid ...
3afec6b: Draft of ompi_op_select_device
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
If the target process is unable to execute an RDMA operation it instructs the origin to change the communication protocol. When this happen theorigin must be informed to cancel all pending RDMA operations, and release the rdma_frag. Signed-off-by: George Bosilca <[email protected]>
…or allreduce recursive doubling Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
The accelerator component may report the availability of a single accelerator whose ID is not zero. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
…_SUPPORT These macros are defined to either 1 or 0 Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
…evices Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
We know where source and target buffers are located, so pass the right transfer direction to the accelerator memcpy call. Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
11e6d2a
to
7bb4b95
Compare
Signed-off-by: Joseph Schuchart <[email protected]>
Trial PR for feedback from @bosilca. Probably needs some more cleanup but feedback on the design is appreciated. Commits will be squashed later.
Currently only implements offloading for allreduce algorithms. It's missing rooted reduce.