Add pinned_memory and non_blocking transfer for default collate_fn #52948

srinathk10 · 2025-05-12T21:52:32Z

Why are these changes needed?

Add pinned_memory and non_blocking transfer for default collate_fn

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Srinath Krishnamachari <[email protected]>

Co-authored-by: lanbochen-anyscale <[email protected]> Signed-off-by: srinathk10 <[email protected]>

…athk10-train-fix-collate-fn

Signed-off-by: Srinath Krishnamachari <[email protected]>

Signed-off-by: srinathk10 <[email protected]>

…-memory Signed-off-by: srinathk10 <[email protected]>

Signed-off-by: Srinath Krishnamachari <[email protected]>

Signed-off-by: srinathk10 <[email protected]>

raulchen · 2025-05-13T19:43:01Z

python/ray/data/collate_fn.py

@@ -213,6 +213,13 @@ def __call__(self, batch: "pyarrow.Table") -> Dict[str, List["torch.Tensor"]]:
        # However, for CPU transfer, we need to combine the chunked arrays first
        # before converting to numpy format and then to Tensors.
        combine_chunks = self.device.type == "cpu"
+
+        # If the device is CPU, we don't need to pin the memory.
+        pin_memory = self.device.type != "cpu"


We should probably expose this arg to users.
pinning memory isn't always better (e.g., when there are many small batches)

yea, seeing overhead with pinning for batch size = 32

srinathk10 and others added 14 commits May 9, 2025 18:59

Train Tests: Disable cgroup isolation on head node for benchmarking

c18a63c

Signed-off-by: Srinath Krishnamachari <[email protected]>

Merge branch 'master' into srinathk10-train-test-cgroups-compute-config

ab514a5

Train Tests: Fix custom collate fn; include warmup time

7707f41

Signed-off-by: Srinath Krishnamachari <[email protected]>

Merge branch 'master' into srinathk10-train-fix-collate-fn

2d31486

Train Tests: Fix custom collate fn; include warmup time

4a10695

Signed-off-by: Srinath Krishnamachari <[email protected]>

Merge branch 'master' into srinathk10-train-fix-collate-fn

6d71ec4

Lint

1acbf57

Signed-off-by: Srinath Krishnamachari <[email protected]>

Misc fixes

3cab00f

Signed-off-by: Srinath Krishnamachari <[email protected]>

Apply suggestions from code review

ef7302c

Co-authored-by: lanbochen-anyscale <[email protected]> Signed-off-by: srinathk10 <[email protected]>

Merge branch 'master' into srinathk10-train-test-cgroups-compute-config

749a739

Merge branch 'master' into srinathk10-train-fix-collate-fn

ad27d53

Merge branch 'srinathk10-train-test-cgroups-compute-config' into srin…

e4b628a

…athk10-train-fix-collate-fn

Misc fixes

73263ad

Signed-off-by: Srinath Krishnamachari <[email protected]>

Add pinned_memory and non_blocking transfer for default collate_fn

27f7242

Signed-off-by: Srinath Krishnamachari <[email protected]>

srinathk10 requested review from hongpeng-guo, justinvyu, matthewdeng, raulchen, woshiyyya and a team as code owners May 12, 2025 21:52

srinathk10 changed the base branch from master to srinathk10-train-fix-collate-fn May 12, 2025 22:04

srinathk10 added 2 commits May 12, 2025 15:05

Update factory.py

c2afb07

Signed-off-by: srinathk10 <[email protected]>

Merge branch 'srinathk10-train-fix-collate-fn' into srinathk10-pinned…

aacde34

…-memory Signed-off-by: srinathk10 <[email protected]>

srinathk10 changed the base branch from srinathk10-train-fix-collate-fn to master May 12, 2025 22:47

srinathk10 and others added 4 commits May 12, 2025 15:47

Merge branch 'master' into srinathk10-pinned-memory

06fb331

Reverts

0f4021d

Signed-off-by: Srinath Krishnamachari <[email protected]>

Merge branch 'master' into srinathk10-pinned-memory

5611e52

Merge branch 'master' into srinathk10-pinned-memory

58bb4f3

srinathk10 marked this pull request as draft May 13, 2025 00:34

srinathk10 added the go add ONLY when ready to merge, run all tests label May 13, 2025

Merge branch 'master' into srinathk10-pinned-memory

33eb5bb

Signed-off-by: srinathk10 <[email protected]>

raulchen reviewed May 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pinned_memory and non_blocking transfer for default collate_fn #52948

Add pinned_memory and non_blocking transfer for default collate_fn #52948

srinathk10 commented May 12, 2025

raulchen May 13, 2025

srinathk10 May 13, 2025

Add pinned_memory and non_blocking transfer for default collate_fn #52948

Are you sure you want to change the base?

Add pinned_memory and non_blocking transfer for default collate_fn #52948

Conversation

srinathk10 commented May 12, 2025

Why are these changes needed?

Related issue number

Checks

raulchen May 13, 2025

Choose a reason for hiding this comment

srinathk10 May 13, 2025

Choose a reason for hiding this comment