Skip to content

scx_lavd: Donate tasks at ops.select_cpu() and ops.enqueue(). #1879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 14, 2025

Conversation

multics69
Copy link
Contributor

When the system is underloaded, task stealing does not work well since there is almost always no task in DSQ.

To address this problem, when the system is underloaded, perform task donation at ops.select_cpu() and ops.enqueue() paths.

More specifically, try task donation when the sticky domain is a stealee domain (i.e., relatively overloaded) and there is no fully idle CPU on the sticky domain while picking an idle CPU at ops.select_cpu() and ops.enqueue(). In this case, traverse neighbor domains in distance order, find a fully idle CPU, and migrate the domain having the fully idle CPU.

@multics69 multics69 requested review from arighi, htejun and hodgesds May 14, 2025 05:54
When the system is underloaded, task stealing does not work well since there
is almost always no task in DSQ.

To address this problem, when the system is underloaded, perform task donation
at ops.select_cpu() and ops.enqueue() paths.

More specifically, try task donation when the sticky domain is a stealee domain
(i.e., relatively overloaded) and there is no fully idle CPU on the sticky
domain while picking an idle CPU at ops.select_cpu() and ops.enqueue().
In this case, traverse neighbor domains in distance order,
find a fully idle CPU, and migrate the domain having the fully idle CPU.

Signed-off-by: Changwoo Min <[email protected]>
@marioroy
Copy link

==> Starting prepare()...
2c9155d7 scx_lavd: Donate tasks at ops.select_cpu() and ops.enqueue().

Running first with affinity, then without. I can see an improvement in htop.

scx-scheds-git 1.0.12.r63.ged4296e7-1

$ ./algorithm3.pl 1e12 --threads=20 --procbind
Primes found: 37607912018
Seconds: 23.446

# before 28.656
$ ./algorithm3.pl 1e12 --threads=20
Primes found: 37607912018
Seconds: 24.610

The more CPU bound primesieve.pl.

$ ./primesieve.pl 2e12 --threads=20 --procbind
Primes found: 73301896139
Seconds: 18.038

# before 20.346
$ ./primesieve.pl 2e12 --threads=20
Primes found: 73301896139
Seconds: 19.172

@marioroy
Copy link

marioroy commented May 14, 2025

Recently, I updated mce-sandbox repo. Increased factor in primesieve.pl. So to have chunky algorithm3.pl and more cpu-bound primesieve.pl.

marioroy/mce-sandbox@aad158c

Adding Chrome WebGL Aquarium to the mix. Let me compare LAVD and Rustland { algorithm3.pl and primesieve.pl }. One is chunky, the other more CPU bound.

algorithm3.pl

# cpu affinity to capture optimal performance
$ ./algorithm3.pl 1e12 --threads=20 --procbind
25.092  bore      fps: 52

# without CPU affinity
$ ./algorithm3.pl 1e12 --threads=20 
24.135  rustland  fps: 60
25.058  lavd      fps: 60

primesieve.pl

# cpu affinity to capture optimal performance
$ ./primesieve.pl 2e12 --threads=20 --procbind
18.878  bore      fps: 42~44

# without CPU affinity
$ ./primesieve.pl 2e12 --threads=20 
19.141  rustland  fps: 42~60
21.636  lavd      fps: 42~53

The PR not working as well for primesieve.pl. Favors more SMT siblings in htop.

My Google Chrome launcher applies CPU affinity on 8-15 (primary cores) and 40-47 (siblings). I removed the taskset command, pinning to CPUs.

EXECCMD=/opt/google/chrome/google-chrome
exec taskset -c 8-15,40-47 "$EXECCMD" ...

Now, primesieve.pl runs similar to Rustland.

$ ./primesieve.pl 2e12 --threads=20
Primes found: 73301896139
Seconds: 19.076

@multics69
Copy link
Contributor Author

Thank you super much, @marioroy, for the testing! In the case of pinned chrome and primesieve.pl, I will dig deeper what happens when task pinning is involved. I will probably handle it in a separate PR as an improvement to this one.

@multics69 multics69 added this pull request to the merge queue May 14, 2025
Merged via the queue into sched-ext:main with commit 0bca464 May 14, 2025
16 checks passed
@multics69 multics69 deleted the lavd-lb-smt-opt branch May 14, 2025 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants