Skip to content

docker: clamp CPU shares to minimum of 2 #26081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 19, 2025
Merged

Conversation

tgross
Copy link
Member

@tgross tgross commented Jun 19, 2025

In #25963 we added normalization of CPU shares for large hosts where the total compute was larger than the maximum CPU shares. But if the result after normalization is less than 2, runc will have an integer overflow. We prevent this in the shared executor for the exec/rawexec driver by clamping to the safe minimum value. Do this for the docker driver as well and add test coverage of it for the shared executor too.

Fixes: #26080
Fixes: https://hashicorp.atlassian.net/browse/NMD-858
Ref: #25963

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad website documentation to reflect this. Refer to
    the website README for docs guidelines. Please also consider whether the
    change requires notes within the upgrade guide.

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.

In #25963 we added normalization of CPU shares for large hosts where the total
compute was larger than the maximum CPU shares. But if the result after
normalization is less than 2, runc will have an integer overflow. We prevent
this in the shared executor for the `exec`/`rawexec` driver by clamping to the
safe minimum value. Do this for the `docker` driver as well and add test
coverage of it for the shared executor too.

Fixes: #26080
Ref: #25963
@tgross tgross force-pushed the NMD858-cgroup-integer-overflow branch from 238e17b to 9ede0a5 Compare June 19, 2025 15:40
@tgross tgross added backport/ent/1.8.x+ent Changes are backported to 1.8.x+ent backport/ent/1.9.x+ent Changes are backported to 1.9.x+ent backport/1.10.x backport to 1.10.x release line labels Jun 19, 2025
@tgross tgross marked this pull request as ready for review June 19, 2025 16:00
@tgross tgross requested review from a team as code owners June 19, 2025 16:00
@tgross
Copy link
Member Author

tgross commented Jun 19, 2025

Just for my peace of mind I ran a half hour of fuzz testing and tested a few big 10k integer windows of requests vs every value for the total from hw.MHz(1) to hw.MHz(5_000_000), and all the values fall into the acceptable range.

@tgross tgross merged commit c8dcd3c into main Jun 19, 2025
46 checks passed
@tgross tgross deleted the NMD858-cgroup-integer-overflow branch June 19, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/ent/1.8.x+ent Changes are backported to 1.8.x+ent backport/ent/1.9.x+ent Changes are backported to 1.9.x+ent backport/1.10.x backport to 1.10.x release line theme/driver/docker type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cgroupv2 cpu.weight integer overflow
3 participants