Skip to content

Do not move thread-locals before dropping #141685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 31, 2025
Merged

Conversation

orlp
Copy link
Contributor

@orlp orlp commented May 28, 2025

Fixes #140816. I also (potentially) improved the speed of get_or_init a bit by having an explicit hot/cold path.

We still move the value before dropping in the event of a recursive initialization (leading to double-initialization with one value being silently dropped). This is the old behavior, but changing this to panic instead would involve changing tests and also the other OS-specific thread_local/os.rs implementation, which is more than I'd like in this PR.

@rustbot
Copy link
Collaborator

rustbot commented May 28, 2025

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 28, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@orlp orlp changed the title Do not move thread-locals before dropping and panic on recursive initialization Do not move thread-locals before dropping May 28, 2025
@Noratrieb
Copy link
Member

can you add a test?

@orlp
Copy link
Contributor Author

orlp commented May 29, 2025

@Noratrieb Done.

@Kobzol
Copy link
Contributor

Kobzol commented May 29, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 29, 2025
bors added a commit that referenced this pull request May 29, 2025
Do not move thread-locals before dropping

Fixes #140816. I also (potentially) improved the speed of `get_or_init` a bit by having an explicit hot/cold path.

We still move the value before dropping in the event of a recursive initialization (leading to double-initialization with one value being silently dropped). This is the old behavior, but changing this to panic instead would involve changing tests and also the other OS-specific `thread_local/os.rs` implementation, which is more than I'd like in this PR.
@bors
Copy link
Collaborator

bors commented May 29, 2025

⌛ Trying commit aff29df with merge 71827ba...

@bors
Copy link
Collaborator

bors commented May 29, 2025

☀️ Try build successful - checks-actions
Build commit: 71827ba (71827ba6edcfe824077acb1cbfd0f3167800826d)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (71827ba): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.7% [-2.0%, -0.2%] 25
Improvements ✅
(secondary)
-1.1% [-1.6%, -0.2%] 7
All ❌✅ (primary) -0.7% [-2.0%, -0.2%] 25

Max RSS (memory usage)

Results (primary 4.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
7.2% [4.7%, 9.7%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.5% [-1.5%, -1.5%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.3% [-1.5%, 9.7%] 3

Cycles

Results (primary -1.4%, secondary -14.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.4% [-1.4%, -1.4%] 1
Improvements ✅
(secondary)
-14.6% [-16.3%, -13.5%] 6
All ❌✅ (primary) -1.4% [-1.4%, -1.4%] 1

Binary size

Results (primary -0.2%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.5%, -0.1%] 11
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 3
All ❌✅ (primary) -0.2% [-0.5%, -0.1%] 11

Bootstrap: 777.949s -> 778.459s (0.07%)
Artifact size: 368.50 MiB -> 368.44 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 29, 2025
@jhpratt
Copy link
Member

jhpratt commented May 30, 2025

I don't have the necessary context here; reassigning. Though I will note the performance looks great!

r? libs

@rustbot rustbot assigned joboet and unassigned jhpratt May 30, 2025
Copy link
Member

@joboet joboet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the performance improvements speak for themselves...

When I wrote this code, I was hoping that storing the value in an enum would allow for niche-optimisations. It'd be interesting to see whether doing that helps in the case of variables without destructors – but that shouldn't block this PR.

r=me with the nits addressed.

// access to self.value and may replace it.
let mut old_value = unsafe { self.value.get().replace(MaybeUninit::new(v)) };
match self.state.replace(State::Alive) {
State::Uninitialized => D::register_dtor(self),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you reintroduce the comments from the old version? I was confused for a second about why the destructor is only registered in this arm, the comment helped to avoid that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,38 @@
//@ run-pass
#![allow(stable_features)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing, it was left over after copying tls-init-on-init.rs as a template.

@orlp
Copy link
Contributor Author

orlp commented May 30, 2025

Well, the performance improvements speak for themselves...

I believe the most important part of that isn't the change in layout, it's separating get_or_init into a fast path (already initialized) with a single well-predicted branch followed by an immediate return and a #[cold] slow path where all the other stuff needed for initialization can be.

@joboet
Copy link
Member

joboet commented May 30, 2025

Thank you!
@bors r+

@bors
Copy link
Collaborator

bors commented May 30, 2025

📌 Commit b374adc has been approved by joboet

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 30, 2025
@bors
Copy link
Collaborator

bors commented May 31, 2025

⌛ Testing commit b374adc with merge 852f15c...

@bors
Copy link
Collaborator

bors commented May 31, 2025

☀️ Test successful - checks-actions
Approved by: joboet
Pushing 852f15c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 31, 2025
@bors bors merged commit 852f15c into rust-lang:master May 31, 2025
10 checks passed
@rustbot rustbot added this to the 1.89.0 milestone May 31, 2025
Copy link

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 738c08b (parent) -> 852f15c (this PR)

Test differences

Show 2 test diffs

Stage 1

  • [ui] tests/ui/threads-sendsync/tls-dont-move-after-init.rs: [missing] -> pass (J1)

Stage 2

  • [ui] tests/ui/threads-sendsync/tls-dont-move-after-init.rs: [missing] -> pass (J0)

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 852f15c0f146fc292c9b20f2a8f44c1f671d7845 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-aarch64-linux: 7727.0s -> 5682.4s (-26.5%)
  2. x86_64-apple-2: 5246.1s -> 6305.4s (20.2%)
  3. dist-apple-various: 7343.2s -> 6036.6s (-17.8%)
  4. x86_64-apple-1: 9750.8s -> 8183.3s (-16.1%)
  5. dist-x86_64-apple: 8047.2s -> 8932.7s (11.0%)
  6. aarch64-apple: 4738.1s -> 5153.1s (8.8%)
  7. x86_64-rust-for-linux: 2840.5s -> 2620.7s (-7.7%)
  8. dist-various-1: 4887.9s -> 4512.1s (-7.7%)
  9. dist-x86_64-freebsd: 5024.2s -> 5321.2s (5.9%)
  10. dist-aarch64-apple: 5387.3s -> 5681.6s (5.5%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (852f15c): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.5% [0.5%, 0.5%] 1
Improvements ✅
(primary)
-0.7% [-0.7%, -0.7%] 1
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.3%] 2
All ❌✅ (primary) -0.7% [-0.7%, -0.7%] 1

Max RSS (memory usage)

Results (primary 0.7%, secondary 0.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.2% [2.1%, 5.2%] 3
Regressions ❌
(secondary)
0.8% [0.4%, 2.2%] 7
Improvements ✅
(primary)
-7.0% [-7.0%, -7.0%] 1
Improvements ✅
(secondary)
-0.8% [-0.8%, -0.8%] 1
All ❌✅ (primary) 0.7% [-7.0%, 5.2%] 4

Cycles

Results (primary -0.6%, secondary -1.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.8% [0.4%, 1.6%] 5
Improvements ✅
(primary)
-0.6% [-0.6%, -0.6%] 1
Improvements ✅
(secondary)
-2.5% [-9.1%, -0.5%] 8
All ❌✅ (primary) -0.6% [-0.6%, -0.6%] 1

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.1%] 5
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.5%, -0.1%] 11
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 3
All ❌✅ (primary) -0.1% [-0.5%, 0.1%] 16

Bootstrap: 776.294s -> 776.679s (0.05%)
Artifact size: 372.28 MiB -> 372.27 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Static thread_local! declarations are moved before Drop is called
9 participants