-
Notifications
You must be signed in to change notification settings - Fork 3.9k
concurrency: update wait queues in AddDiscoveredLock #147065
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
concurrency: update wait queues in AddDiscoveredLock #147065
Conversation
2aae2e1
to
566e0bf
Compare
Concurrent readers in the face of a concurrent, retrying writer can result in a situation where AddDiscoveredLock moves the timestamp of a held intent past the read timestamp a waiting reader. The reader should be unblocked in this case but previously wasn't, resulting in a lock table verification assertion failure in the form of: error: non locking reader ... does not conflict with lock holder This is my best theory for what is happening in cockroachdb#146749 based on increased logging. It is difficult to be certain given the required ordering of events is hard to observe with locking. See the comment in the test for a more complete timeline that can lead to the bug. Fixes cockroachdb#146749 Release note: None
566e0bf
to
75d36be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me, but maybe @arulajmani should take a look too. There are so many branches in some of these lock table functions, and I'm never sure when we need to informActiveWaiters
vs recomputeWaitQueues
vs both.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)
pkg/kv/kvserver/concurrency/lock_table.go
line 3117 at r2 (raw file):
if tl.replicatedInfo.isEmpty() { tl.replicatedInfo.acquire(foundLock.Strength, foundLock.Txn.WriteTimestamp)
Can we end up in a similar situation in this branch? E.g. the lock was held unreplicated at first, then held as an intent by the same transaction at a higher timestamp, and then it was discovered by some other request.
pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered
line 170 at r1 (raw file):
# update at step 12. # # Note that in the test as written, the followin are how the reqests align
nit: typos on "following" and "requests"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't mean to block it.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @stevendanna)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)
pkg/kv/kvserver/concurrency/lock_table.go
line 3117 at r2 (raw file):
Previously, miraradeva (Mira Radeva) wrote…
Can we end up in a similar situation in this branch? E.g. the lock was held unreplicated at first, then held as an intent by the same transaction at a higher timestamp, and then it was discovered by some other request.
I think that "the lock was held unreplicated at first" would still fall under the kl.isLockedBy
branch assuming the discovered lock is for the same transaction.
That said though, I do kinda wonder if we could re-arrange this function a bit. It seems to me that instead of calling tl.replicatedInfo.acquire(foundLock.Strength, foundLock.Txn.WriteTimestamp)
conditionally here, we could remove it from the first branch, call it unconditionally here, and then always recompute the wait queues. If you take the else branch above, then I think isEmpty() is true by definition.
@arulajmani If you have some time for review, this would clear a flaky test. |
Sorry I missed this. I'll have a look today after our weekly meeting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice sleuthing here!
Reviewed all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @miraradeva and @stevendanna)
pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered
line 159 at r2 (raw file):
# 6 | | FinishReq | | | # 8 | | | | SeqReq | # 9 | | | AddDiscovered(intent@ts10) | |
@stevendanna is another way to look at this issue that r1@ts10 doesn't conflict on latches with w2@ts12? Even though this doesn't cause the test to fail, because r1 has pulled the wrong intent into the lock table, r1 shouldn't really be waiting on the intent written by w2.
What if we had w2 acquire a write latch at Txn.MinTimestamp
(in this case ts@10) instead? Conceptually, you could think of re-writing an intent at a higher timestamp as intent resolution. Should we be mimicking how intent resolution does latching, for similar reasons?
minTxnTS = t.IntentTxn.MinTimestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)
pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered
line 159 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
@stevendanna is another way to look at this issue that r1@ts10 doesn't conflict on latches with w2@ts12? Even though this doesn't cause the test to fail, because r1 has pulled the wrong intent into the lock table, r1 shouldn't really be waiting on the intent written by w2.
What if we had w2 acquire a write latch at
Txn.MinTimestamp
(in this case ts@10) instead? Conceptually, you could think of re-writing an intent at a higher timestamp as intent resolution. Should we be mimicking how intent resolution does latching, for similar reasons?
minTxnTS = t.IntentTxn.MinTimestamp
Yes, I definitely think another way to look at this is from the latching angle. Latching at the MinTimestamp is an interesting idea.
It would mean a retrying transaction is more likely to be blocked by concurrent readers. But whether that is a bad thing isn't so clear to me. In this case it does seem like it would have been better. Perhaps we can use this vecindex workload as something of a benchmark to see how they compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)
pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered
line 159 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Yes, I definitely think another way to look at this is from the latching angle. Latching at the MinTimestamp is an interesting idea.
It would mean a retrying transaction is more likely to be blocked by concurrent readers. But whether that is a bad thing isn't so clear to me. In this case it does seem like it would have been better. Perhaps we can use this vecindex workload as something of a benchmark to see how they compare.
Apologies for the delay here. I think this fix is correct, but I am still interested in exploring changing the latching timestamp.
I'll try to get this done this week.
Retrying writers currently latch at their WriteTimestamp, allowing them to overlap with readers who might be reading intents from their previous Epoch. This is fine, but does lead to some interesting situations like the one described in cockroachdb#147065. In some cases, such readers will end up unnecessarily waiting on the old intent and will only be unblocked when a subsequent reader discovers the updated intent or when the writer completes its transaction. Latching at the MinTimestamp would disallow the overlapping reader and writer. The reader would wait until the writer had updated its intent and then would proceed with its read. Release note: None
I've opened an experiment for the change to the latch timestamp here: #148802 I personally think we should probably do both fixes. |
Concurrent readers in the face of a concurrent, retrying writer can result in a situation where AddDiscoveredLock moves the timestamp of a held intent past the read timestamp a waiting reader. The reader should be unblocked in this case but previously wasn't, resulting in a lock table verification assertion failure in the form of:
error: non locking reader ... does not conflict with lock holder
This is my best theory for what is happening in #146749 based on increased logging. It is difficult to be certain given the required ordering of events is hard to observe with locking.
See the comment in the test for a more complete timeline that can lead to the bug.
Fixes #146749
Release note: None