concurrency: update wait queues in AddDiscoveredLock #147065

stevendanna · 2025-05-21T12:19:06Z

Concurrent readers in the face of a concurrent, retrying writer can result in a situation where AddDiscoveredLock moves the timestamp of a held intent past the read timestamp a waiting reader. The reader should be unblocked in this case but previously wasn't, resulting in a lock table verification assertion failure in the form of:

error: non locking reader ... does not conflict with lock holder

This is my best theory for what is happening in #146749 based on increased logging. It is difficult to be certain given the required ordering of events is hard to observe with locking.

See the comment in the test for a more complete timeline that can lead to the bug.

Fixes #146749

Release note: None

cockroach-teamcity · 2025-05-21T12:19:16Z

This change is

Concurrent readers in the face of a concurrent, retrying writer can result in a situation where AddDiscoveredLock moves the timestamp of a held intent past the read timestamp a waiting reader. The reader should be unblocked in this case but previously wasn't, resulting in a lock table verification assertion failure in the form of: error: non locking reader ... does not conflict with lock holder This is my best theory for what is happening in cockroachdb#146749 based on increased logging. It is difficult to be certain given the required ordering of events is hard to observe with locking. See the comment in the test for a more complete timeline that can lead to the bug. Fixes cockroachdb#146749 Release note: None

miraradeva

Looks reasonable to me, but maybe @arulajmani should take a look too. There are so many branches in some of these lock table functions, and I'm never sure when we need to informActiveWaiters vs recomputeWaitQueues vs both.

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)

pkg/kv/kvserver/concurrency/lock_table.go line 3117 at r2 (raw file):

	if tl.replicatedInfo.isEmpty() {
		tl.replicatedInfo.acquire(foundLock.Strength, foundLock.Txn.WriteTimestamp)

Can we end up in a similar situation in this branch? E.g. the lock was held unreplicated at first, then held as an intent by the same transaction at a higher timestamp, and then it was discovered by some other request.

pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered line 170 at r1 (raw file):

# update at step 12.
#
# Note that in the test as written, the followin are how the reqests align

nit: typos on "following" and "requests"

miraradeva

Sorry, I didn't mean to block it.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @stevendanna)

stevendanna

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)

pkg/kv/kvserver/concurrency/lock_table.go line 3117 at r2 (raw file):

Previously, miraradeva (Mira Radeva) wrote…

Can we end up in a similar situation in this branch? E.g. the lock was held unreplicated at first, then held as an intent by the same transaction at a higher timestamp, and then it was discovered by some other request.

I think that "the lock was held unreplicated at first" would still fall under the kl.isLockedBy branch assuming the discovered lock is for the same transaction.

That said though, I do kinda wonder if we could re-arrange this function a bit. It seems to me that instead of calling tl.replicatedInfo.acquire(foundLock.Strength, foundLock.Txn.WriteTimestamp) conditionally here, we could remove it from the first branch, call it unconditionally here, and then always recompute the wait queues. If you take the else branch above, then I think isEmpty() is true by definition.

stevendanna · 2025-05-28T14:24:28Z

@arulajmani If you have some time for review, this would clear a flaky test.

arulajmani · 2025-05-28T14:59:16Z

Sorry I missed this. I'll have a look today after our weekly meeting.

arulajmani

Nice sleuthing here!

Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @miraradeva and @stevendanna)

pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered line 159 at r2 (raw file):

# 6 |                          | FinishReq                |                            |                             |
# 8 |                          |                          |                            | SeqReq                      |
# 9 |                          |                          | AddDiscovered(intent@ts10) |                             |

@stevendanna is another way to look at this issue that r1@ts10 doesn't conflict on latches with w2@ts12? Even though this doesn't cause the test to fail, because r1 has pulled the wrong intent into the lock table, r1 shouldn't really be waiting on the intent written by w2.

What if we had w2 acquire a write latch at Txn.MinTimestamp (in this case ts@10) instead? Conceptually, you could think of re-writing an intent at a higher timestamp as intent resolution. Should we be mimicking how intent resolution does latching, for similar reasons?

cockroach/pkg/kv/kvserver/batcheval/cmd_resolve_intent.go

Line 41 in c83c57d

minTxnTS = t.IntentTxn.MinTimestamp

stevendanna

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)

pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered line 159 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

@stevendanna is another way to look at this issue that r1@ts10 doesn't conflict on latches with w2@ts12? Even though this doesn't cause the test to fail, because r1 has pulled the wrong intent into the lock table, r1 shouldn't really be waiting on the intent written by w2.

What if we had w2 acquire a write latch at Txn.MinTimestamp (in this case ts@10) instead? Conceptually, you could think of re-writing an intent at a higher timestamp as intent resolution. Should we be mimicking how intent resolution does latching, for similar reasons?

cockroach/pkg/kv/kvserver/batcheval/cmd_resolve_intent.go

Line 41 in c83c57d

minTxnTS = t.IntentTxn.MinTimestamp

Yes, I definitely think another way to look at this is from the latching angle. Latching at the MinTimestamp is an interesting idea.

It would mean a retrying transaction is more likely to be blocked by concurrent readers. But whether that is a bad thing isn't so clear to me. In this case it does seem like it would have been better. Perhaps we can use this vecindex workload as something of a benchmark to see how they compare.

stevendanna

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)

pkg/kv/kvserver/concurrency/testdata/lock_table/add_discovered line 159 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Yes, I definitely think another way to look at this is from the latching angle. Latching at the MinTimestamp is an interesting idea.

It would mean a retrying transaction is more likely to be blocked by concurrent readers. But whether that is a bad thing isn't so clear to me. In this case it does seem like it would have been better. Perhaps we can use this vecindex workload as something of a benchmark to see how they compare.

Apologies for the delay here. I think this fix is correct, but I am still interested in exploring changing the latching timestamp.

I'll try to get this done this week.

Retrying writers currently latch at their WriteTimestamp, allowing them to overlap with readers who might be reading intents from their previous Epoch. This is fine, but does lead to some interesting situations like the one described in cockroachdb#147065. In some cases, such readers will end up unnecessarily waiting on the old intent and will only be unblocked when a subsequent reader discovers the updated intent or when the writer completes its transaction. Latching at the MinTimestamp would disallow the overlapping reader and writer. The reader would wait until the writer had updated its intent and then would proceed with its read. Release note: None

stevendanna · 2025-06-25T12:20:14Z

I've opened an experiment for the change to the latch timestamp here: #148802

I personally think we should probably do both fixes.

stevendanna requested a review from a team as a code owner May 21, 2025 12:19

stevendanna requested a review from miraradeva May 21, 2025 12:19

stevendanna force-pushed the ssd/add-discovered-recomputes-wait-queue branch from 2aae2e1 to 566e0bf Compare May 21, 2025 12:31

stevendanna force-pushed the ssd/add-discovered-recomputes-wait-queue branch from 566e0bf to 75d36be Compare May 21, 2025 13:54

stevendanna removed the request for review from a team May 21, 2025 13:55

stevendanna assigned arulajmani and unassigned arulajmani May 21, 2025

stevendanna requested a review from arulajmani May 21, 2025 13:55

miraradeva requested changes May 21, 2025

View reviewed changes

miraradeva approved these changes May 21, 2025

View reviewed changes

stevendanna commented May 21, 2025

View reviewed changes

arulajmani reviewed Jun 2, 2025

View reviewed changes

stevendanna commented Jun 2, 2025

View reviewed changes

stevendanna commented Jun 18, 2025

View reviewed changes

stevendanna mentioned this pull request Jun 25, 2025

batcheval: latch writes at MinTimestamp #148802

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

concurrency: update wait queues in AddDiscoveredLock #147065

concurrency: update wait queues in AddDiscoveredLock #147065

stevendanna commented May 21, 2025 •

edited

Loading

Uh oh!

cockroach-teamcity commented May 21, 2025

Uh oh!

miraradeva left a comment

Uh oh!

miraradeva left a comment

Uh oh!

stevendanna left a comment

Uh oh!

stevendanna commented May 28, 2025

Uh oh!

arulajmani commented May 28, 2025

Uh oh!

arulajmani left a comment

Uh oh!

stevendanna left a comment

Uh oh!

stevendanna left a comment

Uh oh!

stevendanna commented Jun 25, 2025

Uh oh!

Uh oh!

concurrency: update wait queues in AddDiscoveredLock #147065

Are you sure you want to change the base?

concurrency: update wait queues in AddDiscoveredLock #147065

Conversation

stevendanna commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented May 21, 2025

Uh oh!

miraradeva left a comment

Choose a reason for hiding this comment

Uh oh!

miraradeva left a comment

Choose a reason for hiding this comment

Uh oh!

stevendanna left a comment

Choose a reason for hiding this comment

Uh oh!

stevendanna commented May 28, 2025

Uh oh!

arulajmani commented May 28, 2025

Uh oh!

arulajmani left a comment

Choose a reason for hiding this comment

Uh oh!

stevendanna left a comment

Choose a reason for hiding this comment

Uh oh!

stevendanna left a comment

Choose a reason for hiding this comment

Uh oh!

stevendanna commented Jun 25, 2025

Uh oh!

Uh oh!

stevendanna commented May 21, 2025 •

edited

Loading