Skip to content

concurrency: mark some unreplicated locks as ineligible for export #145532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

stevendanna
Copy link
Collaborator

Once a lock has been reported as missing via QueryIntent, it is
important that it stays missing. As described in #145458, if a lock
that has been previously reported as missing is somehow then found via
a subsequent QueryIntent request, this can result in a transaction
being erroneously committed by the transaction recovery process.

When unreplicated lock reliability is enabled, Lease and Merge use
ExportUnreplicatedLocks to move unreplicated locks from the lock table
into durable storage. However, these locks may include unreplicated
locks that "cover" a replicated lock that was previously reported as
missing, exactly what we must avoid.

Here, we solve that by extending QueryIntent's evaluation to mark any
locks it reports as missing as ineligible for export.

Fixes #145458

Release note: None

@stevendanna stevendanna requested a review from a team as a code owner April 30, 2025 09:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@stevendanna
Copy link
Collaborator Author

@miraradeva @arulajmani I think we are leaning away from this towards a more fundamental fix. But, I figured I'd put this up in a slightly cleaner form just so we could consider it since, I think something like this might still be required along some of the implementation paths we've discussed.

@stevendanna stevendanna force-pushed the ssd/epoch-atomicity-violation-fix branch from 2ecf9dd to 272c1fb Compare April 30, 2025 09:56
@stevendanna stevendanna force-pushed the ssd/epoch-atomicity-violation-fix branch from 272c1fb to 86b31cd Compare May 25, 2025 17:07
@stevendanna stevendanna force-pushed the ssd/epoch-atomicity-violation-fix branch from 86b31cd to 4d3b62f Compare June 6, 2025 10:49
@stevendanna
Copy link
Collaborator Author

Fixed up the race condition in the test here.

Copy link
Contributor

@miraradeva miraradeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. But I wonder if it's possible for the marking-as-ineligible mechanism to race with the lock-export mechanism. E.g. QueryIntent finds a missing intent and is just about to mark the lock in the lock table, while a lease transfer kicks off a lock export. If it's not possible, we should mention it somewhere.

Reviewed 2 of 2 files at r1, 8 of 8 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


-- commits line 42 at r3:
I assume you'll squash this commit into one of the previous ones?


pkg/kv/kvserver/batcheval/cmd_query_intent.go line 162 at r3 (raw file):

	res := result.Result{}
	if !reply.FoundIntent && args.ErrorIfMissing {
		l := roachpb.MakeLockAcquisition(args.Txn, args.Key, lock.Replicated, args.Strength, args.IgnoredSeqNums)

It's a bit unintuitive why we're returning a replicated lock acquisition here, given that we'll use this response to mark unreplicated locks in the lock table. Wouldn't it make more sense for this acquisition to be lock.Intent?


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4791 at r2 (raw file):

				return err
			case 2:
				// My our second retry, txrecovery should have correctly aborted this

nit: "On our second retry ..."


pkg/kv/kvserver/concurrency/concurrency_control.go line 706 at r2 (raw file):

	// MarkIneligibleForExport marks any locks held by this transaction on the
	// same key at greater or equal strength as ineligible for exported from the

nit: "ineligible for export"


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4707 at r1 (raw file):

						getReq, ok := fArgs.Req.Requests[0].GetInner().(*kvpb.GetRequest)
						// Only fail replication on the first retry.
						if ok && getReq.KeyLockingDurability == lock.Replicated && retryNum.Load() == 1 {

Is there a reason we need to keep track of these retries separately? Wouldn't the txn epoch tell us that?


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4733 at r1 (raw file):

		tc.TransferRangeLeaseOrFatal(t, desc, tc.Target(idx))
	}
	// Make a db with transaction heartbeating disabled. This ensures that we

Is it important that heartbeating is disabled only for this db? If it was ok to disable it in general, we could just do this, right?

kvcoord.TxnHeartbeatInterval.Override(ctx, &s.ClusterSettings().SV, -1)

pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4786 at r1 (raw file):

				err = txn.Commit(ctx)
				require.Error(t, err)
				//require.ErrorContains(t, err, "IntentMissingError")

Seems like we should be able to uncomment this?

@stevendanna
Copy link
Collaborator Author

But I wonder if it's possible for the marking-as-ineligible mechanism to race with the lock-export mechanism. E.g. QueryIntent finds a missing intent and is just about to mark the lock in the lock table, while a lease transfer kicks off a lock export. If it's not possible, we should mention it somewhere.

Good question. What I think makes this safe is that both a lease transfer and subsume acquire non-MVCC write latches over the entire range's span, so when they run they really are running alone.

I wonder if we could even assert that this is the case somewhere?

@miraradeva
Copy link
Contributor

But I wonder if it's possible for the marking-as-ineligible mechanism to race with the lock-export mechanism. E.g. QueryIntent finds a missing intent and is just about to mark the lock in the lock table, while a lease transfer kicks off a lock export. If it's not possible, we should mention it somewhere.

Good question. What I think makes this safe is that both a lease transfer and subsume acquire non-MVCC write latches over the entire range's span, so when they run they really are running alone.

I wonder if we could even assert that this is the case somewhere?

Yeah, I see. And QueryIntent, which kicks off the marking-as-ineligible mechanism, declares non-MVCC keys on the intents it's querying. It seems safe.

Copy link
Collaborator Author

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)


-- commits line 42 at r3:

Previously, miraradeva (Mira Radeva) wrote…

I assume you'll squash this commit into one of the previous ones?

Yeah, I'll squash everything into a single commit after review.


pkg/kv/kvserver/batcheval/cmd_query_intent.go line 162 at r3 (raw file):

Previously, miraradeva (Mira Radeva) wrote…

It's a bit unintuitive why we're returning a replicated lock acquisition here, given that we'll use this response to mark unreplicated locks in the lock table. Wouldn't it make more sense for this acquisition to be lock.Intent?

This represents the missing lock which only could have been replicated. The missing lock might not be an intent, it could be a shared or exclusive lock as well.


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4707 at r1 (raw file):

Previously, miraradeva (Mira Radeva) wrote…

Is there a reason we need to keep track of these retries separately? Wouldn't the txn epoch tell us that?

Good call. Done. I think this was probably originally done because when the txn gets correctly aborted then on retry you are actually on a new txn, but this easily detected via the changed txnID.


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4733 at r1 (raw file):

Previously, miraradeva (Mira Radeva) wrote…

Is it important that heartbeating is disabled only for this db? If it was ok to disable it in general, we could just do this, right?

kvcoord.TxnHeartbeatInterval.Override(ctx, &s.ClusterSettings().SV, -1)

I suppose without heartbeating gnerally, if this test ends up running for a long time somehow, some internal transactions could be aborted. Everything should handle that correctly (since it can presumably happen in production) but it could cause log noise that would complicated debugging.


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4786 at r1 (raw file):

Previously, miraradeva (Mira Radeva) wrote…

Seems like we should be able to uncomment this?

Done.

Copy link
Contributor

@miraradeva miraradeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani and @stevendanna)


pkg/kv/kvserver/batcheval/cmd_query_intent.go line 162 at r3 (raw file):

Previously, stevendanna (Steven Danna) wrote…

This represents the missing lock which only could have been replicated. The missing lock might not be an intent, it could be a shared or exclusive lock as well.

Got it.


pkg/kv/kvclient/kvcoord/dist_sender_server_test.go line 4733 at r1 (raw file):

Previously, stevendanna (Steven Danna) wrote…

I suppose without heartbeating gnerally, if this test ends up running for a long time somehow, some internal transactions could be aborted. Everything should handle that correctly (since it can presumably happen in production) but it could cause log noise that would complicated debugging.

Makes sense.

@stevendanna stevendanna force-pushed the ssd/epoch-atomicity-violation-fix branch from ab9c485 to 6deea66 Compare June 18, 2025 11:24
@stevendanna
Copy link
Collaborator Author

@arulajmani Not sure if you wanted a second look at this. But I'll probably merge this after a rebase tomorrow. If you do spot any follow-ups post merge, let me know.

@stevendanna stevendanna force-pushed the ssd/epoch-atomicity-violation-fix branch from 6deea66 to 5510302 Compare June 19, 2025 08:19
stevendanna and others added 2 commits June 19, 2025 09:21
Once a lock has been reported as missing via QueryIntent, it is
important that it stays missing. As described in cockroachdb#145458, if a lock
that has been previously reported as missing is somehow then found via
a subsequent QueryIntent request, this can result in a transaction
being erroneously committed by the transaction recovery process.

When unreplicated lock reliability is enabled, Lease and Merge use
ExportUnreplicatedLocks to move unreplicated locks from the lock table
into durable storage. However, these locks may include unreplicated
locks that "cover" a replicated lock that was previously reported as
missing, exactly what we must avoid.

Here, we solve that by extending QueryIntent's evaluation to mark any
locks it reports as missing as ineligible for export.

Fixes cockroachdb#145458

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
@stevendanna stevendanna force-pushed the ssd/epoch-atomicity-violation-fix branch from 5510302 to e390c70 Compare June 19, 2025 08:21
@stevendanna
Copy link
Collaborator Author

bors r=miraradeva

craig bot pushed a commit that referenced this pull request Jun 20, 2025
145532: concurrency: mark some unreplicated locks as ineligible for export  r=miraradeva a=stevendanna

Once a lock has been reported as missing via QueryIntent, it is
important that it stays missing. As described in #145458, if a lock
that has been previously reported as missing is somehow then found via
a subsequent QueryIntent request, this can result in a transaction
being erroneously committed by the transaction recovery process.

When unreplicated lock reliability is enabled, Lease and Merge use
ExportUnreplicatedLocks to move unreplicated locks from the lock table
into durable storage. However, these locks may include unreplicated
locks that "cover" a replicated lock that was previously reported as
missing, exactly what we must avoid.

Here, we solve that by extending QueryIntent's evaluation to mark any
locks it reports as missing as ineligible for export.

Fixes #145458

Release note: None

Co-authored-by: Steven Danna <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jun 20, 2025

Build failed:

@stevendanna
Copy link
Collaborator Author

The extended CI failure is: #148582
The roachtest failure is: #148586

bors retry

craig bot pushed a commit that referenced this pull request Jun 20, 2025
145532: concurrency: mark some unreplicated locks as ineligible for export  r=miraradeva a=stevendanna

Once a lock has been reported as missing via QueryIntent, it is
important that it stays missing. As described in #145458, if a lock
that has been previously reported as missing is somehow then found via
a subsequent QueryIntent request, this can result in a transaction
being erroneously committed by the transaction recovery process.

When unreplicated lock reliability is enabled, Lease and Merge use
ExportUnreplicatedLocks to move unreplicated locks from the lock table
into durable storage. However, these locks may include unreplicated
locks that "cover" a replicated lock that was previously reported as
missing, exactly what we must avoid.

Here, we solve that by extending QueryIntent's evaluation to mark any
locks it reports as missing as ineligible for export.

Fixes #145458

Release note: None

Co-authored-by: Steven Danna <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jun 20, 2025

Build failed:

@stevendanna
Copy link
Collaborator Author

The universe doesn't want this fix. And in some sense, the universe is correct.

Internal build system failure:

ERROR: The Build Event Protocol upload failed: All 8 retry attempts failed. INTERNAL: INTERNAL: Invocation [tenant_name=default invocation_id=751d3de7-b36f-4ac8-a48e-b5bd813ea02d]: failed to write batch 1

bors retry

@craig
Copy link
Contributor

craig bot commented Jun 20, 2025

@craig craig bot merged commit 1c347c0 into cockroachdb:master Jun 20, 2025
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv: transactions may be unexpectedly committed when preserving locks across lease transfers or range merges
3 participants