Fix remaining race condition with CLOG truncation and LISTEN/NOTIFY
authorHeikki Linnakangas <[email protected]>
Wed, 12 Nov 2025 18:59:44 +0000 (20:59 +0200)
committerHeikki Linnakangas <[email protected]>
Wed, 12 Nov 2025 18:59:44 +0000 (20:59 +0200)
commit797e9ea6e54b06c3e6c79b468dab89fdbf6be179
treef8639a5d677f054189c37f9060880e6318418cfc
parent8eeb4a0f7c061ece7a8836e738ea8b7764617d3b
Fix remaining race condition with CLOG truncation and LISTEN/NOTIFY

Previous commit fixed a bug where VACUUM would truncate the CLOG
that's still needed to check the commit status of XIDs in the async
notify queue, but as mentioned in the commit message, it wasn't a full
fix. If a backend is executing asyncQueueReadAllNotifications() and
has just made a local copy of an async SLRU page which contains old
XIDs, vacuum can concurrently truncate the CLOG covering those XIDs,
and the backend still gets an error when it calls
TransactionIdDidCommit() on those XIDs in the local copy. This commit
fixes that race condition.

To fix, hold the SLRU bank lock across the TransactionIdDidCommit()
calls in NOTIFY processing.

Per Tom Lane's idea. Backpatch to all supported versions.

Reviewed-by: Joel Jacobson <[email protected]>
Reviewed-by: Arseniy Mukhin <[email protected]>
Discussion: https://www.postgresql.org/message-id/2759499.1761756503@sss.pgh.pa.us
Backpatch-through: 14
src/backend/commands/async.c