Fix the logical replication timeout during large transactions.
authorAmit Kapila <[email protected]>
Wed, 11 May 2022 05:41:44 +0000 (11:11 +0530)
committerAmit Kapila <[email protected]>
Wed, 11 May 2022 05:41:44 +0000 (11:11 +0530)
commitf95d53eded55ecbf037f6416ced6af29a2c3caca
tree300a90851aa4256f9c5cdf0e8f88e9d8efdf40ba
parent8bbf8461a3a2a38ce5f2952a025385b6938a61f7
Fix the logical replication timeout during large transactions.

The problem is that we don't send keep-alive messages for a long time
while processing large transactions during logical replication where we
don't send any data of such transactions. This can happen when the table
modified in the transaction is not published or because all the changes
got filtered. We do try to send the keep_alive if necessary at the end of
the transaction (via WalSndWriteData()) but by that time the
subscriber-side can timeout and exit.

To fix this we try to send the keepalive message if required after
processing certain threshold of changes.

Reported-by: Fabrice Chapuis
Author: Wang wei and Amit Kapila
Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
src/backend/replication/logical/logical.c
src/backend/replication/pgoutput/pgoutput.c
src/backend/replication/walsender.c
src/include/replication/logical.h