Skip to content

Commit 948d83d

Browse files
Karolina Szczepankiewiczbjornmu
authored andcommitted
Bug #34860923 : Timeout on cv in waiting_with_heartbeat cause dump thread to stop
Problem ------- In case a binary log dump thread waits for new events with a heartbeat configured and a new event arrives, it is possible that a binary log dump thread will send an EOF packet to connected client (replica/mysqlbinlog/custom client...) before sending all of the events. Analysis / Root-cause analysis ------------------------------ It happens in case binary log dump thread exits with a timeout on conditional variable just before position gets updated. Function 'wait_with_heartbeat' exits with a code 1, which is treated later on as the end of the execution. Solution -------- Ignore the code returned from the 'wait' function, since a timeout is not important information for the binary dump log thread. In case a timeout occurs, binary log dump thread should continue execution or abort in case thread was stopped. Return 0 from the wait_with_heartbeat or 1 in case of send/flush error. Signed-off-by: Karolina Szczepankiewicz <[email protected]> Change-Id: I027985aafc1234194f0798ba52b65cce36936f24
1 parent a153bf5 commit 948d83d

File tree

3 files changed

+144
-4
lines changed

3 files changed

+144
-4
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
2+
1. Setup a simple replication topology : source -> replica
3+
4+
include/master-slave.inc
5+
Warnings:
6+
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
7+
Note #### Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
8+
[connection master]
9+
10+
2. Setup heartbeat period to 1 ms
11+
12+
[connection slave]
13+
include/stop_slave.inc
14+
CHANGE REPLICATION SOURCE TO SOURCE_HEARTBEAT_PERIOD=0.001;
15+
include/start_slave.inc
16+
[connection master]
17+
CREATE TABLE test.t(a INT);
18+
19+
3. Execute `mysqlslap` in a loop.
20+
Let the source send heartbeat messages between iterations.
21+
22+
[connection server_1]
23+
24+
4. Sync the replica
25+
26+
[connection master]
27+
include/sync_slave_sql_with_master.inc
28+
29+
5. Verify that Dump thread was not restarted between
30+
mysqlslap iterations. Dump thread should exit only
31+
if network is unstable, e.g. there was an error on 'send' or 'flush'
32+
33+
include/assert_grep.inc [Binary dump log thread should be started twice]
34+
35+
6. Cleanup
36+
37+
[connection slave]
38+
include/stop_slave.inc
39+
CHANGE REPLICATION SOURCE TO SOURCE_HEARTBEAT_PERIOD=SAVED_HEARTBEAT_PERIOD;
40+
include/start_slave.inc
41+
[connection master]
42+
DROP TABLE test.t;
43+
include/rpl_end.inc
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# ==== Purpose ====
2+
#
3+
# The purpose of this script is to test that the binary log thread
4+
# does not exit in case events are written after some "idle" time,
5+
# after which the source sends heartbeat messages to the replica
6+
#
7+
# ==== Requirements ====
8+
#
9+
# R1. Dump thread with heartbeat option enabled should disconnect
10+
# only if an error occurs (send/flush).
11+
#
12+
# ==== Implementation ====
13+
#
14+
# 1. Setup a simple replication topology : source -> replica
15+
# 2. Setup heartbeat period to 1 ms
16+
# 3. Execute `mysqlslap` in a loop.
17+
# Let the source send heartbeat messages between iterations.
18+
# 4. Sync the replica.
19+
# 5. Verify that dump thread was not restarted between mysqlslap
20+
# iterations. Dump thread should exit only if a network is
21+
# unstable, e.g. there was an error on "send" or "flush".
22+
# 6. Cleanup
23+
#
24+
# ==== References ====
25+
#
26+
# BUG#34860923 Timeout on cv in waiting with heartbeat cause dump thread to stop
27+
#
28+
29+
--echo
30+
--echo 1. Setup a simple replication topology : source -> replica
31+
--echo
32+
--source include/master-slave.inc
33+
--source include/have_binlog_format_row.inc
34+
35+
--echo
36+
--echo 2. Setup heartbeat period to 1 ms
37+
--echo
38+
--source include/rpl_connection_slave.inc
39+
--source include/stop_slave.inc
40+
--let $saved_heartbeat_period = `SELECT Heartbeat FROM mysql.slave_master_info`
41+
CHANGE REPLICATION SOURCE TO SOURCE_HEARTBEAT_PERIOD=0.001;
42+
--source include/start_slave.inc
43+
44+
--source include/rpl_connection_master.inc
45+
CREATE TABLE test.t(a INT);
46+
47+
--echo
48+
--echo 3. Execute `mysqlslap` in a loop.
49+
--echo Let the source send heartbeat messages between iterations.
50+
--echo
51+
52+
--let $mysqlslap_total_iters = 50
53+
--let $i = 0
54+
55+
--let $rpl_connection_name= server_1
56+
--source include/rpl_connection.inc
57+
58+
while ($i < $mysqlslap_total_iters)
59+
{
60+
--exec $MYSQL_SLAP --create-schema=test --delimiter=";" --iterations=5 --query="INSERT INTO test.t VALUES (1)" --concurrency=1 --silent 2>&1
61+
--inc $i
62+
--sleep 0.1
63+
}
64+
65+
--echo
66+
--echo 4. Sync the replica
67+
--echo
68+
--source include/rpl_connection_master.inc
69+
--source include/sync_slave_sql_with_master.inc
70+
71+
--echo
72+
--echo 5. Verify that Dump thread was not restarted between
73+
--echo mysqlslap iterations. Dump thread should exit only
74+
--echo if network is unstable, e.g. there was an error on 'send' or 'flush'
75+
--echo
76+
77+
--let $assert_text = Binary dump log thread should be started twice
78+
--let $assert_file = $MYSQLTEST_VARDIR/log/mysqld.1.err
79+
--let $assert_select = Start binlog_dump to
80+
--let $assert_count = 2
81+
--source include/assert_grep.inc
82+
83+
--echo
84+
--echo 6. Cleanup
85+
--echo
86+
87+
88+
--source include/rpl_connection_slave.inc
89+
--source include/stop_slave.inc
90+
--replace_result $saved_heartbeat_period SAVED_HEARTBEAT_PERIOD
91+
--eval CHANGE REPLICATION SOURCE TO SOURCE_HEARTBEAT_PERIOD=$saved_heartbeat_period
92+
--source include/start_slave.inc
93+
94+
--source include/rpl_connection_master.inc
95+
DROP TABLE test.t;
96+
97+
--source include/rpl_end.inc

sql/rpl_binlog_sender.cc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -803,13 +803,13 @@ inline int Binlog_sender::wait_with_heartbeat(my_off_t log_pos) {
803803
#ifndef NDEBUG
804804
ulong hb_info_counter = 0;
805805
#endif
806-
int ret = 0;
807806

808807
while (!stop_waiting_for_update(log_pos)) {
809-
ret = mysql_bin_log.wait_for_update(m_heartbeat_period) > 0 ? 1 : 0;
808+
// ignoring timeout on conditional variable
809+
mysql_bin_log.wait_for_update(m_heartbeat_period);
810810

811811
if (stop_waiting_for_update(log_pos)) {
812-
return ret;
812+
return 0;
813813
}
814814
mysql_bin_log.unlock_binlog_end_pos();
815815
Scope_guard lock([]() { mysql_bin_log.lock_binlog_end_pos(); });
@@ -825,7 +825,7 @@ inline int Binlog_sender::wait_with_heartbeat(my_off_t log_pos) {
825825
if (send_heartbeat_event(log_pos)) return 1;
826826
}
827827

828-
return ret;
828+
return 0;
829829
}
830830

831831
inline int Binlog_sender::wait_without_heartbeat(my_off_t log_pos) {

0 commit comments

Comments
 (0)