From: Vladimir Davydov <vdavydov.dev@gmail.com> To: tarantool-patches@freelists.org Subject: [PATCH 1/5] recovery: stop writing to xstream on system error Date: Sat, 29 Dec 2018 00:21:47 +0300 [thread overview] Message-ID: <dc22e4cc81ce87fc0d852dafa96464f135bebe0a.1546030880.git.vdavydov.dev@gmail.com> (raw) In-Reply-To: <cover.1546030880.git.vdavydov.dev@gmail.com> In-Reply-To: <cover.1546030880.git.vdavydov.dev@gmail.com> In case force_recovery flag is set, recover_xlog() ignores any errors returned by xstream_write(), even SocketError or FiberIsCancelled. This may result in permanent replication breakdown as described in the next paragraph. Suppose there's a master and a replica and the master has force_recovery flag set. The replica gets stalled on WAL while applying a row fetched from the master. As a result, it stops sending ACKs. In the meantime, the master writes a lot of new rows to its WAL so that the relay thread sending changes to the replica fills up all the space available in the network buffer and blocks on the replication socket. Note, at this moment it may occur that a packet fragment has been written to the socket. The WAL delay on the replica takes long enough for replication to break on timeout: the relay reader fiber on the master doesn't receive an ACK from the replica in time and cancels the relay writer fiber. The relay writer fiber wakes up and returns to recover_xlog(), which happily continues to scan the xlog attempting to send more rows (force_recovery is set), failing, and complaining to the log. While the relay thread is still scanning the log, the replica finishes the long WAL write and reads more data from the socket, freeing up some space in the network buffer for the relay to write more rows. The relay thread, which happens to be still in recover_xlog(), writes a new row to the socket after the packet fragment it had written when it was cancelled, effectively corrupting the stream and breaking a replication with an unrecoverable error, e.g. xrow.c:99 E> ER_INVALID_MSGPACK: Invalid MsgPack - packet header Actually, it's pointless to continue scanning an xlog if xstream_write() returned any error different from ClientError - this means that the xlog is scanned by a relay thread (not local recovery) and the connection is broken, in which case there isn't much we can do but stop the relay and wait for the replica to reconnect. So let's fix this issue by ignoring force_recovery option for any error that doesn't have type ClientError. It's difficult to write a test for this case, since too many conditions have to be satisfied simultaneously for the issue to occur. Injecting errors doesn't really help here and would look artificial, because it'd rely too much on the implementation. So I'm committing this one without a test case. Part of #3910 --- src/box/recovery.cc | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/box/recovery.cc b/src/box/recovery.cc index 64d50989..c3cc7454 100644 --- a/src/box/recovery.cc +++ b/src/box/recovery.cc @@ -279,7 +279,17 @@ recover_xlog(struct recovery *r, struct xstream *stream, } else { say_error("can't apply row: "); diag_log(); - if (!r->wal_dir.force_recovery) + /* + * Stop recovery if a system error occurred, + * no matter if force_recovery is set or not, + * because in this case we could have written + * a packet fragment to the stream so that + * the next write would corrupt data at the + * receiving end. + */ + struct error *e = diag_last_error(diag_get()); + if (!r->wal_dir.force_recovery || + !type_assignable(&type_ClientError, e->type)) diag_raise(); } } -- 2.11.0
next prev parent reply other threads:[~2018-12-28 21:21 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-12-28 21:21 [PATCH 0/5] Fix a couple of replication breakdown issues Vladimir Davydov 2018-12-28 21:21 ` Vladimir Davydov [this message] 2018-12-29 9:09 ` [tarantool-patches] Re: [PATCH 1/5] recovery: stop writing to xstream on system error Konstantin Osipov 2018-12-29 9:50 ` Vladimir Davydov 2018-12-29 10:57 ` Vladimir Davydov 2018-12-29 12:08 ` Konstantin Osipov 2018-12-28 21:21 ` [PATCH 2/5] relay: do not try to scan xlog if exiting Vladimir Davydov 2018-12-29 9:14 ` [tarantool-patches] " Konstantin Osipov 2018-12-29 9:53 ` Vladimir Davydov 2018-12-28 21:21 ` [PATCH 3/5] relay: cleanup error handling Vladimir Davydov 2018-12-28 21:21 ` [PATCH 4/5] relay: close xlog cursor in relay thread Vladimir Davydov 2018-12-28 21:21 ` [PATCH 5/5] xlog: assure xlog is opened and closed in the same thread Vladimir Davydov 2018-12-29 11:40 ` [PATCH 0/5] Fix a couple of replication breakdown issues Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=dc22e4cc81ce87fc0d852dafa96464f135bebe0a.1546030880.git.vdavydov.dev@gmail.com \ --to=vdavydov.dev@gmail.com \ --cc=tarantool-patches@freelists.org \ --subject='Re: [PATCH 1/5] recovery: stop writing to xstream on system error' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox