From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH v3 00/11] Replica rejoin Date: Sat, 14 Jul 2018 23:49:15 +0300 Message-Id: To: kostja@tarantool.org Cc: tarantool-patches@freelists.org List-ID: After this patch set is applied, an instance will try to detect if it fell too much behind its peers in the cluster and so needs to be rebootstrapped. If it does, it will skip local recovery and instead proceed to bootstrap from a remote master. Old files (xlog, snap) are not deleted during rebootstrap. They will be removed by gc as usual. https://github.com/tarantool/tarantool/issues/461 https://github.com/tarantool/tarantool/commits/dv/gh-461-replica-rejoin Changes in v3: - Remove merged patches, add some new ones. - Rebase on top of the latest 1.10: this required patching gc to make it track vclocks instead of signatures so that it could report the vclock of the oldest xlog stored on the instance. - Follow-up on the recently committed patch for recovery subsystem: add some comments and remove double scanning of the WAL directory. - Introduce a new IPROTO command, IPROTO_REQUEST_STATUS, to be used instead of IPROTO_REQUEST_VOTE; send a map in reply to this command. Rationale: a map is more flexible and can be extended. In particular, we can use the very same message for inquiring the oldest vclock stored on the master to detect if a replica needs to be rejoined, instead of introducing a new IPROTO command, as we did in v2. - Do NOT rebootstrap a replica if it has some data that is absent on the master. Rationale: we don't want to lose ANY data by rejoining a replica; besides, if a replica's vclock is incomparable with the master's, xdir_scan may break. v2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-0011-Replica-rejoin Changes in v2: - Implement rebootstrap support for vinyl engine. - Call recover_remaining_wals() explicitly after recovery_stop_local() as suggested by @kostja. - Add comment to memtx_engine_new() explaining why we need to init INSTANCE_UUID before proceeding to local recovery. v1: https://www.freelists.org/post/tarantool-patches/RFC-PATCH-0012-Replica-rejoin Vladimir Davydov (11): recovery: clean up WAL dir scan code xrow: factor out function for decoding vclock Introduce IPROTO_REQUEST_STATUS command Get rid of IPROTO_SERVER_IS_RO gc: keep track of vclocks instead of signatures Include oldest vclock available on the instance in IPROTO_STATUS replication: rebootstrap instance on startup if it fell behind vinyl: simplify vylog recovery from backup vinyl: pass flags to vy_recovery_new Update test-run vinyl: implement rebootstrap support src/box/applier.cc | 6 +- src/box/applier.h | 8 +- src/box/box.cc | 26 +++- src/box/box.h | 3 + src/box/gc.c | 89 ++++++----- src/box/gc.h | 23 +-- src/box/iproto.cc | 16 +- src/box/iproto_constants.c | 4 +- src/box/iproto_constants.h | 15 +- src/box/lua/info.c | 4 +- src/box/recovery.cc | 2 +- src/box/recovery.h | 7 +- src/box/relay.cc | 21 +-- src/box/replication.cc | 36 ++++- src/box/replication.h | 9 ++ src/box/vinyl.c | 8 +- src/box/vy_log.c | 207 ++++++++++++++++++------- src/box/vy_log.h | 50 +++++- src/box/wal.c | 9 ++ src/box/xrow.c | 179 +++++++++++++++------ src/box/xrow.h | 106 +++++++------ src/errinj.h | 1 + test-run | 2 +- test/box/errinj.result | 6 +- test/replication/replica_rejoin.result | 250 ++++++++++++++++++++++++++++++ test/replication/replica_rejoin.test.lua | 91 +++++++++++ test/vinyl/replica_rejoin.lua | 13 ++ test/vinyl/replica_rejoin.result | 257 +++++++++++++++++++++++++++++++ test/vinyl/replica_rejoin.test.lua | 88 +++++++++++ test/vinyl/suite.ini | 2 +- 30 files changed, 1293 insertions(+), 245 deletions(-) create mode 100644 test/replication/replica_rejoin.result create mode 100644 test/replication/replica_rejoin.test.lua create mode 100644 test/vinyl/replica_rejoin.lua create mode 100644 test/vinyl/replica_rejoin.result create mode 100644 test/vinyl/replica_rejoin.test.lua -- 2.11.0