Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: kostja@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: [PATCH v3 00/11] Replica rejoin
Date: Sat, 14 Jul 2018 23:49:15 +0300	[thread overview]
Message-ID: <cover.1531598427.git.vdavydov.dev@gmail.com> (raw)

After this patch set is applied, an instance will try to detect if
it fell too much behind its peers in the cluster and so needs to be
rebootstrapped. If it does, it will skip local recovery and instead
proceed to bootstrap from a remote master. Old files (xlog, snap)
are not deleted during rebootstrap. They will be removed by gc as
usual.

https://github.com/tarantool/tarantool/issues/461
https://github.com/tarantool/tarantool/commits/dv/gh-461-replica-rejoin

Changes in v3:
 - Remove merged patches, add some new ones.
 - Rebase on top of the latest 1.10: this required patching gc to make
   it track vclocks instead of signatures so that it could report the
   vclock of the oldest xlog stored on the instance.
 - Follow-up on the recently committed patch for recovery subsystem: add
   some comments and remove double scanning of the WAL directory.
 - Introduce a new IPROTO command, IPROTO_REQUEST_STATUS, to be used
   instead of IPROTO_REQUEST_VOTE; send a map in reply to this command.
   Rationale: a map is more flexible and can be extended. In particular,
   we can use the very same message for inquiring the oldest vclock
   stored on the master to detect if a replica needs to be rejoined,
   instead of introducing a new IPROTO command, as we did in v2.
 - Do NOT rebootstrap a replica if it has some data that is absent on
   the master. Rationale: we don't want to lose ANY data by rejoining a
   replica; besides, if a replica's vclock is incomparable with the
   master's, xdir_scan may break.

v2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-0011-Replica-rejoin

Changes in v2:
 - Implement rebootstrap support for vinyl engine.
 - Call recover_remaining_wals() explicitly after recovery_stop_local()
   as suggested by @kostja.
 - Add comment to memtx_engine_new() explaining why we need to init
   INSTANCE_UUID before proceeding to local recovery.

v1: https://www.freelists.org/post/tarantool-patches/RFC-PATCH-0012-Replica-rejoin

Vladimir Davydov (11):
  recovery: clean up WAL dir scan code
  xrow: factor out function for decoding vclock
  Introduce IPROTO_REQUEST_STATUS command
  Get rid of IPROTO_SERVER_IS_RO
  gc: keep track of vclocks instead of signatures
  Include oldest vclock available on the instance in IPROTO_STATUS
  replication: rebootstrap instance on startup if it fell behind
  vinyl: simplify vylog recovery from backup
  vinyl: pass flags to vy_recovery_new
  Update test-run
  vinyl: implement rebootstrap support

 src/box/applier.cc                       |   6 +-
 src/box/applier.h                        |   8 +-
 src/box/box.cc                           |  26 +++-
 src/box/box.h                            |   3 +
 src/box/gc.c                             |  89 ++++++-----
 src/box/gc.h                             |  23 +--
 src/box/iproto.cc                        |  16 +-
 src/box/iproto_constants.c               |   4 +-
 src/box/iproto_constants.h               |  15 +-
 src/box/lua/info.c                       |   4 +-
 src/box/recovery.cc                      |   2 +-
 src/box/recovery.h                       |   7 +-
 src/box/relay.cc                         |  21 +--
 src/box/replication.cc                   |  36 ++++-
 src/box/replication.h                    |   9 ++
 src/box/vinyl.c                          |   8 +-
 src/box/vy_log.c                         | 207 ++++++++++++++++++-------
 src/box/vy_log.h                         |  50 +++++-
 src/box/wal.c                            |   9 ++
 src/box/xrow.c                           | 179 +++++++++++++++------
 src/box/xrow.h                           | 106 +++++++------
 src/errinj.h                             |   1 +
 test-run                                 |   2 +-
 test/box/errinj.result                   |   6 +-
 test/replication/replica_rejoin.result   | 250 ++++++++++++++++++++++++++++++
 test/replication/replica_rejoin.test.lua |  91 +++++++++++
 test/vinyl/replica_rejoin.lua            |  13 ++
 test/vinyl/replica_rejoin.result         | 257 +++++++++++++++++++++++++++++++
 test/vinyl/replica_rejoin.test.lua       |  88 +++++++++++
 test/vinyl/suite.ini                     |   2 +-
 30 files changed, 1293 insertions(+), 245 deletions(-)
 create mode 100644 test/replication/replica_rejoin.result
 create mode 100644 test/replication/replica_rejoin.test.lua
 create mode 100644 test/vinyl/replica_rejoin.lua
 create mode 100644 test/vinyl/replica_rejoin.result
 create mode 100644 test/vinyl/replica_rejoin.test.lua

-- 
2.11.0

             reply	other threads:[~2018-07-14 20:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-14 20:49 Vladimir Davydov [this message]
2018-07-14 20:49 ` [PATCH v3 01/11] recovery: clean up WAL dir scan code Vladimir Davydov
2018-07-19  7:08   ` Konstantin Osipov
2018-07-14 20:49 ` [PATCH v3 02/11] xrow: factor out function for decoding vclock Vladimir Davydov
2018-07-19  7:08   ` Konstantin Osipov
2018-07-14 20:49 ` [PATCH v3 03/11] Introduce IPROTO_REQUEST_STATUS command Vladimir Davydov
2018-07-19  7:10   ` Konstantin Osipov
2018-07-19  8:17     ` Vladimir Davydov
2018-07-21 10:25   ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 04/11] Get rid of IPROTO_SERVER_IS_RO Vladimir Davydov
2018-07-19  7:10   ` Konstantin Osipov
2018-07-21 12:07   ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 05/11] gc: keep track of vclocks instead of signatures Vladimir Davydov
2018-07-19  7:11   ` Konstantin Osipov
2018-07-14 20:49 ` [PATCH v3 06/11] Include oldest vclock available on the instance in IPROTO_STATUS Vladimir Davydov
2018-07-19  7:12   ` Konstantin Osipov
2018-07-21 12:07   ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 07/11] replication: rebootstrap instance on startup if it fell behind Vladimir Davydov
2018-07-19  7:19   ` Konstantin Osipov
2018-07-19 10:04     ` Vladimir Davydov
2018-07-23 20:19       ` Konstantin Osipov
2018-07-27 16:13         ` [PATCH] replication: print master uuid when (re)bootstrapping Vladimir Davydov
2018-07-31  8:34           ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 08/11] vinyl: simplify vylog recovery from backup Vladimir Davydov
2018-07-31  8:21   ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 09/11] vinyl: pass flags to vy_recovery_new Vladimir Davydov
2018-07-21 11:12   ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 10/11] Update test-run Vladimir Davydov
2018-07-21 11:13   ` Vladimir Davydov
2018-07-14 20:49 ` [PATCH v3 11/11] vinyl: implement rebootstrap support Vladimir Davydov
2018-07-31  8:23   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1531598427.git.vdavydov.dev@gmail.com \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH v3 00/11] Replica rejoin' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox