From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [RFC PATCH 07/12] box: retrieve end vclock before starting local recovery Date: Wed, 6 Jun 2018 20:45:07 +0300 Message-Id: <2e093f7806eecaeab239e42e9a2decb80fa048ef.1528305232.git.vdavydov.dev@gmail.com> In-Reply-To: References: In-Reply-To: References: To: kostja@tarantool.org Cc: tarantool-patches@freelists.org List-ID: In order to find out if the current instance fell too much behind its peers in the cluster and so needs to be rebootstrapped, we need to know its vclock before we start local recovery. To do that, let's scan the most recent xlog. In future, we can optimize that by either storing end vclock in xlog eof marker or by making a new xlog on server stop. Needed for #461 --- src/box/box.cc | 20 +++++++++++++------- src/box/recovery.cc | 23 +++++++++++++++++++++++ src/box/recovery.h | 3 +++ 3 files changed, 39 insertions(+), 7 deletions(-) diff --git a/src/box/box.cc b/src/box/box.cc index 9105ed19..b072f788 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1858,6 +1858,14 @@ box_cfg_xc(void) auto guard = make_scoped_guard([=]{ recovery_delete(recovery); }); /* + * Initialize the replica set vclock from recovery. + * The local WAL may contain rows from remote masters, + * so we must reflect this in replicaset vclock to + * not attempt to apply these rows twice. + */ + recovery_end_vclock(recovery, &replicaset.vclock); + + /* * recovery->vclock is needed by Vinyl to filter * WAL rows that were dumped before restart. * @@ -1907,6 +1915,11 @@ box_cfg_xc(void) fiber_sleep(0.1); } recovery_stop_local(recovery); + /* + * Advance replica set vclock to reflect records + * applied in hot standby mode. + */ + vclock_copy(&replicaset.vclock, &recovery->vclock); box_bind(); } recovery_finalize(recovery); @@ -1922,13 +1935,6 @@ box_cfg_xc(void) /* Clear the pointer to journal before it goes out of scope */ journal_set(NULL); - /* - * Initialize the replica set vclock from recovery. - * The local WAL may contain rows from remote masters, - * so we must reflect this in replicaset vclock to - * not attempt to apply these rows twice. - */ - vclock_copy(&replicaset.vclock, &recovery->vclock); /** Begin listening only when the local recovery is complete. */ box_listen(); diff --git a/src/box/recovery.cc b/src/box/recovery.cc index 5ef1f979..8bf081d6 100644 --- a/src/box/recovery.cc +++ b/src/box/recovery.cc @@ -137,6 +137,29 @@ recovery_new(const char *wal_dirname, bool force_recovery, return r; } +void +recovery_end_vclock(struct recovery *r, struct vclock *end_vclock) +{ + xdir_scan_xc(&r->wal_dir); + + struct vclock *vclock = vclockset_last(&r->wal_dir.index); + if (vclock == NULL || vclock_compare(vclock, &r->vclock) < 0) { + /* No xlogs after last checkpoint. */ + vclock_copy(end_vclock, &r->vclock); + return; + } + + /* Scan the last xlog to find end vclock. */ + vclock_copy(end_vclock, vclock); + struct xlog_cursor cursor; + if (xdir_open_cursor(&r->wal_dir, vclock_sum(vclock), &cursor) != 0) + return; + struct xrow_header row; + while (xlog_cursor_next(&cursor, &row, true) == 0) + vclock_follow(end_vclock, row.replica_id, row.lsn); + xlog_cursor_close(&cursor, false); +} + static inline void recovery_close_log(struct recovery *r) { diff --git a/src/box/recovery.h b/src/box/recovery.h index 6aba922b..1ae6f2c3 100644 --- a/src/box/recovery.h +++ b/src/box/recovery.h @@ -69,6 +69,9 @@ void recovery_delete(struct recovery *r); void +recovery_end_vclock(struct recovery *r, struct vclock *end_vclock); + +void recovery_follow_local(struct recovery *r, struct xstream *stream, const char *name, ev_tstamp wal_dir_rescan_delay); -- 2.11.0