From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp53.i.mail.ru (smtp53.i.mail.ru [94.100.177.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id F33994696C5 for ; Tue, 7 Apr 2020 18:50:11 +0300 (MSK) From: Serge Petrenko Date: Tue, 7 Apr 2020 18:49:54 +0300 Message-Id: In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH v6 1/3] replication: omit 0-th vclock component in replication responses List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: v.shpilevoy@tarantool.org Cc: tarantool-patches@dev.tarantool.org If an anonymous replica is promoted to a normal one and becomes replication master later, its vclock contains a non-empty zero component, tracking local changes on this replica from the time when it had been anonymous. No need to pollute joining instance's vclock with our non-empty 0 component. When an anonymous replica reports its status to a remote instance it should also hide its 0-th vclock component. This is needed for backward compatibility with old instances, which don't ignore 0th vclock component coming from a remote instance by default. In order to do so, introduce a new function - vclock_size_ignore0(), which doesn't count 0th clock component, and patch xrow_encode_vclock() to skip 0th clock component if it's present. Also make sure that new instances ignore 0th vclock component coming from an unpatched remote instance. Follow-up #3186 Prerequisite #4114 --- src/box/relay.cc | 2 +- src/box/replication.cc | 4 ++-- src/box/vclock.h | 6 ++++++ src/box/xrow.c | 10 +++++++--- test/replication/anon.result | 5 +++++ test/replication/anon.test.lua | 2 ++ 6 files changed, 23 insertions(+), 6 deletions(-) diff --git a/src/box/relay.cc b/src/box/relay.cc index c634348a4..fec9f07d1 100644 --- a/src/box/relay.cc +++ b/src/box/relay.cc @@ -464,7 +464,7 @@ relay_schedule_pending_gc(struct relay *relay, const struct vclock *vclock) * the greater signatures is due to changes pulled * from other members of the cluster. */ - if (vclock_compare(&curr->vclock, vclock) > 0) + if (vclock_compare_ignore0(&curr->vclock, vclock) > 0) break; stailq_shift(&relay->pending_gc); free(gc_msg); diff --git a/src/box/replication.cc b/src/box/replication.cc index 1345f189b..c833041a3 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -777,8 +777,8 @@ replicaset_needs_rejoin(struct replica **master) continue; const struct ballot *ballot = &applier->ballot; - if (vclock_compare(&ballot->gc_vclock, - &replicaset.vclock) <= 0) { + if (vclock_compare_ignore0(&ballot->gc_vclock, + &replicaset.vclock) <= 0) { /* * There's at least one master that still stores * WALs needed by this instance. Proceed to local diff --git a/src/box/vclock.h b/src/box/vclock.h index 79e5a1bc0..5c0525b00 100644 --- a/src/box/vclock.h +++ b/src/box/vclock.h @@ -200,6 +200,12 @@ vclock_size(const struct vclock *vclock) return bit_count_u32(vclock->map); } +static inline uint32_t +vclock_size_ignore0(const struct vclock *vclock) +{ + return bit_count_u32(vclock->map & ~1); +} + static inline int64_t vclock_calc_sum(const struct vclock *vclock) { diff --git a/src/box/xrow.c b/src/box/xrow.c index be026a43c..21a68220a 100644 --- a/src/box/xrow.c +++ b/src/box/xrow.c @@ -51,7 +51,7 @@ static_assert(IPROTO_DATA < 0x7f && IPROTO_METADATA < 0x7f && static inline uint32_t mp_sizeof_vclock(const struct vclock *vclock) { - uint32_t size = vclock_size(vclock); + uint32_t size = vclock_size_ignore0(vclock); return mp_sizeof_map(size) + size * (mp_sizeof_uint(UINT32_MAX) + mp_sizeof_uint(UINT64_MAX)); } @@ -59,10 +59,14 @@ mp_sizeof_vclock(const struct vclock *vclock) static inline char * mp_encode_vclock(char *data, const struct vclock *vclock) { - data = mp_encode_map(data, vclock_size(vclock)); + data = mp_encode_map(data, vclock_size_ignore0(vclock)); struct vclock_iterator it; vclock_iterator_init(&it, vclock); - vclock_foreach(&it, replica) { + struct vclock_c replica; + replica = vclock_iterator_next(&it); + if (replica.id == 0) + replica = vclock_iterator_next(&it); + for ( ; replica.id < VCLOCK_MAX; replica = vclock_iterator_next(&it)) { data = mp_encode_uint(data, replica.id); data = mp_encode_uint(data, replica.lsn); } diff --git a/test/replication/anon.result b/test/replication/anon.result index 88061569f..cbbeeef09 100644 --- a/test/replication/anon.result +++ b/test/replication/anon.result @@ -187,6 +187,11 @@ a > 0 | --- | - true | ... +-- 0-th vclock component isn't propagated across the cluster. +box.info.vclock[0] + | --- + | - null + | ... test_run:cmd('switch default') | --- | - true diff --git a/test/replication/anon.test.lua b/test/replication/anon.test.lua index 8a8d15c18..627dc5c8e 100644 --- a/test/replication/anon.test.lua +++ b/test/replication/anon.test.lua @@ -66,6 +66,8 @@ test_run:cmd('switch replica_anon2') a = box.info.vclock[1] -- The instance did fetch a snapshot. a > 0 +-- 0-th vclock component isn't propagated across the cluster. +box.info.vclock[0] test_run:cmd('switch default') box.space.test:insert{2} test_run:cmd("switch replica_anon2") -- 2.21.1 (Apple Git-122.3)