From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH 13/13] replication: ref checkpoint needed to join replica Date: Thu, 4 Oct 2018 20:20:15 +0300 Message-Id: In-Reply-To: References: In-Reply-To: References: To: kostja@tarantool.org Cc: tarantool-patches@freelists.org List-ID: Before joining a new replica we register a gc_consumer to prevent garbage collection of files needed for join and following subscribe. Before commit 9c5d851d7830 ("replication: remove old snapshot files not needed by replicas") a consumer would pin both checkpoints and WALs so that would work as expected. However, the above mentioned commit introduced consumer types and marked a consumer registered on replica join as WAL-only so if the garbage collector was invoked during join, it could delete files corresponding to the relayed checkpoint resulting in replica join failure. Fix this issue by pinning the checkpoint used for joining a replica with gc_ref_checkpoint and unpinning once join is complete. The issue can only be reproduced if there are vinyl spaces, because deletion of an open snap file doesn't prevent the relay from reading it. The existing replication/gc test would catch the issue if it triggered compaction on the master so we simply tweak it accordingly instead of adding a new test case. Closes #3708 --- src/box/box.cc | 26 ++++++++++++++++---------- test/replication/gc.result | 19 +++++++++++++------ test/replication/gc.test.lua | 8 +++++--- 3 files changed, 34 insertions(+), 19 deletions(-) diff --git a/src/box/box.cc b/src/box/box.cc index 2679f6da..7e32b9fc 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1465,14 +1465,14 @@ box_process_join(struct ev_io *io, struct xrow_header *header) struct vclock start_vclock; vclock_copy(&start_vclock, &checkpoint->vclock); - /* Register the replica with the garbage collector. */ - struct gc_consumer *gc = gc_consumer_register(&start_vclock, - "replica %s", tt_uuid_str(&instance_uuid)); - if (gc == NULL) - diag_raise(); - auto gc_guard = make_scoped_guard([=]{ - gc_consumer_unregister(gc); - }); + /* + * Make sure the checkpoint files won't be deleted while + * initial join is in progress. + */ + struct gc_checkpoint_ref gc; + gc_ref_checkpoint(checkpoint, &gc, "replica %s", + tt_uuid_str(&instance_uuid)); + auto gc_guard = make_scoped_guard([&]{ gc_unref_checkpoint(&gc); }); /* Respond to JOIN request with start_vclock. */ struct xrow_header row; @@ -1496,8 +1496,14 @@ box_process_join(struct ev_io *io, struct xrow_header *header) replica = replica_by_uuid(&instance_uuid); assert(replica != NULL); - replica->gc = gc; - gc_guard.is_active = false; + + /* Register the replica as a WAL consumer. */ + if (replica->gc != NULL) + gc_consumer_unregister(replica->gc); + replica->gc = gc_consumer_register(&start_vclock, "replica %s", + tt_uuid_str(&instance_uuid)); + if (replica->gc == NULL) + diag_raise(); /* Remember master's vclock after the last request */ struct vclock stop_vclock; diff --git a/test/replication/gc.result b/test/replication/gc.result index 83d0de29..5944ba51 100644 --- a/test/replication/gc.result +++ b/test/replication/gc.result @@ -16,6 +16,10 @@ fiber = require('fiber') test_run:cleanup_cluster() --- ... +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") +--- +- true +... -- Make each snapshot trigger garbage collection. default_checkpoint_count = box.cfg.checkpoint_count --- @@ -40,10 +44,17 @@ box.error.injection.set("ERRINJ_RELAY_REPORT_INTERVAL", 0.05) s = box.schema.space.create('test', {engine = engine}); --- ... -_ = s:create_index('pk') +_ = s:create_index('pk', {run_count_per_level = 1}) --- ... -for i = 1, 100 do s:auto_increment{} end +for i = 1, 50 do s:auto_increment{} end +--- +... +box.snapshot() +--- +- ok +... +for i = 1, 50 do s:auto_increment{} end --- ... box.snapshot() @@ -76,10 +87,6 @@ test_run:cmd("setopt delimiter ''"); --- ... -- Start the replica. -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") ---- -- true -... test_run:cmd("start server replica") --- - true diff --git a/test/replication/gc.test.lua b/test/replication/gc.test.lua index eed76850..84c980dd 100644 --- a/test/replication/gc.test.lua +++ b/test/replication/gc.test.lua @@ -5,6 +5,7 @@ replica_set = require('fast_replica') fiber = require('fiber') test_run:cleanup_cluster() +test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") -- Make each snapshot trigger garbage collection. default_checkpoint_count = box.cfg.checkpoint_count @@ -21,8 +22,10 @@ box.error.injection.set("ERRINJ_RELAY_REPORT_INTERVAL", 0.05) -- Create and populate the space we will replicate. s = box.schema.space.create('test', {engine = engine}); -_ = s:create_index('pk') -for i = 1, 100 do s:auto_increment{} end +_ = s:create_index('pk', {run_count_per_level = 1}) +for i = 1, 50 do s:auto_increment{} end +box.snapshot() +for i = 1, 50 do s:auto_increment{} end box.snapshot() for i = 1, 100 do s:auto_increment{} end @@ -43,7 +46,6 @@ end) test_run:cmd("setopt delimiter ''"); -- Start the replica. -test_run:cmd("create server replica with rpl_master=default, script='replication/replica.lua'") test_run:cmd("start server replica") -- Despite the fact that we invoked garbage collection that -- 2.11.0