From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp52.i.mail.ru (smtp52.i.mail.ru [94.100.177.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 47A54469719 for ; Mon, 14 Sep 2020 12:15:56 +0300 (MSK) References: <5a6c33ec7bfffd3d0dd201a1b38cc5205f0f0ea4.1600025872.git.v.shpilevoy@tarantool.org> From: Serge Petrenko Message-ID: Date: Mon, 14 Sep 2020 12:15:54 +0300 MIME-Version: 1.0 In-Reply-To: <5a6c33ec7bfffd3d0dd201a1b38cc5205f0f0ea4.1600025872.git.v.shpilevoy@tarantool.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [Tarantool-patches] [PATCH v2 1/1] memtx: force async snapshot transactions List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, gorcunov@gmail.com 13.09.2020 22:39, Vladislav Shpilevoy пишет: > Snapshot rows contain not real LSNs. Instead their LSNs are > signatures, ordinal numbers. Rows in the snap have LSNs from 1 to > the number of rows. This is because LSNs are not stored with every > tuple in the storages, and there is no way to store real LSNs in > the snapshot. > > These artificial LSNs broke the synchronous replication limbo. > After snap recovery is done, limbo vclock was broken - it > contained numbers not related to reality, and affected by rows > from local spaces. > > Also the recovery could stuck because ACKs in the limbo stopped > working after a first row - the vclock was set to the final > signature right away. > > This patch makes all snapshot recovered rows async. Because they > are confirmed by definition. So now the limbo is not involved into > the snapshot recovery. > > Closes #5298 Thanks for the patch! LGTM. > --- > Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5298-qsync-recovery-snap > Issue: https://github.com/tarantool/tarantool/issues/5298 > > @ChangeLog > * Having synchronous rows in the snapshot could make the instance hang on recovery (gh-5298). > > Changes in v2: > - I realized it is enough to set snapshot transactions asynchronous. So they > don't involve the limbo anymore, and can't pollute its vclock with the not > real LSNs. > > src/box/memtx_engine.c | 5 + > .../gh-5298-qsync-recovery-snap.result | 100 ++++++++++++++++++ > .../gh-5298-qsync-recovery-snap.test.lua | 48 +++++++++ > 3 files changed, 153 insertions(+) > create mode 100644 test/replication/gh-5298-qsync-recovery-snap.result > create mode 100644 test/replication/gh-5298-qsync-recovery-snap.test.lua > > diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c > index 9f079a6b5..9df67426c 100644 > --- a/src/box/memtx_engine.c > +++ b/src/box/memtx_engine.c > @@ -233,6 +233,11 @@ memtx_engine_recover_snapshot_row(struct memtx_engine *memtx, > goto rollback_stmt; > if (txn_commit_stmt(txn, &request) != 0) > goto rollback; > + /* > + * Snapshot rows are confirmed by definition. They don't need to go to > + * the synchronous transactions limbo. > + */ > + txn_set_flag(txn, TXN_FORCE_ASYNC); > rc = txn_commit(txn); > /* > * Don't let gc pool grow too much. Yet to > diff --git a/test/replication/gh-5298-qsync-recovery-snap.result b/test/replication/gh-5298-qsync-recovery-snap.result > new file mode 100644 > index 000000000..922831552 > --- /dev/null > +++ b/test/replication/gh-5298-qsync-recovery-snap.result > @@ -0,0 +1,100 @@ > +-- test-run result file version 2 > +test_run = require('test_run').new() > + | --- > + | ... > +engine = test_run:get_cfg('engine') > + | --- > + | ... > +-- > +-- gh-5298: rows from snapshot got their LSN = signature of the instance vclock. > +-- All the same LSN. For example, signature at the moment of recovery start > +-- was 100. Then all rows from the snap got their LSN = 100. That broke the > +-- limbo, because it assumes LSNs grow. In order to skip duplicate ACKs. > +-- > +_ = box.schema.space.create('sync', {is_sync = true, engine = engine}) > + | --- > + | ... > +_ = box.space.sync:create_index('pk') > + | --- > + | ... > +for i = 1, 10 do box.space.sync:replace{i} end > + | --- > + | ... > + > +-- Local rows could affect this by increasing the signature. > +_ = box.schema.space.create('loc', {is_local = true, engine = engine}) > + | --- > + | ... > +_ = box.space.loc:create_index('pk') > + | --- > + | ... > +for i = 1, 10 do box.space.loc:replace{i} end > + | --- > + | ... > + > +box.snapshot() > + | --- > + | - ok > + | ... > + > +test_run:cmd("restart server default") > + | > + > +-- Could hang if the limbo would incorrectly handle the snapshot end. > +box.space.sync:replace{11} > + | --- > + | - [11] > + | ... > + > +old_synchro_quorum = box.cfg.replication_synchro_quorum > + | --- > + | ... > +old_synchro_timeout = box.cfg.replication_synchro_timeout > + | --- > + | ... > + > +box.cfg{ \ > + replication_synchro_timeout = 0.001, \ > + replication_synchro_quorum = 2, \ > +} > + | --- > + | ... > +box.space.sync:replace{12} > + | --- > + | - error: Quorum collection for a synchronous transaction is timed out > + | ... > + > +box.cfg{ \ > + replication_synchro_timeout = 1000, \ > + replication_synchro_quorum = 1, \ > +} > + | --- > + | ... > +box.space.sync:replace{13} > + | --- > + | - [13] > + | ... > +box.space.sync:get({11}) > + | --- > + | - [11] > + | ... > +box.space.sync:get({12}) > + | --- > + | ... > +box.space.sync:get({13}) > + | --- > + | - [13] > + | ... > + > +box.cfg{ \ > + replication_synchro_timeout = old_synchro_timeout, \ > + replication_synchro_quorum = old_synchro_quorum, \ > +} > + | --- > + | ... > +box.space.sync:drop() > + | --- > + | ... > +box.space.loc:drop() > + | --- > + | ... > diff --git a/test/replication/gh-5298-qsync-recovery-snap.test.lua b/test/replication/gh-5298-qsync-recovery-snap.test.lua > new file mode 100644 > index 000000000..187f60d75 > --- /dev/null > +++ b/test/replication/gh-5298-qsync-recovery-snap.test.lua > @@ -0,0 +1,48 @@ > +test_run = require('test_run').new() > +engine = test_run:get_cfg('engine') > +-- > +-- gh-5298: rows from snapshot got their LSN = signature of the instance vclock. > +-- All the same LSN. For example, signature at the moment of recovery start > +-- was 100. Then all rows from the snap got their LSN = 100. That broke the > +-- limbo, because it assumes LSNs grow. In order to skip duplicate ACKs. > +-- > +_ = box.schema.space.create('sync', {is_sync = true, engine = engine}) > +_ = box.space.sync:create_index('pk') > +for i = 1, 10 do box.space.sync:replace{i} end > + > +-- Local rows could affect this by increasing the signature. > +_ = box.schema.space.create('loc', {is_local = true, engine = engine}) > +_ = box.space.loc:create_index('pk') > +for i = 1, 10 do box.space.loc:replace{i} end > + > +box.snapshot() > + > +test_run:cmd("restart server default") > + > +-- Could hang if the limbo would incorrectly handle the snapshot end. > +box.space.sync:replace{11} > + > +old_synchro_quorum = box.cfg.replication_synchro_quorum > +old_synchro_timeout = box.cfg.replication_synchro_timeout > + > +box.cfg{ \ > + replication_synchro_timeout = 0.001, \ > + replication_synchro_quorum = 2, \ > +} > +box.space.sync:replace{12} > + > +box.cfg{ \ > + replication_synchro_timeout = 1000, \ > + replication_synchro_quorum = 1, \ > +} > +box.space.sync:replace{13} > +box.space.sync:get({11}) > +box.space.sync:get({12}) > +box.space.sync:get({13}) > + > +box.cfg{ \ > + replication_synchro_timeout = old_synchro_timeout, \ > + replication_synchro_quorum = old_synchro_quorum, \ > +} > +box.space.sync:drop() > +box.space.loc:drop() -- Serge Petrenko