From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
tarantool-patches@dev.tarantool.org, gorcunov@gmail.com
Subject: Re: [Tarantool-patches] [PATCH v2 1/1] memtx: force async snapshot transactions
Date: Mon, 14 Sep 2020 12:15:54 +0300 [thread overview]
Message-ID: <d4222631-31c7-f76f-002d-efd330466a03@tarantool.org> (raw)
In-Reply-To: <5a6c33ec7bfffd3d0dd201a1b38cc5205f0f0ea4.1600025872.git.v.shpilevoy@tarantool.org>
13.09.2020 22:39, Vladislav Shpilevoy пишет:
> Snapshot rows contain not real LSNs. Instead their LSNs are
> signatures, ordinal numbers. Rows in the snap have LSNs from 1 to
> the number of rows. This is because LSNs are not stored with every
> tuple in the storages, and there is no way to store real LSNs in
> the snapshot.
>
> These artificial LSNs broke the synchronous replication limbo.
> After snap recovery is done, limbo vclock was broken - it
> contained numbers not related to reality, and affected by rows
> from local spaces.
>
> Also the recovery could stuck because ACKs in the limbo stopped
> working after a first row - the vclock was set to the final
> signature right away.
>
> This patch makes all snapshot recovered rows async. Because they
> are confirmed by definition. So now the limbo is not involved into
> the snapshot recovery.
>
> Closes #5298
Thanks for the patch! LGTM.
> ---
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5298-qsync-recovery-snap
> Issue: https://github.com/tarantool/tarantool/issues/5298
>
> @ChangeLog
> * Having synchronous rows in the snapshot could make the instance hang on recovery (gh-5298).
>
> Changes in v2:
> - I realized it is enough to set snapshot transactions asynchronous. So they
> don't involve the limbo anymore, and can't pollute its vclock with the not
> real LSNs.
>
> src/box/memtx_engine.c | 5 +
> .../gh-5298-qsync-recovery-snap.result | 100 ++++++++++++++++++
> .../gh-5298-qsync-recovery-snap.test.lua | 48 +++++++++
> 3 files changed, 153 insertions(+)
> create mode 100644 test/replication/gh-5298-qsync-recovery-snap.result
> create mode 100644 test/replication/gh-5298-qsync-recovery-snap.test.lua
>
> diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
> index 9f079a6b5..9df67426c 100644
> --- a/src/box/memtx_engine.c
> +++ b/src/box/memtx_engine.c
> @@ -233,6 +233,11 @@ memtx_engine_recover_snapshot_row(struct memtx_engine *memtx,
> goto rollback_stmt;
> if (txn_commit_stmt(txn, &request) != 0)
> goto rollback;
> + /*
> + * Snapshot rows are confirmed by definition. They don't need to go to
> + * the synchronous transactions limbo.
> + */
> + txn_set_flag(txn, TXN_FORCE_ASYNC);
> rc = txn_commit(txn);
> /*
> * Don't let gc pool grow too much. Yet to
> diff --git a/test/replication/gh-5298-qsync-recovery-snap.result b/test/replication/gh-5298-qsync-recovery-snap.result
> new file mode 100644
> index 000000000..922831552
> --- /dev/null
> +++ b/test/replication/gh-5298-qsync-recovery-snap.result
> @@ -0,0 +1,100 @@
> +-- test-run result file version 2
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +engine = test_run:get_cfg('engine')
> + | ---
> + | ...
> +--
> +-- gh-5298: rows from snapshot got their LSN = signature of the instance vclock.
> +-- All the same LSN. For example, signature at the moment of recovery start
> +-- was 100. Then all rows from the snap got their LSN = 100. That broke the
> +-- limbo, because it assumes LSNs grow. In order to skip duplicate ACKs.
> +--
> +_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
> + | ---
> + | ...
> +_ = box.space.sync:create_index('pk')
> + | ---
> + | ...
> +for i = 1, 10 do box.space.sync:replace{i} end
> + | ---
> + | ...
> +
> +-- Local rows could affect this by increasing the signature.
> +_ = box.schema.space.create('loc', {is_local = true, engine = engine})
> + | ---
> + | ...
> +_ = box.space.loc:create_index('pk')
> + | ---
> + | ...
> +for i = 1, 10 do box.space.loc:replace{i} end
> + | ---
> + | ...
> +
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +
> +test_run:cmd("restart server default")
> + |
> +
> +-- Could hang if the limbo would incorrectly handle the snapshot end.
> +box.space.sync:replace{11}
> + | ---
> + | - [11]
> + | ...
> +
> +old_synchro_quorum = box.cfg.replication_synchro_quorum
> + | ---
> + | ...
> +old_synchro_timeout = box.cfg.replication_synchro_timeout
> + | ---
> + | ...
> +
> +box.cfg{ \
> + replication_synchro_timeout = 0.001, \
> + replication_synchro_quorum = 2, \
> +}
> + | ---
> + | ...
> +box.space.sync:replace{12}
> + | ---
> + | - error: Quorum collection for a synchronous transaction is timed out
> + | ...
> +
> +box.cfg{ \
> + replication_synchro_timeout = 1000, \
> + replication_synchro_quorum = 1, \
> +}
> + | ---
> + | ...
> +box.space.sync:replace{13}
> + | ---
> + | - [13]
> + | ...
> +box.space.sync:get({11})
> + | ---
> + | - [11]
> + | ...
> +box.space.sync:get({12})
> + | ---
> + | ...
> +box.space.sync:get({13})
> + | ---
> + | - [13]
> + | ...
> +
> +box.cfg{ \
> + replication_synchro_timeout = old_synchro_timeout, \
> + replication_synchro_quorum = old_synchro_quorum, \
> +}
> + | ---
> + | ...
> +box.space.sync:drop()
> + | ---
> + | ...
> +box.space.loc:drop()
> + | ---
> + | ...
> diff --git a/test/replication/gh-5298-qsync-recovery-snap.test.lua b/test/replication/gh-5298-qsync-recovery-snap.test.lua
> new file mode 100644
> index 000000000..187f60d75
> --- /dev/null
> +++ b/test/replication/gh-5298-qsync-recovery-snap.test.lua
> @@ -0,0 +1,48 @@
> +test_run = require('test_run').new()
> +engine = test_run:get_cfg('engine')
> +--
> +-- gh-5298: rows from snapshot got their LSN = signature of the instance vclock.
> +-- All the same LSN. For example, signature at the moment of recovery start
> +-- was 100. Then all rows from the snap got their LSN = 100. That broke the
> +-- limbo, because it assumes LSNs grow. In order to skip duplicate ACKs.
> +--
> +_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
> +_ = box.space.sync:create_index('pk')
> +for i = 1, 10 do box.space.sync:replace{i} end
> +
> +-- Local rows could affect this by increasing the signature.
> +_ = box.schema.space.create('loc', {is_local = true, engine = engine})
> +_ = box.space.loc:create_index('pk')
> +for i = 1, 10 do box.space.loc:replace{i} end
> +
> +box.snapshot()
> +
> +test_run:cmd("restart server default")
> +
> +-- Could hang if the limbo would incorrectly handle the snapshot end.
> +box.space.sync:replace{11}
> +
> +old_synchro_quorum = box.cfg.replication_synchro_quorum
> +old_synchro_timeout = box.cfg.replication_synchro_timeout
> +
> +box.cfg{ \
> + replication_synchro_timeout = 0.001, \
> + replication_synchro_quorum = 2, \
> +}
> +box.space.sync:replace{12}
> +
> +box.cfg{ \
> + replication_synchro_timeout = 1000, \
> + replication_synchro_quorum = 1, \
> +}
> +box.space.sync:replace{13}
> +box.space.sync:get({11})
> +box.space.sync:get({12})
> +box.space.sync:get({13})
> +
> +box.cfg{ \
> + replication_synchro_timeout = old_synchro_timeout, \
> + replication_synchro_quorum = old_synchro_quorum, \
> +}
> +box.space.sync:drop()
> +box.space.loc:drop()
--
Serge Petrenko
next prev parent reply other threads:[~2020-09-14 9:15 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-13 19:39 Vladislav Shpilevoy
2020-09-14 8:12 ` Cyrill Gorcunov
2020-09-14 9:15 ` Serge Petrenko [this message]
2020-09-14 20:01 ` Vladislav Shpilevoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d4222631-31c7-f76f-002d-efd330466a03@tarantool.org \
--to=sergepetrenko@tarantool.org \
--cc=gorcunov@gmail.com \
--cc=tarantool-patches@dev.tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v2 1/1] memtx: force async snapshot transactions' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox