Tarantool development patches archive
 help / color / mirror / Atom feed
From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org, gorcunov@gmail.com
Subject: Re: [Tarantool-patches] [PATCH v2 1/1] memtx: force async snapshot transactions
Date: Mon, 14 Sep 2020 12:15:54 +0300	[thread overview]
Message-ID: <d4222631-31c7-f76f-002d-efd330466a03@tarantool.org> (raw)
In-Reply-To: <5a6c33ec7bfffd3d0dd201a1b38cc5205f0f0ea4.1600025872.git.v.shpilevoy@tarantool.org>


13.09.2020 22:39, Vladislav Shpilevoy пишет:
> Snapshot rows contain not real LSNs. Instead their LSNs are
> signatures, ordinal numbers. Rows in the snap have LSNs from 1 to
> the number of rows. This is because LSNs are not stored with every
> tuple in the storages, and there is no way to store real LSNs in
> the snapshot.
>
> These artificial LSNs broke the synchronous replication limbo.
> After snap recovery is done, limbo vclock was broken - it
> contained numbers not related to reality, and affected by rows
> from local spaces.
>
> Also the recovery could stuck because ACKs in the limbo stopped
> working after a first row - the vclock was set to the final
> signature right away.
>
> This patch makes all snapshot recovered rows async. Because they
> are confirmed by definition. So now the limbo is not involved into
> the snapshot recovery.
>
> Closes #5298
Thanks for the patch! LGTM.
> ---
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5298-qsync-recovery-snap
> Issue: https://github.com/tarantool/tarantool/issues/5298
>
> @ChangeLog
> * Having synchronous rows in the snapshot could make the instance hang on recovery (gh-5298).
>
> Changes in v2:
> - I realized it is enough to set snapshot transactions asynchronous. So they
>    don't involve the limbo anymore, and can't pollute its vclock with the not
>    real LSNs.
>
>   src/box/memtx_engine.c                        |   5 +
>   .../gh-5298-qsync-recovery-snap.result        | 100 ++++++++++++++++++
>   .../gh-5298-qsync-recovery-snap.test.lua      |  48 +++++++++
>   3 files changed, 153 insertions(+)
>   create mode 100644 test/replication/gh-5298-qsync-recovery-snap.result
>   create mode 100644 test/replication/gh-5298-qsync-recovery-snap.test.lua
>
> diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
> index 9f079a6b5..9df67426c 100644
> --- a/src/box/memtx_engine.c
> +++ b/src/box/memtx_engine.c
> @@ -233,6 +233,11 @@ memtx_engine_recover_snapshot_row(struct memtx_engine *memtx,
>   		goto rollback_stmt;
>   	if (txn_commit_stmt(txn, &request) != 0)
>   		goto rollback;
> +	/*
> +	 * Snapshot rows are confirmed by definition. They don't need to go to
> +	 * the synchronous transactions limbo.
> +	 */
> +	txn_set_flag(txn, TXN_FORCE_ASYNC);
>   	rc = txn_commit(txn);
>   	/*
>   	 * Don't let gc pool grow too much. Yet to
> diff --git a/test/replication/gh-5298-qsync-recovery-snap.result b/test/replication/gh-5298-qsync-recovery-snap.result
> new file mode 100644
> index 000000000..922831552
> --- /dev/null
> +++ b/test/replication/gh-5298-qsync-recovery-snap.result
> @@ -0,0 +1,100 @@
> +-- test-run result file version 2
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +engine = test_run:get_cfg('engine')
> + | ---
> + | ...
> +--
> +-- gh-5298: rows from snapshot got their LSN = signature of the instance vclock.
> +-- All the same LSN. For example, signature at the moment of recovery start
> +-- was 100. Then all rows from the snap got their LSN = 100. That broke the
> +-- limbo, because it assumes LSNs grow. In order to skip duplicate ACKs.
> +--
> +_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
> + | ---
> + | ...
> +_ = box.space.sync:create_index('pk')
> + | ---
> + | ...
> +for i = 1, 10 do box.space.sync:replace{i} end
> + | ---
> + | ...
> +
> +-- Local rows could affect this by increasing the signature.
> +_ = box.schema.space.create('loc', {is_local = true, engine = engine})
> + | ---
> + | ...
> +_ = box.space.loc:create_index('pk')
> + | ---
> + | ...
> +for i = 1, 10 do box.space.loc:replace{i} end
> + | ---
> + | ...
> +
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +
> +test_run:cmd("restart server default")
> + |
> +
> +-- Could hang if the limbo would incorrectly handle the snapshot end.
> +box.space.sync:replace{11}
> + | ---
> + | - [11]
> + | ...
> +
> +old_synchro_quorum = box.cfg.replication_synchro_quorum
> + | ---
> + | ...
> +old_synchro_timeout = box.cfg.replication_synchro_timeout
> + | ---
> + | ...
> +
> +box.cfg{                                                                        \
> +    replication_synchro_timeout = 0.001,                                        \
> +    replication_synchro_quorum = 2,                                             \
> +}
> + | ---
> + | ...
> +box.space.sync:replace{12}
> + | ---
> + | - error: Quorum collection for a synchronous transaction is timed out
> + | ...
> +
> +box.cfg{                                                                        \
> +    replication_synchro_timeout = 1000,                                         \
> +    replication_synchro_quorum = 1,                                             \
> +}
> + | ---
> + | ...
> +box.space.sync:replace{13}
> + | ---
> + | - [13]
> + | ...
> +box.space.sync:get({11})
> + | ---
> + | - [11]
> + | ...
> +box.space.sync:get({12})
> + | ---
> + | ...
> +box.space.sync:get({13})
> + | ---
> + | - [13]
> + | ...
> +
> +box.cfg{                                                                        \
> +    replication_synchro_timeout = old_synchro_timeout,                          \
> +    replication_synchro_quorum = old_synchro_quorum,                            \
> +}
> + | ---
> + | ...
> +box.space.sync:drop()
> + | ---
> + | ...
> +box.space.loc:drop()
> + | ---
> + | ...
> diff --git a/test/replication/gh-5298-qsync-recovery-snap.test.lua b/test/replication/gh-5298-qsync-recovery-snap.test.lua
> new file mode 100644
> index 000000000..187f60d75
> --- /dev/null
> +++ b/test/replication/gh-5298-qsync-recovery-snap.test.lua
> @@ -0,0 +1,48 @@
> +test_run = require('test_run').new()
> +engine = test_run:get_cfg('engine')
> +--
> +-- gh-5298: rows from snapshot got their LSN = signature of the instance vclock.
> +-- All the same LSN. For example, signature at the moment of recovery start
> +-- was 100. Then all rows from the snap got their LSN = 100. That broke the
> +-- limbo, because it assumes LSNs grow. In order to skip duplicate ACKs.
> +--
> +_ = box.schema.space.create('sync', {is_sync = true, engine = engine})
> +_ = box.space.sync:create_index('pk')
> +for i = 1, 10 do box.space.sync:replace{i} end
> +
> +-- Local rows could affect this by increasing the signature.
> +_ = box.schema.space.create('loc', {is_local = true, engine = engine})
> +_ = box.space.loc:create_index('pk')
> +for i = 1, 10 do box.space.loc:replace{i} end
> +
> +box.snapshot()
> +
> +test_run:cmd("restart server default")
> +
> +-- Could hang if the limbo would incorrectly handle the snapshot end.
> +box.space.sync:replace{11}
> +
> +old_synchro_quorum = box.cfg.replication_synchro_quorum
> +old_synchro_timeout = box.cfg.replication_synchro_timeout
> +
> +box.cfg{                                                                        \
> +    replication_synchro_timeout = 0.001,                                        \
> +    replication_synchro_quorum = 2,                                             \
> +}
> +box.space.sync:replace{12}
> +
> +box.cfg{                                                                        \
> +    replication_synchro_timeout = 1000,                                         \
> +    replication_synchro_quorum = 1,                                             \
> +}
> +box.space.sync:replace{13}
> +box.space.sync:get({11})
> +box.space.sync:get({12})
> +box.space.sync:get({13})
> +
> +box.cfg{                                                                        \
> +    replication_synchro_timeout = old_synchro_timeout,                          \
> +    replication_synchro_quorum = old_synchro_quorum,                            \
> +}
> +box.space.sync:drop()
> +box.space.loc:drop()

-- 
Serge Petrenko

  parent reply	other threads:[~2020-09-14  9:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-13 19:39 Vladislav Shpilevoy
2020-09-14  8:12 ` Cyrill Gorcunov
2020-09-14  9:15 ` Serge Petrenko [this message]
2020-09-14 20:01 ` Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4222631-31c7-f76f-002d-efd330466a03@tarantool.org \
    --to=sergepetrenko@tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 1/1] memtx: force async snapshot transactions' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox