From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>,
tml <tarantool-patches@dev.tarantool.org>
Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator
Date: Tue, 15 Jun 2021 13:36:54 +0300 [thread overview]
Message-ID: <814a002a-6cde-1ec9-e201-c13637c58207@tarantool.org> (raw)
In-Reply-To: <20210611152259.205627-2-gorcunov@gmail.com>
11.06.2021 18:22, Cyrill Gorcunov пишет:
> Currently we use synchro packets filtration based on their contents,
> in particular by their xrow->replica_id value. Still there was a
> question if we can optimize this moment and rather filter out all
> packets coming from non-leader replica.
>
> Raft specification requires that only data from a current leader
> should be applied to local WAL but doesn't put a concrete claim on
> the data transport, ie how exactly rows are reaching replicas. This
> implies that data propagation may reach replicas indirectly via transit
> hops. Thus we drop applier->instance_id filtering and rely on
> xrow->replica_id matching instead.
>
> In the test (inspired by Serge Petrenko's test) we recreate the situation
> where replica3 obtains master's node data (which is a raft leader)
> indirectly via replica2 node.
>
> Closes #6035
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> src/box/applier.cc | 27 ++--
> src/lib/raft/raft.h | 7 -
> .../gh-5445-leader-inconsistency.result | 15 ++
> .../gh-5445-leader-inconsistency.test.lua | 5 +
> .../replication/gh-6035-applier-filter.result | 137 ++++++++++++++++++
> .../gh-6035-applier-filter.test.lua | 64 ++++++++
> test/replication/gh-6035-master.lua | 1 +
> test/replication/gh-6035-node.lua | 35 +++++
> test/replication/gh-6035-replica1.lua | 1 +
> test/replication/gh-6035-replica2.lua | 1 +
> test/replication/suite.cfg | 3 +
> 11 files changed, 277 insertions(+), 19 deletions(-)
> create mode 100644 test/replication/gh-6035-applier-filter.result
> create mode 100644 test/replication/gh-6035-applier-filter.test.lua
> create mode 120000 test/replication/gh-6035-master.lua
> create mode 100644 test/replication/gh-6035-node.lua
> create mode 120000 test/replication/gh-6035-replica1.lua
> create mode 120000 test/replication/gh-6035-replica2.lua
>
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index 33181fdbf..d3430f582 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -1027,21 +1027,24 @@ nopify:;
> * Return 0 for success or -1 in case of an error.
> */
> static int
> -applier_apply_tx(struct applier *applier, struct stailq *rows)
> +applier_apply_tx(struct stailq *rows)
> {
> /*
> - * Rows received not directly from a leader are ignored. That is a
> - * protection against the case when an old leader keeps sending data
> - * around not knowing yet that it is not a leader anymore.
> + * Initially we've been filtering out data if it came from
> + * an applier which instance_id doesn't match raft->leader,
> + * but this prevents from obtaining valid leader's data when
> + * it comes from intermediate node. For example a series of
> + * replica hops
> *
> - * XXX: it may be that this can be fine to apply leader transactions by
> - * looking at their replica_id field if it is equal to leader id. That
> - * can be investigated as an 'optimization'. Even though may not give
> - * anything, because won't change total number of rows sent in the
> - * network anyway.
> + * master -> replica 1 -> replica 2
> + *
> + * where each replica carries master's initiated transaction
> + * in xrow->replica_id field and master's data get propagated
> + * indirectly.
> + *
> + * Finally we dropped such "sender" filtration and use transaction
> + * "initiator" filtration via xrow->replica_id only.
> */
> - if (!raft_is_source_allowed(box_raft(), applier->instance_id))
> - return 0;
> struct xrow_header *first_row = &stailq_first_entry(rows,
> struct applier_tx_row, next)->row;
> struct xrow_header *last_row;
> @@ -1312,7 +1315,7 @@ applier_subscribe(struct applier *applier)
> diag_raise();
> }
> applier_signal_ack(applier);
> - } else if (applier_apply_tx(applier, &rows) != 0) {
> + } else if (applier_apply_tx(&rows) != 0) {
> diag_raise();
> }
>
> diff --git a/src/lib/raft/raft.h b/src/lib/raft/raft.h
> index a8da564b0..fae30b03d 100644
> --- a/src/lib/raft/raft.h
> +++ b/src/lib/raft/raft.h
> @@ -236,13 +236,6 @@ raft_is_ro(const struct raft *raft)
> return raft->is_enabled && raft->state != RAFT_STATE_LEADER;
> }
>
> -/** See if the instance can accept rows from an instance with the given ID. */
> -static inline bool
> -raft_is_source_allowed(const struct raft *raft, uint32_t source_id)
> -{
> - return !raft->is_enabled || raft->leader == source_id;
> -}
> -
> /** Check if Raft is enabled. */
> static inline bool
> raft_is_enabled(const struct raft *raft)
> diff --git a/test/replication/gh-5445-leader-inconsistency.result b/test/replication/gh-5445-leader-inconsistency.result
> index 5c6169f50..38d0b097c 100644
> --- a/test/replication/gh-5445-leader-inconsistency.result
> +++ b/test/replication/gh-5445-leader-inconsistency.result
> @@ -175,9 +175,17 @@ test_run:cmd('stop server '..leader)
> | ---
> | - true
> | ...
> +-- And other node as well so it would notice new term on restart.
> is_possible_leader[leader_nr] = false
> | ---
> | ...
> +test_run:cmd('stop server '..other)
> + | ---
> + | - true
> + | ...
> +is_possible_leader[other_nr] = false
> + | ---
> + | ...
>
> -- Emulate a situation when next_leader wins the elections. It can't do that in
> -- this configuration, obviously, because it's behind the 'other' node, so set
> @@ -195,6 +203,13 @@ assert(get_leader(is_possible_leader) == next_leader_nr)
> | ---
> | - true
> | ...
> +test_run:cmd('start server '..other..' with args="1 0.4 voter 2"')
> + | ---
> + | - true
> + | ...
> +is_possible_leader[other_nr] = true
> + | ---
> + | ...
> test_run:switch(other)
> | ---
> | - true
> diff --git a/test/replication/gh-5445-leader-inconsistency.test.lua b/test/replication/gh-5445-leader-inconsistency.test.lua
> index e7952f5fa..fad101881 100644
> --- a/test/replication/gh-5445-leader-inconsistency.test.lua
> +++ b/test/replication/gh-5445-leader-inconsistency.test.lua
> @@ -81,7 +81,10 @@ test_run:wait_cond(function() return box.space.test:get{2} ~= nil end)
> -- Old leader is gone.
> test_run:switch('default')
> test_run:cmd('stop server '..leader)
> +-- And other node as well so it would notice new term on restart.
> is_possible_leader[leader_nr] = false
> +test_run:cmd('stop server '..other)
As I said in the previous review, let's move the comment one line below.
And you shouldn't mention term here. The node is stopped so that it
doesn't replicate [2] to next_leader.
> +is_possible_leader[other_nr] = false
>
> -- Emulate a situation when next_leader wins the elections. It can't do that in
> -- this configuration, obviously, because it's behind the 'other' node, so set
> @@ -93,6 +96,8 @@ is_possible_leader[leader_nr] = false
> -- a situation when some rows from the old leader were not received).
> test_run:cmd('start server '..next_leader..' with args="1 0.4 candidate 1"')
> assert(get_leader(is_possible_leader) == next_leader_nr)
> +test_run:cmd('start server '..other..' with args="1 0.4 voter 2"')
> +is_possible_leader[other_nr] = true
> test_run:switch(other)
> -- New leader didn't know about the unconfirmed rows but still rolled them back.
> test_run:wait_cond(function() return box.space.test:get{2} == nil end)
> diff --git a/test/replication/gh-6035-applier-filter.result b/test/replication/gh-6035-applier-filter.result
> new file mode 100644
> index 000000000..077cd46e5
> --- /dev/null
> +++ b/test/replication/gh-6035-applier-filter.result
> @@ -0,0 +1,137 @@
> +-- test-run result file version 2
> +--
> +-- gh-6035: verify synchronous rows filtration in applier,
> +-- we need to be sure that filtering synchronous rows is
> +-- done via transaction initiator not sender (iow via
> +-- xrow->replica_id).
> +--
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +
> +--
> +-- Prepare a scheme with transitional node
> +--
> +-- master <=> replica1 => replica2
> +--
> +-- such as transaction initiated on the master node would
> +-- be replicated to the replica2 via interim replica1 node.
> +--
> +
> +test_run:cmd('create server master with script="replication/gh-6035-master.lua"')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"')
> + | ---
> + | - true
> + | ...
> +
> +test_run:cmd('start server master')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica1')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica2')
> + | ---
> + | - true
> + | ...
> +
> +test_run:switch('replica2')
> + | ---
> + | - true
> + | ...
> +box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}}
> + | ---
> + | ...
> +
> +--
> +-- Make the master to be RAFT leader.
> +test_run:switch('master')
> + | ---
> + | - true
> + | ...
> +box.cfg({ \
> + replication = { \
> + require('fio').cwd() .. "/master.sock", \
> + require('fio').cwd() .. "/replica1.sock", \
> + }, \
> + replication_synchro_quorum = 2, \
> + election_mode = 'manual', \
> +})
> + | ---
> + | ...
> +
> +box.ctl.promote()
> + | ---
> + | ...
> +_ = box.schema.space.create("sync", {is_sync = true})
> + | ---
> + | ...
> +_ = box.space.sync:create_index("pk")
> + | ---
> + | ...
> +box.space.sync:insert{1}
> + | ---
> + | - [1]
> + | ...
> +
> +--
> +-- The first hop is replica1.
> +test_run:switch('replica1')
> + | ---
> + | - true
> + | ...
> +box.space.sync:select{}
> + | ---
> + | - - [1]
> + | ...
> +
> +--
> +-- And the second hop is replica2 where
> +-- replica1 replicated the data to us.
> +test_run:switch('replica2')
> + | ---
> + | - true
> + | ...
> +--test_run:wait_cond(function() return box.space.sync:get{1} ~= nil end, 10)
> +box.space.sync:select{}
> + | ---
> + | - []
> + | ...
> +
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('stop server master')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('delete server master')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('stop server replica1')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('delete server replica1')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('stop server replica2')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('delete server replica2')
> + | ---
> + | - true
> + | ...
> diff --git a/test/replication/gh-6035-applier-filter.test.lua b/test/replication/gh-6035-applier-filter.test.lua
> new file mode 100644
> index 000000000..4e72abe5f
> --- /dev/null
> +++ b/test/replication/gh-6035-applier-filter.test.lua
> @@ -0,0 +1,64 @@
> +--
> +-- gh-6035: verify synchronous rows filtration in applier,
> +-- we need to be sure that filtering synchronous rows is
> +-- done via transaction initiator not sender (iow via
> +-- xrow->replica_id).
> +--
> +test_run = require('test_run').new()
> +
> +--
> +-- Prepare a scheme with transitional node
> +--
> +-- master <=> replica1 => replica2
> +--
> +-- such as transaction initiated on the master node would
> +-- be replicated to the replica2 via interim replica1 node.
> +--
> +
> +test_run:cmd('create server master with script="replication/gh-6035-master.lua"')
> +test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"')
> +test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"')
> +
> +test_run:cmd('start server master')
> +test_run:cmd('start server replica1')
> +test_run:cmd('start server replica2')
> +
> +test_run:switch('replica2')
> +box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}}
> +
> +--
> +-- Make the master to be RAFT leader.
> +test_run:switch('master')
> +box.cfg({ \
> + replication = { \
> + require('fio').cwd() .. "/master.sock", \
> + require('fio').cwd() .. "/replica1.sock", \
> + }, \
> + replication_synchro_quorum = 2, \
> + election_mode = 'manual', \
> +})
> +
> +box.ctl.promote()
> +_ = box.schema.space.create("sync", {is_sync = true})
> +_ = box.space.sync:create_index("pk")
> +box.space.sync:insert{1}
> +
> +--
> +-- The first hop is replica1.
> +test_run:switch('replica1')
> +box.space.sync:select{}
> +
> +--
> +-- And the second hop is replica2 where
> +-- replica1 replicated the data to us.
> +test_run:switch('replica2')
> +--test_run:wait_cond(function() return box.space.sync:get{1} ~= nil end, 10)
> +box.space.sync:select{}
> +
> +test_run:switch('default')
> +test_run:cmd('stop server master')
> +test_run:cmd('delete server master')
> +test_run:cmd('stop server replica1')
> +test_run:cmd('delete server replica1')
> +test_run:cmd('stop server replica2')
> +test_run:cmd('delete server replica2')
> diff --git a/test/replication/gh-6035-master.lua b/test/replication/gh-6035-master.lua
> new file mode 120000
> index 000000000..f7ede7ef2
> --- /dev/null
> +++ b/test/replication/gh-6035-master.lua
> @@ -0,0 +1 @@
> +gh-6035-node.lua
> \ No newline at end of file
> diff --git a/test/replication/gh-6035-node.lua b/test/replication/gh-6035-node.lua
> new file mode 100644
> index 000000000..e3819471a
> --- /dev/null
> +++ b/test/replication/gh-6035-node.lua
> @@ -0,0 +1,35 @@
> +local SOCKET_DIR = require('fio').cwd()
> +local INSTANCE_ID = string.match(arg[0], "gh%-6035%-(.+)%.lua")
> +
> +local function unix_socket(name)
> + return SOCKET_DIR .. "/" .. name .. '.sock';
> +end
> +
> +require('console').listen(os.getenv('ADMIN'))
> +
> +if INSTANCE_ID == "master" then
> + box.cfg({
> + listen = unix_socket("master"),
> + })
> +elseif INSTANCE_ID == "replica1" then
> + box.cfg({
> + listen = unix_socket("replica1"),
> + replication = {
> + unix_socket("master"),
> + unix_socket("replica1")
> + },
> + election_mode = 'voter'
> + })
> +else
> + assert(INSTANCE_ID == "replica2")
> + box.cfg({
> + replication = {
> + unix_socket("master"),
> + },
> + election_mode = 'voter'
> + })
> +end
> +
> +box.once("bootstrap", function()
> + box.schema.user.grant('guest', 'super')
> +end)
> diff --git a/test/replication/gh-6035-replica1.lua b/test/replication/gh-6035-replica1.lua
> new file mode 120000
> index 000000000..f7ede7ef2
> --- /dev/null
> +++ b/test/replication/gh-6035-replica1.lua
> @@ -0,0 +1 @@
> +gh-6035-node.lua
> \ No newline at end of file
> diff --git a/test/replication/gh-6035-replica2.lua b/test/replication/gh-6035-replica2.lua
> new file mode 120000
> index 000000000..f7ede7ef2
> --- /dev/null
> +++ b/test/replication/gh-6035-replica2.lua
> @@ -0,0 +1 @@
> +gh-6035-node.lua
> \ No newline at end of file
> diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
> index 27eab20c2..55ec022ff 100644
> --- a/test/replication/suite.cfg
> +++ b/test/replication/suite.cfg
> @@ -47,6 +47,9 @@
> "gh-6032-promote-wal-write.test.lua": {},
> "gh-6057-qsync-confirm-async-no-wal.test.lua": {},
> "gh-6094-rs-uuid-mismatch.test.lua": {},
> + "gh-6035-applier-filter.test.lua": {
> + "memtx": {"engine": "memtx"}
> + },
> "*": {
> "memtx": {"engine": "memtx"},
> "vinyl": {"engine": "vinyl"}
--
Serge Petrenko
next prev parent reply other threads:[~2021-06-15 10:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-11 15:22 [Tarantool-patches] [PATCH v8 0/2] filter incoming packets Cyrill Gorcunov via Tarantool-patches
2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator Cyrill Gorcunov via Tarantool-patches
2021-06-15 10:36 ` Serge Petrenko via Tarantool-patches [this message]
2021-06-15 11:35 ` Serge Petrenko via Tarantool-patches
2021-06-15 11:55 ` Serge Petrenko via Tarantool-patches
2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 2/2] Vlad: applier filtration Cyrill Gorcunov via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=814a002a-6cde-1ec9-e201-c13637c58207@tarantool.org \
--to=tarantool-patches@dev.tarantool.org \
--cc=gorcunov@gmail.com \
--cc=sergepetrenko@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox