* [Tarantool-patches] [PATCH v8 0/2] filter incoming packets @ 2021-06-11 15:22 Cyrill Gorcunov via Tarantool-patches 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator Cyrill Gorcunov via Tarantool-patches 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 2/2] Vlad: applier filtration Cyrill Gorcunov via Tarantool-patches 0 siblings, 2 replies; 6+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-11 15:22 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy Guys, here is a patch which implements the filtration we are interested in. The patch is not for merging yet because there some weird things going on: - to verify that we're hitting exactly the issue we're interested in I apply Vlad's patch to make sure that replication doesn't pass where is should - the test sets up somehow weird configuration but changing it even a bit doesnt trigger the problem we're investigating, and to be honest I'm not sure why - current test (with Vlad's patch applied) returns box.space.sync:select{} | --- | - [] | ... in turn if we revert Vlad's patch then data get propagated to replica1 as it should, which proves that we need to filter by xrow->replica_id Still I would like to get a feedback how to simplify the testcase. branch gorcunov/gh-6035-applier-filter-7-notest Cyrill Gorcunov (2): applier: filter incoming synchro packets via transaction initiator Vlad: applier filtration src/box/applier.cc | 29 ++-- src/lib/raft/raft.h | 7 - .../gh-5445-leader-inconsistency.result | 15 ++ .../gh-5445-leader-inconsistency.test.lua | 5 + .../replication/gh-6035-applier-filter.result | 137 ++++++++++++++++++ .../gh-6035-applier-filter.test.lua | 64 ++++++++ test/replication/gh-6035-master.lua | 1 + test/replication/gh-6035-node.lua | 35 +++++ test/replication/gh-6035-replica1.lua | 1 + test/replication/gh-6035-replica2.lua | 1 + test/replication/suite.cfg | 3 + 11 files changed, 278 insertions(+), 20 deletions(-) create mode 100644 test/replication/gh-6035-applier-filter.result create mode 100644 test/replication/gh-6035-applier-filter.test.lua create mode 120000 test/replication/gh-6035-master.lua create mode 100644 test/replication/gh-6035-node.lua create mode 120000 test/replication/gh-6035-replica1.lua create mode 120000 test/replication/gh-6035-replica2.lua base-commit: 9fb95cfd78403b33cacbe765c35036a0532b2c02 -- 2.31.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator 2021-06-11 15:22 [Tarantool-patches] [PATCH v8 0/2] filter incoming packets Cyrill Gorcunov via Tarantool-patches @ 2021-06-11 15:22 ` Cyrill Gorcunov via Tarantool-patches 2021-06-15 10:36 ` Serge Petrenko via Tarantool-patches 2021-06-15 11:35 ` Serge Petrenko via Tarantool-patches 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 2/2] Vlad: applier filtration Cyrill Gorcunov via Tarantool-patches 1 sibling, 2 replies; 6+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-11 15:22 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy Currently we use synchro packets filtration based on their contents, in particular by their xrow->replica_id value. Still there was a question if we can optimize this moment and rather filter out all packets coming from non-leader replica. Raft specification requires that only data from a current leader should be applied to local WAL but doesn't put a concrete claim on the data transport, ie how exactly rows are reaching replicas. This implies that data propagation may reach replicas indirectly via transit hops. Thus we drop applier->instance_id filtering and rely on xrow->replica_id matching instead. In the test (inspired by Serge Petrenko's test) we recreate the situation where replica3 obtains master's node data (which is a raft leader) indirectly via replica2 node. Closes #6035 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/applier.cc | 27 ++-- src/lib/raft/raft.h | 7 - .../gh-5445-leader-inconsistency.result | 15 ++ .../gh-5445-leader-inconsistency.test.lua | 5 + .../replication/gh-6035-applier-filter.result | 137 ++++++++++++++++++ .../gh-6035-applier-filter.test.lua | 64 ++++++++ test/replication/gh-6035-master.lua | 1 + test/replication/gh-6035-node.lua | 35 +++++ test/replication/gh-6035-replica1.lua | 1 + test/replication/gh-6035-replica2.lua | 1 + test/replication/suite.cfg | 3 + 11 files changed, 277 insertions(+), 19 deletions(-) create mode 100644 test/replication/gh-6035-applier-filter.result create mode 100644 test/replication/gh-6035-applier-filter.test.lua create mode 120000 test/replication/gh-6035-master.lua create mode 100644 test/replication/gh-6035-node.lua create mode 120000 test/replication/gh-6035-replica1.lua create mode 120000 test/replication/gh-6035-replica2.lua diff --git a/src/box/applier.cc b/src/box/applier.cc index 33181fdbf..d3430f582 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -1027,21 +1027,24 @@ nopify:; * Return 0 for success or -1 in case of an error. */ static int -applier_apply_tx(struct applier *applier, struct stailq *rows) +applier_apply_tx(struct stailq *rows) { /* - * Rows received not directly from a leader are ignored. That is a - * protection against the case when an old leader keeps sending data - * around not knowing yet that it is not a leader anymore. + * Initially we've been filtering out data if it came from + * an applier which instance_id doesn't match raft->leader, + * but this prevents from obtaining valid leader's data when + * it comes from intermediate node. For example a series of + * replica hops * - * XXX: it may be that this can be fine to apply leader transactions by - * looking at their replica_id field if it is equal to leader id. That - * can be investigated as an 'optimization'. Even though may not give - * anything, because won't change total number of rows sent in the - * network anyway. + * master -> replica 1 -> replica 2 + * + * where each replica carries master's initiated transaction + * in xrow->replica_id field and master's data get propagated + * indirectly. + * + * Finally we dropped such "sender" filtration and use transaction + * "initiator" filtration via xrow->replica_id only. */ - if (!raft_is_source_allowed(box_raft(), applier->instance_id)) - return 0; struct xrow_header *first_row = &stailq_first_entry(rows, struct applier_tx_row, next)->row; struct xrow_header *last_row; @@ -1312,7 +1315,7 @@ applier_subscribe(struct applier *applier) diag_raise(); } applier_signal_ack(applier); - } else if (applier_apply_tx(applier, &rows) != 0) { + } else if (applier_apply_tx(&rows) != 0) { diag_raise(); } diff --git a/src/lib/raft/raft.h b/src/lib/raft/raft.h index a8da564b0..fae30b03d 100644 --- a/src/lib/raft/raft.h +++ b/src/lib/raft/raft.h @@ -236,13 +236,6 @@ raft_is_ro(const struct raft *raft) return raft->is_enabled && raft->state != RAFT_STATE_LEADER; } -/** See if the instance can accept rows from an instance with the given ID. */ -static inline bool -raft_is_source_allowed(const struct raft *raft, uint32_t source_id) -{ - return !raft->is_enabled || raft->leader == source_id; -} - /** Check if Raft is enabled. */ static inline bool raft_is_enabled(const struct raft *raft) diff --git a/test/replication/gh-5445-leader-inconsistency.result b/test/replication/gh-5445-leader-inconsistency.result index 5c6169f50..38d0b097c 100644 --- a/test/replication/gh-5445-leader-inconsistency.result +++ b/test/replication/gh-5445-leader-inconsistency.result @@ -175,9 +175,17 @@ test_run:cmd('stop server '..leader) | --- | - true | ... +-- And other node as well so it would notice new term on restart. is_possible_leader[leader_nr] = false | --- | ... +test_run:cmd('stop server '..other) + | --- + | - true + | ... +is_possible_leader[other_nr] = false + | --- + | ... -- Emulate a situation when next_leader wins the elections. It can't do that in -- this configuration, obviously, because it's behind the 'other' node, so set @@ -195,6 +203,13 @@ assert(get_leader(is_possible_leader) == next_leader_nr) | --- | - true | ... +test_run:cmd('start server '..other..' with args="1 0.4 voter 2"') + | --- + | - true + | ... +is_possible_leader[other_nr] = true + | --- + | ... test_run:switch(other) | --- | - true diff --git a/test/replication/gh-5445-leader-inconsistency.test.lua b/test/replication/gh-5445-leader-inconsistency.test.lua index e7952f5fa..fad101881 100644 --- a/test/replication/gh-5445-leader-inconsistency.test.lua +++ b/test/replication/gh-5445-leader-inconsistency.test.lua @@ -81,7 +81,10 @@ test_run:wait_cond(function() return box.space.test:get{2} ~= nil end) -- Old leader is gone. test_run:switch('default') test_run:cmd('stop server '..leader) +-- And other node as well so it would notice new term on restart. is_possible_leader[leader_nr] = false +test_run:cmd('stop server '..other) +is_possible_leader[other_nr] = false -- Emulate a situation when next_leader wins the elections. It can't do that in -- this configuration, obviously, because it's behind the 'other' node, so set @@ -93,6 +96,8 @@ is_possible_leader[leader_nr] = false -- a situation when some rows from the old leader were not received). test_run:cmd('start server '..next_leader..' with args="1 0.4 candidate 1"') assert(get_leader(is_possible_leader) == next_leader_nr) +test_run:cmd('start server '..other..' with args="1 0.4 voter 2"') +is_possible_leader[other_nr] = true test_run:switch(other) -- New leader didn't know about the unconfirmed rows but still rolled them back. test_run:wait_cond(function() return box.space.test:get{2} == nil end) diff --git a/test/replication/gh-6035-applier-filter.result b/test/replication/gh-6035-applier-filter.result new file mode 100644 index 000000000..077cd46e5 --- /dev/null +++ b/test/replication/gh-6035-applier-filter.result @@ -0,0 +1,137 @@ +-- test-run result file version 2 +-- +-- gh-6035: verify synchronous rows filtration in applier, +-- we need to be sure that filtering synchronous rows is +-- done via transaction initiator not sender (iow via +-- xrow->replica_id). +-- +test_run = require('test_run').new() + | --- + | ... + +-- +-- Prepare a scheme with transitional node +-- +-- master <=> replica1 => replica2 +-- +-- such as transaction initiated on the master node would +-- be replicated to the replica2 via interim replica1 node. +-- + +test_run:cmd('create server master with script="replication/gh-6035-master.lua"') + | --- + | - true + | ... +test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"') + | --- + | - true + | ... +test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"') + | --- + | - true + | ... + +test_run:cmd('start server master') + | --- + | - true + | ... +test_run:cmd('start server replica1') + | --- + | - true + | ... +test_run:cmd('start server replica2') + | --- + | - true + | ... + +test_run:switch('replica2') + | --- + | - true + | ... +box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} + | --- + | ... + +-- +-- Make the master to be RAFT leader. +test_run:switch('master') + | --- + | - true + | ... +box.cfg({ \ + replication = { \ + require('fio').cwd() .. "/master.sock", \ + require('fio').cwd() .. "/replica1.sock", \ + }, \ + replication_synchro_quorum = 2, \ + election_mode = 'manual', \ +}) + | --- + | ... + +box.ctl.promote() + | --- + | ... +_ = box.schema.space.create("sync", {is_sync = true}) + | --- + | ... +_ = box.space.sync:create_index("pk") + | --- + | ... +box.space.sync:insert{1} + | --- + | - [1] + | ... + +-- +-- The first hop is replica1. +test_run:switch('replica1') + | --- + | - true + | ... +box.space.sync:select{} + | --- + | - - [1] + | ... + +-- +-- And the second hop is replica2 where +-- replica1 replicated the data to us. +test_run:switch('replica2') + | --- + | - true + | ... +--test_run:wait_cond(function() return box.space.sync:get{1} ~= nil end, 10) +box.space.sync:select{} + | --- + | - [] + | ... + +test_run:switch('default') + | --- + | - true + | ... +test_run:cmd('stop server master') + | --- + | - true + | ... +test_run:cmd('delete server master') + | --- + | - true + | ... +test_run:cmd('stop server replica1') + | --- + | - true + | ... +test_run:cmd('delete server replica1') + | --- + | - true + | ... +test_run:cmd('stop server replica2') + | --- + | - true + | ... +test_run:cmd('delete server replica2') + | --- + | - true + | ... diff --git a/test/replication/gh-6035-applier-filter.test.lua b/test/replication/gh-6035-applier-filter.test.lua new file mode 100644 index 000000000..4e72abe5f --- /dev/null +++ b/test/replication/gh-6035-applier-filter.test.lua @@ -0,0 +1,64 @@ +-- +-- gh-6035: verify synchronous rows filtration in applier, +-- we need to be sure that filtering synchronous rows is +-- done via transaction initiator not sender (iow via +-- xrow->replica_id). +-- +test_run = require('test_run').new() + +-- +-- Prepare a scheme with transitional node +-- +-- master <=> replica1 => replica2 +-- +-- such as transaction initiated on the master node would +-- be replicated to the replica2 via interim replica1 node. +-- + +test_run:cmd('create server master with script="replication/gh-6035-master.lua"') +test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"') +test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"') + +test_run:cmd('start server master') +test_run:cmd('start server replica1') +test_run:cmd('start server replica2') + +test_run:switch('replica2') +box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} + +-- +-- Make the master to be RAFT leader. +test_run:switch('master') +box.cfg({ \ + replication = { \ + require('fio').cwd() .. "/master.sock", \ + require('fio').cwd() .. "/replica1.sock", \ + }, \ + replication_synchro_quorum = 2, \ + election_mode = 'manual', \ +}) + +box.ctl.promote() +_ = box.schema.space.create("sync", {is_sync = true}) +_ = box.space.sync:create_index("pk") +box.space.sync:insert{1} + +-- +-- The first hop is replica1. +test_run:switch('replica1') +box.space.sync:select{} + +-- +-- And the second hop is replica2 where +-- replica1 replicated the data to us. +test_run:switch('replica2') +--test_run:wait_cond(function() return box.space.sync:get{1} ~= nil end, 10) +box.space.sync:select{} + +test_run:switch('default') +test_run:cmd('stop server master') +test_run:cmd('delete server master') +test_run:cmd('stop server replica1') +test_run:cmd('delete server replica1') +test_run:cmd('stop server replica2') +test_run:cmd('delete server replica2') diff --git a/test/replication/gh-6035-master.lua b/test/replication/gh-6035-master.lua new file mode 120000 index 000000000..f7ede7ef2 --- /dev/null +++ b/test/replication/gh-6035-master.lua @@ -0,0 +1 @@ +gh-6035-node.lua \ No newline at end of file diff --git a/test/replication/gh-6035-node.lua b/test/replication/gh-6035-node.lua new file mode 100644 index 000000000..e3819471a --- /dev/null +++ b/test/replication/gh-6035-node.lua @@ -0,0 +1,35 @@ +local SOCKET_DIR = require('fio').cwd() +local INSTANCE_ID = string.match(arg[0], "gh%-6035%-(.+)%.lua") + +local function unix_socket(name) + return SOCKET_DIR .. "/" .. name .. '.sock'; +end + +require('console').listen(os.getenv('ADMIN')) + +if INSTANCE_ID == "master" then + box.cfg({ + listen = unix_socket("master"), + }) +elseif INSTANCE_ID == "replica1" then + box.cfg({ + listen = unix_socket("replica1"), + replication = { + unix_socket("master"), + unix_socket("replica1") + }, + election_mode = 'voter' + }) +else + assert(INSTANCE_ID == "replica2") + box.cfg({ + replication = { + unix_socket("master"), + }, + election_mode = 'voter' + }) +end + +box.once("bootstrap", function() + box.schema.user.grant('guest', 'super') +end) diff --git a/test/replication/gh-6035-replica1.lua b/test/replication/gh-6035-replica1.lua new file mode 120000 index 000000000..f7ede7ef2 --- /dev/null +++ b/test/replication/gh-6035-replica1.lua @@ -0,0 +1 @@ +gh-6035-node.lua \ No newline at end of file diff --git a/test/replication/gh-6035-replica2.lua b/test/replication/gh-6035-replica2.lua new file mode 120000 index 000000000..f7ede7ef2 --- /dev/null +++ b/test/replication/gh-6035-replica2.lua @@ -0,0 +1 @@ +gh-6035-node.lua \ No newline at end of file diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg index 27eab20c2..55ec022ff 100644 --- a/test/replication/suite.cfg +++ b/test/replication/suite.cfg @@ -47,6 +47,9 @@ "gh-6032-promote-wal-write.test.lua": {}, "gh-6057-qsync-confirm-async-no-wal.test.lua": {}, "gh-6094-rs-uuid-mismatch.test.lua": {}, + "gh-6035-applier-filter.test.lua": { + "memtx": {"engine": "memtx"} + }, "*": { "memtx": {"engine": "memtx"}, "vinyl": {"engine": "vinyl"} -- 2.31.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator Cyrill Gorcunov via Tarantool-patches @ 2021-06-15 10:36 ` Serge Petrenko via Tarantool-patches 2021-06-15 11:35 ` Serge Petrenko via Tarantool-patches 1 sibling, 0 replies; 6+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-06-15 10:36 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 11.06.2021 18:22, Cyrill Gorcunov пишет: > Currently we use synchro packets filtration based on their contents, > in particular by their xrow->replica_id value. Still there was a > question if we can optimize this moment and rather filter out all > packets coming from non-leader replica. > > Raft specification requires that only data from a current leader > should be applied to local WAL but doesn't put a concrete claim on > the data transport, ie how exactly rows are reaching replicas. This > implies that data propagation may reach replicas indirectly via transit > hops. Thus we drop applier->instance_id filtering and rely on > xrow->replica_id matching instead. > > In the test (inspired by Serge Petrenko's test) we recreate the situation > where replica3 obtains master's node data (which is a raft leader) > indirectly via replica2 node. > > Closes #6035 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > src/box/applier.cc | 27 ++-- > src/lib/raft/raft.h | 7 - > .../gh-5445-leader-inconsistency.result | 15 ++ > .../gh-5445-leader-inconsistency.test.lua | 5 + > .../replication/gh-6035-applier-filter.result | 137 ++++++++++++++++++ > .../gh-6035-applier-filter.test.lua | 64 ++++++++ > test/replication/gh-6035-master.lua | 1 + > test/replication/gh-6035-node.lua | 35 +++++ > test/replication/gh-6035-replica1.lua | 1 + > test/replication/gh-6035-replica2.lua | 1 + > test/replication/suite.cfg | 3 + > 11 files changed, 277 insertions(+), 19 deletions(-) > create mode 100644 test/replication/gh-6035-applier-filter.result > create mode 100644 test/replication/gh-6035-applier-filter.test.lua > create mode 120000 test/replication/gh-6035-master.lua > create mode 100644 test/replication/gh-6035-node.lua > create mode 120000 test/replication/gh-6035-replica1.lua > create mode 120000 test/replication/gh-6035-replica2.lua > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index 33181fdbf..d3430f582 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -1027,21 +1027,24 @@ nopify:; > * Return 0 for success or -1 in case of an error. > */ > static int > -applier_apply_tx(struct applier *applier, struct stailq *rows) > +applier_apply_tx(struct stailq *rows) > { > /* > - * Rows received not directly from a leader are ignored. That is a > - * protection against the case when an old leader keeps sending data > - * around not knowing yet that it is not a leader anymore. > + * Initially we've been filtering out data if it came from > + * an applier which instance_id doesn't match raft->leader, > + * but this prevents from obtaining valid leader's data when > + * it comes from intermediate node. For example a series of > + * replica hops > * > - * XXX: it may be that this can be fine to apply leader transactions by > - * looking at their replica_id field if it is equal to leader id. That > - * can be investigated as an 'optimization'. Even though may not give > - * anything, because won't change total number of rows sent in the > - * network anyway. > + * master -> replica 1 -> replica 2 > + * > + * where each replica carries master's initiated transaction > + * in xrow->replica_id field and master's data get propagated > + * indirectly. > + * > + * Finally we dropped such "sender" filtration and use transaction > + * "initiator" filtration via xrow->replica_id only. > */ > - if (!raft_is_source_allowed(box_raft(), applier->instance_id)) > - return 0; > struct xrow_header *first_row = &stailq_first_entry(rows, > struct applier_tx_row, next)->row; > struct xrow_header *last_row; > @@ -1312,7 +1315,7 @@ applier_subscribe(struct applier *applier) > diag_raise(); > } > applier_signal_ack(applier); > - } else if (applier_apply_tx(applier, &rows) != 0) { > + } else if (applier_apply_tx(&rows) != 0) { > diag_raise(); > } > > diff --git a/src/lib/raft/raft.h b/src/lib/raft/raft.h > index a8da564b0..fae30b03d 100644 > --- a/src/lib/raft/raft.h > +++ b/src/lib/raft/raft.h > @@ -236,13 +236,6 @@ raft_is_ro(const struct raft *raft) > return raft->is_enabled && raft->state != RAFT_STATE_LEADER; > } > > -/** See if the instance can accept rows from an instance with the given ID. */ > -static inline bool > -raft_is_source_allowed(const struct raft *raft, uint32_t source_id) > -{ > - return !raft->is_enabled || raft->leader == source_id; > -} > - > /** Check if Raft is enabled. */ > static inline bool > raft_is_enabled(const struct raft *raft) > diff --git a/test/replication/gh-5445-leader-inconsistency.result b/test/replication/gh-5445-leader-inconsistency.result > index 5c6169f50..38d0b097c 100644 > --- a/test/replication/gh-5445-leader-inconsistency.result > +++ b/test/replication/gh-5445-leader-inconsistency.result > @@ -175,9 +175,17 @@ test_run:cmd('stop server '..leader) > | --- > | - true > | ... > +-- And other node as well so it would notice new term on restart. > is_possible_leader[leader_nr] = false > | --- > | ... > +test_run:cmd('stop server '..other) > + | --- > + | - true > + | ... > +is_possible_leader[other_nr] = false > + | --- > + | ... > > -- Emulate a situation when next_leader wins the elections. It can't do that in > -- this configuration, obviously, because it's behind the 'other' node, so set > @@ -195,6 +203,13 @@ assert(get_leader(is_possible_leader) == next_leader_nr) > | --- > | - true > | ... > +test_run:cmd('start server '..other..' with args="1 0.4 voter 2"') > + | --- > + | - true > + | ... > +is_possible_leader[other_nr] = true > + | --- > + | ... > test_run:switch(other) > | --- > | - true > diff --git a/test/replication/gh-5445-leader-inconsistency.test.lua b/test/replication/gh-5445-leader-inconsistency.test.lua > index e7952f5fa..fad101881 100644 > --- a/test/replication/gh-5445-leader-inconsistency.test.lua > +++ b/test/replication/gh-5445-leader-inconsistency.test.lua > @@ -81,7 +81,10 @@ test_run:wait_cond(function() return box.space.test:get{2} ~= nil end) > -- Old leader is gone. > test_run:switch('default') > test_run:cmd('stop server '..leader) > +-- And other node as well so it would notice new term on restart. > is_possible_leader[leader_nr] = false > +test_run:cmd('stop server '..other) As I said in the previous review, let's move the comment one line below. And you shouldn't mention term here. The node is stopped so that it doesn't replicate [2] to next_leader. > +is_possible_leader[other_nr] = false > > -- Emulate a situation when next_leader wins the elections. It can't do that in > -- this configuration, obviously, because it's behind the 'other' node, so set > @@ -93,6 +96,8 @@ is_possible_leader[leader_nr] = false > -- a situation when some rows from the old leader were not received). > test_run:cmd('start server '..next_leader..' with args="1 0.4 candidate 1"') > assert(get_leader(is_possible_leader) == next_leader_nr) > +test_run:cmd('start server '..other..' with args="1 0.4 voter 2"') > +is_possible_leader[other_nr] = true > test_run:switch(other) > -- New leader didn't know about the unconfirmed rows but still rolled them back. > test_run:wait_cond(function() return box.space.test:get{2} == nil end) > diff --git a/test/replication/gh-6035-applier-filter.result b/test/replication/gh-6035-applier-filter.result > new file mode 100644 > index 000000000..077cd46e5 > --- /dev/null > +++ b/test/replication/gh-6035-applier-filter.result > @@ -0,0 +1,137 @@ > +-- test-run result file version 2 > +-- > +-- gh-6035: verify synchronous rows filtration in applier, > +-- we need to be sure that filtering synchronous rows is > +-- done via transaction initiator not sender (iow via > +-- xrow->replica_id). > +-- > +test_run = require('test_run').new() > + | --- > + | ... > + > +-- > +-- Prepare a scheme with transitional node > +-- > +-- master <=> replica1 => replica2 > +-- > +-- such as transaction initiated on the master node would > +-- be replicated to the replica2 via interim replica1 node. > +-- > + > +test_run:cmd('create server master with script="replication/gh-6035-master.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"') > + | --- > + | - true > + | ... > +test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"') > + | --- > + | - true > + | ... > + > +test_run:cmd('start server master') > + | --- > + | - true > + | ... > +test_run:cmd('start server replica1') > + | --- > + | - true > + | ... > +test_run:cmd('start server replica2') > + | --- > + | - true > + | ... > + > +test_run:switch('replica2') > + | --- > + | - true > + | ... > +box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} > + | --- > + | ... > + > +-- > +-- Make the master to be RAFT leader. > +test_run:switch('master') > + | --- > + | - true > + | ... > +box.cfg({ \ > + replication = { \ > + require('fio').cwd() .. "/master.sock", \ > + require('fio').cwd() .. "/replica1.sock", \ > + }, \ > + replication_synchro_quorum = 2, \ > + election_mode = 'manual', \ > +}) > + | --- > + | ... > + > +box.ctl.promote() > + | --- > + | ... > +_ = box.schema.space.create("sync", {is_sync = true}) > + | --- > + | ... > +_ = box.space.sync:create_index("pk") > + | --- > + | ... > +box.space.sync:insert{1} > + | --- > + | - [1] > + | ... > + > +-- > +-- The first hop is replica1. > +test_run:switch('replica1') > + | --- > + | - true > + | ... > +box.space.sync:select{} > + | --- > + | - - [1] > + | ... > + > +-- > +-- And the second hop is replica2 where > +-- replica1 replicated the data to us. > +test_run:switch('replica2') > + | --- > + | - true > + | ... > +--test_run:wait_cond(function() return box.space.sync:get{1} ~= nil end, 10) > +box.space.sync:select{} > + | --- > + | - [] > + | ... > + > +test_run:switch('default') > + | --- > + | - true > + | ... > +test_run:cmd('stop server master') > + | --- > + | - true > + | ... > +test_run:cmd('delete server master') > + | --- > + | - true > + | ... > +test_run:cmd('stop server replica1') > + | --- > + | - true > + | ... > +test_run:cmd('delete server replica1') > + | --- > + | - true > + | ... > +test_run:cmd('stop server replica2') > + | --- > + | - true > + | ... > +test_run:cmd('delete server replica2') > + | --- > + | - true > + | ... > diff --git a/test/replication/gh-6035-applier-filter.test.lua b/test/replication/gh-6035-applier-filter.test.lua > new file mode 100644 > index 000000000..4e72abe5f > --- /dev/null > +++ b/test/replication/gh-6035-applier-filter.test.lua > @@ -0,0 +1,64 @@ > +-- > +-- gh-6035: verify synchronous rows filtration in applier, > +-- we need to be sure that filtering synchronous rows is > +-- done via transaction initiator not sender (iow via > +-- xrow->replica_id). > +-- > +test_run = require('test_run').new() > + > +-- > +-- Prepare a scheme with transitional node > +-- > +-- master <=> replica1 => replica2 > +-- > +-- such as transaction initiated on the master node would > +-- be replicated to the replica2 via interim replica1 node. > +-- > + > +test_run:cmd('create server master with script="replication/gh-6035-master.lua"') > +test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"') > +test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"') > + > +test_run:cmd('start server master') > +test_run:cmd('start server replica1') > +test_run:cmd('start server replica2') > + > +test_run:switch('replica2') > +box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} > + > +-- > +-- Make the master to be RAFT leader. > +test_run:switch('master') > +box.cfg({ \ > + replication = { \ > + require('fio').cwd() .. "/master.sock", \ > + require('fio').cwd() .. "/replica1.sock", \ > + }, \ > + replication_synchro_quorum = 2, \ > + election_mode = 'manual', \ > +}) > + > +box.ctl.promote() > +_ = box.schema.space.create("sync", {is_sync = true}) > +_ = box.space.sync:create_index("pk") > +box.space.sync:insert{1} > + > +-- > +-- The first hop is replica1. > +test_run:switch('replica1') > +box.space.sync:select{} > + > +-- > +-- And the second hop is replica2 where > +-- replica1 replicated the data to us. > +test_run:switch('replica2') > +--test_run:wait_cond(function() return box.space.sync:get{1} ~= nil end, 10) > +box.space.sync:select{} > + > +test_run:switch('default') > +test_run:cmd('stop server master') > +test_run:cmd('delete server master') > +test_run:cmd('stop server replica1') > +test_run:cmd('delete server replica1') > +test_run:cmd('stop server replica2') > +test_run:cmd('delete server replica2') > diff --git a/test/replication/gh-6035-master.lua b/test/replication/gh-6035-master.lua > new file mode 120000 > index 000000000..f7ede7ef2 > --- /dev/null > +++ b/test/replication/gh-6035-master.lua > @@ -0,0 +1 @@ > +gh-6035-node.lua > \ No newline at end of file > diff --git a/test/replication/gh-6035-node.lua b/test/replication/gh-6035-node.lua > new file mode 100644 > index 000000000..e3819471a > --- /dev/null > +++ b/test/replication/gh-6035-node.lua > @@ -0,0 +1,35 @@ > +local SOCKET_DIR = require('fio').cwd() > +local INSTANCE_ID = string.match(arg[0], "gh%-6035%-(.+)%.lua") > + > +local function unix_socket(name) > + return SOCKET_DIR .. "/" .. name .. '.sock'; > +end > + > +require('console').listen(os.getenv('ADMIN')) > + > +if INSTANCE_ID == "master" then > + box.cfg({ > + listen = unix_socket("master"), > + }) > +elseif INSTANCE_ID == "replica1" then > + box.cfg({ > + listen = unix_socket("replica1"), > + replication = { > + unix_socket("master"), > + unix_socket("replica1") > + }, > + election_mode = 'voter' > + }) > +else > + assert(INSTANCE_ID == "replica2") > + box.cfg({ > + replication = { > + unix_socket("master"), > + }, > + election_mode = 'voter' > + }) > +end > + > +box.once("bootstrap", function() > + box.schema.user.grant('guest', 'super') > +end) > diff --git a/test/replication/gh-6035-replica1.lua b/test/replication/gh-6035-replica1.lua > new file mode 120000 > index 000000000..f7ede7ef2 > --- /dev/null > +++ b/test/replication/gh-6035-replica1.lua > @@ -0,0 +1 @@ > +gh-6035-node.lua > \ No newline at end of file > diff --git a/test/replication/gh-6035-replica2.lua b/test/replication/gh-6035-replica2.lua > new file mode 120000 > index 000000000..f7ede7ef2 > --- /dev/null > +++ b/test/replication/gh-6035-replica2.lua > @@ -0,0 +1 @@ > +gh-6035-node.lua > \ No newline at end of file > diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg > index 27eab20c2..55ec022ff 100644 > --- a/test/replication/suite.cfg > +++ b/test/replication/suite.cfg > @@ -47,6 +47,9 @@ > "gh-6032-promote-wal-write.test.lua": {}, > "gh-6057-qsync-confirm-async-no-wal.test.lua": {}, > "gh-6094-rs-uuid-mismatch.test.lua": {}, > + "gh-6035-applier-filter.test.lua": { > + "memtx": {"engine": "memtx"} > + }, > "*": { > "memtx": {"engine": "memtx"}, > "vinyl": {"engine": "vinyl"} -- Serge Petrenko ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator Cyrill Gorcunov via Tarantool-patches 2021-06-15 10:36 ` Serge Petrenko via Tarantool-patches @ 2021-06-15 11:35 ` Serge Petrenko via Tarantool-patches 2021-06-15 11:55 ` Serge Petrenko via Tarantool-patches 1 sibling, 1 reply; 6+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-06-15 11:35 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 11.06.2021 18:22, Cyrill Gorcunov пишет: > Currently we use synchro packets filtration based on their contents, > in particular by their xrow->replica_id value. Still there was a > question if we can optimize this moment and rather filter out all > packets coming from non-leader replica. > > Raft specification requires that only data from a current leader > should be applied to local WAL but doesn't put a concrete claim on > the data transport, ie how exactly rows are reaching replicas. This > implies that data propagation may reach replicas indirectly via transit > hops. Thus we drop applier->instance_id filtering and rely on > xrow->replica_id matching instead. > > In the test (inspired by Serge Petrenko's test) we recreate the situation > where replica3 obtains master's node data (which is a raft leader) > indirectly via replica2 node. > > Closes #6035 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > I've found a way to make the test work as discussed: replication 1 <-> 2 -> 3 from the beginning. Without replication reconfiguration. Take a look at the diff below. Although the diff doesn't simplify anything, and it makes the test run longer: 2.5 s vs 0.8 s on my machine. So it's up to you whether you want to apply it. I'm starting to like your test version more, to be honest. Here's the diff: ================================================= diff --git a/test/replication/gh-6035-applier-filter.result b/test/replication/gh-6035-applier-filter.result index 077cd46e5..4ddebcc6b 100644 --- a/test/replication/gh-6035-applier-filter.result +++ b/test/replication/gh-6035-applier-filter.result @@ -31,7 +31,7 @@ test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.l | - true | ... -test_run:cmd('start server master') +test_run:cmd('start server master with wait=False') | --- | - true | ... @@ -44,25 +44,24 @@ test_run:cmd('start server replica2') | - true | ... -test_run:switch('replica2') +-- +-- Make the master to be RAFT leader. +test_run:switch('replica1') | --- | - true | ... -box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} +box.cfg{election_mode = 'voter'} | --- | ... --- --- Make the master to be RAFT leader. test_run:switch('master') | --- | - true | ... +test_run:wait_lsn('master', 'replica1') + | --- + | ... box.cfg({ \ - replication = { \ - require('fio').cwd() .. "/master.sock", \ - require('fio').cwd() .. "/replica1.sock", \ - }, \ replication_synchro_quorum = 2, \ election_mode = 'manual', \ }) diff --git a/test/replication/gh-6035-applier-filter.test.lua b/test/replication/gh-6035-applier-filter.test.lua index 4e72abe5f..4ec7a6e6a 100644 --- a/test/replication/gh-6035-applier-filter.test.lua +++ b/test/replication/gh-6035-applier-filter.test.lua @@ -19,21 +19,18 @@ test_run:cmd('create server master with script="replication/gh-6035-master.lua"' test_run:cmd('create server replica1 with script="replication/gh-6035-replica1.lua"') test_run:cmd('create server replica2 with script="replication/gh-6035-replica2.lua"') -test_run:cmd('start server master') +test_run:cmd('start server master with wait=False') test_run:cmd('start server replica1') test_run:cmd('start server replica2') -test_run:switch('replica2') -box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} - -- -- Make the master to be RAFT leader. +test_run:switch('replica1') +box.cfg{election_mode = 'voter'} + test_run:switch('master') +test_run:wait_lsn('master', 'replica1') box.cfg({ \ - replication = { \ - require('fio').cwd() .. "/master.sock", \ - require('fio').cwd() .. "/replica1.sock", \ - }, \ replication_synchro_quorum = 2, \ election_mode = 'manual', \ }) diff --git a/test/replication/gh-6035-node.lua b/test/replication/gh-6035-node.lua index e3819471a..4ed41b3cf 100644 --- a/test/replication/gh-6035-node.lua +++ b/test/replication/gh-6035-node.lua @@ -10,6 +10,10 @@ require('console').listen(os.getenv('ADMIN')) if INSTANCE_ID == "master" then box.cfg({ listen = unix_socket("master"), + replication = { + unix_socket("master"), + unix_socket("replica1") + }, }) elseif INSTANCE_ID == "replica1" then box.cfg({ @@ -18,13 +22,12 @@ elseif INSTANCE_ID == "replica1" then unix_socket("master"), unix_socket("replica1") }, - election_mode = 'voter' }) else assert(INSTANCE_ID == "replica2") box.cfg({ replication = { - unix_socket("master"), + unix_socket("replica1"), }, election_mode = 'voter' }) ==================================================== -- Serge Petrenko ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator 2021-06-15 11:35 ` Serge Petrenko via Tarantool-patches @ 2021-06-15 11:55 ` Serge Petrenko via Tarantool-patches 0 siblings, 0 replies; 6+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-06-15 11:55 UTC (permalink / raw) To: Cyrill Gorcunov, tml; +Cc: Vladislav Shpilevoy 15.06.2021 14:35, Serge Petrenko via Tarantool-patches пишет: > > > 11.06.2021 18:22, Cyrill Gorcunov пишет: >> Currently we use synchro packets filtration based on their contents, >> in particular by their xrow->replica_id value. Still there was a >> question if we can optimize this moment and rather filter out all >> packets coming from non-leader replica. >> >> Raft specification requires that only data from a current leader >> should be applied to local WAL but doesn't put a concrete claim on >> the data transport, ie how exactly rows are reaching replicas. This >> implies that data propagation may reach replicas indirectly via transit >> hops. Thus we drop applier->instance_id filtering and rely on >> xrow->replica_id matching instead. >> >> In the test (inspired by Serge Petrenko's test) we recreate the >> situation >> where replica3 obtains master's node data (which is a raft leader) >> indirectly via replica2 node. >> >> Closes #6035 >> >> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> >> --- >> > > I've found a way to make the test work as discussed: replication 1 <-> > 2 -> 3 from > the beginning. Without replication reconfiguration. Take a look at the > diff below. > > Although the diff doesn't simplify anything, and it makes the test > run longer: 2.5 s vs 0.8 s > on my machine. So it's up to you whether you want to apply it. > > I'm starting to like your test version more, to be honest. > > > Here's the diff: Pushed the diff on top of your branch as a separate commit. > ================================================= > > diff --git a/test/replication/gh-6035-applier-filter.result > b/test/replication/gh-6035-applier-filter.result > index 077cd46e5..4ddebcc6b 100644 > --- a/test/replication/gh-6035-applier-filter.result > +++ b/test/replication/gh-6035-applier-filter.result > @@ -31,7 +31,7 @@ test_run:cmd('create server replica2 with > script="replication/gh-6035-replica2.l > | - true > | ... > > -test_run:cmd('start server master') > +test_run:cmd('start server master with wait=False') > | --- > | - true > | ... > @@ -44,25 +44,24 @@ test_run:cmd('start server replica2') > | - true > | ... > > -test_run:switch('replica2') > +-- > +-- Make the master to be RAFT leader. > +test_run:switch('replica1') > | --- > | - true > | ... > -box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} > +box.cfg{election_mode = 'voter'} > | --- > | ... > > --- > --- Make the master to be RAFT leader. > test_run:switch('master') > | --- > | - true > | ... > +test_run:wait_lsn('master', 'replica1') > + | --- > + | ... > box.cfg({ \ > - replication = { \ > - require('fio').cwd() .. "/master.sock", \ > - require('fio').cwd() .. "/replica1.sock", \ > - }, \ > replication_synchro_quorum = 2, \ > election_mode = 'manual', \ > }) > diff --git a/test/replication/gh-6035-applier-filter.test.lua > b/test/replication/gh-6035-applier-filter.test.lua > index 4e72abe5f..4ec7a6e6a 100644 > --- a/test/replication/gh-6035-applier-filter.test.lua > +++ b/test/replication/gh-6035-applier-filter.test.lua > @@ -19,21 +19,18 @@ test_run:cmd('create server master with > script="replication/gh-6035-master.lua"' > test_run:cmd('create server replica1 with > script="replication/gh-6035-replica1.lua"') > test_run:cmd('create server replica2 with > script="replication/gh-6035-replica2.lua"') > > -test_run:cmd('start server master') > +test_run:cmd('start server master with wait=False') > test_run:cmd('start server replica1') > test_run:cmd('start server replica2') > > -test_run:switch('replica2') > -box.cfg{replication = {require('fio').cwd() .. "/replica1.sock"}} > - > -- > -- Make the master to be RAFT leader. > +test_run:switch('replica1') > +box.cfg{election_mode = 'voter'} > + > test_run:switch('master') > +test_run:wait_lsn('master', 'replica1') > box.cfg({ \ > - replication = { \ > - require('fio').cwd() .. "/master.sock", \ > - require('fio').cwd() .. "/replica1.sock", \ > - }, \ > replication_synchro_quorum = 2, \ > election_mode = 'manual', \ > }) > diff --git a/test/replication/gh-6035-node.lua > b/test/replication/gh-6035-node.lua > index e3819471a..4ed41b3cf 100644 > --- a/test/replication/gh-6035-node.lua > +++ b/test/replication/gh-6035-node.lua > @@ -10,6 +10,10 @@ require('console').listen(os.getenv('ADMIN')) > if INSTANCE_ID == "master" then > box.cfg({ > listen = unix_socket("master"), > + replication = { > + unix_socket("master"), > + unix_socket("replica1") > + }, > }) > elseif INSTANCE_ID == "replica1" then > box.cfg({ > @@ -18,13 +22,12 @@ elseif INSTANCE_ID == "replica1" then > unix_socket("master"), > unix_socket("replica1") > }, > - election_mode = 'voter' > }) > else > assert(INSTANCE_ID == "replica2") > box.cfg({ > replication = { > - unix_socket("master"), > + unix_socket("replica1"), > }, > election_mode = 'voter' > }) > > > ==================================================== > -- Serge Petrenko ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Tarantool-patches] [PATCH v8 2/2] Vlad: applier filtration 2021-06-11 15:22 [Tarantool-patches] [PATCH v8 0/2] filter incoming packets Cyrill Gorcunov via Tarantool-patches 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator Cyrill Gorcunov via Tarantool-patches @ 2021-06-11 15:22 ` Cyrill Gorcunov via Tarantool-patches 1 sibling, 0 replies; 6+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-06-11 15:22 UTC (permalink / raw) To: tml; +Cc: Vladislav Shpilevoy Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/applier.cc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/box/applier.cc b/src/box/applier.cc index d3430f582..1d92a989d 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -973,7 +973,7 @@ apply_final_join_tx(struct stailq *rows) * The rows are replaced with NOPs to preserve the vclock consistency. */ static void -applier_synchro_filter_tx(struct stailq *rows) +applier_synchro_filter_tx(uint32_t replica_id, struct stailq *rows) { /* * XXX: in case raft is disabled, synchronous replication still works @@ -989,7 +989,7 @@ applier_synchro_filter_tx(struct stailq *rows) * node, so cannot check for applier->instance_id here. */ row = &stailq_first_entry(rows, struct applier_tx_row, next)->row; - if (!txn_limbo_is_replica_outdated(&txn_limbo, row->replica_id)) + if (!txn_limbo_is_replica_outdated(&txn_limbo, replica_id)) return; if (stailq_last_entry(rows, struct applier_tx_row, next)->row.wait_sync) @@ -1027,7 +1027,7 @@ nopify:; * Return 0 for success or -1 in case of an error. */ static int -applier_apply_tx(struct stailq *rows) +applier_apply_tx(struct applier *applier, struct stailq *rows) { /* * Initially we've been filtering out data if it came from @@ -1083,7 +1083,7 @@ applier_apply_tx(struct stailq *rows) } } } - applier_synchro_filter_tx(rows); + applier_synchro_filter_tx(applier->instance_id, rows); if (unlikely(iproto_type_is_synchro_request(first_row->type))) { /* * Synchro messages are not transactions, in terms @@ -1315,7 +1315,7 @@ applier_subscribe(struct applier *applier) diag_raise(); } applier_signal_ack(applier); - } else if (applier_apply_tx(&rows) != 0) { + } else if (applier_apply_tx(applier, &rows) != 0) { diag_raise(); } -- 2.31.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-06-15 11:55 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-11 15:22 [Tarantool-patches] [PATCH v8 0/2] filter incoming packets Cyrill Gorcunov via Tarantool-patches 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 1/2] applier: filter incoming synchro packets via transaction initiator Cyrill Gorcunov via Tarantool-patches 2021-06-15 10:36 ` Serge Petrenko via Tarantool-patches 2021-06-15 11:35 ` Serge Petrenko via Tarantool-patches 2021-06-15 11:55 ` Serge Petrenko via Tarantool-patches 2021-06-11 15:22 ` [Tarantool-patches] [PATCH v8 2/2] Vlad: applier filtration Cyrill Gorcunov via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox