[Tarantool-patches] [PATCH v14 6/6] test: add replication/gh-6036-rollback-confirm
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Sun Sep 12 18:44:17 MSK 2021
Thanks for the patch!
See 3 comments below.
> diff --git a/test/replication/gh-6036-rollback-confirm.result b/test/replication/gh-6036-rollback-confirm.result
> new file mode 100644
> index 000000000..e85f6af37
> --- /dev/null
> +++ b/test/replication/gh-6036-rollback-confirm.result
<...>
> +-- Connect master to the replica and write a record. Since the quorum
> +-- value is bigger than number of nodes in a cluster it will be rolled
> +-- back later.
> +test_run:switch('master')
> + | ---
> + | - true
> + | ...
> +box.cfg({ \
> + replication = { \
> + "unix/:./master.sock", \
> + "unix/:./replica.sock", \
> + }, \
> +})
> + | ---
> + | ...
> +_ = box.schema.create_space('sync', {is_sync = true})
> + | ---
> + | ...
> +_ = box.space.sync:create_index('pk')
> + | ---
> + | ...
> +
> +--
> +-- Wait the record to appear on the master.
> +f = require('fiber').create(function() box.space.sync:replace{1} end)
> + | ---
> + | ...
> +test_run:wait_cond(function() return box.space.sync:get({1}) ~= nil end, 100)
1. Why do you need a custom wait_cond timeout?
> + | ---
> + | - true
> + | ...
> +box.space.sync:select{}
> + | ---
> + | - - [1]
> + | ...
> +
> +--
> +-- Wait the record from master get written and then
> +-- drop the replication.
> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +test_run:wait_cond(function() return box.space.sync:get({1}) ~= nil end, 100)
> + | ---
> + | - true
> + | ...
> +box.space.sync:select{}
2. You don't need the waiting on the master if you wait for the same on
the replica. It couldn't get there before master itself.
> + | ---
> + | - - [1]
> + | ...
> +box.cfg{replication = {}}
> + | ---
> + | ...
> +
> +--
> +-- Then we jump back to the master and drop the replication,
> +-- thus unconfirmed record get rolled back.
> +test_run:switch('master')
> + | ---
> + | - true
> + | ...
> +box.cfg({ \
> + replication = {}, \
> + replication_synchro_timeout = 0.001, \
> + election_mode = 'manual', \
> +})
> + | ---
> + | ...
> +while f:status() ~= 'dead' do require('fiber').sleep(0.1) end
> + | ---
> + | ...
> +test_run:wait_cond(function() return box.space.sync:get({1}) == nil end, 100)
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- Force the replica to become a RAFT leader and
> +-- commit this new record.
> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +box.cfg({ \
> + replication_synchro_quorum = 1, \
> + election_mode = 'manual' \
> +})
> + | ---
> + | ...
> +box.ctl.promote()
> + | ---
> + | ...
> +box.space.sync:select{}
> + | ---
> + | - - [1]
> + | ...
> +
> +--
> +-- Connect master back to the replica, it should
> +-- be refused.
> +test_run:switch('master')
> + | ---
> + | - true
> + | ...
> +box.cfg({ \
> + replication = { \
> + "unix/:./replica.sock", \
> + }, \
> +})
> + | ---
> + | ...
> +box.space.sync:select{}
> + | ---
> + | - []
> + | ...
> +test_run:wait_cond(function() return \
> + test_run:grep_log('master', \
> + 'rejecting PROMOTE') ~= nil end, 100) \
> +test_run:wait_cond(function() return \
> + test_run:grep_log('master', \
> + 'ER_CLUSTER_SPLIT') ~= nil end, 100)
3. Why do you need these 2 conds with \? The only
reason for using \ between multiple statements is
to prevent yields. Why can't you have yields between
the two wait_cond() calls?
Also could you make one cond with 'grep_log() and grep_log()'?
More information about the Tarantool-patches
mailing list