[Tarantool-patches] [PATCH v4 2/3] test: add a test for wal_cleanup_delay option

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Mar 25 01:10:23 MSK 2021


Thanks for working on this!

See 4 comments below.

> diff --git a/test/replication/gh-5806-xlog-cleanup.result b/test/replication/gh-5806-xlog-cleanup.result
> new file mode 100644
> index 000000000..88b272e62
> --- /dev/null
> +++ b/test/replication/gh-5806-xlog-cleanup.result
> @@ -0,0 +1,356 @@
> +--
> +-- On replica we create an own space which allows us to
> +-- use more complex scenario and disables replica from
> +-- automatic rejoin (since replica can't do auto-rejoin if
> +-- there gonna be an own data loss). This allows us to
> +-- trigger XlogGapError in the log.
> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +box.cfg{checkpoint_count = 1}
> + | ---
> + | ...
> +s = box.schema.space.create('testreplica')
> + | ---
> + | ...
> +_ = s:create_index('pk')
> + | ---
> + | ...
> +box.space.testreplica:insert({1})
> + | ---
> + | - [1]
> + | ...
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +
> +--
> +-- Stop the replica node and generate
> +-- xlogs on the master.
> +test_run:switch('default')

1. You could switch to the master right away. No
need to switch through the default node.

> + | ---
> + | - true
> + | ...
> +test_run:cmd('stop server replica')
> + | ---
> + | - true
> + | ...
> +
> +test_run:switch('master')
> + | ---
> + | - true
> + | ...
> +box.space.test:insert({1})
> + | ---
> + | - [1]
> + | ...
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +
> +--
> +-- We need to restart the master node since otherwise
> +-- the replica will be preventing us from removing old
> +-- xlog because it will be tracked by gc consumer which
> +-- kept in memory while master node is running.
> +--
> +-- Once restarted we write a new record into master's
> +-- space and run snapshot which removes old xlog required
> +-- by replica to subscribe leading to XlogGapError which
> +-- we need to test.
> +test_run:switch('default')

2. You can restart the master without switching to default.

> + | ---
> + | - true
> + | ...
> +test_run:cmd('restart server master with wait_load=True')
> + | ---
> + | - true
> + | ...
> +test_run:switch('master')
> + | ---
> + | - true
> + | ...
> +box.space.test:insert({2})
> + | ---
> + | - [2]
> + | ...
> +box.snapshot()
> + | ---
> + | - ok
> + | ...
> +assert(box.info.gc().is_paused == false)
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- Start replica and wait for error.
> +test_run:switch('default')

3. You don't need to switch to default to start the replica. All the
same switch-comments for 'Case 2' test.

> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica with wait=False, wait_load=False')
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- Wait error to appear.
> +test_run:wait_log('master', 'XlogGapError', 1024, 10) ~= nil

4. 10 seconds is a tiny timeout. Seconds is not enough as the
synchronous replication tests have shown. But why do you even need
non-default parameters?

> + | ---
> + | - true
> + | ...
> +


More information about the Tarantool-patches mailing list