[Tarantool-patches] [PATCH v2 2/2] wal: reorder tx rows so that a tx ends on a global row

Konstantin Osipov kostja.osipov at gmail.com
Mon May 25 19:34:09 MSK 2020


* Serge Petrenko <sergepetrenko at tarantool.org> [20/05/25 14:00]:

While I think it's an acceptable strategy for a bugfix, I still
think relaying has to be fixed to be transactional, the current
xstream api is a huge legacy we're stuck with since Tarantool 1.3!

It saddens me a great deal the relay patch may be shuffled into
the desk once the problem is not as urgent.

> Since we stopped sending local space operations in replication, the last
> tx row has to be global in order to preserve tx boundaries on replica.
> If the last row happens to be a local one, replica will never receive
> the tx end marker, yielding the following errors:
> `ER_UNSUPPORTED: replication does not support interleaving
> transactions`.
> 
> In order to fix the problem reorder the rows before writing them to WAL
> so that the last tx row is a global one.
> Do nothing if the transaction is fully local. It isn't replicated so the
> tx borders aren't an issue here.
> 
> Such reordering doesn't break anything, since only the relative positions
> between some local rows and one global row are changed. Local and global
> rows do not share a common lsn counter: local rows use a zero instance
> id. So each row gets the same lsn it would get without the reordering,
> the transaction looks exactly the same when replicated, and recovery
> also isn't an issue, since local and global space operations can be
> reordered freely.
> 
> Follow-up #4114
> Closes #4928
> ---
>  src/box/wal.c                                 |  19 ++-
>  test/replication/gh-4928-tx-boundaries.result | 138 ++++++++++++++++++
>  .../gh-4928-tx-boundaries.test.lua            |  61 ++++++++
>  test/replication/suite.cfg                    |   1 +
>  4 files changed, 218 insertions(+), 1 deletion(-)
>  create mode 100644 test/replication/gh-4928-tx-boundaries.result
>  create mode 100644 test/replication/gh-4928-tx-boundaries.test.lua
> 
> diff --git a/src/box/wal.c b/src/box/wal.c
> index ef4d84920..0921c7261 100644
> --- a/src/box/wal.c
> +++ b/src/box/wal.c
> @@ -958,6 +958,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>  	int64_t tsn = 0;
>  	struct xrow_header **start = row;
>  	struct xrow_header **first_glob_row = row;
> +	struct xrow_header **last_glob_row = end - 1;
>  	/** Assign LSN to all local rows. */
>  	for ( ; row < end; row++) {
>  		if ((*row)->replica_id == 0) {
> @@ -971,6 +972,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>  			 */
>  			if ((*row)->group_id != GROUP_LOCAL) {
>  				(*row)->replica_id = instance_id;
> +				last_glob_row = row;
>  			}
>  
>  			(*row)->lsn = vclock_inc(vclock_diff, (*row)->replica_id) +
> @@ -987,7 +989,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>  				first_glob_row = row;
>  			}
>  			(*row)->tsn = tsn == 0 ? (*start)->lsn : tsn;
> -			(*row)->is_commit = row == end - 1;
> +			(*row)->is_commit = false;
>  		} else {
>  			int64_t diff = (*row)->lsn - vclock_get(base, (*row)->replica_id);
>  			if (diff <= vclock_get(vclock_diff,
> @@ -1013,6 +1015,21 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>  	 */
>  	for (row = start; row < first_glob_row; row++)
>  		(*row)->tsn = tsn;
> +
> +	/*
> +	 * If a mixed transaction ends on a local row, float up
> +	 * the last global row, so that the upper tx boundary is
> +	 * delivered to the replica.
> +	 */
> +	if (last_glob_row < end - 1) {
> +		struct xrow_header *tmp = *last_glob_row;
> +		memmove(last_glob_row, last_glob_row + 1,
> +			(end - 1 - last_glob_row) *
> +			sizeof(struct xrow_header *));
> +		*(end - 1) = tmp;
> +	}
> +
> +	(*(end - 1))->is_commit = true;
>  }
>  
>  static void
> diff --git a/test/replication/gh-4928-tx-boundaries.result b/test/replication/gh-4928-tx-boundaries.result
> new file mode 100644
> index 000000000..969bd8438
> --- /dev/null
> +++ b/test/replication/gh-4928-tx-boundaries.result
> @@ -0,0 +1,138 @@
> +-- test-run result file version 2
> +-- gh-4928. Test that transactions mixing local and global
> +-- space operations are replicated correctly.
> +env = require('test_run')
> + | ---
> + | ...
> +test_run = env.new()
> + | ---
> + | ...
> +bit = require('bit')
> + | ---
> + | ...
> +
> +-- Init.
> +box.schema.user.grant('guest', 'replication')
> + | ---
> + | ...
> +_ = box.schema.space.create('glob')
> + | ---
> + | ...
> +_ = box.schema.space.create('loc', {is_local=true})
> + | ---
> + | ...
> +_ = box.space.glob:create_index('pk')
> + | ---
> + | ...
> +_ = box.space.loc:create_index('pk')
> + | ---
> + | ...
> +
> +function gen_mixed_tx(i)\
> +    box.begin()\
> +    if bit.band(i, 1) ~= 0 then\
> +        box.space.glob:insert{10 * i + 1}\
> +    else\
> +        box.space.loc:insert{10 * i + 1}\
> +    end\
> +    if bit.band(i, 2) ~= 0 then\
> +        box.space.glob:insert{10 * i + 2}\
> +    else\
> +        box.space.loc:insert{10 * i + 2}\
> +    end\
> +    if bit.band(i, 4) ~= 0 then\
> +        box.space.glob:insert{10 * i + 3}\
> +    else\
> +        box.space.loc:insert{10 * i + 3}\
> +    end\
> +    box.commit()\
> +end
> + | ---
> + | ...
> +
> +test_run:cmd("create server replica with rpl_master=default,\
> +             script='replication/replica.lua'")
> + | ---
> + | - true
> + | ...
> +test_run:cmd('start server replica')
> + | ---
> + | - true
> + | ...
> +test_run:wait_downstream(2, {status='follow'})
> + | ---
> + | - true
> + | ...
> +
> +for i = 0, 7 do gen_mixed_tx(i) end
> + | ---
> + | ...
> +
> +box.info.replication[2].status
> + | ---
> + | - null
> + | ...
> +
> +vclock = box.info.vclock
> + | ---
> + | ...
> +vclock[0] = nil
> + | ---
> + | ...
> +test_run:wait_vclock("replica", vclock)
> + | ---
> + | ...
> +
> +test_run:cmd('switch replica')
> + | ---
> + | - true
> + | ...
> +
> +box.info.status
> + | ---
> + | - running
> + | ...
> +box.info.replication[1].upstream.status
> + | ---
> + | - follow
> + | ...
> +
> +box.space.glob:select{}
> + | ---
> + | - - [11]
> + |   - [22]
> + |   - [31]
> + |   - [32]
> + |   - [43]
> + |   - [51]
> + |   - [53]
> + |   - [62]
> + |   - [63]
> + |   - [71]
> + |   - [72]
> + |   - [73]
> + | ...
> +
> +test_run:cmd('switch default')
> + | ---
> + | - true
> + | ...
> +
> +-- Cleanup.
> +test_run:cmd('stop server replica')
> + | ---
> + | - true
> + | ...
> +test_run:cmd('delete server replica')
> + | ---
> + | - true
> + | ...
> +box.schema.user.revoke('guest', 'replication')
> + | ---
> + | ...
> +box.space.loc:drop()
> + | ---
> + | ...
> +box.space.glob:drop()
> + | ---
> + | ...
> diff --git a/test/replication/gh-4928-tx-boundaries.test.lua b/test/replication/gh-4928-tx-boundaries.test.lua
> new file mode 100644
> index 000000000..92526fc51
> --- /dev/null
> +++ b/test/replication/gh-4928-tx-boundaries.test.lua
> @@ -0,0 +1,61 @@
> +-- gh-4928. Test that transactions mixing local and global
> +-- space operations are replicated correctly.
> +env = require('test_run')
> +test_run = env.new()
> +bit = require('bit')
> +
> +-- Init.
> +box.schema.user.grant('guest', 'replication')
> +_ = box.schema.space.create('glob')
> +_ = box.schema.space.create('loc', {is_local=true})
> +_ = box.space.glob:create_index('pk')
> +_ = box.space.loc:create_index('pk')
> +
> +function gen_mixed_tx(i)\
> +    box.begin()\
> +    if bit.band(i, 1) ~= 0 then\
> +        box.space.glob:insert{10 * i + 1}\
> +    else\
> +        box.space.loc:insert{10 * i + 1}\
> +    end\
> +    if bit.band(i, 2) ~= 0 then\
> +        box.space.glob:insert{10 * i + 2}\
> +    else\
> +        box.space.loc:insert{10 * i + 2}\
> +    end\
> +    if bit.band(i, 4) ~= 0 then\
> +        box.space.glob:insert{10 * i + 3}\
> +    else\
> +        box.space.loc:insert{10 * i + 3}\
> +    end\
> +    box.commit()\
> +end
> +
> +test_run:cmd("create server replica with rpl_master=default,\
> +             script='replication/replica.lua'")
> +test_run:cmd('start server replica')
> +test_run:wait_downstream(2, {status='follow'})
> +
> +for i = 0, 7 do gen_mixed_tx(i) end
> +
> +box.info.replication[2].status
> +
> +vclock = box.info.vclock
> +vclock[0] = nil
> +test_run:wait_vclock("replica", vclock)
> +
> +test_run:cmd('switch replica')
> +
> +box.info.status
> +box.info.replication[1].upstream.status
> +
> +box.space.glob:select{}
> +
> +test_run:cmd('switch default')
> +
> +-- Cleanup.
> +test_run:cmd('stop server replica')
> +test_run:cmd('delete server replica')
> +box.schema.user.revoke('guest', 'replication')
> +box.space.loc:drop()
> +box.space.glob:drop()
> diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
> index d2743b5ed..f357b07da 100644
> --- a/test/replication/suite.cfg
> +++ b/test/replication/suite.cfg
> @@ -18,6 +18,7 @@
>      "gh-4606-admin-creds.test.lua": {},
>      "gh-4739-vclock-assert.test.lua": {},
>      "gh-4730-applier-rollback.test.lua": {},
> +    "gh-4928-tx-boundaries.test.lua": {},
>      "*": {
>          "memtx": {"engine": "memtx"},
>          "vinyl": {"engine": "vinyl"}
> -- 
> 2.24.2 (Apple Git-127)
> 

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list