From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 51106469710 for ; Mon, 25 May 2020 19:34:12 +0300 (MSK) Received: by mail-lj1-f195.google.com with SMTP id o14so21427870ljp.4 for ; Mon, 25 May 2020 09:34:12 -0700 (PDT) Date: Mon, 25 May 2020 19:34:09 +0300 From: Konstantin Osipov Message-ID: <20200525163409.GC4272@atlas> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH v2 2/2] wal: reorder tx rows so that a tx ends on a global row List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org * Serge Petrenko [20/05/25 14:00]: While I think it's an acceptable strategy for a bugfix, I still think relaying has to be fixed to be transactional, the current xstream api is a huge legacy we're stuck with since Tarantool 1.3! It saddens me a great deal the relay patch may be shuffled into the desk once the problem is not as urgent. > Since we stopped sending local space operations in replication, the last > tx row has to be global in order to preserve tx boundaries on replica. > If the last row happens to be a local one, replica will never receive > the tx end marker, yielding the following errors: > `ER_UNSUPPORTED: replication does not support interleaving > transactions`. > > In order to fix the problem reorder the rows before writing them to WAL > so that the last tx row is a global one. > Do nothing if the transaction is fully local. It isn't replicated so the > tx borders aren't an issue here. > > Such reordering doesn't break anything, since only the relative positions > between some local rows and one global row are changed. Local and global > rows do not share a common lsn counter: local rows use a zero instance > id. So each row gets the same lsn it would get without the reordering, > the transaction looks exactly the same when replicated, and recovery > also isn't an issue, since local and global space operations can be > reordered freely. > > Follow-up #4114 > Closes #4928 > --- > src/box/wal.c | 19 ++- > test/replication/gh-4928-tx-boundaries.result | 138 ++++++++++++++++++ > .../gh-4928-tx-boundaries.test.lua | 61 ++++++++ > test/replication/suite.cfg | 1 + > 4 files changed, 218 insertions(+), 1 deletion(-) > create mode 100644 test/replication/gh-4928-tx-boundaries.result > create mode 100644 test/replication/gh-4928-tx-boundaries.test.lua > > diff --git a/src/box/wal.c b/src/box/wal.c > index ef4d84920..0921c7261 100644 > --- a/src/box/wal.c > +++ b/src/box/wal.c > @@ -958,6 +958,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, > int64_t tsn = 0; > struct xrow_header **start = row; > struct xrow_header **first_glob_row = row; > + struct xrow_header **last_glob_row = end - 1; > /** Assign LSN to all local rows. */ > for ( ; row < end; row++) { > if ((*row)->replica_id == 0) { > @@ -971,6 +972,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, > */ > if ((*row)->group_id != GROUP_LOCAL) { > (*row)->replica_id = instance_id; > + last_glob_row = row; > } > > (*row)->lsn = vclock_inc(vclock_diff, (*row)->replica_id) + > @@ -987,7 +989,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, > first_glob_row = row; > } > (*row)->tsn = tsn == 0 ? (*start)->lsn : tsn; > - (*row)->is_commit = row == end - 1; > + (*row)->is_commit = false; > } else { > int64_t diff = (*row)->lsn - vclock_get(base, (*row)->replica_id); > if (diff <= vclock_get(vclock_diff, > @@ -1013,6 +1015,21 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, > */ > for (row = start; row < first_glob_row; row++) > (*row)->tsn = tsn; > + > + /* > + * If a mixed transaction ends on a local row, float up > + * the last global row, so that the upper tx boundary is > + * delivered to the replica. > + */ > + if (last_glob_row < end - 1) { > + struct xrow_header *tmp = *last_glob_row; > + memmove(last_glob_row, last_glob_row + 1, > + (end - 1 - last_glob_row) * > + sizeof(struct xrow_header *)); > + *(end - 1) = tmp; > + } > + > + (*(end - 1))->is_commit = true; > } > > static void > diff --git a/test/replication/gh-4928-tx-boundaries.result b/test/replication/gh-4928-tx-boundaries.result > new file mode 100644 > index 000000000..969bd8438 > --- /dev/null > +++ b/test/replication/gh-4928-tx-boundaries.result > @@ -0,0 +1,138 @@ > +-- test-run result file version 2 > +-- gh-4928. Test that transactions mixing local and global > +-- space operations are replicated correctly. > +env = require('test_run') > + | --- > + | ... > +test_run = env.new() > + | --- > + | ... > +bit = require('bit') > + | --- > + | ... > + > +-- Init. > +box.schema.user.grant('guest', 'replication') > + | --- > + | ... > +_ = box.schema.space.create('glob') > + | --- > + | ... > +_ = box.schema.space.create('loc', {is_local=true}) > + | --- > + | ... > +_ = box.space.glob:create_index('pk') > + | --- > + | ... > +_ = box.space.loc:create_index('pk') > + | --- > + | ... > + > +function gen_mixed_tx(i)\ > + box.begin()\ > + if bit.band(i, 1) ~= 0 then\ > + box.space.glob:insert{10 * i + 1}\ > + else\ > + box.space.loc:insert{10 * i + 1}\ > + end\ > + if bit.band(i, 2) ~= 0 then\ > + box.space.glob:insert{10 * i + 2}\ > + else\ > + box.space.loc:insert{10 * i + 2}\ > + end\ > + if bit.band(i, 4) ~= 0 then\ > + box.space.glob:insert{10 * i + 3}\ > + else\ > + box.space.loc:insert{10 * i + 3}\ > + end\ > + box.commit()\ > +end > + | --- > + | ... > + > +test_run:cmd("create server replica with rpl_master=default,\ > + script='replication/replica.lua'") > + | --- > + | - true > + | ... > +test_run:cmd('start server replica') > + | --- > + | - true > + | ... > +test_run:wait_downstream(2, {status='follow'}) > + | --- > + | - true > + | ... > + > +for i = 0, 7 do gen_mixed_tx(i) end > + | --- > + | ... > + > +box.info.replication[2].status > + | --- > + | - null > + | ... > + > +vclock = box.info.vclock > + | --- > + | ... > +vclock[0] = nil > + | --- > + | ... > +test_run:wait_vclock("replica", vclock) > + | --- > + | ... > + > +test_run:cmd('switch replica') > + | --- > + | - true > + | ... > + > +box.info.status > + | --- > + | - running > + | ... > +box.info.replication[1].upstream.status > + | --- > + | - follow > + | ... > + > +box.space.glob:select{} > + | --- > + | - - [11] > + | - [22] > + | - [31] > + | - [32] > + | - [43] > + | - [51] > + | - [53] > + | - [62] > + | - [63] > + | - [71] > + | - [72] > + | - [73] > + | ... > + > +test_run:cmd('switch default') > + | --- > + | - true > + | ... > + > +-- Cleanup. > +test_run:cmd('stop server replica') > + | --- > + | - true > + | ... > +test_run:cmd('delete server replica') > + | --- > + | - true > + | ... > +box.schema.user.revoke('guest', 'replication') > + | --- > + | ... > +box.space.loc:drop() > + | --- > + | ... > +box.space.glob:drop() > + | --- > + | ... > diff --git a/test/replication/gh-4928-tx-boundaries.test.lua b/test/replication/gh-4928-tx-boundaries.test.lua > new file mode 100644 > index 000000000..92526fc51 > --- /dev/null > +++ b/test/replication/gh-4928-tx-boundaries.test.lua > @@ -0,0 +1,61 @@ > +-- gh-4928. Test that transactions mixing local and global > +-- space operations are replicated correctly. > +env = require('test_run') > +test_run = env.new() > +bit = require('bit') > + > +-- Init. > +box.schema.user.grant('guest', 'replication') > +_ = box.schema.space.create('glob') > +_ = box.schema.space.create('loc', {is_local=true}) > +_ = box.space.glob:create_index('pk') > +_ = box.space.loc:create_index('pk') > + > +function gen_mixed_tx(i)\ > + box.begin()\ > + if bit.band(i, 1) ~= 0 then\ > + box.space.glob:insert{10 * i + 1}\ > + else\ > + box.space.loc:insert{10 * i + 1}\ > + end\ > + if bit.band(i, 2) ~= 0 then\ > + box.space.glob:insert{10 * i + 2}\ > + else\ > + box.space.loc:insert{10 * i + 2}\ > + end\ > + if bit.band(i, 4) ~= 0 then\ > + box.space.glob:insert{10 * i + 3}\ > + else\ > + box.space.loc:insert{10 * i + 3}\ > + end\ > + box.commit()\ > +end > + > +test_run:cmd("create server replica with rpl_master=default,\ > + script='replication/replica.lua'") > +test_run:cmd('start server replica') > +test_run:wait_downstream(2, {status='follow'}) > + > +for i = 0, 7 do gen_mixed_tx(i) end > + > +box.info.replication[2].status > + > +vclock = box.info.vclock > +vclock[0] = nil > +test_run:wait_vclock("replica", vclock) > + > +test_run:cmd('switch replica') > + > +box.info.status > +box.info.replication[1].upstream.status > + > +box.space.glob:select{} > + > +test_run:cmd('switch default') > + > +-- Cleanup. > +test_run:cmd('stop server replica') > +test_run:cmd('delete server replica') > +box.schema.user.revoke('guest', 'replication') > +box.space.loc:drop() > +box.space.glob:drop() > diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg > index d2743b5ed..f357b07da 100644 > --- a/test/replication/suite.cfg > +++ b/test/replication/suite.cfg > @@ -18,6 +18,7 @@ > "gh-4606-admin-creds.test.lua": {}, > "gh-4739-vclock-assert.test.lua": {}, > "gh-4730-applier-rollback.test.lua": {}, > + "gh-4928-tx-boundaries.test.lua": {}, > "*": { > "memtx": {"engine": "memtx"}, > "vinyl": {"engine": "vinyl"} > -- > 2.24.2 (Apple Git-127) > -- Konstantin Osipov, Moscow, Russia