From: Serge Petrenko <sergepetrenko@tarantool.org>
To: Konstantin Osipov <kostja.osipov@gmail.com>
Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v2 2/2] wal: reorder tx rows so that a tx ends on a global row
Date: Tue, 26 May 2020 12:41:38 +0300 [thread overview]
Message-ID: <d1e276fb-27aa-492e-c767-2ecd98d52821@tarantool.org> (raw)
In-Reply-To: <20200525163409.GC4272@atlas>
25.05.2020 19:34, Konstantin Osipov пишет:
> * Serge Petrenko <sergepetrenko@tarantool.org> [20/05/25 14:00]:
>
> While I think it's an acceptable strategy for a bugfix, I still
> think relaying has to be fixed to be transactional, the current
> xstream api is a huge legacy we're stuck with since Tarantool 1.3!
>
> It saddens me a great deal the relay patch may be shuffled into
> the desk once the problem is not as urgent.
Hi! Thanks for your answer.
I understand your concern with transactional relay rework. However, this
approach faces some non-trivial problems we discussed verbally. That's why
I chose a more simple solution.
I have a question regarding this patch. Do you think I should reorder
the tx rows
so that the last row is a global one? Maybe it'd be better to just set
is_commit
flag on the last global row? This breaks tx boundaries in local WAL, but
preserves
them as is in replication and in other instances' WALs. I don't think
we'll need tx
boundariesin local WAL anyway, but still breaking the invariant that the
last tx
row has an is_commit flagset to true is scary.
>
>> Since we stopped sending local space operations in replication, the last
>> tx row has to be global in order to preserve tx boundaries on replica.
>> If the last row happens to be a local one, replica will never receive
>> the tx end marker, yielding the following errors:
>> `ER_UNSUPPORTED: replication does not support interleaving
>> transactions`.
>>
>> In order to fix the problem reorder the rows before writing them to WAL
>> so that the last tx row is a global one.
>> Do nothing if the transaction is fully local. It isn't replicated so the
>> tx borders aren't an issue here.
>>
>> Such reordering doesn't break anything, since only the relative positions
>> between some local rows and one global row are changed. Local and global
>> rows do not share a common lsn counter: local rows use a zero instance
>> id. So each row gets the same lsn it would get without the reordering,
>> the transaction looks exactly the same when replicated, and recovery
>> also isn't an issue, since local and global space operations can be
>> reordered freely.
>>
>> Follow-up #4114
>> Closes #4928
>> ---
>> src/box/wal.c | 19 ++-
>> test/replication/gh-4928-tx-boundaries.result | 138 ++++++++++++++++++
>> .../gh-4928-tx-boundaries.test.lua | 61 ++++++++
>> test/replication/suite.cfg | 1 +
>> 4 files changed, 218 insertions(+), 1 deletion(-)
>> create mode 100644 test/replication/gh-4928-tx-boundaries.result
>> create mode 100644 test/replication/gh-4928-tx-boundaries.test.lua
>>
>> diff --git a/src/box/wal.c b/src/box/wal.c
>> index ef4d84920..0921c7261 100644
>> --- a/src/box/wal.c
>> +++ b/src/box/wal.c
>> @@ -958,6 +958,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>> int64_t tsn = 0;
>> struct xrow_header **start = row;
>> struct xrow_header **first_glob_row = row;
>> + struct xrow_header **last_glob_row = end - 1;
>> /** Assign LSN to all local rows. */
>> for ( ; row < end; row++) {
>> if ((*row)->replica_id == 0) {
>> @@ -971,6 +972,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>> */
>> if ((*row)->group_id != GROUP_LOCAL) {
>> (*row)->replica_id = instance_id;
>> + last_glob_row = row;
>> }
>>
>> (*row)->lsn = vclock_inc(vclock_diff, (*row)->replica_id) +
>> @@ -987,7 +989,7 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>> first_glob_row = row;
>> }
>> (*row)->tsn = tsn == 0 ? (*start)->lsn : tsn;
>> - (*row)->is_commit = row == end - 1;
>> + (*row)->is_commit = false;
>> } else {
>> int64_t diff = (*row)->lsn - vclock_get(base, (*row)->replica_id);
>> if (diff <= vclock_get(vclock_diff,
>> @@ -1013,6 +1015,21 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
>> */
>> for (row = start; row < first_glob_row; row++)
>> (*row)->tsn = tsn;
>> +
>> + /*
>> + * If a mixed transaction ends on a local row, float up
>> + * the last global row, so that the upper tx boundary is
>> + * delivered to the replica.
>> + */
>> + if (last_glob_row < end - 1) {
>> + struct xrow_header *tmp = *last_glob_row;
>> + memmove(last_glob_row, last_glob_row + 1,
>> + (end - 1 - last_glob_row) *
>> + sizeof(struct xrow_header *));
>> + *(end - 1) = tmp;
>> + }
>> +
>> + (*(end - 1))->is_commit = true;
>> }
>>
>> static void
>> diff --git a/test/replication/gh-4928-tx-boundaries.result b/test/replication/gh-4928-tx-boundaries.result
>> new file mode 100644
>> index 000000000..969bd8438
>> --- /dev/null
>> +++ b/test/replication/gh-4928-tx-boundaries.result
>> @@ -0,0 +1,138 @@
>> +-- test-run result file version 2
>> +-- gh-4928. Test that transactions mixing local and global
>> +-- space operations are replicated correctly.
>> +env = require('test_run')
>> + | ---
>> + | ...
>> +test_run = env.new()
>> + | ---
>> + | ...
>> +bit = require('bit')
>> + | ---
>> + | ...
>> +
>> +-- Init.
>> +box.schema.user.grant('guest', 'replication')
>> + | ---
>> + | ...
>> +_ = box.schema.space.create('glob')
>> + | ---
>> + | ...
>> +_ = box.schema.space.create('loc', {is_local=true})
>> + | ---
>> + | ...
>> +_ = box.space.glob:create_index('pk')
>> + | ---
>> + | ...
>> +_ = box.space.loc:create_index('pk')
>> + | ---
>> + | ...
>> +
>> +function gen_mixed_tx(i)\
>> + box.begin()\
>> + if bit.band(i, 1) ~= 0 then\
>> + box.space.glob:insert{10 * i + 1}\
>> + else\
>> + box.space.loc:insert{10 * i + 1}\
>> + end\
>> + if bit.band(i, 2) ~= 0 then\
>> + box.space.glob:insert{10 * i + 2}\
>> + else\
>> + box.space.loc:insert{10 * i + 2}\
>> + end\
>> + if bit.band(i, 4) ~= 0 then\
>> + box.space.glob:insert{10 * i + 3}\
>> + else\
>> + box.space.loc:insert{10 * i + 3}\
>> + end\
>> + box.commit()\
>> +end
>> + | ---
>> + | ...
>> +
>> +test_run:cmd("create server replica with rpl_master=default,\
>> + script='replication/replica.lua'")
>> + | ---
>> + | - true
>> + | ...
>> +test_run:cmd('start server replica')
>> + | ---
>> + | - true
>> + | ...
>> +test_run:wait_downstream(2, {status='follow'})
>> + | ---
>> + | - true
>> + | ...
>> +
>> +for i = 0, 7 do gen_mixed_tx(i) end
>> + | ---
>> + | ...
>> +
>> +box.info.replication[2].status
>> + | ---
>> + | - null
>> + | ...
>> +
>> +vclock = box.info.vclock
>> + | ---
>> + | ...
>> +vclock[0] = nil
>> + | ---
>> + | ...
>> +test_run:wait_vclock("replica", vclock)
>> + | ---
>> + | ...
>> +
>> +test_run:cmd('switch replica')
>> + | ---
>> + | - true
>> + | ...
>> +
>> +box.info.status
>> + | ---
>> + | - running
>> + | ...
>> +box.info.replication[1].upstream.status
>> + | ---
>> + | - follow
>> + | ...
>> +
>> +box.space.glob:select{}
>> + | ---
>> + | - - [11]
>> + | - [22]
>> + | - [31]
>> + | - [32]
>> + | - [43]
>> + | - [51]
>> + | - [53]
>> + | - [62]
>> + | - [63]
>> + | - [71]
>> + | - [72]
>> + | - [73]
>> + | ...
>> +
>> +test_run:cmd('switch default')
>> + | ---
>> + | - true
>> + | ...
>> +
>> +-- Cleanup.
>> +test_run:cmd('stop server replica')
>> + | ---
>> + | - true
>> + | ...
>> +test_run:cmd('delete server replica')
>> + | ---
>> + | - true
>> + | ...
>> +box.schema.user.revoke('guest', 'replication')
>> + | ---
>> + | ...
>> +box.space.loc:drop()
>> + | ---
>> + | ...
>> +box.space.glob:drop()
>> + | ---
>> + | ...
>> diff --git a/test/replication/gh-4928-tx-boundaries.test.lua b/test/replication/gh-4928-tx-boundaries.test.lua
>> new file mode 100644
>> index 000000000..92526fc51
>> --- /dev/null
>> +++ b/test/replication/gh-4928-tx-boundaries.test.lua
>> @@ -0,0 +1,61 @@
>> +-- gh-4928. Test that transactions mixing local and global
>> +-- space operations are replicated correctly.
>> +env = require('test_run')
>> +test_run = env.new()
>> +bit = require('bit')
>> +
>> +-- Init.
>> +box.schema.user.grant('guest', 'replication')
>> +_ = box.schema.space.create('glob')
>> +_ = box.schema.space.create('loc', {is_local=true})
>> +_ = box.space.glob:create_index('pk')
>> +_ = box.space.loc:create_index('pk')
>> +
>> +function gen_mixed_tx(i)\
>> + box.begin()\
>> + if bit.band(i, 1) ~= 0 then\
>> + box.space.glob:insert{10 * i + 1}\
>> + else\
>> + box.space.loc:insert{10 * i + 1}\
>> + end\
>> + if bit.band(i, 2) ~= 0 then\
>> + box.space.glob:insert{10 * i + 2}\
>> + else\
>> + box.space.loc:insert{10 * i + 2}\
>> + end\
>> + if bit.band(i, 4) ~= 0 then\
>> + box.space.glob:insert{10 * i + 3}\
>> + else\
>> + box.space.loc:insert{10 * i + 3}\
>> + end\
>> + box.commit()\
>> +end
>> +
>> +test_run:cmd("create server replica with rpl_master=default,\
>> + script='replication/replica.lua'")
>> +test_run:cmd('start server replica')
>> +test_run:wait_downstream(2, {status='follow'})
>> +
>> +for i = 0, 7 do gen_mixed_tx(i) end
>> +
>> +box.info.replication[2].status
>> +
>> +vclock = box.info.vclock
>> +vclock[0] = nil
>> +test_run:wait_vclock("replica", vclock)
>> +
>> +test_run:cmd('switch replica')
>> +
>> +box.info.status
>> +box.info.replication[1].upstream.status
>> +
>> +box.space.glob:select{}
>> +
>> +test_run:cmd('switch default')
>> +
>> +-- Cleanup.
>> +test_run:cmd('stop server replica')
>> +test_run:cmd('delete server replica')
>> +box.schema.user.revoke('guest', 'replication')
>> +box.space.loc:drop()
>> +box.space.glob:drop()
>> diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
>> index d2743b5ed..f357b07da 100644
>> --- a/test/replication/suite.cfg
>> +++ b/test/replication/suite.cfg
>> @@ -18,6 +18,7 @@
>> "gh-4606-admin-creds.test.lua": {},
>> "gh-4739-vclock-assert.test.lua": {},
>> "gh-4730-applier-rollback.test.lua": {},
>> + "gh-4928-tx-boundaries.test.lua": {},
>> "*": {
>> "memtx": {"engine": "memtx"},
>> "vinyl": {"engine": "vinyl"}
>> --
>> 2.24.2 (Apple Git-127)
>>
--
Serge Petrenko
next prev parent reply other threads:[~2020-05-26 9:41 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-25 10:58 [Tarantool-patches] [PATCH v2 0/2] fix replication tx boundaries after local space rework Serge Petrenko
2020-05-25 10:58 ` [Tarantool-patches] [PATCH v2 1/2] wal: fix tx boundaries Serge Petrenko
2020-05-28 22:53 ` Vladislav Shpilevoy
2020-05-29 11:09 ` Serge Petrenko
2020-05-25 10:58 ` [Tarantool-patches] [PATCH v2 2/2] wal: reorder tx rows so that a tx ends on a global row Serge Petrenko
2020-05-25 15:13 ` Cyrill Gorcunov
2020-05-25 16:34 ` Konstantin Osipov
2020-05-25 18:35 ` Cyrill Gorcunov
2020-05-25 20:42 ` Konstantin Osipov
2020-05-26 9:41 ` Serge Petrenko [this message]
2020-05-26 11:41 ` Konstantin Osipov
2020-05-26 12:08 ` Serge Petrenko
2020-05-28 22:54 ` Vladislav Shpilevoy
2020-05-29 8:13 ` Konstantin Osipov
2020-05-29 11:42 ` Serge Petrenko
2020-05-29 11:51 ` Konstantin Osipov
2020-05-29 12:07 ` Cyrill Gorcunov
2020-05-29 12:07 ` Cyrill Gorcunov
2020-05-29 12:15 ` Serge Petrenko
2020-05-29 13:44 ` Konstantin Osipov
2020-05-29 15:55 ` Serge Petrenko
2020-06-01 13:40 ` Vladislav Shpilevoy
2020-06-01 16:02 ` Sergey Ostanevich
2020-06-01 17:06 ` Vladislav Shpilevoy
2020-05-28 22:53 ` [Tarantool-patches] [PATCH v2 0/2] fix replication tx boundaries after local space rework Vladislav Shpilevoy
2020-05-29 11:03 ` Serge Petrenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d1e276fb-27aa-492e-c767-2ecd98d52821@tarantool.org \
--to=sergepetrenko@tarantool.org \
--cc=kostja.osipov@gmail.com \
--cc=tarantool-patches@dev.tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v2 2/2] wal: reorder tx rows so that a tx ends on a global row' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox