Re: [Tarantool-patches] [PATCH 1/1] applier: process synchro rows after WAL write

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
	tarantool-patches@dev.tarantool.org, gorcunov@gmail.com
Subject: Re: [Tarantool-patches] [PATCH 1/1] applier: process synchro rows after WAL write
Date: Fri, 9 Apr 2021 11:26:23 +0300	[thread overview]
Message-ID: <1460ae71-9e27-f844-1a97-acf3942700fc@tarantool.org> (raw)
In-Reply-To: <c909211a-ef69-82d6-a4f7-ba1d2ee5c7f6@tarantool.org>



09.04.2021 01:56, Vladislav Shpilevoy пишет:
> Hi! Thanks for the review!
>
>> Thanks for the patch!
>>
>> I'm a bit worried about two different synchro rows coming from two
>> appliers. Is everything ok in this case?
>> Or even normal rows coming from other appliers. Say some other replica
>> has already applied this synchro row and even has written some rows on
>> top of it. Its applier won't block on replica_id latch, and may fail to apply
>> some txs following this synchro row, because it's not yet written to WAL
>> and thus not applied (limbo is still not empty or belongs to other instance).
> Should not happen, if I understood you correctly. See my response in the
> other email + a test for test-run + a console test.

Yes, it's fine, thanks for the explanation & the test!

>
>>> diff --git a/src/box/applier.cc b/src/box/applier.cc
>>> index 971b2e64c..f7c82548c 100644
>>> --- a/src/box/applier.cc
>>> +++ b/src/box/applier.cc
>>> @@ -762,26 +762,21 @@ applier_txn_wal_write_cb(struct trigger *trigger, void *event)
>>>    }
>>>      struct synchro_entry {
>>> -    /** Encoded form of a synchro record. */
>>> -    struct synchro_body_bin    body_bin;
>>> -
>>> -    /** xrow to write, used by the journal engine. */
>>> -    struct xrow_header row;
>>> -
>>> +    /** Request to process when WAL write is done. */
>>> +    struct synchro_request *req;
>>> +    /** Fiber created the entry. To wakeup when WAL write is done. */
>>> +    struct fiber *owner;
>>>        /**
>>> -     * The journal entry itself. Note since
>>> -     * it has unsized array it must be the
>>> -     * last entry in the structure.
>>> +     * The base journal entry. It has unsized array and then must be the
>>> +     * last entry in the structure. But can workaround it via a union
>>> +     * adding the needed tail as char[].
>>>         */
>>> -    struct journal_entry journal_entry;
>>> +    union {
>>> +        struct journal_entry base;
>>> +        char base_buf[sizeof(base) + sizeof(base.rows[0])];
>>> +    };
>>>    };
>> I don't understand this union stuff.
>> The journal_entry is the last entry in synchro_entry anyway.
>> Is this a hack for allowing to allocate synchro_entry on the stack?
> Yes. Otherwise I would need a more ugly hack:
>
> 	struct synchro_entry *e;
> 	alignas(alignof(*e)) char e_buf[sizeof(*e) + sizeof(e->base.rows[0])];
> 	e = (struct synchro_entry *)e_buf;
>
> Then I managed to create the union hack, which I find
> less ugly, and it seems to be allow to declare it on the stack.
> The interesting thing though is that I still can't make that
> union not a last member.
>
> I also see now there is a problem in one of the CI jobs:
> https://github.com/tarantool/tarantool/runs/2291961909
>
> It complains about out of bound array access. Which has nothing to
> do with the union, just a C++ issue I suppose. I fixed it in the new
> version, see the other email.

Ok, I see now. Thanks!

>
>>> diff --git a/test/replication/gh-5213-qsync-applier-order.test.lua b/test/replication/gh-5213-qsync-applier-order.test.lua
>>> new file mode 100644
>>> index 000000000..3a32626f2
>>> --- /dev/null
>>> +++ b/test/replication/gh-5213-qsync-applier-order.test.lua
>>> @@ -0,0 +1,129 @@
>>> +--
>>> +-- gh-5213: applier used to process CONFIRM/ROLLBACK before writing them to WAL.
>>> +-- As a result it could happen that the transactions became visible on CONFIRM,
>>> +-- then somehow weren't written to WAL, and on restart the data might not be
>>> +-- visible again. Which means rollback of confirmed data and is not acceptable
>>> +-- (on the contrary with commit after rollback).
>>> +--
>>> +test_run = require('test_run').new()
>>> +fiber = require('fiber')
>>> +old_synchro_quorum = box.cfg.replication_synchro_quorum
>>> +old_synchro_timeout = box.cfg.replication_synchro_timeout
>>> +
>>> +box.schema.user.grant('guest', 'super')
>>> +
>>> +s = box.schema.space.create('test', {is_sync = true})
>>> +_ = s:create_index('pk')
>>> +
>>> +test_run:cmd('create server replica with rpl_master=default,\
>>> +              script="replication/gh-5213-replica.lua"')
>>> +test_run:cmd('start server replica')
>>> +
>>> +test_run:switch('replica')
>>> +assert(box.info.id == 2)
>>> +lsn = box.info.vclock[1]
>>> +
>>> +test_run:switch('default')
>>> +fiber = require('fiber')
>>> +box.cfg{                                                                        \
>>> +    replication_synchro_quorum = 3,                                             \
>>> +    replication_synchro_timeout = 1000,                                         \
>>> +}
>>> +f = fiber.new(function() s:replace{1} end)
>>> +
>>> +test_run:switch('replica')
>>> +-- Wait when the transaction is written to WAL.
>>> +test_run:wait_cond(function() return box.info.vclock[1] == lsn + 1 end)
>> This shouldn't go wrong, but I find
>> test_run:wait_lsn('replica, 'default') more durable.
>> And you wouldn't need to save lsn above then.
>>
>> Same for other usages of lsn = ... and wait_cond(box.info.vclock[1] == lsn + ...)
>>
>> Up to you though.
> I totally forgot about wait_lsn(), thanks for noticing.
> I applied it. See the full diff in the second email.

Thanks for the fixes, LGTM.

-- 
Serge Petrenko

     prev parent reply	other threads:[~2021-04-09  8:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07 22:47 Vladislav Shpilevoy via Tarantool-patches
2021-04-08  8:39 ` Serge Petrenko via Tarantool-patches
2021-04-08 10:19   ` Cyrill Gorcunov via Tarantool-patches
2021-04-08 10:32     ` Serge Petrenko via Tarantool-patches
2021-04-08 10:46       ` Cyrill Gorcunov via Tarantool-patches
2021-04-08 22:57       ` Vladislav Shpilevoy via Tarantool-patches
2021-04-09  8:25         ` Serge Petrenko via Tarantool-patches
2021-04-09 21:32           ` Vladislav Shpilevoy via Tarantool-patches
2021-04-08 22:56   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-09  8:26     ` Serge Petrenko via Tarantool-patches [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1460ae71-9e27-f844-1a97-acf3942700fc@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 1/1] applier: process synchro rows after WAL write' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox