[Tarantool-patches] [PATCH] limbo: introduce request processing hooks
Serge Petrenko
sergepetrenko at tarantool.org
Mon Jul 12 11:01:29 MSK 2021
11.07.2021 17:00, Vladislav Shpilevoy пишет:
> Hi! Thanks for the patch!
>
> On 11.07.2021 00:28, Cyrill Gorcunov wrote:
>> Guys, this is an early rfc since I would like to discuss the
>> design first before going further. Currently we don't interrupt
>> incoming syncro requests which doesn't allow us to detect cluster
>> split-brain situation, as we were discussing verbally there are
>> a number of sign to detect it and we need to stop receiving data
>> from obsolete nodes.
>>
>> The main problem though is that such filtering of incoming packets
>> should happen at the moment where we still can do a step back and
>> inform the peer that data has been declined, but currently our
>> applier code process syncro requests inside WAL trigger, ie when
>> data is already applied or rolling back.
>>
>> Thus we need to separate "filer" and "apply" stages of processing.
>> What is more interesting is that we filter incomings via in memory
>> vclock and update them immediately. Thus the following situation
>> is possible -- a promote request comes in, we remember it inside
>> promote_term_map but then write to WAL fails and we never revert
>> the promote_term_map variable, thus other peer won't be able to
>> resend us this promote request because now we think that we've
>> alreday applied the promote.
> Well, I still don't understand what the issue is. We discussed it
> privately already. You simply should not apply anything until WAL
> write is done. And it is not happening now on the master. The
> terms vclock is updated only **after** WAL write.
>
> Why do you need all these new vclocks if you should not apply
> anything before WAL write in the first place?
If I understand correctly, the issue is that if we filter (and check for
the split brain) after the WAL write, we will end up with a conflicting
PROMOTE in our WAL. Cyrill is trying to avoid this, that's why
he's separating the filter stage. This way the error will reach
the remote peer before any WAL write, and the WAL write won't happen.
And if we filter before the WAL write, we need the second vclock, which
Cyrill has introduced.
We may leave confligting PROMOTEs in WAL (first write them and only
then check for conflicts). In this case this whole patch isn't
needed. But I personally don't like such an approach.
>
>> write to WAL fails and we never revert
>> the promote_term_map variable
> This simply should not be possible. The term map is updated only
> after WAL write is done. At least this is how it works now, doesn't
> it? Why did you change it? (In case you did, because I am not sure,
> I didn't review the code throughly).
--
Serge Petrenko
More information about the Tarantool-patches
mailing list