From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <v.shpilevoy@tarantool.org>
Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits)) (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id 16F5B45C305
 for <tarantool-patches@dev.tarantool.org>;
 Fri, 18 Dec 2020 00:43:24 +0300 (MSK)
References: <cover.1607633488.git.sergepetrenko@tarantool.org>
 <3600a9d1e06d6c1c8cda41dda88cd0dd088ca3e2.1607633488.git.sergepetrenko@tarantool.org>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <5b466891-65b4-ad9c-253e-156869100959@tarantool.org>
Date: Thu, 17 Dec 2020 22:43:22 +0100
MIME-Version: 1.0
In-Reply-To: <3600a9d1e06d6c1c8cda41dda88cd0dd088ca3e2.1607633488.git.sergepetrenko@tarantool.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [Tarantool-patches] [PATCH 2/4] relay: rename is_raft_enabled
 message to relay_is_running
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Serge Petrenko <sergepetrenko@tarantool.org>, cyrillos@gmail.com
Cc: tarantool-patches@dev.tarantool.org

Thanks for the patch!

I have an idea how to do this all a bit easier probably. But could
take a few more commits.

AFAIU, here is what we need, looking from above.

We need box_clear_synchro_queue() wait until quorum of replicas got
the changes of the instance owning the limbo.

Limbo already collects the LSNs on the node owning the limbo, through
txn_limbo_ack(). This function populates a vclock object with versions
of the owner LSN how it is seen on the replicas.

If we would collect LSNs into that vclock on the replica nodes too, it
would be quite simple to check if a quorum is reached. Just do the
same what txn_limbo_assign_local_lsn() does - scan vclock and check if
there is a quorum.

If there is no a quorum, the replica must wait for it. We can add a
trigger to the limbo (on_ack()) or directly to tx_status_update(),
and wait for the moment when the quorum is collected.

Another option - copy limbo's vclock on the stack in box_clear_synchro_queue(),
check current number of confirmed replicas, then hang a trigger on
tx_status_update(), and wait. Even simpler than the first option.


The reason why I don't like or don't get yet (maybe it is good?) the
approach through the relays is that firstly I want is_raft_enabled be
deleted instead of extended. It is an ugly crutch, which probably
should be removed by this:
https://github.com/tarantool/tarantool/issues/5438.

Secondly, any changes in relay not related to just reading and writing
network scare shit out of me. It is a complex thing running in a
separate thread, so better avoid adding any more complex logic to it.
It should be as dumb as possible. Until all relays are moved to one
thread (WAL) with network and disk IO served through worker threads.

What I see in the last commit now looks extremely complex.

This is a discussion proposal. I may be wrong here and miss something.

On 10.12.2020 21:55, Serge Petrenko via Tarantool-patches wrote:
> Relay sends a raft enabler/disabler message on start and stop. This
> message would also be useful to perform some additional work related to relay
> initialization/deinitialization done in the tx thread.
> 
> For example, the message will be used to notify tx thread that there's no
> point in wating until a relay replicates a specific row, since the relay
> is exiting. This is the case in #5435.

Why do we need such a notification? Why can't we let box_clear_synchro_queue()
simply wait until timeout? What if the timeout is 10 sec, and we gave up
immediately because there was no point, but in 5 sec the relays would start,
deliver everything, and the queue could be cleared?

P.S. I read the last commit and now I see that you don't wait for timeout.
But still you want infinitely. So does it matter if a relay has exited or
not anyway?