From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 16F5B45C305 for ; Fri, 18 Dec 2020 00:43:24 +0300 (MSK) References: <3600a9d1e06d6c1c8cda41dda88cd0dd088ca3e2.1607633488.git.sergepetrenko@tarantool.org> From: Vladislav Shpilevoy Message-ID: <5b466891-65b4-ad9c-253e-156869100959@tarantool.org> Date: Thu, 17 Dec 2020 22:43:22 +0100 MIME-Version: 1.0 In-Reply-To: <3600a9d1e06d6c1c8cda41dda88cd0dd088ca3e2.1607633488.git.sergepetrenko@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH 2/4] relay: rename is_raft_enabled message to relay_is_running List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko , cyrillos@gmail.com Cc: tarantool-patches@dev.tarantool.org Thanks for the patch! I have an idea how to do this all a bit easier probably. But could take a few more commits. AFAIU, here is what we need, looking from above. We need box_clear_synchro_queue() wait until quorum of replicas got the changes of the instance owning the limbo. Limbo already collects the LSNs on the node owning the limbo, through txn_limbo_ack(). This function populates a vclock object with versions of the owner LSN how it is seen on the replicas. If we would collect LSNs into that vclock on the replica nodes too, it would be quite simple to check if a quorum is reached. Just do the same what txn_limbo_assign_local_lsn() does - scan vclock and check if there is a quorum. If there is no a quorum, the replica must wait for it. We can add a trigger to the limbo (on_ack()) or directly to tx_status_update(), and wait for the moment when the quorum is collected. Another option - copy limbo's vclock on the stack in box_clear_synchro_queue(), check current number of confirmed replicas, then hang a trigger on tx_status_update(), and wait. Even simpler than the first option. The reason why I don't like or don't get yet (maybe it is good?) the approach through the relays is that firstly I want is_raft_enabled be deleted instead of extended. It is an ugly crutch, which probably should be removed by this: https://github.com/tarantool/tarantool/issues/5438. Secondly, any changes in relay not related to just reading and writing network scare shit out of me. It is a complex thing running in a separate thread, so better avoid adding any more complex logic to it. It should be as dumb as possible. Until all relays are moved to one thread (WAL) with network and disk IO served through worker threads. What I see in the last commit now looks extremely complex. This is a discussion proposal. I may be wrong here and miss something. On 10.12.2020 21:55, Serge Petrenko via Tarantool-patches wrote: > Relay sends a raft enabler/disabler message on start and stop. This > message would also be useful to perform some additional work related to relay > initialization/deinitialization done in the tx thread. > > For example, the message will be used to notify tx thread that there's no > point in wating until a relay replicates a specific row, since the relay > is exiting. This is the case in #5435. Why do we need such a notification? Why can't we let box_clear_synchro_queue() simply wait until timeout? What if the timeout is 10 sec, and we gave up immediately because there was no point, but in 5 sec the relays would start, deliver everything, and the queue could be cleared? P.S. I read the last commit and now I see that you don't wait for timeout. But still you want infinitely. So does it matter if a relay has exited or not anyway?