From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B7D39469719 for ; Mon, 19 Oct 2020 23:26:06 +0300 (MSK) References: <3b913104-6366-4ee0-fda2-812a06cc36a8@tarantool.org> From: Vladislav Shpilevoy Message-ID: <508aa0bf-153c-ed0a-ea81-15720ef9c988@tarantool.org> Date: Mon, 19 Oct 2020 22:26:04 +0200 MIME-Version: 1.0 In-Reply-To: <3b913104-6366-4ee0-fda2-812a06cc36a8@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko , tarantool-patches@dev.tarantool.org Hi! Thanks for the patch! > 17.10.2020 20:17, Vladislav Shpilevoy пишет: >> There were 2 issues with the relay restarting recovery cursor when the node is >> elected as a leader. Fixed in the last 2 commits. First was about local LSN not >> being set, second about GC not being propagated. >> >> The first patch is not related to the bugs above directly. Just was found while >> working on this. In theory without the first patch we can get flakiness into >> the testes changed in this commit, but only if a replication connection will >> break without a reason. >> >> Additionally, the new test - gh-5433-election-restart-recovery - hangs on my >> machine when I start tens of it. All workers, after executing it several times, >> hang. But!!! not in something related to the raft - they hang in the first >> box.snapshot(), where the election is not even enabled yet. From some debug >> prints I see it hangs somewhere in engine_being_checkpoint(), and consumes >> ~80% of the CPU. But it may be just a consequence of the corrupted memory on >> Mac, due to libeio being broken. Don't know what to do with that now. > > Hi! Thanks  for the patchset! > > Patches 2 and 3 LGTM. > > Patch 1 looks ok, but I have one question. > What happens when a user accidentally enables raft  during a cluster upgrade, when > some of the instances support raft, and some don't? > Looks like it'll lead to even more inconvenience. > > In my opinion it's fine if the leader just disappears without further notice. > We have an election timeout set up for this anyway. Election timeout won't work if Raft is disabled or is in 'voter' mode on the other nodes. Moreover, even if we enable the timeout, they will see the leader alive! Even if it is not a leader. Because with Raft disabled, the node does not send Raft state, but keeps sending regular replication heartbeats. Also then the box.info.election output will freeze. Even after a connection to the old master is established, the other nodes will show it as a leader in box.info.election. I don't see how to fix it except by sending Raft state always. This looks more confusing than accidental error problems. May affect monitoring depending on box.info.election, and routing, if someone will make it depend on box.info.election. Other option - we could add a new raft mode: 'mute'. Such node can't vote, can't become a leader, but send Raft state, and is read-only. 'off' means that your totally understand the consequences - the node won't send Raft state at all, and the others still may think it is a leader. This would help to protect from the accidental enabling of Raft. You would set it to 'off' and nothing will be sent. I will ask in chats.