From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 7C4D16EC55; Tue, 20 Jul 2021 01:06:19 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7C4D16EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626732379; bh=BKKWj2cTZ5YQUmFBqClYAHscki9Js8vXSsI2UfBOllY=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=FDmgaipU6A/NqTg/NQQ3+YJAFvYQcD0SOIBksJNfDTWrNxcQQx3LLqjbKNmBJEGdr fFvkwhs+bMZCcXOZZ0U/vayD6rZTITbINsWqfIA1/hRQPeJhBGr3Kg9S/6VbEOO8Dn BqVIjKXdHIX+WSKB4ZjZ6KK6oWx+JfP+waGLRQ8Y= Received: from smtpng1.i.mail.ru (smtpng1.i.mail.ru [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id F2DA06EC55 for ; Tue, 20 Jul 2021 01:06:17 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org F2DA06EC55 Received: by smtpng1.m.smailru.net with esmtpa (envelope-from ) id 1m5bOy-0002gY-Q4; Tue, 20 Jul 2021 01:06:17 +0300 To: Konstantin Osipov , tarantool-patches@dev.tarantool.org, gorcunov@gmail.com, sergepetrenko@tarantool.org References: <0c92a88ff1d392f8b03de59be8cb19a162bf78f8.1626392372.git.v.shpilevoy@tarantool.org> <20210716142959.GC146960@starling> <20210719091248.GA4257@starling> Message-ID: Date: Tue, 20 Jul 2021 00:06:15 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <20210719091248.GA4257@starling> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD941C43E597735A9C386C8E0DDEE7E2465724AB895DD30A549182A05F5380850409BF016CDCDE262381E87C087E7E9E05CDA414D4C513B52F741C038149C0330A0 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE79207F2B4714610D0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637D07BBD2EBFB7BF888638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D88D02BA440BB71E4D103E2271A78A2B79117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC3A703B70628EAD7BA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735201E561CDFBCA1751F2CC0D3CB04F14752D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B613439FA09F3DCB32089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A5FDAF5BF205AD5A58E11829EB6E406A4F9332EC5E9B6FAADAD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7501A9DF589746230F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34BDCC71B3781C9C958D45380DBEDBD3BB57A810A190B160247E4F584D91BF38642E143A6CF57263C31D7E09C32AA3244C7B22C8CF11415BE4FC3684A36A6971927101BF96129E4011927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojL49Xu4qyFBnxusY4melLSw== X-Mailru-Sender: 689FA8AB762F7393C37E3C1AEC41BA5D2CDC8D2D629B517FB6BFA38C295151F03841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 1/2] replication: introduce ballot.can_be_leader X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" On 19.07.2021 11:12, Konstantin Osipov wrote: > * Vladislav Shpilevoy [21/07/18 20:03]: >>>> The new field during bootstrap will help to avoid selecting a >>>> 'voter' as a master. Voters can't write, they are unable to boot >>>> themselves nor register others. >>>> >>>> @TarantoolBot document >>>> Title: New field - IPROTO_BALLOT_CAN_BE_LEADER >>>> It is sent as a part of `IPROTO_BALLOT (0x29)`. The field is a >>>> boolean flag which is true if the sender has `election_mode` in >>>> its config `'manual'` or `'candidate'`. >>>> >>>> During bootstrap the nodes able to become a leader are preferred >>>> over the nodes configured as `'voter'`. >>> >>> Curious why did you add this feature in the first place, I mean >>> "eligibility"? Each voter has to be able to become a leader, >>> otherwise raft liveness guarantees are violated. Raft has >>> learners, but learners neither vote nor can become leaders. >> >> Voters are nodes which an admin does not want to be a leader. For >> instance, they are too far away physically. As voters, they might >> help to elect a leader, for example, if there are just 3 nodes one >> of which is a voter. >> >> Another application is when you specifically start 1 node as a >> voter and 2 candidates. The voter might skip all the replication >> data and work on a slow small machine. >> >> It can help to form a majority. We are planning to make this >> feature even easier to use by adding dataless nodes just for >> voting. >> >> As for Raft, it should not bring any problems. In Raft you can >> say that all nodes are candidates, but some of them are so slow, >> that they can never vote for themselves in time. Raft still works, >> and you essentially have 'voters'. > > Imagine there are nodes A, B, C, D, E. > A is a leader, E is a voter which can not become a leader. > > Imagine A's log index is 5, B = 4, C = 3, D = 2, E = 5. > > The majority's log index is 4, so entry 4 is committed. A dies, B > is partitioned away. The cluster is stuck, because neither C nor B > can get a quorum (3 votes). But how is it different from the real Raft? In normal Raft I can say E simply is too slow to make any actions. It is just stuck or died. The cluster will be stuck then, yes. Not much you can do here. You can think of a voter as of almost a permanently broken node which sometimes manages to vote but never manages to become a candidate in time. I suppose Raft can withstand that behaviour. > Worse yet, if E's (voter) commit index is low, not high, it can vote for a > node which doesn't have a committed entry. In that case you can > lose a committed entry. Could you provide an example? Because I still do not see how is it different from the classic Raft in which one node either is always too late to become a candidate or is turned off when there are no better candidates.