From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 54B716EC40; Sat, 5 Jun 2021 02:49:02 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 54B716EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1622850542; bh=YEZZL4BUuNt/g8Lz/EBQbBj3Us6ZZ1aqkm+IJYAN1mM=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=lGKSj3dig0r7Wy7jnhuOLPo/luFmI46gXEsw9EZFOT6wPR1BHT9UqxvV3J8WwPiou z+7eJsF+kyVPxnz0HSVTm7KvaBx3ngpkWQLxATFqIYupb90MjT2ZzbOT2wAC48GP04 1Z9IRUx5JudG06eNf10bpaFCPccZka9P9SfaLY8Q= Received: from smtp30.i.mail.ru (smtp30.i.mail.ru [94.100.177.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 0CFAD6EC40 for ; Sat, 5 Jun 2021 02:49:01 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0CFAD6EC40 Received: by smtp30.i.mail.ru with esmtpa (envelope-from ) id 1lpJYi-0001Dm-AO; Sat, 05 Jun 2021 02:49:00 +0300 To: Serge Petrenko , tarantool-patches@dev.tarantool.org, gorcunov@gmail.com References: <94aeed6e00578cf917cf009b537342d6823d1f01.1622740090.git.v.shpilevoy@tarantool.org> <92afec1a-8781-52ca-14a2-72f3378e4f6e@tarantool.org> Message-ID: Date: Sat, 5 Jun 2021 01:48:59 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.2 MIME-Version: 1.0 In-Reply-To: <92afec1a-8781-52ca-14a2-72f3378e4f6e@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-7564579A: 78E4E2B564C1792B X-77F55803: 4F1203BC0FB41BD9D5B0DA836B685C5407454A95E60932C8E3171F0D0805CD56182A05F5380850405904D29E6FAFA21B098CE11867152B25A9A7900280D196FBAA013280FDDD7B91 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE710FC7AC39A8009ECEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006379F6495389D012EA98638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D86171343A51944059E200AA4FB1635D8C117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC974A882099E279BDA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735201E561CDFBCA1751FF04B652EEC242312D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE9647ADFADE5905B14DC33E588678F033D8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE32D01283D1ACF37BA302FCEF25BFAB345C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637BBEA499411984DA1EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A24209795067102C07E8F7B195E1C9783151217AF8B30152AF49269A4754B3489E X-C1DE0DAB: 0D63561A33F958A5A4F2D5D4D07D1F86B7CE4135CE06C7F4C4A16B8BC2DF6397D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75FBC5FED0552DA851410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3441661D6226BE8C31CC8C36E6F37364A02B5D9FE606885E71C7396CF7A962EE0ADDB0BF16DC40E8EE1D7E09C32AA3244CB38F2E379B0D4142080147D4D536655A64EE5813BBCA3A9DFACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojz99asgmzejrCVk20sh+YHA== X-Mailru-Sender: 504CC1E875BF3E7D9BC0E5172ADA311087AE997DF90A436F9DF6B3929F5DE038E0A06BEC388674E407784C02288277CA03E0582D3806FB6A5317862B1921BA260ED6CFD6382C13A6112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 1/1] replication: prevent boot when rs uuid mismatches X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi! Thanks for the review! > First of all, 5613 is about 3rd replica bootstrapping a separate cluster, > even when it sees that the 2 other nodes have already bootstrapped. > > This patch doesn't actually fix 5613. The 3rd node shows a different error now, > but it still bootstraps its own cluster with a separate uuid. I see now. I thought that in the issue description the first 2 nodes were bootstrapped separately. > I propose to change replicaset_round() somehow, so that it never chooses > non-bootstrapped instances over bootstrapped ones. Even when bootstrapped > instances are read-only. I did it in the new version, see another email thread. > Looks like you don't even have to change ballot for this purpose. There's > already the 'is_loading' field. We just have to assign higher priority to > `is_loading = false` rather than `read_only = false`. > > P.S. I've checked, and looks like is_loading is not that useful now. > It's equal to instance's is_ro flag (not the one passed in ballot, but actual is_ro). > Still, it's easier to change is_loading encoding than introduce a whole new field. Yes, indeed, is_loading has little to do with actual loading. It is more like "box.cfg() is finished and box.cfg{read_only=true} was set". I did several changes to the ballot to make it work. Renamed field is_ro, renamed + slightly changed behaviour of is_loading, and added a new field. Only is_loading is not enough, because I still need to know who is really read-only. Not just by read_only=false, but who is actually writable. There can be orphans who has finished bootstrap/recovery, but are not writable yet. Some replication tests starts failing if we only look at read_only=false and finished bootstrap/recovery. For instance, assume 1 node is started and booted fine, it is writable. Then 2 other nodes are started: node2 and node3. They connect to node1 first, get its ballot, vclock. Then node2 registers on node1. Node3 connects to node2 now and gets its ballot. It sees node3 has higher vclock than node1 (because node2 connected to node3 later). If it does not look at it being read-only (because it is an orphan), it tries to boot from node3 (because its vclock looks like > node1) and fails because node3 can't write to _cluster yet. That error I got on replication/bootstrap_leader.test.lua until I decided to keep both properties of being booted and of being read-only.