From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 05A926EC55; Mon, 12 Jul 2021 11:01:32 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 05A926EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626076892; bh=OowaCrksK3GSKvy5Za4oli3ZnruaJn1TqqPOzcBD2d4=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=wPPUnOOIJSkg+z09dd8ToZpzo+we+2B6e/W5MVY5iNw3ffrfIxa4nclfVxczf8AKr GljTrlQRLaSnWfPFrjEEkQyh8tk+Xm0pSvlJrapqggJM05Hf7s5jWZYA8GRpUyzunr i2xJCmsdLIatsXKskJjuCS+4DBE/ao0MTZke0hLw= Received: from smtp51.i.mail.ru (smtp51.i.mail.ru [94.100.177.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id A4C9F6EC55 for ; Mon, 12 Jul 2021 11:01:30 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org A4C9F6EC55 Received: by smtp51.i.mail.ru with esmtpa (envelope-from ) id 1m2qsc-0007dV-1g; Mon, 12 Jul 2021 11:01:30 +0300 To: Vladislav Shpilevoy , Cyrill Gorcunov , tml References: <20210710222803.253251-1-gorcunov@gmail.com> Message-ID: <187d1ae2-99cb-50d4-d5b4-18aa6c5f5546@tarantool.org> Date: Mon, 12 Jul 2021 11:01:29 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD954DFF1DC42D673FBE6FDDB4BD448567E879352FD8EC4AF74182A05F538085040F928578C3F8AF46DCF2F5C0A0D4A36AF2FBF6CDC5E995070FF9B6F0E8F893A05 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7AED985C8E545F588EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637E1D2769089B3DFB28638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8F099204F2449006BB2FB9FC8B8BEB34D117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCF80095D1E77F4578A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F4460429728776938767073520B28585415E75ADA9618001F51B5FD3F9D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6753C3A5E0A5AB5B7089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A54D54003FE8CEE18102CB41A41B225863FEA936170DDC104BD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA753753CEE10E4ED4A7410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3472E5ECC12A9739C13F31A5AB46B28025AA1C979A5EACDCB74456846EF8FF3C34A84C9B00BF3978921D7E09C32AA3244C4F5C2CC9404770832AC5562214D3D5057C0C08F7987826B9FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj/3sbGI30XhfHMr30cH+XKw== X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A4460F1186EBDBE8D3AA47AFD8268BF0EC97F7945288957490F1424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH] limbo: introduce request processing hooks X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 11.07.2021 17:00, Vladislav Shpilevoy пишет: > Hi! Thanks for the patch! > > On 11.07.2021 00:28, Cyrill Gorcunov wrote: >> Guys, this is an early rfc since I would like to discuss the >> design first before going further. Currently we don't interrupt >> incoming syncro requests which doesn't allow us to detect cluster >> split-brain situation, as we were discussing verbally there are >> a number of sign to detect it and we need to stop receiving data >> from obsolete nodes. >> >> The main problem though is that such filtering of incoming packets >> should happen at the moment where we still can do a step back and >> inform the peer that data has been declined, but currently our >> applier code process syncro requests inside WAL trigger, ie when >> data is already applied or rolling back. >> >> Thus we need to separate "filer" and "apply" stages of processing. >> What is more interesting is that we filter incomings via in memory >> vclock and update them immediately. Thus the following situation >> is possible -- a promote request comes in, we remember it inside >> promote_term_map but then write to WAL fails and we never revert >> the promote_term_map variable, thus other peer won't be able to >> resend us this promote request because now we think that we've >> alreday applied the promote. > Well, I still don't understand what the issue is. We discussed it > privately already. You simply should not apply anything until WAL > write is done. And it is not happening now on the master. The > terms vclock is updated only **after** WAL write. > > Why do you need all these new vclocks if you should not apply > anything before WAL write in the first place? If I understand correctly, the issue is that if we filter (and check for the split brain) after the WAL write, we will end up with a conflicting PROMOTE in our WAL. Cyrill is trying to avoid this, that's why he's separating the filter stage. This way the error will reach the remote peer before any WAL write, and the WAL write won't happen. And if we filter before the WAL write, we need the second vclock, which Cyrill has introduced. We may leave confligting PROMOTEs in WAL (first write them and only then check for conflicts). In this case this whole patch isn't needed. But I personally don't like such an approach. > >> write to WAL fails and we never revert >> the promote_term_map variable > This simply should not be possible. The term map is updated only > after WAL write is done. At least this is how it works now, doesn't > it? Why did you change it? (In case you did, because I am not sure, > I didn't review the code throughly). -- Serge Petrenko