From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id B076C6ECE3; Wed, 12 Jan 2022 17:01:03 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B076C6ECE3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1641996063; bh=VerYvBrUmz8yXCvharPCLPSOyDjQFzB4lLtQbpyREF4=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=MluOYf2xIlHRJGu4/TTCOfHeW6NIlGDFm2iXFIIIPzhwWfpqHQjT6ar56E7lf3HqM 5zHp62Lr8eSRZAhjHZZt6lQTnKE/Yo3u1t+aaygL52hjs31Kqiz0PIwSy1iZySFRA2 8qUn1PaSLbKbYyfQf75CvwhZ4GIyQhh+kMztSctw= Received: from smtp34.i.mail.ru (smtp34.i.mail.ru [94.100.177.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 804D16ECE3 for ; Wed, 12 Jan 2022 17:01:01 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 804D16ECE3 Received: by smtp34.i.mail.ru with esmtpa (envelope-from ) id 1n7eBQ-00029Q-Mu; Wed, 12 Jan 2022 17:01:01 +0300 Message-ID: <77b533c1-0c2f-c11d-0aa6-4109674a7025@tarantool.org> Date: Wed, 12 Jan 2022 17:01:00 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Content-Language: en-GB To: Cyrill Gorcunov , Vladislav Shpilevoy Cc: tml References: <20211230202347.353494-1-gorcunov@gmail.com> <20211230202347.353494-3-gorcunov@gmail.com> <1641824923.419591282@f764.i.mail.ru> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD99D69EC10612BAE37C9D5E9231303FDC7344E160544FF13B3182A05F538085040BE38825B20F8F5AC6D97EFC808BA7C36D4CE1211253510961408153533B6B5D7 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE77AA33865E80AF043EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AF78C68CB7D402738638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D83D96D9E3422893529FFF338EA2AD401B117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC05EF1B56D39DD5F7389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC8034D30FDF2F620DBF6B57BC7E6449061A352F6E88A58FB86F5D81C698A659EA73AA81AA40904B5D9A18204E546F3947C5E3BF8C76DC23F7403F1AB874ED890284AD6D5ED66289B52698AB9A7B718F8C46E0066C2D8992A16725E5C173C3A84C3BB75CFADA5C32A90BA3038C0950A5D36B5C8C57E37DE458B0BC6067A898B09E46D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE74ABCC139FF3F849B731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A58147C85274A9E4D638F48949D5C59938C2D5C7692EA22E73D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA752CE3587E4385123B410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34A1AF43396E40A36F0C565167F39E628CC7963876014B9A3A747FDB39382F4CB1D93D2FE2824607B61D7E09C32AA3244CFDBC8C8D0E0A5D002765E2A6D3ECD5788A6D4CC6FBFAC251FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojm5xoEBeD4MeBticod3kwgw== X-Mailru-Sender: 11C2EC085EDE56FA38FD4C59F7EFE4072303540E9631722B2555F9433979E22B9E368351EECC161E6BB2E709EA627F343C7DDD459B58856F0E45BC603594F5A135B915D4279FF0574198E0F3ECE9B5443453F38A29522196 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v27 2/3] qsync: order access to the limbo terms X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 11.01.2022 23:39, Cyrill Gorcunov пишет: > On Mon, Jan 10, 2022 at 05:28:43PM +0300, Serge Petrenko wrote: >> Hi! Thanks for the patch! >> >> box_issue_promote() and box_issue_demote() need fine-grained locking >> anyway. >> Otherwise it’s possible that promote() is already issued, but not yet >> written to WAL, and some >> outdated request is applied by applier at that exact moment. > True. And in previous series Vlad has asked to not move in code which is > not covered by tests. So I think this is a task for the next part. Currently > we cover only the race between appliers. Let's ask Vlad, then. I feel like we should fix this now, not waiting for a full fine-grained locking patch. First of all, this is a known bug (and fine-grained locking was meant to cover everything we don't know of, just in case). Besides, simply locking issue_promote/issue_demote should be much easier than implementing the fine-grained locking patch. > >> >> You should take the lock before the WAL write, and release it only after >> txn_limbo_apply. >> >> No need to guard every limbo function there is, but we have to guard >> everything that >> writes PROMOTE/DEMOTE. > ... >> @@ -216,7 +225,7 @@ txn_limbo_last_entry(struct txn_limbo *limbo) >> * @a replica_id. >> */ >> static inline uint64_t >> -txn_limbo_replica_term(const struct txn_limbo *limbo, uint32_t >> replica_id) >> +txn_limbo_replica_term(struct txn_limbo *limbo, uint32_t replica_id) >> { >> >> >> You’ve forgot to lock the latch here, I guess. > I did it on a purpose. As you remember we've faced many problems when tried > to implement fine-grained locking inside limbo code. So I dropped this idea > eventually and I think we could start with explicit locks to cover the applier > race and then walk via small steps trying to cover the rest. Ok, then return `const ` to the function declaration, please. > >> +/** >> + * Initiate execution of a synchronous replication request. >> + */ >> +static inline void >> +txn_limbo_begin(struct txn_limbo *limbo) >> +{ >> + limbo->promote_latch_cnt++; >> + latch_lock(&limbo->promote_latch); >> >> >> I suppose you should decrease the latch_cnt right after acquiring the >> lock. >> >> Otherwise you count the sole «limbo user» together with «limbo waiters». > Yes, this will represent accumulated value. To be honest I never saw such > approach in any other code (ie increment/lock/decrement) but I think this > is fine for fibres, will do. It just looks strange to me that `synchro.queue.waiters` will be non-zero when someone simply uses the limbo. They are `waiters`, not `users` or something else. > > Cyrill -- Serge Petrenko