From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id D21236ECE3; Tue, 11 Jan 2022 23:39:45 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org D21236ECE3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1641933585; bh=FQrqDQQHVYulKvr6jRDlbtCPdYxGgpvRuIX6c+PdNmc=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=KhzI6PQ36XgYBcOP8MgenWhvrCx1uSu4HnIpizXq3ISdjY+pKxqW4Uh2WiYBxs5X4 mUS1JKTSM6dIR/kJCx+AAA4ERdgjf2+HsRfyH6ogPSV1FwKluH4S1bUD+33VsDl+VC 6pWregEd/jCu+Bi8focpzG+tVlbHg/P/k9+9LrwM= Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 1C8056ECE3 for ; Tue, 11 Jan 2022 23:39:44 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 1C8056ECE3 Received: by mail-lf1-f49.google.com with SMTP id h2so792585lfv.9 for ; Tue, 11 Jan 2022 12:39:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=yfEAk6xVdmDAhVJVQKzH41d4cun539frm6XZMhN1lkU=; b=eq3+2RDglyekECbCtlcN+vHKCnMuQ0a3hwIjg+yeiqQ4lP3X7TH6F96EUBhbIrxTlz hh0iVYLsOSW4NQBkL/WEozaZKu72F+5+uNchPjQiHz1VQKuYGZluFaGxEq3ITIkZd+0+ b9G5erlAStyZBXCxL7bynbC1ijQI58aimsFFmtPrfi5sb9LeVTKsjRZX0yB5klXQBKRW vcTksYKediBG4inZihBWuK8MDzXytvKxKkjGnMCKYHzn9kj2gq/EE/ShSMZZE7d5jVV/ 4PPOwsnh+PubTY6jikLEdtuj4+Yrl8r5Q/6Im4MnqcmxsV5WmfjiUmMoNnuivecqFBGm eHeQ== X-Gm-Message-State: AOAM530WWfCKJGq/dEXZZjm0I8ZmJCAlCtxXHNHcnR5O9kkZadKGMVW7 581tBrTdpen/u1MI9senhZ1bNaytqEzvBg== X-Google-Smtp-Source: ABdhPJwg5Fdso5f8dD8YPFhyWyXxTHAZJQwsvBL5gLjmf6Z9DsmalAhbHSPCMa/fREbdD1b2CZrLfA== X-Received: by 2002:a2e:2e18:: with SMTP id u24mr3943079lju.492.1641933582912; Tue, 11 Jan 2022 12:39:42 -0800 (PST) Received: from grain.localdomain ([5.18.251.97]) by smtp.gmail.com with ESMTPSA id u10sm381669ljh.119.2022.01.11.12.39.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jan 2022 12:39:41 -0800 (PST) Received: by grain.localdomain (Postfix, from userid 1000) id 1DB245A0020; Tue, 11 Jan 2022 23:39:41 +0300 (MSK) Date: Tue, 11 Jan 2022 23:39:41 +0300 To: Serge Petrenko Cc: Vladislav Shpilevoy , tml Message-ID: References: <20211230202347.353494-1-gorcunov@gmail.com> <20211230202347.353494-3-gorcunov@gmail.com> <1641824923.419591282@f764.i.mail.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1641824923.419591282@f764.i.mail.ru> User-Agent: Mutt/2.0.7 (2021-05-04) Subject: Re: [Tarantool-patches] [PATCH v27 2/3] qsync: order access to the limbo terms X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" On Mon, Jan 10, 2022 at 05:28:43PM +0300, Serge Petrenko wrote: > Hi! Thanks for the patch! > > box_issue_promote() and box_issue_demote() need fine-grained locking > anyway. > Otherwise it’s possible that promote() is already issued, but not yet > written to WAL, and some > outdated request is applied by applier at that exact moment. True. And in previous series Vlad has asked to not move in code which is not covered by tests. So I think this is a task for the next part. Currently we cover only the race between appliers. > > You should take the lock before the WAL write, and release it only after > txn_limbo_apply. > > No need to guard every limbo function there is, but we have to guard > everything that > writes PROMOTE/DEMOTE. ... > @@ -216,7 +225,7 @@ txn_limbo_last_entry(struct txn_limbo *limbo) > * @a replica_id. > */ > static inline uint64_t > -txn_limbo_replica_term(const struct txn_limbo *limbo, uint32_t > replica_id) > +txn_limbo_replica_term(struct txn_limbo *limbo, uint32_t replica_id) > { > > > You’ve forgot to lock the latch here, I guess. I did it on a purpose. As you remember we've faced many problems when tried to implement fine-grained locking inside limbo code. So I dropped this idea eventually and I think we could start with explicit locks to cover the applier race and then walk via small steps trying to cover the rest. > +/** > + * Initiate execution of a synchronous replication request. > + */ > +static inline void > +txn_limbo_begin(struct txn_limbo *limbo) > +{ > + limbo->promote_latch_cnt++; > + latch_lock(&limbo->promote_latch); > > > I suppose you should decrease the latch_cnt right after acquiring the > lock. > > Otherwise you count the sole «limbo user» together with «limbo waiters». Yes, this will represent accumulated value. To be honest I never saw such approach in any other code (ie increment/lock/decrement) but I think this is fine for fibres, will do. Cyrill