From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 09C716EC55; Sun, 11 Jul 2021 21:22:42 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 09C716EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626027762; bh=27KYDnJ78YxC7xAOZoqf5tkMwBEQmVjJ4oc6in/EVjE=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=BSE88wX7z5DuyD3FtBBiI8hd0mH5qjC3/qayLB1j2ewQSKAziYTZ8Q/KpAU3vvjcr yYSY/5N2BDV03bwPvzgNJSE2SJbck5exIa8Gzmf0tFuEeIjk1N3wUBDBLTIrBRC30k yJvjZX5w20UbyX6sxTczzWffqbnK2eGtKfNu1n6k= Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 2ADB66EC55 for ; Sun, 11 Jul 2021 21:22:40 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 2ADB66EC55 Received: by mail-lj1-f178.google.com with SMTP id h19so8439964ljl.4 for ; Sun, 11 Jul 2021 11:22:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=AV07nFUIKDOKniXixwhuZ/+CL1+MtWFgyVQpBbB/ee4=; b=QlgGTnBRq7jK7EEji1nyZx9Fi2/sZxTw8ssmQangwpoBtCRYyWN8YPjJ4iNUBUs6V3 +fwEydpp2orzoR7tgKwrvzF/+zyU9W5fkD+D6yzqPiAHGMaz5MOhvxcDnp14LzgZzyYk j2ngzPvPL0vFhCPre/AUY/ygUqO0+OSv/EeNpechTgsYgSivrvKTo07xw+sFyeRaWwm2 8smkvaGUqaLJ6tqVBqTc32QfSYPJGUM6FwbwTQ71zR93vwqaPXyO2KcXieYNT5Mup69A GAQAYCErE3oUTf729979qvhE6Cq2OvYBnnIMOUWBeyOQPg6pAdMwTS8PKwz6GmhIXT8l yVdw== X-Gm-Message-State: AOAM533+Jvd5wcup4jP+eYLVhn780wPJ1xi5I+S3N0bhWHTjQt6JnzVg wcCphn+Ov5p1SdEq0wN7E0p17RTj0X65nw== X-Google-Smtp-Source: ABdhPJyYyGqp6B73wYURwKArGZVw4N+Fnmqxy3EkRrxEUtQ5mwhbEEJrlPAReoljV6nSalYlDbuydQ== X-Received: by 2002:a2e:9bca:: with SMTP id w10mr5473569ljj.231.1626027759096; Sun, 11 Jul 2021 11:22:39 -0700 (PDT) Received: from grain.localdomain ([5.18.199.94]) by smtp.gmail.com with ESMTPSA id j8sm999375lfb.215.2021.07.11.11.22.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 11 Jul 2021 11:22:37 -0700 (PDT) Received: by grain.localdomain (Postfix, from userid 1000) id 47E815A001E; Sun, 11 Jul 2021 21:22:37 +0300 (MSK) Date: Sun, 11 Jul 2021 21:22:37 +0300 To: Vladislav Shpilevoy Message-ID: References: <20210710222803.253251-1-gorcunov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/2.0.7 (2021-05-04) Subject: Re: [Tarantool-patches] [PATCH] limbo: introduce request processing hooks X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Cc: tml Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" On Sun, Jul 11, 2021 at 04:00:35PM +0200, Vladislav Shpilevoy wrote: > > > > Thus we need to separate "filer" and "apply" stages of processing. > > What is more interesting is that we filter incomings via in memory > > vclock and update them immediately. Thus the following situation > > is possible -- a promote request comes in, we remember it inside > > promote_term_map but then write to WAL fails and we never revert > > the promote_term_map variable, thus other peer won't be able to > > resend us this promote request because now we think that we've > > alreday applied the promote. > > Well, I still don't understand what the issue is. We discussed it > privately already. You simply should not apply anything until WAL > write is done. And it is not happening now on the master. The > terms vclock is updated only **after** WAL write. Currently we don't reject the data but silently ignore unexpected requests after we've written them. In new approach we should decide if incomng packet can be applied and get written to the WAL or we should filter them out and return an error to the client. In case if we filter out request we must not modify our WAL content at all. > Why do you need all these new vclocks if you should not apply > anything before WAL write in the first place? > > > write to WAL fails and we never revert > > the promote_term_map variable > > This simply should not be possible. The term map is updated only > after WAL write is done. At least this is how it works now, doesn't > it? Why did you change it? (In case you did, because I am not sure, > I didn't review the code throughly). Look, I think this maybe not explicitly seen from the patch since diff is big enough. But here is a key moment. Current code is called after WAL is updated void txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) { uint64_t term = req->term; uint32_t origin = req->origin_id; if (txn_limbo_replica_term(limbo, origin) < term) { vclock_follow(&limbo->promote_term_map, origin, term); if (term > limbo->promote_greatest_term) { limbo->promote_greatest_term = term; } else if (req->type == IPROTO_PROMOTE) { /* * PROMOTE for outdated term. Ignore. */ --> return; } } we exit early without any error. In new approach we start reporting an error if this situation is happening, 'cause it is a split brain. But we shall not write this term into our WAL file, thus we should validate incoming packet earlier. Now imagine the following: we validated the incoming packet and remember its term in promote_term_map, then we start writting this packet into our WAL and write procedure failed. In result our promote_term_map remains updated but real data on disk won't match it. What is worse is that such request may have updated @promote_greatest_term as well. In result our test for obsolete replicas will be false positive static inline bool txn_limbo_is_replica_outdated(const struct txn_limbo *limbo, uint32_t replica_id) { return txn_limbo_replica_term(limbo, replica_id) < limbo->promote_greatest_term; } because promote_greatest_term is already updated but not on WAL level. So I had to split our txn_limbo_process into two stages: "filter" and "application". If application stage fails we restore original max term so the replica will be able to resend us the promote request and we try to write it to the WAL again.