From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 2DDBE6EC58; Tue, 3 Aug 2021 16:26:04 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 2DDBE6EC58 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1627997164; bh=UHRqwSUKxytgh58ixjrZonGSq+lZz/drxWPpWuJzAVU=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=JiZPzzwTmMEYn5xS0NbcK7Yg+WDZ1Ni0uUnfsiJkjW9s6QBucOM49zUt8UmLmm1Ub Zez4V+NZ+4maiyDI7m+C33dJ0uKhRhI6llKk8YISI604OlWBANTwOmDOPZSJSB9BZB qER2nxy1vZADFZMlGUnatlplyQ8Oj77OJSyQdVag= Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 001226EC58 for ; Tue, 3 Aug 2021 16:26:02 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 001226EC58 Received: by mail-lj1-f180.google.com with SMTP id m9so28151387ljp.7 for ; Tue, 03 Aug 2021 06:26:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=BRhsyI/EqXD1E8V8f99vRtuncCBSLyucdEc9O52JqH0=; b=VAW/lqo9gXA3veqe4qMcFUm71w6KFYZ0fmoz7MgtmDUTigU/ny0QFNStM6BFoH0+Ak 8QP1wH7zDbak0ZQADpg90dYllaOLqizHt7jyRW0bFd7i88l/x0LignazeH0kukL59WwW TNzNgig/J24ROWv5S9s+oWiTZrSk79RfWnQdKLU7m0z7+XW82WHUvz7mj6Lp/9BmuTL1 2/27EP1nyBAUrDOcdkSD4NMTk8h9FR1eoZts4M0YOPYJhP3l0jEHF5tcMcdOGabp0AFE y7D7sShXaf9iWVriAoI3tGdbUvfYfrhLHbOYCXSHDZgc/jg25mZKG318sVMtnsSCGDtd ROhg== X-Gm-Message-State: AOAM533611tmAU0RoZJsa9C0nA/hj1Uf4J/j5NOlfYDDCHU9AD7CgFfY ggN1bzkIXZugCm07awadTaNiP1jpBHc= X-Google-Smtp-Source: ABdhPJwNIXZeqhPfSBUkK7ACIL+gNGn+gYwbSkkqZZ9U504XWJU8Zx2pwspqs+BZhIGLMCRqVj0I8w== X-Received: by 2002:a2e:bd16:: with SMTP id n22mr15089569ljq.29.1627997161846; Tue, 03 Aug 2021 06:26:01 -0700 (PDT) Received: from grain.localdomain ([5.18.255.97]) by smtp.gmail.com with ESMTPSA id v4sm1254215lfd.172.2021.08.03.06.26.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Aug 2021 06:26:00 -0700 (PDT) Received: by grain.localdomain (Postfix, from userid 1000) id 040405A001E; Tue, 3 Aug 2021 16:25:59 +0300 (MSK) Date: Tue, 3 Aug 2021 16:25:59 +0300 To: Vladislav Shpilevoy Message-ID: References: <20210730113539.563318-1-gorcunov@gmail.com> <20210730113539.563318-5-gorcunov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/2.0.7 (2021-05-04) Subject: Re: [Tarantool-patches] [PATCH v9 4/5] limbo: filter incoming synchro requests X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Cc: tml Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" On Tue, Aug 03, 2021 at 01:50:49AM +0200, Vladislav Shpilevoy wrote: > Thanks for the patch! > > On 30.07.2021 13:35, Cyrill Gorcunov wrote: > > When we receive synchro requests we can't just apply > > them blindly because in worse case they may come from > > split-brain configuration (where a cluster splitted into > > splitted -> split. > > > several subclusters and each one has own leader elected, > > then subclisters are trying to merge back into original > > subclisters -> subclusters. Thanks! > > > cluster). We need to do our best to detect such configs > > and force these nodes to rejoin from the scratch for > > data consistency sake. > > > > Thus when we're processing requests we pass them to the > > packet filter first which validates their contents and > > refuse to apply if they are not matched. > > > > Depending on request type each packet traverse an > > appropriate chain(s) > > > > FILTER_IN > > - Common chain for any synchro packet. We verify > > that if replica_id is nil then it shall be > > PROMOTE request with lsn 0 to migrate limbo owner > > How can it be 0 for non PROMOTE/DEMOTE requests? > Do we ever encode such rows at all? Why isn't this > a part of FILTER_PROMOTE? There could be network errors for example, thus when we see synchro type of a packet we need to verify its common attributes first before passing to the next chain. These attributes are not depending on limbo state I think. It this particular case if we see a packet with replica_id is nil then it must be a promote/demote request. Otherwise I'll have to add these tests to every non promote/demote packets. For example imagine a packet {rollback | lsn = 0}, it is obviously wrong because we don't have lsn = 0 records at all. Thus either I should this test inside confirm/rollback chains or make a common helper which would make a general validation of incming synchro packets. For this sake filter-in chain has been introduced. > > > FILTER_CONFIRM > > FILTER_ROLLBACK > > - Both confirm and rollback requests shall not come > > with empty limbo since it measn the synchro queue > > measn -> means. thanks > > > is already processed and the peer didn't notice > > that > > Is it the only issue? What about ROLLBACK coming to > an instance, which already made PROMOTE on the rolled back > data? That is a part of the original problem in the ticket. Then it is an error as far as I understand. There is no more queued data and promote request basically dropped any information we've had in memory related to the limbo state. The promote request implies that the node where it is executed is a raft leader and its data is only valid point for all other node in clusters. Thus if in receve confirm/rollback request on rows which are already commited (or rolled back) via promote request, then other peer should exit with error. Don't we? Or I miss something? > > FILTER_PROMOTE > > - Promote request should come in with new terms only, > > otherwise it means the peer didn't notice election > > > > - If limbo's confirmed_lsn is equal to promote LSN then > > it is a valid request to process > > > > - If limbo's confirmed_lsn is bigger than requested then > > it is valid in one case only -- limbo migration so the > > queue shall be empty > > I don't understand. How is it valid? PROMOTE(lsn) rolls > back everything > lsn. If the local confirmed_lsn > lsn, it > means that data can't be rolled back now and the data becomes > inconsistent. IIRC this was a scenario where we're migrating a limbo owner, I think the scenario with owner migration is yet unclear for me, need to revisit this moment. > > > - If limbo's confirmed_lsn is less than promote LSN then > > - If queue is empty then it means the transactions are > > already rolled back and request is invalid > > - If queue is not empty then its first entry might be > > greater than promote LSN and it means that old data > > either committed or rolled back already and request > > is invalid > > If the first entry's LSN in the limbo > promote LSN, it > means it wasn't committed yet. The promote will roll it back > and it is fine. This will make the data consistent. quoting you: > Первая транзакция лимба имеет lsn > promote lsn. Это уже конец. > Потому что старый мастер уже старые данные либо закатил, либо > откатил, уже неважно, и это сплит бреин. translation: > First limbo transaction lsn > promote lsn. This is the end. > Because old master has committed or rolled back the data > already, it doesn't matter, this is a split brain situation. Maybe I got you wrong? > > The problem appears if there were some other sync txns > rolled back or even committed with quorum=1 before this > hanging txn. And I don't remember we figured a way to > distinguish between these situations. Did we? Seems like not yet. Need more time to think if we can have some other scenarios here...