From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp60.i.mail.ru (smtp60.i.mail.ru [217.69.128.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id EB58A445320 for ; Sun, 5 Jul 2020 18:13:31 +0300 (MSK) References: <31755095-f5c6-cc73-f3d3-6bfe233b78c1@tarantool.org> From: Vladislav Shpilevoy Message-ID: <516cd472-e01e-78f2-baed-fa9def76ce46@tarantool.org> Date: Sun, 5 Jul 2020 17:13:30 +0200 MIME-Version: 1.0 In-Reply-To: <31755095-f5c6-cc73-f3d3-6bfe233b78c1@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH 1/5] [tosquash] replication: fix multiple rollbacks List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko , tarantool-patches@dev.tarantool.org >>   src/box/txn_limbo.c                 | 25 +++++++++++++++++++++++++ >>   test/replication/qsync_basic.result |  2 +- >>   2 files changed, 26 insertions(+), 1 deletion(-) >> >> diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c >> index 0402664cb..2cb687f4d 100644 >> --- a/src/box/txn_limbo.c >> +++ b/src/box/txn_limbo.c >> @@ -150,6 +157,24 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry) >>       bool timed_out = fiber_yield_timeout(txn_limbo_confirm_timeout(limbo)); >>       fiber_set_cancellable(cancellable); >>       if (timed_out) { >> +        assert(!txn_limbo_is_empty(limbo)); >> +        if (txn_limbo_first_entry(limbo) != entry) { >> +            /* >> +             * If this is not a first entry in the >> +             * limbo, it is definitely not a first >> +             * timed out entry. And since it managed >> +             * to time out too, it means there is >> +             * currently another fiber writing >> +             * rollback. Wait when it will finish and >> +             * wake us up. >> +             */ > > Why isn't it the first timed out? Is it because once previous entry was confirmed, it > is removed from the queue immediately? > Looks fragile. Нет, это не связано с конфирмами. Если в лимбо добавлены две записи, и стаймаутила вторая, то логично, что раз первая лежит в лимбе еще дольше, она стаймаутила тоже. А значит достаточно записать роллбек для нее, и это откатит все следующие.