From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, Cyrill Gorcunov <gorcunov@gmail.com> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v2 2/2] box: fix an assertion failure in box.ctl.promote() Date: Thu, 27 May 2021 13:53:54 +0300 [thread overview] Message-ID: <008e0d5d-7b1a-0599-b76c-c3dc8a481d60@tarantool.org> (raw) In-Reply-To: <3552b10e-370d-ddbd-11ed-8c3d5310e651@tarantool.org> 26.05.2021 21:46, Vladislav Shpilevoy пишет: >>>> @@ -1618,14 +1618,29 @@ box_promote(void) >>>> txn_limbo.owner_id); >>>> return -1; >>>> } >>>> + if (txn_limbo_is_empty(&txn_limbo)) { >>>> + wait_lsn = txn_limbo.confirmed_lsn; >>>> + goto promote; >>>> + } >>>> } >>>> - /* >>>> - * promote() is a no-op on the limbo owner, so all the rows >>>> - * in the limbo must've come through the applier meaning they already >>>> - * have an lsn assigned, even if their WAL write hasn't finished yet. >>>> - */ >>>> - wait_lsn = txn_limbo_last_synchro_entry(&txn_limbo)->lsn; >>>> + struct txn_limbo_entry *last_entry; >>>> + last_entry = txn_limbo_last_synchro_entry(&txn_limbo); >>>> + /* Wait for the last entries WAL write. */ >>>> + if (last_entry->lsn < 0) { >>>> + if (wal_sync(NULL) < 0) >>>> + return -1; >>>> + if (txn_limbo_is_empty(&txn_limbo)) { >>>> + wait_lsn = txn_limbo.confirmed_lsn; >>>> + goto promote; >>>> + } >>>> + if (last_entry != txn_limbo_last_synchro_entry(&txn_limbo)) { >>> This is a bit dangerous. We cache a pointer and then go to fiber_yield, >>> which switches context, at this moment the pointer become dangling one >>> and we simply can't be sure if it _were_ reused. IOW, Serge are we >>> 100% sure that the same pointer with same address but with new data >>> won't appear here as last entry in limbo? >> I agree this solution is not perfect. >> >> An alternative would be to do the following: >> 1) Check that the limbo owner hasn't changed >> 2) Check that the last entry has positive lsn (e.g. it's not a new entry which >> wasn't yet written to WAL). And that this lsn is equal to the lsn of our entry. >> >> But what if our entry was confirmed and destroyed during wal_sync()? We can't compare >> other entries lsn with this ones. > As decided in the chat, you can use txn->id. It is unique until > restart and should help to detect if the last transaction has > changed. Yep, thanks for the suggestion! Here's the diff: ============================================================= diff --git a/src/box/box.cc b/src/box/box.cc index 3d9cd0e57..3baae6afe 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1628,13 +1628,19 @@ box_promote(void) last_entry = txn_limbo_last_synchro_entry(&txn_limbo); /* Wait for the last entries WAL write. */ if (last_entry->lsn < 0) { + int64_t tid = last_entry->txn->id; if (wal_sync(NULL) < 0) return -1; + if (former_leader_id != txn_limbo.owner_id) { + diag_set(ClientError, ER_INTERFERING_PROMOTE, + txn_limbo.owner_id); + return -1; + } if (txn_limbo_is_empty(&txn_limbo)) { wait_lsn = txn_limbo.confirmed_lsn; goto promote; } - if (last_entry != txn_limbo_last_synchro_entry(&txn_limbo)) { + if (tid != txn_limbo_last_synchro_entry(&txn_limbo)->txn->id) { diag_set(ClientError, ER_QUORUM_WAIT, quorum, "new synchronous transactions appeared"); return -1; ============================================================= -- Serge Petrenko
next prev parent reply other threads:[~2021-05-27 10:53 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-25 10:39 [Tarantool-patches] [PATCH v2 0/2] " Serge Petrenko via Tarantool-patches 2021-05-25 10:39 ` [Tarantool-patches] [PATCH v2 1/2] box: refactor in_promote using a guard Serge Petrenko via Tarantool-patches 2021-05-26 7:25 ` Cyrill Gorcunov via Tarantool-patches 2021-05-27 10:57 ` Serge Petrenko via Tarantool-patches 2021-05-27 11:02 ` Cyrill Gorcunov via Tarantool-patches 2021-05-25 10:39 ` [Tarantool-patches] [PATCH v2 2/2] box: fix an assertion failure in box.ctl.promote() Serge Petrenko via Tarantool-patches 2021-05-26 6:14 ` Cyrill Gorcunov via Tarantool-patches 2021-05-26 8:25 ` Serge Petrenko via Tarantool-patches 2021-05-26 18:46 ` Vladislav Shpilevoy via Tarantool-patches 2021-05-27 10:53 ` Serge Petrenko via Tarantool-patches [this message] 2021-05-27 11:03 ` Cyrill Gorcunov via Tarantool-patches 2021-05-27 19:30 ` Vladislav Shpilevoy via Tarantool-patches 2021-06-01 12:20 ` [Tarantool-patches] [PATCH v2 0/2] " Kirill Yukhin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=008e0d5d-7b1a-0599-b76c-c3dc8a481d60@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 2/2] box: fix an assertion failure in box.ctl.promote()' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox