From: Serge Petrenko <sergepetrenko@tarantool.org> To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: [Tarantool-patches] [PATCH v2 6/6] txn_limbo: ignore CONFIRM/ROLLBACK for a foreign master Date: Wed, 23 Dec 2020 14:59:24 +0300 [thread overview] Message-ID: <b21832c41b36f61f98a59f39a4a26abf2d449654.1608724239.git.sergepetrenko@tarantool.org> (raw) In-Reply-To: <cover.1608724238.git.sergepetrenko@tarantool.org> We designed limbo so that it errors on receiving a CONFIRM or ROLLBACK for other instance's data. Actually, this error is pointless, and even harmful. Here's why: Imagine you have 3 instances, 1, 2 and 3. First 1 writes some synchronous transactions, but dies before writing CONFIRM. Now 2 has to write CONFIRM instead of 1 to take limbo ownership. From now on 2 is the limbo owner and in case of high enough load it constantly has some data in the limbo. Once 1 restarts, it first recovers its xlogs, and fills its limbo with its own unconfirmed transactions from the previous run. Now replication between 1, 2 and 3 is started and the first thing 1 sees is that 2 and 3 ack its old transactions. So 1 writes CONFIRM for its own transactions even before the same CONFIRM written by 2 reaches it. Once the CONFIRM written by 1 is replicated to 2 and 3 they error and stop replication, since their limbo contains entries from 2, not from 1. Actually, there's no need to error, since it's just a really old CONFIRM which's already processed by both 2 and 3. So, ignore CONFIRM/ROLLBACK when it references a wrong limbo owner. The issue was discovered with test replication/election_qsync_stress. Follow-up #5435 --- src/box/applier.cc | 3 +-- src/box/box.cc | 3 +-- src/box/txn_limbo.c | 14 +++++++++----- src/box/txn_limbo.h | 2 +- 4 files changed, 12 insertions(+), 10 deletions(-) diff --git a/src/box/applier.cc b/src/box/applier.cc index fb2f5d130..553db76fc 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -861,8 +861,7 @@ apply_synchro_row(struct xrow_header *row) if (xrow_decode_synchro(row, &req) != 0) goto err; - if (txn_limbo_process(&txn_limbo, &req)) - goto err; + txn_limbo_process(&txn_limbo, &req); struct synchro_entry *entry; entry = synchro_entry_new(row, &req); diff --git a/src/box/box.cc b/src/box/box.cc index 38bf4034e..fc4888955 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -383,8 +383,7 @@ apply_wal_row(struct xstream *stream, struct xrow_header *row) struct synchro_request syn_req; if (xrow_decode_synchro(row, &syn_req) != 0) diag_raise(); - if (txn_limbo_process(&txn_limbo, &syn_req) != 0) - diag_raise(); + txn_limbo_process(&txn_limbo, &syn_req); return; } if (iproto_type_is_raft_request(row->type)) { diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index 9272f5227..9498c7a44 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -634,13 +634,17 @@ complete: return 0; } -int +void txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) { if (req->replica_id != limbo->owner_id) { - diag_set(ClientError, ER_SYNC_MASTER_MISMATCH, - req->replica_id, limbo->owner_id); - return -1; + /* + * Ignore CONFIRM/ROLLBACK messages for a foreign master. + * These are most likely outdated messages for already confirmed + * data from an old leader, who has just started and written + * confirm right on synchronous transaction recovery. + */ + return; } switch (req->type) { case IPROTO_CONFIRM: @@ -652,7 +656,7 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) default: unreachable(); } - return 0; + return; } void diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h index a49356c14..c28b5666d 100644 --- a/src/box/txn_limbo.h +++ b/src/box/txn_limbo.h @@ -257,7 +257,7 @@ int txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry); /** Execute a synchronous replication request. */ -int +void txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req); /** -- 2.24.3 (Apple Git-128)
next prev parent reply other threads:[~2020-12-23 11:59 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-12-23 11:59 [Tarantool-patches] [PATCH v2 0/6] make clear_synchro_queue commit everything Serge Petrenko 2020-12-23 11:59 ` [Tarantool-patches] [PATCH v2 1/6] box: add a single execution guard to clear_synchro_queue Serge Petrenko 2020-12-23 11:59 ` [Tarantool-patches] [PATCH v2 2/6] relay: introduce on_status_update trigger Serge Petrenko 2020-12-23 17:25 ` Vladislav Shpilevoy 2020-12-24 16:11 ` Serge Petrenko 2020-12-23 11:59 ` [Tarantool-patches] [PATCH v2 3/6] txn_limbo: introduce txn_limbo_last_synchro_entry method Serge Petrenko 2020-12-23 17:25 ` Vladislav Shpilevoy 2020-12-24 16:13 ` Serge Petrenko 2020-12-23 11:59 ` [Tarantool-patches] [PATCH v2 4/6] box: rework clear_synchro_queue to commit everything Serge Petrenko 2020-12-23 17:28 ` Vladislav Shpilevoy 2020-12-24 16:12 ` Serge Petrenko 2020-12-24 17:35 ` Vladislav Shpilevoy 2020-12-24 21:02 ` Serge Petrenko 2020-12-23 11:59 ` [Tarantool-patches] [PATCH v2 5/6] test: fix replication/election_qsync_stress test Serge Petrenko 2020-12-23 11:59 ` Serge Petrenko [this message] 2020-12-23 17:28 ` [Tarantool-patches] [PATCH v2 6/6] txn_limbo: ignore CONFIRM/ROLLBACK for a foreign master Vladislav Shpilevoy 2020-12-24 16:13 ` Serge Petrenko 2020-12-25 10:04 ` [Tarantool-patches] [PATCH v2 0/6] make clear_synchro_queue commit everything Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=b21832c41b36f61f98a59f39a4a26abf2d449654.1608724239.git.sergepetrenko@tarantool.org \ --to=sergepetrenko@tarantool.org \ --cc=gorcunov@gmail.com \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 6/6] txn_limbo: ignore CONFIRM/ROLLBACK for a foreign master' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox