From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 5BF936FC87; Fri, 1 Oct 2021 15:14:55 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5BF936FC87 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1633090495; bh=omySbA4pjs+EaFhotmsSnBVOcV1uo06xIcQpYeFLU0g=; h=To:Cc:References:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Ix/pc/wIJ3XJsvl7NP5/qE0NvgRweKY7OXo5vkYttDJ5/G2rv/nWi8FwC0ou38O5Q tW7rKBiwTEvqX0UW0QGszzBpe99DTA9cZqFCcAjnbw7qXvDcx6OzXbqjOcrG4vOg0d crqbzEt/j8KKkHvUdP89lW9egWgTdbn++l1F2+28= Received: from smtp43.i.mail.ru (smtp43.i.mail.ru [94.100.177.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7CCE16FC87 for ; Fri, 1 Oct 2021 15:14:53 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7CCE16FC87 Received: by smtp43.i.mail.ru with esmtpa (envelope-from ) id 1mWHRE-0000Bo-M5; Fri, 01 Oct 2021 15:14:53 +0300 To: Cyrill Gorcunov , tml Cc: Vladislav Shpilevoy References: <20210930094445.316694-1-gorcunov@gmail.com> <20210930094445.316694-3-gorcunov@gmail.com> Message-ID: <0c64d172-4fa8-29ec-7845-ff772738c09a@tarantool.org> Date: Fri, 1 Oct 2021 15:14:52 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: <20210930094445.316694-3-gorcunov@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD96A58C36AA2E996499DED20B83AFF2DE94FA0C0F669C426FD182A05F538085040B357F92BDE4AE165F7C94444B53EEE7C811D2EF9977389E14C8FDCE0BE3C746E X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7B865BA2AB0AE8E25EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637531CC3E3F637A59A8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D80BBB7FC04862685E0330EA2910450DD1117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8B974A882099E279BDA471835C12D1D977C4224003CC8364762BB6847A3DEAEFB0F43C7A68FF6260569E8FC8737B5C2249EC8D19AE6D49635B68655334FD4449CB9ECD01F8117BC8BEAAAE862A0553A39223F8577A6DFFEA7CC415C329B279CF9D43847C11F186F3C59DAA53EE0834AAEE X-C1DE0DAB: 0D63561A33F958A5C2D31DF1C401373D6E9B7572ADB962E198BA3FA910904CFFD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA757AF5085B7B0228D6410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D342B1F2AD168155B061C3907C26081ABF07AAA2BA3C0AE47286F70B624E564C567A404DED4452F93F11D7E09C32AA3244CF057DEA700CDF72F5E50111EAB398C2705AB220A9D022EBCFACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojJNmX3owDPmHcfk7rdebEfA== X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A446B553427B69A2548A89386DE24A78B79510B2E7DC9FE48E24424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v19 2/3] qsync: order access to the limbo terms X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 30.09.2021 12:44, Cyrill Gorcunov пишет: > Limbo terms tracking is shared between appliers and when > one of appliers is waiting for write to complete inside > journal_write() routine, an other may need to access read > term value to figure out if promote request is valid to > apply. Due to cooperative multitasking access to the terms > is not consistent so we need to be sure that other fibers > read up to date terms (ie written to the WAL). > > For this sake we use a latching mechanism, when one fiber > takes a lock for updating other readers are waiting until > the operation is complete. > > For example here is a call graph of two appliers > > applier 1 > --------- > applier_apply_tx > (promote term = 3 > current max term = 2) > applier_synchro_filter_tx > apply_synchro_row > journal_write > (sleeping) > > at this moment another applier comes in with obsolete > data and term 2 > > applier 2 > --------- > applier_apply_tx > (term 2) > applier_synchro_filter_tx > txn_limbo_is_replica_outdated -> false > journal_write (sleep) > > applier 1 > --------- > journal wakes up > apply_synchro_row_cb > set max term to 3 > > So the applier 2 didn't notice that term 3 is already seen > and wrote obsolete data. With locking the applier 2 will > wait until applier 1 has finished its write. > > We introduce the following helpers: > > 1) txn_limbo_process_begin: which takes a lock > 2) txn_limbo_process_commit and txn_limbo_process_rollback > which simply release the lock but have different names > for better semantics > 3) txn_limbo_process is a general function which uses x_begin > and x_commit helper internally > 4) txn_limbo_process_core to do a real job over processing the > request, it implies that txn_limbo_process_begin been called > > Part-of #6036 > > Signed-off-by: Cyrill Gorcunov > --- > src/box/applier.cc | 12 +++++++--- > src/box/box.cc | 15 ++++++++----- > src/box/txn_limbo.c | 17 +++++++++++++-- > src/box/txn_limbo.h | 53 ++++++++++++++++++++++++++++++++++++++++----- > 4 files changed, 80 insertions(+), 17 deletions(-) > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index b981bd436..46c36e259 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -857,7 +857,7 @@ apply_synchro_row_cb(struct journal_entry *entry) > applier_rollback_by_wal_io(entry->res); > } else { > replica_txn_wal_write_cb(synchro_entry->rcb); > - txn_limbo_process(&txn_limbo, synchro_entry->req); > + txn_limbo_process_core(&txn_limbo, synchro_entry->req); > trigger_run(&replicaset.applier.on_wal_write, NULL); > } > fiber_wakeup(synchro_entry->owner); > @@ -873,6 +873,8 @@ apply_synchro_row(uint32_t replica_id, struct xrow_header *row) > if (xrow_decode_synchro(row, &req) != 0) > goto err; > > + txn_limbo_process_begin(&txn_limbo); > + > struct replica_cb_data rcb_data; > struct synchro_entry entry; > /* > @@ -910,12 +912,16 @@ apply_synchro_row(uint32_t replica_id, struct xrow_header *row) > * transactions side, including the async ones. > */ > if (journal_write(&entry.base) != 0) > - goto err; > + goto err_rollback; > if (entry.base.res < 0) { > diag_set_journal_res(entry.base.res); > - goto err; > + goto err_rollback; > } > + txn_limbo_process_commit(&txn_limbo); > return 0; > + > +err_rollback: > + txn_limbo_process_rollback(&txn_limbo); > err: > diag_log(); > return -1; > diff --git a/src/box/box.cc b/src/box/box.cc > index e082e1a3d..6a9be745a 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -1677,8 +1677,6 @@ box_issue_promote(uint32_t prev_leader_id, int64_t promote_lsn) > struct raft *raft = box_raft(); > assert(raft->volatile_term == raft->term); > assert(promote_lsn >= 0); > - txn_limbo_write_promote(&txn_limbo, promote_lsn, > - raft->term); > struct synchro_request req = { > .type = IPROTO_RAFT_PROMOTE, > .replica_id = prev_leader_id, > @@ -1686,8 +1684,11 @@ box_issue_promote(uint32_t prev_leader_id, int64_t promote_lsn) > .lsn = promote_lsn, > .term = raft->term, > }; > - txn_limbo_process(&txn_limbo, &req); > + txn_limbo_process_begin(&txn_limbo); > + txn_limbo_write_promote(&txn_limbo, req.lsn, req.term); > + txn_limbo_process_core(&txn_limbo, &req); > assert(txn_limbo_is_empty(&txn_limbo)); > + txn_limbo_process_commit(&txn_limbo); > } > > /** A guard to block multiple simultaneous promote()/demote() invocations. */ > @@ -1699,8 +1700,6 @@ box_issue_demote(uint32_t prev_leader_id, int64_t promote_lsn) > { > assert(box_raft()->volatile_term == box_raft()->term); > assert(promote_lsn >= 0); > - txn_limbo_write_demote(&txn_limbo, promote_lsn, > - box_raft()->term); > struct synchro_request req = { > .type = IPROTO_RAFT_DEMOTE, > .replica_id = prev_leader_id, > @@ -1708,8 +1707,12 @@ box_issue_demote(uint32_t prev_leader_id, int64_t promote_lsn) > .lsn = promote_lsn, > .term = box_raft()->term, > }; > - txn_limbo_process(&txn_limbo, &req); > + txn_limbo_process_begin(&txn_limbo); > + txn_limbo_write_demote(&txn_limbo, promote_lsn, > + box_raft()->term); > + txn_limbo_process_core(&txn_limbo, &req); > assert(txn_limbo_is_empty(&txn_limbo)); > + txn_limbo_process_commit(&txn_limbo); > } > > int > diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c > index 70447caaf..855c98c98 100644 > --- a/src/box/txn_limbo.c > +++ b/src/box/txn_limbo.c > @@ -47,6 +47,7 @@ txn_limbo_create(struct txn_limbo *limbo) > vclock_create(&limbo->vclock); > vclock_create(&limbo->promote_term_map); > limbo->promote_greatest_term = 0; > + latch_create(&limbo->promote_latch); > limbo->confirmed_lsn = 0; > limbo->rollback_count = 0; > limbo->is_in_rollback = false; > @@ -724,11 +725,14 @@ txn_limbo_wait_empty(struct txn_limbo *limbo, double timeout) > } > > void > -txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) > +txn_limbo_process_core(struct txn_limbo *limbo, > + const struct synchro_request *req) > { > + assert(latch_is_locked(&limbo->promote_latch)); > + > uint64_t term = req->term; > uint32_t origin = req->origin_id; > - if (txn_limbo_replica_term(limbo, origin) < term) { > + if (vclock_get(&limbo->promote_term_map, origin) < (int64_t)term) { > vclock_follow(&limbo->promote_term_map, origin, term); > if (term > limbo->promote_greatest_term) > limbo->promote_greatest_term = term; > @@ -786,6 +790,15 @@ txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req) > return; > } > > +void > +txn_limbo_process(struct txn_limbo *limbo, > + const struct synchro_request *req) > +{ > + txn_limbo_process_begin(limbo); > + txn_limbo_process_core(limbo, req); > + txn_limbo_process_commit(limbo); > +} > + > void > txn_limbo_on_parameters_change(struct txn_limbo *limbo) > { > diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h > index 53e52f676..fdb214711 100644 > --- a/src/box/txn_limbo.h > +++ b/src/box/txn_limbo.h > @@ -31,6 +31,7 @@ > */ > #include "small/rlist.h" > #include "vclock/vclock.h" > +#include "latch.h" > > #include > > @@ -147,6 +148,10 @@ struct txn_limbo { > * limbo and raft are in sync and the terms are the same. > */ > uint64_t promote_greatest_term; > + /** > + * To order access to the promote data. > + */ > + struct latch promote_latch; > /** > * Maximal LSN gathered quorum and either already confirmed in WAL, or > * whose confirmation is in progress right now. Any attempt to confirm > @@ -216,9 +221,12 @@ txn_limbo_last_entry(struct txn_limbo *limbo) > * @a replica_id. > */ > static inline uint64_t > -txn_limbo_replica_term(const struct txn_limbo *limbo, uint32_t replica_id) > +txn_limbo_replica_term(struct txn_limbo *limbo, uint32_t replica_id) > { > - return vclock_get(&limbo->promote_term_map, replica_id); > + latch_lock(&limbo->promote_latch); > + uint64_t v = vclock_get(&limbo->promote_term_map, replica_id); > + latch_unlock(&limbo->promote_latch); > + return v; > } > > /** > @@ -226,11 +234,14 @@ txn_limbo_replica_term(const struct txn_limbo *limbo, uint32_t replica_id) > * data from it. The check is only valid when elections are enabled. > */ > static inline bool > -txn_limbo_is_replica_outdated(const struct txn_limbo *limbo, > +txn_limbo_is_replica_outdated(struct txn_limbo *limbo, > uint32_t replica_id) > { > - return txn_limbo_replica_term(limbo, replica_id) < > - limbo->promote_greatest_term; > + latch_lock(&limbo->promote_latch); > + uint64_t v = vclock_get(&limbo->promote_term_map, replica_id); > + bool res = v < limbo->promote_greatest_term; > + latch_unlock(&limbo->promote_latch); > + return res; > } > > /** > @@ -300,7 +311,37 @@ txn_limbo_ack(struct txn_limbo *limbo, uint32_t replica_id, int64_t lsn); > int > txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry); > > -/** Execute a synchronous replication request. */ > +/** > + * Initiate execution of a synchronous replication request. > + */ > +static inline void > +txn_limbo_process_begin(struct txn_limbo *limbo) > +{ > + latch_lock(&limbo->promote_latch); > +} > + > +/** Commit a synchronous replication request. */ > +static inline void > +txn_limbo_process_commit(struct txn_limbo *limbo) > +{ > + assert(latch_is_locked(&limbo->promote_latch)); > + latch_unlock(&limbo->promote_latch); > +} > + > +/** Rollback a synchronous replication request. */ > +static inline void > +txn_limbo_process_rollback(struct txn_limbo *limbo) > +{ > + assert(latch_is_locked(&limbo->promote_latch)); > + latch_unlock(&limbo->promote_latch); > +} > + > +/** Core of processing synchronous replication request. */ > +void > +txn_limbo_process_core(struct txn_limbo *limbo, > + const struct synchro_request *req); > + > +/** Process a synchronous replication request. */ > void > txn_limbo_process(struct txn_limbo *limbo, const struct synchro_request *req); Thanks for the patch! Mostly ok with one question: What about txn_limbo_write_confirm/txn_limbo_read_confirm pairs issued inside txn_limbo_ack() and txn_limbo_on_parameters_change() ? Shouldn't they take the latch as well? I mean, txn_limbo_ack() and txn_limbo_on_parameters_change() as a whole. > -- Serge Petrenko