From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 3D34C6F3C7; Fri, 26 Mar 2021 23:47:21 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 3D34C6F3C7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1616791641; bh=liDgzlzZrnfZh7OgkTCgpKHaR1gGZu6VsoSZOmlJ1GA=; h=To:Cc:References:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=s9PNn9c0MyOE4Ubc8y584RoTlguudyzKTLnqztDROQpDAIjSLLhT19qxYPwdDO33r fEpA7/WyBGk6qzhxLcoX6v4QTHyQ/QtVij6gPZM4eE9SXk2GZ+8IcYN3mtNIwn9jRk 5ywcCpjOQjX5tAlD2dRB7Y3OkHDpajVqTkNtKGDw= Received: from smtp34.i.mail.ru (smtp34.i.mail.ru [94.100.177.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 75A846F3C7 for ; Fri, 26 Mar 2021 23:47:19 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 75A846F3C7 Received: by smtp34.i.mail.ru with esmtpa (envelope-from ) id 1lPtMU-00055u-Lz; Fri, 26 Mar 2021 23:47:19 +0300 To: Serge Petrenko , gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org References: Message-ID: <2e081c11-ed0c-506a-af8a-57ef6707f7a9@tarantool.org> Date: Fri, 26 Mar 2021 21:47:17 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD9ED7173E37F4E3294CA3588DDE0233B0D17711AF1EA2D7DB9182A05F538085040A9405FDAFA456AC5F598CC66D1370E9B205F7DC790E9BAE7C1C0D74D32D1DB60 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7AB524098FB2F2222EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006374DF0C582D42FCA168638F802B75D45FF914D58D5BE9E6BC131B5C99E7648C95C5DD32608FC869F5DD6BB1FDD243788998D5EFF8FEAD6D5ABA471835C12D1D9774AD6D5ED66289B5259CC434672EE6371117882F4460429724CE54428C33FAD30A8DF7F3B2552694AC26CFBAC0749D213D2E47CDBA5A9658359CC434672EE6371117882F4460429728AD0CFFFB425014E868A13BD56FB6657D81D268191BDAD3DC09775C1D3CA48CFEE95CEFCEDEAA6E8BA3038C0950A5D36C8A9BA7A39EFB766EC990983EF5C0329BA3038C0950A5D36D5E8D9A59859A8B62B215FA7A634AA1D76E601842F6C81A1F004C906525384307823802FF610243DF43C7A68FF6260569E8FC8737B5C2249B372FE9A2E580EFC725E5C173C3A84C3EBF4D8D28E8B690335872C767BF85DA2F004C90652538430E4A6367B16DE6309 X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A2368A440D3B0F6089093C9A16E5BC824AC8B6CDF511875BC4E8F7B195E1C97831DDA0250B8745CF1DD3DF300CDFC134A3 X-C1DE0DAB: 0D63561A33F958A50B32DC84570DC6687C23FDD5E69934163A2D39DE1FFA218FD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7502E6951B79FF9A3F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D349DAEDEF7DE8FDCFA9345C34EE420A257F0A34E07CCE7F30E57B641EDCF475494BC08CBA2FD4F0FE11D7E09C32AA3244C48BA0A9B0EDB3A8278AE212E4984EE138580396430872480FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojapPp7P/VpAh5a6fXIuOHyg== X-Mailru-Sender: 504CC1E875BF3E7D9BC0E5172ADA3110EBD935773ADF6B6E1691BBA23F4ED48F893D4E58CFC29DC007784C02288277CA03E0582D3806FB6A5317862B1921BA260ED6CFD6382C13A6112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v2 3/7] applier: extract plain tx application from applier_apply_tx() X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Thanks for the patch! See 4 comments below. On 24.03.2021 13:24, Serge Petrenko wrote: > The new routine, called apply_plain_tx(), may be used not only by > applier_apply_tx(), but also by final join, once we make it > transactional, and recovery, once it's also turned transactional. > > Also, while we're at it. Remove excess fiber_gc() call from > applier_subscribe loop. Let's better make sure fiber_gc() is called on > any return from applier_apply_tx(). > > Prerequisite #5874 > Part of #5566 > --- > src/box/applier.cc | 188 ++++++++++++++++++++++----------------------- > 1 file changed, 93 insertions(+), 95 deletions(-) > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index 65afa5e98..07e557a51 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -905,6 +905,90 @@ applier_handle_raft(struct applier *applier, struct xrow_header *row) > return box_raft_process(&req, applier->instance_id); > } > > +static inline int > +apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers) > +{ > + /** 1. Inside of functions for comment first line we use /*, not /**. > + * Explicitly begin the transaction so that we can > + * control fiber->gc life cycle and, in case of apply > + * conflict safely access failed xrow object and allocate > + * IPROTO_NOP on gc. > + */ > + struct txn *txn = txn_begin(); > + struct applier_tx_row *item; > + if (txn == NULL) > + return -1; > + > + stailq_foreach_entry(item, rows, next) { > + struct xrow_header *row = &item->row; > + int res = apply_row(row); > + if (res != 0 && skip_conflict) { > + struct error *e = diag_last_error(diag_get()); > + /* > + * In case of ER_TUPLE_FOUND error and enabled > + * replication_skip_conflict configuration > + * option, skip applying the foreign row and > + * replace it with NOP in the local write ahead > + * log. > + */ > + if (e->type == &type_ClientError && > + box_error_code(e) == ER_TUPLE_FOUND && > + replication_skip_conflict) { 2. That looks kind of confusing - you pass skip_conflict option but also use replication_skip_conflict. You could calculate skip_conflict based on replication_skip_conflict in your patch. > + diag_clear(diag_get()); > + row->type = IPROTO_NOP; > + row->bodycnt = 0; > + res = apply_row(row); > + } > + } > + if (res != 0) > + goto fail; > + } > + > + /* > + * We are going to commit so it's a high time to check if > + * the current transaction has non-local effects. > + */ > + if (txn_is_distributed(txn)) { > + /* > + * A transaction mixes remote and local rows. > + * Local rows must be replicated back, which > + * doesn't make sense since the master likely has > + * new changes which local rows may overwrite. > + * Raise an error. > + */ > + diag_set(ClientError, ER_UNSUPPORTED, "Replication", > + "distributed transactions"); > + goto fail; > + } > + > + if (use_triggers) { > + /* We are ready to submit txn to wal. */ > + struct trigger *on_rollback, *on_wal_write; > + size_t size; > + on_rollback = region_alloc_object(&txn->region, typeof(*on_rollback), > + &size); > + on_wal_write = region_alloc_object(&txn->region, typeof(*on_wal_write), > + &size); > + if (on_rollback == NULL || on_wal_write == NULL) { > + diag_set(OutOfMemory, size, "region_alloc_object", > + "on_rollback/on_wal_write"); > + goto fail; > + } > + > + trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL); > + txn_on_rollback(txn, on_rollback); > + > + trigger_create(on_wal_write, applier_txn_wal_write_cb, NULL, NULL); > + txn_on_wal_write(txn, on_wal_write); > + } > + > + return txn_commit_try_async(txn); > +fail: > + txn_rollback(txn); > + return -1; > +} > @@ -974,103 +1058,18 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) > assert(first_row == last_row); > if (apply_synchro_row(first_row) != 0) > diag_raise(); 3. Hm. Isn't it a bug that we raise an error here, but don't unlock the latch and don't call fiber_gc()? Looks like a separate bug. Could you fix it please, and probably with a test? Can it be related to the hang you fix in the previous commit? > - goto success; > - } > - > - /** > - * Explicitly begin the transaction so that we can > - * control fiber->gc life cycle and, in case of apply > - * conflict safely access failed xrow object and allocate > - * IPROTO_NOP on gc. > - */ > - struct txn *txn; > - txn = txn_begin(); > - struct applier_tx_row *item; > - if (txn == NULL) { > - latch_unlock(latch); > - return -1; > - } > - stailq_foreach_entry(item, rows, next) { > - struct xrow_header *row = &item->row; > - int res = apply_row(row); > - if (res != 0) { > - struct error *e = diag_last_error(diag_get()); > - /* > - * In case of ER_TUPLE_FOUND error and enabled > - * replication_skip_conflict configuration > - * option, skip applying the foreign row and > - * replace it with NOP in the local write ahead > - * log. > - */ > - if (e->type == &type_ClientError && > - box_error_code(e) == ER_TUPLE_FOUND && > - replication_skip_conflict) { > - diag_clear(diag_get()); > - row->type = IPROTO_NOP; > - row->bodycnt = 0; > - res = apply_row(row); > - } > - } > - if (res != 0) > - goto rollback; > - } > - /* > - * We are going to commit so it's a high time to check if > - * the current transaction has non-local effects. > - */ > - if (txn_is_distributed(txn)) { > - /* > - * A transaction mixes remote and local rows. > - * Local rows must be replicated back, which > - * doesn't make sense since the master likely has > - * new changes which local rows may overwrite. > - * Raise an error. > - */ > - diag_set(ClientError, ER_UNSUPPORTED, > - "Replication", "distributed transactions"); > - goto rollback; > + goto written; > } > > - /* We are ready to submit txn to wal. */ > - struct trigger *on_rollback, *on_wal_write; > - size_t size; > - on_rollback = region_alloc_object(&txn->region, typeof(*on_rollback), > - &size); > - on_wal_write = region_alloc_object(&txn->region, typeof(*on_wal_write), > - &size); > - if (on_rollback == NULL || on_wal_write == NULL) { > - diag_set(OutOfMemory, size, "region_alloc_object", > - "on_rollback/on_wal_write"); > - goto rollback; > + if ((rc = apply_plain_tx(rows, true, true)) == 0) { > +written: > + vclock_follow(&replicaset.applier.vclock, last_row->replica_id, > + last_row->lsn); > } > - > - trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL); > - txn_on_rollback(txn, on_rollback); > - > - trigger_create(on_wal_write, applier_txn_wal_write_cb, NULL, NULL); > - txn_on_wal_write(txn, on_wal_write); > - > - if (txn_commit_try_async(txn) < 0) > - goto fail; > - > -success: > - /* > - * The transaction was sent to journal so promote vclock. > - * > - * Use the lsn of the last row to guard from 1.10 > - * instances, which send every single tx row as a separate > - * transaction. > - */ > - vclock_follow(&replicaset.applier.vclock, last_row->replica_id, > - last_row->lsn); > - latch_unlock(latch); > - return 0; > -rollback: > - txn_rollback(txn); > -fail: > +no_write: 4. You go to this label even when write was done. Maybe rename to 'end' or 'finish'? Consider this diff: ==================== @@ -1027,7 +1027,7 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) latch_lock(latch); if (vclock_get(&replicaset.applier.vclock, last_row->replica_id) >= last_row->lsn) { - goto no_write; + goto finish; } else if (vclock_get(&replicaset.applier.vclock, first_row->replica_id) >= first_row->lsn) { /* @@ -1058,15 +1058,12 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) assert(first_row == last_row); if (apply_synchro_row(first_row) != 0) diag_raise(); - goto written; + } else if ((rc = apply_plain_tx(rows, true, true)) != 0) { + goto finish; } - - if ((rc = apply_plain_tx(rows, true, true)) == 0) { -written: - vclock_follow(&replicaset.applier.vclock, last_row->replica_id, - last_row->lsn); - } -no_write: + vclock_follow(&replicaset.applier.vclock, last_row->replica_id, + last_row->lsn); +finish: latch_unlock(latch); fiber_gc(); return rc; ==================== > latch_unlock(latch); > fiber_gc(); > - return -1; > + return rc; > }