From: Serge Petrenko via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH v2 3/7] applier: extract plain tx application from applier_apply_tx() Date: Sat, 27 Mar 2021 20:34:36 +0300 [thread overview] Message-ID: <0fee8e8f-45b1-10d5-56fb-1c860b8bc514@tarantool.org> (raw) In-Reply-To: <2e081c11-ed0c-506a-af8a-57ef6707f7a9@tarantool.org> 26.03.2021 23:47, Vladislav Shpilevoy пишет: > Thanks for the patch! > > See 4 comments below. Thanks for the review! > > On 24.03.2021 13:24, Serge Petrenko wrote: >> The new routine, called apply_plain_tx(), may be used not only by >> applier_apply_tx(), but also by final join, once we make it >> transactional, and recovery, once it's also turned transactional. >> >> Also, while we're at it. Remove excess fiber_gc() call from >> applier_subscribe loop. Let's better make sure fiber_gc() is called on >> any return from applier_apply_tx(). >> >> Prerequisite #5874 >> Part of #5566 >> --- >> src/box/applier.cc | 188 ++++++++++++++++++++++----------------------- >> 1 file changed, 93 insertions(+), 95 deletions(-) >> >> diff --git a/src/box/applier.cc b/src/box/applier.cc >> index 65afa5e98..07e557a51 100644 >> --- a/src/box/applier.cc >> +++ b/src/box/applier.cc >> @@ -905,6 +905,90 @@ applier_handle_raft(struct applier *applier, struct xrow_header *row) >> return box_raft_process(&req, applier->instance_id); >> } >> >> +static inline int >> +apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers) >> +{ >> + /** > 1. Inside of functions for comment first line we use /*, not /**. Sure, fixed. > >> + * Explicitly begin the transaction so that we can >> + * control fiber->gc life cycle and, in case of apply >> + * conflict safely access failed xrow object and allocate >> + * IPROTO_NOP on gc. >> + */ >> + struct txn *txn = txn_begin(); >> + struct applier_tx_row *item; >> + if (txn == NULL) >> + return -1; >> + >> + stailq_foreach_entry(item, rows, next) { >> + struct xrow_header *row = &item->row; >> + int res = apply_row(row); >> + if (res != 0 && skip_conflict) { >> + struct error *e = diag_last_error(diag_get()); >> + /* >> + * In case of ER_TUPLE_FOUND error and enabled >> + * replication_skip_conflict configuration >> + * option, skip applying the foreign row and >> + * replace it with NOP in the local write ahead >> + * log. >> + */ >> + if (e->type == &type_ClientError && >> + box_error_code(e) == ER_TUPLE_FOUND && >> + replication_skip_conflict) { > 2. That looks kind of confusing - you pass skip_conflict option but > also use replication_skip_conflict. You could calculate skip_conflict > based on replication_skip_conflict in your patch. Yes, indeed. Thanks for noticing! > >> + diag_clear(diag_get()); >> + row->type = IPROTO_NOP; >> + row->bodycnt = 0; >> + res = apply_row(row); >> + } >> + } >> + if (res != 0) >> + goto fail; >> + } >> + >> + /* >> + * We are going to commit so it's a high time to check if >> + * the current transaction has non-local effects. >> + */ >> + if (txn_is_distributed(txn)) { >> + /* >> + * A transaction mixes remote and local rows. >> + * Local rows must be replicated back, which >> + * doesn't make sense since the master likely has >> + * new changes which local rows may overwrite. >> + * Raise an error. >> + */ >> + diag_set(ClientError, ER_UNSUPPORTED, "Replication", >> + "distributed transactions"); >> + goto fail; >> + } >> + >> + if (use_triggers) { >> + /* We are ready to submit txn to wal. */ >> + struct trigger *on_rollback, *on_wal_write; >> + size_t size; >> + on_rollback = region_alloc_object(&txn->region, typeof(*on_rollback), >> + &size); >> + on_wal_write = region_alloc_object(&txn->region, typeof(*on_wal_write), >> + &size); >> + if (on_rollback == NULL || on_wal_write == NULL) { >> + diag_set(OutOfMemory, size, "region_alloc_object", >> + "on_rollback/on_wal_write"); >> + goto fail; >> + } >> + >> + trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL); >> + txn_on_rollback(txn, on_rollback); >> + >> + trigger_create(on_wal_write, applier_txn_wal_write_cb, NULL, NULL); >> + txn_on_wal_write(txn, on_wal_write); >> + } >> + >> + return txn_commit_try_async(txn); >> +fail: >> + txn_rollback(txn); >> + return -1; >> +} >> @@ -974,103 +1058,18 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) >> assert(first_row == last_row); >> if (apply_synchro_row(first_row) != 0) >> diag_raise(); > 3. Hm. Isn't it a bug that we raise an error here, but don't unlock the > latch and don't call fiber_gc()? Looks like a separate bug. Could you > fix it please, and probably with a test? Can it be related to the > hang you fix in the previous commit? It is a bug, yes. Will fix in a commit on top. It's not related to the hang we spoke of in the previous letter though. > >> - goto success; >> - } >> - >> - /** >> - * Explicitly begin the transaction so that we can >> - * control fiber->gc life cycle and, in case of apply >> - * conflict safely access failed xrow object and allocate >> - * IPROTO_NOP on gc. >> - */ >> - struct txn *txn; >> - txn = txn_begin(); >> - struct applier_tx_row *item; >> - if (txn == NULL) { >> - latch_unlock(latch); >> - return -1; >> - } >> - stailq_foreach_entry(item, rows, next) { >> - struct xrow_header *row = &item->row; >> - int res = apply_row(row); >> - if (res != 0) { >> - struct error *e = diag_last_error(diag_get()); >> - /* >> - * In case of ER_TUPLE_FOUND error and enabled >> - * replication_skip_conflict configuration >> - * option, skip applying the foreign row and >> - * replace it with NOP in the local write ahead >> - * log. >> - */ >> - if (e->type == &type_ClientError && >> - box_error_code(e) == ER_TUPLE_FOUND && >> - replication_skip_conflict) { >> - diag_clear(diag_get()); >> - row->type = IPROTO_NOP; >> - row->bodycnt = 0; >> - res = apply_row(row); >> - } >> - } >> - if (res != 0) >> - goto rollback; >> - } >> - /* >> - * We are going to commit so it's a high time to check if >> - * the current transaction has non-local effects. >> - */ >> - if (txn_is_distributed(txn)) { >> - /* >> - * A transaction mixes remote and local rows. >> - * Local rows must be replicated back, which >> - * doesn't make sense since the master likely has >> - * new changes which local rows may overwrite. >> - * Raise an error. >> - */ >> - diag_set(ClientError, ER_UNSUPPORTED, >> - "Replication", "distributed transactions"); >> - goto rollback; >> + goto written; >> } >> >> - /* We are ready to submit txn to wal. */ >> - struct trigger *on_rollback, *on_wal_write; >> - size_t size; >> - on_rollback = region_alloc_object(&txn->region, typeof(*on_rollback), >> - &size); >> - on_wal_write = region_alloc_object(&txn->region, typeof(*on_wal_write), >> - &size); >> - if (on_rollback == NULL || on_wal_write == NULL) { >> - diag_set(OutOfMemory, size, "region_alloc_object", >> - "on_rollback/on_wal_write"); >> - goto rollback; >> + if ((rc = apply_plain_tx(rows, true, true)) == 0) { >> +written: >> + vclock_follow(&replicaset.applier.vclock, last_row->replica_id, >> + last_row->lsn); >> } >> - >> - trigger_create(on_rollback, applier_txn_rollback_cb, NULL, NULL); >> - txn_on_rollback(txn, on_rollback); >> - >> - trigger_create(on_wal_write, applier_txn_wal_write_cb, NULL, NULL); >> - txn_on_wal_write(txn, on_wal_write); >> - >> - if (txn_commit_try_async(txn) < 0) >> - goto fail; >> - >> -success: >> - /* >> - * The transaction was sent to journal so promote vclock. >> - * >> - * Use the lsn of the last row to guard from 1.10 >> - * instances, which send every single tx row as a separate >> - * transaction. >> - */ >> - vclock_follow(&replicaset.applier.vclock, last_row->replica_id, >> - last_row->lsn); >> - latch_unlock(latch); >> - return 0; >> -rollback: >> - txn_rollback(txn); >> -fail: >> +no_write: > 4. You go to this label even when write was done. Maybe rename to > 'end' or 'finish'? > > Consider this diff: > > ==================== > @@ -1027,7 +1027,7 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) > latch_lock(latch); > if (vclock_get(&replicaset.applier.vclock, > last_row->replica_id) >= last_row->lsn) { > - goto no_write; > + goto finish; > } else if (vclock_get(&replicaset.applier.vclock, > first_row->replica_id) >= first_row->lsn) { > /* > @@ -1058,15 +1058,12 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) > assert(first_row == last_row); > if (apply_synchro_row(first_row) != 0) > diag_raise(); > - goto written; > + } else if ((rc = apply_plain_tx(rows, true, true)) != 0) { > + goto finish; > } > - > - if ((rc = apply_plain_tx(rows, true, true)) == 0) { > -written: > - vclock_follow(&replicaset.applier.vclock, last_row->replica_id, > - last_row->lsn); > - } > -no_write: > + vclock_follow(&replicaset.applier.vclock, last_row->replica_id, > + last_row->lsn); > +finish: > latch_unlock(latch); > fiber_gc(); > return rc; > ==================== Looks good, applied. Incremental diff below. ======================================== diff --git a/src/box/applier.cc b/src/box/applier.cc index f396e43a8..e6d9673dd 100644 --- a/src/box/applier.cc +++ b/src/box/applier.cc @@ -908,7 +908,7 @@ applier_handle_raft(struct applier *applier, struct xrow_header *row) static inline int apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers) { - /** + /* * Explicitly begin the transaction so that we can * control fiber->gc life cycle and, in case of apply * conflict safely access failed xrow object and allocate @@ -932,8 +932,7 @@ apply_plain_tx(struct stailq *rows, bool skip_conflict, bool use_triggers) * log. */ if (e->type == &type_ClientError && - box_error_code(e) == ER_TUPLE_FOUND && - replication_skip_conflict) { + box_error_code(e) == ER_TUPLE_FOUND) { diag_clear(diag_get()); row->type = IPROTO_NOP; row->bodycnt = 0; @@ -1027,7 +1026,7 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) latch_lock(latch); if (vclock_get(&replicaset.applier.vclock, last_row->replica_id) >= last_row->lsn) { - goto no_write; + goto finish; } else if (vclock_get(&replicaset.applier.vclock, first_row->replica_id) >= first_row->lsn) { /* @@ -1058,15 +1057,13 @@ applier_apply_tx(struct applier *applier, struct stailq *rows) assert(first_row == last_row); if (apply_synchro_row(first_row) != 0) diag_raise(); - goto written; - } - - if ((rc = apply_plain_tx(rows, true, true)) == 0) { -written: - vclock_follow(&replicaset.applier.vclock, last_row->replica_id, - last_row->lsn); + } else if ((rc = apply_plain_tx(rows, replication_skip_conflict, + true)) != 0) { + goto finish; } -no_write: + vclock_follow(&replicaset.applier.vclock, last_row->replica_id, + last_row->lsn); +finish: latch_unlock(latch); fiber_gc(); return rc; ======================================== > >> latch_unlock(latch); >> fiber_gc(); >> - return -1; >> + return rc; >> } -- Serge Petrenko
next prev parent reply other threads:[~2021-03-27 17:34 UTC|newest] Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-03-24 12:24 [Tarantool-patches] [PATCH v2 0/7] applier: handle synchronous transactions during final Serge Petrenko via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 1/7] replication: fix a hang on final join retry Serge Petrenko via Tarantool-patches 2021-03-26 20:44 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-27 16:52 ` Serge Petrenko via Tarantool-patches 2021-03-29 21:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 2/7] applier: extract tx boundary checks from applier_read_tx into a separate routine Serge Petrenko via Tarantool-patches 2021-03-26 12:35 ` Cyrill Gorcunov via Tarantool-patches 2021-03-27 16:54 ` Serge Petrenko via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 3/7] applier: extract plain tx application from applier_apply_tx() Serge Petrenko via Tarantool-patches 2021-03-26 20:47 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-27 17:34 ` Serge Petrenko via Tarantool-patches [this message] 2021-03-27 18:30 ` [Tarantool-patches] [PATCH v2 3.5/7] applier: fix not releasing the latch on apply_synchro_row() fail Serge Petrenko via Tarantool-patches 2021-03-29 21:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-30 8:15 ` Serge Petrenko via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 4/7] applier: remove excess last_row_time update from subscribe loop Serge Petrenko via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 5/7] applier: make final join transactional Serge Petrenko via Tarantool-patches 2021-03-26 20:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-27 19:05 ` Serge Petrenko via Tarantool-patches 2021-03-29 21:51 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-30 8:15 ` Serge Petrenko via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 6/7] replication: tolerate synchro rollback during final join Serge Petrenko via Tarantool-patches 2021-03-24 12:45 ` Serge Petrenko via Tarantool-patches 2021-03-26 20:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-27 19:23 ` Serge Petrenko via Tarantool-patches 2021-03-24 12:24 ` [Tarantool-patches] [PATCH v2 7/7] replication: do not ignore replica vclock on register Serge Petrenko via Tarantool-patches 2021-03-26 20:50 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-27 20:13 ` Serge Petrenko via Tarantool-patches 2021-03-29 21:51 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-30 8:16 ` Serge Petrenko via Tarantool-patches 2021-03-30 12:33 ` Serge Petrenko via Tarantool-patches 2021-03-26 13:46 ` [Tarantool-patches] [PATCH v2 0/7] applier: handle synchronous transactions during final Cyrill Gorcunov via Tarantool-patches 2021-03-30 20:13 ` Vladislav Shpilevoy via Tarantool-patches 2021-04-05 16:15 ` Kirill Yukhin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=0fee8e8f-45b1-10d5-56fb-1c860b8bc514@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=sergepetrenko@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 3/7] applier: extract plain tx application from applier_apply_tx()' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox