From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Thu, 13 Jun 2019 17:11:02 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] [PATCH v3 05/14] txn: get rid of autocommit from a txn structure Message-ID: <20190613141102.dw4odqbse6cwawc4@esperanza> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: Georgy Kirichenko Cc: tarantool-patches@freelists.org List-ID: On Sun, Jun 09, 2019 at 11:44:34PM +0300, Georgy Kirichenko wrote: > Move transaction auto start and auto commit behavior to the box level. > From now a transaction won't start and commit automatically without > txn_begin/txn_commit invocations. This is a part of a bigger transaction > refactoring in order to implement detachable transactions and a parallel > applier. TBO I don't understand why you need to move autocommit logic out of txn. The applier code already calls txn_begin/txn_commit explicitly and AFAIU this is the only place you need to care about. The code looks less readable and more bulky after this change IMHO. Let's discuss f2f. See a few comments inline. > > Prerequisites: #1254 > --- > src/box/applier.cc | 35 +++++++++++++--- > src/box/box.cc | 94 +++++++++++++++++++++++++++++++++--------- > src/box/index.cc | 10 ++--- > src/box/memtx_engine.c | 10 ++++- > src/box/memtx_space.c | 8 ++-- > src/box/sql.c | 2 +- > src/box/txn.c | 46 +++++++-------------- > src/box/txn.h | 16 +++---- > src/box/vinyl.c | 12 ++---- > src/box/vy_scheduler.c | 6 +-- > 10 files changed, 148 insertions(+), 91 deletions(-) > > diff --git a/src/box/applier.cc b/src/box/applier.cc > index 373e1feb9..e3203a4c8 100644 > --- a/src/box/applier.cc > +++ b/src/box/applier.cc > @@ -172,11 +172,26 @@ applier_writer_f(va_list ap) > static int > apply_initial_join_row(struct xrow_header *row) > { > + struct txn *txn = txn_begin(); > + if (txn == NULL) > + return -1; > struct request request; > xrow_decode_dml(row, &request, dml_request_key_map(row->type)); > - struct space *space = space_cache_find_xc(request.space_id); > + struct space *space = space_cache_find(request.space_id); > + if (space == NULL) > + goto rollback; > /* no access checks here - applier always works with admin privs */ > - return space_apply_initial_join_row(space, &request); > + if (space_apply_initial_join_row(space, &request)) > + goto rollback; > + int rc; > + rc = txn_commit(txn); > + if (rc < 0) > + return -1; > + fiber_gc(); fiber_gc() must be called after txn_commit() while txn_rollback() calls it by itself. This is a mess. As a matter of fact, I don't understand why we need to call this function at all now, when we have txn->region which is used to allocate all txn stuff. AFAIU fiber->gc is only used for allocating temporary objects, in which case the caller is supposed to clean up by himself (using region_truncate). That said, I don't think we need to call fiber_gc() after txn_commit() at all. There's one exception though - it's vdbe_add_new_autoinc_id, which stores autogenerated ids on fiber->gc and doesn't delete them until the corresponding SQL request is complete (this is why all this mess with fiber_gc() started AFAIR). However, I think we could and should use a mempool for such allocations. I think that before pushing this patch, we should clean up fiber_gc() usage, otherwise it's really easy to loose track where we must and where we mustn't call fiber_gc() now. Actually, I think we should do that irrespective of whether we commit this patch or not. > @@ -403,8 +418,16 @@ applier_join(struct applier *applier) > applier->last_row_time = ev_monotonic_now(loop()); > if (iproto_type_is_dml(row.type)) { > vclock_follow_xrow(&replicaset.vclock, &row); > - if (apply_row(&row) != 0) > + struct txn *txn = txn_begin(); > + if (txn == NULL) > + diag_raise(); > + if (apply_row(&row) != 0) { > + txn_rollback(); > + diag_raise(); > + } > + if (txn_commit(txn) != 0) > diag_raise(); > + fiber_gc(); I'd wrap this into a helper function (apply_final_join_row). > diff --git a/src/box/box.cc b/src/box/box.cc > index 57419ee01..7f23716e5 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -169,34 +169,62 @@ int > box_process_rw(struct request *request, struct space *space, > struct tuple **result) > { > + struct tuple *tuple = NULL; > + struct txn *txn = in_txn(); > + bool is_autocommit = txn == NULL; > + if (is_autocommit && (txn = txn_begin()) == NULL) > + return -1; > assert(iproto_type_is_dml(request->type)); > rmean_collect(rmean_box, request->type, 1); > if (access_check_space(space, PRIV_W) != 0) > - return -1; > - struct txn *txn = txn_begin_stmt(space); > - if (txn == NULL) > - return -1; > - struct tuple *tuple; > + goto fail; > + if (txn_begin_stmt(txn, space) == NULL) > + goto fail; > if (space_execute_dml(space, txn, request, &tuple) != 0) { > - txn_rollback_stmt(); > - return -1; > + txn_rollback_stmt(txn); > + goto fail; > + } > + if (result != NULL) > + *result = tuple; > + > + if (result == NULL || tuple == NULL) { > + if (txn_commit_stmt(txn, request) != 0) > + goto fail; > + if (is_autocommit) { > + if (txn_commit(txn) != 0) > + return -1; > + fiber_gc(); > + } > + return 0; > } > - if (result == NULL) > - return txn_commit_stmt(txn, request); > - *result = tuple; > - if (tuple == NULL) > - return txn_commit_stmt(txn, request); > /* > * Pin the tuple locally before the commit, > * otherwise it may go away during yield in > * when WAL is written in autocommit mode. > */ > tuple_ref(tuple); > - int rc = txn_commit_stmt(txn, request); > - if (rc == 0) > - tuple_bless(tuple); > + > + if (txn_commit_stmt(txn, request)) { > + /* Unref tuple and rollback if autocommit. */ > + tuple_unref(tuple); > + goto fail; > + } > + if (is_autocommit) { > + if (txn_commit(txn) != 0) { > + /* Unref tuple and exit. */ > + tuple_unref(tuple); > + return -1; > + } > + fiber_gc(); > + } > + tuple_bless(tuple); This looks like a lot of copy-n-paste. Please think on how to get rid of code duplication. > @@ -299,10 +327,20 @@ apply_wal_row(struct xstream *stream, struct xrow_header *row) > xrow_decode_dml_xc(row, &request, dml_request_key_map(row->type)); > if (request.type != IPROTO_NOP) { > struct space *space = space_cache_find_xc(request.space_id); > + struct txn *txn = txn_begin(); > + if (txn == NULL) { > + say_error("error applying row: %s", request_str(&request)); > + diag_raise(); > + } Why? box_process_rw can handle autocommit statements. > if (box_process_rw(&request, space, NULL) != 0) { > say_error("error applying row: %s", request_str(&request)); > + txn_rollback(); > diag_raise(); > } > + if (txn_commit(txn) != 0) { > + diag_raise(); > + } > + fiber_gc(); > } > struct wal_stream *xstream = > container_of(stream, struct wal_stream, base); > @@ -1313,9 +1351,17 @@ box_sequence_reset(uint32_t seq_id) > static inline void > box_register_replica(uint32_t id, const struct tt_uuid *uuid) > { > + struct txn *txn = txn_begin(); > + if (txn == NULL) > + diag_raise(); > if (boxk(IPROTO_INSERT, BOX_CLUSTER_ID, "[%u%s]", Again, boxk call box_process_rw, which handles autocommit. No need to call txn_begin/txn_commit here explicitly. > @@ -1636,10 +1682,18 @@ box_set_replicaset_uuid(const struct tt_uuid *replicaset_uuid) > uu = *replicaset_uuid; > else > tt_uuid_create(&uu); > + struct txn *txn = txn_begin(); > + if (txn == NULL) > + diag_raise(); > /* Save replica set UUID in _schema */ > if (boxk(IPROTO_INSERT, BOX_SCHEMA_ID, "[%s%s]", "cluster", Same here. > diff --git a/src/box/index.cc b/src/box/index.cc > index 4a444e5d0..7f26c9bc2 100644 > --- a/src/box/index.cc > +++ b/src/box/index.cc > @@ -274,7 +274,7 @@ box_index_min(uint32_t space_id, uint32_t index_id, const char *key, > if (txn_begin_ro_stmt(space, &txn) != 0) > return -1; > if (index_min(index, key, part_count, result) != 0) { > - txn_rollback_stmt(); > + txn_rollback_stmt(txn); In general I like that txn_rollback_stmt and txn_rollback now take txn. This makes the API more symmetric. Probably, worth doing in a separate patch, even if we realize we don't need this one. As a matter of fact, doing this in a separate patch would make review easier. I'd appreciate if you could split this patch. > diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c > index f4312484a..149215b87 100644 > --- a/src/box/memtx_engine.c > +++ b/src/box/memtx_engine.c > @@ -272,16 +272,22 @@ memtx_engine_recover_snapshot_row(struct memtx_engine *memtx, > diag_set(ClientError, ER_CROSS_ENGINE_TRANSACTION); > return -1; > } > + struct txn *txn = txn_begin(); > + if (txn == NULL) > + return -1; > /* no access checks here - applier always works with admin privs */ > - if (space_apply_initial_join_row(space, &request) != 0) > + if (space_apply_initial_join_row(space, &request) != 0) { > + txn_rollback(); > return -1; > + } > + int rc = txn_commit(txn); This is the second place where you surround apply_initial_join_row with txn_begin/txn_commit. May be, worth doing it in the callback itself? Note, we only need this in case of memtx - vinyl doesn't create a txn object for initial recover or join. > diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c > index 1d209033c..1e630783b 100644 > --- a/src/box/memtx_space.c > +++ b/src/box/memtx_space.c > @@ -310,10 +310,10 @@ memtx_space_apply_initial_join_row(struct space *space, struct request *request) > return -1; > } > request->header->replica_id = 0; > - struct txn *txn = txn_begin_stmt(space); > - if (txn == NULL) > + struct txn *txn = in_txn(); > + struct txn_stmt *stmt = txn_begin_stmt(txn, space); > + if (stmt == NULL) > return -1; > - struct txn_stmt *stmt = txn_current_stmt(txn); AFAICS this is the only place where you use txn_stmt returned by txn_begin_stmt() - in other places you simply check if it's NULL (which indicates an error). Why did you decide to make txn_begin_stmt() return txn_stmt rather than an error code? The API looks lopsided now: txn_begin() => txn txn_begin_stmt(txn, request) => txn_stmt txn_commit_stmt(txn) => int txn_commit(txn) => int Let's please make txn_begin_stmt return int and use txn_current_stmt here to get txn_stmt, as we used to. This shouldn't affect performance while the code would look neater this way. > diff --git a/src/box/txn.c b/src/box/txn.c > index 1d8271e51..21f34e526 100644 > --- a/src/box/txn.c > +++ b/src/box/txn.c > @@ -441,14 +428,11 @@ fail: > } > > void > -txn_rollback_stmt() > +txn_rollback_stmt(struct txn *txn) > { > - struct txn *txn = in_txn(); > if (txn == NULL || txn->in_sub_stmt == 0) > return; Hmm, how's that even possible now?