From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 0679A445323 for ; Fri, 24 Jul 2020 01:10:59 +0300 (MSK) References: <20200723122942.196011-1-gorcunov@gmail.com> <20200723122942.196011-7-gorcunov@gmail.com> From: Vladislav Shpilevoy Message-ID: <2e024e88-df98-8f8a-e418-08ed38e570b9@tarantool.org> Date: Fri, 24 Jul 2020 00:10:58 +0200 MIME-Version: 1.0 In-Reply-To: <20200723122942.196011-7-gorcunov@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v2 6/7] qsync: implement direct write of CONFIRM/ROLLBACK into a journal List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cyrill Gorcunov , tml Thanks for the patch! See 5 comments below. On 23.07.2020 14:29, Cyrill Gorcunov wrote: > When we need to write CONFIRM or ROLLBACK message (which is just > a binary record in msgpack format) into a journal we use txn code > to allocate a new transaction, encode there a message and pass it > to walk the long txn path before it hit the journal. This is not > only resource wasting but also somehow strange from arhitectural 1. arhitectural -> architectural. > point of view. > > Instead lets encode a record on the stack and write it > directly to the journal. > > Closes #5129 > > Signed-off-by: Cyrill Gorcunov > --- > src/box/txn_limbo.c | 92 +++++++++++++++++++++++++-------------------- > 1 file changed, 51 insertions(+), 41 deletions(-) > > diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c > index a74bfe244..3eca244dc 100644 > --- a/src/box/txn_limbo.c > +++ b/src/box/txn_limbo.c > @@ -32,6 +32,9 @@ > #include "txn_limbo.h" > #include "replication.h" > > +#include "iproto_constants.h" > +#include "journal.h" > + > struct txn_limbo txn_limbo; > > static inline void > @@ -237,62 +240,69 @@ txn_limbo_wait_complete(struct txn_limbo *limbo, struct txn_limbo_entry *entry) > return 0; > } > > +static void > +txn_limbo_write_cb(struct journal_entry *entry) > +{ > + /* > + * It is safe to call fiber_wakeup on > + * active fiber but we need to call wakeup > + * in case if we're sitting in WAL thread. 2. Actually it is not safe to call wakeup on the active fiber. It leads to something being broken in the scheduler. This is why we have so many checks like if (txn->fiber != fiber()) fiber_wakeup(txn->fiber); But yeah, in this case this is fine - the fiber is sleeping for sure. > + */ > + assert(entry->complete_data != NULL); > + fiber_wakeup(entry->complete_data); > +} > + > +/** > + * Write CONFIRM or ROLLBACK message to a journal directly > + * without involving transaction engine because using txn > + * engine is far from being cheap while we only need to > + * write a small message. > + */ > static int > -txn_limbo_write_confirm_rollback(struct txn_limbo *limbo, int64_t lsn, > - bool is_confirm) > +txn_limbo_write(uint32_t replica_id, int64_t lsn, int type) > { > + assert(replica_id != REPLICA_ID_NIL); > + assert(type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK); > assert(lsn > 0); > > - struct xrow_header row; > - struct request request = { > - .header = &row, > - }; > + char buf[sizeof(struct journal_entry) + > + sizeof(struct xrow_header *) + > + sizeof(struct xrow_header)]; 3. The last addition is an overkill a bit. Why can't you declare {struct xrow_header row;} on stack separately? I understand sizeof(struct journal_entry) + sizeof(struct xrow_header *) - they should be one object. But the third addition seems not necessarily to be done in the same buffer variable. 4. I think all these changes are going to interfere with https://github.com/tarantool/tarantool/issues/5151. What are you going to do to request_synchro_body to implement 5151? Will they be two separate structs? Or will the struct synchro_request contain the body? - this would lead to data duplication though, because the point of synchro_request is to provide the synchro fields not encoded nor bswaped. Would be good to think through how will these changes live with 5151. > - struct txn *txn = txn_begin(); > - if (txn == NULL) > - return -1; > + struct journal_entry *entry = (void *)buf; > + struct xrow_header *row = (void *)&entry->rows[1]; > + entry->rows[0] = row; > > - int res = 0; > - if (is_confirm) { > - res = xrow_encode_confirm(&row, &txn->region, > - limbo->instance_id, lsn); > - } else { > - /* > - * This LSN is the first to be rolled back, so > - * the last "safe" lsn is lsn - 1. > - */ > - res = xrow_encode_rollback(&row, &txn->region, > - limbo->instance_id, lsn); > + struct request_synchro_body body; 5. It looks like you could make things simpler, if you would create a struct containing the journal_entry, body, *xrow_header[1], and xrow_header. Would simplify things IMO. You wouldn't need the stack manipulations like you did above. > + xrow_encode_confirm_rollback(row, &body, replica_id, > + lsn, type); > + > + journal_entry_create(entry, 1, xrow_approx_len(row), > + txn_limbo_write_cb, fiber()); > + > + if (journal_write(entry) != 0) { > + diag_set(ClientError, ER_WAL_IO); > + diag_log(); > + return -1; > } > - if (res == -1) > - goto rollback; > - /* > - * This is not really a transaction. It just uses txn API > - * to put the data into WAL. And obviously it should not > - * go to the limbo and block on the very same sync > - * transaction which it tries to confirm now. > - */ > - txn_set_flag(txn, TXN_FORCE_ASYNC); > > - if (txn_begin_stmt(txn, NULL) != 0) > - goto rollback; > - if (txn_commit_stmt(txn, &request) != 0) > - goto rollback; > + if (entry->res < 0) { > + diag_set(ClientError, ER_WAL_IO); > + diag_log(); > + return -1; > + } > > - return txn_commit(txn); > -rollback: > - txn_rollback(txn); > - return -1; > + return 0; > }