From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 72078469710 for ; Mon, 1 Jun 2020 20:06:42 +0300 (MSK) References: <266e9768-27ea-e924-e562-846e74df7760@tarantool.org> <20200601160224.GB50@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Mon, 1 Jun 2020 19:06:40 +0200 MIME-Version: 1.0 In-Reply-To: <20200601160224.GB50@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH v2 2/2] wal: reorder tx rows so that a tx ends on a global row List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Ostanevich Cc: tarantool-patches@dev.tarantool.org >>> 29.05.2020 01:54, Vladislav Shpilevoy пишет: >>>> Thanks   for   the   patch, >>>> But   I   have  a  comment! >>>> It   is   a   nice  crutch, >>>> Yet we need one more moment. >>>> >>>> The patch basically sacrifices transaction rows order >>>> and WAL correctness for the sake of replication. It >>>> does not look right. Why can't we leave WAL as is, and >>>> tweak all these things in relay? It looks really wrong >>>> to change statements order. Especially taking into >>>> account this is needed *only* for replication. For example, >>>> consider FKs. >>>> >>>> A local space has a FOREIGN KEY reference to a global >>>> space. To make it work, we need to insert into the global >>>> space first, and then into the local space. When you change >>>> the order, the local insert goes first, and violates the >>>> foreign key. So if we will check FKs on recovery (when we >>>> will have FKs in box), this patch will break them. >>>> >>>> Alternative to relay - append a dummy NOP statement in the >>>> end of the transaction, which would be global. But this is >>>> also a crutch. I think the TSNs figuring out should be done >>>> in relay. It could keep track of the current transaction, >>>> change TSNs and is_commit when necessary. >>> >>> Hi! Thanks for the review! >>> >>> I understand this is a crutch, but it's the best solution I could come >>> up with.  Appending dummy  NOPs will increase instance LSN by one, >>> which  also looks  ugly. The correct solution is, indeed, to collect  a tx >>> in relay and mangle with it in any means  we need before sending, however, >>> I faced some problems with this approach. See more in v1 of this patchset >>> (letter [PATCH 2/2] replication: make relay send txs in batches). >>> >>> By the way, your txn_limbo implementation assigns the tx an lsn of the last >>> tx row, which doesn't work in case the last row is a local one. So, looks like >>> we either need to reorder the rows or insert a NOP at the tx end. Or just >>> assign the tx an lsn of the last global row. >>> >>> Still, it's up to you which solution you find a better one, and I'll >>> implement it. >> >> I guess the current solution is ok then. Feel free to add >> Reviewed-by: Vladislav Shpilevoy . First version >> looks better, but seems it will take too much time to rework xlog >> streaming API. >> >> Although did you try? I don't know xlog code well, but perhaps you could >> just forbid to reuse ibuf for sequential rows until explicit reset() or >> something. >> >> I also have a thought about is_commit necessity. Why do we need it? Why isn't >> tsn enough? I saw in xrow code (xrow_header_encode()) that there are 2 >> combinations: >> >> - No tsn, and no is_commit for single statement transactions. Is_commit >> is assumed implicitly; >> >> - tsn is encoded, and the last statement has is_commit. >> >> But in fact the is_commit flag always can be determined implicitly. Consider >> all the possible cases: >> >> - row1: tsn1, row2: tsn2. Then all rows [..., row1] belong to one >> transaction, all rows [row2, ...] belong to a next transaction; >> >> - row1: no tsn, row2: tsn2. Then row1 is first single-statement >> transaction, [row2, ...] - next transaction. >> >> - row1: tsn1, row2: no tsn. Then the same as previous. [..., row1] - >> first transaction, row2 - second single-statement transaction. >> >> - row1: no tsn, row2: no tsn. Two single statement transactions. >> >> It means, transaction border can be determined by tsn difference >> between two sequential rows. This is allowed because transactions >> are never mixed. Tsn difference is border. And since is_rollback >> does not exist (it will in sync replication, but as a separate entry, >> for multiple transactions), and is_commit *already* is assumed >> implicitly for single statement transactions, why not to calculate it >> implicitly for all transactions? > > Shall we need a readahead then? > Consider you have a bactch arrived and you have to guess if the last op > in it actually is_commit? > What if you have no following ops for a long time? Will you hold the txn > until the next one arrived? > > Sergos I didn't think about it. Then the flag removal can't be done. At least not in the way I described it.