From: Vladimir Davydov <vdavydov.dev@gmail.com> To: "Георгий Кириченко" <georgy@tarantool.org> Cc: tarantool-patches@freelists.org Subject: Re: [tarantool-patches] [PATCH v2 1/2] Journal transaction boundaries Date: Thu, 31 Jan 2019 17:54:54 +0300 [thread overview] Message-ID: <20190131145454.hhuwbgff2hznsrkx@esperanza> (raw) In-Reply-To: <5221637.HvkEVp0ifG@home.lan> On Thu, Jan 31, 2019 at 05:25:24PM +0300, Георгий Кириченко wrote: > On Thursday, January 31, 2019 11:19:51 AM MSK Vladimir Davydov wrote: > > On Thu, Jan 31, 2019 at 10:34:43AM +0300, Георгий Кириченко wrote: > > > On Tuesday, January 29, 2019 2:00:10 PM MSK Vladimir Davydov wrote: > > > > On Tue, Jan 29, 2019 at 01:09:50PM +0300, Георгий Кириченко wrote: > > > > > On Monday, January 28, 2019 3:58:59 PM MSK Vladimir Davydov wrote: > > > > > > On Tue, Jan 22, 2019 at 01:57:36PM +0300, Georgy Kirichenko wrote: > > > > > > > diff --git a/src/box/iproto_constants.h > > > > > > > b/src/box/iproto_constants.h > > > > > > > index 728514297..d01cdf840 100644 > > > > > > > --- a/src/box/iproto_constants.h > > > > > > > +++ b/src/box/iproto_constants.h > > > > > > > @@ -60,6 +60,9 @@ enum iproto_key { > > > > > > > > > > > > > > IPROTO_SCHEMA_VERSION = 0x05, > > > > > > > IPROTO_SERVER_VERSION = 0x06, > > > > > > > IPROTO_GROUP_ID = 0x07, > > > > > > > > > > > > > > + IPROTO_TXN_ID = 0x08, > > > > > > > + IPROTO_TXN_REPLICA_ID = 0x09, > > > > > > > > > > > > Do we really need to introduce both TXN_ID and TXN_REPLICA_ID? I > > > > > > would > > > > > > expect TXN_ID to be enough - we could use REPLICA_ID to make sure > > > > > > that > > > > > > transaction identifiers from different instances don't overlap, > > > > > > because > > > > > > there couldn't be a "multi-instance" transaction, could there? > > > > > > txn_replica_id might be differ from replica_id in case when transaction > > > would be finished on other node, e.g. after recovery. > > > > I'm kinda out of scope here. I think you need to write an RFC or > > something explaining all the design decisions you made when you > > introduced transaction boundaries. Or we should get together with > > Kostja and discuss it. Or both. Neither the commit message nor > > comments shed the light on the big picture. > > > > > > > > > + IPROTO_TXN_LAST = 0x0a, > > > > > > > > > > > > I think we should instead introduce BEGIN and COMMIT commands, because: > > > > > I completely do not like any auto-commit logic in a xlog file. You > > > > > suggestion breaks backward compatibility because previous logs do not > > > > > have any BEGIN/ > > > > > > > > It wouldn't break backward compatibility. It might break forward > > > > compatibility, which is fine by me (we do it all the time). > > > > > > Suggested xrow encoding/decoding rules means that any xrow without txn_id, > > > txn_replica_id, txn_last should be processed as a single statement > > > transaction as it was before. > > > If we would require explicit begin/commit then previous logs turns into an > > > invalid stream without autocommit semantic. > > > > Yeah, we would have to add autocommit. > I am disagreed with autocommit. There are my key points: > 1. We could not interleave autocommit transactions with others, or we would > have to have special mark for this transaction but this is no autocommit after > all. Do we need to support interleaving transactions in xlog? I'm not sure. May be, for sync replication we do. Anyway, we could add BEGIN/COMMIT *in addition* to TXN_ID stored in each row. They would serve as a header and footer. This would give us a place holder to store transaction-wide information if we ever need it. Besides, it would make the stream more robust: if we loose BEGIN, we will detect it while if we loose the first transaction row in your case we won't. Yeah, explicit BEGIN/COMMIT will add a few bytes to each transaction, but IMO it's not critical. Let's discuss it f2f when we can, because we seem to repeat ourselves without moving forward. > 2. Applier would not able to check transaction boundaries. > 3. There is no difference between an autocommit transaction and a single-row > transaction. > 4. Autocommit is about interface behavior but not about log format. > > Also let me explain xrow encoding rules: > If we have only one row in transaction, then we do not encode txn_id and > txn_last. So single-row transaction (or autocommit if you wish) does not > change its' representation. And then while decoding we set txn_id = lsn and > txn_last if txn_id was not set. In that case each row from previous xlog acts > as one single-row transaction. > For multi-row transaction we have only penalty in transaction identifiers for > each row and one commit flag for the last. > > > > > > > > @@ -917,6 +918,39 @@ wal_assign_lsn(struct vclock *vclock, struct > > > > > > > xrow_header **row,> > > > > > > > > > > > > > > vclock_follow_xrow(vclock, *row); > > > > > > > > > > > > > > } > > > > > > > > > > > > > > } > > > > > > > > > > > > > > + if ((*begin)->replica_id != instance_id) { > > > > > > > + /* > > > > > > > + * Move all local changes to the end of rows array and > > > > > > > + * a fake local transaction (like an autonomous transaction) > > > > > > > + * because we could not replicate the transaction back. > > > > > > > + */ > > > > > > > + struct xrow_header **row = end - 1; > > > > > > > + while (row >= begin) { > > > > > > > + if (row[0]->replica_id != instance_id) { > > > > > > > + --row; > > > > > > > + continue; > > > > > > > + } > > > > > > > + /* Local row, move it back. */ > > > > > > > + struct xrow_header **local_row = row; > > > > > > > + while (local_row < end - 1 && > > > > > > > + local_row[1]->replica_id != instance_id) { > > > > > > > + struct xrow_header *tmp = local_row[0]; > > > > > > > + local_row[0] = local_row[1]; > > > > > > > + local_row[1] = tmp; > > > > > > > > > > > > There's SWAP for this. > > > > > > > > > > > > > + } > > > > > > > + --row; > > > > > > > + } > > > > > > > + while (begin < end && begin[0]->replica_id != instance_id) > > > > > > > + ++begin; > > > > > > > + } > > > > > > > > > > > > I don't understand why we need to move rows generated locally (by > > > > > > an on_replace trigger I surmise) to the end of a transaction. We > > > > > > have TXN_ID attached to each row so we could leave the transactions > > > > > > interleaved, couldn't we? > > > > > > > > > > You are right, but in that case applier should track a lot of > > > > > transaction > > > > > simultaneously. Also it complicates recovery too. I hope it will be > > > > > fixed > > > > > while parallel applier implementing. > > > > > > > > May be, we should implement parallel applier in the scope of this issue > > > > then? Anyway, without it, sync replication won't scale with the number > > > > of parallel transactions. > > > > > > I do not think so. Parallel applier depends on that feature, but it is > > > completely different issue. > > > > But as I said, sync replication won't scale without it. > > > > > Also parallel applier is ignorant on a xlog format. > > > > Depends on how you implement it. For example, MySQL marks all > > transactions that were committed in parallel on the master so that > > they can be committed in parallel on replicas. > In case of vinyl it is not true. In case of vinyl we can mark transactions that overlapped in time on the master. In fact, that's what MySQL actually does. I guess I misleaded you by saying "committed in parallel". I meant "executed in parallel".
next prev parent reply other threads:[~2019-01-31 14:54 UTC|newest] Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-22 10:57 [tarantool-patches] [PATCH v2 0/2] Transaction boundaries in replication protocol Georgy Kirichenko 2019-01-22 10:57 ` [tarantool-patches] [PATCH v2 1/2] Journal transaction boundaries Georgy Kirichenko 2019-01-28 12:58 ` Vladimir Davydov 2019-01-29 10:09 ` Георгий Кириченко 2019-01-29 11:00 ` Vladimir Davydov 2019-01-31 7:34 ` Георгий Кириченко 2019-01-31 8:19 ` Vladimir Davydov 2019-01-31 14:25 ` Георгий Кириченко 2019-01-31 14:54 ` Vladimir Davydov [this message] 2019-02-01 9:31 ` Георгий Кириченко 2019-01-28 13:00 ` Vladimir Davydov 2019-01-28 13:08 ` [tarantool-patches] " Vladislav Shpilevoy 2019-02-08 16:56 ` Konstantin Osipov 2019-01-22 10:57 ` [tarantool-patches] [PATCH v2 2/2] Transaction support for applier Georgy Kirichenko 2019-01-28 13:35 ` Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190131145454.hhuwbgff2hznsrkx@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=georgy@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [tarantool-patches] [PATCH v2 1/2] Journal transaction boundaries' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox