From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id D11F821D09 for ; Sun, 6 Jan 2019 08:07:57 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zsZbLykRvOPA for ; Sun, 6 Jan 2019 08:07:57 -0500 (EST) Received: from smtp5.mail.ru (smtp5.mail.ru [94.100.179.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 6A58A21028 for ; Sun, 6 Jan 2019 08:07:57 -0500 (EST) Received: by smtp5.mail.ru with esmtpa (envelope-from ) id 1gg89j-0001uS-Bd for tarantool-patches@freelists.org; Sun, 06 Jan 2019 16:07:55 +0300 From: Georgy Kirichenko Subject: [tarantool-patches] Re: [PATCH 1/2] Journal transaction boundaries Date: Sun, 06 Jan 2019 16:07:55 +0300 Message-ID: <1704369.YruXmEIxTt@localhost> In-Reply-To: <24452370cdb749e9bd8ff745947dd903b563be5e.1546723156.git.georgy@tarantool.org> References: <24452370cdb749e9bd8ff745947dd903b563be5e.1546723156.git.georgy@tarantool.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1657153.4aT1ZPPbye"; micalg="pgp-sha256"; protocol="application/pgp-signature" Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org --nextPart1657153.4aT1ZPPbye Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" It is a wrong commit, please review the second version. On Sunday, January 6, 2019 12:26:05 AM MSK Georgy Kirichenko wrote: > Append txn_id, txn_replica_id and txn_last to xrow_header structure. > txn_replica_id identifies replica where transaction was started and > txn_id identifies transaction id on that replica. As transaction id a lsn > of the first row in this transaction is used. > txn_last set to true if it is the last row in a transaction, so we could > commit transaction with last row or use additional NOP requests with > txn_last =3D true ans valid txn_id and txn_replica_id. > For replication all local changes moved to xrows array tail to form a > separate transaction (like autonomous transaction) because it is not > possible to replicate such transaction back to it's creator. >=20 > As encoding/deconding rules assumed: > 1. txn_replica_id is encoded only if it is not equal with replica id. Th= is > might have point because of replication trigger > 2. txn_id and txn_last are encoded only for multi-row transaction. So if= we > do not have txn_id in a xstream then this means that it is a single row > transaction > This rules provides compatibility with previous xlog handling. >=20 > Needed for: 2798 > --- > src/box/iproto_constants.h | 3 +++ > src/box/wal.c | 33 ++++++++++++++++++++++++++++++++- > src/box/xrow.c | 38 ++++++++++++++++++++++++++++++++++++++ > src/box/xrow.h | 5 ++++- > test/unit/xrow.cc | 3 +++ > test/vinyl/errinj.result | 8 ++++---- > test/vinyl/info.result | 38 +++++++++++++++++++------------------- > test/vinyl/layout.result | 24 ++++++++++++------------ > 8 files changed, 115 insertions(+), 37 deletions(-) >=20 > diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h > index 728514297..d01cdf840 100644 > --- a/src/box/iproto_constants.h > +++ b/src/box/iproto_constants.h > @@ -60,6 +60,9 @@ enum iproto_key { > IPROTO_SCHEMA_VERSION =3D 0x05, > IPROTO_SERVER_VERSION =3D 0x06, > IPROTO_GROUP_ID =3D 0x07, > + IPROTO_TXN_ID =3D 0x08, > + IPROTO_TXN_REPLICA_ID =3D 0x09, > + IPROTO_TXN_LAST =3D 0x0a, > /* Leave a gap for other keys in the header. */ > IPROTO_SPACE_ID =3D 0x10, > IPROTO_INDEX_ID =3D 0x11, > diff --git a/src/box/wal.c b/src/box/wal.c > index 4c3537672..584d951c0 100644 > --- a/src/box/wal.c > +++ b/src/box/wal.c > @@ -905,9 +905,10 @@ wal_writer_begin_rollback(struct wal_writer *writer) > } >=20 > static void > -wal_assign_lsn(struct vclock *vclock, struct xrow_header **row, > +wal_assign_lsn(struct vclock *vclock, struct xrow_header **begin, > struct xrow_header **end) > { > + struct xrow_header **row =3D begin; > /** Assign LSN to all local rows. */ > for ( ; row < end; row++) { > if ((*row)->replica_id =3D=3D 0) { > @@ -917,6 +918,36 @@ wal_assign_lsn(struct vclock *vclock, struct > xrow_header **row, vclock_follow_xrow(vclock, *row); > } > } > + if ((*begin)->replica_id !=3D instance_id) { > + /* > + * Move all local changes to the end of rows array and > + * a fake local transaction (like an autonomous=20 transaction) > + * because we could not replicate the transaction back. > + */ > + row =3D begin; > + while (row < end - 1) { > + if (row[0]->replica_id =3D=3D instance_id && > + row[1]->replica_id !=3D instance_id) { > + struct xrow_header *tmp =3D row[0]; > + row[0] =3D row[1]; > + row[1] =3D tmp; > + } > + ++row; > + } > + /* Search begin of local rows tail. */ > + row =3D end; > + while (row > begin && row[-1]->replica_id =3D=3D=20 instance_id) > + --row; > + begin =3D row; > + } > + /* Setup txn_id and tnx_replica_id for localy generated rows. */ > + row =3D begin; > + while (row < end) { > + row[0]->txn_id =3D begin[0]->lsn; > + row[0]->txn_replica_id =3D instance_id; > + row[0]->txn_last =3D row =3D=3D end - 1 ? 1 : 0; > + ++row; > + } > } >=20 > static void > diff --git a/src/box/xrow.c b/src/box/xrow.c > index ef3f81add..db524b3c8 100644 > --- a/src/box/xrow.c > +++ b/src/box/xrow.c > @@ -133,12 +133,32 @@ error: > case IPROTO_SCHEMA_VERSION: > header->schema_version =3D=20 mp_decode_uint(pos); > break; > + case IPROTO_TXN_ID: > + header->txn_id =3D mp_decode_uint(pos); > + break; > + case IPROTO_TXN_REPLICA_ID: > + header->txn_replica_id =3D mp_decode_uint(pos); > + break; > + case IPROTO_TXN_LAST: > + header->txn_last =3D mp_decode_uint(pos); > + break; > default: > /* unknown header */ > mp_next(pos); > } > } > assert(*pos <=3D end); > + if (header->txn_id =3D=3D 0) { > + /* > + * Transaction id is not set so it is a single statement > + * transaction. > + */ > + header->txn_id =3D header->lsn; > + header->txn_last =3D true; > + } > + if (header->txn_replica_id =3D=3D 0) > + header->txn_replica_id =3D header->replica_id; > + > /* Nop requests aren't supposed to have a body. */ > if (*pos < end && header->type !=3D IPROTO_NOP) { > const char *body =3D *pos; > @@ -223,6 +243,24 @@ xrow_header_encode(const struct xrow_header *header, > uint64_t sync, d =3D mp_encode_double(d, header->tm); > map_size++; > } > + if (header->txn_id !=3D header->lsn || header->txn_last =3D=3D 0) { > + /* Encode txn id for multi row transaction members. */ > + d =3D mp_encode_uint(d, IPROTO_TXN_ID); > + d =3D mp_encode_uint(d, header->txn_id); > + map_size++; > + } > + if (header->txn_replica_id !=3D header->replica_id) { > + d =3D mp_encode_uint(d, IPROTO_TXN_REPLICA_ID); > + d =3D mp_encode_uint(d, header->txn_replica_id); > + map_size++; > + } > + if (header->txn_last && !(header->txn_id =3D=3D header->lsn && > + header->txn_replica_id =3D=3D=20 header->replica_id)) { > + /* Set last row for multi row transaction. */ > + d =3D mp_encode_uint(d, IPROTO_TXN_LAST); > + d =3D mp_encode_uint(d, header->txn_last); > + map_size++; > + } > assert(d <=3D data + XROW_HEADER_LEN_MAX); > mp_encode_map(data, map_size); > out->iov_len =3D d - (char *) out->iov_base; > diff --git a/src/box/xrow.h b/src/box/xrow.h > index 6bab0a1fd..4acd84d56 100644 > --- a/src/box/xrow.h > +++ b/src/box/xrow.h > @@ -47,7 +47,7 @@ enum { > XROW_HEADER_IOVMAX =3D 1, > XROW_BODY_IOVMAX =3D 2, > XROW_IOVMAX =3D XROW_HEADER_IOVMAX + XROW_BODY_IOVMAX, > - XROW_HEADER_LEN_MAX =3D 40, > + XROW_HEADER_LEN_MAX =3D 60, > XROW_BODY_LEN_MAX =3D 128, > IPROTO_HEADER_LEN =3D 28, > /** 7 =3D sizeof(iproto_body_bin). */ > @@ -69,6 +69,9 @@ struct xrow_header { > uint64_t sync; > int64_t lsn; /* LSN must be signed for correct comparison */ > double tm; > + int64_t txn_id; > + uint32_t txn_replica_id; > + uint32_t txn_last; >=20 > int bodycnt; > uint32_t schema_version; > diff --git a/test/unit/xrow.cc b/test/unit/xrow.cc > index 165a543cf..0b796d728 100644 > --- a/test/unit/xrow.cc > +++ b/test/unit/xrow.cc > @@ -215,6 +215,9 @@ test_xrow_header_encode_decode() > header.lsn =3D 400; > header.tm =3D 123.456; > header.bodycnt =3D 0; > + header.txn_id =3D header.lsn; > + header.txn_replica_id =3D header.replica_id; > + header.txn_last =3D true; > uint64_t sync =3D 100500; > struct iovec vec[1]; > is(1, xrow_header_encode(&header, sync, vec, 200), "encode"); > diff --git a/test/vinyl/errinj.result b/test/vinyl/errinj.result > index 23ab845b3..7ea5df777 100644 > --- a/test/vinyl/errinj.result > +++ b/test/vinyl/errinj.result > @@ -2164,7 +2164,7 @@ i:stat().disk.compact.queue -- 30 statements > - bytes_compressed: > pages: 3 > rows: 30 > - bytes: 471 > + bytes: 537 > ... > i:stat().disk.compact.queue.bytes =3D=3D box.stat.vinyl().disk.compact.q= ueue > --- > @@ -2178,7 +2178,7 @@ i:stat().disk.compact.queue -- 40 statements > - bytes_compressed: > pages: 4 > rows: 40 > - bytes: 628 > + bytes: 716 > ... > i:stat().disk.compact.queue.bytes =3D=3D box.stat.vinyl().disk.compact.q= ueue > --- > @@ -2192,7 +2192,7 @@ i:stat().disk.compact.queue -- 50 statements > - bytes_compressed: > pages: 5 > rows: 50 > - bytes: 785 > + bytes: 895 > ... > i:stat().disk.compact.queue.bytes =3D=3D box.stat.vinyl().disk.compact.q= ueue > --- > @@ -2206,7 +2206,7 @@ i:stat().disk.compact.queue -- 50 statements > - bytes_compressed: > pages: 5 > rows: 50 > - bytes: 785 > + bytes: 895 > ... > i:stat().disk.compact.queue.bytes =3D=3D box.stat.vinyl().disk.compact.q= ueue > --- > diff --git a/test/vinyl/info.result b/test/vinyl/info.result > index 922728abe..4876ddc68 100644 > --- a/test/vinyl/info.result > +++ b/test/vinyl/info.result > @@ -285,19 +285,19 @@ stat_diff(istat(), st) > bytes: 26525 > count: 1 > out: > - bytes: 26049 > + bytes: 26113 > pages: 7 > bytes_compressed: > rows: 25 > index_size: 294 > rows: 25 > - bytes: 26049 > + bytes: 26113 > bytes_compressed: > bloom_size: 70 > statement: > replaces: 25 > pages: 7 > - bytes: 26049 > + bytes: 26113 > put: > rows: 25 > bytes: 26525 > @@ -325,26 +325,26 @@ stat_diff(istat(), st) > bytes: 53050 > count: 1 > out: > - bytes: 52091 > + bytes: 52217 > pages: 13 > bytes_compressed: > rows: 50 > index_size: 252 > rows: 25 > - bytes: 26042 > + bytes: 26104 > bytes_compressed: > pages: 6 > statement: > replaces: 25 > compact: > in: > - bytes: 78140 > + bytes: 78330 > pages: 20 > bytes_compressed: > rows: 75 > count: 1 > out: > - bytes: 52091 > + bytes: 52217 > pages: 13 > bytes_compressed: > rows: 50 > @@ -352,7 +352,7 @@ stat_diff(istat(), st) > rows: 50 > bytes: 53050 > rows: 25 > - bytes: 26042 > + bytes: 26104 > ... > -- point lookup from disk + cache put > st =3D istat() > @@ -376,7 +376,7 @@ stat_diff(istat(), st) > disk: > iterator: > read: > - bytes: 4167 > + bytes: 4177 > pages: 1 > bytes_compressed: > rows: 4 > @@ -626,7 +626,7 @@ stat_diff(istat(), st) > disk: > iterator: > read: > - bytes: 104300 > + bytes: 104550 > pages: 25 > bytes_compressed: > rows: 100 > @@ -971,7 +971,7 @@ istat() > --- > - rows: 306 > run_avg: 1 > - bytes: 317731 > + bytes: 317981 > upsert: > squashed: 0 > applied: 0 > @@ -1049,7 +1049,7 @@ istat() > bloom_size: 140 > pages: 25 > bytes_compressed: > - bytes: 104300 > + bytes: 104550 > txw: > bytes: 0 > rows: 0 > @@ -1082,7 +1082,7 @@ gstat() > in: 0 > queue: 0 > out: 0 > - data: 104300 > + data: 104550 > index: 1190 > memory: > tuple_cache: 14313 > @@ -1209,7 +1209,7 @@ st2 =3D i2:stat() > ... > s:bsize() > --- > -- 52199 > +- 52313 > ... > i1:len(), i2:len() > --- > @@ -1219,7 +1219,7 @@ i1:len(), i2:len() > i1:bsize(), i2:bsize() > --- > - 364 > -- 920 > +- 1022 > ... > s:bsize() =3D=3D st1.disk.bytes > --- > @@ -1271,7 +1271,7 @@ st2 =3D i2:stat() > ... > s:bsize() > --- > -- 107449 > +- 107563 > ... > i1:len(), i2:len() > --- > @@ -1281,7 +1281,7 @@ i1:len(), i2:len() > i1:bsize(), i2:bsize() > --- > - 49516 > -- 50072 > +- 50174 > ... > s:bsize() =3D=3D st1.memory.bytes + st1.disk.bytes > --- > @@ -1336,7 +1336,7 @@ st2 =3D i2:stat() > ... > s:bsize() > --- > -- 52199 > +- 52313 > ... > i1:len(), i2:len() > --- > @@ -1346,7 +1346,7 @@ i1:len(), i2:len() > i1:bsize(), i2:bsize() > --- > - 364 > -- 920 > +- 1022 > ... > s:bsize() =3D=3D st1.disk.bytes > --- > diff --git a/test/vinyl/layout.result b/test/vinyl/layout.result > index 14201c5dd..a6b577dbc 100644 > --- a/test/vinyl/layout.result > +++ b/test/vinyl/layout.result > @@ -253,8 +253,8 @@ result > BODY: > row_index_offset: > offset: > - size: 108 > - unpacked_size: 89 > + size: 118 > + unpacked_size: 99 > row_count: 4 > min_key: ['=D1=91=D1=91=D1=91'] > - - 00000000000000000008.run > @@ -281,7 +281,7 @@ result > - HEADER: > type: ROWINDEX > BODY: > - row_index: "\0\0\0\0\0\0\0\x10\0\0\0 \0\0\00" > + row_index: "\0\0\0\0\0\0\0\x12\0\0\0$\0\0\06" > - - 00000000000000000012.index > - - HEADER: > type: RUNINFO > @@ -298,8 +298,8 @@ result > BODY: > row_index_offset: > offset: > - size: 102 > - unpacked_size: 83 > + size: 110 > + unpacked_size: 91 > row_count: 3 > min_key: ['=D1=91=D1=91=D1=91'] > - - 00000000000000000012.run > @@ -324,7 +324,7 @@ result > - HEADER: > type: ROWINDEX > BODY: > - row_index: "\0\0\0\0\0\0\0\x14\0\0\0*" > + row_index: "\0\0\0\0\0\0\0\x16\0\0\0." > - - 00000000000000000006.index > - - HEADER: > type: RUNINFO > @@ -341,8 +341,8 @@ result > BODY: > row_index_offset: > offset: > - size: 108 > - unpacked_size: 89 > + size: 118 > + unpacked_size: 99 > row_count: 4 > min_key: [null, '=D1=91=D1=91=D1=91'] > - - 00000000000000000006.run > @@ -369,7 +369,7 @@ result > - HEADER: > type: ROWINDEX > BODY: > - row_index: "\0\0\0\0\0\0\0\x10\0\0\0 \0\0\00" > + row_index: "\0\0\0\0\0\0\0\x12\0\0\0$\0\0\06" > - - 00000000000000000010.index > - - HEADER: > type: RUNINFO > @@ -386,8 +386,8 @@ result > BODY: > row_index_offset: > offset: > - size: 90 > - unpacked_size: 71 > + size: 98 > + unpacked_size: 79 > row_count: 3 > min_key: [123, '=D1=91=D1=91=D1=91'] > - - 00000000000000000010.run > @@ -409,7 +409,7 @@ result > - HEADER: > type: ROWINDEX > BODY: > - row_index: "\0\0\0\0\0\0\0\x10\0\0\0\"" > + row_index: "\0\0\0\0\0\0\0\x12\0\0\0&" > ... > test_run:cmd("clear filter") > --- --nextPart1657153.4aT1ZPPbye Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEFZT35EtIMRTDS5hJnoTdFFzh6LUFAlwx/asACgkQnoTdFFzh 6LWv/AgAmBmpgJdysxjxOvvArldRbkG/X+LAUOnPYptvaOGHXrdjWu5/gqSeig1A TXuMV096gE+hhqOS0+zbHQZWy62XoUj+YsJNJuWVDZNCeYG8c80OT3ae8O6WCxKd yl+GtRjOyjkoSDJXiwFMOMv/ECCXlH1tA1Y7dZ4kSO/juVHIVSdRW6CGT9sFkWX0 KmI1f3rMKjpyBt1+pK9vs0VDHtPaMc13rR9gj1kgK+b14zshFoWWi1Yw+5iU8NGp IruY85il4ukhTHIymCpBesX7Ov1xe/zRlMQU3V5vAP/PRB4s6D9VQNEv064V4HXl QopUaQR967axeNMaVCt/XKD67BMyLQ== =1FZ8 -----END PGP SIGNATURE----- --nextPart1657153.4aT1ZPPbye--