From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: =?utf-8?B?0JPQtdC+0YDQs9C40Lkg0JrQuNGA0LjRh9C10L3QutC+?= Subject: Re: [tarantool-patches] Re: [PATCH v2 3/5] Enforce applier out of order protection Date: Tue, 29 Jan 2019 13:30:40 +0300 Message-ID: <774394922.0SCKpy4CPs@home.lan> In-Reply-To: <20190128120901.spkitg7kyrfjp6xz@esperanza> References: <4c39bbbfcd12c47b9b14fc1a0a0484331939ed63.1548152776.git.georgy@tarantool.org> <20190128120901.spkitg7kyrfjp6xz@esperanza> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart4571172.4HRWSeu0VK"; micalg="pgp-sha256"; protocol="application/pgp-signature" To: tarantool-patches@freelists.org Cc: Vladimir Davydov List-ID: --nextPart4571172.4HRWSeu0VK Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" On Monday, January 28, 2019 3:09:01 PM MSK Vladimir Davydov wrote: > On Tue, Jan 22, 2019 at 01:31:11PM +0300, Georgy Kirichenko wrote: > > Do not skip row until the row is not processed by other appliers. > > Looks like a fix for > > https://github.com/tarantool/tarantool/issues/3568 > > Worth adding a test? > > > Prerequisite #980 > > --- > > > > src/box/applier.cc | 35 ++++++++++++++++++----------------- > > 1 file changed, 18 insertions(+), 17 deletions(-) > > > > diff --git a/src/box/applier.cc b/src/box/applier.cc > > index 87873e970..148c8ce5a 100644 > > --- a/src/box/applier.cc > > +++ b/src/box/applier.cc > > @@ -504,6 +504,22 @@ applier_subscribe(struct applier *applier) > > > > applier->lag = ev_now(loop()) - row.tm; > > applier->last_row_time = ev_monotonic_now(loop()); > > > > + struct replica *replica = replica_by_id(row.replica_id); > > + struct latch *latch = (replica ? &replica->order_latch : > > + &replicaset.applier.order_latch); > > + /* > > + * In a full mesh topology, the same set > > + * of changes may arrive via two > > + * concurrently running appliers. Thanks > > + * to vclock_follow() above, the first row > > I don't see any vclock_follow() above. Please fix the comment. > > > + * in the set will be skipped - but the > > + * remaining may execute out of order, > > + * when the following xstream_write() > > + * yields on WAL. Hence we need a latch to > > + * strictly order all changes which belong > > + * to the same server id. > > + */ > > + latch_lock(latch); > > > > if (vclock_get(&replicaset.applier.vclock, > > > > row.replica_id) < row.lsn) { > > > > if (row.replica_id == instance_id && > > AFAIU this patch makes replicaset.applier.vclock, introduced by the > previous patch, useless. You are right now, but I plan to release this latch just before commit in case of parallel applier. > > > @@ -516,24 +532,7 @@ applier_subscribe(struct applier *applier) > > > > int64_t old_lsn = vclock_get(&replicaset.applier.vclock, > > > > row.replica_id); > > > > vclock_follow_xrow(&replicaset.applier.vclock, &row); > > > > - struct replica *replica = replica_by_id(row.replica_id); > > - struct latch *latch = (replica ? &replica->order_latch : > > - &replicaset.applier.order_latch); > > - /* > > - * In a full mesh topology, the same set > > - * of changes may arrive via two > > - * concurrently running appliers. Thanks > > - * to vclock_follow() above, the first row > > - * in the set will be skipped - but the > > - * remaining may execute out of order, > > - * when the following xstream_write() > > - * yields on WAL. Hence we need a latch to > > - * strictly order all changes which belong > > - * to the same server id. > > - */ > > - latch_lock(latch); > > > > int res = xstream_write(applier->subscribe_stream, &row); > > > > - latch_unlock(latch); > > > > if (res != 0) { > > > > struct error *e = diag_last_error(diag_get()); > > /** > > > > @@ -548,11 +547,13 @@ applier_subscribe(struct applier *applier) > > > > /* Rollback lsn to have a chance for a retry. */ > > vclock_set(&replicaset.applier.vclock, > > > > row.replica_id, old_lsn); > > > > + latch_unlock(latch); > > > > diag_raise(); > > > > } > > > > } > > > > } > > > > done: > > + latch_unlock(latch); > > > > /* > > > > * Stay 'orphan' until appliers catch up with > > * the remote vclock at the time of SUBSCRIBE --nextPart4571172.4HRWSeu0VK Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEFB+nbqWGnp59Rk9ZFSyY70x8X3sFAlxQK1AACgkQFSyY70x8 X3s3AAf/bP1Zm2jYEHtKMb2C/z6SKOX0q7Z96u1LHlp5S+Ll555fgDJtGPcFykPq IgFZFNW8yT9YXU3dbQyGoyIj5LO4DrTo0mhzzNk3G6QH6sGttvnpVm0SY4Xk7Sb4 0tu16HaaLErSn/YRcs/BeTXAxyFQB8uVyU2HfjNzl4bmiSWeJD0COksQlaMWtiOS LOlCTtR1uVcXmn6DJeJuaMdLXzQatSTwLZc9v8Zh308sclUuaNiwNrohSWQEH+lW vjaNcVxFwzh9A6iQuYKg5+VHanibax9BEMR+pHIPbPaeMPSUGOxr+MBXUQre0gbo CaOJZwhsWhVgzm0tYLF6GlcK+OkbDA== =rH55 -----END PGP SIGNATURE----- --nextPart4571172.4HRWSeu0VK--