[tarantool-patches] Re: [PATCH v2 3/5] Enforce applier out of order protection
Георгий Кириченко
georgy at tarantool.org
Tue Jan 29 13:30:40 MSK 2019
On Monday, January 28, 2019 3:09:01 PM MSK Vladimir Davydov wrote:
> On Tue, Jan 22, 2019 at 01:31:11PM +0300, Georgy Kirichenko wrote:
> > Do not skip row until the row is not processed by other appliers.
>
> Looks like a fix for
>
> https://github.com/tarantool/tarantool/issues/3568
>
> Worth adding a test?
>
> > Prerequisite #980
> > ---
> >
> > src/box/applier.cc | 35 ++++++++++++++++++-----------------
> > 1 file changed, 18 insertions(+), 17 deletions(-)
> >
> > diff --git a/src/box/applier.cc b/src/box/applier.cc
> > index 87873e970..148c8ce5a 100644
> > --- a/src/box/applier.cc
> > +++ b/src/box/applier.cc
> > @@ -504,6 +504,22 @@ applier_subscribe(struct applier *applier)
> >
> > applier->lag = ev_now(loop()) - row.tm;
> > applier->last_row_time = ev_monotonic_now(loop());
> >
> > + struct replica *replica = replica_by_id(row.replica_id);
> > + struct latch *latch = (replica ? &replica->order_latch :
> > + &replicaset.applier.order_latch);
> > + /*
> > + * In a full mesh topology, the same set
> > + * of changes may arrive via two
> > + * concurrently running appliers. Thanks
> > + * to vclock_follow() above, the first row
>
> I don't see any vclock_follow() above. Please fix the comment.
>
> > + * in the set will be skipped - but the
> > + * remaining may execute out of order,
> > + * when the following xstream_write()
> > + * yields on WAL. Hence we need a latch to
> > + * strictly order all changes which belong
> > + * to the same server id.
> > + */
> > + latch_lock(latch);
> >
> > if (vclock_get(&replicaset.applier.vclock,
> >
> > row.replica_id) < row.lsn) {
> >
> > if (row.replica_id == instance_id &&
>
> AFAIU this patch makes replicaset.applier.vclock, introduced by the
> previous patch, useless.
You are right now, but I plan to release this latch just before commit in case
of parallel applier.
>
> > @@ -516,24 +532,7 @@ applier_subscribe(struct applier *applier)
> >
> > int64_t old_lsn =
vclock_get(&replicaset.applier.vclock,
> >
> > row.replica_id);
> >
> > vclock_follow_xrow(&replicaset.applier.vclock, &row);
> >
> > - struct replica *replica = replica_by_id(row.replica_id);
> > - struct latch *latch = (replica ? &replica->order_latch :
> > - &replicaset.applier.order_latch);
> > - /*
> > - * In a full mesh topology, the same set
> > - * of changes may arrive via two
> > - * concurrently running appliers. Thanks
> > - * to vclock_follow() above, the first row
> > - * in the set will be skipped - but the
> > - * remaining may execute out of order,
> > - * when the following xstream_write()
> > - * yields on WAL. Hence we need a latch to
> > - * strictly order all changes which belong
> > - * to the same server id.
> > - */
> > - latch_lock(latch);
> >
> > int res = xstream_write(applier->subscribe_stream,
&row);
> >
> > - latch_unlock(latch);
> >
> > if (res != 0) {
> >
> > struct error *e = diag_last_error(diag_get());
> > /**
> >
> > @@ -548,11 +547,13 @@ applier_subscribe(struct applier *applier)
> >
> > /* Rollback lsn to have a chance for a
retry. */
> > vclock_set(&replicaset.applier.vclock,
> >
> > row.replica_id, old_lsn);
> >
> > + latch_unlock(latch);
> >
> > diag_raise();
> >
> > }
> >
> > }
> >
> > }
> >
> > done:
> > + latch_unlock(latch);
> >
> > /*
> >
> > * Stay 'orphan' until appliers catch up with
> > * the remote vclock at the time of SUBSCRIBE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20190129/693d2049/attachment.sig>
More information about the Tarantool-patches
mailing list