On Monday, January 28, 2019 3:09:01 PM MSK Vladimir Davydov wrote: > On Tue, Jan 22, 2019 at 01:31:11PM +0300, Georgy Kirichenko wrote: > > Do not skip row until the row is not processed by other appliers. > > Looks like a fix for > > https://github.com/tarantool/tarantool/issues/3568 > > Worth adding a test? > > > Prerequisite #980 > > --- > > > > src/box/applier.cc | 35 ++++++++++++++++++----------------- > > 1 file changed, 18 insertions(+), 17 deletions(-) > > > > diff --git a/src/box/applier.cc b/src/box/applier.cc > > index 87873e970..148c8ce5a 100644 > > --- a/src/box/applier.cc > > +++ b/src/box/applier.cc > > @@ -504,6 +504,22 @@ applier_subscribe(struct applier *applier) > > > > applier->lag = ev_now(loop()) - row.tm; > > applier->last_row_time = ev_monotonic_now(loop()); > > > > + struct replica *replica = replica_by_id(row.replica_id); > > + struct latch *latch = (replica ? &replica->order_latch : > > + &replicaset.applier.order_latch); > > + /* > > + * In a full mesh topology, the same set > > + * of changes may arrive via two > > + * concurrently running appliers. Thanks > > + * to vclock_follow() above, the first row > > I don't see any vclock_follow() above. Please fix the comment. > > > + * in the set will be skipped - but the > > + * remaining may execute out of order, > > + * when the following xstream_write() > > + * yields on WAL. Hence we need a latch to > > + * strictly order all changes which belong > > + * to the same server id. > > + */ > > + latch_lock(latch); > > > > if (vclock_get(&replicaset.applier.vclock, > > > > row.replica_id) < row.lsn) { > > > > if (row.replica_id == instance_id && > > AFAIU this patch makes replicaset.applier.vclock, introduced by the > previous patch, useless. You are right now, but I plan to release this latch just before commit in case of parallel applier. > > > @@ -516,24 +532,7 @@ applier_subscribe(struct applier *applier) > > > > int64_t old_lsn = vclock_get(&replicaset.applier.vclock, > > > > row.replica_id); > > > > vclock_follow_xrow(&replicaset.applier.vclock, &row); > > > > - struct replica *replica = replica_by_id(row.replica_id); > > - struct latch *latch = (replica ? &replica->order_latch : > > - &replicaset.applier.order_latch); > > - /* > > - * In a full mesh topology, the same set > > - * of changes may arrive via two > > - * concurrently running appliers. Thanks > > - * to vclock_follow() above, the first row > > - * in the set will be skipped - but the > > - * remaining may execute out of order, > > - * when the following xstream_write() > > - * yields on WAL. Hence we need a latch to > > - * strictly order all changes which belong > > - * to the same server id. > > - */ > > - latch_lock(latch); > > > > int res = xstream_write(applier->subscribe_stream, &row); > > > > - latch_unlock(latch); > > > > if (res != 0) { > > > > struct error *e = diag_last_error(diag_get()); > > /** > > > > @@ -548,11 +547,13 @@ applier_subscribe(struct applier *applier) > > > > /* Rollback lsn to have a chance for a retry. */ > > vclock_set(&replicaset.applier.vclock, > > > > row.replica_id, old_lsn); > > > > + latch_unlock(latch); > > > > diag_raise(); > > > > } > > > > } > > > > } > > > > done: > > + latch_unlock(latch); > > > > /* > > > > * Stay 'orphan' until appliers catch up with > > * the remote vclock at the time of SUBSCRIBE