[tarantool-patches] [PATCH v2 3/5] Enforce applier out of order protection
Vladimir Davydov
vdavydov.dev at gmail.com
Mon Jan 28 15:09:01 MSK 2019
On Tue, Jan 22, 2019 at 01:31:11PM +0300, Georgy Kirichenko wrote:
> Do not skip row until the row is not processed by other appliers.
Looks like a fix for
https://github.com/tarantool/tarantool/issues/3568
Worth adding a test?
>
> Prerequisite #980
> ---
> src/box/applier.cc | 35 ++++++++++++++++++-----------------
> 1 file changed, 18 insertions(+), 17 deletions(-)
>
> diff --git a/src/box/applier.cc b/src/box/applier.cc
> index 87873e970..148c8ce5a 100644
> --- a/src/box/applier.cc
> +++ b/src/box/applier.cc
> @@ -504,6 +504,22 @@ applier_subscribe(struct applier *applier)
>
> applier->lag = ev_now(loop()) - row.tm;
> applier->last_row_time = ev_monotonic_now(loop());
> + struct replica *replica = replica_by_id(row.replica_id);
> + struct latch *latch = (replica ? &replica->order_latch :
> + &replicaset.applier.order_latch);
> + /*
> + * In a full mesh topology, the same set
> + * of changes may arrive via two
> + * concurrently running appliers. Thanks
> + * to vclock_follow() above, the first row
I don't see any vclock_follow() above. Please fix the comment.
> + * in the set will be skipped - but the
> + * remaining may execute out of order,
> + * when the following xstream_write()
> + * yields on WAL. Hence we need a latch to
> + * strictly order all changes which belong
> + * to the same server id.
> + */
> + latch_lock(latch);
> if (vclock_get(&replicaset.applier.vclock,
> row.replica_id) < row.lsn) {
> if (row.replica_id == instance_id &&
AFAIU this patch makes replicaset.applier.vclock, introduced by the
previous patch, useless.
> @@ -516,24 +532,7 @@ applier_subscribe(struct applier *applier)
> int64_t old_lsn = vclock_get(&replicaset.applier.vclock,
> row.replica_id);
> vclock_follow_xrow(&replicaset.applier.vclock, &row);
> - struct replica *replica = replica_by_id(row.replica_id);
> - struct latch *latch = (replica ? &replica->order_latch :
> - &replicaset.applier.order_latch);
> - /*
> - * In a full mesh topology, the same set
> - * of changes may arrive via two
> - * concurrently running appliers. Thanks
> - * to vclock_follow() above, the first row
> - * in the set will be skipped - but the
> - * remaining may execute out of order,
> - * when the following xstream_write()
> - * yields on WAL. Hence we need a latch to
> - * strictly order all changes which belong
> - * to the same server id.
> - */
> - latch_lock(latch);
> int res = xstream_write(applier->subscribe_stream, &row);
> - latch_unlock(latch);
> if (res != 0) {
> struct error *e = diag_last_error(diag_get());
> /**
> @@ -548,11 +547,13 @@ applier_subscribe(struct applier *applier)
> /* Rollback lsn to have a chance for a retry. */
> vclock_set(&replicaset.applier.vclock,
> row.replica_id, old_lsn);
> + latch_unlock(latch);
> diag_raise();
> }
> }
> }
> done:
> + latch_unlock(latch);
> /*
> * Stay 'orphan' until appliers catch up with
> * the remote vclock at the time of SUBSCRIBE
More information about the Tarantool-patches
mailing list