[tarantool-patches] Re: [PATCH] replication: freeze join vclock before read view

Sun Oct 27 20:55:16 MSK 2019

* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/10/27 20:39]:
> Hi! Thanks for the patch!
> 
> > diff --git a/src/box/relay.cc b/src/box/relay.cc
> > index 74588cba7..ee23dd2aa 100644
> > --- a/src/box/relay.cc
> > +++ b/src/box/relay.cc
> > @@ -298,6 +298,8 @@ relay_initial_join(int fd, uint64_t sync, struct vclock *vclock)
> >  		relay_delete(relay);
> >  	});
> >  
> > +	vclock_copy(vclock, &replicaset.vclock);
> > +
> >  	/* Freeze a read view in engines. */
> >  	struct engine_join_ctx ctx;
> >  	engine_prepare_join_xc(&ctx);
> > @@ -312,8 +314,6 @@ relay_initial_join(int fd, uint64_t sync, struct vclock *vclock)
> >  	if (wal_sync() != 0)
> >  		diag_raise();
> >  
> > -	vclock_copy(vclock, &replicaset.vclock);
> > -
> 
> I am not sure that you can just move the copy before wal sync.
> As far as I understand, wal_sync() may commit some records and
> change replicaset.vclock. As a result, after your patch
> vclock will store not the latest vclock, won't it?

It should not store the latest vclock. It should store join vclock, it is used
by replica to subscribe:

A correct fix is to use wal_begin_checkpoint() instead of wal_sync() to obtain
vclock, since it will return the checkpoint vclock, from the checkpoint taken
in engine_prepare_join().  And there should be no yields between
engine_prepare_join and wal_begin_checkpoint(). 

It's the same trick used for usual checkpointing, the code here should
follow the same pattern.

> >  	/* Respond to the JOIN request with the current vclock. */
> >  	struct xrow_header row;
> >  	xrow_encode_vclock_xc(&row, vclock);

-- 
Konstantin Osipov, Moscow, Russia