[Tarantool-patches] [PATCH v7 3/5] box/applier: fix nil dereference in applier rollback

Konstantin Osipov kostja.osipov at gmail.com
Wed Feb 5 01:19:02 MSK 2020


* Cyrill Gorcunov <gorcunov at gmail.com> [20/01/28 22:58]:
> +	/*
> +	 * Something really bad happened, we can't proceed
> +	 * thus stop the applier and throw FiberIsCancelled
> +	 * exception which will be catched by the caller
> +	 * and the fiber gracefully finish.
> +	 *
> +	 * FIXME: Need to make sure that this is a really
> +	 * final error where we can't longer proceed and should
> +	 * zap the applier, probably we could reconnect and
> +	 * retry instead?
> +	 */
>  	fiber_cancel(applier->reader);
> +	diag_set(FiberIsCancelled);

Now that I have seen the entire series and the test case, I think
there are two different issues here and only one of them is
leading to a crash.

One, is that we reset the original error with some vague
ER_WAL_IO. This shouldn't happen, but it's harmless. Still, your
fix for it is good. I don't think though it should check for !e.

Two, is a failure to propagate the error from the cancelled
applier fiber. 

I think it's better to fix the call site to raise
replicaset.applier.diag than to add a yet another vague error
(fiberiscancelled) topped with prolific comments about how
the current code is broken.

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list