[Tarantool-patches] [PATCH v10 3/4] box/applier: prevent nil dereference on applier rollback

Konstantin Osipov kostja.osipov at gmail.com
Sat Feb 15 20:37:41 MSK 2020


* Cyrill Gorcunov <gorcunov at gmail.com> [20/02/14 17:06]:
> +	/*
> +	 * FIXME: Do not clear fiber()->diag since it
> +	 * cause nil dereference
> +	 *
> +	 *   applier_subscribe
> +	 *     applier_apply_tx
> +	 *       diag_raise
> +	 *
> +	 * In turn we need to redesign this code:
> +	 *  - preserve original error or log it somewhere
> +	 *  - make the error path more clear
> +	 *
> +	 *  We must never reach this point with clean diag
> +	 *  area, if we do it means we're simply screwed
> +	 *  somewhere and there is a bug.

I think this comment is obsolete now with the fix below.

You no longer clear fiber->diag.

> +	 */
> +
> +	if (!diag_is_empty(diag_get()))
> +		diag_log();
> +	else
> +		say_warn_ratelimited("applier_txn_rollback_cb: empty diag");

You can also add assert here for debug mode. It should never
happen.

> +
>  	/* Setup shared applier diagnostic area. */
>  	diag_set(ClientError, ER_WAL_IO);
> -	diag_move(&fiber()->diag, &replicaset.applier.diag);
> +	diag_add_error(&replicaset.applier.diag,
> +		       diag_last_error(&fiber()->diag));
> +

It would be nice to explain in the comment why you want to
preserve the original error in the fiber here: because when later
this fiber is joined in (add call site here), we may want to check
its diagnostics area.

>  	/* Broadcast the rollback event across all appliers. */
>  	trigger_run(&replicaset.applier.on_rollback, event);
>  	/* Rollback applier vclock to the committed one. */

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list