[Tarantool-patches] [PATCH 2/2] vinyl: drop wasted runs in case range recovery fails

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun May 3 22:21:00 MSK 2020


Thanks for the patch!

On 30/04/2020 21:27, Nikita Pettik wrote:
> If recovery process fails during range restoration, range itself is
> deleted and recovery is assumed to be finished as failed (in case of
> casual i.e. not forced recovery). During recovery of particular range,
> runs to be restored are reffed twice: once when they are created at
> vy_run_new() and once when they are attached to slice. This fact is
> taken into consideration and after all ranges are recovered all runs of
> lsm tree are unreffed so that slices own run resources. However, if

Nit: 'unreffed' -> 'unrefed'. Since this is a shortcut for
'unreferenced', where number of 'f' is 1.

The same for 'reffed'.

> range recovery fails, it is dropped alongside with slices which in turn
> results in unreffing runs - this is not accounted. In this case, once
> again unreffing such runs would lead to their destruction. On the other
> hand, iteration over runs may turn out to be unsafe, so we should use
> rlist_foreach_entry_safe(). Moreover, we should explicitly unaccount
> these runs calling vy_lsm_remove_run().

Sorry, I didn't understand almost everything from the message :D
But I did from the code. You may want to rephrase/restruct the text
if you think it may help.

> diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
> index 3d3f41b7a..81b011c69 100644
> --- a/src/box/vy_lsm.c
> +++ b/src/box/vy_lsm.c
> @@ -604,9 +604,17 @@ vy_lsm_recover(struct vy_lsm *lsm, struct vy_recovery *recovery,
>  	 * of each recovered run. We need to drop the extra
>  	 * references once we are done.
>  	 */
> -	struct vy_run *run;
> -	rlist_foreach_entry(run, &lsm->runs, in_lsm) {
> -		assert(run->refs > 1);
> +	struct vy_run *run, *next_run;
> +	rlist_foreach_entry_safe(run, &lsm->runs, in_lsm, next_run) {
> +		/*
> +		 * In case vy_lsm_recover_range() failed, slices
> +		 * are already deleted and runs are unreffed. So
> +		 * we have nothing to do but finish run clean-up.
> +		 */
> +		if (run->refs == 1) {

Reference counter looks like not a good information channel.
Could you use run->fd to check whether the run was really recovered?
vy_run_recover() leaves it -1, when fails.

Otherwise this won't work the second when we will ref the run anywhere
else.


More information about the Tarantool-patches mailing list