[Tarantool-patches] [PATCH 2/2] vinyl: drop wasted runs in case range recovery fails

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri May 8 00:47:49 MSK 2020


Thanks for the explanation and the new commit message!

>>> diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
>>> index 3d3f41b7a..81b011c69 100644
>>> --- a/src/box/vy_lsm.c
>>> +++ b/src/box/vy_lsm.c
>>> @@ -604,9 +604,17 @@ vy_lsm_recover(struct vy_lsm *lsm, struct vy_recovery *recovery,
>>>  	 * of each recovered run. We need to drop the extra
>>>  	 * references once we are done.
>>>  	 */
>>> -	struct vy_run *run;
>>> -	rlist_foreach_entry(run, &lsm->runs, in_lsm) {
>>> -		assert(run->refs > 1);
>>> +	struct vy_run *run, *next_run;
>>> +	rlist_foreach_entry_safe(run, &lsm->runs, in_lsm, next_run) {
>>> +		/*
>>> +		 * In case vy_lsm_recover_range() failed, slices
>>> +		 * are already deleted and runs are unreffed. So
>>> +		 * we have nothing to do but finish run clean-up.
>>> +		 */
>>> +		if (run->refs == 1) {
>>
>> Reference counter looks like not a good information channel.
>> Could you use run->fd to check whether the run was really recovered?
>> vy_run_recover() leaves it -1, when fails.
>>
>> Otherwise this won't work the second when we will ref the run anywhere
>> else.
> 
> Firstly, lsm at this point is not restored, ergo it is not functional
> and run can't be refed somewehere else - it's life span is clearly
> defined. Secondly, the problem is not in the last run (which failed to
> recover) but in those which are already recovered at the moment.
> Recovered runs feature valid fds. Finally, slice recover may fail
> not only in vy_run_recover(), but also due to oom, broken vylog etc.
> All these scenarios lead to the same situation.

Yeah, fair. Then what about run->slice_count? If it is zero, then it
is not kept by any slice. So we can look at slice_count == 0 and
assert ref == 1. Or look at ref == 1, and assert slice_count == 0.
Whatever. Will that work?


More information about the Tarantool-patches mailing list