From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp52.i.mail.ru (smtp52.i.mail.ru [94.100.177.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id D6653469710 for ; Sun, 3 May 2020 22:21:02 +0300 (MSK) References: <5012bd8eb07b5379eb70aa777402c7cd566a3b34.1588273848.git.korablev@tarantool.org> From: Vladislav Shpilevoy Message-ID: <959a3f72-6b95-30b1-c838-3b9e2bebc04c@tarantool.org> Date: Sun, 3 May 2020 21:21:00 +0200 MIME-Version: 1.0 In-Reply-To: <5012bd8eb07b5379eb70aa777402c7cd566a3b34.1588273848.git.korablev@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH 2/2] vinyl: drop wasted runs in case range recovery fails List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik , tarantool-patches@dev.tarantool.org Thanks for the patch! On 30/04/2020 21:27, Nikita Pettik wrote: > If recovery process fails during range restoration, range itself is > deleted and recovery is assumed to be finished as failed (in case of > casual i.e. not forced recovery). During recovery of particular range, > runs to be restored are reffed twice: once when they are created at > vy_run_new() and once when they are attached to slice. This fact is > taken into consideration and after all ranges are recovered all runs of > lsm tree are unreffed so that slices own run resources. However, if Nit: 'unreffed' -> 'unrefed'. Since this is a shortcut for 'unreferenced', where number of 'f' is 1. The same for 'reffed'. > range recovery fails, it is dropped alongside with slices which in turn > results in unreffing runs - this is not accounted. In this case, once > again unreffing such runs would lead to their destruction. On the other > hand, iteration over runs may turn out to be unsafe, so we should use > rlist_foreach_entry_safe(). Moreover, we should explicitly unaccount > these runs calling vy_lsm_remove_run(). Sorry, I didn't understand almost everything from the message :D But I did from the code. You may want to rephrase/restruct the text if you think it may help. > diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c > index 3d3f41b7a..81b011c69 100644 > --- a/src/box/vy_lsm.c > +++ b/src/box/vy_lsm.c > @@ -604,9 +604,17 @@ vy_lsm_recover(struct vy_lsm *lsm, struct vy_recovery *recovery, > * of each recovered run. We need to drop the extra > * references once we are done. > */ > - struct vy_run *run; > - rlist_foreach_entry(run, &lsm->runs, in_lsm) { > - assert(run->refs > 1); > + struct vy_run *run, *next_run; > + rlist_foreach_entry_safe(run, &lsm->runs, in_lsm, next_run) { > + /* > + * In case vy_lsm_recover_range() failed, slices > + * are already deleted and runs are unreffed. So > + * we have nothing to do but finish run clean-up. > + */ > + if (run->refs == 1) { Reference counter looks like not a good information channel. Could you use run->fd to check whether the run was really recovered? vy_run_recover() leaves it -1, when fails. Otherwise this won't work the second when we will ref the run anywhere else.