From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 0C24F430407 for ; Wed, 19 Aug 2020 09:04:50 +0300 (MSK) Date: Wed, 19 Aug 2020 09:04:48 +0300 From: "Alexander V. Tikhonov" Message-ID: <20200819060448.GA24723@hpalx> References: <07d6fd4508eda75a98bb9ea49dd58b6b14fbd99a.1597641988.git.avtikhon@tarantool.org> <20200818211014.22yrvxpoyhdokonb@tkn_work_nb> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200818211014.22yrvxpoyhdokonb@tkn_work_nb> Subject: Re: [Tarantool-patches] [PATCH v5 1/2] vinyl: fix check vinyl_dir existence at bootstrap List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Turenko Cc: tarantool-patches@dev.tarantool.org Hi Alexander, thanks for the review, please check my comments below. On Wed, Aug 19, 2020 at 12:10:14AM +0300, Alexander Turenko wrote: > Thanks for the update! Now we can discuss the message. > > I left comments, where something confuses me in the message and the > test. > > WBR, Alexander Turenko. > > On Mon, Aug 17, 2020 at 08:29:14AM +0300, Alexander V. Tikhonov wrote: > > During implementation of openSUSE got failed box-tap/cfg.test.lua test. > > Implementation of openSUSE... build? Corrected. > > > Found that when memtx_dir didn't exist and vinyl_dir existed and also > > errno was set to ENOENT, box configuration succeeded, but it shouldn't. > > Reason of this wrong behaviour was that not all of the failure paths in > > xdir_scan() set errno, but the caller assumed it. > > > > Debugging src/box/xlog.c found that all checks were correct, but at: > > > > src/box/vy_log.c:vy_log_bootstrap() > > src/box/vy_log.c:vy_log_begin_recovery() > > > > the checks on of the errno on ENOENT blocked the negative return from: > > > > src/box/xlog.c:xdir_scan() > > 'blocked negative return' is not clear for me. I guess that you want to > express the following two points: > > - A negative return value is not considered as an error when errno is > set to ENOENT. > - The idea of this check is to handle the situation when vinyl_dir is > not exists. > > I would rephrase it. > Right I've rewrote it an mentioned these points. > > > > Found that errno was already set to ENOENT before the xdir_scan() call. > > To fix the issue the errno could be clean before the call to xdir_scan, > > because we are interesting in it only from xdir_scan function. > > Strictly speaking, the reason is different: because xdir_scan() may > return a negative value and leave errno unchanged. > Right, I've changed the comment to make it more clear. > > > > After discussions found that there were alternative better solution to > > fix it. The fix with resetting errno was not good because xdir_scan() > > was not system call in real and some internal routines could set it > > to ENOENT itself, so it couldn't be controled from outside of function. > > 'internal routines could set it to ENOENT itself' -- I would say that > something that is not vinyl_dir existence check. > > One more point: I don't see how it may happen except due to a race: > > - a file name is obtained from readdir(), > - the file is deleted by another process, > - we open the file and fail with ENOENT. > > And I'm not sure the race is possible. I want to say that I'm unable to > state that 'internal routines could set it to ENOENT': I don't know, in > fact. But the implementation can be changed in a future and there is no > guarantee that returning with -1 and ENOENT means only lack of > vinyl_dir. > > I mean, I would highlight that the sentence does not assert that the > situation is possible with the current implementation. Just that we have > no corresponding guarantee. > Right, the xdir_scan() is not system call and can be changed in the way that can brake the check. > Typo: was not system call -> is not a system call > > Typo: controled -> controlled. > Corrected. > > > > To be sure in behaviour of the changing errno decided to pass a flag to > > xdir_scan() if the directory should exist. > > 'behaviour of the changing errno' is vague. I guess the idea of the > sentence is that the variant with the flag is better, because (I guess) > it is more explicit and should be less fragile. > Here I've added more comments. > > > > Closes #4594 > > Needed for #4562 > > > > Co-authored-by: Alexander Turenko > > --- > > > > Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4562-suse-pack-full-ci > > Issue: https://github.com/tarantool/tarantool/issues/4562 > > > diff --git a/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua b/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua > > new file mode 100755 > > index 000000000..cbf7b1f35 > > --- /dev/null > > +++ b/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua > > @@ -0,0 +1,47 @@ > > +#!/usr/bin/env tarantool > > + > > +local tap = require('tap') > > +local test = tap.test('cfg') > > +local fio = require('fio') > > +test:plan(1) > > + > > +local tarantool_bin = arg[-1] > > +local PANIC = 256 > > + > > +function run_script(code) > > Nit: It is the rule of thumb to don't fill the global namespace with > fields (variables and functions) if you actually don't need it. > Corrected. > > + local dir = fio.tempdir() > > + local script_path = fio.pathjoin(dir, 'script.lua') > > + local script = fio.open(script_path, {'O_CREAT', 'O_WRONLY', 'O_APPEND'}, > > + tonumber('0777', 8)) > > + script:write(code) > > + script:write("\nos.exit(0)") > > + script:close() > > + local cmd = [[/bin/sh -c 'cd "%s" && "%s" ./script.lua 2> /dev/null']] > > + local res = os.execute(string.format(cmd, dir, tarantool_bin)) > > + fio.rmtree(dir) > > + return res > > +end > > + > > +-- > > +-- gh-4594: when memtx_dir is not exists, but vinyl_dir exists and > > +-- errno is set to ENOENT, box configuration succeeds, however it > > +-- should not > > +-- > > +vinyl_dir = fio.tempdir() > > Same here, it would be good to use 'local' here. > Corrected. > > +run_script(string.format([[ > > +box.cfg{vinyl_dir = '%s'} > > +s = box.schema.space.create('test', {engine = 'vinyl'}) > > +s:create_index('pk') > > +os.exit(0) > > +]], vinyl_dir)) > > +code = string.format([[ > > Same here. > Corrected. > > +local errno = require('errno') > > +errno(errno.ENOENT) > > +box.cfg{vinyl_dir = '%s'} > > +os.exit(0) > > +]], vinyl_dir) > > +test:is(run_script(code), PANIC, "bootstrap with ENOENT from non-empty vinyl_dir") > > +fio.rmtree(vinyl_dir) > > I propose to clarify the test steps a bit: > Corrected. > | vinyl_dir = <...> > | > | -- Fill vinyl_dir. > | run_script(<...>) > | > | -- Verify the case described above. > | local code = <...> > | test:is(<...>) > | > | -- Remove vinyl_dir. > | fio.rmtree(vinyl_dir) > > > + > > +test:check() > > +os.exit(0) > > I suggest to highlight a test status in the exit code (it follows the > Lua Style Guide proposal [1]): > > | os.exit(test:check() and 0 or 1) > > [1]: https://github.com/tarantool/doc/issues/1004 Corrected.