[Tarantool-patches] [PATCH v5 1/2] vinyl: fix check vinyl_dir existence at bootstrap
Alexander V. Tikhonov
avtikhon at tarantool.org
Wed Aug 19 09:04:48 MSK 2020
Hi Alexander, thanks for the review, please check my comments below.
On Wed, Aug 19, 2020 at 12:10:14AM +0300, Alexander Turenko wrote:
> Thanks for the update! Now we can discuss the message.
>
> I left comments, where something confuses me in the message and the
> test.
>
> WBR, Alexander Turenko.
>
> On Mon, Aug 17, 2020 at 08:29:14AM +0300, Alexander V. Tikhonov wrote:
> > During implementation of openSUSE got failed box-tap/cfg.test.lua test.
>
> Implementation of openSUSE... build?
Corrected.
>
> > Found that when memtx_dir didn't exist and vinyl_dir existed and also
> > errno was set to ENOENT, box configuration succeeded, but it shouldn't.
> > Reason of this wrong behaviour was that not all of the failure paths in
> > xdir_scan() set errno, but the caller assumed it.
> >
> > Debugging src/box/xlog.c found that all checks were correct, but at:
> >
> > src/box/vy_log.c:vy_log_bootstrap()
> > src/box/vy_log.c:vy_log_begin_recovery()
> >
> > the checks on of the errno on ENOENT blocked the negative return from:
> >
> > src/box/xlog.c:xdir_scan()
>
> 'blocked negative return' is not clear for me. I guess that you want to
> express the following two points:
>
> - A negative return value is not considered as an error when errno is
> set to ENOENT.
> - The idea of this check is to handle the situation when vinyl_dir is
> not exists.
>
> I would rephrase it.
>
Right I've rewrote it an mentioned these points.
> >
> > Found that errno was already set to ENOENT before the xdir_scan() call.
> > To fix the issue the errno could be clean before the call to xdir_scan,
> > because we are interesting in it only from xdir_scan function.
>
> Strictly speaking, the reason is different: because xdir_scan() may
> return a negative value and leave errno unchanged.
>
Right, I've changed the comment to make it more clear.
> >
> > After discussions found that there were alternative better solution to
> > fix it. The fix with resetting errno was not good because xdir_scan()
> > was not system call in real and some internal routines could set it
> > to ENOENT itself, so it couldn't be controled from outside of function.
>
> 'internal routines could set it to ENOENT itself' -- I would say that
> something that is not vinyl_dir existence check.
>
> One more point: I don't see how it may happen except due to a race:
>
> - a file name is obtained from readdir(),
> - the file is deleted by another process,
> - we open the file and fail with ENOENT.
>
> And I'm not sure the race is possible. I want to say that I'm unable to
> state that 'internal routines could set it to ENOENT': I don't know, in
> fact. But the implementation can be changed in a future and there is no
> guarantee that returning with -1 and ENOENT means only lack of
> vinyl_dir.
>
> I mean, I would highlight that the sentence does not assert that the
> situation is possible with the current implementation. Just that we have
> no corresponding guarantee.
>
Right, the xdir_scan() is not system call and can be changed in the way
that can brake the check.
> Typo: was not system call -> is not a system call
>
> Typo: controled -> controlled.
>
Corrected.
> >
> > To be sure in behaviour of the changing errno decided to pass a flag to
> > xdir_scan() if the directory should exist.
>
> 'behaviour of the changing errno' is vague. I guess the idea of the
> sentence is that the variant with the flag is better, because (I guess)
> it is more explicit and should be less fragile.
>
Here I've added more comments.
> >
> > Closes #4594
> > Needed for #4562
> >
> > Co-authored-by: Alexander Turenko <alexander.turenko at tarantool.org>
> > ---
> >
> > Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4562-suse-pack-full-ci
> > Issue: https://github.com/tarantool/tarantool/issues/4562
>
> > diff --git a/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua b/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua
> > new file mode 100755
> > index 000000000..cbf7b1f35
> > --- /dev/null
> > +++ b/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua
> > @@ -0,0 +1,47 @@
> > +#!/usr/bin/env tarantool
> > +
> > +local tap = require('tap')
> > +local test = tap.test('cfg')
> > +local fio = require('fio')
> > +test:plan(1)
> > +
> > +local tarantool_bin = arg[-1]
> > +local PANIC = 256
> > +
> > +function run_script(code)
>
> Nit: It is the rule of thumb to don't fill the global namespace with
> fields (variables and functions) if you actually don't need it.
>
Corrected.
> > + local dir = fio.tempdir()
> > + local script_path = fio.pathjoin(dir, 'script.lua')
> > + local script = fio.open(script_path, {'O_CREAT', 'O_WRONLY', 'O_APPEND'},
> > + tonumber('0777', 8))
> > + script:write(code)
> > + script:write("\nos.exit(0)")
> > + script:close()
> > + local cmd = [[/bin/sh -c 'cd "%s" && "%s" ./script.lua 2> /dev/null']]
> > + local res = os.execute(string.format(cmd, dir, tarantool_bin))
> > + fio.rmtree(dir)
> > + return res
> > +end
> > +
> > +--
> > +-- gh-4594: when memtx_dir is not exists, but vinyl_dir exists and
> > +-- errno is set to ENOENT, box configuration succeeds, however it
> > +-- should not
> > +--
> > +vinyl_dir = fio.tempdir()
>
> Same here, it would be good to use 'local' here.
>
Corrected.
> > +run_script(string.format([[
> > +box.cfg{vinyl_dir = '%s'}
> > +s = box.schema.space.create('test', {engine = 'vinyl'})
> > +s:create_index('pk')
> > +os.exit(0)
> > +]], vinyl_dir))
> > +code = string.format([[
>
> Same here.
>
Corrected.
> > +local errno = require('errno')
> > +errno(errno.ENOENT)
> > +box.cfg{vinyl_dir = '%s'}
> > +os.exit(0)
> > +]], vinyl_dir)
> > +test:is(run_script(code), PANIC, "bootstrap with ENOENT from non-empty vinyl_dir")
> > +fio.rmtree(vinyl_dir)
>
> I propose to clarify the test steps a bit:
>
Corrected.
> | vinyl_dir = <...>
> |
> | -- Fill vinyl_dir.
> | run_script(<...>)
> |
> | -- Verify the case described above.
> | local code = <...>
> | test:is(<...>)
> |
> | -- Remove vinyl_dir.
> | fio.rmtree(vinyl_dir)
>
> > +
> > +test:check()
> > +os.exit(0)
>
> I suggest to highlight a test status in the exit code (it follows the
> Lua Style Guide proposal [1]):
>
> | os.exit(test:check() and 0 or 1)
>
> [1]: https://github.com/tarantool/doc/issues/1004
Corrected.
More information about the Tarantool-patches
mailing list