[Tarantool-patches] [PATCH v5 1/2] vinyl: fix check vinyl_dir existence at bootstrap
Alexander Turenko
alexander.turenko at tarantool.org
Wed Aug 19 00:10:14 MSK 2020
Thanks for the update! Now we can discuss the message.
I left comments, where something confuses me in the message and the
test.
WBR, Alexander Turenko.
On Mon, Aug 17, 2020 at 08:29:14AM +0300, Alexander V. Tikhonov wrote:
> During implementation of openSUSE got failed box-tap/cfg.test.lua test.
Implementation of openSUSE... build?
> Found that when memtx_dir didn't exist and vinyl_dir existed and also
> errno was set to ENOENT, box configuration succeeded, but it shouldn't.
> Reason of this wrong behaviour was that not all of the failure paths in
> xdir_scan() set errno, but the caller assumed it.
>
> Debugging src/box/xlog.c found that all checks were correct, but at:
>
> src/box/vy_log.c:vy_log_bootstrap()
> src/box/vy_log.c:vy_log_begin_recovery()
>
> the checks on of the errno on ENOENT blocked the negative return from:
>
> src/box/xlog.c:xdir_scan()
'blocked negative return' is not clear for me. I guess that you want to
express the following two points:
- A negative return value is not considered as an error when errno is
set to ENOENT.
- The idea of this check is to handle the situation when vinyl_dir is
not exists.
I would rephrase it.
>
> Found that errno was already set to ENOENT before the xdir_scan() call.
> To fix the issue the errno could be clean before the call to xdir_scan,
> because we are interesting in it only from xdir_scan function.
Strictly speaking, the reason is different: because xdir_scan() may
return a negative value and leave errno unchanged.
>
> After discussions found that there were alternative better solution to
> fix it. The fix with resetting errno was not good because xdir_scan()
> was not system call in real and some internal routines could set it
> to ENOENT itself, so it couldn't be controled from outside of function.
'internal routines could set it to ENOENT itself' -- I would say that
something that is not vinyl_dir existence check.
One more point: I don't see how it may happen except due to a race:
- a file name is obtained from readdir(),
- the file is deleted by another process,
- we open the file and fail with ENOENT.
And I'm not sure the race is possible. I want to say that I'm unable to
state that 'internal routines could set it to ENOENT': I don't know, in
fact. But the implementation can be changed in a future and there is no
guarantee that returning with -1 and ENOENT means only lack of
vinyl_dir.
I mean, I would highlight that the sentence does not assert that the
situation is possible with the current implementation. Just that we have
no corresponding guarantee.
Typo: was not system call -> is not a system call
Typo: controled -> controlled.
>
> To be sure in behaviour of the changing errno decided to pass a flag to
> xdir_scan() if the directory should exist.
'behaviour of the changing errno' is vague. I guess the idea of the
sentence is that the variant with the flag is better, because (I guess)
it is more explicit and should be less fragile.
>
> Closes #4594
> Needed for #4562
>
> Co-authored-by: Alexander Turenko <alexander.turenko at tarantool.org>
> ---
>
> Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-4562-suse-pack-full-ci
> Issue: https://github.com/tarantool/tarantool/issues/4562
> diff --git a/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua b/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua
> new file mode 100755
> index 000000000..cbf7b1f35
> --- /dev/null
> +++ b/test/box-tap/gh-4562-errno-at-xdir_scan.test.lua
> @@ -0,0 +1,47 @@
> +#!/usr/bin/env tarantool
> +
> +local tap = require('tap')
> +local test = tap.test('cfg')
> +local fio = require('fio')
> +test:plan(1)
> +
> +local tarantool_bin = arg[-1]
> +local PANIC = 256
> +
> +function run_script(code)
Nit: It is the rule of thumb to don't fill the global namespace with
fields (variables and functions) if you actually don't need it.
> + local dir = fio.tempdir()
> + local script_path = fio.pathjoin(dir, 'script.lua')
> + local script = fio.open(script_path, {'O_CREAT', 'O_WRONLY', 'O_APPEND'},
> + tonumber('0777', 8))
> + script:write(code)
> + script:write("\nos.exit(0)")
> + script:close()
> + local cmd = [[/bin/sh -c 'cd "%s" && "%s" ./script.lua 2> /dev/null']]
> + local res = os.execute(string.format(cmd, dir, tarantool_bin))
> + fio.rmtree(dir)
> + return res
> +end
> +
> +--
> +-- gh-4594: when memtx_dir is not exists, but vinyl_dir exists and
> +-- errno is set to ENOENT, box configuration succeeds, however it
> +-- should not
> +--
> +vinyl_dir = fio.tempdir()
Same here, it would be good to use 'local' here.
> +run_script(string.format([[
> +box.cfg{vinyl_dir = '%s'}
> +s = box.schema.space.create('test', {engine = 'vinyl'})
> +s:create_index('pk')
> +os.exit(0)
> +]], vinyl_dir))
> +code = string.format([[
Same here.
> +local errno = require('errno')
> +errno(errno.ENOENT)
> +box.cfg{vinyl_dir = '%s'}
> +os.exit(0)
> +]], vinyl_dir)
> +test:is(run_script(code), PANIC, "bootstrap with ENOENT from non-empty vinyl_dir")
> +fio.rmtree(vinyl_dir)
I propose to clarify the test steps a bit:
| vinyl_dir = <...>
|
| -- Fill vinyl_dir.
| run_script(<...>)
|
| -- Verify the case described above.
| local code = <...>
| test:is(<...>)
|
| -- Remove vinyl_dir.
| fio.rmtree(vinyl_dir)
> +
> +test:check()
> +os.exit(0)
I suggest to highlight a test status in the exit code (it follows the
Lua Style Guide proposal [1]):
| os.exit(test:check() and 0 or 1)
[1]: https://github.com/tarantool/doc/issues/1004
More information about the Tarantool-patches
mailing list