[Tarantool-patches] [PATCH v6 1/2] fiber: set diagnostics at madvise/mprotect failure
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Thu Feb 13 03:08:18 MSK 2020
Thanks for the patch!
See 7 comments below.
On 06/02/2020 13:31, Cyrill Gorcunov wrote:
> Both madvise and mprotect calls can fail due to various
> reasons, mostly because of lack of free memory in the
> system.
>
> We log such cases via say_x helpers but this is not enough.
> In particular tarantool/memcached relies on diag error to be
> set to detect an error condition:
>
> | expire_fiber = fiber_new(name, memcached_expire_loop);
> | const box_error_t *err = box_error_last();
> | if (err) {
> | say_error("Can't start the expire fiber");
> | say_error("%s", box_error_message(err));
> | return -1;
> | }
>
> Thus lets use diag_set() helper here and instead of macros
> use inline functions for better readability.
>
> Fixes #4722
>
> Reported-by: Alexander Turenko <alexander.turenko at tarantool.org>
> Reviewed-by: Alexander Turenko <alexander.turenko at tarantool.org>
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> ---
> src/lib/core/errinj.h | 2 +
> src/lib/core/fiber.c | 84 +++++++++++++++++-----
> test/box/errinj.result | 136 ++++++++++++++++++-----------------
> test/unit/CMakeLists.txt | 4 ++
> test/unit/fiber_stack.cc | 83 +++++++++++++++++++++
> test/unit/fiber_stack.result | 7 ++
> test/unit/suite.ini | 2 +-
> 7 files changed, 234 insertions(+), 84 deletions(-)
> create mode 100644 test/unit/fiber_stack.cc
> create mode 100644 test/unit/fiber_stack.result
>
> ENUM0(errinj_id, ERRINJ_LIST);
> extern struct errinj errinjs[];
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index 00ae8cded..57e4cb6ef 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -1017,7 +1047,22 @@ fiber_stack_destroy(struct fiber *fiber, struct slab_cache *slabc)
> guard = page_align_down(fiber->stack - page_size);
> else
> guard = page_align_up(fiber->stack + fiber->stack_size);
> - fiber_mprotect(guard, page_size, PROT_READ | PROT_WRITE);
> +
> + if (fiber_mprotect(guard, page_size, mprotect_flags) != 0) {
> + /*
> + * FIXME: We need some intelligent handling:
> + * say put this slab into a queue and retry
> + * to setup the original protection back in
> + * background.
> + *
> + * Note that in case if we're called from
> + * fiber_stack_create() the @mprotect_flags is
1. Nit: we use Doxygen comment style, so any word right after @ is
treated as a Doxygen command. If you want to mention some parameter,
use command @a. Like this: '@a mprotect_flags'.
> + * the same as the slab been created with, so
> + * calling mprotect for VMA with same flags
> + * won't fail.
> + */
2. Yeah, good idea. We can do it not only in background though, but
do similar to pthread and how it manages stacks to free. We can put
stacks into a list, and try to free something from that list on each
next creation of a fiber. Or even reuse right away, without freeing.
> + diag_log();
> + }
> slab_put(slabc, fiber->stack_slab);
> }
> }
> diff --git a/test/box/errinj.result b/test/box/errinj.result
> index babe36b1b..16f6b40c2 100644
> --- a/test/box/errinj.result
> +++ b/test/box/errinj.result
3. So big diff for such a small change. I propose to fix this. We
can sort result of errinj.info before printing it. It would make all
changes to errinj produce much smaller diff. The patch should be
trivial. Could you please create it?
> diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
> index 4a57597e9..212a02d36 100644
> --- a/test/unit/CMakeLists.txt
> +++ b/test/unit/CMakeLists.txt
> @@ -74,6 +74,10 @@ add_executable(fiber.test fiber.cc)
> set_source_files_properties(fiber.cc PROPERTIES COMPILE_FLAGS -O0)
> target_link_libraries(fiber.test core unit)
>
> +add_executable(fiber_stack.test fiber_stack.cc)
> +set_source_files_properties(fiber_stack.cc PROPERTIES COMPILE_FLAGS -O0)
> +target_link_libraries(fiber_stack.test core unit)
> +
> if (NOT ENABLE_GCOV)
> # This test is known to be broken with GCOV
> add_executable(guard.test guard.cc)
> diff --git a/test/unit/fiber_stack.cc b/test/unit/fiber_stack.cc
> new file mode 100644
> index 000000000..de7fe90e3
> --- /dev/null
> +++ b/test/unit/fiber_stack.cc
4. Do you need C++ here?
5. I would call it fiber_errinj.c. In future we may want to
add more tests here, which use error injections. Up to you.
> @@ -0,0 +1,83 @@
> +#include "memory.h"
> +#include "fiber.h"
> +#include "unit.h"
> +#include "trivia/util.h"
> +#include "errinj.h"
> +
> +static struct fiber_attr default_attr;
> +
> +static int
> +noop_f(va_list ap)
> +{
> + return 0;
> +}
> +
> +static void
> +fiber_stack_fail_test(void)
> +{
> + struct errinj *inj;
> + struct fiber *fiber;
> +
> + header();
6. After header() there should be a plan() call saying how
many tests you are going to do. In the end there should be
check_plan() call, which prints a summary. Could you,
please, add them?
> +
> + /*
> + * Set non-default stack size to prevent reusing of an
> + * existing fiber.
> + */
> + struct fiber_attr *fiber_attr = fiber_attr_new();
> + fiber_attr_setstacksize(fiber_attr, default_attr.stack_size * 2);
> +
> + /*
> + * Clear the fiber's diagnostics area to check that failed
> + * fiber_new() sets an error.
> + */
> + diag_clear(diag_get());
> +
> + /*
> + * Check guard page setup via mprotect. We can't test the fiber
> + * destroy path since it clears fiber's diag.
> + */
> + inj = errinj(ERRINJ_FIBER_MPROTECT, ERRINJ_INT);
> + inj->iparam = PROT_NONE;
> + fiber = fiber_new_ex("test_mprotect", fiber_attr, noop_f);
> + inj->iparam = -1;
> +
> + ok(fiber == NULL, "mprotect: failed to setup fiber guard page");
> + ok(diag_get() != NULL, "mprotect: diag is armed after error");
> +
> + /*
> + * Check madvise. We can't test the fiber destroy
> + * path since it is cleaning error.
> + */
> + diag_clear(diag_get());
> + inj = errinj(ERRINJ_FIBER_MADVISE, ERRINJ_BOOL);
> + inj->bparam = true;
> + fiber = fiber_new_ex("test_madvise", fiber_attr, noop_f);
> + inj->bparam = false;
> +
> + ok(fiber != NULL, "madvise: non critical error on madvise hint");
> + ok(diag_get() != NULL, "madvise: diag is armed after error");
> +
> + footer();
> +}
> +
> +static int
> +main_f(va_list ap)
> +{
7. Main caller of test functions usually also has header(),
plan(), check_plan(), and footer(). This is useful when
you have many tests, and each has lots of cases.
Here you have only one test, so far. Probably these calls
can be omitted in main_f(). Up to you.
More information about the Tarantool-patches
mailing list