[Tarantool-patches] [PATCH v7 2/2] fiber: leak slab if unable to bring prots back

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Feb 14 02:26:04 MSK 2020


Thanks for the fixes!

See 2 comments below.

On 13/02/2020 21:56, Cyrill Gorcunov wrote:
> In case if we unable to revert guard page back to
> read|write we should never use such slab again.
> 
> Initially I thought of just put panic here and
> exit but it is too destructive. I think better
> print an error and continue. If node admin ignore
> this message then one moment at future there won't
> be slab left for use and creating new fibers get
> prohibited.
> 
> In future (hopefully near one) we plan to drop
> guard pages to prevent VMA fracturing and use
> stack marks instead.
> 
> Reviewed-by: Alexander Turenko <alexander.turenko at tarantool.org>
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> ---
>  src/lib/core/fiber.c         | 11 +++++++++--
>  test/unit/fiber_stack.c      | 29 ++++++++++++++++++++++++++---
>  test/unit/fiber_stack.result |  4 +++-
>  3 files changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/test/unit/fiber_stack.c b/test/unit/fiber_stack.c
> index 103dfaf0d..5eb128ea1 100644
> --- a/test/unit/fiber_stack.c
> +++ b/test/unit/fiber_stack.c
> @@ -59,6 +58,30 @@ main_f(va_list ap)
>  	ok(fiber != NULL, "madvise: non critical error on madvise hint");
>  	ok(diag_get() != NULL, "madvise: diag is armed after error");
>  
> +	/*
> +	 * Check if we leak on fiber descrution.

1. You probably have meant 'destruction'.

> +	 * We will print an error and result get
> +	 * compared by testing engine.
> +	 */
> +	fiber_attr_delete(fiber_attr);
> +	fiber_attr = fiber_attr_new();
> +	fiber_attr->flags |= FIBER_CUSTOM_STACK;
> +	fiber_attr->stack_size = 64 << 10;
> +
> +	diag_clear(diag_get());
> +	fiber = fiber_new_ex("test_madvise", fiber_attr, noop_f);
> +	ok(fiber != NULL, "fiber with custom stack");
> +	fiber_set_joinable(fiber, true);
> +
> +	inj = errinj(ERRINJ_FIBER_MPROTECT, ERRINJ_INT);
> +	inj->iparam = PROT_READ | PROT_WRITE;
> +
> +	fiber_start(fiber);
> +	fiber_join(fiber);
> +
> +	inj->iparam = -1;
> +
> +	fiber_attr_delete(fiber_attr);

2. When I run the test without the fix, I got 'Bus error'.
Was it intended? In my email I proposed to compare memory
statistics, but crash also looks ok, maybe. If this is what
you meant. And then there should be a comment saying, that we
expect fiber_join() to crash when no leak is done, because it
would put the slab into the cash, where it would be read/written
against protection. This was not obvious until I reverted your
patch and run the test.

IMO, problem of that way of testing is that if internals of
core/fiber.c one day will stop touching stack of a just
destroyed fiber (for example, a pointer at it would be cached
somewhere for newer fibers), this test will happily pass, even
though it perhaps should not. It will pass regardless of what
happens with a fiber stack unless something tries to read/write
it.

>  	footer();
>  
>  	ev_break(loop(), EVBREAK_ALL);
> diff --git a/test/unit/fiber_stack.result b/test/unit/fiber_stack.result
> index 43ff74b2f..f0f5a59c2 100644
> --- a/test/unit/fiber_stack.result
> +++ b/test/unit/fiber_stack.result
> @@ -1,8 +1,10 @@
>  SystemError fiber mprotect failed: Cannot allocate memory
> +fiber: Can't put guard page to slab. Leak 57344 bytes: Cannot allocate memory
>  	*** main_f ***
> -1..4
> +1..5
>  ok 1 - mprotect: failed to setup fiber guard page
>  ok 2 - mprotect: diag is armed after error
>  ok 3 - madvise: non critical error on madvise hint
>  ok 4 - madvise: diag is armed after error
> +ok 5 - fiber with custom stack
>  	*** main_f: done ***
> 


More information about the Tarantool-patches mailing list