Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
Date: Thu, 11 Jun 2020 18:18:01 +0200	[thread overview]
Message-ID: <c00b3856-7938-5d8b-e51a-279f1def539d@tarantool.org> (raw)
In-Reply-To: <20200607165816.4r5s7ah4pa5uihyl@tkn_work_nb>

Thanks for the fixes!

See 2 comments below.

>     A merge source API is designed to be quite abstract: the base structure
>     and virtual methods do not depend on Lua anyhow. Each source should
>     implement next() and destroy() virtual methods, which may be called from
>     C without a Lua state. This design allows to use any source as from C as
>     well as from Lua. The Lua API is based on the C API and supports any
>     source. Even merger itself is implemented in pure C according to the
>     merge source API and so may be used from Lua.
>     
>     A particular source implementation may use a Lua state internally, but
>     it is not part of the API and should be hidden under hood. In fact all
>     sources we have now (except merger itself) store some references in
>     LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
>     the next() virtual method.
>     
>     Before this patch, the sources ('buffer', 'table', 'tuple') assume that
>     a Lua state always exists in the fiber storage of a fiber, where next()
>     is called. This looks valid on the first glance, because it may be
>     called either from a Lua code or from merger, which in turn is called
>     from a Lua code. However background fibers (they serve binary protocol
>     requests) do not store a Lua state in the fiber storage even for Lua
>     call / eval requests.
>     
>     Possible solution would be always store a Lua state in a fiber storage.
>     There are two reasons why it is not implemented here:
>     
>     1. There should be a decision about right balance between speed and
>        memory footprint and maybe some eviction strategy for cached Lua
>        states. Not sure we can just always store a state in each background
>        fiber. It would be wasteful for instances that serve box DQL / DML,
>        SQL and/or C procedure calls.
>     2. Technically contract of the next() method would assume that a Lua
>        state should exist in a fiber storage. Such requirement looks quite
>        unnatural for a C API and also looks fragile: what if we'll implement
>        some less wasteful Lua state caching strategy and the assumption
>        about presence of the Lua state will get broken?
>     
>     Obviously, next() will spend extra time to create a temporary state when
>     it is called from a background fiber. We should reuse existing Lua state
>     at least when a Lua call is performed via a binary protocol. I consider
>     it as the optimization and will solve in the next commit.
>     
>     A few words about the implementation. I have added three functions,
>     which acquire a temporary Lua state, call a function and release the
>     state. It may be squashed into one function that would accept a function
>     pointer and variable number of arguments. However GCC does not
>     devirtualize such calls at -O2 level, so it seems it is better to avoid
>     this. It maybe possible to write some weird macro that will technically
>     reduce code duplication, but I prefer to write in C, not some macro
>     based meta-language.
>     
>     Usage of the fiber-local Lua state is not quite correct now: merge

1. merge -> merger.

>     source code may left garbage on a stack in case of failures (like
>     unexpected result of Lua iterator generator function). This behaviour is
>     kept as is here, but it will be resolved by the next patch.
>     
>     Fixes #4954
> 
> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index 16814c041..b8c432114 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c
> @@ -149,6 +149,74 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
>  	return tuple;
>  }
>  
> +/**
> + * Get a temporary Lua state.
> + *
> + * Use case: a function does not accept a Lua state as an argument
> + * to allow using from C code, but uses a Lua value, which is
> + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed
> + * to get and process the value.
> + *
> + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`.
> + *
> + * This Lua state should be used only from one fiber: otherwise
> + * one fiber may change the stack and another one will access a
> + * wrong stack slot when it will be scheduled for execution after
> + * yield.
> + *
> + * Return a Lua state on success and set @a coro_ref. This
> + * reference should be passed to `luaT_release_temp_luastate()`,
> + * when the state is not needed anymore.
> + *
> + * Return NULL and set a diag at failure.
> + */
> +static struct lua_State *
> +luaT_temp_luastate(int *coro_ref)
> +{
> +	if (fiber()->storage.lua.stack != NULL) {
> +		*coro_ref = LUA_REFNIL;
> +		return fiber()->storage.lua.stack;
> +	}
> +
> +	/*
> +	 * luaT_newthread() pops the new Lua state from
> +	 * tarantool_L and it is right thing to do: if we'll push
> +	 * something to it and yield, then another fiber will not
> +	 * know that a stack top is changed and may operate on a
> +	 * wrong slot.
> +	 *
> +	 * Second, many requests that push a value to tarantool_L
> +	 * and yield may exhaust available slots on the stack.
> +	 */
> +	struct lua_State *L = luaT_newthread(tarantool_L);
> +	if (L == NULL)
> +		return NULL;

2. luaT_newthread() does not set a diag. That may lead to a crash,
because as far as I see, this function may be called
lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
And it luaT_error(), when merge_source_next() fails.

> +	/*
> +	 * The new state is not referenced from anywhere (reasons
> +	 * are above), so we should keep a reference to it in the
> +	 * registry while it is in use.
> +	 */
> +	*coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
> +	return L;
> +}

  reply	other threads:[~2020-06-11 16:18 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
2020-06-02 22:47   ` Vladislav Shpilevoy
2020-06-07 16:57     ` Alexander Turenko
2020-06-11 16:17       ` Vladislav Shpilevoy
2020-06-16 11:59       ` Igor Munkin
2020-06-17 17:53         ` Alexander Turenko
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto Alexander Turenko
2020-06-02 22:48   ` Vladislav Shpilevoy
2020-06-07 16:58     ` Alexander Turenko
2020-06-11 16:18       ` Vladislav Shpilevoy [this message]
2020-06-17 17:53         ` Alexander Turenko
2020-06-18 22:47           ` Vladislav Shpilevoy
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls Alexander Turenko
2020-06-02 22:48   ` Vladislav Shpilevoy
2020-06-07 16:58     ` Alexander Turenko
2020-06-02 22:47 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Vladislav Shpilevoy
2020-06-07 17:17   ` Alexander Turenko
2020-06-07 16:58 ` [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next() Alexander Turenko
2020-06-11 16:20   ` Vladislav Shpilevoy
2020-06-17 17:53     ` Alexander Turenko
2020-06-18 22:48       ` Vladislav Shpilevoy
2020-06-19  7:41         ` Alexander Turenko
2020-06-17 17:54 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c00b3856-7938-5d8b-e51a-279f1def539d@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox