From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
Date: Thu, 11 Jun 2020 18:18:01 +0200 [thread overview]
Message-ID: <c00b3856-7938-5d8b-e51a-279f1def539d@tarantool.org> (raw)
In-Reply-To: <20200607165816.4r5s7ah4pa5uihyl@tkn_work_nb>
Thanks for the fixes!
See 2 comments below.
> A merge source API is designed to be quite abstract: the base structure
> and virtual methods do not depend on Lua anyhow. Each source should
> implement next() and destroy() virtual methods, which may be called from
> C without a Lua state. This design allows to use any source as from C as
> well as from Lua. The Lua API is based on the C API and supports any
> source. Even merger itself is implemented in pure C according to the
> merge source API and so may be used from Lua.
>
> A particular source implementation may use a Lua state internally, but
> it is not part of the API and should be hidden under hood. In fact all
> sources we have now (except merger itself) store some references in
> LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
> the next() virtual method.
>
> Before this patch, the sources ('buffer', 'table', 'tuple') assume that
> a Lua state always exists in the fiber storage of a fiber, where next()
> is called. This looks valid on the first glance, because it may be
> called either from a Lua code or from merger, which in turn is called
> from a Lua code. However background fibers (they serve binary protocol
> requests) do not store a Lua state in the fiber storage even for Lua
> call / eval requests.
>
> Possible solution would be always store a Lua state in a fiber storage.
> There are two reasons why it is not implemented here:
>
> 1. There should be a decision about right balance between speed and
> memory footprint and maybe some eviction strategy for cached Lua
> states. Not sure we can just always store a state in each background
> fiber. It would be wasteful for instances that serve box DQL / DML,
> SQL and/or C procedure calls.
> 2. Technically contract of the next() method would assume that a Lua
> state should exist in a fiber storage. Such requirement looks quite
> unnatural for a C API and also looks fragile: what if we'll implement
> some less wasteful Lua state caching strategy and the assumption
> about presence of the Lua state will get broken?
>
> Obviously, next() will spend extra time to create a temporary state when
> it is called from a background fiber. We should reuse existing Lua state
> at least when a Lua call is performed via a binary protocol. I consider
> it as the optimization and will solve in the next commit.
>
> A few words about the implementation. I have added three functions,
> which acquire a temporary Lua state, call a function and release the
> state. It may be squashed into one function that would accept a function
> pointer and variable number of arguments. However GCC does not
> devirtualize such calls at -O2 level, so it seems it is better to avoid
> this. It maybe possible to write some weird macro that will technically
> reduce code duplication, but I prefer to write in C, not some macro
> based meta-language.
>
> Usage of the fiber-local Lua state is not quite correct now: merge
1. merge -> merger.
> source code may left garbage on a stack in case of failures (like
> unexpected result of Lua iterator generator function). This behaviour is
> kept as is here, but it will be resolved by the next patch.
>
> Fixes #4954
>
> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index 16814c041..b8c432114 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c
> @@ -149,6 +149,74 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
> return tuple;
> }
>
> +/**
> + * Get a temporary Lua state.
> + *
> + * Use case: a function does not accept a Lua state as an argument
> + * to allow using from C code, but uses a Lua value, which is
> + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed
> + * to get and process the value.
> + *
> + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`.
> + *
> + * This Lua state should be used only from one fiber: otherwise
> + * one fiber may change the stack and another one will access a
> + * wrong stack slot when it will be scheduled for execution after
> + * yield.
> + *
> + * Return a Lua state on success and set @a coro_ref. This
> + * reference should be passed to `luaT_release_temp_luastate()`,
> + * when the state is not needed anymore.
> + *
> + * Return NULL and set a diag at failure.
> + */
> +static struct lua_State *
> +luaT_temp_luastate(int *coro_ref)
> +{
> + if (fiber()->storage.lua.stack != NULL) {
> + *coro_ref = LUA_REFNIL;
> + return fiber()->storage.lua.stack;
> + }
> +
> + /*
> + * luaT_newthread() pops the new Lua state from
> + * tarantool_L and it is right thing to do: if we'll push
> + * something to it and yield, then another fiber will not
> + * know that a stack top is changed and may operate on a
> + * wrong slot.
> + *
> + * Second, many requests that push a value to tarantool_L
> + * and yield may exhaust available slots on the stack.
> + */
> + struct lua_State *L = luaT_newthread(tarantool_L);
> + if (L == NULL)
> + return NULL;
2. luaT_newthread() does not set a diag. That may lead to a crash,
because as far as I see, this function may be called
lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
And it luaT_error(), when merge_source_next() fails.
> + /*
> + * The new state is not referenced from anywhere (reasons
> + * are above), so we should keep a reference to it in the
> + * registry while it is in use.
> + */
> + *coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
> + return L;
> +}
next prev parent reply other threads:[~2020-06-11 16:18 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
2020-06-02 22:47 ` Vladislav Shpilevoy
2020-06-07 16:57 ` Alexander Turenko
2020-06-11 16:17 ` Vladislav Shpilevoy
2020-06-16 11:59 ` Igor Munkin
2020-06-17 17:53 ` Alexander Turenko
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto Alexander Turenko
2020-06-02 22:48 ` Vladislav Shpilevoy
2020-06-07 16:58 ` Alexander Turenko
2020-06-11 16:18 ` Vladislav Shpilevoy [this message]
2020-06-17 17:53 ` Alexander Turenko
2020-06-18 22:47 ` Vladislav Shpilevoy
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls Alexander Turenko
2020-06-02 22:48 ` Vladislav Shpilevoy
2020-06-07 16:58 ` Alexander Turenko
2020-06-02 22:47 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Vladislav Shpilevoy
2020-06-07 17:17 ` Alexander Turenko
2020-06-07 16:58 ` [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next() Alexander Turenko
2020-06-11 16:20 ` Vladislav Shpilevoy
2020-06-17 17:53 ` Alexander Turenko
2020-06-18 22:48 ` Vladislav Shpilevoy
2020-06-19 7:41 ` Alexander Turenko
2020-06-17 17:54 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c00b3856-7938-5d8b-e51a-279f1def539d@tarantool.org \
--to=v.shpilevoy@tarantool.org \
--cc=alexander.turenko@tarantool.org \
--cc=tarantool-patches@dev.tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox