[Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Thu Jun 11 19:18:01 MSK 2020
Thanks for the fixes!
See 2 comments below.
> A merge source API is designed to be quite abstract: the base structure
> and virtual methods do not depend on Lua anyhow. Each source should
> implement next() and destroy() virtual methods, which may be called from
> C without a Lua state. This design allows to use any source as from C as
> well as from Lua. The Lua API is based on the C API and supports any
> source. Even merger itself is implemented in pure C according to the
> merge source API and so may be used from Lua.
>
> A particular source implementation may use a Lua state internally, but
> it is not part of the API and should be hidden under hood. In fact all
> sources we have now (except merger itself) store some references in
> LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
> the next() virtual method.
>
> Before this patch, the sources ('buffer', 'table', 'tuple') assume that
> a Lua state always exists in the fiber storage of a fiber, where next()
> is called. This looks valid on the first glance, because it may be
> called either from a Lua code or from merger, which in turn is called
> from a Lua code. However background fibers (they serve binary protocol
> requests) do not store a Lua state in the fiber storage even for Lua
> call / eval requests.
>
> Possible solution would be always store a Lua state in a fiber storage.
> There are two reasons why it is not implemented here:
>
> 1. There should be a decision about right balance between speed and
> memory footprint and maybe some eviction strategy for cached Lua
> states. Not sure we can just always store a state in each background
> fiber. It would be wasteful for instances that serve box DQL / DML,
> SQL and/or C procedure calls.
> 2. Technically contract of the next() method would assume that a Lua
> state should exist in a fiber storage. Such requirement looks quite
> unnatural for a C API and also looks fragile: what if we'll implement
> some less wasteful Lua state caching strategy and the assumption
> about presence of the Lua state will get broken?
>
> Obviously, next() will spend extra time to create a temporary state when
> it is called from a background fiber. We should reuse existing Lua state
> at least when a Lua call is performed via a binary protocol. I consider
> it as the optimization and will solve in the next commit.
>
> A few words about the implementation. I have added three functions,
> which acquire a temporary Lua state, call a function and release the
> state. It may be squashed into one function that would accept a function
> pointer and variable number of arguments. However GCC does not
> devirtualize such calls at -O2 level, so it seems it is better to avoid
> this. It maybe possible to write some weird macro that will technically
> reduce code duplication, but I prefer to write in C, not some macro
> based meta-language.
>
> Usage of the fiber-local Lua state is not quite correct now: merge
1. merge -> merger.
> source code may left garbage on a stack in case of failures (like
> unexpected result of Lua iterator generator function). This behaviour is
> kept as is here, but it will be resolved by the next patch.
>
> Fixes #4954
>
> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index 16814c041..b8c432114 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c
> @@ -149,6 +149,74 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
> return tuple;
> }
>
> +/**
> + * Get a temporary Lua state.
> + *
> + * Use case: a function does not accept a Lua state as an argument
> + * to allow using from C code, but uses a Lua value, which is
> + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed
> + * to get and process the value.
> + *
> + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`.
> + *
> + * This Lua state should be used only from one fiber: otherwise
> + * one fiber may change the stack and another one will access a
> + * wrong stack slot when it will be scheduled for execution after
> + * yield.
> + *
> + * Return a Lua state on success and set @a coro_ref. This
> + * reference should be passed to `luaT_release_temp_luastate()`,
> + * when the state is not needed anymore.
> + *
> + * Return NULL and set a diag at failure.
> + */
> +static struct lua_State *
> +luaT_temp_luastate(int *coro_ref)
> +{
> + if (fiber()->storage.lua.stack != NULL) {
> + *coro_ref = LUA_REFNIL;
> + return fiber()->storage.lua.stack;
> + }
> +
> + /*
> + * luaT_newthread() pops the new Lua state from
> + * tarantool_L and it is right thing to do: if we'll push
> + * something to it and yield, then another fiber will not
> + * know that a stack top is changed and may operate on a
> + * wrong slot.
> + *
> + * Second, many requests that push a value to tarantool_L
> + * and yield may exhaust available slots on the stack.
> + */
> + struct lua_State *L = luaT_newthread(tarantool_L);
> + if (L == NULL)
> + return NULL;
2. luaT_newthread() does not set a diag. That may lead to a crash,
because as far as I see, this function may be called
lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
And it luaT_error(), when merge_source_next() fails.
> + /*
> + * The new state is not referenced from anywhere (reasons
> + * are above), so we should keep a reference to it in the
> + * registry while it is in use.
> + */
> + *coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
> + return L;
> +}
More information about the Tarantool-patches
mailing list