[Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Jun 11 19:18:01 MSK 2020


Thanks for the fixes!

See 2 comments below.

>     A merge source API is designed to be quite abstract: the base structure
>     and virtual methods do not depend on Lua anyhow. Each source should
>     implement next() and destroy() virtual methods, which may be called from
>     C without a Lua state. This design allows to use any source as from C as
>     well as from Lua. The Lua API is based on the C API and supports any
>     source. Even merger itself is implemented in pure C according to the
>     merge source API and so may be used from Lua.
>     
>     A particular source implementation may use a Lua state internally, but
>     it is not part of the API and should be hidden under hood. In fact all
>     sources we have now (except merger itself) store some references in
>     LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
>     the next() virtual method.
>     
>     Before this patch, the sources ('buffer', 'table', 'tuple') assume that
>     a Lua state always exists in the fiber storage of a fiber, where next()
>     is called. This looks valid on the first glance, because it may be
>     called either from a Lua code or from merger, which in turn is called
>     from a Lua code. However background fibers (they serve binary protocol
>     requests) do not store a Lua state in the fiber storage even for Lua
>     call / eval requests.
>     
>     Possible solution would be always store a Lua state in a fiber storage.
>     There are two reasons why it is not implemented here:
>     
>     1. There should be a decision about right balance between speed and
>        memory footprint and maybe some eviction strategy for cached Lua
>        states. Not sure we can just always store a state in each background
>        fiber. It would be wasteful for instances that serve box DQL / DML,
>        SQL and/or C procedure calls.
>     2. Technically contract of the next() method would assume that a Lua
>        state should exist in a fiber storage. Such requirement looks quite
>        unnatural for a C API and also looks fragile: what if we'll implement
>        some less wasteful Lua state caching strategy and the assumption
>        about presence of the Lua state will get broken?
>     
>     Obviously, next() will spend extra time to create a temporary state when
>     it is called from a background fiber. We should reuse existing Lua state
>     at least when a Lua call is performed via a binary protocol. I consider
>     it as the optimization and will solve in the next commit.
>     
>     A few words about the implementation. I have added three functions,
>     which acquire a temporary Lua state, call a function and release the
>     state. It may be squashed into one function that would accept a function
>     pointer and variable number of arguments. However GCC does not
>     devirtualize such calls at -O2 level, so it seems it is better to avoid
>     this. It maybe possible to write some weird macro that will technically
>     reduce code duplication, but I prefer to write in C, not some macro
>     based meta-language.
>     
>     Usage of the fiber-local Lua state is not quite correct now: merge

1. merge -> merger.

>     source code may left garbage on a stack in case of failures (like
>     unexpected result of Lua iterator generator function). This behaviour is
>     kept as is here, but it will be resolved by the next patch.
>     
>     Fixes #4954
> 
> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index 16814c041..b8c432114 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c
> @@ -149,6 +149,74 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
>  	return tuple;
>  }
>  
> +/**
> + * Get a temporary Lua state.
> + *
> + * Use case: a function does not accept a Lua state as an argument
> + * to allow using from C code, but uses a Lua value, which is
> + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed
> + * to get and process the value.
> + *
> + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`.
> + *
> + * This Lua state should be used only from one fiber: otherwise
> + * one fiber may change the stack and another one will access a
> + * wrong stack slot when it will be scheduled for execution after
> + * yield.
> + *
> + * Return a Lua state on success and set @a coro_ref. This
> + * reference should be passed to `luaT_release_temp_luastate()`,
> + * when the state is not needed anymore.
> + *
> + * Return NULL and set a diag at failure.
> + */
> +static struct lua_State *
> +luaT_temp_luastate(int *coro_ref)
> +{
> +	if (fiber()->storage.lua.stack != NULL) {
> +		*coro_ref = LUA_REFNIL;
> +		return fiber()->storage.lua.stack;
> +	}
> +
> +	/*
> +	 * luaT_newthread() pops the new Lua state from
> +	 * tarantool_L and it is right thing to do: if we'll push
> +	 * something to it and yield, then another fiber will not
> +	 * know that a stack top is changed and may operate on a
> +	 * wrong slot.
> +	 *
> +	 * Second, many requests that push a value to tarantool_L
> +	 * and yield may exhaust available slots on the stack.
> +	 */
> +	struct lua_State *L = luaT_newthread(tarantool_L);
> +	if (L == NULL)
> +		return NULL;

2. luaT_newthread() does not set a diag. That may lead to a crash,
because as far as I see, this function may be called
lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
And it luaT_error(), when merge_source_next() fails.

> +	/*
> +	 * The new state is not referenced from anywhere (reasons
> +	 * are above), so we should keep a reference to it in the
> +	 * registry while it is in use.
> +	 */
> +	*coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
> +	return L;
> +}


More information about the Tarantool-patches mailing list