From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5547A42EF5C for ; Thu, 11 Jun 2020 19:18:03 +0300 (MSK) References: <0f0ad73e2fce564e22dcc8f9970d8aedd028c279.1591028838.git.alexander.turenko@tarantool.org> <920c35d4-150b-bae5-345f-e819ada02367@tarantool.org> <20200607165816.4r5s7ah4pa5uihyl@tkn_work_nb> From: Vladislav Shpilevoy Message-ID: Date: Thu, 11 Jun 2020 18:18:01 +0200 MIME-Version: 1.0 In-Reply-To: <20200607165816.4r5s7ah4pa5uihyl@tkn_work_nb> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Turenko Cc: tarantool-patches@dev.tarantool.org Thanks for the fixes! See 2 comments below. > A merge source API is designed to be quite abstract: the base structure > and virtual methods do not depend on Lua anyhow. Each source should > implement next() and destroy() virtual methods, which may be called from > C without a Lua state. This design allows to use any source as from C as > well as from Lua. The Lua API is based on the C API and supports any > source. Even merger itself is implemented in pure C according to the > merge source API and so may be used from Lua. > > A particular source implementation may use a Lua state internally, but > it is not part of the API and should be hidden under hood. In fact all > sources we have now (except merger itself) store some references in > LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in > the next() virtual method. > > Before this patch, the sources ('buffer', 'table', 'tuple') assume that > a Lua state always exists in the fiber storage of a fiber, where next() > is called. This looks valid on the first glance, because it may be > called either from a Lua code or from merger, which in turn is called > from a Lua code. However background fibers (they serve binary protocol > requests) do not store a Lua state in the fiber storage even for Lua > call / eval requests. > > Possible solution would be always store a Lua state in a fiber storage. > There are two reasons why it is not implemented here: > > 1. There should be a decision about right balance between speed and > memory footprint and maybe some eviction strategy for cached Lua > states. Not sure we can just always store a state in each background > fiber. It would be wasteful for instances that serve box DQL / DML, > SQL and/or C procedure calls. > 2. Technically contract of the next() method would assume that a Lua > state should exist in a fiber storage. Such requirement looks quite > unnatural for a C API and also looks fragile: what if we'll implement > some less wasteful Lua state caching strategy and the assumption > about presence of the Lua state will get broken? > > Obviously, next() will spend extra time to create a temporary state when > it is called from a background fiber. We should reuse existing Lua state > at least when a Lua call is performed via a binary protocol. I consider > it as the optimization and will solve in the next commit. > > A few words about the implementation. I have added three functions, > which acquire a temporary Lua state, call a function and release the > state. It may be squashed into one function that would accept a function > pointer and variable number of arguments. However GCC does not > devirtualize such calls at -O2 level, so it seems it is better to avoid > this. It maybe possible to write some weird macro that will technically > reduce code duplication, but I prefer to write in C, not some macro > based meta-language. > > Usage of the fiber-local Lua state is not quite correct now: merge 1. merge -> merger. > source code may left garbage on a stack in case of failures (like > unexpected result of Lua iterator generator function). This behaviour is > kept as is here, but it will be resolved by the next patch. > > Fixes #4954 > > diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c > index 16814c041..b8c432114 100644 > --- a/src/box/lua/merger.c > +++ b/src/box/lua/merger.c > @@ -149,6 +149,74 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format) > return tuple; > } > > +/** > + * Get a temporary Lua state. > + * > + * Use case: a function does not accept a Lua state as an argument > + * to allow using from C code, but uses a Lua value, which is > + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed > + * to get and process the value. > + * > + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`. > + * > + * This Lua state should be used only from one fiber: otherwise > + * one fiber may change the stack and another one will access a > + * wrong stack slot when it will be scheduled for execution after > + * yield. > + * > + * Return a Lua state on success and set @a coro_ref. This > + * reference should be passed to `luaT_release_temp_luastate()`, > + * when the state is not needed anymore. > + * > + * Return NULL and set a diag at failure. > + */ > +static struct lua_State * > +luaT_temp_luastate(int *coro_ref) > +{ > + if (fiber()->storage.lua.stack != NULL) { > + *coro_ref = LUA_REFNIL; > + return fiber()->storage.lua.stack; > + } > + > + /* > + * luaT_newthread() pops the new Lua state from > + * tarantool_L and it is right thing to do: if we'll push > + * something to it and yield, then another fiber will not > + * know that a stack top is changed and may operate on a > + * wrong slot. > + * > + * Second, many requests that push a value to tarantool_L > + * and yield may exhaust available slots on the stack. > + */ > + struct lua_State *L = luaT_newthread(tarantool_L); > + if (L == NULL) > + return NULL; 2. luaT_newthread() does not set a diag. That may lead to a crash, because as far as I see, this function may be called lbox_merge_source_gen() indirectly, somewhere deep in the callstack. And it luaT_error(), when merge_source_next() fails. > + /* > + * The new state is not referenced from anywhere (reasons > + * are above), so we should keep a reference to it in the > + * registry while it is in use. > + */ > + *coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX); > + return L; > +}