From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp33.i.mail.ru (smtp33.i.mail.ru [94.100.177.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 9D0AE445320 for ; Thu, 16 Jul 2020 23:11:44 +0300 (MSK) Date: Thu, 16 Jul 2020 23:10:57 +0300 From: Alexander Turenko Message-ID: <20200716201057.67ity4dkaozagvjd@tkn_work_nb> References: <20200701203633.GA5559@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200701203633.GA5559@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v2 1/3] merger: fix NULL dereference when called via iproto List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy > > A particular source implementation may use a Lua state internally, but > > it is not part of the API and should be hidden under hood. In fact all > > Typo: s/under hood/under the hood/ or s/under hood/under its hood/. Fixed. > > A few words about the implementation. I have added three functions, > > which acquire a temporary Lua state, call a function and release the > > state. It may be squashed into one function that would accept a function > > pointer and variable number of arguments. However GCC does not > > devirtualize such calls at -O2 level, so it seems it is better to avoid > > this. It maybe possible to write some weird macro that will technically > > reduce code duplication, but I prefer to write in C, not some macro > > based meta-language. > > Side note: No one pushes you to create a particular DSL for this case, > but I see nothing criminal to use macros sometimes. I personally prefer > to generalize the occurrences you mentioned above. On the second thought > I guess performance deviation is negligible and the benefits for the > further maintenance are doubtful. I don't find a good way to generate declarations and definitions (or at least definitions) for those functions using some macro. The main problem is that each function has its own arguments list (with its own types). > > +/** > > + * Get a temporary Lua state. > > + * > > + * Use case: a function does not accept a Lua state as an argument > > + * to allow using from C code, but uses a Lua value, which is > > + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed > > + * to get and process the value. > > + * > > + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`. > > Pardon, I don't get this line. I made an attempt to clarify the comment: | * The resulting Lua state has a separate Lua stack, but the same | * globals and registry as `tarantool_L` (and all Lua states in | * tarantool at the moment of writing this). Existing terms are not perfect: both lua_newthread() and luaL_newstate() return the same type of pointer: . So I called any so typed structure 'Lua state' or just 'state'. I hope now the idea 'you can reference something in a registry of some other Lua state and then get it from the registry using this temporary Lua state' is clear. > > > + * > > + * This Lua state should be used only from one fiber: otherwise > > + * one fiber may change the stack and another one will access a > > + * wrong stack slot when it will be scheduled for execution after > > + * yield. > > + * > > + * Return a Lua state on success and set @a coro_ref. This > > + * reference should be passed to `luaT_release_temp_luastate()`, > > + * when the state is not needed anymore. > > + * > > + * Return NULL and set a diag at failure. > > + */ > > +static struct lua_State * > > +luaT_temp_luastate(int *coro_ref) > > +{ > > + if (fiber()->storage.lua.stack != NULL) { > > + *coro_ref = LUA_REFNIL; > > It definitely doesn't affect the implemented behaviour (considering > you're not referencing a value within ); I'm > just too pedantic here: LUA_REFNIL is the ref value obtained from > call anchoring a slot. At the same time there is > another special ref value for your purposes -- LUA_NOREF[1]. > Furthermore, it's the way more convenient to use it for *all* initial > ref values below. > > [1]: https://www.lua.org/manual/5.1/manual.html#pdf-LUA_NOREF LUA_REFNIL is used more often in this sense in tarantool (I don't know why) and there is no practical difference (of course, when is not valid value to reference). I just chose one that is more widely used within our sources. But considering your opinion that LUA_NOREF name better fit here, I changed it to LUA_NOREF now. > > + /* > > + * luaT_newthread() pops the new Lua state from > > + * tarantool_L and it is right thing to do: if we'll push > > + * something to it and yield, then another fiber will not > > + * know that a stack top is changed and may operate on a > > + * wrong slot. > > It seems to relate more to contract, so you can just > mention that it leaves no garbage on the given coroutine stack, ergo > nothing need to be popped in the caller function. I have two goals here: 1. Clarify luaT_newthread() contract on the caller side, because it is unusual for Lua. 2. Explain why we should not leave the new state on top of tarantool_L in luaT_temp_luastate(). There are two reasons, why leaving 'garbage' on tarantool_L is not acceptable here. I want to mention both here. I reformatted the comment a bit to make it more clear: | /* | * Unlike lua_newthread(), luaT_newthread() does not leave | * the new Lua state on tarantool_L. | * | * It is desired behaviour here, because of two reasons. | * | * First, if we'll push something to tarantool_L and | * yield, then another fiber will not know that a stack | * top is changed and may operate on a wrong slot. | * | * Second, many requests that push a value to tarantool_L | * and yield may exhaust available slots on the stack. It | * is limited by LUAI_MAXSTACK build time constant (~65K). | */ > > + * > > + * Second, many requests that push a value to tarantool_L > > + * and yield may exhaust available slots on the stack. > > Pardon, I don't get this line. I don't know how clarify it and don't make it huge. The case is the following: tarantool serves many similar requests. All requests execute luaT_temp_luastate() function and call a merge source fetch function (an iterator generator), which yields. If we would leave new Lua states on tarantool_L, it would be possible that all available stack slots in tarantool_L are occupied and the next request will fails with the 'stack overflow' error. After offline discussion with Igor we agreed to clarify the comment with mention of the LUAI_MAXSTACK constant. The new comment is the following: | * Second, many requests that push a value to tarantool_L | * and yield may exhaust available slots on the stack. It | * is limited by LUAI_MAXSTACK build time constant (~65K). > > +/** > > + * Release a temporary Lua state. > > + * > > + * It is the other half of `luaT_temp_luastate()`. > > It's not a half, it's a complement for function. Thanks! I thought on the right word and failed with it. The new wording: | /** | * Release a temporary Lua state. | * | * It complements `luaT_temp_luastate()`. | */