From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 00B8A469710 for ; Tue, 2 Jun 2020 03:28:56 +0300 (MSK) Date: Tue, 2 Jun 2020 03:19:48 +0300 From: Igor Munkin Message-ID: <20200602001948.GA5745@tarantool.org> References: <20200518093748.16825-1-skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200518093748.16825-1-skaplun@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH] lua: lua_field_inspect_table without pushcfunction List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy Sergey, Thanks for the patch! Please consider my comments below. On 18.05.20, Sergey Kaplun wrote: > Currently on encoding table we push cfunction (lua_field_try_serialize) > to lua stack with additional lightuserdata and table value and after > pcall that function to avoid a raise of error. > > In this case LuaJIT creates new object which will not live long time, just assigns a pointer to the top guest stack slot. Yes, it might trigger stack reallocation, but no GC object is created. Yes, the reallocation can fail with either LUA_ERRMEM or LUA_ERRRUN, but the error is raised out of the protected scope, and is not handled. There are more corner cases we have already discussed with Vlad here[1]. > so it increase amount of dead object and also increase time and > frequency of garbage collection inside LuaJIT. > Also this pcall is necessary only in case when metafield __serialize > of serilizable object has LUA_TFUNCTION type. Typo: s/serilizable/serializable/. > > So instead pushcfunction with pcall we can directly call the function > trying to serialize an object. Well, let's polish the commit message to make it a bit clearer. Commit subject also looks non-informative and doesn't respect our contribution guide[2], so I propose the following rewording: | lua: remove excess Lua call from table encoding | | For safe table encoding function is pushed | to Lua stack along with auxiliary lightuserdata and table object to be | encoded. Its further protected call catches Lua error if one is raised | while encoding. It is only necessary when the object to be serialized | has __serialize field in metatable and this field is a Lua function. | | As a result of this change the function serializing the given object | is called without excess protected frame and auxiliary status struct. Feel free to change it on your own. > --- > > branch: https://github.com/tarantool/tarantool/tree/skaplun/no-ticket-lua-inspect-table-refactoring > > src/lua/utils.c | 132 ++++++++++++++++-------------------------------- > 1 file changed, 44 insertions(+), 88 deletions(-) > > diff --git a/src/lua/utils.c b/src/lua/utils.c > index d410a3d03..58715a55e 100644 > --- a/src/lua/utils.c > +++ b/src/lua/utils.c > @@ -461,91 +461,69 @@ lua_field_inspect_ucdata(struct lua_State *L, struct luaL_serializer *cfg, > } > > /** > - * A helper structure to simplify safe call of __serialize method. > - * It passes some arguments into the serializer called via pcall, > - * and carries out some results. > + * Call __serialize method of a table object by index if the former exists. The comment exceeds 66 symbols. Please adjust it considering our contribution guide[3]. > + * > + * If __serialize not exists function does nothing and the function returns 1; > + * > + * If __serialize exists, is correct and is a function then What is 'correct' in this context? > + * a result of serialization replaces old value by the index and > + * the function returns 0; > + * > + * If the serialization hints like 'array' or 'map', then field->type, > + * field->syze and field->compact sets if necessary Typo: s/syze/size/. > + * and the function returns 0; I failed to get the sentence above and here is what I see in the code: | If the serialization is a hint string (like 'array' or 'map'), then | field->type, field->size and field->compact are set if necessary and | the function returns 0; > + * > + * Otherwise it is an error, set diag and the funciton returns -1; > + * > + * Return values: > + * -1 - error, set diag > + * 0 - has serialize, success replace and have to finish > + * 1 - hasn't serialize, need to process Minor: I propose the following rewording to make this part a bit clear: | Return values: | -1 - error occurs, diag is set, the top of guest stack is undefined. | 0 - __serialize is set, the result value is in the origin slot, | encoding is finished. | 1 - __serialize is not set, proceed with default table encoding. > */ > -struct lua_serialize_status { > - /** > - * True if an attempt to call __serialize has failed. A > - * diag message is set. > - */ > - bool is_error; > - /** > - * True, if __serialize exists. Otherwise an ordinary > - * default serialization is used. > - */ > - bool is_serialize_used; > - /** > - * True, if __serialize not only exists, but also returned > - * a new value which should replace the original one. On > - * the contrary __serialize could be a string like 'array' > - * or 'map' and do not push anything but rather say how to > - * interpret the target table. In such a case there is > - * nothing to replace with. > - */ > - bool is_value_returned; > - /** Parameters, passed originally to luaL_tofield. */ > - struct luaL_serializer *cfg; > - struct luaL_field *field; > -}; > > -/** > - * Call __serialize method of a table object if the former exists. > - * The function expects 2 values pushed onto the Lua stack: a > - * value to serialize, and a pointer at a struct > - * lua_serialize_status object. If __serialize exists, is correct, > - * and is a function then one value is pushed as a result of > - * serialization. If it is correct, but just a serialization hint > - * like 'array' or 'map', then nothing is pushed. Otherwise it is > - * an error. All the described outcomes can be distinguished via > - * lua_serialize_status attributes. > - */ > static int > -lua_field_try_serialize(struct lua_State *L) > +lua_field_try_serialize(struct lua_State *L, int idx, > + struct luaL_serializer *cfg, struct luaL_field *field) I looked on the code around and other signatures are different in the arguments order: | lua_field_do_something(struct lua_State *, struct luaL_serializer *, | int, struct luaL_field *); I see no reason to violate this practice. > { > - struct lua_serialize_status *s = > - (struct lua_serialize_status *) lua_touserdata(L, 2); > - s->is_serialize_used = (luaL_getmetafield(L, 1, LUAL_SERIALIZE) != 0); > - s->is_error = false; > - s->is_value_returned = false; > - if (! s->is_serialize_used) > - return 0; > - struct luaL_serializer *cfg = s->cfg; > - struct luaL_field *field = s->field; > + bool is_serialize_used = (luaL_getmetafield(L, idx, > + LUAL_SERIALIZE) != 0); I agree with Vlad here. If there are some reasons to leave this variable has_serialize is also OK, but at the same time it fits 80 chars. > + if (!is_serialize_used) > + return 1; > if (lua_isfunction(L, -1)) { > /* copy object itself */ > - lua_pushvalue(L, 1); > - lua_call(L, 1, 1); > - s->is_error = (luaL_tofield(L, cfg, NULL, -1, field) != 0); > - s->is_value_returned = true; > - return 1; > + lua_pushvalue(L, idx); > + if (lua_pcall(L, 1, 1, 0) != 0) { > + diag_set(LuajitError, lua_tostring(L, -1)); > + return -1; > + } > + if (luaL_tofield(L, cfg, NULL, -1, field) != 0) > + return -1; > + lua_replace(L, idx); > + return 0; > } > if (!lua_isstring(L, -1)) { > diag_set(LuajitError, "invalid " LUAL_SERIALIZE " value"); > - s->is_error = true; > - lua_pop(L, 1); > - return 0; > + return -1; > } > const char *type = lua_tostring(L, -1); > if (strcmp(type, "array") == 0 || strcmp(type, "seq") == 0 || > strcmp(type, "sequence") == 0) { > field->type = MP_ARRAY; /* Override type */ > - field->size = luaL_arrlen(L, 1); > + field->size = luaL_arrlen(L, idx); > /* YAML: use flow mode if __serialize == 'seq' */ > if (cfg->has_compact && type[3] == '\0') > field->compact = true; > } else if (strcmp(type, "map") == 0 || strcmp(type, "mapping") == 0) { > field->type = MP_MAP; /* Override type */ > - field->size = luaL_maplen(L, 1); > + field->size = luaL_maplen(L, idx); > /* YAML: use flow mode if __serialize == 'map' */ > if (cfg->has_compact && type[3] == '\0') > field->compact = true; > } else { > diag_set(LuajitError, "invalid " LUAL_SERIALIZE " value"); > - s->is_error = true; > + return -1; > } > - lua_pop(L, 1); > + lua_pop(L, 1); /* remove value was setted by luaL_getmetafield */ > return 0; > } > > @@ -559,36 +537,14 @@ lua_field_inspect_table(struct lua_State *L, struct luaL_serializer *cfg, > > if (cfg->encode_load_metatables) { > int top = lua_gettop(L); > - struct lua_serialize_status s; > - s.cfg = cfg; > - s.field = field; > - lua_pushcfunction(L, lua_field_try_serialize); > - lua_pushvalue(L, idx); > - lua_pushlightuserdata(L, &s); > - if (lua_pcall(L, 2, 1, 0) != 0) { > - diag_set(LuajitError, lua_tostring(L, -1)); > - return -1; > - } > - if (s.is_error) > + int res = lua_field_try_serialize(L, idx, cfg, field); > + if (res == -1) > return -1; > - /* > - * lua_call/pcall always returns the specified > - * number of values. Even if the function returned > - * less - others are filled with nils. So when a > - * nil is not needed, it should be popped > - * manually. > - */ > - assert(lua_gettop(L) == top + 1); > - (void) top; > - if (s.is_serialize_used) { > - if (s.is_value_returned) > - lua_replace(L, idx); > - else > - lua_pop(L, 1); > + assert(lua_gettop(L) == top); > + (void)top; > + if (res == 0) > return 0; > - } > - assert(! s.is_value_returned); > - lua_pop(L, 1); > + /* Fall throuth with res == 1 */ Typo: s/Fall throuth/Fallthrough/. Minor: This part can be simply rewritten the following way: | int top = lua_gettop(L); | (void)top; | | switch(lua_field_try_serialize(L, idx, cfg, field)) { | case -1: | return -1; | case 0: | assert(lua_gettop(L) == top); | return 0; | case 1: | assert(lua_gettop(L) == top); | /* Continue table encoding */ | break; | default: | /* Unreachable */ | assert(0); | } IMHO, it's more readable and also checks that nothing except the values mentioned in the function contract is returned. Feel free to ignore. > } > > field->type = MP_ARRAY; > -- > 2.24.1 > Side note: I can't come up with tests except those you showed to Sasha here[4], but it looks like they doesn't directly relate to the changes you introduce with the patch. No idea for now. I guess we can return to the question when you fix the review comments. [1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-April/015701.html [2]: https://www.tarantool.io/en/doc/2.2/dev_guide/developer_guidelines/#how-to-write-a-commit-message [3]: https://www.tarantool.io/en/doc/2.2/dev_guide/c_style_guide/#chapter-2-breaking-long-lines-and-strings [4]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-May/016994.html -- Best regards, IM