[Tarantool-patches] [PATCH] lua: lua_field_inspect_table without pushcfunction
Igor Munkin
imun at tarantool.org
Tue Jun 2 03:19:48 MSK 2020
Sergey,
Thanks for the patch! Please consider my comments below.
On 18.05.20, Sergey Kaplun wrote:
> Currently on encoding table we push cfunction (lua_field_try_serialize)
> to lua stack with additional lightuserdata and table value and after
> pcall that function to avoid a raise of error.
>
> In this case LuaJIT creates new object which will not live long time,
<lua_pushlightuserdata> just assigns a pointer to the top guest stack
slot. Yes, it might trigger stack reallocation, but no GC object is
created.
Yes, the reallocation can fail with either LUA_ERRMEM or LUA_ERRRUN, but
the error is raised out of the protected scope, and is not handled.
There are more corner cases we have already discussed with Vlad here[1].
> so it increase amount of dead object and also increase time and
> frequency of garbage collection inside LuaJIT.
> Also this pcall is necessary only in case when metafield __serialize
> of serilizable object has LUA_TFUNCTION type.
Typo: s/serilizable/serializable/.
>
> So instead pushcfunction with pcall we can directly call the function
> trying to serialize an object.
Well, let's polish the commit message to make it a bit clearer. Commit
subject also looks non-informative and doesn't respect our contribution
guide[2], so I propose the following rewording:
| lua: remove excess Lua call from table encoding
|
| For safe table encoding <lua_field_try_serialize> function is pushed
| to Lua stack along with auxiliary lightuserdata and table object to be
| encoded. Its further protected call catches Lua error if one is raised
| while encoding. It is only necessary when the object to be serialized
| has __serialize field in metatable and this field is a Lua function.
|
| As a result of this change the function serializing the given object
| is called without excess protected frame and auxiliary status struct.
Feel free to change it on your own.
> ---
>
> branch: https://github.com/tarantool/tarantool/tree/skaplun/no-ticket-lua-inspect-table-refactoring
>
> src/lua/utils.c | 132 ++++++++++++++++--------------------------------
> 1 file changed, 44 insertions(+), 88 deletions(-)
>
> diff --git a/src/lua/utils.c b/src/lua/utils.c
> index d410a3d03..58715a55e 100644
> --- a/src/lua/utils.c
> +++ b/src/lua/utils.c
> @@ -461,91 +461,69 @@ lua_field_inspect_ucdata(struct lua_State *L, struct luaL_serializer *cfg,
> }
>
> /**
> - * A helper structure to simplify safe call of __serialize method.
> - * It passes some arguments into the serializer called via pcall,
> - * and carries out some results.
> + * Call __serialize method of a table object by index if the former exists.
The comment exceeds 66 symbols. Please adjust it considering our
contribution guide[3].
> + *
> + * If __serialize not exists function does nothing and the function returns 1;
> + *
> + * If __serialize exists, is correct and is a function then
What is 'correct' in this context?
> + * a result of serialization replaces old value by the index and
> + * the function returns 0;
> + *
> + * If the serialization hints like 'array' or 'map', then field->type,
> + * field->syze and field->compact sets if necessary
Typo: s/syze/size/.
> + * and the function returns 0;
I failed to get the sentence above and here is what I see in the code:
| If the serialization is a hint string (like 'array' or 'map'), then
| field->type, field->size and field->compact are set if necessary and
| the function returns 0;
> + *
> + * Otherwise it is an error, set diag and the funciton returns -1;
> + *
> + * Return values:
> + * -1 - error, set diag
> + * 0 - has serialize, success replace and have to finish
> + * 1 - hasn't serialize, need to process
Minor: I propose the following rewording to make this part a bit clear:
| Return values:
| -1 - error occurs, diag is set, the top of guest stack is undefined.
| 0 - __serialize is set, the result value is in the origin slot,
| encoding is finished.
| 1 - __serialize is not set, proceed with default table encoding.
> */
> -struct lua_serialize_status {
> - /**
> - * True if an attempt to call __serialize has failed. A
> - * diag message is set.
> - */
> - bool is_error;
> - /**
> - * True, if __serialize exists. Otherwise an ordinary
> - * default serialization is used.
> - */
> - bool is_serialize_used;
> - /**
> - * True, if __serialize not only exists, but also returned
> - * a new value which should replace the original one. On
> - * the contrary __serialize could be a string like 'array'
> - * or 'map' and do not push anything but rather say how to
> - * interpret the target table. In such a case there is
> - * nothing to replace with.
> - */
> - bool is_value_returned;
> - /** Parameters, passed originally to luaL_tofield. */
> - struct luaL_serializer *cfg;
> - struct luaL_field *field;
> -};
>
> -/**
> - * Call __serialize method of a table object if the former exists.
> - * The function expects 2 values pushed onto the Lua stack: a
> - * value to serialize, and a pointer at a struct
> - * lua_serialize_status object. If __serialize exists, is correct,
> - * and is a function then one value is pushed as a result of
> - * serialization. If it is correct, but just a serialization hint
> - * like 'array' or 'map', then nothing is pushed. Otherwise it is
> - * an error. All the described outcomes can be distinguished via
> - * lua_serialize_status attributes.
> - */
> static int
> -lua_field_try_serialize(struct lua_State *L)
> +lua_field_try_serialize(struct lua_State *L, int idx,
> + struct luaL_serializer *cfg, struct luaL_field *field)
I looked on the code around and other signatures are different in the
arguments order:
| lua_field_do_something(struct lua_State *, struct luaL_serializer *,
| int, struct luaL_field *);
I see no reason to violate this practice.
> {
> - struct lua_serialize_status *s =
> - (struct lua_serialize_status *) lua_touserdata(L, 2);
> - s->is_serialize_used = (luaL_getmetafield(L, 1, LUAL_SERIALIZE) != 0);
> - s->is_error = false;
> - s->is_value_returned = false;
> - if (! s->is_serialize_used)
> - return 0;
> - struct luaL_serializer *cfg = s->cfg;
> - struct luaL_field *field = s->field;
> + bool is_serialize_used = (luaL_getmetafield(L, idx,
> + LUAL_SERIALIZE) != 0);
I agree with Vlad here. If there are some reasons to leave this variable
has_serialize is also OK, but at the same time it fits 80 chars.
> + if (!is_serialize_used)
> + return 1;
> if (lua_isfunction(L, -1)) {
> /* copy object itself */
> - lua_pushvalue(L, 1);
> - lua_call(L, 1, 1);
> - s->is_error = (luaL_tofield(L, cfg, NULL, -1, field) != 0);
> - s->is_value_returned = true;
> - return 1;
> + lua_pushvalue(L, idx);
> + if (lua_pcall(L, 1, 1, 0) != 0) {
> + diag_set(LuajitError, lua_tostring(L, -1));
> + return -1;
> + }
> + if (luaL_tofield(L, cfg, NULL, -1, field) != 0)
> + return -1;
> + lua_replace(L, idx);
> + return 0;
> }
> if (!lua_isstring(L, -1)) {
> diag_set(LuajitError, "invalid " LUAL_SERIALIZE " value");
> - s->is_error = true;
> - lua_pop(L, 1);
> - return 0;
> + return -1;
> }
> const char *type = lua_tostring(L, -1);
> if (strcmp(type, "array") == 0 || strcmp(type, "seq") == 0 ||
> strcmp(type, "sequence") == 0) {
> field->type = MP_ARRAY; /* Override type */
> - field->size = luaL_arrlen(L, 1);
> + field->size = luaL_arrlen(L, idx);
> /* YAML: use flow mode if __serialize == 'seq' */
> if (cfg->has_compact && type[3] == '\0')
> field->compact = true;
> } else if (strcmp(type, "map") == 0 || strcmp(type, "mapping") == 0) {
> field->type = MP_MAP; /* Override type */
> - field->size = luaL_maplen(L, 1);
> + field->size = luaL_maplen(L, idx);
> /* YAML: use flow mode if __serialize == 'map' */
> if (cfg->has_compact && type[3] == '\0')
> field->compact = true;
> } else {
> diag_set(LuajitError, "invalid " LUAL_SERIALIZE " value");
> - s->is_error = true;
> + return -1;
> }
> - lua_pop(L, 1);
> + lua_pop(L, 1); /* remove value was setted by luaL_getmetafield */
> return 0;
> }
>
> @@ -559,36 +537,14 @@ lua_field_inspect_table(struct lua_State *L, struct luaL_serializer *cfg,
>
> if (cfg->encode_load_metatables) {
> int top = lua_gettop(L);
> - struct lua_serialize_status s;
> - s.cfg = cfg;
> - s.field = field;
> - lua_pushcfunction(L, lua_field_try_serialize);
> - lua_pushvalue(L, idx);
> - lua_pushlightuserdata(L, &s);
> - if (lua_pcall(L, 2, 1, 0) != 0) {
> - diag_set(LuajitError, lua_tostring(L, -1));
> - return -1;
> - }
> - if (s.is_error)
> + int res = lua_field_try_serialize(L, idx, cfg, field);
> + if (res == -1)
> return -1;
> - /*
> - * lua_call/pcall always returns the specified
> - * number of values. Even if the function returned
> - * less - others are filled with nils. So when a
> - * nil is not needed, it should be popped
> - * manually.
> - */
> - assert(lua_gettop(L) == top + 1);
> - (void) top;
> - if (s.is_serialize_used) {
> - if (s.is_value_returned)
> - lua_replace(L, idx);
> - else
> - lua_pop(L, 1);
> + assert(lua_gettop(L) == top);
> + (void)top;
> + if (res == 0)
> return 0;
> - }
> - assert(! s.is_value_returned);
> - lua_pop(L, 1);
> + /* Fall throuth with res == 1 */
Typo: s/Fall throuth/Fallthrough/.
Minor: This part can be simply rewritten the following way:
| int top = lua_gettop(L);
| (void)top;
|
| switch(lua_field_try_serialize(L, idx, cfg, field)) {
| case -1:
| return -1;
| case 0:
| assert(lua_gettop(L) == top);
| return 0;
| case 1:
| assert(lua_gettop(L) == top);
| /* Continue table encoding */
| break;
| default:
| /* Unreachable */
| assert(0);
| }
IMHO, it's more readable and also checks that nothing except the values
mentioned in the function contract is returned. Feel free to ignore.
> }
>
> field->type = MP_ARRAY;
> --
> 2.24.1
>
Side note: I can't come up with tests except those you showed to Sasha
here[4], but it looks like they doesn't directly relate to the changes
you introduce with the patch. No idea for now. I guess we can return to
the question when you fix the review comments.
[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-April/015701.html
[2]: https://www.tarantool.io/en/doc/2.2/dev_guide/developer_guidelines/#how-to-write-a-commit-message
[3]: https://www.tarantool.io/en/doc/2.2/dev_guide/c_style_guide/#chapter-2-breaking-long-lines-and-strings
[4]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-May/016994.html
--
Best regards,
IM
More information about the Tarantool-patches
mailing list