[Tarantool-patches] [PATCH] serializer: serialize recursive structures

Sergey Kaplun skaplun at tarantool.org
Wed Apr 21 12:27:22 MSK 2021


Hi!

Thanks for the fixes and the benchmarks!

Please consider my comments below.

On 06.04.21, Roman Khabibov wrote:
> Hi! Thanks for your review.
> 
> Perf
> 
> Script:
> 
> local serializer = require('json')
> clock = require('clock')
> serializer.cfg({encode_max_depth = 64})

Why do we need this parameter for benchmark?

> local avg = 0
> for j = 1, 10 do
>     local time = clock.monotonic()
>     for i = 1, 150000 do
>         local rec1 = {{{{{{{{{{{{{{{{}}}}}}}}}}}}}}}}

LuaJIT will create these tables every iteration. It is better to move
this line out of cycles scope.

>         serializer.encode(rec1)
>     end
>     local result = clock.monotonic() - time
>     print(result)
>     avg = avg + result
> end
> print("avg", avg / 10)

Side note: I suppose that there is nothing bad to create an gist to
refer in the commit message. Thoughts?

> 
> 
> JSON
> 
> master
> r:tarantool r.khabibov$ ./src/tarantool a.lua
> 2.3421729998663
> 2.4643029998988
> 2.4067219998688
> 2.5030510001816
> 2.6378739997745
> 2.3903989996761
> 2.9206310003065
> 2.9560130001046
> 2.8415450002067
> 2.8488600002602
> avg	2.6311571000144
> 
> my branch
> 2.3160850000568
> 2.4038570001721
> 2.4244269998744
> 2.4124370003119
> 2.4159540003166
> 2.4096140000038
> 2.4202209999785
> 2.3913590000011
> 2.4089310001582
> 2.4692329997197
> avg	2.4072118000593
> 
> 
> MSGPACK
> 
> master
> 4.1559460000135

Side note: Looks like a statistical outlier, doesn't it?
The standart deviation has the same order than the difference between
tests, hasn't it?

> 3.415306000039
> 3.2962070000358
> 3.0802390002646
> 2.8740380001254
> 2.7747510001063
> 2.7418200001121
> 2.7159779998474
> 2.8011710001156
> 3.1459960001521
> avg	3.1001452000812
> 
> my branch
> 2.6075639999472
> 2.7236989997327
> 3.0367560000159
> 3.071974999737
> 4.0645679999143
> 3.3987269997597
> 3.0836050002836
> 3.1121310000308
> 2.9265290000476
> 3.1954760001972
> avg	3.1221029999666
> 

I've moved object creation out of cycles scope and increased amount of
iterations. Here is my results below for RelWithDebInfo build:

JSON:

Master branch:

| 13.585487608798
| 13.637473559938
| 13.619634022936
| 13.573492439464
| 13.595329248346
| 13.635302185081
| 13.599111944437
| 13.547587860376
| 13.577754129656
| 13.594658776186
| avg     13.596583177522

Your branch:

| 13.061761728488
| 13.070893581957
| 13.043347925879
| 13.016978750005
| 13.092284397222
| 13.056780406274
| 13.108249689452
| 13.080217910931
| 13.016354624182
| 13.077872656286
| avg     13.062474167068

MSGPACK:

Master branch:

| 14.461708432995
| 14.485332051292
| 14.49474313762
| 14.497900255956
| 14.501624618657
| 14.475869619288
| 14.452158411033
| 14.504987037741
| 14.460136301816
| 14.526830960065
| avg     14.486129082646

Your branch:

| 14.325977869332
| 14.226252351888
| 14.210794054903
| 14.332323177718
| 14.643362686969
| 14.701242586598
| 14.272431840189
| 14.164950084873
| 14.151984995231
| 14.226807836443
| avg     14.325612748414

MSGPACKFFI:

Master branch:

| 57.604440504685
| 57.71239098534
| 57.707063949667
| 57.636537715793
| 57.562112994492
| 57.627671867609
| 57.714451034553
| 57.665768134408
| 57.560196113773
| 57.603734403849
| avg     57.639436770417

Your branch:

| 58.02601410076
| 57.854091227986
| 57.902167879045
| 58.103026106022
| 58.019230334088
| 58.185967157595
| 57.99015656393
| 57.940024384297
| 58.309608150274
| 58.555808863603
| avg     58.08860947676

YAML:

Master branch:

| 78.119072271511
| 78.244466565549
| 78.199724264443
| 78.105738090351
| 78.051472786814
| 78.179750479758
| 78.219370298088
| 78.096038719639
| 78.09271354042
| 78.195037381724
| avg     78.15033843983

Your branch:

| 88.296284722164
| 88.319487094879
| 88.342842547223
| 88.313459647819
| 88.354313291609
| 88.400771623477
| 88.3271939978
| 88.307627035305
| 88.334653683007
| 88.330221014097
| avg     88.332685465738

To sum up:

There is a small improving of JSON encoder, there is no improving of
MSGPACK encoder (small degradation for MSGPACKFFI) and there is
significant degradation in yaml decoder performance.

Side note: Do you investigate the reason of
improvements/degradations?

Side note: MSGPACK FFI is x4 times slower than lua MSGPACK...

<snipped>

Known issues:

Deep recursive calls still a problem (run with disabled JIT for full
backtrace):

| tarantool> jit.off() local a = {} a[a] = a local recf recf = function(t) return {setmetatable({}, {__serialize = recf})} end setmetatable(a, {__serialize = recf}) return a
| Segmentation fault (core dumped)

Backtrace is the following:

| #0  0x0000558c599d01c4 in lj_mem_newgco (L=L at entry=0x404fb378, size=size at entry=56) at /home/burii/reviews/tarantool/serialize/third_party/luajit/src/lj_gc.c:875
| #1  0x0000558c599d735a in newtab (L=0x404fb378, asize=3, hbits=0) at /home/burii/reviews/tarantool/serialize/third_party/luajit/src/lj_tab.c:107
| #2  0x0000558c599d7579 in lj_tab_new (L=<optimized out>, asize=<optimized out>, hbits=<optimized out>) at /home/burii/reviews/tarantool/serialize/third_party/luajit/src/lj_tab.c:161
| #3  0x0000558c59a046fa in lj_BC_TNEW () at buildvm_x86.dasc:552
| #4  0x0000558c599cac74 in lua_pcall (L=L at entry=0x404fb378, nargs=nargs at entry=1, nresults=nresults at entry=1, errfunc=errfunc at entry=0)
|     at /home/burii/reviews/tarantool/serialize/third_party/luajit/src/lj_api.c:1158
| #5  0x0000558c5998317a in find_references_and_serialize (L=L at entry=0x404fb378, anchortable_index=anchortable_index at entry=3, serialized_objs_idx=serialized_objs_idx at entry=4)
|     at /home/burii/reviews/tarantool/serialize/src/lua/utils.c:625
| #6  0x0000558c5998306b in find_references_and_serialize (L=L at entry=0x404fb378, anchortable_index=anchortable_index at entry=3, serialized_objs_idx=serialized_objs_idx at entry=4)
|     at /home/burii/reviews/tarantool/serialize/src/lua/utils.c:625
| <5K lines snipped>

Also, the following code leads to core:

| tarantool> local a = {} return setmetatable(a, {__serialize = function(t) return nil end})
| tarantool: /home/burii/reviews/tarantool/serialize/src/lua/utils.c:710: lua_field_try_serialize: Assertion `lua_isnil(L, -1) == 0' failed.
| Aborted (core dumped)

Side note: should `__serialize` field work for userdata?

| tarantool> local u = newproxy(true) getmetatable(u).__serialize = function() return "custom coroutine" end return u
| ---
| - 'userdata: 0x41613c70'

Side note:
I surprized with the following behaviour:

| tarantool> local t = {} t[math.huge] = 0/0 for k,v in pairs(t) do print(k,v) end return t
| inf     nan
| ---
| - []
| ...

But according to documentation [1] encoding and decoding of invalid
numbers is enabled by default. What am I doing wrong?

Also, is it valid representation?

| tarantool> local a = math.huge/math.huge print(a) return a
| nan
| ---
| - -nan
| ...

It should be just `nan` not `-nan`.

General notes:
1) Nit: use brackets for the one-line if statements with comments.
Or even better, move it above before if statement:

| /* They should be the same. */
| if (a != b)
| 	return;

Instead of:
| if (a != b)
| 	/* They should be the same. */
| 	return;

2) Nit: don't use ==/!= for Lua type checks:
| if (!lua_isboolean(L, -1)) {

Instead of:
| if (lua_isboolean(L, -1) == 0) {

The returned value is kinda boolean: 0 or 1. Also, this is preferable
codestyle:

| $ grep -P 'lua_is[^&|]*?\([^&|]*?\) (?![!=]=)' -R ../before_patch/src/ 2>/dev/null | wc -l
| 136
| $ grep -P 'lua_is[^&|]*?\([^&|]*?\) (?=[!=]=)' -R ../before_patch/src/ 2>/dev/null | wc -l
| 6

3) Nit: I suggest to use early return in cases like:
| if (!condition) {
| 	lua_pop(L, 1);
| 	return;
| }
| /* Payload. */

Instead of:

| if (condition) {
| 	/* Payload. */
| } else {
| 	lua_pop(L, 1);
| 	return;
| }

> 
> commit 61c84258d32f121c0a5e75761ae3b347d0f41b5d
> Author: Roman Khabibov <roman.habibov at tarantool.org>
> Date:   Fri Feb 5 16:55:56 2021 +0300
> 
>     serializer: serialize recursive structures
>     
>     Fix bug with bus error during serializing of recursive structures
>     and print aliases if node become referenced after serialization.

Typo: s/become/becomes/

>     
>     This patch adds map with serialized nodes and results. The map is
>     used to mark referenced nodes which appears during serializing.

Typo: s/nodes which/nodes, which/
Typo: s/which appears/which appear/

>     
>     Closes #3228
>     
>     @TarantoolBot
>     __serialize parameter
>     If __serialize parameter is initialized as a function then this

Typo: s/function then/function, then/

>     function should be pure. A pure function is a function which
>     returns identical values for identical arguments (no variation
>     with local static variables, non-local variables, mutable
>     reference arguments or input streams). Otherwise, the behavior
>     is undefined.
> 
> diff --git a/changelogs/unreleased/serializer-bug.md b/changelogs/unreleased/serializer-bug.md
> new file mode 100755
> index 000000000..96f359d64
> --- /dev/null
> +++ b/changelogs/unreleased/serializer-bug.md
> @@ -0,0 +1,4 @@
> +## bugfix/core
> +
> +* Recursive structures appeared after __serialize call don't lead
> +* to segfault anymore in lua and yaml serializers (gh-5392).

Typo: s/lua/Lua/
Typo: s/yaml/YAML/
Typo: s/5392/3228/

Nit: I propose the following rewording:

| Serialization of recursive structures considering the value of
| `__serialize` metafield doesn't lead to segfault anymore in Lua and
| YAML serializers (gh-3228).

Feel free to change it on your own or ignore.

> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index 0315e720c..e4907a3f7 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -193,7 +193,7 @@ luamp_encode_call_16(lua_State *L, struct luaL_serializer *cfg,

<snipped>

> diff --git a/src/box/lua/execute.c b/src/box/lua/execute.c
> index 926a0a61c..76d271f5c 100644
> --- a/src/box/lua/execute.c
> +++ b/src/box/lua/execute.c
> @@ -328,7 +328,7 @@ lua_sql_bind_decode(struct lua_State *L, struct sql_bind *bind, int idx, int i)

<snipped>

> diff --git a/src/box/lua/serialize_lua.c b/src/box/lua/serialize_lua.c
> index caa08a60f..17b434ba5 100644
> --- a/src/box/lua/serialize_lua.c
> +++ b/src/box/lua/serialize_lua.c
> @@ -115,6 +115,7 @@ struct lua_dumper {
>  	/** Anchors for self references */
>  	int anchortable_index;
>  	unsigned int anchor_number;
> +	int serialized_objs_idx;

Lets be consistent with previous naming: use "index" instead "idx".

Nit: I suggest s/serialized_objs_idx/cache_index/g renaming.
In my opinion the meaning is the same, but its shorter.
Thoughts?
Feel free to ignore.

>
>  	/** Error message buffer */
>  	char err_msg[256];
> @@ -215,7 +216,7 @@ trace_node(struct lua_dumper *d)
>  	struct luaL_field field;
>  
>  	memset(&field, 0, sizeof(field));
> -	luaL_checkfield(d->L, d->cfg, lua_gettop(d->L), &field);
> +	luaL_checkfield(d->L, d->cfg, 0, lua_gettop(d->L), &field);
>  
>  	if (field.type < lengthof(mp_type_names)) {
>  		if (field.type == MP_EXT) {
> @@ -234,7 +235,7 @@ trace_node(struct lua_dumper *d)
>  
>  	memset(&field, 0, sizeof(field));
>  
> -	luaL_checkfield(d->L, d->cfg, top, &field);
> +	luaL_checkfield(d->L, d->cfg, 0, top, &field);
>  	say_info("serializer-trace: node    :\tfield type %s (%d)",
>  		 type_str, field.type);
>  }
> @@ -339,41 +340,16 @@ emit_node(struct lua_dumper *d, struct node *nd, int indent,
>  static const char *
>  get_lua_anchor(struct lua_dumper *d)
>  {
> -	const char *s = "";
> -
> -	lua_pushvalue(d->L, -1);
> -	lua_rawget(d->L, d->anchortable_index);
> -	if (!lua_toboolean(d->L, -1)) {
> -		lua_pop(d->L, 1);
> -		return NULL;
> +	const char *anchor = NULL;
> +	int code = get_anchor(d->L, d->anchortable_index, &d->anchor_number,
> +			      &anchor);
> +	if (code == GET_ANCHOR_NAMED_NOT) {
> +		trace_anchor(anchor, false);
> +	} else if (code == GET_ANCHOR_ALIASED) {
> +		trace_anchor(anchor, true);
> +		return "";

Why do we return empty string instead an aliased anchor value?

Nit: Looks like trace_anchor should be unconditional, just:
|		trace_anchor(anchor, code == GET_ANCHOR_ALIASED);

>  	}
> -
> -	if (lua_isboolean(d->L, -1)) {

<snipped>

> -	return s;
> +	return anchor;
>  }
>  
>  static void
> @@ -783,7 +759,7 @@ dump_node(struct lua_dumper *d, struct node *nd, int indent)

<snipped>

> diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
> index f7198a025..2779cf786 100644
> --- a/src/box/lua/tuple.c
> +++ b/src/box/lua/tuple.c

<snipped>

> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9b6179f3a..2c7181f84 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c

<snipped>

> diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
> index 1e74a6a3c..f9503a275 100644
> --- a/src/lua/msgpack.c
> +++ b/src/lua/msgpack.c

<snipped>

> diff --git a/src/lua/pickle.c b/src/lua/pickle.c
> index 65208b5b3..86f0f77e8 100644
> --- a/src/lua/pickle.c
> +++ b/src/lua/pickle.c

<snipped>

> diff --git a/src/lua/utils.c b/src/lua/utils.c
> index b5a6ca5b7..0580f4094 100644
> --- a/src/lua/utils.c
> +++ b/src/lua/utils.c
> @@ -441,7 +441,8 @@ lua_gettable_wrapper(lua_State *L)

<snipped>

> +int
> +get_anchor(struct lua_State *L, int anchortable_index,

Nit: I suppose it will be good to add a prefix to the function name
to underline that it is from utils.

> +	   unsigned int *anchor_number, const char **anchor)

Nit: may be anchor_id is better?
Feel free to ignore.

> +{
> +	lua_pushvalue(L, -1);

What is expected to be on the top of the Lua stack? Am I right, that
it is an object to serialize? Please, add corresponding description
to <lua/utils.h>.

> +	lua_rawget(L, anchortable_index);
> +	if (lua_toboolean(L, -1) == 0) {
> +		/* This element is not referenced. */
> +		lua_pop(L, 1);
> +		*anchor = NULL;
> +		return 0;

Should it be `GET_ANCHOR_NO_REFS`?

> +	}
> +
> +	if (lua_isboolean(L, -1) != 0) {
> +		/*
> +		 * This element is referenced more than once but

Typo: s/once but/once, but/

> +		 * has not been named.
> +		 */
> +		char buf[32];
> +		snprintf(buf, sizeof(buf), "%u", (*anchor_number)++);
> +		lua_pop(L, 1);
> +		/* Generate a string anchor and push to table. */

Nit: As for me "cache table" is more verbose.
Feel free to ignore.

> +		lua_pushvalue(L, -1);
> +		lua_pushstring(L, buf);
> +		*anchor = lua_tostring(L, -1);
> +		lua_rawset(L, anchortable_index);
> +		return GET_ANCHOR_NAMED_NOT;
> +	} else {
> +		/* This is an aliased element. */
> +		*anchor = lua_tostring(L, -1);

Should we add assert for string type here?

> +		lua_pop(L, 1);
> +		return GET_ANCHOR_ALIASED;
> +	}
> +	return GET_ANCHOR_NO_REFS;
> +}
> +
> +void
> +find_references_and_serialize(struct lua_State *L, int anchortable_index,

Nit: I suppose it will be good to add a prefix to the function name
to underline that it is from utils.

> +			      int serialized_objs_idx)

Let's be consistent in namings. Please, choose `idx` or `index` notation.

> +{
> +	int type = lua_type(L, -1);
> +	if (type != LUA_TTABLE)
> +		return;

This `type` variable is used only once, so, it may be ommited like the
following:

| 	if (lua_type(L, -1) != LUA_TTABLE)
| 		return;

> +	int node_idx = lua_gettop(L);
> +
> +	/*
> +	 * Check if the node is already serialized i.e. the record
> +	 * is already in the map of serialized objects.
> +	 */
> +	lua_pushvalue(L, node_idx);

If you want to push the last one element you can just use -1 as
index, like you do in `get_anchor()`.

> +	lua_rawget(L, serialized_objs_idx);
> +	bool srlzd = lua_isnil(L, -1) == 0;

Nit: !lua_isnil(L, -1) looks prettier for me.
Feel free to ignore.

Nit: `srlzd` doesn't shows the meaning, and strange in naming :).
I suggest `is_cached` or something like that.

Also, it is hard to me read the code flow below:
There are 2 boolean variables and one variable with three different
values. It gives us 2*2*3 = 12 combinations of values, that define
code behaviour. In my opinion, it is hard to keep track all variants.
I suggest to separate code with the branches:
* already serialized
* not serialized and have __serialize field (by the way, why do you
  do nothing for __serialize "map" and so on?)
* not serialized and have no __serialize field

Also, you can use helper functions to make the code more readable --
like split back `find_references_and_serialize()` ->
`find_references()` + `serialize()`. Let function do one thing and
do it well.

Also, I don't understand what is purpose of this function and
differnce between it and `lua_field_try_serialize()`?

> +	/*
> +	 * This flag indicates that the object is being serialized
> +	 * in the current function call.
> +	 */
> +	bool serialization = false;

Why is this highlighted in a separate case?

> +
> +	if (!srlzd && luaL_getmetafield(L, node_idx, LUAL_SERIALIZE) != 0 &&
> +	    lua_isfunction(L, -1) != 0) {
> +		/* Delete nil after lua_rawget() above. */
> +		lua_replace(L, -2);
> +		srlzd = true;
> +		serialization = true;
> +		/*
> +		 * Copy object itself and call serialize function

Typo: s/serialize function/a serialize function/

> +		 * for it.
> +		 */
> +		lua_pushvalue(L, node_idx);
> +		if (lua_pcall(L, 1, 1, 0) != 0) {

Nit: It is better to use the name of the code: `LUA_OK`.

> +			diag_set(LuajitError, lua_tostring(L, -1));
> +			luaT_error(L);
> +		}
> +		/*
> +		 * Add the result of serialization to the
> +		 * serialized_objs map. Key is node (node_idx),

Typo: s/node/the node/

> +		 * value is the result of serialization (now on
> +		 * the top of stack).

Typo: s/top of stack/top of the stack/

> +		 */
> +		lua_pushvalue(L, node_idx);
> +		lua_pushvalue(L, -2);
> +		lua_rawset(L, serialized_objs_idx);
> +	}
> +
> +	if (srlzd) {
> +		if (lua_type(L, -1) == LUA_TTABLE) {
> +			/*
> +			 * Now we will process the new node - the
> +			 * result of serialization. It is
> +			 * necessary to take it into account in
> +			 * the anchor table.
> +			 */
> +			node_idx = lua_gettop(L);
> +		} else {
> +			lua_pop(L, 1);
> +			return;
> +		}
> +	} else {
> +		if (lua_isnil(L, -1) == 0)
> +			/*
> +			 * Pop an extra field thrown out
> +			 * by luaL_getmetafield(), it was
> +			 * not a function.
> +			 */
> +			lua_pop(L, 1);
> +		/* Delete nil after lua_rawget() above. */
> +		lua_pop(L, 1);
> +	}
> +	/*
> +	 * Take an entry from the anchor table about the current
> +	 * node.
> +	 */
> +	lua_pushvalue(L, node_idx);
> +	lua_rawget(L, anchortable_index);
> +	int newval = -1;
> +	if (lua_isnil(L, -1) == 1)
> +		/*
> +		 * The node has not yet been encountered, it is
> +		 * bypassed for the first time.
> +		 */
> +		newval = 0;
> +	else if (lua_toboolean(L, -1) == 0)
> +		 /* The node has already met once. */
> +		newval = 1;
> +	lua_pop(L, 1);
> +	if (newval != -1) {
> +		lua_pushvalue(L, node_idx);
> +		lua_pushboolean(L, newval);
> +		lua_rawset(L, anchortable_index);
> +	}
> +	if (srlzd && !serialization) {
> +		/*
> +		 * The node has already been serialized not in the
> +		 * current call, so there is no point in going
> +		 * further in depth and checking the leaves. Pop
> +		 * the processed sterilization result.

Typo: s/sterilization/serialization/

> +		 */
> +		lua_pop(L, 1);
> +		return;
> +	}
> +	if (newval)
> +		/* The node has already met twice or more, so

Nit: Impermissible comment codestyle.

> +		 * there is no point in going further in depth and
> +		 * checking the leaves.
> +		 */

Why do we need to process node if it has already met once?

> +		return;
> +
> +	/* Recursively process other table values. */
> +	lua_pushnil(L);
> +	while (lua_next(L, node_idx) != 0) {
> +		/* Find references on value. */

Nit: "Process value." looks enough.
Feel free to ignore.

> +		find_references_and_serialize(L, anchortable_index,
> +					      serialized_objs_idx);
> +		lua_pop(L, 1);
> +		/* Find references on key. */

Nit: "Process key." looks enough.
Feel free to ignore.

> +		find_references_and_serialize(L, anchortable_index,
> +					      serialized_objs_idx);
> +	}
> +	if (serialization)
> +		/*
> +		 * The loop above processed the nodes inside the

Typo: s/processed/processes/

> +		 * serialization result. Pop it, so as not to
> +		 * break the integrity of the loop of the previous
> +		 * call in recursion.
> +		 */
> +		lua_pop(L, 1);
> +
> +	return;
> +}
> +
> +enum try_serialize_ret_code
> +{
> +
> +	TRY_SERIALIZE_ERROR = -1,
> +	TRY_SERIALIZE_RES_TABLE_NOT = 0,
> +	TRY_SERIALIZE_RES_FROM_MAP,
> +	TRY_SERIALIZE_DEFAULT_TABLE,
> +};
> +
>  /**
>   * Call __serialize method of a table object by index
> - * if the former exists.
> + * if the former exists or pull result of serialization (if
> + * exists) from table with index @a serialized_objs_idx if it

Typo: s/from table/from the table/

> + * isn't 0.
>   *
>   * If __serialize does not exist then function does nothing
> - * and the function returns 1;
> + * and the function returns TRY_SERIALIZE_DEFAULT_TABLE;
>   *
>   * If __serialize exists, is a function (which doesn't
>   * raise any error) then a result of serialization
> - * replaces old value by the index and the function returns 0;
> + * replaces old value by the index and the function returns
> + * TRY_SERIALIZE_RES_TABLE_NOT or TRY_SERIALIZE_DEFAULT_TABLE;
>   *
>   * If the serialization is a hint string (like 'array' or 'map'),
>   * then field->type, field->size and field->compact
> - * are set if necessary and the function returns 0;
> + * are set if necessary and the function returns
> + * TRY_SERIALIZE_RES_TABLE_NOT;
>   *
> - * Otherwise it is an error, set diag and the funciton returns -1;
> + * Otherwise it is an error, set diag and the funciton returns
> + * TRY_SERIALIZE_ERROR;
>   *
>   * Return values:
> - * -1 - error occurs, diag is set, the top of guest stack
> - *      is undefined.
> - *  0 - __serialize field is available in the metatable,
> - *      the result value is put in the origin slot,
> - *      encoding is finished.
> - *  1 - __serialize field is not available in the metatable,
> - *      proceed with default table encoding.
> + * TRY_SERIALIZE_ERROR - error occurs, diag is set, the top of
> + * guest stack is undefined.
> + * TRY_SERIALIZE_RES_TABLE_NOT - __serialize field is available in
> + * the metatable, the result value is put in the origin slot,
> + * encoding is finished.

Sorry, I don't understand what is "NOT" part standing for?
TRY_SERIALIZE_OK looks better, IMHO.
Also, see comment about `lua_field_inspect_table()` function below.

> + * TRY_SERIALIZE_RES_FROM_MAP - __serialize field is available in

Nit: TRY_SERIALIZE_CACHED sounds better, in my opinion.
Feel free to ignore.

> + * the metatable, the result value is table from serialized
> + * objects map.
> + * TRY_SERIALIZE_DEFAULT_TABLE - __serialize field is not

Nit: TRY_SERIALIZE_DEFAULT sounds better, in my opinion.
Feel free to ignore.

> + * available in the metatable, proceed with default table
> + * encoding.
>   */
> +
>  static int
> -lua_field_try_serialize(struct lua_State *L, struct luaL_serializer *cfg,
> -			int idx, struct luaL_field *field)
> +lua_field_try_serialize(struct lua_State *L, int serialized_objs_idx,
> +			struct luaL_serializer *cfg, int idx,
> +			struct luaL_field *field)
>  {
>  	if (luaL_getmetafield(L, idx, LUAL_SERIALIZE) == 0)
> -		return 1;
> +		return TRY_SERIALIZE_DEFAULT_TABLE;

Can we use the default handler here?

>  	if (lua_isfunction(L, -1)) {
> -		/* copy object itself */
> -		lua_pushvalue(L, idx);
> -		if (lua_pcall(L, 1, 1, 0) != 0) {
> -			diag_set(LuajitError, lua_tostring(L, -1));
> -			return -1;
> +		if (serialized_objs_idx != 0) {
> +			/*
> +			 * Pop the __serialize function for the
> +			 * current node. It was already called in
> +			 * find_references_and_serialize().
> +			 */
> +			lua_pop(L, 1);
> +			/*
> +			 * Get the result of serialization from
> +			 * the map.
> +			 */
> +			lua_pushvalue(L, idx);
> +			lua_rawget(L, serialized_objs_idx);
> +			assert(lua_isnil(L, -1) == 0);

This assert fails, when a serialization function returns nil. See the
example above.

> +
> +			/*
> +			 * Replace the serialized node with a new
> +			 * result, if it is a table.
> +			 */
> +			if (lua_type(L, -1) == LUA_TTABLE) {
> +				lua_replace(L, idx);
> +				return TRY_SERIALIZE_RES_FROM_MAP;
> +			}
> +		} else {
> +			/*
> +			 * Serializer don't use map with
> +			 * serialized objects. Copy object itself
> +			 * and call __serialize for it.
> +			 */
> +			lua_pushvalue(L, idx);
> +			if (lua_pcall(L, 1, 1, 0) != 0) {

Nit: It is better to use the name of the code: `LUA_OK`.

> +				diag_set(LuajitError, lua_tostring(L, -1));
> +				return TRY_SERIALIZE_ERROR;
> +			}
>  		}
> -		if (luaL_tofield(L, cfg, NULL, -1, field) != 0)
> -			return -1;
> +		if (luaL_tofield(L, cfg, serialized_objs_idx, NULL, -1,
> +				 field) != 0)
> +			return TRY_SERIALIZE_ERROR;
>  		lua_replace(L, idx);
> -		return 0;
> +		return TRY_SERIALIZE_RES_TABLE_NOT;
>  	}
>  	if (!lua_isstring(L, -1)) {
>  		diag_set(LuajitError, "invalid " LUAL_SERIALIZE " value");
> -		return -1;
> +		return TRY_SERIALIZE_ERROR;
>  	}
>  	const char *type = lua_tostring(L, -1);
>  	if (strcmp(type, "array") == 0 || strcmp(type, "seq") == 0 ||
> @@ -533,16 +755,17 @@ lua_field_try_serialize(struct lua_State *L, struct luaL_serializer *cfg,
>  			field->compact = true;
>  	} else {
>  		diag_set(LuajitError, "invalid " LUAL_SERIALIZE " value");
> -		return -1;
> +		return TRY_SERIALIZE_ERROR;
>  	}
>  	/* Remove value set by luaL_getmetafield. */
>  	lua_pop(L, 1);
> -	return 0;
> +	return TRY_SERIALIZE_RES_TABLE_NOT;
>  }
>  

Nit: see nothing bad to add a comment for this function as for
previous one.

>  static int
> -lua_field_inspect_table(struct lua_State *L, struct luaL_serializer *cfg,
> -			int idx, struct luaL_field *field)
> +lua_field_inspect_table(struct lua_State *L, int serialized_objs_idx,

Nit: I suggest s/serialized_objs_idx/cache_idx/g renaming.
In my opinion the meaning is the same, but its shorter.
Thoughts?
Feel free to ignore.

> +			struct luaL_serializer *cfg, int idx,
> +			struct luaL_field *field)
>  {
>  	assert(lua_type(L, idx) == LUA_TTABLE);
>  	uint32_t size = 0;
> @@ -550,14 +773,20 @@ lua_field_inspect_table(struct lua_State *L, struct luaL_serializer *cfg,
>  
>  	if (cfg->encode_load_metatables) {
>  		int top = lua_gettop(L);
> -		int res = lua_field_try_serialize(L, cfg, idx, field);
> -		if (res == -1)
> +		int res = lua_field_try_serialize(L, serialized_objs_idx, cfg,
> +						  idx, field);
> +		if (res == TRY_SERIALIZE_ERROR)
>  			return -1;
>  		assert(lua_gettop(L) == top);
>  		(void)top;
> -		if (res == 0)
> +		if (res == TRY_SERIALIZE_RES_TABLE_NOT ||
> +		    (res == TRY_SERIALIZE_RES_FROM_MAP &&
> +		    lua_type(L, -1) != LUA_TTABLE))
>  			return 0;
> -		/* Fallthrough with res == 1 */
> +		/*
> +		 * Fallthrough with res TRY_SERIALIZE_RES_FROM_MAP
> +		 * or TRY_SERIALIZE_DEFAULT_TABLE.
> +		 */

Looks like the current work of the function is inconsistent.
In one case an object serializing inside `lua_field_try_serialize()`,
but in others in the caller. As for me, it makes code tricky to understand.

I suggest to introduce the default serializer function (the chunk to
fallthrough), that is called inside `lua_field_try_serialize()`. It
allows to reduce the amount of returning codes(may be even use only
two: 0 on success and -1 on failure, as usual), doesn't it ?

Also, with this refactoring I suggest the following renaming:
s/lua_field_try_serialize/field_serialize/g

Rationale: "try" prefix is redundant as far as this function really
serialize object, or return standard status on failure. "lua" prefix
is conflicted with standard Lua-C API, so we can just omit it for
static functions.

Thoughts?

>  	}
>  
>  	field->type = MP_ARRAY;
> @@ -612,14 +841,14 @@ lua_field_tostring(struct lua_State *L, struct luaL_serializer *cfg, int idx,

<skipped>

>  int
>  luaL_tofield(struct lua_State *L, struct luaL_serializer *cfg,
> -	     const struct serializer_opts *opts, int index,
> -	     struct luaL_field *field)
> +	     int serialized_objs_idx, const struct serializer_opts *opts,
> +	     int index, struct luaL_field *field)

Let's be consistent in namings. Please, choose `idx` or `index` notation.

>  {
>  	if (index < 0)
>  		index = lua_gettop(L) + index + 1;
> @@ -753,7 +982,8 @@ luaL_tofield(struct lua_State *L, struct luaL_serializer *cfg,

<snipped>

> @@ -789,9 +1019,12 @@ luaL_convertfield(struct lua_State *L, struct luaL_serializer *cfg, int idx,
>  			 */
>  			GCcdata *cd = cdataV(L->base + idx - 1);
>  			if (cd->ctypeid > CTID_CTYPEID)
> -				lua_field_inspect_ucdata(L, cfg, idx, field);
> +				lua_field_inspect_ucdata(L, cfg,
> +							 serialized_objs_idx, idx,

Line length is more than 80 symbols.

> +							 field);
>  		} else if (type == LUA_TUSERDATA) {
> -			lua_field_inspect_ucdata(L, cfg, idx, field);
> +			lua_field_inspect_ucdata(L, cfg, serialized_objs_idx, idx,

Line length is more than 80 symbols.

> +						 field);
>  		}
>  	}
>  
> diff --git a/src/lua/utils.h b/src/lua/utils.h
> index 4a164868b..cb1025495 100644
> --- a/src/lua/utils.h
> +++ b/src/lua/utils.h
> @@ -353,6 +353,44 @@ struct luaL_field {
>  	bool compact;                /* a flag used by YAML serializer */
>  };
>  
> +/**
> + * Find references to tables to look up self references in tables
> + * and serialize nodes if __serialize is function. Collect

Typo: s/function/a function/

> + * referenced nodes in a table with stack index @a
> + * anchortable_index and serialized nodes with results in a table
> + * with stack index @a serialized_objs_idx.
> + *
> + * @param L Lua stack.
> + * @param anchortable_index Index of anchor table on stack with

Typo: s/of anchor table/of the anchor table/

> +          pairs: "node address" <-> "false (met once) or true
> +          (met more than once)".
> + * @param serialized_objs_idx Index of serialized objects map on
> + *        stack with pairs: "node address" <-> "result address".

It looks like the second table is superfluous: it is enough for us to
store "node address" -> "result object address". If we go through the
first time, then we will fill in the same table and indicate that
this element has already been serialized. If we serialize the object
a second/third/... time, then the serialization result will already
be in the table.

Thoughts?

> + */
> +void
> +find_references_and_serialize(struct lua_State *L, int anchortable_index,
> +			      int serialized_objs_idx);
> +
> +enum get_anchor_ret_code
> +{
> +	GET_ANCHOR_NO_REFS = 0,
> +	GET_ANCHOR_NAMED_NOT,
> +	GET_ANCHOR_ALIASED,
> +};
> +
> +/**
> + * Generate aliases and anchor numbers for self references.
> + *
> + * @param L Lua stack.
> + * @param anchortable_index Index of anchor table on stack.

Typo: s/of table/of the table/
Typo: s/on stack/on the stack/

> + * @param[out] anchor_number Ptr to the number anchors in a
> +               structure being under dumping.

Please, mention here, when this value is incremented.

> + * @param[out] anchor Ptr to anchor string.
> + */

Please, add comment about return codes like it is done for
`lua_field_try_serialize()` in <lua/utils.c>

> +int
> +get_anchor(struct lua_State *L, int anchortable_index,
> +	   unsigned int *anchor_number, const char **anchor);
> +
>  /**
>   * @brief Convert a value from the Lua stack to a lua_field structure.
>   * This function is designed for use with Lua bindings and data
> @@ -387,6 +425,8 @@ struct luaL_field {
>   *
>   * @param L stack
>   * @param cfg configuration
> + * @param serialized_objs_idx Index of table with serialized

Typo: s/of table/of the table/

> +                              objects map.
>   * @param opts the Lua serializer additional options.
>   * @param index stack index
>   * @param field conversion result
> @@ -396,8 +436,8 @@ struct luaL_field {
>   */
>  int
>  luaL_tofield(struct lua_State *L, struct luaL_serializer *cfg,
> -	     const struct serializer_opts *opts, int index,
> -	     struct luaL_field *field);
> +	     int serialized_objs_idx, const struct serializer_opts *opts,

Let's be consistent in namings. Please, choose `idx` or `index` notation.

> +	     int index, struct luaL_field *field);
>  
>  /**
>   * @brief Try to convert userdata/cdata values using defined conversion logic.
> @@ -405,18 +445,22 @@ luaL_tofield(struct lua_State *L, struct luaL_serializer *cfg,
>   *
>   * @param L stack
>   * @param cfg configuration
> + * @param serialized_objs_idx Index of table with serialized

Typo: s/of table/of the table/

> +                              objects map.
>   * @param idx stack index
>   * @param field conversion result
>   */
>  void
> -luaL_convertfield(struct lua_State *L, struct luaL_serializer *cfg, int idx,
> -		  struct luaL_field *field);
> +luaL_convertfield(struct lua_State *L, struct luaL_serializer *cfg,
> +		  int serialized_objs_idx, int idx, struct luaL_field *field);
>  
>  /**
>   * @brief A wrapper for luaL_tofield() and luaL_convertfield() that
>   * tries to convert value or raise an error.
>   * @param L stack
>   * @param cfg configuration
> + * @param serialized_objs_idx Stack index of table with serialized

Typo: s/of table/of the table/

> +                              objects map.
>   * @param idx stack index
>   * @param field conversion result
>   * @sa lua_tofield()
> @@ -432,14 +476,14 @@ luaL_convertfield(struct lua_State *L, struct luaL_serializer *cfg, int idx,

<snipped>

> diff --git a/test/app/gh-3228-serializer-look-for-recursion.result b/test/app/gh-3228-serializer-look-for-recursion.result
> new file mode 100644
> index 000000000..cb8c73de0
> --- /dev/null
> +++ b/test/app/gh-3228-serializer-look-for-recursion.result

Side note: AFAIK, we try to avoid to create new result files.
But for these tests it looks like the best option.

> @@ -0,0 +1,211 @@

<snipped>

> diff --git a/test/app/gh-3228-serializer-look-for-recursion.test.lua b/test/app/gh-3228-serializer-look-for-recursion.test.lua
> new file mode 100644
> index 000000000..bfa046a16
> --- /dev/null
> +++ b/test/app/gh-3228-serializer-look-for-recursion.test.lua
> @@ -0,0 +1,83 @@
> +test_run = require('test_run').new()
> +
> +--
> +-- gh-3228: Check that recursive structures are serialized
> +-- properly.
> +--
> +setmetatable({}, {__serialize = function(a) return a end})
> +setmetatable({}, {__serialize = function(a) return {a} end})
> +setmetatable({}, {__serialize = function(a) return {a, a} end})
> +setmetatable({}, {__serialize = function(a) return {a, a, a} end})

Looks like this case is the same as the previous one, we can drop
it.

> +setmetatable({}, {__serialize = function(a) return {{a, a}, a} end})
> +setmetatable({}, {__serialize = function(a) return {a, 1} end})
> +setmetatable({}, {__serialize = function(a) return {{{{a}}}} end})
> +setmetatable({}, {__serialize = function(a) return {{{{1}}}} end})
> +
> +b = {}
> +setmetatable({b}, {__serialize = function(a) return {a_1 = a, a_2 = a, b_1 = b, b_2 = b} end})
> +setmetatable({b}, {__serialize = function(a) return {a_1 = a, a_2 = {a, b}, b = b} end})
> +
> +a = {}
> +b = {a = a}
> +b[b] = b
> +a[b] = a
> +setmetatable(b, {__serialize = function(...) return {a} end})
> +a
> +
> +a = {}
> +b = {}
> +a[b] = b
> +setmetatable(b, {__serialize = function(...) return print end})
> +a
> +
> +a = {}
> +a[a] = a
> +recf = function(t) return setmetatable({}, {__serialize = recf}) end
> +setmetatable(a, {__serialize = recf}) return a
> +
> +--
> +-- __serialize function is pure, i.e. always returns identical
> +-- value for identical argument. Otherwise, the behavior is
> +-- undefined. So that, we ignore the side effects and just use the
> +-- value after the first serialization.

I see no reason to check undefined behaviour. I suppose we can just
drop these tests. Otherwise, we shouldn't check the value returned by
the function, just verify that the call doesn't lead to a crash.

Thoughts?

> +--
> +a = {}
> +b = {}
> +b[a] = a
> +show_a = true
> +test_run:cmd('setopt delimiter ";"')
> +serialize = function()
> +    show_a = not show_a
> +    if show_a then
> +        return "a"
> +    else
> +        return "b" end
> +end;
> +test_run:cmd('setopt delimiter ""');
> +setmetatable(a, {__serialize = serialize})
> +b
> +
> +test_run:cmd('setopt delimiter ";"')
> +function example()

Nit: swapped_serializer() as a name is more verbose to me.
Feel free to ignore.

> +    local a = {}
> +    local b = {}
> +    b[a] = a
> +    local reta
> +    local retb
> +    local function swap_ser(o)
> +        local newf
> +        local f = getmetatable(o).__serialize
> +        if f == reta then
> +            newf = retb
> +        else
> +            newf = reta
> +        end
> +        getmetatable(o).__serialize = newf
> +    end
> +    reta = function(o) swap_ser(o) return "a" end
> +    retb = function(o) swap_ser(o) return "b" end
> +    setmetatable(a, {__serialize = reta})
> +    return b
> +end;
> +test_run:cmd('setopt delimiter ""');
> +example()
> diff --git a/test/swim/swim.result b/test/swim/swim.result
> index bfc83c143..539131677 100644
> --- a/test/swim/swim.result
> +++ b/test/swim/swim.result
> @@ -1322,16 +1322,13 @@ m_list

<snipped>

> diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
> index 38e999870..67a9762bc 100644
> --- a/third_party/lua-cjson/lua_cjson.c
> +++ b/third_party/lua-cjson/lua_cjson.c

<snipped>

> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> index 9c3a4a646..249603ba3 100644
> --- a/third_party/lua-yaml/lyaml.cc
> +++ b/third_party/lua-yaml/lyaml.cc
> @@ -79,6 +79,7 @@ struct lua_yaml_dumper {
>     lua_State *L;
>     struct luaL_serializer *cfg;
>     int anchortable_index;
> +   int serialized_objs_idx;

Let's be consistent in namings. Please, choose `idx` or `index` notation.

>     unsigned int anchor_number;

<snipped>

> 

[1]: https://www.tarantool.io/en/doc/latest/reference/reference_lua/yaml/

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list