[Tarantool-patches] [PATCH] serializer: serialize recursive structures

Alexander Turenko alexander.turenko at tarantool.org
Thu Jul 8 01:42:26 MSK 2021


On Thu, Mar 11, 2021 at 08:49:49AM +0300, Roman Khabibov via Tarantool-patches wrote:
> Fix bug with bus error during serializing of recursive
> structures.
> 
> Closes #3228
> ---
> 
> Branch: https://github.com/tarantool/tarantool/tree/romanhabibov/gh-3228-visited-set
> Issue: https://github.com/tarantool/tarantool/issues/3228

Sorry for the long delay. It is really hard to me to explain complex
topics in a more or less structured way.

Please, see my comments below.

## Overview

I splitted all problems we found into four ones that look atomic. I
described three problems (A, B, C) in the issue of the topic (gh-3228)
and moved one into its own issue (gh-6201). Here I refer those problems
by given names.

My comments are for 7206b41ff1456e71474c2ff84b214f770e5897af
('serializer: serialize recursive structures').

I'll give a short checklist first and my notes from meetings with Roman
next.

## Patchset checklist

- [ ] All refactoring changes should be separated from ones that modify
      user visible behaviour. Example: extracting of common parts of
      yaml and lua serializers.
- [ ] Behaviour of table/userdata/cdata with __serialize is the same
      (there are test cases that verify it).
- [ ] The patches should follow tarantool's code style. Example: I
      didn't saw enums introduced specifically to give names for return
      codes.
- [ ] Ensure that usual nesting (non-recursive structures) is not
      broken. Example: a table w/ __serialize, which returns a table
      with __serialize (see meeting notes for a test case).
- [ ] It is wishful to split solutions for different problems into
      separate commits (where possible).
- [ ] It is wishful to structurize test cases to make testing matrix
      clear: {problem A, problem B, problem C} x {table, cdata,
      userdata}.
- [ ] It is wishful to describe the overall algorithm somewhere. A
      comment for the corresponding header file is the good candidate
      for this.

## Meeting notes

Paragraphs in square brackets are added later to clarify something we
didn't discuss on our meetings.

{{{ Serializer 2021-05-14

The problem (segfault)
----------------------

[I named it 'problem A'.]

setmetatable({}, {__serialize = function(self) return self end})

Single node with recursive __serialize.

Complex example:

local x = {whoami = 'x'}
local a = {whoami = 'a'}
local b = {whoami = 'b'}
setmetatable(x, {__serialize = function(_) return a end})
setmetatable(a, {__serialize = function(_) return b end})
setmetatable(b, {__serialize = function(_) return a end})

(x -> a -> b)
      ^    |
      |    |
      +----+

yaml.encode(x) -> ?

However __serialize for a node may be nested, but not recursive:

local x = {whoami = 'x'}
local a = {whoami = 'a'}
local b = {whoami = 'b'}
setmetatable(x, {__serialize = function(_) return a end})
setmetatable(a, {__serialize = function(_) return b end})
setmetatable(b, {__serialize = function(_) return {whoami = 'result'} end}) -- !!

yaml.encode(x) -> {"whoami": "result"}

Related (?) problem (wrong result)
----------------------------------

[I named it 'problem C'.]

__serialize result do not participate in references search.

local x = {whoami = 'x'}
yaml.encode({
    foo = x,
    bar = setmetatable({}, {__serialize = function(_) return x end})
})

** should be **
---
foo: &1
  whoami: x
bar: *1
...

** now **
---
foo:
  whoami: x
bar:
  whoami: x
...

Are those problems actually related? Can we fix the segfault separately from
the wrong result case?

Details
-------

luaL_tofield() calls lua_field_try_serialize(), which calls luaL_tofield() for
__serialize result. We should detect a loop, so we should maintain a set of
visited (processed) objects.

How yaml works now:

find_references() tracks objects in a table (anchortable):

- key: object
- value:
  nil -- we didn't meet the object
  false -- seen once
  true -- seen twice or more

At dump yaml enumerates anchortable[obj] == true objects and set
anchortable[obj] = '0' ('1', '2' and so on).

In the resulting patch (re wrong result): replacement objects are tracked in
another table

TMP: Whether loop detection set may be replaced with ref map.

### cdata / userdata

How cdata/userdata with recursive __serialize works now? We should test it and
the behaviour should be similar to tables.

### Non-recursive serialize for a single node

[It is the test case for the 'Ensure <...>' checklist bullet.]

Current master works this way:

 | tarantool> a = {}
 | tarantool> b = {}
 | tarantool> setmetatable(b, {__serialize = function(_) return 'string' end})
 | ---
 | - string
 | ...
 | tarantool> setmetatable(a, {__serialize = function(_) return b end})
 | ---
 | - string
 | ...

Current patch has the following behaviour:

 | <...>
 | tarantool> setmetatable(a, {__serialize = function(_) return b end})
 | ---
 | - []
 | ...

We should return 'string' for `a`: call __serialize for `a` and, then, for
`b`. It is how the serializer works now, so backward compatibility requires
it.

}}}

{{{ Serializer 2021-05-19

Known problems (about existing patch)
-------------------------------------

* Non-recusive serialize for a single node (backward compatibility).
* Naming (at least enum constants).
* Test cdata / userdata in the similar cases as we test tables.
* No overall description how it works: it is hard to dive into the topic.

To discuss
----------

* Is it possible to split segfault and wrong result patches?
* Elaborate constants naming.
* Where to write serializer docs: source comments, doc/, tarantool-internals?
* We should discuss a test plan.

Discussion
----------

## The second segfault problem

[I named it 'problem B'.]

setmetatable({}, {__serialize = function(self) return {self} end})

## Is it possible to split segfault and wrong result patches?

Seems so. Don't move __serialize calls before dumping in the first patch. But
introduce visited objects set / map. It seems a set is enough here. But it
depends on a behaviour in the x -> a -> b -> a -> b -> ... case.

It seems that it'll simplify the patch.

NB: Don't forget about cdata / userdata.

It also worth to separate refactoring from user visible changes.

## We should discuss a test plan

Rough testing matrix for the first patch:

Cases:
* a -> a -> a -> ...
  setmetatable({}, {__serialize = function(self) return self end})
* a -> something that contains a
  setmetatable({}, {__serialize = function(self) return {self} end})
* a -> b -> a -> b -> ...
* x -> a -> b -> a -> b -> ...

[Those cases are problems A, B, C. First axis.]

Three types, where __serialize is applicable:
* table
* cdata
* userdata
* plus recursion over table + cdata or cdata + userdata

[This is the second axis.]

[Fourth case goes to gh-6201.]

For the second patch (wrong result):

TBD

[The same, table / userdata / cdata? Are there other axes?]

## Elaborate constants naming

[Or discard it and use usual unnamed return codes.]

It is for the second patch (wrong result). Even more: it'll be an intermediate
patch with refactoring.

+enum try_serialize_ret_code
+{
+
+       TRY_SERIALIZE_ERROR = -1,
+       TRY_SERIALIZE_RES_TABLE_NOT = 0,
+       TRY_SERIALIZE_RES_FROM_MAP,
+       TRY_SERIALIZE_DEFAULT_TABLE,
+};

+ * TRY_SERIALIZE_ERROR - error occurs, diag is set, the top of
+ * guest stack is undefined.
+ * TRY_SERIALIZE_RES_TABLE_NOT - __serialize field is available in
+ * the metatable, the result value is put in the origin slot,
+ * encoding is finished.
+ * TRY_SERIALIZE_RES_FROM_MAP - __serialize field is available in
+ * the metatable, the result value is table from serialized
+ * objects map.
+ * TRY_SERIALIZE_DEFAULT_TABLE - __serialize field is not
+ * available in the metatable, proceed with default table
+ * encoding.

+enum get_anchor_ret_code
+{
+       GET_ANCHOR_NO_REFS = 0,
+       GET_ANCHOR_NAMED_NOT,
+       GET_ANCHOR_ALIASED,
+};

----

Why we need to distinguish situations, when a table remains the same (no
__serialize is not called) and when we call it and replace the Lua object on
the stack.

It seems TRY_SERIALIZE_RES_FROM_MAP is redundant. It means that the result is
from the visited set.

TRY_SERIALIZE_RES_TABLE_NOT -- grammar. Is not a table.

TRY_SERIALIZE_RES_TABLE_NOT -- we can receive cdata / userdata with
__serialize, which'll return a table. (NB: to test cases; return self, return
{self}.)

What information we should return from lua_field_try_serialize() to skip or
proceed with array / map guessing.

1. result not a table -> don't guess whether it is map / array
2. __serialize = 'map/seq/sequence/mapping' -> don't guess whether it is
   map / array
3. __serialize = function(self) <...> end -> check by 1, 2

lua_field_try_serialize() retval:
1. error
2. we set array / map (ask to don't guess it afterwards)
3. success (no __serialize, succeeded __serialize function call)

I don't understand how lua_field_inspect_table() works if __serialize function
returns a string.

TRY_SERIALIZE_ERROR
TRY_SERIALIZE_IS_SEQUENCE_OR_MAPPING_DETERMINED
TRY_SERIALIZE_EVERYTHING_ELSE

(Looks weird.)

-1
0
1

(Maybe keep numbers?)

Another variant:

try_serialize(<...>, bool *known_as_array_or_map)
retval: -1 (error) / 0 (success)

----

get_anchor() -- we should give information, whether we meet the object first
time (and dump `&0 obj`) or next time (and dump `*0`). And 'no anchor / no
refs' situation.

GET_ANCHOR_NAMED_NOT -- grammar. Is not named.

| Case       | Retval               | *anchor |
| ---------- | -------------------- | ------- |
| No refs    | NO_REFS              | NULL    |
| First time | GET_ANCHOR_NAMED_NOT | "0"     |
| Next time  | GET_ANCHOR_ALIASED   | "0"     |

Possible naming:

GET_ANCHOR_NO_REFS
GET_ANCHOR_FIRST_VISIT
GET_ANCHOR_NEXT_VISIT

Another variant:

GET_ANCHOR_UNNAMED
GET_ANCHOR_JUST_NAMED
GET_ANCHOR_ALREADY_NAMED

Another variant:

void
get_anchor(<...>, const char **anchor, bool *is_just_named)

## Where to write serializer docs: source comments, doc/, tarantool-internals?

TBD

[Let's start with a comment in the header file.]

AR Alexander: create a checklist.

[Done.]

}}}


More information about the Tarantool-patches mailing list