[Tarantool-patches] [PATCH] serializer: serialize recursive structures
Alexander Turenko
alexander.turenko at tarantool.org
Thu Jul 8 01:42:26 MSK 2021
On Thu, Mar 11, 2021 at 08:49:49AM +0300, Roman Khabibov via Tarantool-patches wrote:
> Fix bug with bus error during serializing of recursive
> structures.
>
> Closes #3228
> ---
>
> Branch: https://github.com/tarantool/tarantool/tree/romanhabibov/gh-3228-visited-set
> Issue: https://github.com/tarantool/tarantool/issues/3228
Sorry for the long delay. It is really hard to me to explain complex
topics in a more or less structured way.
Please, see my comments below.
## Overview
I splitted all problems we found into four ones that look atomic. I
described three problems (A, B, C) in the issue of the topic (gh-3228)
and moved one into its own issue (gh-6201). Here I refer those problems
by given names.
My comments are for 7206b41ff1456e71474c2ff84b214f770e5897af
('serializer: serialize recursive structures').
I'll give a short checklist first and my notes from meetings with Roman
next.
## Patchset checklist
- [ ] All refactoring changes should be separated from ones that modify
user visible behaviour. Example: extracting of common parts of
yaml and lua serializers.
- [ ] Behaviour of table/userdata/cdata with __serialize is the same
(there are test cases that verify it).
- [ ] The patches should follow tarantool's code style. Example: I
didn't saw enums introduced specifically to give names for return
codes.
- [ ] Ensure that usual nesting (non-recursive structures) is not
broken. Example: a table w/ __serialize, which returns a table
with __serialize (see meeting notes for a test case).
- [ ] It is wishful to split solutions for different problems into
separate commits (where possible).
- [ ] It is wishful to structurize test cases to make testing matrix
clear: {problem A, problem B, problem C} x {table, cdata,
userdata}.
- [ ] It is wishful to describe the overall algorithm somewhere. A
comment for the corresponding header file is the good candidate
for this.
## Meeting notes
Paragraphs in square brackets are added later to clarify something we
didn't discuss on our meetings.
{{{ Serializer 2021-05-14
The problem (segfault)
----------------------
[I named it 'problem A'.]
setmetatable({}, {__serialize = function(self) return self end})
Single node with recursive __serialize.
Complex example:
local x = {whoami = 'x'}
local a = {whoami = 'a'}
local b = {whoami = 'b'}
setmetatable(x, {__serialize = function(_) return a end})
setmetatable(a, {__serialize = function(_) return b end})
setmetatable(b, {__serialize = function(_) return a end})
(x -> a -> b)
^ |
| |
+----+
yaml.encode(x) -> ?
However __serialize for a node may be nested, but not recursive:
local x = {whoami = 'x'}
local a = {whoami = 'a'}
local b = {whoami = 'b'}
setmetatable(x, {__serialize = function(_) return a end})
setmetatable(a, {__serialize = function(_) return b end})
setmetatable(b, {__serialize = function(_) return {whoami = 'result'} end}) -- !!
yaml.encode(x) -> {"whoami": "result"}
Related (?) problem (wrong result)
----------------------------------
[I named it 'problem C'.]
__serialize result do not participate in references search.
local x = {whoami = 'x'}
yaml.encode({
foo = x,
bar = setmetatable({}, {__serialize = function(_) return x end})
})
** should be **
---
foo: &1
whoami: x
bar: *1
...
** now **
---
foo:
whoami: x
bar:
whoami: x
...
Are those problems actually related? Can we fix the segfault separately from
the wrong result case?
Details
-------
luaL_tofield() calls lua_field_try_serialize(), which calls luaL_tofield() for
__serialize result. We should detect a loop, so we should maintain a set of
visited (processed) objects.
How yaml works now:
find_references() tracks objects in a table (anchortable):
- key: object
- value:
nil -- we didn't meet the object
false -- seen once
true -- seen twice or more
At dump yaml enumerates anchortable[obj] == true objects and set
anchortable[obj] = '0' ('1', '2' and so on).
In the resulting patch (re wrong result): replacement objects are tracked in
another table
TMP: Whether loop detection set may be replaced with ref map.
### cdata / userdata
How cdata/userdata with recursive __serialize works now? We should test it and
the behaviour should be similar to tables.
### Non-recursive serialize for a single node
[It is the test case for the 'Ensure <...>' checklist bullet.]
Current master works this way:
| tarantool> a = {}
| tarantool> b = {}
| tarantool> setmetatable(b, {__serialize = function(_) return 'string' end})
| ---
| - string
| ...
| tarantool> setmetatable(a, {__serialize = function(_) return b end})
| ---
| - string
| ...
Current patch has the following behaviour:
| <...>
| tarantool> setmetatable(a, {__serialize = function(_) return b end})
| ---
| - []
| ...
We should return 'string' for `a`: call __serialize for `a` and, then, for
`b`. It is how the serializer works now, so backward compatibility requires
it.
}}}
{{{ Serializer 2021-05-19
Known problems (about existing patch)
-------------------------------------
* Non-recusive serialize for a single node (backward compatibility).
* Naming (at least enum constants).
* Test cdata / userdata in the similar cases as we test tables.
* No overall description how it works: it is hard to dive into the topic.
To discuss
----------
* Is it possible to split segfault and wrong result patches?
* Elaborate constants naming.
* Where to write serializer docs: source comments, doc/, tarantool-internals?
* We should discuss a test plan.
Discussion
----------
## The second segfault problem
[I named it 'problem B'.]
setmetatable({}, {__serialize = function(self) return {self} end})
## Is it possible to split segfault and wrong result patches?
Seems so. Don't move __serialize calls before dumping in the first patch. But
introduce visited objects set / map. It seems a set is enough here. But it
depends on a behaviour in the x -> a -> b -> a -> b -> ... case.
It seems that it'll simplify the patch.
NB: Don't forget about cdata / userdata.
It also worth to separate refactoring from user visible changes.
## We should discuss a test plan
Rough testing matrix for the first patch:
Cases:
* a -> a -> a -> ...
setmetatable({}, {__serialize = function(self) return self end})
* a -> something that contains a
setmetatable({}, {__serialize = function(self) return {self} end})
* a -> b -> a -> b -> ...
* x -> a -> b -> a -> b -> ...
[Those cases are problems A, B, C. First axis.]
Three types, where __serialize is applicable:
* table
* cdata
* userdata
* plus recursion over table + cdata or cdata + userdata
[This is the second axis.]
[Fourth case goes to gh-6201.]
For the second patch (wrong result):
TBD
[The same, table / userdata / cdata? Are there other axes?]
## Elaborate constants naming
[Or discard it and use usual unnamed return codes.]
It is for the second patch (wrong result). Even more: it'll be an intermediate
patch with refactoring.
+enum try_serialize_ret_code
+{
+
+ TRY_SERIALIZE_ERROR = -1,
+ TRY_SERIALIZE_RES_TABLE_NOT = 0,
+ TRY_SERIALIZE_RES_FROM_MAP,
+ TRY_SERIALIZE_DEFAULT_TABLE,
+};
+ * TRY_SERIALIZE_ERROR - error occurs, diag is set, the top of
+ * guest stack is undefined.
+ * TRY_SERIALIZE_RES_TABLE_NOT - __serialize field is available in
+ * the metatable, the result value is put in the origin slot,
+ * encoding is finished.
+ * TRY_SERIALIZE_RES_FROM_MAP - __serialize field is available in
+ * the metatable, the result value is table from serialized
+ * objects map.
+ * TRY_SERIALIZE_DEFAULT_TABLE - __serialize field is not
+ * available in the metatable, proceed with default table
+ * encoding.
+enum get_anchor_ret_code
+{
+ GET_ANCHOR_NO_REFS = 0,
+ GET_ANCHOR_NAMED_NOT,
+ GET_ANCHOR_ALIASED,
+};
----
Why we need to distinguish situations, when a table remains the same (no
__serialize is not called) and when we call it and replace the Lua object on
the stack.
It seems TRY_SERIALIZE_RES_FROM_MAP is redundant. It means that the result is
from the visited set.
TRY_SERIALIZE_RES_TABLE_NOT -- grammar. Is not a table.
TRY_SERIALIZE_RES_TABLE_NOT -- we can receive cdata / userdata with
__serialize, which'll return a table. (NB: to test cases; return self, return
{self}.)
What information we should return from lua_field_try_serialize() to skip or
proceed with array / map guessing.
1. result not a table -> don't guess whether it is map / array
2. __serialize = 'map/seq/sequence/mapping' -> don't guess whether it is
map / array
3. __serialize = function(self) <...> end -> check by 1, 2
lua_field_try_serialize() retval:
1. error
2. we set array / map (ask to don't guess it afterwards)
3. success (no __serialize, succeeded __serialize function call)
I don't understand how lua_field_inspect_table() works if __serialize function
returns a string.
TRY_SERIALIZE_ERROR
TRY_SERIALIZE_IS_SEQUENCE_OR_MAPPING_DETERMINED
TRY_SERIALIZE_EVERYTHING_ELSE
(Looks weird.)
-1
0
1
(Maybe keep numbers?)
Another variant:
try_serialize(<...>, bool *known_as_array_or_map)
retval: -1 (error) / 0 (success)
----
get_anchor() -- we should give information, whether we meet the object first
time (and dump `&0 obj`) or next time (and dump `*0`). And 'no anchor / no
refs' situation.
GET_ANCHOR_NAMED_NOT -- grammar. Is not named.
| Case | Retval | *anchor |
| ---------- | -------------------- | ------- |
| No refs | NO_REFS | NULL |
| First time | GET_ANCHOR_NAMED_NOT | "0" |
| Next time | GET_ANCHOR_ALIASED | "0" |
Possible naming:
GET_ANCHOR_NO_REFS
GET_ANCHOR_FIRST_VISIT
GET_ANCHOR_NEXT_VISIT
Another variant:
GET_ANCHOR_UNNAMED
GET_ANCHOR_JUST_NAMED
GET_ANCHOR_ALREADY_NAMED
Another variant:
void
get_anchor(<...>, const char **anchor, bool *is_just_named)
## Where to write serializer docs: source comments, doc/, tarantool-internals?
TBD
[Let's start with a comment in the header file.]
AR Alexander: create a checklist.
[Done.]
}}}
More information about the Tarantool-patches
mailing list