[Tarantool-patches] [PATCH 0/4] RFC: Isolate serializer helpers

Alexander Turenko alexander.turenko at tarantool.org
Mon Jul 5 09:30:47 MSK 2021


On Sun, Jul 04, 2021 at 03:09:07PM +0200, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
> 
> On 23.06.2021 21:12, Alexander Turenko via Tarantool-patches wrote:
> > Moved the serializer helpers into its own compilation unit, add some
> > comments and a basic test: everything is just to simplify diving into
> > this code.
> > 
> > Guys, please, look, whether it seems useful enough to include into
> > tarantool's mainline? Should we name it serializer.[ch] or
> > somehow like serializer_helpers.[ch]?
> > 
> > Part of https://github.com/tarantool/tarantool/issues/3228
> 
> Are you sure you need to fix it? It looks like a regular leg shooting.
> It might be simple to detect in the case described by Mons, but what if
> the recursion is not so easily visible?
> 
> 	setmetatable({},{
> 		__serialize = function(a)
> 			return {{{{a}}}}
> 		end
> 	})
> 
> You would need to use recursion detection algorithms like the one
> we used to ask on interviews. And I am not sure it is worth it if
> it can't be done in a simple way.

I think it worth to rearrange the code and add a test disregarding
whether we'll decide to fix or leave the problem.

I'll update the issue on the week with description of all problems found
around __serialize (see at end of the email as well). After this I'll
ask Roman to update its patch (I'll add a checklist what should be done
and how). I'll keep you in CC for those discussions. So you'll have
ability to say 'it looks to complex' at any stage.

In my opinion, it is highly undesirable to get segfault (or even a Lua
error) from a serializer, because it is often used for logging. More or
less correct result is better than fail. Even if the passed Lua object
is ill-formed in some way. (However, sure, I want to keep the code as
readable as possible and I would not accept a solution that is hard for
me to dive into. I hope we'll implement something well balanced.)

To be honest, even our usual "unsupported Lua type 'function'" error
(which is raised for a function if `encode_use_tostring` is not `true`)
is often undesirable. Raw idea: provide a helper like
`yaml.encode_noxc()`, which will never raise an error and will be
suitable for logging in the general case (it'll set
`encode_use_tostring` under the hood).

WBR, Alexander Turenko.

----

Sure, there are two problems with __serialize, which lead to segfault:

- recursion within single Lua object serialization;
- recursion over several Lua objects.

But there is one problem of another kind.

A return value of __serialize does not participate in references search.

 | local x = {whoami = 'x'}
 | yaml.encode({
 |     foo = x,
 |     bar = setmetatable({}, {__serialize = function(_) return x end})
 | })

** now **
 | ---
 | foo:
 |   whoami: x
 | bar:
 |   whoami: x
 | ...

** should be **
 | ---
 | foo: &1
 |   whoami: x
 | bar: *1
 | ...


More information about the Tarantool-patches mailing list