We should test the new code. Look, test/app-tap/lua/serializer_test.lua
(common code to test serializers) contains `if not s.cfg then return
end` blocks to exclude test cases, which are not supported by
msgpackffi. Please, ensure that all new functionality is tested.
You can also try to use luacov (test-run even has --luacov option),
however I didn't tried it with tarantool's build-in code, only with
tarantool/graphql. Don't sure whether it is hard to setup, so I don't
insist on this at all.
Igor (I CCed him) recently wonder whether it worth to use ffi on modern
LuaJIT with implemented traces stitching. Maybe it worth to perform some
microbenchmarks of msgpack and msgpackffi to ensure that there is still
need to work on msgpackffi. (Technically it is out of scope of your
issue, however it is interesting to know. Please, ask Georgy whether it
is worth to spend your time on that now.)
See other notes below.
WBR, Alexander Turenko.
On Thu, Nov 07, 2019 at 01:36:42PM +0300, Maria wrote:
> Lua serializers have common configuration options
> that mostly were not respected in msgpackffi module.
> This patch fixes that for several configurations:
> encode_use_tostring, encode_invalid_as_nil and
> encode_load_metatables.
> Options encode/decode_invalid_numbers are suggested
> to not be considered in msgpackffi module since lua
> processed them properly without this check so adding
> it would mean cutting functionality provided by the
> language in case this field was set to be false.
Lua supports NaN/Inf values, MsgPack (I mean, the format) supports them
too -- here you are right. But a user logic can lean on the fact that
all floating point numbers are not NaN/Inf. Also, we intend to be
compatible with msgpack module, which supports those options. I propose
to implement them.
I think we also should implement encode_sparse_* options to be
compatible with msgpack module.
During the review I gathered the information what options are supported
by each (de)serializer:
https://github.com/tarantool/doc/issues/976
Maybe this will be interesting for you.
I'll paste notes I wrote while looking around serializers (I was not
quite aware of the code), they can be useful to verify the table from
the issue above.
----
The following encode_* options are handled in src/lua/utils.c (in
luaL_tofield() and luaL_convertfield(), which are used in msgpack, json
and yaml modules; sometimes via luaL_checkfield() helper):
- encode_sparse_convert
- encode_sparse_ratio
- encode_sparse_safe
- encode_invalid_numbers
- encode_load_metatables
- encode_use_tostring
- encode_invalid_as_nil
The following encode_* options are handled separately by each
serializer:
- encode_max_depth: msgpack, json
- encode_deep_as_nil: msgpack, json
- encode_number_precision: json, yaml
(msgpackffi does not use those helpers, because they are for Lua-C API.)
All decode_* options are handled separately by each deserializer (but
sometimes using luaL_checkfinite() helper from src/lua/utils.c):
- decode_invalid_numbers: msgpack, json, yaml
- decode_save_metatables: msgpack, json, yaml
- decode_max_depth: json
----
FYI: I also filed
https://github.com/tarantool/tarantool/issues/4622
('lua: forbid options that are not applicable for certain
(de)serializer').
> Closes #4499
I think it worth to add a docbot comment (see tarantool/docbot) and
state that msgpackffi now supports all msgpack options (and gather those
ones that are set with msgpack.cfg()). It worth also mention that while
the API is mostly the same, msgpackffi does not check a buffer
boundaries and lacks of .decode_{array,map}_header(), .cfg() and .new()
functions (maybe there are other differences; I would expect that you
will refine them during the work on tests).
However, to be honest, I'm a bit tentative about making the module
public. I would discuss this with the rest of the team. Igor, Georgy,
what do you think about it?
> Issue:
>
https://github.com/tarantool/tarantool/issues/4499
> Branch:
>
https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options
> diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
> index c2be0b3e8..d3df1a8c1 100644
> --- a/src/lua/msgpack.c
> +++ b/src/lua/msgpack.c
> @@ -95,7 +95,6 @@ luamp_decode_extension_default(struct lua_State *L, const char **data)
> (unsigned char) **data);
> }
>
> -
> void
> luamp_set_decode_extension(luamp_decode_extension_f handler)
> {
Please, move all those changes into a separate commit.
> diff --git a/src/lua/msgpack.h b/src/lua/msgpack.h
> index d0ade303f..1e60f0a5d 100644
> --- a/src/lua/msgpack.h
> +++ b/src/lua/msgpack.h
> @@ -47,7 +47,7 @@ struct mpstream;
> /**
> * Default instance of msgpack serializer (msgpack = require('msgpack')).
> * This instance is used by all box's Lua/C API bindings (e.g. space:replace()).
> - * All changes made by msgpack.cfg{} function are also affect box's bindings
> + * All changes made by msgpack.cfg{} function also affect box's bindings
> * (this is a feature).
> */
> extern struct luaL_serializer *luaL_msgpack_default;
> diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
> index f7ee44291..43f5d6df1 100644
> --- a/src/lua/msgpackffi.lua
> +++ b/src/lua/msgpackffi.lua
> @@ -224,9 +224,11 @@ local function encode_r(buf, obj, level)
> return
> end
> local serialize = nil
> - local mt = getmetatable(obj)
> - if mt ~= nil then
> - serialize = mt.__serialize
> + if msgpack.cfg.encode_load_metatables then
> + local mt = getmetatable(obj)
> + if mt ~= nil then
> + serialize = mt.__serialize
> + end
> end
> -- calculate the number of array and map elements in the table
> -- TODO: pairs() aborts JIT
> @@ -275,7 +277,19 @@ local function encode_r(buf, obj, level)
> error("can not encode FFI type: '"..ffi.typeof(obj).."'")
> end
> else
> - error("can not encode Lua type: '"..type(obj).."'")
> + if msgpack.cfg.encode_use_tostring then
> + obj = tostring(obj)
> + if obj then
> + goto restart
> + else
> + error("can not encode Lua type: '"..type(obj).."'")
> + end
> + else if msgpack.cfg.encode_invalid_as_nil then
> + encode_nil(buf)
> + else
> + error("can not encode Lua type: '"..type(obj).."'")
> + end
> + end
> end
> end
The code itself looks correct, however we cannot be sure w/o proper
testing.