* [Tarantool-patches] [PATCH] Msgpackffi considers msgpack.cfg options now @ 2019-11-07 10:36 Maria 2019-11-11 23:21 ` Alexander Turenko 0 siblings, 1 reply; 6+ messages in thread From: Maria @ 2019-11-07 10:36 UTC (permalink / raw) To: tarantool-patches, alexander.turenko, georgy Lua serializers have common configuration options that mostly were not respected in msgpackffi module. This patch fixes that for several configurations: encode_use_tostring, encode_invalid_as_nil and encode_load_metatables. Options encode/decode_invalid_numbers are suggested to not be considered in msgpackffi module since lua processed them properly without this check so adding it would mean cutting functionality provided by the language in case this field was set to be false. Closes #4499 Issue: https://github.com/tarantool/tarantool/issues/4499 Branch: https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options --- src/lua/msgpack.c | 1 - src/lua/msgpack.h | 2 +- src/lua/msgpackffi.lua | 22 ++++++++++++++++++---- src/lua/utils.h | 6 +++--- 4 files changed, 22 insertions(+), 9 deletions(-) diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c index c2be0b3e8..d3df1a8c1 100644 --- a/src/lua/msgpack.c +++ b/src/lua/msgpack.c @@ -95,7 +95,6 @@ luamp_decode_extension_default(struct lua_State *L, const char **data) (unsigned char) **data); } - void luamp_set_decode_extension(luamp_decode_extension_f handler) { diff --git a/src/lua/msgpack.h b/src/lua/msgpack.h index d0ade303f..1e60f0a5d 100644 --- a/src/lua/msgpack.h +++ b/src/lua/msgpack.h @@ -47,7 +47,7 @@ struct mpstream; /** * Default instance of msgpack serializer (msgpack = require('msgpack')). * This instance is used by all box's Lua/C API bindings (e.g. space:replace()). - * All changes made by msgpack.cfg{} function are also affect box's bindings + * All changes made by msgpack.cfg{} function also affect box's bindings * (this is a feature). */ extern struct luaL_serializer *luaL_msgpack_default; diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua index f7ee44291..43f5d6df1 100644 --- a/src/lua/msgpackffi.lua +++ b/src/lua/msgpackffi.lua @@ -224,9 +224,11 @@ local function encode_r(buf, obj, level) return end local serialize = nil - local mt = getmetatable(obj) - if mt ~= nil then - serialize = mt.__serialize + if msgpack.cfg.encode_load_metatables then + local mt = getmetatable(obj) + if mt ~= nil then + serialize = mt.__serialize + end end -- calculate the number of array and map elements in the table -- TODO: pairs() aborts JIT @@ -275,7 +277,19 @@ local function encode_r(buf, obj, level) error("can not encode FFI type: '"..ffi.typeof(obj).."'") end else - error("can not encode Lua type: '"..type(obj).."'") + if msgpack.cfg.encode_use_tostring then + obj = tostring(obj) + if obj then + goto restart + else + error("can not encode Lua type: '"..type(obj).."'") + end + else if msgpack.cfg.encode_invalid_as_nil then + encode_nil(buf) + else + error("can not encode Lua type: '"..type(obj).."'") + end + end end end diff --git a/src/lua/utils.h b/src/lua/utils.h index d42cc3992..425b9ebf1 100644 --- a/src/lua/utils.h +++ b/src/lua/utils.h @@ -185,7 +185,7 @@ struct luaL_serializer { * + map - at least one table index is not unsigned integer. * + regular array - all array indexes are available. * + sparse array - at least one array index is missing. - * + excessively sparse arrat - the number of values missing + * + excessively sparse array - the number of values missing * exceeds the configured ratio. * * An array is excessively sparse when **all** the following @@ -212,7 +212,7 @@ struct luaL_serializer { int encode_sparse_safe; /** Max recursion depth for encoding (MsgPack, CJSON only) */ int encode_max_depth; - /** Enables encoding of NaN and Inf numbers */ + /** Enables encoding of NaN and Inf numbers (YAML, CJSON only) */ int encode_invalid_numbers; /** Floating point numbers precision (YAML, CJSON only) */ int encode_number_precision; @@ -233,7 +233,7 @@ struct luaL_serializer { /** Use NULL for all unrecognizable types */ int encode_invalid_as_nil; - /** Enables decoding NaN and Inf numbers */ + /** Enables decoding NaN and Inf numbers (YAML, CJSON only) */ int decode_invalid_numbers; /** Save __serialize meta-value for decoded arrays and maps */ int decode_save_metatables; -- 2.20.1 (Apple Git-117) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH] Msgpackffi considers msgpack.cfg options now 2019-11-07 10:36 [Tarantool-patches] [PATCH] Msgpackffi considers msgpack.cfg options now Maria @ 2019-11-11 23:21 ` Alexander Turenko 2019-11-29 21:09 ` Maria Khaydich 0 siblings, 1 reply; 6+ messages in thread From: Alexander Turenko @ 2019-11-11 23:21 UTC (permalink / raw) To: Maria; +Cc: tarantool-patches We should test the new code. Look, test/app-tap/lua/serializer_test.lua (common code to test serializers) contains `if not s.cfg then return end` blocks to exclude test cases, which are not supported by msgpackffi. Please, ensure that all new functionality is tested. You can also try to use luacov (test-run even has --luacov option), however I didn't tried it with tarantool's build-in code, only with tarantool/graphql. Don't sure whether it is hard to setup, so I don't insist on this at all. Igor (I CCed him) recently wonder whether it worth to use ffi on modern LuaJIT with implemented traces stitching. Maybe it worth to perform some microbenchmarks of msgpack and msgpackffi to ensure that there is still need to work on msgpackffi. (Technically it is out of scope of your issue, however it is interesting to know. Please, ask Georgy whether it is worth to spend your time on that now.) See other notes below. WBR, Alexander Turenko. On Thu, Nov 07, 2019 at 01:36:42PM +0300, Maria wrote: > Lua serializers have common configuration options > that mostly were not respected in msgpackffi module. > This patch fixes that for several configurations: > encode_use_tostring, encode_invalid_as_nil and > encode_load_metatables. > Options encode/decode_invalid_numbers are suggested > to not be considered in msgpackffi module since lua > processed them properly without this check so adding > it would mean cutting functionality provided by the > language in case this field was set to be false. Lua supports NaN/Inf values, MsgPack (I mean, the format) supports them too -- here you are right. But a user logic can lean on the fact that all floating point numbers are not NaN/Inf. Also, we intend to be compatible with msgpack module, which supports those options. I propose to implement them. I think we also should implement encode_sparse_* options to be compatible with msgpack module. During the review I gathered the information what options are supported by each (de)serializer: https://github.com/tarantool/doc/issues/976 Maybe this will be interesting for you. I'll paste notes I wrote while looking around serializers (I was not quite aware of the code), they can be useful to verify the table from the issue above. ---- The following encode_* options are handled in src/lua/utils.c (in luaL_tofield() and luaL_convertfield(), which are used in msgpack, json and yaml modules; sometimes via luaL_checkfield() helper): - encode_sparse_convert - encode_sparse_ratio - encode_sparse_safe - encode_invalid_numbers - encode_load_metatables - encode_use_tostring - encode_invalid_as_nil The following encode_* options are handled separately by each serializer: - encode_max_depth: msgpack, json - encode_deep_as_nil: msgpack, json - encode_number_precision: json, yaml (msgpackffi does not use those helpers, because they are for Lua-C API.) All decode_* options are handled separately by each deserializer (but sometimes using luaL_checkfinite() helper from src/lua/utils.c): - decode_invalid_numbers: msgpack, json, yaml - decode_save_metatables: msgpack, json, yaml - decode_max_depth: json ---- FYI: I also filed https://github.com/tarantool/tarantool/issues/4622 ('lua: forbid options that are not applicable for certain (de)serializer'). > Closes #4499 I think it worth to add a docbot comment (see tarantool/docbot) and state that msgpackffi now supports all msgpack options (and gather those ones that are set with msgpack.cfg()). It worth also mention that while the API is mostly the same, msgpackffi does not check a buffer boundaries and lacks of .decode_{array,map}_header(), .cfg() and .new() functions (maybe there are other differences; I would expect that you will refine them during the work on tests). However, to be honest, I'm a bit tentative about making the module public. I would discuss this with the rest of the team. Igor, Georgy, what do you think about it? > Issue: > https://github.com/tarantool/tarantool/issues/4499 > Branch: > https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options > diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c > index c2be0b3e8..d3df1a8c1 100644 > --- a/src/lua/msgpack.c > +++ b/src/lua/msgpack.c > @@ -95,7 +95,6 @@ luamp_decode_extension_default(struct lua_State *L, const char **data) > (unsigned char) **data); > } > > - > void > luamp_set_decode_extension(luamp_decode_extension_f handler) > { Please, move all those changes into a separate commit. > diff --git a/src/lua/msgpack.h b/src/lua/msgpack.h > index d0ade303f..1e60f0a5d 100644 > --- a/src/lua/msgpack.h > +++ b/src/lua/msgpack.h > @@ -47,7 +47,7 @@ struct mpstream; > /** > * Default instance of msgpack serializer (msgpack = require('msgpack')). > * This instance is used by all box's Lua/C API bindings (e.g. space:replace()). > - * All changes made by msgpack.cfg{} function are also affect box's bindings > + * All changes made by msgpack.cfg{} function also affect box's bindings > * (this is a feature). > */ > extern struct luaL_serializer *luaL_msgpack_default; > diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua > index f7ee44291..43f5d6df1 100644 > --- a/src/lua/msgpackffi.lua > +++ b/src/lua/msgpackffi.lua > @@ -224,9 +224,11 @@ local function encode_r(buf, obj, level) > return > end > local serialize = nil > - local mt = getmetatable(obj) > - if mt ~= nil then > - serialize = mt.__serialize > + if msgpack.cfg.encode_load_metatables then > + local mt = getmetatable(obj) > + if mt ~= nil then > + serialize = mt.__serialize > + end > end > -- calculate the number of array and map elements in the table > -- TODO: pairs() aborts JIT > @@ -275,7 +277,19 @@ local function encode_r(buf, obj, level) > error("can not encode FFI type: '"..ffi.typeof(obj).."'") > end > else > - error("can not encode Lua type: '"..type(obj).."'") > + if msgpack.cfg.encode_use_tostring then > + obj = tostring(obj) > + if obj then > + goto restart > + else > + error("can not encode Lua type: '"..type(obj).."'") > + end > + else if msgpack.cfg.encode_invalid_as_nil then > + encode_nil(buf) > + else > + error("can not encode Lua type: '"..type(obj).."'") > + end > + end > end > end The code itself looks correct, however we cannot be sure w/o proper testing. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH] Msgpackffi considers msgpack.cfg options now 2019-11-11 23:21 ` Alexander Turenko @ 2019-11-29 21:09 ` Maria Khaydich 2019-12-17 23:26 ` Alexander Turenko 0 siblings, 1 reply; 6+ messages in thread From: Maria Khaydich @ 2019-11-29 21:09 UTC (permalink / raw) To: Alexander Turenko; +Cc: tarantool-patches [-- Attachment #1: Type: text/plain, Size: 14595 bytes --] Thank you for the review. I’ve made sure new functionality is tested in test/app-tap/lua/serializer_test.lua by adding ‘cfg’ and ‘new’ fields in msgpackffi module. > Lua supports NaN/Inf values, .. I propose to implement them. Also done. > I think we also should implement encode_sparse_* options to be compatible with msgpack module. Could you please explain how do you propose to implement them? It seems to me the purpose of those configs is to handle excessively sparse arrays as maps (as stated in src/lua/utils.h). Which does not seem relevant to lua module where the only container type is table. What do I miss here? > During the review I gathered the information what options are supported by each (de)serializer That was very helpful, thank you! The patch Subject: [PATCH 1/2] Msgpackffi considers msgpack.cfg options now Msgpack.cfg options changes did not affect msgpackffi serializer state. This patch fixes it for options encode_load_metatables, decode_ and encode_invalid_numbers, encode_use_tostring and encode_invalid_as_nil. Closes #4499 --- Issue: https://github.com/tarantool/tarantool/issues/4499 Branch: https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options src/lua/msgpackffi.lua | 46 +++++++++++++++++++++++++--- test/app-tap/lua/serializer_test.lua | 7 ++++- 2 files changed, 48 insertions(+), 5 deletions(-) diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua index 5331dc75f..2689bb214 100644 --- a/src/lua/msgpackffi.lua +++ b/src/lua/msgpackffi.lua @@ -227,9 +227,11 @@ local function encode_r(buf, obj, level) return end local serialize = nil - local mt = getmetatable(obj) - if mt ~= nil then - serialize = mt.__serialize + if msgpack.cfg.encode_load_metatables then + local mt = getmetatable(obj) + if mt ~= nil then + serialize = mt.__serialize + end end -- calculate the number of array and map elements in the table -- TODO: pairs() aborts JIT @@ -261,6 +263,19 @@ local function encode_r(buf, obj, level) else error("Invalid __serialize value") end + elseif type(obj) == math.huge or type(obj) == -math.huge or + type(obj) == math.nan then + if msgpack.cfg.encode_invalid_numbers then + if obj == math.huge then + obj = 1/0 + elseif obj == -math.huge then + obj = -1/0 + else + obj = 0/0 + end + else + encode_nil(buf) + end elseif obj == nil then encode_nil(buf) elseif type(obj) == "boolean" then @@ -278,9 +293,21 @@ local function encode_r(buf, obj, level) error("can not encode FFI type: '"..ffi.typeof(obj).."'") end else - error("can not encode Lua type: '"..type(obj).."'") + if msgpack.cfg.encode_use_tostring then + obj = tostring(obj) + if obj then + encode_str(buf, obj) + else + error("can not encode Lua type: '"..type(obj).."'") + end + else if msgpack.cfg.encode_invalid_as_nil then + encode_nil(buf) + else + error("can not encode Lua type: '"..type(obj).."'") + end end end +end local function encode(obj) local tmpbuf = buffer.IBUF_SHARED @@ -562,6 +589,15 @@ decode_r = function(data) elseif c >= 0xe0 then return tonumber(ffi.cast('signed char',c)) -- negfixint elseif c == 0xc0 then + if msgpack.cfg.decode_invalid_numbers then + if c == 0/0 then + return math.nan + elseif c == 1/0 then + return math.huge + elseif c == -1/0 then + return -math.huge + end + end return msgpack.NULL elseif c == 0xc2 then return false @@ -628,6 +664,8 @@ return { on_encode = on_encode; decode_unchecked = decode_unchecked; decode = decode_unchecked; -- just for tests + cfg = msgpack.cfg; + new = msgpack.new; -- for serializer tests to work properly internal = { encode_fix = encode_fix; encode_array = encode_array; diff --git a/test/app-tap/lua/serializer_test.lua b/test/app-tap/lua/serializer_test.lua index 1609f768f..b29eb9da5 100644 --- a/test/app-tap/lua/serializer_test.lua +++ b/test/app-tap/lua/serializer_test.lua @@ -146,7 +146,7 @@ local function test_signed(test, s) end local function test_double(test, s) - test:plan(s.cfg and 15 or 9) + test:plan(s.cfg and 18 or 9) rt(test, s, -1.1) rt(test, s, 3.1415926535898) @@ -168,23 +168,28 @@ local function test_double(test, s) -- local nan = 0/0 local inf = 1/0 + local minf = -1/0 local ss = s.new() ss.cfg{encode_invalid_numbers = false} test:ok(not pcall(ss.encode, nan), "encode exception on nan") test:ok(not pcall(ss.encode, inf), "encode exception on inf") + test:ok(not pcall(ss.encode, minf), "encode exception on minf") ss.cfg{encode_invalid_numbers = true} local xnan = ss.encode(nan) local xinf = ss.encode(inf) + local xminf = ss.encode(minf) ss.cfg{decode_invalid_numbers = false} test:ok(not pcall(ss.decode, xnan), "decode exception on nan") test:ok(not pcall(ss.decode, xinf), "decode exception on inf") + test:ok(not pcall(ss.decode, xminf), "decode exception on minf") ss.cfg{decode_invalid_numbers = true} rt(test, s, nan) rt(test, s, inf) + rt(test, s, minf) ss = nil end -- 2.20.1 (Apple Git-117) Subject: [PATCH 2/2] Typos --- src/lua/msgpack.c | 1 - src/lua/msgpack.h | 2 +- src/lua/utils.h | 2 +- 3 files changed, 2 insertions(+), 3 deletions(-) diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c index ef315b692..4c3fbfba2 100644 --- a/src/lua/msgpack.c +++ b/src/lua/msgpack.c @@ -95,7 +95,6 @@ luamp_decode_extension_default(struct lua_State *L, const char **data) (unsigned char) **data); } - void luamp_set_decode_extension(luamp_decode_extension_f handler) { diff --git a/src/lua/msgpack.h b/src/lua/msgpack.h index d0ade303f..1e60f0a5d 100644 --- a/src/lua/msgpack.h +++ b/src/lua/msgpack.h @@ -47,7 +47,7 @@ struct mpstream; /** * Default instance of msgpack serializer (msgpack = require('msgpack')). * This instance is used by all box's Lua/C API bindings (e.g. space:replace()). - * All changes made by msgpack.cfg{} function are also affect box's bindings + * All changes made by msgpack.cfg{} function also affect box's bindings * (this is a feature). */ extern struct luaL_serializer *luaL_msgpack_default; diff --git a/src/lua/utils.h b/src/lua/utils.h index d9fb0704f..1e586295a 100644 --- a/src/lua/utils.h +++ b/src/lua/utils.h @@ -186,7 +186,7 @@ struct luaL_serializer { * + map - at least one table index is not unsigned integer. * + regular array - all array indexes are available. * + sparse array - at least one array index is missing. - * + excessively sparse arrat - the number of values missing + * + excessively sparse array - the number of values missing * exceeds the configured ratio. * * An array is excessively sparse when **all** the following -- 2.20.1 (Apple Git-117) >Вторник, 12 ноября 2019, 2:21 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>: > >We should test the new code. Look, test/app-tap/lua/serializer_test.lua >(common code to test serializers) contains `if not s.cfg then return >end` blocks to exclude test cases, which are not supported by >msgpackffi. Please, ensure that all new functionality is tested. > >You can also try to use luacov (test-run even has --luacov option), >however I didn't tried it with tarantool's build-in code, only with >tarantool/graphql. Don't sure whether it is hard to setup, so I don't >insist on this at all. > >Igor (I CCed him) recently wonder whether it worth to use ffi on modern >LuaJIT with implemented traces stitching. Maybe it worth to perform some >microbenchmarks of msgpack and msgpackffi to ensure that there is still >need to work on msgpackffi. (Technically it is out of scope of your >issue, however it is interesting to know. Please, ask Georgy whether it >is worth to spend your time on that now.) > >See other notes below. > >WBR, Alexander Turenko. > >On Thu, Nov 07, 2019 at 01:36:42PM +0300, Maria wrote: >> Lua serializers have common configuration options >> that mostly were not respected in msgpackffi module. >> This patch fixes that for several configurations: >> encode_use_tostring, encode_invalid_as_nil and >> encode_load_metatables. >> Options encode/decode_invalid_numbers are suggested >> to not be considered in msgpackffi module since lua >> processed them properly without this check so adding >> it would mean cutting functionality provided by the >> language in case this field was set to be false. > >Lua supports NaN/Inf values, MsgPack (I mean, the format) supports them >too -- here you are right. But a user logic can lean on the fact that >all floating point numbers are not NaN/Inf. Also, we intend to be >compatible with msgpack module, which supports those options. I propose >to implement them. > >I think we also should implement encode_sparse_* options to be >compatible with msgpack module. > >During the review I gathered the information what options are supported >by each (de)serializer: https://github.com/tarantool/doc/issues/976 >Maybe this will be interesting for you. > >I'll paste notes I wrote while looking around serializers (I was not >quite aware of the code), they can be useful to verify the table from >the issue above. > >---- > >The following encode_* options are handled in src/lua/utils.c (in >luaL_tofield() and luaL_convertfield(), which are used in msgpack, json >and yaml modules; sometimes via luaL_checkfield() helper): > >- encode_sparse_convert >- encode_sparse_ratio >- encode_sparse_safe >- encode_invalid_numbers >- encode_load_metatables >- encode_use_tostring >- encode_invalid_as_nil > >The following encode_* options are handled separately by each >serializer: > >- encode_max_depth: msgpack, json >- encode_deep_as_nil: msgpack, json >- encode_number_precision: json, yaml > >(msgpackffi does not use those helpers, because they are for Lua-C API.) > >All decode_* options are handled separately by each deserializer (but >sometimes using luaL_checkfinite() helper from src/lua/utils.c): > >- decode_invalid_numbers: msgpack, json, yaml >- decode_save_metatables: msgpack, json, yaml >- decode_max_depth: json > >---- > >FYI: I also filed https://github.com/tarantool/tarantool/issues/4622 >('lua: forbid options that are not applicable for certain >(de)serializer'). > >> Closes #4499 > >I think it worth to add a docbot comment (see tarantool/docbot) and >state that msgpackffi now supports all msgpack options (and gather those >ones that are set with msgpack.cfg()). It worth also mention that while >the API is mostly the same, msgpackffi does not check a buffer >boundaries and lacks of .decode_{array,map}_header(), .cfg() and .new() >functions (maybe there are other differences; I would expect that you >will refine them during the work on tests). > >However, to be honest, I'm a bit tentative about making the module >public. I would discuss this with the rest of the team. Igor, Georgy, >what do you think about it? > >> Issue: >> https://github.com/tarantool/tarantool/issues/4499 >> Branch: >> https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options > >> diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c >> index c2be0b3e8..d3df1a8c1 100644 >> --- a/src/lua/msgpack.c >> +++ b/src/lua/msgpack.c >> @@ -95,7 +95,6 @@ luamp_decode_extension_default(struct lua_State *L, const char **data) >> (unsigned char) **data); >> } >> >> - >> void >> luamp_set_decode_extension(luamp_decode_extension_f handler) >> { > >Please, move all those changes into a separate commit. > >> diff --git a/src/lua/msgpack.h b/src/lua/msgpack.h >> index d0ade303f..1e60f0a5d 100644 >> --- a/src/lua/msgpack.h >> +++ b/src/lua/msgpack.h >> @@ -47,7 +47,7 @@ struct mpstream; >> /** >> * Default instance of msgpack serializer (msgpack = require('msgpack')). >> * This instance is used by all box's Lua/C API bindings (e.g. space:replace()). >> - * All changes made by msgpack.cfg{} function are also affect box's bindings >> + * All changes made by msgpack.cfg{} function also affect box's bindings >> * (this is a feature). >> */ >> extern struct luaL_serializer *luaL_msgpack_default; >> diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua >> index f7ee44291..43f5d6df1 100644 >> --- a/src/lua/msgpackffi.lua >> +++ b/src/lua/msgpackffi.lua >> @@ -224,9 +224,11 @@ local function encode_r(buf, obj, level) >> return >> end >> local serialize = nil >> - local mt = getmetatable(obj) >> - if mt ~= nil then >> - serialize = mt.__serialize >> + if msgpack.cfg.encode_load_metatables then >> + local mt = getmetatable(obj) >> + if mt ~= nil then >> + serialize = mt.__serialize >> + end >> end >> -- calculate the number of array and map elements in the table >> -- TODO: pairs() aborts JIT >> @@ -275,7 +277,19 @@ local function encode_r(buf, obj, level) >> error("can not encode FFI type: '"..ffi.typeof(obj).."'") >> end >> else >> - error("can not encode Lua type: '"..type(obj).."'") >> + if msgpack.cfg.encode_use_tostring then >> + obj = tostring(obj) >> + if obj then >> + goto restart >> + else >> + error("can not encode Lua type: '"..type(obj).."'") >> + end >> + else if msgpack.cfg.encode_invalid_as_nil then >> + encode_nil(buf) >> + else >> + error("can not encode Lua type: '"..type(obj).."'") >> + end >> + end >> end >> end > >The code itself looks correct, however we cannot be sure w/o proper >testing. -- Maria Khaydich [-- Attachment #2: Type: text/html, Size: 19504 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH] Msgpackffi considers msgpack.cfg options now 2019-11-29 21:09 ` Maria Khaydich @ 2019-12-17 23:26 ` Alexander Turenko 2020-01-24 17:37 ` [Tarantool-patches] [PATCH] lua: msgpackffi " Maria Khaydich 0 siblings, 1 reply; 6+ messages in thread From: Alexander Turenko @ 2019-12-17 23:26 UTC (permalink / raw) To: Maria Khaydich; +Cc: tarantool-patches Sorry again for the late reply. > > I think we also should implement encode_sparse_* options to > > be compatible with msgpack module. > > Could you please explain how do you propose to implement them? It > seems to me the purpose of those configs is to handle excessively > sparse arrays as maps (as stated in src/lua/utils.h). Which does not > seem relevant to lua module where the only container type is table. > What do I miss here? It is about encoding, not decoding: encode a Lua table with numeric keys as MP_ARRAY, MP_MAP or give an error. It seems that nothing prevent us from implementing this logic on Lua. > Subject: [PATCH 1/2] Msgpackffi considers msgpack.cfg options now > Msgpack.cfg options changes did not affect msgpackffi serializer state. > This patch fixes it for options encode_load_metatables, decode_ and > encode_invalid_numbers, encode_use_tostring and encode_invalid_as_nil. > Closes #4499 > --- > Issue: > https://github.com/tarantool/tarantool/issues/4499 > Branch: > https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options > @@ -261,6 +263,19 @@ local function encode_r(buf, obj, level) > else > error("Invalid __serialize value") > end > + elseif type(obj) == math.huge or type(obj) == -math.huge or > + type(obj) == math.nan then > + if msgpack.cfg.encode_invalid_numbers then > + if obj == math.huge then > + obj = 1/0 > + elseif obj == -math.huge then > + obj = -1/0 > + else > + obj = 0/0 > + end > + else > + encode_nil(buf) > + end I would save 1/0, -1/0, 0/0 as module local variables for readability, like as it is done in test_double(). I also found that math.huge is 1/0 (inf) and there is no math.nan (it just nil). > @@ -562,6 +589,15 @@ decode_r = function(data) > elseif c >= 0xe0 then > return tonumber(ffi.cast('signed char',c)) -- negfixint > elseif c == 0xc0 then > + if msgpack.cfg.decode_invalid_numbers then > + if c == 0/0 then > + return math.nan > + elseif c == 1/0 then > + return math.huge > + elseif c == -1/0 then > + return -math.huge > + end > + end Same as above. > @@ -628,6 +664,8 @@ return { > on_encode = on_encode; > decode_unchecked = decode_unchecked; > decode = decode_unchecked; -- just for tests > + cfg = msgpack.cfg; > + new = msgpack.new; -- for serializer tests to work properly So they'll test msgpack, not msgpackffi. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: msgpackffi considers msgpack.cfg options now 2019-12-17 23:26 ` Alexander Turenko @ 2020-01-24 17:37 ` Maria Khaydich 2020-01-25 15:01 ` Konstantin Osipov 0 siblings, 1 reply; 6+ messages in thread From: Maria Khaydich @ 2020-01-24 17:37 UTC (permalink / raw) To: Alexander Turenko; +Cc: tarantool-patches [-- Attachment #1: Type: text/plain, Size: 18133 bytes --] Thank you for the review! All done now: Msgpack.cfg reconfiguration did not affect ffi module behavior. This patch fixes it for all the options that were not yet covered elsewhere: encode_sparse_convert, encode_sparse_ratio, encode_sparse_safe, encode_invalid_numbers & decode_invalid_numbers, encode_load_metatables & decode_save_metatables, encode_use_tostring, encode_invalid_as_nil. Most of common serializers' test were not applicable to msgpackffi since it has no cfg field of its own. This issue was also solved. Closes #4499 --- Issue: https://github.com/tarantool/tarantool/issues/4499 Branch: https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options src/lua/msgpack.c | 1 - src/lua/msgpack.h | 2 +- src/lua/msgpackffi.lua | 70 +++++++++++++++++++---- src/lua/utils.h | 2 +- test/app-tap/lua/serializer_test.lua | 7 ++- test/app-tap/msgpackffi.test.lua | 85 +++++++++++++++++++++++++++- 6 files changed, 149 insertions(+), 18 deletions(-) diff --git a/src/lua/msgpack.h b/src/lua/msgpack.h index d0ade303f..1e60f0a5d 100644 --- a/src/lua/msgpack.h +++ b/src/lua/msgpack.h @@ -47,7 +47,7 @@ struct mpstream; /** * Default instance of msgpack serializer (msgpack = require('msgpack')). * This instance is used by all box's Lua/C API bindings (e.g. space:replace()). - * All changes made by msgpack.cfg{} function are also affect box's bindings + * All changes made by msgpack.cfg{} function also affect box's bindings * (this is a feature). */ extern struct luaL_serializer *luaL_msgpack_default; diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua index f775f2d41..159a652b9 100644 --- a/src/lua/msgpackffi.lua +++ b/src/lua/msgpackffi.lua @@ -206,6 +206,10 @@ local function encode_nil(buf) p[0] = 0xc0 end +local inf = 1/0 +local minf = -1/0 +local nan = 0/0 + local function encode_r(buf, obj, level) ::restart:: if type(obj) == "number" then @@ -213,6 +217,14 @@ local function encode_r(buf, obj, level) if obj % 1 == 0 and obj > -1e63 and obj < 1e64 then encode_int(buf, obj) else + if (obj == inf or obj == minf or tostring(obj) == "nan") and + not msgpack.cfg.encode_invalid_numbers then + if msgpack.cfg.encode_invalid_as_nil then + encode_nil(buf) + else + error("Invalid numbers encoding is manually prohibited") + end + end encode_double(buf, obj) end elseif type(obj) == "string" then @@ -227,28 +239,46 @@ local function encode_r(buf, obj, level) return end local serialize = nil - local mt = getmetatable(obj) - if mt ~= nil then - serialize = mt.__serialize + if msgpack.cfg.encode_load_metatables then + local mt = getmetatable(obj) + if mt ~= nil then + serialize = mt.__serialize + end end -- calculate the number of array and map elements in the table -- TODO: pairs() aborts JIT - local array_count, map_count = 0, 0 - for key in pairs(obj) do - if type(key) == 'number' and key >= 1 and - key == math.floor(key) and key == array_count + 1 then + local array_count, map_count, max_idx = 0, 0, 0 + for key, el in pairs(obj) do + if type(key) == 'number' and key >= 1 and + key == math.floor(key) then array_count = array_count + 1 + if key > max_idx then + max_idx = key + end else map_count = map_count + 1 end end - if (serialize == nil and map_count == 0) or serialize == 'array' or - serialize == 'seq' or serialize == 'sequence' then + -- excessively sparse tables should be encoded as maps + local is_excessively_sparse = false + if msgpack.cfg.encode_sparse_ratio and + max_idx > msgpack.cfg.encode_sparse_safe and + max_idx > array_count * msgpack.cfg.encode_sparse_ratio then + if not msgpack.cfg.encode_sparse_convert then + error(string.format("Can\'t encode excessively ".. + "sparse tables when encode_sparse_convert ".. + "configuration option is set to false")) + end + is_excessively_sparse = true + end + if (not is_excessively_sparse) and ( serialize == 'array' or + serialize == 'seq' or serialize == 'sequence' or + (serialize == nil and map_count == 0) ) then encode_array(buf, array_count) for i=1,array_count,1 do encode_r(buf, obj[i], level + 1) end - elseif (serialize == nil and map_count > 0) or + elseif serialize == nil or is_excessively_sparse or serialize == 'map' or serialize == 'mapping' then encode_map(buf, array_count + map_count) for key, val in pairs(obj) do @@ -278,7 +308,18 @@ local function encode_r(buf, obj, level) error("can not encode FFI type: '"..ffi.typeof(obj).."'") end else - error("can not encode Lua type: '"..type(obj).."'") + if msgpack.cfg.encode_use_tostring then + obj = tostring(obj) + if obj then + encode_str(buf, obj) + else + error("can not encode Lua type: '"..type(obj).."'") + end + elseif msgpack.cfg.encode_invalid_as_nil then + encode_nil(buf) + else + error("can not encode Lua type: '"..type(obj).."'") + end end end @@ -453,7 +494,12 @@ end local function decode_double(data) data[0] = data[0] - 1 -- mp_decode_double need type code - return tonumber(builtin.mp_decode_double(data)) + local res = tonumber(builtin.mp_decode_double(data)) + if (res == inf or res == minf or tostring(res) == "nan") and + not msgpack.cfg.decode_invalid_numbers then + error("Invalid numbers decoding is manually prohibited") + end + return res end local function decode_str(data, size) diff --git a/src/lua/utils.h b/src/lua/utils.h index 0b3672769..fed83cf8d 100644 --- a/src/lua/utils.h +++ b/src/lua/utils.h @@ -183,7 +183,7 @@ struct luaL_serializer { * + map - at least one table index is not unsigned integer. * + regular array - all array indexes are available. * + sparse array - at least one array index is missing. - * + excessively sparse arrat - the number of values missing + * + excessively sparse array - the number of values missing * exceeds the configured ratio. * * An array is excessively sparse when **all** the following diff --git a/test/app-tap/lua/serializer_test.lua b/test/app-tap/lua/serializer_test.lua index 47edac621..860f4dceb 100644 --- a/test/app-tap/lua/serializer_test.lua +++ b/test/app-tap/lua/serializer_test.lua @@ -150,7 +150,7 @@ local function test_signed(test, s) end local function test_double(test, s) - test:plan(s.cfg and 15 or 9) + test:plan(s.cfg and 18 or 9) rt(test, s, -1.1) rt(test, s, 3.1415926535898) @@ -172,23 +172,28 @@ local function test_double(test, s) -- local nan = 0/0 local inf = 1/0 + local minf = -1/0 local ss = s.new() ss.cfg{encode_invalid_numbers = false} test:ok(not pcall(ss.encode, nan), "encode exception on nan") test:ok(not pcall(ss.encode, inf), "encode exception on inf") + test:ok(not pcall(ss.encode, minf), "encode exception on minf") ss.cfg{encode_invalid_numbers = true} local xnan = ss.encode(nan) local xinf = ss.encode(inf) + local xminf = ss.encode(minf) ss.cfg{decode_invalid_numbers = false} test:ok(not pcall(ss.decode, xnan), "decode exception on nan") test:ok(not pcall(ss.decode, xinf), "decode exception on inf") + test:ok(not pcall(ss.decode, xminf), "decode exception on minf") ss.cfg{decode_invalid_numbers = true} rt(test, s, nan) rt(test, s, inf) + rt(test, s, minf) ss = nil end diff --git a/test/app-tap/msgpackffi.test.lua b/test/app-tap/msgpackffi.test.lua index 994402d15..6304559f0 100755 --- a/test/app-tap/msgpackffi.test.lua +++ b/test/app-tap/msgpackffi.test.lua @@ -5,6 +5,7 @@ package.path = "lua/?.lua;"..package.path local tap = require('tap') local common = require('serializer_test') local ffi = require('ffi') +local msgpack = require('msgpack') local function is_map(s) local b = string.byte(string.sub(s, 1, 1)) @@ -36,6 +37,86 @@ local function test_offsets(test, s) test:ok(not pcall(s.decode, dump, offset), "invalid offset") end +local function test_configs(test, s) + test:plan(26) + -- + -- gh-4499: ffi module should consider msgpack.cfg options + -- + + -- only excessively sparse arrays should be encoded as maps, + -- i.e. encode_sparse_ratio > 0 && max_idx > encode_sparse_safe + -- && max_idx > num(elements) * encode_sparse_ratio + test:ok(is_array(s.encode({ [1] = 1, [3] = 3, [4] = 3 })), + "not excessively sparse") + test:ok(is_array(s.encode({ [1] = 1, [2] = 2, [3] = 3, [4] = 4, [5] = 5, + [11] = 6 })), "still not excessively sparse") + test:ok(is_map(s.encode({ [1] = 1, [2] = 2, [100] = 3 })), + "is excessively sparse") + + msgpack.cfg({encode_sparse_convert = false}) + local ok, _ = pcall(s.encode(), { [1] = 1, [2] = 2, [100] = 3 }) + test:ok(not ok, "conversion of sparse tables is manually prohibited") + + -- testing (en/de)coding of invalid numbers + local inf = 1/0 + local minf = -1/0 + local nan = 0/0 + + test:ok(s.decode(s.encode(inf)), "decode & encode for inf") + test:ok(s.decode(s.encode(minf)), "decode & encode for minf") + test:ok(s.decode(s.encode(nan)), "decode & encode for nan") + + msgpack.cfg({encode_invalid_numbers = false}) + test:ok(not pcall(s.encode, inf), "prohibited encode for inf") + test:ok(not pcall(s.encode, minf), "prohibited encode for minf") + test:ok(not pcall(s.encode, nan), "prohibited encode for nan") + + msgpack.cfg({decode_invalid_numbers = false}) + test:ok(not pcall(s.decode, inf), "prohibited decode for inf") + test:ok(not pcall(s.decode, minf), "prohibited decode for minf") + test:ok(not pcall(s.decode, nan), "prohibited decode for nan") + + -- invalid numbers can also be encoded as nil values + msgpack.cfg({decode_invalid_numbers = true, encode_invalid_as_nil = true}) + test:is(s.decode(s.encode(inf)), nil, "invalid_as_nil for inf") + test:is(s.decode(s.encode(minf)), nil, "invalid_as_nil for minf") + test:is(s.decode(s.encode(nan)), nil, "invalid_as_nil for nan") + + -- whether __serialize meta-value checking is enabled + local arr = setmetatable({1, 2, 3, k1 = 'v1', k2 = 'v2', 4, 5}, + { __serialize = 'seq'}) + local map = setmetatable({1, 2, 3, 4, 5}, { __serialize = 'map'}) + local obj = setmetatable({}, { + __serialize = function(x) return 'serialize' end + }) + + -- testing encode metatables + msgpack.cfg{encode_load_metatables = false} + test:ok(is_map(s.encode(arr)), "array ignore __serialize") + test:ok(is_array(s.encode(map)), "map ignore __serialize") + test:ok(is_array(s.encode(obj)), "object ignore __serialize") + + msgpack.cfg{encode_load_metatables = true} + test:ok(is_array(s.encode(arr)), "array load __serialize") + test:ok(is_map(s.encode(map)), "map load __serialize") + test:is(s.decode(s.encode(obj)), "serialize", "object load __serialize") + + -- testing decode metatables + msgpack.cfg{decode_save_metatables = false} + test:isnil(getmetatable(s.decode(s.encode(arr))), "array __serialize") + test:isnil(getmetatable(s.decode(s.encode(map))), "map __serialize") + + msgpack.cfg{decode_save_metatables = true} + test:is(getmetatable(s.decode(s.encode(arr))).__serialize, "seq", + "array save __serialize") + test:is(getmetatable(s.decode(s.encode(map))).__serialize, "map", + "map save __serialize") + + -- applying default configs + msgpack.cfg({encode_sparse_convert = true, encode_invalid_numbers = true, + encode_invalid_as_nil = false}) +end + local function test_other(test, s) test:plan(24) local buf = string.char(0x93, 0x6e, 0xcb, 0x42, 0x2b, 0xed, 0x30, 0x47, @@ -82,7 +163,6 @@ local function test_other(test, s) while t ~= nil do level = level + 1 t = t[1] end return level end - local msgpack = require('msgpack') local deep_as_nil = msgpack.cfg.encode_deep_as_nil msgpack.cfg({encode_deep_as_nil = true}) local max_depth = msgpack.cfg.encode_max_depth @@ -118,7 +198,7 @@ end tap.test("msgpackffi", function(test) local serializer = require('msgpackffi') - test:plan(11) + test:plan(12) test:test("unsigned", common.test_unsigned, serializer) test:test("signed", common.test_signed, serializer) test:test("double", common.test_double, serializer) @@ -130,6 +210,7 @@ tap.test("msgpackffi", function(test) -- udata/cdata hooks are not implemented --test:test("ucdata", common.test_ucdata, serializer) test:test("offsets", test_offsets, serializer) + test:test("configs", test_configs, serializer) test:test("other", test_other, serializer) test:test("decode_buffer", common.test_decode_buffer, serializer) end) diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c index edbc15b72..133b2b527 100644 --- a/src/lua/msgpack.c +++ b/src/lua/msgpack.c @@ -95,7 +95,6 @@ luamp_decode_extension_default(struct lua_State *L, const char **data) (unsigned char) **data); } - void luamp_set_decode_extension(luamp_decode_extension_f handler) { -- 2.24.0 > >> >>Среда, 18 декабря 2019, 2:26 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>: >> >>Sorry again for the late reply. >> >>> > I think we also should implement encode_sparse_* options to >>> > be compatible with msgpack module. >>> >>> Could you please explain how do you propose to implement them? It >>> seems to me the purpose of those configs is to handle excessively >>> sparse arrays as maps (as stated in src/lua/utils.h). Which does not >>> seem relevant to lua module where the only container type is table. >>> What do I miss here? >> >>It is about encoding, not decoding: encode a Lua table with numeric keys >>as MP_ARRAY, MP_MAP or give an error. It seems that nothing prevent us >>from implementing this logic on Lua. >> >>> Subject: [PATCH 1/2] Msgpackffi considers msgpack.cfg options now >>> Msgpack.cfg options changes did not affect msgpackffi serializer state. >>> This patch fixes it for options encode_load_metatables, decode_ and >>> encode_invalid_numbers, encode_use_tostring and encode_invalid_as_nil. >>> Closes #4499 >>> --- >>> Issue: >>> https://github.com/tarantool/tarantool/issues/4499 >>> Branch: >>> https://github.com/tarantool/tarantool/compare/eljashm/gh-4499-msgpackffi-ignores-msgpack.cfg-options >> >>> @@ -261,6 +263,19 @@ local function encode_r(buf, obj, level) >>> else >>> error("Invalid __serialize value") >>> end >>> + elseif type(obj) == math.huge or type(obj) == -math.huge or >>> + type(obj) == math.nan then >>> + if msgpack.cfg.encode_invalid_numbers then >>> + if obj == math.huge then >>> + obj = 1/0 >>> + elseif obj == -math.huge then >>> + obj = -1/0 >>> + else >>> + obj = 0/0 >>> + end >>> + else >>> + encode_nil(buf) >>> + end >> >>I would save 1/0, -1/0, 0/0 as module local variables for readability, >>like as it is done in test_double(). >> >>I also found that math.huge is 1/0 (inf) and there is no math.nan (it >>just nil). >> >>> @@ -562,6 +589,15 @@ decode_r = function(data) >>> elseif c >= 0xe0 then >>> return tonumber(ffi.cast('signed char',c)) -- negfixint >>> elseif c == 0xc0 then >>> + if msgpack.cfg.decode_invalid_numbers then >>> + if c == 0/0 then >>> + return math.nan >>> + elseif c == 1/0 then >>> + return math.huge >>> + elseif c == -1/0 then >>> + return -math.huge >>> + end >>> + end >> >>Same as above. >> >>> @@ -628,6 +664,8 @@ return { >>> on_encode = on_encode; >>> decode_unchecked = decode_unchecked; >>> decode = decode_unchecked; -- just for tests >>> + cfg = msgpack.cfg; >>> + new = msgpack.new; -- for serializer tests to work properly >> >>So they'll test msgpack, not msgpackffi. > > >-- >Maria Khaydich > [-- Attachment #2: Type: text/html, Size: 24901 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [PATCH] lua: msgpackffi considers msgpack.cfg options now 2020-01-24 17:37 ` [Tarantool-patches] [PATCH] lua: msgpackffi " Maria Khaydich @ 2020-01-25 15:01 ` Konstantin Osipov 0 siblings, 0 replies; 6+ messages in thread From: Konstantin Osipov @ 2020-01-25 15:01 UTC (permalink / raw) To: Maria Khaydich; +Cc: tarantool-patches * Maria Khaydich <maria.khaydich@tarantool.org> [20/01/24 20:42]: Goodbye, performance? -- Konstantin Osipov, Moscow, Russia ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-01-25 15:01 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-07 10:36 [Tarantool-patches] [PATCH] Msgpackffi considers msgpack.cfg options now Maria 2019-11-11 23:21 ` Alexander Turenko 2019-11-29 21:09 ` Maria Khaydich 2019-12-17 23:26 ` Alexander Turenko 2020-01-24 17:37 ` [Tarantool-patches] [PATCH] lua: msgpackffi " Maria Khaydich 2020-01-25 15:01 ` Konstantin Osipov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox