[tarantool-patches] Re: [PATCH v2 4/4] app: allow to raise an error on too nested tables

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Mon Sep 23 23:41:10 MSK 2019



On 23/09/2019 03:33, Alexander Turenko wrote:
>>     YAML is a crazy serializer without depth restrictions. But for
>>     JSON and msgpack a user could set encode_max_depth option. That
> 
> Nit: And msgpackffi.
> 
>>     Option encode_crop_too_deep works for JSON and msgpack modules,
> 
> Nit: And msgpackffi.
> 

Agree, new message (with a new option name):

    app: raise an error on too nested tables serialization
    
    Closes #4434
    Follow-up #4366
    
    @TarantoolBot document
    Title: json/msgpack.cfg.encode_deep_as_nil option
    
    Tarantool has several so called serializers to convert data
    between Lua and another format: YAML, JSON, msgpack.
    
    YAML is a crazy serializer without depth restrictions. But for
    JSON, msgpack, and msgpackffi a user could set encode_max_depth
    option. That option led to crop of a table when it had too many
    nested levels. Sometimes such behaviour is undesirable.
    
    Now an error is raised instead of data corruption:
    
        t = nil
        for i = 1, 100 do t = {t} end
        msgpack.encode(t) -- Here an exception is thrown.
    
    To disable it and return the old behaviour back here is a new
    option:
    
        <serializer>.cfg({encode_deep_as_nil = true})
    
    Option encode_deep_as_nil works for JSON, msgpack, and msgpackffi
    modules, and is false by default. It means, that now if some
    existing users have cropping, even intentional, they will get the
    exception.

>>>>     msgpack.cfg({encode_crop_too_deep = false})
>>>
>>> We have encode_invalid_as_nil with the similar meanings, so, maybe,
>>> encode_too_deep_as_nil?
>>>
>>> Or maybe rephrase 'too deep' as one word: foots, dregs, underside? Those
>>> variants don't look good, but maybe you know some word that would fit
>>> better.
>>
>> No, I don't know a good synonym. But perhaps we could use
>> 'encode_max_depth_and_crop'. That 1) makes clear, that the option is
>> related to 'encode_max_depth', 2) is obvious, that data deeper than
>> max_depth is cropped. Is it ok?
> 
> Words in 'encode_max_depth_and_crop' feels as non-concordant, IMHO.
> 
>> Or 'encode_crop_after_max_depth'. Or 'encode_trim_by_max_depth'.
> 
> For me 'encode_crop_after_max_depth' is better then
> 'encode_crop_too_deep'.
> 
> I stick a bit with the idea to use 'as_nil', but I dislike the word
> 'too' in an option name: it feels vague. Maybe 'encode_deep_as_nil'?
> 
> If we'll go away from 'as_nil', then 'encode_raise_after_max_depth' look
> good (except maybe length).
> 
> From all those variants I would choose 'encode_deep_as_nil'.
> 
> I would let you decide finally.

I would not really bother so much, all names look ok. I will use
encode_deep_as_nil, if you think it is better.

diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
index 25f55604c..ef315b692 100644
--- a/src/lua/msgpack.c
+++ b/src/lua/msgpack.c
@@ -143,7 +143,7 @@ restart: /* used by MP_EXT */
 	case MP_MAP:
 		/* Map */
 		if (level >= cfg->encode_max_depth) {
-			if (! cfg->encode_crop_too_deep) {
+			if (! cfg->encode_deep_as_nil) {
 				return luaL_error(L, "Too high nest level - %d",
 						  level + 1);
 			}
@@ -168,7 +168,7 @@ restart: /* used by MP_EXT */
 	case MP_ARRAY:
 		/* Array */
 		if (level >= cfg->encode_max_depth) {
-			if (! cfg->encode_crop_too_deep) {
+			if (! cfg->encode_deep_as_nil) {
 				return luaL_error(L, "Too high nest level - %d",
 						  level + 1);
 			}
diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
index 138663ccb..5331dc75f 100644
--- a/src/lua/msgpackffi.lua
+++ b/src/lua/msgpackffi.lua
@@ -219,7 +219,7 @@ local function encode_r(buf, obj, level)
         encode_str(buf, obj)
     elseif type(obj) == "table" then
         if level >= msgpack.cfg.encode_max_depth then
-            if not msgpack.cfg.encode_crop_too_deep then
+            if not msgpack.cfg.encode_deep_as_nil then
                 error(string.format('Too high nest level - %d',
                                     msgpack.cfg.encode_max_depth + 1))
             end
diff --git a/src/lua/utils.c b/src/lua/utils.c
index bc06c1227..7f7bf4776 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -237,7 +237,7 @@ static struct {
 	OPTION(LUA_TNUMBER,  encode_sparse_ratio, 2),
 	OPTION(LUA_TNUMBER,  encode_sparse_safe, 10),
 	OPTION(LUA_TNUMBER,  encode_max_depth, 32),
-	OPTION(LUA_TBOOLEAN, encode_crop_too_deep, 0),
+	OPTION(LUA_TBOOLEAN, encode_deep_as_nil, 0),
 	OPTION(LUA_TBOOLEAN, encode_invalid_numbers, 1),
 	OPTION(LUA_TNUMBER,  encode_number_precision, 14),
 	OPTION(LUA_TBOOLEAN, encode_load_metatables, 1),
diff --git a/src/lua/utils.h b/src/lua/utils.h
index d581e47fd..f8c34545a 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -215,10 +215,11 @@ struct luaL_serializer {
 	int encode_max_depth;
 	/**
 	 * A flag whether a table with too high nest level should
-	 * be cropped. If not set, than this is considered an
+	 * be cropped. The not-encoded fields are replaced with
+	 * one null. If not set, too high nesting is considered an
 	 * error.
 	 */
-	int encode_crop_too_deep;
+	int encode_deep_as_nil;
 	/** Enables encoding of NaN and Inf numbers */
 	int encode_invalid_numbers;
 	/** Floating point numbers precision (YAML, CJSON only) */
diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
index ba77a8d77..cd210c85f 100755
--- a/test/app-tap/json.test.lua
+++ b/test/app-tap/json.test.lua
@@ -38,10 +38,10 @@ tap.test("json", function(test)
     --
     -- gh-2888: Check the possibility of using options in encode()/decode().
     --
-    local orig_encode_crop_too_deep = serializer.cfg.encode_crop_too_deep
+    local orig_encode_deep_as_nil = serializer.cfg.encode_deep_as_nil
     local orig_encode_max_depth = serializer.cfg.encode_max_depth
     local sub = {a = 1, { b = {c = 1, d = {e = 1}}}}
-    serializer.cfg({encode_max_depth = 1, encode_crop_too_deep = true})
+    serializer.cfg({encode_max_depth = 1, encode_deep_as_nil = true})
     test:ok(serializer.encode(sub) == '{"1":null,"a":1}',
             'depth of encoding is 1 with .cfg')
     serializer.cfg({encode_max_depth = orig_encode_max_depth})
@@ -123,5 +123,5 @@ tap.test("json", function(test)
     test:is(serializer.encode(rec4),
             '{"a":{"a":null,"b":null},"b":{"a":null,"b":null}}')
     serializer.cfg({encode_max_depth = orig_encode_max_depth,
-                    encode_crop_too_deep = orig_encode_crop_too_deep})
+                    encode_deep_as_nil = orig_encode_deep_as_nil})
 end)
diff --git a/test/app-tap/lua/serializer_test.lua b/test/app-tap/lua/serializer_test.lua
index 85738924a..1609f768f 100644
--- a/test/app-tap/lua/serializer_test.lua
+++ b/test/app-tap/lua/serializer_test.lua
@@ -418,8 +418,8 @@ local function test_depth(test, s)
     -- they want to raise an error on tables with too high nest
     -- level.
     --
-    local crop_too_deep = s.cfg.encode_crop_too_deep
-    s.cfg({encode_crop_too_deep = false})
+    local deep_as_nil = s.cfg.encode_deep_as_nil
+    s.cfg({encode_deep_as_nil = false})
 
     local t = nil
     for i = 1, max_depth + 1 do t = {t} end
@@ -430,7 +430,7 @@ local function test_depth(test, s)
     ok, err = pcall(s.encode, t)
     test:ok(ok, "no throw in a corner case")
 
-    s.cfg({encode_crop_too_deep = crop_too_deep, encode_max_depth = max_depth})
+    s.cfg({encode_deep_as_nil = deep_as_nil, encode_max_depth = max_depth})
 end
 
 return {
diff --git a/test/app-tap/msgpackffi.test.lua b/test/app-tap/msgpackffi.test.lua
index dbae72362..36ac26b7e 100755
--- a/test/app-tap/msgpackffi.test.lua
+++ b/test/app-tap/msgpackffi.test.lua
@@ -82,8 +82,8 @@ local function test_other(test, s)
         return level
     end
     local msgpack = require('msgpack')
-    local crop_too_deep = msgpack.cfg.encode_crop_too_deep
-    msgpack.cfg({encode_crop_too_deep = true})
+    local deep_as_nil = msgpack.cfg.encode_deep_as_nil
+    msgpack.cfg({encode_deep_as_nil = true})
     local max_depth = msgpack.cfg.encode_max_depth
     local result_depth = check_depth(max_depth + 5)
     test:is(result_depth, max_depth,
@@ -107,11 +107,11 @@ local function test_other(test, s)
     while t ~= nil do level = level + 1 t = t.key end
     test:is(level, max_depth + 5, "recursive map")
 
-    msgpack.cfg({encode_crop_too_deep = false})
+    msgpack.cfg({encode_deep_as_nil = false})
     local ok = pcall(check_depth, max_depth + 6)
     test:ok(not ok, "exception is thrown when crop is not allowed")
 
-    msgpack.cfg({encode_crop_too_deep = crop_too_deep,
+    msgpack.cfg({encode_deep_as_nil = deep_as_nil,
                  encode_max_depth = max_depth})
 end
 
diff --git a/test/box/tuple.result b/test/box/tuple.result
index 088400276..f7614b4c4 100644
--- a/test/box/tuple.result
+++ b/test/box/tuple.result
@@ -1401,10 +1401,10 @@ d:update{{'-', 1, dec.new('1e37')}}
 max_depth = msgpack.cfg.encode_max_depth
 ---
 ...
-crop_too_deep = msgpack.cfg.encode_crop_too_deep
+deep_as_nil = msgpack.cfg.encode_deep_as_nil
 ---
 ...
-msgpack.cfg({encode_crop_too_deep = true})
+msgpack.cfg({encode_deep_as_nil = true})
 ---
 ...
 t = nil
@@ -1444,6 +1444,6 @@ level == max_depth + 5 or {level, max_depth}
 ---
 - true
 ...
-msgpack.cfg({encode_max_depth = max_depth, encode_crop_too_deep = crop_too_deep})
+msgpack.cfg({encode_max_depth = max_depth, encode_deep_as_nil = deep_as_nil})
 ---
 ...
diff --git a/test/box/tuple.test.lua b/test/box/tuple.test.lua
index 7660f651c..3ac2902fe 100644
--- a/test/box/tuple.test.lua
+++ b/test/box/tuple.test.lua
@@ -478,8 +478,8 @@ d:update{{'-', 1, dec.new('1e37')}}
 -- gh-4434: tuple should use global msgpack serializer.
 --
 max_depth = msgpack.cfg.encode_max_depth
-crop_too_deep = msgpack.cfg.encode_crop_too_deep
-msgpack.cfg({encode_crop_too_deep = true})
+deep_as_nil = msgpack.cfg.encode_deep_as_nil
+msgpack.cfg({encode_deep_as_nil = true})
 t = nil
 for i = 1, max_depth + 5 do t = {t} end
 tuple = box.tuple.new(t):totable()
@@ -495,4 +495,4 @@ while tuple ~= nil do level = level + 1 tuple = tuple[1] end
 -- serializer allows deeper tables.
 level == max_depth + 5 or {level, max_depth}
 
-msgpack.cfg({encode_max_depth = max_depth, encode_crop_too_deep = crop_too_deep})
+msgpack.cfg({encode_max_depth = max_depth, encode_deep_as_nil = deep_as_nil})
diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
index 0c325d03c..5264b9cd7 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
@@ -401,7 +401,7 @@ static void json_append_data(lua_State *l, struct luaL_serializer *cfg,
     break;
     case MP_MAP:
     if (current_depth >= cfg->encode_max_depth) {
-        if (! cfg->encode_crop_too_deep)
+        if (! cfg->encode_deep_as_nil)
             luaL_error(l, "Too high nest level");
         return json_append_nil(cfg, json); /* Limit nested maps */
     }
@@ -410,7 +410,7 @@ static void json_append_data(lua_State *l, struct luaL_serializer *cfg,
     case MP_ARRAY:
     /* Array */
     if (current_depth >= cfg->encode_max_depth) {
-        if (! cfg->encode_crop_too_deep)
+        if (! cfg->encode_deep_as_nil)
             luaL_error(l, "Too high nest level");
         return json_append_nil(cfg, json); /* Limit nested arrays */
     }




More information about the Tarantool-patches mailing list