Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH v2 4/4] app: allow to raise an error on too nested tables
Date: Sat, 14 Sep 2019 00:32:28 +0200	[thread overview]
Message-ID: <691762ba-48a6-04b2-d0e0-bafdad317217@tarantool.org> (raw)
In-Reply-To: <20190912233231.zs3c6o2zv7pyl6lo@tkn_work_nb>

Thanks for the review!

On 13/09/2019 01:32, Alexander Turenko wrote:
>> app: allow to raise an error on too nested tables
> 
> The commit header and the description looks like we have 'crop'
> behaviour as default, however it is not more so.
> 
> I agree that we should change the default behaviour and raise an error
> by default. This however should be clearly stated in the docbot request.

Ok, new commit message:

    app: raise an error on too nested tables serialization
    
    Closes #4434
    Follow-up #4366
    
    @TarantoolBot document
    Title: json/msgpack.cfg.encode_crop_too_deep option
    
    Tarantool has several so called serializers to convert data
    between Lua and another format: YAML, JSON, msgpack.
    
    YAML is a crazy serializer without depth restrictions. But for
    JSON and msgpack a user could set encode_max_depth option. That
    option led to crop of a table when it had too many nested levels.
    Sometimes such behaviour is undesirable.
    
    Now an error is raised instead of data corruption:
    
        t = nil
        for i = 1, 100 do t = {t} end
        msgpack.encode(t) -- Here an exception is thrown.
    
    To disable it and return the old behaviour back here is a new
    option:
    
        <serializer>.cfg({encode_crop_too_deep = true})
    
    Option encode_crop_too_deep works for JSON and msgpack modules,
    and is false by default. It means, that now if some existing
    users have cropping, even intentional, they will get the
    exception.

> 
> See more comments below.
> 
> WBR, Alexander Turenko.
> 
>>     msgpack.cfg({encode_crop_too_deep = false})
> 
> We have encode_invalid_as_nil with the similar meanings, so, maybe,
> encode_too_deep_as_nil?
> 
> Or maybe rephrase 'too deep' as one word: foots, dregs, underside? Those
> variants don't look good, but maybe you know some word that would fit
> better.

No, I don't know a good synonym. But perhaps we could use
'encode_max_depth_and_crop'. That 1) makes clear, that the option is
related to 'encode_max_depth', 2) is obvious, that data deeper than
max_depth is cropped. Is it ok?

Or 'encode_crop_after_max_depth'. Or 'encode_trim_by_max_depth'.

>> +			if (! cfg->encode_crop_too_deep)
>> +				return luaL_error(L, "Too high nest level");
> 
> We can easily give more information: say, a current max depth level; so
> I think it worth to add to the error message. (The same for msgpackffi
> and json.)
> 

The error message will always just contain cfg.encode_max_depth + 1, I
don't think it makes much sense to return it. But ok, it won't make worse:

diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
index b1354776d..06f6dc53e 100644
--- a/src/lua/msgpack.c
+++ b/src/lua/msgpack.c
@@ -143,8 +143,10 @@ restart: /* used by MP_EXT */
 	case MP_MAP:
 		/* Map */
 		if (level >= cfg->encode_max_depth) {
-			if (! cfg->encode_crop_too_deep)
-				return luaL_error(L, "Too high nest level");
+			if (! cfg->encode_crop_too_deep) {
+				return luaL_error(L, "Too high nest level - %d",
+						  level + 1);
+			}
 			mpstream_encode_nil(stream); /* Limit nested maps */
 			return MP_NIL;
 		}
@@ -167,7 +169,8 @@ restart: /* used by MP_EXT */
 		/* Array */
 		if (level >= cfg->encode_max_depth) {
-			if (! cfg->encode_crop_too_deep)
-				return luaL_error(L, "Too high nest level");
+			if (! cfg->encode_crop_too_deep) {
+				return luaL_error(L, "Too high nest level - %d",
+						  level + 1);
+			}
 			mpstream_encode_nil(stream); /* Limit nested arrays */
 			return MP_NIL;
 		}
diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
index 73d0d6fe2..138663ccb 100644
--- a/src/lua/msgpackffi.lua
+++ b/src/lua/msgpackffi.lua
@@ -220,7 +220,8 @@ local function encode_r(buf, obj, level)
     elseif type(obj) == "table" then
         if level >= msgpack.cfg.encode_max_depth then
             if not msgpack.cfg.encode_crop_too_deep then
-                error('Too high nest level')
+                error(string.format('Too high nest level - %d',
+                                    msgpack.cfg.encode_max_depth + 1))
             end
             encode_nil(buf)
             return

>> +    -- gh-4434 (yes, the same issue): let users choose whether
>> +    -- they want to raise an error on tables with too high nest
>> +    -- level.
>> +    --
>> +    s.cfg({encode_crop_too_deep = false})
>> +
>> +    local t = nil
>> +    for i = 1, max_depth + 1 do t = {t} end
>> +    local ok, err = pcall(s.encode, t)
>> +    test:ok(not ok, "too deep encode depth")
>> +
>> +    s.cfg({encode_max_depth = max_depth + 1})
>> +    ok, err = pcall(s.encode, t)
>> +    test:ok(ok, "no throw in a corner case")
>> +
>> +    s.cfg({encode_crop_too_deep = true, encode_max_depth = max_depth})
> 
> I think we should save a default value of the option and set it here as
> we do with encode_max_depth. (The same for msgpackffi.test.lua.)

Agree, it looks more correct. Only not 'default' but just 'previous',
used before a test:

diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
index 0a8966866..ba77a8d77 100755
--- a/test/app-tap/json.test.lua
+++ b/test/app-tap/json.test.lua
@@ -38,9 +38,10 @@ tap.test("json", function(test)
     --
     -- gh-2888: Check the possibility of using options in encode()/decode().
     --
+    local orig_encode_crop_too_deep = serializer.cfg.encode_crop_too_deep
     local orig_encode_max_depth = serializer.cfg.encode_max_depth
     local sub = {a = 1, { b = {c = 1, d = {e = 1}}}}
-    serializer.cfg({encode_max_depth = 1})
+    serializer.cfg({encode_max_depth = 1, encode_crop_too_deep = true})
     test:ok(serializer.encode(sub) == '{"1":null,"a":1}',
             'depth of encoding is 1 with .cfg')
     serializer.cfg({encode_max_depth = orig_encode_max_depth})
@@ -121,5 +122,6 @@ tap.test("json", function(test)
     rec4['b'] = rec4
     test:is(serializer.encode(rec4),
             '{"a":{"a":null,"b":null},"b":{"a":null,"b":null}}')
-    serializer.cfg({encode_max_depth = orig_encode_max_depth})
+    serializer.cfg({encode_max_depth = orig_encode_max_depth,
+                    encode_crop_too_deep = orig_encode_crop_too_deep})
 end)
diff --git a/test/app-tap/lua/serializer_test.lua b/test/app-tap/lua/serializer_test.lua
index 4fa2924cf..85738924a 100644
--- a/test/app-tap/lua/serializer_test.lua
+++ b/test/app-tap/lua/serializer_test.lua
@@ -418,6 +418,7 @@ local function test_depth(test, s)
     -- they want to raise an error on tables with too high nest
     -- level.
     --
+    local crop_too_deep = s.cfg.encode_crop_too_deep
     s.cfg({encode_crop_too_deep = false})
 
     local t = nil
@@ -429,7 +430,7 @@ local function test_depth(test, s)
     ok, err = pcall(s.encode, t)
     test:ok(ok, "no throw in a corner case")
 
-    s.cfg({encode_crop_too_deep = true, encode_max_depth = max_depth})
+    s.cfg({encode_crop_too_deep = crop_too_deep, encode_max_depth = max_depth})
 end
 
 return {
diff --git a/test/app-tap/msgpackffi.test.lua b/test/app-tap/msgpackffi.test.lua
index a418f5ac1..dbae72362 100755
--- a/test/app-tap/msgpackffi.test.lua
+++ b/test/app-tap/msgpackffi.test.lua
@@ -82,6 +82,7 @@ local function test_other(test, s)
         return level
     end
     local msgpack = require('msgpack')
+    local crop_too_deep = msgpack.cfg.encode_crop_too_deep
     msgpack.cfg({encode_crop_too_deep = true})
     local max_depth = msgpack.cfg.encode_max_depth
     local result_depth = check_depth(max_depth + 5)
@@ -110,7 +111,8 @@ local function test_other(test, s)
     local ok = pcall(check_depth, max_depth + 6)
     test:ok(not ok, "exception is thrown when crop is not allowed")
 
-    msgpack.cfg({encode_max_depth = max_depth})
+    msgpack.cfg({encode_crop_too_deep = crop_too_deep,
+                 encode_max_depth = max_depth})
 end
 
 tap.test("msgpackffi", function(test)

> 
> BTW, the default is 'false' and we should set the default value here
> (now we set 'true'). Following test cases (e.g. gh-2888 test case in
> app-tap/json.test.lua) should set and restore the option locally.
> 
> Maybe it also worth add a test case that will verify that the default
> behaviour is to raise an error?
> 

Not sure. Such a test would assume a certain value of
msgpack.cfg.encode_crop_too_deep. This is exactly what I was trying to
avoid in the tests.

  reply	other threads:[~2019-09-13 22:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-10 20:24 [tarantool-patches] [PATCH v2 0/4] Serializer bugs Vladislav Shpilevoy
2019-09-09 19:00 ` [tarantool-patches] [PATCH v2 1/4] app: serializers update now is reflected in Lua Vladislav Shpilevoy
2019-09-12 23:22   ` [tarantool-patches] " Alexander Turenko
2019-09-13 22:32     ` Vladislav Shpilevoy
2019-09-09 19:00 ` [tarantool-patches] [PATCH v2 2/4] msgpack: make msgpackffi use encode_max_depth option Vladislav Shpilevoy
2019-09-12 23:24   ` [tarantool-patches] " Alexander Turenko
2019-09-13 22:32     ` Vladislav Shpilevoy
2019-09-09 19:00 ` [tarantool-patches] [PATCH v2 3/4] tuple: use global msgpack serializer in Lua tuple Vladislav Shpilevoy
2019-09-12 23:27   ` [tarantool-patches] " Alexander Turenko
2019-09-13 22:32     ` Vladislav Shpilevoy
2019-09-09 19:00 ` [tarantool-patches] [PATCH v2 4/4] app: allow to raise an error on too nested tables Vladislav Shpilevoy
2019-09-12 23:32   ` [tarantool-patches] " Alexander Turenko
2019-09-13 22:32     ` Vladislav Shpilevoy [this message]
2019-09-10 20:25 ` [tarantool-patches] Re: [PATCH v2 0/4] Serializer bugs Vladislav Shpilevoy
2019-09-12 23:44 ` Alexander Turenko
2019-09-13 22:32   ` Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=691762ba-48a6-04b2-d0e0-bafdad317217@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH v2 4/4] app: allow to raise an error on too nested tables' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox