Tarantool development patches archive
 help / color / mirror / Atom feed
* [tarantool-patches] [PATCH 0/4] Serializer bugs
@ 2019-09-04 21:44 Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 1/4] app: serializers update now is reflected in Lua Vladislav Shpilevoy
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-04 21:44 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

The patchset fixes several problems about JSON and msgpack serializers.

1) msgpackffi didn't use msgpack serializer options;
2) tuple serializer either;
3) update of a serializer option was not reflected in its Lua representation;
4) during serialization too nested tables are silently cropped and there was no
  way to prevent it.

The patchset depends on fix of #4366 provided by Kirill.

Also it was discovered, that msgpackffi does not care *all* options. Not only
about max_depth. I am not sure if it is worth fixing here (or at all) though.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4434-box-lua-msgpack-cfg
Issue: https://github.com/tarantool/tarantool/issues/4434

Vladislav Shpilevoy (4):
  app: serializers update now is reflected in Lua
  msgpack: make msgpackffi use encode_max_depth option
  tuple: use global msgpack serializer in Lua tuple
  app: allow to raise an error on too nested tables

 src/box/lua/tuple.c                  | 32 +++++++++++++------
 src/lua/msgpack.c                    |  4 +++
 src/lua/msgpackffi.lua               |  6 ++--
 src/lua/utils.c                      | 37 ++++++++++++----------
 src/lua/utils.h                      | 15 +++++++++
 test/app-tap/json.test.lua           |  3 +-
 test/app-tap/lua/serializer_test.lua | 31 +++++++++++++++++++
 test/app-tap/msgpack.test.lua        |  3 +-
 test/app-tap/msgpackffi.test.lua     | 30 +++++++++++++++++-
 test/box/tuple.result                | 46 ++++++++++++++++++++++++++++
 test/box/tuple.test.lua              | 21 +++++++++++++
 test/unit/luaT_tuple_new.c           |  2 +-
 third_party/lua-cjson/lua_cjson.c    | 10 ++++--
 13 files changed, 207 insertions(+), 33 deletions(-)

-- 
2.20.1 (Apple Git-117)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tarantool-patches] [PATCH 1/4] app: serializers update now is reflected in Lua
  2019-09-04 21:44 [tarantool-patches] [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
@ 2019-09-04 21:44 ` Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 2/4] msgpack: make msgpackffi use encode_max_depth option Vladislav Shpilevoy
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-04 21:44 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

There are some objects called serializers - msgpack, cjson, yaml,
maybe more. They are global objects affecting both Lua and C
modules.

A serializer have settings which can be updated. But before the
patch an update changed only C structure of the serializer. It
made impossible to use settings of the serializers from Lua.

Now any update of any serializer is reflected both in its C and
Lua structures.

Part of #4434
---
 src/lua/utils.c                      | 34 +++++++++++++++-------------
 test/app-tap/json.test.lua           |  3 ++-
 test/app-tap/lua/serializer_test.lua | 13 +++++++++++
 test/app-tap/msgpack.test.lua        |  3 ++-
 4 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/src/lua/utils.c b/src/lua/utils.c
index 75efe0ed2..a082a2e5b 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -270,10 +270,8 @@ luaL_serializer_parse_option(struct lua_State *L, int i,
 			     struct luaL_serializer *cfg)
 {
 	lua_getfield(L, 2, OPTIONS[i].name);
-	if (lua_isnil(L, -1)) {
-		lua_pop(L, 1);
+	if (lua_isnil(L, -1))
 		return NULL;
-	}
 	/*
 	 * Update struct luaL_serializer using pointer to a
 	 * configuration value (all values must be `int` for that).
@@ -289,7 +287,6 @@ luaL_serializer_parse_option(struct lua_State *L, int i,
 	default:
 		unreachable();
 	}
-	lua_pop(L, 1);
 	return pval;
 }
 
@@ -297,26 +294,31 @@ void
 luaL_serializer_parse_options(struct lua_State *L,
 			      struct luaL_serializer *cfg)
 {
-	for (int i = 0; OPTIONS[i].name != NULL; i++)
+	for (int i = 0; OPTIONS[i].name != NULL; ++i) {
 		luaL_serializer_parse_option(L, i, cfg);
+		lua_pop(L, 1);
+	}
 }
 
 /**
- * @brief serializer.cfg{} Lua binding for serializers.
- * serializer.cfg is a table that contains current configuration values from
- * luaL_serializer structure. serializer.cfg has overriden __call() method
- * to change configuration keys in internal userdata (like box.cfg{}).
- * Please note that direct change in serializer.cfg.key will not affect
- * internal state of userdata.
- * @param L lua stack
- * @return 0
+ * Serializer.cfg.__call implementation. Merge new parameters into
+ * both C structure of the serializer, and into its Lua table
+ * representation: serializer.cfg.
  */
 static int
 luaL_serializer_cfg(struct lua_State *L)
 {
-	luaL_checktype(L, 1, LUA_TTABLE); /* serializer */
-	luaL_checktype(L, 2, LUA_TTABLE); /* serializer.cfg */
-	luaL_serializer_parse_options(L, luaL_checkserializer(L));
+	/* Serializer.cfg */
+	luaL_checktype(L, 1, LUA_TTABLE);
+	/* Updated parameters. */
+	luaL_checktype(L, 2, LUA_TTABLE);
+	struct luaL_serializer *cfg = luaL_checkserializer(L);
+	for (int i = 0; OPTIONS[i].name != NULL; ++i) {
+		if (luaL_serializer_parse_option(L, i, cfg) == NULL)
+			lua_pop(L, 1);
+		else
+			lua_setfield(L, 1, OPTIONS[i].name);
+	}
 	return 0;
 }
 
diff --git a/test/app-tap/json.test.lua b/test/app-tap/json.test.lua
index 10f6f4ab2..0a8966866 100755
--- a/test/app-tap/json.test.lua
+++ b/test/app-tap/json.test.lua
@@ -22,7 +22,7 @@ end
 
 tap.test("json", function(test)
     local serializer = require('json')
-    test:plan(32)
+    test:plan(33)
 
     test:test("unsigned", common.test_unsigned, serializer)
     test:test("signed", common.test_signed, serializer)
@@ -32,6 +32,7 @@ tap.test("json", function(test)
     test:test("nil", common.test_nil, serializer)
     test:test("table", common.test_table, serializer, is_array, is_map)
     test:test("ucdata", common.test_ucdata, serializer)
+    test:test("depth", common.test_depth, serializer)
     test:test("misc", test_misc, serializer)
 
     --
diff --git a/test/app-tap/lua/serializer_test.lua b/test/app-tap/lua/serializer_test.lua
index e7048da4d..a44247d70 100644
--- a/test/app-tap/lua/serializer_test.lua
+++ b/test/app-tap/lua/serializer_test.lua
@@ -402,6 +402,18 @@ local function test_ucdata(test, s)
     ss = nil
 end
 
+local function test_depth(test, s)
+    test:plan(1)
+    --
+    -- gh-4434: serializer update should be reflected in Lua.
+    --
+    local max_depth = s.cfg.encode_max_depth
+    s.cfg({encode_max_depth = max_depth + 5})
+    test:is(s.cfg.encode_max_depth, max_depth + 5,
+            "cfg({<name> = value}) is reflected in cfg.<name>")
+    s.cfg({encode_max_depth = max_depth})
+end
+
 return {
     test_unsigned = test_unsigned;
     test_signed = test_signed;
@@ -412,4 +424,5 @@ return {
     test_table = test_table;
     test_ucdata = test_ucdata;
     test_decimal = test_decimal;
+    test_depth = test_depth;
 }
diff --git a/test/app-tap/msgpack.test.lua b/test/app-tap/msgpack.test.lua
index bd095e5ae..752f107a8 100755
--- a/test/app-tap/msgpack.test.lua
+++ b/test/app-tap/msgpack.test.lua
@@ -231,7 +231,7 @@ end
 
 tap.test("msgpack", function(test)
     local serializer = require('msgpack')
-    test:plan(11)
+    test:plan(12)
     test:test("unsigned", common.test_unsigned, serializer)
     test:test("signed", common.test_signed, serializer)
     test:test("double", common.test_double, serializer)
@@ -240,6 +240,7 @@ tap.test("msgpack", function(test)
     test:test("nil", common.test_nil, serializer)
     test:test("table", common.test_table, serializer, is_array, is_map)
     test:test("ucdata", common.test_ucdata, serializer)
+    test:test("depth", common.test_depth, serializer)
     test:test("offsets", test_offsets, serializer)
     test:test("misc", test_misc, serializer)
     test:test("decode_array_map", test_decode_array_map_header, serializer)
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tarantool-patches] [PATCH 2/4] msgpack: make msgpackffi use encode_max_depth option
  2019-09-04 21:44 [tarantool-patches] [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 1/4] app: serializers update now is reflected in Lua Vladislav Shpilevoy
@ 2019-09-04 21:44 ` Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 3/4] tuple: use global msgpack serializer in Lua tuple Vladislav Shpilevoy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-04 21:44 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Msgpack Lua module is not a simple set of functions. It is a
global serializer object used by plenty of other Lua and C
modules. Msgpack as a serializer can be configured, and in theory
its configuration updates should affect all other modules. For
example, a user could change encode_max_depth:

    require('msgpack').cfg({encode_max_depth = <new_value>})

And that would make tuple:update() accept tables with <new_value>
depth without a crop.

But in fact msgpack configuration didn't affect some places, such
as this one. And all the others who use msgpackffi.

This patch fixes it, for encode_max_depth option. Other options
are still ignored.

Part of #4434
---
 src/lua/msgpackffi.lua           |  3 +--
 test/app-tap/msgpackffi.test.lua | 26 +++++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
index f7ee44291..65c57aa6e 100644
--- a/src/lua/msgpackffi.lua
+++ b/src/lua/msgpackffi.lua
@@ -4,7 +4,6 @@ local ffi = require('ffi')
 local buffer = require('buffer')
 local builtin = ffi.C
 local msgpack = require('msgpack') -- .NULL, .array_mt, .map_mt, .cfg
-local MAXNESTING = 16
 local int8_ptr_t = ffi.typeof('int8_t *')
 local uint8_ptr_t = ffi.typeof('uint8_t *')
 local uint16_ptr_t = ffi.typeof('uint16_t *')
@@ -219,7 +218,7 @@ local function encode_r(buf, obj, level)
     elseif type(obj) == "string" then
         encode_str(buf, obj)
     elseif type(obj) == "table" then
-        if level >= MAXNESTING then -- Limit nested tables
+        if level >= msgpack.cfg.encode_max_depth then
             encode_nil(buf)
             return
         end
diff --git a/test/app-tap/msgpackffi.test.lua b/test/app-tap/msgpackffi.test.lua
index f2a8f254b..e26247625 100755
--- a/test/app-tap/msgpackffi.test.lua
+++ b/test/app-tap/msgpackffi.test.lua
@@ -36,7 +36,7 @@ local function test_offsets(test, s)
 end
 
 local function test_other(test, s)
-    test:plan(19)
+    test:plan(21)
     local buf = string.char(0x93, 0x6e, 0xcb, 0x42, 0x2b, 0xed, 0x30, 0x47,
         0x6f, 0xff, 0xff, 0xac, 0x77, 0x6b, 0x61, 0x71, 0x66, 0x7a, 0x73,
         0x7a, 0x75, 0x71, 0x71, 0x78)
@@ -68,6 +68,30 @@ local function test_other(test, s)
     test:is(#s.encode(-0x8001), 5, "len(encode(-0x8001))")
     test:is(#s.encode(-0x80000000), 5, "len(encode(-0x80000000))")
     test:is(#s.encode(-0x80000001), 9, "len(encode(-0x80000001))")
+
+    --
+    -- gh-4434: msgpackffi does not care about msgpack serializer
+    -- configuration, but it should.
+    --
+    local function check_depth(depth_to_try)
+        local t = nil
+        for i = 1, depth_to_try do t = {t} end
+        t = s.decode_unchecked(s.encode(t))
+        local level = 0
+        while t ~= nil do level = level + 1 t = t[1] end
+        return level
+    end
+    local msgpack = require('msgpack')
+    local max_depth = msgpack.cfg.encode_max_depth
+    local result_depth = check_depth(max_depth + 5)
+    test:is(result_depth, max_depth,
+            "msgpackffi uses msgpack.cfg.encode_max_depth")
+
+    msgpack.cfg({encode_max_depth = max_depth + 5})
+    result_depth = check_depth(max_depth + 5)
+    test:is(result_depth, max_depth + 5, "and uses it dynamically")
+
+    msgpack.cfg({encode_max_depth = max_depth})
 end
 
 tap.test("msgpackffi", function(test)
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tarantool-patches] [PATCH 3/4] tuple: use global msgpack serializer in Lua tuple
  2019-09-04 21:44 [tarantool-patches] [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 1/4] app: serializers update now is reflected in Lua Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 2/4] msgpack: make msgpackffi use encode_max_depth option Vladislav Shpilevoy
@ 2019-09-04 21:44 ` Vladislav Shpilevoy
  2019-09-08 15:02   ` [tarantool-patches] " Vladislav Shpilevoy
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 4/4] app: allow to raise an error on too nested tables Vladislav Shpilevoy
  2019-09-09 18:57 ` [tarantool-patches] Re: [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
  4 siblings, 1 reply; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-04 21:44 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Tuple is a C library exposed to Lua. In Lua to translate Lua
objects into tuples and back luaL_serializer structure is used.

In Tarantool we have several global serializers, one of which is
for msgpack. Tuples store data in msgpack, and in theory should
have used that global msgpack serializer. But in fact the tuple
module had its own private serializer because of tuples encoding
specifics such as never encode sparse arrays as maps.

This patch makes tuple Lua module use global msgpack serializer
always. But how does tuple handle sparse arrays now? In fact,
the tuple module still has its own serializer, but it is updated
each time when the msgpack serializer is changed.

Part of #4434
---
 src/box/lua/tuple.c        | 32 ++++++++++++++++++--------
 src/lua/utils.c            |  2 ++
 src/lua/utils.h            |  9 ++++++++
 test/box/tuple.result      | 46 ++++++++++++++++++++++++++++++++++++++
 test/box/tuple.test.lua    | 21 +++++++++++++++++
 test/unit/luaT_tuple_new.c |  2 +-
 6 files changed, 102 insertions(+), 10 deletions(-)

diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 3902288bf..3ac32601c 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -58,8 +58,27 @@
 
 static const char *tuplelib_name = "box.tuple";
 static const char *tuple_iteratorlib_name = "box.tuple.iterator";
+/*
+ * Special serializer for box.tuple.new() to disable storage
+ * optimization for excessively sparse arrays as a tuple always
+ * must be regular MP_ARRAY.
+ */
 static struct luaL_serializer tuple_serializer;
 
+/**
+ * Take actual version of msgpack serializer slightly modified to
+ * correctly encode tuples.
+ */
+static inline struct luaL_serializer *
+tuple_serializer_actualize(void)
+{
+	if (tuple_serializer.version == luaL_msgpack_default->version)
+		return &tuple_serializer;
+	tuple_serializer = *luaL_msgpack_default;
+	tuple_serializer.encode_sparse_ratio = 0;
+	return &tuple_serializer;
+}
+
 extern char tuple_lua[]; /* Lua source */
 
 uint32_t CTID_STRUCT_TUPLE_REF;
@@ -119,7 +138,8 @@ luaT_tuple_new(struct lua_State *L, int idx, box_tuple_format_t *format)
 		}
 	} else {
 		/* Create the tuple from a Lua table. */
-		luamp_encode_tuple(L, &tuple_serializer, &stream, idx);
+		luamp_encode_tuple(L, tuple_serializer_actualize(), &stream,
+				   idx);
 	}
 	mpstream_flush(&stream);
 	struct tuple *tuple = box_tuple_new(format, buf->buf,
@@ -564,14 +584,8 @@ box_lua_tuple_init(struct lua_State *L)
 
 	luamp_set_encode_extension(luamp_encode_extension_box);
 
-	/*
-	 * Create special serializer for box.tuple.new().
-	 * Disable storage optimization for excessively
-	 * sparse arrays as a tuple always must be regular
-	 * MP_ARRAY.
-	 */
-	luaL_serializer_create(&tuple_serializer);
-	tuple_serializer.encode_sparse_ratio = 0;
+	tuple_serializer.version = -1;
+	tuple_serializer_actualize();
 
 	/* Get CTypeID for `struct tuple' */
 	int rc = luaL_cdef(L, "struct tuple;");
diff --git a/src/lua/utils.c b/src/lua/utils.c
index a082a2e5b..ebbe208b4 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -251,6 +251,7 @@ static struct {
 void
 luaL_serializer_create(struct luaL_serializer *cfg)
 {
+	cfg->version = 0;
 	for (int i = 0; OPTIONS[i].name != NULL; i++) {
 		int *pval = (int *) ((char *) cfg + OPTIONS[i].offset);
 		*pval = OPTIONS[i].defvalue;
@@ -313,6 +314,7 @@ luaL_serializer_cfg(struct lua_State *L)
 	/* Updated parameters. */
 	luaL_checktype(L, 2, LUA_TTABLE);
 	struct luaL_serializer *cfg = luaL_checkserializer(L);
+	++cfg->version;
 	for (int i = 0; OPTIONS[i].name != NULL; ++i) {
 		if (luaL_serializer_parse_option(L, i, cfg) == NULL)
 			lua_pop(L, 1);
diff --git a/src/lua/utils.h b/src/lua/utils.h
index d42cc3992..e802b8c4c 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -242,6 +242,15 @@ struct luaL_serializer {
 
 	/** Enable support for compact represenation (internal, YAML-only). */
 	int has_compact;
+	/**
+	 * Constantly growing number increased each time the
+	 * serializer is changed. It can be used by modules which
+	 * need certain settings constant. Such modules can keep a
+	 * local copy of the serializer, and copy it from the main
+	 * one only when the version is changed. This is what Lua
+	 * tuple implementation did at the moment of writing this.
+	 */
+	int version;
 };
 
 extern int luaL_nil_ref;
diff --git a/test/box/tuple.result b/test/box/tuple.result
index 895462518..83f74d111 100644
--- a/test/box/tuple.result
+++ b/test/box/tuple.result
@@ -1395,3 +1395,49 @@ d:update{{'-', 1, dec.new('1e37')}}
 ---
 - [80000000000000000000000000000000000000]
 ...
+--
+-- gh-4434: tuple should use global msgpack serializer.
+--
+max_depth = msgpack.cfg.encode_max_depth
+---
+...
+t = nil
+---
+...
+for i = 1, max_depth + 5 do t = {t} end
+---
+...
+tuple = box.tuple.new(t):totable()
+---
+...
+level = 0
+---
+...
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+---
+...
+level == max_depth or {level, max_depth}
+---
+- true
+...
+msgpack.cfg({encode_max_depth = max_depth + 5})
+---
+...
+tuple = box.tuple.new(t):totable()
+---
+...
+level = 0
+---
+...
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+---
+...
+-- Level should be bigger now, because the default msgpack
+-- serializer allows deeper tables.
+level == max_depth + 5 or {level, max_depth}
+---
+- true
+...
+msgpack.cfg({encode_max_depth = max_depth})
+---
+...
diff --git a/test/box/tuple.test.lua b/test/box/tuple.test.lua
index 9762fc8b3..c8a0c03c5 100644
--- a/test/box/tuple.test.lua
+++ b/test/box/tuple.test.lua
@@ -473,3 +473,24 @@ d = box.tuple.new(dec.new('9e37'))
 d
 d:update{{'+', 1, dec.new('1e37')}}
 d:update{{'-', 1, dec.new('1e37')}}
+
+--
+-- gh-4434: tuple should use global msgpack serializer.
+--
+max_depth = msgpack.cfg.encode_max_depth
+t = nil
+for i = 1, max_depth + 5 do t = {t} end
+tuple = box.tuple.new(t):totable()
+level = 0
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+level == max_depth or {level, max_depth}
+
+msgpack.cfg({encode_max_depth = max_depth + 5})
+tuple = box.tuple.new(t):totable()
+level = 0
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+-- Level should be bigger now, because the default msgpack
+-- serializer allows deeper tables.
+level == max_depth + 5 or {level, max_depth}
+
+msgpack.cfg({encode_max_depth = max_depth})
diff --git a/test/unit/luaT_tuple_new.c b/test/unit/luaT_tuple_new.c
index 609d64e45..8f25c8e07 100644
--- a/test/unit/luaT_tuple_new.c
+++ b/test/unit/luaT_tuple_new.c
@@ -170,8 +170,8 @@ main()
 	luaL_openlibs(L);
 
 	box_init();
-	box_lua_tuple_init(L);
 	luaopen_msgpack(L);
+	box_lua_tuple_init(L);
 	lua_pop(L, 1);
 
 	return test_basic(L);
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tarantool-patches] [PATCH 4/4] app: allow to raise an error on too nested tables
  2019-09-04 21:44 [tarantool-patches] [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
                   ` (2 preceding siblings ...)
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 3/4] tuple: use global msgpack serializer in Lua tuple Vladislav Shpilevoy
@ 2019-09-04 21:44 ` Vladislav Shpilevoy
  2019-09-09 18:57 ` [tarantool-patches] Re: [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
  4 siblings, 0 replies; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-04 21:44 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Closes #4434
Follow-up #4366

@TarantoolBot document
Title: json/msgpack.cfg.encode_crop_too_deep option

Tarantool has several so called serializers to convert data
between Lua and another format: YAML, JSON, msgpack.

YAML is a crazy serializer without depth restrictions. But for
JSON and msgpack a user could set encode_max_depth option. That
option led to crop of a table when it had too many nested levels.
Sometimes such behaviour is undesirable.

Now a user can choose to raise an error instead of data
corruption:

    msgpack.cfg({encode_crop_too_deep = false})
    t = nil
    for i = 1, 100 do t = {t} end
    msgpack.encode(t) -- Here an exception is thrown.

Option encode_crop_too_deep works for JSON and msgpack modules,
and is true by default, to keep backward compatibility.
---
 src/lua/msgpack.c                    |  4 ++++
 src/lua/msgpackffi.lua               |  3 +++
 src/lua/utils.c                      |  1 +
 src/lua/utils.h                      |  6 ++++++
 test/app-tap/lua/serializer_test.lua | 20 +++++++++++++++++++-
 test/app-tap/msgpackffi.test.lua     |  6 +++++-
 third_party/lua-cjson/lua_cjson.c    | 10 ++++++++--
 7 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/src/lua/msgpack.c b/src/lua/msgpack.c
index c2be0b3e8..b1354776d 100644
--- a/src/lua/msgpack.c
+++ b/src/lua/msgpack.c
@@ -143,6 +143,8 @@ restart: /* used by MP_EXT */
 	case MP_MAP:
 		/* Map */
 		if (level >= cfg->encode_max_depth) {
+			if (! cfg->encode_crop_too_deep)
+				return luaL_error(L, "Too high nest level");
 			mpstream_encode_nil(stream); /* Limit nested maps */
 			return MP_NIL;
 		}
@@ -164,6 +166,8 @@ restart: /* used by MP_EXT */
 	case MP_ARRAY:
 		/* Array */
 		if (level >= cfg->encode_max_depth) {
+			if (! cfg->encode_crop_too_deep)
+				return luaL_error(L, "Too high nest level");
 			mpstream_encode_nil(stream); /* Limit nested arrays */
 			return MP_NIL;
 		}
diff --git a/src/lua/msgpackffi.lua b/src/lua/msgpackffi.lua
index 65c57aa6e..73d0d6fe2 100644
--- a/src/lua/msgpackffi.lua
+++ b/src/lua/msgpackffi.lua
@@ -219,6 +219,9 @@ local function encode_r(buf, obj, level)
         encode_str(buf, obj)
     elseif type(obj) == "table" then
         if level >= msgpack.cfg.encode_max_depth then
+            if not msgpack.cfg.encode_crop_too_deep then
+                error('Too high nest level')
+            end
             encode_nil(buf)
             return
         end
diff --git a/src/lua/utils.c b/src/lua/utils.c
index ebbe208b4..ddc5f933a 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -237,6 +237,7 @@ static struct {
 	OPTION(LUA_TNUMBER,  encode_sparse_ratio, 2),
 	OPTION(LUA_TNUMBER,  encode_sparse_safe, 10),
 	OPTION(LUA_TNUMBER,  encode_max_depth, 32),
+	OPTION(LUA_TBOOLEAN, encode_crop_too_deep, 1),
 	OPTION(LUA_TBOOLEAN, encode_invalid_numbers, 1),
 	OPTION(LUA_TNUMBER,  encode_number_precision, 14),
 	OPTION(LUA_TBOOLEAN, encode_load_metatables, 1),
diff --git a/src/lua/utils.h b/src/lua/utils.h
index e802b8c4c..00a5d16cf 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -212,6 +212,12 @@ struct luaL_serializer {
 	int encode_sparse_safe;
 	/** Max recursion depth for encoding (MsgPack, CJSON only) */
 	int encode_max_depth;
+	/**
+	 * A flag whether a table with too high nest level should
+	 * be cropped. If not set, than this is considered an
+	 * error.
+	 */
+	int encode_crop_too_deep;
 	/** Enables encoding of NaN and Inf numbers */
 	int encode_invalid_numbers;
 	/** Floating point numbers precision (YAML, CJSON only) */
diff --git a/test/app-tap/lua/serializer_test.lua b/test/app-tap/lua/serializer_test.lua
index a44247d70..4fa2924cf 100644
--- a/test/app-tap/lua/serializer_test.lua
+++ b/test/app-tap/lua/serializer_test.lua
@@ -403,7 +403,7 @@ local function test_ucdata(test, s)
 end
 
 local function test_depth(test, s)
-    test:plan(1)
+    test:plan(3)
     --
     -- gh-4434: serializer update should be reflected in Lua.
     --
@@ -412,6 +412,24 @@ local function test_depth(test, s)
     test:is(s.cfg.encode_max_depth, max_depth + 5,
             "cfg({<name> = value}) is reflected in cfg.<name>")
     s.cfg({encode_max_depth = max_depth})
+
+    --
+    -- gh-4434 (yes, the same issue): let users choose whether
+    -- they want to raise an error on tables with too high nest
+    -- level.
+    --
+    s.cfg({encode_crop_too_deep = false})
+
+    local t = nil
+    for i = 1, max_depth + 1 do t = {t} end
+    local ok, err = pcall(s.encode, t)
+    test:ok(not ok, "too deep encode depth")
+
+    s.cfg({encode_max_depth = max_depth + 1})
+    ok, err = pcall(s.encode, t)
+    test:ok(ok, "no throw in a corner case")
+
+    s.cfg({encode_crop_too_deep = true, encode_max_depth = max_depth})
 end
 
 return {
diff --git a/test/app-tap/msgpackffi.test.lua b/test/app-tap/msgpackffi.test.lua
index e26247625..4417e92bd 100755
--- a/test/app-tap/msgpackffi.test.lua
+++ b/test/app-tap/msgpackffi.test.lua
@@ -36,7 +36,7 @@ local function test_offsets(test, s)
 end
 
 local function test_other(test, s)
-    test:plan(21)
+    test:plan(22)
     local buf = string.char(0x93, 0x6e, 0xcb, 0x42, 0x2b, 0xed, 0x30, 0x47,
         0x6f, 0xff, 0xff, 0xac, 0x77, 0x6b, 0x61, 0x71, 0x66, 0x7a, 0x73,
         0x7a, 0x75, 0x71, 0x71, 0x78)
@@ -91,6 +91,10 @@ local function test_other(test, s)
     result_depth = check_depth(max_depth + 5)
     test:is(result_depth, max_depth + 5, "and uses it dynamically")
 
+    msgpack.cfg({encode_crop_too_deep = false})
+    local ok = pcall(check_depth, max_depth + 6)
+    test:ok(not ok, "exception is thrown when crop is not allowed")
+
     msgpack.cfg({encode_max_depth = max_depth})
 end
 
diff --git a/third_party/lua-cjson/lua_cjson.c b/third_party/lua-cjson/lua_cjson.c
index 6b03e391a..0c325d03c 100644
--- a/third_party/lua-cjson/lua_cjson.c
+++ b/third_party/lua-cjson/lua_cjson.c
@@ -400,14 +400,20 @@ static void json_append_data(lua_State *l, struct luaL_serializer *cfg,
     json_append_nil(cfg, json);
     break;
     case MP_MAP:
-    if (current_depth >= cfg->encode_max_depth)
+    if (current_depth >= cfg->encode_max_depth) {
+        if (! cfg->encode_crop_too_deep)
+            luaL_error(l, "Too high nest level");
         return json_append_nil(cfg, json); /* Limit nested maps */
+    }
     json_append_object(l, cfg, current_depth + 1, json);
     return;
     case MP_ARRAY:
     /* Array */
-    if (current_depth >= cfg->encode_max_depth)
+    if (current_depth >= cfg->encode_max_depth) {
+        if (! cfg->encode_crop_too_deep)
+            luaL_error(l, "Too high nest level");
         return json_append_nil(cfg, json); /* Limit nested arrays */
+    }
     json_append_array(l, cfg, current_depth + 1, json, field.size);
     return;
     case MP_EXT:
-- 
2.20.1 (Apple Git-117)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tarantool-patches] Re: [PATCH 3/4] tuple: use global msgpack serializer in Lua tuple
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 3/4] tuple: use global msgpack serializer in Lua tuple Vladislav Shpilevoy
@ 2019-09-08 15:02   ` Vladislav Shpilevoy
  0 siblings, 0 replies; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-08 15:02 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Kostja proposed to use triggers instead of a version
number. Here is a new commit:

diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 3902288bf..b9ea63350 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -58,6 +58,11 @@
 
 static const char *tuplelib_name = "box.tuple";
 static const char *tuple_iteratorlib_name = "box.tuple.iterator";
+/*
+ * Special serializer for box.tuple.new() to disable storage
+ * optimization for excessively sparse arrays as a tuple always
+ * must be regular MP_ARRAY.
+ */
 static struct luaL_serializer tuple_serializer;
 
 extern char tuple_lua[]; /* Lua source */
@@ -548,6 +553,21 @@ static const struct luaL_Reg lbox_tuple_iterator_meta[] = {
 
 /* }}} */
 
+static inline void
+tuple_serializer_fill(void)
+{
+	luaL_serializer_copy_options(&tuple_serializer, luaL_msgpack_default);
+	tuple_serializer.encode_sparse_ratio = 0;
+}
+
+static void
+on_msgpack_serializer_update(struct trigger *trigger, void *event)
+{
+	(void) trigger;
+	(void) event;
+	tuple_serializer_fill();
+}
+
 void
 box_lua_tuple_init(struct lua_State *L)
 {
@@ -564,14 +584,11 @@ box_lua_tuple_init(struct lua_State *L)
 
 	luamp_set_encode_extension(luamp_encode_extension_box);
 
-	/*
-	 * Create special serializer for box.tuple.new().
-	 * Disable storage optimization for excessively
-	 * sparse arrays as a tuple always must be regular
-	 * MP_ARRAY.
-	 */
-	luaL_serializer_create(&tuple_serializer);
-	tuple_serializer.encode_sparse_ratio = 0;
+	tuple_serializer_fill();
+	trigger_create(&tuple_serializer.on_update_up,
+		       on_msgpack_serializer_update, NULL, NULL);
+	trigger_add(&luaL_msgpack_default->on_update_down,
+		    &tuple_serializer.on_update_up);
 
 	/* Get CTypeID for `struct tuple' */
 	int rc = luaL_cdef(L, "struct tuple;");
diff --git a/src/lua/utils.c b/src/lua/utils.c
index a082a2e5b..fa33d5eba 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -251,12 +251,20 @@ static struct {
 void
 luaL_serializer_create(struct luaL_serializer *cfg)
 {
+	rlist_create(&cfg->on_update_down);
 	for (int i = 0; OPTIONS[i].name != NULL; i++) {
 		int *pval = (int *) ((char *) cfg + OPTIONS[i].offset);
 		*pval = OPTIONS[i].defvalue;
 	}
 }
 
+void
+luaL_serializer_copy_options(struct luaL_serializer *dst,
+			     const struct luaL_serializer *src)
+{
+	memcpy(dst, src, offsetof(struct luaL_serializer, end_of_options));
+}
+
 /**
  * Configure one field in @a cfg.
  * @param L Lua stack.
@@ -319,6 +327,7 @@ luaL_serializer_cfg(struct lua_State *L)
 		else
 			lua_setfield(L, 1, OPTIONS[i].name);
 	}
+	trigger_run(&cfg->on_update_down, cfg);
 	return 0;
 }
 
@@ -342,7 +351,7 @@ luaL_newserializer(struct lua_State *L, const char *modname, const luaL_Reg *reg
 			lua_newuserdata(L, sizeof(*serializer));
 	luaL_getmetatable(L, LUAL_SERIALIZER);
 	lua_setmetatable(L, -2);
-	memset(serializer, 0, sizeof(*serializer));
+	luaL_serializer_create(serializer);
 
 	for (; reg->name != NULL; reg++) {
 		/* push luaL_serializer as upvalue */
@@ -362,7 +371,6 @@ luaL_newserializer(struct lua_State *L, const char *modname, const luaL_Reg *reg
 	/* Save configuration values to serializer.cfg */
 	for (int i = 0; OPTIONS[i].name != NULL; i++) {
 		int *pval = (int *) ((char *) serializer + OPTIONS[i].offset);
-		*pval = OPTIONS[i].defvalue;
 		switch (OPTIONS[i].type) {
 		case LUA_TBOOLEAN:
 			lua_pushboolean(L, *pval);
diff --git a/src/lua/utils.h b/src/lua/utils.h
index d42cc3992..4c10a0a8f 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -36,6 +36,7 @@
 #include <math.h> /* modf, isfinite */
 
 #include <msgpuck.h> /* enum mp_type */
+#include "trigger.h"
 
 #if defined(__cplusplus)
 extern "C" {
@@ -242,6 +243,23 @@ struct luaL_serializer {
 
 	/** Enable support for compact represenation (internal, YAML-only). */
 	int has_compact;
+	/**
+	 * Border where copyable fields end. Is used to copy
+	 * serializer options into an existing serializer without
+	 * erasure of its non-option fields like triggers.
+	 */
+	char end_of_options[0];
+	/**
+	 * Trigger object to subscribe on updates of a more
+	 * general serializer. For example, tuple serializer
+	 * subscribes on msgpack.
+	 */
+	struct trigger on_update_up;
+	/**
+	 * List of triggers on update of this serializer. To push
+	 * updates down to dependent serializers.
+	 */
+	struct rlist on_update_down;
 };
 
 extern int luaL_nil_ref;
@@ -254,6 +272,14 @@ extern int luaL_array_metatable_ref;
 struct luaL_serializer *
 luaL_newserializer(struct lua_State *L, const char *modname, const luaL_Reg *reg);
 
+/**
+ * Copy all option fields of @a src into @a dst. Other fields,
+ * such as triggers, are not touched.
+ */
+void
+luaL_serializer_copy_options(struct luaL_serializer *dst,
+			     const struct luaL_serializer *src);
+
 static inline struct luaL_serializer *
 luaL_checkserializer(struct lua_State *L) {
 	return (struct luaL_serializer *)
diff --git a/test/box/tuple.result b/test/box/tuple.result
index 895462518..83f74d111 100644
--- a/test/box/tuple.result
+++ b/test/box/tuple.result
@@ -1395,3 +1395,49 @@ d:update{{'-', 1, dec.new('1e37')}}
 ---
 - [80000000000000000000000000000000000000]
 ...
+--
+-- gh-4434: tuple should use global msgpack serializer.
+--
+max_depth = msgpack.cfg.encode_max_depth
+---
+...
+t = nil
+---
+...
+for i = 1, max_depth + 5 do t = {t} end
+---
+...
+tuple = box.tuple.new(t):totable()
+---
+...
+level = 0
+---
+...
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+---
+...
+level == max_depth or {level, max_depth}
+---
+- true
+...
+msgpack.cfg({encode_max_depth = max_depth + 5})
+---
+...
+tuple = box.tuple.new(t):totable()
+---
+...
+level = 0
+---
+...
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+---
+...
+-- Level should be bigger now, because the default msgpack
+-- serializer allows deeper tables.
+level == max_depth + 5 or {level, max_depth}
+---
+- true
+...
+msgpack.cfg({encode_max_depth = max_depth})
+---
+...
diff --git a/test/box/tuple.test.lua b/test/box/tuple.test.lua
index 9762fc8b3..c8a0c03c5 100644
--- a/test/box/tuple.test.lua
+++ b/test/box/tuple.test.lua
@@ -473,3 +473,24 @@ d = box.tuple.new(dec.new('9e37'))
 d
 d:update{{'+', 1, dec.new('1e37')}}
 d:update{{'-', 1, dec.new('1e37')}}
+
+--
+-- gh-4434: tuple should use global msgpack serializer.
+--
+max_depth = msgpack.cfg.encode_max_depth
+t = nil
+for i = 1, max_depth + 5 do t = {t} end
+tuple = box.tuple.new(t):totable()
+level = 0
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+level == max_depth or {level, max_depth}
+
+msgpack.cfg({encode_max_depth = max_depth + 5})
+tuple = box.tuple.new(t):totable()
+level = 0
+while tuple ~= nil do level = level + 1 tuple = tuple[1] end
+-- Level should be bigger now, because the default msgpack
+-- serializer allows deeper tables.
+level == max_depth + 5 or {level, max_depth}
+
+msgpack.cfg({encode_max_depth = max_depth})
diff --git a/test/unit/luaT_tuple_new.c b/test/unit/luaT_tuple_new.c
index 609d64e45..8f25c8e07 100644
--- a/test/unit/luaT_tuple_new.c
+++ b/test/unit/luaT_tuple_new.c
@@ -170,8 +170,8 @@ main()
 	luaL_openlibs(L);
 
 	box_init();
-	box_lua_tuple_init(L);
 	luaopen_msgpack(L);
+	box_lua_tuple_init(L);
 	lua_pop(L, 1);
 
 	return test_basic(L);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tarantool-patches] Re: [PATCH 0/4] Serializer bugs
  2019-09-04 21:44 [tarantool-patches] [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
                   ` (3 preceding siblings ...)
  2019-09-04 21:44 ` [tarantool-patches] [PATCH 4/4] app: allow to raise an error on too nested tables Vladislav Shpilevoy
@ 2019-09-09 18:57 ` Vladislav Shpilevoy
  4 siblings, 0 replies; 7+ messages in thread
From: Vladislav Shpilevoy @ 2019-09-09 18:57 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Sorry, too many changes from Kostja. I send v2 in a separate
thread.

On 04/09/2019 23:44, Vladislav Shpilevoy wrote:
> The patchset fixes several problems about JSON and msgpack serializers.
> 
> 1) msgpackffi didn't use msgpack serializer options;
> 2) tuple serializer either;
> 3) update of a serializer option was not reflected in its Lua representation;
> 4) during serialization too nested tables are silently cropped and there was no
>   way to prevent it.
> 
> The patchset depends on fix of #4366 provided by Kirill.
> 
> Also it was discovered, that msgpackffi does not care *all* options. Not only
> about max_depth. I am not sure if it is worth fixing here (or at all) though.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4434-box-lua-msgpack-cfg
> Issue: https://github.com/tarantool/tarantool/issues/4434
> 
> Vladislav Shpilevoy (4):
>   app: serializers update now is reflected in Lua
>   msgpack: make msgpackffi use encode_max_depth option
>   tuple: use global msgpack serializer in Lua tuple
>   app: allow to raise an error on too nested tables
> 
>  src/box/lua/tuple.c                  | 32 +++++++++++++------
>  src/lua/msgpack.c                    |  4 +++
>  src/lua/msgpackffi.lua               |  6 ++--
>  src/lua/utils.c                      | 37 ++++++++++++----------
>  src/lua/utils.h                      | 15 +++++++++
>  test/app-tap/json.test.lua           |  3 +-
>  test/app-tap/lua/serializer_test.lua | 31 +++++++++++++++++++
>  test/app-tap/msgpack.test.lua        |  3 +-
>  test/app-tap/msgpackffi.test.lua     | 30 +++++++++++++++++-
>  test/box/tuple.result                | 46 ++++++++++++++++++++++++++++
>  test/box/tuple.test.lua              | 21 +++++++++++++
>  test/unit/luaT_tuple_new.c           |  2 +-
>  third_party/lua-cjson/lua_cjson.c    | 10 ++++--
>  13 files changed, 207 insertions(+), 33 deletions(-)
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-09-09 18:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-04 21:44 [tarantool-patches] [PATCH 0/4] Serializer bugs Vladislav Shpilevoy
2019-09-04 21:44 ` [tarantool-patches] [PATCH 1/4] app: serializers update now is reflected in Lua Vladislav Shpilevoy
2019-09-04 21:44 ` [tarantool-patches] [PATCH 2/4] msgpack: make msgpackffi use encode_max_depth option Vladislav Shpilevoy
2019-09-04 21:44 ` [tarantool-patches] [PATCH 3/4] tuple: use global msgpack serializer in Lua tuple Vladislav Shpilevoy
2019-09-08 15:02   ` [tarantool-patches] " Vladislav Shpilevoy
2019-09-04 21:44 ` [tarantool-patches] [PATCH 4/4] app: allow to raise an error on too nested tables Vladislav Shpilevoy
2019-09-09 18:57 ` [tarantool-patches] Re: [PATCH 0/4] Serializer bugs Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox