[Tarantool-patches] [PATCH v5 01/52] sql: enhance vdbe_decode_msgpack_into_mem()

Mergen Imeev imeevma at tarantool.org
Tue Apr 13 15:01:16 MSK 2021


Thank you for the review! I'a sorry to ask you so late, but could you look at
new version of the patch? It was decided that this patch should be pushed to
2.6 and 2.7 along with master. Only this patch of the patch-set.

Diff and new version below. Please note, that I added additional SELECT to
the first test since currently sql-tap test system does not support binary
result. Also, in the last two tests I created new table for each since this way
we could see proper error. If I used space T there then in the last test there
would be the same error as in the test 5 due to UUID being before DECIMAL in
space format. To remove dependence of their position, I added two new spaces.

On Sun, Apr 11, 2021 at 07:42:22PM +0200, Vladislav Shpilevoy wrote:
> Hi! Good job on the patch!
> 
> On 09.04.2021 18:51, Mergen Imeev via Tarantool-patches wrote:
> > Hi! Thank you for the review! My answer and new patch below. I didn't include
> > diffs in answers since due to merge conflicts they are partly useless.
> > 
> > On 30.03.2021 01:57, Vladislav Shpilevoy wrote:
> >> Hi! I appreciate the work you did here!
> >>
> >> Truly huge patch. But it does the important thing which I think
> >> should give a huge help in the task of SQL code rework.
> >>
> >> On 23.03.2021 10:34, Mergen Imeev via Tarantool-patches wrote:
> >>> Currently, vdbe_decode_msgpack_into_mem() creates a MEM that is not
> >>> properly initialized in case msgpack contains MP_EXT, MP_MAP, or
> >>> MP_ARRAY fields. Also, it doesn't set field_type.
> >>
> >> AFAIR, field type wasn't set deliberately. Because we use it only for
> >> strictly typed values obtained from formatted space fields. It wasn't
> >> applied to plain non-formatted values on purpose.
> >>
> >> Why do you change that?
> > 
> > I didn't know about that. I thought that all MEMs that contains data should have
> > field_type set. However, I tried to understand where did this field come from
> > and found that it was added for two purposes: to show difference between NUMBER
> > and INTEGER in MEM before DOUBLE was added and to show column name instead of
> > type determined from MP-type in typeof(). I believe that both these purposes are
> > not needed now and that this field should be removed from struct MEM. I created
> > an issue for this: #5957. However, I was prohibited to remove this field for now
> > by @tsafin, who believes that this field is actually important.
> 
> I tend to agree, that it must be done separately. Although probably could be done
> before this patchset to make it kind of simpler. Anyway, thanks for the
> explanation.


Diff:


diff --git a/test/sql-tap/gh-5913-segfault-on-select-uuid.test.lua b/test/sql-tap/gh-5913-segfault-on-select-uuid.test.lua
new file mode 100755
index 000000000..60978c2b5
--- /dev/null
+++ b/test/sql-tap/gh-5913-segfault-on-select-uuid.test.lua
@@ -0,0 +1,99 @@
+#!/usr/bin/env tarantool
+local test = require("sqltester")
+test:plan(6)
+
+local uuid = require("uuid").fromstr("11111111-1111-1111-1111-111111111111")
+local decimal = require("decimal").new(111.111)
+
+box.schema.create_space('T')
+box.space.T:format({{name = "I", type = "integer"}, {name = "U", type = "uuid"},
+                    {name = "D", type = "decimal"}})
+box.space.T:create_index("primary")
+box.space.T:insert({1, uuid, decimal})
+
+--
+-- Make sure that there is no segmentation fault on select from field that
+-- contains UUID or DECIMAL. Currently SQL does not support UUID and DECIMAL,
+-- so they treated as VARBINARY.
+--
+test:do_execsql_test(
+    "gh-5913-1",
+    [[
+        SELECT i, u, d FROM t;
+        SELECT i from t;
+    ]], {
+        1
+    })
+
+box.schema.create_space('T1')
+box.space.T1:format({{name = "I", type = "integer"},
+                     {name = "U", type = "uuid", is_nullable = true},
+                     {name = "D", type = "decimal", is_nullable = true}})
+box.space.T1:create_index("primary")
+
+--
+-- Since SQL does not support UUID and DECIMAL and they treated as VARBINARY,
+-- they cannot be inserted from SQL.
+--
+test:do_catchsql_test(
+    "gh-5913-2",
+    [[
+        INSERT INTO t1 SELECT i, u, NULL FROM t;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to uuid"
+    })
+
+test:do_catchsql_test(
+    "gh-5913-3",
+    [[
+        INSERT INTO t1 SELECT i, NULL, d FROM t;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to decimal"
+    })
+
+--
+-- Still, if UUID or DECIMAL fields does not selected directly, insert is
+-- working properly.
+--
+test:do_execsql_test(
+    "gh-5913-4",
+    [[
+        INSERT INTO t1 SELECT * FROM t;
+        SELECT count() FROM t1;
+    ]], {
+        1
+    })
+
+box.schema.create_space('TU')
+box.space.TU:format({{name = "I", type = "integer"},
+                     {name = "U", type = "uuid"}})
+box.space.TU:create_index("primary")
+box.space.TU:insert({1, uuid})
+
+box.schema.create_space('TD')
+box.space.TD:format({{name = "I", type = "integer"},
+                     {name = "D", type = "decimal"}})
+box.space.TD:create_index("primary")
+box.space.TD:insert({1, decimal})
+
+--
+-- Update of UUID or VARBINARY also does not lead to segfault, however throws an
+-- error since after changing value cannot be inserted into the field from SQL.
+--
+test:do_catchsql_test(
+    "gh-5913-5",
+    [[
+        UPDATE tu SET u = u;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to uuid"
+    })
+
+test:do_catchsql_test(
+    "gh-5913-6",
+    [[
+        UPDATE td SET d = d;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to decimal"
+    })
+
+test:finish_test()



New patch:


commit d41d8f756f8ab6bfae70ee5787bc4c509ddb844d
Author: Mergen Imeev <imeevma at gmail.com>
Date:   Thu Mar 4 17:17:18 2021 +0300

    sql: enhance vdbe_decode_msgpack_into_mem()
    
    Currently, vdbe_decode_msgpack_into_mem() creates a MEM that is not
    properly initialized in case msgpack contains MP_EXT, MP_MAP, or
    MP_ARRAY fields. This patch makes it so that after execution of this
    function, all created MEMs are properly initialized.
    
    Closes #5011
    Closes #5704
    Closes #5913
    Needed for #5818

diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index 3b3b1f01d..9a4f38bb9 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -846,16 +846,6 @@ vdbe_field_ref_fetch_data(struct vdbe_field_ref *field_ref, uint32_t fieldno)
 	return field_begin;
 }
 
-static inline enum field_type
-vdbe_field_ref_fetch_type(struct vdbe_field_ref *field_ref, uint32_t fieldno)
-{
-	const struct tuple_field *tf =
-		vdbe_field_ref_fetch_field(field_ref, fieldno);
-	if (tf == NULL || tf->type == FIELD_TYPE_ANY)
-		return field_type_MAX;
-	return tf->type;
-}
-
 /**
  * Fetch field by fieldno using vdbe_field_ref and store result
  * in dest_mem.
@@ -879,17 +869,6 @@ vdbe_field_ref_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno,
 	if (vdbe_decode_msgpack_into_mem(data, dest_mem, &dummy) != 0)
 		return -1;
 
-	/*
-	 * MsgPack map, array or extension (unsupported in sql).
-	 * Wrap it in a blob verbatim.
-	 */
-	if (dest_mem->flags == 0) {
-		dest_mem->z = (char *) data;
-		dest_mem->n = vdbe_field_ref_fetch_data(field_ref,
-							fieldno + 1) - data;
-		dest_mem->flags = MEM_Blob | MEM_Ephem | MEM_Subtype;
-		dest_mem->subtype = SQL_SUBTYPE_MSGPACK;
-	}
 	/*
 	 * Add 0 termination (at most for strings)
 	 * Not sure why do we check MEM_Ephem
@@ -909,7 +888,6 @@ vdbe_field_ref_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno,
 		dest_mem->flags |= MEM_Term;
 	}
 	UPDATE_MAX_BLOBSIZE(dest_mem);
-	dest_mem->field_type = vdbe_field_ref_fetch_type(field_ref, fieldno);
 	return 0;
 }
 
diff --git a/src/box/sql/vdbeaux.c b/src/box/sql/vdbeaux.c
index 91b64316e..772476377 100644
--- a/src/box/sql/vdbeaux.c
+++ b/src/box/sql/vdbeaux.c
@@ -2793,38 +2793,62 @@ vdbe_decode_msgpack_into_mem(const char *buf, struct Mem *mem, uint32_t *len)
 {
 	const char *start_buf = buf;
 	switch (mp_typeof(*buf)) {
-	case MP_ARRAY:
-	case MP_MAP:
-	case MP_EXT:
-	default: {
-		mem->flags = 0;
+	case MP_ARRAY: {
+		mem->z = (char *)buf;
+		mp_next(&buf);
+		mem->n = buf - mem->z;
+		mem->flags = MEM_Blob | MEM_Ephem | MEM_Subtype;
+		mem->subtype = SQL_SUBTYPE_MSGPACK;
+		mem->field_type = FIELD_TYPE_ARRAY;
+		break;
+	}
+	case MP_MAP: {
+		mem->z = (char *)buf;
+		mp_next(&buf);
+		mem->n = buf - mem->z;
+		mem->flags = MEM_Blob | MEM_Ephem | MEM_Subtype;
+		mem->subtype = SQL_SUBTYPE_MSGPACK;
+		mem->field_type = FIELD_TYPE_MAP;
+		break;
+	}
+	case MP_EXT: {
+		mem->z = (char *)buf;
+		mp_next(&buf);
+		mem->n = buf - mem->z;
+		mem->flags = MEM_Blob | MEM_Ephem;
+		mem->field_type = FIELD_TYPE_VARBINARY;
 		break;
 	}
 	case MP_NIL: {
 		mp_decode_nil(&buf);
 		mem->flags = MEM_Null;
+		mem->field_type = field_type_MAX;
 		break;
 	}
 	case MP_BOOL: {
 		mem->u.b = mp_decode_bool(&buf);
 		mem->flags = MEM_Bool;
+		mem->field_type = FIELD_TYPE_BOOLEAN;
 		break;
 	}
 	case MP_UINT: {
 		uint64_t v = mp_decode_uint(&buf);
 		mem->u.u = v;
 		mem->flags = MEM_UInt;
+		mem->field_type = FIELD_TYPE_INTEGER;
 		break;
 	}
 	case MP_INT: {
 		mem->u.i = mp_decode_int(&buf);
 		mem->flags = MEM_Int;
+		mem->field_type = FIELD_TYPE_INTEGER;
 		break;
 	}
 	case MP_STR: {
 		/* XXX u32->int */
 		mem->n = (int) mp_decode_strl(&buf);
 		mem->flags = MEM_Str | MEM_Ephem;
+		mem->field_type = FIELD_TYPE_STRING;
 install_blob:
 		mem->z = (char *)buf;
 		buf += mem->n;
@@ -2834,18 +2858,33 @@ install_blob:
 		/* XXX u32->int */
 		mem->n = (int) mp_decode_binl(&buf);
 		mem->flags = MEM_Blob | MEM_Ephem;
+		mem->field_type = FIELD_TYPE_VARBINARY;
 		goto install_blob;
 	}
 	case MP_FLOAT: {
 		mem->u.r = mp_decode_float(&buf);
-		mem->flags = sqlIsNaN(mem->u.r) ? MEM_Null : MEM_Real;
+		if (sqlIsNaN(mem->u.r)) {
+			mem->flags = MEM_Null;
+			mem->field_type = FIELD_TYPE_DOUBLE;
+		} else {
+			mem->flags = MEM_Real;
+			mem->field_type = FIELD_TYPE_DOUBLE;
+		}
 		break;
 	}
 	case MP_DOUBLE: {
 		mem->u.r = mp_decode_double(&buf);
-		mem->flags = sqlIsNaN(mem->u.r) ? MEM_Null : MEM_Real;
+		if (sqlIsNaN(mem->u.r)) {
+			mem->flags = MEM_Null;
+			mem->field_type = FIELD_TYPE_DOUBLE;
+		} else {
+			mem->flags = MEM_Real;
+			mem->field_type = FIELD_TYPE_DOUBLE;
+		}
 		break;
 	}
+	default:
+		unreachable();
 	}
 	*len = (uint32_t)(buf - start_buf);
 	return 0;
@@ -2868,15 +2907,8 @@ sqlVdbeRecordUnpackMsgpack(struct key_def *key_def,	/* Information about the rec
 		pMem->z = 0;
 		uint32_t sz = 0;
 		vdbe_decode_msgpack_into_mem(zParse, pMem, &sz);
-		if (sz == 0) {
-			/* MsgPack array, map or ext. Treat as blob. */
-			pMem->z = (char *)zParse;
-			mp_next(&zParse);
-			pMem->n = zParse - pMem->z;
-			pMem->flags = MEM_Blob | MEM_Ephem;
-		} else {
-			zParse += sz;
-		}
+		assert(sz != 0);
+		zParse += sz;
 		pMem++;
 	}
 }
diff --git a/test/sql-tap/gh-5913-segfault-on-select-uuid.test.lua b/test/sql-tap/gh-5913-segfault-on-select-uuid.test.lua
new file mode 100755
index 000000000..60978c2b5
--- /dev/null
+++ b/test/sql-tap/gh-5913-segfault-on-select-uuid.test.lua
@@ -0,0 +1,99 @@
+#!/usr/bin/env tarantool
+local test = require("sqltester")
+test:plan(6)
+
+local uuid = require("uuid").fromstr("11111111-1111-1111-1111-111111111111")
+local decimal = require("decimal").new(111.111)
+
+box.schema.create_space('T')
+box.space.T:format({{name = "I", type = "integer"}, {name = "U", type = "uuid"},
+                    {name = "D", type = "decimal"}})
+box.space.T:create_index("primary")
+box.space.T:insert({1, uuid, decimal})
+
+--
+-- Make sure that there is no segmentation fault on select from field that
+-- contains UUID or DECIMAL. Currently SQL does not support UUID and DECIMAL,
+-- so they treated as VARBINARY.
+--
+test:do_execsql_test(
+    "gh-5913-1",
+    [[
+        SELECT i, u, d FROM t;
+        SELECT i from t;
+    ]], {
+        1
+    })
+
+box.schema.create_space('T1')
+box.space.T1:format({{name = "I", type = "integer"},
+                     {name = "U", type = "uuid", is_nullable = true},
+                     {name = "D", type = "decimal", is_nullable = true}})
+box.space.T1:create_index("primary")
+
+--
+-- Since SQL does not support UUID and DECIMAL and they treated as VARBINARY,
+-- they cannot be inserted from SQL.
+--
+test:do_catchsql_test(
+    "gh-5913-2",
+    [[
+        INSERT INTO t1 SELECT i, u, NULL FROM t;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to uuid"
+    })
+
+test:do_catchsql_test(
+    "gh-5913-3",
+    [[
+        INSERT INTO t1 SELECT i, NULL, d FROM t;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to decimal"
+    })
+
+--
+-- Still, if UUID or DECIMAL fields does not selected directly, insert is
+-- working properly.
+--
+test:do_execsql_test(
+    "gh-5913-4",
+    [[
+        INSERT INTO t1 SELECT * FROM t;
+        SELECT count() FROM t1;
+    ]], {
+        1
+    })
+
+box.schema.create_space('TU')
+box.space.TU:format({{name = "I", type = "integer"},
+                     {name = "U", type = "uuid"}})
+box.space.TU:create_index("primary")
+box.space.TU:insert({1, uuid})
+
+box.schema.create_space('TD')
+box.space.TD:format({{name = "I", type = "integer"},
+                     {name = "D", type = "decimal"}})
+box.space.TD:create_index("primary")
+box.space.TD:insert({1, decimal})
+
+--
+-- Update of UUID or VARBINARY also does not lead to segfault, however throws an
+-- error since after changing value cannot be inserted into the field from SQL.
+--
+test:do_catchsql_test(
+    "gh-5913-5",
+    [[
+        UPDATE tu SET u = u;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to uuid"
+    })
+
+test:do_catchsql_test(
+    "gh-5913-6",
+    [[
+        UPDATE td SET d = d;
+    ]], {
+        1, "Type mismatch: can not convert varbinary to decimal"
+    })
+
+test:finish_test()


More information about the Tarantool-patches mailing list