[Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Aug 5 02:45:18 MSK 2020


When a tuple had format with multikey indexes in it, any attempt
to get a multikey indexed field by a JSON path from Lua led to a
crash.

That was because of incorrect interpretation of offset slot value
in tuple's field map.

Tuple field map is an array stored before the tuple's MessagePack
data. Each element is a 4 byte offset to an indexed value to be
able to get it for O(1) time without MessagePack decoding of all
the previous fields.

At least it was so before multikeys. Now tuple field map is not
just an array. It is rather a 2-level array, somehow similar to
ext4 FS. Some elements of the root array are positive numbers
pointing at data. Some elements point at a second 'indirect'
array, so called 'extra', size of which is individual for each
tuple. These second arrays are used by multikey indexes to store
offsets to each multikey indexed value in a tuple.

It means, that if there is an offset slot, it can't be just used
as is. It is allowed only if the field is not multikey. Otherwise
it is neccessary to somehow get an index in the second 'indirect'
array.

This is what was happening - a multikey field was found, its
offset slot was valid, but it was pointing at an 'indirect' array,
not at the data. JSON tuple field access tried to use it as a data
offset.

The patch makes JSON field access degrade to fullscan when a field
is multikey, but no multikey array index is provided.

Closes #5224
---
 src/box/tuple.h                               |   8 +
 test/box/gh-5224-multikey-field-access.result | 155 ++++++++++++++++++
 .../gh-5224-multikey-field-access.test.lua    |  66 ++++++++
 3 files changed, 229 insertions(+)
 create mode 100644 test/box/gh-5224-multikey-field-access.result
 create mode 100644 test/box/gh-5224-multikey-field-access.test.lua

diff --git a/src/box/tuple.h b/src/box/tuple.h
index 4752323e4..2b48d6af9 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -628,6 +628,14 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
 			goto parse;
 		if (offset_slot_hint != NULL)
 			*offset_slot_hint = offset_slot;
+		/*
+		 * When the field is multikey, the offset slot points not at the
+		 * data. It points at 'extra' array of offsets for this multikey
+		 * index. That array can only be accessed if index in that array
+		 * is known.
+		 */
+		if (field->is_multikey_part && multikey_idx == MULTIKEY_NONE)
+			goto parse;
 offset_slot_access:
 		/* Indexed field */
 		offset = field_map_get_offset(field_map, offset_slot,
diff --git a/test/box/gh-5224-multikey-field-access.result b/test/box/gh-5224-multikey-field-access.result
new file mode 100644
index 000000000..014e810d0
--- /dev/null
+++ b/test/box/gh-5224-multikey-field-access.result
@@ -0,0 +1,155 @@
+-- test-run result file version 2
+--
+-- gh-5224: tuple field access by JSON path crashed when tried to get a multikey
+-- indexed field.
+--
+format = {}
+ | ---
+ | ...
+format[1] = {name = 'f1', type = 'unsigned'}
+ | ---
+ | ...
+format[2] = {name = 'f2', type = 'array'}
+ | ---
+ | ...
+s = box.schema.create_space('test', {format = format})
+ | ---
+ | ...
+_ = s:create_index('pk')
+ | ---
+ | ...
+_ = s:create_index('sk', {                                                      \
+    parts = {                                                                   \
+        {field = 2, path = "[*].tags", type = "unsigned"}                       \
+    }                                                                           \
+})
+ | ---
+ | ...
+
+t = s:replace{1, {{tags = 2}}}
+ | ---
+ | ...
+t['[2][1].tags']
+ | ---
+ | - 2
+ | ...
+
+t = s:replace{1, {{tags = 2}, {tags = 3}, {tags = 4}}}
+ | ---
+ | ...
+t['[2]']
+ | ---
+ | - [{'tags': 2}, {'tags': 3}, {'tags': 4}]
+ | ...
+t['[2][1]']
+ | ---
+ | - {'tags': 2}
+ | ...
+t['[2][1].tags']
+ | ---
+ | - 2
+ | ...
+t['[2][2]']
+ | ---
+ | - {'tags': 3}
+ | ...
+t['[2][2].tags']
+ | ---
+ | - 3
+ | ...
+t['[2][3]']
+ | ---
+ | - {'tags': 4}
+ | ...
+t['[2][3].tags']
+ | ---
+ | - 4
+ | ...
+
+s:truncate()
+ | ---
+ | ...
+s.index.sk:drop()
+ | ---
+ | ...
+_ = s:create_index('sk', {                                                      \
+    parts = {                                                                   \
+        {field = 2, path = "[*].p1.p2", type = "unsigned"}                      \
+    }                                                                           \
+})
+ | ---
+ | ...
+
+t = s:replace{1, {{p1 = {p2 = 2}}}}
+ | ---
+ | ...
+t['[2][1].p1.p2']
+ | ---
+ | - 2
+ | ...
+
+t = s:replace{1, {                                                              \
+    {                                                                           \
+        p1 = {                                                                  \
+            p2 = 2, p3 = 3                                                      \
+        },                                                                      \
+        p4 = 4                                                                  \
+    },                                                                          \
+    {                                                                           \
+        p1 = {p2 = 5}                                                           \
+    },                                                                          \
+    {                                                                           \
+        p1 = {p2 = 6}                                                           \
+    }                                                                           \
+}}
+ | ---
+ | ...
+
+t['[2][1].p1.p2']
+ | ---
+ | - 2
+ | ...
+t['[2][1].p1.p3']
+ | ---
+ | - 3
+ | ...
+t['[2][1].p1']
+ | ---
+ | - {'p2': 2, 'p3': 3}
+ | ...
+t['[2][1]']
+ | ---
+ | - {'p4': 4, 'p1': {'p2': 2, 'p3': 3}}
+ | ...
+t['[2][1].p4']
+ | ---
+ | - 4
+ | ...
+t['[2][2].p1.p2']
+ | ---
+ | - 5
+ | ...
+t['[2][2].p1']
+ | ---
+ | - {'p2': 5}
+ | ...
+t['[2][2]']
+ | ---
+ | - {'p1': {'p2': 5}}
+ | ...
+t['[2][3].p1.p2']
+ | ---
+ | - 6
+ | ...
+t['[2][3].p1']
+ | ---
+ | - {'p2': 6}
+ | ...
+t['[2][3]']
+ | ---
+ | - {'p1': {'p2': 6}}
+ | ...
+
+s:drop()
+ | ---
+ | ...
diff --git a/test/box/gh-5224-multikey-field-access.test.lua b/test/box/gh-5224-multikey-field-access.test.lua
new file mode 100644
index 000000000..19765171e
--- /dev/null
+++ b/test/box/gh-5224-multikey-field-access.test.lua
@@ -0,0 +1,66 @@
+--
+-- gh-5224: tuple field access by JSON path crashed when tried to get a multikey
+-- indexed field.
+--
+format = {}
+format[1] = {name = 'f1', type = 'unsigned'}
+format[2] = {name = 'f2', type = 'array'}
+s = box.schema.create_space('test', {format = format})
+_ = s:create_index('pk')
+_ = s:create_index('sk', {                                                      \
+    parts = {                                                                   \
+        {field = 2, path = "[*].tags", type = "unsigned"}                       \
+    }                                                                           \
+})
+
+t = s:replace{1, {{tags = 2}}}
+t['[2][1].tags']
+
+t = s:replace{1, {{tags = 2}, {tags = 3}, {tags = 4}}}
+t['[2]']
+t['[2][1]']
+t['[2][1].tags']
+t['[2][2]']
+t['[2][2].tags']
+t['[2][3]']
+t['[2][3].tags']
+
+s:truncate()
+s.index.sk:drop()
+_ = s:create_index('sk', {                                                      \
+    parts = {                                                                   \
+        {field = 2, path = "[*].p1.p2", type = "unsigned"}                      \
+    }                                                                           \
+})
+
+t = s:replace{1, {{p1 = {p2 = 2}}}}
+t['[2][1].p1.p2']
+
+t = s:replace{1, {                                                              \
+    {                                                                           \
+        p1 = {                                                                  \
+            p2 = 2, p3 = 3                                                      \
+        },                                                                      \
+        p4 = 4                                                                  \
+    },                                                                          \
+    {                                                                           \
+        p1 = {p2 = 5}                                                           \
+    },                                                                          \
+    {                                                                           \
+        p1 = {p2 = 6}                                                           \
+    }                                                                           \
+}}
+
+t['[2][1].p1.p2']
+t['[2][1].p1.p3']
+t['[2][1].p1']
+t['[2][1]']
+t['[2][1].p4']
+t['[2][2].p1.p2']
+t['[2][2].p1']
+t['[2][2]']
+t['[2][3].p1.p2']
+t['[2][3].p1']
+t['[2][3]']
+
+s:drop()
-- 
2.21.1 (Apple Git-122.3)



More information about the Tarantool-patches mailing list