From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: tarantool-patches@dev.tarantool.org, alyapunov@tarantool.org, korablev@tarantool.org Subject: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Date: Wed, 5 Aug 2020 01:45:18 +0200 [thread overview] Message-ID: <f1b60f4011e19aa04c0ff7d49c82e46955a968fc.1596584571.git.v.shpilevoy@tarantool.org> (raw) In-Reply-To: <cover.1596584571.git.v.shpilevoy@tarantool.org> When a tuple had format with multikey indexes in it, any attempt to get a multikey indexed field by a JSON path from Lua led to a crash. That was because of incorrect interpretation of offset slot value in tuple's field map. Tuple field map is an array stored before the tuple's MessagePack data. Each element is a 4 byte offset to an indexed value to be able to get it for O(1) time without MessagePack decoding of all the previous fields. At least it was so before multikeys. Now tuple field map is not just an array. It is rather a 2-level array, somehow similar to ext4 FS. Some elements of the root array are positive numbers pointing at data. Some elements point at a second 'indirect' array, so called 'extra', size of which is individual for each tuple. These second arrays are used by multikey indexes to store offsets to each multikey indexed value in a tuple. It means, that if there is an offset slot, it can't be just used as is. It is allowed only if the field is not multikey. Otherwise it is neccessary to somehow get an index in the second 'indirect' array. This is what was happening - a multikey field was found, its offset slot was valid, but it was pointing at an 'indirect' array, not at the data. JSON tuple field access tried to use it as a data offset. The patch makes JSON field access degrade to fullscan when a field is multikey, but no multikey array index is provided. Closes #5224 --- src/box/tuple.h | 8 + test/box/gh-5224-multikey-field-access.result | 155 ++++++++++++++++++ .../gh-5224-multikey-field-access.test.lua | 66 ++++++++ 3 files changed, 229 insertions(+) create mode 100644 test/box/gh-5224-multikey-field-access.result create mode 100644 test/box/gh-5224-multikey-field-access.test.lua diff --git a/src/box/tuple.h b/src/box/tuple.h index 4752323e4..2b48d6af9 100644 --- a/src/box/tuple.h +++ b/src/box/tuple.h @@ -628,6 +628,14 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple, goto parse; if (offset_slot_hint != NULL) *offset_slot_hint = offset_slot; + /* + * When the field is multikey, the offset slot points not at the + * data. It points at 'extra' array of offsets for this multikey + * index. That array can only be accessed if index in that array + * is known. + */ + if (field->is_multikey_part && multikey_idx == MULTIKEY_NONE) + goto parse; offset_slot_access: /* Indexed field */ offset = field_map_get_offset(field_map, offset_slot, diff --git a/test/box/gh-5224-multikey-field-access.result b/test/box/gh-5224-multikey-field-access.result new file mode 100644 index 000000000..014e810d0 --- /dev/null +++ b/test/box/gh-5224-multikey-field-access.result @@ -0,0 +1,155 @@ +-- test-run result file version 2 +-- +-- gh-5224: tuple field access by JSON path crashed when tried to get a multikey +-- indexed field. +-- +format = {} + | --- + | ... +format[1] = {name = 'f1', type = 'unsigned'} + | --- + | ... +format[2] = {name = 'f2', type = 'array'} + | --- + | ... +s = box.schema.create_space('test', {format = format}) + | --- + | ... +_ = s:create_index('pk') + | --- + | ... +_ = s:create_index('sk', { \ + parts = { \ + {field = 2, path = "[*].tags", type = "unsigned"} \ + } \ +}) + | --- + | ... + +t = s:replace{1, {{tags = 2}}} + | --- + | ... +t['[2][1].tags'] + | --- + | - 2 + | ... + +t = s:replace{1, {{tags = 2}, {tags = 3}, {tags = 4}}} + | --- + | ... +t['[2]'] + | --- + | - [{'tags': 2}, {'tags': 3}, {'tags': 4}] + | ... +t['[2][1]'] + | --- + | - {'tags': 2} + | ... +t['[2][1].tags'] + | --- + | - 2 + | ... +t['[2][2]'] + | --- + | - {'tags': 3} + | ... +t['[2][2].tags'] + | --- + | - 3 + | ... +t['[2][3]'] + | --- + | - {'tags': 4} + | ... +t['[2][3].tags'] + | --- + | - 4 + | ... + +s:truncate() + | --- + | ... +s.index.sk:drop() + | --- + | ... +_ = s:create_index('sk', { \ + parts = { \ + {field = 2, path = "[*].p1.p2", type = "unsigned"} \ + } \ +}) + | --- + | ... + +t = s:replace{1, {{p1 = {p2 = 2}}}} + | --- + | ... +t['[2][1].p1.p2'] + | --- + | - 2 + | ... + +t = s:replace{1, { \ + { \ + p1 = { \ + p2 = 2, p3 = 3 \ + }, \ + p4 = 4 \ + }, \ + { \ + p1 = {p2 = 5} \ + }, \ + { \ + p1 = {p2 = 6} \ + } \ +}} + | --- + | ... + +t['[2][1].p1.p2'] + | --- + | - 2 + | ... +t['[2][1].p1.p3'] + | --- + | - 3 + | ... +t['[2][1].p1'] + | --- + | - {'p2': 2, 'p3': 3} + | ... +t['[2][1]'] + | --- + | - {'p4': 4, 'p1': {'p2': 2, 'p3': 3}} + | ... +t['[2][1].p4'] + | --- + | - 4 + | ... +t['[2][2].p1.p2'] + | --- + | - 5 + | ... +t['[2][2].p1'] + | --- + | - {'p2': 5} + | ... +t['[2][2]'] + | --- + | - {'p1': {'p2': 5}} + | ... +t['[2][3].p1.p2'] + | --- + | - 6 + | ... +t['[2][3].p1'] + | --- + | - {'p2': 6} + | ... +t['[2][3]'] + | --- + | - {'p1': {'p2': 6}} + | ... + +s:drop() + | --- + | ... diff --git a/test/box/gh-5224-multikey-field-access.test.lua b/test/box/gh-5224-multikey-field-access.test.lua new file mode 100644 index 000000000..19765171e --- /dev/null +++ b/test/box/gh-5224-multikey-field-access.test.lua @@ -0,0 +1,66 @@ +-- +-- gh-5224: tuple field access by JSON path crashed when tried to get a multikey +-- indexed field. +-- +format = {} +format[1] = {name = 'f1', type = 'unsigned'} +format[2] = {name = 'f2', type = 'array'} +s = box.schema.create_space('test', {format = format}) +_ = s:create_index('pk') +_ = s:create_index('sk', { \ + parts = { \ + {field = 2, path = "[*].tags", type = "unsigned"} \ + } \ +}) + +t = s:replace{1, {{tags = 2}}} +t['[2][1].tags'] + +t = s:replace{1, {{tags = 2}, {tags = 3}, {tags = 4}}} +t['[2]'] +t['[2][1]'] +t['[2][1].tags'] +t['[2][2]'] +t['[2][2].tags'] +t['[2][3]'] +t['[2][3].tags'] + +s:truncate() +s.index.sk:drop() +_ = s:create_index('sk', { \ + parts = { \ + {field = 2, path = "[*].p1.p2", type = "unsigned"} \ + } \ +}) + +t = s:replace{1, {{p1 = {p2 = 2}}}} +t['[2][1].p1.p2'] + +t = s:replace{1, { \ + { \ + p1 = { \ + p2 = 2, p3 = 3 \ + }, \ + p4 = 4 \ + }, \ + { \ + p1 = {p2 = 5} \ + }, \ + { \ + p1 = {p2 = 6} \ + } \ +}} + +t['[2][1].p1.p2'] +t['[2][1].p1.p3'] +t['[2][1].p1'] +t['[2][1]'] +t['[2][1].p4'] +t['[2][2].p1.p2'] +t['[2][2].p1'] +t['[2][2]'] +t['[2][3].p1.p2'] +t['[2][3].p1'] +t['[2][3]'] + +s:drop() -- 2.21.1 (Apple Git-122.3)
next prev parent reply other threads:[~2020-08-04 23:45 UTC|newest] Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-08-04 23:45 [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Vladislav Shpilevoy 2020-08-04 23:45 ` Vladislav Shpilevoy [this message] 2020-08-06 16:00 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Oleg Babin 2020-08-06 20:04 ` Vladislav Shpilevoy 2020-08-10 16:09 ` Nikita Pettik 2020-08-11 9:44 ` Aleksandr Lyapunov 2020-08-11 21:24 ` Vladislav Shpilevoy 2020-08-12 13:05 ` Aleksandr Lyapunov 2020-08-12 20:34 ` Vladislav Shpilevoy 2020-08-04 23:45 ` [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]' Vladislav Shpilevoy 2020-08-10 17:52 ` Nikita Pettik 2020-08-11 18:50 ` Aleksandr Lyapunov 2020-08-10 10:10 ` [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Aleksandr Lyapunov 2020-08-10 22:22 ` Vladislav Shpilevoy 2020-08-12 20:34 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=f1b60f4011e19aa04c0ff7d49c82e46955a968fc.1596584571.git.v.shpilevoy@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=alyapunov@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox