* [Tarantool-patches] [PATCH 0/2] JSON field multikey crash
@ 2020-08-04 23:45 Vladislav Shpilevoy
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-04 23:45 UTC (permalink / raw)
To: tarantool-patches, alyapunov, korablev
The patchset fixes 2 crashes related to multikey in JSON path
tuple field access code.
Also during working on this I found
https://github.com/tarantool/tarantool/issues/5226, but couldn't
find a simple solution.
Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5224-tuple-field-by-path-crash
Issue: https://github.com/tarantool/tarantool/issues/5224
@ChangeLog
* Fixed a crash when JSON tuple field access was used to get a multikey indexed field, and when a JSON contained [*] in the beginning;
Vladislav Shpilevoy (2):
tuple: fix multikey field JSON access crash
tuple: fix access by JSON path starting from '[*]'
src/box/tuple.c | 3 +-
src/box/tuple.h | 8 +
test/box/gh-5224-multikey-field-access.result | 164 ++++++++++++++++++
.../gh-5224-multikey-field-access.test.lua | 72 ++++++++
4 files changed, 246 insertions(+), 1 deletion(-)
create mode 100644 test/box/gh-5224-multikey-field-access.result
create mode 100644 test/box/gh-5224-multikey-field-access.test.lua
--
2.21.1 (Apple Git-122.3)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-04 23:45 [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Vladislav Shpilevoy
@ 2020-08-04 23:45 ` Vladislav Shpilevoy
2020-08-06 16:00 ` Oleg Babin
` (2 more replies)
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]' Vladislav Shpilevoy
` (2 subsequent siblings)
3 siblings, 3 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-04 23:45 UTC (permalink / raw)
To: tarantool-patches, alyapunov, korablev
When a tuple had format with multikey indexes in it, any attempt
to get a multikey indexed field by a JSON path from Lua led to a
crash.
That was because of incorrect interpretation of offset slot value
in tuple's field map.
Tuple field map is an array stored before the tuple's MessagePack
data. Each element is a 4 byte offset to an indexed value to be
able to get it for O(1) time without MessagePack decoding of all
the previous fields.
At least it was so before multikeys. Now tuple field map is not
just an array. It is rather a 2-level array, somehow similar to
ext4 FS. Some elements of the root array are positive numbers
pointing at data. Some elements point at a second 'indirect'
array, so called 'extra', size of which is individual for each
tuple. These second arrays are used by multikey indexes to store
offsets to each multikey indexed value in a tuple.
It means, that if there is an offset slot, it can't be just used
as is. It is allowed only if the field is not multikey. Otherwise
it is neccessary to somehow get an index in the second 'indirect'
array.
This is what was happening - a multikey field was found, its
offset slot was valid, but it was pointing at an 'indirect' array,
not at the data. JSON tuple field access tried to use it as a data
offset.
The patch makes JSON field access degrade to fullscan when a field
is multikey, but no multikey array index is provided.
Closes #5224
---
src/box/tuple.h | 8 +
test/box/gh-5224-multikey-field-access.result | 155 ++++++++++++++++++
.../gh-5224-multikey-field-access.test.lua | 66 ++++++++
3 files changed, 229 insertions(+)
create mode 100644 test/box/gh-5224-multikey-field-access.result
create mode 100644 test/box/gh-5224-multikey-field-access.test.lua
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 4752323e4..2b48d6af9 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -628,6 +628,14 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
goto parse;
if (offset_slot_hint != NULL)
*offset_slot_hint = offset_slot;
+ /*
+ * When the field is multikey, the offset slot points not at the
+ * data. It points at 'extra' array of offsets for this multikey
+ * index. That array can only be accessed if index in that array
+ * is known.
+ */
+ if (field->is_multikey_part && multikey_idx == MULTIKEY_NONE)
+ goto parse;
offset_slot_access:
/* Indexed field */
offset = field_map_get_offset(field_map, offset_slot,
diff --git a/test/box/gh-5224-multikey-field-access.result b/test/box/gh-5224-multikey-field-access.result
new file mode 100644
index 000000000..014e810d0
--- /dev/null
+++ b/test/box/gh-5224-multikey-field-access.result
@@ -0,0 +1,155 @@
+-- test-run result file version 2
+--
+-- gh-5224: tuple field access by JSON path crashed when tried to get a multikey
+-- indexed field.
+--
+format = {}
+ | ---
+ | ...
+format[1] = {name = 'f1', type = 'unsigned'}
+ | ---
+ | ...
+format[2] = {name = 'f2', type = 'array'}
+ | ---
+ | ...
+s = box.schema.create_space('test', {format = format})
+ | ---
+ | ...
+_ = s:create_index('pk')
+ | ---
+ | ...
+_ = s:create_index('sk', { \
+ parts = { \
+ {field = 2, path = "[*].tags", type = "unsigned"} \
+ } \
+})
+ | ---
+ | ...
+
+t = s:replace{1, {{tags = 2}}}
+ | ---
+ | ...
+t['[2][1].tags']
+ | ---
+ | - 2
+ | ...
+
+t = s:replace{1, {{tags = 2}, {tags = 3}, {tags = 4}}}
+ | ---
+ | ...
+t['[2]']
+ | ---
+ | - [{'tags': 2}, {'tags': 3}, {'tags': 4}]
+ | ...
+t['[2][1]']
+ | ---
+ | - {'tags': 2}
+ | ...
+t['[2][1].tags']
+ | ---
+ | - 2
+ | ...
+t['[2][2]']
+ | ---
+ | - {'tags': 3}
+ | ...
+t['[2][2].tags']
+ | ---
+ | - 3
+ | ...
+t['[2][3]']
+ | ---
+ | - {'tags': 4}
+ | ...
+t['[2][3].tags']
+ | ---
+ | - 4
+ | ...
+
+s:truncate()
+ | ---
+ | ...
+s.index.sk:drop()
+ | ---
+ | ...
+_ = s:create_index('sk', { \
+ parts = { \
+ {field = 2, path = "[*].p1.p2", type = "unsigned"} \
+ } \
+})
+ | ---
+ | ...
+
+t = s:replace{1, {{p1 = {p2 = 2}}}}
+ | ---
+ | ...
+t['[2][1].p1.p2']
+ | ---
+ | - 2
+ | ...
+
+t = s:replace{1, { \
+ { \
+ p1 = { \
+ p2 = 2, p3 = 3 \
+ }, \
+ p4 = 4 \
+ }, \
+ { \
+ p1 = {p2 = 5} \
+ }, \
+ { \
+ p1 = {p2 = 6} \
+ } \
+}}
+ | ---
+ | ...
+
+t['[2][1].p1.p2']
+ | ---
+ | - 2
+ | ...
+t['[2][1].p1.p3']
+ | ---
+ | - 3
+ | ...
+t['[2][1].p1']
+ | ---
+ | - {'p2': 2, 'p3': 3}
+ | ...
+t['[2][1]']
+ | ---
+ | - {'p4': 4, 'p1': {'p2': 2, 'p3': 3}}
+ | ...
+t['[2][1].p4']
+ | ---
+ | - 4
+ | ...
+t['[2][2].p1.p2']
+ | ---
+ | - 5
+ | ...
+t['[2][2].p1']
+ | ---
+ | - {'p2': 5}
+ | ...
+t['[2][2]']
+ | ---
+ | - {'p1': {'p2': 5}}
+ | ...
+t['[2][3].p1.p2']
+ | ---
+ | - 6
+ | ...
+t['[2][3].p1']
+ | ---
+ | - {'p2': 6}
+ | ...
+t['[2][3]']
+ | ---
+ | - {'p1': {'p2': 6}}
+ | ...
+
+s:drop()
+ | ---
+ | ...
diff --git a/test/box/gh-5224-multikey-field-access.test.lua b/test/box/gh-5224-multikey-field-access.test.lua
new file mode 100644
index 000000000..19765171e
--- /dev/null
+++ b/test/box/gh-5224-multikey-field-access.test.lua
@@ -0,0 +1,66 @@
+--
+-- gh-5224: tuple field access by JSON path crashed when tried to get a multikey
+-- indexed field.
+--
+format = {}
+format[1] = {name = 'f1', type = 'unsigned'}
+format[2] = {name = 'f2', type = 'array'}
+s = box.schema.create_space('test', {format = format})
+_ = s:create_index('pk')
+_ = s:create_index('sk', { \
+ parts = { \
+ {field = 2, path = "[*].tags", type = "unsigned"} \
+ } \
+})
+
+t = s:replace{1, {{tags = 2}}}
+t['[2][1].tags']
+
+t = s:replace{1, {{tags = 2}, {tags = 3}, {tags = 4}}}
+t['[2]']
+t['[2][1]']
+t['[2][1].tags']
+t['[2][2]']
+t['[2][2].tags']
+t['[2][3]']
+t['[2][3].tags']
+
+s:truncate()
+s.index.sk:drop()
+_ = s:create_index('sk', { \
+ parts = { \
+ {field = 2, path = "[*].p1.p2", type = "unsigned"} \
+ } \
+})
+
+t = s:replace{1, {{p1 = {p2 = 2}}}}
+t['[2][1].p1.p2']
+
+t = s:replace{1, { \
+ { \
+ p1 = { \
+ p2 = 2, p3 = 3 \
+ }, \
+ p4 = 4 \
+ }, \
+ { \
+ p1 = {p2 = 5} \
+ }, \
+ { \
+ p1 = {p2 = 6} \
+ } \
+}}
+
+t['[2][1].p1.p2']
+t['[2][1].p1.p3']
+t['[2][1].p1']
+t['[2][1]']
+t['[2][1].p4']
+t['[2][2].p1.p2']
+t['[2][2].p1']
+t['[2][2]']
+t['[2][3].p1.p2']
+t['[2][3].p1']
+t['[2][3]']
+
+s:drop()
--
2.21.1 (Apple Git-122.3)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]'
2020-08-04 23:45 [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Vladislav Shpilevoy
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
@ 2020-08-04 23:45 ` Vladislav Shpilevoy
2020-08-10 17:52 ` Nikita Pettik
2020-08-11 18:50 ` Aleksandr Lyapunov
2020-08-10 10:10 ` [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Aleksandr Lyapunov
2020-08-12 20:34 ` Vladislav Shpilevoy
3 siblings, 2 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-04 23:45 UTC (permalink / raw)
To: tarantool-patches, alyapunov, korablev
Tuple JSON field access crashed when '[*]' was used as a first
part of the JSON path. The patch makes it treated like 'field not
found'.
Follow-up #5224
---
src/box/tuple.c | 3 ++-
test/box/gh-5224-multikey-field-access.result | 9 +++++++++
test/box/gh-5224-multikey-field-access.test.lua | 6 ++++++
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/src/box/tuple.c b/src/box/tuple.c
index 9f0f24c64..e77753a5b 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -531,7 +531,8 @@ tuple_field_raw_by_full_path(struct tuple_format *format, const char *tuple,
break;
}
default:
- assert(token.type == JSON_TOKEN_END);
+ assert(token.type == JSON_TOKEN_END ||
+ token.type == JSON_TOKEN_ANY);
return NULL;
}
return tuple_field_raw_by_path(format, tuple, field_map, fieldno,
diff --git a/test/box/gh-5224-multikey-field-access.result b/test/box/gh-5224-multikey-field-access.result
index 014e810d0..20df44cc3 100644
--- a/test/box/gh-5224-multikey-field-access.result
+++ b/test/box/gh-5224-multikey-field-access.result
@@ -150,6 +150,15 @@ t['[2][3]']
| - {'p1': {'p2': 6}}
| ...
+--
+-- Multikey path part could crash when used as a first part of the path during
+-- accessing a tuple field.
+--
+t['[*]']
+ | ---
+ | - null
+ | ...
+
s:drop()
| ---
| ...
diff --git a/test/box/gh-5224-multikey-field-access.test.lua b/test/box/gh-5224-multikey-field-access.test.lua
index 19765171e..6f6691d3c 100644
--- a/test/box/gh-5224-multikey-field-access.test.lua
+++ b/test/box/gh-5224-multikey-field-access.test.lua
@@ -63,4 +63,10 @@ t['[2][3].p1.p2']
t['[2][3].p1']
t['[2][3]']
+--
+-- Multikey path part could crash when used as a first part of the path during
+-- accessing a tuple field.
+--
+t['[*]']
+
s:drop()
--
2.21.1 (Apple Git-122.3)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
@ 2020-08-06 16:00 ` Oleg Babin
2020-08-06 20:04 ` Vladislav Shpilevoy
2020-08-10 16:09 ` Nikita Pettik
2020-08-11 9:44 ` Aleksandr Lyapunov
2 siblings, 1 reply; 15+ messages in thread
From: Oleg Babin @ 2020-08-06 16:00 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, alyapunov, korablev
Hi! Thanks for your patch. It's not a review but I have a question.
On 05/08/2020 02:45, Vladislav Shpilevoy wrote:
> When a tuple had format with multikey indexes in it, any attempt
> to get a multikey indexed field by a JSON path from Lua led to a
> crash.
>
> That was because of incorrect interpretation of offset slot value
> in tuple's field map.
>
> Tuple field map is an array stored before the tuple's MessagePack
> data. Each element is a 4 byte offset to an indexed value to be
> able to get it for O(1) time without MessagePack decoding of all
> the previous fields.
>
> At least it was so before multikeys. Now tuple field map is not
> just an array. It is rather a 2-level array, somehow similar to
> ext4 FS. Some elements of the root array are positive numbers
> pointing at data. Some elements point at a second 'indirect'
> array, so called 'extra', size of which is individual for each
> tuple. These second arrays are used by multikey indexes to store
> offsets to each multikey indexed value in a tuple.
Do json path updates use offsets? Is such issue relevant for them?
I tried to update poisoned tuple but seems it works fine. But maybe I've
missed something.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-06 16:00 ` Oleg Babin
@ 2020-08-06 20:04 ` Vladislav Shpilevoy
0 siblings, 0 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-06 20:04 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, alyapunov, korablev
On 06.08.2020 18:00, Oleg Babin wrote:
> Hi! Thanks for your patch. It's not a review but I have a question.
>
> On 05/08/2020 02:45, Vladislav Shpilevoy wrote:
>> When a tuple had format with multikey indexes in it, any attempt
>> to get a multikey indexed field by a JSON path from Lua led to a
>> crash.
>>
>> That was because of incorrect interpretation of offset slot value
>> in tuple's field map.
>>
>> Tuple field map is an array stored before the tuple's MessagePack
>> data. Each element is a 4 byte offset to an indexed value to be
>> able to get it for O(1) time without MessagePack decoding of all
>> the previous fields.
>>
>> At least it was so before multikeys. Now tuple field map is not
>> just an array. It is rather a 2-level array, somehow similar to
>> ext4 FS. Some elements of the root array are positive numbers
>> pointing at data. Some elements point at a second 'indirect'
>> array, so called 'extra', size of which is individual for each
>> tuple. These second arrays are used by multikey indexes to store
>> offsets to each multikey indexed value in a tuple.
>
>
> Do json path updates use offsets? Is such issue relevant for them?
>
> I tried to update poisoned tuple but seems it works fine. But maybe I've missed something.
No, JSON updates always decode whole tuple, at least all fields <=
max affected field. So offsets are not used. I was thinking about
adding them, but so far there was no a request for it, nor benches
how would it help exactly.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] JSON field multikey crash
2020-08-04 23:45 [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Vladislav Shpilevoy
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]' Vladislav Shpilevoy
@ 2020-08-10 10:10 ` Aleksandr Lyapunov
2020-08-10 22:22 ` Vladislav Shpilevoy
2020-08-12 20:34 ` Vladislav Shpilevoy
3 siblings, 1 reply; 15+ messages in thread
From: Aleksandr Lyapunov @ 2020-08-10 10:10 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, korablev
Hi! thanks for the patch.
Btw, why do we call them 'JSON fields'? Is it supposed that it should
work somehow
with Java Script Object Notation serialization format?
On 8/5/20 2:45 AM, Vladislav Shpilevoy wrote:
> The patchset fixes 2 crashes related to multikey in JSON path
> tuple field access code.
>
> Also during working on this I found
> https://github.com/tarantool/tarantool/issues/5226, but couldn't
> find a simple solution.
>
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5224-tuple-field-by-path-crash
> Issue: https://github.com/tarantool/tarantool/issues/5224
>
> @ChangeLog
> * Fixed a crash when JSON tuple field access was used to get a multikey indexed field, and when a JSON contained [*] in the beginning;
>
> Vladislav Shpilevoy (2):
> tuple: fix multikey field JSON access crash
> tuple: fix access by JSON path starting from '[*]'
>
> src/box/tuple.c | 3 +-
> src/box/tuple.h | 8 +
> test/box/gh-5224-multikey-field-access.result | 164 ++++++++++++++++++
> .../gh-5224-multikey-field-access.test.lua | 72 ++++++++
> 4 files changed, 246 insertions(+), 1 deletion(-)
> create mode 100644 test/box/gh-5224-multikey-field-access.result
> create mode 100644 test/box/gh-5224-multikey-field-access.test.lua
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
2020-08-06 16:00 ` Oleg Babin
@ 2020-08-10 16:09 ` Nikita Pettik
2020-08-11 9:44 ` Aleksandr Lyapunov
2 siblings, 0 replies; 15+ messages in thread
From: Nikita Pettik @ 2020-08-10 16:09 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches
On 05 Aug 01:45, Vladislav Shpilevoy wrote:
> When a tuple had format with multikey indexes in it, any attempt
> to get a multikey indexed field by a JSON path from Lua led to a
> crash.
>
> That was because of incorrect interpretation of offset slot value
> in tuple's field map.
>
> Tuple field map is an array stored before the tuple's MessagePack
> data. Each element is a 4 byte offset to an indexed value to be
> able to get it for O(1) time without MessagePack decoding of all
> the previous fields.
>
> At least it was so before multikeys. Now tuple field map is not
> just an array. It is rather a 2-level array, somehow similar to
> ext4 FS. Some elements of the root array are positive numbers
> pointing at data. Some elements point at a second 'indirect'
> array, so called 'extra', size of which is individual for each
> tuple. These second arrays are used by multikey indexes to store
> offsets to each multikey indexed value in a tuple.
>
> It means, that if there is an offset slot, it can't be just used
> as is. It is allowed only if the field is not multikey. Otherwise
> it is neccessary to somehow get an index in the second 'indirect'
> array.
>
> This is what was happening - a multikey field was found, its
> offset slot was valid, but it was pointing at an 'indirect' array,
> not at the data. JSON tuple field access tried to use it as a data
> offset.
>
> The patch makes JSON field access degrade to fullscan when a field
> is multikey, but no multikey array index is provided.
>
> Closes #5224
LGTM
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]'
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]' Vladislav Shpilevoy
@ 2020-08-10 17:52 ` Nikita Pettik
2020-08-11 18:50 ` Aleksandr Lyapunov
1 sibling, 0 replies; 15+ messages in thread
From: Nikita Pettik @ 2020-08-10 17:52 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches
On 05 Aug 01:45, Vladislav Shpilevoy wrote:
> Tuple JSON field access crashed when '[*]' was used as a first
> part of the JSON path. The patch makes it treated like 'field not
> found'.
>
> Follow-up #5224
LGTM
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] JSON field multikey crash
2020-08-10 10:10 ` [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Aleksandr Lyapunov
@ 2020-08-10 22:22 ` Vladislav Shpilevoy
0 siblings, 0 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-10 22:22 UTC (permalink / raw)
To: Aleksandr Lyapunov, tarantool-patches, korablev
Hi!
On 10.08.2020 12:10, Aleksandr Lyapunov wrote:
> Hi! thanks for the patch.
> Btw, why do we call them 'JSON fields'? Is it supposed that it should work somehow
> with Java Script Object Notation serialization format?
Yes. This is what JSON means. In context of this patch it is meant 'JSON field access'.
Access of a tuple field by a JSON path.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
2020-08-06 16:00 ` Oleg Babin
2020-08-10 16:09 ` Nikita Pettik
@ 2020-08-11 9:44 ` Aleksandr Lyapunov
2020-08-11 21:24 ` Vladislav Shpilevoy
2 siblings, 1 reply; 15+ messages in thread
From: Aleksandr Lyapunov @ 2020-08-11 9:44 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, korablev
Hi! thanks again for the patch. see one comment below.
On 8/5/20 2:45 AM, Vladislav Shpilevoy wrote:
> goto parse;
> if (offset_slot_hint != NULL)
> *offset_slot_hint = offset_slot;
> + /*
> + * When the field is multikey, the offset slot points not at the
> + * data. It points at 'extra' array of offsets for this multikey
> + * index. That array can only be accessed if index in that array
> + * is known.
> + */
> + if (field->is_multikey_part && multikey_idx == MULTIKEY_NONE)
> + goto parse;
> offset_slot_access:
> /* Indexed field */
> offset = field_map_get_offset(field_map, offset_slot,
I'm sure that your check must be moved for two lines up. I mean the check
must be done before setting *offset_slot_hint.
As I understood offset_slot_hint will contain a hint for further
tuple_field_raw_by_path
calls with the same path. That is a kind of agreement, we may call
tuple_field_raw_by_path
twice and must get the same results.
But in your code you set *offset_slot_hint before a check that could go
to 'parse' label.
Meanwhile in the second call of tuple_field_raw_by_path it'll check
*offset_slot_hint and
will go to 'offset_slot_access' label. That's wrong.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]'
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]' Vladislav Shpilevoy
2020-08-10 17:52 ` Nikita Pettik
@ 2020-08-11 18:50 ` Aleksandr Lyapunov
1 sibling, 0 replies; 15+ messages in thread
From: Aleksandr Lyapunov @ 2020-08-11 18:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, korablev
LGTM
On 8/5/20 2:45 AM, Vladislav Shpilevoy wrote:
> Tuple JSON field access crashed when '[*]' was used as a first
> part of the JSON path. The patch makes it treated like 'field not
> found'.
>
> Follow-up #5224
> ---
> src/box/tuple.c | 3 ++-
> test/box/gh-5224-multikey-field-access.result | 9 +++++++++
> test/box/gh-5224-multikey-field-access.test.lua | 6 ++++++
> 3 files changed, 17 insertions(+), 1 deletion(-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-11 9:44 ` Aleksandr Lyapunov
@ 2020-08-11 21:24 ` Vladislav Shpilevoy
2020-08-12 13:05 ` Aleksandr Lyapunov
0 siblings, 1 reply; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-11 21:24 UTC (permalink / raw)
To: Aleksandr Lyapunov, tarantool-patches, korablev
Hi! Thanks for the review!
> On 8/5/20 2:45 AM, Vladislav Shpilevoy wrote:
>> goto parse;
>> if (offset_slot_hint != NULL)
>> *offset_slot_hint = offset_slot;
>> + /*
>> + * When the field is multikey, the offset slot points not at the
>> + * data. It points at 'extra' array of offsets for this multikey
>> + * index. That array can only be accessed if index in that array
>> + * is known.
>> + */
>> + if (field->is_multikey_part && multikey_idx == MULTIKEY_NONE)
>> + goto parse;
>> offset_slot_access:
>> /* Indexed field */
>> offset = field_map_get_offset(field_map, offset_slot,
> I'm sure that your check must be moved for two lines up. I mean the check
> must be done before setting *offset_slot_hint.
>
> As I understood offset_slot_hint will contain a hint for further tuple_field_raw_by_path
> calls with the same path. That is a kind of agreement, we may call tuple_field_raw_by_path
> twice and must get the same results.
>
> But in your code you set *offset_slot_hint before a check that could go to 'parse' label.
> Meanwhile in the second call of tuple_field_raw_by_path it'll check *offset_slot_hint and
> will go to 'offset_slot_access' label. That's wrong.
You would be right if not the fact that there is always a guarantee, that
if offset_slot_hint != NULL, then either multikey_idx != MULTIKEY_NONE or it is not
a multikey part. It is unreachable.
So it wouldn't be correct to put it 2 lines above, nor it wouldn't be incorrect -
it does not change anything.
But it is possible to put it *instead*. Into 'else' branch. Then it will be -1
condition check.
New patch for this file:
====================
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 4752323e4..09ebeecf3 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -626,8 +626,28 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
offset_slot = field->offset_slot;
if (offset_slot == TUPLE_OFFSET_SLOT_NIL)
goto parse;
- if (offset_slot_hint != NULL)
+ if (offset_slot_hint != NULL) {
*offset_slot_hint = offset_slot;
+ /*
+ * Hint is never requested for a multikey field without
+ * providing a concrete multikey index.
+ */
+ assert(!field->is_multikey_part ||
+ (multikey_idx != MULTIKEY_NONE &&
+ field->is_multikey_part));
+ } else if (field->is_multikey_part &&
+ multikey_idx == MULTIKEY_NONE) {
+ /*
+ * When the field is multikey, the offset slot points
+ * not at the data. It points at 'extra' array of
+ * offsets for this multikey index. That array can only
+ * be accessed if index in that array is known. It is
+ * not known when the field is accessed not in an index.
+ * For example, in an application's Lua code by a JSON
+ * path.
+ */
+ goto parse;
+ }
offset_slot_access:
/* Indexed field */
offset = field_map_get_offset(field_map, offset_slot,
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-11 21:24 ` Vladislav Shpilevoy
@ 2020-08-12 13:05 ` Aleksandr Lyapunov
2020-08-12 20:34 ` Vladislav Shpilevoy
0 siblings, 1 reply; 15+ messages in thread
From: Aleksandr Lyapunov @ 2020-08-12 13:05 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, korablev
Hi! thanks for the fix, as for me it became more understandable.
See one minor comment below and LGTM.
On 8/12/20 12:24 AM, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
>
> + if (offset_slot_hint != NULL) {
> *offset_slot_hint = offset_slot;
> + /*
> + * Hint is never requested for a multikey field without
> + * providing a concrete multikey index.
> + */
> + assert(!field->is_multikey_part ||
> + (multikey_idx != MULTIKEY_NONE &&
> + field->is_multikey_part));
The last '&& field->is_multikey_part' is excess.
> + } else if (field->is_multikey_part &&
> + multikey_idx == MULTIKEY_NONE) {
> + /*
> + * When the field is multikey, the offset slot points
> + * not at the data. It points at 'extra' array of
> + * offsets for this multikey index. That array can only
> + * be accessed if index in that array is known. It is
> + * not known when the field is accessed not in an index.
> + * For example, in an application's Lua code by a JSON
> + * path.
> + */
> + goto parse;
> + }
> offset_slot_access:
> /* Indexed field */
> offset = field_map_get_offset(field_map, offset_slot,
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash
2020-08-12 13:05 ` Aleksandr Lyapunov
@ 2020-08-12 20:34 ` Vladislav Shpilevoy
0 siblings, 0 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-12 20:34 UTC (permalink / raw)
To: Aleksandr Lyapunov, tarantool-patches, korablev
>> + if (offset_slot_hint != NULL) {
>> *offset_slot_hint = offset_slot;
>> + /*
>> + * Hint is never requested for a multikey field without
>> + * providing a concrete multikey index.
>> + */
>> + assert(!field->is_multikey_part ||
>> + (multikey_idx != MULTIKEY_NONE &&
>> + field->is_multikey_part));
> The last '&& field->is_multikey_part' is excess.
Indeed, a stupid mistake. Somewhy I was sure it was needed yesterday.
Dropped now.
====================
assert(!field->is_multikey_part ||
- (multikey_idx != MULTIKEY_NONE &&
- field->is_multikey_part));
+ multikey_idx != MULTIKEY_NONE);
====================
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] JSON field multikey crash
2020-08-04 23:45 [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Vladislav Shpilevoy
` (2 preceding siblings ...)
2020-08-10 10:10 ` [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Aleksandr Lyapunov
@ 2020-08-12 20:34 ` Vladislav Shpilevoy
3 siblings, 0 replies; 15+ messages in thread
From: Vladislav Shpilevoy @ 2020-08-12 20:34 UTC (permalink / raw)
To: tarantool-patches, alyapunov, korablev
Pushed to master, 2.5, 2.4.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-08-12 20:34 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-04 23:45 [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Vladislav Shpilevoy
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 1/2] tuple: fix multikey field JSON access crash Vladislav Shpilevoy
2020-08-06 16:00 ` Oleg Babin
2020-08-06 20:04 ` Vladislav Shpilevoy
2020-08-10 16:09 ` Nikita Pettik
2020-08-11 9:44 ` Aleksandr Lyapunov
2020-08-11 21:24 ` Vladislav Shpilevoy
2020-08-12 13:05 ` Aleksandr Lyapunov
2020-08-12 20:34 ` Vladislav Shpilevoy
2020-08-04 23:45 ` [Tarantool-patches] [PATCH 2/2] tuple: fix access by JSON path starting from '[*]' Vladislav Shpilevoy
2020-08-10 17:52 ` Nikita Pettik
2020-08-11 18:50 ` Aleksandr Lyapunov
2020-08-10 10:10 ` [Tarantool-patches] [PATCH 0/2] JSON field multikey crash Aleksandr Lyapunov
2020-08-10 22:22 ` Vladislav Shpilevoy
2020-08-12 20:34 ` Vladislav Shpilevoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox