From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
To: tarantool-patches@freelists.org,
Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>
Subject: Re: [tarantool-patches] Re: [PATCH v3 3/4] box: introduce JSON indexes
Date: Thu, 6 Sep 2018 15:46:50 +0300 [thread overview]
Message-ID: <b15535bd-fbf7-a688-5fca-cde8136a9ae7@tarantool.org> (raw)
In-Reply-To: <cc078d7c-4ce3-ec9f-b26f-990edd6db844@tarantool.org>
Hi! Thank you for review and fixes.
> 1. On your branch vinyl/errinj and vinyl/info are failing.
> I do not know is it because you did not cherry-pick
> Vladimir's patch or because of a bug in your patch, but it
> should not fail.
>
> This is what I get sometimes on engine/null test:
I can't reproduce this problem. And I have definitely never seen this failure
starting this specific test many times independently.
> 2. Your question:
>
>> I've checked this code reachable with assert. But I don't now how handle it manually..
>> Please, help.
>
> Answer: you should write a test, that extends a path in such
> a way that newly indexed field values change order of index
> sorting. Select from the index should return the new order.
> Before you fixed this thing it returned an old order. Test it,
> please.
s = box.schema.space.create('withdata', {engine = engine})
pk_simplified = s:create_index('primary', { type = 'tree', parts = {{1, 'unsigned', path = '[1]'}}})
assert(pk_simplified.path == box.NULL)
idx = s:create_index('idx', {parts = {{2, 'integer', path = '[2].a'}}})
s:insert{31, {a = 1, aa = -1}}
s:insert{22, {a = 2, aa = -2}}
s:insert{13, {a = 3, aa = -3}}
idx:select()
idx:alter({parts = {{2, 'integer', path = '[2].aa'}}})
idx:select()
s:drop()
> 3. Why does path != NULL guarantee that a def has no optional parts?
> On the contrary, it means that a part is optional if it is not flat
> and is nullable.
>
> What is funny, even now in all has_optional_parts usage places you
> check both for is_flat and has_optional_parts.
Yes, you are right. Can't imagine, why did I implement it.
> 4. You do not need announcement of this variable.
> (fixed by me)
> 5. As I remember, I asked you to use gcov to see
> which code is untested and test it. But looks like
> you did not, but I did and here a result is:
>
> 638: 602: if (lexemes == 1) {
> -: 603: /* JSON index is useless. */
> #####: 604: path = part->path;
> #####: 605: part->path = NULL;
> #####: 606: } else {
Introduced a test:
pk_simplified = s:create_index('primary', { type = 'tree', parts = {{1, 'unsigned', path = '[1]'}}})
assert(pk_simplified.path == box.NULL)
> 6. Here and above you should set epoch and slot offset.
new_def->parts[pos].offset_slot_epoch = part->offset_slot_epoch;
new_def->parts[pos].offset_slot = part->offset_slot;
> 7. Comparators matrix is useless unless you use it for all
> tuple_compare_slowpath usage places.
>
> The same for tuple_compare_with_key.
Ok.
> 8. Typo.
Fixed.
> 9. Never tested:
>
> 167: 73: } else {
> -: 74: /* Skip key. */
> #####: 75: mp_next(field);
> -: 76: }
>
>> + }
>> + /* Skip value. */
>> + mp_next(field);
>> + }
>> + return -1;
>> +}
This code isn't mine, but I've modified test to trigger it:
s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
> 10. The same.>
> 668: 180: type = FIELD_TYPE_ARRAY;
> 1020: 181: if (field->type != FIELD_TYPE_ANY && field->type != type)
> #####: 182: goto error_type_mistmatch;
> -: 183: /* Create or resize field array if required. */
>
Done.
> 11.
>
> 132: 346: } else if (field->type == FIELD_TYPE_ARRAY) {
> 458: 347: if (type != MP_ARRAY) {
> 193: 348: valid_type_str = mp_type_strs[MP_ARRAY];
> #####: 349: goto error_type_mistmatch;
> #####: 350: }
> -: 351: uint32_t count = mp_decode_array(offset);
Done.
> 12.
>
> 34: 365: }
> -: 366: err = tt_sprintf("array size %d is less than item %d "
> #####: 367: "defined in index", i,
> #####: 368: field->array_size);
> #####: 369: goto error_invalid_document;
> #####: 370: }
Done.
> 13. Auxiliary 'childs' makes no sense if you still need to
> init array_size. Please, come up with a better solution.
> Inlined memset from offsetof is not good as well. Or it
> could be wrapped into functions which test for childs,
> nullify them.
> 14. Out of 80.
Non actual already.
> 15. Why do you check for path !=/== NULL in the code below if
> you already know it is == NULL? The only point of moving non-flat
> part processing there was to remove those checks.
>
> (fixed by me)
> 16. I see that you init field->is_nullable before checking
> for path != NULL, but it is wrong. If a field is not flat,
> then its first level nullability could be != a leaf
> nullability. Now it is impossible due to lack of non-flat
> format, but it will be. Besides, I think you should
> move all this cycle body into a function like
> tuple_format_add_key_part or something.
Moved to tuple_format_add_key_part.
> 17. For flat fields different nullability is allowed
> and the strictest is chosen. Why do you forbid it for
> non-flat fields (in tuple_format_add_json_path)?
> The same for compatible types. Below they are allowed,
> but for non-flat fields are not.
Done.
> 18. Two leaks of path_hash below where 'return NULL' is.
> (fixed by me)
> 19. The comment now is not linked with the code below.
> (fixed by me)
> 20. This code is different on the branch and in the email. Here it is:
>
>> if (likely(part->offset_slot_epoch == format->epoch &&
>> format->epoch != 0)) {
>> offset_slot = part->offset_slot
> And please explain how format epoch can be 0 ever? A format epoch is
> initialized by 0 but then it is always either reset to an old epoch or
> old epoch + 1. When a space has no indexes, it could be 0, but each
> new index alters the space and updates its epoch. So when a format
> epoch is 0, it has no tuples.
I've changed epoch bump mechanism(as a part of first commit) and not it is really
useless.
> 21. Why do not you update epoch and slot in such a case?
> 22. This check is not needed for the best case when epochs are
> equal in format and key part. Please, move it to the only place
> where it is needed above.
Done.
> 23. By the way, why do you store offset_slot_epoch in key_part
> instead of in a parent key_def? As I understand, it is the same
> for all parts of a key_def.
Because caches are per part, tuple_field_by_part_raw is used with parts.
I don't like to manually indirectly resolve all key_parts to bump key_part
epoch.
> 24. Untested.
>
> 186: 1171: struct mh_strnptr_node_t *ht_record =
> 372: 1172: json_path_hash_get(format->path_hash, path, path_len,
> 186: 1173: mh_strn_hash(path, path_len));
> 186: 1174: if (ht_record != NULL) {
> #####: 1175: struct tuple_field *leaf = ht_record->val;
> #####: 1176: assert(leaf != NULL);
> #####: 1177: int32_t offset_slot = leaf->offset_slot;
> #####: 1178: assert(offset_slot != TUPLE_OFFSET_SLOT_NIL);
> #####: 1179: if (field_map[offset_slot] != 0)
> #####: 1180: *field = tuple + field_map[offset_slot];
> -: 1181: else
> #####: 1182: *field = NULL;
> #####: 1183: return 0;
> -: 1184: }
> 186: 1185: }
t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
-- Test field_map in tuple speed-up access by indexed path.
t["[3][\"FIO\"][\"fname\"]"]
> 25. path_hash parameter missed.
> 26. offset parameter missed.
FIxed.
> 27. I still think that your walker should be generic and it is not
> ok that you reimplement it when a task is slightly extended, but it
> is up to Vova.
> 28. Please, remove tuple_field->arg field. It is extra ugly. Find a
> better solution. Why not just make it the builder's function argument?
After productive discussion with Vova I've understood that this won't work.
Vova likes my hack, but it is not thread-safe and it is critical here.
Now I have to construct temporal hash table with map field -> iov that is
pushed to function as an argument.
===================================================
From cb48f2f6975725e24513660b371ec0d7cf519ec5 Mon Sep 17 00:00:00 2001
Message-Id: <cb48f2f6975725e24513660b371ec0d7cf519ec5.1536237903.git.kshcherbatov@tarantool.org>
In-Reply-To: <cover.1536237903.git.kshcherbatov@tarantool.org>
References: <cover.1536237903.git.kshcherbatov@tarantool.org>
From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Date: Tue, 31 Jul 2018 18:20:15 +0300
Subject: [PATCH 3/4] box: introduce JSON indexes
As we need to store user-defined JSON path in key_part
and key_part_def, we have introduced path and path_len
fields. JSON path is verified and transformed to canonical
form on index msgpack unpack.
Path string stored as a part of the key_def allocation:
+-------+---------+-------+---------+-------+-------+-------+
|key_def|key_part1| ... |key_partN| path1 | pathK | pathN |
+-------+---------+-------+---------+-------+-------+-------+
| ^
|-> path _________________|
Because of field names specified as format could be changed
key_part path persisted in Tarantool should be always started
with first-level field access via array index(not by name).
To work with JSON-defined indexes we use format JSON
path hashtable data_path and a tree of intermediate path
fields attached to format root fields.
<Hashtable>: format->data_path
[2].FIO.fname -> field "fname" {type=str, off_slot=-1}
[2].FIO.sname -> field "sname" {type=str, off_slot=-2}
<Tree>: format->field[2] {type = map}
|
FIO {type = map}
|
"fname" | "sname"
{type=str,off_slot=-1} ____|____ {type = str,off_slot=-2}
Leaf fields used in Index have initialized offset_slot.
On new tuple creation we observe fields tree and use leaf
records to init tuple field_map.
At the same time we use data_path hashtable on tuple data
access by index(when cached offset_slot is invalid).
All paths stored at the end of format allocation:
JSON-tree fields same as format->path_hash point to them.
+------------+------------+-------+------------+-------+
|tuple_format|tuple_field1| ... |tuple_fieldN| pathK |
+------------+------------+-------+------------+-------+
New routine tuple_format_add_json_path is used to construct
all internal structures for JSON path on new format creation
and duplicating.
Paths are deduplicated in format allocation.
Part of #1012.
---
src/box/errcode.h | 2 +-
src/box/index_def.c | 10 +-
src/box/key_def.c | 300 +++++++++++++--
src/box/key_def.h | 38 +-
src/box/lua/space.cc | 5 +
src/box/memtx_engine.c | 5 +
src/box/schema.cc | 12 +-
src/box/tuple.c | 12 +-
src/box/tuple_compare.cc | 59 ++-
| 112 ++++--
src/box/tuple_format.c | 878 ++++++++++++++++++++++++++++++++++++-------
src/box/tuple_format.h | 63 +++-
src/box/tuple_hash.cc | 18 +-
src/box/vinyl.c | 5 +
src/box/vy_log.c | 3 +-
src/box/vy_lsm.c | 44 +++
src/box/vy_point_lookup.c | 2 -
src/box/vy_stmt.c | 149 ++++++--
test/box/misc.result | 57 +--
test/engine/tuple.result | 387 +++++++++++++++++++
test/engine/tuple.test.lua | 109 ++++++
test/vinyl/info.result | 2 +-
22 files changed, 1963 insertions(+), 309 deletions(-)
diff --git a/src/box/errcode.h b/src/box/errcode.h
index 4115e6b..464f413 100644
--- a/src/box/errcode.h
+++ b/src/box/errcode.h
@@ -107,7 +107,7 @@ struct errcode_record {
/* 52 */_(ER_FUNCTION_EXISTS, "Function '%s' already exists") \
/* 53 */_(ER_BEFORE_REPLACE_RET, "Invalid return value of space:before_replace trigger: expected tuple or nil, got %s") \
/* 54 */_(ER_FUNCTION_MAX, "A limit on the total number of functions has been reached: %u") \
- /* 55 */_(ER_UNUSED4, "") \
+ /* 55 */_(ER_DATA_STRUCTURE_MISMATCH, "Tuple doesn't math document structure: %s") \
/* 56 */_(ER_USER_MAX, "A limit on the total number of users has been reached: %u") \
/* 57 */_(ER_NO_SUCH_ENGINE, "Space engine '%s' does not exist") \
/* 58 */_(ER_RELOAD_CFG, "Can't set option '%s' dynamically") \
diff --git a/src/box/index_def.c b/src/box/index_def.c
index 9cda63c..f67b952 100644
--- a/src/box/index_def.c
+++ b/src/box/index_def.c
@@ -209,8 +209,14 @@ index_def_is_valid(struct index_def *index_def, const char *space_name)
* Courtesy to a user who could have made
* a typo.
*/
- if (index_def->key_def->parts[i].fieldno ==
- index_def->key_def->parts[j].fieldno) {
+ struct key_part *part_a = &index_def->key_def->parts[i];
+ struct key_part *part_b = &index_def->key_def->parts[j];
+ if ((part_a->fieldno == part_b->fieldno &&
+ part_a->path == NULL && part_b->path == NULL) ||
+ (part_a->path_len != 0 &&
+ part_a->path_len == part_b->path_len &&
+ memcmp(part_a->path, part_b->path,
+ part_a->path_len) == 0)) {
diag_set(ClientError, ER_MODIFY_INDEX,
index_def->name, space_name,
"same key part is indexed twice");
diff --git a/src/box/key_def.c b/src/box/key_def.c
index 2ef78c1..216d858 100644
--- a/src/box/key_def.c
+++ b/src/box/key_def.c
@@ -35,12 +35,16 @@
#include "column_mask.h"
#include "schema_def.h"
#include "coll_id_cache.h"
+#include "fiber.h"
+#include "assoc.h"
+#include "json/path.h"
static const struct key_part_def key_part_def_default = {
0,
field_type_MAX,
COLL_NONE,
false,
+ NULL
};
static int64_t
@@ -53,6 +57,7 @@ part_type_by_name_wrapper(const char *str, uint32_t len)
#define PART_OPT_FIELD "field"
#define PART_OPT_COLLATION "collation"
#define PART_OPT_NULLABILITY "is_nullable"
+#define PART_OPT_PATH "path"
const struct opt_def part_def_reg[] = {
OPT_DEF_ENUM(PART_OPT_TYPE, field_type, struct key_part_def, type,
@@ -61,6 +66,7 @@ const struct opt_def part_def_reg[] = {
OPT_DEF(PART_OPT_COLLATION, OPT_UINT32, struct key_part_def, coll_id),
OPT_DEF(PART_OPT_NULLABILITY, OPT_BOOL, struct key_part_def,
is_nullable),
+ OPT_DEF(PART_OPT_PATH, OPT_STRPTR, struct key_part_def, path),
OPT_END,
};
@@ -96,13 +102,24 @@ const uint32_t key_mp_type[] = {
struct key_def *
key_def_dup(const struct key_def *src)
{
- size_t sz = key_def_sizeof(src->part_count);
+ const struct key_part *parts = src->parts;
+ const struct key_part *parts_end = parts + src->part_count;
+ size_t sz = 0;
+ for (; parts < parts_end; parts++)
+ sz += parts->path != NULL ? parts->path_len + 1 : 0;
+ sz = key_def_sizeof(src->part_count, sz);
struct key_def *res = (struct key_def *)malloc(sz);
if (res == NULL) {
diag_set(OutOfMemory, sz, "malloc", "res");
return NULL;
}
memcpy(res, src, sz);
+ for (uint32_t i = 0; i < src->part_count; i++) {
+ if (src->parts[i].path == NULL)
+ continue;
+ size_t path_offset = src->parts[i].path - (char *)src;
+ res->parts[i].path = (char *)res + path_offset;
+ }
return res;
}
@@ -110,8 +127,17 @@ void
key_def_swap(struct key_def *old_def, struct key_def *new_def)
{
assert(old_def->part_count == new_def->part_count);
- for (uint32_t i = 0; i < new_def->part_count; i++)
- SWAP(old_def->parts[i], new_def->parts[i]);
+ for (uint32_t i = 0; i < new_def->part_count; i++) {
+ if (old_def->parts[i].path == NULL) {
+ SWAP(old_def->parts[i], new_def->parts[i]);
+ } else {
+ size_t path_offset =
+ old_def->parts[i].path - (char *)old_def;
+ SWAP(old_def->parts[i], new_def->parts[i]);
+ old_def->parts[i].path = (char *)old_def + path_offset;
+ new_def->parts[i].path = (char *)new_def + path_offset;
+ }
+ }
SWAP(*old_def, *new_def);
}
@@ -131,9 +157,9 @@ key_def_set_cmp(struct key_def *def)
}
struct key_def *
-key_def_new(uint32_t part_count)
+key_def_new(uint32_t part_count, size_t paths_size)
{
- size_t sz = key_def_sizeof(part_count);
+ size_t sz = key_def_sizeof(part_count, paths_size);
/** Use calloc() to zero comparator function pointers. */
struct key_def *key_def = (struct key_def *) calloc(1, sz);
if (key_def == NULL) {
@@ -148,10 +174,13 @@ key_def_new(uint32_t part_count)
struct key_def *
key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count)
{
- struct key_def *def = key_def_new(part_count);
+ size_t sz = 0;
+ for (uint32_t i = 0; i < part_count; i++)
+ sz += parts[i].path != NULL ? strlen(parts[i].path) + 1 : 0;
+ struct key_def *def = key_def_new(part_count, sz);
if (def == NULL)
return NULL;
-
+ char *data = (char *)def + key_def_sizeof(part_count, 0);
for (uint32_t i = 0; i < part_count; i++) {
struct key_part_def *part = &parts[i];
struct coll *coll = NULL;
@@ -165,14 +194,22 @@ key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count)
}
coll = coll_id->coll;
}
+ uint32_t path_len = 0;
+ if (part->path != NULL) {
+ path_len = strlen(part->path);
+ def->parts[i].path = data;
+ data += path_len + 1;
+ }
key_def_set_part(def, i, part->fieldno, part->type,
- part->is_nullable, coll, part->coll_id);
+ part->is_nullable, coll, part->coll_id,
+ part->path, path_len);
}
return def;
}
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
+int
+key_def_dump_parts(struct region *pool, const struct key_def *def,
+ struct key_part_def *parts)
{
for (uint32_t i = 0; i < def->part_count; i++) {
const struct key_part *part = &def->parts[i];
@@ -181,13 +218,26 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
part_def->type = part->type;
part_def->is_nullable = part->is_nullable;
part_def->coll_id = part->coll_id;
+ if (part->path != NULL) {
+ part_def->path = region_alloc(pool, part->path_len + 1);
+ if (part_def->path == NULL) {
+ diag_set(OutOfMemory, part->path_len + 1,
+ "region_alloc", "part_def->path");
+ return -1;
+ }
+ memcpy(part_def->path, part->path, part->path_len);
+ part_def->path[part->path_len] = '\0';
+ } else {
+ part_def->path = NULL;
+ }
}
+ return 0;
}
box_key_def_t *
box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
{
- struct key_def *key_def = key_def_new(part_count);
+ struct key_def *key_def = key_def_new(part_count, 0);
if (key_def == NULL)
return key_def;
@@ -195,7 +245,7 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
key_def_set_part(key_def, item, fields[item],
(enum field_type)types[item],
key_part_def_default.is_nullable, NULL,
- COLL_NONE);
+ COLL_NONE, NULL, 0);
}
return key_def;
}
@@ -241,6 +291,13 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
if (part1->is_nullable != part2->is_nullable)
return part1->is_nullable <
part2->is_nullable ? -1 : 1;
+ /* Lexicographic strings order. */
+ if (part1->path_len != part2->path_len)
+ return part1->path_len - part2->path_len;
+ int rc = 0;
+ if ((rc = memcmp(part1->path, part2->path,
+ part1->path_len)) != 0)
+ return rc;
}
return part_count1 < part_count2 ? -1 : part_count1 > part_count2;
}
@@ -248,11 +305,12 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
void
key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
enum field_type type, bool is_nullable, struct coll *coll,
- uint32_t coll_id)
+ uint32_t coll_id, const char *path, uint32_t path_len)
{
assert(part_no < def->part_count);
assert(type < field_type_MAX);
def->is_nullable |= is_nullable;
+ def->has_json_paths |= path != NULL;
def->parts[part_no].is_nullable = is_nullable;
def->parts[part_no].fieldno = fieldno;
def->parts[part_no].type = type;
@@ -260,6 +318,17 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
def->parts[part_no].coll_id = coll_id;
def->parts[part_no].offset_slot = TUPLE_OFFSET_SLOT_NIL;
def->parts[part_no].offset_slot_epoch = 0;
+ if (path != NULL) {
+ def->parts[part_no].path_len = path_len;
+ assert(def->parts[part_no].path != NULL);
+ memcpy(def->parts[part_no].path, path, path_len);
+ def->parts[part_no].path[path_len] = '\0';
+ def->parts[part_no].path_hash = mh_strn_hash(path, path_len);
+ } else {
+ def->parts[part_no].path_len = 0;
+ def->parts[part_no].path = NULL;
+ def->parts[part_no].path_hash = 0;
+ }
column_mask_set_fieldno(&def->column_mask, fieldno);
/**
* When all parts are set, initialize the tuple
@@ -284,13 +353,13 @@ key_def_update_optionality(struct key_def *def, uint32_t min_field_count)
for (uint32_t i = 0; i < def->part_count; ++i) {
struct key_part *part = &def->parts[i];
def->has_optional_parts |= part->is_nullable &&
- min_field_count < part->fieldno + 1;
+ (min_field_count <
+ part->fieldno + 1);
/*
* One optional part is enough to switch to new
* comparators.
*/
- if (def->has_optional_parts)
- break;
+ if (def->has_optional_parts) break;
}
key_def_set_cmp(def);
}
@@ -304,8 +373,15 @@ key_def_snprint_parts(char *buf, int size, const struct key_part_def *parts,
for (uint32_t i = 0; i < part_count; i++) {
const struct key_part_def *part = &parts[i];
assert(part->type < field_type_MAX);
- SNPRINT(total, snprintf, buf, size, "%d, '%s'",
- (int)part->fieldno, field_type_strs[part->type]);
+ if (part->path != NULL) {
+ SNPRINT(total, snprintf, buf, size, "%d, '%s', '%s'",
+ (int) part->fieldno, part->path,
+ field_type_strs[part->type]);
+ } else {
+ SNPRINT(total, snprintf, buf, size, "%d, '%s'",
+ (int) part->fieldno,
+ field_type_strs[part->type]);
+ }
if (i < part_count - 1)
SNPRINT(total, snprintf, buf, size, ", ");
}
@@ -324,6 +400,8 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
count++;
if (part->is_nullable)
count++;
+ if (part->path != NULL)
+ count++;
size += mp_sizeof_map(count);
size += mp_sizeof_str(strlen(PART_OPT_FIELD));
size += mp_sizeof_uint(part->fieldno);
@@ -338,6 +416,10 @@ key_def_sizeof_parts(const struct key_part_def *parts, uint32_t part_count)
size += mp_sizeof_str(strlen(PART_OPT_NULLABILITY));
size += mp_sizeof_bool(part->is_nullable);
}
+ if (part->path != NULL) {
+ size += mp_sizeof_str(strlen(PART_OPT_PATH));
+ size += mp_sizeof_str(strlen(part->path));
+ }
}
return size;
}
@@ -351,6 +433,8 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
int count = 2;
if (part->coll_id != COLL_NONE)
count++;
+ if (part->path != NULL)
+ count++;
if (part->is_nullable)
count++;
data = mp_encode_map(data, count);
@@ -372,6 +456,12 @@ key_def_encode_parts(char *data, const struct key_part_def *parts,
strlen(PART_OPT_NULLABILITY));
data = mp_encode_bool(data, part->is_nullable);
}
+ if (part->path != NULL) {
+ data = mp_encode_str(data, PART_OPT_PATH,
+ strlen(PART_OPT_PATH));
+ data = mp_encode_str(data, part->path,
+ strlen(part->path));
+ }
}
return data;
}
@@ -432,8 +522,111 @@ key_def_decode_parts_166(struct key_part_def *parts, uint32_t part_count,
fields[part->fieldno].is_nullable :
key_part_def_default.is_nullable);
part->coll_id = COLL_NONE;
+ part->path = NULL;
+ }
+ return 0;
+}
+
+/**
+ * Verify key_part JSON path and convert to canonical form.
+ *
+ * @param region Region to make allocations.
+ * @param part Part with path to update.
+ * @param path_extra Extra allocated space to reuse if possible.
+ * @param path_extra_size The @path_extra size.
+ *
+ * @retval -1 on error.
+ * @retval 0 on success.
+ */
+static int
+key_def_normalize_json_path(struct region *region, struct key_part_def *part,
+ char **path_extra, uint32_t *path_extra_size)
+{
+ uint32_t allocated_size = *path_extra_size;
+ char *path = *path_extra;
+
+ uint32_t path_len = strlen(part->path);
+ struct json_path_parser parser;
+ struct json_path_node node;
+ json_path_parser_create(&parser, part->path, path_len);
+ /*
+ * A worst-case scenario is .a -> ["a"]
+ * i.e. 2.5 * path_len + 1 is enough.
+ */
+ uint32_t new_path_size = 2.5 * path_len + 1;
+ if (new_path_size >= allocated_size) {
+ path = region_alloc(region, new_path_size);
+ if (path == NULL) {
+ diag_set(OutOfMemory, new_path_size,
+ "region_alloc", "path");
+ return -1;
+ }
+ allocated_size = new_path_size;
+ }
+ assert(path != NULL);
+ part->path = path;
+ int rc = json_path_next(&parser, &node);
+ if (rc != 0)
+ goto error_invalid_json;
+ if (node.type != JSON_PATH_NUM) {
+ diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+ part->fieldno,
+ "invalid JSON path: first part should "
+ "be defined as array index");
+ return -1;
+ }
+ if (node.num - TUPLE_INDEX_BASE != part->fieldno) {
+ diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+ part->fieldno,
+ "invalid JSON path: first part refers "
+ "to invalid field");
+ return -1;
+ }
+ uint32_t lexemes = 0;
+ do {
+ if (node.type == JSON_PATH_NUM) {
+ path += sprintf(path, "[%llu]",
+ (unsigned long long) node.num);
+ } else if (node.type == JSON_PATH_STR) {
+ path += sprintf(path, "[\"%.*s\"]", node.len, node.str);
+ } else {
+ unreachable();
+ }
+ lexemes++;
+ } while ((rc = json_path_next(&parser, &node)) == 0 &&
+ node.type != JSON_PATH_END);
+ if (rc != 0 || node.type != JSON_PATH_END)
+ goto error_invalid_json;
+ if (lexemes == 1) {
+ /* JSON index is useless. */
+ path = part->path;
+ part->path = NULL;
+ } else {
+ /* Skip terminating zero. */
+ path++;
+ /* Account constructed string size. */
+ allocated_size -= path - part->path;
+ }
+ /* Going to try to reuse extra allocation next time. */
+ if (allocated_size > (uint32_t)parser.src_len) {
+ /* Use rest of new buffer next time. */
+ *path_extra = path;
+ *path_extra_size = allocated_size;
+ } else {
+ /* Reuse old path buffer. */
+ *path_extra = (char *)parser.src;
+ *path_extra_size = parser.src_len;
}
return 0;
+
+error_invalid_json: ;
+ const char *err_msg =
+ tt_sprintf("invalid JSON path '%.*s': path has invalid "\
+ "structure (error at position %d)", parser.src_len,
+ parser.src, rc);
+ diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+ part->fieldno + TUPLE_INDEX_BASE, err_msg);
+ return -1;
}
int
@@ -445,8 +638,11 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
return key_def_decode_parts_166(parts, part_count, data,
fields, field_count);
}
- for (uint32_t i = 0; i < part_count; i++) {
- struct key_part_def *part = &parts[i];
+ char *path = NULL;
+ uint32_t allocated_size = 0;
+ struct key_part_def *part = parts;
+ struct region *region = &fiber()->gc;
+ for (uint32_t i = 0; i < part_count; i++, part++) {
if (mp_typeof(**data) != MP_MAP) {
diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
i + TUPLE_INDEX_BASE,
@@ -456,7 +652,7 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
*part = key_part_def_default;
if (opts_decode(part, part_def_reg, data,
ER_WRONG_INDEX_OPTIONS, i + TUPLE_INDEX_BASE,
- NULL) != 0)
+ region) != 0)
return -1;
if (part->type == field_type_MAX) {
diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
@@ -473,6 +669,10 @@ key_def_decode_parts(struct key_part_def *parts, uint32_t part_count,
"string and scalar parts");
return -1;
}
+ if (part->path != NULL &&
+ key_def_normalize_json_path(region, part, &path,
+ &allocated_size) != 0)
+ return -1;
}
return 0;
}
@@ -497,20 +697,25 @@ key_def_decode_parts_160(struct key_part_def *parts, uint32_t part_count,
fields[part->fieldno].is_nullable :
key_part_def_default.is_nullable);
part->coll_id = COLL_NONE;
+ part->path = NULL;
}
return 0;
}
-const struct key_part *
-key_def_find(const struct key_def *key_def, uint32_t fieldno)
+bool
+key_def_contains_part(const struct key_def *key_def,
+ const struct key_part *to_find)
{
const struct key_part *part = key_def->parts;
const struct key_part *end = part + key_def->part_count;
for (; part != end; part++) {
- if (part->fieldno == fieldno)
- return part;
+ if (part->fieldno == to_find->fieldno &&
+ part->path_len == to_find->path_len &&
+ (part->path == NULL || memcmp(part->path, to_find->path,
+ to_find->path_len) == 0))
+ return true;
}
- return NULL;
+ return false;
}
bool
@@ -519,7 +724,7 @@ key_def_contains(const struct key_def *first, const struct key_def *second)
const struct key_part *part = second->parts;
const struct key_part *end = part + second->part_count;
for (; part != end; part++) {
- if (key_def_find(first, part->fieldno) == NULL)
+ if (! key_def_contains_part(first, part))
return false;
}
return true;
@@ -533,18 +738,25 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
* Find and remove part duplicates, i.e. parts counted
* twice since they are present in both key defs.
*/
- const struct key_part *part = second->parts;
- const struct key_part *end = part + second->part_count;
+ size_t sz = 0;
+ const struct key_part *part = first->parts;
+ const struct key_part *end = part + first->part_count;
for (; part != end; part++) {
- if (key_def_find(first, part->fieldno))
+ if (part->path != NULL)
+ sz += part->path_len + 1;
+ }
+ part = second->parts;
+ end = part + second->part_count;
+ for (; part != end; part++) {
+ if (key_def_contains_part(first, part))
--new_part_count;
+ else if (part->path != NULL)
+ sz += part->path_len + 1;
}
-
- struct key_def *new_def;
- new_def = (struct key_def *)calloc(1, key_def_sizeof(new_part_count));
+ sz = key_def_sizeof(new_part_count, sz);
+ struct key_def *new_def = (struct key_def *)calloc(1, sz);
if (new_def == NULL) {
- diag_set(OutOfMemory, key_def_sizeof(new_part_count), "malloc",
- "new_def");
+ diag_set(OutOfMemory, sz, "calloc", "new_def");
return NULL;
}
new_def->part_count = new_part_count;
@@ -552,14 +764,21 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
new_def->is_nullable = first->is_nullable || second->is_nullable;
new_def->has_optional_parts = first->has_optional_parts ||
second->has_optional_parts;
+ /* Path data write position in the new key_def. */
+ char *data = (char *)new_def + key_def_sizeof(new_part_count, 0);
/* Write position in the new key def. */
uint32_t pos = 0;
/* Append first key def's parts to the new index_def. */
part = first->parts;
end = part + first->part_count;
for (; part != end; part++) {
+ if (part->path != NULL) {
+ new_def->parts[pos].path = data;
+ data += part->path_len + 1;
+ }
key_def_set_part(new_def, pos, part->fieldno, part->type,
- part->is_nullable, part->coll, part->coll_id);
+ part->is_nullable, part->coll, part->coll_id,
+ part->path, part->path_len);
new_def->parts[pos].offset_slot_epoch = part->offset_slot_epoch;
new_def->parts[pos].offset_slot = part->offset_slot;
pos++;
@@ -569,10 +788,15 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
part = second->parts;
end = part + second->part_count;
for (; part != end; part++) {
- if (key_def_find(first, part->fieldno))
+ if (key_def_contains_part(first, part))
continue;
+ if (part->path != NULL) {
+ new_def->parts[pos].path = data;
+ data += part->path_len + 1;
+ }
key_def_set_part(new_def, pos, part->fieldno, part->type,
- part->is_nullable, part->coll, part->coll_id);
+ part->is_nullable, part->coll, part->coll_id,
+ part->path, part->path_len);
new_def->parts[pos].offset_slot_epoch = part->offset_slot_epoch;
new_def->parts[pos].offset_slot = part->offset_slot;
pos++;
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 07997b8..30c0b84 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -54,6 +54,8 @@ struct key_part_def {
uint32_t coll_id;
/** True if a key part can store NULLs. */
bool is_nullable;
+ /** JSON path to data. */
+ char *path;
};
/**
@@ -85,6 +87,12 @@ struct key_part {
uint64_t offset_slot_epoch;
/** Cache with format's field offset slot. */
int32_t offset_slot;
+ /** JSON path to data in canonical form. */
+ char *path;
+ /** JSON path length. */
+ uint32_t path_len;
+ /** JSON path hash. */
+ uint32_t path_hash;
};
struct key_def;
@@ -144,6 +152,10 @@ struct key_def {
* fields assumed to be MP_NIL.
*/
bool has_optional_parts;
+ /**
+ * True, if some key part contain JSON path.
+ */
+ bool has_json_paths;
/** Key fields mask. @sa column_mask.h for details. */
uint64_t column_mask;
/** The size of the 'parts' array. */
@@ -232,16 +244,17 @@ box_tuple_compare_with_key(const box_tuple_t *tuple_a, const char *key_b,
/** \endcond public */
static inline size_t
-key_def_sizeof(uint32_t part_count)
+key_def_sizeof(uint32_t part_count, size_t paths_size)
{
- return sizeof(struct key_def) + sizeof(struct key_part) * part_count;
+ return sizeof(struct key_def) + sizeof(struct key_part) * part_count +
+ paths_size;
}
/**
* Allocate a new key_def with the given part count.
*/
struct key_def *
-key_def_new(uint32_t part_count);
+key_def_new(uint32_t part_count, size_t paths_size);
/**
* Allocate a new key_def with the given part count
@@ -253,8 +266,9 @@ key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count);
/**
* Dump part definitions of the given key def.
*/
-void
-key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
+int
+key_def_dump_parts(struct region *pool, const struct key_def *def,
+ struct key_part_def *parts);
/**
* Set a single key part in a key def.
@@ -263,7 +277,7 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
void
key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
enum field_type type, bool is_nullable, struct coll *coll,
- uint32_t coll_id);
+ uint32_t coll_id, const char *path, uint32_t path_len);
/**
* Update 'has_optional_parts' of @a key_def with correspondence
@@ -325,12 +339,10 @@ key_def_decode_parts_160(struct key_part_def *parts, uint32_t part_count,
const char **data, const struct field_def *fields,
uint32_t field_count);
-/**
- * Returns the part in index_def->parts for the specified fieldno.
- * If fieldno is not in index_def->parts returns NULL.
- */
-const struct key_part *
-key_def_find(const struct key_def *key_def, uint32_t fieldno);
+/** Check if @a key_def contains @a to_find part. */
+bool
+key_def_contains_part(const struct key_def *key_def,
+ const struct key_part *to_find);
/**
* Check if key definition @a first contains all parts of
@@ -377,6 +389,8 @@ key_validate_parts(const struct key_def *key_def, const char *key,
static inline bool
key_def_is_sequential(const struct key_def *key_def)
{
+ if (key_def->has_json_paths)
+ return false;
for (uint32_t part_id = 0; part_id < key_def->part_count; part_id++) {
if (key_def->parts[part_id].fieldno != part_id)
return false;
diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 25b7e36..875e51f 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -295,6 +295,11 @@ lbox_fillspace(struct lua_State *L, struct space *space, int i)
lua_pushnumber(L, part->fieldno + TUPLE_INDEX_BASE);
lua_setfield(L, -2, "fieldno");
+ if (part->path != NULL) {
+ lua_pushstring(L, part->path);
+ lua_setfield(L, -2, "path");
+ }
+
lua_pushboolean(L, part->is_nullable);
lua_setfield(L, -2, "is_nullable");
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 4b7d377..62c3c9a 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1310,6 +1310,11 @@ memtx_index_def_change_requires_rebuild(struct index *index,
return true;
if (old_part->coll != new_part->coll)
return true;
+ if (old_part->path_len != new_part->path_len)
+ return true;
+ if (memcmp(old_part->path, new_part->path,
+ old_part->path_len) != 0)
+ return true;
}
return false;
}
diff --git a/src/box/schema.cc b/src/box/schema.cc
index e52e19d..cf7cf35 100644
--- a/src/box/schema.cc
+++ b/src/box/schema.cc
@@ -286,19 +286,19 @@ schema_init()
* (and re-created) first.
*/
/* _schema - key/value space with schema description */
- struct key_def *key_def = key_def_new(1); /* part count */
+ struct key_def *key_def = key_def_new(1, 0);
if (key_def == NULL)
diag_raise();
auto key_def_guard = make_scoped_guard([&] { key_def_delete(key_def); });
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_STRING, false, NULL, COLL_NONE);
+ FIELD_TYPE_STRING, false, NULL, COLL_NONE, NULL, 0);
sc_space_new(BOX_SCHEMA_ID, "_schema", key_def, &on_replace_schema,
NULL);
/* _space - home for all spaces. */
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE, NULL, 0);
/* _collation - collation description. */
sc_space_new(BOX_COLLATION_ID, "_collation", key_def,
@@ -341,15 +341,15 @@ schema_init()
NULL);
key_def_delete(key_def);
- key_def = key_def_new(2); /* part count */
+ key_def = key_def_new(2, 0);
if (key_def == NULL)
diag_raise();
/* space no */
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE, NULL, 0);
/* index no */
key_def_set_part(key_def, 1 /* part no */, 1 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE, NULL, 0);
sc_space_new(BOX_INDEX_ID, "_index", key_def,
&alter_space_on_replace_index, &on_stmt_begin_index);
diff --git a/src/box/tuple.c b/src/box/tuple.c
index d7dbad3..83bdad1 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -159,14 +159,22 @@ tuple_validate_raw(struct tuple_format *format, const char *tuple)
/* Check field types */
struct tuple_field *field = &format->fields[0];
+ const char *pos = tuple;
uint32_t i = 0;
uint32_t defined_field_count = MIN(field_count, format->field_count);
for (; i < defined_field_count; ++i, ++field) {
- if (key_mp_type_validate(field->type, mp_typeof(*tuple),
+ if (key_mp_type_validate(field->type, mp_typeof(*pos),
ER_FIELD_TYPE, i + TUPLE_INDEX_BASE,
field->is_nullable))
return -1;
- mp_next(&tuple);
+ /* Check all JSON paths. */
+ if (field->childs != NULL) {
+ if (tuple_field_bypass_and_init(field, i, tuple, &pos,
+ NULL) != 0)
+ return -1;
+ } else {
+ mp_next(&pos);
+ }
}
return 0;
}
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index b14ac35..5a3a968 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -463,13 +463,15 @@ static inline int
tuple_compare_slowpath(const struct tuple *tuple_a, const struct tuple *tuple_b,
struct key_def *key_def)
{
+ assert(has_json_path == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
struct key_part *part = key_def->parts;
const char *tuple_a_raw = tuple_data(tuple_a);
const char *tuple_b_raw = tuple_data(tuple_b);
- if (key_def->part_count == 1 && part->fieldno == 0) {
+ if (key_def->part_count == 1 && part->fieldno == 0 &&
+ part->path == NULL) {
/*
* First field can not be optional - empty tuples
* can not exist.
@@ -597,6 +599,7 @@ static inline int
tuple_compare_with_key_slowpath(const struct tuple *tuple, const char *key,
uint32_t part_count, struct key_def *key_def)
{
+ assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || is_nullable);
assert(is_nullable == key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
@@ -1039,23 +1042,35 @@ static const comparator_signature cmp_arr[] = {
#undef COMPARATOR
+static const tuple_compare_t compare_slowpath_funcs[] = {
+ tuple_compare_slowpath<false, false, false>,
+ tuple_compare_slowpath<true, false, false>,
+ tuple_compare_slowpath<false, true, false>,
+ tuple_compare_slowpath<true, true, false>,
+ tuple_compare_slowpath<false, false, true>,
+ tuple_compare_slowpath<true, false, true>,
+ tuple_compare_slowpath<false, true, true>,
+ tuple_compare_slowpath<true, true, true>
+};
+
tuple_compare_t
tuple_compare_create(const struct key_def *def)
{
+ int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+ 2 * (def->has_optional_parts ? 1 : 0) +
+ 4 * (def->has_json_paths ? 1 : 0);
if (def->is_nullable) {
if (key_def_is_sequential(def)) {
if (def->has_optional_parts)
return tuple_compare_sequential<true, true>;
else
return tuple_compare_sequential<true, false>;
- } else if (def->has_optional_parts) {
- return tuple_compare_slowpath<true, true, false>;
} else {
- return tuple_compare_slowpath<true, false, false>;
+ return compare_slowpath_funcs[cmp_func_idx];
}
}
assert(! def->has_optional_parts);
- if (!key_def_has_collation(def)) {
+ if (!key_def_has_collation(def) && !def->has_json_paths) {
/* Precalculated comparators don't use collation */
for (uint32_t k = 0;
k < sizeof(cmp_arr) / sizeof(cmp_arr[0]); k++) {
@@ -1071,10 +1086,9 @@ tuple_compare_create(const struct key_def *def)
return cmp_arr[k].f;
}
}
- if (key_def_is_sequential(def))
- return tuple_compare_sequential<false, false>;
- else
- return tuple_compare_slowpath<false, false, false>;
+ return key_def_is_sequential(def) ?
+ tuple_compare_sequential<false, false> :
+ compare_slowpath_funcs[cmp_func_idx];
}
/* }}} tuple_compare */
@@ -1256,9 +1270,23 @@ static const comparator_with_key_signature cmp_wk_arr[] = {
#undef KEY_COMPARATOR
+static const tuple_compare_with_key_t compare_with_key_slowpath_funcs[] = {
+ tuple_compare_with_key_slowpath<false, false, false>,
+ tuple_compare_with_key_slowpath<true, false, false>,
+ tuple_compare_with_key_slowpath<false, true, false>,
+ tuple_compare_with_key_slowpath<true, true, false>,
+ tuple_compare_with_key_slowpath<false, false, true>,
+ tuple_compare_with_key_slowpath<true, false, true>,
+ tuple_compare_with_key_slowpath<false, true, true>,
+ tuple_compare_with_key_slowpath<true, true, true>
+};
+
tuple_compare_with_key_t
tuple_compare_with_key_create(const struct key_def *def)
{
+ int cmp_func_idx = (def->is_nullable ? 1 : 0) +
+ 2 * (def->has_optional_parts ? 1 : 0) +
+ 4 * (def->has_json_paths ? 1 : 0);
if (def->is_nullable) {
if (key_def_is_sequential(def)) {
if (def->has_optional_parts) {
@@ -1268,14 +1296,12 @@ tuple_compare_with_key_create(const struct key_def *def)
return tuple_compare_with_key_sequential<true,
false>;
}
- } else if (def->has_optional_parts) {
- return tuple_compare_with_key_slowpath<true, true, false>;
} else {
- return tuple_compare_with_key_slowpath<true, false, false>;
+ return compare_with_key_slowpath_funcs[cmp_func_idx];
}
}
assert(! def->has_optional_parts);
- if (!key_def_has_collation(def)) {
+ if (!key_def_has_collation(def) && !def->has_json_paths) {
/* Precalculated comparators don't use collation */
for (uint32_t k = 0;
k < sizeof(cmp_wk_arr) / sizeof(cmp_wk_arr[0]);
@@ -1294,10 +1320,9 @@ tuple_compare_with_key_create(const struct key_def *def)
return cmp_wk_arr[k].f;
}
}
- if (key_def_is_sequential(def))
- return tuple_compare_with_key_sequential<false, false>;
- else
- return tuple_compare_with_key_slowpath<false, false, false>;
+ return key_def_is_sequential(def) ?
+ tuple_compare_with_key_sequential<false, false> :
+ compare_with_key_slowpath_funcs[cmp_func_idx];
}
/* }}} tuple_compare_with_key */
--git a/src/box/tuple_extract_key.cc b/src/box/tuple_extract_key.cc
index 6b771f3..2474d98 100644
--- a/src/box/tuple_extract_key.cc
+++ b/src/box/tuple_extract_key.cc
@@ -1,15 +1,31 @@
#include "tuple_extract_key.h"
#include "tuple.h"
#include "fiber.h"
+#include "json/path.h"
enum { MSGPACK_NULL = 0xc0 };
+/** True if key part i and i+1 are sequential. */
+template <bool has_json_paths>
+static inline bool
+key_def_parts_are_sequential(const struct key_def *def, int i)
+{
+ uint32_t fieldno1 = def->parts[i].fieldno + 1;
+ uint32_t fieldno2 = def->parts[i + 1].fieldno;
+ if (!has_json_paths) {
+ return fieldno1 == fieldno2;
+ } else {
+ return fieldno1 == fieldno2 && def->parts[i].path == NULL &&
+ def->parts[i + 1].path == NULL;
+ }
+}
+
/** True, if a key con contain two or more parts in sequence. */
static bool
key_def_contains_sequential_parts(const struct key_def *def)
{
for (uint32_t i = 0; i < def->part_count - 1; ++i) {
- if (def->parts[i].fieldno + 1 == def->parts[i + 1].fieldno)
+ if (key_def_parts_are_sequential<true>(def, i))
return true;
}
return false;
@@ -95,6 +111,7 @@ static char *
tuple_extract_key_slowpath(const struct tuple *tuple,
struct key_def *key_def, uint32_t *key_size)
{
+ assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
assert(contains_sequential_parts ==
@@ -129,8 +146,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
* minimize tuple_field_raw() calls.
*/
for (; i < part_count - 1; i++) {
- if (key_def->parts[i].fieldno + 1 !=
- key_def->parts[i + 1].fieldno) {
+ if (!key_def_parts_are_sequential
+ <has_json_paths>(key_def, i)) {
/*
* End of sequential part.
*/
@@ -176,8 +193,8 @@ tuple_extract_key_slowpath(const struct tuple *tuple,
* minimize tuple_field_raw() calls.
*/
for (; i < part_count - 1; i++) {
- if (key_def->parts[i].fieldno + 1 !=
- key_def->parts[i + 1].fieldno) {
+ if (!key_def_parts_are_sequential
+ <has_json_paths>(key_def, i)) {
/*
* End of sequential part.
*/
@@ -215,6 +232,7 @@ static char *
tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
struct key_def *key_def, uint32_t *key_size)
{
+ assert(has_json_paths == key_def->has_json_paths);
assert(!has_optional_parts || key_def->is_nullable);
assert(has_optional_parts == key_def->has_optional_parts);
assert(mp_sizeof_nil() == 1);
@@ -242,11 +260,12 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
uint32_t fieldno = key_def->parts[i].fieldno;
uint32_t null_count = 0;
for (; i < key_def->part_count - 1; i++) {
- if (key_def->parts[i].fieldno + 1 !=
- key_def->parts[i + 1].fieldno)
+ if (!key_def_parts_are_sequential
+ <has_json_paths>(key_def, i))
break;
}
- uint32_t end_fieldno = key_def->parts[i].fieldno;
+ const struct key_part *part = &key_def->parts[i];
+ uint32_t end_fieldno = part->fieldno;
if (fieldno < current_fieldno) {
/* Rewind. */
@@ -288,6 +307,21 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
current_fieldno++;
}
}
+ const char *field_last, *field_end_last;
+ if (has_json_paths && part->path != NULL) {
+ field_last = field;
+ field_end_last = field_end;
+ struct json_path_parser parser;
+ struct json_path_node node;
+ json_path_parser_create(&parser, part->path,
+ part->path_len);
+ /* Skip fieldno. */
+ int rc = json_path_next(&parser, &node);
+ assert(rc == 0);
+ rc = tuple_field_dig_with_parser(&parser, &field);
+ field_end = field;
+ mp_next(&field_end);
+ }
memcpy(key_buf, field, field_end - field);
key_buf += field_end - field;
if (has_optional_parts && null_count != 0) {
@@ -296,12 +330,27 @@ tuple_extract_key_slowpath_raw(const char *data, const char *data_end,
} else {
assert(key_buf - key <= data_end - data);
}
+ if (has_json_paths && part->path != NULL) {
+ field = field_last;
+ field_end = field_end_last;
+ }
}
if (key_size != NULL)
*key_size = (uint32_t)(key_buf - key);
return key;
}
+static const tuple_extract_key_t extract_key_slowpath_funcs[] = {
+ tuple_extract_key_slowpath<false, false, false>,
+ tuple_extract_key_slowpath<true, false, false>,
+ tuple_extract_key_slowpath<false, true, false>,
+ tuple_extract_key_slowpath<true, true, false>,
+ tuple_extract_key_slowpath<false, false, true>,
+ tuple_extract_key_slowpath<true, false, true>,
+ tuple_extract_key_slowpath<false, true, true>,
+ tuple_extract_key_slowpath<true, true, true>
+};
+
/**
* Initialize tuple_extract_key() and tuple_extract_key_raw()
*/
@@ -322,35 +371,30 @@ tuple_extract_key_set(struct key_def *key_def)
tuple_extract_key_sequential_raw<false>;
}
} else {
- if (key_def->has_optional_parts) {
- assert(key_def->is_nullable);
- if (key_def_contains_sequential_parts(key_def)) {
- key_def->tuple_extract_key =
- tuple_extract_key_slowpath<true, true,
- false>;
- } else {
- key_def->tuple_extract_key =
- tuple_extract_key_slowpath<false, true,
- false>;
- }
- } else {
- if (key_def_contains_sequential_parts(key_def)) {
- key_def->tuple_extract_key =
- tuple_extract_key_slowpath<true, false,
- false>;
- } else {
- key_def->tuple_extract_key =
- tuple_extract_key_slowpath<false, false,
- false>;
- }
- }
+ int func_idx =
+ (key_def_contains_sequential_parts(key_def) ? 1 : 0) +
+ 2 * (key_def->has_optional_parts ? 1 : 0) +
+ 4 * (key_def->has_json_paths ? 1 : 0);
+ key_def->tuple_extract_key =
+ extract_key_slowpath_funcs[func_idx];
+ assert(!key_def->has_optional_parts || key_def->is_nullable);
}
if (key_def->has_optional_parts) {
assert(key_def->is_nullable);
- key_def->tuple_extract_key_raw =
- tuple_extract_key_slowpath_raw<true, false>;
+ if (key_def->has_json_paths) {
+ key_def->tuple_extract_key_raw =
+ tuple_extract_key_slowpath_raw<true, true>;
+ } else {
+ key_def->tuple_extract_key_raw =
+ tuple_extract_key_slowpath_raw<true, false>;
+ }
} else {
- key_def->tuple_extract_key_raw =
- tuple_extract_key_slowpath_raw<false, false>;
+ if (key_def->has_json_paths) {
+ key_def->tuple_extract_key_raw =
+ tuple_extract_key_slowpath_raw<false, true>;
+ } else {
+ key_def->tuple_extract_key_raw =
+ tuple_extract_key_slowpath_raw<false, false>;
+ }
}
}
diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c
index 6ae96e2..00170c9 100644
--- a/src/box/tuple_format.c
+++ b/src/box/tuple_format.c
@@ -30,6 +30,7 @@
*/
#include "json/path.h"
#include "tuple_format.h"
+#include "assoc.h"
/** Global table of tuple formats */
struct tuple_format **tuple_formats;
@@ -38,10 +39,551 @@ static intptr_t recycled_format_ids = FORMAT_ID_NIL;
static uint32_t formats_size = 0, formats_capacity = 0;
static const struct tuple_field tuple_field_default = {
- FIELD_TYPE_ANY, TUPLE_OFFSET_SLOT_NIL, false, false,
+ FIELD_TYPE_ANY, TUPLE_OFFSET_SLOT_NIL, false, false, {{NULL, 0}}
};
/**
+ * Propagate @a field to MessagePack(field)[key].
+ * @param[in][out] field Field to propagate.
+ * @param key Key to propagate to.
+ * @param len Length of @a key.
+ * @param field_idx Field index in map.
+ *
+ * @retval 0 Success, the index was found.
+ * @retval -1 Not found.
+ */
+static inline int
+tuple_field_go_to_key(const char **field, const char *key, int len,
+ uint32_t *field_idx)
+{
+ enum mp_type type = mp_typeof(**field);
+ if (type != MP_MAP)
+ return -1;
+ uint32_t count = mp_decode_map(field);
+ for (uint32_t idx = 0; idx < count; idx++) {
+ type = mp_typeof(**field);
+ if (type == MP_STR) {
+ uint32_t value_len;
+ const char *value = mp_decode_str(field, &value_len);
+ if (value_len == (uint)len &&
+ memcmp(value, key, len) == 0) {
+ *field_idx = idx;
+ return 0;
+ }
+ } else {
+ /* Skip key. */
+ mp_next(field);
+ }
+ /* Skip value. */
+ mp_next(field);
+ }
+ return -1;
+}
+
+struct mh_strnptr_node_t *
+json_path_hash_get(struct mh_strnptr_t *hashtable, const char *path,
+ uint32_t path_len, uint32_t path_hash)
+{
+ assert(hashtable != NULL);
+ struct mh_strnptr_key_t key = {path, path_len, path_hash};
+ mh_int_t rc = mh_strnptr_find(hashtable, &key, NULL);
+ if (rc == mh_end(hashtable))
+ return NULL;
+ return mh_strnptr_node(hashtable, rc);
+}
+
+/**
+ * Create a new hashtable object.
+ * @param[out] hashtable Pointer to object to create.
+ * @param records Count of records to reserve.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static struct mh_strnptr_t *
+json_path_hash_create(uint32_t records)
+{
+ struct mh_strnptr_t *ret = mh_strnptr_new();
+ if (ret == NULL) {
+ diag_set(OutOfMemory, sizeof(struct mh_strnptr_t),
+ "mh_strnptr_new", "hashtable");
+ return NULL;
+ }
+ if (records > 0 &&
+ mh_strnptr_reserve(ret, records, NULL) != 0) {
+ mh_strnptr_delete(ret);
+ diag_set(OutOfMemory, records, "mh_strnptr_reserve",
+ "hashtable");
+ return NULL;
+ }
+ return ret;
+}
+
+/**
+ * Delete @hashtable object.
+ * @param hashtable Pointer to object to delete.
+ */
+static void
+json_path_hash_delete(struct mh_strnptr_t *hashtable)
+{
+ assert(hashtable != NULL);
+ while (mh_size(hashtable) != 0) {
+ mh_int_t n = mh_first(hashtable);
+ mh_strnptr_del(hashtable, n, NULL);
+ }
+ mh_strnptr_delete(hashtable);
+}
+
+/**
+ * Insert a new record to hashtable.
+ * @param hashtable Storage to insert new record.
+ * @param path String with path.
+ * @param path_len Length of @path.
+ * @param field Value to store in @hashtable.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+json_path_hash_insert(struct mh_strnptr_t *hashtable, const char *path,
+ uint32_t path_len, struct tuple_field *field)
+{
+ assert(hashtable != NULL);
+ uint32_t path_hash = mh_strn_hash(path, path_len);
+ struct mh_strnptr_node_t name_node = {path, path_len, path_hash, field};
+ mh_int_t rc = mh_strnptr_put(hashtable, &name_node, NULL, NULL);
+ if (rc == mh_end(hashtable)) {
+ diag_set(OutOfMemory, sizeof(*hashtable), "mh_strnptr_put",
+ "hashtable");
+ return -1;
+ }
+ return 0;
+}
+
+/**
+ * Construct field tree level for JSON path part.
+ *
+ * @param[in, out] tuple_field Pointer to record to start with
+ * would be changed to record that math
+ * @part lexeme.
+ * @param fieldno Number of root space field.
+ * @param part JSON path lexeme to represent in field tree.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+json_field_tree_append(struct tuple_field **field_subtree, uint32_t fieldno,
+ struct json_path_node *part)
+{
+ enum field_type type;
+ struct tuple_field *field = *field_subtree;
+ switch (part->type) {
+ case JSON_PATH_NUM: {
+ type = FIELD_TYPE_ARRAY;
+ if (field->type != FIELD_TYPE_ANY && field->type != type)
+ goto error_type_mistmatch;
+ /* Create or resize field array if required. */
+ if (field->array == NULL || part->num > field->array_size) {
+ struct tuple_field **array =
+ realloc(field->array,
+ part->num * sizeof(array[0]));
+ if (array == NULL) {
+ diag_set(OutOfMemory,
+ part->num * sizeof(array[0]),
+ "realloc","array");
+ return -1;
+ }
+ memset(&array[field->array_size], 0,
+ (part->num - field->array_size) *
+ sizeof(array[0]));
+ field->array = array;
+ field->array_size = part->num;
+ field->type = type;
+ } else if (field->array[part->num - TUPLE_INDEX_BASE] != NULL) {
+ /* Record already exists. No actions required */
+ *field_subtree =
+ field->array[part->num - TUPLE_INDEX_BASE];
+ return 0;
+ }
+ break;
+ }
+ case JSON_PATH_STR: {
+ type = FIELD_TYPE_MAP;
+ if (field->type != FIELD_TYPE_ANY && field->type != type)
+ goto error_type_mistmatch;
+ if (field->map == NULL) {
+ field->map = json_path_hash_create(1);
+ if (field->map == NULL)
+ return -1;
+ field->type = type;
+ } else {
+ uint32_t str_hash = mh_strn_hash(part->str, part->len);
+ struct mh_strnptr_node_t *ht_record =
+ json_path_hash_get(field->map, part->str,
+ part->len, str_hash);
+ if (ht_record != NULL) {
+ assert(ht_record->val != NULL);
+ *field_subtree = ht_record->val;
+ return 0;
+ }
+ }
+ break;
+ }
+ default:
+ unreachable();
+ }
+
+ /* Construct and insert a new record. */
+ struct tuple_field *new_field = malloc(sizeof(struct tuple_field));
+ if (new_field == NULL) {
+ diag_set(OutOfMemory, sizeof(struct tuple_field), "malloc",
+ "new_field");
+ return -1;
+ }
+ *new_field = tuple_field_default;
+ if (field->type == FIELD_TYPE_MAP) {
+ if (json_path_hash_insert(field->map, part->str, part->len,
+ new_field) != 0) {
+ free(new_field);
+ return -1;
+ }
+ } else if (field->type == FIELD_TYPE_ARRAY) {
+ field->array[part->num - TUPLE_INDEX_BASE] = new_field;
+ }
+ *field_subtree = new_field;
+ return 0;
+
+error_type_mistmatch:
+ diag_set(ClientError, ER_INDEX_PART_TYPE_MISMATCH,
+ tt_sprintf("%d", fieldno + TUPLE_INDEX_BASE),
+ field_type_strs[type], field_type_strs[field->type]);
+ return -1;
+}
+
+/**
+ * Delete @field_subtree object.
+ * @param field_subtree To delete.
+ */
+static void
+json_field_tree_delete(struct tuple_field *field_subtree)
+{
+ if (field_subtree->type == FIELD_TYPE_MAP &&
+ field_subtree->map != NULL) {
+ mh_int_t i;
+ mh_foreach(field_subtree->map, i) {
+ struct tuple_field *field =
+ mh_strnptr_node(field_subtree->map, i)->val;
+ assert(field != NULL);
+ json_field_tree_delete(field);
+ free(field);
+ }
+ json_path_hash_delete(field_subtree->map);
+ } else if (field_subtree->type == FIELD_TYPE_ARRAY &&
+ field_subtree->array != NULL) {
+ for (uint32_t i = 0; i < field_subtree->array_size; i++) {
+ struct tuple_field *field = field_subtree->array[i];
+ if (field == NULL)
+ continue;
+ json_field_tree_delete(field_subtree->array[i]);
+ free(field_subtree->array[i]);
+ }
+ free(field_subtree->array);
+ }
+}
+
+int
+tuple_field_bypass_and_init(const struct tuple_field *field, uint32_t idx,
+ const char *tuple, const char **offset,
+ uint32_t *field_map)
+{
+ assert(offset != NULL);
+ const char *mp_data = *offset;
+ const char *valid_type_str = NULL;
+ const char *err = NULL;
+ enum mp_type type = mp_typeof(**offset);
+ if (field->type == FIELD_TYPE_MAP) {
+ if (type != MP_MAP) {
+ valid_type_str = mp_type_strs[MP_MAP];
+ goto error_type_mistmatch;
+ }
+ const char *max_offset = *offset;
+ uint32_t max_idx = 0;
+ uint32_t count = mp_decode_map(&max_offset);
+ mh_int_t i;
+ mh_foreach(field->map, i) {
+ struct mh_strnptr_node_t *ht_record =
+ mh_strnptr_node(field->map, i);
+ struct tuple_field *leaf = ht_record->val;
+ assert(leaf != NULL);
+
+ const char *raw = *offset;
+ uint32_t map_idx = 0;
+ int rc = tuple_field_go_to_key(&raw, ht_record->str,
+ (int)ht_record->len,
+ &map_idx);
+ if (rc != 0 && !leaf->is_nullable) {
+ err = tt_sprintf("map doesn't contain key "
+ "'%.*s' defined in index",
+ ht_record->len,ht_record->str);
+ goto error_invalid_document;
+ }
+ if (rc != 0) {
+ if (field_map != NULL &&
+ leaf->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+ field_map[leaf->offset_slot] = 0;
+ continue;
+ }
+ if (tuple_field_bypass_and_init(leaf, idx, tuple, &raw,
+ field_map) != 0)
+ return -1;
+ max_idx = MAX(max_idx, map_idx + 1);
+ max_offset = MAX(max_offset, raw);
+ }
+ *offset = max_offset;
+ while (count-- > max_idx) {
+ mp_next(offset);
+ mp_next(offset);
+ }
+ return 0;
+ } else if (field->type == FIELD_TYPE_ARRAY) {
+ if (type != MP_ARRAY) {
+ valid_type_str = mp_type_strs[MP_ARRAY];
+ goto error_type_mistmatch;
+ }
+ uint32_t count = mp_decode_array(offset);
+ for (uint32_t i = count; i < field->array_size; i++) {
+ /*
+ * Index fields out of document array
+ * must be nullable.
+ */
+ struct tuple_field *leaf = field->array[i];
+ if (leaf == NULL)
+ continue;
+ if (leaf->is_nullable) {
+ if (field_map != NULL &&
+ leaf->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+ field_map[leaf->offset_slot] = 0;
+ continue;
+ }
+ err = tt_sprintf("array size %d is less than size of %d "
+ "defined in index", i, i + 1);
+ goto error_invalid_document;
+ }
+ uint32_t fields = MIN(field->array_size, count);
+ for (uint32_t i = 0; i < fields; i++) {
+ if (field->array[i] == NULL) {
+ mp_next(offset);
+ continue;
+ }
+ if (tuple_field_bypass_and_init(field->array[i], idx,
+ tuple,
+ offset, field_map) != 0)
+ return -1;
+ }
+ while (count-- > fields)
+ mp_next(offset);
+ return 0;
+ }
+ /* Tree leaf field */
+ if (key_mp_type_validate(field->type, type, ER_KEY_PART_TYPE, idx,
+ field->is_nullable) != 0) {
+ valid_type_str = field_type_strs[field->type];
+ goto error_type_mistmatch;
+ }
+ assert(offset != NULL);
+ if (field_map != NULL &&
+ field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+ field_map[field->offset_slot] = (uint32_t) (*offset - tuple);
+ mp_next(offset);
+ return 0;
+
+error_type_mistmatch:
+ err = tt_sprintf("type mismatch: have %s, expected %s",
+ mp_type_strs[type], valid_type_str);
+error_invalid_document:
+ assert(err != NULL);
+ char *data_buff = tt_static_buf();
+ mp_snprint(data_buff, TT_STATIC_BUF_LEN, mp_data);
+ const char *err_msg =
+ tt_sprintf("invalid field %d document content '%s': %s",
+ idx + TUPLE_INDEX_BASE, data_buff, err);
+ diag_set(ClientError, ER_DATA_STRUCTURE_MISMATCH, err_msg);
+ return -1;
+}
+
+/**
+ * Add new JSON @path to @format.
+ * @param format Tuple format to modify.
+ * @param path String to add.
+ * @param path_len Length of @path.
+ * @param path_hash Hash of @path.
+ * @param type Type of field by @path.
+ * @param is_nullable Nullability of field by @path.
+ * @param strings Area to store unique JSON paths (optional).
+ * @param[out] leaf Pointer to leaf field.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+tuple_format_add_json_path(struct tuple_format *format, const char *path,
+ uint32_t path_len, uint32_t path_hash,
+ enum field_type type, bool is_nullable,
+ char **strings, struct tuple_field **leaf)
+{
+ assert(format->path_hash != NULL);
+ /*
+ * Get root field by index.
+ * Path is specified in canonical form: [i]...
+ */
+ struct json_path_parser parser;
+ struct json_path_node node;
+ json_path_parser_create(&parser, path, path_len);
+ int rc = json_path_next(&parser, &node);
+ assert(rc == 0 && node.type == JSON_PATH_NUM);
+ assert(node.num - TUPLE_INDEX_BASE < format->field_count);
+
+ /* Test if path is already registered. */
+ struct mh_strnptr_node_t *ht_record =
+ json_path_hash_get(format->path_hash, path, path_len, path_hash);
+ assert(ht_record != NULL);
+ struct tuple_field *field = ht_record->val;
+ if (unlikely(field != NULL)) {
+ /* Path has been already registered. */
+ if (field->is_nullable != is_nullable)
+ field->is_nullable = false;
+ if (field_type1_contains_type2(field->type, type)) {
+ field->type = type;
+ } else if (!field_type1_contains_type2(type, field->type)) {
+ const char *err =
+ tt_sprintf("JSON path '%.*s' has been already "
+ "constructed for '%s' leaf record",
+ path_len, path,
+ field_type_strs[field->type]);
+ diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
+ node.num, err);
+ return -1;
+ }
+ *leaf = field;
+ return 0;
+ } else if (strings != NULL) {
+ /*
+ * Hashtable should hold memory related to format
+ * chunk allocation.
+ */
+ memcpy(*strings, path, path_len);
+ (*strings)[path_len] = '\0';
+ ht_record->str = *strings;
+ *strings += path_len + 1;
+ }
+
+ /*
+ * We have to re-init parser with path string located in
+ * format chunk.
+ */
+ json_path_parser_create(&parser, ht_record->str + parser.offset,
+ path_len - parser.offset);
+ /* Build data path tree. */
+ uint32_t root_fieldno = node.num - TUPLE_INDEX_BASE;
+ field = &format->fields[root_fieldno];
+ while ((rc = json_path_next(&parser, &node)) == 0 &&
+ node.type != JSON_PATH_END) {
+ if (json_field_tree_append(&field, root_fieldno, &node) != 0)
+ return -1;
+ }
+ assert(rc == 0 && node.type == JSON_PATH_END);
+
+ /* Leaf record is a new object as JSON path unique. */
+ field->type = type;
+ field->is_nullable = is_nullable;
+ *leaf = field;
+ ht_record->val = field;
+ return 0;
+}
+
+/**
+ * Add a new key_part to format and initialize format tuple_field
+ * representation.
+ * @param format Format to initialize.
+ * @param fields Fields definition if any.
+ * @param fields_count Count of @fields.
+ * @param part An index part to append.
+ * @param is_sequential Does this part sequential.
+ * @param data Memory to store path strings.
+ * @param current_slot Pointer to last offset slot.
+ * @retval -1 On error.
+ * @retval 0 On success.
+ */
+static int
+tuple_format_add_key_part(struct tuple_format *format,
+ const struct field_def *fields, uint32_t field_count,
+ const struct key_part *part, bool is_sequential,
+ char **data, int *current_slot)
+{
+ assert(part->fieldno < format->field_count);
+ struct tuple_field *field = &format->fields[part->fieldno];
+ if (part->path != NULL) {
+ field->is_key_part = true;
+ assert(!is_sequential);
+ struct tuple_field *leaf = NULL;
+ if (tuple_format_add_json_path(format, part->path,
+ part->path_len, part->path_hash,
+ part->type, part->is_nullable,
+ data, &leaf) != 0)
+ return -1;
+ assert(leaf != NULL);
+ if (leaf->offset_slot == TUPLE_OFFSET_SLOT_NIL) {
+ *current_slot = *current_slot - 1;
+ leaf->offset_slot = *current_slot;
+ }
+ return 0;
+ }
+ if (part->fieldno >= field_count) {
+ field->is_nullable = part->is_nullable;
+ } else if (field->is_nullable != part->is_nullable) {
+ /*
+ * In case of mismatch set the most
+ * strict option for is_nullable.
+ */
+ field->is_nullable = false;
+ }
+ /*
+ * Check that there are no conflicts
+ * between index part types and space
+ * fields. If a part type is compatible
+ * with field's one, then the part type is
+ * more strict and the part type must be
+ * used in tuple_format.
+ */
+ if (field_type1_contains_type2(field->type, part->type)) {
+ field->type = part->type;
+ } else if (!field_type1_contains_type2(part->type, field->type)) {
+ int fieldno = part->fieldno + TUPLE_INDEX_BASE;
+ const char *name = part->fieldno >= field_count ?
+ tt_sprintf("%d", fieldno) :
+ tt_sprintf("'%s'",
+ fields[part->fieldno].name);
+ int errcode = !field->is_key_part ?
+ ER_FORMAT_MISMATCH_INDEX_PART :
+ ER_INDEX_PART_TYPE_MISMATCH;
+ diag_set(ClientError, errcode, name,
+ field_type_strs[field->type],
+ field_type_strs[part->type]);
+ return -1;
+ }
+ field->is_key_part = true;
+ /*
+ * In the tuple, store only offsets necessary
+ * to access fields of non-sequential keys.
+ * First field is always simply accessible,
+ * so we don't store an offset for it.
+ */
+ if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL && !is_sequential &&
+ part->fieldno > 0) {
+ *current_slot = *current_slot - 1;
+ field->offset_slot = *current_slot;
+ }
+ return 0;
+}
+
+/**
* Extract all available type info from keys and field
* definitions.
*/
@@ -63,12 +605,18 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
format->fields[i].type = fields[i].type;
format->fields[i].offset_slot = TUPLE_OFFSET_SLOT_NIL;
format->fields[i].is_nullable = fields[i].is_nullable;
+ /* Don't need to init format->fields[i].map. */
+ format->fields[i].childs = NULL;
+ format->fields[i].array_size = 0;
}
/* Initialize remaining fields */
for (uint32_t i = field_count; i < format->field_count; i++)
format->fields[i] = tuple_field_default;
int current_slot = 0;
+ /* Memory allocated for JSON paths if any. */
+ char *data = (char *)format + sizeof(struct tuple_format) +
+ format->field_count * sizeof(struct tuple_field);
/* extract field type info */
for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
@@ -76,65 +624,12 @@ tuple_format_create(struct tuple_format *format, struct key_def * const *keys,
bool is_sequential = key_def_is_sequential(key_def);
const struct key_part *part = key_def->parts;
const struct key_part *parts_end = part + key_def->part_count;
-
for (; part < parts_end; part++) {
- assert(part->fieldno < format->field_count);
- struct tuple_field *field =
- &format->fields[part->fieldno];
- if (part->fieldno >= field_count) {
- field->is_nullable = part->is_nullable;
- } else if (field->is_nullable != part->is_nullable) {
- /*
- * In case of mismatch set the most
- * strict option for is_nullable.
- */
- field->is_nullable = false;
- }
-
- /*
- * Check that there are no conflicts
- * between index part types and space
- * fields. If a part type is compatible
- * with field's one, then the part type is
- * more strict and the part type must be
- * used in tuple_format.
- */
- if (field_type1_contains_type2(field->type,
- part->type)) {
- field->type = part->type;
- } else if (! field_type1_contains_type2(part->type,
- field->type)) {
- const char *name;
- int fieldno = part->fieldno + TUPLE_INDEX_BASE;
- if (part->fieldno >= field_count) {
- name = tt_sprintf("%d", fieldno);
- } else {
- const struct field_def *def =
- &fields[part->fieldno];
- name = tt_sprintf("'%s'", def->name);
- }
- int errcode;
- if (! field->is_key_part)
- errcode = ER_FORMAT_MISMATCH_INDEX_PART;
- else
- errcode = ER_INDEX_PART_TYPE_MISMATCH;
- diag_set(ClientError, errcode, name,
- field_type_strs[field->type],
- field_type_strs[part->type]);
+ if (tuple_format_add_key_part(format, fields,
+ field_count, part,
+ is_sequential, &data,
+ ¤t_slot) != 0)
return -1;
- }
- field->is_key_part = true;
- /*
- * In the tuple, store only offsets necessary
- * to access fields of non-sequential keys.
- * First field is always simply accessible,
- * so we don't store an offset for it.
- */
- if (field->offset_slot == TUPLE_OFFSET_SLOT_NIL &&
- is_sequential == false && part->fieldno > 0) {
-
- field->offset_slot = --current_slot;
- }
}
}
@@ -201,32 +696,58 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
uint32_t space_field_count, struct tuple_dictionary *dict)
{
uint32_t index_field_count = 0;
+ /* JSON path hashtable. */
+ struct mh_strnptr_t *path_hash = json_path_hash_create(0);
+ if (path_hash == NULL)
+ return NULL;
/* find max max field no */
for (uint16_t key_no = 0; key_no < key_count; ++key_no) {
const struct key_def *key_def = keys[key_no];
const struct key_part *part = key_def->parts;
const struct key_part *pend = part + key_def->part_count;
for (; part < pend; part++) {
+ if (part->path != NULL &&
+ json_path_hash_insert(path_hash, part->path,
+ part->path_len, NULL) != 0)
+ goto error;
index_field_count = MAX(index_field_count,
part->fieldno + 1);
}
}
+ size_t extra_size = 0;
+ if (mh_size(path_hash) == 0) {
+ /* Hashtable is useless. */
+ json_path_hash_delete(path_hash);
+ path_hash = NULL;
+ } else {
+ /*
+ * Calculate unique JSON paths count.
+ * Path data would be copied later on
+ * tuple_format_create routine.
+ */
+ mh_int_t i;
+ mh_foreach(path_hash, i) {
+ struct mh_strnptr_node_t *node =
+ mh_strnptr_node(path_hash, i);
+ extra_size += node->len + 1;
+ }
+ }
uint32_t field_count = MAX(space_field_count, index_field_count);
uint32_t total = sizeof(struct tuple_format) +
- field_count * sizeof(struct tuple_field);
+ field_count * sizeof(struct tuple_field) + extra_size;
struct tuple_format *format = (struct tuple_format *) malloc(total);
if (format == NULL) {
diag_set(OutOfMemory, sizeof(struct tuple_format), "malloc",
"tuple format");
- return NULL;
+ goto error;
}
if (dict == NULL) {
assert(space_field_count == 0);
format->dict = tuple_dictionary_new(NULL, 0);
if (format->dict == NULL) {
free(format);
- return NULL;
+ goto error;
}
} else {
format->dict = dict;
@@ -243,13 +764,21 @@ tuple_format_alloc(struct key_def * const *keys, uint16_t key_count,
format->index_field_count = index_field_count;
format->exact_field_count = 0;
format->min_field_count = 0;
+ format->path_hash = path_hash;
return format;
+error:
+ json_path_hash_delete(path_hash);
+ return NULL;
}
/** Free tuple format resources, doesn't unregister. */
static inline void
tuple_format_destroy(struct tuple_format *format)
{
+ for (uint32_t i = 0; i < format->field_count; i++)
+ json_field_tree_delete(&format->fields[i]);
+ if (format->path_hash != NULL)
+ json_path_hash_delete(format->path_hash);
tuple_dictionary_unref(format->dict);
}
@@ -334,21 +863,61 @@ tuple_format_dup(struct tuple_format *src)
{
uint32_t total = sizeof(struct tuple_format) +
src->field_count * sizeof(struct tuple_field);
+ if (src->path_hash != NULL) {
+ mh_int_t i;
+ mh_foreach(src->path_hash, i)
+ total += mh_strnptr_node(src->path_hash, i)->len + 1;
+ }
struct tuple_format *format = (struct tuple_format *) malloc(total);
if (format == NULL) {
diag_set(OutOfMemory, total, "malloc", "tuple format");
return NULL;
}
memcpy(format, src, total);
+
+ /* Fill with NULLs for normal destruction on error. */
+ format->path_hash = NULL;
+ for (uint32_t i = 0; i < format->field_count; i++) {
+ format->fields[i].childs = NULL;
+ format->fields[i].array_size = 0;
+ }
+ if (src->path_hash != NULL) {
+ mh_int_t i;
+ format->path_hash =
+ json_path_hash_create(mh_size(src->path_hash));
+ if (format->path_hash == NULL)
+ goto error;
+ mh_foreach(src->path_hash, i) {
+ struct mh_strnptr_node_t *node =
+ mh_strnptr_node(src->path_hash, i);
+ /* Path data has been already copied. */
+ char *path = (char *)format + (node->str - (char *)src);
+ if (json_path_hash_insert(format->path_hash, path,
+ node->len, NULL) != 0)
+ goto error;
+ /* Store source leaf field offset_slot. */
+ struct tuple_field *leaf = node->val;
+ int32_t offset_slot = leaf->offset_slot;
+ uint32_t path_hash = mh_strn_hash(path, node->len);
+ if (tuple_format_add_json_path(format, path, node->len,
+ path_hash, leaf->type,
+ leaf->is_nullable, NULL,
+ &leaf) != 0)
+ goto error;
+ /* Store offset_slot in a new leaf record. */
+ assert(leaf != NULL);
+ leaf->offset_slot = offset_slot;
+ }
+ }
tuple_dictionary_ref(format->dict);
format->id = FORMAT_ID_NIL;
format->refs = 0;
- if (tuple_format_register(format) != 0) {
- tuple_format_destroy(format);
- free(format);
- return NULL;
- }
- return format;
+ if (tuple_format_register(format) == 0)
+ return format;
+error:
+ tuple_format_destroy(format);
+ free(format);
+ return NULL;
}
/** @sa declaration for details. */
@@ -377,18 +946,10 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
return -1;
}
- /* first field is simply accessible, so we do not store offset to it */
- enum mp_type mp_type = mp_typeof(*pos);
+ uint32_t i = 0;
+ enum mp_type mp_type;
const struct tuple_field *field = &format->fields[0];
- if (key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE,
- TUPLE_INDEX_BASE, field->is_nullable))
- return -1;
- mp_next(&pos);
- /* other fields...*/
- ++field;
- uint32_t i = 1;
- uint32_t defined_field_count = MIN(field_count, format->field_count);
- if (field_count < format->index_field_count) {
+ if (field_count < format->index_field_count || field->childs != NULL) {
/*
* Nullify field map to be able to detect by 0,
* which key fields are absent in tuple_field().
@@ -396,6 +957,20 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
memset((char *)field_map - format->field_map_size, 0,
format->field_map_size);
}
+ if (field->childs == NULL) {
+ /*
+ * First field is simply accessible, do not store
+ * offset to it.
+ */
+ mp_type = mp_typeof(*pos);
+ if (key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE,
+ TUPLE_INDEX_BASE, field->is_nullable))
+ return -1;
+ mp_next(&pos);
+ ++field;
+ ++i;
+ }
+ uint32_t defined_field_count = MIN(field_count, format->field_count);
for (; i < defined_field_count; ++i, ++field) {
mp_type = mp_typeof(*pos);
if (key_mp_type_validate(field->type, mp_type, ER_FIELD_TYPE,
@@ -405,8 +980,12 @@ tuple_init_field_map(const struct tuple_format *format, uint32_t *field_map,
if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
field_map[field->offset_slot] =
(uint32_t) (pos - tuple);
- }
- mp_next(&pos);
+ } else if (field->childs != NULL &&
+ tuple_field_bypass_and_init(field, i, tuple, &pos,
+ field_map) != 0)
+ return -1;
+ if (field->childs == NULL)
+ mp_next(&pos);
}
return 0;
}
@@ -512,55 +1091,106 @@ tuple_field_go_to_index(const char **field, uint64_t index)
return -1;
}
-/**
- * Propagate @a field to MessagePack(field)[key].
- * @param[in][out] field Field to propagate.
- * @param key Key to propagate to.
- * @param len Length of @a key.
- *
- * @retval 0 Success, the index was found.
- * @retval -1 Not found.
- */
-static inline int
-tuple_field_go_to_key(const char **field, const char *key, int len)
+const char *
+tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
+ const uint32_t *field_map, struct key_part *part)
{
- enum mp_type type = mp_typeof(**field);
- if (type != MP_MAP)
- return -1;
- uint64_t count = mp_decode_map(field);
- for (; count > 0; --count) {
- type = mp_typeof(**field);
- if (type == MP_STR) {
- uint32_t value_len;
- const char *value = mp_decode_str(field, &value_len);
- if (value_len == (uint)len &&
- memcmp(value, key, len) == 0)
- return 0;
- } else {
- /* Skip key. */
- mp_next(field);
+ if (likely(part->path == NULL))
+ return tuple_field_raw(format, data, field_map, part->fieldno);
+
+ struct mh_strnptr_node_t *ht_record = NULL;
+ int32_t offset_slot;
+ if (likely(part->offset_slot_epoch == format->epoch)) {
+ offset_slot = part->offset_slot;
+ } else if (format->path_hash != NULL &&
+ (ht_record = json_path_hash_get(format->path_hash, part->path,
+ part->path_len,
+ part->path_hash)) != NULL) {
+ struct tuple_field *field = ht_record->val;
+ assert(field != NULL);
+ offset_slot = field->offset_slot;
+ /* Cache offset_slot if required. */
+ if (part->offset_slot_epoch < format->epoch) {
+ part->offset_slot = offset_slot;
+ part->offset_slot_epoch = format->epoch;
}
- /* Skip value. */
- mp_next(field);
+ } else {
+ /*
+ * Legacy tuple having no field map for
+ * JSON index.
+ */
+ uint32_t path_hash =
+ field_name_hash(part->path, part->path_len);
+ const char *raw = NULL;
+ if (tuple_field_raw_by_path(format, data, field_map,
+ part->path, part->path_len,
+ path_hash, &raw) != 0)
+ raw = NULL;
+ return raw;
}
- return -1;
+ assert(offset_slot < 0);
+ assert(-offset_slot * sizeof(uint32_t) <= format->field_map_size);
+ if (unlikely(field_map[offset_slot] == 0))
+ return NULL;
+ return data + field_map[offset_slot];
}
-const char *
-tuple_field_by_part_raw(const struct tuple_format *format, const char *data,
- const uint32_t *field_map, struct key_part *part)
+int
+tuple_field_dig_with_parser(struct json_path_parser *parser, const char **field)
{
- return tuple_field_raw(format, data, field_map, part->fieldno);
+ int rc;
+ struct json_path_node node;
+ while ((rc = json_path_next(parser, &node)) == 0) {
+ uint32_t dummy;
+ switch(node.type) {
+ case JSON_PATH_NUM:
+ rc = tuple_field_go_to_index(field, node.num);
+ break;
+ case JSON_PATH_STR:
+ rc = tuple_field_go_to_key(field, node.str,
+ node.len, &dummy);
+ break;
+ default:
+ assert(node.type == JSON_PATH_END);
+ return 0;
+ }
+ if (rc != 0) {
+ *field = NULL;
+ return 0;
+ }
+ }
+ return rc;
}
int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
+tuple_field_raw_by_path(const struct tuple_format *format, const char *tuple,
const uint32_t *field_map, const char *path,
uint32_t path_len, uint32_t path_hash,
const char **field)
{
assert(path_len > 0);
uint32_t fieldno;
+ if (format->path_hash != NULL) {
+ /*
+ * The path hash for format->path_hash hashtable
+ * may may be different from path_hash specified
+ * as function argument.
+ */
+ struct mh_strnptr_node_t *ht_record =
+ json_path_hash_get(format->path_hash, path, path_len,
+ mh_strn_hash(path, path_len));
+ if (ht_record != NULL) {
+ struct tuple_field *leaf = ht_record->val;
+ assert(leaf != NULL);
+ int32_t offset_slot = leaf->offset_slot;
+ assert(offset_slot != TUPLE_OFFSET_SLOT_NIL);
+ if (likely(field_map[offset_slot] != 0))
+ *field = tuple + field_map[offset_slot];
+ else
+ *field = NULL;
+ return 0;
+ }
+ }
/*
* It is possible, that a field has a name as
* well-formatted JSON. For example 'a.b.c.d' or '[1]' can
@@ -616,23 +1246,9 @@ tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
*field = NULL;
return 0;
}
- while ((rc = json_path_next(&parser, &node)) == 0) {
- switch(node.type) {
- case JSON_PATH_NUM:
- rc = tuple_field_go_to_index(field, node.num);
- break;
- case JSON_PATH_STR:
- rc = tuple_field_go_to_key(field, node.str, node.len);
- break;
- default:
- assert(node.type == JSON_PATH_END);
- return 0;
- }
- if (rc != 0) {
- *field = NULL;
- return 0;
- }
- }
+ rc = tuple_field_dig_with_parser(&parser, field);
+ if (rc == 0)
+ return 0;
error:
assert(rc > 0);
diag_set(ClientError, ER_ILLEGAL_PARAMS,
diff --git a/src/box/tuple_format.h b/src/box/tuple_format.h
index 9406d5b..afdb2aa 100644
--- a/src/box/tuple_format.h
+++ b/src/box/tuple_format.h
@@ -63,6 +63,8 @@ enum { TUPLE_OFFSET_SLOT_NIL = INT32_MAX };
struct tuple;
struct tuple_format;
+struct json_path_parser;
+struct mh_strnptr_t;
/** Engine-specific tuple format methods. */
struct tuple_format_vtab {
@@ -108,6 +110,21 @@ struct tuple_field {
bool is_key_part;
/** True, if a field can store NULL. */
bool is_nullable;
+ /** Tree child records. Must at the end of struct */
+ union {
+ /** Array of fields. */
+ struct {
+ struct tuple_field **array;
+ uint32_t array_size;
+ };
+ /** Hashtable: path -> tuple_field. */
+ struct mh_strnptr_t *map;
+ /**
+ * Auxiliary pointer to test if field has
+ * JSON path subtree.
+ */
+ void *childs;
+ };
};
/**
@@ -167,6 +184,8 @@ struct tuple_format {
* Shared names storage used by all formats of a space.
*/
struct tuple_dictionary *dict;
+ /** JSON path hash table. */
+ struct mh_strnptr_t *path_hash;
/* Formats of the fields */
struct tuple_field fields[0];
};
@@ -394,7 +413,7 @@ tuple_field_raw(const struct tuple_format *format, const char *tuple,
* @retval NULL No field with @a name.
*/
static inline const char *
-tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
+tuple_field_raw_by_name(const struct tuple_format *format, const char *tuple,
const uint32_t *field_map, const char *name,
uint32_t name_len, uint32_t name_hash)
{
@@ -419,11 +438,51 @@ tuple_field_raw_by_name(struct tuple_format *format, const char *tuple,
* @retval -1 Error in JSON path.
*/
int
-tuple_field_raw_by_path(struct tuple_format *format, const char *tuple,
+tuple_field_raw_by_path(const struct tuple_format *format, const char *tuple,
const uint32_t *field_map, const char *path,
uint32_t path_len, uint32_t path_hash,
const char **field);
+/**
+ * Retrieve document data @field with initialized @parser.
+ * @param parser JSON parser.
+ * @param[in, out] field Tuple field to lookup.
+ * @retval 0 On success.
+ * @retval > 0 On error in path been used to initialize @parser.
+ */
+int
+tuple_field_dig_with_parser(struct json_path_parser *parser,
+ const char **field);
+
+/**
+ * Get @hashtable record by key @path, @path_len.
+ * @param hashtable Storage to lookup.
+ * @param path Path string.
+ * @param path_len Length of @path.
+ * @param path_hash Hash of @path.
+ * @retval NULL On nothing found.
+ * @retval not NULL Leaf field pointer for registered path.
+ */
+struct mh_strnptr_node_t *
+json_path_hash_get(struct mh_strnptr_t *hashtable, const char *path,
+ uint32_t path_len, uint32_t path_hash);
+
+/**
+ * Observe JSON path tree in @field comparing with @tuple
+ * structure. Initialize field map if specified.
+ * @param field Field to use on initialization.
+ * @param idx Root field index to emmit correct error.
+ * @param tuple Source raw data.
+ * @param offset Document field offset to process.
+ * @param field_map Field map to initialize (optional).
+ * @retval 0 On success.
+ * @retval -1 On error.
+ */
+int
+tuple_field_bypass_and_init(const struct tuple_field *field, uint32_t idx,
+ const char *tuple, const char **offset,
+ uint32_t *field_map);
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 01a0983..8ede290 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -222,7 +222,7 @@ key_hash_slowpath(const char *key, struct key_def *key_def);
void
tuple_hash_func_set(struct key_def *key_def) {
- if (key_def->is_nullable)
+ if (key_def->is_nullable || key_def->has_json_paths)
goto slowpath;
/*
* Check that key_def defines sequential a key without holes
@@ -256,10 +256,17 @@ tuple_hash_func_set(struct key_def *key_def) {
}
slowpath:
- if (key_def->has_optional_parts)
- key_def->tuple_hash = tuple_hash_slowpath<true, false>;
- else
- key_def->tuple_hash = tuple_hash_slowpath<false, false>;
+ if (key_def->has_optional_parts) {
+ if (key_def->has_json_paths)
+ key_def->tuple_hash = tuple_hash_slowpath<true, true>;
+ else
+ key_def->tuple_hash = tuple_hash_slowpath<true, false>;
+ } else {
+ if (key_def->has_json_paths)
+ key_def->tuple_hash = tuple_hash_slowpath<false, true>;
+ else
+ key_def->tuple_hash = tuple_hash_slowpath<false, false>;
+ }
key_def->key_hash = key_hash_slowpath;
}
@@ -323,6 +330,7 @@ template <bool has_optional_parts, bool has_json_paths>
uint32_t
tuple_hash_slowpath(const struct tuple *tuple, struct key_def *key_def)
{
+ assert(has_json_paths == key_def->has_json_paths);
assert(has_optional_parts == key_def->has_optional_parts);
uint32_t h = HASH_SEED;
uint32_t carry = 0;
diff --git a/src/box/vinyl.c b/src/box/vinyl.c
index 86a33ec..2da9607 100644
--- a/src/box/vinyl.c
+++ b/src/box/vinyl.c
@@ -956,6 +956,11 @@ vinyl_index_def_change_requires_rebuild(struct index *index,
return true;
if (!field_type1_contains_type2(new_part->type, old_part->type))
return true;
+ if (old_part->path_len != new_part->path_len)
+ return true;
+ if (memcmp(old_part->path, new_part->path,
+ old_part->path_len) != 0)
+ return true;
}
return false;
}
diff --git a/src/box/vy_log.c b/src/box/vy_log.c
index fc8ede5..f396705 100644
--- a/src/box/vy_log.c
+++ b/src/box/vy_log.c
@@ -711,7 +711,8 @@ vy_log_record_dup(struct region *pool, const struct vy_log_record *src)
"struct key_part_def");
goto err;
}
- key_def_dump_parts(src->key_def, dst->key_parts);
+ if (key_def_dump_parts(pool, src->key_def, dst->key_parts) != 0)
+ goto err;
dst->key_part_count = src->key_def->part_count;
dst->key_def = NULL;
}
diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c
index 8fa86d3..0abdd15 100644
--- a/src/box/vy_lsm.c
+++ b/src/box/vy_lsm.c
@@ -36,6 +36,7 @@
#include <sys/stat.h>
#include <sys/types.h>
#include <small/mempool.h>
+#include <assoc.h>
#include "diag.h"
#include "errcode.h"
@@ -158,6 +159,49 @@ vy_lsm_new(struct vy_lsm_env *lsm_env, struct vy_cache_env *cache_env,
NULL);
if (lsm->disk_format == NULL)
goto fail_format;
+ /*
+ * Tuple formats should be compatible to make
+ * epoch-based caching work.
+ */
+ int32_t min_offset_slot = 0;
+ struct tuple_field *dst_fields = lsm->disk_format->fields;
+ struct mh_strnptr_t *dst_ht = lsm->disk_format->path_hash;
+ struct mh_strnptr_t *src_ht = format->path_hash;
+ struct key_part *part = cmp_def->parts;
+ struct key_part *part_end = part + cmp_def->part_count;
+ for (; part < part_end; part++) {
+ struct tuple_field *dst_field =
+ &dst_fields[part->fieldno];
+ struct tuple_field *src_field;
+ if (dst_field->offset_slot != TUPLE_OFFSET_SLOT_NIL) {
+ src_field = &format->fields[part->fieldno];
+ } else if (part->path != NULL) {
+ struct mh_strnptr_node_t *ht_record;
+ ht_record =
+ json_path_hash_get(dst_ht, part->path,
+ part->path_len,
+ part->path_hash);
+ assert(ht_record != NULL);
+ dst_field = ht_record->val;
+ assert(dst_field != NULL);
+ ht_record =
+ json_path_hash_get(src_ht, part->path,
+ part->path_len,
+ part->path_hash);
+ assert(ht_record != NULL);
+ src_field = ht_record->val;
+ assert(src_field != NULL);
+ } else {
+ continue;
+ }
+ if (src_field->offset_slot == TUPLE_OFFSET_SLOT_NIL)
+ continue;
+ dst_field->offset_slot = src_field->offset_slot;
+ min_offset_slot =
+ MIN(src_field->offset_slot, min_offset_slot);
+ }
+ lsm->disk_format->field_map_size =
+ -min_offset_slot * sizeof(uint32_t);
lsm->disk_format->epoch = format->epoch;
}
tuple_format_ref(lsm->disk_format);
diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c
index 7b704b8..9d5e220 100644
--- a/src/box/vy_point_lookup.c
+++ b/src/box/vy_point_lookup.c
@@ -196,8 +196,6 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx,
const struct vy_read_view **rv,
struct tuple *key, struct tuple **ret)
{
- assert(tuple_field_count(key) >= lsm->cmp_def->part_count);
-
*ret = NULL;
double start_time = ev_monotonic_now(loop());
int rc = 0;
diff --git a/src/box/vy_stmt.c b/src/box/vy_stmt.c
index 8018dee..8259a91 100644
--- a/src/box/vy_stmt.c
+++ b/src/box/vy_stmt.c
@@ -44,6 +44,7 @@
#include "tuple_format.h"
#include "xrow.h"
#include "fiber.h"
+#include "assoc.h"
/**
* Statement metadata keys.
@@ -330,6 +331,71 @@ vy_stmt_replace_from_upsert(const struct tuple *upsert)
return replace;
}
+static void
+vy_stmt_msgpack_build(struct tuple_field *field, char *tuple,
+ uint32_t *field_map, char **offset, bool write_data,
+ struct mh_i64ptr_t *fields_iov_ht)
+{
+ if (field->type == FIELD_TYPE_ARRAY) {
+ if (write_data)
+ *offset = mp_encode_array(*offset, field->array_size);
+ else
+ *offset += mp_sizeof_array(field->array_size);
+ for (uint32_t i = 0; i < field->array_size; i++) {
+ if (field->array[i] == NULL) {
+ if (write_data)
+ *offset = mp_encode_nil(*offset);
+ else
+ *offset += mp_sizeof_nil();
+ continue;
+ }
+ vy_stmt_msgpack_build(field->array[i], tuple, field_map,
+ offset, write_data,
+ fields_iov_ht);
+ }
+ return;
+ } else if (field->type == FIELD_TYPE_MAP) {
+ if (write_data)
+ *offset = mp_encode_map(*offset, mh_size(field->map));
+ else
+ *offset += mp_sizeof_map(mh_size(field->map));
+ mh_int_t i;
+ mh_foreach(field->map, i) {
+ struct mh_strnptr_node_t *node =
+ mh_strnptr_node(field->map, i);
+ assert(node);
+ if (write_data) {
+ *offset = mp_encode_str(*offset, node->str,
+ node->len);
+ } else {
+ *offset += mp_sizeof_str(node->len);
+ }
+ vy_stmt_msgpack_build(node->val, tuple, field_map,
+ offset, write_data,
+ fields_iov_ht);
+ }
+ return;
+ }
+
+ mh_int_t k = mh_i64ptr_find(fields_iov_ht, (uint64_t)field, NULL);
+ struct iovec *iov = k != mh_end(fields_iov_ht) ?
+ mh_i64ptr_node(fields_iov_ht, k)->val : NULL;
+ if (iov == NULL) {
+ if (write_data)
+ *offset = mp_encode_nil(*offset);
+ else
+ *offset += mp_sizeof_nil();
+ } else {
+ if (write_data) {
+ uint32_t data_offset = *offset - tuple;
+ memcpy(*offset, iov->iov_base, iov->iov_len);
+ if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
+ field_map[field->offset_slot] = data_offset;
+ }
+ *offset += iov->iov_len;
+ }
+}
+
static struct tuple *
vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
const struct key_def *cmp_def,
@@ -338,51 +404,80 @@ vy_stmt_new_surrogate_from_key(const char *key, enum iproto_type type,
/* UPSERT can't be surrogate. */
assert(type != IPROTO_UPSERT);
struct region *region = &fiber()->gc;
+ struct tuple *stmt = NULL;
uint32_t field_count = format->index_field_count;
- struct iovec *iov = region_alloc(region, sizeof(*iov) * field_count);
+ uint32_t part_count = mp_decode_array(&key);
+ assert(part_count == cmp_def->part_count);
+ struct iovec *iov = region_alloc(region, sizeof(*iov) * part_count);
if (iov == NULL) {
- diag_set(OutOfMemory, sizeof(*iov) * field_count,
- "region", "iov for surrogate key");
+ diag_set(OutOfMemory, sizeof(*iov) * part_count, "region",
+ "iov for surrogate key");
return NULL;
}
- memset(iov, 0, sizeof(*iov) * field_count);
- uint32_t part_count = mp_decode_array(&key);
- assert(part_count == cmp_def->part_count);
- assert(part_count <= field_count);
- uint32_t nulls_count = field_count - cmp_def->part_count;
- uint32_t bsize = mp_sizeof_array(field_count) +
- mp_sizeof_nil() * nulls_count;
- for (uint32_t i = 0; i < part_count; ++i) {
- const struct key_part *part = &cmp_def->parts[i];
+ struct mh_i64ptr_t *fields_iov_ht = mh_i64ptr_new();
+ if (fields_iov_ht == NULL) {
+ diag_set(OutOfMemory, sizeof(struct mh_i64ptr_t),
+ "mh_i64ptr_new", "fields_iov_ht");
+ return NULL;
+ }
+ if (mh_i64ptr_reserve(fields_iov_ht, part_count, NULL) != 0) {
+ diag_set(OutOfMemory, part_count, "mh_i64ptr_reserve",
+ "fields_iov_ht");
+ goto end;
+ }
+ uint32_t bsize = mp_sizeof_array(field_count);
+ uint32_t nulls_count = field_count;
+ memset(iov, 0, sizeof(*iov) * part_count);
+ const struct key_part *part = cmp_def->parts;
+ for (uint32_t i = 0; i < part_count; ++i, ++part) {
assert(part->fieldno < field_count);
const char *svp = key;
- iov[part->fieldno].iov_base = (char *) key;
+ iov[i].iov_base = (char *) key;
mp_next(&key);
- iov[part->fieldno].iov_len = key - svp;
- bsize += key - svp;
+ iov[i].iov_len = key - svp;
+ struct tuple_field *field;
+ if (part->path == NULL) {
+ field = &format->fields[part->fieldno];
+ --nulls_count;
+ } else {
+ struct mh_strnptr_node_t *ht_record =
+ json_path_hash_get(format->path_hash,
+ part->path, part->path_len,
+ part->path_hash);
+ assert(ht_record != NULL);
+ field = ht_record->val;
+ assert(field != NULL);
+ }
+ struct mh_i64ptr_node_t node = {(uint64_t)field, &iov[i]};
+ mh_int_t k = mh_i64ptr_put(fields_iov_ht, &node, NULL, NULL);
+ if (k == mh_end(fields_iov_ht))
+ goto end;
+ }
+ bsize += nulls_count * mp_sizeof_nil();
+ for (uint32_t i = 0; i < field_count; ++i) {
+ char *data = NULL;
+ vy_stmt_msgpack_build(&format->fields[i], NULL, NULL, &data,
+ false, fields_iov_ht);
+ bsize += data - (char *)NULL;
}
- struct tuple *stmt = vy_stmt_alloc(format, bsize);
+ stmt = vy_stmt_alloc(format, bsize);
if (stmt == NULL)
- return NULL;
+ goto end;
char *raw = (char *) tuple_data(stmt);
uint32_t *field_map = (uint32_t *) raw;
char *wpos = mp_encode_array(raw, field_count);
for (uint32_t i = 0; i < field_count; ++i) {
- const struct tuple_field *field = &format->fields[i];
- if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL)
- field_map[field->offset_slot] = wpos - raw;
- if (iov[i].iov_base == NULL) {
- wpos = mp_encode_nil(wpos);
- } else {
- memcpy(wpos, iov[i].iov_base, iov[i].iov_len);
- wpos += iov[i].iov_len;
- }
+ vy_stmt_msgpack_build(&format->fields[i], raw, field_map, &wpos,
+ true, fields_iov_ht);
}
- assert(wpos == raw + bsize);
+ assert(wpos <= raw + bsize);
vy_stmt_set_type(stmt, type);
+
+end:
+ mh_i64ptr_delete(fields_iov_ht);
return stmt;
}
diff --git a/test/box/misc.result b/test/box/misc.result
index 6237675..6ea97e1 100644
--- a/test/box/misc.result
+++ b/test/box/misc.result
@@ -350,7 +350,7 @@ t;
- 'box.error.CANT_CREATE_COLLATION : 150'
- 'box.error.USER_EXISTS : 46'
- 'box.error.WAL_IO : 40'
- - 'box.error.PROC_RET : 21'
+ - 'box.error.RTREE_RECT : 101'
- 'box.error.PRIV_GRANTED : 89'
- 'box.error.CREATE_SPACE : 9'
- 'box.error.GRANT : 88'
@@ -361,7 +361,7 @@ t;
- 'box.error.VINYL_MAX_TUPLE_SIZE : 139'
- 'box.error.LOAD_FUNCTION : 99'
- 'box.error.INVALID_XLOG : 74'
- - 'box.error.READ_VIEW_ABORTED : 130'
+ - 'box.error.PRIV_NOT_GRANTED : 91'
- 'box.error.TRANSACTION_CONFLICT : 97'
- 'box.error.GUEST_USER_PASSWORD : 96'
- 'box.error.PROC_C : 102'
@@ -371,8 +371,8 @@ t;
- 'box.error.DROP_FUNCTION : 71'
- 'box.error.CFG : 59'
- 'box.error.NO_SUCH_FIELD : 37'
- - 'box.error.CONNECTION_TO_SELF : 117'
- - 'box.error.FUNCTION_MAX : 54'
+ - 'box.error.MORE_THAN_ONE_TUPLE : 41'
+ - 'box.error.PROC_LUA : 32'
- 'box.error.ILLEGAL_PARAMS : 1'
- 'box.error.PARTIAL_KEY : 136'
- 'box.error.SAVEPOINT_NO_TRANSACTION : 114'
@@ -400,34 +400,35 @@ t;
- 'box.error.UPDATE_ARG_TYPE : 26'
- 'box.error.CROSS_ENGINE_TRANSACTION : 81'
- 'box.error.FORMAT_MISMATCH_INDEX_PART : 27'
- - 'box.error.FUNCTION_TX_ACTIVE : 30'
- 'box.error.injection : table: <address>
- - 'box.error.ITERATOR_TYPE : 72'
+ - 'box.error.FUNCTION_TX_ACTIVE : 30'
+ - 'box.error.IDENTIFIER : 70'
+ - 'box.error.TRANSACTION_YIELD : 154'
- 'box.error.NO_SUCH_ENGINE : 57'
- 'box.error.COMMIT_IN_SUB_STMT : 122'
- - 'box.error.TRANSACTION_YIELD : 154'
- - 'box.error.UNSUPPORTED : 5'
- - 'box.error.LAST_DROP : 15'
+ - 'box.error.RELOAD_CFG : 58'
- 'box.error.SPACE_FIELD_IS_DUPLICATE : 149'
+ - 'box.error.LAST_DROP : 15'
+ - 'box.error.SEQUENCE_OVERFLOW : 147'
- 'box.error.DECOMPRESSION : 124'
- 'box.error.CREATE_SEQUENCE : 142'
- 'box.error.CREATE_USER : 43'
- - 'box.error.SEQUENCE_OVERFLOW : 147'
+ - 'box.error.FUNCTION_MAX : 54'
- 'box.error.INSTANCE_UUID_MISMATCH : 66'
- - 'box.error.RELOAD_CFG : 58'
+ - 'box.error.TUPLE_FORMAT_LIMIT : 16'
- 'box.error.SYSTEM : 115'
- 'box.error.KEY_PART_IS_TOO_LONG : 118'
- - 'box.error.MORE_THAN_ONE_TUPLE : 41'
- 'box.error.TRUNCATE_SYSTEM_SPACE : 137'
- - 'box.error.NO_SUCH_SAVEPOINT : 61'
- 'box.error.VY_QUOTA_TIMEOUT : 135'
- - 'box.error.PRIV_NOT_GRANTED : 91'
+ - 'box.error.NO_SUCH_SAVEPOINT : 61'
+ - 'box.error.PROTOCOL : 104'
+ - 'box.error.READ_VIEW_ABORTED : 130'
- 'box.error.WRONG_INDEX_OPTIONS : 108'
- 'box.error.INVALID_VYLOG_FILE : 133'
- 'box.error.INDEX_FIELD_COUNT_LIMIT : 127'
- - 'box.error.BEFORE_REPLACE_RET : 53'
+ - 'box.error.DATA_STRUCTURE_MISMATCH : 55'
- 'box.error.USER_MAX : 56'
- - 'box.error.INVALID_MSGPACK : 20'
+ - 'box.error.BEFORE_REPLACE_RET : 53'
- 'box.error.TUPLE_NOT_ARRAY : 22'
- 'box.error.KEY_PART_COUNT : 31'
- 'box.error.ALTER_SPACE : 12'
@@ -436,47 +437,47 @@ t;
- 'box.error.DROP_SEQUENCE : 144'
- 'box.error.INVALID_XLOG_ORDER : 76'
- 'box.error.UNKNOWN_REQUEST_TYPE : 48'
- - 'box.error.PROC_LUA : 32'
+ - 'box.error.PROC_RET : 21'
- 'box.error.SUB_STMT_MAX : 121'
- 'box.error.ROLE_NOT_GRANTED : 92'
- 'box.error.SPACE_EXISTS : 10'
- - 'box.error.UPDATE_INTEGER_OVERFLOW : 95'
+ - 'box.error.UNSUPPORTED : 5'
- 'box.error.MIN_FIELD_COUNT : 39'
- 'box.error.NO_SUCH_SPACE : 36'
- 'box.error.WRONG_INDEX_PARTS : 107'
- 'box.error.REPLICASET_UUID_MISMATCH : 63'
- 'box.error.UPDATE_FIELD : 29'
- 'box.error.INDEX_EXISTS : 85'
- - 'box.error.SPLICE : 25'
+ - 'box.error.DROP_SPACE : 11'
- 'box.error.COMPRESSION : 119'
- 'box.error.INVALID_ORDER : 68'
- - 'box.error.UNKNOWN : 0'
+ - 'box.error.SPLICE : 25'
- 'box.error.NO_SUCH_GROUP : 155'
- - 'box.error.TUPLE_FORMAT_LIMIT : 16'
+ - 'box.error.INVALID_MSGPACK : 20'
- 'box.error.DROP_PRIMARY_KEY : 17'
- 'box.error.NULLABLE_PRIMARY : 152'
- 'box.error.NO_SUCH_SEQUENCE : 145'
- 'box.error.INJECTION : 8'
- 'box.error.INVALID_UUID : 64'
- - 'box.error.IDENTIFIER : 70'
+ - 'box.error.NO_SUCH_ROLE : 82'
- 'box.error.TIMEOUT : 78'
+ - 'box.error.ITERATOR_TYPE : 72'
- 'box.error.REPLICA_MAX : 73'
- - 'box.error.NO_SUCH_ROLE : 82'
- - 'box.error.DROP_SPACE : 11'
+ - 'box.error.UNKNOWN : 0'
- 'box.error.MISSING_REQUEST_FIELD : 69'
- 'box.error.MISSING_SNAPSHOT : 93'
- 'box.error.WRONG_SPACE_OPTIONS : 111'
- 'box.error.READONLY : 7'
- - 'box.error.RTREE_RECT : 101'
+ - 'box.error.UPDATE_INTEGER_OVERFLOW : 95'
- 'box.error.UPSERT_UNIQUE_SECONDARY_KEY : 105'
- 'box.error.NO_CONNECTION : 77'
- 'box.error.UNSUPPORTED_PRIV : 98'
- 'box.error.WRONG_SCHEMA_VERSION : 109'
- 'box.error.ROLLBACK_IN_SUB_STMT : 123'
- - 'box.error.PROTOCOL : 104'
- - 'box.error.INVALID_XLOG_TYPE : 125'
- - 'box.error.INDEX_PART_TYPE_MISMATCH : 24'
- 'box.error.UNSUPPORTED_INDEX_FEATURE : 112'
+ - 'box.error.CONNECTION_TO_SELF : 117'
+ - 'box.error.INDEX_PART_TYPE_MISMATCH : 24'
+ - 'box.error.INVALID_XLOG_TYPE : 125'
...
test_run:cmd("setopt delimiter ''");
---
diff --git a/test/engine/tuple.result b/test/engine/tuple.result
index 35c700e..b74bb23 100644
--- a/test/engine/tuple.result
+++ b/test/engine/tuple.result
@@ -954,6 +954,393 @@ type(tuple:tomap().fourth)
s:drop()
---
...
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO["fname"]'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': same key
+ part is indexed twice'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+---
+- error: 'Wrong index options (field 2): ''path'' must be string'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'field.FIO.fname'}}})
+---
+- error: 'Wrong index options (field 2): invalid JSON path: first part should be defined
+ as array index'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = '[3].FIO'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+ ''map'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[3][1]'}}})
+---
+- error: 'Can''t create or modify index ''test1'' in space ''withdata'': field type
+ ''array'' is not supported'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+---
+- error: Field 3 has type 'map' in one index, but type 'string' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3][1].sname'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+---
+- error: Field 3 has type 'map' in one index, but type 'array' in another
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[2].FIO.fname'}}})
+---
+- error: 'Wrong index options (field 2): invalid JSON path: first part refers to invalid
+ field'
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO....fname'}}})
+---
+- error: 'Wrong index options (field 3): invalid JSON path ''[3].FIO....fname'': path
+ has invalid structure (error at position 9)'
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+...
+assert(idx ~= nil)
+---
+- true
+...
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+ ''666'': type mismatch: have unsigned, expected map'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+ ''666'': type mismatch: have unsigned, expected string'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+ ''{"fname": "James"}'': map doesn''t contain key ''sname'' defined in index'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+ 4, 5]
+...
+idx:select()
+---
+- - [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+ - [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+ 4, 5]
+...
+idx:min()
+---
+- [7, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:max()
+---
+- [7, 7, {'town': 'Moscow', 'FIO': {'fname': 'Max', 'data': 'extra', 'sname': 'Isaev'}},
+ 4, 5]
+...
+s:drop()
+---
+...
+s = box.schema.create_space('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[1][2]'}
+---
+...
+pk = s:create_index('pk', {parts = parts})
+---
+...
+s:insert{{1, 2}, 3}
+---
+- [[1, 2], 3]
+...
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+---
+...
+s:get(2)
+---
+- [[1, 2], 8]
+...
+s:drop()
+---
+...
+-- Create index on space with data
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk = s:create_index('primary', { type = 'tree' })
+---
+...
+s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
+---
+- [1, 7, {'town': 'London', 'FIO': 1234}, 4, 5]
+...
+s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [2, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+---
+- [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+---
+- [4, 7, {'town': 'London', 'FIO': [1, 2, 3]}, 4, 5]
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+ ''1234'': type mismatch: have unsigned, expected map'
+...
+_ = s:delete(1)
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+- error: Duplicate key exists in unique index 'test1' in space 'withdata'
+...
+_ = s:delete(2)
+---
+...
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+---
+- error: 'Tuple doesn''t math document structure: invalid field 3 document content
+ ''[1, 2, 3]'': type mismatch: have array, expected map'
+...
+_ = s:delete(4)
+---
+...
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '[3]["FIO"]["sname"]'}, {3, 'str', path = '[3]["FIO"]["extra"]', is_nullable = true}}})
+---
+...
+assert(idx ~= nil)
+---
+- true
+...
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '[3]["FIO"]["fname"]'}}})
+---
+- error: 'Wrong index options (field 3): JSON path ''[3]["FIO"]["fname"]'' has been
+ already constructed for ''string'' leaf record'
+...
+idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+---
+...
+assert(idx2 ~= nil)
+---
+- true
+...
+t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+---
+...
+-- Test field_map in tuple speed-up access by indexed path.
+t["[3][\"FIO\"][\"fname\"]"]
+---
+- Agent
+...
+idx:select()
+---
+- - [5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+ - [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:min()
+---
+- [5, 7, {'town': 'Matrix', 'FIO': {'fname': 'Agent', 'sname': 'Smith'}}, 4, 5]
+...
+idx:max()
+---
+- [3, 7, {'town': 'London', 'FIO': {'fname': 'James', 'sname': 'Bond'}}, 4, 5]
+...
+idx:drop()
+---
+...
+s:drop()
+---
+...
+-- Test complex JSON indexes
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+parts = {}
+---
+...
+parts[1] = {1, 'str', path='[1][3][2].a'}
+---
+...
+parts[2] = {1, 'unsigned', path = '[1][3][1]'}
+---
+...
+parts[3] = {2, 'str', path = '[2][2].d[1]'}
+---
+...
+pk = s:create_index('primary', { type = 'tree', parts = parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+---
+- [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6,
+ [1, 2, 3]]
+...
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+---
+- error: Duplicate key exists in unique index 'primary' in space 'withdata'
+...
+parts = {}
+---
+...
+parts[1] = {4, 'unsigned', path='[4][1]', is_nullable = false}
+---
+...
+parts[2] = {4, 'unsigned', path='[4][2]', is_nullable = true}
+---
+...
+parts[3] = {4, 'unsigned', path='[4][4]', is_nullable = true}
+---
+...
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+---
+...
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+---
+- error: 'Tuple doesn''t math document structure: invalid field 4 document content
+ ''[]'': array size 0 is less than size of 1 defined in index'
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[1][3][2].b' }
+---
+...
+parts[2] = {3, 'unsigned'}
+---
+...
+crosspart_idx = s:create_index('crosspart', { parts = parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+ 2, 3]]
+...
+parts = {}
+---
+...
+parts[1] = {1, 'unsigned', path='[1][3][2].b'}
+---
+...
+num_idx = s:create_index('numeric', {parts = parts})
+---
+...
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:get(2)
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+ 2, 3]]
+...
+num_idx:select()
+---
+- - [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+ 9, 2, 3]]
+ - [[1, 2, [3, {1: 3, 'a': 'str', 'b': 5}]], ['c', {'d': ['e', 'f'], 'e': 'g'}],
+ 6, [1, 2, 3]]
+ - [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [
+ 0]]
+...
+num_idx:max()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+num_idx:min()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+ 2, 3]]
+...
+assert(crosspart_idx:max() == num_idx:max())
+---
+- true
+...
+assert(crosspart_idx:min() == num_idx:min())
+---
+- true
+...
+trap_idx:max()
+---
+- [[1, 2, [3, {'a': 'str2', 'b': 2}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [9,
+ 2, 3]]
+...
+trap_idx:min()
+---
+- [[1, 2, [3, {'a': 'str3', 'b': 9}]], ['c', {'d': ['e', 'f'], 'e': 'g'}], 6, [0]]
+...
+s:drop()
+---
+...
+s = box.schema.space.create('withdata', {engine = engine})
+---
+...
+pk_simplified = s:create_index('primary', { type = 'tree', parts = {{1, 'unsigned', path = '[1]'}}})
+---
+...
+assert(pk_simplified.path == box.NULL)
+---
+- true
+...
+idx = s:create_index('idx', {parts = {{2, 'integer', path = '[2].a'}}})
+---
+...
+s:insert{31, {a = 1, aa = -1}}
+---
+- [31, {'a': 1, 'aa': -1}]
+...
+s:insert{22, {a = 2, aa = -2}}
+---
+- [22, {'a': 2, 'aa': -2}]
+...
+s:insert{13, {a = 3, aa = -3}}
+---
+- [13, {'a': 3, 'aa': -3}]
+...
+idx:select()
+---
+- - [31, {'a': 1, 'aa': -1}]
+ - [22, {'a': 2, 'aa': -2}]
+ - [13, {'a': 3, 'aa': -3}]
+...
+idx:alter({parts = {{2, 'integer', path = '[2].aa'}}})
+---
+...
+idx:select()
+---
+- - [13, {'a': 3, 'aa': -3}]
+ - [22, {'a': 2, 'aa': -2}]
+ - [31, {'a': 1, 'aa': -1}]
+...
+s:drop()
+---
+...
engine = nil
---
...
diff --git a/test/engine/tuple.test.lua b/test/engine/tuple.test.lua
index edc3dab..d563c66 100644
--- a/test/engine/tuple.test.lua
+++ b/test/engine/tuple.test.lua
@@ -312,5 +312,114 @@ tuple:tomap().fourth
type(tuple:tomap().fourth)
s:drop()
+--
+-- gh-1012: Indexes for JSON-defined paths.
+--
+s = box.schema.space.create('withdata', {engine = engine})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO["fname"]'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 666}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = 'field.FIO.fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'map', path = '[3].FIO'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'array', path = '[3][1]'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3][1].sname'}, {3, 'str', path = '[3]["FIO"].fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[2].FIO.fname'}}})
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3].FIO....fname'}}})
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+assert(idx ~= nil)
+s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 666, sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = "James"}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{7, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond', data = "extra"}}, 4, 5}
+s:insert{7, 7, {town = 'Moscow', FIO = {fname = 'Max', sname = 'Isaev', data = "extra"}}, 4, 5}
+idx:select()
+idx:min()
+idx:max()
+s:drop()
+
+s = box.schema.create_space('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'unsigned', path='[1][2]'}
+pk = s:create_index('pk', {parts = parts})
+s:insert{{1, 2}, 3}
+s:upsert({{box.null, 2}}, {{'+', 2, 5}})
+s:get(2)
+s:drop()
+
+-- Create index on space with data
+s = box.schema.space.create('withdata', {engine = engine})
+pk = s:create_index('primary', { type = 'tree' })
+s:insert{1, 7, {town = 'London', FIO = 1234}, 4, 5}
+s:insert{2, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{3, 7, {town = 'London', FIO = {fname = 'James', sname = 'Bond'}}, 4, 5}
+s:insert{4, 7, {town = 'London', FIO = {1,2,3}}, 4, 5}
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+_ = s:delete(1)
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+_ = s:delete(2)
+s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}, {3, 'str', path = '[3]["FIO"]["sname"]'}}})
+_ = s:delete(4)
+idx = s:create_index('test1', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]', is_nullable = true}, {3, 'str', path = '[3]["FIO"]["sname"]'}, {3, 'str', path = '[3]["FIO"]["extra"]', is_nullable = true}}})
+assert(idx ~= nil)
+s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = '[3]["FIO"]["fname"]'}}})
+idx2 = s:create_index('test2', {parts = {{2, 'number'}, {3, 'str', path = '[3]["FIO"]["fname"]'}}})
+assert(idx2 ~= nil)
+t = s:insert{5, 7, {town = 'Matrix', FIO = {fname = 'Agent', sname = 'Smith'}}, 4, 5}
+-- Test field_map in tuple speed-up access by indexed path.
+t["[3][\"FIO\"][\"fname\"]"]
+idx:select()
+idx:min()
+idx:max()
+idx:drop()
+s:drop()
+
+-- Test complex JSON indexes
+s = box.schema.space.create('withdata', {engine = engine})
+parts = {}
+parts[1] = {1, 'str', path='[1][3][2].a'}
+parts[2] = {1, 'unsigned', path = '[1][3][1]'}
+parts[3] = {2, 'str', path = '[2][2].d[1]'}
+pk = s:create_index('primary', { type = 'tree', parts = parts})
+s:insert{{1, 2, {3, {3, a = 'str', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {1, 2, 3}}
+s:insert{{1, 2, {3, {a = 'str', b = 1}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6}
+parts = {}
+parts[1] = {4, 'unsigned', path='[4][1]', is_nullable = false}
+parts[2] = {4, 'unsigned', path='[4][2]', is_nullable = true}
+parts[3] = {4, 'unsigned', path='[4][4]', is_nullable = true}
+trap_idx = s:create_index('trap', { type = 'tree', parts = parts})
+s:insert{{1, 2, {3, {3, a = 'str2', b = 5}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[1][3][2].b' }
+parts[2] = {3, 'unsigned'}
+crosspart_idx = s:create_index('crosspart', { parts = parts})
+s:insert{{1, 2, {3, {a = 'str2', b = 2}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {9, 2, 3}}
+parts = {}
+parts[1] = {1, 'unsigned', path='[1][3][2].b'}
+num_idx = s:create_index('numeric', {parts = parts})
+s:insert{{1, 2, {3, {a = 'str3', b = 9}}}, {'c', {d = {'e', 'f'}, e = 'g'}}, 6, {0}}
+num_idx:get(2)
+num_idx:select()
+num_idx:max()
+num_idx:min()
+assert(crosspart_idx:max() == num_idx:max())
+assert(crosspart_idx:min() == num_idx:min())
+trap_idx:max()
+trap_idx:min()
+s:drop()
+
+s = box.schema.space.create('withdata', {engine = engine})
+pk_simplified = s:create_index('primary', { type = 'tree', parts = {{1, 'unsigned', path = '[1]'}}})
+assert(pk_simplified.path == box.NULL)
+idx = s:create_index('idx', {parts = {{2, 'integer', path = '[2].a'}}})
+s:insert{31, {a = 1, aa = -1}}
+s:insert{22, {a = 2, aa = -2}}
+s:insert{13, {a = 3, aa = -3}}
+idx:select()
+idx:alter({parts = {{2, 'integer', path = '[2].aa'}}})
+idx:select()
+s:drop()
+
engine = nil
test_run = nil
diff --git a/test/vinyl/info.result b/test/vinyl/info.result
index 95e8cc6..134924f 100644
--- a/test/vinyl/info.result
+++ b/test/vinyl/info.result
@@ -1157,7 +1157,7 @@ st2 = i2:stat()
...
s:bsize()
---
-- 107449
+- 107499
...
i1:len(), i2:len()
---
--
2.7.4
next prev parent reply other threads:[~2018-09-06 12:46 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 7:37 [tarantool-patches] [PATCH v3 0/4] box: indexes by JSON path Kirill Shcherbatov
2018-08-27 7:37 ` [tarantool-patches] [PATCH v3 1/4] rfc: describe a Tarantool JSON indexes Kirill Shcherbatov
2018-08-27 7:37 ` [tarantool-patches] [PATCH v3 2/4] box: introduce slot_cache in key_part Kirill Shcherbatov
2018-09-03 10:35 ` [tarantool-patches] " Vladislav Shpilevoy
2018-09-06 12:47 ` Kirill Shcherbatov
2018-09-17 17:08 ` Vladimir Davydov
2018-08-27 7:37 ` [tarantool-patches] [PATCH v3 3/4] box: introduce JSON indexes Kirill Shcherbatov
2018-09-03 10:32 ` [tarantool-patches] " Vladislav Shpilevoy
2018-09-03 10:35 ` Vladislav Shpilevoy
2018-09-06 12:46 ` Kirill Shcherbatov [this message]
2018-08-27 7:37 ` [tarantool-patches] [PATCH v3 4/4] box: specify indexes in user-friendly form Kirill Shcherbatov
2018-09-03 10:32 ` [tarantool-patches] " Vladislav Shpilevoy
2018-09-06 12:46 ` Kirill Shcherbatov
2018-09-17 15:50 ` [tarantool-patches] [PATCH v3 0/4] box: indexes by JSON path Vladimir Davydov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b15535bd-fbf7-a688-5fca-cde8136a9ae7@tarantool.org \
--to=kshcherbatov@tarantool.org \
--cc=tarantool-patches@freelists.org \
--cc=v.shpilevoy@tarantool.org \
--cc=vdavydov.dev@gmail.com \
--subject='Re: [tarantool-patches] Re: [PATCH v3 3/4] box: introduce JSON indexes' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox