From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 780E0469710 for ; Wed, 25 Nov 2020 01:33:44 +0300 (MSK) References: <20201114172823.8217-1-sergepetrenko@tarantool.org> <20201114172823.8217-2-sergepetrenko@tarantool.org> <635cf529-93e6-09e3-73d4-bba8ad04e793@tarantool.org> <2b49855f-3337-0828-465b-38fbd7ef0912@tarantool.org> From: Vladislav Shpilevoy Message-ID: <9eba270a-c0fb-3238-bc2e-2e39fe73eb91@tarantool.org> Date: Tue, 24 Nov 2020 23:33:40 +0100 MIME-Version: 1.0 In-Reply-To: <2b49855f-3337-0828-465b-38fbd7ef0912@tarantool.org> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH 1/2] box: speed up tuple_field_map_create List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Serge Petrenko , korablev@tarantool.org Cc: tarantool-patches@dev.tarantool.org Hi! Thanks for the fixes! >> Did you think of trying to flatten the first level of tuple_format.fields >> tree into an array, like it was before JSON indexes? So we would have an >> array of fields, and each one has a json-subtree in it. Which is just >> unused in 99.999% of cases. It could restore even more perf maybe. > > > Your suggestion is good, but looks like it won't work. As I understand, > the first level of JSON schema (or how does one call this?) may be a map, > rather than an array. So, even top-level fields must be organised in a tree, > rather than an array. First level of tuple_format is always an array. Because tuple root type is MP_ARRAY. The first level names are emulated using tuple_dictionary. But in the code we usually just trim the first JSON path part, convert it to a field number using the dictionary, and then work with an array in the root. This is why in many functions we have fieldno + path, not just path. > I was thinking of making fields a union. So that it'd hold an array for > simple cases and a json tree for complex ones. I also thought it should work > even better, but didn't start implementation. Sounds complex. But maybe it would look not as complex as it sounds. However if the idea with json tree root as array works, we probably don't need it. > What's interesting, the JSON tree root has all its children organised in an > array anyway. It's just hidden behind one level of abstraction. So, in > `json_tree_lookup_entry` one actually does `return root->children[token.num]`. Yeah, this is because the tree does not have hash tables in each node. Instead, all nodes store their children in an array. Regardless of their key. Even string keys are stored in the children array. To lookup paths the tree uses a single hash json_tree.hash. Here by a parent node pointer and a child name you can find the next json_token. So even if children looks like an array, it is not entirely correct to use as an array. Although it will work for the root tuple level which is always an array I guess. I don't know if we put root field names to this json tree. If we don't, then the root level safely can be used as an array. > So, I'm not sure whether we should introduce a plain array for top-level > fields. This'll require a bigger refactoring just to check whether the idea > is good or not. We already have it in the tree. So maybe nothing to do here, and I was wrong. We should not bring it up to tuple_format. > I have a suggestion: let's access the root's array directly. I didn't > implement it initially since I thought it'd harm JSON tree encapsulation. Should be fine, as long as root tuple type is MP_ARRAY, I hope. > However, I tried it out, and it actually looks good. > Consider this diff (not applied yet, just an example): > > ``` > diff --git a/src/box/tuple_format.c b/src/box/tuple_format.c > index d8656aa26..6c9b2a255 100644 > --- a/src/box/tuple_format.c > +++ b/src/box/tuple_format.c > @@ -888,17 +888,10 @@ tuple_field_map_create_plain(struct tuple_format *format, const char *tuple, >                        required_fields_sz); >         } > > -       struct json_token token = { > -               .type = JSON_TOKEN_NUM, > -               .num = 0, > -       }; >         struct tuple_field *field; > - > -       for (uint32_t i = 0; i < defined_field_count; i++, mp_next(&pos)) { > -               token.num = i; > -               field = json_tree_lookup_entry(&format->fields, > - &format->fields.root, &token, > -                                              struct tuple_field, token); > +       struct json_token **token = format->fields.root.children; > +       for (uint32_t i = 0; i < defined_field_count; i++, token++, mp_next(&pos)) { > +               field = json_tree_entry(*token, struct tuple_field, token); >                 if (validate) { >                         bool nullable = tuple_field_is_nullable(field); > if(!field_mp_type_is_compatible(field->type, pos, > ``` > > > I actually profiled my original patch and the version with this diff applied: > > > This diff reduces time taken by tuple_field_map_create from 2.12 seconds > to 1.69 seconds. 1.10 still has the best time: 1.31 seconds. > Just for comparison, current 2.x takes 6.1 seconds for the same amount of > tuple_field_map_create calls. Hm. That looks too much. json_tree_lookup_entry() is sure slower than a direct access, but why so much slower? It calls json_tree_lookup, which checks token type, which is num, and then it does the same as you in the diff above. So it is just +2 ifs, not counting the 'unlikely' multikey check. Does a couple of ifs slow down the code so heavily? Maybe the inlining was too aggressive, and we washed out the instruction cache? Don't know how to check it properly. Just out of curiosity, what if we move json_tree_lookup to json.c, and field_map_builder_set_slot to field_map.c? And maybe field_mp_type_is_compatible to field_def.c. If it won't change anything, then probably just 2 ifs really matter that much. > I'm starting to like this variant more. Anyway, I'm not applying this diff > until your ack. Looks fine. Just don't understand why so big difference for such a simple change. Then we probably also could gain some perf by splitting these functions into 2 versions with validate true and validate false. Because we always know it at compile time. Would be cool to use templates for this, but not sure if we can simply change tuple.c to tuple.cc only for that. We could also template tuple_field_raw_by_path, because tuple_field_raw always passes NULL path and NULL offset_slot_hint. >>> +    mp_next(&pos); >>> + >>> +    for (uint32_t i = 1; i < defined_field_count; i++, mp_next(&pos)) { >>> +        token.num = i; >>> +        field = json_tree_lookup_entry(&format->fields, >>> +                           &format->fields.root, &token, >>> +                           struct  tuple_field, token); >>> +        if (validate) { >>> +            check_field_type(field, pos); >>> +            bit_clear(required_fields, field->id); >>> +        } >>> +        if (field->offset_slot != TUPLE_OFFSET_SLOT_NIL && >>> +            field_map_builder_set_slot(builder, field->offset_slot, >>> +                           pos - tuple, MULTIKEY_NONE, >>> +                           0, NULL) != 0) { >>> +            return -1; >>> +        } >>> +    } >>> +