From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id AF90125BD3 for ; Thu, 15 Aug 2019 04:39:58 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PRqAXmmN38vx for ; Thu, 15 Aug 2019 04:39:58 -0400 (EDT) Received: from smtp45.i.mail.ru (smtp45.i.mail.ru [94.100.177.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 4B0E820299 for ; Thu, 15 Aug 2019 04:39:58 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH v1 1/1] sql: improve vdbe_field_ref fetcher References: <978c68684414e4df06fa7b37697c3f348d96cf5b.1563781318.git.kshcherbatov@tarantool.org> <967599c4-0e10-8cdc-2860-dc02105cd341@tarantool.org> <9a02e44d-cdf8-c2e6-16b9-326aef714021@tarantool.org> <4151ee6c-fb07-0801-e6ad-d62cffcdb197@tarantool.org> From: Kirill Shcherbatov Message-ID: <2683f8ca-9fe0-a0e6-7433-9bfa4745f4f4@tarantool.org> Date: Thu, 15 Aug 2019 11:39:56 +0300 MIME-Version: 1.0 In-Reply-To: <4151ee6c-fb07-0801-e6ad-d62cffcdb197@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: Vladislav Shpilevoy , Tarantool MailList > Hi! Thanks for the fixes! Hi! Thank you for your contribution. Squashed your changes. > 1. Now there is a new problem. If a field is > 64, but there are filled > slots >= 64 and < the requested field, we still decode all the fields > from 64. Even if the field was accessed earlier and has a slot. > > In the 'if' above: > > if (bitmask64_bit_is_set(field_ref->slot_bitmask, fieldno)) > return field_ref->data + field_ref->slots[fieldno]; > > We miss fields, which actually has slot, but >= 64. > > We could solve that by > 1) Filling slot array with an invalid value. For example, 0. > 2) In the 'if' above we check, that if a slot (not bit, but the slot) > is not 0, then it was filled earlier and can be used. It will work > for all fields, even >= 64. > 3) Otherwise for < 64 we check slot bitmask of the left neighbour. > For >= 64 we get index of the biggest bit in the mask and scan all > left neighbour slots of the requested field until find non-empty, or > stumble into the max index of the mask. > > What do you think? Sounds good. I've implemented this feature. >> +-- >> +-- gh-4267: Full power of vdbe_field_ref > > 2. It would be great to have here a short description of what > are we testing here. Done. =============================================== >From 87e383a01abc6bf63e88dee22a18d1a08aac414e Mon Sep 17 00:00:00 2001 Message-Id: <87e383a01abc6bf63e88dee22a18d1a08aac414e.1565857939.git.kshcherbatov@tarantool.org> From: Kirill Shcherbatov Date: Mon, 12 Aug 2019 16:14:55 +0300 Subject: [PATCH v2 1/1] sql: improve vdbe_field_ref fetcher Vdbe field ref is a dynamic index over tuple fields storing offsets to each field and filling the offset array on demand. It is highly used in SQL, because it strongly relies on fast and repetitive access to any field, not only indexed. There is an optimisation for the case when a requested field fieldno is indexed, and the tuple itself stores offset to the field in its own small field map, used by indexes. vdbe_field_ref then uses that map to retrieve offset value without decoding anything. But when SQL requests any field > fieldno, then vdbe_field_ref decodes the tuple from the beginning in the worst case. Even having previously accessed fieldno. But it could start decoding from the latter. An updated vdbe_field_ref fetcher class uses a bitmask of initialized slots to use pre-calculated offsets when possible. This speed-ups SQL in some corner case and doesn't damage performance in general scenarios. Closes #4267 --- src/box/column_mask.h | 30 +++++++++++ src/box/sql.h | 9 ++-- src/box/sql.c | 5 +- src/box/sql/vdbe.c | 116 +++++++++++++++++++++++------------------ test/sql/misc.result | 53 +++++++++++++++++++ test/sql/misc.test.lua | 22 ++++++++ 6 files changed, 181 insertions(+), 54 deletions(-) diff --git a/src/box/column_mask.h b/src/box/column_mask.h index 14da841f1..5fb0e1fe0 100644 --- a/src/box/column_mask.h +++ b/src/box/column_mask.h @@ -123,4 +123,34 @@ column_mask_fieldno_is_set(uint64_t column_mask, uint32_t fieldno) return (column_mask & mask) != 0; } +/** + * Set a bit with given index @a bitno in a given @a bitmask. + * Do nothing when @a bitno is greater than size if bitmask. + * @param bitmask Bitmask to test. + * @param bitno bit number to test (must be 0-based). + */ +static inline void +bitmask64_set_bit(uint64_t *bitmask, uint32_t bitno) +{ + if (bitno >= 64) + return; + *bitmask |= ((uint64_t)1 << bitno); +} + +/** + * Test a bit in the bitmask corresponding to @a bitno. + * @param bitmask Bitmask to test. + * @param bitno Bit number to test (must be 0-based). + * @return Whether given bit is set in a given bitmask. + * When bitno is greater than size of bitmask returns + * false. + */ +static inline bool +bitmask64_is_bit_set(uint64_t bitmask, uint32_t bitno) +{ + if (bitno >= 64) + return false; + return (bitmask & ((uint64_t)1 << bitno)) != 0; +} + #endif diff --git a/src/box/sql.h b/src/box/sql.h index 9ccecf28c..7644051b4 100644 --- a/src/box/sql.h +++ b/src/box/sql.h @@ -372,10 +372,13 @@ struct vdbe_field_ref { /** Count of fields in tuple. */ uint32_t field_count; /** - * Index of the rightmost initialized slot in slots - * array. + * Bitmask of initialized slots. The fieldno == 0 slot + * must be initialized in vdbe_field_ref constructor. + * This bitmap allows to lookup for the nearest + * initialized slot for a given fieldno to perform as few + * extra tuple decoding as possible. */ - uint32_t rightmost_slot; + uint64_t slot_bitmask; /** * Array of offsets of tuple fields. * Only values <= rightmost_slot are valid. diff --git a/src/box/sql.c b/src/box/sql.c index 0ab3a506f..f1df55571 100644 --- a/src/box/sql.c +++ b/src/box/sql.c @@ -1245,7 +1245,10 @@ vdbe_field_ref_create(struct vdbe_field_ref *field_ref, struct tuple *tuple, const char *field0 = data; field_ref->field_count = mp_decode_array((const char **) &field0); field_ref->slots[0] = (uint32_t)(field0 - data); - field_ref->rightmost_slot = 0; + memset(&field_ref->slots[1], 0, + field_ref->field_count * sizeof(field_ref->slots[0])); + field_ref->slot_bitmask = 0; + bitmask64_set_bit(&field_ref->slot_bitmask, 0); } void diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c index e096e1f65..21b585364 100644 --- a/src/box/sql/vdbe.c +++ b/src/box/sql/vdbe.c @@ -606,27 +606,74 @@ vdbe_field_ref_fetch_field(struct vdbe_field_ref *field_ref, uint32_t fieldno) } /** - * Try to get a current tuple field using its field map. + * Find the left closest field for a given fieldno in field_ref's + * slot_bitmask. The fieldno is expected to be greater than 0. + * @param field_ref The vdbe_field_ref instance to use. + * @param fieldno Number of a field to find the nearest left + * neighbor of. + * @retval >0 An index of the closest smaller fieldno + * initialized in slot_bitmask. + */ +static inline uint32_t +vdbe_field_ref_closest_slotno(struct vdbe_field_ref *field_ref, + uint32_t fieldno) +{ + uint64_t slot_bitmask = field_ref->slot_bitmask; + assert(slot_bitmask != 0 && fieldno > 0); + uint64_t le_mask = fieldno < 64 ? slot_bitmask & ((1LLU << fieldno) - 1) + : slot_bitmask; + assert(bit_clz_u64(le_mask) < 64); + return 64 - bit_clz_u64(le_mask) - 1; +} + +/** + * Get a tuple's field using field_ref's slot_bitmask, and tuple's + * field_map when possible. Required field must be present in + * tuple. * @param field_ref The vdbe_field_ref instance to use. * @param fieldno Number of a field to get. - * @param[out] field_size Result field size. * @retval not NULL MessagePack field. - * @retval NULL Can not use field_map - it does not contain - * offset to @a fieldno. */ -static const void * -vdbe_field_ref_fast_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno, - uint32_t *field_size) +static const char * +vdbe_field_ref_fetch_data(struct vdbe_field_ref *field_ref, uint32_t fieldno) { - const struct tuple_field *tf = - vdbe_field_ref_fetch_field(field_ref, fieldno); - if (tf == NULL || tf->offset_slot == TUPLE_OFFSET_SLOT_NIL) - return NULL; - const char *field = tuple_field(field_ref->tuple, fieldno); - const char *end = field; - mp_next(&end); - *field_size = end - field; - return field; + if (field_ref->slots[fieldno] != 0) + return field_ref->data + field_ref->slots[fieldno]; + + const char *field_begin; + const struct tuple_field *field = vdbe_field_ref_fetch_field(field_ref, + fieldno); + if (field != NULL && field->offset_slot != TUPLE_OFFSET_SLOT_NIL) { + field_begin = tuple_field(field_ref->tuple, fieldno); + } else { + uint32_t prev = vdbe_field_ref_closest_slotno(field_ref, + fieldno);; + if (fieldno >= 64) { + /* + * There could be initialized slots + * that didn't fit in the bitmask. + * Try to find the biggest initialized + * slot. + */ + for (uint32_t it = fieldno - 1; it > prev; it--) { + if (field_ref->slots[it] == 0) + continue; + prev = it; + break; + } + } + field_begin = field_ref->data + field_ref->slots[prev]; + for (prev++; prev < fieldno; prev++) { + mp_next(&field_begin); + field_ref->slots[prev] = + (uint32_t)(field_begin - field_ref->data); + bitmask64_set_bit(&field_ref->slot_bitmask, prev); + } + mp_next(&field_begin); + } + field_ref->slots[fieldno] = (uint32_t)(field_begin - field_ref->data); + bitmask64_set_bit(&field_ref->slot_bitmask, fieldno); + return field_begin; } static inline enum field_type @@ -652,45 +699,13 @@ static int vdbe_field_ref_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno, struct Mem *dest_mem) { - uint32_t *slots = field_ref->slots; if (fieldno >= field_ref->field_count) { UPDATE_MAX_BLOBSIZE(dest_mem); return 0; } - - const char *data; - if (field_ref->rightmost_slot <= fieldno) { - uint32_t field_sz; - data = vdbe_field_ref_fast_fetch(field_ref, fieldno, - &field_sz); - if (data != NULL) { - /* - * Special case for tarantool spaces: for - * indexed fields a tuple field map can be - * used. Else there is no sense in - * tuple_field usage, because it makes - * foreach field { mp_next(); } to find - * a field. In such a case sql is - * better - it saves offsets to all fields - * visited in mp_next() cycle. - */ - uint32_t offset = (uint32_t)(data - field_ref->data); - slots[fieldno] = offset; - slots[fieldno + 1] = offset + field_sz; - } else { - uint32_t i = field_ref->rightmost_slot; - data = field_ref->data + slots[i]; - do { - mp_next(&data); - slots[++i] = (uint32_t)(data - field_ref->data); - } while (i <= fieldno); - field_ref->rightmost_slot = i; - } - } - assert(sqlVdbeCheckMemInvariants(dest_mem) != 0); + const char *data = vdbe_field_ref_fetch_data(field_ref, fieldno); uint32_t dummy; - data = field_ref->data + slots[fieldno]; if (vdbe_decode_msgpack_into_mem(data, dest_mem, &dummy) != 0) return -1; @@ -699,8 +714,9 @@ vdbe_field_ref_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno, * Wrap it in a blob verbatim. */ if (dest_mem->flags == 0) { - dest_mem->n = slots[fieldno + 1] - slots[fieldno]; dest_mem->z = (char *) data; + dest_mem->n = vdbe_field_ref_fetch_data(field_ref, + fieldno + 1) - data; dest_mem->flags = MEM_Blob | MEM_Ephem | MEM_Subtype; dest_mem->subtype = SQL_SUBTYPE_MSGPACK; } diff --git a/test/sql/misc.result b/test/sql/misc.result index 2591d9d75..7d183d3a0 100644 --- a/test/sql/misc.result +++ b/test/sql/misc.result @@ -133,3 +133,56 @@ box.execute('select * from "s"') s:drop() --- ... +-- +-- gh-4267: Full power of vdbe_field_ref +-- The test consists of sequential reads of multiple fields: +-- the first is indexed while the other are not indexed. +-- In skope of this issue an optimization that makes second and +-- third lookups faster were introduced. Test that they works +-- correctly. +-- +format = {} +--- +... +t = {} +--- +... +test_run:cmd("setopt delimiter ';'") +--- +- true +... +for i = 1, 70 do + format[i] = {name = 'FIELD'..i, type = 'unsigned'} + t[i] = i +end +test_run:cmd("setopt delimiter ''"); +--- +... +s = box.schema.create_space('TEST', {format = format}) +--- +... +pk = s:create_index('pk', {parts = {66}}) +--- +... +s:insert(t) +--- +- [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, + 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, + 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, + 63, 64, 65, 66, 67, 68, 69, 70] +... +box.execute('SELECT field66, field68, field70 FROM test') +--- +- metadata: + - name: FIELD66 + type: unsigned + - name: FIELD68 + type: unsigned + - name: FIELD70 + type: unsigned + rows: + - [66, 68, 70] +... +box.space.TEST:drop() +--- +... diff --git a/test/sql/misc.test.lua b/test/sql/misc.test.lua index fdc19f3ac..afbb2d7f8 100644 --- a/test/sql/misc.test.lua +++ b/test/sql/misc.test.lua @@ -35,3 +35,25 @@ s = box.schema.space.create('s',{format=format, temporary=true}) i = s:create_index('i') box.execute('select * from "s"') s:drop() + +-- +-- gh-4267: Full power of vdbe_field_ref +-- The test consists of sequential reads of multiple fields: +-- the first is indexed while the other are not indexed. +-- In skope of this issue an optimization that makes second and +-- third lookups faster were introduced. Test that they works +-- correctly. +-- +format = {} +t = {} +test_run:cmd("setopt delimiter ';'") +for i = 1, 70 do + format[i] = {name = 'FIELD'..i, type = 'unsigned'} + t[i] = i +end +test_run:cmd("setopt delimiter ''"); +s = box.schema.create_space('TEST', {format = format}) +pk = s:create_index('pk', {parts = {66}}) +s:insert(t) +box.execute('SELECT field66, field68, field70 FROM test') +box.space.TEST:drop() -- 2.22.1