From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Kirill Shcherbatov <kshcherbatov@tarantool.org>, Tarantool MailList <tarantool-patches@freelists.org> Subject: [tarantool-patches] Re: [PATCH v1 1/1] sql: improve vdbe_field_ref fetcher Date: Wed, 14 Aug 2019 22:18:02 +0200 [thread overview] Message-ID: <4151ee6c-fb07-0801-e6ad-d62cffcdb197@tarantool.org> (raw) In-Reply-To: <de91ee71-4b4d-1525-da9f-e8e671a8fece@tarantool.org> Hi! Thanks for the fixes! But you missed some places. Consider my fixes on top of the branch and at the bottom of the email. Also see 2 comments. >> 13. Why do you consider fieldno >= 64 separately? If it is >= 64, >> then ((1LLU << fieldno) - 1) is all ones, a full mask. It won't affect >> slot_bitmask, and you will have the same result without this >> branching. > Because type overflow with << operator has undefined behavior. > This doesn't work in release mode (with optimization flags). Ok, I missed that, thanks for the check. > =============================================== > > Vdbe field ref is a dynamic index over tuple fields storing > offsets to each field and filling the offset array on demand. > It is highly used in SQL, because it strongly relies on fast and > repetitive access to any field, not only indexed. > > There is an optimisation for the case when a requested field > fieldno is indexed, and the tuple itself stores offset to the > field in its own small field map, used by indexes. vdbe_field_ref > then uses that map to retrieve offset value without decoding > anything. But when SQL requests any field > fieldno, then > vdbe_field_ref decodes the tuple from the beginning in the worst > case. Even having previously accessed fieldno. But it could > start decoding from the latter. > > An updated vdbe_field_ref fetcher class uses a bitmask of > initialized slots to use pre-calculated offsets when possible. > This speed-ups SQL in some corner case and doesn't damage > performance in general scenarios. > > Closes #4267 > --- > src/box/column_mask.h | 30 ++++++++++++ > src/box/sql.h | 9 ++-- > src/box/sql.c | 3 +- > src/box/sql/vdbe.c | 102 +++++++++++++++++++++-------------------- > test/sql/misc.result | 46 +++++++++++++++++++ > test/sql/misc.test.lua | 17 +++++++ > 6 files changed, 153 insertions(+), 54 deletions(-) > > diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c > index e096e1f65..cfb487115 100644 > --- a/src/box/sql/vdbe.c > +++ b/src/box/sql/vdbe.c > @@ -606,27 +606,60 @@ vdbe_field_ref_fetch_field(struct vdbe_field_ref *field_ref, uint32_t fieldno) > } > > /** > - * Try to get a current tuple field using its field map. > + * Find the closest field for given fieldno in field_ref's > + * slot_bitmask. The fieldno is expected to be greater than 0. > + * @param field_ref The vdbe_field_ref instance to use. > + * @param fieldno Number of a field to find the nearest left > + * neighbor of. > + * @retval >0 An index of the closest smaller fieldno > + * initialized in slot_bitmask. > + */ > +static inline uint32_t > +vdbe_field_ref_closest_slotno(struct vdbe_field_ref *field_ref, > + uint32_t fieldno) > +{ > + uint64_t slot_bitmask = field_ref->slot_bitmask; > + assert(slot_bitmask != 0 && fieldno > 0); > + uint64_t le_mask = fieldno < 64 ? > + slot_bitmask & ((1LLU << fieldno) - 1) : slot_bitmask; > + assert(bit_clz_u64(le_mask) < 64); > + return 64 - bit_clz_u64(le_mask) - 1; > +} > + > +/** > + * Get a tuple's field using field_ref's slot_bitmask, and tuple's > + * field_map when possible. Required field must be present in > + * tuple. > * @param field_ref The vdbe_field_ref instance to use. > * @param fieldno Number of a field to get. > - * @param[out] field_size Result field size. > * @retval not NULL MessagePack field. > - * @retval NULL Can not use field_map - it does not contain > - * offset to @a fieldno. > */ > -static const void * > -vdbe_field_ref_fast_fetch(struct vdbe_field_ref *field_ref, uint32_t fieldno, > - uint32_t *field_size) > +static const char * > +vdbe_field_ref_fetch_data(struct vdbe_field_ref *field_ref, uint32_t fieldno) > { > - const struct tuple_field *tf = > - vdbe_field_ref_fetch_field(field_ref, fieldno); > - if (tf == NULL || tf->offset_slot == TUPLE_OFFSET_SLOT_NIL) > - return NULL; > - const char *field = tuple_field(field_ref->tuple, fieldno); > - const char *end = field; > - mp_next(&end); > - *field_size = end - field; > - return field; > + if (bitmask64_bit_is_set(field_ref->slot_bitmask, fieldno)) > + return field_ref->data + field_ref->slots[fieldno]; > + > + const char *field_begin; > + const struct tuple_field *field = vdbe_field_ref_fetch_field(field_ref, > + fieldno); > + if (field != NULL && field->offset_slot != TUPLE_OFFSET_SLOT_NIL) { > + field_begin = tuple_field(field_ref->tuple, fieldno); > + } else { > + uint32_t prev = vdbe_field_ref_closest_slotno(field_ref, > + fieldno); > + field_begin = field_ref->data + field_ref->slots[prev]; > + for (prev++; prev < fieldno; prev++) { > + mp_next(&field_begin); 1. Now there is a new problem. If a field is > 64, but there are filled slots >= 64 and < the requested field, we still decode all the fields from 64. Even if the field was accessed earlier and has a slot. In the 'if' above: if (bitmask64_bit_is_set(field_ref->slot_bitmask, fieldno)) return field_ref->data + field_ref->slots[fieldno]; We miss fields, which actually has slot, but >= 64. We could solve that by 1) Filling slot array with an invalid value. For example, 0. 2) In the 'if' above we check, that if a slot (not bit, but the slot) is not 0, then it was filled earlier and can be used. It will work for all fields, even >= 64. 3) Otherwise for < 64 we check slot bitmask of the left neighbour. For >= 64 we get index of the biggest bit in the mask and scan all left neighbour slots of the requested field until find non-empty, or stumble into the max index of the mask. What do you think? > + field_ref->slots[prev] = > + (uint32_t)(field_begin - field_ref->data); > + bitmask64_set_bit(&field_ref->slot_bitmask, prev); > + } > + mp_next(&field_begin); > + } > + field_ref->slots[fieldno] = (uint32_t)(field_begin - field_ref->data); > + bitmask64_set_bit(&field_ref->slot_bitmask, fieldno); > + return field_begin; > } > > static inline enum field_type > @@ -652,45 +685,13 @@ static int > diff --git a/test/sql/misc.result b/test/sql/misc.result > index 2591d9d75..f2357c122 100644 > --- a/test/sql/misc.result > +++ b/test/sql/misc.result > @@ -133,3 +133,49 @@ box.execute('select * from "s"') > s:drop() > --- > ... > +-- > +-- gh-4267: Full power of vdbe_field_ref 2. It would be great to have here a short description of what are we testing here. > +-- > +format = {} > +--- > +... > +t = {} > +--- > +... > +test_run:cmd("setopt delimiter ';'") > +--- > +- true > +... > +for i = 1, 70 do > + format[i] = {name = 'FIELD'..i, type = 'unsigned'} > + t[i] = i > +end > +test_run:cmd("setopt delimiter ''"); > +--- > +... > +s = box.schema.create_space('TEST', {format = format}) > +--- > +... > +pk = s:create_index('pk', {parts = {70}}) > +--- > +... > +s:insert(t) > +--- > +- [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, > + 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, > + 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, > + 63, 64, 65, 66, 67, 68, 69, 70] > +... > +box.execute('SELECT field70, field64 FROM test') > +--- > +- metadata: > + - name: FIELD70 > + type: unsigned > + - name: FIELD64 > + type: unsigned > + rows: > + - [70, 64] > +... > +box.space.TEST:drop() > +--- > +... ================================================================================ diff --git a/src/box/column_mask.h b/src/box/column_mask.h index 0518fa6c3..5fb0e1fe0 100644 --- a/src/box/column_mask.h +++ b/src/box/column_mask.h @@ -138,15 +138,15 @@ bitmask64_set_bit(uint64_t *bitmask, uint32_t bitno) } /** - * Test a bit in the bitmask corresponding to a bitno.. + * Test a bit in the bitmask corresponding to @a bitno. * @param bitmask Bitmask to test. - * @param bitno bit number to test (must be 0-based). + * @param bitno Bit number to test (must be 0-based). * @return Whether given bit is set in a given bitmask. * When bitno is greater than size of bitmask returns * false. */ static inline bool -bitmask64_bit_is_set(uint64_t bitmask, uint32_t bitno) +bitmask64_is_bit_set(uint64_t bitmask, uint32_t bitno) { if (bitno >= 64) return false; diff --git a/src/box/sql.h b/src/box/sql.h index 08fe5c7af..7644051b4 100644 --- a/src/box/sql.h +++ b/src/box/sql.h @@ -375,8 +375,8 @@ struct vdbe_field_ref { * Bitmask of initialized slots. The fieldno == 0 slot * must be initialized in vdbe_field_ref constructor. * This bitmap allows to lookup for the nearest - * initialized slot for a given fieldno to perform a few - * extra tuple decoding, as possible. + * initialized slot for a given fieldno to perform as few + * extra tuple decoding as possible. */ uint64_t slot_bitmask; /** diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c index cfb487115..327dc699a 100644 --- a/src/box/sql/vdbe.c +++ b/src/box/sql/vdbe.c @@ -606,7 +606,7 @@ vdbe_field_ref_fetch_field(struct vdbe_field_ref *field_ref, uint32_t fieldno) } /** - * Find the closest field for given fieldno in field_ref's + * Find the left closest field for a given fieldno in field_ref's * slot_bitmask. The fieldno is expected to be greater than 0. * @param field_ref The vdbe_field_ref instance to use. * @param fieldno Number of a field to find the nearest left @@ -620,8 +620,8 @@ vdbe_field_ref_closest_slotno(struct vdbe_field_ref *field_ref, { uint64_t slot_bitmask = field_ref->slot_bitmask; assert(slot_bitmask != 0 && fieldno > 0); - uint64_t le_mask = fieldno < 64 ? - slot_bitmask & ((1LLU << fieldno) - 1) : slot_bitmask; + uint64_t le_mask = fieldno < 64 ? slot_bitmask & ((1LLU << fieldno) - 1) + : slot_bitmask; assert(bit_clz_u64(le_mask) < 64); return 64 - bit_clz_u64(le_mask) - 1; } @@ -637,7 +637,7 @@ vdbe_field_ref_closest_slotno(struct vdbe_field_ref *field_ref, static const char * vdbe_field_ref_fetch_data(struct vdbe_field_ref *field_ref, uint32_t fieldno) { - if (bitmask64_bit_is_set(field_ref->slot_bitmask, fieldno)) + if (bitmask64_is_bit_set(field_ref->slot_bitmask, fieldno)) return field_ref->data + field_ref->slots[fieldno]; const char *field_begin;
next prev parent reply other threads:[~2019-08-14 20:15 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-07-22 7:43 [tarantool-patches] " Kirill Shcherbatov 2019-07-22 8:45 ` [tarantool-patches] " Kirill Shcherbatov 2019-08-08 21:13 ` Vladislav Shpilevoy 2019-08-12 15:14 ` Kirill Shcherbatov 2019-08-13 21:23 ` Vladislav Shpilevoy 2019-08-14 10:24 ` Kirill Shcherbatov 2019-08-14 20:18 ` Vladislav Shpilevoy [this message] 2019-08-15 8:39 ` Kirill Shcherbatov 2019-08-15 21:16 ` Vladislav Shpilevoy 2019-08-16 7:49 ` Kirill Shcherbatov 2019-08-19 20:00 ` Vladislav Shpilevoy 2019-08-19 20:27 ` Kirill Shcherbatov 2019-08-19 20:50 ` Vladislav Shpilevoy 2019-08-20 9:07 ` n.pettik 2019-08-20 18:46 ` Vladislav Shpilevoy 2019-08-22 12:08 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=4151ee6c-fb07-0801-e6ad-d62cffcdb197@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=kshcherbatov@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH v1 1/1] sql: improve vdbe_field_ref fetcher' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox