From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Thu, 1 Nov 2018 16:32:22 +0300 From: Konstantin Osipov Subject: Re: [tarantool-patches] [PATCH v5 11/12] box: introduce offset slot cache in key_part Message-ID: <20181101133222.GB30032@chai> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: tarantool-patches@freelists.org Cc: vdavydov.dev@gmail.com, Kirill Shcherbatov List-ID: * Kirill Shcherbatov [18/10/29 20:25]: > Same key_part could be used in different formats multiple > times I don't understand this comment. Could you please rephrase? key_part is a part of key_def. How can it be used in a format at all? >, so different field->offset_slot would be allocated. > In most scenarios we work with series of tuples of same > format, and (in general) format lookup for field would be > expensive operation for JSON-paths defined in key_part. I don't understand this statement either. Could you give an example? > New offset_slot_cache field in key_part structure and epoch-based > mechanism to validate it's actuality should be effective > approach to improve performance. Did you consider storing it elsewhere, e.g. in some kind of index search context? > - alter->new_space = space_new_xc(alter->space_def, &alter->key_list); > + alter->new_space = > + space_new_xc(alter->space_def, &alter->key_list, > + alter->old_space->format != NULL ? > + alter->old_space->format->epoch + 1 : 1); Can't we make it simpler and simply increase epoch id every time we create a new space? This is only an optimization, by leaking it into alter.cc you are making alter worry about stuff which should not be its concern. > + struct space *space = engine_create_space(engine, def, key_list, epoch); Passing epoch id around explicitly is ugly. > - key_def_set_part(new_def, pos++, part->fieldno, part->type, > + key_def_set_part(new_def, pos, part->fieldno, part->type, > part->nullable_action, part->coll, > part->coll_id, part->sort_order, part->path, > part->path_len); > + new_def->parts[pos].offset_slot_cache = part->offset_slot_cache; > + new_def->parts[pos].format_cache = part->format_cache; > + pos++; Why can't you do it in key_def_set_part? > - key_def_set_part(new_def, pos++, part->fieldno, part->type, > + key_def_set_part(new_def, pos, part->fieldno, part->type, > part->nullable_action, part->coll, > part->coll_id, part->sort_order, part->path, > part->path_len); > + new_def->parts[pos].offset_slot_cache = part->offset_slot_cache; > + new_def->parts[pos].format_cache = part->format_cache; > + pos++; Lack of code reuse, abstraction leak. > +++ b/src/box/key_def.h > @@ -101,6 +101,14 @@ struct key_part { > char *path; > /** The length of JSON path. */ > uint32_t path_len; > + /** > + * Source format for offset_slot_cache actuality > + * validations. Cache is expected to use "the format with The source format to check that offset_slot_epoch is not stale. Please avoid using the word "actuality". > + * the newest epoch is most relevant" strategy. > + */ > + struct tuple_format *format_cache; > + /** Cache with format's field offset slot. */ > + int32_t offset_slot_cache; > }; > -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov