[tarantool-patches] [PATCH v5 11/12] box: introduce offset slot cache in key_part

Konstantin Osipov kostja at tarantool.org
Thu Nov 1 16:32:22 MSK 2018


* Kirill Shcherbatov <kshcherbatov at tarantool.org> [18/10/29 20:25]:
> Same key_part could be used in different formats multiple
> times

I don't understand this comment. Could you please rephrase?

key_part is a part of key_def. How can it be used in a format at
all?

>, so different field->offset_slot would be allocated.
> In most scenarios we work with series of tuples of same
> format, and (in general) format lookup for field would be
> expensive operation for JSON-paths defined in key_part.

I don't understand this statement either. Could you give an
example?

> New offset_slot_cache field in key_part structure and epoch-based
> mechanism to validate it's actuality should be effective
> approach to improve performance.

Did you consider storing it elsewhere, e.g. in some kind of 
index search context?
> -	alter->new_space = space_new_xc(alter->space_def, &alter->key_list);
> +	alter->new_space =
> +		space_new_xc(alter->space_def, &alter->key_list,
> +			     alter->old_space->format != NULL ?
> +			     alter->old_space->format->epoch + 1 : 1);

Can't we make it simpler and simply increase epoch id every
time we create a new space? This is only an optimization, 
by leaking it into alter.cc you are making alter worry about
stuff which should not be its concern.

> +	struct space *space = engine_create_space(engine, def, key_list, epoch);

Passing epoch id around explicitly is ugly.

> -		key_def_set_part(new_def, pos++, part->fieldno, part->type,
> +		key_def_set_part(new_def, pos, part->fieldno, part->type,
>  				 part->nullable_action, part->coll,
>  				 part->coll_id, part->sort_order, part->path,
>  				 part->path_len);
> +		new_def->parts[pos].offset_slot_cache = part->offset_slot_cache;
> +		new_def->parts[pos].format_cache = part->format_cache;
> +		pos++;

Why can't you do it in key_def_set_part?

> -		key_def_set_part(new_def, pos++, part->fieldno, part->type,
> +		key_def_set_part(new_def, pos, part->fieldno, part->type,
>  				 part->nullable_action, part->coll,
>  				 part->coll_id, part->sort_order, part->path,
>  				 part->path_len);
> +		new_def->parts[pos].offset_slot_cache = part->offset_slot_cache;
> +		new_def->parts[pos].format_cache = part->format_cache;
> +		pos++;

Lack of code reuse, abstraction leak.
> +++ b/src/box/key_def.h
> @@ -101,6 +101,14 @@ struct key_part {
>  	char *path;
>  	/** The length of JSON path. */
>  	uint32_t path_len;
> +	/**
> +	 * Source format for offset_slot_cache actuality

> +	 * validations. Cache is expected to use "the format with

The source format to check that offset_slot_epoch is not stale.

Please avoid using the word "actuality".

> +	 * the newest epoch is most relevant" strategy.
> +	 */
> +	struct tuple_format *format_cache;
> +	/** Cache with format's field offset slot. */
> +	int32_t offset_slot_cache;
>  };
>  

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov



More information about the Tarantool-patches mailing list