[patches] [PATCH 4/4] vinyl: on space alter update key definitions in indexes

v.shpilevoy at tarantool.org v.shpilevoy at tarantool.org
Tue Mar 13 17:44:27 MSK 2018



> 13 марта 2018 г., в 16:44, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> 
> On Mon, Mar 12, 2018 at 08:01:21PM +0300, v.shpilevoy at tarantool.org wrote:
>> 
>> 
>>> 12 марта 2018 г., в 19:47, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>> 
>>> On Mon, Mar 12, 2018 at 07:16:31PM +0300, v.shpilevoy at tarantool.org wrote:
>>>> 
>>>> 
>>>>> 12 марта 2018 г., в 19:10, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>>>> 
>>>>> On Mon, Mar 12, 2018 at 05:41:12PM +0300, v.shpilevoy at tarantool.org wrote:
>>>>>> 
>>>>>> 
>>>>>>> 12 марта 2018 г., в 16:03, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>>>>>> 
>>>>>>> On Mon, Mar 12, 2018 at 12:53:36AM +0300, Vladislav Shpilevoy wrote:
>>>>>>>> Vinyl alter can change comparators, for example, by removal of a
>>>>>>>> space format and making some fields optional. This alter is
>>>>>>>> allowed on a non-empty space even, because optionality of a
>>>>>>>> field is not persisted. But index key definitions can not be
>>>>>>>> updated in place, since they are used in multiple threads (vinyl
>>>>>>>> workers). So it is necessary to get new key definitions from a
>>>>>>>> new index definition, and leave old ones to be deleted by
>>>>>>>> existing worker tasks, when they will be finished.
>>>>>>> 
>>>>>>> I don't think that workers alone are worth introducing key_def reference
>>>>>>> counting - we can just copy key_def for them, it is a slow path after
>>>>>>> all.
>>>>>> 
>>>>>> We can, but further we want to store key_def per mem and per run to
>>>>>> allow various formats and keys in different places.
>>>>> 
>>>>> But why? All tuples stored in an index should have the same format and
>>>>> be compared by the same key definition, no?
>>>> 
>>>> Now yes. Later - no. On vinyl alter you can not update formats in all older tuples in all mems and runs.
>>> 
>>> I don't think I understand. If you change space format, all tuples must
>>> conform to it, no? Why do we need a separate format per each run and/or
>>> mem?
>> 
>> Exactly. But they may not be the same. They different, but compatible.
> 
> That doesn't answer my question. Why do we need to store different but
> compatible formats in each mem/run instead of simply using index->format
> in all iterators?

"All tuples stored in an index should have the same format" - this is your words, and I answered, that it is false. Indexes can contain tuples with different compatible formats. Iterator->format has no sense, since format is a property of a tuple, not an iterator.

> 
>> I think, that the same can be applied to key_defs in a future. In a
>> case of vinyl we must avoid full rebuild - it will be much and mush
>> slower that memtx, and if key_def will be changed in a compatible way,
>> then I do not see a reason to do not use it for rebuild avoidance.
>> 
>>> 
>>>> And on index update you can not change key_defs too.
>>> 
>>> Why would you need to do anything like this? Just rebuild the whole
>>> index on key_def alter, that's all. Simple and robust.
>> 
>> It is not effective when a key_def is changed in a compatible way (for example, by extending of part types).
>> 
>>> 
>>> To be honest, I find all those little tricks, like changing key field
>>> type from unsigned to signed without rebuild, useless. Anybody needs
>>> that?
>> 
>> You think, that it useless, but lets see how many time rebuild
>> consumes on a space with, for instance, 5 000 000 000 tuples?
> 
> Doesn't matter if nobody uses it.

You do not know, do somebody use it or not. If you think, that it is not needed, you can ask in a public chat: do people want to avoid index rebuild on any index alter, or it is not matter.

> 
>> 
>>> 
>>>> 
>>>>> 
>>>>>> sk = s:create_index('sk', {parts = {{2, 'unsigned', is_nullable = true}}})
>>>>>> s:format({}) -- this will fail, because min_field_count is changed. But user even does not know about 'min_field_count'. We simply can not explain to a user, why it is forbidden.
>>>>> 
>>>>> Who cares? De facto, vinyl doesn't support alter at all. The user can
>>>>> drop all indexes and recreate the space with another format.
>>>> 
>>>> Vinyl supports alter. At least, it allows to update format in compatible ways.
>>> 
>>> You've recently forbidden altering key_def of an empty vinyl index, and
>>> nobody cared. How's space format any different?
>> 
>> I have forbidden key definition alter, yes, but I did not forbid to
>> create/drop an index. And these hacks are only for empty spaces. On a
>> non-empty space key_definition has no worked ever.  Now you propose to
>> forbid alteration of a non-empty space. And for a reason, that can not
>> be even explained to a user. With the same success we can forbid any
>> alter of Vinyl spaces, except disk options (page sizes etc).
> 
> I'm suggesting to temporarily disable it until we figure out a way to
> fix this key_def problem. Reference counting is a no-go for me, as it
> turns the relationship between index_def and vy_index into even a bigger
> mess. I would resort to it only if I was absolutely sure there's no
> better way. Please come up with a better approach to this problem if you
> don't want to disable ALTER for vinyl, even temporarily.
> 
> Since you're not even trying, let me help. Have you considered updating
> vy_index->key_def and cmp_def in-place? AFAIK all vinyl objects
> reference either vy_index->key_def or cmp_def or both, so if we updated
> field types and comparators of vy_index->key_def and cmp_def in a
> compatible way in-place (i.e. without reallocating those structures),
> all its users (iterators, cache, ranges, etc) would get updated
> automatically, and we wouldn't need any reference counting or that ugly
> iteration over all ranges in ALTER you're trying to push - everything
> would seamlessly switch to the new definition. Everything, except
> workers, of course, for they are running in other threads, but for
> them we can duplicate key_def and cmp_def and store them in vy_task -
> it's a slow path anyway. Currently, I don't see any pitfalls in such a
> design, please look into it.

It was already considered, and it solves the problem, but it 1) is a temporary hack, 2) contradict with a plan, that I have discussed with Kostja. If he approve, that we want to copy key/cmp_def on each task, I would implement it.

> 
>>>>> Anyway, the fact that you can't change a key definition or space format
>>>>> without potentially affecting other key definitions looks cheesy...
>>>> 
>>>> A user does not see changes of non-altered indexes. So it is not cheesy.
>>> 
>>> I didn't mean users, I meant the ALTER code - it's getting more and more
>>> convoluted. And the bug you're fixing is a living proof to that.
>> 
>> It always was slightly "convoluted". Somewhere it complicates, and
>> somewhere it simplifies - the ModifySpaceFormat is "living proof to
>> that" - it removed many of doubtful code from ModifySpace.
> 
> So what? Are you implying that it justifies the mess we already have and
> gives you a permission to make it even more messy?

Messy and complex are not the same. I think, that now alter code is much better, than ever. It is neither messy nor complex.





More information about the Tarantool-patches mailing list