[patches] [PATCH 4/4] vinyl: on space alter update key definitions in indexes

v.shpilevoy at tarantool.org v.shpilevoy at tarantool.org
Wed Mar 14 12:01:22 MSK 2018



> 14 марта 2018 г., в 11:30, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> 
> On Tue, Mar 13, 2018 at 05:44:27PM +0300, v.shpilevoy at tarantool.org wrote:
>> 
>> 
>>> 13 марта 2018 г., в 16:44, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>> 
>>> On Mon, Mar 12, 2018 at 08:01:21PM +0300, v.shpilevoy at tarantool.org wrote:
>>>> 
>>>> 
>>>>> 12 марта 2018 г., в 19:47, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>>>> 
>>>>> On Mon, Mar 12, 2018 at 07:16:31PM +0300, v.shpilevoy at tarantool.org wrote:
>>>>>> 
>>>>>> 
>>>>>>> 12 марта 2018 г., в 19:10, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>>>>>> 
>>>>>>> On Mon, Mar 12, 2018 at 05:41:12PM +0300, v.shpilevoy at tarantool.org wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 12 марта 2018 г., в 16:03, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
>>>>>>>>> 
>>>>>>>>> On Mon, Mar 12, 2018 at 12:53:36AM +0300, Vladislav Shpilevoy wrote:
>>>>>>>>>> Vinyl alter can change comparators, for example, by removal of a
>>>>>>>>>> space format and making some fields optional. This alter is
>>>>>>>>>> allowed on a non-empty space even, because optionality of a
>>>>>>>>>> field is not persisted. But index key definitions can not be
>>>>>>>>>> updated in place, since they are used in multiple threads (vinyl
>>>>>>>>>> workers). So it is necessary to get new key definitions from a
>>>>>>>>>> new index definition, and leave old ones to be deleted by
>>>>>>>>>> existing worker tasks, when they will be finished.
>>>>>>>>> 
>>>>>>>>> I don't think that workers alone are worth introducing key_def reference
>>>>>>>>> counting - we can just copy key_def for them, it is a slow path after
>>>>>>>>> all.
>>>>>>>> 
>>>>>>>> We can, but further we want to store key_def per mem and per run to
>>>>>>>> allow various formats and keys in different places.
>>>>>>> 
>>>>>>> But why? All tuples stored in an index should have the same format and
>>>>>>> be compared by the same key definition, no?
>>>>>> 
>>>>>> Now yes. Later - no. On vinyl alter you can not update formats in all older tuples in all mems and runs.
>>>>> 
>>>>> I don't think I understand. If you change space format, all tuples must
>>>>> conform to it, no? Why do we need a separate format per each run and/or
>>>>> mem?
>>>> 
>>>> Exactly. But they may not be the same. They different, but compatible.
>>> 
>>> That doesn't answer my question. Why do we need to store different but
>>> compatible formats in each mem/run instead of simply using index->format
>>> in all iterators?
>> 
>> "All tuples stored in an index should have the same format" - this is
>> your words, and I answered, that it is false. Indexes can contain
>> tuples with different compatible formats. Iterator->format has no
>> sense, since format is a property of a tuple, not an iterator.
> 
> I guess you either don't understand or don't want to understand what I'm
> saying. Essentially, vy_index->format only defines the way tuples are
> stored, in memory or on disk, hence it can't change without index
> rebuild. Other format fields (min_field_count, dict) are not used in
> vinyl AFAICS. Am I wrong? If not, why do we need to maintain different
> tuple formats (per-mem, per-run) for the same index?

You are wrong. Vy_index->formats are the formats, that tuples have returned to a user. Users sees not space->format, it sees actually vy_index->format, that for a user looks the same.

> 
>> 
>>> 
>>>> I think, that the same can be applied to key_defs in a future. In a
>>>> case of vinyl we must avoid full rebuild - it will be much and mush
>>>> slower that memtx, and if key_def will be changed in a compatible way,
>>>> then I do not see a reason to do not use it for rebuild avoidance.
>>>> 
>>>>> 
>>>>>> And on index update you can not change key_defs too.
>>>>> 
>>>>> Why would you need to do anything like this? Just rebuild the whole
>>>>> index on key_def alter, that's all. Simple and robust.
>>>> 
>>>> It is not effective when a key_def is changed in a compatible way (for example, by extending of part types).
>>>> 
>>>>> 
>>>>> To be honest, I find all those little tricks, like changing key field
>>>>> type from unsigned to signed without rebuild, useless. Anybody needs
>>>>> that?
>>>> 
>>>> You think, that it useless, but lets see how many time rebuild
>>>> consumes on a space with, for instance, 5 000 000 000 tuples?
>>> 
>>> Doesn't matter if nobody uses it.
>> 
>> You do not know, do somebody use it or not. If you think, that it is
>> not needed, you can ask in a public chat: do people want to avoid
>> index rebuild on any index alter, or it is not matter.
> 
> I'm not going to do that, because it's *your* responsibility as this is
> *your* patch, not mine.

Ok, my opinion, that rebuild must be avoided. EOF.

> 
>>>>>>>> sk = s:create_index('sk', {parts = {{2, 'unsigned', is_nullable = true}}})
>>>>>>>> s:format({}) -- this will fail, because min_field_count is changed. But user even does not know about 'min_field_count'. We simply can not explain to a user, why it is forbidden.
>>>>>>> 
>>>>>>> Who cares? De facto, vinyl doesn't support alter at all. The user can
>>>>>>> drop all indexes and recreate the space with another format.
>>>>>> 
>>>>>> Vinyl supports alter. At least, it allows to update format in compatible ways.
>>>>> 
>>>>> You've recently forbidden altering key_def of an empty vinyl index, and
>>>>> nobody cared. How's space format any different?
>>>> 
>>>> I have forbidden key definition alter, yes, but I did not forbid to
>>>> create/drop an index. And these hacks are only for empty spaces. On a
>>>> non-empty space key_definition has no worked ever.  Now you propose to
>>>> forbid alteration of a non-empty space. And for a reason, that can not
>>>> be even explained to a user. With the same success we can forbid any
>>>> alter of Vinyl spaces, except disk options (page sizes etc).
>>> 
>>> I'm suggesting to temporarily disable it until we figure out a way to
>>> fix this key_def problem. Reference counting is a no-go for me, as it
>>> turns the relationship between index_def and vy_index into even a bigger
>>> mess. I would resort to it only if I was absolutely sure there's no
>>> better way. Please come up with a better approach to this problem if you
>>> don't want to disable ALTER for vinyl, even temporarily.
>>> 
>>> Since you're not even trying, let me help. Have you considered updating
>>> vy_index->key_def and cmp_def in-place? AFAIK all vinyl objects
>>> reference either vy_index->key_def or cmp_def or both, so if we updated
>>> field types and comparators of vy_index->key_def and cmp_def in a
>>> compatible way in-place (i.e. without reallocating those structures),
>>> all its users (iterators, cache, ranges, etc) would get updated
>>> automatically, and we wouldn't need any reference counting or that ugly
>>> iteration over all ranges in ALTER you're trying to push - everything
>>> would seamlessly switch to the new definition. Everything, except
>>> workers, of course, for they are running in other threads, but for
>>> them we can duplicate key_def and cmp_def and store them in vy_task -
>>> it's a slow path anyway. Currently, I don't see any pitfalls in such a
>>> design, please look into it.
>> 
>> It was already considered, and it solves the problem, but it
> 
>> 1) is a temporary hack,
> 
> Oh, really? May I beg to ask why?

We can not copy into a worker all objects, that can be changed in TX thread. For example, we do not copy formats regardless of it is "slowpath". After all, copying will be done in TX thread, not in a worker, which is really slowpath, so you actually propose to consume CPU for this. Key_def is relative big object (and will become much bigger after merging with SQL), and its copying is not free.

> 
>> 2) contradict with a plan, that I have discussed with Kostja.
> 
> Oh, now that's getting funny. You might not have noticed that, but I
> haven't participated in any discussions and I'm not aware of any of your
> plans.

I told it to you serval messages before.

> 
>> If he approve, that we want to copy key/cmp_def on each task, I would
>> implement it.
> 
> OK, I see. Then why did you ask me to review this patch set in the first
> place? Just get it pushed by Kostja.

Because now it is your responsibility to review patches before Kostja, as I known from Kirill. If not, then you can discuss it with Kirill.

> 
>> 
>>> 
>>>>>>> Anyway, the fact that you can't change a key definition or space format
>>>>>>> without potentially affecting other key definitions looks cheesy...
>>>>>> 
>>>>>> A user does not see changes of non-altered indexes. So it is not cheesy.
>>>>> 
>>>>> I didn't mean users, I meant the ALTER code - it's getting more and more
>>>>> convoluted. And the bug you're fixing is a living proof to that.
>>>> 
>>>> It always was slightly "convoluted". Somewhere it complicates, and
>>>> somewhere it simplifies - the ModifySpaceFormat is "living proof to
>>>> that" - it removed many of doubtful code from ModifySpace.
>>> 
>>> So what? Are you implying that it justifies the mess we already have and
>>> gives you a permission to make it even more messy?
>> 
>> Messy and complex are not the same. I think, that now alter code is
>> much better, than ever. It is neither messy nor complex.
> 
> The problem is you're not the only developer here, and others might
> disagree, but you obviously don't care.

No comments. EOF.





More information about the Tarantool-patches mailing list