[patches] [PATCH 4/4] vinyl: on space alter update key definitions in indexes

Vladimir Davydov vdavydov.dev at gmail.com
Wed Mar 14 11:30:06 MSK 2018


On Tue, Mar 13, 2018 at 05:44:27PM +0300, v.shpilevoy at tarantool.org wrote:
> 
> 
> > 13 марта 2018 г., в 16:44, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> > 
> > On Mon, Mar 12, 2018 at 08:01:21PM +0300, v.shpilevoy at tarantool.org wrote:
> >> 
> >> 
> >>> 12 марта 2018 г., в 19:47, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>> 
> >>> On Mon, Mar 12, 2018 at 07:16:31PM +0300, v.shpilevoy at tarantool.org wrote:
> >>>> 
> >>>> 
> >>>>> 12 марта 2018 г., в 19:10, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>>>> 
> >>>>> On Mon, Mar 12, 2018 at 05:41:12PM +0300, v.shpilevoy at tarantool.org wrote:
> >>>>>> 
> >>>>>> 
> >>>>>>> 12 марта 2018 г., в 16:03, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>>>>>> 
> >>>>>>> On Mon, Mar 12, 2018 at 12:53:36AM +0300, Vladislav Shpilevoy wrote:
> >>>>>>>> Vinyl alter can change comparators, for example, by removal of a
> >>>>>>>> space format and making some fields optional. This alter is
> >>>>>>>> allowed on a non-empty space even, because optionality of a
> >>>>>>>> field is not persisted. But index key definitions can not be
> >>>>>>>> updated in place, since they are used in multiple threads (vinyl
> >>>>>>>> workers). So it is necessary to get new key definitions from a
> >>>>>>>> new index definition, and leave old ones to be deleted by
> >>>>>>>> existing worker tasks, when they will be finished.
> >>>>>>> 
> >>>>>>> I don't think that workers alone are worth introducing key_def reference
> >>>>>>> counting - we can just copy key_def for them, it is a slow path after
> >>>>>>> all.
> >>>>>> 
> >>>>>> We can, but further we want to store key_def per mem and per run to
> >>>>>> allow various formats and keys in different places.
> >>>>> 
> >>>>> But why? All tuples stored in an index should have the same format and
> >>>>> be compared by the same key definition, no?
> >>>> 
> >>>> Now yes. Later - no. On vinyl alter you can not update formats in all older tuples in all mems and runs.
> >>> 
> >>> I don't think I understand. If you change space format, all tuples must
> >>> conform to it, no? Why do we need a separate format per each run and/or
> >>> mem?
> >> 
> >> Exactly. But they may not be the same. They different, but compatible.
> > 
> > That doesn't answer my question. Why do we need to store different but
> > compatible formats in each mem/run instead of simply using index->format
> > in all iterators?
> 
> "All tuples stored in an index should have the same format" - this is
> your words, and I answered, that it is false. Indexes can contain
> tuples with different compatible formats. Iterator->format has no
> sense, since format is a property of a tuple, not an iterator.

I guess you either don't understand or don't want to understand what I'm
saying. Essentially, vy_index->format only defines the way tuples are
stored, in memory or on disk, hence it can't change without index
rebuild. Other format fields (min_field_count, dict) are not used in
vinyl AFAICS. Am I wrong? If not, why do we need to maintain different
tuple formats (per-mem, per-run) for the same index?

> 
> > 
> >> I think, that the same can be applied to key_defs in a future. In a
> >> case of vinyl we must avoid full rebuild - it will be much and mush
> >> slower that memtx, and if key_def will be changed in a compatible way,
> >> then I do not see a reason to do not use it for rebuild avoidance.
> >> 
> >>> 
> >>>> And on index update you can not change key_defs too.
> >>> 
> >>> Why would you need to do anything like this? Just rebuild the whole
> >>> index on key_def alter, that's all. Simple and robust.
> >> 
> >> It is not effective when a key_def is changed in a compatible way (for example, by extending of part types).
> >> 
> >>> 
> >>> To be honest, I find all those little tricks, like changing key field
> >>> type from unsigned to signed without rebuild, useless. Anybody needs
> >>> that?
> >> 
> >> You think, that it useless, but lets see how many time rebuild
> >> consumes on a space with, for instance, 5 000 000 000 tuples?
> > 
> > Doesn't matter if nobody uses it.
> 
> You do not know, do somebody use it or not. If you think, that it is
> not needed, you can ask in a public chat: do people want to avoid
> index rebuild on any index alter, or it is not matter.

I'm not going to do that, because it's *your* responsibility as this is
*your* patch, not mine.

> >>>>>> sk = s:create_index('sk', {parts = {{2, 'unsigned', is_nullable = true}}})
> >>>>>> s:format({}) -- this will fail, because min_field_count is changed. But user even does not know about 'min_field_count'. We simply can not explain to a user, why it is forbidden.
> >>>>> 
> >>>>> Who cares? De facto, vinyl doesn't support alter at all. The user can
> >>>>> drop all indexes and recreate the space with another format.
> >>>> 
> >>>> Vinyl supports alter. At least, it allows to update format in compatible ways.
> >>> 
> >>> You've recently forbidden altering key_def of an empty vinyl index, and
> >>> nobody cared. How's space format any different?
> >> 
> >> I have forbidden key definition alter, yes, but I did not forbid to
> >> create/drop an index. And these hacks are only for empty spaces. On a
> >> non-empty space key_definition has no worked ever.  Now you propose to
> >> forbid alteration of a non-empty space. And for a reason, that can not
> >> be even explained to a user. With the same success we can forbid any
> >> alter of Vinyl spaces, except disk options (page sizes etc).
> > 
> > I'm suggesting to temporarily disable it until we figure out a way to
> > fix this key_def problem. Reference counting is a no-go for me, as it
> > turns the relationship between index_def and vy_index into even a bigger
> > mess. I would resort to it only if I was absolutely sure there's no
> > better way. Please come up with a better approach to this problem if you
> > don't want to disable ALTER for vinyl, even temporarily.
> > 
> > Since you're not even trying, let me help. Have you considered updating
> > vy_index->key_def and cmp_def in-place? AFAIK all vinyl objects
> > reference either vy_index->key_def or cmp_def or both, so if we updated
> > field types and comparators of vy_index->key_def and cmp_def in a
> > compatible way in-place (i.e. without reallocating those structures),
> > all its users (iterators, cache, ranges, etc) would get updated
> > automatically, and we wouldn't need any reference counting or that ugly
> > iteration over all ranges in ALTER you're trying to push - everything
> > would seamlessly switch to the new definition. Everything, except
> > workers, of course, for they are running in other threads, but for
> > them we can duplicate key_def and cmp_def and store them in vy_task -
> > it's a slow path anyway. Currently, I don't see any pitfalls in such a
> > design, please look into it.
> 
> It was already considered, and it solves the problem, but it

> 1) is a temporary hack,

Oh, really? May I beg to ask why?

> 2) contradict with a plan, that I have discussed with Kostja.

Oh, now that's getting funny. You might not have noticed that, but I
haven't participated in any discussions and I'm not aware of any of your
plans.

> If he approve, that we want to copy key/cmp_def on each task, I would
> implement it.

OK, I see. Then why did you ask me to review this patch set in the first
place? Just get it pushed by Kostja.

> 
> > 
> >>>>> Anyway, the fact that you can't change a key definition or space format
> >>>>> without potentially affecting other key definitions looks cheesy...
> >>>> 
> >>>> A user does not see changes of non-altered indexes. So it is not cheesy.
> >>> 
> >>> I didn't mean users, I meant the ALTER code - it's getting more and more
> >>> convoluted. And the bug you're fixing is a living proof to that.
> >> 
> >> It always was slightly "convoluted". Somewhere it complicates, and
> >> somewhere it simplifies - the ModifySpaceFormat is "living proof to
> >> that" - it removed many of doubtful code from ModifySpace.
> > 
> > So what? Are you implying that it justifies the mess we already have and
> > gives you a permission to make it even more messy?
> 
> Messy and complex are not the same. I think, that now alter code is
> much better, than ever. It is neither messy nor complex.

The problem is you're not the only developer here, and others might
disagree, but you obviously don't care.



More information about the Tarantool-patches mailing list