[patches] [PATCH 4/4] vinyl: on space alter update key definitions in indexes

Vladimir Davydov vdavydov.dev at gmail.com
Wed Mar 14 12:14:10 MSK 2018


On Wed, Mar 14, 2018 at 12:01:22PM +0300, v.shpilevoy at tarantool.org wrote:
> 
> 
> > 14 марта 2018 г., в 11:30, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> > 
> > On Tue, Mar 13, 2018 at 05:44:27PM +0300, v.shpilevoy at tarantool.org wrote:
> >> 
> >> 
> >>> 13 марта 2018 г., в 16:44, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>> 
> >>> On Mon, Mar 12, 2018 at 08:01:21PM +0300, v.shpilevoy at tarantool.org wrote:
> >>>> 
> >>>> 
> >>>>> 12 марта 2018 г., в 19:47, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>>>> 
> >>>>> On Mon, Mar 12, 2018 at 07:16:31PM +0300, v.shpilevoy at tarantool.org wrote:
> >>>>>> 
> >>>>>> 
> >>>>>>> 12 марта 2018 г., в 19:10, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>>>>>> 
> >>>>>>> On Mon, Mar 12, 2018 at 05:41:12PM +0300, v.shpilevoy at tarantool.org wrote:
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>>> 12 марта 2018 г., в 16:03, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> >>>>>>>>> 
> >>>>>>>>> On Mon, Mar 12, 2018 at 12:53:36AM +0300, Vladislav Shpilevoy wrote:
> >>>>>>>>>> Vinyl alter can change comparators, for example, by removal of a
> >>>>>>>>>> space format and making some fields optional. This alter is
> >>>>>>>>>> allowed on a non-empty space even, because optionality of a
> >>>>>>>>>> field is not persisted. But index key definitions can not be
> >>>>>>>>>> updated in place, since they are used in multiple threads (vinyl
> >>>>>>>>>> workers). So it is necessary to get new key definitions from a
> >>>>>>>>>> new index definition, and leave old ones to be deleted by
> >>>>>>>>>> existing worker tasks, when they will be finished.
> >>>>>>>>> 
> >>>>>>>>> I don't think that workers alone are worth introducing key_def reference
> >>>>>>>>> counting - we can just copy key_def for them, it is a slow path after
> >>>>>>>>> all.
> >>>>>>>> 
> >>>>>>>> We can, but further we want to store key_def per mem and per run to
> >>>>>>>> allow various formats and keys in different places.
> >>>>>>> 
> >>>>>>> But why? All tuples stored in an index should have the same format and
> >>>>>>> be compared by the same key definition, no?
> >>>>>> 
> >>>>>> Now yes. Later - no. On vinyl alter you can not update formats in all older tuples in all mems and runs.
> >>>>> 
> >>>>> I don't think I understand. If you change space format, all tuples must
> >>>>> conform to it, no? Why do we need a separate format per each run and/or
> >>>>> mem?
> >>>> 
> >>>> Exactly. But they may not be the same. They different, but compatible.
> >>> 
> >>> That doesn't answer my question. Why do we need to store different but
> >>> compatible formats in each mem/run instead of simply using index->format
> >>> in all iterators?
> >> 
> >> "All tuples stored in an index should have the same format" - this is
> >> your words, and I answered, that it is false. Indexes can contain
> >> tuples with different compatible formats. Iterator->format has no
> >> sense, since format is a property of a tuple, not an iterator.
> > 
> > I guess you either don't understand or don't want to understand what I'm
> > saying. Essentially, vy_index->format only defines the way tuples are
> > stored, in memory or on disk, hence it can't change without index
> > rebuild. Other format fields (min_field_count, dict) are not used in
> > vinyl AFAICS. Am I wrong? If not, why do we need to maintain different
> > tuple formats (per-mem, per-run) for the same index?
> 
> You are wrong. Vy_index->formats are the formats, that tuples have
> returned to a user. Users sees not space->format, it sees actually
> vy_index->format, that for a user looks the same.

We reallocate tuples anyway. Since formats are compatible we can use
space->format for new tuples.

> 
> > 
> >> 
> >>> 
> >>>> I think, that the same can be applied to key_defs in a future. In a
> >>>> case of vinyl we must avoid full rebuild - it will be much and mush
> >>>> slower that memtx, and if key_def will be changed in a compatible way,
> >>>> then I do not see a reason to do not use it for rebuild avoidance.
> >>>> 
> >>>>> 
> >>>>>> And on index update you can not change key_defs too.
> >>>>> 
> >>>>> Why would you need to do anything like this? Just rebuild the whole
> >>>>> index on key_def alter, that's all. Simple and robust.
> >>>> 
> >>>> It is not effective when a key_def is changed in a compatible way (for example, by extending of part types).
> >>>> 
> >>>>> 
> >>>>> To be honest, I find all those little tricks, like changing key field
> >>>>> type from unsigned to signed without rebuild, useless. Anybody needs
> >>>>> that?
> >>>> 
> >>>> You think, that it useless, but lets see how many time rebuild
> >>>> consumes on a space with, for instance, 5 000 000 000 tuples?
> >>> 
> >>> Doesn't matter if nobody uses it.
> >> 
> >> You do not know, do somebody use it or not. If you think, that it is
> >> not needed, you can ask in a public chat: do people want to avoid
> >> index rebuild on any index alter, or it is not matter.
> > 
> > I'm not going to do that, because it's *your* responsibility as this is
> > *your* patch, not mine.
> 
> Ok, my opinion, that rebuild must be avoided. EOF.
> 
> > 
> >>>>>>>> sk = s:create_index('sk', {parts = {{2, 'unsigned', is_nullable = true}}})
> >>>>>>>> s:format({}) -- this will fail, because min_field_count is changed. But user even does not know about 'min_field_count'. We simply can not explain to a user, why it is forbidden.
> >>>>>>> 
> >>>>>>> Who cares? De facto, vinyl doesn't support alter at all. The user can
> >>>>>>> drop all indexes and recreate the space with another format.
> >>>>>> 
> >>>>>> Vinyl supports alter. At least, it allows to update format in compatible ways.
> >>>>> 
> >>>>> You've recently forbidden altering key_def of an empty vinyl index, and
> >>>>> nobody cared. How's space format any different?
> >>>> 
> >>>> I have forbidden key definition alter, yes, but I did not forbid to
> >>>> create/drop an index. And these hacks are only for empty spaces. On a
> >>>> non-empty space key_definition has no worked ever.  Now you propose to
> >>>> forbid alteration of a non-empty space. And for a reason, that can not
> >>>> be even explained to a user. With the same success we can forbid any
> >>>> alter of Vinyl spaces, except disk options (page sizes etc).
> >>> 
> >>> I'm suggesting to temporarily disable it until we figure out a way to
> >>> fix this key_def problem. Reference counting is a no-go for me, as it
> >>> turns the relationship between index_def and vy_index into even a bigger
> >>> mess. I would resort to it only if I was absolutely sure there's no
> >>> better way. Please come up with a better approach to this problem if you
> >>> don't want to disable ALTER for vinyl, even temporarily.
> >>> 
> >>> Since you're not even trying, let me help. Have you considered updating
> >>> vy_index->key_def and cmp_def in-place? AFAIK all vinyl objects
> >>> reference either vy_index->key_def or cmp_def or both, so if we updated
> >>> field types and comparators of vy_index->key_def and cmp_def in a
> >>> compatible way in-place (i.e. without reallocating those structures),
> >>> all its users (iterators, cache, ranges, etc) would get updated
> >>> automatically, and we wouldn't need any reference counting or that ugly
> >>> iteration over all ranges in ALTER you're trying to push - everything
> >>> would seamlessly switch to the new definition. Everything, except
> >>> workers, of course, for they are running in other threads, but for
> >>> them we can duplicate key_def and cmp_def and store them in vy_task -
> >>> it's a slow path anyway. Currently, I don't see any pitfalls in such a
> >>> design, please look into it.
> >> 
> >> It was already considered, and it solves the problem, but it
> > 
> >> 1) is a temporary hack,
> > 
> > Oh, really? May I beg to ask why?
> 
> We can not copy into a worker all objects, that can be changed in TX
> thread. For example, we do not copy formats regardless of it is
> "slowpath". After all, copying will be done in TX thread, not in a
> worker, which is really slowpath, so you actually propose to consume
> CPU for this. Key_def is relative big object (and will become much
> bigger after merging with SQL), and its copying is not free.

We don't copy tuple_format, because we already have reference counting
for it - it is needed for tuples anyway. Actually, I think that
tuple_format must be frozen per vy_index - it shouldn't change with
ALTER that doesn't require rebuild. May be, we should strip it to a bare
minimum necessary to store tuples on disk and/or in memory.

> 
> > 
> >> 2) contradict with a plan, that I have discussed with Kostja.
> > 
> > Oh, now that's getting funny. You might not have noticed that, but I
> > haven't participated in any discussions and I'm not aware of any of your
> > plans.
> 
> I told it to you serval messages before.
> 
> > 
> >> If he approve, that we want to copy key/cmp_def on each task, I would
> >> implement it.
> > 
> > OK, I see. Then why did you ask me to review this patch set in the first
> > place? Just get it pushed by Kostja.
> 
> Because now it is your responsibility to review patches before Kostja,
> as I known from Kirill. If not, then you can discuss it with Kirill.

OK, I will.

> 
> > 
> >> 
> >>> 
> >>>>>>> Anyway, the fact that you can't change a key definition or space format
> >>>>>>> without potentially affecting other key definitions looks cheesy...
> >>>>>> 
> >>>>>> A user does not see changes of non-altered indexes. So it is not cheesy.
> >>>>> 
> >>>>> I didn't mean users, I meant the ALTER code - it's getting more and more
> >>>>> convoluted. And the bug you're fixing is a living proof to that.
> >>>> 
> >>>> It always was slightly "convoluted". Somewhere it complicates, and
> >>>> somewhere it simplifies - the ModifySpaceFormat is "living proof to
> >>>> that" - it removed many of doubtful code from ModifySpace.
> >>> 
> >>> So what? Are you implying that it justifies the mess we already have and
> >>> gives you a permission to make it even more messy?
> >> 
> >> Messy and complex are not the same. I think, that now alter code is
> >> much better, than ever. It is neither messy nor complex.
> > 
> > The problem is you're not the only developer here, and others might
> > disagree, but you obviously don't care.
> 
> No comments. EOF.
> 
> 



More information about the Tarantool-patches mailing list