[Tarantool-patches] [PATCH] vinyl: bump dump_generation in case scheduler doesn't catch up with DDL
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Fri Apr 3 03:21:37 MSK 2020
Hi! Thanks for the patch!
See 3 comments below.
On 31/03/2020 02:47, Nikita Pettik wrote:
> It may turn out that dump_generation does not catch up with current
> generation and no other dump tasks are in progress. This may happen
> dump process is throttled due to errors. In this case generation is
> bumped but dump_generation is not (since dump is not completed). In
> turn, throttling opens a window for DDL operations. For instance, index
> dropping and creation of new one results in mentioned situation:
>
> box.snapshot() -- fails for some reason; next attempt at dumping will be
> -- taken in one second.
> s:drop() -- drop index to be dumped
> s = box.schema.space.create('test', {engine = 'vinyl'})
> -- create new one (its mem generation is greater than scheduler's one)
> i = s:create_index('pk')
>
> Closes #4821
> ---
1. Add a changelog please.
> Note that current fix is only workaround amd doesn't pretend to be
> the most proper one. Alternatively, we can rollback scheduler
> generation in case dump is failed and index to be dumped is dropped.
2. Not sure this is possible. At least I don't see a way. The dump is
failed before the index is dropped. And only on a next dump you see
that there is a problem. You can try changing the generation in the
index drop, if you see, that its generation matches the dump generation.
But it looks like encapsulation violation if you do it straight forward.
vy_lsm works with the scheduler only via its API. If you go this way,
we need a method like vy_scheduler_lsm_is_dropped(). It would check that
if the LSM's generation = dump_generation, we should recalculate it
not counting this tree.
However I am ok with both solutions. Up to you.
> Branch: https://github.com/tarantool/tarantool/tree/np/gh-4821-scheduler-gen-assert
> Issue: https://github.com/tarantool/tarantool/issues/4821
> > diff --git a/test/vinyl/gh-4821-ddl-during-throttled-dump.result b/test/vinyl/gh-4821-ddl-during-throttled-dump.result
> new file mode 100644
> index 000000000..eee031101
3. I rolledback the patch except the test, and it passed anyway.
The example in the ticket crashes on my machine, but the test
does not.
Why not to take the test from the ticket as is? With a small
change of throttling timeout using ERRINJ_VY_SCHED_TIMEOUT, to
make it fail faster than 1 sec.
More information about the Tarantool-patches
mailing list