From: Aleksandr Lyapunov <alyapunov@tarantool.org>
To: Nikita Pettik <korablev@tarantool.org>,
tarantool-patches@dev.tarantool.org
Cc: v.shpilevoy@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] vinyl: rotate mem during index build on demand
Date: Sat, 6 Jun 2020 11:38:04 +0300 [thread overview]
Message-ID: <35e748ed-4826-37f0-24a5-a860508bcdc9@tarantool.org> (raw)
In-Reply-To: <7407b62139349ee3904a674490a6222b0c960a1c.1591292549.git.korablev@tarantool.org>
Thanks for the patch, lgtm.
On 6/4/20 8:49 PM, Nikita Pettik wrote:
> Meanwhile in 17f6af7dc the similar problem has been fixed, still it may
> appear that in-memory level of secondary index being constructed has
> lagging memory generation (in other words, it's values is less than
> the value of current global generation). Such situation can be achieved
> if yield which is required to check uniqueness of value being inserted
> is too long. In this time gap other space may trigger dump process
> bumping global memory generation counter, but dump itself is still not
> yet scheduled.
>
> So to get rid of generations mismatch, let's rotate in-memory level
> after yield on demand.
>
> Closes #5042
> ---
> Branch: https://github.com/tarantool/tarantool/tree/np/gh-5042-rotate-mem-after-yield
> Issue: https://github.com/tarantool/tarantool/issues/5042
>
> @ChangeLog:
> * Fix crash due to triggered dump process during secondary index creation (gh-5042).
>
> src/box/vinyl.c | 14 ++
> src/errinj.h | 1 +
> test/box/errinj.result | 1 +
> .../gh-5042-dump-during-index-build-2.result | 189 ++++++++++++++++++
> ...gh-5042-dump-during-index-build-2.test.lua | 98 +++++++++
> test/vinyl/suite.ini | 2 +-
> 6 files changed, 304 insertions(+), 1 deletion(-)
> create mode 100644 test/vinyl/gh-5042-dump-during-index-build-2.result
> create mode 100644 test/vinyl/gh-5042-dump-during-index-build-2.test.lua
>
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index a90b786a7..19e37947a 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -4083,6 +4083,14 @@ vy_build_insert_tuple(struct vy_env *env, struct vy_lsm *lsm,
> vy_mem_pin(mem);
> rc = vy_check_is_unique_secondary(NULL, &env->xm->p_committed_read_view,
> space_name, index_name, lsm, stmt);
> +#ifndef NDEBUG
> + struct errinj *inj = errinj(ERRINJ_BUILD_UNIQUE_IDX_DELAY, ERRINJ_BOOL);
> + if (inj != NULL) {
> + while (inj->bparam) {
> + fiber_sleep(0.01);
> + }
> + }
> +#endif
> vy_mem_unpin(mem);
> if (rc != 0) {
> tuple_unref(stmt);
> @@ -4097,6 +4105,12 @@ vy_build_insert_tuple(struct vy_env *env, struct vy_lsm *lsm,
> * Hence, to get right mem (during mem rotation it becomes
> * sealed i.e. read-only) we should fetch it from lsm again.
> */
> + if (lsm->mem->generation != *lsm->env->p_generation) {
> + if (vy_lsm_rotate_mem(lsm) != 0) {
> + tuple_unref(stmt);
> + return -1;
> + }
> + }
> mem = lsm->mem;
>
> /* Insert the new tuple into the in-memory index. */
> diff --git a/src/errinj.h b/src/errinj.h
> index 92bff5e4d..1adbb9e05 100644
> --- a/src/errinj.h
> +++ b/src/errinj.h
> @@ -131,6 +131,7 @@ struct errinj {
> _(ERRINJ_VY_READ_VIEW_MERGE_FAIL, ERRINJ_BOOL, {.bparam = false})\
> _(ERRINJ_VY_WRITE_ITERATOR_START_FAIL, ERRINJ_BOOL, {.bparam = false})\
> _(ERRINJ_VY_RUN_OPEN, ERRINJ_INT, {.iparam = -1})\
> + _(ERRINJ_BUILD_UNIQUE_IDX_DELAY, ERRINJ_BOOL, {.bparam = false})\
>
> ENUM0(errinj_id, ERRINJ_LIST);
> extern struct errinj errinjs[];
> diff --git a/test/box/errinj.result b/test/box/errinj.result
> index 3f075a2d0..cf1e81072 100644
> --- a/test/box/errinj.result
> +++ b/test/box/errinj.result
> @@ -33,6 +33,7 @@ end
> evals
> ---
> - - ERRINJ_BUILD_INDEX: -1
> + - ERRINJ_BUILD_UNIQUE_IDX_DELAY: false
> - ERRINJ_DYN_MODULE_COUNT: 0
> - ERRINJ_HTTPC_EXECUTE: false
> - ERRINJ_HTTP_RESPONSE_ADD_WAIT: false
> diff --git a/test/vinyl/gh-5042-dump-during-index-build-2.result b/test/vinyl/gh-5042-dump-during-index-build-2.result
> new file mode 100644
> index 000000000..a4fbc56ec
> --- /dev/null
> +++ b/test/vinyl/gh-5042-dump-during-index-build-2.result
> @@ -0,0 +1,189 @@
> +-- test-run result file version 2
> +test_run = require('test_run').new()
> + | ---
> + | ...
> +
> +test_run:cmd("create server test with script='vinyl/low_quota.lua'")
> + | ---
> + | - true
> + | ...
> +test_run:cmd("start server test with args='13421772'")
> + | ---
> + | - true
> + | ...
> +test_run:cmd('switch test')
> + | ---
> + | - true
> + | ...
> +
> +fiber = require 'fiber'
> + | ---
> + | ...
> +errinj = box.error.injection
> + | ---
> + | ...
> +
> +--
> +-- Test is quite similar to gh-4810, but data is inserted to
> +-- the space different from one which builds the new index.
> +-- Consider following scenario:
> +-- 1. during building secondary index, we yield to scan disk
> +-- in order to check tuple uniqueness;
> +-- 2. insertion to another space bumps memory generation (owing to
> +-- exceeding memory limit), but dump itself is not yet scheduled
> +-- (e.g. scheduler fiber does not get time to execute, or it fails
> +-- due for whatever reason);
> +-- 3. we return from yield point and now generation of mem does not
> +-- match with global one (which was bumped during second step). It may
> +-- lead to unpredictable behavior (starting from fired assertions in
> +-- debug mode to segfaults or lost data in release).
> +-- To equalize memory generations, we must rotate mem of corresponding
> +-- LSM tree before processing further operations with it.
> +--
> +s1 = box.schema.space.create('insert_to', {engine = 'vinyl'})
> + | ---
> + | ...
> +_ = s1:create_index('pk', {parts = {1, 'unsigned'}})
> + | ---
> + | ...
> +
> +s2 = box.schema.space.create('build_index', {engine = 'vinyl'})
> + | ---
> + | ...
> +_ = s2:create_index('pk', {parts = {1, 'unsigned'}})
> + | ---
> + | ...
> +
> +val = 0
> + | ---
> + | ...
> +PADDING = string.rep('x', 100)
> + | ---
> + | ...
> +
> +test_run:cmd("setopt delimiter ';'")
> + | ---
> + | - true
> + | ...
> +
> +function gen_insert(space)
> + pcall(space.insert, space, {val, val, val, val, PADDING})
> + val = val + 1
> +end;
> + | ---
> + | ...
> +
> +function fill_L0_without_dump(space, watermark)
> + while box.stat.vinyl().memory.level0 < watermark do
> + gen_insert(space)
> + end
> +end;
> + | ---
> + | ...
> +
> +function insert_loop()
> + while not stop do
> + gen_insert(s1)
> + end
> + ch:put(true)
> +end;
> + | ---
> + | ...
> +
> +function idx_build()
> + _ = s2:create_index('i1', {unique = true, parts = {2, 'unsigned', 3, 'unsigned'}})
> + ch:put(true)
> +end;
> + | ---
> + | ...
> +
> +dump_watermark = box.cfg.vinyl_memory / 2;
> + | ---
> + | ...
> +fill_L0_without_dump(s1, dump_watermark / 2);
> + | ---
> + | ...
> +fill_L0_without_dump(s2, dump_watermark);
> + | ---
> + | ...
> +
> +box.snapshot();
> + | ---
> + | - ok
> + | ...
> +
> +fill_L0_without_dump(s1, dump_watermark / 2 * 3);
> + | ---
> + | ...
> +
> +function idx_build()
> + errinj.set("ERRINJ_BUILD_UNIQUE_IDX_DELAY", true)
> + _ = s2:create_index('i1', {unique = true, parts = {2, 'unsigned', 3, 'unsigned'}})
> + ch:put(true)
> +end;
> + | ---
> + | ...
> +
> +stop = false;
> + | ---
> + | ...
> +ch = fiber.channel(2);
> + | ---
> + | ...
> +
> +_ = fiber.create(idx_build);
> + | ---
> + | ...
> +_ = fiber.create(insert_loop, s1);
> + | ---
> + | ...
> +errinj.set("ERRINJ_VY_INDEX_DUMP", 1);
> + | ---
> + | - ok
> + | ...
> +
> +fiber.sleep(2);
> + | ---
> + | ...
> +errinj.set("ERRINJ_BUILD_UNIQUE_IDX_DELAY", false);
> + | ---
> + | - ok
> + | ...
> +errinj.set("ERRINJ_VY_INDEX_DUMP", -1);
> + | ---
> + | - ok
> + | ...
> +
> +stop = true;
> + | ---
> + | ...
> +for i = 1, ch:size() do
> + ch:get()
> +end;
> + | ---
> + | ...
> +
> +s1:drop();
> + | ---
> + | ...
> +s2:drop();
> + | ---
> + | ...
> +
> +test_run:cmd("setopt delimiter ''");
> + | ---
> + | - true
> + | ...
> +
> +test_run:cmd('switch default')
> + | ---
> + | - true
> + | ...
> +test_run:cmd("stop server test")
> + | ---
> + | - true
> + | ...
> +test_run:cmd("cleanup server test")
> + | ---
> + | - true
> + | ...
> diff --git a/test/vinyl/gh-5042-dump-during-index-build-2.test.lua b/test/vinyl/gh-5042-dump-during-index-build-2.test.lua
> new file mode 100644
> index 000000000..16cdfa668
> --- /dev/null
> +++ b/test/vinyl/gh-5042-dump-during-index-build-2.test.lua
> @@ -0,0 +1,98 @@
> +test_run = require('test_run').new()
> +
> +test_run:cmd("create server test with script='vinyl/low_quota.lua'")
> +test_run:cmd("start server test with args='13421772'")
> +test_run:cmd('switch test')
> +
> +fiber = require 'fiber'
> +errinj = box.error.injection
> +
> +--
> +-- Test is quite similar to gh-4810, but data is inserted to
> +-- the space different from one which builds the new index.
> +-- Consider following scenario:
> +-- 1. during building secondary index, we yield to scan disk
> +-- in order to check tuple uniqueness;
> +-- 2. insertion to another space bumps memory generation (owing to
> +-- exceeding memory limit), but dump itself is not yet scheduled
> +-- (e.g. scheduler fiber does not get time to execute, or it fails
> +-- due for whatever reason);
> +-- 3. we return from yield point and now generation of mem does not
> +-- match with global one (which was bumped during second step). It may
> +-- lead to unpredictable behavior (starting from fired assertions in
> +-- debug mode to segfaults or lost data in release).
> +-- To equalize memory generations, we must rotate mem of corresponding
> +-- LSM tree before processing further operations with it.
> +--
> +s1 = box.schema.space.create('insert_to', {engine = 'vinyl'})
> +_ = s1:create_index('pk', {parts = {1, 'unsigned'}})
> +
> +s2 = box.schema.space.create('build_index', {engine = 'vinyl'})
> +_ = s2:create_index('pk', {parts = {1, 'unsigned'}})
> +
> +val = 0
> +PADDING = string.rep('x', 100)
> +
> +test_run:cmd("setopt delimiter ';'")
> +
> +function gen_insert(space)
> + pcall(space.insert, space, {val, val, val, val, PADDING})
> + val = val + 1
> +end;
> +
> +function fill_L0_without_dump(space, watermark)
> + while box.stat.vinyl().memory.level0 < watermark do
> + gen_insert(space)
> + end
> +end;
> +
> +function insert_loop()
> + while not stop do
> + gen_insert(s1)
> + end
> + ch:put(true)
> +end;
> +
> +function idx_build()
> + _ = s2:create_index('i1', {unique = true, parts = {2, 'unsigned', 3, 'unsigned'}})
> + ch:put(true)
> +end;
> +
> +dump_watermark = box.cfg.vinyl_memory / 2;
> +fill_L0_without_dump(s1, dump_watermark / 2);
> +fill_L0_without_dump(s2, dump_watermark);
> +
> +box.snapshot();
> +
> +fill_L0_without_dump(s1, dump_watermark / 2 * 3);
> +
> +function idx_build()
> + errinj.set("ERRINJ_BUILD_UNIQUE_IDX_DELAY", true)
> + _ = s2:create_index('i1', {unique = true, parts = {2, 'unsigned', 3, 'unsigned'}})
> + ch:put(true)
> +end;
> +
> +stop = false;
> +ch = fiber.channel(2);
> +
> +_ = fiber.create(idx_build);
> +_ = fiber.create(insert_loop, s1);
> +errinj.set("ERRINJ_VY_INDEX_DUMP", 1);
> +
> +fiber.sleep(2);
> +errinj.set("ERRINJ_BUILD_UNIQUE_IDX_DELAY", false);
> +errinj.set("ERRINJ_VY_INDEX_DUMP", -1);
> +
> +stop = true;
> +for i = 1, ch:size() do
> + ch:get()
> +end;
> +
> +s1:drop();
> +s2:drop();
> +
> +test_run:cmd("setopt delimiter ''");
> +
> +test_run:cmd('switch default')
> +test_run:cmd("stop server test")
> +test_run:cmd("cleanup server test")
> diff --git a/test/vinyl/suite.ini b/test/vinyl/suite.ini
> index 4de29fccf..7f2a2896f 100644
> --- a/test/vinyl/suite.ini
> +++ b/test/vinyl/suite.ini
> @@ -2,7 +2,7 @@
> core = tarantool
> description = vinyl integration tests
> script = vinyl.lua
> -release_disabled = errinj.test.lua errinj_ddl.test.lua errinj_gc.test.lua errinj_stat.test.lua errinj_tx.test.lua errinj_vylog.test.lua partial_dump.test.lua quota_timeout.test.lua recovery_quota.test.lua replica_rejoin.test.lua gh-4864-stmt-alloc-fail-compact.test.lua gh-4805-open-run-err-recovery.test.lua
> +release_disabled = errinj.test.lua errinj_ddl.test.lua errinj_gc.test.lua errinj_stat.test.lua errinj_tx.test.lua errinj_vylog.test.lua partial_dump.test.lua quota_timeout.test.lua recovery_quota.test.lua replica_rejoin.test.lua gh-4864-stmt-alloc-fail-compact.test.lua gh-4805-open-run-err-recovery.test.lua gh-5042-dump-during-index-build-2.test.lua
> config = suite.cfg
> lua_libs = suite.lua stress.lua large.lua txn_proxy.lua ../box/lua/utils.lua
> use_unix_sockets = True
next prev parent reply other threads:[~2020-06-06 8:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-04 17:49 Nikita Pettik
2020-06-04 20:23 ` Konstantin Osipov
2020-06-04 22:32 ` Nikita Pettik
2020-06-04 22:52 ` Konstantin Osipov
2020-06-05 11:18 ` Nikita Pettik
2020-06-05 11:33 ` Konstantin Osipov
2020-06-06 8:38 ` Aleksandr Lyapunov [this message]
2020-06-07 15:51 ` Vladislav Shpilevoy
2020-06-09 22:34 ` Nikita Pettik
2020-06-15 13:54 ` Nikita Pettik
2020-06-23 22:43 ` Vladislav Shpilevoy
2020-06-23 23:11 ` Nikita Pettik
2020-07-22 22:54 ` Vladislav Shpilevoy
2020-06-10 15:27 ` Nikita Pettik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=35e748ed-4826-37f0-24a5-a860508bcdc9@tarantool.org \
--to=alyapunov@tarantool.org \
--cc=korablev@tarantool.org \
--cc=tarantool-patches@dev.tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH] vinyl: rotate mem during index build on demand' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox