From: Nikita Pettik <korablev@tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Date: Fri, 22 May 2020 02:42:06 +0000 [thread overview] Message-ID: <20200522024206.GA24698@tarantool.org> (raw) In-Reply-To: <0bdc1b5e-82b4-2e24-ad06-26587b007742@tarantool.org> [-- Attachment #1: Type: text/plain, Size: 10515 bytes --] On 21 May 00:13, Vladislav Shpilevoy wrote: > Thanks for the patchset! > > I grepped for other vy_apply_upsert() usage cases and found these: > > - vy_build_recover_stmt() - should we patch it too? No, I'd say not. If user wants to skip errors and continue recovery process one can use force_recovery option. > - vy_tx_write() - here the upsert is skipped in case of any error, but > what if it was not a ClientError? Originally, there was a comment: /* * Ignore a memory error, because it is * not critical to apply the optimization. */ So I decided to keep this logic. I guess returning -1 in case of OOM is also OK. So I can add check here as well. Let me know. Btw, I've found another one bug concerned with upserts. It is revealed since upserts can turn out to be invalid not only due to OOM. Here's the patch (see also as an attachment). It is located on the top of np/gh-1622-skip-invalid-upserts branch. Author: Nikita Pettik <korablev@tarantool.org> Date: Fri May 22 04:38:45 2020 +0300 vinyl: handle invalid upserts in background squash process When L0 collects more than certain number of upserts (to be more precise VY_UPSERT_THRESHOLD == 128), background process is launched to squash them. During this squash, new statements could be inserted into L0. Until committed prepared statements feature lsn = MAX_LSN as a special marker. It is assumed that all upserts will be successfully squashed. As a result statements with given key in L0 will be able to have only MAX_LSN lsn (except for the squash result). However, it is not so if at least one upsert is failed to be applied: it will get stack in L0 until dump and it has normal (i.e. < MAX_LSN) lsn. So the assertion which verifies that lsn of upsert in L0 is greater than MAX_LSN is wrong. Moreover, if we execute enough (> 128) invalid upserts, they will prevent from squashing normal upserts. So let's skip invalid upserts and don't account them while calculating number of upserts residing in L0. Follow-up #1622 diff --git a/src/box/vinyl.c b/src/box/vinyl.c index 55a4c8fb7..b39e3b9d8 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash) if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 || vy_stmt_type(mem_stmt) != IPROTO_UPSERT) break; - assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN); - vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts); - if (n_upserts <= VY_UPSERT_THRESHOLD) - ++n_upserts; + /* + * Upsert was not applied in vy_history_apply(), + * but it still resides in tree memory. Ignore + * such statements and do not account them while + * counting upserts. Otherwise invalid upserts will + * get stack and won't allow other upserts to squash. + */ + if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) { + vy_stmt_set_n_upserts((struct tuple *) mem_stmt, + n_upserts); + if (n_upserts <= VY_UPSERT_THRESHOLD) + ++n_upserts; + } else { + vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0); + } vy_mem_tree_iterator_prev(&mem->tree, &mem_itr); } diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result index b59ed53dc..12ca65c19 100644 --- a/test/vinyl/gh-1622-skip-invalid-upserts.result +++ b/test/vinyl/gh-1622-skip-invalid-upserts.result @@ -130,6 +130,140 @@ s:select{} | - - [2, 13] | ... +-- Try to insert too many invalid upserts in a row so that +-- they trigger squashing procedure. +-- +fiber = require('fiber') + | --- + | ... +s:drop() + | --- + | ... + +s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 }) + | --- + | ... +pk = s:create_index('pk') + | --- + | ... +s:replace({1, 1}) + | --- + | - [1, 1] + | ... + +squashed_before = pk:stat().upsert.squashed + | --- + | ... +applied_before = pk:stat().upsert.applied + | --- + | ... +len_before = s:len() + | --- + | ... +upsert_cnt = 129 + | --- + | ... +for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end + | --- + | ... +-- 1 squash is accounted in vy_squash_process() unconditionally. +-- +while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end + | --- + | ... +-- Make sure invalid upserts are committed into tree. +-- +assert(s:len() == len_before + upsert_cnt) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == applied_before) + | --- + | - true + | ... +-- Adding new invalid upserts will not result in squash until +-- we reach VY_UPSERT_THRESHOLD (128) limit again. +-- +s:upsert({1, 1}, {{'=', 3, 5}}) + | --- + | ... +assert(pk:stat().upsert.squashed == squashed_before + 1) + | --- + | - true + | ... +for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end + | --- + | ... +assert(pk:stat().upsert.squashed == squashed_before + 1) + | --- + | - true + | ... +s:upsert({1, 1}, {{'=', 3, 5}}) + | --- + | ... +while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end + | --- + | ... +assert(pk:stat().upsert.squashed == squashed_before + 2) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == applied_before) + | --- + | - true + | ... +box.snapshot() + | --- + | - ok + | ... +-- No one upsert is applied so number of rows should not change. +-- +assert(s:len() == len_before) + | --- + | - true + | ... +assert(pk:stat().upsert.squashed == squashed_before + 2) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == applied_before) + | --- + | - true + | ... +-- Now let's mix valid and invalid upserts. Make sure that only +-- valid are accounted during squash/apply. +-- +for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end + | --- + | ... +while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end + | --- + | ... +assert(s:len() == len_before + upsert_cnt) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) + | --- + | - true + | ... +box.snapshot() + | --- + | - ok + | ... +assert(s:len() == len_before) + | --- + | - true + | ... +assert(pk:stat().upsert.squashed == squashed_before + 3) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) + | --- + | - true + | ... + s:drop() | --- | ... diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua index eb93393be..27c168fcb 100644 --- a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua +++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua @@ -47,4 +47,55 @@ s:select{} box.snapshot() s:select{} +-- Try to insert too many invalid upserts in a row so that +-- they trigger squashing procedure. +-- +fiber = require('fiber') +s:drop() + +s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 }) +pk = s:create_index('pk') +s:replace({1, 1}) + +squashed_before = pk:stat().upsert.squashed +applied_before = pk:stat().upsert.applied +len_before = s:len() +upsert_cnt = 129 +for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end +-- 1 squash is accounted in vy_squash_process() unconditionally. +-- +while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end +-- Make sure invalid upserts are committed into tree. +-- +assert(s:len() == len_before + upsert_cnt) +assert(pk:stat().upsert.applied == applied_before) +-- Adding new invalid upserts will not result in squash until +-- we reach VY_UPSERT_THRESHOLD (128) limit again. +-- +s:upsert({1, 1}, {{'=', 3, 5}}) +assert(pk:stat().upsert.squashed == squashed_before + 1) +for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end +assert(pk:stat().upsert.squashed == squashed_before + 1) +s:upsert({1, 1}, {{'=', 3, 5}}) +while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end +assert(pk:stat().upsert.squashed == squashed_before + 2) +assert(pk:stat().upsert.applied == applied_before) +box.snapshot() +-- No one upsert is applied so number of rows should not change. +-- +assert(s:len() == len_before) +assert(pk:stat().upsert.squashed == squashed_before + 2) +assert(pk:stat().upsert.applied == applied_before) +-- Now let's mix valid and invalid upserts. Make sure that only +-- valid are accounted during squash/apply. +-- +for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end +while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end +assert(s:len() == len_before + upsert_cnt) +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) +box.snapshot() +assert(s:len() == len_before) +assert(pk:stat().upsert.squashed == squashed_before + 3) +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) + s:drop() \ No newline at end of file > On 13/04/2020 23:55, Nikita Pettik wrote: > > Branch: https://github.com/tarantool/tarantool/tree/np/gh-1622-skip-invalid-upserts > > Issue: https://github.com/tarantool/tarantool/issues/1622 > > > > In fact, the first patch in series does not make much sense without > > second since otherwise dump or compaction processes (and any other > > operations that read data as well) will fail whenever they are started. > > But still I've divided fix into two patches for the sake of review > > simplicity. > > > > Nikita Pettik (2): > > vinyl: validate resulting tuple after upsert is applied > > vinyl: skip invalid upserts during squash > > > > src/box/vy_history.c | 20 ++- > > src/box/vy_lsm.c | 13 +- > > src/box/vy_tx.c | 29 ++-- > > src/box/vy_upsert.c | 4 + > > src/box/vy_write_iterator.c | 34 +++-- > > .../vinyl/gh-1622-skip-invalid-upserts.result | 135 ++++++++++++++++++ > > .../gh-1622-skip-invalid-upserts.test.lua | 50 +++++++ > > 7 files changed, 258 insertions(+), 27 deletions(-) > > create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.result > > create mode 100644 test/vinyl/gh-1622-skip-invalid-upserts.test.lua > > [-- Attachment #2: 0001-vinyl-handle-invalid-upserts-in-background-squash-pr.patch --] [-- Type: text/x-diff, Size: 8277 bytes --] From b4a8349217abe5727047751ce55fe09f620b3669 Mon Sep 17 00:00:00 2001 Message-Id: <b4a8349217abe5727047751ce55fe09f620b3669.1590114943.git.korablev@tarantool.org> From: Nikita Pettik <korablev@tarantool.org> Date: Fri, 22 May 2020 04:38:45 +0300 Subject: [PATCH] vinyl: handle invalid upserts in background squash process When L0 collects more than certain number of upserts (to be more precise VY_UPSERT_THRESHOLD == 128), background process is launched to squash them. During this squash, new statements could be inserted into L0. Until committed prepared statements feature lsn = MAX_LSN as a special marker. It is assumed that all upserts will be successfully squashed. As a result statements with given key in L0 will be able to have only MAX_LSN lsn (except for the squash result). However, it is not so if at least one upsert is failed to be applied: it will get stack in L0 until dump and it has normal (i.e. < MAX_LSN) lsn. So the assertion which verifies that lsn of upsert in L0 is greater than MAX_LSN is wrong. Moreover, if we execute enough (> 128) invalid upserts, they will prevent from squashing normal upserts. So let's skip invalid upserts and don't account them while calculating number of upserts residing in L0. Follow-up #1622 --- src/box/vinyl.c | 19 ++- .../vinyl/gh-1622-skip-invalid-upserts.result | 134 ++++++++++++++++++ .../gh-1622-skip-invalid-upserts.test.lua | 51 +++++++ 3 files changed, 200 insertions(+), 4 deletions(-) diff --git a/src/box/vinyl.c b/src/box/vinyl.c index 55a4c8fb7..b39e3b9d8 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash) if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 || vy_stmt_type(mem_stmt) != IPROTO_UPSERT) break; - assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN); - vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts); - if (n_upserts <= VY_UPSERT_THRESHOLD) - ++n_upserts; + /* + * Upsert was not applied in vy_history_apply(), + * but it still resides in tree memory. Ignore + * such statements and do not account them while + * counting upserts. Otherwise invalid upserts will + * get stack and won't allow other upserts to squash. + */ + if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) { + vy_stmt_set_n_upserts((struct tuple *) mem_stmt, + n_upserts); + if (n_upserts <= VY_UPSERT_THRESHOLD) + ++n_upserts; + } else { + vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0); + } vy_mem_tree_iterator_prev(&mem->tree, &mem_itr); } diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.result b/test/vinyl/gh-1622-skip-invalid-upserts.result index b59ed53dc..12ca65c19 100644 --- a/test/vinyl/gh-1622-skip-invalid-upserts.result +++ b/test/vinyl/gh-1622-skip-invalid-upserts.result @@ -130,6 +130,140 @@ s:select{} | - - [2, 13] | ... +-- Try to insert too many invalid upserts in a row so that +-- they trigger squashing procedure. +-- +fiber = require('fiber') + | --- + | ... +s:drop() + | --- + | ... + +s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 }) + | --- + | ... +pk = s:create_index('pk') + | --- + | ... +s:replace({1, 1}) + | --- + | - [1, 1] + | ... + +squashed_before = pk:stat().upsert.squashed + | --- + | ... +applied_before = pk:stat().upsert.applied + | --- + | ... +len_before = s:len() + | --- + | ... +upsert_cnt = 129 + | --- + | ... +for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end + | --- + | ... +-- 1 squash is accounted in vy_squash_process() unconditionally. +-- +while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end + | --- + | ... +-- Make sure invalid upserts are committed into tree. +-- +assert(s:len() == len_before + upsert_cnt) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == applied_before) + | --- + | - true + | ... +-- Adding new invalid upserts will not result in squash until +-- we reach VY_UPSERT_THRESHOLD (128) limit again. +-- +s:upsert({1, 1}, {{'=', 3, 5}}) + | --- + | ... +assert(pk:stat().upsert.squashed == squashed_before + 1) + | --- + | - true + | ... +for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end + | --- + | ... +assert(pk:stat().upsert.squashed == squashed_before + 1) + | --- + | - true + | ... +s:upsert({1, 1}, {{'=', 3, 5}}) + | --- + | ... +while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end + | --- + | ... +assert(pk:stat().upsert.squashed == squashed_before + 2) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == applied_before) + | --- + | - true + | ... +box.snapshot() + | --- + | - ok + | ... +-- No one upsert is applied so number of rows should not change. +-- +assert(s:len() == len_before) + | --- + | - true + | ... +assert(pk:stat().upsert.squashed == squashed_before + 2) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == applied_before) + | --- + | - true + | ... +-- Now let's mix valid and invalid upserts. Make sure that only +-- valid are accounted during squash/apply. +-- +for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end + | --- + | ... +while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end + | --- + | ... +assert(s:len() == len_before + upsert_cnt) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) + | --- + | - true + | ... +box.snapshot() + | --- + | - ok + | ... +assert(s:len() == len_before) + | --- + | - true + | ... +assert(pk:stat().upsert.squashed == squashed_before + 3) + | --- + | - true + | ... +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) + | --- + | - true + | ... + s:drop() | --- | ... diff --git a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua index eb93393be..27c168fcb 100644 --- a/test/vinyl/gh-1622-skip-invalid-upserts.test.lua +++ b/test/vinyl/gh-1622-skip-invalid-upserts.test.lua @@ -47,4 +47,55 @@ s:select{} box.snapshot() s:select{} +-- Try to insert too many invalid upserts in a row so that +-- they trigger squashing procedure. +-- +fiber = require('fiber') +s:drop() + +s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 }) +pk = s:create_index('pk') +s:replace({1, 1}) + +squashed_before = pk:stat().upsert.squashed +applied_before = pk:stat().upsert.applied +len_before = s:len() +upsert_cnt = 129 +for i =1 , upsert_cnt do s:upsert({1, 1}, {{'=', 3, 5}}) end +-- 1 squash is accounted in vy_squash_process() unconditionally. +-- +while pk:stat().upsert.squashed ~= squashed_before + 1 do fiber.sleep(0.01) end +-- Make sure invalid upserts are committed into tree. +-- +assert(s:len() == len_before + upsert_cnt) +assert(pk:stat().upsert.applied == applied_before) +-- Adding new invalid upserts will not result in squash until +-- we reach VY_UPSERT_THRESHOLD (128) limit again. +-- +s:upsert({1, 1}, {{'=', 3, 5}}) +assert(pk:stat().upsert.squashed == squashed_before + 1) +for i = 1, upsert_cnt - 3 do s:upsert({1, 1}, {{'=', 3, 5}}) end +assert(pk:stat().upsert.squashed == squashed_before + 1) +s:upsert({1, 1}, {{'=', 3, 5}}) +while pk:stat().upsert.squashed ~= squashed_before + 2 do fiber.sleep(0.01) end +assert(pk:stat().upsert.squashed == squashed_before + 2) +assert(pk:stat().upsert.applied == applied_before) +box.snapshot() +-- No one upsert is applied so number of rows should not change. +-- +assert(s:len() == len_before) +assert(pk:stat().upsert.squashed == squashed_before + 2) +assert(pk:stat().upsert.applied == applied_before) +-- Now let's mix valid and invalid upserts. Make sure that only +-- valid are accounted during squash/apply. +-- +for i = 1, upsert_cnt do if i % 2 == 0 then s:upsert({1, 1}, {{'=', 3, 5}}) else s:upsert({1, 1}, {{'+', 2, 1}}) end end +while pk:stat().upsert.squashed ~= squashed_before + 3 do fiber.sleep(0.01) end +assert(s:len() == len_before + upsert_cnt) +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) +box.snapshot() +assert(s:len() == len_before) +assert(pk:stat().upsert.squashed == squashed_before + 3) +assert(pk:stat().upsert.applied == math.floor(upsert_cnt/2) + 1) + s:drop() \ No newline at end of file -- 2.17.1
next prev parent reply other threads:[~2020-05-22 2:42 UTC|newest] Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-13 21:55 Nikita Pettik 2020-04-13 21:55 ` [Tarantool-patches] [PATCH 1/2] vinyl: validate resulting tuple after upsert is applied Nikita Pettik 2020-06-22 19:28 ` Aleksandr Lyapunov 2020-04-13 21:55 ` [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash Nikita Pettik 2020-04-13 22:12 ` Konstantin Osipov 2020-05-14 2:11 ` Nikita Pettik 2020-05-14 6:56 ` Konstantin Osipov 2020-05-19 19:10 ` Nikita Pettik 2020-05-19 19:39 ` Konstantin Osipov 2020-05-21 2:51 ` Nikita Pettik 2020-05-21 8:36 ` Konstantin Osipov 2020-05-01 0:31 ` Vladislav Shpilevoy 2020-05-14 2:21 ` Nikita Pettik 2020-05-14 21:32 ` Vladislav Shpilevoy 2020-05-19 18:18 ` Nikita Pettik 2020-05-20 22:13 ` Vladislav Shpilevoy 2020-05-26 21:33 ` Vladislav Shpilevoy 2020-05-27 20:05 ` Nikita Pettik 2020-05-29 21:47 ` Vladislav Shpilevoy 2020-06-01 19:24 ` Nikita Pettik 2020-05-20 22:13 ` [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied Vladislav Shpilevoy 2020-05-22 2:42 ` Nikita Pettik [this message] 2020-05-26 21:33 ` Vladislav Shpilevoy 2020-05-27 20:10 ` Nikita Pettik 2020-06-22 14:13 ` Aleksandr Lyapunov 2020-06-22 20:21 ` Nikita Pettik 2020-06-23 12:32 ` Aleksandr Lyapunov 2020-06-02 21:36 ` Vladislav Shpilevoy 2020-06-02 21:37 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200522024206.GA24698@tarantool.org \ --to=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can'\''t be applied' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox