From: Vladimir Davydov <vdavydov.dev@gmail.com> To: Konstantin Osipov <kostja@tarantool.org> Cc: tarantool-patches@freelists.org Subject: Re: [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version Date: Wed, 11 Jul 2018 19:33:07 +0300 [thread overview] Message-ID: <20180711163307.vxwzle6claiazwrg@esperanza> (raw) In-Reply-To: <20180710164343.xgxdwk4ovbhdmmbo@esperanza> BTW this patch makes the behavior of vy_point_lookup() consistent with vy_read_iterator: both iterators now return the newest tuple version. Also, it allows to simplify vy_squash_process() - see the patch below (I pushed it on the branch, you may want to cherry-pick it as well): From c001e3f1a320bb804f49ca19ec540fee094dacc3 Mon Sep 17 00:00:00 2001 From: Vladimir Davydov <vdavydov.dev@gmail.com> Date: Wed, 11 Jul 2018 19:14:24 +0300 Subject: [PATCH] vinyl: simplify vy_squash_process Since vy_point_lookup() now guarantees that it returns the newest tuple version, we can remove the code that squashes UPSERTs from vy_squash_process(). diff --git a/src/box/vinyl.c b/src/box/vinyl.c index a9603560..2d1a6fc0 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -3585,11 +3585,6 @@ vy_squash_process(struct vy_squash *squash) struct vy_lsm *lsm = squash->lsm; struct vy_env *env = squash->env; - /* - * vy_apply_upsert() is used for primary key only, - * so this is the same as lsm->key_def - */ - struct key_def *def = lsm->cmp_def; /* Upserts enabled only in the primary index LSM tree. */ assert(lsm->index_id == 0); @@ -3607,8 +3602,10 @@ vy_squash_process(struct vy_squash *squash) /* * While we were reading on-disk runs, new statements could - * have been inserted into the in-memory tree. Apply them to - * the result. + * have been prepared for the squashed key. We mustn't apply + * them, because they may be rolled back, but we must adjust + * their n_upserts counter so that they will get squashed by + * vy_lsm_commit_upsert(). */ struct vy_mem *mem = lsm->mem; struct tree_mem_key tree_key = { @@ -3625,108 +3622,20 @@ vy_squash_process(struct vy_squash *squash) tuple_unref(result); return 0; } - /** - * Algorithm of the squashing. - * Assume, during building the non-UPSERT statement - * 'result' in the mem some new UPSERTs were inserted, and - * some of them were commited, while the other were just - * prepared. And lets UPSERT_THRESHOLD to be equal to 3, - * for example. - * Mem - * -------------------------------------+ - * UPSERT, lsn = 1, n_ups = 0 | - * UPSERT, lsn = 2, n_ups = 1 | Commited - * UPSERT, lsn = 3, n_ups = 2 | - * -------------------------------------+ - * UPSERT, lsn = MAX, n_ups = 3 | - * UPSERT, lsn = MAX + 1, n_ups = 4 | Prepared - * UPSERT, lsn = MAX + 2, n_ups = 5 | - * -------------------------------------+ - * In such a case the UPSERT statements with - * lsns = {1, 2, 3} are squashed. But now the n_upsert - * values in the prepared statements are not correct. - * If we will not update values, then the - * vy_lsm_commit_upsert will not be able to squash them. - * - * So after squashing it is necessary to update n_upsert - * value in the prepared statements: - * Mem - * -------------------------------------+ - * UPSERT, lsn = 1, n_ups = 0 | - * UPSERT, lsn = 2, n_ups = 1 | Commited - * REPLACE, lsn = 3 | - * -------------------------------------+ - * UPSERT, lsn = MAX, n_ups = 0 !!! | - * UPSERT, lsn = MAX + 1, n_ups = 1 !!! | Prepared - * UPSERT, lsn = MAX + 2, n_ups = 2 !!! | - * -------------------------------------+ - */ vy_mem_tree_iterator_prev(&mem->tree, &mem_itr); - const struct tuple *mem_stmt; - int64_t stmt_lsn; - /* - * According to the described algorithm, squash the - * commited UPSERTs at first. - */ + uint8_t n_upserts = 0; while (!vy_mem_tree_iterator_is_invalid(&mem_itr)) { + const struct tuple *mem_stmt; mem_stmt = *vy_mem_tree_iterator_get_elem(&mem->tree, &mem_itr); - stmt_lsn = vy_stmt_lsn(mem_stmt); - if (vy_tuple_compare(result, mem_stmt, def) != 0) - break; - /** - * Leave alone prepared statements; they will be handled - * in vy_range_commit_stmt. - */ - if (stmt_lsn >= MAX_LSN) + if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 || + vy_stmt_type(mem_stmt) != IPROTO_UPSERT) break; - if (vy_stmt_type(mem_stmt) != IPROTO_UPSERT) { - /** - * Somebody inserted non-upsert statement, - * squashing is useless. - */ - tuple_unref(result); - return 0; - } - assert(lsm->index_id == 0); - struct tuple *applied = vy_apply_upsert(mem_stmt, result, def, - mem->format, true); - lsm->stat.upsert.applied++; - tuple_unref(result); - if (applied == NULL) - return -1; - result = applied; - /** - * In normal cases we get a result with the same lsn as - * in mem_stmt. - * But if there are buggy upserts that do wrong things, - * they are ignored and the result has lower lsn. - * We should fix the lsn in any case to replace - * exactly mem_stmt in general and the buggy upsert - * in particular. - */ - vy_stmt_set_lsn(result, stmt_lsn); + assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN); + vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts); + if (n_upserts <= VY_UPSERT_THRESHOLD) + ++n_upserts; vy_mem_tree_iterator_prev(&mem->tree, &mem_itr); } - /* - * The second step of the algorithm above is updating of - * n_upsert values of the prepared UPSERTs. - */ - if (stmt_lsn >= MAX_LSN) { - uint8_t n_upserts = 0; - while (!vy_mem_tree_iterator_is_invalid(&mem_itr)) { - mem_stmt = *vy_mem_tree_iterator_get_elem(&mem->tree, - &mem_itr); - if (vy_tuple_compare(result, mem_stmt, def) != 0 || - vy_stmt_type(mem_stmt) != IPROTO_UPSERT) - break; - assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN); - vy_stmt_set_n_upserts((struct tuple *)mem_stmt, - n_upserts); - if (n_upserts <= VY_UPSERT_THRESHOLD) - ++n_upserts; - vy_mem_tree_iterator_prev(&mem->tree, &mem_itr); - } - } lsm->stat.upsert.squashed++;
next prev parent reply other threads:[~2018-07-11 16:33 UTC|newest] Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-07-08 16:48 [RFC PATCH 02/23] vinyl: always get full tuple from pk after reading from secondary index Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 00/23] vinyl: eliminate read on REPLACE/DELETE Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 01/23] vinyl: do not turn REPLACE into INSERT when processing DML request Vladimir Davydov 2018-07-10 12:15 ` Konstantin Osipov 2018-07-10 12:19 ` Vladimir Davydov 2018-07-10 18:39 ` Konstantin Osipov 2018-07-11 7:57 ` Vladimir Davydov 2018-07-11 10:25 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 03/23] vinyl: use vy_mem_iterator for point lookup Vladimir Davydov 2018-07-17 10:14 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version Vladimir Davydov 2018-07-10 16:19 ` Konstantin Osipov 2018-07-10 16:43 ` Vladimir Davydov 2018-07-11 16:33 ` Vladimir Davydov [this message] 2018-07-31 19:17 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 05/23] vinyl: fold vy_replace_one and vy_replace_impl Vladimir Davydov 2018-07-31 20:28 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 06/23] vinyl: fold vy_delete_impl Vladimir Davydov 2018-07-31 20:28 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 07/23] vinyl: refactor unique check Vladimir Davydov 2018-07-31 20:28 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 08/23] vinyl: check key uniqueness before modifying tx write set Vladimir Davydov 2018-07-31 20:34 ` Konstantin Osipov 2018-08-01 10:42 ` Vladimir Davydov 2018-08-09 20:26 ` Konstantin Osipov 2018-08-10 8:26 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 09/23] vinyl: remove env argument of vy_check_is_unique_{primary,secondary} Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 10/23] vinyl: store full tuples in secondary index cache Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 11/23] xrow: allow to store flags in DML requests Vladimir Davydov 2018-07-31 20:36 ` Konstantin Osipov 2018-08-01 14:10 ` Vladimir Davydov 2018-08-17 13:34 ` Vladimir Davydov 2018-08-17 13:34 ` [PATCH 1/2] xrow: allow to store tuple metadata in request Vladimir Davydov 2018-08-17 13:34 ` [PATCH 2/2] vinyl: introduce statement flags Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 12/23] vinyl: do not pass region explicitly to write iterator functions Vladimir Davydov 2018-07-17 10:16 ` Vladimir Davydov 2018-07-31 20:38 ` Konstantin Osipov 2018-08-01 14:14 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 13/23] vinyl: fix potential use-after-free in vy_read_view_merge Vladimir Davydov 2018-07-17 10:16 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 14/23] test: unit/vy_write_iterator: minor refactoring Vladimir Davydov 2018-07-17 10:17 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 15/23] vinyl: teach write iterator to return overwritten tuples Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 16/23] vinyl: allow to skip certain statements on read Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 17/23] vinyl: do not free pending tasks on shutdown Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 18/23] vinyl: store pointer to scheduler in struct vy_task Vladimir Davydov 2018-07-31 20:39 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 19/23] vinyl: rename some members of vy_scheduler and vy_task struct Vladimir Davydov 2018-07-31 20:40 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 20/23] vinyl: use cbus for communication between scheduler and worker threads Vladimir Davydov 2018-07-31 20:43 ` Konstantin Osipov 2018-08-01 14:26 ` Vladimir Davydov 2018-07-08 16:48 ` [RFC PATCH 21/23] vinyl: zap vy_scheduler::is_worker_pool_running Vladimir Davydov 2018-07-31 20:43 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 22/23] vinyl: rename vy_task::status to is_failed Vladimir Davydov 2018-07-31 20:44 ` Konstantin Osipov 2018-07-08 16:48 ` [RFC PATCH 23/23] vinyl: eliminate read on REPLACE/DELETE Vladimir Davydov 2018-07-13 10:53 ` Vladimir Davydov 2018-07-13 10:53 ` [PATCH 1/3] stailq: add stailq_insert function Vladimir Davydov 2018-07-15 7:02 ` Konstantin Osipov 2018-07-15 13:17 ` Vladimir Davydov 2018-07-15 18:40 ` Konstantin Osipov 2018-07-17 10:18 ` Vladimir Davydov 2018-07-13 10:53 ` [PATCH 2/3] vinyl: link all indexes of the same space Vladimir Davydov 2018-07-13 10:53 ` [PATCH 3/3] vinyl: generate deferred DELETEs on tx commit Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180711163307.vxwzle6claiazwrg@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=kostja@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='Re: [RFC PATCH 04/23] vinyl: make point lookup always return the latest tuple version' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox