From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH] vinyl: fix recovery after aborted index creation Date: Thu, 21 Mar 2019 22:11:02 +0300 Message-Id: <38fc0bbe800bb3e876e501ab729bb5c1131f173a.1553195373.git.vdavydov.dev@gmail.com> To: tarantool-patches@freelists.org List-ID: There's a bug in the code building index hash on recovery: we replace a dropped index with any newer index, even incomplete one. Apparently, this is wrong, because a dropped index may have been dropped during final recovery and hence is still needed for initial recovery. If we replace it with an incomplete index in the index hash, initial recovery will fail with ER_INVALID_VYLOG_FILE: Invalid VYLOG file: LSM tree 512/1 not found (see vy_lsm_recover()). Fix this problem by checking create_lsn of the index that is going to replace a dropped one - if it's negative, we must link it to the dropped index via vy_lsm_recovery_info->prepared instead of inserting it into the hash directly. Closes #4066 --- https://github.com/tarantool/tarantool/issues/4066 src/box/vy_log.c | 5 ++-- test/vinyl/errinj_vylog.result | 55 ++++++++++++++++++++++++++++++++++++++++ test/vinyl/errinj_vylog.test.lua | 28 ++++++++++++++++++++ 3 files changed, 86 insertions(+), 2 deletions(-) diff --git a/src/box/vy_log.c b/src/box/vy_log.c index 06ab7247..85b61a84 100644 --- a/src/box/vy_log.c +++ b/src/box/vy_log.c @@ -2141,9 +2141,10 @@ vy_recovery_build_index_id_hash(struct vy_recovery *recovery) /* * If there's no LSM tree for these space_id/index_id * or it was dropped, simply replace it with the latest - * LSM tree version. + * committed LSM tree version. */ - if (hashed_lsm == NULL || hashed_lsm->drop_lsn >= 0) { + if (hashed_lsm == NULL || + (hashed_lsm->drop_lsn >= 0 && lsm->create_lsn >= 0)) { struct mh_i64ptr_node_t node; node.key = vy_recovery_index_id_hash(space_id, index_id); node.val = lsm; diff --git a/test/vinyl/errinj_vylog.result b/test/vinyl/errinj_vylog.result index 0e3b79c4..06cc6818 100644 --- a/test/vinyl/errinj_vylog.result +++ b/test/vinyl/errinj_vylog.result @@ -368,3 +368,58 @@ s.index.sk:select() s:drop() --- ... +-- +-- gh-4066: recovery error if an instance is restarted while +-- building an index and there's an index with the same id in +-- the snapshot. +-- +fiber = require('fiber') +--- +... +s = box.schema.space.create('test', {engine = 'vinyl'}) +--- +... +_ = s:create_index('pk') +--- +... +_ = s:create_index('sk', {parts = {2, 'unsigned'}}) +--- +... +s.index[1] ~= nil +--- +- true +... +s:replace{1, 2} +--- +- [1, 2] +... +box.snapshot() +--- +- ok +... +s.index.sk:drop() +--- +... +-- Log index creation, but never finish building it due to an error injection. +box.error.injection.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 9000) +--- +- ok +... +_ = fiber.create(function() s:create_index('sk', {parts = {2, 'unsigned'}}) end) +--- +... +fiber.sleep(0.01) +--- +... +-- Should ignore the incomplete index on recovery. +test_run:cmd('restart server default') +s = box.space.test +--- +... +s.index[1] == nil +--- +- true +... +s:drop() +--- +... diff --git a/test/vinyl/errinj_vylog.test.lua b/test/vinyl/errinj_vylog.test.lua index ce9e12e5..2936f879 100644 --- a/test/vinyl/errinj_vylog.test.lua +++ b/test/vinyl/errinj_vylog.test.lua @@ -177,3 +177,31 @@ s.index.pk:select() s.index.sk:select() s:drop() + +-- +-- gh-4066: recovery error if an instance is restarted while +-- building an index and there's an index with the same id in +-- the snapshot. +-- +fiber = require('fiber') + +s = box.schema.space.create('test', {engine = 'vinyl'}) +_ = s:create_index('pk') +_ = s:create_index('sk', {parts = {2, 'unsigned'}}) +s.index[1] ~= nil +s:replace{1, 2} +box.snapshot() + +s.index.sk:drop() + +-- Log index creation, but never finish building it due to an error injection. +box.error.injection.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 9000) +_ = fiber.create(function() s:create_index('sk', {parts = {2, 'unsigned'}}) end) +fiber.sleep(0.01) + +-- Should ignore the incomplete index on recovery. +test_run:cmd('restart server default') + +s = box.space.test +s.index[1] == nil +s:drop() -- 2.11.0