Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: kostja@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: [PATCH 5/5] alter: call space_vtab::commit_alter after WAL write
Date: Tue,  3 Apr 2018 20:37:43 +0300	[thread overview]
Message-ID: <743af45528804b8d61e03bc5a8415015f713f26a.1522775293.git.vdavydov.dev@gmail.com> (raw)
In-Reply-To: <cover.1522775293.git.vdavydov.dev@gmail.com>
In-Reply-To: <cover.1522775293.git.vdavydov.dev@gmail.com>

space_vtab::commit_alter is implemented only by memtx, which uses it to
delete tuples stored in the space and reset bsize when the primary index
is dropped. Currently, it's called before WAL write and so this can
result in use-after-free in memtx if WAL write fails. To avoid that,
this patch moves invocation of this method after WAL write.

A few things to note about this patch:

 - Space bsize should still be updated before WAL write, when the space
   becomes visible. So now we set it in space_vtab::prepare_alter and
   reset it in drop_primary_key, along with memtx_space::replace.

 - Before this patch, space_vtab::drop_primary_key was not called on
   space truncation, because DropIndex::alter only called it if the
   primary key was absent in the new space, which isn't true in case of
   space truncation. So this patch fixed this check: now we check the id
   of the dropped index directly, and call drop_primary_key if it's 0.

 - To discriminate between primary index rebuild and space truncation in
   space_vtab::commit_alter and make the right decision about whether we
   need to delete tuples stored in the space or not, we use a version
   counter attached to memtx_space. The counter is bumped every time the
   primary index is dropped, so if it differs between the old and new
   spaces in commit_alter, it was either primary index drop or space
   truncation and hence we need to purge the space.

Closes #3289
---
 src/box/alter.cc         | 23 +++--------------
 src/box/memtx_space.c    | 16 +++++-------
 src/box/memtx_space.h    |  6 +++++
 test/box/errinj.result   | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
 test/box/errinj.test.lua | 18 +++++++++++++
 5 files changed, 101 insertions(+), 28 deletions(-)

diff --git a/src/box/alter.cc b/src/box/alter.cc
index 174d53fa..06ef2923 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -729,6 +729,8 @@ alter_space_commit(struct trigger *trigger, void *event)
 		op->commit(alter, txn->signature);
 	}
 
+	space_commit_alter(alter->old_space, alter->new_space);
+
 	trigger_run_xc(&on_alter_space, alter->new_space);
 
 	alter->new_space = NULL; /* for alter_space_delete(). */
@@ -864,8 +866,6 @@ alter_space_do(struct txn *txn, struct alter_space *alter)
 	 * The new space is ready. Time to update the space
 	 * cache with it.
 	 */
-	space_commit_alter(alter->old_space, alter->new_space);
-
 	struct space *old_space = space_cache_replace(alter->new_space);
 	(void) old_space;
 	assert(old_space == alter->old_space);
@@ -1026,23 +1026,8 @@ DropIndex::alter_def(struct alter_space * /* alter */)
 void
 DropIndex::alter(struct alter_space *alter)
 {
-	/*
-	 * If it's not the primary key, nothing to do --
-	 * the dropped index didn't exist in the new space
-	 * definition, so does not exist in the created space.
-	 */
-	if (space_index(alter->new_space, 0) != NULL)
-		return;
-	/*
-	 * OK to drop the primary key. Inform the engine about it,
-	 * since it may have to reset handler->replace function,
-	 * so that:
-	 * - DML returns proper errors rather than crashes the
-	 *   program
-	 * - when a new primary key is finally added, the space
-	 *   can be put back online properly.
-	 */
-	space_drop_primary_key(alter->new_space);
+	if (old_index_def->iid == 0)
+		space_drop_primary_key(alter->new_space);
 }
 
 void
diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
index ec6f6db6..adc81430 100644
--- a/src/box/memtx_space.c
+++ b/src/box/memtx_space.c
@@ -736,6 +736,8 @@ memtx_space_drop_primary_key(struct space *space)
 {
 	struct memtx_space *memtx_space = (struct memtx_space *)space;
 	memtx_space->replace = memtx_space_replace_no_keys;
+	memtx_space->bsize = 0;
+	memtx_space->version++;
 }
 
 static void
@@ -844,6 +846,8 @@ memtx_space_prepare_alter(struct space *old_space, struct space *new_space)
 	struct memtx_space *old_memtx_space = (struct memtx_space *)old_space;
 	struct memtx_space *new_memtx_space = (struct memtx_space *)new_space;
 	new_memtx_space->replace = old_memtx_space->replace;
+	new_memtx_space->bsize = old_memtx_space->bsize;
+	new_memtx_space->version = old_memtx_space->version;
 	bool is_empty = old_space->index_count == 0 ||
 			index_size(old_space->index[0]) == 0;
 	return space_def_check_compatibility(old_space->def,
@@ -855,17 +859,10 @@ memtx_space_commit_alter(struct space *old_space, struct space *new_space)
 {
 	struct memtx_space *old_memtx_space = (struct memtx_space *)old_space;
 	struct memtx_space *new_memtx_space = (struct memtx_space *)new_space;
-	bool is_empty = new_space->index_count == 0 ||
-			index_size(new_space->index[0]) == 0;
 
-	/*
-	 * Delete all tuples when the last index is dropped
-	 * or the space is truncated.
-	 */
-	if (is_empty)
+	/* Delete all tuples if the primary index was dropped. */
+	if (old_memtx_space->version != new_memtx_space->version)
 		memtx_space_prune(old_space);
-	else
-		new_memtx_space->bsize = old_memtx_space->bsize;
 }
 
 /* }}} DDL */
@@ -937,6 +934,7 @@ memtx_space_new(struct memtx_engine *memtx,
 	tuple_format_unref(format);
 
 	memtx_space->bsize = 0;
+	memtx_space->version = 0;
 	memtx_space->replace = memtx_space_replace_no_keys;
 	return (struct space *)memtx_space;
 }
diff --git a/src/box/memtx_space.h b/src/box/memtx_space.h
index 26d33349..0c4176bc 100644
--- a/src/box/memtx_space.h
+++ b/src/box/memtx_space.h
@@ -43,6 +43,12 @@ struct memtx_space {
 	/* Number of bytes used in memory by tuples in the space. */
 	size_t bsize;
 	/**
+	 * Version of data stored in the space. It is bumped every
+	 * time the primary index is dropped. We use it to detect
+	 * if we need to delete tuples when the space is altered.
+	 */
+	int version;
+	/**
 	 * A pointer to replace function, set to different values
 	 * at different stages of recovery.
 	 */
diff --git a/test/box/errinj.result b/test/box/errinj.result
index 8e4d5742..269de663 100644
--- a/test/box/errinj.result
+++ b/test/box/errinj.result
@@ -1093,3 +1093,69 @@ s:drop()
 box.schema.user.revoke('guest', 'read,write,execute','universe')
 ---
 ...
+--
+-- gh-3289: drop/truncate leaves the space in inconsistent
+-- state if WAL write fails.
+--
+s = box.schema.space.create('test')
+---
+...
+_ = s:create_index('pk')
+---
+...
+for i = 1, 10 do s:replace{i} end
+---
+...
+errinj.set('ERRINJ_WAL_IO', true)
+---
+- ok
+...
+s:drop()
+---
+- error: Failed to write to disk
+...
+s:truncate()
+---
+- error: Failed to write to disk
+...
+s:drop()
+---
+- error: Failed to write to disk
+...
+s:truncate()
+---
+- error: Failed to write to disk
+...
+errinj.set('ERRINJ_WAL_IO', false)
+---
+- ok
+...
+for i = 1, 10 do s:replace{i + 10} end
+---
+...
+s:select()
+---
+- - [1]
+  - [2]
+  - [3]
+  - [4]
+  - [5]
+  - [6]
+  - [7]
+  - [8]
+  - [9]
+  - [10]
+  - [11]
+  - [12]
+  - [13]
+  - [14]
+  - [15]
+  - [16]
+  - [17]
+  - [18]
+  - [19]
+  - [20]
+...
+s:drop()
+---
+...
diff --git a/test/box/errinj.test.lua b/test/box/errinj.test.lua
index d97cd81f..c12f61a4 100644
--- a/test/box/errinj.test.lua
+++ b/test/box/errinj.test.lua
@@ -366,3 +366,21 @@ errinj.set("ERRINJ_IPROTO_TX_DELAY", false)
 
 s:drop()
 box.schema.user.revoke('guest', 'read,write,execute','universe')
+
+--
+-- gh-3289: drop/truncate leaves the space in inconsistent
+-- state if WAL write fails.
+--
+s = box.schema.space.create('test')
+_ = s:create_index('pk')
+for i = 1, 10 do s:replace{i} end
+errinj.set('ERRINJ_WAL_IO', true)
+s:drop()
+s:truncate()
+s:drop()
+s:truncate()
+errinj.set('ERRINJ_WAL_IO', false)
+for i = 1, 10 do s:replace{i + 10} end
+s:select()
+
+s:drop()
-- 
2.11.0

  parent reply	other threads:[~2018-04-03 17:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-03 17:37 [PATCH 0/5] alter: fix WAL error handling Vladimir Davydov
2018-04-03 17:37 ` [PATCH 1/5] memtx: rtree: remove pointless index_vtab::begin_build implementation Vladimir Davydov
2018-04-05 20:25   ` Konstantin Osipov
2018-04-03 17:37 ` [PATCH 2/5] memtx: don't call begin_buid and end_build for new pk after recovery Vladimir Davydov
2018-04-03 17:37 ` [PATCH 3/5] vinyl: use disk_format in vy_run_rebuild_index Vladimir Davydov
2018-04-05 20:25   ` Konstantin Osipov
2018-04-03 17:37 ` [PATCH 4/5] vinyl: do not use space_vtab::commit_alter for preparing new indexes Vladimir Davydov
2018-04-03 17:37 ` Vladimir Davydov [this message]
2018-04-05 20:37   ` [PATCH 5/5] alter: call space_vtab::commit_alter after WAL write Konstantin Osipov
2018-04-06 10:59     ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=743af45528804b8d61e03bc5a8415015f713f26a.1522775293.git.vdavydov.dev@gmail.com \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH 5/5] alter: call space_vtab::commit_alter after WAL write' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox