Tarantool development patches archive
 help / color / mirror / Atom feed
From: Aleksandr Lyapunov <alyapunov@tarantool.org>
To: Nikita Pettik <korablev@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Cc: v.shpilevoy@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v3 2/3] vinyl: clean-up unprocessed read views in *_build_read_views()
Date: Wed, 6 May 2020 12:56:52 +0300	[thread overview]
Message-ID: <fc6693b3-bd4a-cc89-3cc2-5395a2969e6e@tarantool.org> (raw)
In-Reply-To: <1a626bac0c6d0d79934f1e144bb92709f4bd4963.1587948306.git.korablev@tarantool.org>

This will work, but I think we have to make the patch better.

The cause of the issue is that the state of an iterator is not obvious. 
Is the history
collected or not? Is it merged already? Must it be cleaned up? Even the 
cleanup
is split to two stages - tuple unrefs and region truncate.

Now you propose to call  vy_read_view_stmt_destroy from several places 
that also
unrefs tuples from read view (not only from its history). This makes  
the state
of an iterator even more unpredictable: when vy_write_iterator_stop is 
called
sometimes read views are already cleaned up, sometimes are not.

Let's fix this mess.
1. History is collected (rallocated and refed) only in 
vy_write_iterator_build_history.
2. The history must be destructed after call to 
vy_write_iterator_build_history in any case,
whether an error occurred or not (comment about it?).
3. There must be a method for history destruction with two steps: unref 
all and rtruncate.
4. Only vy_write_iterator_build_read_views must call this method. 
vy_write_iterator_stop
or anybody else shouldn't be concerned about history.
5. vy_write_iterator_build_read_views must not unref resulting tuples 
(entry.stmt) in case of error,
vy_write_iterator_stop must do it.


On 4/27/20 3:52 AM, Nikita Pettik wrote:
> vy_write_iterator->read_views[i].history objects are allocated on
> region (see vy_write_iterator_push_rv()) during building history of the
> given key. However, in case of fail of vy_write_iterator_build_history()
> region is truncated but pointers to vy_write_history objects are not
> nullified. As a result, they may be accessed (for instance while
> finalizing write_iterator object in  vy_write_iterator_stop) which in
> turn may lead to crash, segfaul or disk formatting. The same may happen
> if vy_read_view_merge() fails during processing of read view array.
> Let's clean-up those objects in case of error takes place.
>
> Part of #4864
> ---
>   src/box/vy_write_iterator.c                   |  23 +++-
>   src/errinj.h                                  |   1 +
>   test/box/errinj.result                        |   1 +
>   .../gh-4864-stmt-alloc-fail-compact.result    | 125 ++++++++++++++++++
>   .../gh-4864-stmt-alloc-fail-compact.test.lua  |  53 ++++++++
>   5 files changed, 200 insertions(+), 3 deletions(-)
>
> diff --git a/src/box/vy_write_iterator.c b/src/box/vy_write_iterator.c
> index 7a6a20627..efb88d1ae 100644
> --- a/src/box/vy_write_iterator.c
> +++ b/src/box/vy_write_iterator.c
> @@ -790,8 +790,11 @@ next_lsn:
>   	 * statement around if this is major compaction, because
>   	 * there's no tuple it could overwrite.
>   	 */
> -	if (rc == 0 && stream->is_last_level &&
> -	    stream->deferred_delete_stmt != NULL) {
> +	if (rc != 0) {
> +		for (int i = 0; i < stream->rv_count; ++i)
> +			vy_read_view_stmt_destroy(&stream->read_views[i]);
> +	} else if (stream->is_last_level &&
> +		   stream->deferred_delete_stmt != NULL) {
>   		vy_stmt_unref_if_possible(stream->deferred_delete_stmt);
>   		stream->deferred_delete_stmt = NULL;
>   	}
> @@ -834,6 +837,15 @@ vy_read_view_merge(struct vy_write_iterator *stream, struct tuple *hint,
>   		rv->history = NULL;
>   		return 0;
>   	}
> +#ifndef NDEBUG
> +	struct errinj *inj =
> +		errinj(ERRINJ_VY_READ_VIEW_MERGE_FAIL, ERRINJ_BOOL);
> +	if (inj != NULL && inj->bparam) {
> +		inj->bparam = false;
> +		diag_set(OutOfMemory, 666, "malloc", "struct vy_stmt");
> +		return -1;
> +	}
> +#endif
>   	/*
>   	 * Two possible hints to remove the current UPSERT.
>   	 * 1. If the stream is working on the last level, we
> @@ -983,8 +995,13 @@ vy_write_iterator_build_read_views(struct vy_write_iterator *stream, int *count)
>   		if (rv->history == NULL)
>   			continue;
>   		if (vy_read_view_merge(stream, hint, rv,
> -				       is_first_insert) != 0)
> +				       is_first_insert) != 0) {
> +			while (rv >= &stream->read_views[0]) {
> +				vy_read_view_stmt_destroy(rv);
> +				rv--;
> +			}
>   			goto error;
> +		}
>   		assert(rv->history == NULL);
>   		if (rv->tuple == NULL)
>   			continue;
> diff --git a/src/errinj.h b/src/errinj.h
> index 383dafcb5..b7550bb5e 100644
> --- a/src/errinj.h
> +++ b/src/errinj.h
> @@ -128,6 +128,7 @@ struct errinj {
>   	_(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT, {.iparam = 0}) \
>   	_(ERRINJ_INDEX_RESERVE, ERRINJ_BOOL, {.bparam = false})\
>   	_(ERRINJ_VY_STMT_ALLOC, ERRINJ_INT, {.iparam = -1})\
> +	_(ERRINJ_VY_READ_VIEW_MERGE_FAIL, ERRINJ_BOOL, {.bparam = false})\
>   
>   ENUM0(errinj_id, ERRINJ_LIST);
>   extern struct errinj errinjs[];
> diff --git a/test/box/errinj.result b/test/box/errinj.result
> index efbb4e85e..e1b9fbe2a 100644
> --- a/test/box/errinj.result
> +++ b/test/box/errinj.result
> @@ -68,6 +68,7 @@ evals
>     - ERRINJ_VY_READ_PAGE: false
>     - ERRINJ_VY_READ_PAGE_DELAY: false
>     - ERRINJ_VY_READ_PAGE_TIMEOUT: 0
> +  - ERRINJ_VY_READ_VIEW_MERGE_FAIL: false
>     - ERRINJ_VY_RUN_DISCARD: false
>     - ERRINJ_VY_RUN_FILE_RENAME: false
>     - ERRINJ_VY_RUN_WRITE: false
> diff --git a/test/vinyl/gh-4864-stmt-alloc-fail-compact.result b/test/vinyl/gh-4864-stmt-alloc-fail-compact.result
> index 1afc02bef..af116a4b4 100644
> --- a/test/vinyl/gh-4864-stmt-alloc-fail-compact.result
> +++ b/test/vinyl/gh-4864-stmt-alloc-fail-compact.result
> @@ -121,3 +121,128 @@ errinj.set('ERRINJ_VY_STMT_ALLOC', -1)
>   s:drop()
>    | ---
>    | ...
> +
> +-- All the same except for delayed vy_stmt_alloc() fail.
> +-- Re-create space for the sake of test purity.
> +--
> +s = box.schema.space.create('test', {engine = 'vinyl'})
> + | ---
> + | ...
> +_ = s:create_index('pk', {run_count_per_level = 100, page_size = 128, range_size = 1024})
> + | ---
> + | ...
> +
> +dump(true)
> + | ---
> + | ...
> +dump()
> + | ---
> + | ...
> +
> +compact()
> + | ---
> + | ...
> +
> +dump()
> + | ---
> + | ...
> +
> +errinj = box.error.injection
> + | ---
> + | ...
> +errinj.set('ERRINJ_VY_STMT_ALLOC', 5)
> + | ---
> + | - ok
> + | ...
> +-- Compaction of first range fails, so it is re-scheduled and
> +-- then successfully finishes at the second attempt.
> +--
> +compact()
> + | ---
> + | ...
> +assert(s.index.pk:stat().range_count == 2)
> + | ---
> + | - true
> + | ...
> +assert(s.index.pk:stat().run_count == 2)
> + | ---
> + | - true
> + | ...
> +assert(errinj.get('ERRINJ_VY_STMT_ALLOC') == -1)
> + | ---
> + | - true
> + | ...
> +errinj.set('ERRINJ_VY_STMT_ALLOC', -1)
> + | ---
> + | - ok
> + | ...
> +-- Unthrottle scheduler to allow next dump.
> +--
> +errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 0.0001)
> + | ---
> + | - ok
> + | ...
> +
> +s:drop()
> + | ---
> + | ...
> +
> +-- Once again but test that clean-up is made in case
> +-- vy_read_view_merge() fails.
> +--
> +s = box.schema.space.create('test', {engine = 'vinyl'})
> + | ---
> + | ...
> +_ = s:create_index('pk', {run_count_per_level = 100, page_size = 128, range_size = 1024})
> + | ---
> + | ...
> +
> +dump(true)
> + | ---
> + | ...
> +dump()
> + | ---
> + | ...
> +
> +compact()
> + | ---
> + | ...
> +
> +dump()
> + | ---
> + | ...
> +
> +errinj = box.error.injection
> + | ---
> + | ...
> +errinj.set('ERRINJ_VY_READ_VIEW_MERGE_FAIL', true)
> + | ---
> + | - ok
> + | ...
> +compact()
> + | ---
> + | ...
> +assert(s.index.pk:stat().range_count == 2)
> + | ---
> + | - true
> + | ...
> +assert(s.index.pk:stat().run_count == 2)
> + | ---
> + | - true
> + | ...
> +assert(errinj.get('ERRINJ_VY_READ_VIEW_MERGE_FAIL') == false)
> + | ---
> + | - true
> + | ...
> +errinj.set('ERRINJ_VY_READ_VIEW_MERGE_FAIL', false)
> + | ---
> + | - ok
> + | ...
> +s:drop()
> + | ---
> + | ...
> +
> +errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 0)
> + | ---
> + | - ok
> + | ...
> diff --git a/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua b/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua
> index bf70bdf75..a68c73d32 100644
> --- a/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua
> +++ b/test/vinyl/gh-4864-stmt-alloc-fail-compact.test.lua
> @@ -53,3 +53,56 @@ assert(errinj.get('ERRINJ_VY_STMT_ALLOC') == -1)
>   errinj.set('ERRINJ_VY_STMT_ALLOC', -1)
>   
>   s:drop()
> +
> +-- All the same except for delayed vy_stmt_alloc() fail.
> +-- Re-create space for the sake of test purity.
> +--
> +s = box.schema.space.create('test', {engine = 'vinyl'})
> +_ = s:create_index('pk', {run_count_per_level = 100, page_size = 128, range_size = 1024})
> +
> +dump(true)
> +dump()
> +
> +compact()
> +
> +dump()
> +
> +errinj = box.error.injection
> +errinj.set('ERRINJ_VY_STMT_ALLOC', 5)
> +-- Compaction of first range fails, so it is re-scheduled and
> +-- then successfully finishes at the second attempt.
> +--
> +compact()
> +assert(s.index.pk:stat().range_count == 2)
> +assert(s.index.pk:stat().run_count == 2)
> +assert(errinj.get('ERRINJ_VY_STMT_ALLOC') == -1)
> +errinj.set('ERRINJ_VY_STMT_ALLOC', -1)
> +-- Unthrottle scheduler to allow next dump.
> +--
> +errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 0.0001)
> +
> +s:drop()
> +
> +-- Once again but test that clean-up is made in case
> +-- vy_read_view_merge() fails.
> +--
> +s = box.schema.space.create('test', {engine = 'vinyl'})
> +_ = s:create_index('pk', {run_count_per_level = 100, page_size = 128, range_size = 1024})
> +
> +dump(true)
> +dump()
> +
> +compact()
> +
> +dump()
> +
> +errinj = box.error.injection
> +errinj.set('ERRINJ_VY_READ_VIEW_MERGE_FAIL', true)
> +compact()
> +assert(s.index.pk:stat().range_count == 2)
> +assert(s.index.pk:stat().run_count == 2)
> +assert(errinj.get('ERRINJ_VY_READ_VIEW_MERGE_FAIL') == false)
> +errinj.set('ERRINJ_VY_READ_VIEW_MERGE_FAIL', false)
> +s:drop()
> +
> +errinj.set("ERRINJ_VY_SCHED_TIMEOUT", 0)

  reply	other threads:[~2020-05-06  9:56 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-27  0:52 [Tarantool-patches] [PATCH v3 0/3] vinyl: fix uninitialized memory accesses Nikita Pettik
2020-04-27  0:52 ` [Tarantool-patches] [PATCH v3 1/3] vinyl: init all vars before cleanup in vy_lsm_split_range() Nikita Pettik
2020-05-06  9:04   ` Aleksandr Lyapunov
2020-05-06 13:12     ` Nikita Pettik
2020-05-06 17:52       ` Aleksandr Lyapunov
2020-05-07  1:09         ` Nikita Pettik
2020-04-27  0:52 ` [Tarantool-patches] [PATCH v3 2/3] vinyl: clean-up unprocessed read views in *_build_read_views() Nikita Pettik
2020-05-06  9:56   ` Aleksandr Lyapunov [this message]
2020-05-07  0:29     ` Nikita Pettik
2020-05-07  8:44       ` Aleksandr Lyapunov
2020-05-07 12:28         ` Nikita Pettik
2020-04-27  0:52 ` [Tarantool-patches] [PATCH v3 3/3] vinyl: clean-up write iterator if vy_task_write_run() fails Nikita Pettik
2020-05-01  0:55   ` Vladislav Shpilevoy
2020-05-03  9:22     ` Konstantin Osipov
2020-05-07  0:38     ` Nikita Pettik
2020-05-06 10:37   ` Aleksandr Lyapunov
2020-05-07  0:36     ` Nikita Pettik
2020-05-07  7:53       ` Aleksandr Lyapunov
2020-05-07 22:16         ` Nikita Pettik
2020-05-08 16:29           ` Aleksandr Lyapunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc6693b3-bd4a-cc89-3cc2-5395a2969e6e@tarantool.org \
    --to=alyapunov@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v3 2/3] vinyl: clean-up unprocessed read views in *_build_read_views()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox