From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <v.shpilevoy@tarantool.org>
Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits)) (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id B65BB469710
 for <tarantool-patches@dev.tarantool.org>;
 Mon,  1 Jun 2020 20:40:08 +0300 (MSK)
References: <50a25fbae907f1b0d5406fb2e78d40d3e42a8a8d.1591029888.git.korablev@tarantool.org>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <4e9d5184-1b3a-d4de-e040-a3edb6ca38cd@tarantool.org>
Date: Mon, 1 Jun 2020 19:40:06 +0200
MIME-Version: 1.0
In-Reply-To: <50a25fbae907f1b0d5406fb2e78d40d3e42a8a8d.1591029888.git.korablev@tarantool.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [Tarantool-patches] [PATCH v2] vinyl: restart read iterator in
 case of rolled back WAL
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Nikita Pettik <korablev@tarantool.org>, tarantool-patches@dev.tarantool.org

Hi! Thanks for the patch!

On 01/06/2020 18:46, Nikita Pettik wrote:
> Data read in vinyl is known to yield in case of disc access. So it opens
> a window for modifications of in-memory level. Imagine following scenario:
> right before data selection tuple is inserted into space. It passes first
> stage of commit procedure, i.e. it is prepared to be committed but still
> is not yet reached WAL.  Meanwhile iterator is starting to read the same key.
> At this moment prepared statement is already inserted to in-memory tree
> ergo visible to read iterator. So, read iterator fetches this statement
> and proceeds to disk scan.  In turn, disk scan yields and in this moment
> WAL fails to write statement on disk. Next, two cases are possible:
> 1. WAL thread has enough time to complete rollback procedure.
> 2. WAL thread fails to finish rollback in this time gap.
Seems like the second test case is not really different from no rollback
at all. Because basically the WAL thread execution is still in the same
function, wal_write_to_disk(). Both ERRINJ_WAL_DELAY and ERRINJ_RELAY_FASTER_THAN_TX
give the same effect, if the transaction is rolled back. If it is committed,
they give also the same effect from TX<->WAL interaction point of view. The
difference only appears in TX<->WAL<->Relay threads.

So probably you could drop this case from the patch. It would simplify
things. But up to you and Alexander L.

Talking of the tests, I tried running multiple instances of the new
test:

    python test-run.py gh-3395-read-prepared-un gh-3395-read-prepared-un gh-3395-read-prepared-un gh-3395-read-prepared-un gh-3395-read-prepared-un gh-3395-read-prepared-un

And it fails sometimes. So it is flaky. Better fix it before push.

> diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c
> index dcc9a0873..62a8722d9 100644
> --- a/src/box/vy_read_iterator.c
> +++ b/src/box/vy_read_iterator.c
> @@ -382,6 +382,9 @@ vy_read_iterator_scan_disk(struct vy_read_iterator *itr, uint32_t disk_src,
>   * Restore the position of the active in-memory tree iterator
>   * after a yield caused by a disk read and update 'next'
>   * if necessary.
> + * @retval -1 In case of error (e.g. OOM);
> + * @retval 0 Successful execution;
> + * @retval 1 Restart of advance_iterator is required.

What is advance_iterator?

>   */
>  static NODISCARD int
>  vy_read_iterator_restore_mem(struct vy_read_iterator *itr,