[Tarantool-patches] [PATCH 0/2] Validate result of upserts squash & skip invalid upserts which can't be applied

Wed May 27 00:33:49 MSK 2020

Thanks for the patch, nice bug you found!

>> - vy_tx_write() - here the upsert is skipped in case of any error, but
>>   what if it was not a ClientError?
> 
> Originally, there was a comment:
> 
> /*
>  * Ignore a memory error, because it is
>  * not critical to apply the optimization.
>  */
> 
> So I decided to keep this logic. I guess returning -1 in case of OOM
> is also OK. So I can add check here as well. Let me know.

Aha, this is an optimization. I didn't notice that the tuple ends up
in the tree anyway.

> Btw, I've found another one bug concerned with upserts. It is revealed
> since upserts can turn out to be invalid not only due to OOM.
> Here's the patch (see also as an attachment). It is located on the top
> of np/gh-1622-skip-invalid-upserts branch.
> 
> Author: Nikita Pettik <korablev at tarantool.org>
> Date:   Fri May 22 04:38:45 2020 +0300
> 
>     vinyl: handle invalid upserts in background squash process
>     
>     When L0 collects more than certain number of upserts (to be more precise
>     VY_UPSERT_THRESHOLD == 128), background process is launched to squash
>     them. During this squash, new statements could be inserted into L0.
>     Until committed prepared statements feature lsn = MAX_LSN as a special
>     marker.

Did you mean 'until' -> 'while'?

>     It is assumed that all upserts will be successfully squashed. As
>     a result statements with given key in L0 will be able to have only
>     MAX_LSN lsn (except for the squash result).

Honestly, I didn't understand anything from this. I got more info from
reading squashing code source. Probably this needs some rewording. For
example, in the second sentence say 'After squash it is expected all
newer statements on top of the squashed tuple have MAX_LSN, i.e. they
are prepared but not committed. Otherwise they would end up being squashed
too.'

>     However, it is not so if at
>     least one upsert is failed to be applied: it will get stack in L0 until

stack -> stuck? The same in the code comments.

>     dump and it has normal (i.e. < MAX_LSN) lsn. So the assertion which>     verifies that lsn of upsert in L0 is greater than MAX_LSN is wrong.
>     Moreover, if we execute enough (> 128) invalid upserts, they will
>     prevent from squashing normal upserts. So let's skip invalid upserts and
>     don't account them while calculating number of upserts residing in L0.
>     
>     Follow-up #1622
> 
> diff --git a/src/box/vinyl.c b/src/box/vinyl.c
> index 55a4c8fb7..b39e3b9d8 100644
> --- a/src/box/vinyl.c
> +++ b/src/box/vinyl.c
> @@ -3539,10 +3539,21 @@ vy_squash_process(struct vy_squash *squash)
>                 if (vy_tuple_compare(result, mem_stmt, lsm->cmp_def) != 0 ||
>                     vy_stmt_type(mem_stmt) != IPROTO_UPSERT)
>                         break;
> -               assert(vy_stmt_lsn(mem_stmt) >= MAX_LSN);
> -               vy_stmt_set_n_upserts((struct tuple *)mem_stmt, n_upserts);
> -               if (n_upserts <= VY_UPSERT_THRESHOLD)
> -                       ++n_upserts;
> +               /*
> +                * Upsert was not applied in vy_history_apply(),
> +                * but it still resides in tree memory.

Could we remove these upserts from memory right away? The code below was not
simple beforehand, and now looks even more complex. I am looking for a way
to simplify it anyhow.

Could vy_tx_write() fix help? If you would skip them there instead of writing
them to the tree anyway. Or is it something totally not related?

> +                * Ignore
> +                * such statements and do not account them while
> +                * counting upserts. Otherwise invalid upserts will
> +                * get stack and won't allow other upserts to squash.
> +                */
> +               if (vy_stmt_lsn(mem_stmt) >= MAX_LSN) {
> +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt,
> +                                             n_upserts);
> +                       if (n_upserts <= VY_UPSERT_THRESHOLD)
> +                               ++n_upserts;
> +               } else {
> +                       vy_stmt_set_n_upserts((struct tuple *) mem_stmt, 0);
> +               }
>                 vy_mem_tree_iterator_prev(&mem->tree, &mem_itr);
>         }