Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Osipov <kostja@tarantool.org>
To: tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH 1/3] vinyl: fix secondary index divergence on update
Date: Sat, 25 May 2019 09:11:57 +0300	[thread overview]
Message-ID: <20190525061157.GB14501@atlas> (raw)
In-Reply-To: <8e4175c3f3b857097ccfd264608b046b71635e91.1558733443.git.vdavydov.dev@gmail.com>

* Vladimir Davydov <vdavydov.dev@gmail.com> [19/05/25 06:41]:

Vladimir, 

could you clarify your comments a bit?

> If an UPDATE request doesn't touch key parts of a secondary index, we
> don't need to write it to the index memory level or dump it to disk, as

We don't have a separate memtable for secondary keys. Better say
"we don't need to re-index it in the in-memory secondary index".

> this would only increase IO load. Historically, we use column mask set
> by the UPDATE operation to skip secondary indexes that are not affected
> by the operation on commit. However, there's a problem here: the column
> mask isn't precise - it may have a bit set even if the corresponding
> column doesn't really get updated, e.g. consider {'+', 2, 0}.

The column does get updated, but the update doesn't change its
value. Now I am making the ends of it.


> Not taking
> this into account may result in appearance of phantom tuples on disk as
> the write iterator assumes that statements that have no effect aren't
> written to secondary indexes (this is needed to apply INSERT+DELETE
> "annihilation" optimization). We fixed that by clearing column mask bits
> in vy_tx_set in case we detect that the key isn't changed, for more
> details see #3607 and commit e72867cb9169 ("vinyl: fix appearance of
> phantom tuple in secondary index after update"). It was rather an ugly
> hack, but it worked.
> 
> However, it turned out that apart from looking hackish this code has
> a nasty bug that may lead to tuples missing from secondary indexes.
> Consider the following example:
> 
>   s = box.schema.space.create('test', {engine = 'vinyl'})
>   s:create_index('pk')
>   s:create_index('sk', {parts = {2, 'unsigned'}})
>   s:insert{1, 1, 1}
> 
>   box.begin()
>   s:update(1, {{'=', 2, 2}})
>   s:update(1, {{'=', 3, 2}})
>   box.commit()
> 
> The first update operation writes DELETE{1,1} and REPLACE{2,1} to the
> secondary index write set. The second update replaces REPLACE{2,1} with
> DELETE{2,1} and then with REPLACE{2,1}. When replacing DELETE{2,1} with
> REPLACE{2,1} in the write set, we assume that the update doesn't modify
> secondary index key parts and clear the column mask so as not to commit
> a pointless request, see vy_tx_set. As a result, we skip the first
> update too and get key {2,1} missing from the secondary index.
> 
> Actually, it was a dumb idea to use column mask to skip statements in
> the first place, as there's a much easier way to filter out statements
> that have no effect for secondary indexes. The thing is every DELETE
> statement inserted into a secondary index write set acts as a "single
> DELETE", i.e. there's exactly one older statement it is supposed to
> purge. This is, because in contrast to the primary index we don't write
> DELETE statements blindly - we always look up the tuple overwritten in
> the primary index first. This means that REPLACE+DELETE for the same key
> is basically a no-op and can be safely skip. Moreover, DELETE+REPLACE
> can be treated as no-op, too, because secondary indexes don't store full
> tuples hence all REPLACE statements for the same key are equivalent.
> By marking such pair of statements as no-op in vy_tx_set, we guarantee
> that no-op statements don't make it to secondary index memory or disk
> levels.

Better say "mark both statements", not a pair, since they are not
present in the tx write list as a pair.

Could you also please explain why you decided to introduce a new
flag, and not use is_overwritten?


-- 
Konstantin Osipov, Moscow, Russia

  reply	other threads:[~2019-05-25  6:12 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-24 21:53 [PATCH 0/3] Fixes of a few Vinyl transaction manager issues Vladimir Davydov
2019-05-24 21:53 ` [PATCH 1/3] vinyl: fix secondary index divergence on update Vladimir Davydov
2019-05-25  6:11   ` Konstantin Osipov [this message]
2019-05-25 19:51     ` [tarantool-patches] " Vladimir Davydov
2019-05-25 20:28       ` Konstantin Osipov
2019-05-26 14:36         ` Vladimir Davydov
     [not found]           ` <CAPZPwLoP+SEO2WbTavgtR3feWN4tX81GAYw5ZYp4_pC5JkyS_A@mail.gmail.com>
2019-05-27  8:28             ` Vladimir Davydov
2019-05-24 21:53 ` [PATCH 2/3] vinyl: don't produce deferred DELETE on commit if key isn't updated Vladimir Davydov
2019-05-25  6:13   ` [tarantool-patches] " Konstantin Osipov
2019-05-24 21:53 ` [PATCH 3/3] vinyl: fix deferred DELETE statement lost on commit Vladimir Davydov
2019-05-25  6:15   ` [tarantool-patches] " Konstantin Osipov
2019-05-27  8:29     ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190525061157.GB14501@atlas \
    --to=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH 1/3] vinyl: fix secondary index divergence on update' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox