From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f68.google.com (mail-lf1-f68.google.com [209.85.167.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id C4BD8469710 for ; Thu, 14 May 2020 09:56:10 +0300 (MSK) Received: by mail-lf1-f68.google.com with SMTP id d22so1628640lfm.11 for ; Wed, 13 May 2020 23:56:10 -0700 (PDT) Date: Thu, 14 May 2020 09:56:08 +0300 From: Konstantin Osipov Message-ID: <20200514065608.GA12581@atlas> References: <670c3876e58a7cfa14d45db1dc074a10dd034759.1586808463.git.korablev@tarantool.org> <20200413221229.GA3462@atlas> <20200514021155.GB18509@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200514021155.GB18509@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH 2/2] vinyl: skip invalid upserts during squash List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org * Nikita Pettik [20/05/14 09:50]: > On 14 Apr 01:12, Konstantin Osipov wrote: > > * Nikita Pettik [20/04/14 00:57]: > > > Instead of aborting merge sequence of upserts let's log appeared > > > errors and skip upserts which can't be applied. It makes sense > > > taking into consideration previous commit: now upsert statements which > > > can't be applied may appear in mems/runs (previously squash operation > > > might fail only due to OOM). As a result, if we didn't ignore invalid > > > upserts, dump or compaction processes would not be able to finish (owing > > > to inability to squash upserts). > > > > We should log the upsert itself, base64 encoded. > > It is logged with following format (in vy_read_view_merge()): > > say_info("upsert %s is not applied due to the error: %s", > vy_stmt_str(h->tuple), e->errmsg); > > It is the function which is called during compaction to squash > upserts. > > Also, I've added logs during squash/apply of upserts in in-memory > tree: see diffs in vy_tx_write() and vy_tx_set(). So that user gets > the same error as when executing invalid upserts in memtx. However, > I've found these logs a bit inconsistent: upserts are not always > squashed/applied once they are inserted in tree. For instance: > > s = box.schema.space.create('test', { engine = 'vinyl', field_count = 2 }) > pk = s:create_index('pk') > s:replace{1, 1} > s:upsert({1, 1}, {{'=', 3, 5}}) > main/101/interactive tuple.c:150 E> ER_EXACT_FIELD_COUNT: Tuple field count 3 does not match space field count 2 > > This upsert is attepted to be optimized: in vy_lsm_commit_upsert() > we try to apply it on replace statement, but we fail so error is logged > (still upsert has been inserted in tree). Now, if we execute this > upsert once again: > > s:upsert({1, 1}, {{'=', 3, 5}}) > > error won't be logged, since previous upsert now resides in tree > and we won't attempt at applying it (according to optimization's conditions). > > Hence now I'm not sure that logging failed upsert squashes/applications > is worth it when it comes for in-memory lsm level. This is fine, the correct behaviour is not possible to define anyway, so best effort on tarantool part is good enough. If it becomes an issue we could create a local space which will receive these upserts, instead of a log file, something like _vy_failed_upserts, but I'd not expect it to be very useful. > > Second, vy_history_apply may be called from many contexts - reads > > and writes, for example. We should only log and skip during > > compaction, not when reading, otherwise we'll spam the log file at > > least. > > We don't log anything in vy_history_apply(). So there are no logs > during reads. OK > > Finally, let's double check there are no issues with the used > > format - can it become obsolete by the time it's used, e.g. if > > there is an online/non-blocking schema change that happened in tx > > thread (compaction is running in the write thread)? > > Upserts are supported only by primary index. Meanwhile vinyl does > not support altering primary index of non-empty space. Am I missing > something? I mean the space format object itself. Squashing is happening in a different thread, the > -- Konstantin Osipov, Moscow, Russia