From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp37.i.mail.ru (smtp37.i.mail.ru [94.100.177.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 5878945C304 for ; Fri, 11 Dec 2020 16:39:46 +0300 (MSK) Date: Fri, 11 Dec 2020 16:39:43 +0300 From: "Alexander V. Tikhonov" Message-ID: <20201211133943.GA124131@hpalx> References: <20201210173523.GE1319@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201210173523.GE1319@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH v3 0/2] reduce performance degradation introduced by JSON path indices List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik , Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Hi All, thanks for the patch, as I see no new degradation found in gitlab-ci testing commit criteria pipeline [1], patch LGTM. [1] - https://gitlab.com/tarantool/tarantool/-/pipelines/228335114 On Thu, Dec 10, 2020 at 05:35:23PM +0000, Nikita Pettik wrote: > On 04 Dec 12:52, Serge Petrenko wrote: > > https://github.com/tarantool/tarantool/issues/4774 > > sp/gh-4774-multikey-refactoring > > > > The patchset fixes two degradations found by measuring snapshot recovery time > > for a 1.5G snapshot containing 30M tuples in a memtx space with a simple primary > > key and one secondary key over 4 integer and one string field. > > > > The first degradation manifests itself during snapshot recovery phase (the one > > with "3.5M rows processed" messages) and is connected to `memtx_tuple_new` > > slowdown due to unoptimised `tuple_field_map_create`. > > > > First patch deals with this degradation and manages to restore almost all > > performance lost since 1.10. (The patched version is only 11% slower than 1.10, > > while the current master is 39% slower on this phase). > > > > The second degradation appears during next snapshot recovery phase, secondary > > index building. Here the degradation is rooted in slow tuple field access via > > tuple_field_raw(). > > > > The second patch deals with this issue and manages to restore all the lost > > performance. (The patched version is 10% faster(!) than 1.10 while the current > > master is 27% slower). > > To be honest, the increase in speed between 1.10 and the second patch must be > > due to tuple comparison hints. Otherwise the patched version should be even with > > 1.10, since it uses literally the same code as 1.10 did (with minor changes). > > To Serge: I guess we should reflect this fix in user's changelog. > Could you please provide a few lines about patches? > > To Alexander: we are going to push this patch to master. Could you verify > that it doesn't break any tests? Branch is: > https://github.com/tarantool/tarantool/tree/sp/gh-4774-multikey-refactoring > > > Changes in v2: > > - win some more performance by accessing top level > > tuple format fields directly (bypass the json_tree_lookup) > > - instead of relying on offset_slot_hint in the second patch, > > rewrite tuple_field_raw so that it doesn't check for path > > this wins a whopping 24% of perf compared to the previous > > version. > > > > Changes in v3: > > - minor typo fixes > > > > Serge Petrenko (2): > > box: speed up tuple_field_map_create > > box: refactor tuple_field_raw to omit path checks > > > > src/box/tuple.h | 29 ++++++++++++++-- > > src/box/tuple_format.c | 75 ++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 102 insertions(+), 2 deletions(-) > > > > -- > > 2.24.3 (Apple Git-128) > >