From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp43.i.mail.ru (smtp43.i.mail.ru [94.100.177.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 56E8F45C304 for ; Fri, 11 Dec 2020 09:47:02 +0300 (MSK) References: <20201210173523.GE1319@tarantool.org> From: Serge Petrenko Message-ID: Date: Fri, 11 Dec 2020 09:47:00 +0300 MIME-Version: 1.0 In-Reply-To: <20201210173523.GE1319@tarantool.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Content-Language: en-GB Subject: Re: [Tarantool-patches] [PATCH v3 0/2] reduce performance degradation introduced by JSON path indices List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Nikita Pettik , avtikhon@tarantool.org Cc: tarantool-patches@dev.tarantool.org 10.12.2020 20:35, Nikita Pettik пишет: > On 04 Dec 12:52, Serge Petrenko wrote: >> https://github.com/tarantool/tarantool/issues/4774 >> sp/gh-4774-multikey-refactoring >> >> The patchset fixes two degradations found by measuring snapshot recovery time >> for a 1.5G snapshot containing 30M tuples in a memtx space with a simple primary >> key and one secondary key over 4 integer and one string field. >> >> The first degradation manifests itself during snapshot recovery phase (the one >> with "3.5M rows processed" messages) and is connected to `memtx_tuple_new` >> slowdown due to unoptimised `tuple_field_map_create`. >> >> First patch deals with this degradation and manages to restore almost all >> performance lost since 1.10. (The patched version is only 11% slower than 1.10, >> while the current master is 39% slower on this phase). >> >> The second degradation appears during next snapshot recovery phase, secondary >> index building. Here the degradation is rooted in slow tuple field access via >> tuple_field_raw(). >> >> The second patch deals with this issue and manages to restore all the lost >> performance. (The patched version is 10% faster(!) than 1.10 while the current >> master is 27% slower). >> To be honest, the increase in speed between 1.10 and the second patch must be >> due to tuple comparison hints. Otherwise the patched version should be even with >> 1.10, since it uses literally the same code as 1.10 did (with minor changes). > To Serge: I guess we should reflect this fix in user's changelog. > Could you please provide a few lines about patches? Hi! Thanks for the review! Ok, sure: @ChangeLog: Fix performance degradation in snapshot recovery when no JSON path or multikey indices are involved. The degradation first appeared in 2.2.1 and raised the recovery time by approximately 30% compared to 1.10. Now snapshot recovery when JSON path indices are unused is even faster than it used to be on 1.10. The time difference is around 7% (gh-4774). > To Alexander: we are going to push this patch to master. Could you verify > that it doesn't break any tests? Branch is: > https://github.com/tarantool/tarantool/tree/sp/gh-4774-multikey-refactoring > >> Changes in v2: >> - win some more performance by accessing top level >> tuple format fields directly (bypass the json_tree_lookup) >> - instead of relying on offset_slot_hint in the second patch, >> rewrite tuple_field_raw so that it doesn't check for path >> this wins a whopping 24% of perf compared to the previous >> version. >> >> Changes in v3: >> - minor typo fixes >> >> Serge Petrenko (2): >> box: speed up tuple_field_map_create >> box: refactor tuple_field_raw to omit path checks >> >> src/box/tuple.h | 29 ++++++++++++++-- >> src/box/tuple_format.c | 75 ++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 102 insertions(+), 2 deletions(-) >> >> -- >> 2.24.3 (Apple Git-128) >> -- Serge Petrenko