From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp34.i.mail.ru (smtp34.i.mail.ru [94.100.177.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id AA71245C305 for ; Fri, 4 Dec 2020 12:53:08 +0300 (MSK) From: Serge Petrenko Date: Fri, 4 Dec 2020 12:52:53 +0300 Message-Id: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH v3 0/2] reduce performance degradation introduced by JSON path indices List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: korablev@tarantool.org Cc: tarantool-patches@dev.tarantool.org https://github.com/tarantool/tarantool/issues/4774 sp/gh-4774-multikey-refactoring The patchset fixes two degradations found by measuring snapshot recovery time for a 1.5G snapshot containing 30M tuples in a memtx space with a simple primary key and one secondary key over 4 integer and one string field. The first degradation manifests itself during snapshot recovery phase (the one with "3.5M rows processed" messages) and is connected to `memtx_tuple_new` slowdown due to unoptimised `tuple_field_map_create`. First patch deals with this degradation and manages to restore almost all performance lost since 1.10. (The patched version is only 11% slower than 1.10, while the current master is 39% slower on this phase). The second degradation appears during next snapshot recovery phase, secondary index building. Here the degradation is rooted in slow tuple field access via tuple_field_raw(). The second patch deals with this issue and manages to restore all the lost performance. (The patched version is 10% faster(!) than 1.10 while the current master is 27% slower). To be honest, the increase in speed between 1.10 and the second patch must be due to tuple comparison hints. Otherwise the patched version should be even with 1.10, since it uses literally the same code as 1.10 did (with minor changes). Changes in v2: - win some more performance by accessing top level tuple format fields directly (bypass the json_tree_lookup) - instead of relying on offset_slot_hint in the second patch, rewrite tuple_field_raw so that it doesn't check for path this wins a whopping 24% of perf compared to the previous version. Changes in v3: - minor typo fixes Serge Petrenko (2): box: speed up tuple_field_map_create box: refactor tuple_field_raw to omit path checks src/box/tuple.h | 29 ++++++++++++++-- src/box/tuple_format.c | 75 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 102 insertions(+), 2 deletions(-) -- 2.24.3 (Apple Git-128)