[Tarantool-patches] [PATCH v3 0/2] reduce performance degradation introduced by JSON path indices

Alexander V. Tikhonov avtikhon at tarantool.org
Fri Dec 11 16:39:43 MSK 2020


Hi All, thanks for the patch, as I see no new degradation found in
gitlab-ci testing commit criteria pipeline [1], patch LGTM.

[1] - https://gitlab.com/tarantool/tarantool/-/pipelines/228335114

On Thu, Dec 10, 2020 at 05:35:23PM +0000, Nikita Pettik wrote:
> On 04 Dec 12:52, Serge Petrenko wrote:
> > https://github.com/tarantool/tarantool/issues/4774
> > sp/gh-4774-multikey-refactoring
> > 
> > The patchset fixes two degradations found by measuring snapshot recovery time
> > for a 1.5G snapshot containing 30M tuples in a memtx space with a simple primary
> > key and one secondary key over 4 integer and one string field.
> > 
> > The first degradation manifests itself during snapshot recovery phase (the one
> > with "3.5M rows processed" messages) and is connected to `memtx_tuple_new`
> > slowdown due to unoptimised `tuple_field_map_create`.
> > 
> > First patch deals with this degradation and manages to restore almost all
> > performance lost since 1.10. (The patched version is only 11% slower than 1.10,
> > while the current master is 39% slower on this phase).
> > 
> > The second degradation appears during next snapshot recovery phase, secondary
> > index building. Here the degradation is rooted in slow tuple field access via
> > tuple_field_raw().
> > 
> > The second patch deals with this issue and manages to restore all the lost
> > performance. (The patched version is 10% faster(!) than 1.10 while the current
> > master is 27% slower).
> > To be honest, the increase in speed between 1.10 and the second patch must be
> > due to tuple comparison hints. Otherwise the patched version should be even with
> > 1.10, since it uses literally the same code as 1.10 did (with minor changes).
> 
> To Serge: I guess we should reflect this fix in user's changelog.
> Could you please provide a few lines about patches?
> 
> To Alexander: we are going to push this patch to master. Could you verify
> that it doesn't break any tests? Branch is:
> https://github.com/tarantool/tarantool/tree/sp/gh-4774-multikey-refactoring
>  
> > Changes in v2:
> >   - win some more performance by accessing top level
> >     tuple format fields directly (bypass the json_tree_lookup)
> >   - instead of relying on offset_slot_hint in the second patch,
> >     rewrite tuple_field_raw so that it doesn't check for path
> >     this wins a whopping 24% of perf compared to the previous
> >     version.
> > 
> > Changes in v3:
> >   - minor typo fixes
> > 
> > Serge Petrenko (2):
> >   box: speed up tuple_field_map_create
> >   box: refactor tuple_field_raw to omit path checks
> > 
> >  src/box/tuple.h        | 29 ++++++++++++++--
> >  src/box/tuple_format.c | 75 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 102 insertions(+), 2 deletions(-)
> > 
> > -- 
> > 2.24.3 (Apple Git-128)
> > 


More information about the Tarantool-patches mailing list