[Tarantool-patches] [PATCH v2 3/3] tuple: rework updates to improve code extendibility

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Nov 7 13:58:44 MSK 2019


I gave up trying to split these changes, because they
are either too minor, or are not separable at all. Kostja
proposed to describe them in a changelist, and here is a
new commit message:

====================================================================

tuple: rework updates to improve code extendibility

Before the patch update was implemented as a set of operations
applicable for arrays only. It was ok until field names and JSON
paths appearance, because tuple is an array on the top level.

But now there are four reasons to allow more complex updates of
tuple field internals by JSON paths:

  - tuple field access by JSON path is allowed so for consistency
    JSON paths should be allowed in updates as well;

  - JSON indexes are supported. JSON update should be able to
    change an indexed field without rewriting half of a tuple, and
    its full replacement;

  - Tarantool is going to support documents in storage so JSON
    path updates is one more step forward;

  - JSON updates are going to be faster and more compact in WAL
    than get + in-memory Lua/connector update + replace (or update
    of a whole tuple field).

The patch reworks the current update code in such a way, that now
update is not just an array of operations, applied to tuple's top
level fields. Now it is a tree, just like tuples are.

The concept is to build a tree of xrow_update_field objects. Each
updates a part of a tuple. Leafs in the tree contain update
operations, specified by a user, as xrow_update_op objects.

To make the code support and understanding simpler, the patch
splits update implementation into several independent
files-modules for each type of an updated field. One file
describes how to update an array field, another file - how to
update a map field, etc. This commit introduces only array. Just
because it was already supported before the patch. Next commits
will introduce more types one by one.

Besides, the patch makes some minor changes, not separable from
this commit:

  - The big comment about xrow updates in xrow_update.c is
    updated. Now it describes the tree-idea presented above;

  - Comments were properly aligned by 66 symbols in all the moved
    or changed code. Not affected code is kept as is so as not to
    increase the diff even more;

  - Added missing comments to moved or changed structures and
    their attributes such as struct xrow_update,
    struct xrow_update_op_meta, struct xrow_update_op.

  - Struct xrow_update_field was significantly reworked. Now it is
    not just a couple of pointers at tuple's top level array. From
    now it stores type of the updated field, range of its source
    data in the original tuple, and a subtree of other update
    fields applied to the original data.

  - Added missing comments to some functions which I moved and
    decided worth commenting alongside, such as
    xrow_update_op_adjust_field_no(), xrow_update_alloc().

  - Functions xrow_update_op_do_f, xrow_update_op_read_arg_f,
    xrow_update_op_store_f are separated from struct xrow_update,
    so as they could be called on any updated field in the tree.
    From this moment they are methods of struct xrow_update_op.
    They take an op as a first argument (like 'this' in C++), and
    are applied to a given struct xrow_update_field.

Another notable, but not separable, change is a new naming schema
for the methods of struct xrow_update_field and struct
xrow_update_op. This is motivated by the fact that struct
xrow_update_field now has a type, and might be not a terminal.

There are now two groups of functions. Generic functions working
with struct xrow_update_field of any type:

    xrow_update_field_sizeof
    xrow_update_field_store
    xrow_update_op_do_field_<operation>

And typed functions:

    xrow_update_<type>_sizeof
    xrow_update_<type>_store
    xrow_update_op_do_<type>_<operation>

Where
    operation = insert/delete/set/arith ...
         type = array/map/bar/scalar ...

Common functions are used when type of a field to update is not
known in advance. For example, when an operation is applied to one
of fields of an array, it is not known what a type this field has:
another array, scalar, not changed field, map, etc. Common
functions do nothing more than just a switch by field type to
choose a more specific function.

Typed functions work with a specific type. They may change the
given field (add a new array element, replace it with a new value,
...), or may forward an operation deeper in case they see that its
JSON path is not fully passed yet.

Part of #1261


More information about the Tarantool-patches mailing list