[tarantool-patches] Re: [PATCH 1/3] tuple: rework updates to improve code extendibility

Sat Oct 5 16:49:49 MSK 2019

I don't know why my mutt folder hook are messed up, will
take some time to figure out.

Resending from the right email.

* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/09/28 09:10]:
> Before the patch update was implemented as a set of operations
> applicable for arrays only. It was ok until field names and JSON
> paths appearance, because tuple is array on the top level.
> 
> But now there are four reasons to allow more complex updates of
> tuple field internals via JSON paths:
> * tuple field access by JSON path is allowed so for consistency
>   JSON paths should be allowed in updates as well;
> * JSON indexes are supported. JSON update should be able to
>   change an indexed field without rewriting half of a tuple;
> * Tarantool is going to support documents in storage so JSON
>   path updates is one more step forward;
> * JSON updates are going to be faster than get + in-memory
>   Lua/connector update + replace (or update of a whole tuple
>   field).
> 
> The patch prepares current update code to be extended by updates
> of map, so named isolated 'bar' updates, and multilevel updates.
> 
> The concept is to build a tree of update objects. Each updates a
> scalar terminal value (leaf of that tree), or a complex object
> like array or map. In the latter case these objects contain >= 2
> children updates.
> 
> This organization allows to split update implementation into
> several independent files-modules for each update type. Next
> commits will introduce them one by one.

Now that you provide some rationale for the rename, does the
naming scheme you used reflects this rationale?

1) update - a very general name, there is sql update, tuple
   update, lsm tree update, configuration update
2) update_array - impossible to say by looking at the code whether
   it is a function which udpates an array or an array of updates.

I agree and support the rationale (to a certain extent), but now 
could you please come up with a structure and a naming scheme 
that reflects this rationale and is unambiguous?

Generally, the key reason to exist for the update interface
is being able to represent partial tuple changes in the binary
log - i.e. update micro-language exists only because we want
to send compact changes over the network to replicas. There is
no other reason *for the microlanguage* to exist.

So maybe it should be called xrow_update, to avoid ambiguity with
other types of updates? 

-- 
Konstantin Osipov, Moscow, Russia