[Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/8] tuple: rework updates to improve code extendibility

Sat Oct 19 18:11:32 MSK 2019

Thanks for the review!

On 05/10/2019 15:50, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/09/28 09:11]:
>> Hi! Thanks for the review!
>>
>> On 03/09/2019 21:20, Konstantin Osipov wrote:
>>> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/09/01 00:33]:
>>>
>>> If you create a directory, please make it a standalone library, so
>>> that dependencies are controlled, both at compile-time and at
>>> link-time.
>>
>> I tried - it is not possible. Update depends on tuple_dictionary and
>> TUPLE_INDEX_BASE from tuple_format.h. Updates need to be a part of
>> the tuple library.
>>
>>> I don't see a reason to keep the top-level update api in src/box, 
>>> please move it to update/update.h or alike.
>> I don't see a reason not to keep it, but I don't mind to move
>> all to update/.
> 
> Well, if it is not an independent library then it should *not* be
> moved into a subdirectory. 
> 

Ok, then how would you explain src/box/lua directory? It is not a
separate library, but it is in own directory. It is not a single
example. Just the first I found. And what is more important, I
consider it a good example.

>>> Generally I don't (I guess yet?) get why you make the split, I
>>> hope to learn in a subsequent patches.
>>
>> Because update implementations for each field type are very different
>> and independent from each other. And I decided that it would be simpler
>> to maintain several relatively small and simple files for each field
>> type, than one huge mega-file for all updates. Actually, I borrowed
>> that file organization from some other databases, where I've looked
>> for an inspiration. Such as mongo.
> 
> Extracting update operations for different field types into own
> files could actually reduce header file dependencies (e..g. decimal includes are only 
> used for decimal field updates), so looks like a good idea, but
> you didn't do just that.

My purpose was not related to 'header file dependencies'. Actually,
I don't really see any practical pros in trying to include less header
files. If something in a header file is really needed, you will include
it anyway. Explicitly via that header, or implicitly via another. We
already talked about that, and as I remember you could not explain it
either.

> You chopped everything into little
> pieces, and I am still failing to extract motivation for this.
> 
> Is there a rational reason to do it except you felt like it?

No, there is no other reason. I "felt like" it will be simpler
to implement and support, and it appeared to be true. One file way
had led to creation of a monster file with 3-4k of lines, where each
300-500 line blocks were independent from each other. It appeared to
be unmaintainable. That is all. I explained it already several times,
both verbally when I just started this patchset many months ago, and
in a written form. I don't think that if I will repeat it again, it
will help.

I am going to stick to multiple files.

> We
> generally don't stick to one-class-per-file paradigm, but
> one-module-per-file.

But it is not true. We have SQL, Vinyl, SWIM, and many
other modules consists of multiple files.

And one-class-per-file is not true as well. We have key_def,
field_def, other 'def's. We have memtx_tree/space/engine,
the same for Vinyl, and many other examples. That is, we use
one-class-per-file when it help to organize the code.

> What is the module structure that you are
> trying to get to with the move? 
> 
>> It is debatable. The reasoning above looks fair to me. But we can
>> discuss.
> 
> Wherever you provided any reasoning, it made some sense, and where
> I am not compelled to argue. The questions would not have arisen 
> if this reasoning was in code and changeset comments.

I provided the reason in the commit message:

    This organization allows to split update implementation into
    several independent files-modules for each update type. Next
    commits will introduce them one by one.

Here I said explicitly, that it helps to reduce size of the
huge JSON update task into several smaller tasks: "independent
files-modules for each update type". And that organization has
proven its efficiency. I was able to return to this ticket after
half a year of absence, understand everything, and finish the
patchset. In several independent commits.

With monolithic file way I was not even able to start this
patchset. I tried, and it just didn't work.