From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <v.shpilevoy@tarantool.org>
Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits)) (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id AD5B5469719
 for <tarantool-patches@dev.tarantool.org>;
 Sat, 31 Oct 2020 03:21:36 +0300 (MSK)
References: <20201014133535.224573-1-gorcunov@gmail.com>
 <20201014133535.224573-4-gorcunov@gmail.com>
 <6572e8c8-b801-0db9-80c0-31bdaca89355@tarantool.org>
 <20201030125101.GF198833@grain>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <7a6a6dd8-5eb0-0c55-2eeb-8aac3f5b11bd@tarantool.org>
Date: Sat, 31 Oct 2020 01:21:34 +0100
MIME-Version: 1.0
In-Reply-To: <20201030125101.GF198833@grain>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [Tarantool-patches] [PATCH v8 3/4] box/cbox: implement cbox Lua
	module
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: tml <tarantool-patches@dev.tarantool.org>

>> The module leaks somewhere. It never unloads the shared file. I
>> tried this:
>>
>> 	cbox = require('cbox')
>> 	f = cbox.func.load('function1.test_push')
>> 	f()
>> 	f = nil
>> 	cbox.func.unload('function1.test_push')
>> 	collectgarbage('collect')
>>
>> Then I use 'lsof -p' to see what files are loaded, and I see
>> function1.dylib here. When I do all the same using _func space
>> and box.func.call, the file is unloaded correctly.
>>
>> (function1.dylib I took from test/box/.)
> 
> I'll try to repeat. That;s interesting, actually I've been trying
> to force gc to trigger but without much luck. Letme investigate
> this.

It is not about Lua GC. The collectgarbage('collect') works fine
and the function object is deleted. But the module is not unloaded.
Because you never call module_sym_unload().

>>> ``` Lua
>>> ok, err = require('cbox').module.reload('func')
>>> if not ok then
>>>     print(err)
>>> end
>>> ```
>>
>> 2. In the previous review I said:
>>
>> 	3. It seems the function invocation itself is not documented?
>>
>> You answered:
>>
>> 	Good point, seems this would be useful. I though 'cause this
>> 	is mostly an alias for "_func" space it would be obvious. But
>> 	indeed without real example the doc looks incomplete. Thanks!
>>
>> Tbh, I don't see any difference regarding this. It is
>> still not documented how to call a function after you
>> loaded it, and what does it return.
> 
> Here is a paragraph above
> ---
>     Once function is loaded it is possible to execute it
>     in a traditional Lua way, ie to call it as a function.
> 
>     ``` Lua
>     -- call with agruments arg1 and arg2
>     f(arg1, arg2)
>     ```
> ---
> 
> I'm not sure what else we could put here?

This paragraph is false, it seems. Because it does not work like Lua
functions 1-to-1. It returns nil,err or true,results. Lua functions
return whatever you return from them so just 'results'. These cbox
functions change the returned set a bit. And this is good, but you
didn't document it.

>>> diff --git a/src/box/box.cc b/src/box/box.cc
>>> index 2485b79f3..f20761e8f 100644
>>> --- a/src/box/box.cc
>>> +++ b/src/box/box.cc
>>> @@ -75,6 +75,7 @@
>>>  #include "systemd.h"
>>>  #include "call.h"
>>>  #include "module_cache.h"
>>> +#include "lua/cbox.h"
>>>  #include "sequence.h"
>>>  #include "sql_stmt_cache.h"
>>>  #include "msgpack.h"
>>> @@ -2246,6 +2247,7 @@ box_free(void)
>>>  		tuple_free();
>>>  		port_free();
>>>  #endif
>>> +		cbox_free();
>>
>> 3. box_init and box_free are not for Lua modules. Please,
>> use box_lua_init(). Also I don't unserstand, why do you have two
>> cbox init functions. If it is a Lua module, it should have lua init
>> function and be in lua/ folder. If it is a C module, it should have
>> a usual init function and not be in a lua/ folder. Not both. See
>> examples such as tuple and error modules. They have separate Lua
>> and C part. In your case I don't understand why would you need a
>> separate C part. Cbox is entirely for Lua. We don't need cbox in C.
>> So for your case box_lua_init should be enough.
> 
> cbox engine uses rbtree to keep functions in memory and we have
> to initialize it first and clean up it on exit. The lua init provides
> Lua's interface over the engine. I can move initialization to box_lua_init
> but where to clean things up?!

Option 1 - nowhere like other Lua modules. Option 2 - add box_lua_free
and add your cbox destruction there.

> Please look into popen module -- it does
> exactly the same: there is C part and Lua parts.

It is not the same. Popen, on the contrary, proves my point.

lua/popen.h has only tarantool_lua_popen_init(). lua/popen
defines a new module on top of lib/core/popen.

lib/core/popen.h has popen_init().

These two modules are separate. They are not one. lib/core/popen works
and builds without lua part. In your case you just threw everything into
a huge pile inside of lua/cbox.c, which is also fine, but then it is a
single monolithic module, and should have one init function. Also since
it is in Lua, it should not be used in files having nothing to do with
Lua.

>>> +/**
>>> + * Function descriptor.
>>> + */
>>> +struct cbox_func {
>>> +	/**
>>> +	 * Gather functions into rbtree.
>>> +	 */
>>> +	rb_node(struct cbox_func) nd;
>>
>> 4. Why rbtree? Search in the tree is going to be slower than in
>> the hash. Hash would be perfect here, because all the searches
>> are always by ==, not < or >, no iteration. Only lookup.
> 
> And lookup in our hash table causes iterations as well in case
> of collisions.
> 
>> In the cover letter you said
>>
>> 	"i don't like the idea of unexpected rehashing of
>> 	values in case of massive number of functions
>> 	allocated"
>>
>> but AFAIK, according to our benches, when a hash is used as an index,
>> it is always faster than any tree on lookups. Although I don't
>> remember benches hash vs tree out of an index. Since your are doing
>> such dubious "optimizations", is there a bench you can show, that
>> a tree is faster than a hash?
> 
> The main reason for rbtree is the memory.

Ok. Did you measure how much more efficient it is in terms of memory?
Even in theory?

> Initially I used our mhash
> engine but I don't expect there be that many functions which allows
> me to not allocate such big structure as a hash table, instead I only
> need two pointers for lookup (ie 16 bytes per entry).
> 
> Do you really think it worth it to allocate hash table instead? I'm ready
> to make lookup a bit slower just to eliminate redundant memory overhead.

Yes, I think the hash is better here. But since you said the tree
solution is better in terms of speed because of rehash, and in terms of
memory because of I don't know what, you gotta prove it somehow. I can
prove my point because Alexanded had benchmarks showing hash is faster
inside of indexes. Also there is an issue in vshard about hash index being
better: https://github.com/tarantool/vshard/issues/128. Alexey did a bench,
AFAIR, even though we didn't post it in the issue.

We use the hash in a lot of far more critical places in terms of perf.
So either you are right and we need to fix them too, or we need to fix the
hash implementation, or you are not right and we need to keep using the
hash, when all operations are lookup. In either way we can't just say the
tree is better for lookups and silently use it here.

> I expect at worst case to have up to 1000 functions which may require up
> to 10 attempts.  I hate (with a passon) idea that in a middle of inserting
> new key our hash engine start a rehashing procedure.

If a user loads and unloads functions often, the rehash issue is going to
be the least severe one, because much more time will be spent on doing
mmap/munmap to load/unload the shared files.

> And note that lookup happens only _once_ when you loading a new function,
> we keep the pointer later and calling function goes directly without any
> lookups. So I think keeping low memory usage a way more preferred.

I am going to ask for numbers, what is the memory usage difference we are
talking about. Otherwise it all sounds sus.

> Also
> we might consider moving memory allocation for functions in our own "small"
> instace (where mh hashes simply use system's malloc/calloc calls).

I didn't understand this. Can you elaborate?

> And no, I didn't measure performance here because it was not a priority
> when _inserting_ a new function. And I'm very sceptical about "we ran
> a benchmark and it worked a way more better". We've been using both
> trees and hashes for decades and open-hashing (like ours) works fine
> until some moment where collision starts to happen and performace flushed
> to the toiled because you start walking over the table causing pages
> to swap in, massive tlb flushed and all this friends. On big data and
> higly loaded systems it is very fragile area and there is still no
> answer which structures are better :(

Please, go to Alexander L. He has concrete numbers how much faster hash is.
I remember he did it at least in scope of an issue about Lua tables vs spaces
performance.

> Sometimes if your data is small
> enough at sits near in memory old plain linear lookup a way more
> faster than enything else, because theory is theory but in real life
> too many components with own limists involved.
> 
> Summarizing: I can easily switch back to mhash if you prefer.

I think at this point we have gone too far to simply choose one solution,
and we need to do the comparison. Memory comparison is simple to do - insert
the same functions into a hash, printf its size. Insert into a tree, and
printf its size. For perf you can either write your own micro-bench or ask
Alexander L. for existing results.

>>> +
>>> +	/**
>>> +	 * Symbol descriptor for the function in
>>> +	 * an associated module.
>>> +	 */
>>> +	struct module_sym mod_sym;
>>> +
>>> +	/**
>>> +	 * Number of references to the function
>>> +	 * instance.
>>> +	 */
>>> +	ssize_t ref;
>>
>> 5. Why is it ssize_t? Is there any single place, where we use ssize_t
>> for reference counting? Why ssize_t, if it is has nothing to do with
>> size of anything?
>>
>> Also it probably would be better to name it 'load_count'. Because
>> to the end of the patch I was thinking you decided to count Lua refs
>> too somewhy, and realized it is load count only afterwards, when
>> started writing the comments.
> 
> Strictly speaking plain int should fit here better but signed size
> just guaranteed to be large enough to cover address space.

But it is not an address either ...

>>> +/**
>>> + * Reference a function instance.
>>> + */
>>> +static bool
>>> +cbox_func_ref(struct cbox_func *cf)
>>> +{
>>> +	assert(cf->ref >= 0);
>>> +
>>> +	/*
>>> +	 * Hardly to ever happen but just
>>> +	 * to make sure.
>>
>> 9. We already discussed it a ton of times, literally exactly the
>> same 'problem' with struct error objects. A 64 bit integer
>> will not overflow for your lifetime even if it would increment
>> every nanosecond. How is it possible? Why the code should get
>> complicated by making a simple 'ref' function be able to fail?
> 
> This has nothing to do for fair use. It prevents from errors where
> you occasionally screwed the counter.

In the code you always either increment or decrement it. How many
lifes you will need to get it 'screwed'?

> I don't know why but for some
> reason you think that counters are never screwed.

With that we can get to the point that it is not safe to use TCP,
and we need to ensure packet order, send checksums and so on. Also
we can't rely on the main memory stability - some electrons may leak,
and screw a bit in a number. So we need to check each number before
we use it.

What I think is that there is a border of sanity, when the checks
must be stopped in order to keep the code sane and simple. An overflow
of a 64 bit number, changed only by increments and decrements, is beyond
that border.

If the counter is somehow broken, it means its memory was overridden,
and the whole process is compromised. It makes no sense to continue
the execution. You not only continue, you also expose this error to a
user. What is he supposed to do with that error?

> What we really
> need is a general kind of kref_t and it *must* include saturation
> check somewhere inside and panic on overflow.

This is not really applicable in Tarantool. At least not everywhere.
Struct tuple, for instance, is referenced, but it uses a very special
reference counter to save every single byte, because tuple count is
millions and billions.

>>> +/**
>>> + * Allocate a new function instance.
>>> + */
>>> +static struct cbox_func *
>>> +cbox_func_new(const char *name, size_t name_len)
>>> +{
>>> +	const ssize_t cf_size = sizeof(struct cbox_func);
>>> +	ssize_t size = cf_size + name_len + 1;
>>> +	if (size < 0) {
>>
>> 10. If this is intended to check for an overflow, it does not work.
>> If I will pass name_len = UINT64_MAX (on my machine size_t == 8 bytes),
>> then UINT64_MAX + 1 = 0, you will allocate only sizeof(), and then
>> will do memcpy for UINT64_MAX below.
>>
>> I would suggest not to get crazy with such checks. You can't cover
>> everything, and it is not needed.
> 
> It is a typo, it should be `if (size <= 0)`.

It won't work either. Because size won't be 0. Name_len + 1 will be,
and size will be = sizeof(struct cbox_func).

> I think we must check
> what users pass us and don't allow to screw a program.

Name_len is returned by lua_tolstring(). If there is a string, whose
size overflows size_t, I am afraid the program is already screwed,
because there is not enough memory in the world to fit such a string.

> So this test
> is not crazy at all.

It really is, see above.