[Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module

Sat Jan 30 21:53:04 MSK 2021

>>> On success the first returned value set to `true` and `res` represent
>>> function execution result.
>>>
>>> ``` Lua
>>> local ok, res = cfunc_sum(1, 2)
>>> assert(ok);
>>> print(res)
>>> ```
>>>
>>> We will see the number `3` in output.
>>
>> 5. What happens when I do multireturn?
> 
> Not sure I follow you here. But if you mean tradtional
> multireturn then we have it in test case (the next patch)
> 
>  | cfunc_multireturn()
>  |  | ---
>  |  | - true
>  |  | - [1]
>  |  | - [1]
> 
> The function should prepare the result by self. The calling
> convention is the same as to old known `box.func`. Or
> you mean something else?

I mean you should document it better. I know how multireturn
works in your patch because I reviewed it. Users won't look at
the source code or at the tests.

>>> +
>>> +static struct cmod_func *
>>> +cmod_get_func_handle(struct lua_State *L, bool mandatory)
>>
>> 8. According to the code style, flags must have 'is' prefix or
>> a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'.
> 
> Could you please point me where this rule sits?

I couldn't find it either. AFAIU, more detailed rules should come with
the next update of the C style code on the site. But I remember saying
it to you a few times.

> I see both usage (with and without prefix) in our code.

Without prefix is incorrect. It is either old code, or new badly reviewed
code.

>>> +{
>>> +	struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname);
>>> +	if (mandatory && cf == NULL) {
>>> +		const char *fmt = "The function is already unloaded";
>>> +		diag_set(IllegalParams, fmt);
>>> +	}
>>> +	return cf;
>>> +}
>>> +
>>> +static void
>>> +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf)
>>> +{
>>> +	cmod_set_handle(L, cmod_func_handle_uname, cf);
>>> +}
>>> +
>>> +static void
>>> +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf)
>>> +{
>>> +	cmod_setup_handle(L, cmod_func_handle_uname, cf);
>>> +}
>>
>> 9. Amount of boilerplate code and double checks of the same
>> conditions you need just to push some udata is terrifying, tbh.
>>
>> For instance, take lcmod_func_handle_call(). It calls
>> cmod_get_func_handle() and checks result for NULL. This one
>> calls cmod_get_handle() and also checks result for NULL.
>> This one calls luaL_testudata() and also checks result for NULL.
>> You did 3 NULL checks which could be one, if you would have
>> a set of simpler functions working directly with luaL_testudata().
> 
> No, you're wrong. The minimum number of check is 2 (1 - fetch the
> double pointer to userdata, 2 - dereference first level). Instead
> I use two helpers where one sets diag in case of error which allows
> me to not duplicate code. This is a way better.Until you meant
> something different.

I simply mean the code is hard to read. You have a ton of 2-line
functions with strange names which I need to lookup each time I am
looking at one of the upper level functions. Because from the
names I don't always understand what they are doing.

>> 10. Why do you have 'handle' word in each function name? For
>> example, how 'handle_call' is any different from just 'call' in
>> scope of code in this file? Or how is 'cmod_setup_func' different
>> from 'cmod_setup_func_handle'? It seems that by 'handle' you
>> try not to clash names with the non-Lua part of cmod. But you
>> should do it via prefixes, not via suffixes. lcmod instead of cmod.
> 
> The base idea of handle is that "handle" is sitting in udata of Lua
> variable and the functions are fetching it from there.

In udata you literally store the pointer at the function or module.
You don't have 'func_handle' or something.

> User may unload
> a function then its handle set to NULL but variable is still alive
> and subsequent call should print an error.
> 
> If you prefer the prefix then sure, gimme some time to try out.

I would prefer to drop most of these functions if possible, and get
certainly rid of the 'handle' suffix. I suspect you copied it from
popen code which does something similar - deep system code exposed to
Lua. But in popen they literally have 'popen_handle' structure. You
don't. So you don't need 'handle' in function names. It is long and
confusing.

>>> +{
>>> +	mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len);
>>> +	if (e == mh_end(func_hash))
>>> +		return NULL;
>>> +	return mh_strnptr_node(func_hash, e)->val;
>>> +}
>>> +
>>> +/**
>>> + * Delete a function instance from the hash or decrease
>>> + * a reference if the function is still loaded.
>>> + *
>>> + * @param cf function descriptor.
>>> + *
>>> + * @retval true if the function has no more loaded instances
>>> + * and removed from the hash.
>>> + *
>>> + * @retval false if there are loaded instances left and function
>>> + * is kept in the hash.
>>> + */
>>> +static bool
>>> +cmod_func_del(struct cmod_func *cf)
>>
>> 12. This whole cmod and lcmod API looks extremely inconsistent and complex.
>> Here are the functions you have (not counting the boilerplate code above):
>>
>> // Find a function in the cache, ok.
>> cmod_func_find(name)
>>
>> // It is called 'delete'. But it does not do the
>> // delete. Moreover, it also decreases 'load_count',
>> // which is super surprising. What does it have to do
>> // with loads? You don't call module_sym_unload() here,
>> // but decrease load_count.
>> cmod_func_del(f)
> 
> The same function may be loaded twise into different
> Lua variables, thus we count number of loads. You proposed
> it.

I don't talk here about behaviour of the public API. I talk
about internal functions having seemingly random names not
related to what they are doing.

> I think we can drop this countint and simply use module
> `refs` field instead as you proposed in another email.

I didn't propose to remove the function refs. If you drop them,
how will you determine when a cmod_func object can be deleted?

> cmod_func_add/del operates with function hash they don't manipulate
> symbols. The symbols are handled via module_cache. This is because
> we have two interfaces for functions: box.func and cmod.

But they touch load_count. If you would remove load_count change
from them and rename to something like

	func_hash_add()
	func_hash_del()

It would look better probably. The functions really would become
about hash only.

>> // Ok, this looks like it should add the function to the
>> // hash. But wait, it also increases load_count, and does
>> // not call module_sym_load()! WTF?
>> cmod_func_add(f)
> 
> Tracks number of same function loads.

Exactly, I know this. But the function name is 'add', and it
'tracks loads' - these are unrelated things!

>> // It should free the memory, right? But it calls
>> // module_sym_unload(). Why!? Also, we have a code style
>> // agreement, that function freeing the memory should
>> // have suffix 'delete', not 'free'. As soon as you have
>> // 'new'.
>> cmod_func_free(f)
> 
> OK for naming. As to in general -- new function allocation
> goes into module_cache and resolve the function symbol. The
> symbol resides in memory until last unload/__gc and then
> function gets deleted and module_cache unloads the symbol.

I know the behaviour. My questions are not to the big picture
of behaviour. My questions are about this behaviour being
split into seemingly random parts with strange names.

>> // Fine, it should allocate a new function object, right?
>> // But it also calls module_sym_load()! And does not increase
>> // load_count! I don't understand. It seems like the names
>> // of these functions has very little to do with what they are
>> // really doing. The only way to make them work is to call them
>> // in some special sequence, in which a next function fixes
> 
> "some special sequence" is all our code I fear, the functions
> flow is context dependent grammar not context free.

You know what I mean.

>> // leftovers of previous functions.
>> cmod_func_new(name)
>>
>> This must be seriously redesigned. Starting from the function
>> names.
>>
>> Here is an option:
>>
>> // Find the function in the hash. If found, increase load_count
>> // and return. Otherwise allocate a new function object, fill its
>> // members, do the load, and make load_count = 1. Add to the hash.
>> cmod_func_load(name)
>>
>> // Decrease load_count. If became 0, remove from the hash,
>> // unload mod_sym, and free the memory.
>> cmod_func_unload(f)
>>
>> I don't really see why would you need the other functions. These 2
>> are going to be extra simple anyway. Module_cache API for module
>> objects is literally that simple - load, unload, reload. Why do
>> you need so many for cmod, especially with so unclear behaviour?
> 
> Because otherwise we will have a long function where we will have
> error path to handle. Let me check if I manged to make it readable.
> Note thouh that the number of actions are not goiing to disappear
> it will simply sit in one big function instead.

I looked at module_cache when proposed the API. In module_cache the
public API is relatively good. It is simple and yet totally enough.

Because of this the module functions like lcmod_module_load() look
very simple:

	1. Get arguments from Lua stack.
	2. Forward them to some module_cache function as is.
	3. Push result back to Lua stack.

At the same time for cmod_func functions like lcmod_func_load() you do
too much code which is not about Lua and seems to be internal for
cmod_func objects. But it should be simpler.

Probably it would help if you would think of it like you need a
pure C module. Like there is a cmod.h file, and you need to design API
for the header file to load/unload functions. You would just expose two
functions:

	cmod_func_load()
	cmod_func_unload()

And then it would be extremely easy to wrap this into Lua API. As simple
as it was with module_cache.

Alongside you would probably realize that some of the 'helper' functions
become unnecessary and could be inlined into these 2 functions making the
code ever smaller.

Talking of long functions - they will both fit into one screen I believe.
Even if you inline everything. So I don't see sense in splitting functions
just for the sake of splitting. Functions like 'add to hash', 'remove from
hash', 'allocate new cmod_func and initialize to default values everything'?
Sure, fine, these actions look independent.

But the "public" part of the API should be just 2 functions, you see? Load and
unload. And then you wrap them with a couple of short Lua functions. Plus
ton of Lua shit like __index, __gc, and all. The latter, btw, should just
call cmod_func_unload(). Then you would see that the hash deletion is needed
only in one place, not in two like now.

>>> +static int
>>> +cmod_func_add(struct cmod_func *cf)
>>> +{
>>> +	assert(cf->load_count >= 0);
>>> +	if (cf->load_count++ != 0)
>>
>> 13. I already proposed an option how to rework cmod API above, but
>> still will leave some comments alongside.
> 
> Thanks!

Just tell me you hate these comments and me as well, it is fine. Don't
keep it inside, it is not healthy. You need a lot of mental health if
you work in Tarantool.

>> Why the heck 'add' function increases load_count?
> 
> Because it doesn't need to add same function into hash.

Em. But ... . How are these actions related? Call it simply
'refs' then? If the counter is not related to loads. Or make
the 'load_count' counter incremented by a function at least
having 'load' in its name, right?

>>> +		return 0;
>>> +
>>> +	const struct mh_strnptr_node_t nd = {
>>> +		.str	= cf->name,
>>> +		.len	= cf->len,
>>> +		.hash	= mh_strn_hash(cf->name, cf->len),
>>> +		.val	= cf,
>>> +	};
>>> +
>>> +	mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL);
>>> +	if (e == mh_end(func_hash)) {
>>> +		diag_set(OutOfMemory, sizeof(nd),
>>> +			 "malloc", "cmod_func node");
>>> +		return -1;
>>> +	}
>>> +	return 0;
>>
>> 14. Why so complex? Just add it to the hash when it is created. This
>> function is used in a single place, where you already do the check
>> if it exists.
> 
> I would like to separate hash helpers from other code. But sure.

It is fine to separate them. For instance, look at module_cache_find,
module_cache_add, module_cache_del - they don't try to do anything
except work with the hash table. And this is atomic. Easy to understand.
Also they have 'module_cache' prefix, so I understand they work with the
cache. Not with the module objects. They accept a module pointer, but
only to save it into the hash. They don't change the modules.

>>> +	TRASH(cf);
>>> +	free(cf);
>>> +}
>>> +
>>> +/**
>>> + * Allocate a new function instance and resolve a symbol address.
>>> + *
>>> + * @param name package path and a function name, ie "module.foo"
>>> + * @param len length of @a name.
>>> + * @param func_name_len function name length, ie "3" for "module.foo"
>>> + *
>>> + * @returns function instance on success, NULL otherwise setting diag area.
>>> + */
>>> +static struct cmod_func *
>>> +cmod_func_new(const char *name, size_t len, size_t func_name_len)
>>> +{
>>> +	const ssize_t cf_size = sizeof(struct cmod_func);
>>> +	size_t size = cf_size + len + 1;
>>> +	struct cmod_func *cf = malloc(size);
>>> +	if (cf == NULL) {
>>> +		diag_set(OutOfMemory, size, "malloc", "cf");
>>> +		return NULL;
>>> +	}
>>> +
>>> +	cf->mod_sym.addr	= NULL;
>>> +	cf->mod_sym.module	= NULL;
>>> +	cf->load_count		= 0;
>>> +	cf->mod_sym.name	= cf->name;
>>> +	cf->func_name		= &cf->name[len - func_name_len];
>>> +	cf->len			= len;
>>> +
>>> +	memcpy(cf->name, name, len);
>>> +	cf->name[len] = '\0';
>>> +
>>> +	if (module_sym_load(&cf->mod_sym) != 0) {
>>
>> 16. You load, and don't increase load_count. Does not it look
>> contradictory to you?
> 
> Well, there are two sides: initial values where address comes from
> module subsystem and the state when function is ready to be used.
> The function is "ready" when it is not only initilized and has
> symbol resolved but when it sits inside function cache, and because
> of this we increas load_count only in cmod_func_add. That is why
> load_count increased not at moment of creation.

I see your point. Maybe just load_count name confuses me. Probably
would help to rename it to 'refs' really. Then some things come
together, at least in my mind.

>>> +static int
>>> +lcmod_func_load(struct lua_State *L)
>>> +{
>>> +	const char *method = "function = module:load";
>>> +	struct cmod_func *cf = NULL;
>>> +
>>> +	if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) {
>>> +		const char *fmt =
>>> +			"Expects %s(\'name\') but no name passed";
>>> +		diag_set(IllegalParams, fmt, method);
>>
>> 17. Why couldn't you just inline 'fmt' string? The current version
>> is not any shorter nor easier to read. The same in many other places.
> 
> To not split the string - it will overflow 80 symbols iirc

And why not split it? We do it in a lot of places. Don't tell about
grep. The string has %s and escapes quotes. If you see such a string
in a test, you won't know how the source looks anyway.

>>> +		return luaT_push_nil_and_error(L);
>>> +	}
>>> +
>>> +	struct module *m = cmod_get_module_handle(L, false);
>>> +	if (m == NULL) {
>>> +		const char *fmt =
>>> +			"Expects %s(\'name\') but not module object passed";
>>> +		diag_set(IllegalParams, fmt, method);
>>> +		return luaT_push_nil_and_error(L);
>>> +	}
>>> +
>>> +	const char *func_name = lua_tostring(L, 2);
>>> +	const char *name = tt_sprintf("%s.%s", m->package, func_name);
>>> +	size_t len = strlen(name);
>>> +
>>> +	cf = cmod_func_find(name, len);
>>
>> 18. Or you could pass package name and function name to the constructor
>> and copy them right to the allocated memory, without double copying
>> via the static buffer.
> 
> Will take a look, though the hash requires the long name which means
> we will need to count number of symbols in merged name. Will try.

Wouldn't strlen(left) + strlen(right) work?

>>> +static int
>>> +lcmod_module_load(struct lua_State *L)
>>> +{
>>> +	if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) {
>>> +		const char *fmt =
>>> +			"Expects cmod.load(\'name\') but no name passed";
>>> +		diag_set(IllegalParams, fmt);
>>> +		return luaT_push_nil_and_error(L);
>>> +	}
>>> +
>>> +	size_t name_len;
>>> +	const char *name = lua_tolstring(L, 1, &name_len);
>>> +
>>> +	struct module *module = module_load(name, name_len);
>>> +	if (module == NULL)
>>> +		return luaT_push_nil_and_error(L);
>>> +
>>> +	cmod_setup_module_handle(L, module);
>>> +	/*
>>> +	 * Make sure the module won't disappear until
>>> +	 * it is GC'ed or unloaded explicitly.
>>> +	 */
>>> +	module->ref++;
>>
>> 20. The fact that you need to touch reference counter of an object
>> from a different subsystem should make you start to question
>> sanity of the API. Why module_load() does not do the increase,
>> and module_unload() - decrease?
> 
> Because there were no such API and I though this gonna be over
> the top taking into account the size of the patch.

Total size of the patchset does not matter. And size of one commit
also does not matter if it is atomic.

But talking of this issue - you could make a small commit before
this one, which makes module_load/unload increment the ref.

> Moreover there
> is different senatics of how modules are working in box.fun and
> our cmod module.

How increment of the counter on module_load() is going to change the
behaviour? It is not about when to do the load in box.func. Box.func
also references module object, also calls load() and unload()/gc().
I don't propose to change *when* they are called.

> I agree that this is ugly, maybe it is a good time to improve
> modules code.