[Tarantool-patches] [PATCH v15 09/11] module_cache: use own hash for box.schema.func requests

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Feb 7 20:20:42 MSK 2021


Thanks for the patch!

On 05.02.2021 19:54, Cyrill Gorcunov via Tarantool-patches wrote:
> This is needed to distinguish requests from future `cmod` module
> (which gonna be detecting modules renew automatically) and old
> `box.schema.func` interface. To make it clear use `box_schema_hash`
> for such modules.

After this patch you made module_cache depend on box.schema.func.
Which should not happen. Module_cache is at the lower level. It
should not depend on anything in box except basic things like ClientError,
port.

Besides, you made cmod depend on box.schema.func implicitly. Depend on
the storage. Which was an issue of the first versions of the patchset.
We should not go there again. Both this patch and the previous one should
not really exist. If you have a hash specific for box.schema.func, it
should be there. In box/func.c for example.

AFAIS, this issue should disappear when you will have a second hash on top
of the first one. Then module_sym on first load will get a module from
module_cache. You will put it into the box.schema.func hash, increase
reference counter. And all next loads will see module_sym.module != NULL
even if it was deleted from the lower hash. So it won't lead to load of
a new module.

But even if we wouldn't make the second hash on top, but on the side, the
hack in this commit also could be fixed. You would need to extract
module_sym first load into a new function. The regular sym load would accept
a module object. And also you would need to expose methods to load/unload an
entire module, not just module_sym. Then you could have a separate hash in
another file. However as we discussed verbally, better go for the two-level
hash approach.

I suggest to try this really-really hard. It won't help if you will send
a new patchset almost every day. It will only slow things down.

P.S. I see you even did separate the module load/unload into new
functions, like I advised above. This should be a sign that the module_cache
literally asks us not to make 2 independent hash tables in it.

> diff --git a/src/box/module_cache.c b/src/box/module_cache.c
> index 786e49cba..22b906fd7 100644
> --- a/src/box/module_cache.c
> +++ b/src/box/module_cache.c
> @@ -23,7 +23,7 @@
>  #include "module_cache.h"
>  
>  /** Modules name to descriptor hash. */
> -static struct mh_strnptr_t *mod_hash = NULL;
> +static struct mh_strnptr_t *box_schema_hash = NULL;
>  
>  /**
>   * Parsed symbol and package names.
> @@ -45,6 +45,16 @@ struct func_name {
>  	const char *package_end;
>  };
>  
> +/**
> + * Return module hash depending on where request comes
> + * from: if it is legacy `box.schema.func` interface or not.
> + */
> +static inline struct mh_strnptr_t *
> +hash_tbl(bool is_box_schema)
> +{
> +	return is_box_schema ? box_schema_hash : NULL;
> +}
> +
>  /***
>   * Split function name to symbol and package names.
>   *
> @@ -391,7 +401,7 @@ module_sym(struct module *module, const char *name)
>  }
>  
>  int
> -module_sym_load(struct module_sym *mod_sym)
> +module_sym_load(struct module_sym *mod_sym, bool is_box_schema)

The amount of hacks like 'is_box_schema' flag, 'hash_tbl' function,
'module hash pointer' in the module object, and even usage of word 'schema'
in a subsystem which has nothing to do with the storage, should be a
red flag that something is wrong.


More information about the Tarantool-patches mailing list