The cmod module provide a way to execute C stored procedures on read only nodes without registering them in `_func` system space. The series implements a bare minimum. Vlad, please take a look once time permit. v1-v3 are development ones and not sent. v5 (by vlad): - drop exists, list methods: they are redundant - rename cfunc to cbox - when create a function make it callable Lua object - initialize cbox out of modules - fix error in passing module name for reloading - make api been cbox.func.[create|drop] and cbox.module.reload - fix test for OSX sake v6 (by vlad): - move module handling into module_cache file. v7: - development v8: - use rbtree for function instance storage, since i don't like the idea of unexpected rehashing of values in case of massive number of functions allocated - use reference counter and free function instance if only load/unload are coupled - keep a pointer to the function inside Lua object so we don't need to lookup on every function call. this force us to implement __gc method - use new API and update docs v9: - development v10: - use hashes for function names lookup - simply function loads counting - use luaL_register_module and luaL_register_type for easier methods registering - carry functions as userdata object v11: - development v12: - switch to new API as been discussed in https://lists.tarantool.org/tarantool-patches/e186c454-6765-4776-6433-f3f791ff4c27@tarantool.org/ branch gorcunov/gh-4642-func-ro-12 issue https://github.com/tarantool/tarantool/issues/4642 Cyrill Gorcunov (8): box/func: factor out c function entry structure module_cache: move module handling into own subsystem module_cache: improve naming module_cache: direct update a cache value on reload module_cache: rename calls to ref in module structure module_cache: provide helpers to load and unload modules box/cmod: implement cmod Lua module test: box/cfunc -- add cmod test src/box/CMakeLists.txt | 2 + src/box/func.c | 491 +-------------------------- src/box/func.h | 41 +-- src/box/func_def.h | 14 - src/box/lua/cmod.c | 718 ++++++++++++++++++++++++++++++++++++++++ src/box/lua/cmod.h | 24 ++ src/box/lua/init.c | 2 + src/box/module_cache.c | 557 +++++++++++++++++++++++++++++++ src/box/module_cache.h | 158 +++++++++ test/box/CMakeLists.txt | 2 + test/box/cfunc1.c | 58 ++++ test/box/cfunc2.c | 132 ++++++++ test/box/cmod.result | 199 +++++++++++ test/box/cmod.test.lua | 70 ++++ test/box/suite.ini | 2 +- 15 files changed, 1933 insertions(+), 537 deletions(-) create mode 100644 src/box/lua/cmod.c create mode 100644 src/box/lua/cmod.h create mode 100644 src/box/module_cache.c create mode 100644 src/box/module_cache.h create mode 100644 test/box/cfunc1.c create mode 100644 test/box/cfunc2.c create mode 100644 test/box/cmod.result create mode 100644 test/box/cmod.test.lua base-commit: d712f83c105d107bac2491703fef5e154f54f55d -- 2.29.2
Currently func_c structure is a wrapper over struct func which in turn handles function definition, credentials and etc, mostly to handle internal "_func" space. Such tight bound doesn't allow to reuse func_c in any other context. But we're going to support C function execution in read-only instances and for this we better reuse already existing structures as much as possible instead of breeding new ones. Thus lets extract module symbols into module_sym structure, this allows us to reuse module path cache in other code. While extracting the structure rename "func" member to "addr" since this exactly what the member represents: an address of symbol in memory. Same time to be completely independent of "_func" specific lets carry symbolic name inplace (ie member "name" is kind of redundant when module_sym is used for "_func" handling where we can retrieve the name via function definition but such definition won't be available for non-persistent C functions which we will support in next patches). The new structure allows us to load/unload general symbols via module_sym_load() and module_sym_unload() helpers. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/func.c | 190 ++++++++++++++++++++++++------------------------- src/box/func.h | 42 +++++++++++ 2 files changed, 134 insertions(+), 98 deletions(-) diff --git a/src/box/func.c b/src/box/func.c index 9909cee45..8030aa07c 100644 --- a/src/box/func.c +++ b/src/box/func.c @@ -59,19 +59,8 @@ struct func_name { struct func_c { /** Function object base class. */ struct func base; - /** - * Anchor for module membership. - */ - struct rlist item; - /** - * For C functions, the body of the function. - */ - box_function_f func; - /** - * Each stored function keeps a handle to the - * dynamic library for the C callback. - */ - struct module *module; + /** C function module symbol. */ + struct module_sym mod_sym; }; /*** @@ -371,6 +360,73 @@ module_sym(struct module *module, const char *name) return f; } +int +module_sym_load(struct module_sym *mod_sym) +{ + assert(mod_sym->addr == NULL); + + struct func_name name; + func_split_name(mod_sym->name, &name); + + /* + * In case if module has been loaded already by + * some previous call we can eliminate redundant + * loading and take it from the cache. + */ + struct module *cached, *module; + cached = module_cache_find(name.package, name.package_end); + if (cached == NULL) { + module = module_load(name.package, name.package_end); + if (module == NULL) + return -1; + if (module_cache_put(module) != 0) { + module_delete(module); + return -1; + } + } else { + module = cached; + } + + mod_sym->addr = module_sym(module, name.sym); + if (mod_sym->addr == NULL) { + if (cached == NULL) { + /* + * In case if it was a first load we should + * clean the cache immediately otherwise + * the module continue being referenced even + * if there will be no use of it. + * + * Note the module_sym set an error thus be + * careful to not wipe it. + */ + module_cache_del(name.package, name.package_end); + module_delete(module); + } + return -1; + } + mod_sym->module = module; + rlist_add(&module->funcs, &mod_sym->item); + return 0; +} + +void +module_sym_unload(struct module_sym *mod_sym) +{ + if (mod_sym->module == NULL) + return; + + rlist_del(&mod_sym->item); + if (rlist_empty(&mod_sym->module->funcs)) { + struct func_name name; + func_split_name(mod_sym->name, &name); + module_cache_del(name.package, name.package_end); + } + module_gc(mod_sym->module); + + mod_sym->module = NULL; + mod_sym->addr = NULL; +} + int module_reload(const char *package, const char *package_end, struct module **module) { @@ -385,15 +441,15 @@ module_reload(const char *package, const char *package_end, struct module **modu if (new_module == NULL) return -1; - struct func_c *func, *tmp_func; - rlist_foreach_entry_safe(func, &old_module->funcs, item, tmp_func) { + struct module_sym *mod_sym, *tmp; + rlist_foreach_entry_safe(mod_sym, &old_module->funcs, item, tmp) { struct func_name name; - func_split_name(func->base.def->name, &name); - func->func = module_sym(new_module, name.sym); - if (func->func == NULL) + func_split_name(mod_sym->name, &name); + mod_sym->addr = module_sym(new_module, name.sym); + if (mod_sym->addr == NULL) goto restore; - func->module = new_module; - rlist_move(&new_module->funcs, &func->item); + mod_sym->module = new_module; + rlist_move(&new_module->funcs, &mod_sym->item); } module_cache_del(package, package_end); if (module_cache_put(new_module) != 0) @@ -408,9 +464,9 @@ module_reload(const char *package, const char *package_end, struct module **modu */ do { struct func_name name; - func_split_name(func->base.def->name, &name); - func->func = module_sym(old_module, name.sym); - if (func->func == NULL) { + func_split_name(mod_sym->name, &name); + mod_sym->addr = module_sym(old_module, name.sym); + if (mod_sym->addr == NULL) { /* * Something strange was happen, an early loaden * function was not found in an old dso. @@ -418,10 +474,11 @@ module_reload(const char *package, const char *package_end, struct module **modu panic("Can't restore module function, " "server state is inconsistent"); } - func->module = old_module; - rlist_move(&old_module->funcs, &func->item); - } while (func != rlist_first_entry(&old_module->funcs, - struct func_c, item)); + mod_sym->module = old_module; + rlist_move(&old_module->funcs, &mod_sym->item); + } while (mod_sym != rlist_first_entry(&old_module->funcs, + struct module_sym, + item)); assert(rlist_empty(&new_module->funcs)); module_delete(new_module); return -1; @@ -484,94 +541,31 @@ func_c_new(MAYBE_UNUSED struct func_def *def) return NULL; } func->base.vtab = &func_c_vtab; - func->func = NULL; - func->module = NULL; + func->mod_sym.addr = NULL; + func->mod_sym.module = NULL; + func->mod_sym.name = def->name; return &func->base; } -static void -func_c_unload(struct func_c *func) -{ - if (func->module) { - rlist_del(&func->item); - if (rlist_empty(&func->module->funcs)) { - struct func_name name; - func_split_name(func->base.def->name, &name); - module_cache_del(name.package, name.package_end); - } - module_gc(func->module); - } - func->module = NULL; - func->func = NULL; -} - static void func_c_destroy(struct func *base) { assert(base->vtab == &func_c_vtab); assert(base != NULL && base->def->language == FUNC_LANGUAGE_C); struct func_c *func = (struct func_c *) base; - func_c_unload(func); + module_sym_unload(&func->mod_sym); TRASH(base); free(func); } -/** - * Resolve func->func (find the respective DLL and fetch the - * symbol from it). - */ -static int -func_c_load(struct func_c *func) -{ - assert(func->func == NULL); - - struct func_name name; - func_split_name(func->base.def->name, &name); - - struct module *cached, *module; - cached = module_cache_find(name.package, name.package_end); - if (cached == NULL) { - module = module_load(name.package, name.package_end); - if (module == NULL) - return -1; - if (module_cache_put(module)) { - module_delete(module); - return -1; - } - } else { - module = cached; - } - - func->func = module_sym(module, name.sym); - if (func->func == NULL) { - if (cached == NULL) { - /* - * In case if it was a first load we should - * clean the cache immediately otherwise - * the module continue being referenced even - * if there will be no use of it. - * - * Note the module_sym set an error thus be - * careful to not wipe it. - */ - module_cache_del(name.package, name.package_end); - module_delete(module); - } - return -1; - } - func->module = module; - rlist_add(&module->funcs, &func->item); - return 0; -} - int func_c_call(struct func *base, struct port *args, struct port *ret) { assert(base->vtab == &func_c_vtab); assert(base != NULL && base->def->language == FUNC_LANGUAGE_C); struct func_c *func = (struct func_c *) base; - if (func->func == NULL) { - if (func_c_load(func) != 0) + if (func->mod_sym.addr == NULL) { + if (module_sym_load(&func->mod_sym) != 0) return -1; } @@ -586,10 +580,10 @@ func_c_call(struct func *base, struct port *args, struct port *ret) box_function_ctx_t ctx = { ret }; /* Module can be changed after function reload. */ - struct module *module = func->module; + struct module *module = func->mod_sym.module; assert(module != NULL); ++module->calls; - int rc = func->func(&ctx, data, data + data_sz); + int rc = func->mod_sym.addr(&ctx, data, data + data_sz); --module->calls; module_gc(module); region_truncate(region, region_svp); diff --git a/src/box/func.h b/src/box/func.h index 581e468cb..9a7f17446 100644 --- a/src/box/func.h +++ b/src/box/func.h @@ -58,6 +58,28 @@ struct module { char package[0]; }; +/** + * Callable symbol bound to a module. + */ +struct module_sym { + /** + * Anchor for module membership. + */ + struct rlist item; + /** + * For C functions, address of the function. + */ + box_function_f addr; + /** + * A module the symbol belongs to. + */ + struct module *module; + /** + * Function name definition. + */ + char *name; +}; + /** Virtual method table for func object. */ struct func_vtab { /** Call function with given arguments. */ @@ -108,6 +130,26 @@ func_delete(struct func *func); int func_call(struct func *func, struct port *args, struct port *ret); +/** + * Resolve C entry (find the respective DLL and fetch the + * symbol from it). + * + * @param mod_sym module symbol pointer. + * @retval -1 on error. + * @retval 0 on success. + */ +int +module_sym_load(struct module_sym *mod_sym); + +/** + * Unload module symbol and drop it from the package + * cache if there is no users left. + * + * @param mod_sym module symbol pointer. + */ +void +module_sym_unload(struct module_sym *mod_sym); + /** * Reload dynamically loadable module. * -- 2.29.2
The module handling should not be bound to particular function implementation (we will have two users: already existing functions for "_func" space, and a new upcoming one which will be serving cbox submodule in next patch). For this sake all module related code is moved to module_cache file where we do symbol resolving, calling and tracking of symbol usage. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/CMakeLists.txt | 1 + src/box/func.c | 473 +------------------------------------ src/box/func.h | 83 +------ src/box/func_def.h | 14 -- src/box/module_cache.c | 513 +++++++++++++++++++++++++++++++++++++++++ src/box/module_cache.h | 139 +++++++++++ 6 files changed, 657 insertions(+), 566 deletions(-) create mode 100644 src/box/module_cache.c create mode 100644 src/box/module_cache.h diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt index 19203f770..339e2c8a9 100644 --- a/src/box/CMakeLists.txt +++ b/src/box/CMakeLists.txt @@ -187,6 +187,7 @@ add_library(box STATIC sql_stmt_cache.c wal.c call.c + module_cache.c merger.c ibuf.c ${sql_sources} diff --git a/src/box/func.c b/src/box/func.c index 8030aa07c..4e7025fdb 100644 --- a/src/box/func.c +++ b/src/box/func.c @@ -31,30 +31,12 @@ #include "func.h" #include "fiber.h" #include "trivia/config.h" -#include "assoc.h" #include "lua/utils.h" #include "lua/call.h" -#include "error.h" -#include "errinj.h" #include "diag.h" #include "port.h" #include "schema.h" #include "session.h" -#include "libeio/eio.h" -#include <fcntl.h> -#include <dlfcn.h> - -/** - * Parsed symbol and package names. - */ -struct func_name { - /** Null-terminated symbol name, e.g. "func" for "mod.submod.func" */ - const char *sym; - /** Package name, e.g. "mod.submod" for "mod.submod.func" */ - const char *package; - /** A pointer to the last character in ->package + 1 */ - const char *package_end; -}; struct func_c { /** Function object base class. */ @@ -63,427 +45,6 @@ struct func_c { struct module_sym mod_sym; }; -/*** - * Split function name to symbol and package names. - * For example, str = foo.bar.baz => sym = baz, package = foo.bar - * @param str function name, e.g. "module.submodule.function". - * @param[out] name parsed symbol and package names. - */ -static void -func_split_name(const char *str, struct func_name *name) -{ - name->package = str; - name->package_end = strrchr(str, '.'); - if (name->package_end != NULL) { - /* module.submodule.function => module.submodule, function */ - name->sym = name->package_end + 1; /* skip '.' */ - } else { - /* package == function => function, function */ - name->sym = name->package; - name->package_end = str + strlen(str); - } -} - -/** - * Arguments for luaT_module_find used by lua_cpcall() - */ -struct module_find_ctx { - const char *package; - const char *package_end; - char *path; - size_t path_len; -}; - -/** - * A cpcall() helper for module_find() - */ -static int -luaT_module_find(lua_State *L) -{ - struct module_find_ctx *ctx = (struct module_find_ctx *) - lua_topointer(L, 1); - - /* - * Call package.searchpath(name, package.cpath) and use - * the path to the function in dlopen(). - */ - lua_getglobal(L, "package"); - - lua_getfield(L, -1, "search"); - - /* Argument of search: name */ - lua_pushlstring(L, ctx->package, ctx->package_end - ctx->package); - - lua_call(L, 1, 1); - if (lua_isnil(L, -1)) - return luaL_error(L, "module not found"); - /* Convert path to absolute */ - char resolved[PATH_MAX]; - if (realpath(lua_tostring(L, -1), resolved) == NULL) { - diag_set(SystemError, "realpath"); - return luaT_error(L); - } - - snprintf(ctx->path, ctx->path_len, "%s", resolved); - return 0; -} - -/** - * Find path to module using Lua's package.cpath - * @param package package name - * @param package_end a pointer to the last byte in @a package + 1 - * @param[out] path path to shared library - * @param path_len size of @a path buffer - * @retval 0 on success - * @retval -1 on error, diag is set - */ -static int -module_find(const char *package, const char *package_end, char *path, - size_t path_len) -{ - struct module_find_ctx ctx = { package, package_end, path, path_len }; - lua_State *L = tarantool_L; - int top = lua_gettop(L); - if (luaT_cpcall(L, luaT_module_find, &ctx) != 0) { - int package_len = (int) (package_end - package); - diag_set(ClientError, ER_LOAD_MODULE, package_len, package, - lua_tostring(L, -1)); - lua_settop(L, top); - return -1; - } - assert(top == lua_gettop(L)); /* cpcall discard results */ - return 0; -} - -static struct mh_strnptr_t *modules = NULL; - -static void -module_gc(struct module *module); - -int -module_init(void) -{ - modules = mh_strnptr_new(); - if (modules == NULL) { - diag_set(OutOfMemory, sizeof(*modules), "malloc", - "modules hash table"); - return -1; - } - return 0; -} - -void -module_free(void) -{ - while (mh_size(modules) > 0) { - mh_int_t i = mh_first(modules); - struct module *module = - (struct module *) mh_strnptr_node(modules, i)->val; - /* Can't delete modules if they have active calls */ - module_gc(module); - } - mh_strnptr_delete(modules); -} - -/** - * Look up a module in the modules cache. - */ -static struct module * -module_cache_find(const char *name, const char *name_end) -{ - mh_int_t i = mh_strnptr_find_inp(modules, name, name_end - name); - if (i == mh_end(modules)) - return NULL; - return (struct module *)mh_strnptr_node(modules, i)->val; -} - -/** - * Save module to the module cache. - */ -static inline int -module_cache_put(struct module *module) -{ - size_t package_len = strlen(module->package); - uint32_t name_hash = mh_strn_hash(module->package, package_len); - const struct mh_strnptr_node_t strnode = { - module->package, package_len, name_hash, module}; - - if (mh_strnptr_put(modules, &strnode, NULL, NULL) == mh_end(modules)) { - diag_set(OutOfMemory, sizeof(strnode), "malloc", "modules"); - return -1; - } - return 0; -} - -/** - * Delete a module from the module cache - */ -static void -module_cache_del(const char *name, const char *name_end) -{ - mh_int_t i = mh_strnptr_find_inp(modules, name, name_end - name); - if (i == mh_end(modules)) - return; - mh_strnptr_del(modules, i, NULL); -} - -/* - * Load a dso. - * Create a new symlink based on temporary directory and try to - * load via this symink to load a dso twice for cases of a function - * reload. - */ -static struct module * -module_load(const char *package, const char *package_end) -{ - char path[PATH_MAX]; - if (module_find(package, package_end, path, sizeof(path)) != 0) - return NULL; - - int package_len = package_end - package; - struct module *module = (struct module *) - malloc(sizeof(*module) + package_len + 1); - if (module == NULL) { - diag_set(OutOfMemory, sizeof(struct module) + package_len + 1, - "malloc", "struct module"); - return NULL; - } - memcpy(module->package, package, package_len); - module->package[package_len] = 0; - rlist_create(&module->funcs); - module->calls = 0; - - const char *tmpdir = getenv("TMPDIR"); - if (tmpdir == NULL) - tmpdir = "/tmp"; - char dir_name[PATH_MAX]; - int rc = snprintf(dir_name, sizeof(dir_name), "%s/tntXXXXXX", tmpdir); - if (rc < 0 || (size_t) rc >= sizeof(dir_name)) { - diag_set(SystemError, "failed to generate path to tmp dir"); - goto error; - } - - if (mkdtemp(dir_name) == NULL) { - diag_set(SystemError, "failed to create unique dir name: %s", - dir_name); - goto error; - } - char load_name[PATH_MAX]; - rc = snprintf(load_name, sizeof(load_name), "%s/%.*s." TARANTOOL_LIBEXT, - dir_name, package_len, package); - if (rc < 0 || (size_t) rc >= sizeof(dir_name)) { - diag_set(SystemError, "failed to generate path to DSO"); - goto error; - } - - struct stat st; - if (stat(path, &st) < 0) { - diag_set(SystemError, "failed to stat() module %s", path); - goto error; - } - - int source_fd = open(path, O_RDONLY); - if (source_fd < 0) { - diag_set(SystemError, "failed to open module %s file for" \ - " reading", path); - goto error; - } - int dest_fd = open(load_name, O_WRONLY|O_CREAT|O_TRUNC, - st.st_mode & 0777); - if (dest_fd < 0) { - diag_set(SystemError, "failed to open file %s for writing ", - load_name); - close(source_fd); - goto error; - } - - off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size); - close(source_fd); - close(dest_fd); - if (ret != st.st_size) { - diag_set(SystemError, "failed to copy DSO %s to %s", - path, load_name); - goto error; - } - - module->handle = dlopen(load_name, RTLD_NOW | RTLD_LOCAL); - if (unlink(load_name) != 0) - say_warn("failed to unlink dso link %s", load_name); - if (rmdir(dir_name) != 0) - say_warn("failed to delete temporary dir %s", dir_name); - if (module->handle == NULL) { - diag_set(ClientError, ER_LOAD_MODULE, package_len, - package, dlerror()); - goto error; - } - struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT); - if (e != NULL) - ++e->iparam; - return module; -error: - free(module); - return NULL; -} - -static void -module_delete(struct module *module) -{ - struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT); - if (e != NULL) - --e->iparam; - dlclose(module->handle); - TRASH(module); - free(module); -} - -/* - * Check if a dso is unused and can be closed. - */ -static void -module_gc(struct module *module) -{ - if (rlist_empty(&module->funcs) && module->calls == 0) - module_delete(module); -} - -/* - * Import a function from the module. - */ -static box_function_f -module_sym(struct module *module, const char *name) -{ - box_function_f f = (box_function_f)dlsym(module->handle, name); - if (f == NULL) { - diag_set(ClientError, ER_LOAD_FUNCTION, name, dlerror()); - return NULL; - } - return f; -} - -int -module_sym_load(struct module_sym *mod_sym) -{ - assert(mod_sym->addr == NULL); - - struct func_name name; - func_split_name(mod_sym->name, &name); - - /* - * In case if module has been loaded already by - * some previous call we can eliminate redundant - * loading and take it from the cache. - */ - struct module *cached, *module; - cached = module_cache_find(name.package, name.package_end); - if (cached == NULL) { - module = module_load(name.package, name.package_end); - if (module == NULL) - return -1; - if (module_cache_put(module) != 0) { - module_delete(module); - return -1; - } - } else { - module = cached; - } - - mod_sym->addr = module_sym(module, name.sym); - if (mod_sym->addr == NULL) { - if (cached == NULL) { - /* - * In case if it was a first load we should - * clean the cache immediately otherwise - * the module continue being referenced even - * if there will be no use of it. - * - * Note the module_sym set an error thus be - * careful to not wipe it. - */ - module_cache_del(name.package, name.package_end); - module_delete(module); - } - return -1; - } - mod_sym->module = module; - rlist_add(&module->funcs, &mod_sym->item); - return 0; -} - -void -module_sym_unload(struct module_sym *mod_sym) -{ - if (mod_sym->module == NULL) - return; - - rlist_del(&mod_sym->item); - if (rlist_empty(&mod_sym->module->funcs)) { - struct func_name name; - func_split_name(mod_sym->name, &name); - module_cache_del(name.package, name.package_end); - } - module_gc(mod_sym->module); - - mod_sym->module = NULL; - mod_sym->addr = NULL; -} - -int -module_reload(const char *package, const char *package_end, struct module **module) -{ - struct module *old_module = module_cache_find(package, package_end); - if (old_module == NULL) { - /* Module wasn't loaded - do nothing. */ - *module = NULL; - return 0; - } - - struct module *new_module = module_load(package, package_end); - if (new_module == NULL) - return -1; - - struct module_sym *mod_sym, *tmp; - rlist_foreach_entry_safe(mod_sym, &old_module->funcs, item, tmp) { - struct func_name name; - func_split_name(mod_sym->name, &name); - mod_sym->addr = module_sym(new_module, name.sym); - if (mod_sym->addr == NULL) - goto restore; - mod_sym->module = new_module; - rlist_move(&new_module->funcs, &mod_sym->item); - } - module_cache_del(package, package_end); - if (module_cache_put(new_module) != 0) - goto restore; - module_gc(old_module); - *module = new_module; - return 0; -restore: - /* - * Some old-dso func can't be load from new module, restore old - * functions. - */ - do { - struct func_name name; - func_split_name(mod_sym->name, &name); - mod_sym->addr = module_sym(old_module, name.sym); - if (mod_sym->addr == NULL) { - /* - * Something strange was happen, an early loaden - * function was not found in an old dso. - */ - panic("Can't restore module function, " - "server state is inconsistent"); - } - mod_sym->module = old_module; - rlist_move(&old_module->funcs, &mod_sym->item); - } while (mod_sym != rlist_first_entry(&old_module->funcs, - struct module_sym, - item)); - assert(rlist_empty(&new_module->funcs)); - module_delete(new_module); - return -1; -} - static struct func * func_c_new(struct func_def *def); @@ -563,39 +124,9 @@ func_c_call(struct func *base, struct port *args, struct port *ret) { assert(base->vtab == &func_c_vtab); assert(base != NULL && base->def->language == FUNC_LANGUAGE_C); - struct func_c *func = (struct func_c *) base; - if (func->mod_sym.addr == NULL) { - if (module_sym_load(&func->mod_sym) != 0) - return -1; - } - - struct region *region = &fiber()->gc; - size_t region_svp = region_used(region); - uint32_t data_sz; - const char *data = port_get_msgpack(args, &data_sz); - if (data == NULL) - return -1; - - port_c_create(ret); - box_function_ctx_t ctx = { ret }; - /* Module can be changed after function reload. */ - struct module *module = func->mod_sym.module; - assert(module != NULL); - ++module->calls; - int rc = func->mod_sym.addr(&ctx, data, data + data_sz); - --module->calls; - module_gc(module); - region_truncate(region, region_svp); - if (rc != 0) { - if (diag_last_error(&fiber()->diag) == NULL) { - /* Stored procedure forget to set diag */ - diag_set(ClientError, ER_PROC_C, "unknown error"); - } - port_destroy(ret); - return -1; - } - return rc; + struct func_c *func = (struct func_c *)base; + return module_sym_call(&func->mod_sym, args, ret); } static struct func_vtab func_c_vtab = { diff --git a/src/box/func.h b/src/box/func.h index 9a7f17446..11a466b28 100644 --- a/src/box/func.h +++ b/src/box/func.h @@ -34,7 +34,8 @@ #include <stddef.h> #include <stdint.h> #include <stdbool.h> -#include "small/rlist.h" + +#include "module_cache.h" #include "func_def.h" #include "user_def.h" @@ -44,42 +45,6 @@ extern "C" { struct func; -/** - * Dynamic shared module. - */ -struct module { - /** Module dlhandle. */ - void *handle; - /** List of imported functions. */ - struct rlist funcs; - /** Count of active calls. */ - size_t calls; - /** Module's package name. */ - char package[0]; -}; - -/** - * Callable symbol bound to a module. - */ -struct module_sym { - /** - * Anchor for module membership. - */ - struct rlist item; - /** - * For C functions, address of the function. - */ - box_function_f addr; - /** - * A module the symbol belongs to. - */ - struct module *module; - /** - * Function name definition. - */ - char *name; -}; - /** Virtual method table for func object. */ struct func_vtab { /** Call function with given arguments. */ @@ -106,18 +71,6 @@ struct func { struct access access[BOX_USER_MAX]; }; -/** - * Initialize modules subsystem. - */ -int -module_init(void); - -/** - * Cleanup modules subsystem. - */ -void -module_free(void); - struct func * func_new(struct func_def *def); @@ -130,38 +83,6 @@ func_delete(struct func *func); int func_call(struct func *func, struct port *args, struct port *ret); -/** - * Resolve C entry (find the respective DLL and fetch the - * symbol from it). - * - * @param mod_sym module symbol pointer. - * @retval -1 on error. - * @retval 0 on success. - */ -int -module_sym_load(struct module_sym *mod_sym); - -/** - * Unload module symbol and drop it from the package - * cache if there is no users left. - * - * @param mod_sym module symbol pointer. - */ -void -module_sym_unload(struct module_sym *mod_sym); - -/** - * Reload dynamically loadable module. - * - * @param package name begin pointer. - * @param package_end package_end name end pointer. - * @param[out] module a pointer to store module object on success. - * @retval -1 on error. - * @retval 0 on success. - */ -int -module_reload(const char *package, const char *package_end, struct module **module); - #if defined(__cplusplus) } /* extern "C" */ #endif /* defined(__cplusplus) */ diff --git a/src/box/func_def.h b/src/box/func_def.h index d99d89190..75cd6a0d3 100644 --- a/src/box/func_def.h +++ b/src/box/func_def.h @@ -168,20 +168,6 @@ func_def_dup(struct func_def *def); int func_def_check(struct func_def *def); -/** - * API of C stored function. - */ - -struct port; - -struct box_function_ctx { - struct port *port; -}; - -typedef struct box_function_ctx box_function_ctx_t; -typedef int (*box_function_f)(box_function_ctx_t *ctx, - const char *args, const char *args_end); - #ifdef __cplusplus } #endif diff --git a/src/box/module_cache.c b/src/box/module_cache.c new file mode 100644 index 000000000..9fe316807 --- /dev/null +++ b/src/box/module_cache.c @@ -0,0 +1,513 @@ +/* + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. + */ + +#include <dlfcn.h> +#include <fcntl.h> +#include <stdlib.h> +#include <string.h> + +#include "assoc.h" +#include "diag.h" +#include "error.h" +#include "errinj.h" +#include "fiber.h" +#include "port.h" + +#include "box/error.h" +#include "lua/utils.h" +#include "libeio/eio.h" + +#include "module_cache.h" + +/** Modules name to descriptor hash. */ +static struct mh_strnptr_t *mod_hash = NULL; + +/** + * Parsed symbol and package names. + */ +struct func_name { + /** + * Null-terminated symbol name, e.g. + * "func" for "mod.submod.func". + */ + const char *sym; + /** + * Package name, e.g. "mod.submod" for + * "mod.submod.func". + */ + const char *package; + /** + * A pointer to the last character in ->package + 1. + */ + const char *package_end; +}; + +/*** + * Split function name to symbol and package names. + * + * For example, str = foo.bar.baz => sym = baz, package = foo.bar + * + * @param str function name, e.g. "module.submodule.function". + * @param[out] name parsed symbol and a package name. + */ +static void +func_split_name(const char *str, struct func_name *name) +{ + name->package = str; + name->package_end = strrchr(str, '.'); + if (name->package_end != NULL) { + /* module.submodule.function => module.submodule, function */ + name->sym = name->package_end + 1; /* skip '.' */ + } else { + /* package == function => function, function */ + name->sym = name->package; + name->package_end = str + strlen(str); + } +} + +/** + * Look up a module in the modules cache. + */ +static struct module * +module_cache_find(const char *name, const char *name_end) +{ + mh_int_t e = mh_strnptr_find_inp(mod_hash, name, name_end - name); + if (e == mh_end(mod_hash)) + return NULL; + return mh_strnptr_node(mod_hash, e)->val; +} + +/** + * Save a module to the modules cache. + */ +static int +module_cache_add(struct module *module) +{ + size_t package_len = strlen(module->package); + const struct mh_strnptr_node_t nd = { + .str = module->package, + .len = package_len, + .hash = mh_strn_hash(module->package, package_len), + .val = module, + }; + + if (mh_strnptr_put(mod_hash, &nd, NULL, NULL) == mh_end(mod_hash)) { + diag_set(OutOfMemory, sizeof(nd), "malloc", + "module cache node"); + return -1; + } + return 0; +} + +/** + * Delete a module from the modules cache. + */ +static void +module_cache_del(const char *name, const char *name_end) +{ + mh_int_t e = mh_strnptr_find_inp(mod_hash, name, name_end - name); + if (e != mh_end(mod_hash)) + mh_strnptr_del(mod_hash, e, NULL); +} + +/** + * Arguments for luaT_module_find used by lua_cpcall(). + */ +struct module_find_ctx { + const char *package; + const char *package_end; + char *path; + size_t path_len; +}; + +/** + * A cpcall() helper for module_find(). + */ +static int +luaT_module_find(lua_State *L) +{ + struct module_find_ctx *ctx = (void *)lua_topointer(L, 1); + + /* + * Call package.searchpath(name, package.cpath) and use + * the path to the function in dlopen(). + */ + lua_getglobal(L, "package"); + lua_getfield(L, -1, "search"); + + /* Argument of search: name */ + lua_pushlstring(L, ctx->package, ctx->package_end - ctx->package); + + lua_call(L, 1, 1); + if (lua_isnil(L, -1)) + return luaL_error(L, "module not found"); + + /* Convert path to absolute */ + char resolved[PATH_MAX]; + if (realpath(lua_tostring(L, -1), resolved) == NULL) { + diag_set(SystemError, "realpath"); + return luaT_error(L); + } + + snprintf(ctx->path, ctx->path_len, "%s", resolved); + return 0; +} + +/** + * Find a path to a module using Lua's package.cpath. + * + * @param package package name + * @param package_end a pointer to the last byte in @a package + 1 + * @param[out] path path to shared library + * @param path_len size of @a path buffer + * + * @retval 0 on success + * @retval -1 on error, diag is set + */ +static int +module_find(const char *package, const char *package_end, + char *path, size_t path_len) +{ + struct module_find_ctx ctx = { + .package = package, + .package_end = package_end, + .path = path, + .path_len = path_len, + }; + lua_State *L = tarantool_L; + int top = lua_gettop(L); + if (luaT_cpcall(L, luaT_module_find, &ctx) != 0) { + diag_set(ClientError, ER_LOAD_MODULE, + (int)(package_end - package), + package, lua_tostring(L, -1)); + lua_settop(L, top); + return -1; + } + assert(top == lua_gettop(L)); /* cpcall discard results */ + return 0; +} + +/** + * Load dynamic shared object, ie module library. + * + * Create a new symlink based on temporary directory + * and try to load via this symink to load a dso twice + * for cases of a function reload. + */ +static struct module * +module_load(const char *package, const char *package_end) +{ + char path[PATH_MAX]; + if (module_find(package, package_end, path, sizeof(path)) != 0) + return NULL; + + int package_len = package_end - package; + struct module *module = malloc(sizeof(*module) + package_len + 1); + if (module == NULL) { + diag_set(OutOfMemory, sizeof(*module) + package_len + 1, + "malloc", "struct module"); + return NULL; + } + memcpy(module->package, package, package_len); + module->package[package_len] = 0; + rlist_create(&module->funcs_list); + module->calls = 0; + + const char *tmpdir = getenv("TMPDIR"); + if (tmpdir == NULL) + tmpdir = "/tmp"; + + char dir_name[PATH_MAX]; + int rc = snprintf(dir_name, sizeof(dir_name), "%s/tntXXXXXX", tmpdir); + if (rc < 0 || (size_t)rc >= sizeof(dir_name)) { + diag_set(SystemError, "failed to generate path to tmp dir"); + goto error; + } + + if (mkdtemp(dir_name) == NULL) { + diag_set(SystemError, "failed to create unique dir name: %s", + dir_name); + goto error; + } + + char load_name[PATH_MAX]; + rc = snprintf(load_name, sizeof(load_name), "%s/%.*s." TARANTOOL_LIBEXT, + dir_name, package_len, package); + if (rc < 0 || (size_t)rc >= sizeof(dir_name)) { + diag_set(SystemError, "failed to generate path to DSO"); + goto error; + } + + struct stat st; + if (stat(path, &st) < 0) { + diag_set(SystemError, "failed to stat() module %s", path); + goto error; + } + + int source_fd = open(path, O_RDONLY); + if (source_fd < 0) { + diag_set(SystemError, "failed to open module %s", path); + goto error; + } + + int dest_fd = open(load_name, O_WRONLY | O_CREAT | O_TRUNC, + st.st_mode & (S_IRWXU | S_IRWXG | S_IRWXO)); + if (dest_fd < 0) { + diag_set(SystemError, "failed to open file %s for writing ", + load_name); + close(source_fd); + goto error; + } + + off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size); + close(source_fd); + close(dest_fd); + if (ret != st.st_size) { + diag_set(SystemError, "failed to copy DSO %s to %s", + path, load_name); + goto error; + } + + module->handle = dlopen(load_name, RTLD_NOW | RTLD_LOCAL); + if (unlink(load_name) != 0) + say_warn("failed to unlink dso link %s", load_name); + if (rmdir(dir_name) != 0) + say_warn("failed to delete temporary dir %s", dir_name); + if (module->handle == NULL) { + diag_set(ClientError, ER_LOAD_MODULE, package_len, + package, dlerror()); + goto error; + } + + struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT); + if (e != NULL) + ++e->iparam; + return module; + +error: + free(module); + return NULL; +} + +/** + * Delete a module. + */ +static void +module_delete(struct module *module) +{ + struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT); + if (e != NULL) + --e->iparam; + dlclose(module->handle); + TRASH(module); + free(module); +} + +/** + * Check if a module is unused and delete it then. + */ +static void +module_gc(struct module *module) +{ + if (rlist_empty(&module->funcs_list) && module->calls == 0) + module_delete(module); +} + +/** + * Import a function from a module. + */ +static box_function_f +module_sym(struct module *module, const char *name) +{ + box_function_f f = dlsym(module->handle, name); + if (f == NULL) { + diag_set(ClientError, ER_LOAD_FUNCTION, name, dlerror()); + return NULL; + } + return f; +} + +int +module_sym_load(struct module_sym *mod_sym) +{ + assert(mod_sym->addr == NULL); + + struct func_name name; + func_split_name(mod_sym->name, &name); + + struct module *module = module_cache_find(name.package, name.package_end); + if (module == NULL) { + module = module_load(name.package, name.package_end); + if (module == NULL) + return -1; + if (module_cache_add(module) != 0) { + module_delete(module); + return -1; + } + } + + mod_sym->addr = module_sym(module, name.sym); + if (mod_sym->addr == NULL) + return -1; + + mod_sym->module = module; + rlist_add(&module->funcs_list, &mod_sym->item); + return 0; +} + +void +module_sym_unload(struct module_sym *mod_sym) +{ + if (mod_sym->module == NULL) + return; + + rlist_del(&mod_sym->item); + if (rlist_empty(&mod_sym->module->funcs_list)) { + struct func_name name; + func_split_name(mod_sym->name, &name); + module_cache_del(name.package, name.package_end); + } + module_gc(mod_sym->module); + + mod_sym->module = NULL; + mod_sym->addr = NULL; +} + +int +module_sym_call(struct module_sym *mod_sym, struct port *args, + struct port *ret) +{ + if (mod_sym->addr == NULL) { + if (module_sym_load(mod_sym) != 0) + return -1; + } + + struct region *region = &fiber()->gc; + size_t region_svp = region_used(region); + + uint32_t data_sz; + const char *data = port_get_msgpack(args, &data_sz); + if (data == NULL) + return -1; + + port_c_create(ret); + box_function_ctx_t ctx = { + .port = ret, + }; + + /* + * Module can be changed after function reload. Also + * keep in mind that stored C procedure may yield inside. + */ + struct module *module = mod_sym->module; + assert(module != NULL); + ++module->calls; + int rc = mod_sym->addr(&ctx, data, data + data_sz); + --module->calls; + module_gc(module); + region_truncate(region, region_svp); + + if (rc != 0) { + if (diag_last_error(&fiber()->diag) == NULL) { + /* Stored procedure forget to set diag */ + diag_set(ClientError, ER_PROC_C, "unknown error"); + } + port_destroy(ret); + return -1; + } + + return rc; +} + +int +module_reload(const char *package, const char *package_end, + struct module **module) +{ + struct module *old = module_cache_find(package, package_end); + if (old == NULL) { + /* Module wasn't loaded - do nothing. */ + *module = NULL; + return 0; + } + + struct module *new = module_load(package, package_end); + if (new == NULL) + return -1; + + struct module_sym *mod_sym, *tmp; + rlist_foreach_entry_safe(mod_sym, &old->funcs_list, item, tmp) { + struct func_name name; + func_split_name(mod_sym->name, &name); + + mod_sym->addr = module_sym(new, name.sym); + if (mod_sym->addr == NULL) { + say_error("module: reload %s, symbol %s not found", + package, name.sym); + goto restore; + } + + mod_sym->module = new; + rlist_move(&new->funcs_list, &mod_sym->item); + } + + module_cache_del(package, package_end); + if (module_cache_add(new) != 0) + goto restore; + + module_gc(old); + *module = new; + return 0; + +restore: + /* + * Some old-dso func can't be load from new module, + * restore old functions. + */ + do { + struct func_name name; + func_split_name(mod_sym->name, &name); + mod_sym->addr = module_sym(old, name.sym); + if (mod_sym->addr == NULL) { + /* + * Something strange was happen, an early loaden + * function was not found in an old dso. + */ + panic("Can't restore module function, " + "server state is inconsistent"); + } + mod_sym->module = old; + rlist_move(&old->funcs_list, &mod_sym->item); + } while (mod_sym != rlist_first_entry(&old->funcs_list, + struct module_sym, + item)); + assert(rlist_empty(&new->funcs_list)); + module_delete(new); + return -1; +} + +int +module_init(void) +{ + mod_hash = mh_strnptr_new(); + if (mod_hash == NULL) { + diag_set(OutOfMemory, sizeof(*mod_hash), + "malloc", "modules hash table"); + return -1; + } + return 0; +} + +void +module_free(void) +{ + while (mh_size(mod_hash) > 0) { + mh_int_t i = mh_first(mod_hash); + struct module *m = mh_strnptr_node(mod_hash, i)->val; + module_gc(m); + } + mh_strnptr_delete(mod_hash); + mod_hash = NULL; +} diff --git a/src/box/module_cache.h b/src/box/module_cache.h new file mode 100644 index 000000000..fd789f603 --- /dev/null +++ b/src/box/module_cache.h @@ -0,0 +1,139 @@ +/* + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. + */ + +#pragma once + +#include <small/rlist.h> + +#if defined(__cplusplus) +extern "C" { +#endif /* defined(__cplusplus) */ + + +/** + * API of C stored function. + */ + +struct port; + +struct box_function_ctx { + struct port *port; +}; + +typedef struct box_function_ctx box_function_ctx_t; +typedef int (*box_function_f)(box_function_ctx_t *ctx, + const char *args, + const char *args_end); + +/** + * Dynamic shared module. + */ +struct module { + /** + * Module dlhandle. + */ + void *handle; + /** + * List of associated symbols (functions). + */ + struct rlist funcs_list; + /** + * Count of active calls. + */ + size_t calls; + /** + * Module's package name. + */ + char package[0]; +}; + +/** + * Callable symbol bound to a module. + */ +struct module_sym { + /** + * Anchor for module::funcs_list. + */ + struct rlist item; + /** + * For C functions, address of the function. + */ + box_function_f addr; + /** + * A module the symbol belongs to. + */ + struct module *module; + /** + * Symbol (function) name definition. + */ + char *name; +}; + +/** + * Load a new module symbol. + * + * @param mod_sym symbol to load. + * + * @returns 0 on succse, -1 otherwise, diag is set. + */ +int +module_sym_load(struct module_sym *mod_sym); + +/** + * Unload a module's symbol. + * + * @param mod_sym symbol to unload. + */ +void +module_sym_unload(struct module_sym *mod_sym); + +/** + * Execute a module symbol (run a function). + * + * The function packs function arguments into a message pack + * and send it as a function argument. Function may return + * results via @a ret stream. + * + * @param mod_sym module symbol to run. + * @param args function arguments. + * @param[out] ret execution results. + * + * @returns 0 on success, -1 otherwise, diag is set. + */ +int +module_sym_call(struct module_sym *mod_sym, struct port *args, + struct port *ret); + +/** + * Reload a module and all associated symbols. + * + * @param package shared library path start. + * @param package_end shared library path end. + * @param[out] module pointer to the reloaded module. + * + * @return 0 on succes, -1 otherwise, diag is set. + */ +int +module_reload(const char *package, const char *package_end, + struct module **module); + +/** + * Initialize modules subsystem. + * + * @return 0 on succes, -1 otherwise, diag is set. + */ +int +module_init(void); + +/** + * Free modules subsystem. + */ +void +module_free(void); + +#if defined(__cplusplus) +} +#endif /* defined(__plusplus) */ -- 2.29.2
- rename func_name to func_name_desc because it is not just a name but rather a name descriptor which includes symbol address - rename func_split_name to parse_func_name because the action of splitting name to parts is not what the function about, the function parses specification to fetch a descriptor Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/module_cache.c | 52 +++++++++++++++++++++--------------------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index 9fe316807..8ae61a883 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -26,9 +26,9 @@ static struct mh_strnptr_t *mod_hash = NULL; /** - * Parsed symbol and package names. + * Function name descriptor: a symbol and a package. */ -struct func_name { +struct func_name_desc { /** * Null-terminated symbol name, e.g. * "func" for "mod.submod.func". @@ -46,25 +46,25 @@ struct func_name { }; /*** - * Split function name to symbol and package names. + * Parse function name into a name descriptor. * * For example, str = foo.bar.baz => sym = baz, package = foo.bar * * @param str function name, e.g. "module.submodule.function". - * @param[out] name parsed symbol and a package name. + * @param[out] d parsed symbol and a package name. */ static void -func_split_name(const char *str, struct func_name *name) +parse_func_name(const char *str, struct func_name_desc *d) { - name->package = str; - name->package_end = strrchr(str, '.'); - if (name->package_end != NULL) { + d->package = str; + d->package_end = strrchr(str, '.'); + if (d->package_end != NULL) { /* module.submodule.function => module.submodule, function */ - name->sym = name->package_end + 1; /* skip '.' */ + d->sym = d->package_end + 1; /* skip '.' */ } else { /* package == function => function, function */ - name->sym = name->package; - name->package_end = str + strlen(str); + d->sym = d->package; + d->package_end = str + strlen(str); } } @@ -335,12 +335,12 @@ module_sym_load(struct module_sym *mod_sym) { assert(mod_sym->addr == NULL); - struct func_name name; - func_split_name(mod_sym->name, &name); + struct func_name_desc d; + parse_func_name(mod_sym->name, &d); - struct module *module = module_cache_find(name.package, name.package_end); + struct module *module = module_cache_find(d.package, d.package_end); if (module == NULL) { - module = module_load(name.package, name.package_end); + module = module_load(d.package, d.package_end); if (module == NULL) return -1; if (module_cache_add(module) != 0) { @@ -349,7 +349,7 @@ module_sym_load(struct module_sym *mod_sym) } } - mod_sym->addr = module_sym(module, name.sym); + mod_sym->addr = module_sym(module, d.sym); if (mod_sym->addr == NULL) return -1; @@ -366,9 +366,9 @@ module_sym_unload(struct module_sym *mod_sym) rlist_del(&mod_sym->item); if (rlist_empty(&mod_sym->module->funcs_list)) { - struct func_name name; - func_split_name(mod_sym->name, &name); - module_cache_del(name.package, name.package_end); + struct func_name_desc d; + parse_func_name(mod_sym->name, &d); + module_cache_del(d.package, d.package_end); } module_gc(mod_sym->module); @@ -439,13 +439,13 @@ module_reload(const char *package, const char *package_end, struct module_sym *mod_sym, *tmp; rlist_foreach_entry_safe(mod_sym, &old->funcs_list, item, tmp) { - struct func_name name; - func_split_name(mod_sym->name, &name); + struct func_name_desc d; + parse_func_name(mod_sym->name, &d); - mod_sym->addr = module_sym(new, name.sym); + mod_sym->addr = module_sym(new, d.sym); if (mod_sym->addr == NULL) { say_error("module: reload %s, symbol %s not found", - package, name.sym); + package, d.sym); goto restore; } @@ -467,9 +467,9 @@ module_reload(const char *package, const char *package_end, * restore old functions. */ do { - struct func_name name; - func_split_name(mod_sym->name, &name); - mod_sym->addr = module_sym(old, name.sym); + struct func_name_desc d; + parse_func_name(mod_sym->name, &d); + mod_sym->addr = module_sym(old, d.sym); if (mod_sym->addr == NULL) { /* * Something strange was happen, an early loaden -- 2.29.2
Currently when we reload modules we remove old instance from the module cache and then try to insert a new one back. Note that module cache is a string based hash table: we lookup for a pointer to the module via package name. Thus when reload initiated we simply remove same key from hash and put it back with a new pointer. This is too complex and completely useless. Instead we should simply update the value associated this the key in the hash table. For this sake we introduce module_cache_update helper. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/module_cache.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index 8ae61a883..48e55f421 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -422,6 +422,21 @@ module_sym_call(struct module_sym *mod_sym, struct port *args, return rc; } +/** + * Update the module cache. Since the lookup is string + * key based it is safe to just update the value. + */ +static int +module_cache_update(const char *name, const char *name_end, + struct module *module) +{ + mh_int_t e = mh_strnptr_find_inp(mod_hash, name, name_end - name); + if (e == mh_end(mod_hash)) + return -1; + mh_strnptr_node(mod_hash, e)->val = module; + return 0; +} + int module_reload(const char *package, const char *package_end, struct module **module) @@ -453,9 +468,15 @@ module_reload(const char *package, const char *package_end, rlist_move(&new->funcs_list, &mod_sym->item); } - module_cache_del(package, package_end); - if (module_cache_add(new) != 0) - goto restore; + if (module_cache_update(package, package_end, new) != 0) { + /* + * Module cache must be consistent at this moment, + * we've looked up for the package recently. If + * someone has updated the cache in unexpected way + * the consistency is lost and we must not continue. + */ + panic("module: can't update module cache (%s)", package); + } module_gc(old); *module = new; -- 2.29.2
We will use this field not only to count a number of active calls to the module but to prevent the module from disappear when it get loaded via "cmod" interface (will be implemented in next patches). Same time make it 64 bit long since there might be a number of such references size_t didn't prevent from numeric overflow. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/module_cache.c | 8 ++++---- src/box/module_cache.h | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index 48e55f421..ae874caab 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -214,7 +214,7 @@ module_load(const char *package, const char *package_end) memcpy(module->package, package, package_len); module->package[package_len] = 0; rlist_create(&module->funcs_list); - module->calls = 0; + module->ref = 0; const char *tmpdir = getenv("TMPDIR"); if (tmpdir == NULL) @@ -312,7 +312,7 @@ module_delete(struct module *module) static void module_gc(struct module *module) { - if (rlist_empty(&module->funcs_list) && module->calls == 0) + if (rlist_empty(&module->funcs_list) && module->ref == 0) module_delete(module); } @@ -404,9 +404,9 @@ module_sym_call(struct module_sym *mod_sym, struct port *args, */ struct module *module = mod_sym->module; assert(module != NULL); - ++module->calls; + ++module->ref; int rc = mod_sym->addr(&ctx, data, data + data_sz); - --module->calls; + --module->ref; module_gc(module); region_truncate(region, region_svp); diff --git a/src/box/module_cache.h b/src/box/module_cache.h index fd789f603..feb1f2266 100644 --- a/src/box/module_cache.h +++ b/src/box/module_cache.h @@ -41,9 +41,9 @@ struct module { */ struct rlist funcs_list; /** - * Count of active calls. + * Count of active references to the module. */ - size_t calls; + int64_t ref; /** * Module's package name. */ -- 2.29.2
We need to be able to load and unload modules without functions bound to them, just plain dlopen inside. This is a part of API for future patches. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/module_cache.c | 47 +++++++++++++++++++++++++++++++----------- src/box/module_cache.h | 19 +++++++++++++++++ 2 files changed, 54 insertions(+), 12 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index ae874caab..cb580f342 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -198,7 +198,7 @@ module_find(const char *package, const char *package_end, * for cases of a function reload. */ static struct module * -module_load(const char *package, const char *package_end) +module_dlopen(const char *package, const char *package_end) { char path[PATH_MAX]; if (module_find(package, package_end, path, sizeof(path)) != 0) @@ -338,16 +338,10 @@ module_sym_load(struct module_sym *mod_sym) struct func_name_desc d; parse_func_name(mod_sym->name, &d); - struct module *module = module_cache_find(d.package, d.package_end); - if (module == NULL) { - module = module_load(d.package, d.package_end); - if (module == NULL) - return -1; - if (module_cache_add(module) != 0) { - module_delete(module); - return -1; - } - } + size_t len = d.package_end - d.package; + struct module *module = module_load(d.package, len); + if (module == NULL) + return -1; mod_sym->addr = module_sym(module, d.sym); if (mod_sym->addr == NULL) @@ -437,6 +431,34 @@ module_cache_update(const char *name, const char *name_end, return 0; } +/** + * Load a new module. + */ +struct module * +module_load(const char *path, size_t path_len) +{ + struct module *module = module_cache_find(path, &path[path_len]); + if (module == NULL) { + module = module_dlopen(path, &path[path_len]); + if (module == NULL) + return NULL; + if (module_cache_add(module) != 0) { + module_delete(module); + return NULL; + } + } + return module; +} + +/** + * Unload a module. + */ +void +module_unload(struct module *module) +{ + module_gc(module); +} + int module_reload(const char *package, const char *package_end, struct module **module) @@ -448,7 +470,8 @@ module_reload(const char *package, const char *package_end, return 0; } - struct module *new = module_load(package, package_end); + size_t len = package_end - package; + struct module *new = module_dlopen(package, &package[len]); if (new == NULL) return -1; diff --git a/src/box/module_cache.h b/src/box/module_cache.h index feb1f2266..87108ae81 100644 --- a/src/box/module_cache.h +++ b/src/box/module_cache.h @@ -107,6 +107,25 @@ int module_sym_call(struct module_sym *mod_sym, struct port *args, struct port *ret); +/** + * Load a module. + * + * @param path path to a module file without extension. + * @param path_len length of @path. + * + * @return pointer to a module or NULL on error. + */ +struct module * +module_load(const char *path, size_t path_len); + +/** + * Unload a module. + * + * @param module pointer to a module. + */ +void +module_unload(struct module *module); + /** * Reload a module and all associated symbols. * -- 2.29.2
Currently to run "C" function from some external module one have to register it first in "_func" system space. This is a problem if node is in read-only mode (replica). Still people would like to have a way to run such functions even in ro mode. For this sake we implement "cmod" lua module. Fixes #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> @TarantoolBot document Title: cmod module Overview ======== `cmod` module provides a way to create, delete and execute `C` procedures. Unlike `box.schema.func` methods the functions created with `cmod` help are not persistent and live purely in memory. Once a node get turned off they are vanished. An initial purpose for them is to execute them on nodes which are running in read-only mode. Module functions ================ `require('cmod').load(path) -> obj | nil, err` ---------------------------------------------- Loads a module from `path` and return an object instance associate with the module. The `path` should not end up with shared object extension (such as `.so`), only a file name shall be there. Possible errors: - IllegalParams: module path is either not supplied or not a string. - SystemError: unable to open a module due to a system error. - ClientError: a module does not exist. - OutOfMemory: unable to allocate a module. `module:unload() -> true | nil, err` ------------------------------------ Unloads a module. Once the module is unloaded one can't load new functions from this module instance. Possible errors: - IllegalParams: a module is not supplied. - IllegalParams: a module is already unloaded. Example: ``` Lua m, err = require('cmod').load('path') if not m then print(err) end ok, err = m:unload() if not ok then print(err) end ``` If there are functions from this module referenced somewhere in other Lua code they still can be called because module continue sitting in memory until the last reference is closed. If the module become a target to the Lua's garbage collector then unload is called implicitly. `module:reload() -> true | nil, err` ------------------------------------ Reloads a module and all functions which were associated with the module. Each module keeps a list of functions belonging to the module and reload procedure cause the bound function to update their addresses such that function execution will be routed via a new library. Modules are loaded with that named local binding which means that reload of module symbols won't affect the functions which are started execution already, only new calls will be rerouted. Possible errors: - IllegalParams: a module is not supplied. - ClientError: a module does not exist. On success `true` is returned, otherwise `nil,error` pair. Example: ``` Lua m, err = require('cmod').load('path') if not m then print(err) end ok, err = m:reload() if not ok then print(err) end ``` module:load(name) -> obj | nil, err` ------------------------------------ Loads a new function with name `name` from the `module` object and return a callable object instance associate with the function. Possible errors: - IllegalParams: function name is either not supplied or not a string. - OutOfMemory: unable to allocate a function. Example: ``` Lua m, err = require('cmod').load('path') if not m then print(err) end f, err = m:load('foo') if not ok then print(err) end ``` `function:unload() -> true | nil, err` -------------------------------------- Unloads a function. Possible errors: - IllegalParams: function name is either not supplied or not a string. - IllegalParams: the function does not exist. Example: ``` Lua m, err = require('cmod').load('path') if not m then print(err) end f, err = m:load('foo') if not ok then print(err) end ok, err = f:unload() if not ok then print(err) end ``` Executing a loaded function =========================== Once function is loaded it can be executed by ordinary Lua call. Lets consider the following example. We have a `C` function which takes two numbers and returns their sum. ``` C int cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end) { uint32_t arg_count = mp_decode_array(&args); if (arg_count != 2) { return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", "invalid argument count"); } uint64_t a = mp_decode_uint(&args); uint64_t b = mp_decode_uint(&args); char res[16]; char *end = mp_encode_uint(res, a + b); box_return_mp(ctx, res, end); return 0; } ``` The name of the function is `cfunc_sum` and the function is built into `cfunc.so` shared library. First we should load it as ``` Lua m, err = require('cmod').load('cfunc') if not m then print(err) end cfunc_sum, err = m:load('cfunc_sum') if not cfunc_sum then print(err) end ``` Once successfully loaded we can execute it. Note that unlike regular Lua functions the context of `C` functions is different. They never thrown an exception but return `true|nil, res` form where first value set to `nil` in case of error condition and `res` carries an error description. Lets call the `cfunc_sum` with wrong number of arguments ``` Lua local ok, res = cfunc_sum() if not ok then print(res) end ``` We will the `"invalid argument count"` message in output. The error message has been set by the `box_error_set` in `C` code above. On success the first returned value set to `true` and `res` represent function execution result. ``` Lua local ok, res = cfunc_sum(1, 2) assert(ok); print(res) ``` We will see the number `3` in output. --- src/box/CMakeLists.txt | 1 + src/box/lua/cmod.c | 718 +++++++++++++++++++++++++++++++++++++++++ src/box/lua/cmod.h | 24 ++ src/box/lua/init.c | 2 + 4 files changed, 745 insertions(+) create mode 100644 src/box/lua/cmod.c create mode 100644 src/box/lua/cmod.h diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt index 339e2c8a9..feba5a037 100644 --- a/src/box/CMakeLists.txt +++ b/src/box/CMakeLists.txt @@ -195,6 +195,7 @@ add_library(box STATIC lua/init.c lua/call.c lua/cfg.cc + lua/cmod.c lua/console.c lua/serialize_lua.c lua/tuple.c diff --git a/src/box/lua/cmod.c b/src/box/lua/cmod.c new file mode 100644 index 000000000..c3a44e3df --- /dev/null +++ b/src/box/lua/cmod.c @@ -0,0 +1,718 @@ +/* + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. + */ + +#include <string.h> +#include <lua.h> + +#include "assoc.h" +#include "diag.h" + +#include "box/module_cache.h" +#include "box/error.h" +#include "box/port.h" +#include "tt_static.h" + +#include "trivia/util.h" +#include "lua/utils.h" + +/** + * Function descriptor. + */ +struct cmod_func { + /** + * Symbol descriptor for the function in + * an associated module. + */ + struct module_sym mod_sym; + /** + * Number of loads of the function. + */ + int64_t load_count; + /** + * The function name (without a package path). + */ + const char *func_name; + /** + * Length of @a name member. + */ + size_t len; + /** + * Module path with function name separated + * by a point, like "module.func". + * + * See parse_func_name helper, we simply have + * to keep this strange architecture for backward + * compatibility sake (ie for functions which + * are created via box.schema.func). + */ + char name[0]; +}; + +/** Get a handle associated with an object. */ +static void * +cmod_get_handle(struct lua_State *L, const char *uname) +{ + void **pptr = luaL_testudata(L, 1, uname); + return pptr != NULL ? *pptr : NULL; +} + +/** Set a handle to a new value. */ +static void +cmod_set_handle(struct lua_State *L, const char *uname, void *ptr) +{ + void **pptr = luaL_testudata(L, 1, uname); + if (pptr != NULL) + *pptr = ptr; +} + +/** Setup a new handle and associate it with an object. */ +static void +cmod_setup_handle(struct lua_State *L, const char *uname, void *ptr) +{ + *(void **)lua_newuserdata(L, sizeof(void *)) = ptr; + luaL_getmetatable(L, uname); + lua_setmetatable(L, -2); +} + +/** A type to find a module handle from an object. */ +static const char *cmod_module_handle_uname = "cmod_module_handle"; + +static struct module * +cmod_get_module_handle(struct lua_State *L, bool mandatory) +{ + struct module *m = cmod_get_handle(L, cmod_module_handle_uname); + if (mandatory && m == NULL) { + const char *fmt = "The module is already unloaded"; + diag_set(IllegalParams, fmt); + } + return m; +} + +static void +cmod_set_module_handle(struct lua_State *L, struct module *module) +{ + cmod_set_handle(L, cmod_module_handle_uname, module); +} + +static void +cmod_setup_module_handle(struct lua_State *L, struct module *module) +{ + cmod_setup_handle(L, cmod_module_handle_uname, module); +} + +/** A type to find a function handle from an object. */ +static const char *cmod_func_handle_uname = "cmod_func_handle"; + +static struct cmod_func * +cmod_get_func_handle(struct lua_State *L, bool mandatory) +{ + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname); + if (mandatory && cf == NULL) { + const char *fmt = "The function is already unloaded"; + diag_set(IllegalParams, fmt); + } + return cf; +} + +static void +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf) +{ + cmod_set_handle(L, cmod_func_handle_uname, cf); +} + +static void +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf) +{ + cmod_setup_handle(L, cmod_func_handle_uname, cf); +} + +/** + * Function name to cmod_func hash. The name includes + * module package path without file extension. + */ +static struct mh_strnptr_t *func_hash = NULL; + +/** + * Find function in cmod_func hash. + * + * @param name function name. + * @param name_len function name length. + * + * @returns function descriptor if found, NULL otherwise. + */ +struct cmod_func * +cmod_func_find(const char *name, size_t name_len) +{ + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len); + if (e == mh_end(func_hash)) + return NULL; + return mh_strnptr_node(func_hash, e)->val; +} + +/** + * Delete a function instance from the hash or decrease + * a reference if the function is still loaded. + * + * @param cf function descriptor. + * + * @retval true if the function has no more loaded instances + * and removed from the hash. + * + * @retval false if there are loaded instances left and function + * is kept in the hash. + */ +static bool +cmod_func_del(struct cmod_func *cf) +{ + assert(cf->load_count > 0); + if (cf->load_count-- != 1) + return false; + + mh_int_t e = mh_strnptr_find_inp(func_hash, cf->name, cf->len); + assert(e != mh_end(func_hash)); + mh_strnptr_del(func_hash, e, NULL); + return true; +} + +/** + * Add a function instance into the hash or increase + * a reference if the function is already exist. + * + * @param cf Function descriptor. + * + * Possible errors: + * + * - OutOfMemory: unable to allocate a hash entry. + * + * @retval 0 on success. + * @retval -1 on error, diag is set. + */ +static int +cmod_func_add(struct cmod_func *cf) +{ + assert(cf->load_count >= 0); + if (cf->load_count++ != 0) + return 0; + + const struct mh_strnptr_node_t nd = { + .str = cf->name, + .len = cf->len, + .hash = mh_strn_hash(cf->name, cf->len), + .val = cf, + }; + + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL); + if (e == mh_end(func_hash)) { + diag_set(OutOfMemory, sizeof(nd), + "malloc", "cmod_func node"); + return -1; + } + return 0; +} + +/** + * Unload a symbol and free a function instance. + * + * @param cf function descriptor. + */ +static void +cmod_func_free(struct cmod_func *cf) +{ + module_sym_unload(&cf->mod_sym); + TRASH(cf); + free(cf); +} + +/** + * Allocate a new function instance and resolve a symbol address. + * + * @param name package path and a function name, ie "module.foo" + * @param len length of @a name. + * @param func_name_len function name length, ie "3" for "module.foo" + * + * @returns function instance on success, NULL otherwise setting diag area. + */ +static struct cmod_func * +cmod_func_new(const char *name, size_t len, size_t func_name_len) +{ + const ssize_t cf_size = sizeof(struct cmod_func); + size_t size = cf_size + len + 1; + struct cmod_func *cf = malloc(size); + if (cf == NULL) { + diag_set(OutOfMemory, size, "malloc", "cf"); + return NULL; + } + + cf->mod_sym.addr = NULL; + cf->mod_sym.module = NULL; + cf->load_count = 0; + cf->mod_sym.name = cf->name; + cf->func_name = &cf->name[len - func_name_len]; + cf->len = len; + + memcpy(cf->name, name, len); + cf->name[len] = '\0'; + + if (module_sym_load(&cf->mod_sym) != 0) { + cmod_func_free(cf); + return NULL; + } + + return cf; +} + +/** + * Load a new function. + * + * This function takes a function name from the caller + * stack @a L and creates a new function object. If + * the function is already loaded we simply return + * a reference to existing one. + * + * Possible errors: + * + * - IllegalParams: function name is either not supplied + * or not a string. + * - IllegalParams: function references limit exceeded. + * - OutOfMemory: unable to allocate a function. + * + * @returns function object on success or {nil,error} on error, + * the error is set to the diagnostics area. + */ +static int +lcmod_func_load(struct lua_State *L) +{ + const char *method = "function = module:load"; + struct cmod_func *cf = NULL; + + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) { + const char *fmt = + "Expects %s(\'name\') but no name passed"; + diag_set(IllegalParams, fmt, method); + return luaT_push_nil_and_error(L); + } + + struct module *m = cmod_get_module_handle(L, false); + if (m == NULL) { + const char *fmt = + "Expects %s(\'name\') but not module object passed"; + diag_set(IllegalParams, fmt, method); + return luaT_push_nil_and_error(L); + } + + const char *func_name = lua_tostring(L, 2); + const char *name = tt_sprintf("%s.%s", m->package, func_name); + size_t len = strlen(name); + + cf = cmod_func_find(name, len); + if (cf == NULL) { + cf = cmod_func_new(name, len, strlen(func_name)); + if (cf == NULL) + return luaT_push_nil_and_error(L); + } + + if (cmod_func_add(cf) != 0) { + cmod_func_free(cf); + return luaT_push_nil_and_error(L); + } + + cmod_setup_func_handle(L, cf); + return 1; +} + +/** + * Unload a function. + * + * This function takes a function object from + * the caller stack @a L and unloads it. + * + * Possible errors: + * + * - IllegalParams: function is not supplied. + * - IllegalParams: the function does not exist. + * + * @returns true on success or {nil,error} on error, + * the error is set to the diagnostics area. + */ +static int +lcmod_func_unload(struct lua_State *L) +{ + if (lua_gettop(L) != 1) { + const char *fmt = "Expects function:unload()"; + diag_set(IllegalParams, fmt); + return luaT_push_nil_and_error(L); + } + + struct cmod_func *cf = cmod_get_func_handle(L, true); + if (cf == NULL) + return luaT_push_nil_and_error(L); + + cmod_set_func_handle(L, NULL); + if (cmod_func_del(cf)) + cmod_func_free(cf); + + lua_pushboolean(L, true); + return 1; +} + +/** + * Load a new module. + * + * This function takes a module patch from the caller + * stack @a L and creates a new module object. + * + * Possible errors: + * + * - IllegalParams: module path is either not supplied + * or not a string. + * - SystemError: unable to open a module due to a system error. + * - ClientError: a module does not exist. + * - OutOfMemory: unable to allocate a module. + * + * @returns module object on success or {nil,error} on error, + * the error is set to the diagnostics area. + */ +static int +lcmod_module_load(struct lua_State *L) +{ + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) { + const char *fmt = + "Expects cmod.load(\'name\') but no name passed"; + diag_set(IllegalParams, fmt); + return luaT_push_nil_and_error(L); + } + + size_t name_len; + const char *name = lua_tolstring(L, 1, &name_len); + + struct module *module = module_load(name, name_len); + if (module == NULL) + return luaT_push_nil_and_error(L); + + cmod_setup_module_handle(L, module); + /* + * Make sure the module won't disappear until + * it is GC'ed or unloaded explicitly. + */ + module->ref++; + return 1; +} + +/** + * Unload a module. + * + * This function takes a module object from + * the caller stack @a L and unloads it. + * + * If there are some active functions left then + * module won't be freed internally until last function + * from this module is unloaded, this is guaranteed by + * module_cache engine. + * + * Possible errors: + * + * - IllegalParams: a module is not supplied. + * - IllegalParams: a module does not exist. + * + * @returns true on success or {nil,error} on error, + * the error is set to the diagnostics area. + */ +static int +lcmod_module_unload(struct lua_State *L) +{ + if (lua_gettop(L) != 1) { + const char *fmt = "Expects module:unload()"; + diag_set(IllegalParams, fmt); + return luaT_push_nil_and_error(L); + } + + struct module *m = cmod_get_module_handle(L, true); + if (m == NULL) + return luaT_push_nil_and_error(L); + m->ref--; + cmod_set_module_handle(L, NULL); + lua_pushboolean(L, true); + return 1; +} + +/** + * Reload a module. + * + * This function takes a module object from the caller + * stack @a L and reloads all functions associated with + * it. + * + * Possible errors: + * + * - IllegalParams: a module is not supplied. + * - IllegalParams: a module does not exist. + * - ClientError: a module with the name provided does + * not exist. + * + * @returns true on success or {nil,error} on error, + * the error is set to the diagnostics area. + */ +static int +lcmod_module_reload(struct lua_State *L) +{ + if (lua_gettop(L) != 1) { + const char *fmt = "Expects module:reload()"; + diag_set(IllegalParams, fmt); + return luaT_push_nil_and_error(L); + } + + struct module *m = cmod_get_module_handle(L, true); + if (m == NULL) + return luaT_push_nil_and_error(L); + + struct module *new = NULL; + size_t len = strlen(m->package); + if (module_reload(m->package, &m->package[len], &new) == 0) { + if (new != NULL) { + m->ref--; + module_unload(m); + new->ref++; + cmod_set_module_handle(L, new); + lua_pushboolean(L, true); + return 1; + } + diag_set(ClientError, ER_NO_SUCH_MODULE, m->package); + } + return luaT_push_nil_and_error(L); +} + +/** + * Handle __index request for a module object. + */ +static int +lcmod_module_handle_index(struct lua_State *L) +{ + /* + * Instead of showing userdata pointer + * lets provide a serialized value. + */ + lua_getmetatable(L, 1); + lua_pushvalue(L, 2); + lua_rawget(L, -2); + if (!lua_isnil(L, -1)) + return 1; + + struct module *m = cmod_get_module_handle(L, false); + if (m == NULL) { + lua_pushnil(L); + return 1; + } + + const char *key = lua_tostring(L, 2); + + if (lua_type(L, 2) != LUA_TSTRING || key == NULL) { + diag_set(IllegalParams, + "Bad params, use __index(obj, <string>)"); + return luaT_error(L); + } + + if (strcmp(key, "path") == 0) { + lua_pushstring(L, m->package); + return 1; + } + + return 0; +} + +/** + * Module handle representation for REPL (console). + */ +static int +lcmod_module_handle_serialize(struct lua_State *L) +{ + struct module *m = cmod_get_module_handle(L, true); + if (m == NULL) { + lua_pushnil(L); + return 1; + } + + lua_createtable(L, 0, 0); + lua_pushstring(L, m->package); + lua_setfield(L, -2, "path"); + + return 1; +} + +/** + * Collect a module handle. + */ +static int +lcmod_module_handle_gc(struct lua_State *L) +{ + struct module *m = cmod_get_module_handle(L, false); + if (m != NULL) { + /* + * If there are some referenced functions + * alive then final module GC will happen + * on last function unload (indirectly when + * module symbol get collected by module_sym_unload + * code). + */ + cmod_set_module_handle(L, NULL); + m->ref--; + module_unload(m); + } + return 0; +} + +/** + * Function handle representation for REPL (console). + */ +static int +lcmod_func_handle_serialize(struct lua_State *L) +{ + struct cmod_func *cf = cmod_get_func_handle(L, true); + if (cf == NULL) { + lua_pushnil(L); + return 1; + } + + lua_createtable(L, 0, 0); + lua_pushstring(L, cf->name); + lua_setfield(L, -2, "name"); + + return 1; +} + +/** + * Handle __index request for a function object. + */ +static int +lcmod_func_handle_index(struct lua_State *L) +{ + /* + * Instead of showing userdata pointer + * lets provide a serialized value. + */ + lua_getmetatable(L, 1); + lua_pushvalue(L, 2); + lua_rawget(L, -2); + if (!lua_isnil(L, -1)) + return 1; + + struct cmod_func *cf = cmod_get_func_handle(L, true); + if (cf == NULL) + return luaT_error(L); + + const char *key = lua_tostring(L, 2); + + if (lua_type(L, 2) != LUA_TSTRING || key == NULL) { + diag_set(IllegalParams, + "Bad params, use __index(obj, <string>)"); + return luaT_error(L); + } + + if (strcmp(key, "name") == 0) { + lua_pushstring(L, cf->name); + return 1; + } + + return 0; +} + +/** + * Collect function handle if there is no active loads left. + */ +static int +lcmod_func_handle_gc(struct lua_State *L) +{ + struct cmod_func *cf = cmod_get_func_handle(L, false); + if (cf != NULL) { + cmod_set_func_handle(L, NULL); + if (cmod_func_del(cf)) + cmod_func_free(cf); + } + return 0; +} + +/** + * Call a function by its name from the Lua code. + */ +static int +lcmod_func_handle_call(struct lua_State *L) +{ + struct cmod_func *cf = cmod_get_func_handle(L, true); + if (cf == NULL) + return luaT_push_nil_and_error(L); + + /* + * FIXME: We should get rid of luaT_newthread but this + * requires serious modifications. In particular + * port_lua_do_dump uses tarantool_L reference and + * coro_ref must be valid as well. + */ + lua_State *args_L = luaT_newthread(tarantool_L); + if (args_L == NULL) + return luaT_push_nil_and_error(L); + + int coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX); + lua_xmove(L, args_L, lua_gettop(L) - 1); + + struct port args; + port_lua_create(&args, args_L); + ((struct port_lua *)&args)->ref = coro_ref; + + struct port ret; + if (module_sym_call(&cf->mod_sym, &args, &ret) != 0) { + port_destroy(&args); + return luaT_push_nil_and_error(L); + } + + int top = lua_gettop(L); + lua_pushboolean(L, true); + port_dump_lua(&ret, L, true); + int cnt = lua_gettop(L) - top; + + port_destroy(&ret); + port_destroy(&args); + + return cnt; +} + +/** + * Initialize cmod module. + */ +void +box_lua_cmod_init(struct lua_State *L) +{ + func_hash = mh_strnptr_new(); + if (func_hash == NULL) { + panic("Can't allocate cmod hash table"); + } + + static const struct luaL_Reg module_methods[] = { + { "load", lcmod_module_load }, + { NULL, NULL }, + }; + luaL_register_module(L, "cmod", module_methods); + lua_pop(L, 1); + + static const struct luaL_Reg module_handle_methods[] = { + { "load", lcmod_func_load }, + { "reload", lcmod_module_reload }, + { "unload", lcmod_module_unload }, + { "__index", lcmod_module_handle_index }, + { "__serialize", lcmod_module_handle_serialize }, + { "__gc", lcmod_module_handle_gc }, + { NULL, NULL }, + }; + luaL_register_type(L, cmod_module_handle_uname, module_handle_methods); + + static const struct luaL_Reg func_handle_methods[] = { + { "unload", lcmod_func_unload }, + { "__index", lcmod_func_handle_index }, + { "__serialize", lcmod_func_handle_serialize }, + { "__call", lcmod_func_handle_call }, + { "__gc", lcmod_func_handle_gc }, + { NULL, NULL }, + }; + luaL_register_type(L, cmod_func_handle_uname, func_handle_methods); +} diff --git a/src/box/lua/cmod.h b/src/box/lua/cmod.h new file mode 100644 index 000000000..f0ea2d34d --- /dev/null +++ b/src/box/lua/cmod.h @@ -0,0 +1,24 @@ +/* + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. + */ + +#pragma once + +#if defined(__cplusplus) +extern "C" { +#endif /* defined(__cplusplus) */ + +struct lua_State; + +/** + * Initialize cmod Lua module. + * + * @param L Lua state where to register the cmod module. + */ +void +box_lua_cmod_init(struct lua_State *L); +#if defined(__cplusplus) +} +#endif /* defined(__plusplus) */ diff --git a/src/box/lua/init.c b/src/box/lua/init.c index fbcdfb20b..bad2b7ca9 100644 --- a/src/box/lua/init.c +++ b/src/box/lua/init.c @@ -60,6 +60,7 @@ #include "box/lua/cfg.h" #include "box/lua/xlog.h" #include "box/lua/console.h" +#include "box/lua/cmod.h" #include "box/lua/tuple.h" #include "box/lua/execute.h" #include "box/lua/key_def.h" @@ -465,6 +466,7 @@ box_lua_init(struct lua_State *L) box_lua_tuple_init(L); box_lua_call_init(L); box_lua_cfg_init(L); + box_lua_cmod_init(L); box_lua_slab_init(L); box_lua_index_init(L); box_lua_space_init(L); -- 2.29.2
Note that the test is disabled for a while since we need to update test-run first like | diff --git a/pretest_clean.lua b/pretest_clean.lua | index 9b5ac9d..b0280c4 100644 | --- a/pretest_clean.lua | +++ b/pretest_clean.lua | @@ -272,6 +272,7 @@ local function clean() | package = true, | pickle = true, | popen = true, | + cmod = true, | pwd = true, | socket = true, | strict = true, ie need to enable cmod module. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- test/box/CMakeLists.txt | 2 + test/box/cfunc1.c | 58 ++++++++++++ test/box/cfunc2.c | 132 ++++++++++++++++++++++++++ test/box/cmod.result | 199 ++++++++++++++++++++++++++++++++++++++++ test/box/cmod.test.lua | 70 ++++++++++++++ test/box/suite.ini | 2 +- 6 files changed, 462 insertions(+), 1 deletion(-) create mode 100644 test/box/cfunc1.c create mode 100644 test/box/cfunc2.c create mode 100644 test/box/cmod.result create mode 100644 test/box/cmod.test.lua diff --git a/test/box/CMakeLists.txt b/test/box/CMakeLists.txt index 06bfbbe9d..2afbeadc3 100644 --- a/test/box/CMakeLists.txt +++ b/test/box/CMakeLists.txt @@ -3,3 +3,5 @@ build_module(function1 function1.c) build_module(reload1 reload1.c) build_module(reload2 reload2.c) build_module(tuple_bench tuple_bench.c) +build_module(cfunc1 cfunc1.c) +build_module(cfunc2 cfunc2.c) diff --git a/test/box/cfunc1.c b/test/box/cfunc1.c new file mode 100644 index 000000000..8f6d3990c --- /dev/null +++ b/test/box/cfunc1.c @@ -0,0 +1,58 @@ +#include <stdio.h> +#include <stdbool.h> +#include <msgpuck.h> + +#include "module.h" + +/* + * Before the reload functions are just declared + * and simply exit with zero. + * + * After the module reload we should provide real + * functionality. + */ + +int +cfunc_nop(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void)ctx; + (void)args; + (void)args_end; + return 0; +} + +int +cfunc_fetch_evens(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void)ctx; + (void)args; + (void)args_end; + return 0; +} + +int +cfunc_multireturn(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void)ctx; + (void)args; + (void)args_end; + return 0; +} + +int +cfunc_args(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void)ctx; + (void)args; + (void)args_end; + return 0; +} + +int +cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void)ctx; + (void)args; + (void)args_end; + return 0; +} diff --git a/test/box/cfunc2.c b/test/box/cfunc2.c new file mode 100644 index 000000000..f6d826183 --- /dev/null +++ b/test/box/cfunc2.c @@ -0,0 +1,132 @@ +#include <stdio.h> +#include <stdbool.h> +#include <msgpuck.h> + +#include "module.h" + +/* + * Just make sure we've been called. + */ +int +cfunc_nop(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void)ctx; + (void)args; + (void)args_end; + return 0; +} + +/* + * Fetch first N even numbers (just to make sure the order of + * arguments is not screwed). + */ +int +cfunc_fetch_evens(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + int arg_count = mp_decode_array(&args); + if (arg_count != 1) { + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", + "invalid argument count"); + } + int field_count = mp_decode_array(&args); + if (field_count < 1) { + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", + "invalid array size"); + } + + for (int i = 0; i < field_count; i++) { + int val = mp_decode_uint(&args); + int needed = 2 * (i+1); + if (val != needed) { + char res[128]; + snprintf(res, sizeof(res), "%s %d != %d", + "invalid argument", val, needed); + return box_error_set(__FILE__, __LINE__, + ER_PROC_C, "%s", res); + } + } + + return 0; +} + +/* + * Return one element array twice. + */ +int +cfunc_multireturn(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + char tuple_buf[512]; + char *d = tuple_buf; + d = mp_encode_array(d, 1); + d = mp_encode_uint(d, 1); + assert(d <= tuple_buf + sizeof(tuple_buf)); + + box_tuple_format_t *fmt = box_tuple_format_default(); + box_tuple_t *tuple_a = box_tuple_new(fmt, tuple_buf, d); + if (tuple_a == NULL) + return -1; + int rc = box_return_tuple(ctx, tuple_a); + if (rc != 0) + return rc; + return box_return_tuple(ctx, tuple_a); +} + +/* + * Encode int + string pair back. + */ +int +cfunc_args(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + uint32_t arg_count = mp_decode_array(&args); + if (arg_count != 2) { + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", + "invalid argument count"); + } + + if (mp_typeof(*args) != MP_UINT) { + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", + "tuple field must be uint"); + } + uint32_t num = mp_decode_uint(&args); + + if (mp_typeof(*args) != MP_STR) { + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", + "tuple field must be string"); + } + const char *str = args; + uint32_t len = mp_decode_strl(&str); + + char tuple_buf[512]; + char *d = tuple_buf; + d = mp_encode_array(d, 2); + d = mp_encode_uint(d, num); + d = mp_encode_str(d, str, len); + assert(d <= tuple_buf + sizeof(tuple_buf)); + + box_tuple_format_t *fmt = box_tuple_format_default(); + box_tuple_t *tuple = box_tuple_new(fmt, tuple_buf, d); + if (tuple == NULL) + return -1; + + return box_return_tuple(ctx, tuple); +} + +/* + * Sum two integers. + */ +int +cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + uint32_t arg_count = mp_decode_array(&args); + if (arg_count != 2) { + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", + "invalid argument count"); + } + uint64_t a = mp_decode_uint(&args); + uint64_t b = mp_decode_uint(&args); + + char res[16]; + char *end = mp_encode_uint(res, a + b); + box_return_mp(ctx, res, end); + return 0; +} diff --git a/test/box/cmod.result b/test/box/cmod.result new file mode 100644 index 000000000..f8558c191 --- /dev/null +++ b/test/box/cmod.result @@ -0,0 +1,199 @@ +-- test-run result file version 2 +build_path = os.getenv("BUILDDIR") + | --- + | ... +package.cpath = build_path..'/test/box/?.so;'..build_path..'/test/box/?.dylib;'..package.cpath + | --- + | ... + +cmod = require('cmod') + | --- + | ... +fio = require('fio') + | --- + | ... + +ext = (jit.os == "OSX" and "dylib" or "so") + | --- + | ... + +cfunc_path = fio.pathjoin(build_path, "test/box/cfunc.") .. ext + | --- + | ... +cfunc1_path = fio.pathjoin(build_path, "test/box/cfunc1.") .. ext + | --- + | ... +cfunc2_path = fio.pathjoin(build_path, "test/box/cfunc2.") .. ext + | --- + | ... + +_ = pcall(fio.unlink(cfunc_path)) + | --- + | ... +fio.symlink(cfunc1_path, cfunc_path) + | --- + | - true + | ... + +module, err = cmod.load('non-existing-cfunc') + | --- + | ... +assert(err ~= nil) + | --- + | - true + | ... + +-- +-- They all are sitting in cfunc.so +module, err = cmod.load('cfunc') + | --- + | ... +assert(err == nil) + | --- + | - true + | ... + +cfunc_nop, err = module:load('no-such-func') + | --- + | ... +assert(err ~= nil) + | --- + | - true + | ... +cfunc_nop, err = module:load('cfunc_nop') + | --- + | ... +assert(err == nil) + | --- + | - true + | ... +cfunc_fetch_evens, err = module:load('cfunc_fetch_evens') + | --- + | ... +assert(err == nil) + | --- + | - true + | ... +cfunc_multireturn, err = module:load('cfunc_multireturn') + | --- + | ... +assert(err == nil) + | --- + | - true + | ... +cfunc_args, err = module:load('cfunc_args') + | --- + | ... +assert(err == nil) + | --- + | - true + | ... +cfunc_sum, err = module:load('cfunc_sum') + | --- + | ... +assert(err == nil) + | --- + | - true + | ... + +-- +-- Make sure they all are callable +cfunc_nop() + | --- + | - true + | ... +cfunc_fetch_evens() + | --- + | - true + | ... +cfunc_multireturn() + | --- + | - true + | ... +cfunc_args() + | --- + | - true + | ... + +-- +-- Clean old function references and reload a new one. +_ = pcall(fio.unlink(cfunc_path)) + | --- + | ... +fio.symlink(cfunc2_path, cfunc_path) + | --- + | - true + | ... + +module:reload() + | --- + | - true + | ... + +cfunc_nop() + | --- + | - true + | ... +cfunc_multireturn() + | --- + | - true + | - [1] + | - [1] + | ... +cfunc_fetch_evens({2,4,6}) + | --- + | - true + | ... +cfunc_fetch_evens({1,2,3}) -- error + | --- + | - null + | - invalid argument 1 != 2 + | ... +cfunc_args(1, "hello") + | --- + | - true + | - [1, 'hello'] + | ... +cfunc_sum(1) -- error + | --- + | - null + | - invalid argument count + | ... +cfunc_sum(1,2) + | --- + | - true + | - 3 + | ... + +-- +-- Clean them up +cfunc_nop:unload() + | --- + | - true + | ... +cfunc_multireturn:unload() + | --- + | - true + | ... +cfunc_fetch_evens:unload() + | --- + | - true + | ... +cfunc_args:unload() + | --- + | - true + | ... +cfunc_sum:unload() + | --- + | - true + | ... +module:unload() + | --- + | - true + | ... + +-- +-- Cleanup the generated symlink +_ = pcall(fio.unlink(cfunc_path)) + | --- + | ... diff --git a/test/box/cmod.test.lua b/test/box/cmod.test.lua new file mode 100644 index 000000000..e9837e693 --- /dev/null +++ b/test/box/cmod.test.lua @@ -0,0 +1,70 @@ +build_path = os.getenv("BUILDDIR") +package.cpath = build_path..'/test/box/?.so;'..build_path..'/test/box/?.dylib;'..package.cpath + +cmod = require('cmod') +fio = require('fio') + +ext = (jit.os == "OSX" and "dylib" or "so") + +cfunc_path = fio.pathjoin(build_path, "test/box/cfunc.") .. ext +cfunc1_path = fio.pathjoin(build_path, "test/box/cfunc1.") .. ext +cfunc2_path = fio.pathjoin(build_path, "test/box/cfunc2.") .. ext + +_ = pcall(fio.unlink(cfunc_path)) +fio.symlink(cfunc1_path, cfunc_path) + +module, err = cmod.load('non-existing-cfunc') +assert(err ~= nil) + +-- +-- They all are sitting in cfunc.so +module, err = cmod.load('cfunc') +assert(err == nil) + +cfunc_nop, err = module:load('no-such-func') +assert(err ~= nil) +cfunc_nop, err = module:load('cfunc_nop') +assert(err == nil) +cfunc_fetch_evens, err = module:load('cfunc_fetch_evens') +assert(err == nil) +cfunc_multireturn, err = module:load('cfunc_multireturn') +assert(err == nil) +cfunc_args, err = module:load('cfunc_args') +assert(err == nil) +cfunc_sum, err = module:load('cfunc_sum') +assert(err == nil) + +-- +-- Make sure they all are callable +cfunc_nop() +cfunc_fetch_evens() +cfunc_multireturn() +cfunc_args() + +-- +-- Clean old function references and reload a new one. +_ = pcall(fio.unlink(cfunc_path)) +fio.symlink(cfunc2_path, cfunc_path) + +module:reload() + +cfunc_nop() +cfunc_multireturn() +cfunc_fetch_evens({2,4,6}) +cfunc_fetch_evens({1,2,3}) -- error +cfunc_args(1, "hello") +cfunc_sum(1) -- error +cfunc_sum(1,2) + +-- +-- Clean them up +cfunc_nop:unload() +cfunc_multireturn:unload() +cfunc_fetch_evens:unload() +cfunc_args:unload() +cfunc_sum:unload() +module:unload() + +-- +-- Cleanup the generated symlink +_ = pcall(fio.unlink(cfunc_path)) diff --git a/test/box/suite.ini b/test/box/suite.ini index e700d0b9e..fc16c5951 100644 --- a/test/box/suite.ini +++ b/test/box/suite.ini @@ -2,7 +2,7 @@ core = tarantool description = Database tests script = box.lua -disabled = rtree_errinj.test.lua tuple_bench.test.lua +disabled = rtree_errinj.test.lua tuple_bench.test.lua cmod.test.lua long_run = huge_field_map_long.test.lua config = engine.cfg release_disabled = errinj.test.lua errinj_index.test.lua rtree_errinj.test.lua upsert_errinj.test.lua iproto_stress.test.lua gh-4648-func-load-unload.test.lua -- 2.29.2
On Mon, Jan 18, 2021 at 11:35:52PM +0300, Cyrill Gorcunov wrote: > Currently when we reload modules we remove old instance > from the module cache and then try to insert a new one > back. Note that module cache is a string based hash table: > we lookup for a pointer to the module via package name. This approach doesn't work. An update patch attached. I force pushed the update. --- From: Cyrill Gorcunov <gorcunov@gmail.com> Date: Tue, 12 Jan 2021 11:53:50 +0300 Subject: [PATCH v12 4/8] module_cache: panic if module cache update failed Currently when we reload modules we remove old instance from the module cache and then try to insert a new one back. If error happens inbetween (say machine reached lack of memory and hash resize failed) we try to restore functions on old module. This is fine but we can't continue working with modules -- next reload action won't find the module in hash and won't reload. Since this is a really rare case plain panic should be fine here. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/module_cache.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index 8ae61a883..ccf079d7a 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -422,6 +422,24 @@ module_sym_call(struct module_sym *mod_sym, struct port *args, return rc; } +/** + * Update the module cache. + */ +static void +module_cache_update(const char *package, const char *package_end, + struct module *module) +{ + /* + * Due to hash internals we need to remove old node + * and insert a new one. If there some memory pressure + * happen between these actions we would loose the hash + * node and this make inconsistent the next modules reload. + */ + module_cache_del(package, package_end); + if (module_cache_add(module) != 0) + panic("module: can't update module cache (%s)", package); +} + int module_reload(const char *package, const char *package_end, struct module **module) @@ -453,10 +471,7 @@ module_reload(const char *package, const char *package_end, rlist_move(&new->funcs_list, &mod_sym->item); } - module_cache_del(package, package_end); - if (module_cache_add(new) != 0) - goto restore; - + module_cache_update(package, package_end, new); module_gc(old); *module = new; return 0; -- 2.29.2
Hi! Thanks for the patchset!
> v12:
> - switch to new API as been discussed in
> https://lists.tarantool.org/tarantool-patches/e186c454-6765-4776-6433-f3f791ff4c27@tarantool.org/
This is what I meant when I asked for a proper API design
which we could strictly follow. The thing by the link is not a
discussion. It is some questions from me + your response talking
about implementation and other assumptions/proposals. I still
don't see an approved API design on which all the interested people
would agree.
I simply don't know how to review the user-visible part of this patch
because I don't know how it should work. There is no design, the API
in the patch just happened to be this.
Thanks for the patch! See 4 comments below. On 18.01.2021 21:35, Cyrill Gorcunov wrote: > The module handling should not be bound to particular > function implementation (we will have two users: already > existing functions for "_func" space, and a new upcoming > one which will be serving cbox submodule in next patch). 1. It is now cmod, not cbox. > For this sake all module related code is moved to > module_cache file where we do symbol resolving, calling > and tracking of symbol usage. > > Part-of #4642 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > diff --git a/src/box/module_cache.c b/src/box/module_cache.c > new file mode 100644 > index 000000000..9fe316807 > --- /dev/null > +++ b/src/box/module_cache.c > @@ -0,0 +1,513 @@ > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#include <dlfcn.h> > +#include <fcntl.h> > +#include <stdlib.h> > +#include <string.h> > + > +#include "assoc.h" > +#include "diag.h" > +#include "error.h" > +#include "errinj.h" > +#include "fiber.h" > +#include "port.h" > + > +#include "box/error.h" 2. #include "error.h" and #include "box/error.h" are the same files. Because you are in box/ folder. > +#include "lua/utils.h" > +#include "libeio/eio.h" > + > +#include "module_cache.h" > + > +/** Modules name to descriptor hash. */ > +static struct mh_strnptr_t *mod_hash = NULL; > + > +int > +module_sym_load(struct module_sym *mod_sym) > +{ > + assert(mod_sym->addr == NULL); > + > + struct func_name name; > + func_split_name(mod_sym->name, &name); > + > + struct module *module = module_cache_find(name.package, name.package_end); > + if (module == NULL) { > + module = module_load(name.package, name.package_end); > + if (module == NULL) > + return -1; > + if (module_cache_add(module) != 0) { > + module_delete(module); > + return -1; > + } > + } > + > + mod_sym->addr = module_sym(module, name.sym); > + if (mod_sym->addr == NULL) > + return -1; 3. Is it correct, that you second time has deleted the bugfix which you did about module not being unloaded when first symbol load fails? > + > + mod_sym->module = module; > + rlist_add(&module->funcs_list, &mod_sym->item); > + return 0; > +} > diff --git a/src/box/module_cache.h b/src/box/module_cache.h > new file mode 100644 > index 000000000..fd789f603 > --- /dev/null > +++ b/src/box/module_cache.h > @@ -0,0 +1,139 @@ > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#pragma once > + > +#include <small/rlist.h> 4. Please, use "" instead of <> for non-system headers.
Thanks for the patch! Very dubious commit. To me the old and new names look the same. 'func_name' was even better than 'func_name_desc'. At least because it was shorter, and 'desc' does not add any more meaning. On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote: > - rename func_name to func_name_desc because > it is not just a name but rather a name descriptor > which includes symbol address I don't see any address in it. Only name tokens. > - rename func_split_name to parse_func_name because > the action of splitting name to parts is not what > the function about, the function parses specification > to fetch a descriptor > > Part-of #4642 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > ---
Thanks for the patch! On 19.01.2021 13:46, Cyrill Gorcunov wrote: > On Mon, Jan 18, 2021 at 11:35:52PM +0300, Cyrill Gorcunov wrote: >> Currently when we reload modules we remove old instance >> from the module cache and then try to insert a new one >> back. Note that module cache is a string based hash table: >> we lookup for a pointer to the module via package name. > > This approach doesn't work. Why? > --- > From: Cyrill Gorcunov <gorcunov@gmail.com> > Date: Tue, 12 Jan 2021 11:53:50 +0300 > Subject: [PATCH v12 4/8] module_cache: panic if module cache update failed > > Currently when we reload modules we remove old instance > from the module cache and then try to insert a new one > back. If error happens inbetween (say machine reached > lack of memory and hash resize failed) we try to restore > functions on old module. This is fine but we can't continue > working with modules -- next reload action won't find > the module in hash and won't reload. Since this is a really > rare case plain panic should be fine here. > > Part-of #4642 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > ---
Thanks for the patch! Why isn't it in the third commit ("improve naming")? On 18.01.2021 21:35, Cyrill Gorcunov wrote: > We will use this field not only to count a number of > active calls to the module but to prevent the module > from disappear when it get loaded via "cmod" interface > (will be implemented in next patches). > > Same time make it 64 bit long since there might be a > number of such references size_t didn't prevent from > numeric overflow. > > Part-of #4642 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > --- > src/box/module_cache.c | 8 ++++---- > src/box/module_cache.h | 4 ++-- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/src/box/module_cache.c b/src/box/module_cache.c > index 48e55f421..ae874caab 100644 > --- a/src/box/module_cache.c > +++ b/src/box/module_cache.c > @@ -214,7 +214,7 @@ module_load(const char *package, const char *package_end) > memcpy(module->package, package, package_len); > module->package[package_len] = 0; > rlist_create(&module->funcs_list); > - module->calls = 0; > + module->ref = 0; I propose to be consistent with the other code and call it 'refs' instead of 'ref' (see struct error, struct tuple, struct tuple_format). > const char *tmpdir = getenv("TMPDIR"); > if (tmpdir == NULL)
Thanks for the patch!
> diff --git a/src/box/module_cache.c b/src/box/module_cache.c
> index ae874caab..cb580f342 100644
> --- a/src/box/module_cache.c
> +++ b/src/box/module_cache.c
> @@ -437,6 +431,34 @@ module_cache_update(const char *name, const char *name_end,
> return 0;
> }
>
> +/**
> + * Load a new module.
> + */
> +struct module *
> +module_load(const char *path, size_t path_len)
> +{
> + struct module *module = module_cache_find(path, &path[path_len]);
> + if (module == NULL) {
> + module = module_dlopen(path, &path[path_len]);
> + if (module == NULL)
> + return NULL;
> + if (module_cache_add(module) != 0) {
> + module_delete(module);
> + return NULL;
> + }
> + }
> + return module;
> +}
> +
> +/**
> + * Unload a module.
> + */
> +void
> +module_unload(struct module *module)
> +{
> + module_gc(module);
From what I see in the main commit, you should increase
the reference counter in load and decrease it in unload.
When you do that - you get another issue: --ref + module_gc() become
a pattern. So it makes sense to move them to a function module_unref().
Also it looks super strange, that the module deletion from the cache
is done in module_sym_unload(). I don't think it should change anything
except its mod_sym argument.
Another issue - module.funcs_list is an implicit reference counter,
which is extra weird. The function list should only be used for
reloads. Each function should increase the counter, so it couldn't happen
that the ref counter is 0, but there are functions keeping a pointer at
the module.
As you can see there is really a pile of issues with the current
reference counting and deletion of modules. Maybe better fix all of
them while you are here. At least the issue in your main patch about
touching ref counter manually.
Thanks for the patch! See 21 comments below. On 18.01.2021 21:35, Cyrill Gorcunov wrote: > Currently to run "C" function from some external module > one have to register it first in "_func" system space. This > is a problem if node is in read-only mode (replica). > > Still people would like to have a way to run such functions > even in ro mode. For this sake we implement "cmod" lua module. > > Fixes #4642 > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > > @TarantoolBot document > Title: cmod module > > Overview > ======== > > `cmod` module provides a way to create, delete and execute > `C` procedures. Unlike `box.schema.func` methods the functions > created with `cmod` help are not persistent and live purely > in memory. Once a node get turned off they are vanished. > An initial purpose for them is to execute them on nodes > which are running in read-only mode. > > Module functions > ================ > > `require('cmod').load(path) -> obj | nil, err` 1. I don't like it, but I must say: according to one another change in Lua guidelines now we must again throw exceptions from all new functions present on the board instead of returning errors. You can try to raise it in the chat if you want. > ---------------------------------------------- > > Loads a module from `path` and return an object instance > associate with the module. The `path` should not end up > with shared object extension (such as `.so`), only a file > name shall be there. > > Possible errors: > > - IllegalParams: module path is either not supplied > or not a string. > - SystemError: unable to open a module due to a system error. > - ClientError: a module does not exist. > - OutOfMemory: unable to allocate a module. > > `module:unload() -> true | nil, err` > ------------------------------------ > > Unloads a module. Once the module is unloaded one can't load > new functions from this module instance. > > Possible errors: > > - IllegalParams: a module is not supplied. > - IllegalParams: a module is already unloaded. > > Example: > > ``` Lua > m, err = require('cmod').load('path') > if not m then > print(err) > end > ok, err = m:unload() > if not ok then > print(err) > end > ``` > > If there are functions from this module referenced somewhere > in other Lua code they still can be called because module > continue sitting in memory until the last reference is closed. > > If the module become a target to the Lua's garbage collector > then unload is called implicitly. > > `module:reload() -> true | nil, err` > ------------------------------------ > > Reloads a module and all functions which were associated with > the module. Each module keeps a list of functions belonging to > the module and reload procedure cause the bound function to update > their addresses such that function execution will be routed via > a new library. > > Modules are loaded with that named local binding which means > that reload of module symbols won't affect the functions which > are started execution already, only new calls will be rerouted. > > Possible errors: > - IllegalParams: a module is not supplied. > - ClientError: a module does not exist. 2. What happens to the existing function objects on reload()? How is reload different from unload() + load()? > On success `true` is returned, otherwise `nil,error` pair. > > Example: > > ``` Lua > m, err = require('cmod').load('path') > if not m then > print(err) > end > ok, err = m:reload() > if not ok then > print(err) > end > ``` > > module:load(name) -> obj | nil, err` > ------------------------------------ > > Loads a new function with name `name` from the `module` object > and return a callable object instance associate with the function. > > Possible errors: > - IllegalParams: function name is either not supplied > or not a string. > - OutOfMemory: unable to allocate a function. 3. What if the function is not found? > Example: > > ``` Lua > m, err = require('cmod').load('path') > if not m then > print(err) > end > f, err = m:load('foo') > if not ok then > print(err) > end > ``` > > `function:unload() -> true | nil, err` > -------------------------------------- > > Unloads a function. > > Possible errors: > - IllegalParams: function name is either not supplied > or not a string. > - IllegalParams: the function does not exist. 4. Is this unloaded automatically, when the function object is GCed? > Example: > > ``` Lua > m, err = require('cmod').load('path') > if not m then > print(err) > end > f, err = m:load('foo') > if not ok then > print(err) > end > ok, err = f:unload() > if not ok then > print(err) > end > ``` > > Executing a loaded function > =========================== > > Once function is loaded it can be executed by ordinary Lua call. > Lets consider the following example. We have a `C` function which > takes two numbers and returns their sum. > > ``` C > int > cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end) > { > uint32_t arg_count = mp_decode_array(&args); > if (arg_count != 2) { > return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", > "invalid argument count"); > } > uint64_t a = mp_decode_uint(&args); > uint64_t b = mp_decode_uint(&args); > > char res[16]; > char *end = mp_encode_uint(res, a + b); > box_return_mp(ctx, res, end); > return 0; > } > ``` > > The name of the function is `cfunc_sum` and the function is built into > `cfunc.so` shared library. > > First we should load it as > > ``` Lua > m, err = require('cmod').load('cfunc') > if not m then > print(err) > end > cfunc_sum, err = m:load('cfunc_sum') > if not cfunc_sum then > print(err) > end > ``` > > Once successfully loaded we can execute it. Note that unlike regular > Lua functions the context of `C` functions is different. They never > thrown an exception but return `true|nil, res` form where first value > set to `nil` in case of error condition and `res` carries an error > description. > > Lets call the `cfunc_sum` with wrong number of arguments > > ``` Lua > local ok, res = cfunc_sum() > if not ok then > print(res) > end > ``` > > We will the `"invalid argument count"` message in output. > The error message has been set by the `box_error_set` in `C` > code above. > > On success the first returned value set to `true` and `res` represent > function execution result. > > ``` Lua > local ok, res = cfunc_sum(1, 2) > assert(ok); > print(res) > ``` > > We will see the number `3` in output. 5. What happens when I do multireturn? > diff --git a/src/box/lua/cmod.c b/src/box/lua/cmod.c > new file mode 100644 > index 000000000..c3a44e3df > --- /dev/null > +++ b/src/box/lua/cmod.c > @@ -0,0 +1,718 @@ > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#include <string.h> > +#include <lua.h> > + > +#include "assoc.h" > +#include "diag.h" > + > +#include "box/module_cache.h" > +#include "box/error.h" > +#include "box/port.h" > +#include "tt_static.h" > + > +#include "trivia/util.h" > +#include "lua/utils.h" > + > +/** > + * Function descriptor. > + */ > +struct cmod_func { > + /** > + * Symbol descriptor for the function in > + * an associated module. > + */ > + struct module_sym mod_sym; > + /** > + * Number of loads of the function. > + */ > + int64_t load_count; > + /** > + * The function name (without a package path). > + */ > + const char *func_name; 6. This member is never used. I removed it and nothing changed. ==================== @@ -31,10 +31,6 @@ struct cmod_func { * Number of loads of the function. */ int64_t load_count; - /** - * The function name (without a package path). - */ - const char *func_name; /** * Length of @a name member. */ @@ -231,12 +227,11 @@ cmod_func_free(struct cmod_func *cf) * * @param name package path and a function name, ie "module.foo" * @param len length of @a name. - * @param func_name_len function name length, ie "3" for "module.foo" * * @returns function instance on success, NULL otherwise setting diag area. */ static struct cmod_func * -cmod_func_new(const char *name, size_t len, size_t func_name_len) +cmod_func_new(const char *name, size_t len) { const ssize_t cf_size = sizeof(struct cmod_func); size_t size = cf_size + len + 1; @@ -250,7 +245,6 @@ cmod_func_new(const char *name, size_t len, size_t func_name_len) cf->mod_sym.module = NULL; cf->load_count = 0; cf->mod_sym.name = cf->name; - cf->func_name = &cf->name[len - func_name_len]; cf->len = len; memcpy(cf->name, name, len); @@ -309,7 +303,7 @@ lcmod_func_load(struct lua_State *L) cf = cmod_func_find(name, len); if (cf == NULL) { - cf = cmod_func_new(name, len, strlen(func_name)); + cf = cmod_func_new(name, len); if (cf == NULL) return luaT_push_nil_and_error(L); ==================== > + /** > + * Length of @a name member. > + */ > + size_t len; > + /** > + * Module path with function name separated > + * by a point, like "module.func". > + * > + * See parse_func_name helper, we simply have > + * to keep this strange architecture for backward > + * compatibility sake (ie for functions which > + * are created via box.schema.func). > + */ > + char name[0]; > +}; > + > +/** Get a handle associated with an object. */ > +static void * > +cmod_get_handle(struct lua_State *L, const char *uname) > +{ > + void **pptr = luaL_testudata(L, 1, uname); > + return pptr != NULL ? *pptr : NULL; > +} > + > +/** Set a handle to a new value. */ > +static void > +cmod_set_handle(struct lua_State *L, const char *uname, void *ptr) > +{ > + void **pptr = luaL_testudata(L, 1, uname); > + if (pptr != NULL) > + *pptr = ptr; > +} > + > +/** Setup a new handle and associate it with an object. */ > +static void > +cmod_setup_handle(struct lua_State *L, const char *uname, void *ptr) > +{ > + *(void **)lua_newuserdata(L, sizeof(void *)) = ptr; > + luaL_getmetatable(L, uname); > + lua_setmetatable(L, -2); > +} > + > +/** A type to find a module handle from an object. */ > +static const char *cmod_module_handle_uname = "cmod_module_handle"; > + > +static struct module * > +cmod_get_module_handle(struct lua_State *L, bool mandatory) > +{ > + struct module *m = cmod_get_handle(L, cmod_module_handle_uname); > + if (mandatory && m == NULL) { > + const char *fmt = "The module is already unloaded"; > + diag_set(IllegalParams, fmt); > + } > + return m; > +} > + > +static void > +cmod_set_module_handle(struct lua_State *L, struct module *module) > +{ > + cmod_set_handle(L, cmod_module_handle_uname, module); > +} > + > +static void > +cmod_setup_module_handle(struct lua_State *L, struct module *module) > +{ > + cmod_setup_handle(L, cmod_module_handle_uname, module); > +} > + > +/** A type to find a function handle from an object. */ > +static const char *cmod_func_handle_uname = "cmod_func_handle"; 7. Please, move global declarations to the beginning of the file so as it would be easier to find them, and to be consistent with other files. > + > +static struct cmod_func * > +cmod_get_func_handle(struct lua_State *L, bool mandatory) 8. According to the code style, flags must have 'is' prefix or a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'. But a better idea - just drop it. The single place where it is not mandatory is GC handler. Another option - introduce separate get() and check(). Get() only returns it without checking for NULL. Check() will ensure it is not NULL and set the error otherwise. Also the function takes a Lua state, so it seems it must have lcmod prefix, not cmod. According to the naming convention you established in this file. > +{ > + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname); > + if (mandatory && cf == NULL) { > + const char *fmt = "The function is already unloaded"; > + diag_set(IllegalParams, fmt); > + } > + return cf; > +} > + > +static void > +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf) > +{ > + cmod_set_handle(L, cmod_func_handle_uname, cf); > +} > + > +static void > +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf) > +{ > + cmod_setup_handle(L, cmod_func_handle_uname, cf); > +} 9. Amount of boilerplate code and double checks of the same conditions you need just to push some udata is terrifying, tbh. For instance, take lcmod_func_handle_call(). It calls cmod_get_func_handle() and checks result for NULL. This one calls cmod_get_handle() and also checks result for NULL. This one calls luaL_testudata() and also checks result for NULL. You did 3 NULL checks which could be one, if you would have a set of simpler functions working directly with luaL_testudata(). 10. Why do you have 'handle' word in each function name? For example, how 'handle_call' is any different from just 'call' in scope of code in this file? Or how is 'cmod_setup_func' different from 'cmod_setup_func_handle'? It seems that by 'handle' you try not to clash names with the non-Lua part of cmod. But you should do it via prefixes, not via suffixes. lcmod instead of cmod. > + > +/** > + * Function name to cmod_func hash. The name includes > + * module package path without file extension. > + */ > +static struct mh_strnptr_t *func_hash = NULL; > + > +/** > + * Find function in cmod_func hash. > + * > + * @param name function name. > + * @param name_len function name length. > + * > + * @returns function descriptor if found, NULL otherwise. > + */ > +struct cmod_func * > +cmod_func_find(const char *name, size_t name_len) 11. The function must be static. > +{ > + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len); > + if (e == mh_end(func_hash)) > + return NULL; > + return mh_strnptr_node(func_hash, e)->val; > +} > + > +/** > + * Delete a function instance from the hash or decrease > + * a reference if the function is still loaded. > + * > + * @param cf function descriptor. > + * > + * @retval true if the function has no more loaded instances > + * and removed from the hash. > + * > + * @retval false if there are loaded instances left and function > + * is kept in the hash. > + */ > +static bool > +cmod_func_del(struct cmod_func *cf) 12. This whole cmod and lcmod API looks extremely inconsistent and complex. Here are the functions you have (not counting the boilerplate code above): // Find a function in the cache, ok. cmod_func_find(name) // It is called 'delete'. But it does not do the // delete. Moreover, it also decreases 'load_count', // which is super surprising. What does it have to do // with loads? You don't call module_sym_unload() here, // but decrease load_count. cmod_func_del(f) // Ok, this looks like it should add the function to the // hash. But wait, it also increases load_count, and does // not call module_sym_load()! WTF? cmod_func_add(f) // It should free the memory, right? But it calls // module_sym_unload(). Why!? Also, we have a code style // agreement, that function freeing the memory should // have suffix 'delete', not 'free'. As soon as you have // 'new'. cmod_func_free(f) // Fine, it should allocate a new function object, right? // But it also calls module_sym_load()! And does not increase // load_count! I don't understand. It seems like the names // of these functions has very little to do with what they are // really doing. The only way to make them work is to call them // in some special sequence, in which a next function fixes // leftovers of previous functions. cmod_func_new(name) This must be seriously redesigned. Starting from the function names. Here is an option: // Find the function in the hash. If found, increase load_count // and return. Otherwise allocate a new function object, fill its // members, do the load, and make load_count = 1. Add to the hash. cmod_func_load(name) // Decrease load_count. If became 0, remove from the hash, // unload mod_sym, and free the memory. cmod_func_unload(f) I don't really see why would you need the other functions. These 2 are going to be extra simple anyway. Module_cache API for module objects is literally that simple - load, unload, reload. Why do you need so many for cmod, especially with so unclear behaviour? > +{ > + assert(cf->load_count > 0); > + if (cf->load_count-- != 1) > + return false; > + > + mh_int_t e = mh_strnptr_find_inp(func_hash, cf->name, cf->len); > + assert(e != mh_end(func_hash)); > + mh_strnptr_del(func_hash, e, NULL); > + return true; > +} > + > +/** > + * Add a function instance into the hash or increase > + * a reference if the function is already exist. > + * > + * @param cf Function descriptor. > + * > + * Possible errors: > + * > + * - OutOfMemory: unable to allocate a hash entry. > + * > + * @retval 0 on success. > + * @retval -1 on error, diag is set. > + */ > +static int > +cmod_func_add(struct cmod_func *cf) > +{ > + assert(cf->load_count >= 0); > + if (cf->load_count++ != 0) 13. I already proposed an option how to rework cmod API above, but still will leave some comments alongside. Why the heck 'add' function increases load_count? > + return 0; > + > + const struct mh_strnptr_node_t nd = { > + .str = cf->name, > + .len = cf->len, > + .hash = mh_strn_hash(cf->name, cf->len), > + .val = cf, > + }; > + > + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL); > + if (e == mh_end(func_hash)) { > + diag_set(OutOfMemory, sizeof(nd), > + "malloc", "cmod_func node"); > + return -1; > + } > + return 0; 14. Why so complex? Just add it to the hash when it is created. This function is used in a single place, where you already do the check if it exists. > +} > + > +/** > + * Unload a symbol and free a function instance. > + * > + * @param cf function descriptor. > + */ > +static void > +cmod_func_free(struct cmod_func *cf) > +{ > + module_sym_unload(&cf->mod_sym); 15. Worth adding load_count == 0 assertion. Or rather do the unload where it belongs - where load_count becomes 0. > + TRASH(cf); > + free(cf); > +} > + > +/** > + * Allocate a new function instance and resolve a symbol address. > + * > + * @param name package path and a function name, ie "module.foo" > + * @param len length of @a name. > + * @param func_name_len function name length, ie "3" for "module.foo" > + * > + * @returns function instance on success, NULL otherwise setting diag area. > + */ > +static struct cmod_func * > +cmod_func_new(const char *name, size_t len, size_t func_name_len) > +{ > + const ssize_t cf_size = sizeof(struct cmod_func); > + size_t size = cf_size + len + 1; > + struct cmod_func *cf = malloc(size); > + if (cf == NULL) { > + diag_set(OutOfMemory, size, "malloc", "cf"); > + return NULL; > + } > + > + cf->mod_sym.addr = NULL; > + cf->mod_sym.module = NULL; > + cf->load_count = 0; > + cf->mod_sym.name = cf->name; > + cf->func_name = &cf->name[len - func_name_len]; > + cf->len = len; > + > + memcpy(cf->name, name, len); > + cf->name[len] = '\0'; > + > + if (module_sym_load(&cf->mod_sym) != 0) { 16. You load, and don't increase load_count. Does not it look contradictory to you? > + cmod_func_free(cf); > + return NULL; > + } > + > + return cf; > +} > + > +/** > + * Load a new function. > + * > + * This function takes a function name from the caller > + * stack @a L and creates a new function object. If > + * the function is already loaded we simply return > + * a reference to existing one. > + * > + * Possible errors: > + * > + * - IllegalParams: function name is either not supplied > + * or not a string. > + * - IllegalParams: function references limit exceeded. > + * - OutOfMemory: unable to allocate a function. > + * > + * @returns function object on success or {nil,error} on error, > + * the error is set to the diagnostics area. > + */ > +static int > +lcmod_func_load(struct lua_State *L) > +{ > + const char *method = "function = module:load"; > + struct cmod_func *cf = NULL; > + > + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) { > + const char *fmt = > + "Expects %s(\'name\') but no name passed"; > + diag_set(IllegalParams, fmt, method); 17. Why couldn't you just inline 'fmt' string? The current version is not any shorter nor easier to read. The same in many other places. > + return luaT_push_nil_and_error(L); > + } > + > + struct module *m = cmod_get_module_handle(L, false); > + if (m == NULL) { > + const char *fmt = > + "Expects %s(\'name\') but not module object passed"; > + diag_set(IllegalParams, fmt, method); > + return luaT_push_nil_and_error(L); > + } > + > + const char *func_name = lua_tostring(L, 2); > + const char *name = tt_sprintf("%s.%s", m->package, func_name); > + size_t len = strlen(name); > + > + cf = cmod_func_find(name, len); 18. Or you could pass package name and function name to the constructor and copy them right to the allocated memory, without double copying via the static buffer. > + if (cf == NULL) { > + cf = cmod_func_new(name, len, strlen(func_name)); > + if (cf == NULL) > + return luaT_push_nil_and_error(L); > + } > + > + if (cmod_func_add(cf) != 0) { 19. From this code it looks like you can fail addition of an already existing function and you just delete it. I know it can't happen, but the code says the opposite, and it is confusing. Another reason why 'add' is a bad name - it seems like it should fail when added second time due to being already added. > + cmod_func_free(cf); > + return luaT_push_nil_and_error(L); > + } > + > + cmod_setup_func_handle(L, cf); > + return 1; > +} > + > +/** > + * Unload a function. > + * > + * This function takes a function object from > + * the caller stack @a L and unloads it. > + * > + * Possible errors: > + * > + * - IllegalParams: function is not supplied. > + * - IllegalParams: the function does not exist. > + * > + * @returns true on success or {nil,error} on error, > + * the error is set to the diagnostics area. > + */ > +static int > +lcmod_func_unload(struct lua_State *L) > +{ > + if (lua_gettop(L) != 1) { > + const char *fmt = "Expects function:unload()"; > + diag_set(IllegalParams, fmt); > + return luaT_push_nil_and_error(L); > + } > + > + struct cmod_func *cf = cmod_get_func_handle(L, true); > + if (cf == NULL) > + return luaT_push_nil_and_error(L); > + > + cmod_set_func_handle(L, NULL); > + if (cmod_func_del(cf)) > + cmod_func_free(cf); > + > + lua_pushboolean(L, true); > + return 1; > +} > + > +/** > + * Load a new module. > + * > + * This function takes a module patch from the caller > + * stack @a L and creates a new module object. > + * > + * Possible errors: > + * > + * - IllegalParams: module path is either not supplied > + * or not a string. > + * - SystemError: unable to open a module due to a system error. > + * - ClientError: a module does not exist. > + * - OutOfMemory: unable to allocate a module. > + * > + * @returns module object on success or {nil,error} on error, > + * the error is set to the diagnostics area. > + */ > +static int > +lcmod_module_load(struct lua_State *L) > +{ > + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) { > + const char *fmt = > + "Expects cmod.load(\'name\') but no name passed"; > + diag_set(IllegalParams, fmt); > + return luaT_push_nil_and_error(L); > + } > + > + size_t name_len; > + const char *name = lua_tolstring(L, 1, &name_len); > + > + struct module *module = module_load(name, name_len); > + if (module == NULL) > + return luaT_push_nil_and_error(L); > + > + cmod_setup_module_handle(L, module); > + /* > + * Make sure the module won't disappear until > + * it is GC'ed or unloaded explicitly. > + */ > + module->ref++; 20. The fact that you need to touch reference counter of an object from a different subsystem should make you start to question sanity of the API. Why module_load() does not do the increase, and module_unload() - decrease? > + return 1; > +}> + > +/** > + * Module handle representation for REPL (console). > + */ > +static int > +lcmod_module_handle_serialize(struct lua_State *L) > +{ > + struct module *m = cmod_get_module_handle(L, true); > + if (m == NULL) { > + lua_pushnil(L); > + return 1; > + } > + > + lua_createtable(L, 0, 0); 21. Since you decided to use createtable instead of newtable, I suggest you to fill the proper parameters, which should be (0, 1) in this case I think. > + lua_pushstring(L, m->package); > + lua_setfield(L, -2, "path"); > + > + return 1; > +}
Thanks for the patch! See 5 comments below. > diff --git a/test/box/cfunc2.c b/test/box/cfunc2.c > new file mode 100644 > index 000000000..f6d826183 > --- /dev/null > +++ b/test/box/cfunc2.c > @@ -0,0 +1,132 @@ > +#include <stdio.h> > +#include <stdbool.h> > +#include <msgpuck.h> > + > +#include "module.h" > + > +/* > + * Just make sure we've been called. > + */ > +int > +cfunc_nop(box_function_ctx_t *ctx, const char *args, const char *args_end) > +{ > + (void)ctx; > + (void)args; > + (void)args_end; > + return 0; > +} > + > +/* > + * Fetch first N even numbers (just to make sure the order of > + * arguments is not screwed).> + */ > +int > +cfunc_fetch_evens(box_function_ctx_t *ctx, const char *args, const char *args_end) > +{ > + int arg_count = mp_decode_array(&args); > + if (arg_count != 1) { > + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", > + "invalid argument count"); 1. Broken alignment. Here and in other error places. > + } > + int field_count = mp_decode_array(&args); > + if (field_count < 1) { > + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s", > + "invalid array size"); > + } > + > + for (int i = 0; i < field_count; i++) { > + int val = mp_decode_uint(&args); > + int needed = 2 * (i+1); 2. Please, use whitespaces before and after binary operators. I really suggest you to revisit the code style guide. Why do you need such a complex function? Why couldn't you simply make a function like 'echo', which validates msgpack and returns it as is? You would be able to check that the args are not corrupted, and are exactly what you have passed. > + if (val != needed) { > + char res[128]; > + snprintf(res, sizeof(res), "%s %d != %d", > + "invalid argument", val, needed); > + return box_error_set(__FILE__, __LINE__, > + ER_PROC_C, "%s", res); > + } > + } > + > + return 0; > +} > diff --git a/test/box/cmod.result b/test/box/cmod.result > new file mode 100644 > index 000000000..f8558c191 > --- /dev/null > +++ b/test/box/cmod.result > @@ -0,0 +1,199 @@ > +-- test-run result file version 2 > +build_path = os.getenv("BUILDDIR") > + | --- > + | ... > +package.cpath = build_path..'/test/box/?.so;'..build_path..'/test/box/?.dylib;'..package.cpath > + | --- > + | ... > + > +cmod = require('cmod') 3. In the beginning of a test we usually write a comment with the issue number and its description. > + | --- > + | ... > +fio = require('fio') > + | --- > + | ... > + > +ext = (jit.os == "OSX" and "dylib" or "so") > + | --- > + | ... > + > +cfunc_path = fio.pathjoin(build_path, "test/box/cfunc.") .. ext > + | --- > + | ... > +cfunc1_path = fio.pathjoin(build_path, "test/box/cfunc1.") .. ext > + | --- > + | ... > +cfunc2_path = fio.pathjoin(build_path, "test/box/cfunc2.") .. ext > + | --- > + | ... > + > +_ = pcall(fio.unlink(cfunc_path)) > + | --- > + | ... > +fio.symlink(cfunc1_path, cfunc_path) > + | --- > + | - true > + | ... > + > +module, err = cmod.load('non-existing-cfunc') > + | --- > + | ... > +assert(err ~= nil) > + | --- > + | - true > + | ... > + > +-- > +-- They all are sitting in cfunc.so > +module, err = cmod.load('cfunc') > + | --- > + | ... > +assert(err == nil) > + | --- > + | - true > + | ... > + > +cfunc_nop, err = module:load('no-such-func') > + | --- > + | ... > +assert(err ~= nil) > + | --- > + | - true > + | ... > +cfunc_nop, err = module:load('cfunc_nop') > + | --- > + | ... > +assert(err == nil) > + | --- > + | - true > + | ... > +cfunc_fetch_evens, err = module:load('cfunc_fetch_evens') > + | --- > + | ... > +assert(err == nil) > + | --- > + | - true > + | ... > +cfunc_multireturn, err = module:load('cfunc_multireturn') > + | --- > + | ... > +assert(err == nil) > + | --- > + | - true > + | ... > +cfunc_args, err = module:load('cfunc_args') > + | --- > + | ... > +assert(err == nil) > + | --- > + | - true > + | ... > +cfunc_sum, err = module:load('cfunc_sum') > + | --- > + | ... > +assert(err == nil) > + | --- > + | - true > + | ... > + > +-- > +-- Make sure they all are callable > +cfunc_nop() > + | --- > + | - true > + | ... > +cfunc_fetch_evens() > + | --- > + | - true > + | ... > +cfunc_multireturn() > + | --- > + | - true > + | ... > +cfunc_args() > + | --- > + | - true > + | ... > + > +-- > +-- Clean old function references and reload a new one. > +_ = pcall(fio.unlink(cfunc_path)) > + | --- > + | ... > +fio.symlink(cfunc2_path, cfunc_path) > + | --- > + | - true > + | ... > + > +module:reload() > + | --- > + | - true > + | ... > + > +cfunc_nop() > + | --- > + | - true > + | ... > +cfunc_multireturn() 4. I am not sure it is correct. The function wasn't reloaded. I thought we decided that to get a new function after reload you need to call load() on the module again. Reason for not changing the existing function objects was that the reload is supposed to happen when the whole application reloads. It means there can be some fibers running the old code for a while after reload until they notice that they must be restarted. If you do reload like you did in this patch, the old fibers will get new functions immediately right under their feet, and this may result into unexpected behaviour for code in these fibers. Another way to get the desired behaviour - use unload() + load() - I assume in this case the old function objects will remain unchanged, right? I suggest to ask Mons what to do with this. And this is why I asked for a proper RFC (or GitHub discussion) before starting to rework the patch. > + | --- > + | - true > + | - [1] > + | - [1] > + | ... > +cfunc_fetch_evens({2,4,6}) > + | --- > + | - true > + | ... > +cfunc_fetch_evens({1,2,3}) -- error > + | --- > + | - null > + | - invalid argument 1 != 2 > + | ... > +cfunc_args(1, "hello") > + | --- > + | - true > + | - [1, 'hello'] > + | ... > +cfunc_sum(1) -- error > + | --- > + | - null > + | - invalid argument count > + | ... > +cfunc_sum(1,2) > + | --- > + | - true > + | - 3 > + | ... > + > +-- > +-- Clean them up > +cfunc_nop:unload() > + | --- > + | - true > + | ... > +cfunc_multireturn:unload() > + | --- > + | - true > + | ... > +cfunc_fetch_evens:unload() > + | --- > + | - true > + | ... > +cfunc_args:unload() > + | --- > + | - true > + | ... > +cfunc_sum:unload() > + | --- > + | - true > + | ... > +module:unload() > + | --- > + | - true > + | ... > + > +-- > +-- Cleanup the generated symlink > +_ = pcall(fio.unlink(cfunc_path)) 5. You didn't test GC, multiple unload (should raise an error), reload when the shared file was deleted, and that the functions are kept loaded even if the module was unloaded. Maybe more, but this is just what I thought of right away.
On Sun, Jan 24, 2021 at 05:27:18PM +0100, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> On 19.01.2021 13:46, Cyrill Gorcunov wrote:
> > On Mon, Jan 18, 2021 at 11:35:52PM +0300, Cyrill Gorcunov wrote:
> >> Currently when we reload modules we remove old instance
> >> from the module cache and then try to insert a new one
> >> back. Note that module cache is a string based hash table:
> >> we lookup for a pointer to the module via package name.
> >
> > This approach doesn't work.
>
> Why?
False alarm, it does work. I fear due to shadow copy it won't
be consistent but looking more precisely to mhash code I think
it should be safe to update the pointer inplace without modifying
the hash table. I still think our hashing code is just unreadable
pile but whatever...
On Sun, Jan 24, 2021 at 05:27:12PM +0100, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> Very dubious commit. To me the old and new names look the same.
> 'func_name' was even better than 'func_name_desc'. At least
> because it was shorter, and 'desc' does not add any more meaning.
>
> On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote:
> > - rename func_name to func_name_desc because
> > it is not just a name but rather a name descriptor
> > which includes symbol address
>
> I don't see any address in it. Only name tokens.
struct func_name_desc {
/**
* Null-terminated symbol name, e.g.
* "func" for "mod.submod.func".
*/
---> const char *sym;
/**
* Package name, e.g. "mod.submod" for
* "mod.submod.func".
*/
const char *package;
/**
* A pointer to the last character in ->package + 1.
*/
const char *package_end;
};
I marked the address. The structure is not a function name
but consists of two parts - package path and function name
itself and maybe something new will be added in future. So
func_name is suitable for plain names but not for strings
where some part of it has a special meaning with hidden
extension inside, hereby a descriptor. And don't forget
about grepability, without this name change grep returns
a number of irrelevant results.
On Sun, Jan 24, 2021 at 05:26:48PM +0100, Vladislav Shpilevoy wrote: > Thanks for the patch! > > See 4 comments below. > > On 18.01.2021 21:35, Cyrill Gorcunov wrote: > > The module handling should not be bound to particular > > function implementation (we will have two users: already > > existing functions for "_func" space, and a new upcoming > > one which will be serving cbox submodule in next patch). > > 1. It is now cmod, not cbox. Thanks! > > + > > +#include "box/error.h" > > 2. #include "error.h" and #include "box/error.h" are the same > files. Because you are in box/ folder. OK > > + > > + mod_sym->addr = module_sym(module, name.sym); > > + if (mod_sym->addr == NULL) > > + return -1; > > 3. Is it correct, that you second time has deleted the bugfix > which you did about module not being unloaded when first > symbol load fails? Sigh... This sneaky issue triggers for n'th time :( Thanks for catching, Vlad! > > +#pragma once > > + > > +#include <small/rlist.h> > > 4. Please, use "" instead of <> for non-system headers. OK, an update on top of the patch. I'll squash it later --- src/box/module_cache.c | 30 ++++++++++++++++++++++++++---- src/box/module_cache.h | 2 +- 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index 9fe316807..a66b84efb 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -16,7 +16,7 @@ #include "fiber.h" #include "port.h" -#include "box/error.h" +#include "error.h" #include "lua/utils.h" #include "libeio/eio.h" @@ -338,8 +338,14 @@ module_sym_load(struct module_sym *mod_sym) struct func_name name; func_split_name(mod_sym->name, &name); - struct module *module = module_cache_find(name.package, name.package_end); - if (module == NULL) { + /* + * In case if module has been loaded already by + * some previous call we can eliminate redundant + * loading and take it from the cache. + */ + struct module *cached, *module; + cached = module_cache_find(name.package, name.package_end); + if (cached == NULL) { module = module_load(name.package, name.package_end); if (module == NULL) return -1; @@ -347,11 +353,27 @@ module_sym_load(struct module_sym *mod_sym) module_delete(module); return -1; } + } else { + module = cached; } mod_sym->addr = module_sym(module, name.sym); - if (mod_sym->addr == NULL) + if (mod_sym->addr == NULL) { + if (cached == NULL) { + /* + * In case if it was a first load we should + * clean the cache immediately otherwise + * the module continue being referenced even + * if there will be no use of it. + * + * Note the module_sym set an error thus be + * careful to not wipe it. + */ + module_cache_del(name.package, name.package_end); + module_delete(module); + } return -1; + } mod_sym->module = module; rlist_add(&module->funcs_list, &mod_sym->item); diff --git a/src/box/module_cache.h b/src/box/module_cache.h index fd789f603..0f5d2b64a 100644 --- a/src/box/module_cache.h +++ b/src/box/module_cache.h @@ -6,7 +6,7 @@ #pragma once -#include <small/rlist.h> +#include "small/rlist.h" #if defined(__cplusplus) extern "C" { -- 2.29.2
On Sun, Jan 24, 2021 at 05:27:35PM +0100, Vladislav Shpilevoy wrote: > Thanks for the patch! > > Why isn't it in the third commit ("improve naming")? Because I wanna separate function descriptor changes from module refs. They are serving different purposes and moreover the refs patch changes the element type. > > - module->calls = 0; > > + module->ref = 0; > > I propose to be consistent with the other code and > call it 'refs' instead of 'ref' (see struct error, struct tuple, > struct tuple_format). Sure
On Sun, Jan 24, 2021 at 05:28:27PM +0100, Vladislav Shpilevoy wrote: > > > > Module functions > > ================ > > > > `require('cmod').load(path) -> obj | nil, err` > > 1. I don't like it, but I must say: according to one another > change in Lua guidelines now we must again throw exceptions > from all new functions present on the board instead of returning > errors. You can try to raise it in the chat if you want. Sigh... Thanks for pointing, will switch to a new approach. > > > > Modules are loaded with that named local binding which means > > that reload of module symbols won't affect the functions which > > are started execution already, only new calls will be rerouted. > > > > Possible errors: > > - IllegalParams: a module is not supplied. > > - ClientError: a module does not exist. > > 2. What happens to the existing function objects on reload()? > Modules are loaded with that named local binding which means > that reload of module symbols won't affect the functions which > are started execution already, only new calls will be rerouted. I suspect the paragraph above was not clear. What if I rephrase it as "Modules are loaded with that named local binding which means that reload of the module updates all functions tied up with, the only place where a kind of race possible is function execution: a function may start execution and call reschedule inside, thus this function will not be unloaded until its execution complete. Next call to the same function will be routed via new symbol". Does it sound better? > How is reload different from unload() + load()? This is rather a shortcut to unload+load where previously allocated functions on Lua level are simply updates their internal state. > > module:load(name) -> obj | nil, err` > > ------------------------------------ > > > > Loads a new function with name `name` from the `module` object > > and return a callable object instance associate with the function. > > > > Possible errors: > > - IllegalParams: function name is either not supplied > > or not a string. > > - OutOfMemory: unable to allocate a function. > > 3. What if the function is not found? ClientError. I'll add, thanks! > > > > `function:unload() -> true | nil, err` > > -------------------------------------- > > > > Unloads a function. > > > > Possible errors: > > - IllegalParams: function name is either not supplied > > or not a string. > > - IllegalParams: the function does not exist. > > 4. Is this unloaded automatically, when the function object is > GCed? True. When __gc called on variable we unload function automatically. Probably I should mention this in documentation. > > On success the first returned value set to `true` and `res` represent > > function execution result. > > > > ``` Lua > > local ok, res = cfunc_sum(1, 2) > > assert(ok); > > print(res) > > ``` > > > > We will see the number `3` in output. > > 5. What happens when I do multireturn? Not sure I follow you here. But if you mean tradtional multireturn then we have it in test case (the next patch) | cfunc_multireturn() | | --- | | - true | | - [1] | | - [1] The function should prepare the result by self. The calling convention is the same as to old known `box.func`. Or you mean something else? > > 6. This member is never used. I removed it and nothing > changed. > > ==================== > @@ -31,10 +31,6 @@ struct cmod_func { > * Number of loads of the function. > */ > int64_t load_count; > - /** > - * The function name (without a package path). > - */ > - const char *func_name; > /** Initially I thought to print short function name in obj:__index, but since the function hash is path+name I switched to the long function name (but continue keeping short "func_name" if we deside to print a short name as well). Probably indeed better to drop it and don't confuse code readers. > > + > > +/** A type to find a function handle from an object. */ > > +static const char *cmod_func_handle_uname = "cmod_func_handle"; > > 7. Please, move global declarations to the beginning of the file > so as it would be easier to find them, and to be consistent with > other files. OK. I put them first at the top but then desided to group by function context (ie close to the functions which use them). > > > + > > +static struct cmod_func * > > +cmod_get_func_handle(struct lua_State *L, bool mandatory) > > 8. According to the code style, flags must have 'is' prefix or > a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'. Could you please point me where this rule sits? I see both usage (with and without prefix) in our code. Surely I update it (or probably drop as you suggest below). > But a better idea - just drop it. The single place where it is not > mandatory is GC handler. > > Another option - introduce separate get() and check(). Get() only > returns it without checking for NULL. Check() will ensure it is not NULL > and set the error otherwise. > > Also the function takes a Lua state, so it seems it must have lcmod > prefix, not cmod. According to the naming convention you established > in this file. Yeah, let me try this way. > > +{ > > + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname); > > + if (mandatory && cf == NULL) { > > + const char *fmt = "The function is already unloaded"; > > + diag_set(IllegalParams, fmt); > > + } > > + return cf; > > +} > > + > > +static void > > +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf) > > +{ > > + cmod_set_handle(L, cmod_func_handle_uname, cf); > > +} > > + > > +static void > > +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf) > > +{ > > + cmod_setup_handle(L, cmod_func_handle_uname, cf); > > +} > > 9. Amount of boilerplate code and double checks of the same > conditions you need just to push some udata is terrifying, tbh. > > For instance, take lcmod_func_handle_call(). It calls > cmod_get_func_handle() and checks result for NULL. This one > calls cmod_get_handle() and also checks result for NULL. > This one calls luaL_testudata() and also checks result for NULL. > You did 3 NULL checks which could be one, if you would have > a set of simpler functions working directly with luaL_testudata(). No, you're wrong. The minimum number of check is 2 (1 - fetch the double pointer to userdata, 2 - dereference first level). Instead I use two helpers where one sets diag in case of error which allows me to not duplicate code. This is a way better. Until you meant something different. > 10. Why do you have 'handle' word in each function name? For > example, how 'handle_call' is any different from just 'call' in > scope of code in this file? Or how is 'cmod_setup_func' different > from 'cmod_setup_func_handle'? It seems that by 'handle' you > try not to clash names with the non-Lua part of cmod. But you > should do it via prefixes, not via suffixes. lcmod instead of cmod. The base idea of handle is that "handle" is sitting in udata of Lua variable and the functions are fetching it from there. User may unload a function then its handle set to NULL but variable is still alive and subsequent call should print an error. If you prefer the prefix then sure, gimme some time to try out. > > +struct cmod_func * > > +cmod_func_find(const char *name, size_t name_len) > > 11. The function must be static. Yup. Thanks! > > +{ > > + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len); > > + if (e == mh_end(func_hash)) > > + return NULL; > > + return mh_strnptr_node(func_hash, e)->val; > > +} > > + > > +/** > > + * Delete a function instance from the hash or decrease > > + * a reference if the function is still loaded. > > + * > > + * @param cf function descriptor. > > + * > > + * @retval true if the function has no more loaded instances > > + * and removed from the hash. > > + * > > + * @retval false if there are loaded instances left and function > > + * is kept in the hash. > > + */ > > +static bool > > +cmod_func_del(struct cmod_func *cf) > > 12. This whole cmod and lcmod API looks extremely inconsistent and complex. > Here are the functions you have (not counting the boilerplate code above): > > // Find a function in the cache, ok. > cmod_func_find(name) > > // It is called 'delete'. But it does not do the > // delete. Moreover, it also decreases 'load_count', > // which is super surprising. What does it have to do > // with loads? You don't call module_sym_unload() here, > // but decrease load_count. > cmod_func_del(f) The same function may be loaded twise into different Lua variables, thus we count number of loads. You proposed it. I think we can drop this countint and simply use module `refs` field instead as you proposed in another email. cmod_func_add/del operates with function hash they don't manipulate symbols. The symbols are handled via module_cache. This is because we have two interfaces for functions: box.func and cmod. In turn cmod_func_new/free calls module_sym_x helpers to run a real work with symbols. > > // Ok, this looks like it should add the function to the > // hash. But wait, it also increases load_count, and does > // not call module_sym_load()! WTF? > cmod_func_add(f) Tracks number of same function loads. > > // It should free the memory, right? But it calls > // module_sym_unload(). Why!? Also, we have a code style > // agreement, that function freeing the memory should > // have suffix 'delete', not 'free'. As soon as you have > // 'new'. > cmod_func_free(f) OK for naming. As to in general -- new function allocation goes into module_cache and resolve the function symbol. The symbol resides in memory until last unload/__gc and then function gets deleted and module_cache unloads the symbol. > > // Fine, it should allocate a new function object, right? > // But it also calls module_sym_load()! And does not increase > // load_count! I don't understand. It seems like the names > // of these functions has very little to do with what they are > // really doing. The only way to make them work is to call them > // in some special sequence, in which a next function fixes "some special sequence" is all our code I fear, the functions flow is context dependent grammar not context free. > // leftovers of previous functions. > cmod_func_new(name) > > This must be seriously redesigned. Starting from the function > names. > > Here is an option: > > // Find the function in the hash. If found, increase load_count > // and return. Otherwise allocate a new function object, fill its > // members, do the load, and make load_count = 1. Add to the hash. > cmod_func_load(name) > > // Decrease load_count. If became 0, remove from the hash, > // unload mod_sym, and free the memory. > cmod_func_unload(f) > > I don't really see why would you need the other functions. These 2 > are going to be extra simple anyway. Module_cache API for module > objects is literally that simple - load, unload, reload. Why do > you need so many for cmod, especially with so unclear behaviour? Because otherwise we will have a long function where we will have error path to handle. Let me check if I manged to make it readable. Note thouh that the number of actions are not goiing to disappear it will simply sit in one big function instead. > > +static int > > +cmod_func_add(struct cmod_func *cf) > > +{ > > + assert(cf->load_count >= 0); > > + if (cf->load_count++ != 0) > > 13. I already proposed an option how to rework cmod API above, but > still will leave some comments alongside. Thanks! > > Why the heck 'add' function increases load_count? Because it doesn't need to add same function into hash. > > + return 0; > > + > > + const struct mh_strnptr_node_t nd = { > > + .str = cf->name, > > + .len = cf->len, > > + .hash = mh_strn_hash(cf->name, cf->len), > > + .val = cf, > > + }; > > + > > + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL); > > + if (e == mh_end(func_hash)) { > > + diag_set(OutOfMemory, sizeof(nd), > > + "malloc", "cmod_func node"); > > + return -1; > > + } > > + return 0; > > 14. Why so complex? Just add it to the hash when it is created. This > function is used in a single place, where you already do the check > if it exists. I would like to separate hash helpers from other code. But sure. > > +} > > + > > +/** > > + * Unload a symbol and free a function instance. > > + * > > + * @param cf function descriptor. > > + */ > > +static void > > +cmod_func_free(struct cmod_func *cf) > > +{ > > + module_sym_unload(&cf->mod_sym); > > 15. Worth adding load_count == 0 assertion. > Or rather do the unload where it belongs - where > load_count becomes 0. OK > > > + TRASH(cf); > > + free(cf); > > +} > > + > > +/** > > + * Allocate a new function instance and resolve a symbol address. > > + * > > + * @param name package path and a function name, ie "module.foo" > > + * @param len length of @a name. > > + * @param func_name_len function name length, ie "3" for "module.foo" > > + * > > + * @returns function instance on success, NULL otherwise setting diag area. > > + */ > > +static struct cmod_func * > > +cmod_func_new(const char *name, size_t len, size_t func_name_len) > > +{ > > + const ssize_t cf_size = sizeof(struct cmod_func); > > + size_t size = cf_size + len + 1; > > + struct cmod_func *cf = malloc(size); > > + if (cf == NULL) { > > + diag_set(OutOfMemory, size, "malloc", "cf"); > > + return NULL; > > + } > > + > > + cf->mod_sym.addr = NULL; > > + cf->mod_sym.module = NULL; > > + cf->load_count = 0; > > + cf->mod_sym.name = cf->name; > > + cf->func_name = &cf->name[len - func_name_len]; > > + cf->len = len; > > + > > + memcpy(cf->name, name, len); > > + cf->name[len] = '\0'; > > + > > + if (module_sym_load(&cf->mod_sym) != 0) { > > 16. You load, and don't increase load_count. Does not it look > contradictory to you? Well, there are two sides: initial values where address comes from module subsystem and the state when function is ready to be used. The function is "ready" when it is not only initilized and has symbol resolved but when it sits inside function cache, and because of this we increas load_count only in cmod_func_add. That is why load_count increased not at moment of creation. > > +static int > > +lcmod_func_load(struct lua_State *L) > > +{ > > + const char *method = "function = module:load"; > > + struct cmod_func *cf = NULL; > > + > > + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) { > > + const char *fmt = > > + "Expects %s(\'name\') but no name passed"; > > + diag_set(IllegalParams, fmt, method); > > 17. Why couldn't you just inline 'fmt' string? The current version > is not any shorter nor easier to read. The same in many other places. To not split the string - it will overflow 80 symbols iirc > > + return luaT_push_nil_and_error(L); > > + } > > + > > + struct module *m = cmod_get_module_handle(L, false); > > + if (m == NULL) { > > + const char *fmt = > > + "Expects %s(\'name\') but not module object passed"; > > + diag_set(IllegalParams, fmt, method); > > + return luaT_push_nil_and_error(L); > > + } > > + > > + const char *func_name = lua_tostring(L, 2); > > + const char *name = tt_sprintf("%s.%s", m->package, func_name); > > + size_t len = strlen(name); > > + > > + cf = cmod_func_find(name, len); > > 18. Or you could pass package name and function name to the constructor > and copy them right to the allocated memory, without double copying > via the static buffer. Will take a look, though the hash requires the long name which means we will need to count number of symbols in merged name. Will try. > > + if (cf == NULL) { > > + cf = cmod_func_new(name, len, strlen(func_name)); > > + if (cf == NULL) > > + return luaT_push_nil_and_error(L); > > + } > > + > > + if (cmod_func_add(cf) != 0) { > > 19. From this code it looks like you can fail addition of an > already existing function and you just delete it. I know it > can't happen, but the code says the opposite, and it is > confusing. > > Another reason why 'add' is a bad name - it seems like it > should fail when added second time due to being already > added. OK > > +static int > > +lcmod_module_load(struct lua_State *L) > > +{ > > + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) { > > + const char *fmt = > > + "Expects cmod.load(\'name\') but no name passed"; > > + diag_set(IllegalParams, fmt); > > + return luaT_push_nil_and_error(L); > > + } > > + > > + size_t name_len; > > + const char *name = lua_tolstring(L, 1, &name_len); > > + > > + struct module *module = module_load(name, name_len); > > + if (module == NULL) > > + return luaT_push_nil_and_error(L); > > + > > + cmod_setup_module_handle(L, module); > > + /* > > + * Make sure the module won't disappear until > > + * it is GC'ed or unloaded explicitly. > > + */ > > + module->ref++; > > 20. The fact that you need to touch reference counter of an object > from a different subsystem should make you start to question > sanity of the API. Why module_load() does not do the increase, > and module_unload() - decrease? Because there were no such API and I though this gonna be over the top taking into account the size of the patch. Moreover there is different senatics of how modules are working in box.fun and our cmod module. I agree that this is ugly, maybe it is a good time to improve modules code. > > +static int > > +lcmod_module_handle_serialize(struct lua_State *L) > > +{ > > + struct module *m = cmod_get_module_handle(L, true); > > + if (m == NULL) { > > + lua_pushnil(L); > > + return 1; > > + } > > + > > + lua_createtable(L, 0, 0); > > 21. Since you decided to use createtable instead of newtable, > I suggest you to fill the proper parameters, which should > be (0, 1) in this case I think. OK. Thanks a huge for review, Vlad!
Hi! On 24.01.2021 23:32, Cyrill Gorcunov wrote: > On Sun, Jan 24, 2021 at 05:27:12PM +0100, Vladislav Shpilevoy wrote: >> Thanks for the patch! >> >> Very dubious commit. To me the old and new names look the same. >> 'func_name' was even better than 'func_name_desc'. At least >> because it was shorter, and 'desc' does not add any more meaning. >> >> On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote: >>> - rename func_name to func_name_desc because >>> it is not just a name but rather a name descriptor >>> which includes symbol address >> >> I don't see any address in it. Only name tokens. > > struct func_name_desc { > /** > * Null-terminated symbol name, e.g. > * "func" for "mod.submod.func". > */ > ---> const char *sym; > /** > * Package name, e.g. "mod.submod" for > * "mod.submod.func". > */ > const char *package; > /** > * A pointer to the last character in ->package + 1. > */ > const char *package_end; > }; > > I marked the address. You said "symbol address". But it is not a symbol address. It is an address pointing at a part of the name string. Not at the actual symbol, which you need to load later. > The structure is not a function name > but consists of two parts - package path and function name > itself and maybe something new will be added in future. In this case 'name desc' is also invalid - it does not describe the name. Because there is also a path which is not a part of the name. It would be more correct to call it func_path then. > So func_name is suitable for plain names but not for strings > where some part of it has a special meaning with hidden > extension inside, hereby a descriptor. In this case you didn't really change it - it was about function name and stayed about function name. It should be func_path then. If you really want to rename. > And don't forget > about grepability, without this name change grep returns > a number of irrelevant results. This struct is defined and used in a single file. It is not that hard to find it here. Especially if you use some smarter tools. For instance, I use full text search in Sublime, and have never had any issues with finding anything in Tarantool sources. Plus for structs and functions you usually want to use ctags (Sublime also generates these tags). Maybe it is only hard when you use command line and bare grep.
>>> On success the first returned value set to `true` and `res` represent >>> function execution result. >>> >>> ``` Lua >>> local ok, res = cfunc_sum(1, 2) >>> assert(ok); >>> print(res) >>> ``` >>> >>> We will see the number `3` in output. >> >> 5. What happens when I do multireturn? > > Not sure I follow you here. But if you mean tradtional > multireturn then we have it in test case (the next patch) > > | cfunc_multireturn() > | | --- > | | - true > | | - [1] > | | - [1] > > The function should prepare the result by self. The calling > convention is the same as to old known `box.func`. Or > you mean something else? I mean you should document it better. I know how multireturn works in your patch because I reviewed it. Users won't look at the source code or at the tests. >>> + >>> +static struct cmod_func * >>> +cmod_get_func_handle(struct lua_State *L, bool mandatory) >> >> 8. According to the code style, flags must have 'is' prefix or >> a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'. > > Could you please point me where this rule sits? I couldn't find it either. AFAIU, more detailed rules should come with the next update of the C style code on the site. But I remember saying it to you a few times. > I see both usage (with and without prefix) in our code. Without prefix is incorrect. It is either old code, or new badly reviewed code. >>> +{ >>> + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname); >>> + if (mandatory && cf == NULL) { >>> + const char *fmt = "The function is already unloaded"; >>> + diag_set(IllegalParams, fmt); >>> + } >>> + return cf; >>> +} >>> + >>> +static void >>> +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf) >>> +{ >>> + cmod_set_handle(L, cmod_func_handle_uname, cf); >>> +} >>> + >>> +static void >>> +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf) >>> +{ >>> + cmod_setup_handle(L, cmod_func_handle_uname, cf); >>> +} >> >> 9. Amount of boilerplate code and double checks of the same >> conditions you need just to push some udata is terrifying, tbh. >> >> For instance, take lcmod_func_handle_call(). It calls >> cmod_get_func_handle() and checks result for NULL. This one >> calls cmod_get_handle() and also checks result for NULL. >> This one calls luaL_testudata() and also checks result for NULL. >> You did 3 NULL checks which could be one, if you would have >> a set of simpler functions working directly with luaL_testudata(). > > No, you're wrong. The minimum number of check is 2 (1 - fetch the > double pointer to userdata, 2 - dereference first level). Instead > I use two helpers where one sets diag in case of error which allows > me to not duplicate code. This is a way better.Until you meant > something different. I simply mean the code is hard to read. You have a ton of 2-line functions with strange names which I need to lookup each time I am looking at one of the upper level functions. Because from the names I don't always understand what they are doing. >> 10. Why do you have 'handle' word in each function name? For >> example, how 'handle_call' is any different from just 'call' in >> scope of code in this file? Or how is 'cmod_setup_func' different >> from 'cmod_setup_func_handle'? It seems that by 'handle' you >> try not to clash names with the non-Lua part of cmod. But you >> should do it via prefixes, not via suffixes. lcmod instead of cmod. > > The base idea of handle is that "handle" is sitting in udata of Lua > variable and the functions are fetching it from there. In udata you literally store the pointer at the function or module. You don't have 'func_handle' or something. > User may unload > a function then its handle set to NULL but variable is still alive > and subsequent call should print an error. > > If you prefer the prefix then sure, gimme some time to try out. I would prefer to drop most of these functions if possible, and get certainly rid of the 'handle' suffix. I suspect you copied it from popen code which does something similar - deep system code exposed to Lua. But in popen they literally have 'popen_handle' structure. You don't. So you don't need 'handle' in function names. It is long and confusing. >>> +{ >>> + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len); >>> + if (e == mh_end(func_hash)) >>> + return NULL; >>> + return mh_strnptr_node(func_hash, e)->val; >>> +} >>> + >>> +/** >>> + * Delete a function instance from the hash or decrease >>> + * a reference if the function is still loaded. >>> + * >>> + * @param cf function descriptor. >>> + * >>> + * @retval true if the function has no more loaded instances >>> + * and removed from the hash. >>> + * >>> + * @retval false if there are loaded instances left and function >>> + * is kept in the hash. >>> + */ >>> +static bool >>> +cmod_func_del(struct cmod_func *cf) >> >> 12. This whole cmod and lcmod API looks extremely inconsistent and complex. >> Here are the functions you have (not counting the boilerplate code above): >> >> // Find a function in the cache, ok. >> cmod_func_find(name) >> >> // It is called 'delete'. But it does not do the >> // delete. Moreover, it also decreases 'load_count', >> // which is super surprising. What does it have to do >> // with loads? You don't call module_sym_unload() here, >> // but decrease load_count. >> cmod_func_del(f) > > The same function may be loaded twise into different > Lua variables, thus we count number of loads. You proposed > it. I don't talk here about behaviour of the public API. I talk about internal functions having seemingly random names not related to what they are doing. > I think we can drop this countint and simply use module > `refs` field instead as you proposed in another email. I didn't propose to remove the function refs. If you drop them, how will you determine when a cmod_func object can be deleted? > cmod_func_add/del operates with function hash they don't manipulate > symbols. The symbols are handled via module_cache. This is because > we have two interfaces for functions: box.func and cmod. But they touch load_count. If you would remove load_count change from them and rename to something like func_hash_add() func_hash_del() It would look better probably. The functions really would become about hash only. >> // Ok, this looks like it should add the function to the >> // hash. But wait, it also increases load_count, and does >> // not call module_sym_load()! WTF? >> cmod_func_add(f) > > Tracks number of same function loads. Exactly, I know this. But the function name is 'add', and it 'tracks loads' - these are unrelated things! >> // It should free the memory, right? But it calls >> // module_sym_unload(). Why!? Also, we have a code style >> // agreement, that function freeing the memory should >> // have suffix 'delete', not 'free'. As soon as you have >> // 'new'. >> cmod_func_free(f) > > OK for naming. As to in general -- new function allocation > goes into module_cache and resolve the function symbol. The > symbol resides in memory until last unload/__gc and then > function gets deleted and module_cache unloads the symbol. I know the behaviour. My questions are not to the big picture of behaviour. My questions are about this behaviour being split into seemingly random parts with strange names. >> // Fine, it should allocate a new function object, right? >> // But it also calls module_sym_load()! And does not increase >> // load_count! I don't understand. It seems like the names >> // of these functions has very little to do with what they are >> // really doing. The only way to make them work is to call them >> // in some special sequence, in which a next function fixes > > "some special sequence" is all our code I fear, the functions > flow is context dependent grammar not context free. You know what I mean. >> // leftovers of previous functions. >> cmod_func_new(name) >> >> This must be seriously redesigned. Starting from the function >> names. >> >> Here is an option: >> >> // Find the function in the hash. If found, increase load_count >> // and return. Otherwise allocate a new function object, fill its >> // members, do the load, and make load_count = 1. Add to the hash. >> cmod_func_load(name) >> >> // Decrease load_count. If became 0, remove from the hash, >> // unload mod_sym, and free the memory. >> cmod_func_unload(f) >> >> I don't really see why would you need the other functions. These 2 >> are going to be extra simple anyway. Module_cache API for module >> objects is literally that simple - load, unload, reload. Why do >> you need so many for cmod, especially with so unclear behaviour? > > Because otherwise we will have a long function where we will have > error path to handle. Let me check if I manged to make it readable. > Note thouh that the number of actions are not goiing to disappear > it will simply sit in one big function instead. I looked at module_cache when proposed the API. In module_cache the public API is relatively good. It is simple and yet totally enough. Because of this the module functions like lcmod_module_load() look very simple: 1. Get arguments from Lua stack. 2. Forward them to some module_cache function as is. 3. Push result back to Lua stack. At the same time for cmod_func functions like lcmod_func_load() you do too much code which is not about Lua and seems to be internal for cmod_func objects. But it should be simpler. Probably it would help if you would think of it like you need a pure C module. Like there is a cmod.h file, and you need to design API for the header file to load/unload functions. You would just expose two functions: cmod_func_load() cmod_func_unload() And then it would be extremely easy to wrap this into Lua API. As simple as it was with module_cache. Alongside you would probably realize that some of the 'helper' functions become unnecessary and could be inlined into these 2 functions making the code ever smaller. Talking of long functions - they will both fit into one screen I believe. Even if you inline everything. So I don't see sense in splitting functions just for the sake of splitting. Functions like 'add to hash', 'remove from hash', 'allocate new cmod_func and initialize to default values everything'? Sure, fine, these actions look independent. But the "public" part of the API should be just 2 functions, you see? Load and unload. And then you wrap them with a couple of short Lua functions. Plus ton of Lua shit like __index, __gc, and all. The latter, btw, should just call cmod_func_unload(). Then you would see that the hash deletion is needed only in one place, not in two like now. >>> +static int >>> +cmod_func_add(struct cmod_func *cf) >>> +{ >>> + assert(cf->load_count >= 0); >>> + if (cf->load_count++ != 0) >> >> 13. I already proposed an option how to rework cmod API above, but >> still will leave some comments alongside. > > Thanks! Just tell me you hate these comments and me as well, it is fine. Don't keep it inside, it is not healthy. You need a lot of mental health if you work in Tarantool. >> Why the heck 'add' function increases load_count? > > Because it doesn't need to add same function into hash. Em. But ... . How are these actions related? Call it simply 'refs' then? If the counter is not related to loads. Or make the 'load_count' counter incremented by a function at least having 'load' in its name, right? >>> + return 0; >>> + >>> + const struct mh_strnptr_node_t nd = { >>> + .str = cf->name, >>> + .len = cf->len, >>> + .hash = mh_strn_hash(cf->name, cf->len), >>> + .val = cf, >>> + }; >>> + >>> + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL); >>> + if (e == mh_end(func_hash)) { >>> + diag_set(OutOfMemory, sizeof(nd), >>> + "malloc", "cmod_func node"); >>> + return -1; >>> + } >>> + return 0; >> >> 14. Why so complex? Just add it to the hash when it is created. This >> function is used in a single place, where you already do the check >> if it exists. > > I would like to separate hash helpers from other code. But sure. It is fine to separate them. For instance, look at module_cache_find, module_cache_add, module_cache_del - they don't try to do anything except work with the hash table. And this is atomic. Easy to understand. Also they have 'module_cache' prefix, so I understand they work with the cache. Not with the module objects. They accept a module pointer, but only to save it into the hash. They don't change the modules. >>> + TRASH(cf); >>> + free(cf); >>> +} >>> + >>> +/** >>> + * Allocate a new function instance and resolve a symbol address. >>> + * >>> + * @param name package path and a function name, ie "module.foo" >>> + * @param len length of @a name. >>> + * @param func_name_len function name length, ie "3" for "module.foo" >>> + * >>> + * @returns function instance on success, NULL otherwise setting diag area. >>> + */ >>> +static struct cmod_func * >>> +cmod_func_new(const char *name, size_t len, size_t func_name_len) >>> +{ >>> + const ssize_t cf_size = sizeof(struct cmod_func); >>> + size_t size = cf_size + len + 1; >>> + struct cmod_func *cf = malloc(size); >>> + if (cf == NULL) { >>> + diag_set(OutOfMemory, size, "malloc", "cf"); >>> + return NULL; >>> + } >>> + >>> + cf->mod_sym.addr = NULL; >>> + cf->mod_sym.module = NULL; >>> + cf->load_count = 0; >>> + cf->mod_sym.name = cf->name; >>> + cf->func_name = &cf->name[len - func_name_len]; >>> + cf->len = len; >>> + >>> + memcpy(cf->name, name, len); >>> + cf->name[len] = '\0'; >>> + >>> + if (module_sym_load(&cf->mod_sym) != 0) { >> >> 16. You load, and don't increase load_count. Does not it look >> contradictory to you? > > Well, there are two sides: initial values where address comes from > module subsystem and the state when function is ready to be used. > The function is "ready" when it is not only initilized and has > symbol resolved but when it sits inside function cache, and because > of this we increas load_count only in cmod_func_add. That is why > load_count increased not at moment of creation. I see your point. Maybe just load_count name confuses me. Probably would help to rename it to 'refs' really. Then some things come together, at least in my mind. >>> +static int >>> +lcmod_func_load(struct lua_State *L) >>> +{ >>> + const char *method = "function = module:load"; >>> + struct cmod_func *cf = NULL; >>> + >>> + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) { >>> + const char *fmt = >>> + "Expects %s(\'name\') but no name passed"; >>> + diag_set(IllegalParams, fmt, method); >> >> 17. Why couldn't you just inline 'fmt' string? The current version >> is not any shorter nor easier to read. The same in many other places. > > To not split the string - it will overflow 80 symbols iirc And why not split it? We do it in a lot of places. Don't tell about grep. The string has %s and escapes quotes. If you see such a string in a test, you won't know how the source looks anyway. >>> + return luaT_push_nil_and_error(L); >>> + } >>> + >>> + struct module *m = cmod_get_module_handle(L, false); >>> + if (m == NULL) { >>> + const char *fmt = >>> + "Expects %s(\'name\') but not module object passed"; >>> + diag_set(IllegalParams, fmt, method); >>> + return luaT_push_nil_and_error(L); >>> + } >>> + >>> + const char *func_name = lua_tostring(L, 2); >>> + const char *name = tt_sprintf("%s.%s", m->package, func_name); >>> + size_t len = strlen(name); >>> + >>> + cf = cmod_func_find(name, len); >> >> 18. Or you could pass package name and function name to the constructor >> and copy them right to the allocated memory, without double copying >> via the static buffer. > > Will take a look, though the hash requires the long name which means > we will need to count number of symbols in merged name. Will try. Wouldn't strlen(left) + strlen(right) work? >>> +static int >>> +lcmod_module_load(struct lua_State *L) >>> +{ >>> + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) { >>> + const char *fmt = >>> + "Expects cmod.load(\'name\') but no name passed"; >>> + diag_set(IllegalParams, fmt); >>> + return luaT_push_nil_and_error(L); >>> + } >>> + >>> + size_t name_len; >>> + const char *name = lua_tolstring(L, 1, &name_len); >>> + >>> + struct module *module = module_load(name, name_len); >>> + if (module == NULL) >>> + return luaT_push_nil_and_error(L); >>> + >>> + cmod_setup_module_handle(L, module); >>> + /* >>> + * Make sure the module won't disappear until >>> + * it is GC'ed or unloaded explicitly. >>> + */ >>> + module->ref++; >> >> 20. The fact that you need to touch reference counter of an object >> from a different subsystem should make you start to question >> sanity of the API. Why module_load() does not do the increase, >> and module_unload() - decrease? > > Because there were no such API and I though this gonna be over > the top taking into account the size of the patch. Total size of the patchset does not matter. And size of one commit also does not matter if it is atomic. But talking of this issue - you could make a small commit before this one, which makes module_load/unload increment the ref. > Moreover there > is different senatics of how modules are working in box.fun and > our cmod module. How increment of the counter on module_load() is going to change the behaviour? It is not about when to do the load in box.func. Box.func also references module object, also calls load() and unload()/gc(). I don't propose to change *when* they are called. > I agree that this is ugly, maybe it is a good time to improve > modules code.
On Sat, Jan 30, 2021 at 07:53:03PM +0100, Vladislav Shpilevoy wrote: > Hi! > > On 24.01.2021 23:32, Cyrill Gorcunov wrote: > > On Sun, Jan 24, 2021 at 05:27:12PM +0100, Vladislav Shpilevoy wrote: > >> Thanks for the patch! > >> > >> Very dubious commit. To me the old and new names look the same. > >> 'func_name' was even better than 'func_name_desc'. At least > >> because it was shorter, and 'desc' does not add any more meaning. > >> > >> On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote: > >>> - rename func_name to func_name_desc because > >>> it is not just a name but rather a name descriptor > >>> which includes symbol address > >> > >> I don't see any address in it. Only name tokens. > > > > struct func_name_desc { > > /** > > * Null-terminated symbol name, e.g. > > * "func" for "mod.submod.func". > > */ > > ---> const char *sym; > > /** > > * Package name, e.g. "mod.submod" for > > * "mod.submod.func". > > */ > > const char *package; > > /** > > * A pointer to the last character in ->package + 1. > > */ > > const char *package_end; > > }; > > > > I marked the address. > > You said "symbol address". But it is not a symbol address. It is > an address pointing at a part of the name string. Not at the > actual symbol, which you need to load later. This is an address of part of the string which represents symbol. When symbol is loaded it will have "resolved" address. Plain address and resolved address are pretty different things in loaders terminology. In real loaders initial address is position inside strings section which you're resolving later upon symbols lookup. We have the similar approach, except our initial address is position inside package string... > > > The structure is not a function name > > but consists of two parts - package path and function name > > itself and maybe something new will be added in future. > > In this case 'name desc' is also invalid - it does not describe > the name. Because there is also a path which is not a part of > the name. > > It would be more correct to call it func_path then. It is not a path. It is a *descriptor* which identify function in unique way on a storage device. But sure I will do struct func_path { const char *sym_name; const char *package; const char *package_end; }; > > > So func_name is suitable for plain names but not for strings > > where some part of it has a special meaning with hidden > > extension inside, hereby a descriptor. > > In this case you didn't really change it - it was about function > name and stayed about function name. It should be func_path then. > If you really want to rename. OK, I would prefer to rename. func_name already used in many places with rather plain string context, and forcing to use some fulltext search is not a good idea. To be honest I would keep func_name_desc here because the structure might extend in future but func_path fine as well. > > > And don't forget > > about grepability, without this name change grep returns > > a number of irrelevant results. > > This struct is defined and used in a single file. It is not that > hard to find it here. Especially if you use some smarter tools. > For instance, I use full text search in Sublime, and have never had > any issues with finding anything in Tarantool sources. Plus for > structs and functions you usually want to use ctags (Sublime also > generates these tags). Maybe it is only hard when you use command > line and bare grep.