* [Tarantool-patches] [PATCH v12 1/8] box/func: factor out c function entry structure
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 2/8] module_cache: move module handling into own subsystem Cyrill Gorcunov via Tarantool-patches
` (7 subsequent siblings)
8 siblings, 0 replies; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
Currently func_c structure is a wrapper over struct func
which in turn handles function definition, credentials and etc,
mostly to handle internal "_func" space.
Such tight bound doesn't allow to reuse func_c in any
other context. But we're going to support C function
execution in read-only instances and for this we better
reuse already existing structures as much as possible
instead of breeding new ones.
Thus lets extract module symbols into module_sym structure,
this allows us to reuse module path cache in other code.
While extracting the structure rename "func" member to
"addr" since this exactly what the member represents:
an address of symbol in memory.
Same time to be completely independent of "_func" specific lets
carry symbolic name inplace (ie member "name" is kind of redundant
when module_sym is used for "_func" handling where we can retrieve
the name via function definition but such definition won't be
available for non-persistent C functions which we will support
in next patches).
The new structure allows us to load/unload general symbols via
module_sym_load() and module_sym_unload() helpers.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/func.c | 190 ++++++++++++++++++++++++-------------------------
src/box/func.h | 42 +++++++++++
2 files changed, 134 insertions(+), 98 deletions(-)
diff --git a/src/box/func.c b/src/box/func.c
index 9909cee45..8030aa07c 100644
--- a/src/box/func.c
+++ b/src/box/func.c
@@ -59,19 +59,8 @@ struct func_name {
struct func_c {
/** Function object base class. */
struct func base;
- /**
- * Anchor for module membership.
- */
- struct rlist item;
- /**
- * For C functions, the body of the function.
- */
- box_function_f func;
- /**
- * Each stored function keeps a handle to the
- * dynamic library for the C callback.
- */
- struct module *module;
+ /** C function module symbol. */
+ struct module_sym mod_sym;
};
/***
@@ -371,6 +360,73 @@ module_sym(struct module *module, const char *name)
return f;
}
+int
+module_sym_load(struct module_sym *mod_sym)
+{
+ assert(mod_sym->addr == NULL);
+
+ struct func_name name;
+ func_split_name(mod_sym->name, &name);
+
+ /*
+ * In case if module has been loaded already by
+ * some previous call we can eliminate redundant
+ * loading and take it from the cache.
+ */
+ struct module *cached, *module;
+ cached = module_cache_find(name.package, name.package_end);
+ if (cached == NULL) {
+ module = module_load(name.package, name.package_end);
+ if (module == NULL)
+ return -1;
+ if (module_cache_put(module) != 0) {
+ module_delete(module);
+ return -1;
+ }
+ } else {
+ module = cached;
+ }
+
+ mod_sym->addr = module_sym(module, name.sym);
+ if (mod_sym->addr == NULL) {
+ if (cached == NULL) {
+ /*
+ * In case if it was a first load we should
+ * clean the cache immediately otherwise
+ * the module continue being referenced even
+ * if there will be no use of it.
+ *
+ * Note the module_sym set an error thus be
+ * careful to not wipe it.
+ */
+ module_cache_del(name.package, name.package_end);
+ module_delete(module);
+ }
+ return -1;
+ }
+ mod_sym->module = module;
+ rlist_add(&module->funcs, &mod_sym->item);
+ return 0;
+}
+
+void
+module_sym_unload(struct module_sym *mod_sym)
+{
+ if (mod_sym->module == NULL)
+ return;
+
+ rlist_del(&mod_sym->item);
+ if (rlist_empty(&mod_sym->module->funcs)) {
+ struct func_name name;
+ func_split_name(mod_sym->name, &name);
+ module_cache_del(name.package, name.package_end);
+ }
+ module_gc(mod_sym->module);
+
+ mod_sym->module = NULL;
+ mod_sym->addr = NULL;
+}
+
int
module_reload(const char *package, const char *package_end, struct module **module)
{
@@ -385,15 +441,15 @@ module_reload(const char *package, const char *package_end, struct module **modu
if (new_module == NULL)
return -1;
- struct func_c *func, *tmp_func;
- rlist_foreach_entry_safe(func, &old_module->funcs, item, tmp_func) {
+ struct module_sym *mod_sym, *tmp;
+ rlist_foreach_entry_safe(mod_sym, &old_module->funcs, item, tmp) {
struct func_name name;
- func_split_name(func->base.def->name, &name);
- func->func = module_sym(new_module, name.sym);
- if (func->func == NULL)
+ func_split_name(mod_sym->name, &name);
+ mod_sym->addr = module_sym(new_module, name.sym);
+ if (mod_sym->addr == NULL)
goto restore;
- func->module = new_module;
- rlist_move(&new_module->funcs, &func->item);
+ mod_sym->module = new_module;
+ rlist_move(&new_module->funcs, &mod_sym->item);
}
module_cache_del(package, package_end);
if (module_cache_put(new_module) != 0)
@@ -408,9 +464,9 @@ module_reload(const char *package, const char *package_end, struct module **modu
*/
do {
struct func_name name;
- func_split_name(func->base.def->name, &name);
- func->func = module_sym(old_module, name.sym);
- if (func->func == NULL) {
+ func_split_name(mod_sym->name, &name);
+ mod_sym->addr = module_sym(old_module, name.sym);
+ if (mod_sym->addr == NULL) {
/*
* Something strange was happen, an early loaden
* function was not found in an old dso.
@@ -418,10 +474,11 @@ module_reload(const char *package, const char *package_end, struct module **modu
panic("Can't restore module function, "
"server state is inconsistent");
}
- func->module = old_module;
- rlist_move(&old_module->funcs, &func->item);
- } while (func != rlist_first_entry(&old_module->funcs,
- struct func_c, item));
+ mod_sym->module = old_module;
+ rlist_move(&old_module->funcs, &mod_sym->item);
+ } while (mod_sym != rlist_first_entry(&old_module->funcs,
+ struct module_sym,
+ item));
assert(rlist_empty(&new_module->funcs));
module_delete(new_module);
return -1;
@@ -484,94 +541,31 @@ func_c_new(MAYBE_UNUSED struct func_def *def)
return NULL;
}
func->base.vtab = &func_c_vtab;
- func->func = NULL;
- func->module = NULL;
+ func->mod_sym.addr = NULL;
+ func->mod_sym.module = NULL;
+ func->mod_sym.name = def->name;
return &func->base;
}
-static void
-func_c_unload(struct func_c *func)
-{
- if (func->module) {
- rlist_del(&func->item);
- if (rlist_empty(&func->module->funcs)) {
- struct func_name name;
- func_split_name(func->base.def->name, &name);
- module_cache_del(name.package, name.package_end);
- }
- module_gc(func->module);
- }
- func->module = NULL;
- func->func = NULL;
-}
-
static void
func_c_destroy(struct func *base)
{
assert(base->vtab == &func_c_vtab);
assert(base != NULL && base->def->language == FUNC_LANGUAGE_C);
struct func_c *func = (struct func_c *) base;
- func_c_unload(func);
+ module_sym_unload(&func->mod_sym);
TRASH(base);
free(func);
}
-/**
- * Resolve func->func (find the respective DLL and fetch the
- * symbol from it).
- */
-static int
-func_c_load(struct func_c *func)
-{
- assert(func->func == NULL);
-
- struct func_name name;
- func_split_name(func->base.def->name, &name);
-
- struct module *cached, *module;
- cached = module_cache_find(name.package, name.package_end);
- if (cached == NULL) {
- module = module_load(name.package, name.package_end);
- if (module == NULL)
- return -1;
- if (module_cache_put(module)) {
- module_delete(module);
- return -1;
- }
- } else {
- module = cached;
- }
-
- func->func = module_sym(module, name.sym);
- if (func->func == NULL) {
- if (cached == NULL) {
- /*
- * In case if it was a first load we should
- * clean the cache immediately otherwise
- * the module continue being referenced even
- * if there will be no use of it.
- *
- * Note the module_sym set an error thus be
- * careful to not wipe it.
- */
- module_cache_del(name.package, name.package_end);
- module_delete(module);
- }
- return -1;
- }
- func->module = module;
- rlist_add(&module->funcs, &func->item);
- return 0;
-}
-
int
func_c_call(struct func *base, struct port *args, struct port *ret)
{
assert(base->vtab == &func_c_vtab);
assert(base != NULL && base->def->language == FUNC_LANGUAGE_C);
struct func_c *func = (struct func_c *) base;
- if (func->func == NULL) {
- if (func_c_load(func) != 0)
+ if (func->mod_sym.addr == NULL) {
+ if (module_sym_load(&func->mod_sym) != 0)
return -1;
}
@@ -586,10 +580,10 @@ func_c_call(struct func *base, struct port *args, struct port *ret)
box_function_ctx_t ctx = { ret };
/* Module can be changed after function reload. */
- struct module *module = func->module;
+ struct module *module = func->mod_sym.module;
assert(module != NULL);
++module->calls;
- int rc = func->func(&ctx, data, data + data_sz);
+ int rc = func->mod_sym.addr(&ctx, data, data + data_sz);
--module->calls;
module_gc(module);
region_truncate(region, region_svp);
diff --git a/src/box/func.h b/src/box/func.h
index 581e468cb..9a7f17446 100644
--- a/src/box/func.h
+++ b/src/box/func.h
@@ -58,6 +58,28 @@ struct module {
char package[0];
};
+/**
+ * Callable symbol bound to a module.
+ */
+struct module_sym {
+ /**
+ * Anchor for module membership.
+ */
+ struct rlist item;
+ /**
+ * For C functions, address of the function.
+ */
+ box_function_f addr;
+ /**
+ * A module the symbol belongs to.
+ */
+ struct module *module;
+ /**
+ * Function name definition.
+ */
+ char *name;
+};
+
/** Virtual method table for func object. */
struct func_vtab {
/** Call function with given arguments. */
@@ -108,6 +130,26 @@ func_delete(struct func *func);
int
func_call(struct func *func, struct port *args, struct port *ret);
+/**
+ * Resolve C entry (find the respective DLL and fetch the
+ * symbol from it).
+ *
+ * @param mod_sym module symbol pointer.
+ * @retval -1 on error.
+ * @retval 0 on success.
+ */
+int
+module_sym_load(struct module_sym *mod_sym);
+
+/**
+ * Unload module symbol and drop it from the package
+ * cache if there is no users left.
+ *
+ * @param mod_sym module symbol pointer.
+ */
+void
+module_sym_unload(struct module_sym *mod_sym);
+
/**
* Reload dynamically loadable module.
*
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 2/8] module_cache: move module handling into own subsystem
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 1/8] box/func: factor out c function entry structure Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:26 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming Cyrill Gorcunov via Tarantool-patches
` (6 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
The module handling should not be bound to particular
function implementation (we will have two users: already
existing functions for "_func" space, and a new upcoming
one which will be serving cbox submodule in next patch).
For this sake all module related code is moved to
module_cache file where we do symbol resolving, calling
and tracking of symbol usage.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/CMakeLists.txt | 1 +
src/box/func.c | 473 +------------------------------------
src/box/func.h | 83 +------
src/box/func_def.h | 14 --
src/box/module_cache.c | 513 +++++++++++++++++++++++++++++++++++++++++
src/box/module_cache.h | 139 +++++++++++
6 files changed, 657 insertions(+), 566 deletions(-)
create mode 100644 src/box/module_cache.c
create mode 100644 src/box/module_cache.h
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 19203f770..339e2c8a9 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -187,6 +187,7 @@ add_library(box STATIC
sql_stmt_cache.c
wal.c
call.c
+ module_cache.c
merger.c
ibuf.c
${sql_sources}
diff --git a/src/box/func.c b/src/box/func.c
index 8030aa07c..4e7025fdb 100644
--- a/src/box/func.c
+++ b/src/box/func.c
@@ -31,30 +31,12 @@
#include "func.h"
#include "fiber.h"
#include "trivia/config.h"
-#include "assoc.h"
#include "lua/utils.h"
#include "lua/call.h"
-#include "error.h"
-#include "errinj.h"
#include "diag.h"
#include "port.h"
#include "schema.h"
#include "session.h"
-#include "libeio/eio.h"
-#include <fcntl.h>
-#include <dlfcn.h>
-
-/**
- * Parsed symbol and package names.
- */
-struct func_name {
- /** Null-terminated symbol name, e.g. "func" for "mod.submod.func" */
- const char *sym;
- /** Package name, e.g. "mod.submod" for "mod.submod.func" */
- const char *package;
- /** A pointer to the last character in ->package + 1 */
- const char *package_end;
-};
struct func_c {
/** Function object base class. */
@@ -63,427 +45,6 @@ struct func_c {
struct module_sym mod_sym;
};
-/***
- * Split function name to symbol and package names.
- * For example, str = foo.bar.baz => sym = baz, package = foo.bar
- * @param str function name, e.g. "module.submodule.function".
- * @param[out] name parsed symbol and package names.
- */
-static void
-func_split_name(const char *str, struct func_name *name)
-{
- name->package = str;
- name->package_end = strrchr(str, '.');
- if (name->package_end != NULL) {
- /* module.submodule.function => module.submodule, function */
- name->sym = name->package_end + 1; /* skip '.' */
- } else {
- /* package == function => function, function */
- name->sym = name->package;
- name->package_end = str + strlen(str);
- }
-}
-
-/**
- * Arguments for luaT_module_find used by lua_cpcall()
- */
-struct module_find_ctx {
- const char *package;
- const char *package_end;
- char *path;
- size_t path_len;
-};
-
-/**
- * A cpcall() helper for module_find()
- */
-static int
-luaT_module_find(lua_State *L)
-{
- struct module_find_ctx *ctx = (struct module_find_ctx *)
- lua_topointer(L, 1);
-
- /*
- * Call package.searchpath(name, package.cpath) and use
- * the path to the function in dlopen().
- */
- lua_getglobal(L, "package");
-
- lua_getfield(L, -1, "search");
-
- /* Argument of search: name */
- lua_pushlstring(L, ctx->package, ctx->package_end - ctx->package);
-
- lua_call(L, 1, 1);
- if (lua_isnil(L, -1))
- return luaL_error(L, "module not found");
- /* Convert path to absolute */
- char resolved[PATH_MAX];
- if (realpath(lua_tostring(L, -1), resolved) == NULL) {
- diag_set(SystemError, "realpath");
- return luaT_error(L);
- }
-
- snprintf(ctx->path, ctx->path_len, "%s", resolved);
- return 0;
-}
-
-/**
- * Find path to module using Lua's package.cpath
- * @param package package name
- * @param package_end a pointer to the last byte in @a package + 1
- * @param[out] path path to shared library
- * @param path_len size of @a path buffer
- * @retval 0 on success
- * @retval -1 on error, diag is set
- */
-static int
-module_find(const char *package, const char *package_end, char *path,
- size_t path_len)
-{
- struct module_find_ctx ctx = { package, package_end, path, path_len };
- lua_State *L = tarantool_L;
- int top = lua_gettop(L);
- if (luaT_cpcall(L, luaT_module_find, &ctx) != 0) {
- int package_len = (int) (package_end - package);
- diag_set(ClientError, ER_LOAD_MODULE, package_len, package,
- lua_tostring(L, -1));
- lua_settop(L, top);
- return -1;
- }
- assert(top == lua_gettop(L)); /* cpcall discard results */
- return 0;
-}
-
-static struct mh_strnptr_t *modules = NULL;
-
-static void
-module_gc(struct module *module);
-
-int
-module_init(void)
-{
- modules = mh_strnptr_new();
- if (modules == NULL) {
- diag_set(OutOfMemory, sizeof(*modules), "malloc",
- "modules hash table");
- return -1;
- }
- return 0;
-}
-
-void
-module_free(void)
-{
- while (mh_size(modules) > 0) {
- mh_int_t i = mh_first(modules);
- struct module *module =
- (struct module *) mh_strnptr_node(modules, i)->val;
- /* Can't delete modules if they have active calls */
- module_gc(module);
- }
- mh_strnptr_delete(modules);
-}
-
-/**
- * Look up a module in the modules cache.
- */
-static struct module *
-module_cache_find(const char *name, const char *name_end)
-{
- mh_int_t i = mh_strnptr_find_inp(modules, name, name_end - name);
- if (i == mh_end(modules))
- return NULL;
- return (struct module *)mh_strnptr_node(modules, i)->val;
-}
-
-/**
- * Save module to the module cache.
- */
-static inline int
-module_cache_put(struct module *module)
-{
- size_t package_len = strlen(module->package);
- uint32_t name_hash = mh_strn_hash(module->package, package_len);
- const struct mh_strnptr_node_t strnode = {
- module->package, package_len, name_hash, module};
-
- if (mh_strnptr_put(modules, &strnode, NULL, NULL) == mh_end(modules)) {
- diag_set(OutOfMemory, sizeof(strnode), "malloc", "modules");
- return -1;
- }
- return 0;
-}
-
-/**
- * Delete a module from the module cache
- */
-static void
-module_cache_del(const char *name, const char *name_end)
-{
- mh_int_t i = mh_strnptr_find_inp(modules, name, name_end - name);
- if (i == mh_end(modules))
- return;
- mh_strnptr_del(modules, i, NULL);
-}
-
-/*
- * Load a dso.
- * Create a new symlink based on temporary directory and try to
- * load via this symink to load a dso twice for cases of a function
- * reload.
- */
-static struct module *
-module_load(const char *package, const char *package_end)
-{
- char path[PATH_MAX];
- if (module_find(package, package_end, path, sizeof(path)) != 0)
- return NULL;
-
- int package_len = package_end - package;
- struct module *module = (struct module *)
- malloc(sizeof(*module) + package_len + 1);
- if (module == NULL) {
- diag_set(OutOfMemory, sizeof(struct module) + package_len + 1,
- "malloc", "struct module");
- return NULL;
- }
- memcpy(module->package, package, package_len);
- module->package[package_len] = 0;
- rlist_create(&module->funcs);
- module->calls = 0;
-
- const char *tmpdir = getenv("TMPDIR");
- if (tmpdir == NULL)
- tmpdir = "/tmp";
- char dir_name[PATH_MAX];
- int rc = snprintf(dir_name, sizeof(dir_name), "%s/tntXXXXXX", tmpdir);
- if (rc < 0 || (size_t) rc >= sizeof(dir_name)) {
- diag_set(SystemError, "failed to generate path to tmp dir");
- goto error;
- }
-
- if (mkdtemp(dir_name) == NULL) {
- diag_set(SystemError, "failed to create unique dir name: %s",
- dir_name);
- goto error;
- }
- char load_name[PATH_MAX];
- rc = snprintf(load_name, sizeof(load_name), "%s/%.*s." TARANTOOL_LIBEXT,
- dir_name, package_len, package);
- if (rc < 0 || (size_t) rc >= sizeof(dir_name)) {
- diag_set(SystemError, "failed to generate path to DSO");
- goto error;
- }
-
- struct stat st;
- if (stat(path, &st) < 0) {
- diag_set(SystemError, "failed to stat() module %s", path);
- goto error;
- }
-
- int source_fd = open(path, O_RDONLY);
- if (source_fd < 0) {
- diag_set(SystemError, "failed to open module %s file for" \
- " reading", path);
- goto error;
- }
- int dest_fd = open(load_name, O_WRONLY|O_CREAT|O_TRUNC,
- st.st_mode & 0777);
- if (dest_fd < 0) {
- diag_set(SystemError, "failed to open file %s for writing ",
- load_name);
- close(source_fd);
- goto error;
- }
-
- off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size);
- close(source_fd);
- close(dest_fd);
- if (ret != st.st_size) {
- diag_set(SystemError, "failed to copy DSO %s to %s",
- path, load_name);
- goto error;
- }
-
- module->handle = dlopen(load_name, RTLD_NOW | RTLD_LOCAL);
- if (unlink(load_name) != 0)
- say_warn("failed to unlink dso link %s", load_name);
- if (rmdir(dir_name) != 0)
- say_warn("failed to delete temporary dir %s", dir_name);
- if (module->handle == NULL) {
- diag_set(ClientError, ER_LOAD_MODULE, package_len,
- package, dlerror());
- goto error;
- }
- struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT);
- if (e != NULL)
- ++e->iparam;
- return module;
-error:
- free(module);
- return NULL;
-}
-
-static void
-module_delete(struct module *module)
-{
- struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT);
- if (e != NULL)
- --e->iparam;
- dlclose(module->handle);
- TRASH(module);
- free(module);
-}
-
-/*
- * Check if a dso is unused and can be closed.
- */
-static void
-module_gc(struct module *module)
-{
- if (rlist_empty(&module->funcs) && module->calls == 0)
- module_delete(module);
-}
-
-/*
- * Import a function from the module.
- */
-static box_function_f
-module_sym(struct module *module, const char *name)
-{
- box_function_f f = (box_function_f)dlsym(module->handle, name);
- if (f == NULL) {
- diag_set(ClientError, ER_LOAD_FUNCTION, name, dlerror());
- return NULL;
- }
- return f;
-}
-
-int
-module_sym_load(struct module_sym *mod_sym)
-{
- assert(mod_sym->addr == NULL);
-
- struct func_name name;
- func_split_name(mod_sym->name, &name);
-
- /*
- * In case if module has been loaded already by
- * some previous call we can eliminate redundant
- * loading and take it from the cache.
- */
- struct module *cached, *module;
- cached = module_cache_find(name.package, name.package_end);
- if (cached == NULL) {
- module = module_load(name.package, name.package_end);
- if (module == NULL)
- return -1;
- if (module_cache_put(module) != 0) {
- module_delete(module);
- return -1;
- }
- } else {
- module = cached;
- }
-
- mod_sym->addr = module_sym(module, name.sym);
- if (mod_sym->addr == NULL) {
- if (cached == NULL) {
- /*
- * In case if it was a first load we should
- * clean the cache immediately otherwise
- * the module continue being referenced even
- * if there will be no use of it.
- *
- * Note the module_sym set an error thus be
- * careful to not wipe it.
- */
- module_cache_del(name.package, name.package_end);
- module_delete(module);
- }
- return -1;
- }
- mod_sym->module = module;
- rlist_add(&module->funcs, &mod_sym->item);
- return 0;
-}
-
-void
-module_sym_unload(struct module_sym *mod_sym)
-{
- if (mod_sym->module == NULL)
- return;
-
- rlist_del(&mod_sym->item);
- if (rlist_empty(&mod_sym->module->funcs)) {
- struct func_name name;
- func_split_name(mod_sym->name, &name);
- module_cache_del(name.package, name.package_end);
- }
- module_gc(mod_sym->module);
-
- mod_sym->module = NULL;
- mod_sym->addr = NULL;
-}
-
-int
-module_reload(const char *package, const char *package_end, struct module **module)
-{
- struct module *old_module = module_cache_find(package, package_end);
- if (old_module == NULL) {
- /* Module wasn't loaded - do nothing. */
- *module = NULL;
- return 0;
- }
-
- struct module *new_module = module_load(package, package_end);
- if (new_module == NULL)
- return -1;
-
- struct module_sym *mod_sym, *tmp;
- rlist_foreach_entry_safe(mod_sym, &old_module->funcs, item, tmp) {
- struct func_name name;
- func_split_name(mod_sym->name, &name);
- mod_sym->addr = module_sym(new_module, name.sym);
- if (mod_sym->addr == NULL)
- goto restore;
- mod_sym->module = new_module;
- rlist_move(&new_module->funcs, &mod_sym->item);
- }
- module_cache_del(package, package_end);
- if (module_cache_put(new_module) != 0)
- goto restore;
- module_gc(old_module);
- *module = new_module;
- return 0;
-restore:
- /*
- * Some old-dso func can't be load from new module, restore old
- * functions.
- */
- do {
- struct func_name name;
- func_split_name(mod_sym->name, &name);
- mod_sym->addr = module_sym(old_module, name.sym);
- if (mod_sym->addr == NULL) {
- /*
- * Something strange was happen, an early loaden
- * function was not found in an old dso.
- */
- panic("Can't restore module function, "
- "server state is inconsistent");
- }
- mod_sym->module = old_module;
- rlist_move(&old_module->funcs, &mod_sym->item);
- } while (mod_sym != rlist_first_entry(&old_module->funcs,
- struct module_sym,
- item));
- assert(rlist_empty(&new_module->funcs));
- module_delete(new_module);
- return -1;
-}
-
static struct func *
func_c_new(struct func_def *def);
@@ -563,39 +124,9 @@ func_c_call(struct func *base, struct port *args, struct port *ret)
{
assert(base->vtab == &func_c_vtab);
assert(base != NULL && base->def->language == FUNC_LANGUAGE_C);
- struct func_c *func = (struct func_c *) base;
- if (func->mod_sym.addr == NULL) {
- if (module_sym_load(&func->mod_sym) != 0)
- return -1;
- }
-
- struct region *region = &fiber()->gc;
- size_t region_svp = region_used(region);
- uint32_t data_sz;
- const char *data = port_get_msgpack(args, &data_sz);
- if (data == NULL)
- return -1;
-
- port_c_create(ret);
- box_function_ctx_t ctx = { ret };
- /* Module can be changed after function reload. */
- struct module *module = func->mod_sym.module;
- assert(module != NULL);
- ++module->calls;
- int rc = func->mod_sym.addr(&ctx, data, data + data_sz);
- --module->calls;
- module_gc(module);
- region_truncate(region, region_svp);
- if (rc != 0) {
- if (diag_last_error(&fiber()->diag) == NULL) {
- /* Stored procedure forget to set diag */
- diag_set(ClientError, ER_PROC_C, "unknown error");
- }
- port_destroy(ret);
- return -1;
- }
- return rc;
+ struct func_c *func = (struct func_c *)base;
+ return module_sym_call(&func->mod_sym, args, ret);
}
static struct func_vtab func_c_vtab = {
diff --git a/src/box/func.h b/src/box/func.h
index 9a7f17446..11a466b28 100644
--- a/src/box/func.h
+++ b/src/box/func.h
@@ -34,7 +34,8 @@
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
-#include "small/rlist.h"
+
+#include "module_cache.h"
#include "func_def.h"
#include "user_def.h"
@@ -44,42 +45,6 @@ extern "C" {
struct func;
-/**
- * Dynamic shared module.
- */
-struct module {
- /** Module dlhandle. */
- void *handle;
- /** List of imported functions. */
- struct rlist funcs;
- /** Count of active calls. */
- size_t calls;
- /** Module's package name. */
- char package[0];
-};
-
-/**
- * Callable symbol bound to a module.
- */
-struct module_sym {
- /**
- * Anchor for module membership.
- */
- struct rlist item;
- /**
- * For C functions, address of the function.
- */
- box_function_f addr;
- /**
- * A module the symbol belongs to.
- */
- struct module *module;
- /**
- * Function name definition.
- */
- char *name;
-};
-
/** Virtual method table for func object. */
struct func_vtab {
/** Call function with given arguments. */
@@ -106,18 +71,6 @@ struct func {
struct access access[BOX_USER_MAX];
};
-/**
- * Initialize modules subsystem.
- */
-int
-module_init(void);
-
-/**
- * Cleanup modules subsystem.
- */
-void
-module_free(void);
-
struct func *
func_new(struct func_def *def);
@@ -130,38 +83,6 @@ func_delete(struct func *func);
int
func_call(struct func *func, struct port *args, struct port *ret);
-/**
- * Resolve C entry (find the respective DLL and fetch the
- * symbol from it).
- *
- * @param mod_sym module symbol pointer.
- * @retval -1 on error.
- * @retval 0 on success.
- */
-int
-module_sym_load(struct module_sym *mod_sym);
-
-/**
- * Unload module symbol and drop it from the package
- * cache if there is no users left.
- *
- * @param mod_sym module symbol pointer.
- */
-void
-module_sym_unload(struct module_sym *mod_sym);
-
-/**
- * Reload dynamically loadable module.
- *
- * @param package name begin pointer.
- * @param package_end package_end name end pointer.
- * @param[out] module a pointer to store module object on success.
- * @retval -1 on error.
- * @retval 0 on success.
- */
-int
-module_reload(const char *package, const char *package_end, struct module **module);
-
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/box/func_def.h b/src/box/func_def.h
index d99d89190..75cd6a0d3 100644
--- a/src/box/func_def.h
+++ b/src/box/func_def.h
@@ -168,20 +168,6 @@ func_def_dup(struct func_def *def);
int
func_def_check(struct func_def *def);
-/**
- * API of C stored function.
- */
-
-struct port;
-
-struct box_function_ctx {
- struct port *port;
-};
-
-typedef struct box_function_ctx box_function_ctx_t;
-typedef int (*box_function_f)(box_function_ctx_t *ctx,
- const char *args, const char *args_end);
-
#ifdef __cplusplus
}
#endif
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
new file mode 100644
index 000000000..9fe316807
--- /dev/null
+++ b/src/box/module_cache.c
@@ -0,0 +1,513 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#include <dlfcn.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "assoc.h"
+#include "diag.h"
+#include "error.h"
+#include "errinj.h"
+#include "fiber.h"
+#include "port.h"
+
+#include "box/error.h"
+#include "lua/utils.h"
+#include "libeio/eio.h"
+
+#include "module_cache.h"
+
+/** Modules name to descriptor hash. */
+static struct mh_strnptr_t *mod_hash = NULL;
+
+/**
+ * Parsed symbol and package names.
+ */
+struct func_name {
+ /**
+ * Null-terminated symbol name, e.g.
+ * "func" for "mod.submod.func".
+ */
+ const char *sym;
+ /**
+ * Package name, e.g. "mod.submod" for
+ * "mod.submod.func".
+ */
+ const char *package;
+ /**
+ * A pointer to the last character in ->package + 1.
+ */
+ const char *package_end;
+};
+
+/***
+ * Split function name to symbol and package names.
+ *
+ * For example, str = foo.bar.baz => sym = baz, package = foo.bar
+ *
+ * @param str function name, e.g. "module.submodule.function".
+ * @param[out] name parsed symbol and a package name.
+ */
+static void
+func_split_name(const char *str, struct func_name *name)
+{
+ name->package = str;
+ name->package_end = strrchr(str, '.');
+ if (name->package_end != NULL) {
+ /* module.submodule.function => module.submodule, function */
+ name->sym = name->package_end + 1; /* skip '.' */
+ } else {
+ /* package == function => function, function */
+ name->sym = name->package;
+ name->package_end = str + strlen(str);
+ }
+}
+
+/**
+ * Look up a module in the modules cache.
+ */
+static struct module *
+module_cache_find(const char *name, const char *name_end)
+{
+ mh_int_t e = mh_strnptr_find_inp(mod_hash, name, name_end - name);
+ if (e == mh_end(mod_hash))
+ return NULL;
+ return mh_strnptr_node(mod_hash, e)->val;
+}
+
+/**
+ * Save a module to the modules cache.
+ */
+static int
+module_cache_add(struct module *module)
+{
+ size_t package_len = strlen(module->package);
+ const struct mh_strnptr_node_t nd = {
+ .str = module->package,
+ .len = package_len,
+ .hash = mh_strn_hash(module->package, package_len),
+ .val = module,
+ };
+
+ if (mh_strnptr_put(mod_hash, &nd, NULL, NULL) == mh_end(mod_hash)) {
+ diag_set(OutOfMemory, sizeof(nd), "malloc",
+ "module cache node");
+ return -1;
+ }
+ return 0;
+}
+
+/**
+ * Delete a module from the modules cache.
+ */
+static void
+module_cache_del(const char *name, const char *name_end)
+{
+ mh_int_t e = mh_strnptr_find_inp(mod_hash, name, name_end - name);
+ if (e != mh_end(mod_hash))
+ mh_strnptr_del(mod_hash, e, NULL);
+}
+
+/**
+ * Arguments for luaT_module_find used by lua_cpcall().
+ */
+struct module_find_ctx {
+ const char *package;
+ const char *package_end;
+ char *path;
+ size_t path_len;
+};
+
+/**
+ * A cpcall() helper for module_find().
+ */
+static int
+luaT_module_find(lua_State *L)
+{
+ struct module_find_ctx *ctx = (void *)lua_topointer(L, 1);
+
+ /*
+ * Call package.searchpath(name, package.cpath) and use
+ * the path to the function in dlopen().
+ */
+ lua_getglobal(L, "package");
+ lua_getfield(L, -1, "search");
+
+ /* Argument of search: name */
+ lua_pushlstring(L, ctx->package, ctx->package_end - ctx->package);
+
+ lua_call(L, 1, 1);
+ if (lua_isnil(L, -1))
+ return luaL_error(L, "module not found");
+
+ /* Convert path to absolute */
+ char resolved[PATH_MAX];
+ if (realpath(lua_tostring(L, -1), resolved) == NULL) {
+ diag_set(SystemError, "realpath");
+ return luaT_error(L);
+ }
+
+ snprintf(ctx->path, ctx->path_len, "%s", resolved);
+ return 0;
+}
+
+/**
+ * Find a path to a module using Lua's package.cpath.
+ *
+ * @param package package name
+ * @param package_end a pointer to the last byte in @a package + 1
+ * @param[out] path path to shared library
+ * @param path_len size of @a path buffer
+ *
+ * @retval 0 on success
+ * @retval -1 on error, diag is set
+ */
+static int
+module_find(const char *package, const char *package_end,
+ char *path, size_t path_len)
+{
+ struct module_find_ctx ctx = {
+ .package = package,
+ .package_end = package_end,
+ .path = path,
+ .path_len = path_len,
+ };
+ lua_State *L = tarantool_L;
+ int top = lua_gettop(L);
+ if (luaT_cpcall(L, luaT_module_find, &ctx) != 0) {
+ diag_set(ClientError, ER_LOAD_MODULE,
+ (int)(package_end - package),
+ package, lua_tostring(L, -1));
+ lua_settop(L, top);
+ return -1;
+ }
+ assert(top == lua_gettop(L)); /* cpcall discard results */
+ return 0;
+}
+
+/**
+ * Load dynamic shared object, ie module library.
+ *
+ * Create a new symlink based on temporary directory
+ * and try to load via this symink to load a dso twice
+ * for cases of a function reload.
+ */
+static struct module *
+module_load(const char *package, const char *package_end)
+{
+ char path[PATH_MAX];
+ if (module_find(package, package_end, path, sizeof(path)) != 0)
+ return NULL;
+
+ int package_len = package_end - package;
+ struct module *module = malloc(sizeof(*module) + package_len + 1);
+ if (module == NULL) {
+ diag_set(OutOfMemory, sizeof(*module) + package_len + 1,
+ "malloc", "struct module");
+ return NULL;
+ }
+ memcpy(module->package, package, package_len);
+ module->package[package_len] = 0;
+ rlist_create(&module->funcs_list);
+ module->calls = 0;
+
+ const char *tmpdir = getenv("TMPDIR");
+ if (tmpdir == NULL)
+ tmpdir = "/tmp";
+
+ char dir_name[PATH_MAX];
+ int rc = snprintf(dir_name, sizeof(dir_name), "%s/tntXXXXXX", tmpdir);
+ if (rc < 0 || (size_t)rc >= sizeof(dir_name)) {
+ diag_set(SystemError, "failed to generate path to tmp dir");
+ goto error;
+ }
+
+ if (mkdtemp(dir_name) == NULL) {
+ diag_set(SystemError, "failed to create unique dir name: %s",
+ dir_name);
+ goto error;
+ }
+
+ char load_name[PATH_MAX];
+ rc = snprintf(load_name, sizeof(load_name), "%s/%.*s." TARANTOOL_LIBEXT,
+ dir_name, package_len, package);
+ if (rc < 0 || (size_t)rc >= sizeof(dir_name)) {
+ diag_set(SystemError, "failed to generate path to DSO");
+ goto error;
+ }
+
+ struct stat st;
+ if (stat(path, &st) < 0) {
+ diag_set(SystemError, "failed to stat() module %s", path);
+ goto error;
+ }
+
+ int source_fd = open(path, O_RDONLY);
+ if (source_fd < 0) {
+ diag_set(SystemError, "failed to open module %s", path);
+ goto error;
+ }
+
+ int dest_fd = open(load_name, O_WRONLY | O_CREAT | O_TRUNC,
+ st.st_mode & (S_IRWXU | S_IRWXG | S_IRWXO));
+ if (dest_fd < 0) {
+ diag_set(SystemError, "failed to open file %s for writing ",
+ load_name);
+ close(source_fd);
+ goto error;
+ }
+
+ off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size);
+ close(source_fd);
+ close(dest_fd);
+ if (ret != st.st_size) {
+ diag_set(SystemError, "failed to copy DSO %s to %s",
+ path, load_name);
+ goto error;
+ }
+
+ module->handle = dlopen(load_name, RTLD_NOW | RTLD_LOCAL);
+ if (unlink(load_name) != 0)
+ say_warn("failed to unlink dso link %s", load_name);
+ if (rmdir(dir_name) != 0)
+ say_warn("failed to delete temporary dir %s", dir_name);
+ if (module->handle == NULL) {
+ diag_set(ClientError, ER_LOAD_MODULE, package_len,
+ package, dlerror());
+ goto error;
+ }
+
+ struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT);
+ if (e != NULL)
+ ++e->iparam;
+ return module;
+
+error:
+ free(module);
+ return NULL;
+}
+
+/**
+ * Delete a module.
+ */
+static void
+module_delete(struct module *module)
+{
+ struct errinj *e = errinj(ERRINJ_DYN_MODULE_COUNT, ERRINJ_INT);
+ if (e != NULL)
+ --e->iparam;
+ dlclose(module->handle);
+ TRASH(module);
+ free(module);
+}
+
+/**
+ * Check if a module is unused and delete it then.
+ */
+static void
+module_gc(struct module *module)
+{
+ if (rlist_empty(&module->funcs_list) && module->calls == 0)
+ module_delete(module);
+}
+
+/**
+ * Import a function from a module.
+ */
+static box_function_f
+module_sym(struct module *module, const char *name)
+{
+ box_function_f f = dlsym(module->handle, name);
+ if (f == NULL) {
+ diag_set(ClientError, ER_LOAD_FUNCTION, name, dlerror());
+ return NULL;
+ }
+ return f;
+}
+
+int
+module_sym_load(struct module_sym *mod_sym)
+{
+ assert(mod_sym->addr == NULL);
+
+ struct func_name name;
+ func_split_name(mod_sym->name, &name);
+
+ struct module *module = module_cache_find(name.package, name.package_end);
+ if (module == NULL) {
+ module = module_load(name.package, name.package_end);
+ if (module == NULL)
+ return -1;
+ if (module_cache_add(module) != 0) {
+ module_delete(module);
+ return -1;
+ }
+ }
+
+ mod_sym->addr = module_sym(module, name.sym);
+ if (mod_sym->addr == NULL)
+ return -1;
+
+ mod_sym->module = module;
+ rlist_add(&module->funcs_list, &mod_sym->item);
+ return 0;
+}
+
+void
+module_sym_unload(struct module_sym *mod_sym)
+{
+ if (mod_sym->module == NULL)
+ return;
+
+ rlist_del(&mod_sym->item);
+ if (rlist_empty(&mod_sym->module->funcs_list)) {
+ struct func_name name;
+ func_split_name(mod_sym->name, &name);
+ module_cache_del(name.package, name.package_end);
+ }
+ module_gc(mod_sym->module);
+
+ mod_sym->module = NULL;
+ mod_sym->addr = NULL;
+}
+
+int
+module_sym_call(struct module_sym *mod_sym, struct port *args,
+ struct port *ret)
+{
+ if (mod_sym->addr == NULL) {
+ if (module_sym_load(mod_sym) != 0)
+ return -1;
+ }
+
+ struct region *region = &fiber()->gc;
+ size_t region_svp = region_used(region);
+
+ uint32_t data_sz;
+ const char *data = port_get_msgpack(args, &data_sz);
+ if (data == NULL)
+ return -1;
+
+ port_c_create(ret);
+ box_function_ctx_t ctx = {
+ .port = ret,
+ };
+
+ /*
+ * Module can be changed after function reload. Also
+ * keep in mind that stored C procedure may yield inside.
+ */
+ struct module *module = mod_sym->module;
+ assert(module != NULL);
+ ++module->calls;
+ int rc = mod_sym->addr(&ctx, data, data + data_sz);
+ --module->calls;
+ module_gc(module);
+ region_truncate(region, region_svp);
+
+ if (rc != 0) {
+ if (diag_last_error(&fiber()->diag) == NULL) {
+ /* Stored procedure forget to set diag */
+ diag_set(ClientError, ER_PROC_C, "unknown error");
+ }
+ port_destroy(ret);
+ return -1;
+ }
+
+ return rc;
+}
+
+int
+module_reload(const char *package, const char *package_end,
+ struct module **module)
+{
+ struct module *old = module_cache_find(package, package_end);
+ if (old == NULL) {
+ /* Module wasn't loaded - do nothing. */
+ *module = NULL;
+ return 0;
+ }
+
+ struct module *new = module_load(package, package_end);
+ if (new == NULL)
+ return -1;
+
+ struct module_sym *mod_sym, *tmp;
+ rlist_foreach_entry_safe(mod_sym, &old->funcs_list, item, tmp) {
+ struct func_name name;
+ func_split_name(mod_sym->name, &name);
+
+ mod_sym->addr = module_sym(new, name.sym);
+ if (mod_sym->addr == NULL) {
+ say_error("module: reload %s, symbol %s not found",
+ package, name.sym);
+ goto restore;
+ }
+
+ mod_sym->module = new;
+ rlist_move(&new->funcs_list, &mod_sym->item);
+ }
+
+ module_cache_del(package, package_end);
+ if (module_cache_add(new) != 0)
+ goto restore;
+
+ module_gc(old);
+ *module = new;
+ return 0;
+
+restore:
+ /*
+ * Some old-dso func can't be load from new module,
+ * restore old functions.
+ */
+ do {
+ struct func_name name;
+ func_split_name(mod_sym->name, &name);
+ mod_sym->addr = module_sym(old, name.sym);
+ if (mod_sym->addr == NULL) {
+ /*
+ * Something strange was happen, an early loaden
+ * function was not found in an old dso.
+ */
+ panic("Can't restore module function, "
+ "server state is inconsistent");
+ }
+ mod_sym->module = old;
+ rlist_move(&old->funcs_list, &mod_sym->item);
+ } while (mod_sym != rlist_first_entry(&old->funcs_list,
+ struct module_sym,
+ item));
+ assert(rlist_empty(&new->funcs_list));
+ module_delete(new);
+ return -1;
+}
+
+int
+module_init(void)
+{
+ mod_hash = mh_strnptr_new();
+ if (mod_hash == NULL) {
+ diag_set(OutOfMemory, sizeof(*mod_hash),
+ "malloc", "modules hash table");
+ return -1;
+ }
+ return 0;
+}
+
+void
+module_free(void)
+{
+ while (mh_size(mod_hash) > 0) {
+ mh_int_t i = mh_first(mod_hash);
+ struct module *m = mh_strnptr_node(mod_hash, i)->val;
+ module_gc(m);
+ }
+ mh_strnptr_delete(mod_hash);
+ mod_hash = NULL;
+}
diff --git a/src/box/module_cache.h b/src/box/module_cache.h
new file mode 100644
index 000000000..fd789f603
--- /dev/null
+++ b/src/box/module_cache.h
@@ -0,0 +1,139 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#pragma once
+
+#include <small/rlist.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+
+/**
+ * API of C stored function.
+ */
+
+struct port;
+
+struct box_function_ctx {
+ struct port *port;
+};
+
+typedef struct box_function_ctx box_function_ctx_t;
+typedef int (*box_function_f)(box_function_ctx_t *ctx,
+ const char *args,
+ const char *args_end);
+
+/**
+ * Dynamic shared module.
+ */
+struct module {
+ /**
+ * Module dlhandle.
+ */
+ void *handle;
+ /**
+ * List of associated symbols (functions).
+ */
+ struct rlist funcs_list;
+ /**
+ * Count of active calls.
+ */
+ size_t calls;
+ /**
+ * Module's package name.
+ */
+ char package[0];
+};
+
+/**
+ * Callable symbol bound to a module.
+ */
+struct module_sym {
+ /**
+ * Anchor for module::funcs_list.
+ */
+ struct rlist item;
+ /**
+ * For C functions, address of the function.
+ */
+ box_function_f addr;
+ /**
+ * A module the symbol belongs to.
+ */
+ struct module *module;
+ /**
+ * Symbol (function) name definition.
+ */
+ char *name;
+};
+
+/**
+ * Load a new module symbol.
+ *
+ * @param mod_sym symbol to load.
+ *
+ * @returns 0 on succse, -1 otherwise, diag is set.
+ */
+int
+module_sym_load(struct module_sym *mod_sym);
+
+/**
+ * Unload a module's symbol.
+ *
+ * @param mod_sym symbol to unload.
+ */
+void
+module_sym_unload(struct module_sym *mod_sym);
+
+/**
+ * Execute a module symbol (run a function).
+ *
+ * The function packs function arguments into a message pack
+ * and send it as a function argument. Function may return
+ * results via @a ret stream.
+ *
+ * @param mod_sym module symbol to run.
+ * @param args function arguments.
+ * @param[out] ret execution results.
+ *
+ * @returns 0 on success, -1 otherwise, diag is set.
+ */
+int
+module_sym_call(struct module_sym *mod_sym, struct port *args,
+ struct port *ret);
+
+/**
+ * Reload a module and all associated symbols.
+ *
+ * @param package shared library path start.
+ * @param package_end shared library path end.
+ * @param[out] module pointer to the reloaded module.
+ *
+ * @return 0 on succes, -1 otherwise, diag is set.
+ */
+int
+module_reload(const char *package, const char *package_end,
+ struct module **module);
+
+/**
+ * Initialize modules subsystem.
+ *
+ * @return 0 on succes, -1 otherwise, diag is set.
+ */
+int
+module_init(void);
+
+/**
+ * Free modules subsystem.
+ */
+void
+module_free(void);
+
+#if defined(__cplusplus)
+}
+#endif /* defined(__plusplus) */
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 2/8] module_cache: move module handling into own subsystem
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 2/8] module_cache: move module handling into own subsystem Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:26 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-25 8:52 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:26 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
See 4 comments below.
On 18.01.2021 21:35, Cyrill Gorcunov wrote:
> The module handling should not be bound to particular
> function implementation (we will have two users: already
> existing functions for "_func" space, and a new upcoming
> one which will be serving cbox submodule in next patch).
1. It is now cmod, not cbox.
> For this sake all module related code is moved to
> module_cache file where we do symbol resolving, calling
> and tracking of symbol usage.
>
> Part-of #4642
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> diff --git a/src/box/module_cache.c b/src/box/module_cache.c
> new file mode 100644
> index 000000000..9fe316807
> --- /dev/null
> +++ b/src/box/module_cache.c
> @@ -0,0 +1,513 @@
> +/*
> + * SPDX-License-Identifier: BSD-2-Clause
> + *
> + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
> + */
> +
> +#include <dlfcn.h>
> +#include <fcntl.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include "assoc.h"
> +#include "diag.h"
> +#include "error.h"
> +#include "errinj.h"
> +#include "fiber.h"
> +#include "port.h"
> +
> +#include "box/error.h"
2. #include "error.h" and #include "box/error.h" are the same
files. Because you are in box/ folder.
> +#include "lua/utils.h"
> +#include "libeio/eio.h"
> +
> +#include "module_cache.h"
> +
> +/** Modules name to descriptor hash. */
> +static struct mh_strnptr_t *mod_hash = NULL;
> +
> +int
> +module_sym_load(struct module_sym *mod_sym)
> +{
> + assert(mod_sym->addr == NULL);
> +
> + struct func_name name;
> + func_split_name(mod_sym->name, &name);
> +
> + struct module *module = module_cache_find(name.package, name.package_end);
> + if (module == NULL) {
> + module = module_load(name.package, name.package_end);
> + if (module == NULL)
> + return -1;
> + if (module_cache_add(module) != 0) {
> + module_delete(module);
> + return -1;
> + }
> + }
> +
> + mod_sym->addr = module_sym(module, name.sym);
> + if (mod_sym->addr == NULL)
> + return -1;
3. Is it correct, that you second time has deleted the bugfix
which you did about module not being unloaded when first
symbol load fails?
> +
> + mod_sym->module = module;
> + rlist_add(&module->funcs_list, &mod_sym->item);
> + return 0;
> +}
> diff --git a/src/box/module_cache.h b/src/box/module_cache.h
> new file mode 100644
> index 000000000..fd789f603
> --- /dev/null
> +++ b/src/box/module_cache.h
> @@ -0,0 +1,139 @@
> +/*
> + * SPDX-License-Identifier: BSD-2-Clause
> + *
> + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
> + */
> +
> +#pragma once
> +
> +#include <small/rlist.h>
4. Please, use "" instead of <> for non-system headers.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 2/8] module_cache: move module handling into own subsystem
2021-01-24 16:26 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-01-25 8:52 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-25 8:52 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Sun, Jan 24, 2021 at 05:26:48PM +0100, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> See 4 comments below.
>
> On 18.01.2021 21:35, Cyrill Gorcunov wrote:
> > The module handling should not be bound to particular
> > function implementation (we will have two users: already
> > existing functions for "_func" space, and a new upcoming
> > one which will be serving cbox submodule in next patch).
>
> 1. It is now cmod, not cbox.
Thanks!
> > +
> > +#include "box/error.h"
>
> 2. #include "error.h" and #include "box/error.h" are the same
> files. Because you are in box/ folder.
OK
> > +
> > + mod_sym->addr = module_sym(module, name.sym);
> > + if (mod_sym->addr == NULL)
> > + return -1;
>
> 3. Is it correct, that you second time has deleted the bugfix
> which you did about module not being unloaded when first
> symbol load fails?
Sigh... This sneaky issue triggers for n'th time :( Thanks
for catching, Vlad!
> > +#pragma once
> > +
> > +#include <small/rlist.h>
>
> 4. Please, use "" instead of <> for non-system headers.
OK, an update on top of the patch. I'll squash it later
---
src/box/module_cache.c | 30 ++++++++++++++++++++++++++----
src/box/module_cache.h | 2 +-
2 files changed, 27 insertions(+), 5 deletions(-)
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index 9fe316807..a66b84efb 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -16,7 +16,7 @@
#include "fiber.h"
#include "port.h"
-#include "box/error.h"
+#include "error.h"
#include "lua/utils.h"
#include "libeio/eio.h"
@@ -338,8 +338,14 @@ module_sym_load(struct module_sym *mod_sym)
struct func_name name;
func_split_name(mod_sym->name, &name);
- struct module *module = module_cache_find(name.package, name.package_end);
- if (module == NULL) {
+ /*
+ * In case if module has been loaded already by
+ * some previous call we can eliminate redundant
+ * loading and take it from the cache.
+ */
+ struct module *cached, *module;
+ cached = module_cache_find(name.package, name.package_end);
+ if (cached == NULL) {
module = module_load(name.package, name.package_end);
if (module == NULL)
return -1;
@@ -347,11 +353,27 @@ module_sym_load(struct module_sym *mod_sym)
module_delete(module);
return -1;
}
+ } else {
+ module = cached;
}
mod_sym->addr = module_sym(module, name.sym);
- if (mod_sym->addr == NULL)
+ if (mod_sym->addr == NULL) {
+ if (cached == NULL) {
+ /*
+ * In case if it was a first load we should
+ * clean the cache immediately otherwise
+ * the module continue being referenced even
+ * if there will be no use of it.
+ *
+ * Note the module_sym set an error thus be
+ * careful to not wipe it.
+ */
+ module_cache_del(name.package, name.package_end);
+ module_delete(module);
+ }
return -1;
+ }
mod_sym->module = module;
rlist_add(&module->funcs_list, &mod_sym->item);
diff --git a/src/box/module_cache.h b/src/box/module_cache.h
index fd789f603..0f5d2b64a 100644
--- a/src/box/module_cache.h
+++ b/src/box/module_cache.h
@@ -6,7 +6,7 @@
#pragma once
-#include <small/rlist.h>
+#include "small/rlist.h"
#if defined(__cplusplus)
extern "C" {
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 1/8] box/func: factor out c function entry structure Cyrill Gorcunov via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 2/8] module_cache: move module handling into own subsystem Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload Cyrill Gorcunov via Tarantool-patches
` (5 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
- rename func_name to func_name_desc because
it is not just a name but rather a name descriptor
which includes symbol address
- rename func_split_name to parse_func_name because
the action of splitting name to parts is not what
the function about, the function parses specification
to fetch a descriptor
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/module_cache.c | 52 +++++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 26 deletions(-)
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index 9fe316807..8ae61a883 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -26,9 +26,9 @@
static struct mh_strnptr_t *mod_hash = NULL;
/**
- * Parsed symbol and package names.
+ * Function name descriptor: a symbol and a package.
*/
-struct func_name {
+struct func_name_desc {
/**
* Null-terminated symbol name, e.g.
* "func" for "mod.submod.func".
@@ -46,25 +46,25 @@ struct func_name {
};
/***
- * Split function name to symbol and package names.
+ * Parse function name into a name descriptor.
*
* For example, str = foo.bar.baz => sym = baz, package = foo.bar
*
* @param str function name, e.g. "module.submodule.function".
- * @param[out] name parsed symbol and a package name.
+ * @param[out] d parsed symbol and a package name.
*/
static void
-func_split_name(const char *str, struct func_name *name)
+parse_func_name(const char *str, struct func_name_desc *d)
{
- name->package = str;
- name->package_end = strrchr(str, '.');
- if (name->package_end != NULL) {
+ d->package = str;
+ d->package_end = strrchr(str, '.');
+ if (d->package_end != NULL) {
/* module.submodule.function => module.submodule, function */
- name->sym = name->package_end + 1; /* skip '.' */
+ d->sym = d->package_end + 1; /* skip '.' */
} else {
/* package == function => function, function */
- name->sym = name->package;
- name->package_end = str + strlen(str);
+ d->sym = d->package;
+ d->package_end = str + strlen(str);
}
}
@@ -335,12 +335,12 @@ module_sym_load(struct module_sym *mod_sym)
{
assert(mod_sym->addr == NULL);
- struct func_name name;
- func_split_name(mod_sym->name, &name);
+ struct func_name_desc d;
+ parse_func_name(mod_sym->name, &d);
- struct module *module = module_cache_find(name.package, name.package_end);
+ struct module *module = module_cache_find(d.package, d.package_end);
if (module == NULL) {
- module = module_load(name.package, name.package_end);
+ module = module_load(d.package, d.package_end);
if (module == NULL)
return -1;
if (module_cache_add(module) != 0) {
@@ -349,7 +349,7 @@ module_sym_load(struct module_sym *mod_sym)
}
}
- mod_sym->addr = module_sym(module, name.sym);
+ mod_sym->addr = module_sym(module, d.sym);
if (mod_sym->addr == NULL)
return -1;
@@ -366,9 +366,9 @@ module_sym_unload(struct module_sym *mod_sym)
rlist_del(&mod_sym->item);
if (rlist_empty(&mod_sym->module->funcs_list)) {
- struct func_name name;
- func_split_name(mod_sym->name, &name);
- module_cache_del(name.package, name.package_end);
+ struct func_name_desc d;
+ parse_func_name(mod_sym->name, &d);
+ module_cache_del(d.package, d.package_end);
}
module_gc(mod_sym->module);
@@ -439,13 +439,13 @@ module_reload(const char *package, const char *package_end,
struct module_sym *mod_sym, *tmp;
rlist_foreach_entry_safe(mod_sym, &old->funcs_list, item, tmp) {
- struct func_name name;
- func_split_name(mod_sym->name, &name);
+ struct func_name_desc d;
+ parse_func_name(mod_sym->name, &d);
- mod_sym->addr = module_sym(new, name.sym);
+ mod_sym->addr = module_sym(new, d.sym);
if (mod_sym->addr == NULL) {
say_error("module: reload %s, symbol %s not found",
- package, name.sym);
+ package, d.sym);
goto restore;
}
@@ -467,9 +467,9 @@ module_reload(const char *package, const char *package_end,
* restore old functions.
*/
do {
- struct func_name name;
- func_split_name(mod_sym->name, &name);
- mod_sym->addr = module_sym(old, name.sym);
+ struct func_name_desc d;
+ parse_func_name(mod_sym->name, &d);
+ mod_sym->addr = module_sym(old, d.sym);
if (mod_sym->addr == NULL) {
/*
* Something strange was happen, an early loaden
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-24 22:32 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:27 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
Very dubious commit. To me the old and new names look the same.
'func_name' was even better than 'func_name_desc'. At least
because it was shorter, and 'desc' does not add any more meaning.
On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote:
> - rename func_name to func_name_desc because
> it is not just a name but rather a name descriptor
> which includes symbol address
I don't see any address in it. Only name tokens.
> - rename func_split_name to parse_func_name because
> the action of splitting name to parts is not what
> the function about, the function parses specification
> to fetch a descriptor
>
> Part-of #4642
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming
2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-01-24 22:32 ` Cyrill Gorcunov via Tarantool-patches
2021-01-30 18:53 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-24 22:32 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Sun, Jan 24, 2021 at 05:27:12PM +0100, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> Very dubious commit. To me the old and new names look the same.
> 'func_name' was even better than 'func_name_desc'. At least
> because it was shorter, and 'desc' does not add any more meaning.
>
> On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote:
> > - rename func_name to func_name_desc because
> > it is not just a name but rather a name descriptor
> > which includes symbol address
>
> I don't see any address in it. Only name tokens.
struct func_name_desc {
/**
* Null-terminated symbol name, e.g.
* "func" for "mod.submod.func".
*/
---> const char *sym;
/**
* Package name, e.g. "mod.submod" for
* "mod.submod.func".
*/
const char *package;
/**
* A pointer to the last character in ->package + 1.
*/
const char *package_end;
};
I marked the address. The structure is not a function name
but consists of two parts - package path and function name
itself and maybe something new will be added in future. So
func_name is suitable for plain names but not for strings
where some part of it has a special meaning with hidden
extension inside, hereby a descriptor. And don't forget
about grepability, without this name change grep returns
a number of irrelevant results.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming
2021-01-24 22:32 ` Cyrill Gorcunov via Tarantool-patches
@ 2021-01-30 18:53 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-01 11:41 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-30 18:53 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
Hi!
On 24.01.2021 23:32, Cyrill Gorcunov wrote:
> On Sun, Jan 24, 2021 at 05:27:12PM +0100, Vladislav Shpilevoy wrote:
>> Thanks for the patch!
>>
>> Very dubious commit. To me the old and new names look the same.
>> 'func_name' was even better than 'func_name_desc'. At least
>> because it was shorter, and 'desc' does not add any more meaning.
>>
>> On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote:
>>> - rename func_name to func_name_desc because
>>> it is not just a name but rather a name descriptor
>>> which includes symbol address
>>
>> I don't see any address in it. Only name tokens.
>
> struct func_name_desc {
> /**
> * Null-terminated symbol name, e.g.
> * "func" for "mod.submod.func".
> */
> ---> const char *sym;
> /**
> * Package name, e.g. "mod.submod" for
> * "mod.submod.func".
> */
> const char *package;
> /**
> * A pointer to the last character in ->package + 1.
> */
> const char *package_end;
> };
>
> I marked the address.
You said "symbol address". But it is not a symbol address. It is
an address pointing at a part of the name string. Not at the
actual symbol, which you need to load later.
> The structure is not a function name
> but consists of two parts - package path and function name
> itself and maybe something new will be added in future.
In this case 'name desc' is also invalid - it does not describe
the name. Because there is also a path which is not a part of
the name.
It would be more correct to call it func_path then.
> So func_name is suitable for plain names but not for strings
> where some part of it has a special meaning with hidden
> extension inside, hereby a descriptor.
In this case you didn't really change it - it was about function
name and stayed about function name. It should be func_path then.
If you really want to rename.
> And don't forget
> about grepability, without this name change grep returns
> a number of irrelevant results.
This struct is defined and used in a single file. It is not that
hard to find it here. Especially if you use some smarter tools.
For instance, I use full text search in Sublime, and have never had
any issues with finding anything in Tarantool sources. Plus for
structs and functions you usually want to use ctags (Sublime also
generates these tags). Maybe it is only hard when you use command
line and bare grep.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming
2021-01-30 18:53 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-01 11:41 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-02-01 11:41 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Sat, Jan 30, 2021 at 07:53:03PM +0100, Vladislav Shpilevoy wrote:
> Hi!
>
> On 24.01.2021 23:32, Cyrill Gorcunov wrote:
> > On Sun, Jan 24, 2021 at 05:27:12PM +0100, Vladislav Shpilevoy wrote:
> >> Thanks for the patch!
> >>
> >> Very dubious commit. To me the old and new names look the same.
> >> 'func_name' was even better than 'func_name_desc'. At least
> >> because it was shorter, and 'desc' does not add any more meaning.
> >>
> >> On 18.01.2021 21:35, Cyrill Gorcunov via Tarantool-patches wrote:
> >>> - rename func_name to func_name_desc because
> >>> it is not just a name but rather a name descriptor
> >>> which includes symbol address
> >>
> >> I don't see any address in it. Only name tokens.
> >
> > struct func_name_desc {
> > /**
> > * Null-terminated symbol name, e.g.
> > * "func" for "mod.submod.func".
> > */
> > ---> const char *sym;
> > /**
> > * Package name, e.g. "mod.submod" for
> > * "mod.submod.func".
> > */
> > const char *package;
> > /**
> > * A pointer to the last character in ->package + 1.
> > */
> > const char *package_end;
> > };
> >
> > I marked the address.
>
> You said "symbol address". But it is not a symbol address. It is
> an address pointing at a part of the name string. Not at the
> actual symbol, which you need to load later.
This is an address of part of the string which represents symbol.
When symbol is loaded it will have "resolved" address. Plain
address and resolved address are pretty different things in
loaders terminology. In real loaders initial address is position
inside strings section which you're resolving later upon symbols
lookup. We have the similar approach, except our initial address is
position inside package string...
>
> > The structure is not a function name
> > but consists of two parts - package path and function name
> > itself and maybe something new will be added in future.
>
> In this case 'name desc' is also invalid - it does not describe
> the name. Because there is also a path which is not a part of
> the name.
>
> It would be more correct to call it func_path then.
It is not a path. It is a *descriptor* which identify function
in unique way on a storage device. But sure I will do
struct func_path {
const char *sym_name;
const char *package;
const char *package_end;
};
>
> > So func_name is suitable for plain names but not for strings
> > where some part of it has a special meaning with hidden
> > extension inside, hereby a descriptor.
>
> In this case you didn't really change it - it was about function
> name and stayed about function name. It should be func_path then.
> If you really want to rename.
OK, I would prefer to rename. func_name already used in many places
with rather plain string context, and forcing to use some fulltext
search is not a good idea. To be honest I would keep func_name_desc
here because the structure might extend in future but func_path
fine as well.
>
> > And don't forget
> > about grepability, without this name change grep returns
> > a number of irrelevant results.
>
> This struct is defined and used in a single file. It is not that
> hard to find it here. Especially if you use some smarter tools.
> For instance, I use full text search in Sublime, and have never had
> any issues with finding anything in Tarantool sources. Plus for
> structs and functions you usually want to use ctags (Sublime also
> generates these tags). Maybe it is only hard when you use command
> line and bare grep.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (2 preceding siblings ...)
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 3/8] module_cache: improve naming Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-19 12:46 ` Cyrill Gorcunov via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 5/8] module_cache: rename calls to ref in module structure Cyrill Gorcunov via Tarantool-patches
` (4 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
Currently when we reload modules we remove old instance
from the module cache and then try to insert a new one
back. Note that module cache is a string based hash table:
we lookup for a pointer to the module via package name.
Thus when reload initiated we simply remove same key
from hash and put it back with a new pointer. This is
too complex and completely useless. Instead we should
simply update the value associated this the key in the
hash table. For this sake we introduce module_cache_update
helper.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/module_cache.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index 8ae61a883..48e55f421 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -422,6 +422,21 @@ module_sym_call(struct module_sym *mod_sym, struct port *args,
return rc;
}
+/**
+ * Update the module cache. Since the lookup is string
+ * key based it is safe to just update the value.
+ */
+static int
+module_cache_update(const char *name, const char *name_end,
+ struct module *module)
+{
+ mh_int_t e = mh_strnptr_find_inp(mod_hash, name, name_end - name);
+ if (e == mh_end(mod_hash))
+ return -1;
+ mh_strnptr_node(mod_hash, e)->val = module;
+ return 0;
+}
+
int
module_reload(const char *package, const char *package_end,
struct module **module)
@@ -453,9 +468,15 @@ module_reload(const char *package, const char *package_end,
rlist_move(&new->funcs_list, &mod_sym->item);
}
- module_cache_del(package, package_end);
- if (module_cache_add(new) != 0)
- goto restore;
+ if (module_cache_update(package, package_end, new) != 0) {
+ /*
+ * Module cache must be consistent at this moment,
+ * we've looked up for the package recently. If
+ * someone has updated the cache in unexpected way
+ * the consistency is lost and we must not continue.
+ */
+ panic("module: can't update module cache (%s)", package);
+ }
module_gc(old);
*module = new;
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload Cyrill Gorcunov via Tarantool-patches
@ 2021-01-19 12:46 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-19 12:46 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
On Mon, Jan 18, 2021 at 11:35:52PM +0300, Cyrill Gorcunov wrote:
> Currently when we reload modules we remove old instance
> from the module cache and then try to insert a new one
> back. Note that module cache is a string based hash table:
> we lookup for a pointer to the module via package name.
This approach doesn't work. An update patch attached. I force pushed the update.
---
From: Cyrill Gorcunov <gorcunov@gmail.com>
Date: Tue, 12 Jan 2021 11:53:50 +0300
Subject: [PATCH v12 4/8] module_cache: panic if module cache update failed
Currently when we reload modules we remove old instance
from the module cache and then try to insert a new one
back. If error happens inbetween (say machine reached
lack of memory and hash resize failed) we try to restore
functions on old module. This is fine but we can't continue
working with modules -- next reload action won't find
the module in hash and won't reload. Since this is a really
rare case plain panic should be fine here.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/module_cache.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index 8ae61a883..ccf079d7a 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -422,6 +422,24 @@ module_sym_call(struct module_sym *mod_sym, struct port *args,
return rc;
}
+/**
+ * Update the module cache.
+ */
+static void
+module_cache_update(const char *package, const char *package_end,
+ struct module *module)
+{
+ /*
+ * Due to hash internals we need to remove old node
+ * and insert a new one. If there some memory pressure
+ * happen between these actions we would loose the hash
+ * node and this make inconsistent the next modules reload.
+ */
+ module_cache_del(package, package_end);
+ if (module_cache_add(module) != 0)
+ panic("module: can't update module cache (%s)", package);
+}
+
int
module_reload(const char *package, const char *package_end,
struct module **module)
@@ -453,10 +471,7 @@ module_reload(const char *package, const char *package_end,
rlist_move(&new->funcs_list, &mod_sym->item);
}
- module_cache_del(package, package_end);
- if (module_cache_add(new) != 0)
- goto restore;
-
+ module_cache_update(package, package_end, new);
module_gc(old);
*module = new;
return 0;
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload
2021-01-19 12:46 ` Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-24 22:26 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:27 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
On 19.01.2021 13:46, Cyrill Gorcunov wrote:
> On Mon, Jan 18, 2021 at 11:35:52PM +0300, Cyrill Gorcunov wrote:
>> Currently when we reload modules we remove old instance
>> from the module cache and then try to insert a new one
>> back. Note that module cache is a string based hash table:
>> we lookup for a pointer to the module via package name.
>
> This approach doesn't work.
Why?
> ---
> From: Cyrill Gorcunov <gorcunov@gmail.com>
> Date: Tue, 12 Jan 2021 11:53:50 +0300
> Subject: [PATCH v12 4/8] module_cache: panic if module cache update failed
>
> Currently when we reload modules we remove old instance
> from the module cache and then try to insert a new one
> back. If error happens inbetween (say machine reached
> lack of memory and hash resize failed) we try to restore
> functions on old module. This is fine but we can't continue
> working with modules -- next reload action won't find
> the module in hash and won't reload. Since this is a really
> rare case plain panic should be fine here.
>
> Part-of #4642
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload
2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-01-24 22:26 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-24 22:26 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Sun, Jan 24, 2021 at 05:27:18PM +0100, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> On 19.01.2021 13:46, Cyrill Gorcunov wrote:
> > On Mon, Jan 18, 2021 at 11:35:52PM +0300, Cyrill Gorcunov wrote:
> >> Currently when we reload modules we remove old instance
> >> from the module cache and then try to insert a new one
> >> back. Note that module cache is a string based hash table:
> >> we lookup for a pointer to the module via package name.
> >
> > This approach doesn't work.
>
> Why?
False alarm, it does work. I fear due to shadow copy it won't
be consistent but looking more precisely to mhash code I think
it should be safe to update the pointer inplace without modifying
the hash table. I still think our hashing code is just unreadable
pile but whatever...
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 5/8] module_cache: rename calls to ref in module structure
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (3 preceding siblings ...)
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 4/8] module_cache: direct update a cache value on reload Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 6/8] module_cache: provide helpers to load and unload modules Cyrill Gorcunov via Tarantool-patches
` (3 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
We will use this field not only to count a number of
active calls to the module but to prevent the module
from disappear when it get loaded via "cmod" interface
(will be implemented in next patches).
Same time make it 64 bit long since there might be a
number of such references size_t didn't prevent from
numeric overflow.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/module_cache.c | 8 ++++----
src/box/module_cache.h | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index 48e55f421..ae874caab 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -214,7 +214,7 @@ module_load(const char *package, const char *package_end)
memcpy(module->package, package, package_len);
module->package[package_len] = 0;
rlist_create(&module->funcs_list);
- module->calls = 0;
+ module->ref = 0;
const char *tmpdir = getenv("TMPDIR");
if (tmpdir == NULL)
@@ -312,7 +312,7 @@ module_delete(struct module *module)
static void
module_gc(struct module *module)
{
- if (rlist_empty(&module->funcs_list) && module->calls == 0)
+ if (rlist_empty(&module->funcs_list) && module->ref == 0)
module_delete(module);
}
@@ -404,9 +404,9 @@ module_sym_call(struct module_sym *mod_sym, struct port *args,
*/
struct module *module = mod_sym->module;
assert(module != NULL);
- ++module->calls;
+ ++module->ref;
int rc = mod_sym->addr(&ctx, data, data + data_sz);
- --module->calls;
+ --module->ref;
module_gc(module);
region_truncate(region, region_svp);
diff --git a/src/box/module_cache.h b/src/box/module_cache.h
index fd789f603..feb1f2266 100644
--- a/src/box/module_cache.h
+++ b/src/box/module_cache.h
@@ -41,9 +41,9 @@ struct module {
*/
struct rlist funcs_list;
/**
- * Count of active calls.
+ * Count of active references to the module.
*/
- size_t calls;
+ int64_t ref;
/**
* Module's package name.
*/
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 5/8] module_cache: rename calls to ref in module structure
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 5/8] module_cache: rename calls to ref in module structure Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-25 10:29 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:27 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
Why isn't it in the third commit ("improve naming")?
On 18.01.2021 21:35, Cyrill Gorcunov wrote:
> We will use this field not only to count a number of
> active calls to the module but to prevent the module
> from disappear when it get loaded via "cmod" interface
> (will be implemented in next patches).
>
> Same time make it 64 bit long since there might be a
> number of such references size_t didn't prevent from
> numeric overflow.
>
> Part-of #4642
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> src/box/module_cache.c | 8 ++++----
> src/box/module_cache.h | 4 ++--
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/box/module_cache.c b/src/box/module_cache.c
> index 48e55f421..ae874caab 100644
> --- a/src/box/module_cache.c
> +++ b/src/box/module_cache.c
> @@ -214,7 +214,7 @@ module_load(const char *package, const char *package_end)
> memcpy(module->package, package, package_len);
> module->package[package_len] = 0;
> rlist_create(&module->funcs_list);
> - module->calls = 0;
> + module->ref = 0;
I propose to be consistent with the other code and
call it 'refs' instead of 'ref' (see struct error, struct tuple,
struct tuple_format).
> const char *tmpdir = getenv("TMPDIR");
> if (tmpdir == NULL)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 5/8] module_cache: rename calls to ref in module structure
2021-01-24 16:27 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-01-25 10:29 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-25 10:29 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Sun, Jan 24, 2021 at 05:27:35PM +0100, Vladislav Shpilevoy wrote:
> Thanks for the patch!
>
> Why isn't it in the third commit ("improve naming")?
Because I wanna separate function descriptor changes from
module refs. They are serving different purposes and moreover
the refs patch changes the element type.
> > - module->calls = 0;
> > + module->ref = 0;
>
> I propose to be consistent with the other code and
> call it 'refs' instead of 'ref' (see struct error, struct tuple,
> struct tuple_format).
Sure
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 6/8] module_cache: provide helpers to load and unload modules
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (4 preceding siblings ...)
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 5/8] module_cache: rename calls to ref in module structure Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (2 subsequent siblings)
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
We need to be able to load and unload modules without
functions bound to them, just plain dlopen inside. This
is a part of API for future patches.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/box/module_cache.c | 47 +++++++++++++++++++++++++++++++-----------
src/box/module_cache.h | 19 +++++++++++++++++
2 files changed, 54 insertions(+), 12 deletions(-)
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index ae874caab..cb580f342 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -198,7 +198,7 @@ module_find(const char *package, const char *package_end,
* for cases of a function reload.
*/
static struct module *
-module_load(const char *package, const char *package_end)
+module_dlopen(const char *package, const char *package_end)
{
char path[PATH_MAX];
if (module_find(package, package_end, path, sizeof(path)) != 0)
@@ -338,16 +338,10 @@ module_sym_load(struct module_sym *mod_sym)
struct func_name_desc d;
parse_func_name(mod_sym->name, &d);
- struct module *module = module_cache_find(d.package, d.package_end);
- if (module == NULL) {
- module = module_load(d.package, d.package_end);
- if (module == NULL)
- return -1;
- if (module_cache_add(module) != 0) {
- module_delete(module);
- return -1;
- }
- }
+ size_t len = d.package_end - d.package;
+ struct module *module = module_load(d.package, len);
+ if (module == NULL)
+ return -1;
mod_sym->addr = module_sym(module, d.sym);
if (mod_sym->addr == NULL)
@@ -437,6 +431,34 @@ module_cache_update(const char *name, const char *name_end,
return 0;
}
+/**
+ * Load a new module.
+ */
+struct module *
+module_load(const char *path, size_t path_len)
+{
+ struct module *module = module_cache_find(path, &path[path_len]);
+ if (module == NULL) {
+ module = module_dlopen(path, &path[path_len]);
+ if (module == NULL)
+ return NULL;
+ if (module_cache_add(module) != 0) {
+ module_delete(module);
+ return NULL;
+ }
+ }
+ return module;
+}
+
+/**
+ * Unload a module.
+ */
+void
+module_unload(struct module *module)
+{
+ module_gc(module);
+}
+
int
module_reload(const char *package, const char *package_end,
struct module **module)
@@ -448,7 +470,8 @@ module_reload(const char *package, const char *package_end,
return 0;
}
- struct module *new = module_load(package, package_end);
+ size_t len = package_end - package;
+ struct module *new = module_dlopen(package, &package[len]);
if (new == NULL)
return -1;
diff --git a/src/box/module_cache.h b/src/box/module_cache.h
index feb1f2266..87108ae81 100644
--- a/src/box/module_cache.h
+++ b/src/box/module_cache.h
@@ -107,6 +107,25 @@ int
module_sym_call(struct module_sym *mod_sym, struct port *args,
struct port *ret);
+/**
+ * Load a module.
+ *
+ * @param path path to a module file without extension.
+ * @param path_len length of @path.
+ *
+ * @return pointer to a module or NULL on error.
+ */
+struct module *
+module_load(const char *path, size_t path_len);
+
+/**
+ * Unload a module.
+ *
+ * @param module pointer to a module.
+ */
+void
+module_unload(struct module *module);
+
/**
* Reload a module and all associated symbols.
*
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 6/8] module_cache: provide helpers to load and unload modules
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 6/8] module_cache: provide helpers to load and unload modules Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:28 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
> diff --git a/src/box/module_cache.c b/src/box/module_cache.c
> index ae874caab..cb580f342 100644
> --- a/src/box/module_cache.c
> +++ b/src/box/module_cache.c
> @@ -437,6 +431,34 @@ module_cache_update(const char *name, const char *name_end,
> return 0;
> }
>
> +/**
> + * Load a new module.
> + */
> +struct module *
> +module_load(const char *path, size_t path_len)
> +{
> + struct module *module = module_cache_find(path, &path[path_len]);
> + if (module == NULL) {
> + module = module_dlopen(path, &path[path_len]);
> + if (module == NULL)
> + return NULL;
> + if (module_cache_add(module) != 0) {
> + module_delete(module);
> + return NULL;
> + }
> + }
> + return module;
> +}
> +
> +/**
> + * Unload a module.
> + */
> +void
> +module_unload(struct module *module)
> +{
> + module_gc(module);
From what I see in the main commit, you should increase
the reference counter in load and decrease it in unload.
When you do that - you get another issue: --ref + module_gc() become
a pattern. So it makes sense to move them to a function module_unref().
Also it looks super strange, that the module deletion from the cache
is done in module_sym_unload(). I don't think it should change anything
except its mod_sym argument.
Another issue - module.funcs_list is an implicit reference counter,
which is extra weird. The function list should only be used for
reloads. Each function should increase the counter, so it couldn't happen
that the ref counter is 0, but there are functions keeping a pointer at
the module.
As you can see there is really a pile of issues with the current
reference counting and deletion of modules. Maybe better fix all of
them while you are here. At least the issue in your main patch about
touching ref counter manually.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (5 preceding siblings ...)
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 6/8] module_cache: provide helpers to load and unload modules Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 8/8] test: box/cfunc -- add cmod test Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:26 ` [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Vladislav Shpilevoy via Tarantool-patches
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
Currently to run "C" function from some external module
one have to register it first in "_func" system space. This
is a problem if node is in read-only mode (replica).
Still people would like to have a way to run such functions
even in ro mode. For this sake we implement "cmod" lua module.
Fixes #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
@TarantoolBot document
Title: cmod module
Overview
========
`cmod` module provides a way to create, delete and execute
`C` procedures. Unlike `box.schema.func` methods the functions
created with `cmod` help are not persistent and live purely
in memory. Once a node get turned off they are vanished.
An initial purpose for them is to execute them on nodes
which are running in read-only mode.
Module functions
================
`require('cmod').load(path) -> obj | nil, err`
----------------------------------------------
Loads a module from `path` and return an object instance
associate with the module. The `path` should not end up
with shared object extension (such as `.so`), only a file
name shall be there.
Possible errors:
- IllegalParams: module path is either not supplied
or not a string.
- SystemError: unable to open a module due to a system error.
- ClientError: a module does not exist.
- OutOfMemory: unable to allocate a module.
`module:unload() -> true | nil, err`
------------------------------------
Unloads a module. Once the module is unloaded one can't load
new functions from this module instance.
Possible errors:
- IllegalParams: a module is not supplied.
- IllegalParams: a module is already unloaded.
Example:
``` Lua
m, err = require('cmod').load('path')
if not m then
print(err)
end
ok, err = m:unload()
if not ok then
print(err)
end
```
If there are functions from this module referenced somewhere
in other Lua code they still can be called because module
continue sitting in memory until the last reference is closed.
If the module become a target to the Lua's garbage collector
then unload is called implicitly.
`module:reload() -> true | nil, err`
------------------------------------
Reloads a module and all functions which were associated with
the module. Each module keeps a list of functions belonging to
the module and reload procedure cause the bound function to update
their addresses such that function execution will be routed via
a new library.
Modules are loaded with that named local binding which means
that reload of module symbols won't affect the functions which
are started execution already, only new calls will be rerouted.
Possible errors:
- IllegalParams: a module is not supplied.
- ClientError: a module does not exist.
On success `true` is returned, otherwise `nil,error` pair.
Example:
``` Lua
m, err = require('cmod').load('path')
if not m then
print(err)
end
ok, err = m:reload()
if not ok then
print(err)
end
```
module:load(name) -> obj | nil, err`
------------------------------------
Loads a new function with name `name` from the `module` object
and return a callable object instance associate with the function.
Possible errors:
- IllegalParams: function name is either not supplied
or not a string.
- OutOfMemory: unable to allocate a function.
Example:
``` Lua
m, err = require('cmod').load('path')
if not m then
print(err)
end
f, err = m:load('foo')
if not ok then
print(err)
end
```
`function:unload() -> true | nil, err`
--------------------------------------
Unloads a function.
Possible errors:
- IllegalParams: function name is either not supplied
or not a string.
- IllegalParams: the function does not exist.
Example:
``` Lua
m, err = require('cmod').load('path')
if not m then
print(err)
end
f, err = m:load('foo')
if not ok then
print(err)
end
ok, err = f:unload()
if not ok then
print(err)
end
```
Executing a loaded function
===========================
Once function is loaded it can be executed by ordinary Lua call.
Lets consider the following example. We have a `C` function which
takes two numbers and returns their sum.
``` C
int
cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
uint32_t arg_count = mp_decode_array(&args);
if (arg_count != 2) {
return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
"invalid argument count");
}
uint64_t a = mp_decode_uint(&args);
uint64_t b = mp_decode_uint(&args);
char res[16];
char *end = mp_encode_uint(res, a + b);
box_return_mp(ctx, res, end);
return 0;
}
```
The name of the function is `cfunc_sum` and the function is built into
`cfunc.so` shared library.
First we should load it as
``` Lua
m, err = require('cmod').load('cfunc')
if not m then
print(err)
end
cfunc_sum, err = m:load('cfunc_sum')
if not cfunc_sum then
print(err)
end
```
Once successfully loaded we can execute it. Note that unlike regular
Lua functions the context of `C` functions is different. They never
thrown an exception but return `true|nil, res` form where first value
set to `nil` in case of error condition and `res` carries an error
description.
Lets call the `cfunc_sum` with wrong number of arguments
``` Lua
local ok, res = cfunc_sum()
if not ok then
print(res)
end
```
We will the `"invalid argument count"` message in output.
The error message has been set by the `box_error_set` in `C`
code above.
On success the first returned value set to `true` and `res` represent
function execution result.
``` Lua
local ok, res = cfunc_sum(1, 2)
assert(ok);
print(res)
```
We will see the number `3` in output.
---
src/box/CMakeLists.txt | 1 +
src/box/lua/cmod.c | 718 +++++++++++++++++++++++++++++++++++++++++
src/box/lua/cmod.h | 24 ++
src/box/lua/init.c | 2 +
4 files changed, 745 insertions(+)
create mode 100644 src/box/lua/cmod.c
create mode 100644 src/box/lua/cmod.h
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 339e2c8a9..feba5a037 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -195,6 +195,7 @@ add_library(box STATIC
lua/init.c
lua/call.c
lua/cfg.cc
+ lua/cmod.c
lua/console.c
lua/serialize_lua.c
lua/tuple.c
diff --git a/src/box/lua/cmod.c b/src/box/lua/cmod.c
new file mode 100644
index 000000000..c3a44e3df
--- /dev/null
+++ b/src/box/lua/cmod.c
@@ -0,0 +1,718 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#include <string.h>
+#include <lua.h>
+
+#include "assoc.h"
+#include "diag.h"
+
+#include "box/module_cache.h"
+#include "box/error.h"
+#include "box/port.h"
+#include "tt_static.h"
+
+#include "trivia/util.h"
+#include "lua/utils.h"
+
+/**
+ * Function descriptor.
+ */
+struct cmod_func {
+ /**
+ * Symbol descriptor for the function in
+ * an associated module.
+ */
+ struct module_sym mod_sym;
+ /**
+ * Number of loads of the function.
+ */
+ int64_t load_count;
+ /**
+ * The function name (without a package path).
+ */
+ const char *func_name;
+ /**
+ * Length of @a name member.
+ */
+ size_t len;
+ /**
+ * Module path with function name separated
+ * by a point, like "module.func".
+ *
+ * See parse_func_name helper, we simply have
+ * to keep this strange architecture for backward
+ * compatibility sake (ie for functions which
+ * are created via box.schema.func).
+ */
+ char name[0];
+};
+
+/** Get a handle associated with an object. */
+static void *
+cmod_get_handle(struct lua_State *L, const char *uname)
+{
+ void **pptr = luaL_testudata(L, 1, uname);
+ return pptr != NULL ? *pptr : NULL;
+}
+
+/** Set a handle to a new value. */
+static void
+cmod_set_handle(struct lua_State *L, const char *uname, void *ptr)
+{
+ void **pptr = luaL_testudata(L, 1, uname);
+ if (pptr != NULL)
+ *pptr = ptr;
+}
+
+/** Setup a new handle and associate it with an object. */
+static void
+cmod_setup_handle(struct lua_State *L, const char *uname, void *ptr)
+{
+ *(void **)lua_newuserdata(L, sizeof(void *)) = ptr;
+ luaL_getmetatable(L, uname);
+ lua_setmetatable(L, -2);
+}
+
+/** A type to find a module handle from an object. */
+static const char *cmod_module_handle_uname = "cmod_module_handle";
+
+static struct module *
+cmod_get_module_handle(struct lua_State *L, bool mandatory)
+{
+ struct module *m = cmod_get_handle(L, cmod_module_handle_uname);
+ if (mandatory && m == NULL) {
+ const char *fmt = "The module is already unloaded";
+ diag_set(IllegalParams, fmt);
+ }
+ return m;
+}
+
+static void
+cmod_set_module_handle(struct lua_State *L, struct module *module)
+{
+ cmod_set_handle(L, cmod_module_handle_uname, module);
+}
+
+static void
+cmod_setup_module_handle(struct lua_State *L, struct module *module)
+{
+ cmod_setup_handle(L, cmod_module_handle_uname, module);
+}
+
+/** A type to find a function handle from an object. */
+static const char *cmod_func_handle_uname = "cmod_func_handle";
+
+static struct cmod_func *
+cmod_get_func_handle(struct lua_State *L, bool mandatory)
+{
+ struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname);
+ if (mandatory && cf == NULL) {
+ const char *fmt = "The function is already unloaded";
+ diag_set(IllegalParams, fmt);
+ }
+ return cf;
+}
+
+static void
+cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf)
+{
+ cmod_set_handle(L, cmod_func_handle_uname, cf);
+}
+
+static void
+cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf)
+{
+ cmod_setup_handle(L, cmod_func_handle_uname, cf);
+}
+
+/**
+ * Function name to cmod_func hash. The name includes
+ * module package path without file extension.
+ */
+static struct mh_strnptr_t *func_hash = NULL;
+
+/**
+ * Find function in cmod_func hash.
+ *
+ * @param name function name.
+ * @param name_len function name length.
+ *
+ * @returns function descriptor if found, NULL otherwise.
+ */
+struct cmod_func *
+cmod_func_find(const char *name, size_t name_len)
+{
+ mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len);
+ if (e == mh_end(func_hash))
+ return NULL;
+ return mh_strnptr_node(func_hash, e)->val;
+}
+
+/**
+ * Delete a function instance from the hash or decrease
+ * a reference if the function is still loaded.
+ *
+ * @param cf function descriptor.
+ *
+ * @retval true if the function has no more loaded instances
+ * and removed from the hash.
+ *
+ * @retval false if there are loaded instances left and function
+ * is kept in the hash.
+ */
+static bool
+cmod_func_del(struct cmod_func *cf)
+{
+ assert(cf->load_count > 0);
+ if (cf->load_count-- != 1)
+ return false;
+
+ mh_int_t e = mh_strnptr_find_inp(func_hash, cf->name, cf->len);
+ assert(e != mh_end(func_hash));
+ mh_strnptr_del(func_hash, e, NULL);
+ return true;
+}
+
+/**
+ * Add a function instance into the hash or increase
+ * a reference if the function is already exist.
+ *
+ * @param cf Function descriptor.
+ *
+ * Possible errors:
+ *
+ * - OutOfMemory: unable to allocate a hash entry.
+ *
+ * @retval 0 on success.
+ * @retval -1 on error, diag is set.
+ */
+static int
+cmod_func_add(struct cmod_func *cf)
+{
+ assert(cf->load_count >= 0);
+ if (cf->load_count++ != 0)
+ return 0;
+
+ const struct mh_strnptr_node_t nd = {
+ .str = cf->name,
+ .len = cf->len,
+ .hash = mh_strn_hash(cf->name, cf->len),
+ .val = cf,
+ };
+
+ mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL);
+ if (e == mh_end(func_hash)) {
+ diag_set(OutOfMemory, sizeof(nd),
+ "malloc", "cmod_func node");
+ return -1;
+ }
+ return 0;
+}
+
+/**
+ * Unload a symbol and free a function instance.
+ *
+ * @param cf function descriptor.
+ */
+static void
+cmod_func_free(struct cmod_func *cf)
+{
+ module_sym_unload(&cf->mod_sym);
+ TRASH(cf);
+ free(cf);
+}
+
+/**
+ * Allocate a new function instance and resolve a symbol address.
+ *
+ * @param name package path and a function name, ie "module.foo"
+ * @param len length of @a name.
+ * @param func_name_len function name length, ie "3" for "module.foo"
+ *
+ * @returns function instance on success, NULL otherwise setting diag area.
+ */
+static struct cmod_func *
+cmod_func_new(const char *name, size_t len, size_t func_name_len)
+{
+ const ssize_t cf_size = sizeof(struct cmod_func);
+ size_t size = cf_size + len + 1;
+ struct cmod_func *cf = malloc(size);
+ if (cf == NULL) {
+ diag_set(OutOfMemory, size, "malloc", "cf");
+ return NULL;
+ }
+
+ cf->mod_sym.addr = NULL;
+ cf->mod_sym.module = NULL;
+ cf->load_count = 0;
+ cf->mod_sym.name = cf->name;
+ cf->func_name = &cf->name[len - func_name_len];
+ cf->len = len;
+
+ memcpy(cf->name, name, len);
+ cf->name[len] = '\0';
+
+ if (module_sym_load(&cf->mod_sym) != 0) {
+ cmod_func_free(cf);
+ return NULL;
+ }
+
+ return cf;
+}
+
+/**
+ * Load a new function.
+ *
+ * This function takes a function name from the caller
+ * stack @a L and creates a new function object. If
+ * the function is already loaded we simply return
+ * a reference to existing one.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: function name is either not supplied
+ * or not a string.
+ * - IllegalParams: function references limit exceeded.
+ * - OutOfMemory: unable to allocate a function.
+ *
+ * @returns function object on success or {nil,error} on error,
+ * the error is set to the diagnostics area.
+ */
+static int
+lcmod_func_load(struct lua_State *L)
+{
+ const char *method = "function = module:load";
+ struct cmod_func *cf = NULL;
+
+ if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) {
+ const char *fmt =
+ "Expects %s(\'name\') but no name passed";
+ diag_set(IllegalParams, fmt, method);
+ return luaT_push_nil_and_error(L);
+ }
+
+ struct module *m = cmod_get_module_handle(L, false);
+ if (m == NULL) {
+ const char *fmt =
+ "Expects %s(\'name\') but not module object passed";
+ diag_set(IllegalParams, fmt, method);
+ return luaT_push_nil_and_error(L);
+ }
+
+ const char *func_name = lua_tostring(L, 2);
+ const char *name = tt_sprintf("%s.%s", m->package, func_name);
+ size_t len = strlen(name);
+
+ cf = cmod_func_find(name, len);
+ if (cf == NULL) {
+ cf = cmod_func_new(name, len, strlen(func_name));
+ if (cf == NULL)
+ return luaT_push_nil_and_error(L);
+ }
+
+ if (cmod_func_add(cf) != 0) {
+ cmod_func_free(cf);
+ return luaT_push_nil_and_error(L);
+ }
+
+ cmod_setup_func_handle(L, cf);
+ return 1;
+}
+
+/**
+ * Unload a function.
+ *
+ * This function takes a function object from
+ * the caller stack @a L and unloads it.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: function is not supplied.
+ * - IllegalParams: the function does not exist.
+ *
+ * @returns true on success or {nil,error} on error,
+ * the error is set to the diagnostics area.
+ */
+static int
+lcmod_func_unload(struct lua_State *L)
+{
+ if (lua_gettop(L) != 1) {
+ const char *fmt = "Expects function:unload()";
+ diag_set(IllegalParams, fmt);
+ return luaT_push_nil_and_error(L);
+ }
+
+ struct cmod_func *cf = cmod_get_func_handle(L, true);
+ if (cf == NULL)
+ return luaT_push_nil_and_error(L);
+
+ cmod_set_func_handle(L, NULL);
+ if (cmod_func_del(cf))
+ cmod_func_free(cf);
+
+ lua_pushboolean(L, true);
+ return 1;
+}
+
+/**
+ * Load a new module.
+ *
+ * This function takes a module patch from the caller
+ * stack @a L and creates a new module object.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: module path is either not supplied
+ * or not a string.
+ * - SystemError: unable to open a module due to a system error.
+ * - ClientError: a module does not exist.
+ * - OutOfMemory: unable to allocate a module.
+ *
+ * @returns module object on success or {nil,error} on error,
+ * the error is set to the diagnostics area.
+ */
+static int
+lcmod_module_load(struct lua_State *L)
+{
+ if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) {
+ const char *fmt =
+ "Expects cmod.load(\'name\') but no name passed";
+ diag_set(IllegalParams, fmt);
+ return luaT_push_nil_and_error(L);
+ }
+
+ size_t name_len;
+ const char *name = lua_tolstring(L, 1, &name_len);
+
+ struct module *module = module_load(name, name_len);
+ if (module == NULL)
+ return luaT_push_nil_and_error(L);
+
+ cmod_setup_module_handle(L, module);
+ /*
+ * Make sure the module won't disappear until
+ * it is GC'ed or unloaded explicitly.
+ */
+ module->ref++;
+ return 1;
+}
+
+/**
+ * Unload a module.
+ *
+ * This function takes a module object from
+ * the caller stack @a L and unloads it.
+ *
+ * If there are some active functions left then
+ * module won't be freed internally until last function
+ * from this module is unloaded, this is guaranteed by
+ * module_cache engine.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: a module is not supplied.
+ * - IllegalParams: a module does not exist.
+ *
+ * @returns true on success or {nil,error} on error,
+ * the error is set to the diagnostics area.
+ */
+static int
+lcmod_module_unload(struct lua_State *L)
+{
+ if (lua_gettop(L) != 1) {
+ const char *fmt = "Expects module:unload()";
+ diag_set(IllegalParams, fmt);
+ return luaT_push_nil_and_error(L);
+ }
+
+ struct module *m = cmod_get_module_handle(L, true);
+ if (m == NULL)
+ return luaT_push_nil_and_error(L);
+ m->ref--;
+ cmod_set_module_handle(L, NULL);
+ lua_pushboolean(L, true);
+ return 1;
+}
+
+/**
+ * Reload a module.
+ *
+ * This function takes a module object from the caller
+ * stack @a L and reloads all functions associated with
+ * it.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: a module is not supplied.
+ * - IllegalParams: a module does not exist.
+ * - ClientError: a module with the name provided does
+ * not exist.
+ *
+ * @returns true on success or {nil,error} on error,
+ * the error is set to the diagnostics area.
+ */
+static int
+lcmod_module_reload(struct lua_State *L)
+{
+ if (lua_gettop(L) != 1) {
+ const char *fmt = "Expects module:reload()";
+ diag_set(IllegalParams, fmt);
+ return luaT_push_nil_and_error(L);
+ }
+
+ struct module *m = cmod_get_module_handle(L, true);
+ if (m == NULL)
+ return luaT_push_nil_and_error(L);
+
+ struct module *new = NULL;
+ size_t len = strlen(m->package);
+ if (module_reload(m->package, &m->package[len], &new) == 0) {
+ if (new != NULL) {
+ m->ref--;
+ module_unload(m);
+ new->ref++;
+ cmod_set_module_handle(L, new);
+ lua_pushboolean(L, true);
+ return 1;
+ }
+ diag_set(ClientError, ER_NO_SUCH_MODULE, m->package);
+ }
+ return luaT_push_nil_and_error(L);
+}
+
+/**
+ * Handle __index request for a module object.
+ */
+static int
+lcmod_module_handle_index(struct lua_State *L)
+{
+ /*
+ * Instead of showing userdata pointer
+ * lets provide a serialized value.
+ */
+ lua_getmetatable(L, 1);
+ lua_pushvalue(L, 2);
+ lua_rawget(L, -2);
+ if (!lua_isnil(L, -1))
+ return 1;
+
+ struct module *m = cmod_get_module_handle(L, false);
+ if (m == NULL) {
+ lua_pushnil(L);
+ return 1;
+ }
+
+ const char *key = lua_tostring(L, 2);
+
+ if (lua_type(L, 2) != LUA_TSTRING || key == NULL) {
+ diag_set(IllegalParams,
+ "Bad params, use __index(obj, <string>)");
+ return luaT_error(L);
+ }
+
+ if (strcmp(key, "path") == 0) {
+ lua_pushstring(L, m->package);
+ return 1;
+ }
+
+ return 0;
+}
+
+/**
+ * Module handle representation for REPL (console).
+ */
+static int
+lcmod_module_handle_serialize(struct lua_State *L)
+{
+ struct module *m = cmod_get_module_handle(L, true);
+ if (m == NULL) {
+ lua_pushnil(L);
+ return 1;
+ }
+
+ lua_createtable(L, 0, 0);
+ lua_pushstring(L, m->package);
+ lua_setfield(L, -2, "path");
+
+ return 1;
+}
+
+/**
+ * Collect a module handle.
+ */
+static int
+lcmod_module_handle_gc(struct lua_State *L)
+{
+ struct module *m = cmod_get_module_handle(L, false);
+ if (m != NULL) {
+ /*
+ * If there are some referenced functions
+ * alive then final module GC will happen
+ * on last function unload (indirectly when
+ * module symbol get collected by module_sym_unload
+ * code).
+ */
+ cmod_set_module_handle(L, NULL);
+ m->ref--;
+ module_unload(m);
+ }
+ return 0;
+}
+
+/**
+ * Function handle representation for REPL (console).
+ */
+static int
+lcmod_func_handle_serialize(struct lua_State *L)
+{
+ struct cmod_func *cf = cmod_get_func_handle(L, true);
+ if (cf == NULL) {
+ lua_pushnil(L);
+ return 1;
+ }
+
+ lua_createtable(L, 0, 0);
+ lua_pushstring(L, cf->name);
+ lua_setfield(L, -2, "name");
+
+ return 1;
+}
+
+/**
+ * Handle __index request for a function object.
+ */
+static int
+lcmod_func_handle_index(struct lua_State *L)
+{
+ /*
+ * Instead of showing userdata pointer
+ * lets provide a serialized value.
+ */
+ lua_getmetatable(L, 1);
+ lua_pushvalue(L, 2);
+ lua_rawget(L, -2);
+ if (!lua_isnil(L, -1))
+ return 1;
+
+ struct cmod_func *cf = cmod_get_func_handle(L, true);
+ if (cf == NULL)
+ return luaT_error(L);
+
+ const char *key = lua_tostring(L, 2);
+
+ if (lua_type(L, 2) != LUA_TSTRING || key == NULL) {
+ diag_set(IllegalParams,
+ "Bad params, use __index(obj, <string>)");
+ return luaT_error(L);
+ }
+
+ if (strcmp(key, "name") == 0) {
+ lua_pushstring(L, cf->name);
+ return 1;
+ }
+
+ return 0;
+}
+
+/**
+ * Collect function handle if there is no active loads left.
+ */
+static int
+lcmod_func_handle_gc(struct lua_State *L)
+{
+ struct cmod_func *cf = cmod_get_func_handle(L, false);
+ if (cf != NULL) {
+ cmod_set_func_handle(L, NULL);
+ if (cmod_func_del(cf))
+ cmod_func_free(cf);
+ }
+ return 0;
+}
+
+/**
+ * Call a function by its name from the Lua code.
+ */
+static int
+lcmod_func_handle_call(struct lua_State *L)
+{
+ struct cmod_func *cf = cmod_get_func_handle(L, true);
+ if (cf == NULL)
+ return luaT_push_nil_and_error(L);
+
+ /*
+ * FIXME: We should get rid of luaT_newthread but this
+ * requires serious modifications. In particular
+ * port_lua_do_dump uses tarantool_L reference and
+ * coro_ref must be valid as well.
+ */
+ lua_State *args_L = luaT_newthread(tarantool_L);
+ if (args_L == NULL)
+ return luaT_push_nil_and_error(L);
+
+ int coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
+ lua_xmove(L, args_L, lua_gettop(L) - 1);
+
+ struct port args;
+ port_lua_create(&args, args_L);
+ ((struct port_lua *)&args)->ref = coro_ref;
+
+ struct port ret;
+ if (module_sym_call(&cf->mod_sym, &args, &ret) != 0) {
+ port_destroy(&args);
+ return luaT_push_nil_and_error(L);
+ }
+
+ int top = lua_gettop(L);
+ lua_pushboolean(L, true);
+ port_dump_lua(&ret, L, true);
+ int cnt = lua_gettop(L) - top;
+
+ port_destroy(&ret);
+ port_destroy(&args);
+
+ return cnt;
+}
+
+/**
+ * Initialize cmod module.
+ */
+void
+box_lua_cmod_init(struct lua_State *L)
+{
+ func_hash = mh_strnptr_new();
+ if (func_hash == NULL) {
+ panic("Can't allocate cmod hash table");
+ }
+
+ static const struct luaL_Reg module_methods[] = {
+ { "load", lcmod_module_load },
+ { NULL, NULL },
+ };
+ luaL_register_module(L, "cmod", module_methods);
+ lua_pop(L, 1);
+
+ static const struct luaL_Reg module_handle_methods[] = {
+ { "load", lcmod_func_load },
+ { "reload", lcmod_module_reload },
+ { "unload", lcmod_module_unload },
+ { "__index", lcmod_module_handle_index },
+ { "__serialize", lcmod_module_handle_serialize },
+ { "__gc", lcmod_module_handle_gc },
+ { NULL, NULL },
+ };
+ luaL_register_type(L, cmod_module_handle_uname, module_handle_methods);
+
+ static const struct luaL_Reg func_handle_methods[] = {
+ { "unload", lcmod_func_unload },
+ { "__index", lcmod_func_handle_index },
+ { "__serialize", lcmod_func_handle_serialize },
+ { "__call", lcmod_func_handle_call },
+ { "__gc", lcmod_func_handle_gc },
+ { NULL, NULL },
+ };
+ luaL_register_type(L, cmod_func_handle_uname, func_handle_methods);
+}
diff --git a/src/box/lua/cmod.h b/src/box/lua/cmod.h
new file mode 100644
index 000000000..f0ea2d34d
--- /dev/null
+++ b/src/box/lua/cmod.h
@@ -0,0 +1,24 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#pragma once
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+struct lua_State;
+
+/**
+ * Initialize cmod Lua module.
+ *
+ * @param L Lua state where to register the cmod module.
+ */
+void
+box_lua_cmod_init(struct lua_State *L);
+#if defined(__cplusplus)
+}
+#endif /* defined(__plusplus) */
diff --git a/src/box/lua/init.c b/src/box/lua/init.c
index fbcdfb20b..bad2b7ca9 100644
--- a/src/box/lua/init.c
+++ b/src/box/lua/init.c
@@ -60,6 +60,7 @@
#include "box/lua/cfg.h"
#include "box/lua/xlog.h"
#include "box/lua/console.h"
+#include "box/lua/cmod.h"
#include "box/lua/tuple.h"
#include "box/lua/execute.h"
#include "box/lua/key_def.h"
@@ -465,6 +466,7 @@ box_lua_init(struct lua_State *L)
box_lua_tuple_init(L);
box_lua_call_init(L);
box_lua_cfg_init(L);
+ box_lua_cmod_init(L);
box_lua_slab_init(L);
box_lua_index_init(L);
box_lua_space_init(L);
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-25 16:50 ` Cyrill Gorcunov via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:28 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
See 21 comments below.
On 18.01.2021 21:35, Cyrill Gorcunov wrote:
> Currently to run "C" function from some external module
> one have to register it first in "_func" system space. This
> is a problem if node is in read-only mode (replica).
>
> Still people would like to have a way to run such functions
> even in ro mode. For this sake we implement "cmod" lua module.
>
> Fixes #4642
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
>
> @TarantoolBot document
> Title: cmod module
>
> Overview
> ========
>
> `cmod` module provides a way to create, delete and execute
> `C` procedures. Unlike `box.schema.func` methods the functions
> created with `cmod` help are not persistent and live purely
> in memory. Once a node get turned off they are vanished.
> An initial purpose for them is to execute them on nodes
> which are running in read-only mode.
>
> Module functions
> ================
>
> `require('cmod').load(path) -> obj | nil, err`
1. I don't like it, but I must say: according to one another
change in Lua guidelines now we must again throw exceptions
from all new functions present on the board instead of returning
errors. You can try to raise it in the chat if you want.
> ----------------------------------------------
>
> Loads a module from `path` and return an object instance
> associate with the module. The `path` should not end up
> with shared object extension (such as `.so`), only a file
> name shall be there.
>
> Possible errors:
>
> - IllegalParams: module path is either not supplied
> or not a string.
> - SystemError: unable to open a module due to a system error.
> - ClientError: a module does not exist.
> - OutOfMemory: unable to allocate a module.
>
> `module:unload() -> true | nil, err`
> ------------------------------------
>
> Unloads a module. Once the module is unloaded one can't load
> new functions from this module instance.
>
> Possible errors:
>
> - IllegalParams: a module is not supplied.
> - IllegalParams: a module is already unloaded.
>
> Example:
>
> ``` Lua
> m, err = require('cmod').load('path')
> if not m then
> print(err)
> end
> ok, err = m:unload()
> if not ok then
> print(err)
> end
> ```
>
> If there are functions from this module referenced somewhere
> in other Lua code they still can be called because module
> continue sitting in memory until the last reference is closed.
>
> If the module become a target to the Lua's garbage collector
> then unload is called implicitly.
>
> `module:reload() -> true | nil, err`
> ------------------------------------
>
> Reloads a module and all functions which were associated with
> the module. Each module keeps a list of functions belonging to
> the module and reload procedure cause the bound function to update
> their addresses such that function execution will be routed via
> a new library.
>
> Modules are loaded with that named local binding which means
> that reload of module symbols won't affect the functions which
> are started execution already, only new calls will be rerouted.
>
> Possible errors:
> - IllegalParams: a module is not supplied.
> - ClientError: a module does not exist.
2. What happens to the existing function objects on reload()?
How is reload different from unload() + load()?
> On success `true` is returned, otherwise `nil,error` pair.
>
> Example:
>
> ``` Lua
> m, err = require('cmod').load('path')
> if not m then
> print(err)
> end
> ok, err = m:reload()
> if not ok then
> print(err)
> end
> ```
>
> module:load(name) -> obj | nil, err`
> ------------------------------------
>
> Loads a new function with name `name` from the `module` object
> and return a callable object instance associate with the function.
>
> Possible errors:
> - IllegalParams: function name is either not supplied
> or not a string.
> - OutOfMemory: unable to allocate a function.
3. What if the function is not found?
> Example:
>
> ``` Lua
> m, err = require('cmod').load('path')
> if not m then
> print(err)
> end
> f, err = m:load('foo')
> if not ok then
> print(err)
> end
> ```
>
> `function:unload() -> true | nil, err`
> --------------------------------------
>
> Unloads a function.
>
> Possible errors:
> - IllegalParams: function name is either not supplied
> or not a string.
> - IllegalParams: the function does not exist.
4. Is this unloaded automatically, when the function object is
GCed?
> Example:
>
> ``` Lua
> m, err = require('cmod').load('path')
> if not m then
> print(err)
> end
> f, err = m:load('foo')
> if not ok then
> print(err)
> end
> ok, err = f:unload()
> if not ok then
> print(err)
> end
> ```
>
> Executing a loaded function
> ===========================
>
> Once function is loaded it can be executed by ordinary Lua call.
> Lets consider the following example. We have a `C` function which
> takes two numbers and returns their sum.
>
> ``` C
> int
> cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end)
> {
> uint32_t arg_count = mp_decode_array(&args);
> if (arg_count != 2) {
> return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
> "invalid argument count");
> }
> uint64_t a = mp_decode_uint(&args);
> uint64_t b = mp_decode_uint(&args);
>
> char res[16];
> char *end = mp_encode_uint(res, a + b);
> box_return_mp(ctx, res, end);
> return 0;
> }
> ```
>
> The name of the function is `cfunc_sum` and the function is built into
> `cfunc.so` shared library.
>
> First we should load it as
>
> ``` Lua
> m, err = require('cmod').load('cfunc')
> if not m then
> print(err)
> end
> cfunc_sum, err = m:load('cfunc_sum')
> if not cfunc_sum then
> print(err)
> end
> ```
>
> Once successfully loaded we can execute it. Note that unlike regular
> Lua functions the context of `C` functions is different. They never
> thrown an exception but return `true|nil, res` form where first value
> set to `nil` in case of error condition and `res` carries an error
> description.
>
> Lets call the `cfunc_sum` with wrong number of arguments
>
> ``` Lua
> local ok, res = cfunc_sum()
> if not ok then
> print(res)
> end
> ```
>
> We will the `"invalid argument count"` message in output.
> The error message has been set by the `box_error_set` in `C`
> code above.
>
> On success the first returned value set to `true` and `res` represent
> function execution result.
>
> ``` Lua
> local ok, res = cfunc_sum(1, 2)
> assert(ok);
> print(res)
> ```
>
> We will see the number `3` in output.
5. What happens when I do multireturn?
> diff --git a/src/box/lua/cmod.c b/src/box/lua/cmod.c
> new file mode 100644
> index 000000000..c3a44e3df
> --- /dev/null
> +++ b/src/box/lua/cmod.c
> @@ -0,0 +1,718 @@
> +/*
> + * SPDX-License-Identifier: BSD-2-Clause
> + *
> + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
> + */
> +
> +#include <string.h>
> +#include <lua.h>
> +
> +#include "assoc.h"
> +#include "diag.h"
> +
> +#include "box/module_cache.h"
> +#include "box/error.h"
> +#include "box/port.h"
> +#include "tt_static.h"
> +
> +#include "trivia/util.h"
> +#include "lua/utils.h"
> +
> +/**
> + * Function descriptor.
> + */
> +struct cmod_func {
> + /**
> + * Symbol descriptor for the function in
> + * an associated module.
> + */
> + struct module_sym mod_sym;
> + /**
> + * Number of loads of the function.
> + */
> + int64_t load_count;
> + /**
> + * The function name (without a package path).
> + */
> + const char *func_name;
6. This member is never used. I removed it and nothing
changed.
====================
@@ -31,10 +31,6 @@ struct cmod_func {
* Number of loads of the function.
*/
int64_t load_count;
- /**
- * The function name (without a package path).
- */
- const char *func_name;
/**
* Length of @a name member.
*/
@@ -231,12 +227,11 @@ cmod_func_free(struct cmod_func *cf)
*
* @param name package path and a function name, ie "module.foo"
* @param len length of @a name.
- * @param func_name_len function name length, ie "3" for "module.foo"
*
* @returns function instance on success, NULL otherwise setting diag area.
*/
static struct cmod_func *
-cmod_func_new(const char *name, size_t len, size_t func_name_len)
+cmod_func_new(const char *name, size_t len)
{
const ssize_t cf_size = sizeof(struct cmod_func);
size_t size = cf_size + len + 1;
@@ -250,7 +245,6 @@ cmod_func_new(const char *name, size_t len, size_t func_name_len)
cf->mod_sym.module = NULL;
cf->load_count = 0;
cf->mod_sym.name = cf->name;
- cf->func_name = &cf->name[len - func_name_len];
cf->len = len;
memcpy(cf->name, name, len);
@@ -309,7 +303,7 @@ lcmod_func_load(struct lua_State *L)
cf = cmod_func_find(name, len);
if (cf == NULL) {
- cf = cmod_func_new(name, len, strlen(func_name));
+ cf = cmod_func_new(name, len);
if (cf == NULL)
return luaT_push_nil_and_error(L);
====================
> + /**
> + * Length of @a name member.
> + */
> + size_t len;
> + /**
> + * Module path with function name separated
> + * by a point, like "module.func".
> + *
> + * See parse_func_name helper, we simply have
> + * to keep this strange architecture for backward
> + * compatibility sake (ie for functions which
> + * are created via box.schema.func).
> + */
> + char name[0];
> +};
> +
> +/** Get a handle associated with an object. */
> +static void *
> +cmod_get_handle(struct lua_State *L, const char *uname)
> +{
> + void **pptr = luaL_testudata(L, 1, uname);
> + return pptr != NULL ? *pptr : NULL;
> +}
> +
> +/** Set a handle to a new value. */
> +static void
> +cmod_set_handle(struct lua_State *L, const char *uname, void *ptr)
> +{
> + void **pptr = luaL_testudata(L, 1, uname);
> + if (pptr != NULL)
> + *pptr = ptr;
> +}
> +
> +/** Setup a new handle and associate it with an object. */
> +static void
> +cmod_setup_handle(struct lua_State *L, const char *uname, void *ptr)
> +{
> + *(void **)lua_newuserdata(L, sizeof(void *)) = ptr;
> + luaL_getmetatable(L, uname);
> + lua_setmetatable(L, -2);
> +}
> +
> +/** A type to find a module handle from an object. */
> +static const char *cmod_module_handle_uname = "cmod_module_handle";
> +
> +static struct module *
> +cmod_get_module_handle(struct lua_State *L, bool mandatory)
> +{
> + struct module *m = cmod_get_handle(L, cmod_module_handle_uname);
> + if (mandatory && m == NULL) {
> + const char *fmt = "The module is already unloaded";
> + diag_set(IllegalParams, fmt);
> + }
> + return m;
> +}
> +
> +static void
> +cmod_set_module_handle(struct lua_State *L, struct module *module)
> +{
> + cmod_set_handle(L, cmod_module_handle_uname, module);
> +}
> +
> +static void
> +cmod_setup_module_handle(struct lua_State *L, struct module *module)
> +{
> + cmod_setup_handle(L, cmod_module_handle_uname, module);
> +}
> +
> +/** A type to find a function handle from an object. */
> +static const char *cmod_func_handle_uname = "cmod_func_handle";
7. Please, move global declarations to the beginning of the file
so as it would be easier to find them, and to be consistent with
other files.
> +
> +static struct cmod_func *
> +cmod_get_func_handle(struct lua_State *L, bool mandatory)
8. According to the code style, flags must have 'is' prefix or
a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'.
But a better idea - just drop it. The single place where it is not
mandatory is GC handler.
Another option - introduce separate get() and check(). Get() only
returns it without checking for NULL. Check() will ensure it is not NULL
and set the error otherwise.
Also the function takes a Lua state, so it seems it must have lcmod
prefix, not cmod. According to the naming convention you established
in this file.
> +{
> + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname);
> + if (mandatory && cf == NULL) {
> + const char *fmt = "The function is already unloaded";
> + diag_set(IllegalParams, fmt);
> + }
> + return cf;
> +}
> +
> +static void
> +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf)
> +{
> + cmod_set_handle(L, cmod_func_handle_uname, cf);
> +}
> +
> +static void
> +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf)
> +{
> + cmod_setup_handle(L, cmod_func_handle_uname, cf);
> +}
9. Amount of boilerplate code and double checks of the same
conditions you need just to push some udata is terrifying, tbh.
For instance, take lcmod_func_handle_call(). It calls
cmod_get_func_handle() and checks result for NULL. This one
calls cmod_get_handle() and also checks result for NULL.
This one calls luaL_testudata() and also checks result for NULL.
You did 3 NULL checks which could be one, if you would have
a set of simpler functions working directly with luaL_testudata().
10. Why do you have 'handle' word in each function name? For
example, how 'handle_call' is any different from just 'call' in
scope of code in this file? Or how is 'cmod_setup_func' different
from 'cmod_setup_func_handle'? It seems that by 'handle' you
try not to clash names with the non-Lua part of cmod. But you
should do it via prefixes, not via suffixes. lcmod instead of cmod.
> +
> +/**
> + * Function name to cmod_func hash. The name includes
> + * module package path without file extension.
> + */
> +static struct mh_strnptr_t *func_hash = NULL;
> +
> +/**
> + * Find function in cmod_func hash.
> + *
> + * @param name function name.
> + * @param name_len function name length.
> + *
> + * @returns function descriptor if found, NULL otherwise.
> + */
> +struct cmod_func *
> +cmod_func_find(const char *name, size_t name_len)
11. The function must be static.
> +{
> + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len);
> + if (e == mh_end(func_hash))
> + return NULL;
> + return mh_strnptr_node(func_hash, e)->val;
> +}
> +
> +/**
> + * Delete a function instance from the hash or decrease
> + * a reference if the function is still loaded.
> + *
> + * @param cf function descriptor.
> + *
> + * @retval true if the function has no more loaded instances
> + * and removed from the hash.
> + *
> + * @retval false if there are loaded instances left and function
> + * is kept in the hash.
> + */
> +static bool
> +cmod_func_del(struct cmod_func *cf)
12. This whole cmod and lcmod API looks extremely inconsistent and complex.
Here are the functions you have (not counting the boilerplate code above):
// Find a function in the cache, ok.
cmod_func_find(name)
// It is called 'delete'. But it does not do the
// delete. Moreover, it also decreases 'load_count',
// which is super surprising. What does it have to do
// with loads? You don't call module_sym_unload() here,
// but decrease load_count.
cmod_func_del(f)
// Ok, this looks like it should add the function to the
// hash. But wait, it also increases load_count, and does
// not call module_sym_load()! WTF?
cmod_func_add(f)
// It should free the memory, right? But it calls
// module_sym_unload(). Why!? Also, we have a code style
// agreement, that function freeing the memory should
// have suffix 'delete', not 'free'. As soon as you have
// 'new'.
cmod_func_free(f)
// Fine, it should allocate a new function object, right?
// But it also calls module_sym_load()! And does not increase
// load_count! I don't understand. It seems like the names
// of these functions has very little to do with what they are
// really doing. The only way to make them work is to call them
// in some special sequence, in which a next function fixes
// leftovers of previous functions.
cmod_func_new(name)
This must be seriously redesigned. Starting from the function
names.
Here is an option:
// Find the function in the hash. If found, increase load_count
// and return. Otherwise allocate a new function object, fill its
// members, do the load, and make load_count = 1. Add to the hash.
cmod_func_load(name)
// Decrease load_count. If became 0, remove from the hash,
// unload mod_sym, and free the memory.
cmod_func_unload(f)
I don't really see why would you need the other functions. These 2
are going to be extra simple anyway. Module_cache API for module
objects is literally that simple - load, unload, reload. Why do
you need so many for cmod, especially with so unclear behaviour?
> +{
> + assert(cf->load_count > 0);
> + if (cf->load_count-- != 1)
> + return false;
> +
> + mh_int_t e = mh_strnptr_find_inp(func_hash, cf->name, cf->len);
> + assert(e != mh_end(func_hash));
> + mh_strnptr_del(func_hash, e, NULL);
> + return true;
> +}
> +
> +/**
> + * Add a function instance into the hash or increase
> + * a reference if the function is already exist.
> + *
> + * @param cf Function descriptor.
> + *
> + * Possible errors:
> + *
> + * - OutOfMemory: unable to allocate a hash entry.
> + *
> + * @retval 0 on success.
> + * @retval -1 on error, diag is set.
> + */
> +static int
> +cmod_func_add(struct cmod_func *cf)
> +{
> + assert(cf->load_count >= 0);
> + if (cf->load_count++ != 0)
13. I already proposed an option how to rework cmod API above, but
still will leave some comments alongside.
Why the heck 'add' function increases load_count?
> + return 0;
> +
> + const struct mh_strnptr_node_t nd = {
> + .str = cf->name,
> + .len = cf->len,
> + .hash = mh_strn_hash(cf->name, cf->len),
> + .val = cf,
> + };
> +
> + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL);
> + if (e == mh_end(func_hash)) {
> + diag_set(OutOfMemory, sizeof(nd),
> + "malloc", "cmod_func node");
> + return -1;
> + }
> + return 0;
14. Why so complex? Just add it to the hash when it is created. This
function is used in a single place, where you already do the check
if it exists.
> +}
> +
> +/**
> + * Unload a symbol and free a function instance.
> + *
> + * @param cf function descriptor.
> + */
> +static void
> +cmod_func_free(struct cmod_func *cf)
> +{
> + module_sym_unload(&cf->mod_sym);
15. Worth adding load_count == 0 assertion.
Or rather do the unload where it belongs - where
load_count becomes 0.
> + TRASH(cf);
> + free(cf);
> +}
> +
> +/**
> + * Allocate a new function instance and resolve a symbol address.
> + *
> + * @param name package path and a function name, ie "module.foo"
> + * @param len length of @a name.
> + * @param func_name_len function name length, ie "3" for "module.foo"
> + *
> + * @returns function instance on success, NULL otherwise setting diag area.
> + */
> +static struct cmod_func *
> +cmod_func_new(const char *name, size_t len, size_t func_name_len)
> +{
> + const ssize_t cf_size = sizeof(struct cmod_func);
> + size_t size = cf_size + len + 1;
> + struct cmod_func *cf = malloc(size);
> + if (cf == NULL) {
> + diag_set(OutOfMemory, size, "malloc", "cf");
> + return NULL;
> + }
> +
> + cf->mod_sym.addr = NULL;
> + cf->mod_sym.module = NULL;
> + cf->load_count = 0;
> + cf->mod_sym.name = cf->name;
> + cf->func_name = &cf->name[len - func_name_len];
> + cf->len = len;
> +
> + memcpy(cf->name, name, len);
> + cf->name[len] = '\0';
> +
> + if (module_sym_load(&cf->mod_sym) != 0) {
16. You load, and don't increase load_count. Does not it look
contradictory to you?
> + cmod_func_free(cf);
> + return NULL;
> + }
> +
> + return cf;
> +}
> +
> +/**
> + * Load a new function.
> + *
> + * This function takes a function name from the caller
> + * stack @a L and creates a new function object. If
> + * the function is already loaded we simply return
> + * a reference to existing one.
> + *
> + * Possible errors:
> + *
> + * - IllegalParams: function name is either not supplied
> + * or not a string.
> + * - IllegalParams: function references limit exceeded.
> + * - OutOfMemory: unable to allocate a function.
> + *
> + * @returns function object on success or {nil,error} on error,
> + * the error is set to the diagnostics area.
> + */
> +static int
> +lcmod_func_load(struct lua_State *L)
> +{
> + const char *method = "function = module:load";
> + struct cmod_func *cf = NULL;
> +
> + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) {
> + const char *fmt =
> + "Expects %s(\'name\') but no name passed";
> + diag_set(IllegalParams, fmt, method);
17. Why couldn't you just inline 'fmt' string? The current version
is not any shorter nor easier to read. The same in many other places.
> + return luaT_push_nil_and_error(L);
> + }
> +
> + struct module *m = cmod_get_module_handle(L, false);
> + if (m == NULL) {
> + const char *fmt =
> + "Expects %s(\'name\') but not module object passed";
> + diag_set(IllegalParams, fmt, method);
> + return luaT_push_nil_and_error(L);
> + }
> +
> + const char *func_name = lua_tostring(L, 2);
> + const char *name = tt_sprintf("%s.%s", m->package, func_name);
> + size_t len = strlen(name);
> +
> + cf = cmod_func_find(name, len);
18. Or you could pass package name and function name to the constructor
and copy them right to the allocated memory, without double copying
via the static buffer.
> + if (cf == NULL) {
> + cf = cmod_func_new(name, len, strlen(func_name));
> + if (cf == NULL)
> + return luaT_push_nil_and_error(L);
> + }
> +
> + if (cmod_func_add(cf) != 0) {
19. From this code it looks like you can fail addition of an
already existing function and you just delete it. I know it
can't happen, but the code says the opposite, and it is
confusing.
Another reason why 'add' is a bad name - it seems like it
should fail when added second time due to being already
added.
> + cmod_func_free(cf);
> + return luaT_push_nil_and_error(L);
> + }
> +
> + cmod_setup_func_handle(L, cf);
> + return 1;
> +}
> +
> +/**
> + * Unload a function.
> + *
> + * This function takes a function object from
> + * the caller stack @a L and unloads it.
> + *
> + * Possible errors:
> + *
> + * - IllegalParams: function is not supplied.
> + * - IllegalParams: the function does not exist.
> + *
> + * @returns true on success or {nil,error} on error,
> + * the error is set to the diagnostics area.
> + */
> +static int
> +lcmod_func_unload(struct lua_State *L)
> +{
> + if (lua_gettop(L) != 1) {
> + const char *fmt = "Expects function:unload()";
> + diag_set(IllegalParams, fmt);
> + return luaT_push_nil_and_error(L);
> + }
> +
> + struct cmod_func *cf = cmod_get_func_handle(L, true);
> + if (cf == NULL)
> + return luaT_push_nil_and_error(L);
> +
> + cmod_set_func_handle(L, NULL);
> + if (cmod_func_del(cf))
> + cmod_func_free(cf);
> +
> + lua_pushboolean(L, true);
> + return 1;
> +}
> +
> +/**
> + * Load a new module.
> + *
> + * This function takes a module patch from the caller
> + * stack @a L and creates a new module object.
> + *
> + * Possible errors:
> + *
> + * - IllegalParams: module path is either not supplied
> + * or not a string.
> + * - SystemError: unable to open a module due to a system error.
> + * - ClientError: a module does not exist.
> + * - OutOfMemory: unable to allocate a module.
> + *
> + * @returns module object on success or {nil,error} on error,
> + * the error is set to the diagnostics area.
> + */
> +static int
> +lcmod_module_load(struct lua_State *L)
> +{
> + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) {
> + const char *fmt =
> + "Expects cmod.load(\'name\') but no name passed";
> + diag_set(IllegalParams, fmt);
> + return luaT_push_nil_and_error(L);
> + }
> +
> + size_t name_len;
> + const char *name = lua_tolstring(L, 1, &name_len);
> +
> + struct module *module = module_load(name, name_len);
> + if (module == NULL)
> + return luaT_push_nil_and_error(L);
> +
> + cmod_setup_module_handle(L, module);
> + /*
> + * Make sure the module won't disappear until
> + * it is GC'ed or unloaded explicitly.
> + */
> + module->ref++;
20. The fact that you need to touch reference counter of an object
from a different subsystem should make you start to question
sanity of the API. Why module_load() does not do the increase,
and module_unload() - decrease?
> + return 1;
> +}> +
> +/**
> + * Module handle representation for REPL (console).
> + */
> +static int
> +lcmod_module_handle_serialize(struct lua_State *L)
> +{
> + struct module *m = cmod_get_module_handle(L, true);
> + if (m == NULL) {
> + lua_pushnil(L);
> + return 1;
> + }
> +
> + lua_createtable(L, 0, 0);
21. Since you decided to use createtable instead of newtable,
I suggest you to fill the proper parameters, which should
be (0, 1) in this case I think.
> + lua_pushstring(L, m->package);
> + lua_setfield(L, -2, "path");
> +
> + return 1;
> +}
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module
2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-01-25 16:50 ` Cyrill Gorcunov via Tarantool-patches
2021-01-30 18:53 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-25 16:50 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Sun, Jan 24, 2021 at 05:28:27PM +0100, Vladislav Shpilevoy wrote:
> >
> > Module functions
> > ================
> >
> > `require('cmod').load(path) -> obj | nil, err`
>
> 1. I don't like it, but I must say: according to one another
> change in Lua guidelines now we must again throw exceptions
> from all new functions present on the board instead of returning
> errors. You can try to raise it in the chat if you want.
Sigh... Thanks for pointing, will switch to a new approach.
> >
> > Modules are loaded with that named local binding which means
> > that reload of module symbols won't affect the functions which
> > are started execution already, only new calls will be rerouted.
> >
> > Possible errors:
> > - IllegalParams: a module is not supplied.
> > - ClientError: a module does not exist.
>
> 2. What happens to the existing function objects on reload()?
> Modules are loaded with that named local binding which means
> that reload of module symbols won't affect the functions which
> are started execution already, only new calls will be rerouted.
I suspect the paragraph above was not clear. What if I rephrase it as "Modules
are loaded with that named local binding which means that reload of the module
updates all functions tied up with, the only place where a kind of race
possible is function execution: a function may start execution and call
reschedule inside, thus this function will not be unloaded until its
execution complete. Next call to the same function will be routed via
new symbol".
Does it sound better?
> How is reload different from unload() + load()?
This is rather a shortcut to unload+load where previously allocated
functions on Lua level are simply updates their internal state.
> > module:load(name) -> obj | nil, err`
> > ------------------------------------
> >
> > Loads a new function with name `name` from the `module` object
> > and return a callable object instance associate with the function.
> >
> > Possible errors:
> > - IllegalParams: function name is either not supplied
> > or not a string.
> > - OutOfMemory: unable to allocate a function.
>
> 3. What if the function is not found?
ClientError. I'll add, thanks!
> >
> > `function:unload() -> true | nil, err`
> > --------------------------------------
> >
> > Unloads a function.
> >
> > Possible errors:
> > - IllegalParams: function name is either not supplied
> > or not a string.
> > - IllegalParams: the function does not exist.
>
> 4. Is this unloaded automatically, when the function object is
> GCed?
True. When __gc called on variable we unload function automatically.
Probably I should mention this in documentation.
> > On success the first returned value set to `true` and `res` represent
> > function execution result.
> >
> > ``` Lua
> > local ok, res = cfunc_sum(1, 2)
> > assert(ok);
> > print(res)
> > ```
> >
> > We will see the number `3` in output.
>
> 5. What happens when I do multireturn?
Not sure I follow you here. But if you mean tradtional
multireturn then we have it in test case (the next patch)
| cfunc_multireturn()
| | ---
| | - true
| | - [1]
| | - [1]
The function should prepare the result by self. The calling
convention is the same as to old known `box.func`. Or
you mean something else?
>
> 6. This member is never used. I removed it and nothing
> changed.
>
> ====================
> @@ -31,10 +31,6 @@ struct cmod_func {
> * Number of loads of the function.
> */
> int64_t load_count;
> - /**
> - * The function name (without a package path).
> - */
> - const char *func_name;
> /**
Initially I thought to print short function name in obj:__index,
but since the function hash is path+name I switched to the long
function name (but continue keeping short "func_name" if we deside
to print a short name as well). Probably indeed better to drop it
and don't confuse code readers.
> > +
> > +/** A type to find a function handle from an object. */
> > +static const char *cmod_func_handle_uname = "cmod_func_handle";
>
> 7. Please, move global declarations to the beginning of the file
> so as it would be easier to find them, and to be consistent with
> other files.
OK. I put them first at the top but then desided to group by function
context (ie close to the functions which use them).
>
> > +
> > +static struct cmod_func *
> > +cmod_get_func_handle(struct lua_State *L, bool mandatory)
>
> 8. According to the code style, flags must have 'is' prefix or
> a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'.
Could you please point me where this rule sits? I see both usage (with
and without prefix) in our code. Surely I update it (or probably drop
as you suggest below).
> But a better idea - just drop it. The single place where it is not
> mandatory is GC handler.
>
> Another option - introduce separate get() and check(). Get() only
> returns it without checking for NULL. Check() will ensure it is not NULL
> and set the error otherwise.
>
> Also the function takes a Lua state, so it seems it must have lcmod
> prefix, not cmod. According to the naming convention you established
> in this file.
Yeah, let me try this way.
> > +{
> > + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname);
> > + if (mandatory && cf == NULL) {
> > + const char *fmt = "The function is already unloaded";
> > + diag_set(IllegalParams, fmt);
> > + }
> > + return cf;
> > +}
> > +
> > +static void
> > +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf)
> > +{
> > + cmod_set_handle(L, cmod_func_handle_uname, cf);
> > +}
> > +
> > +static void
> > +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf)
> > +{
> > + cmod_setup_handle(L, cmod_func_handle_uname, cf);
> > +}
>
> 9. Amount of boilerplate code and double checks of the same
> conditions you need just to push some udata is terrifying, tbh.
>
> For instance, take lcmod_func_handle_call(). It calls
> cmod_get_func_handle() and checks result for NULL. This one
> calls cmod_get_handle() and also checks result for NULL.
> This one calls luaL_testudata() and also checks result for NULL.
> You did 3 NULL checks which could be one, if you would have
> a set of simpler functions working directly with luaL_testudata().
No, you're wrong. The minimum number of check is 2 (1 - fetch the
double pointer to userdata, 2 - dereference first level). Instead
I use two helpers where one sets diag in case of error which allows
me to not duplicate code. This is a way better. Until you meant
something different.
> 10. Why do you have 'handle' word in each function name? For
> example, how 'handle_call' is any different from just 'call' in
> scope of code in this file? Or how is 'cmod_setup_func' different
> from 'cmod_setup_func_handle'? It seems that by 'handle' you
> try not to clash names with the non-Lua part of cmod. But you
> should do it via prefixes, not via suffixes. lcmod instead of cmod.
The base idea of handle is that "handle" is sitting in udata of Lua
variable and the functions are fetching it from there. User may unload
a function then its handle set to NULL but variable is still alive
and subsequent call should print an error.
If you prefer the prefix then sure, gimme some time to try out.
> > +struct cmod_func *
> > +cmod_func_find(const char *name, size_t name_len)
>
> 11. The function must be static.
Yup. Thanks!
> > +{
> > + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len);
> > + if (e == mh_end(func_hash))
> > + return NULL;
> > + return mh_strnptr_node(func_hash, e)->val;
> > +}
> > +
> > +/**
> > + * Delete a function instance from the hash or decrease
> > + * a reference if the function is still loaded.
> > + *
> > + * @param cf function descriptor.
> > + *
> > + * @retval true if the function has no more loaded instances
> > + * and removed from the hash.
> > + *
> > + * @retval false if there are loaded instances left and function
> > + * is kept in the hash.
> > + */
> > +static bool
> > +cmod_func_del(struct cmod_func *cf)
>
> 12. This whole cmod and lcmod API looks extremely inconsistent and complex.
> Here are the functions you have (not counting the boilerplate code above):
>
> // Find a function in the cache, ok.
> cmod_func_find(name)
>
> // It is called 'delete'. But it does not do the
> // delete. Moreover, it also decreases 'load_count',
> // which is super surprising. What does it have to do
> // with loads? You don't call module_sym_unload() here,
> // but decrease load_count.
> cmod_func_del(f)
The same function may be loaded twise into different
Lua variables, thus we count number of loads. You proposed
it. I think we can drop this countint and simply use module
`refs` field instead as you proposed in another email.
cmod_func_add/del operates with function hash they don't manipulate
symbols. The symbols are handled via module_cache. This is because
we have two interfaces for functions: box.func and cmod.
In turn cmod_func_new/free calls module_sym_x helpers to run
a real work with symbols.
>
> // Ok, this looks like it should add the function to the
> // hash. But wait, it also increases load_count, and does
> // not call module_sym_load()! WTF?
> cmod_func_add(f)
Tracks number of same function loads.
>
> // It should free the memory, right? But it calls
> // module_sym_unload(). Why!? Also, we have a code style
> // agreement, that function freeing the memory should
> // have suffix 'delete', not 'free'. As soon as you have
> // 'new'.
> cmod_func_free(f)
OK for naming. As to in general -- new function allocation
goes into module_cache and resolve the function symbol. The
symbol resides in memory until last unload/__gc and then
function gets deleted and module_cache unloads the symbol.
>
> // Fine, it should allocate a new function object, right?
> // But it also calls module_sym_load()! And does not increase
> // load_count! I don't understand. It seems like the names
> // of these functions has very little to do with what they are
> // really doing. The only way to make them work is to call them
> // in some special sequence, in which a next function fixes
"some special sequence" is all our code I fear, the functions
flow is context dependent grammar not context free.
> // leftovers of previous functions.
> cmod_func_new(name)
>
> This must be seriously redesigned. Starting from the function
> names.
>
> Here is an option:
>
> // Find the function in the hash. If found, increase load_count
> // and return. Otherwise allocate a new function object, fill its
> // members, do the load, and make load_count = 1. Add to the hash.
> cmod_func_load(name)
>
> // Decrease load_count. If became 0, remove from the hash,
> // unload mod_sym, and free the memory.
> cmod_func_unload(f)
>
> I don't really see why would you need the other functions. These 2
> are going to be extra simple anyway. Module_cache API for module
> objects is literally that simple - load, unload, reload. Why do
> you need so many for cmod, especially with so unclear behaviour?
Because otherwise we will have a long function where we will have
error path to handle. Let me check if I manged to make it readable.
Note thouh that the number of actions are not goiing to disappear
it will simply sit in one big function instead.
> > +static int
> > +cmod_func_add(struct cmod_func *cf)
> > +{
> > + assert(cf->load_count >= 0);
> > + if (cf->load_count++ != 0)
>
> 13. I already proposed an option how to rework cmod API above, but
> still will leave some comments alongside.
Thanks!
>
> Why the heck 'add' function increases load_count?
Because it doesn't need to add same function into hash.
> > + return 0;
> > +
> > + const struct mh_strnptr_node_t nd = {
> > + .str = cf->name,
> > + .len = cf->len,
> > + .hash = mh_strn_hash(cf->name, cf->len),
> > + .val = cf,
> > + };
> > +
> > + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL);
> > + if (e == mh_end(func_hash)) {
> > + diag_set(OutOfMemory, sizeof(nd),
> > + "malloc", "cmod_func node");
> > + return -1;
> > + }
> > + return 0;
>
> 14. Why so complex? Just add it to the hash when it is created. This
> function is used in a single place, where you already do the check
> if it exists.
I would like to separate hash helpers from other code. But sure.
> > +}
> > +
> > +/**
> > + * Unload a symbol and free a function instance.
> > + *
> > + * @param cf function descriptor.
> > + */
> > +static void
> > +cmod_func_free(struct cmod_func *cf)
> > +{
> > + module_sym_unload(&cf->mod_sym);
>
> 15. Worth adding load_count == 0 assertion.
> Or rather do the unload where it belongs - where
> load_count becomes 0.
OK
>
> > + TRASH(cf);
> > + free(cf);
> > +}
> > +
> > +/**
> > + * Allocate a new function instance and resolve a symbol address.
> > + *
> > + * @param name package path and a function name, ie "module.foo"
> > + * @param len length of @a name.
> > + * @param func_name_len function name length, ie "3" for "module.foo"
> > + *
> > + * @returns function instance on success, NULL otherwise setting diag area.
> > + */
> > +static struct cmod_func *
> > +cmod_func_new(const char *name, size_t len, size_t func_name_len)
> > +{
> > + const ssize_t cf_size = sizeof(struct cmod_func);
> > + size_t size = cf_size + len + 1;
> > + struct cmod_func *cf = malloc(size);
> > + if (cf == NULL) {
> > + diag_set(OutOfMemory, size, "malloc", "cf");
> > + return NULL;
> > + }
> > +
> > + cf->mod_sym.addr = NULL;
> > + cf->mod_sym.module = NULL;
> > + cf->load_count = 0;
> > + cf->mod_sym.name = cf->name;
> > + cf->func_name = &cf->name[len - func_name_len];
> > + cf->len = len;
> > +
> > + memcpy(cf->name, name, len);
> > + cf->name[len] = '\0';
> > +
> > + if (module_sym_load(&cf->mod_sym) != 0) {
>
> 16. You load, and don't increase load_count. Does not it look
> contradictory to you?
Well, there are two sides: initial values where address comes from
module subsystem and the state when function is ready to be used.
The function is "ready" when it is not only initilized and has
symbol resolved but when it sits inside function cache, and because
of this we increas load_count only in cmod_func_add. That is why
load_count increased not at moment of creation.
> > +static int
> > +lcmod_func_load(struct lua_State *L)
> > +{
> > + const char *method = "function = module:load";
> > + struct cmod_func *cf = NULL;
> > +
> > + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) {
> > + const char *fmt =
> > + "Expects %s(\'name\') but no name passed";
> > + diag_set(IllegalParams, fmt, method);
>
> 17. Why couldn't you just inline 'fmt' string? The current version
> is not any shorter nor easier to read. The same in many other places.
To not split the string - it will overflow 80 symbols iirc
> > + return luaT_push_nil_and_error(L);
> > + }
> > +
> > + struct module *m = cmod_get_module_handle(L, false);
> > + if (m == NULL) {
> > + const char *fmt =
> > + "Expects %s(\'name\') but not module object passed";
> > + diag_set(IllegalParams, fmt, method);
> > + return luaT_push_nil_and_error(L);
> > + }
> > +
> > + const char *func_name = lua_tostring(L, 2);
> > + const char *name = tt_sprintf("%s.%s", m->package, func_name);
> > + size_t len = strlen(name);
> > +
> > + cf = cmod_func_find(name, len);
>
> 18. Or you could pass package name and function name to the constructor
> and copy them right to the allocated memory, without double copying
> via the static buffer.
Will take a look, though the hash requires the long name which means
we will need to count number of symbols in merged name. Will try.
> > + if (cf == NULL) {
> > + cf = cmod_func_new(name, len, strlen(func_name));
> > + if (cf == NULL)
> > + return luaT_push_nil_and_error(L);
> > + }
> > +
> > + if (cmod_func_add(cf) != 0) {
>
> 19. From this code it looks like you can fail addition of an
> already existing function and you just delete it. I know it
> can't happen, but the code says the opposite, and it is
> confusing.
>
> Another reason why 'add' is a bad name - it seems like it
> should fail when added second time due to being already
> added.
OK
> > +static int
> > +lcmod_module_load(struct lua_State *L)
> > +{
> > + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) {
> > + const char *fmt =
> > + "Expects cmod.load(\'name\') but no name passed";
> > + diag_set(IllegalParams, fmt);
> > + return luaT_push_nil_and_error(L);
> > + }
> > +
> > + size_t name_len;
> > + const char *name = lua_tolstring(L, 1, &name_len);
> > +
> > + struct module *module = module_load(name, name_len);
> > + if (module == NULL)
> > + return luaT_push_nil_and_error(L);
> > +
> > + cmod_setup_module_handle(L, module);
> > + /*
> > + * Make sure the module won't disappear until
> > + * it is GC'ed or unloaded explicitly.
> > + */
> > + module->ref++;
>
> 20. The fact that you need to touch reference counter of an object
> from a different subsystem should make you start to question
> sanity of the API. Why module_load() does not do the increase,
> and module_unload() - decrease?
Because there were no such API and I though this gonna be over
the top taking into account the size of the patch. Moreover there
is different senatics of how modules are working in box.fun and
our cmod module.
I agree that this is ugly, maybe it is a good time to improve
modules code.
> > +static int
> > +lcmod_module_handle_serialize(struct lua_State *L)
> > +{
> > + struct module *m = cmod_get_module_handle(L, true);
> > + if (m == NULL) {
> > + lua_pushnil(L);
> > + return 1;
> > + }
> > +
> > + lua_createtable(L, 0, 0);
>
> 21. Since you decided to use createtable instead of newtable,
> I suggest you to fill the proper parameters, which should
> be (0, 1) in this case I think.
OK.
Thanks a huge for review, Vlad!
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module
2021-01-25 16:50 ` Cyrill Gorcunov via Tarantool-patches
@ 2021-01-30 18:53 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-30 18:53 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
>>> On success the first returned value set to `true` and `res` represent
>>> function execution result.
>>>
>>> ``` Lua
>>> local ok, res = cfunc_sum(1, 2)
>>> assert(ok);
>>> print(res)
>>> ```
>>>
>>> We will see the number `3` in output.
>>
>> 5. What happens when I do multireturn?
>
> Not sure I follow you here. But if you mean tradtional
> multireturn then we have it in test case (the next patch)
>
> | cfunc_multireturn()
> | | ---
> | | - true
> | | - [1]
> | | - [1]
>
> The function should prepare the result by self. The calling
> convention is the same as to old known `box.func`. Or
> you mean something else?
I mean you should document it better. I know how multireturn
works in your patch because I reviewed it. Users won't look at
the source code or at the tests.
>>> +
>>> +static struct cmod_func *
>>> +cmod_get_func_handle(struct lua_State *L, bool mandatory)
>>
>> 8. According to the code style, flags must have 'is' prefix or
>> a similar one like 'has', 'does', etc. Here it should be 'is_mandatory'.
>
> Could you please point me where this rule sits?
I couldn't find it either. AFAIU, more detailed rules should come with
the next update of the C style code on the site. But I remember saying
it to you a few times.
> I see both usage (with and without prefix) in our code.
Without prefix is incorrect. It is either old code, or new badly reviewed
code.
>>> +{
>>> + struct cmod_func *cf = cmod_get_handle(L, cmod_func_handle_uname);
>>> + if (mandatory && cf == NULL) {
>>> + const char *fmt = "The function is already unloaded";
>>> + diag_set(IllegalParams, fmt);
>>> + }
>>> + return cf;
>>> +}
>>> +
>>> +static void
>>> +cmod_set_func_handle(struct lua_State *L, struct cmod_func *cf)
>>> +{
>>> + cmod_set_handle(L, cmod_func_handle_uname, cf);
>>> +}
>>> +
>>> +static void
>>> +cmod_setup_func_handle(struct lua_State *L, struct cmod_func *cf)
>>> +{
>>> + cmod_setup_handle(L, cmod_func_handle_uname, cf);
>>> +}
>>
>> 9. Amount of boilerplate code and double checks of the same
>> conditions you need just to push some udata is terrifying, tbh.
>>
>> For instance, take lcmod_func_handle_call(). It calls
>> cmod_get_func_handle() and checks result for NULL. This one
>> calls cmod_get_handle() and also checks result for NULL.
>> This one calls luaL_testudata() and also checks result for NULL.
>> You did 3 NULL checks which could be one, if you would have
>> a set of simpler functions working directly with luaL_testudata().
>
> No, you're wrong. The minimum number of check is 2 (1 - fetch the
> double pointer to userdata, 2 - dereference first level). Instead
> I use two helpers where one sets diag in case of error which allows
> me to not duplicate code. This is a way better.Until you meant
> something different.
I simply mean the code is hard to read. You have a ton of 2-line
functions with strange names which I need to lookup each time I am
looking at one of the upper level functions. Because from the
names I don't always understand what they are doing.
>> 10. Why do you have 'handle' word in each function name? For
>> example, how 'handle_call' is any different from just 'call' in
>> scope of code in this file? Or how is 'cmod_setup_func' different
>> from 'cmod_setup_func_handle'? It seems that by 'handle' you
>> try not to clash names with the non-Lua part of cmod. But you
>> should do it via prefixes, not via suffixes. lcmod instead of cmod.
>
> The base idea of handle is that "handle" is sitting in udata of Lua
> variable and the functions are fetching it from there.
In udata you literally store the pointer at the function or module.
You don't have 'func_handle' or something.
> User may unload
> a function then its handle set to NULL but variable is still alive
> and subsequent call should print an error.
>
> If you prefer the prefix then sure, gimme some time to try out.
I would prefer to drop most of these functions if possible, and get
certainly rid of the 'handle' suffix. I suspect you copied it from
popen code which does something similar - deep system code exposed to
Lua. But in popen they literally have 'popen_handle' structure. You
don't. So you don't need 'handle' in function names. It is long and
confusing.
>>> +{
>>> + mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len);
>>> + if (e == mh_end(func_hash))
>>> + return NULL;
>>> + return mh_strnptr_node(func_hash, e)->val;
>>> +}
>>> +
>>> +/**
>>> + * Delete a function instance from the hash or decrease
>>> + * a reference if the function is still loaded.
>>> + *
>>> + * @param cf function descriptor.
>>> + *
>>> + * @retval true if the function has no more loaded instances
>>> + * and removed from the hash.
>>> + *
>>> + * @retval false if there are loaded instances left and function
>>> + * is kept in the hash.
>>> + */
>>> +static bool
>>> +cmod_func_del(struct cmod_func *cf)
>>
>> 12. This whole cmod and lcmod API looks extremely inconsistent and complex.
>> Here are the functions you have (not counting the boilerplate code above):
>>
>> // Find a function in the cache, ok.
>> cmod_func_find(name)
>>
>> // It is called 'delete'. But it does not do the
>> // delete. Moreover, it also decreases 'load_count',
>> // which is super surprising. What does it have to do
>> // with loads? You don't call module_sym_unload() here,
>> // but decrease load_count.
>> cmod_func_del(f)
>
> The same function may be loaded twise into different
> Lua variables, thus we count number of loads. You proposed
> it.
I don't talk here about behaviour of the public API. I talk
about internal functions having seemingly random names not
related to what they are doing.
> I think we can drop this countint and simply use module
> `refs` field instead as you proposed in another email.
I didn't propose to remove the function refs. If you drop them,
how will you determine when a cmod_func object can be deleted?
> cmod_func_add/del operates with function hash they don't manipulate
> symbols. The symbols are handled via module_cache. This is because
> we have two interfaces for functions: box.func and cmod.
But they touch load_count. If you would remove load_count change
from them and rename to something like
func_hash_add()
func_hash_del()
It would look better probably. The functions really would become
about hash only.
>> // Ok, this looks like it should add the function to the
>> // hash. But wait, it also increases load_count, and does
>> // not call module_sym_load()! WTF?
>> cmod_func_add(f)
>
> Tracks number of same function loads.
Exactly, I know this. But the function name is 'add', and it
'tracks loads' - these are unrelated things!
>> // It should free the memory, right? But it calls
>> // module_sym_unload(). Why!? Also, we have a code style
>> // agreement, that function freeing the memory should
>> // have suffix 'delete', not 'free'. As soon as you have
>> // 'new'.
>> cmod_func_free(f)
>
> OK for naming. As to in general -- new function allocation
> goes into module_cache and resolve the function symbol. The
> symbol resides in memory until last unload/__gc and then
> function gets deleted and module_cache unloads the symbol.
I know the behaviour. My questions are not to the big picture
of behaviour. My questions are about this behaviour being
split into seemingly random parts with strange names.
>> // Fine, it should allocate a new function object, right?
>> // But it also calls module_sym_load()! And does not increase
>> // load_count! I don't understand. It seems like the names
>> // of these functions has very little to do with what they are
>> // really doing. The only way to make them work is to call them
>> // in some special sequence, in which a next function fixes
>
> "some special sequence" is all our code I fear, the functions
> flow is context dependent grammar not context free.
You know what I mean.
>> // leftovers of previous functions.
>> cmod_func_new(name)
>>
>> This must be seriously redesigned. Starting from the function
>> names.
>>
>> Here is an option:
>>
>> // Find the function in the hash. If found, increase load_count
>> // and return. Otherwise allocate a new function object, fill its
>> // members, do the load, and make load_count = 1. Add to the hash.
>> cmod_func_load(name)
>>
>> // Decrease load_count. If became 0, remove from the hash,
>> // unload mod_sym, and free the memory.
>> cmod_func_unload(f)
>>
>> I don't really see why would you need the other functions. These 2
>> are going to be extra simple anyway. Module_cache API for module
>> objects is literally that simple - load, unload, reload. Why do
>> you need so many for cmod, especially with so unclear behaviour?
>
> Because otherwise we will have a long function where we will have
> error path to handle. Let me check if I manged to make it readable.
> Note thouh that the number of actions are not goiing to disappear
> it will simply sit in one big function instead.
I looked at module_cache when proposed the API. In module_cache the
public API is relatively good. It is simple and yet totally enough.
Because of this the module functions like lcmod_module_load() look
very simple:
1. Get arguments from Lua stack.
2. Forward them to some module_cache function as is.
3. Push result back to Lua stack.
At the same time for cmod_func functions like lcmod_func_load() you do
too much code which is not about Lua and seems to be internal for
cmod_func objects. But it should be simpler.
Probably it would help if you would think of it like you need a
pure C module. Like there is a cmod.h file, and you need to design API
for the header file to load/unload functions. You would just expose two
functions:
cmod_func_load()
cmod_func_unload()
And then it would be extremely easy to wrap this into Lua API. As simple
as it was with module_cache.
Alongside you would probably realize that some of the 'helper' functions
become unnecessary and could be inlined into these 2 functions making the
code ever smaller.
Talking of long functions - they will both fit into one screen I believe.
Even if you inline everything. So I don't see sense in splitting functions
just for the sake of splitting. Functions like 'add to hash', 'remove from
hash', 'allocate new cmod_func and initialize to default values everything'?
Sure, fine, these actions look independent.
But the "public" part of the API should be just 2 functions, you see? Load and
unload. And then you wrap them with a couple of short Lua functions. Plus
ton of Lua shit like __index, __gc, and all. The latter, btw, should just
call cmod_func_unload(). Then you would see that the hash deletion is needed
only in one place, not in two like now.
>>> +static int
>>> +cmod_func_add(struct cmod_func *cf)
>>> +{
>>> + assert(cf->load_count >= 0);
>>> + if (cf->load_count++ != 0)
>>
>> 13. I already proposed an option how to rework cmod API above, but
>> still will leave some comments alongside.
>
> Thanks!
Just tell me you hate these comments and me as well, it is fine. Don't
keep it inside, it is not healthy. You need a lot of mental health if
you work in Tarantool.
>> Why the heck 'add' function increases load_count?
>
> Because it doesn't need to add same function into hash.
Em. But ... . How are these actions related? Call it simply
'refs' then? If the counter is not related to loads. Or make
the 'load_count' counter incremented by a function at least
having 'load' in its name, right?
>>> + return 0;
>>> +
>>> + const struct mh_strnptr_node_t nd = {
>>> + .str = cf->name,
>>> + .len = cf->len,
>>> + .hash = mh_strn_hash(cf->name, cf->len),
>>> + .val = cf,
>>> + };
>>> +
>>> + mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL);
>>> + if (e == mh_end(func_hash)) {
>>> + diag_set(OutOfMemory, sizeof(nd),
>>> + "malloc", "cmod_func node");
>>> + return -1;
>>> + }
>>> + return 0;
>>
>> 14. Why so complex? Just add it to the hash when it is created. This
>> function is used in a single place, where you already do the check
>> if it exists.
>
> I would like to separate hash helpers from other code. But sure.
It is fine to separate them. For instance, look at module_cache_find,
module_cache_add, module_cache_del - they don't try to do anything
except work with the hash table. And this is atomic. Easy to understand.
Also they have 'module_cache' prefix, so I understand they work with the
cache. Not with the module objects. They accept a module pointer, but
only to save it into the hash. They don't change the modules.
>>> + TRASH(cf);
>>> + free(cf);
>>> +}
>>> +
>>> +/**
>>> + * Allocate a new function instance and resolve a symbol address.
>>> + *
>>> + * @param name package path and a function name, ie "module.foo"
>>> + * @param len length of @a name.
>>> + * @param func_name_len function name length, ie "3" for "module.foo"
>>> + *
>>> + * @returns function instance on success, NULL otherwise setting diag area.
>>> + */
>>> +static struct cmod_func *
>>> +cmod_func_new(const char *name, size_t len, size_t func_name_len)
>>> +{
>>> + const ssize_t cf_size = sizeof(struct cmod_func);
>>> + size_t size = cf_size + len + 1;
>>> + struct cmod_func *cf = malloc(size);
>>> + if (cf == NULL) {
>>> + diag_set(OutOfMemory, size, "malloc", "cf");
>>> + return NULL;
>>> + }
>>> +
>>> + cf->mod_sym.addr = NULL;
>>> + cf->mod_sym.module = NULL;
>>> + cf->load_count = 0;
>>> + cf->mod_sym.name = cf->name;
>>> + cf->func_name = &cf->name[len - func_name_len];
>>> + cf->len = len;
>>> +
>>> + memcpy(cf->name, name, len);
>>> + cf->name[len] = '\0';
>>> +
>>> + if (module_sym_load(&cf->mod_sym) != 0) {
>>
>> 16. You load, and don't increase load_count. Does not it look
>> contradictory to you?
>
> Well, there are two sides: initial values where address comes from
> module subsystem and the state when function is ready to be used.
> The function is "ready" when it is not only initilized and has
> symbol resolved but when it sits inside function cache, and because
> of this we increas load_count only in cmod_func_add. That is why
> load_count increased not at moment of creation.
I see your point. Maybe just load_count name confuses me. Probably
would help to rename it to 'refs' really. Then some things come
together, at least in my mind.
>>> +static int
>>> +lcmod_func_load(struct lua_State *L)
>>> +{
>>> + const char *method = "function = module:load";
>>> + struct cmod_func *cf = NULL;
>>> +
>>> + if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) {
>>> + const char *fmt =
>>> + "Expects %s(\'name\') but no name passed";
>>> + diag_set(IllegalParams, fmt, method);
>>
>> 17. Why couldn't you just inline 'fmt' string? The current version
>> is not any shorter nor easier to read. The same in many other places.
>
> To not split the string - it will overflow 80 symbols iirc
And why not split it? We do it in a lot of places. Don't tell about
grep. The string has %s and escapes quotes. If you see such a string
in a test, you won't know how the source looks anyway.
>>> + return luaT_push_nil_and_error(L);
>>> + }
>>> +
>>> + struct module *m = cmod_get_module_handle(L, false);
>>> + if (m == NULL) {
>>> + const char *fmt =
>>> + "Expects %s(\'name\') but not module object passed";
>>> + diag_set(IllegalParams, fmt, method);
>>> + return luaT_push_nil_and_error(L);
>>> + }
>>> +
>>> + const char *func_name = lua_tostring(L, 2);
>>> + const char *name = tt_sprintf("%s.%s", m->package, func_name);
>>> + size_t len = strlen(name);
>>> +
>>> + cf = cmod_func_find(name, len);
>>
>> 18. Or you could pass package name and function name to the constructor
>> and copy them right to the allocated memory, without double copying
>> via the static buffer.
>
> Will take a look, though the hash requires the long name which means
> we will need to count number of symbols in merged name. Will try.
Wouldn't strlen(left) + strlen(right) work?
>>> +static int
>>> +lcmod_module_load(struct lua_State *L)
>>> +{
>>> + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) {
>>> + const char *fmt =
>>> + "Expects cmod.load(\'name\') but no name passed";
>>> + diag_set(IllegalParams, fmt);
>>> + return luaT_push_nil_and_error(L);
>>> + }
>>> +
>>> + size_t name_len;
>>> + const char *name = lua_tolstring(L, 1, &name_len);
>>> +
>>> + struct module *module = module_load(name, name_len);
>>> + if (module == NULL)
>>> + return luaT_push_nil_and_error(L);
>>> +
>>> + cmod_setup_module_handle(L, module);
>>> + /*
>>> + * Make sure the module won't disappear until
>>> + * it is GC'ed or unloaded explicitly.
>>> + */
>>> + module->ref++;
>>
>> 20. The fact that you need to touch reference counter of an object
>> from a different subsystem should make you start to question
>> sanity of the API. Why module_load() does not do the increase,
>> and module_unload() - decrease?
>
> Because there were no such API and I though this gonna be over
> the top taking into account the size of the patch.
Total size of the patchset does not matter. And size of one commit
also does not matter if it is atomic.
But talking of this issue - you could make a small commit before
this one, which makes module_load/unload increment the ref.
> Moreover there
> is different senatics of how modules are working in box.fun and
> our cmod module.
How increment of the counter on module_load() is going to change the
behaviour? It is not about when to do the load in box.func. Box.func
also references module object, also calls load() and unload()/gc().
I don't propose to change *when* they are called.
> I agree that this is ugly, maybe it is a good time to improve
> modules code.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Tarantool-patches] [PATCH v12 8/8] test: box/cfunc -- add cmod test
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (6 preceding siblings ...)
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 7/8] box/cmod: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
@ 2021-01-18 20:35 ` Cyrill Gorcunov via Tarantool-patches
2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
2021-01-24 16:26 ` [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Vladislav Shpilevoy via Tarantool-patches
8 siblings, 1 reply; 26+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-01-18 20:35 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
Note that the test is disabled for a while since we
need to update test-run first like
| diff --git a/pretest_clean.lua b/pretest_clean.lua
| index 9b5ac9d..b0280c4 100644
| --- a/pretest_clean.lua
| +++ b/pretest_clean.lua
| @@ -272,6 +272,7 @@ local function clean()
| package = true,
| pickle = true,
| popen = true,
| + cmod = true,
| pwd = true,
| socket = true,
| strict = true,
ie need to enable cmod module.
Part-of #4642
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
test/box/CMakeLists.txt | 2 +
test/box/cfunc1.c | 58 ++++++++++++
test/box/cfunc2.c | 132 ++++++++++++++++++++++++++
test/box/cmod.result | 199 ++++++++++++++++++++++++++++++++++++++++
test/box/cmod.test.lua | 70 ++++++++++++++
test/box/suite.ini | 2 +-
6 files changed, 462 insertions(+), 1 deletion(-)
create mode 100644 test/box/cfunc1.c
create mode 100644 test/box/cfunc2.c
create mode 100644 test/box/cmod.result
create mode 100644 test/box/cmod.test.lua
diff --git a/test/box/CMakeLists.txt b/test/box/CMakeLists.txt
index 06bfbbe9d..2afbeadc3 100644
--- a/test/box/CMakeLists.txt
+++ b/test/box/CMakeLists.txt
@@ -3,3 +3,5 @@ build_module(function1 function1.c)
build_module(reload1 reload1.c)
build_module(reload2 reload2.c)
build_module(tuple_bench tuple_bench.c)
+build_module(cfunc1 cfunc1.c)
+build_module(cfunc2 cfunc2.c)
diff --git a/test/box/cfunc1.c b/test/box/cfunc1.c
new file mode 100644
index 000000000..8f6d3990c
--- /dev/null
+++ b/test/box/cfunc1.c
@@ -0,0 +1,58 @@
+#include <stdio.h>
+#include <stdbool.h>
+#include <msgpuck.h>
+
+#include "module.h"
+
+/*
+ * Before the reload functions are just declared
+ * and simply exit with zero.
+ *
+ * After the module reload we should provide real
+ * functionality.
+ */
+
+int
+cfunc_nop(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ (void)ctx;
+ (void)args;
+ (void)args_end;
+ return 0;
+}
+
+int
+cfunc_fetch_evens(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ (void)ctx;
+ (void)args;
+ (void)args_end;
+ return 0;
+}
+
+int
+cfunc_multireturn(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ (void)ctx;
+ (void)args;
+ (void)args_end;
+ return 0;
+}
+
+int
+cfunc_args(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ (void)ctx;
+ (void)args;
+ (void)args_end;
+ return 0;
+}
+
+int
+cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ (void)ctx;
+ (void)args;
+ (void)args_end;
+ return 0;
+}
diff --git a/test/box/cfunc2.c b/test/box/cfunc2.c
new file mode 100644
index 000000000..f6d826183
--- /dev/null
+++ b/test/box/cfunc2.c
@@ -0,0 +1,132 @@
+#include <stdio.h>
+#include <stdbool.h>
+#include <msgpuck.h>
+
+#include "module.h"
+
+/*
+ * Just make sure we've been called.
+ */
+int
+cfunc_nop(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ (void)ctx;
+ (void)args;
+ (void)args_end;
+ return 0;
+}
+
+/*
+ * Fetch first N even numbers (just to make sure the order of
+ * arguments is not screwed).
+ */
+int
+cfunc_fetch_evens(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ int arg_count = mp_decode_array(&args);
+ if (arg_count != 1) {
+ return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
+ "invalid argument count");
+ }
+ int field_count = mp_decode_array(&args);
+ if (field_count < 1) {
+ return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
+ "invalid array size");
+ }
+
+ for (int i = 0; i < field_count; i++) {
+ int val = mp_decode_uint(&args);
+ int needed = 2 * (i+1);
+ if (val != needed) {
+ char res[128];
+ snprintf(res, sizeof(res), "%s %d != %d",
+ "invalid argument", val, needed);
+ return box_error_set(__FILE__, __LINE__,
+ ER_PROC_C, "%s", res);
+ }
+ }
+
+ return 0;
+}
+
+/*
+ * Return one element array twice.
+ */
+int
+cfunc_multireturn(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ char tuple_buf[512];
+ char *d = tuple_buf;
+ d = mp_encode_array(d, 1);
+ d = mp_encode_uint(d, 1);
+ assert(d <= tuple_buf + sizeof(tuple_buf));
+
+ box_tuple_format_t *fmt = box_tuple_format_default();
+ box_tuple_t *tuple_a = box_tuple_new(fmt, tuple_buf, d);
+ if (tuple_a == NULL)
+ return -1;
+ int rc = box_return_tuple(ctx, tuple_a);
+ if (rc != 0)
+ return rc;
+ return box_return_tuple(ctx, tuple_a);
+}
+
+/*
+ * Encode int + string pair back.
+ */
+int
+cfunc_args(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ uint32_t arg_count = mp_decode_array(&args);
+ if (arg_count != 2) {
+ return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
+ "invalid argument count");
+ }
+
+ if (mp_typeof(*args) != MP_UINT) {
+ return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
+ "tuple field must be uint");
+ }
+ uint32_t num = mp_decode_uint(&args);
+
+ if (mp_typeof(*args) != MP_STR) {
+ return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
+ "tuple field must be string");
+ }
+ const char *str = args;
+ uint32_t len = mp_decode_strl(&str);
+
+ char tuple_buf[512];
+ char *d = tuple_buf;
+ d = mp_encode_array(d, 2);
+ d = mp_encode_uint(d, num);
+ d = mp_encode_str(d, str, len);
+ assert(d <= tuple_buf + sizeof(tuple_buf));
+
+ box_tuple_format_t *fmt = box_tuple_format_default();
+ box_tuple_t *tuple = box_tuple_new(fmt, tuple_buf, d);
+ if (tuple == NULL)
+ return -1;
+
+ return box_return_tuple(ctx, tuple);
+}
+
+/*
+ * Sum two integers.
+ */
+int
+cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+ uint32_t arg_count = mp_decode_array(&args);
+ if (arg_count != 2) {
+ return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
+ "invalid argument count");
+ }
+ uint64_t a = mp_decode_uint(&args);
+ uint64_t b = mp_decode_uint(&args);
+
+ char res[16];
+ char *end = mp_encode_uint(res, a + b);
+ box_return_mp(ctx, res, end);
+ return 0;
+}
diff --git a/test/box/cmod.result b/test/box/cmod.result
new file mode 100644
index 000000000..f8558c191
--- /dev/null
+++ b/test/box/cmod.result
@@ -0,0 +1,199 @@
+-- test-run result file version 2
+build_path = os.getenv("BUILDDIR")
+ | ---
+ | ...
+package.cpath = build_path..'/test/box/?.so;'..build_path..'/test/box/?.dylib;'..package.cpath
+ | ---
+ | ...
+
+cmod = require('cmod')
+ | ---
+ | ...
+fio = require('fio')
+ | ---
+ | ...
+
+ext = (jit.os == "OSX" and "dylib" or "so")
+ | ---
+ | ...
+
+cfunc_path = fio.pathjoin(build_path, "test/box/cfunc.") .. ext
+ | ---
+ | ...
+cfunc1_path = fio.pathjoin(build_path, "test/box/cfunc1.") .. ext
+ | ---
+ | ...
+cfunc2_path = fio.pathjoin(build_path, "test/box/cfunc2.") .. ext
+ | ---
+ | ...
+
+_ = pcall(fio.unlink(cfunc_path))
+ | ---
+ | ...
+fio.symlink(cfunc1_path, cfunc_path)
+ | ---
+ | - true
+ | ...
+
+module, err = cmod.load('non-existing-cfunc')
+ | ---
+ | ...
+assert(err ~= nil)
+ | ---
+ | - true
+ | ...
+
+--
+-- They all are sitting in cfunc.so
+module, err = cmod.load('cfunc')
+ | ---
+ | ...
+assert(err == nil)
+ | ---
+ | - true
+ | ...
+
+cfunc_nop, err = module:load('no-such-func')
+ | ---
+ | ...
+assert(err ~= nil)
+ | ---
+ | - true
+ | ...
+cfunc_nop, err = module:load('cfunc_nop')
+ | ---
+ | ...
+assert(err == nil)
+ | ---
+ | - true
+ | ...
+cfunc_fetch_evens, err = module:load('cfunc_fetch_evens')
+ | ---
+ | ...
+assert(err == nil)
+ | ---
+ | - true
+ | ...
+cfunc_multireturn, err = module:load('cfunc_multireturn')
+ | ---
+ | ...
+assert(err == nil)
+ | ---
+ | - true
+ | ...
+cfunc_args, err = module:load('cfunc_args')
+ | ---
+ | ...
+assert(err == nil)
+ | ---
+ | - true
+ | ...
+cfunc_sum, err = module:load('cfunc_sum')
+ | ---
+ | ...
+assert(err == nil)
+ | ---
+ | - true
+ | ...
+
+--
+-- Make sure they all are callable
+cfunc_nop()
+ | ---
+ | - true
+ | ...
+cfunc_fetch_evens()
+ | ---
+ | - true
+ | ...
+cfunc_multireturn()
+ | ---
+ | - true
+ | ...
+cfunc_args()
+ | ---
+ | - true
+ | ...
+
+--
+-- Clean old function references and reload a new one.
+_ = pcall(fio.unlink(cfunc_path))
+ | ---
+ | ...
+fio.symlink(cfunc2_path, cfunc_path)
+ | ---
+ | - true
+ | ...
+
+module:reload()
+ | ---
+ | - true
+ | ...
+
+cfunc_nop()
+ | ---
+ | - true
+ | ...
+cfunc_multireturn()
+ | ---
+ | - true
+ | - [1]
+ | - [1]
+ | ...
+cfunc_fetch_evens({2,4,6})
+ | ---
+ | - true
+ | ...
+cfunc_fetch_evens({1,2,3}) -- error
+ | ---
+ | - null
+ | - invalid argument 1 != 2
+ | ...
+cfunc_args(1, "hello")
+ | ---
+ | - true
+ | - [1, 'hello']
+ | ...
+cfunc_sum(1) -- error
+ | ---
+ | - null
+ | - invalid argument count
+ | ...
+cfunc_sum(1,2)
+ | ---
+ | - true
+ | - 3
+ | ...
+
+--
+-- Clean them up
+cfunc_nop:unload()
+ | ---
+ | - true
+ | ...
+cfunc_multireturn:unload()
+ | ---
+ | - true
+ | ...
+cfunc_fetch_evens:unload()
+ | ---
+ | - true
+ | ...
+cfunc_args:unload()
+ | ---
+ | - true
+ | ...
+cfunc_sum:unload()
+ | ---
+ | - true
+ | ...
+module:unload()
+ | ---
+ | - true
+ | ...
+
+--
+-- Cleanup the generated symlink
+_ = pcall(fio.unlink(cfunc_path))
+ | ---
+ | ...
diff --git a/test/box/cmod.test.lua b/test/box/cmod.test.lua
new file mode 100644
index 000000000..e9837e693
--- /dev/null
+++ b/test/box/cmod.test.lua
@@ -0,0 +1,70 @@
+build_path = os.getenv("BUILDDIR")
+package.cpath = build_path..'/test/box/?.so;'..build_path..'/test/box/?.dylib;'..package.cpath
+
+cmod = require('cmod')
+fio = require('fio')
+
+ext = (jit.os == "OSX" and "dylib" or "so")
+
+cfunc_path = fio.pathjoin(build_path, "test/box/cfunc.") .. ext
+cfunc1_path = fio.pathjoin(build_path, "test/box/cfunc1.") .. ext
+cfunc2_path = fio.pathjoin(build_path, "test/box/cfunc2.") .. ext
+
+_ = pcall(fio.unlink(cfunc_path))
+fio.symlink(cfunc1_path, cfunc_path)
+
+module, err = cmod.load('non-existing-cfunc')
+assert(err ~= nil)
+
+--
+-- They all are sitting in cfunc.so
+module, err = cmod.load('cfunc')
+assert(err == nil)
+
+cfunc_nop, err = module:load('no-such-func')
+assert(err ~= nil)
+cfunc_nop, err = module:load('cfunc_nop')
+assert(err == nil)
+cfunc_fetch_evens, err = module:load('cfunc_fetch_evens')
+assert(err == nil)
+cfunc_multireturn, err = module:load('cfunc_multireturn')
+assert(err == nil)
+cfunc_args, err = module:load('cfunc_args')
+assert(err == nil)
+cfunc_sum, err = module:load('cfunc_sum')
+assert(err == nil)
+
+--
+-- Make sure they all are callable
+cfunc_nop()
+cfunc_fetch_evens()
+cfunc_multireturn()
+cfunc_args()
+
+--
+-- Clean old function references and reload a new one.
+_ = pcall(fio.unlink(cfunc_path))
+fio.symlink(cfunc2_path, cfunc_path)
+
+module:reload()
+
+cfunc_nop()
+cfunc_multireturn()
+cfunc_fetch_evens({2,4,6})
+cfunc_fetch_evens({1,2,3}) -- error
+cfunc_args(1, "hello")
+cfunc_sum(1) -- error
+cfunc_sum(1,2)
+
+--
+-- Clean them up
+cfunc_nop:unload()
+cfunc_multireturn:unload()
+cfunc_fetch_evens:unload()
+cfunc_args:unload()
+cfunc_sum:unload()
+module:unload()
+
+--
+-- Cleanup the generated symlink
+_ = pcall(fio.unlink(cfunc_path))
diff --git a/test/box/suite.ini b/test/box/suite.ini
index e700d0b9e..fc16c5951 100644
--- a/test/box/suite.ini
+++ b/test/box/suite.ini
@@ -2,7 +2,7 @@
core = tarantool
description = Database tests
script = box.lua
-disabled = rtree_errinj.test.lua tuple_bench.test.lua
+disabled = rtree_errinj.test.lua tuple_bench.test.lua cmod.test.lua
long_run = huge_field_map_long.test.lua
config = engine.cfg
release_disabled = errinj.test.lua errinj_index.test.lua rtree_errinj.test.lua upsert_errinj.test.lua iproto_stress.test.lua gh-4648-func-load-unload.test.lua
--
2.29.2
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 8/8] test: box/cfunc -- add cmod test
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 8/8] test: box/cfunc -- add cmod test Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:28 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 0 replies; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:28 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Thanks for the patch!
See 5 comments below.
> diff --git a/test/box/cfunc2.c b/test/box/cfunc2.c
> new file mode 100644
> index 000000000..f6d826183
> --- /dev/null
> +++ b/test/box/cfunc2.c
> @@ -0,0 +1,132 @@
> +#include <stdio.h>
> +#include <stdbool.h>
> +#include <msgpuck.h>
> +
> +#include "module.h"
> +
> +/*
> + * Just make sure we've been called.
> + */
> +int
> +cfunc_nop(box_function_ctx_t *ctx, const char *args, const char *args_end)
> +{
> + (void)ctx;
> + (void)args;
> + (void)args_end;
> + return 0;
> +}
> +
> +/*
> + * Fetch first N even numbers (just to make sure the order of
> + * arguments is not screwed).> + */
> +int
> +cfunc_fetch_evens(box_function_ctx_t *ctx, const char *args, const char *args_end)
> +{
> + int arg_count = mp_decode_array(&args);
> + if (arg_count != 1) {
> + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
> + "invalid argument count");
1. Broken alignment. Here and in other error places.
> + }
> + int field_count = mp_decode_array(&args);
> + if (field_count < 1) {
> + return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
> + "invalid array size");
> + }
> +
> + for (int i = 0; i < field_count; i++) {
> + int val = mp_decode_uint(&args);
> + int needed = 2 * (i+1);
2. Please, use whitespaces before and after binary operators. I really
suggest you to revisit the code style guide.
Why do you need such a complex function? Why couldn't you simply make a
function like 'echo', which validates msgpack and returns it as is? You
would be able to check that the args are not corrupted, and are exactly
what you have passed.
> + if (val != needed) {
> + char res[128];
> + snprintf(res, sizeof(res), "%s %d != %d",
> + "invalid argument", val, needed);
> + return box_error_set(__FILE__, __LINE__,
> + ER_PROC_C, "%s", res);
> + }
> + }
> +
> + return 0;
> +}
> diff --git a/test/box/cmod.result b/test/box/cmod.result
> new file mode 100644
> index 000000000..f8558c191
> --- /dev/null
> +++ b/test/box/cmod.result
> @@ -0,0 +1,199 @@
> +-- test-run result file version 2
> +build_path = os.getenv("BUILDDIR")
> + | ---
> + | ...
> +package.cpath = build_path..'/test/box/?.so;'..build_path..'/test/box/?.dylib;'..package.cpath
> + | ---
> + | ...
> +
> +cmod = require('cmod')
3. In the beginning of a test we usually write a comment with the
issue number and its description.
> + | ---
> + | ...
> +fio = require('fio')
> + | ---
> + | ...
> +
> +ext = (jit.os == "OSX" and "dylib" or "so")
> + | ---
> + | ...
> +
> +cfunc_path = fio.pathjoin(build_path, "test/box/cfunc.") .. ext
> + | ---
> + | ...
> +cfunc1_path = fio.pathjoin(build_path, "test/box/cfunc1.") .. ext
> + | ---
> + | ...
> +cfunc2_path = fio.pathjoin(build_path, "test/box/cfunc2.") .. ext
> + | ---
> + | ...
> +
> +_ = pcall(fio.unlink(cfunc_path))
> + | ---
> + | ...
> +fio.symlink(cfunc1_path, cfunc_path)
> + | ---
> + | - true
> + | ...
> +
> +module, err = cmod.load('non-existing-cfunc')
> + | ---
> + | ...
> +assert(err ~= nil)
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- They all are sitting in cfunc.so
> +module, err = cmod.load('cfunc')
> + | ---
> + | ...
> +assert(err == nil)
> + | ---
> + | - true
> + | ...
> +
> +cfunc_nop, err = module:load('no-such-func')
> + | ---
> + | ...
> +assert(err ~= nil)
> + | ---
> + | - true
> + | ...
> +cfunc_nop, err = module:load('cfunc_nop')
> + | ---
> + | ...
> +assert(err == nil)
> + | ---
> + | - true
> + | ...
> +cfunc_fetch_evens, err = module:load('cfunc_fetch_evens')
> + | ---
> + | ...
> +assert(err == nil)
> + | ---
> + | - true
> + | ...
> +cfunc_multireturn, err = module:load('cfunc_multireturn')
> + | ---
> + | ...
> +assert(err == nil)
> + | ---
> + | - true
> + | ...
> +cfunc_args, err = module:load('cfunc_args')
> + | ---
> + | ...
> +assert(err == nil)
> + | ---
> + | - true
> + | ...
> +cfunc_sum, err = module:load('cfunc_sum')
> + | ---
> + | ...
> +assert(err == nil)
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- Make sure they all are callable
> +cfunc_nop()
> + | ---
> + | - true
> + | ...
> +cfunc_fetch_evens()
> + | ---
> + | - true
> + | ...
> +cfunc_multireturn()
> + | ---
> + | - true
> + | ...
> +cfunc_args()
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- Clean old function references and reload a new one.
> +_ = pcall(fio.unlink(cfunc_path))
> + | ---
> + | ...
> +fio.symlink(cfunc2_path, cfunc_path)
> + | ---
> + | - true
> + | ...
> +
> +module:reload()
> + | ---
> + | - true
> + | ...
> +
> +cfunc_nop()
> + | ---
> + | - true
> + | ...
> +cfunc_multireturn()
4. I am not sure it is correct. The function wasn't
reloaded. I thought we decided that to get a new
function after reload you need to call load() on the
module again.
Reason for not changing the existing function objects was
that the reload is supposed to happen when the whole
application reloads. It means there can be some fibers
running the old code for a while after reload until they
notice that they must be restarted. If you do reload like
you did in this patch, the old fibers will get new functions
immediately right under their feet, and this may result
into unexpected behaviour for code in these fibers.
Another way to get the desired behaviour - use unload() + load()
- I assume in this case the old function objects will remain
unchanged, right?
I suggest to ask Mons what to do with this. And this is why
I asked for a proper RFC (or GitHub discussion) before starting
to rework the patch.
> + | ---
> + | - true
> + | - [1]
> + | - [1]
> + | ...
> +cfunc_fetch_evens({2,4,6})
> + | ---
> + | - true
> + | ...
> +cfunc_fetch_evens({1,2,3}) -- error
> + | ---
> + | - null
> + | - invalid argument 1 != 2
> + | ...
> +cfunc_args(1, "hello")
> + | ---
> + | - true
> + | - [1, 'hello']
> + | ...
> +cfunc_sum(1) -- error
> + | ---
> + | - null
> + | - invalid argument count
> + | ...
> +cfunc_sum(1,2)
> + | ---
> + | - true
> + | - 3
> + | ...
> +
> +--
> +-- Clean them up
> +cfunc_nop:unload()
> + | ---
> + | - true
> + | ...
> +cfunc_multireturn:unload()
> + | ---
> + | - true
> + | ...
> +cfunc_fetch_evens:unload()
> + | ---
> + | - true
> + | ...
> +cfunc_args:unload()
> + | ---
> + | - true
> + | ...
> +cfunc_sum:unload()
> + | ---
> + | - true
> + | ...
> +module:unload()
> + | ---
> + | - true
> + | ...
> +
> +--
> +-- Cleanup the generated symlink
> +_ = pcall(fio.unlink(cfunc_path))
5. You didn't test GC, multiple unload (should raise an
error), reload when the shared file was deleted, and
that the functions are kept loaded even if the module
was unloaded. Maybe more, but this is just what I thought
of right away.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module
2021-01-18 20:35 [Tarantool-patches] [PATCH v12 0/8] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches
` (7 preceding siblings ...)
2021-01-18 20:35 ` [Tarantool-patches] [PATCH v12 8/8] test: box/cfunc -- add cmod test Cyrill Gorcunov via Tarantool-patches
@ 2021-01-24 16:26 ` Vladislav Shpilevoy via Tarantool-patches
8 siblings, 0 replies; 26+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-01-24 16:26 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Hi! Thanks for the patchset!
> v12:
> - switch to new API as been discussed in
> https://lists.tarantool.org/tarantool-patches/e186c454-6765-4776-6433-f3f791ff4c27@tarantool.org/
This is what I meant when I asked for a proper API design
which we could strictly follow. The thing by the link is not a
discussion. It is some questions from me + your response talking
about implementation and other assumptions/proposals. I still
don't see an approved API design on which all the interested people
would agree.
I simply don't know how to review the user-visible part of this patch
because I don't know how it should work. There is no design, the API
in the patch just happened to be this.
^ permalink raw reply [flat|nested] 26+ messages in thread