From: Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: tml <tarantool-patches@dev.tarantool.org> Cc: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Subject: [Tarantool-patches] [PATCH v14 10/12] module_cache: provide module_load/unload calls Date: Wed, 3 Feb 2021 01:12:05 +0300 [thread overview] Message-ID: <20210202221207.383101-11-gorcunov@gmail.com> (raw) In-Reply-To: <20210202221207.383101-1-gorcunov@gmail.com> These calls provide an ability to load and unload modules taking into account cache invalidation. In particular if module file is changed on disk then all previous copies become invalid and any attempt to resolve a function symbol in such copies will fail. Still any existing instances will be able to run until explicitly unloaded. This is done for sake of new upcoming `cmod` module (will be implemented in next patch). Need to mention some moments here. Currently we already have `box.schema.func` interface which is strange designed but can't be dropped off without breaking backward compatibility. Very inconvenient moment here is lazy symbols binding, iow we can create a function but until we execute it the module won't be read and parsed. Moreover one can execute the function then overwrite the module on disk and define an another function from the same module and run it. This new function will be read from previously cached copy and one have to run module reload procedure explicitly to update the module. Module update on its own is a very fragile procedure: if there are running functions which are yielding then old module won't be reload until execution is complete even if explicit reload is called. And in case if function return signature is changed (say occasionally) this will lead to unpredicted results. In summary: new module_load/unload API with cache invalidation is incompatible with old `box.schema.func` thus to keep backward compatibility (until we deprecate old API) we keep modules in two separate caches: one for module_load/unload and one for `box.schema.func`. Later we will keep only new API and get rid of old code. Part-of #4642 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> --- src/box/module_cache.c | 179 +++++++++++++++++++++++++++++++++++------ src/box/module_cache.h | 27 +++++++ 2 files changed, 181 insertions(+), 25 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index f9ed6e7d8..aba55bbe3 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -19,11 +19,33 @@ #include "error.h" #include "lua/utils.h" #include "libeio/eio.h" +#include "trivia/util.h" #include "module_cache.h" -/** Modules name to descriptor hash. */ +/** + * Modules names to descriptor hashes. The first one + * for modules created with old `box.schema.func` + * interface. + * + * Here is an important moment for backward compatibility. + * The `box.schema.func` operations always use cache and + * if a module is updated on a storage device or even + * no longer present, then lazy symbol resolving is done + * via previously loaded copy. To update modules one have + * to reload them manually. + * + * In turn new API implies to use module_load/unload explicit + * interface, and when module is re-loaded from cache then + * we make a cache validation to be sure the copy on storage + * is up to date. + * + * Due to all this we have to keep two hash tables. Probably + * we should deprecate explicit reload at all and require + * manual load/unload instead. But later. + */ static struct mh_strnptr_t *box_schema_hash = NULL; +static struct mh_strnptr_t *mod_hash = NULL; /** * Parsed symbol and package names. @@ -52,7 +74,7 @@ struct func_name { static inline struct mh_strnptr_t * hash_tbl(bool is_box_schema) { - return is_box_schema ? box_schema_hash : NULL; + return is_box_schema ? box_schema_hash : mod_hash; } /*** @@ -289,13 +311,9 @@ module_unref(struct module *module) * for cases of a function reload. */ static struct module * -module_load(struct mh_strnptr_t *h, const char *package, - const char *package_end) +module_new(const char *path, struct mh_strnptr_t *h, + const char *package, const char *package_end) { - char path[PATH_MAX]; - if (module_find(package, package_end, path, sizeof(path)) != 0) - return NULL; - int package_len = package_end - package; struct module *module = malloc(sizeof(*module) + package_len + 1); if (module == NULL) { @@ -334,8 +352,8 @@ module_load(struct mh_strnptr_t *h, const char *package, goto error; } - struct stat st; - if (stat(path, &st) < 0) { + struct stat *st = &module->st; + if (stat(path, st) < 0) { diag_set(SystemError, "failed to stat() module %s", path); goto error; } @@ -347,7 +365,7 @@ module_load(struct mh_strnptr_t *h, const char *package, } int dest_fd = open(load_name, O_WRONLY | O_CREAT | O_TRUNC, - st.st_mode & (S_IRWXU | S_IRWXG | S_IRWXO)); + st->st_mode & (S_IRWXU | S_IRWXG | S_IRWXO)); if (dest_fd < 0) { diag_set(SystemError, "failed to open file %s for writing ", load_name); @@ -355,10 +373,10 @@ module_load(struct mh_strnptr_t *h, const char *package, goto error; } - off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size); + off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st->st_size); close(source_fd); close(dest_fd); - if (ret != st.st_size) { + if (ret != st->st_size) { diag_set(SystemError, "failed to copy DSO %s to %s", path, load_name); goto error; @@ -405,6 +423,24 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema) { assert(mod_sym->addr == NULL); + /* + * When module is created via new interface and + * cached, it might be updated on disk afterward + * so that symbols might not longer match the loaded + * instance. Thus new resolving on same module requires + * the whole module explicit reload. + */ + if (!is_box_schema) { + struct module *m = mod_sym->module; + assert(m != NULL); + if (module_is_orphan(m)) { + diag_set(ClientError, ER_UNSUPPORTED, + "module updated on disk,", + "needs whole reload"); + return -1; + } + } + struct func_name name; func_split_name(mod_sym->name, &name); @@ -417,7 +453,12 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema) struct mh_strnptr_t *h = hash_tbl(is_box_schema); cached = module_cache_find(h, name.package, name.package_end); if (cached == NULL) { - module = module_load(h, name.package, name.package_end); + char path[PATH_MAX]; + if (module_find(name.package, name.package_end, + path, sizeof(path)) != 0) { + return -1; + } + module = module_new(path, h, name.package, name.package_end); if (module == NULL) return -1; if (module_cache_add(module) != 0) { @@ -435,7 +476,9 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema) return -1; } - mod_sym->module = module; + if (is_box_schema) + mod_sym->module = module; + rlist_add(&module->funcs_list, &mod_sym->item); return 0; } @@ -514,6 +557,72 @@ module_sym_call(struct module_sym *mod_sym, struct port *args, return rc; } +struct module * +module_load(const char *package, const char *package_end) +{ + char path[PATH_MAX]; + if (module_find(package, package_end, path, sizeof(path)) != 0) + return NULL; + + struct module *cached, *module; + struct mh_strnptr_t *h = hash_tbl(false); + cached = module_cache_find(h, package, package_end); + if (cached == NULL) { + module = module_new(path, h, package, package_end); + if (module == NULL) + return NULL; + if (module_cache_add(module) != 0) { + module_unref(module); + return NULL; + } + return module; + } + + struct stat st; + if (stat(path, &st) != 0) { + diag_set(SystemError, "module: stat() module %s", path); + return NULL; + } + + /* + * When module comes from cache make sure that + * it is not changed on the storage device. The + * test below still can miss update if cpu data + * been manually moved backward and device/inode + * persisted but this is a really rare situation. + */ + if (cached->st.st_dev == st.st_dev && + cached->st.st_ino == st.st_ino && + cached->st.st_size == st.st_size && + memcmp(&cached->st.st_mtim, &st.st_mtim, + sizeof(st.st_mtim)) == 0) { + module_ref(cached); + return cached; + } + + /* + * Load a new module, update the cache and + * zap old module: every attempt to resolve + * symbols on old instance won't success. + */ + module = module_new(path, h, package, package_end); + if (module == NULL) + return NULL; + if (module_cache_update(module) != 0) { + module_unref(module); + return NULL; + } + + module_set_orphan(cached); + return module; +} + +void +module_unload(struct module *module) +{ + module_unref(module); +} + int module_reload(const char *package, const char *package_end) { @@ -529,7 +638,11 @@ module_reload(const char *package, const char *package_end) return -1; } - new = module_load(box_schema_hash, package, package_end); + char path[PATH_MAX]; + if (module_find(package, package_end, path, sizeof(path)) != 0) + return -1; + + new = module_new(path, box_schema_hash, package, package_end); if (new == NULL) return -1; @@ -606,11 +719,21 @@ module_reload(const char *package, const char *package_end) int module_init(void) { - box_schema_hash = mh_strnptr_new(); - if (box_schema_hash == NULL) { - diag_set(OutOfMemory, sizeof(*box_schema_hash), - "malloc", "modules box_schema_hash"); - return -1; + struct mh_strnptr_t **ht[] = { + &box_schema_hash, + &mod_hash, + }; + for (size_t i = 0; i < lengthof(ht); i++) { + *ht[i] = mh_strnptr_new(); + if (*ht[i] == NULL) { + diag_set(OutOfMemory, sizeof(*ht[i]), + "malloc", "modules hash"); + for (ssize_t j = i - 1; j >= 0; j--) { + mh_strnptr_delete(*ht[j]); + *ht[j] = NULL; + } + return -1; + } } return 0; } @@ -618,12 +741,18 @@ module_init(void) void module_free(void) { - struct mh_strnptr_t *h = box_schema_hash; - while (mh_size(h) > 0) { + struct mh_strnptr_t **ht[] = { + &box_schema_hash, + &mod_hash, + }; + for (size_t i = 0; i < lengthof(ht); i++) { + struct mh_strnptr_t *h = *ht[i]; + mh_int_t i = mh_first(h); struct module *m = mh_strnptr_node(h, i)->val; module_unref(m); + + mh_strnptr_delete(h); + *ht[i] = NULL; } - mh_strnptr_delete(box_schema_hash); - box_schema_hash = NULL; } diff --git a/src/box/module_cache.h b/src/box/module_cache.h index 875f2eb3c..ba2cbc7dd 100644 --- a/src/box/module_cache.h +++ b/src/box/module_cache.h @@ -6,6 +6,10 @@ #pragma once +#include <sys/types.h> +#include <sys/stat.h> +#include <unistd.h> + #include "small/rlist.h" #if defined(__cplusplus) @@ -48,6 +52,10 @@ struct module { * Count of active references to the module. */ int64_t refs; + /** + * Storage stat for identity check. + */ + struct stat st; /** * Module's package name. */ @@ -112,6 +120,25 @@ int module_sym_call(struct module_sym *mod_sym, struct port *args, struct port *ret); +/** + * Load new module instance. + * + * @param package shared library path start. + * @param package_end shared library path end. + * + * @return 0 on succes, -1 otherwise, diag is set. + */ +struct module * +module_load(const char *package, const char *package_end); + +/** + * Unload module instance. + * + * @param module instance to unload. + */ +void +module_unload(struct module *module); + /** * Reload a module and all associated symbols. * -- 2.29.2
next prev parent reply other threads:[~2021-02-02 22:17 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-02 22:11 [Tarantool-patches] [PATCH v14 00/12] box: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:11 ` [Tarantool-patches] [PATCH v14 01/12] box/func: factor out c function entry structure Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:11 ` [Tarantool-patches] [PATCH v14 02/12] module_cache: move module handling into own subsystem Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:11 ` [Tarantool-patches] [PATCH v14 03/12] module_cache: direct update a cache value on reload Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:11 ` [Tarantool-patches] [PATCH v14 04/12] module_cache: rename calls to ref in module structure Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 05/12] module_cache: add comment about weird resolving Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 06/12] module_cache: module_reload - drop redundant parameter Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 07/12] module_cache: use references as a main usage counter Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 08/12] module_cache: make module to carry hash it belongs to Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 09/12] module_cache: use own hash for box.schema.func requests Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` Cyrill Gorcunov via Tarantool-patches [this message] 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 11/12] box/cmod: implement cmod Lua module Cyrill Gorcunov via Tarantool-patches 2021-02-02 22:12 ` [Tarantool-patches] [PATCH v14 12/12] test: box/cfunc -- add cmod test Cyrill Gorcunov via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210202221207.383101-11-gorcunov@gmail.com \ --to=tarantool-patches@dev.tarantool.org \ --cc=gorcunov@gmail.com \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v14 10/12] module_cache: provide module_load/unload calls' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox