From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id CCFA563342; Wed, 3 Feb 2021 01:17:14 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org CCFA563342 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1612304234; bh=ZDKQYpJY4rPiVPbe59jJzTbbOJmeyE2asZgQOIfd+zU=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=EWn9+PPAjh1LZAr3eR8SD6rT7SV3pr8GK9QMRhS92PisBNTkrF7aJk/xnuSsZ5yeS dcRoVP05kjfyH4HD0GKmZSMTXhTdVFz6aaT5Ya5UIB7RBjU4rsr69YOHSarkO4lVem ZGE6Jxl602/4i09/m3beTEm+B8ggY08mCpd+FMmA= Received: from mail-lf1-f52.google.com (mail-lf1-f52.google.com [209.85.167.52]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B092CBD9E7 for ; Wed, 3 Feb 2021 01:14:14 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B092CBD9E7 Received: by mail-lf1-f52.google.com with SMTP id h7so30375381lfc.6 for ; Tue, 02 Feb 2021 14:14:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Nodrurbo/X7ZKrzz/npxKV9z4lw+a2XrgUQkafRqBwY=; b=Jr/5rWoRO8RZrTfLBZvql5T67Ms8N80/GG8a1v2tbeeJO0UgTDVX8NGw7j/1PDsdHc Jr89wKaKpG0PNd5BknylzTtIok4fyVvzi2/2kTZgP40giynHavw43xykfxasj3xFCeKM OvB0+hOpiDcUGt537QPIToK87TbU4AHzDWt7Fq8y5NBoxXzb+TfM4+43qGY+3osSLGN+ LVkXlez3DYMUhM9Kg15pqRF/Vpo9ALK1m7Y9nSS1WKMLVISGhVk/uYjdWOacBTLhNEWb J721+9S3AyUsE/AdTZq7Z13+ypf8WBn9UbtHB9k0FlVHkNwx3HGOgn7Piv2LAMq6Ec61 C+hg== X-Gm-Message-State: AOAM533IjC2VUHkzSZL/3sZlPfboFhXPemxpppNZNpvO1mngtspj49lF ddn0tDdujLycB4jXBdip0tcsOXp8OQU= X-Google-Smtp-Source: ABdhPJyqN7JkRGLQmcXT04z5sGJmnTY73yrPu5pTFzdoNyeHKsCgvicpkt8RJ3ti9BZq5XTLJBWfQA== X-Received: by 2002:ac2:5331:: with SMTP id f17mr33465lfh.21.1612304053375; Tue, 02 Feb 2021 14:14:13 -0800 (PST) Received: from grain.localdomain ([5.18.103.226]) by smtp.gmail.com with ESMTPSA id a197sm28842lfd.253.2021.02.02.14.14.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 14:14:12 -0800 (PST) Received: by grain.localdomain (Postfix, from userid 1000) id 9B4605602A3; Wed, 3 Feb 2021 01:12:08 +0300 (MSK) To: tml Date: Wed, 3 Feb 2021 01:12:05 +0300 Message-Id: <20210202221207.383101-11-gorcunov@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210202221207.383101-1-gorcunov@gmail.com> References: <20210202221207.383101-1-gorcunov@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH v14 10/12] module_cache: provide module_load/unload calls X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Cc: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" These calls provide an ability to load and unload modules taking into account cache invalidation. In particular if module file is changed on disk then all previous copies become invalid and any attempt to resolve a function symbol in such copies will fail. Still any existing instances will be able to run until explicitly unloaded. This is done for sake of new upcoming `cmod` module (will be implemented in next patch). Need to mention some moments here. Currently we already have `box.schema.func` interface which is strange designed but can't be dropped off without breaking backward compatibility. Very inconvenient moment here is lazy symbols binding, iow we can create a function but until we execute it the module won't be read and parsed. Moreover one can execute the function then overwrite the module on disk and define an another function from the same module and run it. This new function will be read from previously cached copy and one have to run module reload procedure explicitly to update the module. Module update on its own is a very fragile procedure: if there are running functions which are yielding then old module won't be reload until execution is complete even if explicit reload is called. And in case if function return signature is changed (say occasionally) this will lead to unpredicted results. In summary: new module_load/unload API with cache invalidation is incompatible with old `box.schema.func` thus to keep backward compatibility (until we deprecate old API) we keep modules in two separate caches: one for module_load/unload and one for `box.schema.func`. Later we will keep only new API and get rid of old code. Part-of #4642 Signed-off-by: Cyrill Gorcunov --- src/box/module_cache.c | 179 +++++++++++++++++++++++++++++++++++------ src/box/module_cache.h | 27 +++++++ 2 files changed, 181 insertions(+), 25 deletions(-) diff --git a/src/box/module_cache.c b/src/box/module_cache.c index f9ed6e7d8..aba55bbe3 100644 --- a/src/box/module_cache.c +++ b/src/box/module_cache.c @@ -19,11 +19,33 @@ #include "error.h" #include "lua/utils.h" #include "libeio/eio.h" +#include "trivia/util.h" #include "module_cache.h" -/** Modules name to descriptor hash. */ +/** + * Modules names to descriptor hashes. The first one + * for modules created with old `box.schema.func` + * interface. + * + * Here is an important moment for backward compatibility. + * The `box.schema.func` operations always use cache and + * if a module is updated on a storage device or even + * no longer present, then lazy symbol resolving is done + * via previously loaded copy. To update modules one have + * to reload them manually. + * + * In turn new API implies to use module_load/unload explicit + * interface, and when module is re-loaded from cache then + * we make a cache validation to be sure the copy on storage + * is up to date. + * + * Due to all this we have to keep two hash tables. Probably + * we should deprecate explicit reload at all and require + * manual load/unload instead. But later. + */ static struct mh_strnptr_t *box_schema_hash = NULL; +static struct mh_strnptr_t *mod_hash = NULL; /** * Parsed symbol and package names. @@ -52,7 +74,7 @@ struct func_name { static inline struct mh_strnptr_t * hash_tbl(bool is_box_schema) { - return is_box_schema ? box_schema_hash : NULL; + return is_box_schema ? box_schema_hash : mod_hash; } /*** @@ -289,13 +311,9 @@ module_unref(struct module *module) * for cases of a function reload. */ static struct module * -module_load(struct mh_strnptr_t *h, const char *package, - const char *package_end) +module_new(const char *path, struct mh_strnptr_t *h, + const char *package, const char *package_end) { - char path[PATH_MAX]; - if (module_find(package, package_end, path, sizeof(path)) != 0) - return NULL; - int package_len = package_end - package; struct module *module = malloc(sizeof(*module) + package_len + 1); if (module == NULL) { @@ -334,8 +352,8 @@ module_load(struct mh_strnptr_t *h, const char *package, goto error; } - struct stat st; - if (stat(path, &st) < 0) { + struct stat *st = &module->st; + if (stat(path, st) < 0) { diag_set(SystemError, "failed to stat() module %s", path); goto error; } @@ -347,7 +365,7 @@ module_load(struct mh_strnptr_t *h, const char *package, } int dest_fd = open(load_name, O_WRONLY | O_CREAT | O_TRUNC, - st.st_mode & (S_IRWXU | S_IRWXG | S_IRWXO)); + st->st_mode & (S_IRWXU | S_IRWXG | S_IRWXO)); if (dest_fd < 0) { diag_set(SystemError, "failed to open file %s for writing ", load_name); @@ -355,10 +373,10 @@ module_load(struct mh_strnptr_t *h, const char *package, goto error; } - off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size); + off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st->st_size); close(source_fd); close(dest_fd); - if (ret != st.st_size) { + if (ret != st->st_size) { diag_set(SystemError, "failed to copy DSO %s to %s", path, load_name); goto error; @@ -405,6 +423,24 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema) { assert(mod_sym->addr == NULL); + /* + * When module is created via new interface and + * cached, it might be updated on disk afterward + * so that symbols might not longer match the loaded + * instance. Thus new resolving on same module requires + * the whole module explicit reload. + */ + if (!is_box_schema) { + struct module *m = mod_sym->module; + assert(m != NULL); + if (module_is_orphan(m)) { + diag_set(ClientError, ER_UNSUPPORTED, + "module updated on disk,", + "needs whole reload"); + return -1; + } + } + struct func_name name; func_split_name(mod_sym->name, &name); @@ -417,7 +453,12 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema) struct mh_strnptr_t *h = hash_tbl(is_box_schema); cached = module_cache_find(h, name.package, name.package_end); if (cached == NULL) { - module = module_load(h, name.package, name.package_end); + char path[PATH_MAX]; + if (module_find(name.package, name.package_end, + path, sizeof(path)) != 0) { + return -1; + } + module = module_new(path, h, name.package, name.package_end); if (module == NULL) return -1; if (module_cache_add(module) != 0) { @@ -435,7 +476,9 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema) return -1; } - mod_sym->module = module; + if (is_box_schema) + mod_sym->module = module; + rlist_add(&module->funcs_list, &mod_sym->item); return 0; } @@ -514,6 +557,72 @@ module_sym_call(struct module_sym *mod_sym, struct port *args, return rc; } +struct module * +module_load(const char *package, const char *package_end) +{ + char path[PATH_MAX]; + if (module_find(package, package_end, path, sizeof(path)) != 0) + return NULL; + + struct module *cached, *module; + struct mh_strnptr_t *h = hash_tbl(false); + cached = module_cache_find(h, package, package_end); + if (cached == NULL) { + module = module_new(path, h, package, package_end); + if (module == NULL) + return NULL; + if (module_cache_add(module) != 0) { + module_unref(module); + return NULL; + } + return module; + } + + struct stat st; + if (stat(path, &st) != 0) { + diag_set(SystemError, "module: stat() module %s", path); + return NULL; + } + + /* + * When module comes from cache make sure that + * it is not changed on the storage device. The + * test below still can miss update if cpu data + * been manually moved backward and device/inode + * persisted but this is a really rare situation. + */ + if (cached->st.st_dev == st.st_dev && + cached->st.st_ino == st.st_ino && + cached->st.st_size == st.st_size && + memcmp(&cached->st.st_mtim, &st.st_mtim, + sizeof(st.st_mtim)) == 0) { + module_ref(cached); + return cached; + } + + /* + * Load a new module, update the cache and + * zap old module: every attempt to resolve + * symbols on old instance won't success. + */ + module = module_new(path, h, package, package_end); + if (module == NULL) + return NULL; + if (module_cache_update(module) != 0) { + module_unref(module); + return NULL; + } + + module_set_orphan(cached); + return module; +} + +void +module_unload(struct module *module) +{ + module_unref(module); +} + int module_reload(const char *package, const char *package_end) { @@ -529,7 +638,11 @@ module_reload(const char *package, const char *package_end) return -1; } - new = module_load(box_schema_hash, package, package_end); + char path[PATH_MAX]; + if (module_find(package, package_end, path, sizeof(path)) != 0) + return -1; + + new = module_new(path, box_schema_hash, package, package_end); if (new == NULL) return -1; @@ -606,11 +719,21 @@ module_reload(const char *package, const char *package_end) int module_init(void) { - box_schema_hash = mh_strnptr_new(); - if (box_schema_hash == NULL) { - diag_set(OutOfMemory, sizeof(*box_schema_hash), - "malloc", "modules box_schema_hash"); - return -1; + struct mh_strnptr_t **ht[] = { + &box_schema_hash, + &mod_hash, + }; + for (size_t i = 0; i < lengthof(ht); i++) { + *ht[i] = mh_strnptr_new(); + if (*ht[i] == NULL) { + diag_set(OutOfMemory, sizeof(*ht[i]), + "malloc", "modules hash"); + for (ssize_t j = i - 1; j >= 0; j--) { + mh_strnptr_delete(*ht[j]); + *ht[j] = NULL; + } + return -1; + } } return 0; } @@ -618,12 +741,18 @@ module_init(void) void module_free(void) { - struct mh_strnptr_t *h = box_schema_hash; - while (mh_size(h) > 0) { + struct mh_strnptr_t **ht[] = { + &box_schema_hash, + &mod_hash, + }; + for (size_t i = 0; i < lengthof(ht); i++) { + struct mh_strnptr_t *h = *ht[i]; + mh_int_t i = mh_first(h); struct module *m = mh_strnptr_node(h, i)->val; module_unref(m); + + mh_strnptr_delete(h); + *ht[i] = NULL; } - mh_strnptr_delete(box_schema_hash); - box_schema_hash = NULL; } diff --git a/src/box/module_cache.h b/src/box/module_cache.h index 875f2eb3c..ba2cbc7dd 100644 --- a/src/box/module_cache.h +++ b/src/box/module_cache.h @@ -6,6 +6,10 @@ #pragma once +#include +#include +#include + #include "small/rlist.h" #if defined(__cplusplus) @@ -48,6 +52,10 @@ struct module { * Count of active references to the module. */ int64_t refs; + /** + * Storage stat for identity check. + */ + struct stat st; /** * Module's package name. */ @@ -112,6 +120,25 @@ int module_sym_call(struct module_sym *mod_sym, struct port *args, struct port *ret); +/** + * Load new module instance. + * + * @param package shared library path start. + * @param package_end shared library path end. + * + * @return 0 on succes, -1 otherwise, diag is set. + */ +struct module * +module_load(const char *package, const char *package_end); + +/** + * Unload module instance. + * + * @param module instance to unload. + */ +void +module_unload(struct module *module); + /** * Reload a module and all associated symbols. * -- 2.29.2