From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 6C3FD6EC59; Tue, 9 Mar 2021 01:45:05 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6C3FD6EC59 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1615243505; bh=vMdVCcAScffLXCFoN/UOxIq/HrzuA6VWBdP7IPzuoV8=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=yYBbwi3oEsvw+lDwSEFigKnAqb753hp1iuvtm7W75qqBASzLNJs17y5tQL8KMCRW+ rFwJxh5xNtZyDUGVL/0sdHI2AQhNbcMgKkKjSabK6Wdap9mKsJceLrtOEIl3HWQdVK LA61y8xlyNpmmhl6K7jOUeqgDFqwlWjYsUS3caMU= Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 6D6006EC59 for ; Tue, 9 Mar 2021 01:45:04 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6D6006EC59 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1lJOcZ-0005ud-Q4; Tue, 09 Mar 2021 01:45:04 +0300 To: Cyrill Gorcunov , tml References: <20210301212343.422372-1-gorcunov@gmail.com> <20210301212343.422372-4-gorcunov@gmail.com> Message-ID: Date: Mon, 8 Mar 2021 23:45:02 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: <20210301212343.422372-4-gorcunov@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D3134714A9BDB69BE66984610072BB1D09A72A4B5C8F966400894C459B0CD1B9F8371CC41B76B4C42A20D9931BB70B1AA4066BDE562F54EEFEA01C85A8357090 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7D9C4478D0B876341EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063729D9E4EC97225AF68638F802B75D45FF914D58D5BE9E6BC131B5C99E7648C95C1F4A82545730C76126940A370E7467328C6BC8EA3DD3A5FEA471835C12D1D9774AD6D5ED66289B5259CC434672EE6371117882F4460429724CE54428C33FAD30A8DF7F3B2552694AC26CFBAC0749D213D2E47CDBA5A9658378DA827A17800CE7328B01A8D746D8839FA2833FD35BB23DF004C9065253843057739F23D657EF2B13377AFFFEAFD26923F8577A6DFFEA7CB24F08513AFFAC7993EC92FD9297F6715571747095F342E857739F23D657EF2BD5E8D9A59859A8B6F6B3B5B76D0D00FF089D37D7C0E48F6C5571747095F342E857739F23D657EF2B6825BDBE14D8E70227F01F88B6EF2528BD9CCCA9EDD067B1EDA766A37F9254B7 X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A2BBE337FB72E92315FF39D8DB89857825743D937135AA13FF7120598F3739033D109B151A58CD633F834459D11680B505C87839E103577AE77D5F68EA631A6B49 X-C1DE0DAB: 0D63561A33F958A527477EF4CE4B7468CA06C2545209BBA09332EC5E9B6FAADAD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75448CF9D3A7B2C848410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34A246DFF68F0EE19E3922B76B0174231B1B9A031DED02394BD952145F6D1C1B1F32509D80F92569EB1D7E09C32AA3244C45E09EE659DD37A21B95278EDEFB51207101BF96129E4011729B2BEF169E0186 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojUqxEkzHTt/gABZOddxQ3Rw== X-Mailru-Sender: 689FA8AB762F73936BC43F508A0638221743AA5368EA03A247750C1F7829694A3841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v19 3/6] box/module_cache: introduce modules subsystem X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Thanks for the patch! See 10 comments below. > diff --git a/src/box/module_cache.c b/src/box/module_cache.c > new file mode 100644 > index 000000000..bc2184e5f > --- /dev/null > +++ b/src/box/module_cache.c > @@ -0,0 +1,476 @@ > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#include > +#include > +#include > +#include > +#include > + > +#include "assoc.h" > +#include "diag.h" > +#include "fiber.h" > +#include "module_cache.h" > + > +#include "box/error.h" > +#include "box/port.h" > + > +#include "lua/utils.h" > +#include "libeio/eio.h" > + > +static struct mh_strnptr_t *module_cache = NULL; > + > +/** > + * Helpers for cache manipulations. > + */ > +static void * > +cache_find(const char *str, size_t len) > +{ > + mh_int_t e = mh_strnptr_find_inp(module_cache, str, len); > + if (e == mh_end(module_cache)) > + return NULL; > + return mh_strnptr_node(module_cache, e)->val; > +} > + > +static void > +cache_update(struct module *m) > +{ > + const char *str = m->package; > + size_t len = m->package_len; > + > + mh_int_t e = mh_strnptr_find_inp(module_cache, str, len); > + if (e == mh_end(module_cache)) > + panic("module: failed to update cache: %s", str); > + > + mh_strnptr_node(module_cache, e)->str = m->package; > + mh_strnptr_node(module_cache, e)->val = m; > +} > + > +static int > +cache_add(struct module *m) 1. In func.c it is called `cache_put`, but does the same. What is the difference between `put` and `add` then? If none, then why not use only one verb? > +{ > + const struct mh_strnptr_node_t nd = { > + .str = m->package, > + .len = m->package_len, > + .hash = mh_strn_hash(m->package, m->package_len), > + .val = m, > + }; > + > + mh_int_t e = mh_strnptr_put(module_cache, &nd, NULL, NULL); > + if (e == mh_end(module_cache)) { > + diag_set(OutOfMemory, sizeof(nd), "malloc", > + "module_cache node"); > + return -1; > + } > + return 0; > +} > + > +static void > +cache_del(struct module *m) > +{ > + const char *str = m->package; > + size_t len = m->package_len; > + > + mh_int_t e = mh_strnptr_find_inp(module_cache, str, len); > + if (e != mh_end(module_cache)) { > + struct module *v = mh_strnptr_node(module_cache, e)->val; > + if (v == m) > + mh_strnptr_del(module_cache, e, NULL); 2. Please, add a comment about why the module pointer in the cache might be different. > + } > +} > + > +struct module * > +module_load_force(const char *package, size_t package_len) > +{ > + char path[PATH_MAX]; > + size_t size = sizeof(path); > + > + if (find_package(package, package_len, path, size) != 0) > + return NULL; > + > + struct module *m = module_new(package, package_len, path); > + if (m == NULL) > + return NULL; > + > + struct module *c = cache_find(package, package_len); > + if (c != NULL) { > + cache_update(m); > + say_info("module_cache: force load: %s", package); 3. That is not very helpful for a user - he does not know what 'module_cache' is, and what is 'force load'. Try to make a more user-friendly message. It should not use any internal names. > + } else { > + if (cache_add(m) != 0) { > + module_unload(m); > + return NULL; > + } > + } > + > + return m; > +} > + > +struct module * > +module_load(const char *package, size_t package_len) > +{ > + char path[PATH_MAX]; > + > + if (find_package(package, package_len, path, sizeof(path)) != 0) > + return NULL; > + > + struct module *m = cache_find(package, package_len); > + if (m != NULL) { > + struct module_attr attr; > + struct stat st; > + if (stat(path, &st) != 0) { > + diag_set(SystemError, "failed to stat() %s", path); > + return NULL; > + } > + > + /* > + * In case of cache hit we may reuse existing > + * module which speedup load procedure. > + */ > + module_attr_fill(&attr, &st); > + if (memcmp(&attr, &m->attr, sizeof(attr)) == 0) { 4. That is not safe in case the struct has padding between members and/or in the end. You either must compare the members manually, or must ensure there is no any padding. For instance, reorder the members not to have the padding in between, add a dummy member in the end to fill the end-padding if necessary and fill it with 0 always, and add a static_assert its size is equal to the sum of member sizes. Then memcmp should be safe. Otherwise you risk to compare garbage. > + module_ref(m); > + return m; > + } > + > + /* > + * Module has been updated on a storage device, > + * so load a new instance and update the cache, > + * old entry get evicted but continue residing > + * in memory, fully functional, until last > + * function is unloaded. > + */ > + m = module_new(package, package_len, path); > + if (m != NULL) { > + cache_update(m); > + say_info("module_cache: attr changed: %s", > + package); > + } > + } else { > + m = module_new(package, package_len, path); > + if (m != NULL) { > + if (cache_add(m) != 0) { 5. You could improve readability a bit if you would do early returns on errors (return NULL right away if m == NULL), or reduce the indentation. For instance, by using a single `if` with the condition `m != NULL && cache_add(m) != 0` instead of 2 `if`s. > + module_unload(m); > + return NULL; > + } > + } > + } > + > + return m; > +} > + > +void > +module_unload(struct module *m) > +{ > + module_unref(m); > +} > + > +void > +module_free(void) > +{ > + size_t nr_busy = 0; > + mh_int_t e; > + > + mh_foreach(module_cache, e) { > + struct module *m = mh_strnptr_node(module_cache, e)->val; > + if (m->refs > 1) > + nr_busy++; > + module_unload(m); > + } > + > + mh_strnptr_delete(module_cache); > + module_cache = NULL; > + > + if (nr_busy > 0) > + say_info("module_cache: %zu entries left", nr_busy); 6. This probably must be panic. If there are modules, they are going to try to delete themselves from the cache later, which is NULL, so it would crash anyway but without any message. `module_cache` must be empty when module_free() is called. > +} > diff --git a/src/box/module_cache.h b/src/box/module_cache.h > new file mode 100644 > index 000000000..78dd5c7ad > --- /dev/null > +++ b/src/box/module_cache.h > @@ -0,0 +1,206 @@ > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#pragma once > + > +#include > +#include > + > +#include "trivia/config.h" > + > +#if defined(__cplusplus) > +extern "C" { > +#endif /* defined(__cplusplus) */ > + > +/** > + * API of C stored function. > + */ > + > +struct port; > + > +struct box_function_ctx { > + struct port *port; > +}; > + > +typedef struct box_function_ctx box_function_ctx_t; > +typedef int (*box_function_t)(box_function_ctx_t *ctx, > + const char *args, > + const char *args_end); > + > +/** > + * Shared library file attributes for > + * module cache invalidation. > + */ > +struct module_attr { > +#ifdef TARGET_OS_DARWIN > + struct timespec st_mtimespec; > +#else > + struct timespec st_mtim; > +#endif > + dev_t st_dev; > + ino_t st_ino; > + off_t st_size; 7. For the sake of easier memcmp() you could use platform-agnostic types. For instance, u/int64_t or u/int32_t. > +}; > + > +/** > + * Dynamic shared module. > + */ > +struct module { > + /** > + * Module handle, dlopen() result. > + */ > + void *handle; > + /** > + * File attributes. > + */ > + struct module_attr attr; > + /** > + * Count of active references. > + */ > + int64_t refs; > + /** > + * Length of @a package. > + */ > + size_t package_len; > + /** > + * Module's name without file extension. > + */ > + char package[0]; > +}; > + > +/** > + * Module function. > + */ > +struct module_func { > + /** > + * Function's address, iow dlsym() result. > + */ > + box_function_t func; > + /** > + * Function's module. > + */ > + struct module *module; > +}; > + > +/** > + * Load a module. > + * > + * Lookup for a module instance in cache and if not found > + * the module is loaded from a storage device. 8. It also checks for the attributes being changed. > + * > + * @param package module package (without file extension). > + * @param package_len length of @a package. > + * > + * Possible errors: > + * ClientError: the package is not found on a storage device. > + * ClientError: an error happened when been loading the package. > + * SystemError: a system error happened during procedure. > + * OutOfMemory: unable to allocate new memory for module instance. > + * > + * @return a module instance on success, NULL otherwise (diag is set) > + */ > +struct module * > +module_load(const char *package, size_t package_len); > + > +/** > + * Force load a module. > + * > + * Load a module from a storage device and invalidate > + * an associated cache entry. 9. Reload, not invalidate. It stays cached, but with a new pointer. > + * > + * @param package module package (without file extension). > + * @param package_len length of @a package. > + * > + * Possible errors: > + * ClientError: the package is not found on a storage device. > + * ClientError: an error happened when been loading the package. > + * SystemError: a system error happened during procedure. > + * OutOfMemory: unable to allocate new memory for module instance. > + * > + * @return a module instance on success, NULL otherwise (diag is set) > + */ > +struct module * > +module_load_force(const char *package, size_t package_len); > + > +/** > + * Unload a module instance. > + * > + * @param m a module to unload. > + */ > +void > +module_unload(struct module *m); > + > +/** Test if module function is empty. */ > +static inline bool > +is_module_func_empty(struct module_func *mf) 10. Struct prefix is "stronger" than 'is' prefix. Otherwise we would end up with tons of 'is' checkers in the code not related to each other but having the same prefix, and conflicting with variable names. Here it should be module_func_is_empty(). > +{ > + return mf->module == NULL; > +}