Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>,
	tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v20 4/7] box/module_cache: introduce modules subsystem
Date: Mon, 5 Apr 2021 17:52:47 +0200	[thread overview]
Message-ID: <14b1c3fa-da30-e7f1-6d17-32e575d71272@tarantool.org> (raw)
In-Reply-To: <20210402123420.885834-5-gorcunov@gmail.com>

Thanks for the patch!

See 6 comments below.

> diff --git a/src/box/module_cache.c b/src/box/module_cache.c
> new file mode 100644
> index 000000000..2cd2f2e8b
> --- /dev/null
> +++ b/src/box/module_cache.c
> @@ -0,0 +1,474 @@
> +/*
> + * SPDX-License-Identifier: BSD-2-Clause
> + *
> + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
> + */
> +
> +#include <unistd.h>
> +#include <string.h>
> +#include <fcntl.h>
> +#include <dlfcn.h>
> +#include <lua.h>
> +
> +#include "assoc.h"
> +#include "diag.h"
> +#include "fiber.h"
> +#include "module_cache.h"
> +
> +#include "box/error.h"
> +#include "box/port.h"
> +
> +#include "lua/utils.h"
> +#include "libeio/eio.h"
> +
> +static struct mh_strnptr_t *module_cache = NULL;
> +
> +/**
> + * Helpers for cache manipulations.
> + */
> +static void *

1. It returns struct module, therefore the return type must be
'struct module *', not 'void *'. The same for cache_find() in box.lib
implementation.

> +cache_find(const char *str, size_t len)
> +{
> +	mh_int_t e = mh_strnptr_find_inp(module_cache, str, len);
> +	if (e == mh_end(module_cache))
> +		return NULL;
> +	return mh_strnptr_node(module_cache, e)->val;
> +}
> +
> +static void
> +cache_update(struct module *m)
> +{
> +	const char *str = m->package;
> +	size_t len = m->package_len;
> +
> +	mh_int_t e = mh_strnptr_find_inp(module_cache, str, len);
> +	if (e == mh_end(module_cache))
> +		panic("module: failed to update cache: %s", str);
> +
> +	mh_strnptr_node(module_cache, e)->str = m->package;
> +	mh_strnptr_node(module_cache, e)->val = m;
> +}
> +
> +static int
> +cache_put(struct module *m)
> +{
> +	const struct mh_strnptr_node_t nd = {
> +		.str	= m->package,
> +		.len	= m->package_len,
> +		.hash	= mh_strn_hash(m->package, m->package_len),
> +		.val	= m,
> +	};
> +
> +	mh_int_t e = mh_strnptr_put(module_cache, &nd, NULL, NULL);

2. Put() silently replaces the old value if it is present. I would
recommend to use the next to the last argument to get the old value
and ensure it is mh_end() via an assertion/panic. The same for the other
new put() functions in the other commits.

> +	if (e == mh_end(module_cache)) {
> +		diag_set(OutOfMemory, sizeof(nd), "malloc",
> +			 "module_cache node");
> +		return -1;
> +	}
> +	return 0;
> +}
> +
> +static void
> +cache_del(struct module *m)
> +{
> +	const char *str = m->package;
> +	size_t len = m->package_len;
> +
> +	mh_int_t e = mh_strnptr_find_inp(module_cache, str, len);
> +	if (e != mh_end(module_cache)) {

3. Maybe this must be an assertion/panic. I don't see a valid case when
del() is called on an already deleted module. The same for the other
new del() functions in the other commits.

> +		struct module *v = mh_strnptr_node(module_cache, e)->val;
> +		if (v == m) {
> +			/*
> +			 * The module in cache might be updated
> +			 * via force load and old instance is kept
> +			 * by a reference only.
> +			 */
> +			mh_strnptr_del(module_cache, e, NULL);
> +		}
> +	}
> +}

<...>

> +
> +struct module *
> +module_load(const char *package, size_t package_len)
> +{
> +	char path[PATH_MAX];
> +
> +	if (find_package(package, package_len, path, sizeof(path)) != 0)
> +		return NULL;
> +
> +	struct module *m = cache_find(package, package_len);
> +	if (m != NULL) {
> +		struct module_attr attr;
> +		struct stat st;
> +		if (stat(path, &st) != 0) {
> +			diag_set(SystemError, "failed to stat() %s", path);
> +			return NULL;
> +		}
> +
> +		/*
> +		 * In case of cache hit we may reuse existing
> +		 * module which speedup load procedure.
> +		 */
> +		module_attr_fill(&attr, &st);
> +		if (memcmp(&attr, &m->attr, sizeof(attr)) == 0) {

4. Please, add a static assertion, that sizeof(module_attr) == 40.
Otherwise somebody might add a new attribute, which won't be uint64_t,
and would break the comparison without noticing. Also you can make the
attributes be stored as a byte array char[40] to make it impossible to
add any padding into it. Also you can compare the attributes one by
one.

> +			module_ref(m);
> +			return m;
> +		}
> +
> +		/*
> +		 * Module has been updated on a storage device,
> +		 * so load a new instance and update the cache,
> +		 * old entry get evicted but continue residing
> +		 * in memory, fully functional, until last
> +		 * function is unloaded.
> +		 */
> +		m = module_new(package, package_len, path);
> +		if (m != NULL)
> +			cache_update(m);
> +	} else {
> +		m = module_new(package, package_len, path);
> +		if (m != NULL && cache_put(m) != 0) {
> +			module_unload(m);
> +			return NULL;
> +		}
> +	}
> +
> +	return m;
> +}
> +
> +void
> +module_unload(struct module *m)
> +{
> +	module_unref(m);
> +}
> +
> +void
> +module_free(void)
> +{
> +	mh_int_t e;
> +
> +	mh_foreach(module_cache, e) {
> +		struct module *m = mh_strnptr_node(module_cache, e)->val;
> +		module_unload(m);

5. As I said in the previous review, it does not make much sense.
If there are any not unloaded modules, and they try to unload later,
they will see module_cache == NULL and will crash.

Also you can't do unload here, because the module_cache itself does
not keep any references. All the unloads must be done by the module
objects owners. Not by module_cache on its own. For example, if there
is a module having a single reference and used in some other subsystem,
your unload will free it and make it memory invalid. That will crash
in case the module owner will try to access it again.

There should be a panic-check that the module cache is empty already.

> +	}
> +
> +	mh_strnptr_delete(module_cache);
> +	module_cache = NULL;
> +}
> diff --git a/src/box/module_cache.h b/src/box/module_cache.h
> new file mode 100644
> index 000000000..18eb3866a
> --- /dev/null
> +++ b/src/box/module_cache.h
> @@ -0,0 +1,208 @@
> +/*
> + * SPDX-License-Identifier: BSD-2-Clause
> + *
> + * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
> + */
> +
> +#pragma once
> +
> +#include <stdint.h>
> +
> +#include <sys/types.h>
> +#include <sys/stat.h>

6. You don't need these headers in module_cache.h. They are
needed only in the .c file.

  parent reply	other threads:[~2021-04-05 15:52 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02 12:34 [Tarantool-patches] [PATCH v20 0/7] box: implement box.lib Lua module Cyrill Gorcunov via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 1/7] box/schema: make sure hashes are created Cyrill Gorcunov via Tarantool-patches
2021-04-05  9:28   ` Serge Petrenko via Tarantool-patches
2021-04-05  9:50     ` Cyrill Gorcunov via Tarantool-patches
2021-04-05 10:13       ` Serge Petrenko via Tarantool-patches
2021-04-05 15:45   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-06  7:44     ` Cyrill Gorcunov via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 2/7] box/func: module_reload -- drop redundant argument Cyrill Gorcunov via Tarantool-patches
2021-04-05 10:23   ` Serge Petrenko via Tarantool-patches
2021-04-05 10:26     ` Serge Petrenko via Tarantool-patches
2021-04-05 10:31       ` Cyrill Gorcunov via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 3/7] box/func: fix modules functions restore Cyrill Gorcunov via Tarantool-patches
2021-04-05 10:53   ` Serge Petrenko via Tarantool-patches
2021-04-05 11:26     ` Cyrill Gorcunov via Tarantool-patches
2021-04-05 11:42       ` Serge Petrenko via Tarantool-patches
2021-04-05 15:47   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-06  8:38     ` Cyrill Gorcunov via Tarantool-patches
2021-04-06 20:02       ` Vladislav Shpilevoy via Tarantool-patches
2021-04-06 20:42         ` Cyrill Gorcunov via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 4/7] box/module_cache: introduce modules subsystem Cyrill Gorcunov via Tarantool-patches
2021-04-05 12:34   ` Serge Petrenko via Tarantool-patches
2021-04-05 15:52   ` Vladislav Shpilevoy via Tarantool-patches [this message]
2021-04-06 14:33     ` Cyrill Gorcunov via Tarantool-patches
2021-04-06 20:09       ` Vladislav Shpilevoy via Tarantool-patches
2021-04-06 22:05         ` Cyrill Gorcunov via Tarantool-patches
2021-04-06 23:43           ` Vladislav Shpilevoy via Tarantool-patches
2021-04-07  7:03             ` Cyrill Gorcunov via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 5/7] box/schema.func: switch to new module api Cyrill Gorcunov via Tarantool-patches
2021-04-05 15:56   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 6/7] box: implement box.lib module Cyrill Gorcunov via Tarantool-patches
2021-04-05 16:04   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-07 16:59     ` Cyrill Gorcunov via Tarantool-patches
2021-04-07 20:22       ` Cyrill Gorcunov via Tarantool-patches
2021-04-07 20:28         ` Vladislav Shpilevoy via Tarantool-patches
2021-04-07 20:37           ` Cyrill Gorcunov via Tarantool-patches
2021-04-07 20:45             ` Cyrill Gorcunov via Tarantool-patches
2021-04-07 21:04               ` Vladislav Shpilevoy via Tarantool-patches
2021-04-02 12:34 ` [Tarantool-patches] [PATCH v20 7/7] test: add box.lib test Cyrill Gorcunov via Tarantool-patches
2021-04-05 16:04   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-05 15:45 ` [Tarantool-patches] [PATCH v20 0/7] box: implement box.lib Lua module Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14b1c3fa-da30-e7f1-6d17-32e575d71272@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v20 4/7] box/module_cache: introduce modules subsystem' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox