[Tarantool-patches] [PATCH v15 10/11] box/cmod: implement cmod Lua module

Cyrill Gorcunov gorcunov at gmail.com
Fri Feb 5 21:54:35 MSK 2021


Currently to run "C" function from some external module one
have to register it first in "_func" system space. This is
a problem if node is in read-only mode (replica).

Still people would like to have a way to run such functions
even in ro mode. For this sake we implement "cmod" lua module.

The cmod interface implies explicit module loading and unloading
before resolving symbols. For this sake we introduce module_load
and module_unload calls.

module_loads tries to reuse modules cache in case if shared library
has been loaded already. This is needed to speedup module loading
when some complex application with a number of different Lua scripts
use the same shared library.

Internally module_load test for shared library file attributes
(device, inode, mtime, size) to make sure the library has not
been overwritten, otherwise the cache entry get evicted and
new instance loaded instaead. Note that all previous instances
of loaded module are not changed and continue working as is.
This is on of the main differences from box.schema.func interface
which never expires cache entries until explicitly "reloaded".
Presumably we deprecate old inteface completely in time. For
now to keep backward compatibility we track modules in two different
caches -- one for box.schema.func and one for cmod interface, they
do not interfere.

@TarantoolBot document
Title: cmod module

Overview
========

`cmod` module provides a way to create, delete and execute
`C` procedures from shared libraries. Unlike `box.schema.func`
methods the functions created with `cmod` help are not persistent
and live purely in memory. Once a node get turned off they are
vanished. An initial purpose for them is to execute them on
nodes which are running in read-only mode.

Module functions
================

`require('cmod').load(path) -> obj | error`
-------------------------------------------

Loads a module from `path` and return an object instance
associate with the module, otherwise an error is thrown.

The `path` should not end up with shared library extension
(such as `.so`), only a file name shall be there.

Possible errors:

- IllegalParams: module path is either not supplied
  or not a string.
- SystemError: unable to open a module due to a system error.
- ClientError: a module does not exist.
- OutOfMemory: unable to allocate a module.

Example:

``` Lua
-- Without error handling
m = require('cmod').load('path/to/library)

-- With error handling
m, err = pcall(require('cmod').load, 'path/to/library')
if err ~= nil then
    print(err)
end
```

`module:unload() -> true | error`
---------------------------------

Unloads a module. Returns `true` on success, otherwise an error
is thrown. Once the module is unloaded one can't load new
functions from this module instance.

Possible errors:

- IllegalParams: a module is not supplied.
- IllegalParams: a module is already unloaded.

Example:

``` Lua
m = require('cmod').load('path/to/library')
--
-- do something with module
--
m:unload()
```

If there are functions from this module referenced somewhere
in other places of Lua code they still can be executed because
the module continue sitting in memory until the last reference
to it is closed.

If the module become a target to the Lua's garbage collector
then unload is called implicitly.

module:load(name) -> obj | error`
---------------------------------

Loads a new function with name `name` from the previously
loaded `module` and return a callable object instance
associated with the function. On failure an error is thrown.

Possible errors:
 - IllegalParams: function name is either not supplied
   or not a string.
 - IllegalParams: attempt to load a function but module
   has been unloaded already.
 - ClientError: no such function in the module.
 - ClientError: module has been updated on disk and not
   reloaded.
 - OutOfMemory: unable to allocate a function.

Example:

``` Lua
-- Load a module if not been loaded yet.
m = require('cmod').load('path/to/library')
-- Load a function with the `foo` name from the module `m`.
func = m:load('foo')
```

In case if there is no need for further loading of other
functions from the same module then the module might be
unloaded immediately.

``` Lua
m = require('cmod').load('path/to/library')
func = m:load('foo')
m:unload()
```

`function:unload() -> true | error`
-----------------------------------

Unloads a function. Returns `true` on success, otherwise
an error is thrown.

Possible errors:
 - IllegalParams: function name is either not supplied
   or not a string.
 - IllegalParams: the function does not exist.

Example:

``` Lua
m = require('cmod').load('path/to/library')
func = m:load('foo')
func:unload()
```

If the function become a target to the Lua's garbage collector
then unload is called implicitly.

Executing a loaded function
===========================

Once function is loaded it can be executed as an ordinary Lua call.
Lets consider the following example. We have a `C` function which
takes two numbers and returns their sum.

``` C
int
cfunc_sum(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
	uint32_t arg_count = mp_decode_array(&args);
	if (arg_count != 2) {
		return box_error_set(__FILE__, __LINE__, ER_PROC_C, "%s",
				     "invalid argument count");
	}
	uint64_t a = mp_decode_uint(&args);
	uint64_t b = mp_decode_uint(&args);

	char res[16];
	char *end = mp_encode_uint(res, a + b);
	box_return_mp(ctx, res, end);
	return 0;
}
```

The name of the function is `cfunc_sum` and the function is built into
`cfunc.so` shared library.

First we should load it as

``` Lua
m = require('cmod').load('cfunc')
cfunc_sum = m:load('cfunc_sum')
```

Once successfully loaded we can execute it. Note that unlike regular
Lua functions the context of `C` functions is different. They never
thrown an exception but return `true|nil, res` form where first value
set to `nil` in case of error condition and `res` carries an error
description.

Lets call the `cfunc_sum` with wrong number of arguments

``` Lua
local ok, res = cfunc_sum()
if not ok then
    print(res)
end
```

We will the `"invalid argument count"` message in output.
The error message has been set by the `box_error_set` in `C`
code above.

On success the first returned value set to `true` and `res` represent
function execution result.

``` Lua
local ok, res = cfunc_sum(1, 2)
assert(ok);
print(res)
```

We will see the number `3` in output.

The functions might return multple results. In this case the first
returned value is `true` and the rest are ones provided by function.

Module and function caches
==========================

Loading a module is relatively slow procedure because operating
system needs to read the library, resolve its symbols and etc.
Thus to speedup this procedure if the module is loaded for a first
time we put it into an internal cache. If module is sitting in
the cache already and new request to load comes in we simply
reuse a previous copy immediately. Same applies to functions:
while symbol lookup is a way faster than loading module from
disk it is not completely cheap, thus we cache functions as
well. Functions entries in cache are identified by a module
path and function name.

Still the following situation is possible: the module is loaded but
user does recompile it and overwrite on a storage device. Thus
cache content no longer matches the shared library on the disk.

To handle this situation we use that named cache invalidation procedure:
on every attempt to load the same module again we test low level file
attributes (such as storage device number, inode, size and modification
time) and if they are differ from ones kept by the cache then old module
marked as orphan, new instance is loaded and become a valid cache
entry. Module state could be seen in module variable output

```Lua
m = require('cmod').load('cfunc')
m
```
which will output
```
tarantool> m
---
- path: cfunc
  state: cached
```

The `state` is either `cached` if module is present in cache
and valid or `orphan` if entry in cache has been updated.

In case if there is a strong need to reload a module then better
to unload all functions and the module explicitly, load it from the
scratch and load all functions again. This will prevent from
unexpected errors.
---
 src/box/CMakeLists.txt |   1 +
 src/box/lua/cmod.c     | 610 +++++++++++++++++++++++++++++++++++++++++
 src/box/lua/cmod.h     |  24 ++
 src/box/lua/init.c     |   2 +
 src/box/module_cache.c | 211 +++++++++++---
 src/box/module_cache.h |  36 +++
 6 files changed, 842 insertions(+), 42 deletions(-)
 create mode 100644 src/box/lua/cmod.c
 create mode 100644 src/box/lua/cmod.h

diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 339e2c8a9..feba5a037 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -195,6 +195,7 @@ add_library(box STATIC
     lua/init.c
     lua/call.c
     lua/cfg.cc
+    lua/cmod.c
     lua/console.c
     lua/serialize_lua.c
     lua/tuple.c
diff --git a/src/box/lua/cmod.c b/src/box/lua/cmod.c
new file mode 100644
index 000000000..60fd2e812
--- /dev/null
+++ b/src/box/lua/cmod.c
@@ -0,0 +1,610 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#include <string.h>
+#include <lua.h>
+
+#include "assoc.h"
+#include "diag.h"
+
+#include "box/module_cache.h"
+#include "box/error.h"
+#include "box/port.h"
+#include "tt_static.h"
+
+#include "trivia/util.h"
+#include "lua/utils.h"
+
+/**
+ * Function descriptor.
+ */
+struct cmod_func {
+	/**
+	 * Symbol descriptor for the function in
+	 * an associated module.
+	 */
+	struct module_sym mod_sym;
+	/**
+	 * Length of @a name member.
+	 */
+	size_t len;
+	/**
+	 * Count of active references to the function.
+	 */
+	int64_t refs;
+	/**
+	 * Module path with function name separated
+	 * by a point, like "module.func".
+	 */
+	char name[0];
+};
+
+/** A type to find a module from an object. */
+static const char *cmod_module_uname = "cmod_module_uname";
+
+/** A type to find a function from an object. */
+static const char *cmod_func_uname = "cmod_func_uname";
+
+/** Get data associated with an object. */
+static void *
+get_udata(struct lua_State *L, const char *uname)
+{
+	void **pptr = luaL_testudata(L, 1, uname);
+	return pptr != NULL ? *pptr : NULL;
+}
+
+/** Set data to a new value. */
+static void
+set_udata(struct lua_State *L, const char *uname, void *ptr)
+{
+	void **pptr = luaL_testudata(L, 1, uname);
+	assert(pptr != NULL);
+	*pptr = ptr;
+}
+
+/** Setup a new data and associate it with an object. */
+static void
+new_udata(struct lua_State *L, const char *uname, void *ptr)
+{
+	*(void **)lua_newuserdata(L, sizeof(void *)) = ptr;
+	luaL_getmetatable(L, uname);
+	lua_setmetatable(L, -2);
+}
+
+/**
+ * Function name to cmod_func hash. The name includes
+ * module package path without file extension.
+ */
+static struct mh_strnptr_t *func_hash = NULL;
+
+/**
+ * Find function in cmod_func hash.
+ */
+struct cmod_func *
+func_cache_find(const char *name, size_t name_len)
+{
+	mh_int_t e = mh_strnptr_find_inp(func_hash, name, name_len);
+	if (e == mh_end(func_hash))
+		return NULL;
+	return mh_strnptr_node(func_hash, e)->val;
+}
+
+/**
+ * Delete a function instance from cmod_func hash.
+ */
+static void
+func_cache_del(struct cmod_func *cf)
+{
+	assert(cf->refs == 0);
+
+	mh_int_t e = mh_strnptr_find_inp(func_hash, cf->name, cf->len);
+	assert(e != mh_end(func_hash));
+	mh_strnptr_del(func_hash, e, NULL);
+}
+
+/**
+ * Add a function instance into cmod_func hash.
+ */
+static int
+func_cache_add(struct cmod_func *cf)
+{
+	const struct mh_strnptr_node_t nd = {
+		.str	= cf->name,
+		.len	= cf->len,
+		.hash	= mh_strn_hash(cf->name, cf->len),
+		.val	= cf,
+	};
+
+	mh_int_t e = mh_strnptr_put(func_hash, &nd, NULL, NULL);
+	if (e == mh_end(func_hash)) {
+		diag_set(OutOfMemory, sizeof(nd),
+			 "malloc", "cmod_func node");
+		return -1;
+	}
+	return 0;
+}
+
+/**
+ * Unload a symbol and free a function instance.
+ */
+static void
+func_delete(struct cmod_func *cf)
+{
+	assert(cf->refs == 0);
+	module_sym_unload(&cf->mod_sym);
+	TRASH(cf);
+	free(cf);
+}
+
+/**
+ * Increase reference to a function.
+ */
+static void
+func_ref(struct cmod_func *cf)
+{
+	assert(cf->refs >= 0);
+	cf->refs++;
+}
+
+/**
+ * Decrease a function reference and delete it if last one.
+ */
+static void
+func_unref(struct cmod_func *cf)
+{
+	assert(cf->refs > 0);
+	if (cf->refs-- == 1) {
+		func_cache_del(cf);
+		func_delete(cf);
+	}
+}
+
+/**
+ * Allocate a new function instance and resolve a symbol address.
+ *
+ * @param module module the function load from.
+ * @param name package path and a function name, ie "module.foo".
+ * @param len length of @a name.
+ *
+ * @returns function instance on success, NULL otherwise setting diag area.
+ */
+static struct cmod_func *
+func_new(struct module *m, const char *name, size_t len)
+{
+	size_t size = sizeof(struct cmod_func) + len + 1;
+	struct cmod_func *cf = malloc(size);
+	if (cf == NULL) {
+		diag_set(OutOfMemory, size, "malloc", "cf");
+		return NULL;
+	}
+
+	cf->mod_sym.addr	= NULL;
+	cf->mod_sym.module	= m;
+	cf->mod_sym.name	= cf->name;
+	cf->len			= len;
+	cf->refs		= 0;
+
+	memcpy(cf->name, name, len);
+	cf->name[len] = '\0';
+
+	if (module_sym_load(&cf->mod_sym, false) != 0) {
+		func_delete(cf);
+		return NULL;
+	}
+
+	func_ref(cf);
+	return cf;
+}
+
+/**
+ * Load a new function.
+ *
+ * This function takes a function name from the caller
+ * stack @a L and creates a new function object. If
+ * the function is already loaded we simply return
+ * a reference to existing one.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: function name is either not supplied
+ *   or not a string.
+ * - IllegalParams: function references limit exceeded.
+ * - OutOfMemory: unable to allocate a function.
+ * - ClientError: no such function in the module.
+ * - ClientError: module has been updated on disk and not
+ *   yet unloaded and loaded back.
+ *
+ * @returns function object on success or throwns an error.
+ */
+static int
+lcmod_func_load(struct lua_State *L)
+{
+	const char *method = "function = module:load";
+
+	if (lua_gettop(L) != 2 || !lua_isstring(L, 2)) {
+		const char *fmt =
+			"Expects %s(\'name\') but no name passed";
+		diag_set(IllegalParams, fmt, method);
+		return luaT_error(L);
+	}
+
+	struct module *m = get_udata(L, cmod_module_uname);
+	if (m == NULL) {
+		const char *fmt =
+			"Expects %s(\'name\') but not module object passed";
+		diag_set(IllegalParams, fmt, method);
+		return luaT_error(L);
+	}
+
+	const char *func_name = lua_tostring(L, 2);
+	const char *name = tt_sprintf("%s.%s", m->package, func_name);
+	size_t len = strlen(name);
+
+	/*
+	 * We try to reuse already allocated function in
+	 * case if someone is loading same function twise.
+	 * This will save memory and eliminates redundant
+	 * symbol address resolving.
+	 */
+	struct cmod_func *cf = func_cache_find(name, len);
+	if (cf == NULL) {
+		cf = func_new(m, name, len);
+		if (cf == NULL)
+			return luaT_error(L);
+		if (func_cache_add(cf) != 0) {
+			func_unref(cf);
+			return luaT_error(L);
+		}
+	} else {
+		func_ref(cf);
+	}
+
+	new_udata(L, cmod_func_uname, cf);
+	return 1;
+}
+
+/**
+ * Unload a function.
+ *
+ * This function takes a function object from
+ * the caller stack @a L and unloads it.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: function is not supplied.
+ * - IllegalParams: the function does not exist.
+ *
+ * @returns true on success or throwns an error.
+ */
+static int
+lcmod_func_unload(struct lua_State *L)
+{
+	if (lua_gettop(L) != 1) {
+		diag_set(IllegalParams, "Expects function:unload()");
+		return luaT_error(L);
+	}
+
+	struct cmod_func *cf = get_udata(L, cmod_func_uname);
+	if (cf == NULL) {
+		diag_set(IllegalParams, "The function is unloaded");
+		return luaT_error(L);
+	}
+
+	set_udata(L, cmod_func_uname, NULL);
+	func_unref(cf);
+
+	lua_pushboolean(L, true);
+	return 1;
+}
+
+/**
+ * Load a new module.
+ *
+ * This function takes a module patch from the caller
+ * stack @a L and creates a new module object.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: module path is either not supplied
+ *   or not a string.
+ * - SystemError: unable to open a module due to a system error.
+ * - ClientError: a module does not exist.
+ * - OutOfMemory: unable to allocate a module.
+ *
+ * @returns module object on success or throws an error.
+ */
+static int
+lcmod_module_load(struct lua_State *L)
+{
+	if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) {
+		diag_set(IllegalParams, "Expects cmod.load(\'name\') "
+			 "but no name passed");
+		return luaT_error(L);
+	}
+
+	size_t name_len;
+	const char *name = lua_tolstring(L, 1, &name_len);
+
+	struct module *module = module_load(name, &name[name_len]);
+	if (module == NULL)
+		return luaT_error(L);
+
+	new_udata(L, cmod_module_uname, module);
+	return 1;
+}
+
+/**
+ * Unload a module.
+ *
+ * This function takes a module object from
+ * the caller stack @a L and unloads it.
+ *
+ * If there are some active functions left then
+ * module won't be freed internally until last function
+ * from this module is unloaded, this is guaranteed by
+ * module_cache engine.
+ *
+ * Possible errors:
+ *
+ * - IllegalParams: module is not supplied.
+ * - IllegalParams: module already unloaded.
+ *
+ * @returns true on success or throws an error.
+ */
+static int
+lcmod_module_unload(struct lua_State *L)
+{
+	if (lua_gettop(L) != 1) {
+		diag_set(IllegalParams, "Expects module:unload()");
+		return luaT_error(L);
+	}
+
+	struct module *m = get_udata(L, cmod_module_uname);
+	if (m == NULL) {
+		diag_set(IllegalParams, "The module is already unloaded");
+		return luaT_error(L);
+	}
+	set_udata(L, cmod_module_uname, NULL);
+	module_unload(m);
+	lua_pushboolean(L, true);
+	return 1;
+}
+
+static const char *
+module_state_str(struct module *m)
+{
+	return module_is_orphan(m) ? "orphan" : "cached";
+}
+
+/**
+ * Handle __index request for a module object.
+ */
+static int
+lcmod_module_index(struct lua_State *L)
+{
+	/*
+	 * Instead of showing userdata pointer
+	 * lets provide a serialized value.
+	 */
+	lua_getmetatable(L, 1);
+	lua_pushvalue(L, 2);
+	lua_rawget(L, -2);
+	if (!lua_isnil(L, -1))
+		return 1;
+
+	struct module *m = get_udata(L, cmod_module_uname);
+	if (m == NULL) {
+		lua_pushnil(L);
+		return 1;
+	}
+
+	const char *key = lua_tostring(L, 2);
+	if (key == NULL || lua_type(L, 2) != LUA_TSTRING) {
+		diag_set(IllegalParams,
+			 "Bad params, use __index(obj, <string>)");
+		return luaT_error(L);
+	}
+
+	if (strcmp(key, "path") == 0) {
+		lua_pushstring(L, m->package);
+		return 1;
+	} else if (strcmp(key, "state") == 0) {
+		lua_pushstring(L, module_state_str(m));
+		return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * Module handle representation for REPL (console).
+ */
+static int
+lcmod_module_serialize(struct lua_State *L)
+{
+	struct module *m = get_udata(L, cmod_module_uname);
+	if (m == NULL) {
+		lua_pushnil(L);
+		return 1;
+	}
+
+	lua_createtable(L, 0, 2);
+	lua_pushstring(L, m->package);
+	lua_setfield(L, -2, "path");
+	lua_pushstring(L, module_state_str(m));
+	lua_setfield(L, -2, "state");
+
+	return 1;
+}
+
+/**
+ * Collect a module handle.
+ */
+static int
+lcmod_module_gc(struct lua_State *L)
+{
+	struct module *m = get_udata(L, cmod_module_uname);
+	if (m != NULL) {
+		set_udata(L, cmod_module_uname, NULL);
+		module_unload(m);
+	}
+	return 0;
+}
+
+/**
+ * Function handle representation for REPL (console).
+ */
+static int
+lcmod_func_serialize(struct lua_State *L)
+{
+	struct cmod_func *cf = get_udata(L, cmod_func_uname);
+	if (cf == NULL) {
+		lua_pushnil(L);
+		return 1;
+	}
+
+	lua_createtable(L, 0, 1);
+	lua_pushstring(L, cf->name);
+	lua_setfield(L, -2, "name");
+
+	return 1;
+}
+
+/**
+ * Handle __index request for a function object.
+ */
+static int
+lcmod_func_index(struct lua_State *L)
+{
+	/*
+	 * Instead of showing userdata pointer
+	 * lets provide a serialized value.
+	 */
+	lua_getmetatable(L, 1);
+	lua_pushvalue(L, 2);
+	lua_rawget(L, -2);
+	if (!lua_isnil(L, -1))
+		return 1;
+
+	struct cmod_func *cf = get_udata(L, cmod_func_uname);
+	if (cf == NULL) {
+		lua_pushnil(L);
+		return 1;
+	}
+
+	const char *key = lua_tostring(L, 2);
+	if (key == NULL || lua_type(L, 2) != LUA_TSTRING) {
+		diag_set(IllegalParams,
+			 "Bad params, use __index(obj, <string>)");
+		return luaT_error(L);
+	}
+
+	if (strcmp(key, "name") == 0) {
+		lua_pushstring(L, cf->name);
+		return 1;
+	}
+
+	return 0;
+}
+
+/**
+ * Collect function handle if there is no active loads left.
+ */
+static int
+lcmod_func_gc(struct lua_State *L)
+{
+	struct cmod_func *cf = get_udata(L, cmod_func_uname);
+	if (cf != NULL) {
+		set_udata(L, cmod_func_uname, NULL);
+		func_unref(cf);
+	}
+	return 0;
+}
+
+/**
+ * Call a function by its name from the Lua code.
+ */
+static int
+lcmod_func_call(struct lua_State *L)
+{
+	struct cmod_func *cf = get_udata(L, cmod_func_uname);
+	if (cf == NULL) {
+		diag_set(IllegalParams, "The function is unloaded");
+		return luaT_error(L);
+	}
+
+	/*
+	 * FIXME: We should get rid of luaT_newthread but this
+	 * requires serious modifications. In particular
+	 * port_lua_do_dump uses tarantool_L reference and
+	 * coro_ref must be valid as well.
+	 */
+	lua_State *args_L = luaT_newthread(tarantool_L);
+	if (args_L == NULL)
+		return luaT_error(L);
+
+	int coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
+	lua_xmove(L, args_L, lua_gettop(L) - 1);
+
+	struct port args;
+	port_lua_create(&args, args_L);
+	((struct port_lua *)&args)->ref = coro_ref;
+
+	struct port ret;
+	if (module_sym_call(&cf->mod_sym, &args, &ret) != 0) {
+		port_destroy(&args);
+		return luaT_error(L);
+	}
+
+	int top = lua_gettop(L);
+	lua_pushboolean(L, true);
+	port_dump_lua(&ret, L, true);
+	int cnt = lua_gettop(L) - top;
+
+	port_destroy(&ret);
+	port_destroy(&args);
+
+	return cnt;
+}
+
+/**
+ * Initialize cmod module.
+ */
+void
+box_lua_cmod_init(struct lua_State *L)
+{
+	func_hash = mh_strnptr_new();
+	if (func_hash == NULL) {
+		panic("Can't allocate cmod hash table");
+	}
+
+	static const struct luaL_Reg top_methods[] = {
+		{ "load",		lcmod_module_load	},
+		{ NULL, NULL },
+	};
+	luaL_register_module(L, "cmod", top_methods);
+	lua_pop(L, 1);
+
+	static const struct luaL_Reg module_methods[] = {
+		{ "load",		lcmod_func_load		},
+		{ "unload",		lcmod_module_unload	},
+		{ "__index",		lcmod_module_index	},
+		{ "__serialize",	lcmod_module_serialize	},
+		{ "__gc",		lcmod_module_gc		},
+		{ NULL, NULL },
+	};
+	luaL_register_type(L, cmod_module_uname, module_methods);
+
+	static const struct luaL_Reg func_methods[] = {
+		{ "unload",		lcmod_func_unload	},
+		{ "__index",		lcmod_func_index	},
+		{ "__serialize",	lcmod_func_serialize	},
+		{ "__call",		lcmod_func_call		},
+		{ "__gc",		lcmod_func_gc		},
+		{ NULL, NULL },
+	};
+	luaL_register_type(L, cmod_func_uname, func_methods);
+}
diff --git a/src/box/lua/cmod.h b/src/box/lua/cmod.h
new file mode 100644
index 000000000..f0ea2d34d
--- /dev/null
+++ b/src/box/lua/cmod.h
@@ -0,0 +1,24 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2021, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#pragma once
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+struct lua_State;
+
+/**
+ * Initialize cmod Lua module.
+ *
+ * @param L Lua state where to register the cmod module.
+ */
+void
+box_lua_cmod_init(struct lua_State *L);
+#if defined(__cplusplus)
+}
+#endif /* defined(__plusplus) */
diff --git a/src/box/lua/init.c b/src/box/lua/init.c
index fbcdfb20b..bad2b7ca9 100644
--- a/src/box/lua/init.c
+++ b/src/box/lua/init.c
@@ -60,6 +60,7 @@
 #include "box/lua/cfg.h"
 #include "box/lua/xlog.h"
 #include "box/lua/console.h"
+#include "box/lua/cmod.h"
 #include "box/lua/tuple.h"
 #include "box/lua/execute.h"
 #include "box/lua/key_def.h"
@@ -465,6 +466,7 @@ box_lua_init(struct lua_State *L)
 	box_lua_tuple_init(L);
 	box_lua_call_init(L);
 	box_lua_cfg_init(L);
+	box_lua_cmod_init(L);
 	box_lua_slab_init(L);
 	box_lua_index_init(L);
 	box_lua_space_init(L);
diff --git a/src/box/module_cache.c b/src/box/module_cache.c
index 22b906fd7..e96cbd1f8 100644
--- a/src/box/module_cache.c
+++ b/src/box/module_cache.c
@@ -19,11 +19,33 @@
 #include "error.h"
 #include "lua/utils.h"
 #include "libeio/eio.h"
+#include "trivia/util.h"
 
 #include "module_cache.h"
 
-/** Modules name to descriptor hash. */
+/**
+ * Modules names to descriptor hashes. The first one
+ * for modules created with old `box.schema.func`
+ * interface.
+ *
+ * Here is an important moment for backward compatibility.
+ * The `box.schema.func` operations always use cache and
+ * if a module is updated on a storage device or even
+ * no longer present, then lazy symbol resolving is done
+ * via previously loaded copy. To update modules one have
+ * to reload them manually.
+ *
+ * In turn new API implies to use module_load/unload explicit
+ * interface, and when module is re-loaded from cache then
+ * we make a cache validation to be sure the copy on storage
+ * is up to date.
+ *
+ * Due to all this we have to keep two hash tables. Probably
+ * we should deprecate explicit reload at all and require
+ * manual load/unload instead. But later.
+ */
 static struct mh_strnptr_t *box_schema_hash = NULL;
+static struct mh_strnptr_t *mod_hash = NULL;
 
 /**
  * Parsed symbol and package names.
@@ -52,7 +74,7 @@ struct func_name {
 static inline struct mh_strnptr_t *
 hash_tbl(bool is_box_schema)
 {
-	return is_box_schema ? box_schema_hash : NULL;
+	return is_box_schema ? box_schema_hash : mod_hash;
 }
 
 /***
@@ -160,7 +182,7 @@ module_set_orphan(struct module *module)
 /**
  * Test if module is out of cache.
  */
-static bool
+bool
 module_is_orphan(struct module *module)
 {
 	return module->hash == NULL;
@@ -289,13 +311,9 @@ module_unref(struct module *module)
  * for cases of a function reload.
  */
 static struct module *
-module_load(struct mh_strnptr_t *h, const char *package,
-	    const char *package_end)
+module_new(const char *path, struct mh_strnptr_t *h,
+	   const char *package, const char *package_end)
 {
-	char path[PATH_MAX];
-	if (module_find(package, package_end, path, sizeof(path)) != 0)
-		return NULL;
-
 	int package_len = package_end - package;
 	struct module *module = malloc(sizeof(*module) + package_len + 1);
 	if (module == NULL) {
@@ -334,8 +352,8 @@ module_load(struct mh_strnptr_t *h, const char *package,
 		goto error;
 	}
 
-	struct stat st;
-	if (stat(path, &st) < 0) {
+	struct stat *st = &module->st;
+	if (stat(path, st) < 0) {
 		diag_set(SystemError, "failed to stat() module %s", path);
 		goto error;
 	}
@@ -347,7 +365,7 @@ module_load(struct mh_strnptr_t *h, const char *package,
 	}
 
 	int dest_fd = open(load_name, O_WRONLY | O_CREAT | O_TRUNC,
-			   st.st_mode & (S_IRWXU | S_IRWXG | S_IRWXO));
+			   st->st_mode & (S_IRWXU | S_IRWXG | S_IRWXO));
 	if (dest_fd < 0) {
 		diag_set(SystemError, "failed to open file %s for writing ",
 			 load_name);
@@ -355,10 +373,10 @@ module_load(struct mh_strnptr_t *h, const char *package,
 		goto error;
 	}
 
-	off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st.st_size);
+	off_t ret = eio_sendfile_sync(dest_fd, source_fd, 0, st->st_size);
 	close(source_fd);
 	close(dest_fd);
-	if (ret != st.st_size) {
+	if (ret != st->st_size) {
 		diag_set(SystemError, "failed to copy DSO %s to %s",
 			 path, load_name);
 		goto error;
@@ -403,30 +421,52 @@ module_sym(struct module *module, const char *name)
 int
 module_sym_load(struct module_sym *mod_sym, bool is_box_schema)
 {
+	struct module *cached, *module;
 	assert(mod_sym->addr == NULL);
 
 	struct func_name name;
 	func_split_name(mod_sym->name, &name);
 
-	/*
-	 * In case if module has been loaded already by
-	 * some previous call we can eliminate redundant
-	 * loading and take it from the cache.
-	 */
-	struct module *cached, *module;
-	struct mh_strnptr_t *h = hash_tbl(is_box_schema);
-	cached = module_cache_find(h, name.package, name.package_end);
-	if (cached == NULL) {
-		module = module_load(h, name.package, name.package_end);
-		if (module == NULL)
-			return -1;
-		if (module_cache_add(module) != 0) {
-			module_unref(module);
-			return -1;
+	if (is_box_schema) {
+		/*
+		 * Deprecated interface -- request comes
+		 * from box.schema.func.
+		 *
+		 * In case if module has been loaded already by
+		 * some previous call we can eliminate redundant
+		 * loading and take it from the cache.
+		 */
+		struct mh_strnptr_t *h = hash_tbl(is_box_schema);
+		cached = module_cache_find(h, name.package, name.package_end);
+		if (cached == NULL) {
+			char path[PATH_MAX];
+			if (module_find(name.package, name.package_end,
+					path, sizeof(path)) != 0) {
+				return -1;
+			}
+			module = module_new(path, h, name.package,
+					    name.package_end);
+			if (module == NULL)
+				return -1;
+			if (module_cache_add(module) != 0) {
+				module_unref(module);
+				return -1;
+			}
+		} else {
+			module_ref(cached);
+			module = cached;
 		}
+		mod_sym->module = module;
 	} else {
-		module_ref(cached);
-		module = cached;
+		/*
+		 * New approach is always load module
+		 * explicitly and pass it inside symbol,
+		 * the refernce to the module already has
+		 * to be incremented.
+		 */
+		assert(mod_sym->module->refs > 0);
+		module_ref(mod_sym->module);
+		module = mod_sym->module;
 	}
 
 	mod_sym->addr = module_sym(module, name.sym);
@@ -435,7 +475,6 @@ module_sym_load(struct module_sym *mod_sym, bool is_box_schema)
 		return -1;
 	}
 
-	mod_sym->module = module;
 	rlist_add(&module->funcs_list, &mod_sym->item);
 	return 0;
 }
@@ -514,6 +553,74 @@ module_sym_call(struct module_sym *mod_sym, struct port *args,
 	return rc;
 }
 
+struct module *
+module_load(const char *package, const char *package_end)
+{
+	char path[PATH_MAX];
+	if (module_find(package, package_end, path, sizeof(path)) != 0)
+		return NULL;
+
+	struct module *cached, *module;
+	struct mh_strnptr_t *h = hash_tbl(false);
+	cached = module_cache_find(h, package, package_end);
+	if (cached == NULL) {
+		module = module_new(path, h, package, package_end);
+		if (module == NULL)
+			return NULL;
+		if (module_cache_add(module) != 0) {
+			module_unref(module);
+			return NULL;
+		}
+		return module;
+	}
+
+	struct stat st;
+	if (stat(path, &st) != 0) {
+		diag_set(SystemError, "module: stat() module %s", path);
+		return NULL;
+	}
+
+	/*
+	 * When module comes from cache make sure that
+	 * it is not changed on the storage device. The
+	 * test below still can miss update if cpu data
+	 * been manually moved backward and device/inode
+	 * persisted but this is a really rare situation.
+	 *
+	 * If update is needed one can simply "touch file.so"
+	 * to invalidate the cache entry.
+	 */
+	if (cached->st.st_dev == st.st_dev &&
+	    cached->st.st_ino == st.st_ino &&
+	    cached->st.st_size == st.st_size &&
+	    memcmp(&cached->st.st_mtim, &st.st_mtim,
+		   sizeof(st.st_mtim)) == 0) {
+		module_ref(cached);
+		return cached;
+	}
+
+	/*
+	 * Load a new module, update the cache
+	 * and orphan an old module instance.
+	 */
+	module = module_new(path, h, package, package_end);
+	if (module == NULL)
+		return NULL;
+	if (module_cache_update(module) != 0) {
+		module_unref(module);
+		return NULL;
+	}
+
+	module_set_orphan(cached);
+	return module;
+}
+
+void
+module_unload(struct module *module)
+{
+	module_unref(module);
+}
+
 int
 module_reload(const char *package, const char *package_end)
 {
@@ -529,7 +636,11 @@ module_reload(const char *package, const char *package_end)
 		return -1;
 	}
 
-	new = module_load(box_schema_hash, package, package_end);
+	char path[PATH_MAX];
+	if (module_find(package, package_end, path, sizeof(path)) != 0)
+		return -1;
+
+	new = module_new(path, box_schema_hash, package, package_end);
 	if (new == NULL)
 		return -1;
 
@@ -606,11 +717,21 @@ module_reload(const char *package, const char *package_end)
 int
 module_init(void)
 {
-	box_schema_hash = mh_strnptr_new();
-	if (box_schema_hash == NULL) {
-		diag_set(OutOfMemory, sizeof(*box_schema_hash),
-			 "malloc", "modules box_schema_hash");
-		return -1;
+	struct mh_strnptr_t **ht[] = {
+		&box_schema_hash,
+		&mod_hash,
+	};
+	for (size_t i = 0; i < lengthof(ht); i++) {
+		*ht[i] = mh_strnptr_new();
+		if (*ht[i] == NULL) {
+			diag_set(OutOfMemory, sizeof(*ht[i]),
+				 "malloc", "modules hash");
+			for (ssize_t j = i - 1; j >= 0; j--) {
+				mh_strnptr_delete(*ht[j]);
+				*ht[j] = NULL;
+			}
+			return -1;
+		}
 	}
 	return 0;
 }
@@ -618,12 +739,18 @@ module_init(void)
 void
 module_free(void)
 {
-	struct mh_strnptr_t *h = box_schema_hash;
-	while (mh_size(h) > 0) {
+	struct mh_strnptr_t **ht[] = {
+		&box_schema_hash,
+		&mod_hash,
+	};
+	for (size_t i = 0; i < lengthof(ht); i++) {
+		struct mh_strnptr_t *h = *ht[i];
+
 		mh_int_t i = mh_first(h);
 		struct module *m = mh_strnptr_node(h, i)->val;
 		module_unref(m);
+
+		mh_strnptr_delete(h);
+		*ht[i] = NULL;
 	}
-	mh_strnptr_delete(box_schema_hash);
-	box_schema_hash = NULL;
 }
diff --git a/src/box/module_cache.h b/src/box/module_cache.h
index 875f2eb3c..17a2e27bb 100644
--- a/src/box/module_cache.h
+++ b/src/box/module_cache.h
@@ -6,6 +6,10 @@
 
 #pragma once
 
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
 #include "small/rlist.h"
 
 #if defined(__cplusplus)
@@ -48,6 +52,10 @@ struct module {
 	 * Count of active references to the module.
 	 */
 	int64_t refs;
+	/**
+	 * Storage stat for identity check.
+	 */
+	struct stat st;
 	/**
 	 * Module's package name.
 	 */
@@ -76,6 +84,15 @@ struct module_sym {
 	char *name;
 };
 
+/**
+ * Test if module is orphan and cache carries
+ * up to date version instead.
+ *
+ * @retval true if module is orphan, false otherwise.
+ */
+bool
+module_is_orphan(struct module *module);
+
 /**
  * Load a new module symbol.
  *
@@ -112,6 +129,25 @@ int
 module_sym_call(struct module_sym *mod_sym, struct port *args,
 		struct port *ret);
 
+/**
+ * Load new module instance.
+ *
+ * @param package shared library path start.
+ * @param package_end shared library path end.
+ *
+ * @return 0 on succes, -1 otherwise, diag is set.
+ */
+struct module *
+module_load(const char *package, const char *package_end);
+
+/**
+ * Unload module instance.
+ *
+ * @param module instance to unload.
+ */
+void
+module_unload(struct module *module);
+
 /**
  * Reload a module and all associated symbols.
  *
-- 
2.29.2



More information about the Tarantool-patches mailing list