[tarantool-patches] Re: [PATCH v1 2/3] schema: extend _func space to persist lua functions

Kirill Shcherbatov kshcherbatov at tarantool.org
Fri May 17 15:33:09 MSK 2019


Hi! Thank you for your feedback.

> The column name should be 'code'. There is a separate column
> 'language' that defines function language.
> 
>> box.schema.func.create('sum', {lua_code = lua_code})
> 
> bikeshed, but I would call both the column and the option 'body',
> or 'text', not 'code'.

This field is called 'body' now.

> 
> While we are at it, I would also add 'type' - function or
> procedure, and other related to SQL procedures fields.
> 
> See for example here:
> 
> https://mariadb.com/kb/en/library/mysqlproc-table/
> (not all of these fields are relevant, and perhaps we need an
> 'options' map, to make the whole thing easily extensible in the
> future, but it's better to put as many fields as possible into
> columns, not options).

Thank you for this reference.
I think we'll need 'is_deterministic' flag soon enough (in scope of
functional indexes), so I've introduced it.
To make the process of introducing such options extensible,
@locker and I decided to put 'is_deterministic' and other new
flags in the 'opts' map field. I've reused existent opts_decode
mechanism in scope of this patch.

> This belongs to an own function in util.c, luaT_prepare_sandbox(). It could
> perhaps be done entirely in Lua and in a more efficient way.
> Perhaps you could have a prepared sandbox which you could just
> clone. Please investigate.

I've introduced luaT_get_sandbox public method that clones an existent
'defaullt' sandbox and pushes it on the top of Lua stack in utils.h.
They use private references luaL_default_sandbox_ref and
luaL_deepcopy_func_ref are initialized on first demand.

I've discussed this problem with @Totktonada an he approved my
concept.

> 
>> +/**
>> + * Take over the modules that belonged to the old function.
>> + * Reset module and function pointers in onl function.
>> +*/
> 
> I don't understand this comment, please rephrase, and please do
> not forget to describe the usage and the purpose, not just what
> the function does. 

This allows backward compatibility with reusing loaded modules when
updating the function definition, as is done in Tarantool now.

/**
 * Take over the modules that belonged to the old function.
 * Reset module and function pointers in old function.
 * This helper allows you to inherit already loaded function
 * objects, that is useful on function object update.
*/
void
func_capture_module(struct func *new_func, struct func *old_func);


>> +/**
>> + * Execute the persistent Lua function.
>> + * Assemble Lua object when required.
>> + */
>> +static int
>> +execute_lua_ref_call(lua_State *L)
> 
> What's a ref-call? Why the name of the function does not match the
> comment?

The new name is
static int
execute_persistent_function(lua_State *L)

=================================================
This patch extends _func system space with a new string fields
body and opts to persist Lua functions.

Refactored func hash machinery to pass a new function object and
to return previous one. This allows to implement fault-tolerant
func_new call that constructs a lua function object.

Each persistent function may use limited amount of Lua functions
and modules:
-assert -error -pairs -ipairs -next -pcall -xpcall -type
-print -select -string -tonumber -tostring -unpack -math -utf8
Global variables are forbidden in persistent Lua functions.

This patch makes persistent Lua functions are available via
net.box.connect() :call method.
Example:
lua_code = [[function(a, b) return a + b end]]
box.schema.func.create('sum', {body = lua_code,
                       opts = {is_deterministic = true}})
conn:call("sum", {1, 3})

Part of #4182
Needed for #1260
---
 src/box/alter.cc                  |  94 +++++++++++-----
 src/box/bootstrap.snap            | Bin 4374 -> 4403 bytes
 src/box/call.c                    |   2 +-
 src/box/func.c                    |  69 +++++++++++-
 src/box/func.h                    |  17 ++-
 src/box/func_def.c                |   9 ++
 src/box/func_def.h                |  36 +++++-
 src/box/lua/call.c                |  36 +++++-
 src/box/lua/call.h                |   4 +-
 src/box/lua/schema.lua            |   9 +-
 src/box/lua/upgrade.lua           |  21 +++-
 src/box/schema.cc                 |  46 ++++----
 src/box/schema.h                  |  15 ++-
 src/box/schema_def.h              |   2 +
 src/lua/utils.c                   |  67 +++++++++++
 src/lua/utils.h                   |   8 ++
 test/box-py/bootstrap.result      |   6 +-
 test/box/access_misc.result       |   4 +-
 test/box/persistent_func.result   | 179 ++++++++++++++++++++++++++++++
 test/box/persistent_func.test.lua |  83 ++++++++++++++
 20 files changed, 623 insertions(+), 84 deletions(-)
 create mode 100644 test/box/persistent_func.result
 create mode 100644 test/box/persistent_func.test.lua

diff --git a/src/box/alter.cc b/src/box/alter.cc
index 9279426d2..e3fd2e7ba 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -2402,21 +2402,42 @@ func_def_get_ids_from_tuple(struct tuple *tuple, uint32_t *fid, uint32_t *uid)
 static struct func_def *
 func_def_new_from_tuple(struct tuple *tuple)
 {
-	uint32_t len;
-	const char *name = tuple_field_str_xc(tuple, BOX_FUNC_FIELD_NAME,
-					      &len);
-	if (len > BOX_NAME_MAX)
+	uint32_t name_len, body_len;
+	const char *name, *body;
+	name = tuple_field_str_xc(tuple, BOX_FUNC_FIELD_NAME, &name_len);
+	if (name_len > BOX_NAME_MAX) {
 		tnt_raise(ClientError, ER_CREATE_FUNCTION,
 			  tt_cstr(name, BOX_INVALID_NAME_MAX),
 			  "function name is too long");
-	identifier_check_xc(name, len);
-	struct func_def *def = (struct func_def *) malloc(func_def_sizeof(len));
+	}
+	identifier_check_xc(name, name_len);
+	if (tuple_field_count(tuple) > BOX_FUNC_FIELD_BODY) {
+		body = tuple_field_str_xc(tuple, BOX_FUNC_FIELD_BODY,
+					  &body_len);
+	} else {
+		body = NULL;
+		body_len = 0;
+	}
+
+	uint32_t def_sz = func_def_sizeof(name_len, body_len);
+	struct func_def *def = (struct func_def *) malloc(def_sz);
 	if (def == NULL)
-		tnt_raise(OutOfMemory, func_def_sizeof(len), "malloc", "def");
+		tnt_raise(OutOfMemory, def_sz, "malloc", "def");
 	auto def_guard = make_scoped_guard([=] { free(def); });
+	func_opts_create(&def->opts);
 	func_def_get_ids_from_tuple(tuple, &def->fid, &def->uid);
-	memcpy(def->name, name, len);
-	def->name[len] = 0;
+
+	def->name = (char *)def + sizeof(struct func_def);
+	memcpy(def->name, name, name_len);
+	def->name[name_len] = 0;
+	if (body_len > 0) {
+		def->body = def->name + name_len + 1;
+		memcpy(def->body, body, body_len);
+		def->body[body_len] = 0;
+	} else {
+		def->body = NULL;
+	}
+
 	if (tuple_field_count(tuple) > BOX_FUNC_FIELD_SETUID)
 		def->setuid = tuple_field_u32_xc(tuple, BOX_FUNC_FIELD_SETUID);
 	else
@@ -2433,30 +2454,40 @@ func_def_new_from_tuple(struct tuple *tuple)
 		/* Lua is the default. */
 		def->language = FUNC_LANGUAGE_LUA;
 	}
+	if (def->language != FUNC_LANGUAGE_LUA && body_len > 0) {
+		tnt_raise(ClientError, ER_CREATE_FUNCTION, name,
+			  "function body may be specified only for "
+			  "Lua language");
+	}
+	if (tuple_field_count(tuple) > BOX_FUNC_FIELD_OPTS) {
+		const char *opts = tuple_field(tuple, BOX_FUNC_FIELD_OPTS);
+		if (opts_decode(&def->opts, func_opts_reg, &opts,
+				ER_WRONG_SPACE_OPTIONS, BOX_FUNC_FIELD_OPTS,
+				NULL) != 0)
+			diag_raise();
+	}
 	def_guard.is_active = false;
 	return def;
 }
 
 /** Remove a function from function cache */
 static void
-func_cache_remove_func(struct trigger * /* trigger */, void *event)
+func_cache_remove_func(struct trigger *trigger, void * /* event */)
 {
-	struct txn_stmt *stmt = txn_last_stmt((struct txn *) event);
-	uint32_t fid = tuple_field_u32_xc(stmt->old_tuple ?
-				       stmt->old_tuple : stmt->new_tuple,
-				       BOX_FUNC_FIELD_ID);
-	func_cache_delete(fid);
+	struct func *old_func = (struct func *) trigger->data;
+	func_cache_delete(old_func->def->fid);
+	func_delete(old_func);
 }
 
 /** Replace a function in the function cache */
 static void
-func_cache_replace_func(struct trigger * /* trigger */, void *event)
+func_cache_replace_func(struct trigger *trigger, void * /* event */)
 {
-	struct txn_stmt *stmt = txn_last_stmt((struct txn*) event);
-	struct func_def *def = func_def_new_from_tuple(stmt->new_tuple);
-	auto def_guard = make_scoped_guard([=] { free(def); });
-	func_cache_replace(def);
-	def_guard.is_active = false;
+	struct func *new_func = (struct func *) trigger->data;
+	struct func *old_func;
+	func_cache_replace(new_func, &old_func);
+	assert(old_func != NULL);
+	func_delete(old_func);
 }
 
 /**
@@ -2477,13 +2508,20 @@ on_replace_dd_func(struct trigger * /* trigger */, void *event)
 	struct func *old_func = func_by_id(fid);
 	if (new_tuple != NULL && old_func == NULL) { /* INSERT */
 		struct func_def *def = func_def_new_from_tuple(new_tuple);
+		auto def_guard = make_scoped_guard([=] { free(def); });
 		access_check_ddl(def->name, def->fid, def->uid, SC_FUNCTION,
 				 PRIV_C);
-		auto def_guard = make_scoped_guard([=] { free(def); });
-		func_cache_replace(def);
+		struct func *func = func_new(def);
+		if (func == NULL)
+			diag_raise();
+		auto func_guard = make_scoped_guard([=] { func_delete(func); });
 		def_guard.is_active = false;
+		struct func *old_func = NULL;
+		func_cache_replace(func, &old_func);
+		assert(old_func == NULL);
+		func_guard.is_active = false;
 		struct trigger *on_rollback =
-			txn_alter_trigger_new(func_cache_remove_func, NULL);
+			txn_alter_trigger_new(func_cache_remove_func, func);
 		txn_on_rollback(txn, on_rollback);
 	} else if (new_tuple == NULL) {         /* DELETE */
 		uint32_t uid;
@@ -2501,15 +2539,19 @@ on_replace_dd_func(struct trigger * /* trigger */, void *event)
 				  "function has grants");
 		}
 		struct trigger *on_commit =
-			txn_alter_trigger_new(func_cache_remove_func, NULL);
+			txn_alter_trigger_new(func_cache_remove_func, old_func);
 		txn_on_commit(txn, on_commit);
 	} else {                                /* UPDATE, REPLACE */
 		struct func_def *def = func_def_new_from_tuple(new_tuple);
 		auto def_guard = make_scoped_guard([=] { free(def); });
 		access_check_ddl(def->name, def->fid, def->uid, SC_FUNCTION,
 				 PRIV_A);
+		struct func *func = func_new(def);
+		if (func == NULL)
+			diag_raise();
+		def_guard.is_active = false;
 		struct trigger *on_commit =
-			txn_alter_trigger_new(func_cache_replace_func, NULL);
+			txn_alter_trigger_new(func_cache_replace_func, func);
 		txn_on_commit(txn, on_commit);
 	}
 }

diff --git a/src/box/call.c b/src/box/call.c
index 56da53fb3..773b914b1 100644
--- a/src/box/call.c
+++ b/src/box/call.c
@@ -202,7 +202,7 @@ box_process_call(struct call_request *request, struct port *port)
 	if (func && func->def->language == FUNC_LANGUAGE_C) {
 		rc = box_c_call(func, request, port);
 	} else {
-		rc = box_lua_call(request, port);
+		rc = box_lua_call(func, request, port);
 	}
 	/* Restore the original user */
 	if (orig_credentials)
diff --git a/src/box/func.c b/src/box/func.c
index a817851fd..d650b0144 100644
--- a/src/box/func.c
+++ b/src/box/func.c
@@ -34,6 +34,7 @@
 #include "lua/utils.h"
 #include "error.h"
 #include "diag.h"
+#include "fiber.h"
 #include <dlfcn.h>
 
 /**
@@ -355,6 +356,49 @@ restore:
 	return -1;
 }
 
+/**
+ * Assemble a Lua function object on Lua stack and return
+ * the reference.
+ * Returns func object reference on success, LUA_REFNIL otherwise.
+ */
+static int
+func_lua_code_load(struct func_def *def)
+{
+	int rc = LUA_REFNIL;
+	struct region *region = &fiber()->gc;
+	size_t region_svp = region_used(region);
+	struct lua_State *L = lua_newthread(tarantool_L);
+	int coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
+
+	/*
+	 * Assemble a Lua function object with luaL_loadstring of
+	 * special 'return FUNCTION_BODY' expression and call it.
+	 * Set default sandbox to configure it to use only a
+	 * limited number of functions and modules.
+	 */
+	const char *load_pref = "return ";
+	uint32_t load_str_sz = strlen(load_pref) + strlen(def->body) + 1;
+	char *load_str = region_alloc(region, load_str_sz);
+	if (load_str == NULL) {
+		diag_set(OutOfMemory, load_str_sz, "region", "load_str");
+		goto end;
+	}
+	sprintf(load_str, "%s%s", load_pref, def->body);
+	if (luaL_loadstring(L, load_str) != 0 ||
+	    lua_pcall(L, 0, 1, 0) != 0 || !lua_isfunction(L, -1) ||
+	    luaT_get_sandbox(L) != 0) {
+		diag_set(ClientError, ER_LOAD_FUNCTION, def->name,
+			def->body);
+		goto end;
+	}
+	lua_setfenv(L, -2);
+	rc = luaL_ref(L, LUA_REGISTRYINDEX);
+end:
+	region_truncate(region, region_svp);
+	luaL_unref(L, LUA_REGISTRYINDEX, coro_ref);
+	return rc;
+}
+
 struct func *
 func_new(struct func_def *def)
 {
@@ -380,6 +424,15 @@ func_new(struct func_def *def)
 	func->owner_credentials.auth_token = BOX_USER_MAX; /* invalid value */
 	func->func = NULL;
 	func->module = NULL;
+	if (func->def->body != NULL) {
+		func->lua_func_ref = func_lua_code_load(def);
+		if (func->lua_func_ref == LUA_REFNIL) {
+			free(func);
+			return NULL;
+		}
+	} else {
+		func->lua_func_ref = LUA_REFNIL;
+	}
 	return func;
 }
 
@@ -395,8 +448,11 @@ func_unload(struct func *func)
 		}
 		module_gc(func->module);
 	}
+	if (func->lua_func_ref != -1)
+		luaL_unref(tarantool_L, LUA_REGISTRYINDEX, func->lua_func_ref);
 	func->module = NULL;
 	func->func = NULL;
+	func->lua_func_ref = -1;
 }
 
 /**
@@ -452,17 +508,18 @@ func_call(struct func *func, box_function_ctx_t *ctx, const char *args,
 }
 
 void
-func_update(struct func *func, struct func_def *def)
+func_delete(struct func *func)
 {
 	func_unload(func);
 	free(func->def);
-	func->def = def;
+	free(func);
 }
 
 void
-func_delete(struct func *func)
+func_capture_module(struct func *new_func, struct func *old_func)
 {
-	func_unload(func);
-	free(func->def);
-	free(func);
+	new_func->module = old_func->module;
+	new_func->func = old_func->func;
+	old_func->module = NULL;
+	old_func->func = NULL;
 }
diff --git a/src/box/func.h b/src/box/func.h
index 8dcd61d7b..eaf5a3902 100644
--- a/src/box/func.h
+++ b/src/box/func.h
@@ -65,6 +65,11 @@ struct func {
 	 * Anchor for module membership.
 	 */
 	struct rlist item;
+	/**
+	 * The reference index of Lua function object.
+	 * Is equal to LUA_REFNIL when undefined.
+	 */
+	int lua_func_ref;
 	/**
 	 * For C functions, the body of the function.
 	 */
@@ -100,9 +105,6 @@ module_free(void);
 struct func *
 func_new(struct func_def *def);
 
-void
-func_update(struct func *func, struct func_def *def);
-
 void
 func_delete(struct func *func);
 
@@ -113,6 +115,15 @@ int
 func_call(struct func *func, box_function_ctx_t *ctx, const char *args,
 	  const char *args_end);
 
+/**
+ * Take over the modules that belonged to the old function.
+ * Reset module and function pointers in old function.
+ * This helper allows you to inherit already loaded function
+ * objects, that can is useful on update.
+*/
+void
+func_capture_module(struct func *new_func, struct func *old_func);
+
 /**
  * Reload dynamically loadable module.
  *
diff --git a/src/box/func_def.c b/src/box/func_def.c
index 76ed77b24..6c143f075 100644
--- a/src/box/func_def.c
+++ b/src/box/func_def.c
@@ -1,3 +1,12 @@
 #include "func_def.h"
+#include "opt_def.h"
 
 const char *func_language_strs[] = {"LUA", "C"};
+
+const struct func_opts func_opts_default = {
+	/* .is_deterministic = */ false,
+};
+
+const struct opt_def func_opts_reg[] = {
+	OPT_DEF("is_deterministic", OPT_BOOL, struct func_opts, is_deterministic),
+};
diff --git a/src/box/func_def.h b/src/box/func_def.h
index 5b52ab498..6c35a684b 100644
--- a/src/box/func_def.h
+++ b/src/box/func_def.h
@@ -32,6 +32,7 @@
  */
 
 #include "trivia/util.h"
+#include "opt_def.h"
 #include <stdbool.h>
 
 /**
@@ -45,6 +46,19 @@ enum func_language {
 
 extern const char *func_language_strs[];
 
+/** Function options. */
+struct func_opts {
+	/**
+	 * Whether the routine is deterministic (can produce
+	 * only one result for a given list of parameters)
+	 * or not.
+	 */
+	bool is_deterministic;
+};
+
+extern const struct func_opts func_opts_default;
+extern const struct opt_def func_opts_reg[];
+
 /**
  * Definition of a function. Function body is not stored
  * or replicated (yet).
@@ -54,6 +68,10 @@ struct func_def {
 	uint32_t fid;
 	/** Owner of the function. */
 	uint32_t uid;
+	/** Function name. */
+	char *name;
+	/** Definition of the routine. */
+	char *body;
 	/**
 	 * True if the function requires change of user id before
 	 * invocation.
@@ -63,8 +81,8 @@ struct func_def {
 	 * The language of the stored function.
 	 */
 	enum func_language language;
-	/** Function name. */
-	char name[0];
+	/** The function options. */
+	struct func_opts opts;
 };
 
 /**
@@ -73,10 +91,20 @@ struct func_def {
  * for a function of length @a a name_len.
  */
 static inline size_t
-func_def_sizeof(uint32_t name_len)
+func_def_sizeof(uint32_t name_len, uint32_t body_len)
 {
 	/* +1 for '\0' name terminating. */
-	return sizeof(struct func_def) + name_len + 1;
+	size_t sz = sizeof(struct func_def) + name_len + 1;
+	if (body_len > 0)
+		sz += body_len + 1;
+	return sz;
+}
+
+/** Create index options using default values. */
+static inline void
+func_opts_create(struct func_opts *opts)
+{
+	*opts = func_opts_default;
 }
 
 /**
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index c729778c4..dcd505cc3 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -32,6 +32,8 @@
 #include "box/call.h"
 #include "box/error.h"
 #include "fiber.h"
+#include "box/func.h"
+#include "box/schema.h"
 
 #include "lua/utils.h"
 #include "lua/msgpack.h"
@@ -253,6 +255,33 @@ execute_lua_call(lua_State *L)
 	return lua_gettop(L);
 }
 
+static int
+execute_persistent_function(lua_State *L)
+{
+	struct call_request *request = (struct call_request *)
+		lua_topointer(L, 1);
+	lua_settop(L, 0);
+
+	const char *name = request->name;
+	uint32_t name_len = mp_decode_strl(&name);
+
+	struct func *func = func_by_name(name, name_len);
+	assert(func != NULL);
+	/* Construct Lua function if required. */
+	assert(func->lua_func_ref != LUA_REFNIL);
+	lua_rawgeti(L, LUA_REGISTRYINDEX, func->lua_func_ref);
+
+	/* Push the rest of args (a tuple). */
+	const char *args = request->args;
+	uint32_t arg_count = mp_decode_array(&args);
+	luaL_checkstack(L, arg_count, "lua_func: out of stack");
+	for (uint32_t i = 0; i < arg_count; i++)
+		luamp_decode(L, luaL_msgpack_default, &args);
+
+	lua_call(L, arg_count, LUA_MULTRET);
+	return lua_gettop(L);
+}
+
 static int
 execute_lua_eval(lua_State *L)
 {
@@ -396,9 +425,12 @@ box_process_lua(struct call_request *request, struct port *base,
 }
 
 int
-box_lua_call(struct call_request *request, struct port *port)
+box_lua_call(struct func *func, struct call_request *request, struct port *port)
 {
-	return box_process_lua(request, port, execute_lua_call);
+	if (func != NULL && func->def->body != NULL)
+		return box_process_lua(request, port, execute_persistent_function);
+	else
+		return box_process_lua(request, port, execute_lua_call);
 }
 
 int
diff --git a/src/box/lua/call.h b/src/box/lua/call.h
index 0542123da..d8ef65900 100644
--- a/src/box/lua/call.h
+++ b/src/box/lua/call.h
@@ -42,13 +42,15 @@ box_lua_call_init(struct lua_State *L);
 
 struct port;
 struct call_request;
+struct func;
 
 /**
  * Invoke a Lua stored procedure from the binary protocol
  * (implementation of 'CALL' command code).
  */
 int
-box_lua_call(struct call_request *request, struct port *port);
+box_lua_call(struct func *func, struct call_request *request,
+	     struct port *port);
 
 int
 box_lua_eval(struct call_request *request, struct port *port);
diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index f31cf7f2c..8cf3867f4 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -1981,7 +1981,8 @@ box.schema.func.create = function(name, opts)
     opts = opts or {}
     check_param_table(opts, { setuid = 'boolean',
                               if_not_exists = 'boolean',
-                              language = 'string'})
+                              language = 'string', body = 'string',
+                              opts = 'table'})
     local _func = box.space[box.schema.FUNC_ID]
     local _vfunc = box.space[box.schema.VFUNC_ID]
     local func = _vfunc.index.name:get{name}
@@ -1991,10 +1992,12 @@ box.schema.func.create = function(name, opts)
         end
         return
     end
-    opts = update_param_table(opts, { setuid = false, language = 'lua'})
+    opts = update_param_table(opts, { setuid = false, language = 'lua',
+                                      body = '', opts = setmap{}})
     opts.language = string.upper(opts.language)
     opts.setuid = opts.setuid and 1 or 0
-    _func:auto_increment{session.euid(), name, opts.setuid, opts.language}
+    _func:auto_increment{session.euid(), name, opts.setuid, opts.language,
+                         opts.body, opts.opts}
 end
 
 box.schema.func.drop = function(name, opts)
diff --git a/src/box/lua/upgrade.lua b/src/box/lua/upgrade.lua
index 89d6e3d52..ef3d23c6b 100644
--- a/src/box/lua/upgrade.lua
+++ b/src/box/lua/upgrade.lua
@@ -324,7 +324,7 @@ local function initial_1_7_5()
 
     -- create "box.schema.user.info" function
     log.info('create function "box.schema.user.info" with setuid')
-    _func:replace{1, ADMIN, 'box.schema.user.info', 1, 'LUA'}
+    _func:replace{1, ADMIN, 'box.schema.user.info', 1, 'LUA', '', MAP}
 
     -- grant 'public' role access to 'box.schema.user.info' function
     log.info('grant execute on function "box.schema.user.info" to public')
@@ -737,6 +737,24 @@ local function upgrade_to_2_1_3()
     end
 end
 
+local function upgrade_to_2_2_0()
+    log.info("Update _func format")
+    local _func = box.space[box.schema.FUNC_ID]
+    local format = {}
+    format[1] = {name='id', type='unsigned'}
+    format[2] = {name='owner', type='unsigned'}
+    format[3] = {name='name', type='string'}
+    format[4] = {name='setuid', type='unsigned'}
+    format[5] = {name='language', type='string'}
+    format[6] = {name='body', type='string'}
+    format[7] = {name='opts', type='map'}
+    for _, v in box.space._func:pairs() do
+        _ = box.space._func:replace({v.id, v.owner, v.name, v.setuid,
+                                     v[5] or 'LUA', '', setmap({})})
+    end
+    _func:format(format)
+end
+
 local function get_version()
     local version = box.space._schema:get{'version'}
     if version == nil then
@@ -768,6 +786,7 @@ local function upgrade(options)
         {version = mkversion(2, 1, 1), func = upgrade_to_2_1_1, auto = true},
         {version = mkversion(2, 1, 2), func = upgrade_to_2_1_2, auto = true},
         {version = mkversion(2, 1, 3), func = upgrade_to_2_1_3, auto = true},
+        {version = mkversion(2, 2, 0), func = upgrade_to_2_2_0, auto = true},
     }
 
     for _, handler in ipairs(handlers) do
diff --git a/src/box/schema.cc b/src/box/schema.cc
index 9a55c2f14..b70992297 100644
--- a/src/box/schema.cc
+++ b/src/box/schema.cc
@@ -533,40 +533,35 @@ schema_free(void)
 }
 
 void
-func_cache_replace(struct func_def *def)
+func_cache_replace(struct func *new_func, struct func **old_func)
 {
-	struct func *old = func_by_id(def->fid);
-	if (old) {
-		func_update(old, def);
-		return;
-	}
 	if (mh_size(funcs) >= BOX_FUNCTION_MAX)
 		tnt_raise(ClientError, ER_FUNCTION_MAX, BOX_FUNCTION_MAX);
-	struct func *func = func_new(def);
-	if (func == NULL) {
-error:
-		panic_syserror("Out of memory for the data "
-			       "dictionary cache (stored function).");
-	}
-	const struct mh_i32ptr_node_t node = { def->fid, func };
-	mh_int_t k1 = mh_i32ptr_put(funcs, &node, NULL, NULL);
-	if (k1 == mh_end(funcs)) {
-		func->def = NULL;
-		func_delete(func);
-		goto error;
-	}
-	size_t def_name_len = strlen(func->def->name);
-	uint32_t name_hash = mh_strn_hash(func->def->name, def_name_len);
+
+	mh_int_t k1, k2;
+	struct mh_i32ptr_node_t old_node, *old_node_ptr = &old_node;
+	const struct mh_i32ptr_node_t node = { new_func->def->fid, new_func };
+	size_t def_name_len = strlen(new_func->def->name);
+	uint32_t name_hash = mh_strn_hash(new_func->def->name, def_name_len);
 	const struct mh_strnptr_node_t strnode = {
-		func->def->name, def_name_len, name_hash, func };
+		new_func->def->name, def_name_len, name_hash, new_func };
 
-	mh_int_t k2 = mh_strnptr_put(funcs_by_name, &strnode, NULL, NULL);
+	k1 = mh_i32ptr_put(funcs, &node, &old_node_ptr, NULL);
+	if (k1 == mh_end(funcs))
+		goto error;
+	if (old_node_ptr != NULL)
+		*old_func = (struct func *)old_node_ptr->val;
+	k2 = mh_strnptr_put(funcs_by_name, &strnode, NULL, NULL);
 	if (k2 == mh_end(funcs_by_name)) {
 		mh_i32ptr_del(funcs, k1, NULL);
-		func->def = NULL;
-		func_delete(func);
 		goto error;
 	}
+	if (*old_func != NULL)
+		func_capture_module(new_func, *old_func);
+	return;
+error:
+	panic_syserror("Out of memory for the data dictionary cache "
+		       "(stored function).");
 }
 
 void
@@ -582,7 +577,6 @@ func_cache_delete(uint32_t fid)
 				strlen(func->def->name));
 	if (k != mh_end(funcs))
 		mh_strnptr_del(funcs_by_name, k, NULL);
-	func_delete(func);
 }
 
 struct func *
diff --git a/src/box/schema.h b/src/box/schema.h
index 6f9a96117..ffc41f401 100644
--- a/src/box/schema.h
+++ b/src/box/schema.h
@@ -163,14 +163,17 @@ schema_free();
 struct space *schema_space(uint32_t id);
 
 /**
- * Insert a new function or update the old one.
- *
- * @param def Function definition. In a case of success the ownership
- *        of @a def is transfered to the data dictionary, thus the caller
- *        must not delete it.
+ * Replace an existent (if any) function or insert a new one
+ * in function cache.
+ * @param func_new Function object to insert. In case of success,
+ *                 when old function object is exists, all loaded
+ *                 modules data are inherent with
+ *                 func_capture_module() when possible.
+ * @param func_old[out] The replaced function object if any.
+ * @retval Returns 0 on success, -1 otherwise.
  */
 void
-func_cache_replace(struct func_def *def);
+func_cache_replace(struct func *new_func, struct func **old_func);
 
 void
 func_cache_delete(uint32_t fid);
diff --git a/src/box/schema_def.h b/src/box/schema_def.h
index eeeeb950b..8cabc7e24 100644
--- a/src/box/schema_def.h
+++ b/src/box/schema_def.h
@@ -163,6 +163,8 @@ enum {
 	BOX_FUNC_FIELD_NAME = 2,
 	BOX_FUNC_FIELD_SETUID = 3,
 	BOX_FUNC_FIELD_LANGUAGE = 4,
+	BOX_FUNC_FIELD_BODY = 5,
+	BOX_FUNC_FIELD_OPTS = 6,
 };
 
 /** _collation fields. */
diff --git a/src/lua/utils.c b/src/lua/utils.c
index 27ff6b396..973d8b15c 100644
--- a/src/lua/utils.c
+++ b/src/lua/utils.c
@@ -46,6 +46,14 @@ static uint32_t CTID_STRUCT_IBUF_PTR;
 uint32_t CTID_CHAR_PTR;
 uint32_t CTID_CONST_CHAR_PTR;
 
+static const char *default_sandbox_exports[] =
+	{"assert", "error", "ipairs", "math", "next", "pairs", "pcall",
+	"print", "select", "string", "table", "tonumber", "tostring",
+	"type", "unpack", "xpcall", "utf8"};
+
+static int luaL_deepcopy_func_ref = LUA_REFNIL;
+static int luaL_default_sandbox_ref = LUA_REFNIL;
+
 void *
 luaL_pushcdata(struct lua_State *L, uint32_t ctypeid)
 {
@@ -1248,6 +1256,65 @@ luaT_func_find(struct lua_State *L, const char *name, const char *name_end,
 	return 0;
 }
 
+/**
+ * Assemble a new sandbox with given exports table on top of the
+ * Lua stack. All modules in exports list are copying deeply
+ * to ensure the immutablility of this system object.
+ */
+static int
+luaT_prepare_sandbox(struct lua_State *L, const char *exports[],
+		     uint32_t exports_count)
+{
+	assert(luaL_deepcopy_func_ref != LUA_REFNIL);
+	lua_createtable(L, exports_count, 0);
+	for (unsigned i = 0; i < exports_count; i++) {
+		int count;
+		uint32_t name_len = strlen(exports[i]);
+		if (luaT_func_find(L, exports[i], exports[i] + name_len,
+				   &count) != 0)
+			return -1;
+		switch (lua_type(L, -1)) {
+		case LUA_TTABLE:
+			lua_rawgeti(L, LUA_REGISTRYINDEX,
+				    luaL_deepcopy_func_ref);
+			lua_insert(L, -2);
+			lua_call(L, 1, LUA_MULTRET);
+			FALLTHROUGH;
+		case LUA_TFUNCTION:
+			break;
+		default:
+			unreachable();
+		}
+		lua_setfield(L, -2, exports[i]);
+	}
+	return 0;
+}
+
+int
+luaT_get_sandbox(struct lua_State *L)
+{
+	if (unlikely(luaL_deepcopy_func_ref == LUA_REFNIL)) {
+		int count;
+		const char *deepcopy = "table.deepcopy";
+		if (luaT_func_find(L, deepcopy, deepcopy + strlen(deepcopy),
+				&count) != 0)
+			return -1;
+		luaL_deepcopy_func_ref = luaL_ref(L, LUA_REGISTRYINDEX);
+		assert(luaL_deepcopy_func_ref != LUA_REFNIL);
+	}
+	if (unlikely(luaL_default_sandbox_ref == LUA_REFNIL)) {
+		if (luaT_prepare_sandbox(L, default_sandbox_exports,
+					 nelem(default_sandbox_exports)) != 0)
+			return -1;
+		luaL_default_sandbox_ref = luaL_ref(L, LUA_REGISTRYINDEX);
+		assert(luaL_default_sandbox_ref != LUA_REFNIL);
+	}
+	lua_rawgeti(L, LUA_REGISTRYINDEX, luaL_deepcopy_func_ref);
+	lua_rawgeti(L, LUA_REGISTRYINDEX, luaL_default_sandbox_ref);
+	lua_call(L, 1, LUA_MULTRET);
+	return 0;
+}
+
 int
 tarantool_lua_utils_init(struct lua_State *L)
 {
diff --git a/src/lua/utils.h b/src/lua/utils.h
index 81e936bee..36bb2e53b 100644
--- a/src/lua/utils.h
+++ b/src/lua/utils.h
@@ -603,6 +603,14 @@ int
 luaT_func_find(struct lua_State *L, const char *name, const char *name_end,
 	       int *count);
 
+/**
+ * Prepare a new 'default' sandbox table on the top of the Lua
+ * stack. The 'default' sandbox consists of a minimum set of
+ * functions that are sufficient to serve persistent functions.
+ */
+int
+luaT_get_sandbox(struct lua_State *L);
+
 int
 tarantool_lua_utils_init(struct lua_State *L);
 
diff --git a/test/box-py/bootstrap.result b/test/box-py/bootstrap.result
index 379f6c51f..ea8ae0984 100644
--- a/test/box-py/bootstrap.result
+++ b/test/box-py/bootstrap.result
@@ -4,7 +4,7 @@ box.internal.bootstrap()
 box.space._schema:select{}
 ---
 - - ['max_id', 511]
-  - ['version', 2, 1, 3]
+  - ['version', 2, 2, 0]
 ...
 box.space._cluster:select{}
 ---
@@ -49,7 +49,7 @@ box.space._space:select{}
         'type': 'string'}, {'name': 'opts', 'type': 'map'}, {'name': 'parts', 'type': 'array'}]]
   - [296, 1, '_func', 'memtx', 0, {}, [{'name': 'id', 'type': 'unsigned'}, {'name': 'owner',
         'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'setuid',
-        'type': 'unsigned'}]]
+        'type': 'unsigned'}, {'name': 'lua_code', 'type': 'string'}]]
   - [297, 1, '_vfunc', 'sysview', 0, {}, [{'name': 'id', 'type': 'unsigned'}, {'name': 'owner',
         'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'setuid',
         'type': 'unsigned'}]]
@@ -141,7 +141,7 @@ box.space._user:select{}
 ...
 box.space._func:select{}
 ---
-- - [1, 1, 'box.schema.user.info', 1, 'LUA']
+- - [1, 1, 'box.schema.user.info', 1, 'LUA', '']
 ...
 box.space._priv:select{}
 ---
diff --git a/test/box/access_misc.result b/test/box/access_misc.result
index 36ebfae09..514cfaab0 100644
--- a/test/box/access_misc.result
+++ b/test/box/access_misc.result
@@ -789,7 +789,7 @@ box.space._space:select()
         'type': 'string'}, {'name': 'opts', 'type': 'map'}, {'name': 'parts', 'type': 'array'}]]
   - [296, 1, '_func', 'memtx', 0, {}, [{'name': 'id', 'type': 'unsigned'}, {'name': 'owner',
         'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'setuid',
-        'type': 'unsigned'}]]
+        'type': 'unsigned'}, {'name': 'body', 'type': 'string'}, {'name': 'opts', 'type': 'map'}]]
   - [297, 1, '_vfunc', 'sysview', 0, {}, [{'name': 'id', 'type': 'unsigned'}, {'name': 'owner',
         'type': 'unsigned'}, {'name': 'name', 'type': 'string'}, {'name': 'setuid',
         'type': 'unsigned'}]]
@@ -821,7 +821,7 @@ box.space._space:select()
 ...
 box.space._func:select()
 ---
-- - [1, 1, 'box.schema.user.info', 1, 'LUA']
+- - [1, 1, 'box.schema.user.info', 1, 'LUA', '']
 ...
 session = nil
 ---
diff --git a/test/box/persistent_func.result b/test/box/persistent_func.result
new file mode 100644
index 000000000..0aae0a4ee
--- /dev/null
+++ b/test/box/persistent_func.result
@@ -0,0 +1,179 @@
+env = require('test_run')
+---
+...
+test_run = env.new()
+---
+...
+--
+-- gh-4182: Add persistent LUA functions.
+--
+box.schema.user.grant('guest', 'execute', 'universe')
+---
+...
+net = require('net.box')
+---
+...
+conn = net.connect(box.cfg.listen)
+---
+...
+-- Test valid function.
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+body = [[function(tuple)
+	if type(tuple.address) ~= 'string' then return nil, 'Invalid field type' end
+	local t = tuple.address:lower():split()
+	for k,v in pairs(t) do t[k] = {v} end
+	return t
+end
+]]
+test_run:cmd("setopt delimiter ''");
+---
+...
+box.schema.func.create('test', {body = body, language = "C"})
+---
+- error: 'Failed to create function ''test'': function body may be specified only
+    for Lua language'
+...
+box.schema.func.create('test', {body = body})
+---
+...
+box.schema.func.exists('test')
+---
+- true
+...
+conn:call("test", {{address = "Moscow Dolgoprudny"}})
+---
+- [['moscow'], ['dolgoprudny']]
+...
+box.schema.func.create('test2', {body = body, opts = {is_deterministic = true}})
+---
+...
+box.schema.func.create('test3', {body = body, opts = {is_deterministic = true, extra = true}})
+---
+- error: 'Wrong space options (field 6): unexpected option ''extra'''
+...
+-- Test that monkey-patch attack is not possible.
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+body_monkey = [[function(tuple)
+	math.abs = math.log
+	return tuple
+end
+]]
+test_run:cmd("setopt delimiter ''");
+---
+...
+box.schema.func.create('body_monkey', {body = body_monkey})
+---
+...
+conn:call("body_monkey", {{address = "Moscow Dolgoprudny"}})
+---
+- {'address': 'Moscow Dolgoprudny'}
+...
+math.abs(-666.666)
+---
+- 666.666
+...
+-- Test taht 'require' is forbidden.
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+body_bad1 = [[function(tuple)
+	local json = require('json')
+	return json.encode(tuple)
+end
+]]
+test_run:cmd("setopt delimiter ''");
+---
+...
+box.schema.func.create('json_serializer', {body = body_bad1})
+---
+...
+conn:call("json_serializer", {{address = "Moscow Dolgoprudny"}})
+---
+- error: '[string "return function(tuple) ..."]:1: attempt to call global ''require''
+    (a nil value)'
+...
+-- Test function with spell error - case 1.
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+body_bad2 = [[function(tuple)
+	ret tuple
+end
+]]
+test_run:cmd("setopt delimiter ''");
+---
+...
+box.schema.func.create('body_bad2', {body = body_bad2})
+---
+- error: "Failed to dynamically load function 'body_bad2': function(tuple) \tret tuple
+    end "
+...
+-- Test function with spell error - case 2.
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+body_bad3 = [[func(tuple)
+	return tuple
+end
+]]
+test_run:cmd("setopt delimiter ''");
+---
+...
+box.schema.func.create('body_bad3', {body = body_bad3})
+---
+- error: "Failed to dynamically load function 'body_bad3': func(tuple) \treturn tuple
+    end "
+...
+conn:call("body_bad3", {{address = "Moscow Dolgoprudny"}})
+---
+- error: Procedure 'body_bad3' is not defined
+...
+conn:close()
+---
+...
+-- Restart server.
+test_run:cmd("restart server default")
+net = require('net.box')
+---
+...
+test_run = require('test_run').new()
+---
+...
+conn = net.connect(box.cfg.listen)
+---
+...
+conn:call("test", {{address = "Moscow Dolgoprudny"}})
+---
+- [['moscow'], ['dolgoprudny']]
+...
+conn:close()
+---
+...
+box.schema.func.drop('test')
+---
+...
+box.schema.func.exists('test')
+---
+- false
+...
+box.schema.func.drop('body_monkey')
+---
+...
+box.schema.func.drop('json_serializer')
+---
+...
+box.schema.func.drop('test2')
+---
+...
+box.schema.user.revoke('guest', 'execute', 'universe')
+---
+...
diff --git a/test/box/persistent_func.test.lua b/test/box/persistent_func.test.lua
new file mode 100644
index 000000000..7e321fb5c
--- /dev/null
+++ b/test/box/persistent_func.test.lua
@@ -0,0 +1,83 @@
+env = require('test_run')
+test_run = env.new()
+
+--
+-- gh-4182: Add persistent LUA functions.
+--
+box.schema.user.grant('guest', 'execute', 'universe')
+net = require('net.box')
+conn = net.connect(box.cfg.listen)
+
+-- Test valid function.
+test_run:cmd("setopt delimiter ';'")
+body = [[function(tuple)
+	if type(tuple.address) ~= 'string' then return nil, 'Invalid field type' end
+	local t = tuple.address:lower():split()
+	for k,v in pairs(t) do t[k] = {v} end
+	return t
+end
+]]
+test_run:cmd("setopt delimiter ''");
+box.schema.func.create('test', {body = body, language = "C"})
+box.schema.func.create('test', {body = body})
+box.schema.func.exists('test')
+conn:call("test", {{address = "Moscow Dolgoprudny"}})
+box.schema.func.create('test2', {body = body, opts = {is_deterministic = true}})
+box.schema.func.create('test3', {body = body, opts = {is_deterministic = true, extra = true}})
+
+-- Test that monkey-patch attack is not possible.
+test_run:cmd("setopt delimiter ';'")
+body_monkey = [[function(tuple)
+	math.abs = math.log
+	return tuple
+end
+]]
+test_run:cmd("setopt delimiter ''");
+box.schema.func.create('body_monkey', {body = body_monkey})
+conn:call("body_monkey", {{address = "Moscow Dolgoprudny"}})
+math.abs(-666.666)
+
+-- Test taht 'require' is forbidden.
+test_run:cmd("setopt delimiter ';'")
+body_bad1 = [[function(tuple)
+	local json = require('json')
+	return json.encode(tuple)
+end
+]]
+test_run:cmd("setopt delimiter ''");
+box.schema.func.create('json_serializer', {body = body_bad1})
+conn:call("json_serializer", {{address = "Moscow Dolgoprudny"}})
+
+-- Test function with spell error - case 1.
+test_run:cmd("setopt delimiter ';'")
+body_bad2 = [[function(tuple)
+	ret tuple
+end
+]]
+test_run:cmd("setopt delimiter ''");
+box.schema.func.create('body_bad2', {body = body_bad2})
+
+-- Test function with spell error - case 2.
+test_run:cmd("setopt delimiter ';'")
+body_bad3 = [[func(tuple)
+	return tuple
+end
+]]
+test_run:cmd("setopt delimiter ''");
+box.schema.func.create('body_bad3', {body = body_bad3})
+conn:call("body_bad3", {{address = "Moscow Dolgoprudny"}})
+
+conn:close()
+-- Restart server.
+test_run:cmd("restart server default")
+net = require('net.box')
+test_run = require('test_run').new()
+conn = net.connect(box.cfg.listen)
+conn:call("test", {{address = "Moscow Dolgoprudny"}})
+conn:close()
+box.schema.func.drop('test')
+box.schema.func.exists('test')
+box.schema.func.drop('body_monkey')
+box.schema.func.drop('json_serializer')
+box.schema.func.drop('test2')
+box.schema.user.revoke('guest', 'execute', 'universe')
-- 
2.21.0





More information about the Tarantool-patches mailing list