Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence
@ 2020-06-01 18:10 Alexander Turenko
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-01 18:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches, Alexander Turenko

The first patch is a bunch of renames: I dropped luaL prefix from
several functions to highlight the contract: they can be called from
usual C code, w/o any requirements to pass or to place a Lua state
somewhere.

The second patch just fixes NULL pointer dereference that occurs due to
the wrong assumption: I did think that fiber().storage.lua.stack is
always exists and non-NULL.

The third commit is the optimization: it allows to don't create a new
Lua state in merger when possible.

https://github.com/tarantool/tarantool/issues/4954
Totktonada/gh-4954-fix-merger-segfault-full-ci

Alexander Turenko (3):
  merger: drop luaL prefix where contract allows it
  merger: fix NULL dereference when called via iproto
  lua: expose temporary Lua state for iproto calls

 src/box/lua/call.c                            |  27 ++
 src/box/lua/merger.c                          | 233 ++++++++++++++----
 src/lib/core/fiber.h                          |   8 +-
 .../gh-4954-merger-via-net-box.test.lua       | 129 ++++++++++
 4 files changed, 349 insertions(+), 48 deletions(-)
 create mode 100755 test/box-tap/gh-4954-merger-via-net-box.test.lua

-- 
2.25.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it
  2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
@ 2020-06-01 18:10 ` Alexander Turenko
  2020-06-02 22:47   ` Vladislav Shpilevoy
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto Alexander Turenko
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-01 18:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches, Alexander Turenko

This change highlights the contract of merge source virtual methods:
they don't require a Lua state to be passed with arguments. Internal use
of a temporary Lua state is the implementation detail. Technically the
functions lean on a Lua state existence in a fiber storage, but the next
commit will relax this requirement.

Made the following renames:

* luaL_merge_source_buffer_fetch   -> merge_source_buffer_fetch
* luaL_merge_source_buffer_next    -> merge_source_buffer_next
* luaL_merge_source_buffer_destroy -> merge_source_buffer_destroy

* keep luaL_merge_source_table_fetch (pass <struct lua_State *>)
* luaL_merge_source_table_next     -> merge_source_table_next
* luaL_merge_source_table_destroy  -> merge_source_table_destroy

* keep luaL_merge_source_tuple_fetch (change arguments order)
* luaL_merge_source_tuple_next     -> merge_source_tuple_next
* luaL_merge_source_tuple_destroy  -> merge_source_tuple_destroy

Also added API comments for destroy() and next() virtual methods to
uniform them visually with other merge source functions.

Part of #4954
---
 src/box/lua/merger.c | 106 +++++++++++++++++++++++++++----------------
 1 file changed, 68 insertions(+), 38 deletions(-)

diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
index 1b155152b..16814c041 100644
--- a/src/box/lua/merger.c
+++ b/src/box/lua/merger.c
@@ -354,11 +354,11 @@ struct merge_source_buffer {
 /* Virtual methods declarations */
 
 static void
-luaL_merge_source_buffer_destroy(struct merge_source *base);
+merge_source_buffer_destroy(struct merge_source *base);
 static int
-luaL_merge_source_buffer_next(struct merge_source *base,
-			      struct tuple_format *format,
-			      struct tuple **out);
+merge_source_buffer_next(struct merge_source *base,
+			 struct tuple_format *format,
+			 struct tuple **out);
 
 /* Non-virtual methods */
 
@@ -373,8 +373,8 @@ static struct merge_source *
 luaL_merge_source_buffer_new(struct lua_State *L)
 {
 	static struct merge_source_vtab merge_source_buffer_vtab = {
-		.destroy = luaL_merge_source_buffer_destroy,
-		.next = luaL_merge_source_buffer_next,
+		.destroy = merge_source_buffer_destroy,
+		.next = merge_source_buffer_next,
 	};
 
 	struct merge_source_buffer *source = malloc(
@@ -403,7 +403,7 @@ luaL_merge_source_buffer_new(struct lua_State *L)
  * iterator ends and -1 at error and set a diag.
  */
 static int
-luaL_merge_source_buffer_fetch(struct merge_source_buffer *source)
+merge_source_buffer_fetch(struct merge_source_buffer *source)
 {
 	struct lua_State *L = fiber()->storage.lua.stack;
 	int nresult = luaL_iterator_next(L, source->fetch_it);
@@ -448,8 +448,13 @@ luaL_merge_source_buffer_fetch(struct merge_source_buffer *source)
 
 /* Virtual methods */
 
+/**
+ * destroy() virtual method implementation for a buffer source.
+ *
+ * @see struct merge_source_vtab
+ */
 static void
-luaL_merge_source_buffer_destroy(struct merge_source *base)
+merge_source_buffer_destroy(struct merge_source *base)
 {
 	struct merge_source_buffer *source = container_of(base,
 		struct merge_source_buffer, base);
@@ -462,10 +467,15 @@ luaL_merge_source_buffer_destroy(struct merge_source *base)
 	free(source);
 }
 
+/**
+ * next() virtual method implementation for a buffer source.
+ *
+ * @see struct merge_source_vtab
+ */
 static int
-luaL_merge_source_buffer_next(struct merge_source *base,
-			      struct tuple_format *format,
-			      struct tuple **out)
+merge_source_buffer_next(struct merge_source *base,
+			 struct tuple_format *format,
+			 struct tuple **out)
 {
 	struct merge_source_buffer *source = container_of(base,
 		struct merge_source_buffer, base);
@@ -476,7 +486,7 @@ luaL_merge_source_buffer_next(struct merge_source *base,
 	 * chunks iterator ends.
 	 */
 	while (source->remaining_tuple_count == 0) {
-		int rc = luaL_merge_source_buffer_fetch(source);
+		int rc = merge_source_buffer_fetch(source);
 		if (rc < 0)
 			return -1;
 		if (rc == 0) {
@@ -541,11 +551,11 @@ struct merge_source_table {
 /* Virtual methods declarations */
 
 static void
-luaL_merge_source_table_destroy(struct merge_source *base);
+merge_source_table_destroy(struct merge_source *base);
 static int
-luaL_merge_source_table_next(struct merge_source *base,
-			     struct tuple_format *format,
-			     struct tuple **out);
+merge_source_table_next(struct merge_source *base,
+			struct tuple_format *format,
+			struct tuple **out);
 
 /* Non-virtual methods */
 
@@ -558,8 +568,8 @@ static struct merge_source *
 luaL_merge_source_table_new(struct lua_State *L)
 {
 	static struct merge_source_vtab merge_source_table_vtab = {
-		.destroy = luaL_merge_source_table_destroy,
-		.next = luaL_merge_source_table_next,
+		.destroy = merge_source_table_destroy,
+		.next = merge_source_table_next,
 	};
 
 	struct merge_source_table *source = malloc(
@@ -586,9 +596,9 @@ luaL_merge_source_table_new(struct lua_State *L)
  * received and -1 at an error (set a diag).
  */
 static int
-luaL_merge_source_table_fetch(struct merge_source_table *source)
+luaL_merge_source_table_fetch(struct lua_State *L,
+			      struct merge_source_table *source)
 {
-	struct lua_State *L = fiber()->storage.lua.stack;
 	int nresult = luaL_iterator_next(L, source->fetch_it);
 
 	/* Handle a Lua error in a gen function. */
@@ -625,8 +635,13 @@ luaL_merge_source_table_fetch(struct merge_source_table *source)
 
 /* Virtual methods */
 
+/**
+ * destroy() virtual method implementation for a table source.
+ *
+ * @see struct merge_source_vtab
+ */
 static void
-luaL_merge_source_table_destroy(struct merge_source *base)
+merge_source_table_destroy(struct merge_source *base)
 {
 	struct merge_source_table *source = container_of(base,
 		struct merge_source_table, base);
@@ -639,10 +654,15 @@ luaL_merge_source_table_destroy(struct merge_source *base)
 	free(source);
 }
 
+/**
+ * next() virtual method implementation for a table source.
+ *
+ * @see struct merge_source_vtab
+ */
 static int
-luaL_merge_source_table_next(struct merge_source *base,
-			     struct tuple_format *format,
-			     struct tuple **out)
+merge_source_table_next(struct merge_source *base,
+			struct tuple_format *format,
+			struct tuple **out)
 {
 	struct lua_State *L = fiber()->storage.lua.stack;
 	struct merge_source_table *source = container_of(base,
@@ -659,7 +679,7 @@ luaL_merge_source_table_next(struct merge_source *base,
 	while (source->ref == 0 || lua_isnil(L, -1)) {
 		if (source->ref > 0)
 			lua_pop(L, 2);
-		int rc = luaL_merge_source_table_fetch(source);
+		int rc = luaL_merge_source_table_fetch(L, source);
 		if (rc < 0)
 			return -1;
 		if (rc == 0) {
@@ -715,11 +735,11 @@ struct merge_source_tuple {
 /* Virtual methods declarations */
 
 static void
-luaL_merge_source_tuple_destroy(struct merge_source *base);
+merge_source_tuple_destroy(struct merge_source *base);
 static int
-luaL_merge_source_tuple_next(struct merge_source *base,
-			     struct tuple_format *format,
-			     struct tuple **out);
+merge_source_tuple_next(struct merge_source *base,
+			struct tuple_format *format,
+			struct tuple **out);
 
 /* Non-virtual methods */
 
@@ -732,8 +752,8 @@ static struct merge_source *
 luaL_merge_source_tuple_new(struct lua_State *L)
 {
 	static struct merge_source_vtab merge_source_tuple_vtab = {
-		.destroy = luaL_merge_source_tuple_destroy,
-		.next = luaL_merge_source_tuple_next,
+		.destroy = merge_source_tuple_destroy,
+		.next = merge_source_tuple_next,
 	};
 
 	struct merge_source_tuple *source = malloc(
@@ -763,8 +783,8 @@ luaL_merge_source_tuple_new(struct lua_State *L)
  * Return -1 at error (set a diag).
  */
 static int
-luaL_merge_source_tuple_fetch(struct merge_source_tuple *source,
-			       struct lua_State *L)
+luaL_merge_source_tuple_fetch(struct lua_State *L,
+			      struct merge_source_tuple *source)
 {
 	int nresult = luaL_iterator_next(L, source->fetch_it);
 
@@ -792,8 +812,13 @@ luaL_merge_source_tuple_fetch(struct merge_source_tuple *source,
 
 /* Virtual methods */
 
+/**
+ * destroy() virtual method implementation for a tuple source.
+ *
+ * @see struct merge_source_vtab
+ */
 static void
-luaL_merge_source_tuple_destroy(struct merge_source *base)
+merge_source_tuple_destroy(struct merge_source *base)
 {
 	struct merge_source_tuple *source = container_of(base,
 		struct merge_source_tuple, base);
@@ -804,16 +829,21 @@ luaL_merge_source_tuple_destroy(struct merge_source *base)
 	free(source);
 }
 
+/**
+ * next() virtual method implementation for a tuple source.
+ *
+ * @see struct merge_source_vtab
+ */
 static int
-luaL_merge_source_tuple_next(struct merge_source *base,
-			     struct tuple_format *format,
-			     struct tuple **out)
+merge_source_tuple_next(struct merge_source *base,
+			struct tuple_format *format,
+			struct tuple **out)
 {
 	struct lua_State *L = fiber()->storage.lua.stack;
 	struct merge_source_tuple *source = container_of(base,
 		struct merge_source_tuple, base);
 
-	int rc = luaL_merge_source_tuple_fetch(source, L);
+	int rc = luaL_merge_source_tuple_fetch(L, source);
 	if (rc < 0)
 		return -1;
 	/*
-- 
2.25.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
  2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
@ 2020-06-01 18:10 ` Alexander Turenko
  2020-06-02 22:48   ` Vladislav Shpilevoy
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls Alexander Turenko
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-01 18:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches, Alexander Turenko

A merge source API is designed to be quite abstract: the base structure
and virtual methods do not depend on Lua anyhow. Each source should
implement next() and destroy() virtual methods, which may be called from
C without a Lua state. This design allows to use any source as from C as
well as from Lua. The Lua API is based on the C API and supports any
source. Even merger itself is implemented in pure C according to the
merge source API and so may be used from Lua.

A particular source implementation may use a Lua state internally, but
it is not part of the API and should be hid under hood. In fact all
sources we have now (except merger itself) store some references in
LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
the next() virtual method.

Before this patch, the sources ('buffer', 'table', 'tuple') assume that
a Lua state always exists in the fiber storage of a fiber, where next()
is called. This looks valid on the first glance, because it may be
called either from a Lua code or from merger, which in turn is called
from a Lua code. However background fibers (they serve binary protocol
requests) do not store a Lua state in the fiber storage even for Lua
call / eval requests.

Possible solution would be always store a Lua state in a fiber storage.
There are two reasons why it is not implemented here:

1. There should be a decision about right balance between speed and
   memory footprint and maybe some eviction strategy for cached Lua
   states. Don't sure we can just always store a state in each
   background fiber. It would be wasteful for instances that serve box
   DQL / DML, SQL and/or C procedure calls.
2. Technically contract of the next() method would assume that a Lua
   state should exist in a fiber storage. Such requirement looks quite
   unnatural for a C API and also looks fragile: what if we'll implement
   some less wasteful Lua state caching strategy and the assumption
   about presence of the Lua state will get broken?

Obviously, next() will spend extra time to create a temporary state when
it is called from a background fiber. We should reuse existing Lua state
at least when a Lua call is performed via a binary protocol. I consider
it as the optimization and will solve in the next commit.

A few words about the implementation. I have added three functions,
which acquire a temporary Lua state, call a function and release the
state. It may be squashed into one function that would accept a function
pointer and variable number of arguments. However GCC does not
devirtualize such calls at -O2 level, so it seems it is better to avoid
this. It maybe possible to write some weird macro that will technically
reduce code duplication, but I prefer to write in C, not some macro
based meta-language.

Fixes #4954
---
 src/box/lua/merger.c                          | 153 +++++++++++++++---
 .../gh-4954-merger-via-net-box.test.lua       | 129 +++++++++++++++
 2 files changed, 261 insertions(+), 21 deletions(-)
 create mode 100755 test/box-tap/gh-4954-merger-via-net-box.test.lua

diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
index 16814c041..25df18442 100644
--- a/src/box/lua/merger.c
+++ b/src/box/lua/merger.c
@@ -149,6 +149,68 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
 	return tuple;
 }
 
+/**
+ * Get a temporary Lua state.
+ *
+ * Use case: a function does not accept a Lua state as an argument
+ * to allow using from C code, but uses a Lua value, which is
+ * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed
+ * to get and process the value.
+ *
+ * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`.
+ *
+ * This Lua state should be used only from one fiber: otherwise
+ * one fiber may change the stack and another one will access a
+ * wrong stack slot when it will be scheduled for execution after
+ * yield.
+ *
+ * Return a Lua state on success and set @a coro_ref. This
+ * reference should be passed to `luaT_release_temp_luastate()`,
+ * when the state is not needed anymore.
+ *
+ * Return NULL and set a diag at failure.
+ */
+static struct lua_State *
+luaT_temp_luastate(int *coro_ref)
+{
+	if (fiber()->storage.lua.stack != NULL) {
+		*coro_ref = LUA_REFNIL;
+		return fiber()->storage.lua.stack;
+	}
+
+	/*
+	 * luaT_newthread() pops the new Lua state from
+	 * tarantool_L and it is right thing to do: if we'll push
+	 * something to it and yield, then another fiber will not
+	 * know that a stack top is changed and may operate on a
+	 * wrong slot.
+	 *
+	 * Second, many requests that push a value to tarantool_L
+	 * and yield may exhaust available slots on the stack.
+	 */
+	struct lua_State *L = luaT_newthread(tarantool_L);
+	if (L == NULL)
+		return NULL;
+	/*
+	 * The new state is not referenced from anywhere (reasons
+	 * are above), so we should keep a reference to it in the
+	 * registry while it is in use.
+	 */
+	*coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
+	return L;
+}
+
+/**
+ * Release a temporary Lua state.
+ *
+ * It is the other half of `luaT_temp_luastate()`.
+ */
+static void
+luaT_release_temp_luastate(int coro_ref)
+{
+	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, coro_ref);
+}
+
 /* }}} */
 
 /* {{{ Create, destroy structures from Lua */
@@ -396,16 +458,12 @@ luaL_merge_source_buffer_new(struct lua_State *L)
 }
 
 /**
- * Call a user provided function to get a next data chunk (a
- * buffer).
- *
- * Return 1 when a new buffer is received, 0 when a buffers
- * iterator ends and -1 at error and set a diag.
+ * Helper for `merge_source_buffer_fetch()`.
  */
 static int
-merge_source_buffer_fetch(struct merge_source_buffer *source)
+luaL_merge_source_buffer_fetch_impl(struct lua_State *L,
+				    struct merge_source_buffer *source)
 {
-	struct lua_State *L = fiber()->storage.lua.stack;
 	int nresult = luaL_iterator_next(L, source->fetch_it);
 
 	/* Handle a Lua error in a gen function. */
@@ -446,6 +504,25 @@ merge_source_buffer_fetch(struct merge_source_buffer *source)
 	return 1;
 }
 
+/**
+ * Call a user provided function to get a next data chunk (a
+ * buffer).
+ *
+ * Return 1 when a new buffer is received, 0 when a buffers
+ * iterator ends and -1 at error and set a diag.
+ */
+static int
+merge_source_buffer_fetch(struct merge_source_buffer *source)
+{
+	int coro_ref = LUA_REFNIL;
+	struct lua_State *L = luaT_temp_luastate(&coro_ref);
+	if (L == NULL)
+		return -1;
+	int rc = luaL_merge_source_buffer_fetch_impl(L, source);
+	luaT_release_temp_luastate(coro_ref);
+	return rc;
+}
+
 /* Virtual methods */
 
 /**
@@ -655,16 +732,14 @@ merge_source_table_destroy(struct merge_source *base)
 }
 
 /**
- * next() virtual method implementation for a table source.
- *
- * @see struct merge_source_vtab
+ * Helper for `merge_source_table_next()`.
  */
 static int
-merge_source_table_next(struct merge_source *base,
-			struct tuple_format *format,
-			struct tuple **out)
+luaL_merge_source_table_next_impl(struct lua_State *L,
+				  struct merge_source *base,
+				  struct tuple_format *format,
+				  struct tuple **out)
 {
-	struct lua_State *L = fiber()->storage.lua.stack;
 	struct merge_source_table *source = container_of(base,
 		struct merge_source_table, base);
 
@@ -707,6 +782,25 @@ merge_source_table_next(struct merge_source *base,
 	return 0;
 }
 
+/**
+ * next() virtual method implementation for a table source.
+ *
+ * @see struct merge_source_vtab
+ */
+static int
+merge_source_table_next(struct merge_source *base,
+			     struct tuple_format *format,
+			     struct tuple **out)
+{
+	int coro_ref = LUA_REFNIL;
+	struct lua_State *L = luaT_temp_luastate(&coro_ref);
+	if (L == NULL)
+		return -1;
+	int rc = luaL_merge_source_table_next_impl(L, base, format, out);
+	luaT_release_temp_luastate(coro_ref);
+	return rc;
+}
+
 /* Lua functions */
 
 /**
@@ -830,16 +924,14 @@ merge_source_tuple_destroy(struct merge_source *base)
 }
 
 /**
- * next() virtual method implementation for a tuple source.
- *
- * @see struct merge_source_vtab
+ * Helper for `merge_source_tuple_next()`.
  */
 static int
-merge_source_tuple_next(struct merge_source *base,
-			struct tuple_format *format,
-			struct tuple **out)
+luaL_merge_source_tuple_next_impl(struct lua_State *L,
+				  struct merge_source *base,
+				  struct tuple_format *format,
+				  struct tuple **out)
 {
-	struct lua_State *L = fiber()->storage.lua.stack;
 	struct merge_source_tuple *source = container_of(base,
 		struct merge_source_tuple, base);
 
@@ -864,6 +956,25 @@ merge_source_tuple_next(struct merge_source *base,
 	return 0;
 }
 
+/**
+ * next() virtual method implementation for a tuple source.
+ *
+ * @see struct merge_source_vtab
+ */
+static int
+merge_source_tuple_next(struct merge_source *base,
+			struct tuple_format *format,
+			struct tuple **out)
+{
+	int coro_ref = LUA_REFNIL;
+	struct lua_State *L = luaT_temp_luastate(&coro_ref);
+	if (L == NULL)
+		return -1;
+	int rc = luaL_merge_source_tuple_next_impl(L, base, format, out);
+	luaT_release_temp_luastate(coro_ref);
+	return rc;
+}
+
 /* Lua functions */
 
 /**
diff --git a/test/box-tap/gh-4954-merger-via-net-box.test.lua b/test/box-tap/gh-4954-merger-via-net-box.test.lua
new file mode 100755
index 000000000..e2bd6f8b9
--- /dev/null
+++ b/test/box-tap/gh-4954-merger-via-net-box.test.lua
@@ -0,0 +1,129 @@
+#!/usr/bin/env tarantool
+
+local merger_lib = require('merger')
+local buffer = require('buffer')
+local msgpack = require('msgpack')
+local net_box = require('net.box')
+local fiber = require('fiber')
+local tap = require('tap')
+
+
+-- {{{ Helpers
+
+-- Lua iterator generator function to iterate over an array.
+local function array_next(arr, idx)
+    idx = idx or 1
+    local item = arr[idx]
+    if item == nil then
+        return
+    end
+    return idx + 1, item
+end
+
+-- Lua iterator generator to iterate over an array with yields.
+local function array_yield_next(arr, idx)
+    fiber.sleep(0)
+    return array_next(arr, idx)
+end
+
+-- }}}
+
+-- {{{ Code that is run in a background fiber (via net.box)
+
+local function use_table_source(tuples)
+    local source = merger_lib.new_source_fromtable(tuples)
+    return source:select()
+end
+_G.use_table_source = use_table_source
+
+local function use_buffer_source(tuples)
+    local buf = buffer.ibuf()
+    msgpack.encode(tuples, buf)
+    local source = merger_lib.new_source_frombuffer(buf)
+    return source:select()
+end
+_G.use_buffer_source = use_buffer_source
+
+local function use_tuple_source(tuples)
+    local source = merger_lib.new_tuple_source(array_next, tuples)
+    return source:select()
+end
+_G.use_tuple_source = use_tuple_source
+
+local function use_table_source_yield(tuples)
+    local chunks = {}
+    for i, t in ipairs(tuples) do
+        chunks[i] = {t}
+    end
+    local source = merger_lib.new_table_source(array_yield_next, chunks)
+    return source:select()
+end
+_G.use_table_source_yield = use_table_source_yield
+
+local function use_buffer_source_yield(tuples)
+    local buffers = {}
+    for i, t in ipairs(tuples) do
+        buffers[i] = buffer.ibuf()
+        msgpack.encode({t}, buffers[i])
+    end
+    local source = merger_lib.new_buffer_source(array_yield_next, buffers)
+    return source:select()
+end
+_G.use_buffer_source_yield = use_buffer_source_yield
+
+local function use_tuple_source_yield(tuples)
+    local source = merger_lib.new_tuple_source(array_yield_next, tuples)
+    return source:select()
+end
+_G.use_tuple_source_yield = use_tuple_source_yield
+
+-- }}}
+
+box.cfg({
+    listen = os.getenv('LISTEN') or 'localhost:3301'
+})
+box.schema.user.grant('guest', 'execute', 'universe', nil,
+                      {if_not_exists = true})
+
+local test = tap.test('gh-4954-merger-via-net-box.test.lua')
+test:plan(6)
+
+local tuples = {
+    {1},
+    {2},
+    {3},
+}
+
+local connection = net_box.connect(box.cfg.listen)
+
+local res = connection:call('use_table_source', {tuples})
+test:is_deeply(res, tuples, 'verify table source')
+local res = connection:call('use_buffer_source', {tuples})
+test:is_deeply(res, tuples, 'verify buffer source')
+local res = connection:call('use_tuple_source', {tuples})
+test:is_deeply(res, tuples, 'verify tuple source')
+
+local function test_verify_source_async(test, func_name, request_count)
+    test:plan(request_count)
+
+    local futures = {}
+    for _ = 1, request_count do
+        local future = connection:call(func_name, {tuples}, {is_async = true})
+        table.insert(futures, future)
+    end
+    for i = 1, request_count do
+        local res = unpack(futures[i]:wait_result())
+        test:is_deeply(res, tuples, ('verify request %d'):format(i))
+    end
+end
+
+test:test('verify table source, which yields', test_verify_source_async,
+          'use_table_source_yield', 100)
+test:test('verify buffer source, which yields', test_verify_source_async,
+          'use_buffer_source_yield', 100)
+test:test('verify tuple source, which yields', test_verify_source_async,
+          'use_tuple_source_yield', 100)
+
+box.schema.user.revoke('guest', 'execute', 'universe')
+
+os.exit(test:check() and 0 or 1)
-- 
2.25.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls
  2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto Alexander Turenko
@ 2020-06-01 18:10 ` Alexander Turenko
  2020-06-02 22:48   ` Vladislav Shpilevoy
  2020-06-02 22:47 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Vladislav Shpilevoy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-01 18:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches, Alexander Turenko

There is code that may save some time and resources for creating a new
Lua state when it is present in the fiber storage of a current fiber.
There are not so much of them: running a Lua trigger and construction of
a next tuple in a merge source.

Before the patch, fiber->storage.lua.stack is filled only for the main
fiber and fibers created from Lua using fiber.create() or fiber.new(),
but not for background fibers (which serve binary protocol requests).

This patch fills fiber->storage.lua.stack for background fibers that
serve a Lua call or eval: we already have this state and nothing prevent
us from exposing it via the fiber storage.

Follows up #4954
---
 src/box/lua/call.c   | 27 +++++++++++++++++++++++++++
 src/lib/core/fiber.h |  8 ++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 6588ec2fa..f69d1adbe 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -537,12 +537,39 @@ box_process_lua(lua_CFunction handler, struct execute_lua_ctx *ctx,
 	port_lua_create(ret, L);
 	((struct port_lua *) ret)->ref = coro_ref;
 
+	/*
+	 * A code that need a temporary fiber-local Lua state may
+	 * save some time and resources for creating a new state
+	 * and use this one.
+	 */
+	bool has_lua_stack = fiber()->storage.lua.stack != NULL;
+	if (! has_lua_stack)
+		fiber()->storage.lua.stack = L;
+
 	lua_pushcfunction(L, handler);
 	lua_pushlightuserdata(L, ctx);
 	if (luaT_call(L, 1, LUA_MULTRET) != 0) {
+		if (! has_lua_stack)
+			fiber()->storage.lua.stack = NULL;
 		port_lua_destroy(ret);
 		return -1;
 	}
+
+	/*
+	 * Since this field is optional we're not obligated to
+	 * keep it until the Lua state will be unreferenced in
+	 * port_lua_destroy().
+	 *
+	 * There is no much sense to keep it beyond the Lua call,
+	 * so let's zap now.
+	 *
+	 * But: keep the stack if it was present before the call,
+	 * because it would be counter-intuitive if the existing
+	 * state pointer would be zapped after this function call.
+	 */
+	if (! has_lua_stack)
+		fiber()->storage.lua.stack = NULL;
+
 	return 0;
 }
 
diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
index cd9346a55..db68fb47e 100644
--- a/src/lib/core/fiber.h
+++ b/src/lib/core/fiber.h
@@ -496,8 +496,12 @@ struct fiber {
 		struct credentials *credentials;
 		struct txn *txn;
 		/**
-		 * Lua stack and the optional
-		 * fiber.storage Lua reference.
+		 * Optional Lua state (may be NULL). Useful as a
+		 * temporary Lua state to save time and resources
+		 * on creating it. Should not be used in other
+		 * fibers.
+		 *
+		 * Optional fiber.storage Lua reference.
 		 */
 		struct {
 			struct lua_State *stack;
-- 
2.25.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence
  2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
                   ` (2 preceding siblings ...)
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls Alexander Turenko
@ 2020-06-02 22:47 ` Vladislav Shpilevoy
  2020-06-07 17:17   ` Alexander Turenko
  2020-06-07 16:58 ` [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next() Alexander Turenko
  2020-06-17 17:54 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
  5 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-02 22:47 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Hi! Thanks for the patchset!

I would also ask Igor for a review. Because the patchset does
some manipulations with luaL names, and attempts to optimize
Lua-C code.

On 01/06/2020 20:10, Alexander Turenko wrote:
> The first patch is a bunch of renames: I dropped luaL prefix from
> several functions to highlight the contract: they can be called from
> usual C code, w/o any requirements to pass or to place a Lua state
> somewhere.
> 
> The second patch just fixes NULL pointer dereference that occurs due to
> the wrong assumption: I did think that fiber().storage.lua.stack is
> always exists and non-NULL.
> 
> The third commit is the optimization: it allows to don't create a new
> Lua state in merger when possible.
> 
> https://github.com/tarantool/tarantool/issues/4954
> Totktonada/gh-4954-fix-merger-segfault-full-ci

Do we need a changelog? Bugfixes are visible behaviour change.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
@ 2020-06-02 22:47   ` Vladislav Shpilevoy
  2020-06-07 16:57     ` Alexander Turenko
  0 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-02 22:47 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Thanks for the patch!

On 01/06/2020 20:10, Alexander Turenko wrote:
> This change highlights the contract of merge source virtual methods:
> they don't require a Lua state to be passed with arguments. Internal use
> of a temporary Lua state is the implementation detail. Technically the
> functions lean on a Lua state existence in a fiber storage, but the next
> commit will relax this requirement.
> 
> Made the following renames:
> 
> * luaL_merge_source_buffer_fetch   -> merge_source_buffer_fetch
> * luaL_merge_source_buffer_next    -> merge_source_buffer_next
> * luaL_merge_source_buffer_destroy -> merge_source_buffer_destroy
> 
> * keep luaL_merge_source_table_fetch (pass <struct lua_State *>)
> * luaL_merge_source_table_next     -> merge_source_table_next
> * luaL_merge_source_table_destroy  -> merge_source_table_destroy
> 
> * keep luaL_merge_source_tuple_fetch (change arguments order)
> * luaL_merge_source_tuple_next     -> merge_source_tuple_next
> * luaL_merge_source_tuple_destroy  -> merge_source_tuple_destroy
> 
> Also added API comments for destroy() and next() virtual methods to
> uniform them visually with other merge source functions.

I don't get why do you need these renames. merge_source API is
located in box/merger.h and box/merger.c. In lua/merger you have
children of struct merge_source. So they are not merge_source. The
latter is a virtual struct. lua/merger merge_source structs are
implementations of this virtual struct. So better not to use the same
prefix as for the top level merge_source API. IMO.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto Alexander Turenko
@ 2020-06-02 22:48   ` Vladislav Shpilevoy
  2020-06-07 16:58     ` Alexander Turenko
  0 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-02 22:48 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Thanks for the patch!

See 3 comments below.

On 01/06/2020 20:10, Alexander Turenko wrote:
> A merge source API is designed to be quite abstract: the base structure
> and virtual methods do not depend on Lua anyhow. Each source should
> implement next() and destroy() virtual methods, which may be called from
> C without a Lua state. This design allows to use any source as from C as
> well as from Lua. The Lua API is based on the C API and supports any
> source. Even merger itself is implemented in pure C according to the
> merge source API and so may be used from Lua.
> 
> A particular source implementation may use a Lua state internally, but
> it is not part of the API and should be hid under hood. In fact all

1. 'hid' -> 'hidden'. In passive voice V3 is used.

> sources we have now (except merger itself) store some references in
> LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
> the next() virtual method.
> 
> Before this patch, the sources ('buffer', 'table', 'tuple') assume that
> a Lua state always exists in the fiber storage of a fiber, where next()
> is called. This looks valid on the first glance, because it may be
> called either from a Lua code or from merger, which in turn is called
> from a Lua code. However background fibers (they serve binary protocol
> requests) do not store a Lua state in the fiber storage even for Lua
> call / eval requests.
> 
> Possible solution would be always store a Lua state in a fiber storage.
> There are two reasons why it is not implemented here:
> 
> 1. There should be a decision about right balance between speed and
>    memory footprint and maybe some eviction strategy for cached Lua
>    states. Don't sure we can just always store a state in each

2. "Don't sure" -> "Not sure", or "I am not sure".

>    background fiber. It would be wasteful for instances that serve box
>    DQL / DML, SQL and/or C procedure calls.
> 2. Technically contract of the next() method would assume that a Lua
>    state should exist in a fiber storage. Such requirement looks quite
>    unnatural for a C API and also looks fragile: what if we'll implement
>    some less wasteful Lua state caching strategy and the assumption
>    about presence of the Lua state will get broken?
> 
> Obviously, next() will spend extra time to create a temporary state when
> it is called from a background fiber. We should reuse existing Lua state
> at least when a Lua call is performed via a binary protocol. I consider
> it as the optimization and will solve in the next commit.
> 
> A few words about the implementation. I have added three functions,
> which acquire a temporary Lua state, call a function and release the
> state. It may be squashed into one function that would accept a function
> pointer and variable number of arguments. However GCC does not
> devirtualize such calls at -O2 level, so it seems it is better to avoid
> this. It maybe possible to write some weird macro that will technically
> reduce code duplication, but I prefer to write in C, not some macro
> based meta-language.
> 
> Fixes #4954
> ---
>  src/box/lua/merger.c                          | 153 +++++++++++++++---
>  .../gh-4954-merger-via-net-box.test.lua       | 129 +++++++++++++++
>  2 files changed, 261 insertions(+), 21 deletions(-)
>  create mode 100755 test/box-tap/gh-4954-merger-via-net-box.test.lua
> 
> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index 16814c041..25df18442 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c> @@ -446,6 +504,25 @@ merge_source_buffer_fetch(struct merge_source_buffer *source)
>  	return 1;
>  }
>  
> +/**
> + * Call a user provided function to get a next data chunk (a
> + * buffer).
> + *
> + * Return 1 when a new buffer is received, 0 when a buffers
> + * iterator ends and -1 at error and set a diag.
> + */
> +static int
> +merge_source_buffer_fetch(struct merge_source_buffer *source)
> +{
> +	int coro_ref = LUA_REFNIL;
> +	struct lua_State *L = luaT_temp_luastate(&coro_ref);
> +	if (L == NULL)
> +		return -1;
> +	int rc = luaL_merge_source_buffer_fetch_impl(L, source);

3. Looks like if luaL_merge_source_tuple_fetch() gets
luaL_iterator_next() result != 2, you leave on the stack whatever is
left here. If fiber's Lua stack is used, there will be left garbage
on it. It is popped in case the merger is used from Lua, when an error
is thrown. But if the merger would be used from C, the stack would
contain garbage afterwards.

The same for luaL_merge_source_buffer_fetch_impl(), but not only if
luaL_iterator_next() returns value != 2.

Not related to this patchset.

> +	luaT_release_temp_luastate(coro_ref);
> +	return rc;
> +}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls
  2020-06-01 18:10 ` [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls Alexander Turenko
@ 2020-06-02 22:48   ` Vladislav Shpilevoy
  2020-06-07 16:58     ` Alexander Turenko
  0 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-02 22:48 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Thanks for the patch!

See 3 comments below.

On 01/06/2020 20:10, Alexander Turenko wrote:
> There is code that may save some time and resources for creating a new
> Lua state when it is present in the fiber storage of a current fiber.
> There are not so much of them: running a Lua trigger and construction of
> a next tuple in a merge source.
> 
> Before the patch, fiber->storage.lua.stack is filled only for the main
> fiber and fibers created from Lua using fiber.create() or fiber.new(),
> but not for background fibers (which serve binary protocol requests).
> 
> This patch fills fiber->storage.lua.stack for background fibers that
> serve a Lua call or eval: we already have this state and nothing prevent
> us from exposing it via the fiber storage.
> 
> Follows up #4954
> ---
>  src/box/lua/call.c   | 27 +++++++++++++++++++++++++++
>  src/lib/core/fiber.h |  8 ++++++--
>  2 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index 6588ec2fa..f69d1adbe 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -537,12 +537,39 @@ box_process_lua(lua_CFunction handler, struct execute_lua_ctx *ctx,
>  	port_lua_create(ret, L);
>  	((struct port_lua *) ret)->ref = coro_ref;
>  
> +	/*
> +	 * A code that need a temporary fiber-local Lua state may
> +	 * save some time and resources for creating a new state
> +	 * and use this one.
> +	 */
> +	bool has_lua_stack = fiber()->storage.lua.stack != NULL;
> +	if (! has_lua_stack)
> +		fiber()->storage.lua.stack = L;

1. According to recent code style changes, we should omit single
whitespace after unary operators.

> +
>  	lua_pushcfunction(L, handler);
>  	lua_pushlightuserdata(L, ctx);
>  	if (luaT_call(L, 1, LUA_MULTRET) != 0) {
> +		if (! has_lua_stack)
> +			fiber()->storage.lua.stack = NULL;
>  		port_lua_destroy(ret);
>  		return -1;
>  	}
> +
> +	/*
> +	 * Since this field is optional we're not obligated to
> +	 * keep it until the Lua state will be unreferenced in
> +	 * port_lua_destroy().
> +	 *
> +	 * There is no much sense to keep it beyond the Lua call,
> +	 * so let's zap now.
> +	 *
> +	 * But: keep the stack if it was present before the call,
> +	 * because it would be counter-intuitive if the existing
> +	 * state pointer would be zapped after this function call.

2. It is not just counter-intuitive. It would break Lua-born fibers,
since they would suddenly loose their stack.

> +	 */
> +	if (! has_lua_stack)
> +		fiber()->storage.lua.stack = NULL;
> +
>  	return 0;
>  }
>  
> diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
> index cd9346a55..db68fb47e 100644
> --- a/src/lib/core/fiber.h
> +++ b/src/lib/core/fiber.h
> @@ -496,8 +496,12 @@ struct fiber {
>  		struct credentials *credentials;
>  		struct txn *txn;
>  		/**
> -		 * Lua stack and the optional
> -		 * fiber.storage Lua reference.
> +		 * Optional Lua state (may be NULL). Useful as a
> +		 * temporary Lua state to save time and resources
> +		 * on creating it. Should not be used in other
> +		 * fibers.
> +		 *
> +		 * Optional fiber.storage Lua reference.

3. You change the comment anyway, so it would be better to make it
right and put the member-related comments just above the members.
One comment for the stack, and separate comment for the ref.

>  		 */
>  		struct {
>  			struct lua_State *stack;
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it
  2020-06-02 22:47   ` Vladislav Shpilevoy
@ 2020-06-07 16:57     ` Alexander Turenko
  2020-06-11 16:17       ` Vladislav Shpilevoy
  2020-06-16 11:59       ` Igor Munkin
  0 siblings, 2 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-07 16:57 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Wed, Jun 03, 2020 at 12:47:45AM +0200, Vladislav Shpilevoy wrote:
> Thanks for the patch!
> 
> On 01/06/2020 20:10, Alexander Turenko wrote:
> > This change highlights the contract of merge source virtual methods:
> > they don't require a Lua state to be passed with arguments. Internal use
> > of a temporary Lua state is the implementation detail. Technically the
> > functions lean on a Lua state existence in a fiber storage, but the next
> > commit will relax this requirement.
> > 
> > Made the following renames:
> > 
> > * luaL_merge_source_buffer_fetch   -> merge_source_buffer_fetch
> > * luaL_merge_source_buffer_next    -> merge_source_buffer_next
> > * luaL_merge_source_buffer_destroy -> merge_source_buffer_destroy
> > 
> > * keep luaL_merge_source_table_fetch (pass <struct lua_State *>)
> > * luaL_merge_source_table_next     -> merge_source_table_next
> > * luaL_merge_source_table_destroy  -> merge_source_table_destroy
> > 
> > * keep luaL_merge_source_tuple_fetch (change arguments order)
> > * luaL_merge_source_tuple_next     -> merge_source_tuple_next
> > * luaL_merge_source_tuple_destroy  -> merge_source_tuple_destroy
> > 
> > Also added API comments for destroy() and next() virtual methods to
> > uniform them visually with other merge source functions.
> 
> I don't get why do you need these renames. merge_source API is
> located in box/merger.h and box/merger.c. In lua/merger you have
> children of struct merge_source. So they are not merge_source. The
> latter is a virtual struct. lua/merger merge_source structs are
> implementations of this virtual struct. So better not to use the same
> prefix as for the top level merge_source API. IMO.

Not sure I got your idea. We have no special prefix for merge source
implementations except 'merge_source' itself. Say, 'struct
merge_source_buffer' has no prefix. Functions to construct a merge
source from Lua have 'luaL' prefix, because they work on a passed Lua
state.

Functions that use a Lua state (but don't correspond to 'lua_CFunction'
prototype) are prefixed with 'luaL' (maybe will be prefixed with 'luaT'
/ 'luaE' in future if Igor will push us strong enough). The functions of
the question use a Lua state, but find/create it on demand internally.
So they are a kind of usual C functions and should not be prefixed with
luaL. IMO.

I'll CC Igor to see what will look worthful for him.

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
  2020-06-02 22:48   ` Vladislav Shpilevoy
@ 2020-06-07 16:58     ` Alexander Turenko
  2020-06-11 16:18       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-07 16:58 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

> > A particular source implementation may use a Lua state internally, but
> > it is not part of the API and should be hid under hood. In fact all
> 
> 1. 'hid' -> 'hidden'. In passive voice V3 is used.

Fixed.

> > 1. There should be a decision about right balance between speed and
> >    memory footprint and maybe some eviction strategy for cached Lua
> >    states. Don't sure we can just always store a state in each
> 
> 2. "Don't sure" -> "Not sure", or "I am not sure".

Fixed.

> > +/**
> > + * Call a user provided function to get a next data chunk (a
> > + * buffer).
> > + *
> > + * Return 1 when a new buffer is received, 0 when a buffers
> > + * iterator ends and -1 at error and set a diag.
> > + */
> > +static int
> > +merge_source_buffer_fetch(struct merge_source_buffer *source)
> > +{
> > +	int coro_ref = LUA_REFNIL;
> > +	struct lua_State *L = luaT_temp_luastate(&coro_ref);
> > +	if (L == NULL)
> > +		return -1;
> > +	int rc = luaL_merge_source_buffer_fetch_impl(L, source);
> 
> 3. Looks like if luaL_merge_source_tuple_fetch() gets
> luaL_iterator_next() result != 2, you leave on the stack whatever is
> left here. If fiber's Lua stack is used, there will be left garbage
> on it. It is popped in case the merger is used from Lua, when an error
> is thrown. But if the merger would be used from C, the stack would
> contain garbage afterwards.
> 
> The same for luaL_merge_source_buffer_fetch_impl(), but not only if
> luaL_iterator_next() returns value != 2.
> 
> Not related to this patchset.

Nice catch!

I didn't bother much about this, because it does not matter much for
usual Lua/C functions. However this case is different.

I think it should be handled by luaT_temp_luastate() /
luaT_release_temp_luastate() within this patchset. I have added one more
patch to the series (as third). I'll send it as answer to the cover
letter ('[PATCH 2.5/3] merger: clean fiber-local Lua stack after
next()').

NB: The test that is added within this patch will not work on 2.4 and
below, because those tarantool versions do not expose all symbols from
the executable.

Updated 2nd patch (this one) with FIXME (which will be resolved and
removed by the next patch).

 |  static void
 |  luaT_release_temp_luastate(int coro_ref)
 |  {
 | +	/*
 | +	 * FIXME: The reusable fiber-local Lua state is not
 | +	 * unreferenced here (coro_ref == LUA_REFNIL), but
 | +	 * it must be truncated to its past top to prevent
 | +	 * stack overflow.
 | +	 */
 |  	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, coro_ref);
 |  }
 
Added the following comment into the commit message:

 | Usage of the fiber-local Lua state is not quite correct now: merge
 | source code may left garbage on a stack in case of failures (like
 | unexpected result of Lua iterator generator function). This behaviour is
 | kept as is here, but it will be resolved by the next patch.

While writting the test, found [1]. I'll send a fix separately.

[1]: https://github.com/tarantool/tarantool/issues/5048

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls
  2020-06-02 22:48   ` Vladislav Shpilevoy
@ 2020-06-07 16:58     ` Alexander Turenko
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-07 16:58 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

> > +	/*
> > +	 * A code that need a temporary fiber-local Lua state may
> > +	 * save some time and resources for creating a new state
> > +	 * and use this one.
> > +	 */
> > +	bool has_lua_stack = fiber()->storage.lua.stack != NULL;
> > +	if (! has_lua_stack)
> > +		fiber()->storage.lua.stack = L;
> 
> 1. According to recent code style changes, we should omit single
> whitespace after unary operators.

Fixed.

> > +	/*
> > +	 * Since this field is optional we're not obligated to
> > +	 * keep it until the Lua state will be unreferenced in
> > +	 * port_lua_destroy().
> > +	 *
> > +	 * There is no much sense to keep it beyond the Lua call,
> > +	 * so let's zap now.
> > +	 *
> > +	 * But: keep the stack if it was present before the call,
> > +	 * because it would be counter-intuitive if the existing
> > +	 * state pointer would be zapped after this function call.
> 
> 2. It is not just counter-intuitive. It would break Lua-born fibers,
> since they would suddenly loose their stack.

Nope, it'll work.

Once we put the stack into a Lua-born fiber, this field may be safely
zapped. (I looked into this, because I want to create a test case that
would verify whether we keep the existing state here, but failed to find
such scenario.)

> >  		/**
> > -		 * Lua stack and the optional
> > -		 * fiber.storage Lua reference.
> > +		 * Optional Lua state (may be NULL). Useful as a
> > +		 * temporary Lua state to save time and resources
> > +		 * on creating it. Should not be used in other
> > +		 * fibers.
> > +		 *
> > +		 * Optional fiber.storage Lua reference.
> 
> 3. You change the comment anyway, so it would be better to make it
> right and put the member-related comments just above the members.
> One comment for the stack, and separate comment for the ref.

Fixed.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next()
  2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
                   ` (3 preceding siblings ...)
  2020-06-02 22:47 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Vladislav Shpilevoy
@ 2020-06-07 16:58 ` Alexander Turenko
  2020-06-11 16:20   ` Vladislav Shpilevoy
  2020-06-17 17:54 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
  5 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-07 16:58 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches, Alexander Turenko

merge_source_next() implementations for 'buffer', 'table' and 'tuple'
sources use a temporary Lua state to process data passed from Lua. It
may use the fiber-local Lua state when it is present. In this case the
state is not freed when all operations are done and may be reused later.
We should ensure that merge_source_next() code do not leave extra values
on the stack, because otherwise the stack may be overflowed.

Now those functions may be called only from Lua and if the fiber-local
Lua state is present it is the same as one that is passed to a Lua/C
function. After a Lua/C call a Lua interpreter automatically removes
everything below returned values. So the stack will not accumulate any
garbage.

However the merge source API is desined in the way to allow to use any
merge source from C. If we'll use it this way in a future, we can meet
the described problem.

The merge_source_next() implementations do not leave any garbage on a
Lua stack at success path, but may left something when an error occurs
(say, when a Lua iterator generator returns more then two values). I
would not bother with finding and fixing all such cases, considering
that it would be valid for usual Lua/C code. However it seems reasonable
to ensure that a stack is even when we releasing a temporary Lua state.

Reported-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>

Follows up #4954
---

When cherry-picking to 2.4 and below, all changes in test/ directory
should be removed, because the test leans on the fact that all symbols
are exposed from the tarantool executable. It is performed since
2.5.0-42-g03790ac55 ('cmake: remove dynamic-list linker option').

 src/box/lua/merger.c                       |  47 ++--
 test/CMakeLists.txt                        |   1 +
 test/box-tap/CMakeLists.txt                |   4 +
 test/box-tap/check_merge_source.c          | 101 +++++++++
 test/box-tap/gh-4954-merger-via-c.test.lua | 247 +++++++++++++++++++++
 5 files changed, 381 insertions(+), 19 deletions(-)
 create mode 100644 test/box-tap/CMakeLists.txt
 create mode 100644 test/box-tap/check_merge_source.c
 create mode 100755 test/box-tap/gh-4954-merger-via-c.test.lua

diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
index b8c432114..c7947d7da 100644
--- a/src/box/lua/merger.c
+++ b/src/box/lua/merger.c
@@ -164,18 +164,27 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
  * wrong stack slot when it will be scheduled for execution after
  * yield.
  *
- * Return a Lua state on success and set @a coro_ref. This
- * reference should be passed to `luaT_release_temp_luastate()`,
- * when the state is not needed anymore.
+ * Return a Lua state on success and set @a coro_ref and @a top.
+ * These values should be passed to
+ * `luaT_release_temp_luastate()`, when the state is not needed
+ * anymore.
  *
  * Return NULL and set a diag at failure.
  */
 static struct lua_State *
-luaT_temp_luastate(int *coro_ref)
+luaT_temp_luastate(int *coro_ref, int *top)
 {
 	if (fiber()->storage.lua.stack != NULL) {
+		/*
+		 * Reuse existing stack. In the releasing function
+		 * we should drop a stack top to its initial
+		 * value to don't exhaust available slots by
+		 * many requests in row.
+		 */
+		struct lua_State *L = fiber()->storage.lua.stack;
 		*coro_ref = LUA_REFNIL;
-		return fiber()->storage.lua.stack;
+		*top = lua_gettop(L);
+		return L;
 	}
 
 	/*
@@ -197,6 +206,7 @@ luaT_temp_luastate(int *coro_ref)
 	 * registry while it is in use.
 	 */
 	*coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
+	*top = -1;
 	return L;
 }
 
@@ -206,14 +216,10 @@ luaT_temp_luastate(int *coro_ref)
  * It is the other half of `luaT_temp_luastate()`.
  */
 static void
-luaT_release_temp_luastate(int coro_ref)
+luaT_release_temp_luastate(struct lua_State *L, int coro_ref, int top)
 {
-	/*
-	 * FIXME: The reusable fiber-local Lua state is not
-	 * unreferenced here (coro_ref == LUA_REFNIL), but
-	 * it must be truncated to its past top to prevent
-	 * stack overflow.
-	 */
+	if (top >= 0)
+		lua_settop(L, top);
 	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, coro_ref);
 }
 
@@ -521,11 +527,12 @@ static int
 merge_source_buffer_fetch(struct merge_source_buffer *source)
 {
 	int coro_ref = LUA_REFNIL;
-	struct lua_State *L = luaT_temp_luastate(&coro_ref);
+	int top = -1;
+	struct lua_State *L = luaT_temp_luastate(&coro_ref, &top);
 	if (L == NULL)
 		return -1;
 	int rc = luaL_merge_source_buffer_fetch_impl(L, source);
-	luaT_release_temp_luastate(coro_ref);
+	luaT_release_temp_luastate(L, coro_ref, top);
 	return rc;
 }
 
@@ -799,11 +806,12 @@ merge_source_table_next(struct merge_source *base,
 			     struct tuple **out)
 {
 	int coro_ref = LUA_REFNIL;
-	struct lua_State *L = luaT_temp_luastate(&coro_ref);
+	int top = -1;
+	struct lua_State *L = luaT_temp_luastate(&coro_ref, &top);
 	if (L == NULL)
 		return -1;
 	int rc = luaL_merge_source_table_next_impl(L, base, format, out);
-	luaT_release_temp_luastate(coro_ref);
+	luaT_release_temp_luastate(L, coro_ref, top);
 	return rc;
 }
 
@@ -898,7 +906,7 @@ luaL_merge_source_tuple_fetch(struct lua_State *L,
 
 	/* Handle incorrect results count. */
 	if (nresult != 2) {
-		diag_set(IllegalParams, "Expected <state>, <tuple> got %d "
+		diag_set(IllegalParams, "Expected <state>, <tuple>, got %d "
 			 "return values", nresult);
 		return -1;
 	}
@@ -973,11 +981,12 @@ merge_source_tuple_next(struct merge_source *base,
 			struct tuple **out)
 {
 	int coro_ref = LUA_REFNIL;
-	struct lua_State *L = luaT_temp_luastate(&coro_ref);
+	int top = -1;
+	struct lua_State *L = luaT_temp_luastate(&coro_ref, &top);
 	if (L == NULL)
 		return -1;
 	int rc = luaL_merge_source_tuple_next_impl(L, base, format, out);
-	luaT_release_temp_luastate(coro_ref);
+	luaT_release_temp_luastate(L, coro_ref, top);
 	return rc;
 }
 
diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
index 8d9d0462a..49985b7d4 100644
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@@ -57,6 +57,7 @@ add_custom_target(test-force
 add_subdirectory(app)
 add_subdirectory(app-tap)
 add_subdirectory(box)
+add_subdirectory(box-tap)
 add_subdirectory(unit)
 add_subdirectory(${PROJECT_SOURCE_DIR}/third_party/luajit/test/gh-4427-ffi-sandwich ${PROJECT_BINARY_DIR}/third_party/luajit/test/gh-4427-ffi-sandwich)
 add_subdirectory(${PROJECT_SOURCE_DIR}/third_party/luajit/test/lj-flush-on-trace ${PROJECT_BINARY_DIR}/third_party/luajit/test/lj-flush-on-trace)
diff --git a/test/box-tap/CMakeLists.txt b/test/box-tap/CMakeLists.txt
new file mode 100644
index 000000000..a39898bd8
--- /dev/null
+++ b/test/box-tap/CMakeLists.txt
@@ -0,0 +1,4 @@
+# fiber.h requires tarantool_ev.h from third_party directory.
+include_directories(${CMAKE_SOURCE_DIR}/third_party)
+
+build_module(check_merge_source check_merge_source.c)
diff --git a/test/box-tap/check_merge_source.c b/test/box-tap/check_merge_source.c
new file mode 100644
index 000000000..dbbf27bd1
--- /dev/null
+++ b/test/box-tap/check_merge_source.c
@@ -0,0 +1,101 @@
+#include <lua.h>           /* lua_*() */
+#include <lauxlib.h>       /* struct luaL_Reg */
+#include "lib/core/diag.h" /* struct error, diag_*() */
+#include "fiber.h"         /* fiber_self() */
+#include "lua/utils.h"     /* luaL_checkcdata() */
+#include "box/merger.h"    /* struct merge_source,
+			      merge_source_next() */
+
+/**
+ * Verify whether a temporary fiber-local Lua state has the same
+ * amount of stack slots before and after merge_source_next()
+ * call.
+ *
+ * A merge source is designed to be used from plain C code without
+ * passing any Lua state explicitly. There are merge sources
+ * ('table', 'buffer', 'tuple') that require temporary Lua stack
+ * to fetch next tuple and they use fiber-local Lua stack when it
+ * is available.
+ *
+ * Such calls should not left garbage on the fiber-local Lua
+ * stack, because many of them in row may overflow the stack.
+ *
+ * The module built as a separate dynamic library, but it uses
+ * internal tarantool functions. So it is not a 'real' external
+ * module, but the stub that imitates usage of a merge source from
+ * tarantool code.
+ */
+
+struct tuple;
+
+/**
+ * Extract a merge source from the Lua stack.
+ */
+static struct merge_source *
+luaT_check_merge_source(struct lua_State *L, int idx)
+{
+	uint32_t cdata_type;
+	struct merge_source **source_ptr = luaL_checkcdata(L, idx, &cdata_type);
+	assert(source_ptr != NULL);
+	return *source_ptr;
+}
+
+/**
+ * Call merge_source_next() virtual method of a merge source.
+ *
+ * The purpose of this function is to verify whether the
+ * fiber-local Lua stack is properly cleaned after
+ * merge_source_next() call on the passed merge source.
+ *
+ * @param merge_source    a merge source to call
+ *                        merge_source_next() on it
+ *
+ * @retval is_next_ok     whether the call is successful
+ * @retval err_msg        error message from the call or nil
+ * @retval is_stack_even  whether the fiber-local Lua stack is
+ *                        even after the call
+ */
+static int
+lbox_check_merge_source_call_next(struct lua_State *L)
+{
+	assert(lua_gettop(L) == 1);
+
+	/*
+	 * Ensure that there is reusable temporary Lua stack.
+	 *
+	 * Note: It may be the same as L (and usually do).
+	 */
+	struct lua_State *temporary_L = fiber_self()->storage.lua.stack;
+	assert(temporary_L != NULL);
+
+	struct tuple *tuple;
+	struct merge_source *source = luaT_check_merge_source(L, 1);
+
+	int top = lua_gettop(temporary_L);
+	int rc = merge_source_next(source, NULL, &tuple);
+	(void) tuple;
+	bool is_stack_even = lua_gettop(temporary_L) == top;
+	struct error *e = diag_last_error(diag_get());
+
+	lua_pushboolean(L, rc == 0);
+	if (rc == 0)
+		lua_pushnil(L);
+	else
+		lua_pushstring(L, e->errmsg);
+	lua_pushboolean(L, is_stack_even);
+	return 3;
+}
+
+/**
+ * Register the module.
+ */
+LUA_API int
+luaopen_check_merge_source(struct lua_State *L)
+{
+	static const struct luaL_Reg meta[] = {
+		{"call_next", lbox_check_merge_source_call_next},
+		{NULL, NULL}
+	};
+	luaL_register(L, "merge_source", meta);
+	return 1;
+}
diff --git a/test/box-tap/gh-4954-merger-via-c.test.lua b/test/box-tap/gh-4954-merger-via-c.test.lua
new file mode 100755
index 000000000..963b5825a
--- /dev/null
+++ b/test/box-tap/gh-4954-merger-via-c.test.lua
@@ -0,0 +1,247 @@
+#!/usr/bin/env tarantool
+
+--
+-- gh-4954: The fiber-local Lua stack should be even after
+-- merge_source_next() call from C code.
+--
+-- See test/box-tap/merge_source.c for more information.
+--
+
+local fio = require('fio')
+
+-- Use BUILDDIR passed from test-run or cwd when run w/o
+-- test-run to find test/box-tap/merge_source.{so,dylib}.
+local build_path = os.getenv('BUILDDIR') or '.'
+package.cpath = fio.pathjoin(build_path, 'test/box-tap/?.so'   ) .. ';' ..
+                fio.pathjoin(build_path, 'test/box-tap/?.dylib') .. ';' ..
+                package.cpath
+
+local buffer = require('buffer')
+local msgpack = require('msgpack')
+local merger = require('merger')
+local tap = require('tap')
+local check_merge_source = require('check_merge_source')
+
+-- {{{ Lua iterator generator functions
+
+local function triplet()
+    return 1, 2, 3
+end
+
+local function wrong_type()
+    return 1, 2
+end
+
+local function no_chunks()
+    return nil
+end
+
+local function bad_buffer()
+    local buf = buffer.ibuf()
+    msgpack.encode({foo = 'bar'}, buf)
+    return 1, buf
+end
+
+local function bad_tuple_in_buffer()
+    local tuple = 1
+    local buf = buffer.ibuf()
+    msgpack.encode({tuple}, buf)
+    return 1, buf
+end
+
+local function empty_buffer(_, state)
+    if state ~= nil then
+        return nil
+    end
+    local buf = buffer.ibuf()
+    return 1, buf
+end
+
+local function no_tuples_buffer(_, state)
+    if state ~= nil then
+        return nil
+    end
+    local buf = buffer.ibuf()
+    msgpack.encode({}, buf)
+    return 1, buf
+end
+
+local function good_buffer(_, state)
+    if state ~= nil then
+        return nil
+    end
+    local buf = buffer.ibuf()
+    local tuple = {1, 2, 3}
+    msgpack.encode({tuple}, buf)
+    return 1, buf
+end
+
+local function bad_tuple_in_table()
+    local tuple = 1
+    return 1, {tuple}
+end
+
+local function empty_table(_, state)
+    if state ~= nil then
+        return nil
+    end
+    return 1, {}
+end
+
+local function good_table(_, state)
+    if state ~= nil then
+        return nil
+    end
+    local tuple = {1, 2, 3}
+    return 1, {tuple}
+end
+
+local function bad_tuple()
+    local tuple = 1
+    return 1, tuple
+end
+
+local function good_tuple(_, state)
+    if state ~= nil then
+        return nil
+    end
+    local tuple = {1, 2, 3}
+    return 1, tuple
+end
+
+-- }}}
+
+local cases = {
+    {
+        'buffer source, bad gen function',
+        source_new = merger.new_buffer_source,
+        source_gen = triplet,
+        exp_err = '^Expected <state>, <buffer>, got 3 return values$',
+    },
+    {
+        'buffer source, bad gen result',
+        source_new = merger.new_buffer_source,
+        source_gen = wrong_type,
+        exp_err = '^Expected <state>, <buffer>$',
+    },
+    {
+        'buffer source, bad buffer',
+        source_new = merger.new_buffer_source,
+        source_gen = bad_buffer,
+        exp_err = '^Invalid merge source 0x[0-9a-f]+$',
+    },
+    -- FIXME: Enable after gh-5048: ('non-array tuple in a buffer
+    -- leads to assertion fail').
+    --[[
+    {
+        'buffer source, bad tuple in buffer',
+        source_new = merger.new_buffer_source,
+        source_gen = bad_tuple_in_buffer,
+        exp_err = '^A tuple must be an array$',
+    },
+    ]]--
+    {
+        'buffer source, no buffers',
+        source_new = merger.new_buffer_source,
+        source_gen = no_chunks,
+    },
+    {
+        'buffer source, empty buffer',
+        source_new = merger.new_buffer_source,
+        source_gen = empty_buffer,
+    },
+    {
+        'buffer source, no tuples buffer',
+        source_new = merger.new_buffer_source,
+        source_gen = no_tuples_buffer,
+    },
+    {
+        'buffer source, good buffer',
+        source_new = merger.new_buffer_source,
+        source_gen = good_buffer,
+    },
+    {
+        'table source, bad gen function',
+        source_new = merger.new_table_source,
+        source_gen = triplet,
+        exp_err = '^Expected <state>, <table>, got 3 return values$',
+    },
+    {
+        'table source, bad gen result',
+        source_new = merger.new_table_source,
+        source_gen = wrong_type,
+        exp_err = '^Expected <state>, <table>$',
+    },
+    {
+        'table source, bad tuple in table',
+        source_new = merger.new_table_source,
+        source_gen = bad_tuple_in_table,
+        exp_err = '^A tuple or a table expected, got number$',
+    },
+    {
+        'buffer source, no tables',
+        source_new = merger.new_table_source,
+        source_gen = no_chunks,
+    },
+    {
+        'table source, empty table',
+        source_new = merger.new_table_source,
+        source_gen = empty_table,
+    },
+    {
+        'table source, good table',
+        source_new = merger.new_table_source,
+        source_gen = good_table,
+    },
+    {
+        'tuple source, bad gen function',
+        source_new = merger.new_tuple_source,
+        source_gen = triplet,
+        exp_err = '^Expected <state>, <tuple>, got 3 return values$',
+    },
+    {
+        'tuple source, bad gen result',
+        source_new = merger.new_tuple_source,
+        source_gen = wrong_type,
+        exp_err = '^A tuple or a table expected, got number$',
+    },
+    {
+        'tuple source, bad tuple',
+        source_new = merger.new_tuple_source,
+        source_gen = bad_tuple,
+        exp_err = '^A tuple or a table expected, got number$',
+    },
+    {
+        'tuple source, no tuples',
+        source_new = merger.new_tuple_source,
+        source_gen = no_chunks,
+    },
+    {
+        'tuple source, good tuple',
+        source_new = merger.new_tuple_source,
+        source_gen = good_tuple,
+    },
+}
+
+local test = tap.test('gh-4954-merger-via-c')
+test:plan(#cases)
+
+for _, case in ipairs(cases) do
+    test:test(case[1], function(test)
+        test:plan(3)
+        local source = case.source_new(case.source_gen)
+        local is_next_ok, err_msg, is_stack_even =
+            check_merge_source.call_next(source)
+        if case.exp_err == nil then
+            test:ok(is_next_ok, 'merge_source_next() should succeed')
+            test:ok(err_msg == nil, 'no error message')
+        else
+            test:ok(not is_next_ok, 'merge_source_next() should fail')
+            test:ok(string.match(err_msg, case.exp_err), 'verify error message',
+                                 {err_msg = err_msg, exp_err = case.exp_err})
+        end
+        test:ok(is_stack_even, 'fiber-local Lua stack should be even')
+    end)
+end
+
+os.exit(test:check() and 0 or 1)
-- 
2.25.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence
  2020-06-02 22:47 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Vladislav Shpilevoy
@ 2020-06-07 17:17   ` Alexander Turenko
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-07 17:17 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Wed, Jun 03, 2020 at 12:47:27AM +0200, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patchset!
> 
> I would also ask Igor for a review. Because the patchset does
> some manipulations with luaL names, and attempts to optimize
> Lua-C code.
> 
> On 01/06/2020 20:10, Alexander Turenko wrote:
> > The first patch is a bunch of renames: I dropped luaL prefix from
> > several functions to highlight the contract: they can be called from
> > usual C code, w/o any requirements to pass or to place a Lua state
> > somewhere.
> > 
> > The second patch just fixes NULL pointer dereference that occurs due to
> > the wrong assumption: I did think that fiber().storage.lua.stack is
> > always exists and non-NULL.
> > 
> > The third commit is the optimization: it allows to don't create a new
> > Lua state in merger when possible.
> > 
> > https://github.com/tarantool/tarantool/issues/4954
> > Totktonada/gh-4954-fix-merger-segfault-full-ci
> 
> Do we need a changelog? Bugfixes are visible behaviour change.

Thanks for the reminder!

@ChangeLog

- merger: fix NULL pointer dereference when merger is called via the
  binary protocol (say, via net.box) (gh-4954)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it
  2020-06-07 16:57     ` Alexander Turenko
@ 2020-06-11 16:17       ` Vladislav Shpilevoy
  2020-06-16 11:59       ` Igor Munkin
  1 sibling, 0 replies; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-11 16:17 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

On 07/06/2020 18:57, Alexander Turenko wrote:
> On Wed, Jun 03, 2020 at 12:47:45AM +0200, Vladislav Shpilevoy wrote:
>> Thanks for the patch!
>>
>> On 01/06/2020 20:10, Alexander Turenko wrote:
>>> This change highlights the contract of merge source virtual methods:
>>> they don't require a Lua state to be passed with arguments. Internal use
>>> of a temporary Lua state is the implementation detail. Technically the
>>> functions lean on a Lua state existence in a fiber storage, but the next
>>> commit will relax this requirement.
>>>
>>> Made the following renames:
>>>
>>> * luaL_merge_source_buffer_fetch   -> merge_source_buffer_fetch
>>> * luaL_merge_source_buffer_next    -> merge_source_buffer_next
>>> * luaL_merge_source_buffer_destroy -> merge_source_buffer_destroy
>>>
>>> * keep luaL_merge_source_table_fetch (pass <struct lua_State *>)
>>> * luaL_merge_source_table_next     -> merge_source_table_next
>>> * luaL_merge_source_table_destroy  -> merge_source_table_destroy
>>>
>>> * keep luaL_merge_source_tuple_fetch (change arguments order)
>>> * luaL_merge_source_tuple_next     -> merge_source_tuple_next
>>> * luaL_merge_source_tuple_destroy  -> merge_source_tuple_destroy
>>>
>>> Also added API comments for destroy() and next() virtual methods to
>>> uniform them visually with other merge source functions.
>>
>> I don't get why do you need these renames. merge_source API is
>> located in box/merger.h and box/merger.c. In lua/merger you have
>> children of struct merge_source. So they are not merge_source. The
>> latter is a virtual struct. lua/merger merge_source structs are
>> implementations of this virtual struct. So better not to use the same
>> prefix as for the top level merge_source API. IMO.
> 
> Not sure I got your idea. We have no special prefix for merge source
> implementations except 'merge_source' itself. Say, 'struct
> merge_source_buffer' has no prefix. Functions to construct a merge
> source from Lua have 'luaL' prefix, because they work on a passed Lua
> state.

But merge_source_buffer != merge_source. Generally we name methods
using <struct name>_<method_name>. So the correct way was to use
merge_source_buffer_ prefix. Not just merge_source_. Talking of luaL_
prefix, IMO its role here is secondary. You just need to either use
it for all merge_source_buffer methods as an additional prefix, or
don't use it in either of them.

Currently your merge_source_buffer methods are:

    merge_source_buffer_destroy
    merge_source_buffer_next
    luaL_merge_source_buffer_new
    luaL_merge_source_buffer_fetch_impl
    merge_source_buffer_fetch

Seems inconsistent. Both names and argument order in some of them.
If a function takes luaL_State, it does not take luaL_ prefix
automatically and unconditionally. It is just a parameter. These
functions in the first place are merge_source_buffer methods. Not
luaL_State methods.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
  2020-06-07 16:58     ` Alexander Turenko
@ 2020-06-11 16:18       ` Vladislav Shpilevoy
  2020-06-17 17:53         ` Alexander Turenko
  0 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-11 16:18 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Thanks for the fixes!

See 2 comments below.

>     A merge source API is designed to be quite abstract: the base structure
>     and virtual methods do not depend on Lua anyhow. Each source should
>     implement next() and destroy() virtual methods, which may be called from
>     C without a Lua state. This design allows to use any source as from C as
>     well as from Lua. The Lua API is based on the C API and supports any
>     source. Even merger itself is implemented in pure C according to the
>     merge source API and so may be used from Lua.
>     
>     A particular source implementation may use a Lua state internally, but
>     it is not part of the API and should be hidden under hood. In fact all
>     sources we have now (except merger itself) store some references in
>     LUA_REGISTRYINDEX and need a temporary Lua stack to work with them in
>     the next() virtual method.
>     
>     Before this patch, the sources ('buffer', 'table', 'tuple') assume that
>     a Lua state always exists in the fiber storage of a fiber, where next()
>     is called. This looks valid on the first glance, because it may be
>     called either from a Lua code or from merger, which in turn is called
>     from a Lua code. However background fibers (they serve binary protocol
>     requests) do not store a Lua state in the fiber storage even for Lua
>     call / eval requests.
>     
>     Possible solution would be always store a Lua state in a fiber storage.
>     There are two reasons why it is not implemented here:
>     
>     1. There should be a decision about right balance between speed and
>        memory footprint and maybe some eviction strategy for cached Lua
>        states. Not sure we can just always store a state in each background
>        fiber. It would be wasteful for instances that serve box DQL / DML,
>        SQL and/or C procedure calls.
>     2. Technically contract of the next() method would assume that a Lua
>        state should exist in a fiber storage. Such requirement looks quite
>        unnatural for a C API and also looks fragile: what if we'll implement
>        some less wasteful Lua state caching strategy and the assumption
>        about presence of the Lua state will get broken?
>     
>     Obviously, next() will spend extra time to create a temporary state when
>     it is called from a background fiber. We should reuse existing Lua state
>     at least when a Lua call is performed via a binary protocol. I consider
>     it as the optimization and will solve in the next commit.
>     
>     A few words about the implementation. I have added three functions,
>     which acquire a temporary Lua state, call a function and release the
>     state. It may be squashed into one function that would accept a function
>     pointer and variable number of arguments. However GCC does not
>     devirtualize such calls at -O2 level, so it seems it is better to avoid
>     this. It maybe possible to write some weird macro that will technically
>     reduce code duplication, but I prefer to write in C, not some macro
>     based meta-language.
>     
>     Usage of the fiber-local Lua state is not quite correct now: merge

1. merge -> merger.

>     source code may left garbage on a stack in case of failures (like
>     unexpected result of Lua iterator generator function). This behaviour is
>     kept as is here, but it will be resolved by the next patch.
>     
>     Fixes #4954
> 
> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index 16814c041..b8c432114 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c
> @@ -149,6 +149,74 @@ luaT_gettuple(struct lua_State *L, int idx, struct tuple_format *format)
>  	return tuple;
>  }
>  
> +/**
> + * Get a temporary Lua state.
> + *
> + * Use case: a function does not accept a Lua state as an argument
> + * to allow using from C code, but uses a Lua value, which is
> + * referenced in LUA_REGISTRYINDEX. A temporary Lua stack is needed
> + * to get and process the value.
> + *
> + * The returned state shares LUA_REGISTRYINDEX with `tarantool_L`.
> + *
> + * This Lua state should be used only from one fiber: otherwise
> + * one fiber may change the stack and another one will access a
> + * wrong stack slot when it will be scheduled for execution after
> + * yield.
> + *
> + * Return a Lua state on success and set @a coro_ref. This
> + * reference should be passed to `luaT_release_temp_luastate()`,
> + * when the state is not needed anymore.
> + *
> + * Return NULL and set a diag at failure.
> + */
> +static struct lua_State *
> +luaT_temp_luastate(int *coro_ref)
> +{
> +	if (fiber()->storage.lua.stack != NULL) {
> +		*coro_ref = LUA_REFNIL;
> +		return fiber()->storage.lua.stack;
> +	}
> +
> +	/*
> +	 * luaT_newthread() pops the new Lua state from
> +	 * tarantool_L and it is right thing to do: if we'll push
> +	 * something to it and yield, then another fiber will not
> +	 * know that a stack top is changed and may operate on a
> +	 * wrong slot.
> +	 *
> +	 * Second, many requests that push a value to tarantool_L
> +	 * and yield may exhaust available slots on the stack.
> +	 */
> +	struct lua_State *L = luaT_newthread(tarantool_L);
> +	if (L == NULL)
> +		return NULL;

2. luaT_newthread() does not set a diag. That may lead to a crash,
because as far as I see, this function may be called
lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
And it luaT_error(), when merge_source_next() fails.

> +	/*
> +	 * The new state is not referenced from anywhere (reasons
> +	 * are above), so we should keep a reference to it in the
> +	 * registry while it is in use.
> +	 */
> +	*coro_ref = luaL_ref(tarantool_L, LUA_REGISTRYINDEX);
> +	return L;
> +}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next()
  2020-06-07 16:58 ` [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next() Alexander Turenko
@ 2020-06-11 16:20   ` Vladislav Shpilevoy
  2020-06-17 17:53     ` Alexander Turenko
  0 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-11 16:20 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Thanks for the patch!

See 6 comments below.

> diff --git a/src/box/lua/merger.c b/src/box/lua/merger.c
> index b8c432114..c7947d7da 100644
> --- a/src/box/lua/merger.c
> +++ b/src/box/lua/merger.c
> @@ -206,14 +216,10 @@ luaT_temp_luastate(int *coro_ref)
>   * It is the other half of `luaT_temp_luastate()`.
>   */
>  static void
> -luaT_release_temp_luastate(int coro_ref)
> +luaT_release_temp_luastate(struct lua_State *L, int coro_ref, int top)
>  {
> -	/*
> -	 * FIXME: The reusable fiber-local Lua state is not
> -	 * unreferenced here (coro_ref == LUA_REFNIL), but
> -	 * it must be truncated to its past top to prevent
> -	 * stack overflow.
> -	 */
> +	if (top >= 0)
> +		lua_settop(L, top);

1. lua_settop() works fine even when top is -1. It basically means
'set top to the latest element' = 'leave the stack untouched'. I
checked the implementation, should work.

>  	luaL_unref(tarantool_L, LUA_REGISTRYINDEX, coro_ref);
>  }
> @@ -898,7 +906,7 @@ luaL_merge_source_tuple_fetch(struct lua_State *L,
>  
>  	/* Handle incorrect results count. */
>  	if (nresult != 2) {
> -		diag_set(IllegalParams, "Expected <state>, <tuple> got %d "
> +		diag_set(IllegalParams, "Expected <state>, <tuple>, got %d "

2. Unnecessary diff.

>  			 "return values", nresult);
>  		return -1;
>  	}
> diff --git a/test/box-tap/check_merge_source.c b/test/box-tap/check_merge_source.c
> new file mode 100644
> index 000000000..dbbf27bd1
> --- /dev/null
> +++ b/test/box-tap/check_merge_source.c
> @@ -0,0 +1,101 @@
> +#include <lua.h>           /* lua_*() */
> +#include <lauxlib.h>       /* struct luaL_Reg */
> +#include "lib/core/diag.h" /* struct error, diag_*() */
> +#include "fiber.h"         /* fiber_self() */
> +#include "lua/utils.h"     /* luaL_checkcdata() */
> +#include "box/merger.h"    /* struct merge_source,
> +			      merge_source_next() */

3. Do you really need these comments? Anyway they tend to outdate
fast, because no one watches these comments when changes the code,
uses some new functions from these files, etc.

> +
> +/**
> + * Verify whether a temporary fiber-local Lua state has the same
> + * amount of stack slots before and after merge_source_next()
> + * call.
> + *
> + * A merge source is designed to be used from plain C code without
> + * passing any Lua state explicitly. There are merge sources
> + * ('table', 'buffer', 'tuple') that require temporary Lua stack
> + * to fetch next tuple and they use fiber-local Lua stack when it
> + * is available.
> + *
> + * Such calls should not left garbage on the fiber-local Lua
> + * stack, because many of them in row may overflow the stack.
> + *
> + * The module built as a separate dynamic library, but it uses
> + * internal tarantool functions. So it is not a 'real' external
> + * module, but the stub that imitates usage of a merge source from
> + * tarantool code.
> + */
> +
> +struct tuple;

4. If some merger API returns tuples, I guess it is merger's header
responsibility to announce all the needed structures. And it does.
So why do you need it here?

> +
> +/**
> + * Extract a merge source from the Lua stack.
> + */
> +static struct merge_source *
> +luaT_check_merge_source(struct lua_State *L, int idx)
> +{
> +	uint32_t cdata_type;
> +	struct merge_source **source_ptr = luaL_checkcdata(L, idx, &cdata_type);
> +	assert(source_ptr != NULL);
> +	return *source_ptr;
> +}
> +
> +/**
> + * Call merge_source_next() virtual method of a merge source.
> + *
> + * The purpose of this function is to verify whether the
> + * fiber-local Lua stack is properly cleaned after
> + * merge_source_next() call on the passed merge source.
> + *
> + * @param merge_source    a merge source to call
> + *                        merge_source_next() on it
> + *
> + * @retval is_next_ok     whether the call is successful
> + * @retval err_msg        error message from the call or nil
> + * @retval is_stack_even  whether the fiber-local Lua stack is
> + *                        even after the call

5. This function takes L, and returns an int. These things are
details of how L is changed. So they can't be described like
that.

Also, please, start sentences from capital letters, and end with
a dot. It is really hard to read the parameter comments above, they
look like one big piece of monolothic text.

> + */
> +static int
> +lbox_check_merge_source_call_next(struct lua_State *L)
> +{
> +	assert(lua_gettop(L) == 1);
> +
> +	/*
> +	 * Ensure that there is reusable temporary Lua stack.
> +	 *
> +	 * Note: It may be the same as L (and usually do).
> +	 */
> +	struct lua_State *temporary_L = fiber_self()->storage.lua.stack;
> +	assert(temporary_L != NULL);
> +
> +	struct tuple *tuple;
> +	struct merge_source *source = luaT_check_merge_source(L, 1);
> +
> diff --git a/test/box-tap/gh-4954-merger-via-c.test.lua b/test/box-tap/gh-4954-merger-via-c.test.lua
> new file mode 100755
> index 000000000..963b5825a
> --- /dev/null
> +++ b/test/box-tap/gh-4954-merger-via-c.test.lua
> @@ -0,0 +1,247 @@
> +#!/usr/bin/env tarantool
> +-- }}}
> +
> +local cases = {
> +    {
> +        'buffer source, bad gen function',
> +        source_new = merger.new_buffer_source,
> +        source_gen = triplet,
> +        exp_err = '^Expected <state>, <buffer>, got 3 return values$',
> +    },
> +    {
> +        'buffer source, bad gen result',
> +        source_new = merger.new_buffer_source,
> +        source_gen = wrong_type,
> +        exp_err = '^Expected <state>, <buffer>$',
> +    },
> +    {
> +        'buffer source, bad buffer',
> +        source_new = merger.new_buffer_source,
> +        source_gen = bad_buffer,
> +        exp_err = '^Invalid merge source 0x[0-9a-f]+$',
> +    },
> +    -- FIXME: Enable after gh-5048: ('non-array tuple in a buffer
> +    -- leads to assertion fail').

6. The file is currently called gh-4954-merger-via-c.test.lua. So
will you change the file name, when the test is enabled? Will 5048
get its own test, or it will be covered by this file without any
file name changes?

> +    --[[
> +    {
> +        'buffer source, bad tuple in buffer',
> +        source_new = merger.new_buffer_source,
> +        source_gen = bad_tuple_in_buffer,
> +        exp_err = '^A tuple must be an array$',
> +    },

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it
  2020-06-07 16:57     ` Alexander Turenko
  2020-06-11 16:17       ` Vladislav Shpilevoy
@ 2020-06-16 11:59       ` Igor Munkin
  2020-06-17 17:53         ` Alexander Turenko
  1 sibling, 1 reply; 24+ messages in thread
From: Igor Munkin @ 2020-06-16 11:59 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches, Vladislav Shpilevoy

Sasha,

On 07.06.20, Alexander Turenko wrote:
> On Wed, Jun 03, 2020 at 12:47:45AM +0200, Vladislav Shpilevoy wrote:
> > Thanks for the patch!
> > 
> > On 01/06/2020 20:10, Alexander Turenko wrote:
> > > This change highlights the contract of merge source virtual methods:
> > > they don't require a Lua state to be passed with arguments. Internal use
> > > of a temporary Lua state is the implementation detail. Technically the
> > > functions lean on a Lua state existence in a fiber storage, but the next
> > > commit will relax this requirement.
> > > 
> > > Made the following renames:
> > > 
> > > * luaL_merge_source_buffer_fetch   -> merge_source_buffer_fetch
> > > * luaL_merge_source_buffer_next    -> merge_source_buffer_next
> > > * luaL_merge_source_buffer_destroy -> merge_source_buffer_destroy
> > > 
> > > * keep luaL_merge_source_table_fetch (pass <struct lua_State *>)
> > > * luaL_merge_source_table_next     -> merge_source_table_next
> > > * luaL_merge_source_table_destroy  -> merge_source_table_destroy
> > > 
> > > * keep luaL_merge_source_tuple_fetch (change arguments order)
> > > * luaL_merge_source_tuple_next     -> merge_source_tuple_next
> > > * luaL_merge_source_tuple_destroy  -> merge_source_tuple_destroy
> > > 
> > > Also added API comments for destroy() and next() virtual methods to
> > > uniform them visually with other merge source functions.
> > 
> > I don't get why do you need these renames. merge_source API is
> > located in box/merger.h and box/merger.c. In lua/merger you have
> > children of struct merge_source. So they are not merge_source. The
> > latter is a virtual struct. lua/merger merge_source structs are
> > implementations of this virtual struct. So better not to use the same
> > prefix as for the top level merge_source API. IMO.
> 

As for me I see a mess with the naming already exists:
* What is the rule for using <lbox_> prefix? I see its common usage for
  "exported" merger API functions (e.g. <lbox_merger_new_buffer_source>,
  <lbox_merger_new_table_source>, <lbox_merger_new_tuple_source>) but
  you also use it for <lbox_merge_source_new> helper routine. How come?
* There is a <luaT_check_merge_source> auxiliary function with no
  Tarantool specifics underneath. Thereby it ought to be named with
  <luaL_> prefix instead of <luaT_> one as you mentioned here[1]. Looks
  like another naming rule "violation", doesn't it? However, you left
  this one unchanged.

> Not sure I got your idea. We have no special prefix for merge source
> implementations except 'merge_source' itself. Say, 'struct
> merge_source_buffer' has no prefix. Functions to construct a merge
> source from Lua have 'luaL' prefix, because they work on a passed Lua
> state.
> 
> Functions that use a Lua state (but don't correspond to 'lua_CFunction'
> prototype) are prefixed with 'luaL' (maybe will be prefixed with 'luaT'
> / 'luaE' in future if Igor will push us strong enough). The functions of
> the question use a Lua state, but find/create it on demand internally.
> So they are a kind of usual C functions and should not be prefixed with
> luaL. IMO.

Well, let's consider <luaL_addvalue> function from lauxlib.h[2], that
has the following signature:
| LUALIB_API void luaL_addvalue(luaL_Buffer *B)
It neither corresponds lua_CFunction prototype, nor has explicit
lua_State argument in signature, *but* yet is prefixed with <luaL_>. It
uses lua_State from luaL_Buffer that is quite similar usage to the way
you obtain lua_State for merger needs.

> 
> I'll CC Igor to see what will look worthful for him.

I'm not fussy here (it's an old code) and we don't have strict naming
policy for now, so let's discard these changes until we made any
decision regarding #4577.

> 
> WBR, Alexander Turenko.

[1]: https://github.com/tarantool/tarantool/issues/4577
[2]: https://www.lua.org/manual/5.1/manual.html#luaL_addvalue

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it
  2020-06-16 11:59       ` Igor Munkin
@ 2020-06-17 17:53         ` Alexander Turenko
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-17 17:53 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy

> As for me I see a mess with the naming already exists:
> * What is the rule for using <lbox_> prefix? I see its common usage for
>   "exported" merger API functions (e.g. <lbox_merger_new_buffer_source>,
>   <lbox_merger_new_table_source>, <lbox_merger_new_tuple_source>) but
>   you also use it for <lbox_merge_source_new> helper routine. How come?

It should be luaT, agreed. But I finally decided to don't touch names
within this patchset.

> * There is a <luaT_check_merge_source> auxiliary function with no
>   Tarantool specifics underneath. Thereby it ought to be named with
>   <luaL_> prefix instead of <luaT_> one as you mentioned here[1]. Looks
>   like another naming rule "violation", doesn't it? However, you left
>   this one unchanged.

It operates on struct merge_source, so the naming here looks reasonable
for me.

> > I'll CC Igor to see what will look worthful for him.
> 
> I'm not fussy here (it's an old code) and we don't have strict naming
> policy for now, so let's discard these changes until we made any
> decision regarding #4577.

Considering Vlad's and your objections, it seems it is not worth to
touch these names.

I reverted the renames, but kept API comments and
luaL_merge_source_tuple_fetch() arguments reordering. I squashed those
changes into the main commit ('merger: fix NULL dereference when called
via iproto'). Added the following notes to the commit message:

 | Added API comments for destroy() and next() virtual methods to uniform
 | them visually with other merge source functions.
 |
 | Changed order of luaL_merge_source_tuple_fetch() arguments to unify it
 | with other functions (<struct lua_State *>, <struct merge_source *>).

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
  2020-06-11 16:18       ` Vladislav Shpilevoy
@ 2020-06-17 17:53         ` Alexander Turenko
  2020-06-18 22:47           ` Vladislav Shpilevoy
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-17 17:53 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

> >     Usage of the fiber-local Lua state is not quite correct now: merge
> >     source code may left garbage on a stack in case of failures (like
> 
> 1. merge -> merger.

I use term 'merge source' for sources, we agreed on it with Vladimir D.,
when the merger was designed. Removed 'code' to don't confuse a reader:
'merge source code' -> 'a merge source'.

> > +	struct lua_State *L = luaT_newthread(tarantool_L);
> > +	if (L == NULL)
> > +		return NULL;
> 
> 2. luaT_newthread() does not set a diag. That may lead to a crash,
> because as far as I see, this function may be called
> lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
> And it luaT_error(), when merge_source_next() fails.

As I see, it is not so.

luaT_newthread_wrapper() may raise a Lua error (only 'not enough memory'
I guess), luaT_cpcall() calls luaT_toerror() in this case, which invokes
diag_set(LuajitError, <...>).

Isn't I miss something?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next()
  2020-06-11 16:20   ` Vladislav Shpilevoy
@ 2020-06-17 17:53     ` Alexander Turenko
  2020-06-18 22:48       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Turenko @ 2020-06-17 17:53 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Thanks for the careful review!

> > +	if (top >= 0)
> > +		lua_settop(L, top);
> 
> 1. lua_settop() works fine even when top is -1. It basically means
> 'set top to the latest element' = 'leave the stack untouched'. I
> checked the implementation, should work.

I'm not sure we can lean on this. See, Lua 5.1 Reference Manual [1] states the
following about lua_settop():

 | Accepts any acceptable index, or 0, and sets the stack top to this index.

And defines an acceptable index as follows:

 | More formally, we define an acceptable index as follows:
 |
 |     (index < 0 && abs(index) <= top) ||
 |     (index > 0 && index <= stackspace)

However LuaJIT has the following condition:

 | LUA_API void lua_settop(lua_State *L, int idx)
 | {
 |   if (idx >= 0) {
 |     <...>
 |   } else {
 |     api_check(L, -(idx+1) <= (L->top - L->base));
 |     L->top += idx+1;  /* Shrinks top (idx < 0). */
 |   }
 | }

The api_check() call allows a negative index to point to a stack slot
below the current top by 1. It looks as the implementation detail.

I was wonder from where the condition comes. It seems, LuaJIT borrows it
from PUC-Rio Lua 5.1 and it was introduced in [2] right with this extra
+1.

NB: Widely used index2addr() LuaJIT internal function has the following
check for a negative index:

 | api_check(L, idx != 0 && -idx <= L->top - L->base);

And index2addr() is used in lua_toboolean(), which accepts an acceptable
index according to the manual.

I would better follow the API rather then lean on implementation
details and provide a comment with clarification why it works.

[1]: https://www.lua.org/manual/5.1/manual.html
[2]: https://github.com/lua/lua/commit/255052b6c6a1b109c93b104c03de8968b56dcd5a

> >  	/* Handle incorrect results count. */
> >  	if (nresult != 2) {
> > -		diag_set(IllegalParams, "Expected <state>, <tuple> got %d "
> > +		diag_set(IllegalParams, "Expected <state>, <tuple>, got %d "
> 
> 2. Unnecessary diff.

box-tap/gh-4954-merger-via-c.test.lua has three 'bad gen function' test
cases and all expected errors in them are structured in the same way. It
would be strange to verify an that an error message contains a typo
(especially, when only one of them has it).

> > +#include <lua.h>           /* lua_*() */
> > +#include <lauxlib.h>       /* struct luaL_Reg */
> > +#include "lib/core/diag.h" /* struct error, diag_*() */
> > +#include "fiber.h"         /* fiber_self() */
> > +#include "lua/utils.h"     /* luaL_checkcdata() */
> > +#include "box/merger.h"    /* struct merge_source,
> > +			      merge_source_next() */
> 
> 3. Do you really need these comments? Anyway they tend to outdate
> fast, because no one watches these comments when changes the code,
> uses some new functions from these files, etc.

I actively use them during the initial development: when I remove some
experimental code, I verify whether I should remove some header.

There is nothing bad if it becomes a bit outdated: some of headers used
more, some becomes unused. If one want to clean them up, those comments
will give an idea how to obtain possibly unused headers using simple
pattern matching. Then the list of possibly unused headers may be
verified by removing them and try to compile.

I find it convenient sometimes, so I would prefer to leave the comments
(if you don't strongly disagree).

> > +struct tuple;
> 
> 4. If some merger API returns tuples, I guess it is merger's header
> responsibility to announce all the needed structures. And it does.
> So why do you need it here?

It sounds reasonable: I cannot use merge source methods without at least
opaque <struct tuple> definition. Removed the declaration and added
'struct tuple (opaque)' to the '#include "box/merger.h"' comment.

Also fixed comment style:

 #include "box/merger.h"    /* struct merge_source,
-                             merge_source_next() */
+                           * merge_source_next(),
+                           * struct tuple (opaque)
+                           */

> > +/**
> > + * Call merge_source_next() virtual method of a merge source.
> > + *
> > + * The purpose of this function is to verify whether the
> > + * fiber-local Lua stack is properly cleaned after
> > + * merge_source_next() call on the passed merge source.
> > + *
> > + * @param merge_source    a merge source to call
> > + *                        merge_source_next() on it
> > + *
> > + * @retval is_next_ok     whether the call is successful
> > + * @retval err_msg        error message from the call or nil
> > + * @retval is_stack_even  whether the fiber-local Lua stack is
> > + *                        even after the call
> 
> 5. This function takes L, and returns an int. These things are
> details of how L is changed. So they can't be described like
> that.

The function has the Lua API and I want to describe it in Lua terms: it
accepts a merge source and returns three values. However I agree that I
should not use Doxygen markup for this sake, because it will contradict
with actual C definition.

Since we have no formal agreement how to document Lua and Lua/C
functions (and tooling around it), it seems I should not use any special
markup and just describe parameters and return values in a free form.

I wrote it this way:

  | * The function is to be called from Lua. Lua API is the
  | * following:
  | *
  | * Parameters:
  | *
  | * - merge_source   A merge source object to call
  | *                  merge_source_next() on it.
  | *
  | * Return values:
  | *
  | * - is_next_ok     Whether the call is successful.
  | * - err_msg        Error message from the call or nil.
  | * - is_stack_even  Whether the fiber-local Lua stack is
  | *                  even after the call.

Note: Vlad points me a couple of examples that describe Lua API in
Lua/C terms:

 | /**
 |  * Push a message using a protocol, depending on a session type.
 |  * @param L Lua state. First argument on the stack is data to
 |  *        push.
 |  * @retval 1 Success, true is pushed.
 |  * @retval 2 Error. Nil and error object are pushed.
 |  */
 | static int
 | lbox_session_push(struct lua_State *L)

 | /**
 |  * Create a new SWIM instance. SWIM is not created via FFI,
 |  * because this operation yields.
 |  * @retval 1 Success. A SWIM instance pointer is on the stack.
 |  * @retval 2 Error. Nil and an error object are pushed.
 |  */
 | static int
 | lua_swim_new(struct lua_State *L)

I'm not very comfortable with this way, the reasons are the following:

- Just not so intuitive as a description in usual Lua terms.
- Usually amount of returned values is not most important information,
  but, say, whether a first one is nil is important. We should point a
  user a way to use the API right in the API description: success case
  is when the first retval is not nil in the first place, not when a
  function returns one value.
- It requires some knowledge about Lua/C API and so it is not acceptable
  as a source to generate user-visible API documentation for public Lua
  methods.
- It will change if a function will be reimplemented using pure Lua /
  LuaJIT FFI despite that an actual API (a way to call and use) will not
  be changed.

> Also, please, start sentences from capital letters, and end with
> a dot. It is really hard to read the parameter comments above, they
> look like one big piece of monolothic text.

Okay, fixed.

> > +    -- FIXME: Enable after gh-5048: ('non-array tuple in a buffer
> > +    -- leads to assertion fail').
> 
> 6. The file is currently called gh-4954-merger-via-c.test.lua. So
> will you change the file name, when the test is enabled? Will 5048
> get its own test, or it will be covered by this file without any
> file name changes?

It is one of test cases for gh-4954, but it is blocked by gh-5048. I'll
enable it here with gh-5048 and will add a specific test case either to
box-tap/merger.test.lua (because it already has truncated buffer cases)
or as a separate file (not sure what is actually better, but a formal
process requires the latter).

The gh-5048 case will not check stack size (just error message), call a
Lua/C procedure and so: it will be easier to understand and to verify it
separately from gh-4954.

I'll send a fix when this patchset will land to master to don't forget
to uncomment the case here.

----

During self-review of the changes found and fixed two typos:

diff --git a/test/box-tap/gh-4954-merger-via-c.test.lua b/test/box-tap/gh-4954-merger-via-c.test.lua
index 963b5825a..9fbb0919a 100755
--- a/test/box-tap/gh-4954-merger-via-c.test.lua
+++ b/test/box-tap/gh-4954-merger-via-c.test.lua
@@ -4,13 +4,13 @@
 -- gh-4954: The fiber-local Lua stack should be even after
 -- merge_source_next() call from C code.
 --
--- See test/box-tap/merge_source.c for more information.
+-- See test/box-tap/check_merge_source.c for more information.
 --

 local fio = require('fio')

 -- Use BUILDDIR passed from test-run or cwd when run w/o
--- test-run to find test/box-tap/merge_source.{so,dylib}.
+-- test-run to find test/box-tap/check_merge_source.{so,dylib}.
 local build_path = os.getenv('BUILDDIR') or '.'
 package.cpath = fio.pathjoin(build_path, 'test/box-tap/?.so'   ) .. ';' ..
                 fio.pathjoin(build_path, 'test/box-tap/?.dylib') .. ';' ..

And also commented the unused function to make luacheck happy (and
simplify luacheck integration patchset rebasing if it'll land into
master after this patchset):

diff --git a/test/box-tap/gh-4954-merger-via-c.test.lua b/test/box-tap/gh-4954-merger-via-c.test.lua
index 59ff731a5..9fbb0919a 100755
--- a/test/box-tap/gh-4954-merger-via-c.test.lua
+++ b/test/box-tap/gh-4954-merger-via-c.test.lua
@@ -42,12 +42,16 @@ local function bad_buffer()
     return 1, buf
 end

+-- FIXME: Enable after gh-5048: ('non-array tuple in a buffer
+-- leads to assertion fail').
+--[[
 local function bad_tuple_in_buffer()
     local tuple = 1
     local buf = buffer.ibuf()
     msgpack.encode({tuple}, buf)
     return 1, buf
 end
+]]--

 local function empty_buffer(_, state)
     if state ~= nil then

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence
  2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
                   ` (4 preceding siblings ...)
  2020-06-07 16:58 ` [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next() Alexander Turenko
@ 2020-06-17 17:54 ` Alexander Turenko
  5 siblings, 0 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-17 17:54 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

> https://github.com/tarantool/tarantool/issues/4954
> Totktonada/gh-4954-fix-merger-segfault-full-ci

I'll resend the patchset as v2 to ease review: one commit was removed
and one was added.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto
  2020-06-17 17:53         ` Alexander Turenko
@ 2020-06-18 22:47           ` Vladislav Shpilevoy
  0 siblings, 0 replies; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-18 22:47 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Hi! Thanks for the fixes!

On 17/06/2020 19:53, Alexander Turenko wrote:
>>>     Usage of the fiber-local Lua state is not quite correct now: merge
>>>     source code may left garbage on a stack in case of failures (like
>>
>> 1. merge -> merger.
> 
> I use term 'merge source' for sources, we agreed on it with Vladimir D.,
> when the merger was designed. Removed 'code' to don't confuse a reader:
> 'merge source code' -> 'a merge source'.

Exactly. I read it as 'source code', not 'merge source'. Now it is fine.

>>> +	struct lua_State *L = luaT_newthread(tarantool_L);
>>> +	if (L == NULL)
>>> +		return NULL;
>>
>> 2. luaT_newthread() does not set a diag. That may lead to a crash,
>> because as far as I see, this function may be called
>> lbox_merge_source_gen() indirectly, somewhere deep in the callstack.
>> And it luaT_error(), when merge_source_next() fails.
> 
> As I see, it is not so.
> 
> luaT_newthread_wrapper() may raise a Lua error (only 'not enough memory'
> I guess), luaT_cpcall() calls luaT_toerror() in this case, which invokes
> diag_set(LuajitError, <...>).
> 
> Isn't I miss something?

You are right, all is fine here.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next()
  2020-06-17 17:53     ` Alexander Turenko
@ 2020-06-18 22:48       ` Vladislav Shpilevoy
  2020-06-19  7:41         ` Alexander Turenko
  0 siblings, 1 reply; 24+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-18 22:48 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Thanks for the fixes!

On 17/06/2020 19:53, Alexander Turenko wrote:
> Thanks for the careful review!
> 
>>> +	if (top >= 0)
>>> +		lua_settop(L, top);
>>
>> 1. lua_settop() works fine even when top is -1. It basically means
>> 'set top to the latest element' = 'leave the stack untouched'. I
>> checked the implementation, should work.
> 
> I'm not sure we can lean on this. See, Lua 5.1 Reference Manual [1] states the
> following about lua_settop():
> 
>  | Accepts any acceptable index, or 0, and sets the stack top to this index.
> 
> And defines an acceptable index as follows:
> 
>  | More formally, we define an acceptable index as follows:
>  |
>  |     (index < 0 && abs(index) <= top) ||
>  |     (index > 0 && index <= stackspace)
> 
> However LuaJIT has the following condition:

You said a few lines above that we shouldn't rely on implementation
specifics, and yet you appeal to it here.

As I see, both the implementation, and the format definition of the
valid index mean that lua_settop(-1) is no op. Means the same as
lua_settop(lua_gettop()).

>>> +#include <lua.h>           /* lua_*() */
>>> +#include <lauxlib.h>       /* struct luaL_Reg */
>>> +#include "lib/core/diag.h" /* struct error, diag_*() */
>>> +#include "fiber.h"         /* fiber_self() */
>>> +#include "lua/utils.h"     /* luaL_checkcdata() */
>>> +#include "box/merger.h"    /* struct merge_source,
>>> +			      merge_source_next() */
>>
>> 3. Do you really need these comments? Anyway they tend to outdate
>> fast, because no one watches these comments when changes the code,
>> uses some new functions from these files, etc.
> 
> I actively use them during the initial development: when I remove some
> experimental code, I verify whether I should remove some header.
> 
> There is nothing bad if it becomes a bit outdated: some of headers used
> more, some becomes unused. If one want to clean them up, those comments
> will give an idea how to obtain possibly unused headers using simple
> pattern matching. Then the list of possibly unused headers may be
> verified by removing them and try to compile.
> 
> I find it convenient sometimes, so I would prefer to leave the comments
> (if you don't strongly disagree).

I don't care about include comments much. I just warn you that
it is not in our code style (AFAIK, but I didn't check), and if
someone but you will change the merger code, the comments are likely
to outdate and turn into just confusing text not meaning anything.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next()
  2020-06-18 22:48       ` Vladislav Shpilevoy
@ 2020-06-19  7:41         ` Alexander Turenko
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Turenko @ 2020-06-19  7:41 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Fri, Jun 19, 2020 at 12:48:23AM +0200, Vladislav Shpilevoy wrote:
> Thanks for the fixes!
> 
> On 17/06/2020 19:53, Alexander Turenko wrote:
> > Thanks for the careful review!
> > 
> >>> +	if (top >= 0)
> >>> +		lua_settop(L, top);
> >>
> >> 1. lua_settop() works fine even when top is -1. It basically means
> >> 'set top to the latest element' = 'leave the stack untouched'. I
> >> checked the implementation, should work.
> > 
> > I'm not sure we can lean on this. See, Lua 5.1 Reference Manual [1] states the
> > following about lua_settop():
> > 
> >  | Accepts any acceptable index, or 0, and sets the stack top to this index.
> > 
> > And defines an acceptable index as follows:
> > 
> >  | More formally, we define an acceptable index as follows:
> >  |
> >  |     (index < 0 && abs(index) <= top) ||
> >  |     (index > 0 && index <= stackspace)
> > 
> > However LuaJIT has the following condition:
> 
> You said a few lines above that we shouldn't rely on implementation
> specifics, and yet you appeal to it here.

I just show where Lua and LuaJIT allows more than the reference manual
guarantees to be successful.

> As I see, both the implementation, and the format definition of the
> valid index mean that lua_settop(-1) is no op. Means the same as
> lua_settop(lua_gettop()).

Let `top` be zero (empty stack) and `index` be -1. (index < 0 &&
abs(index) <= top) condition fails. I don't see any mistake in this
math.

> 
> >>> +#include <lua.h>           /* lua_*() */
> >>> +#include <lauxlib.h>       /* struct luaL_Reg */
> >>> +#include "lib/core/diag.h" /* struct error, diag_*() */
> >>> +#include "fiber.h"         /* fiber_self() */
> >>> +#include "lua/utils.h"     /* luaL_checkcdata() */
> >>> +#include "box/merger.h"    /* struct merge_source,
> >>> +			      merge_source_next() */
> >>
> >> 3. Do you really need these comments? Anyway they tend to outdate
> >> fast, because no one watches these comments when changes the code,
> >> uses some new functions from these files, etc.
> > 
> > I actively use them during the initial development: when I remove some
> > experimental code, I verify whether I should remove some header.
> > 
> > There is nothing bad if it becomes a bit outdated: some of headers used
> > more, some becomes unused. If one want to clean them up, those comments
> > will give an idea how to obtain possibly unused headers using simple
> > pattern matching. Then the list of possibly unused headers may be
> > verified by removing them and try to compile.
> > 
> > I find it convenient sometimes, so I would prefer to leave the comments
> > (if you don't strongly disagree).
> 
> I don't care about include comments much. I just warn you that
> it is not in our code style (AFAIK, but I didn't check), and if
> someone but you will change the merger code, the comments are likely
> to outdate and turn into just confusing text not meaning anything.

I asked teammates and several of them speak against it. So I'll remove
them.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-06-19  7:42 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01 18:10 [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 1/3] merger: drop luaL prefix where contract allows it Alexander Turenko
2020-06-02 22:47   ` Vladislav Shpilevoy
2020-06-07 16:57     ` Alexander Turenko
2020-06-11 16:17       ` Vladislav Shpilevoy
2020-06-16 11:59       ` Igor Munkin
2020-06-17 17:53         ` Alexander Turenko
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 2/3] merger: fix NULL dereference when called via iproto Alexander Turenko
2020-06-02 22:48   ` Vladislav Shpilevoy
2020-06-07 16:58     ` Alexander Turenko
2020-06-11 16:18       ` Vladislav Shpilevoy
2020-06-17 17:53         ` Alexander Turenko
2020-06-18 22:47           ` Vladislav Shpilevoy
2020-06-01 18:10 ` [Tarantool-patches] [PATCH 3/3] lua: expose temporary Lua state for iproto calls Alexander Turenko
2020-06-02 22:48   ` Vladislav Shpilevoy
2020-06-07 16:58     ` Alexander Turenko
2020-06-02 22:47 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Vladislav Shpilevoy
2020-06-07 17:17   ` Alexander Turenko
2020-06-07 16:58 ` [Tarantool-patches] [PATCH 2.5/3] merger: clean fiber-local Lua stack after next() Alexander Turenko
2020-06-11 16:20   ` Vladislav Shpilevoy
2020-06-17 17:53     ` Alexander Turenko
2020-06-18 22:48       ` Vladislav Shpilevoy
2020-06-19  7:41         ` Alexander Turenko
2020-06-17 17:54 ` [Tarantool-patches] [PATCH 0/3] Merger's NULL defererence Alexander Turenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox