Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/3] box_return_mp
@ 2020-03-08 17:47 Vladislav Shpilevoy
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw)
  To: tarantool-patches, korablev, imun

The patchset extends the public C API with a function
box_return_mp(). It allows to return arbitrary MessagePack from
user's code.

First patch is not really related to the ticket, but related to
struct port, which is touched here.

Second patch introduces a new struct port - port_c. It replaces
port_tuple for stored C functions, and is able to store both
tuples and raw MessagePack.

Third patch introduces the new public function.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4641-c-function-ret-mp
Issue: https://github.com/tarantool/tarantool/issues/4641

@ChangeLog
- box_return_mp() - new C API function to return arbitrary
  MessagePack from stored C functions (gh-4641).

Vladislav Shpilevoy (3):
  box: fix struct port_tuple.size wrong type in Lua
  box: introduce port_c
  box: introduce box_return_mp() public C function

 extra/exports               |   1 +
 src/box/box.cc              |   8 +-
 src/box/box.h               |  19 ++++
 src/box/func.c              |   2 +-
 src/box/lua/misc.cc         |  20 ++++
 src/box/lua/schema.lua      |   2 +-
 src/box/port.c              | 191 ++++++++++++++++++++++++++++++++++--
 src/box/port.h              |  50 ++++++++++
 src/box/sql/func.c          |  22 +++--
 test/box/function1.c        |  37 +++++++
 test/box/function1.result   |  31 ++++++
 test/box/function1.test.lua |  14 +++
 12 files changed, 375 insertions(+), 22 deletions(-)

-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua
  2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy
@ 2020-03-08 17:47 ` Vladislav Shpilevoy
  2020-03-10 13:42   ` Nikita Pettik
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy
  2 siblings, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw)
  To: tarantool-patches, korablev, imun

Original port_tuple in C has 'int size;' field. It was
'size_t size' in Lua. Since sizeof(size_t) usually is
8, and sizeof(int) is 4, this was a really bad typo.
---
 src/box/lua/schema.lua | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
index f537c3cec..aba439ffb 100644
--- a/src/box/lua/schema.lua
+++ b/src/box/lua/schema.lua
@@ -84,7 +84,7 @@ ffi.cdef[[
 
     struct port_tuple {
         const struct port_vtab *vtab;
-        size_t size;
+        int size;
         struct port_tuple_entry *first;
         struct port_tuple_entry *last;
         struct port_tuple_entry first_entry;
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy
@ 2020-03-08 17:47 ` Vladislav Shpilevoy
  2020-03-26 17:49   ` Nikita Pettik
  2020-04-03 14:12   ` Igor Munkin
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy
  2 siblings, 2 replies; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw)
  To: tarantool-patches, korablev, imun

Port_c is a new descendant of struct port. It is used now for
public C functions to store their result. Currently they can
return only a tuple, but it will change soon, they will be able to
return arbitrary MessagePack.

Port_tuple is not removed, because still is used for box_select(),
for functional indexes, and in SQL as a base for port_sql.
Although that may be changed later. Functional indexes really need
only a single MessagePack object from their function. While
box_select() working via port_tuple or port_c didn't show any
significant difference during micro benchmarks.

Part of #4641
---

As you can see, port_tuple is used in a very few places now. And
by performance it is not really different from the new port_c,
according to micro benchmarks I did. Moreover, it may be actually
faster for functional indexes - users won't need to create a tuple
just to return a functional index key, which is usually small, I
believe.

If you (reviewers) want me to drop port_tuple in scope of this
patchset, I can do that. At the moment of sending the patchset I
decided to postpone it.

 src/box/box.cc      |   2 +-
 src/box/func.c      |   2 +-
 src/box/lua/misc.cc |  20 +++++
 src/box/port.c      | 191 +++++++++++++++++++++++++++++++++++++++++---
 src/box/port.h      |  50 ++++++++++++
 src/box/sql/func.c  |  22 ++---
 6 files changed, 266 insertions(+), 21 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 09dd67ab4..276430913 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1070,7 +1070,7 @@ boxk(int type, uint32_t space_id, const char *format, ...)
 int
 box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple)
 {
-	return port_tuple_add(ctx->port, tuple);
+	return port_c_add_tuple(ctx->port, tuple);
 }
 
 /* schema_find_id()-like method using only public API */
diff --git a/src/box/func.c b/src/box/func.c
index 431577127..04b04b6ad 100644
--- a/src/box/func.c
+++ b/src/box/func.c
@@ -523,7 +523,7 @@ func_c_call(struct func *base, struct port *args, struct port *ret)
 	if (data == NULL)
 		return -1;
 
-	port_tuple_create(ret);
+	port_c_create(ret);
 	box_function_ctx_t ctx = { ret };
 
 	/* Module can be changed after function reload. */
diff --git a/src/box/lua/misc.cc b/src/box/lua/misc.cc
index 79b6cfe3a..a6ba024ea 100644
--- a/src/box/lua/misc.cc
+++ b/src/box/lua/misc.cc
@@ -78,6 +78,26 @@ port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat)
 	}
 }
 
+extern "C" void
+port_c_dump_lua(struct port *base, struct lua_State *L, bool is_flat)
+{
+	struct port_c *port = (struct port_c *)base;
+	if (!is_flat)
+		lua_createtable(L, port->size, 0);
+	struct port_c_entry *pe = port->first;
+	const char *mp;
+	for (int i = 0; pe != NULL; pe = pe->next) {
+		if (pe->mp_size == 0) {
+			luaT_pushtuple(L, pe->tuple);
+		} else {
+			mp = pe->mp;
+			luamp_decode(L, luaL_msgpack_default, &mp);
+		}
+		if (!is_flat)
+			lua_rawseti(L, -2, ++i);
+	}
+}
+
 extern "C" void
 port_msgpack_dump_lua(struct port *base, struct lua_State *L, bool is_flat)
 {
diff --git a/src/box/port.c b/src/box/port.c
index 6e2fe3a6e..925a4b2d5 100644
--- a/src/box/port.c
+++ b/src/box/port.c
@@ -37,7 +37,12 @@
 #include <fiber.h>
 #include "errinj.h"
 
-static struct mempool port_tuple_entry_pool;
+static struct mempool port_entry_pool;
+
+enum {
+	PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry),
+			      sizeof(struct port_tuple_entry)),
+};
 
 int
 port_tuple_add(struct port *base, struct tuple *tuple)
@@ -49,7 +54,7 @@ port_tuple_add(struct port *base, struct tuple *tuple)
 		e = &port->first_entry;
 		port->first = port->last = e;
 	} else {
-		e = mempool_alloc(&port_tuple_entry_pool);
+		e = mempool_alloc(&port_entry_pool);
 		if (e == NULL) {
 			diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e");
 			return -1;
@@ -87,7 +92,7 @@ port_tuple_destroy(struct port *base)
 		struct port_tuple_entry *cur = e;
 		e = e->next;
 		tuple_unref(cur->tuple);
-		mempool_free(&port_tuple_entry_pool, cur);
+		mempool_free(&port_entry_pool, cur);
 	}
 }
 
@@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out)
 extern void
 port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat);
 
+static inline void
+port_c_destroy_entry(struct port_c_entry *pe)
+{
+	if (pe->mp_size == 0)
+		tuple_unref(pe->tuple);
+	else if (pe->mp_size <= PORT_ENTRY_SIZE)
+		mempool_free(&port_entry_pool, pe->mp);
+	else
+		free(pe->mp);
+}
+
+static void
+port_c_destroy(struct port *base)
+{
+	struct port_c *port = (struct port_c *)base;
+	struct port_c_entry *pe = port->first;
+	if (pe == NULL)
+		return;
+	port_c_destroy_entry(pe);
+	pe = pe->next;
+	while (pe != NULL) {
+		struct port_c_entry *cur = pe;
+		pe = pe->next;
+		port_c_destroy_entry(cur);
+		mempool_free(&port_entry_pool, cur);
+	}
+}
+
+static inline struct port_c_entry *
+port_c_new_entry(struct port_c *port)
+{
+	struct port_c_entry *e;
+	if (port->size == 0) {
+		e = &port->first_entry;
+		port->first = e;
+		port->last = e;
+	} else {
+		e = mempool_alloc(&port_entry_pool);
+		if (e == NULL) {
+			diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e");
+			return NULL;
+		}
+		port->last->next = e;
+		port->last = e;
+	}
+	e->next = NULL;
+	++port->size;
+	return e;
+}
+
+int
+port_c_add_tuple(struct port *base, struct tuple *tuple)
+{
+	struct port_c *port = (struct port_c *)base;
+	struct port_c_entry *pe = port_c_new_entry(port);
+	if (pe != NULL) {
+		/* 0 mp_size means the entry stores a tuple. */
+		pe->mp_size = 0;
+		pe->tuple = tuple;
+		tuple_ref(tuple);
+		return 0;
+	}
+	return -1;
+}
+
+int
+port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
+{
+	struct port_c *port = (struct port_c *)base;
+	struct port_c_entry *pe;
+	uint32_t size = mp_end - mp;
+	char *dst;
+	if (size <= PORT_ENTRY_SIZE) {
+		/*
+		 * Alloc on a mempool is several times faster than
+		 * on the heap. And it perfectly fits any
+		 * MessagePack number, a short string, a boolean.
+		 */
+		dst = mempool_alloc(&port_entry_pool);
+		if (dst == NULL) {
+			diag_set(OutOfMemory, size, "mempool_alloc", "dst");
+			return -1;
+		}
+	} else {
+		dst = malloc(size);
+		if (dst == NULL) {
+			diag_set(OutOfMemory, size, "malloc", "dst");
+			return -1;
+		}
+	}
+	pe = port_c_new_entry(port);
+	if (pe != NULL) {
+		memcpy(dst, mp, size);
+		pe->mp = dst;
+		pe->mp_size = size;
+		return 0;
+	}
+	if (size <= PORT_ENTRY_SIZE)
+		mempool_free(&port_entry_pool, dst);
+	else
+		free(dst);
+	return -1;
+}
+
+static int
+port_c_dump_msgpack_16(struct port *base, struct obuf *out)
+{
+	struct port_c *port = (struct port_c *)base;
+	struct port_c_entry *pe;
+	for (pe = port->first; pe != NULL; pe = pe->next) {
+		uint32_t size = pe->mp_size;
+		if (size == 0) {
+			if (tuple_to_obuf(pe->tuple, out) != 0)
+				return -1;
+		} else if (obuf_dup(out, pe->mp, size) != size) {
+			diag_set(OutOfMemory, size, "obuf_dup", "data");
+			return -1;
+		}
+		ERROR_INJECT(ERRINJ_PORT_DUMP, {
+			diag_set(OutOfMemory,
+				 size == 0 ? tuple_size(pe->tuple) : size,
+				 "obuf_dup", "data");
+			return -1;
+		});
+	}
+	return port->size;
+}
+
+static int
+port_c_dump_msgpack(struct port *base, struct obuf *out)
+{
+	struct port_c *port = (struct port_c *)base;
+	char *size_buf = obuf_alloc(out, mp_sizeof_array(port->size));
+	if (size_buf == NULL) {
+		diag_set(OutOfMemory, mp_sizeof_array(port->size), "obuf_alloc",
+			 "size_buf");
+		return -1;
+	}
+	mp_encode_array(size_buf, port->size);
+	if (port_c_dump_msgpack_16(base, out) < 0)
+		return -1;
+	return 1;
+}
+
+extern void
+port_c_dump_lua(struct port *port, struct lua_State *L, bool is_flat);
+
+extern struct sql_value *
+port_c_get_vdbemem(struct port *base, uint32_t *size);
+
+static const struct port_vtab port_c_vtab = {
+	.dump_msgpack = port_c_dump_msgpack,
+	.dump_msgpack_16 = port_c_dump_msgpack_16,
+	.dump_lua = port_c_dump_lua,
+	.dump_plain = NULL,
+	.get_msgpack = NULL,
+	.get_vdbemem = port_c_get_vdbemem,
+	.destroy = port_c_destroy,
+};
+
+void
+port_c_create(struct port *base)
+{
+	struct port_c *port = (struct port_c *)base;
+	port->vtab = &port_c_vtab;
+	port->first = NULL;
+	port->last = NULL;
+	port->size = 0;
+}
+
 void
 port_init(void)
 {
-	mempool_create(&port_tuple_entry_pool, &cord()->slabc,
-		       sizeof(struct port_tuple_entry));
+	mempool_create(&port_entry_pool, &cord()->slabc, PORT_ENTRY_SIZE);
 }
 
 void
 port_free(void)
 {
-	mempool_destroy(&port_tuple_entry_pool);
+	mempool_destroy(&port_entry_pool);
 }
 
-extern struct sql_value *
-port_tuple_get_vdbemem(struct port *base, uint32_t *size);
-
 const struct port_vtab port_tuple_vtab = {
 	.dump_msgpack = port_tuple_dump_msgpack,
 	.dump_msgpack_16 = port_tuple_dump_msgpack_16,
 	.dump_lua = port_tuple_dump_lua,
 	.dump_plain = NULL,
 	.get_msgpack = NULL,
-	.get_vdbemem = port_tuple_get_vdbemem,
+	.get_vdbemem = NULL,
 	.destroy = port_tuple_destroy,
 };
diff --git a/src/box/port.h b/src/box/port.h
index 9d3d02b3c..cd40d8a15 100644
--- a/src/box/port.h
+++ b/src/box/port.h
@@ -130,6 +130,56 @@ void
 port_vdbemem_create(struct port *base, struct sql_value *mem,
 		    uint32_t mem_count);
 
+/**
+ * The struct is PACKED to avoid unnecessary 4 byte padding
+ * after mp_size. These objects are never stored on stack, neither
+ * allocated as an array, so the padding is not needed.
+ */
+struct PACKED port_c_entry {
+	struct port_c_entry *next;
+	union {
+		/** Valid if mp_size is 0. */
+		struct tuple *tuple;
+		/**
+		 * Valid if mp_size is > 0. MessagePack is
+		 * allocated either on heap or on the port entry
+		 * mempool, if it fits into a pool object.
+		 */
+		char *mp;
+	};
+	uint32_t mp_size;
+};
+
+static_assert(sizeof(struct port_c_entry) == 20,
+	      "port_c_entry is expected to be perfectly aligned and packed");
+
+/**
+ * C port is used by C functions from the public API. They can
+ * return tuples and arbitrary MessagePack.
+ */
+struct port_c {
+	const struct port_vtab *vtab;
+	struct port_c_entry *first;
+	struct port_c_entry *last;
+	struct port_c_entry first_entry;
+	int size;
+};
+
+static_assert(sizeof(struct port_c) <= sizeof(struct port),
+	      "sizeof(struct port_c) must be <= sizeof(struct port)");
+
+/** Create a C port object. */
+void
+port_c_create(struct port *base);
+
+/** Append a tuple to the port. Tuple is referenced. */
+int
+port_c_add_tuple(struct port *port, struct tuple *tuple);
+
+/** Append raw MessagePack to the port. It is copied. */
+int
+port_c_add_mp(struct port *port, const char *mp, const char *mp_end);
+
 void
 port_init(void);
 
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 6e724c824..70147da1e 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -299,9 +299,9 @@ error:
 }
 
 struct sql_value *
-port_tuple_get_vdbemem(struct port *base, uint32_t *size)
+port_c_get_vdbemem(struct port *base, uint32_t *size)
 {
-	struct port_tuple *port = (struct port_tuple *)base;
+	struct port_c *port = (struct port_c *)base;
 	*size = port->size;
 	if (*size == 0 || *size > 1) {
 		diag_set(ClientError, ER_SQL_FUNC_WRONG_RET_COUNT, "C", *size);
@@ -315,14 +315,18 @@ port_tuple_get_vdbemem(struct port *base, uint32_t *size)
 	if (val == NULL)
 		return NULL;
 	int i = 0;
-	struct port_tuple_entry *pe;
+	const char *data;
+	struct port_c_entry *pe;
 	for (pe = port->first; pe != NULL; pe = pe->next) {
-		const char *data = tuple_data(pe->tuple);
-		if (mp_typeof(*data) != MP_ARRAY ||
-			mp_decode_array(&data) != 1) {
-			diag_set(ClientError, ER_SQL_EXECUTE,
-				 "Unsupported type passed from C");
-			goto error;
+		if (pe->mp_size == 0) {
+			data = tuple_data(pe->tuple);
+			if (mp_decode_array(&data) != 1) {
+				diag_set(ClientError, ER_SQL_EXECUTE,
+					 "Unsupported type passed from C");
+				goto error;
+			}
+		} else {
+			data = pe->mp;
 		}
 		uint32_t len;
 		const char *str;
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function
  2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy
@ 2020-03-08 17:47 ` Vladislav Shpilevoy
  2020-03-26 17:51   ` Nikita Pettik
  2020-04-03 14:13   ` Igor Munkin
  2 siblings, 2 replies; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw)
  To: tarantool-patches, korablev, imun

Closes #4641

@TarantoolBot document
Title: box_return_mp() public C function

Stored C functions could return a result only via
`box_return_tuple()` function. That made users create a tuple
every time they wanted to return something from a C function.

Now public C API offers another way to return - `box_return_mp()`.
It allows to return arbitrary MessagePack, not wrapped into a
tuple object. This is simpler to use for small results like a
number, boolean, or a short string. Besides, `box_return_mp()` is
much faster than `box_return_tuple()`, especially for small
MessagePack.

Note, that it is faster only if an alternative is to create a
tuple by yourself. If an already existing tuple was obtained from
an iterator, and you want to return it, then of course it is
faster to return via `box_return_tuple()`, than via extraction of
tuple data, and calling `box_return_mp()`.

Here is the function declaration from module.h:
```C
/**
 * Return MessagePack from a stored C procedure. The MessagePack
 * is copied, so it is safe to free/reuse the passed arguments
 * after the call.
 * MessagePack is not validated, for the sake of speed. It is
 * expected to be a single encoded object. An attempt to encode
 * and return multiple objects without wrapping them into an
 * MP_ARRAY or MP_MAP is undefined behaviour.
 *
 * \param ctx An opaque structure passed to the stored C procedure
 *        by Tarantool.
 * \param mp Begin of MessagePack.
 * \param mp_end End of MessagePack.
 * \retval -1 Error.
 * \retval 0 Success.
 */
API_EXPORT int
box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end);
```
---
 extra/exports               |  1 +
 src/box/box.cc              |  6 ++++++
 src/box/box.h               | 19 +++++++++++++++++++
 test/box/function1.c        | 37 +++++++++++++++++++++++++++++++++++++
 test/box/function1.result   | 31 +++++++++++++++++++++++++++++++
 test/box/function1.test.lua | 14 ++++++++++++++
 6 files changed, 108 insertions(+)

diff --git a/extra/exports b/extra/exports
index 3a0637317..d793edfe4 100644
--- a/extra/exports
+++ b/extra/exports
@@ -205,6 +205,7 @@ box_tuple_extract_key
 box_tuple_compare
 box_tuple_compare_with_key
 box_return_tuple
+box_return_mp
 box_space_id_by_name
 box_index_id_by_name
 box_select
diff --git a/src/box/box.cc b/src/box/box.cc
index 276430913..4e15f21fc 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1073,6 +1073,12 @@ box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple)
 	return port_c_add_tuple(ctx->port, tuple);
 }
 
+int
+box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end)
+{
+	return port_c_add_mp(ctx->port, mp, mp_end);
+}
+
 /* schema_find_id()-like method using only public API */
 uint32_t
 box_space_id_by_name(const char *name, uint32_t len)
diff --git a/src/box/box.h b/src/box/box.h
index f37a945eb..a23383630 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -284,6 +284,25 @@ typedef struct box_function_ctx box_function_ctx_t;
 API_EXPORT int
 box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple);
 
+/**
+ * Return MessagePack from a stored C procedure. The MessagePack
+ * is copied, so it is safe to free/reuse the passed arguments
+ * after the call.
+ * MessagePack is not validated, for the sake of speed. It is
+ * expected to be a single encoded object. An attempt to encode
+ * and return multiple objects without wrapping them into an
+ * MP_ARRAY or MP_MAP is undefined behaviour.
+ *
+ * \param ctx An opaque structure passed to the stored C procedure
+ *        by Tarantool.
+ * \param mp Begin of MessagePack.
+ * \param mp_end End of MessagePack.
+ * \retval -1 Error.
+ * \retval 0 Success.
+ */
+API_EXPORT int
+box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end);
+
 /**
  * Find space id by name.
  *
diff --git a/test/box/function1.c b/test/box/function1.c
index 87062d6a8..48a561100 100644
--- a/test/box/function1.c
+++ b/test/box/function1.c
@@ -245,3 +245,40 @@ test_sleep(box_function_ctx_t *ctx, const char *args, const char *args_end)
 		fiber_sleep(0);
 	return 0;
 }
+
+int
+test_return_mp(box_function_ctx_t *ctx, const char *args, const char *args_end)
+{
+	(void) args;
+	(void) args_end;
+	char buf[512];
+	char *pos = mp_encode_uint(buf, 1);
+	int rc = box_return_mp(ctx, buf, pos);
+	if (rc != 0)
+		return rc;
+
+	pos = mp_encode_int(buf, -1);
+	rc = box_return_mp(ctx, buf, pos);
+	if (rc != 0)
+		return rc;
+
+	pos = mp_encode_uint(buf, UINT64_MAX);
+	rc = box_return_mp(ctx, buf, pos);
+	if (rc != 0)
+		return rc;
+
+	const char *str = "123456789101112131415";
+	pos = mp_encode_str(buf, str, strlen(str));
+	rc = box_return_mp(ctx, buf, pos);
+	if (rc != 0)
+		return rc;
+
+	pos = mp_encode_array(buf, 1);
+	pos = mp_encode_uint(pos, 2);
+	box_tuple_t *tuple = box_tuple_new(box_tuple_format_default(),
+					   buf, pos);
+	if (tuple == NULL)
+		return -1;
+	rc = box_return_tuple(ctx, tuple);
+	return rc;
+}
diff --git a/test/box/function1.result b/test/box/function1.result
index b91d63c51..301f666ef 100644
--- a/test/box/function1.result
+++ b/test/box/function1.result
@@ -791,6 +791,37 @@ box.schema.func.drop("function1.multireturn")
 ---
 ...
 --
+-- gh-4641: box_return_mp() C API to return arbitrary MessagePack
+-- from C functions.
+--
+name = 'function1.test_return_mp'
+---
+...
+box.schema.func.create(name, {language = "C", exports = {'LUA'}})
+---
+...
+box.func[name]:call()
+---
+- 1
+- -1
+- 18446744073709551615
+- '123456789101112131415'
+- [2]
+...
+box.schema.user.grant('guest', 'super')
+---
+...
+net:connect(box.cfg.listen):call(name)
+---
+- [1, -1, 18446744073709551615, '123456789101112131415', [2]]
+...
+box.schema.user.revoke('guest', 'super')
+---
+...
+box.schema.func.drop(name)
+---
+...
+--
 -- gh-4182: Introduce persistent Lua functions.
 --
 test_run:cmd("setopt delimiter ';'")
diff --git a/test/box/function1.test.lua b/test/box/function1.test.lua
index b1841f3ad..670d63a44 100644
--- a/test/box/function1.test.lua
+++ b/test/box/function1.test.lua
@@ -268,6 +268,20 @@ box.schema.func.create('function1.multireturn', {language = "C", exports = {'LUA
 box.execute("SELECT \"function1.multireturn\"()")
 box.schema.func.drop("function1.multireturn")
 
+--
+-- gh-4641: box_return_mp() C API to return arbitrary MessagePack
+-- from C functions.
+--
+name = 'function1.test_return_mp'
+box.schema.func.create(name, {language = "C", exports = {'LUA'}})
+box.func[name]:call()
+
+box.schema.user.grant('guest', 'super')
+net:connect(box.cfg.listen):call(name)
+box.schema.user.revoke('guest', 'super')
+
+box.schema.func.drop(name)
+
 --
 -- gh-4182: Introduce persistent Lua functions.
 --
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy
@ 2020-03-10 13:42   ` Nikita Pettik
  2020-03-11  0:17     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 16+ messages in thread
From: Nikita Pettik @ 2020-03-10 13:42 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 08 Mar 18:47, Vladislav Shpilevoy wrote:
> Original port_tuple in C has 'int size;' field. It was
> 'size_t size' in Lua. Since sizeof(size_t) usually is
> 8, and sizeof(int) is 4, this was a really bad typo.

LGTM (feel free to push out of order as obvious).

> ---
>  src/box/lua/schema.lua | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
> index f537c3cec..aba439ffb 100644
> --- a/src/box/lua/schema.lua
> +++ b/src/box/lua/schema.lua
> @@ -84,7 +84,7 @@ ffi.cdef[[
>  
>      struct port_tuple {
>          const struct port_vtab *vtab;
> -        size_t size;
> +        int size;
>          struct port_tuple_entry *first;
>          struct port_tuple_entry *last;
>          struct port_tuple_entry first_entry;
> -- 
> 2.21.1 (Apple Git-122.3)
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua
  2020-03-10 13:42   ` Nikita Pettik
@ 2020-03-11  0:17     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-03-11  0:17 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches



On 10/03/2020 14:42, Nikita Pettik wrote:
> On 08 Mar 18:47, Vladislav Shpilevoy wrote:
>> Original port_tuple in C has 'int size;' field. It was
>> 'size_t size' in Lua. Since sizeof(size_t) usually is
>> 8, and sizeof(int) is 4, this was a really bad typo.
> 
> LGTM (feel free to push out of order as obvious).

Done. Pushed to master, 2.3, 2.2, 1.10.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy
@ 2020-03-26 17:49   ` Nikita Pettik
  2020-04-23  0:14     ` Vladislav Shpilevoy
  2020-04-03 14:12   ` Igor Munkin
  1 sibling, 1 reply; 16+ messages in thread
From: Nikita Pettik @ 2020-03-26 17:49 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 08 Mar 18:47, Vladislav Shpilevoy wrote:

In general, patch is okay. Below some comments which can be
considered as nits.

I see that your branch doesn't pass CI testing:
https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status

I haven't managed to find box/alter_limits among existing flaky
tests. If this test fail is not really related to your patches,
could you please file an issue?

> If you (reviewers) want me to drop port_tuple in scope of this
> patchset, I can do that. At the moment of sending the patchset I
> decided to postpone it.

I'd do it as a follow-up in case you have time to do it.
Otherwise please just file an issue.

> diff --git a/src/box/port.c b/src/box/port.c
> index 6e2fe3a6e..925a4b2d5 100644
> --- a/src/box/port.c
> +++ b/src/box/port.c
> @@ -37,7 +37,12 @@
>  #include <fiber.h>
>  #include "errinj.h"
>  
> -static struct mempool port_tuple_entry_pool;

I'd add comment explaining that port_entry_pool holds both
port_c and port_tuple entries now. Moreover, as an optimization
some msgpacks stored in port_c also can be allocated in this
pool (as far as I understand).

> +static struct mempool port_entry_pool;
> +
> +enum {
> +	PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry),
> +			      sizeof(struct port_tuple_entry)),
> +};
>  
> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out)
>  extern void
>  port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat);
>  
> +static inline void
> +port_c_destroy_entry(struct port_c_entry *pe)
> +{

I'd add a brief reference like:

/* For allocation policy see port_c_add_tuple() and port_c_add_mp(). */

> +	if (pe->mp_size == 0)
> +		tuple_unref(pe->tuple);
> +	else if (pe->mp_size <= PORT_ENTRY_SIZE)
> +		mempool_free(&port_entry_pool, pe->mp);
> +	else
> +		free(pe->mp);
> +}
> +
> +static inline struct port_c_entry *
> +port_c_new_entry(struct port_c *port)
> +{
> +	struct port_c_entry *e;
> +	if (port->size == 0) {
> +		e = &port->first_entry;

I don't really get this 'first_entry' optimization. Why do you
need it at all?

> +		port->first = e;
> +		port->last = e;
> +	} else {
> +		e = mempool_alloc(&port_entry_pool);
> +		if (e == NULL) {
> +			diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e");
> +			return NULL;
> +		}
> +		port->last->next = e;
> +		port->last = e;
> +	}
> +	e->next = NULL;
> +	++port->size;
> +	return e;
> +}
> +
> +int
> +port_c_add_tuple(struct port *base, struct tuple *tuple)
> +{
> +	struct port_c *port = (struct port_c *)base;
> +	struct port_c_entry *pe = port_c_new_entry(port);

I guess reverting condition would make code flow more straightforward:
if (pe == NULL)
	return -1;
...

> +	if (pe != NULL) {
> +		/* 0 mp_size means the entry stores a tuple. */
> +		pe->mp_size = 0;
> +		pe->tuple = tuple;
> +		tuple_ref(tuple);
> +		return 0;
> +	}
> +	return -1;
> +}
> +
> +int
> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
> +{
> +	struct port_c *port = (struct port_c *)base;
> +	struct port_c_entry *pe;
> +	uint32_t size = mp_end - mp;
> +	char *dst;
> +	if (size <= PORT_ENTRY_SIZE) {
> +		/*
> +		 * Alloc on a mempool is several times faster than
> +		 * on the heap. And it perfectly fits any
> +		 * MessagePack number, a short string, a boolean.
> +		 */
> +		dst = mempool_alloc(&port_entry_pool);

Dubious optimization...I mean is this code complication really
worth it?

> +		if (dst == NULL) {
> +			diag_set(OutOfMemory, size, "mempool_alloc", "dst");
> +			return -1;
> +		}
> +
> +static int
> +port_c_dump_msgpack_16(struct port *base, struct obuf *out)
> +{
> +	struct port_c *port = (struct port_c *)base;
> +	struct port_c_entry *pe;
> +	for (pe = port->first; pe != NULL; pe = pe->next) {
> +		uint32_t size = pe->mp_size;
> +		if (size == 0) {
> +			if (tuple_to_obuf(pe->tuple, out) != 0)
> +				return -1;
> +		} else if (obuf_dup(out, pe->mp, size) != size) {
> +			diag_set(OutOfMemory, size, "obuf_dup", "data");
> +			return -1;
> +		}
> +		ERROR_INJECT(ERRINJ_PORT_DUMP, {
> +			diag_set(OutOfMemory,
> +				 size == 0 ? tuple_size(pe->tuple) : size,
> +				 "obuf_dup", "data");
> +			return -1;
> +		});

You added error injection, but did not test it. Could you provide test
(or point me at it please)?

> +	}
> +	return port->size;
> +}
> +
> diff --git a/src/box/port.h b/src/box/port.h
> index 9d3d02b3c..cd40d8a15 100644
> --- a/src/box/port.h
> +++ b/src/box/port.h
> @@ -130,6 +130,56 @@ void
>  port_vdbemem_create(struct port *base, struct sql_value *mem,
>  		    uint32_t mem_count);
>  
> +/**
> + * The struct is PACKED to avoid unnecessary 4 byte padding

Is this really profitable?

> + * after mp_size. These objects are never stored on stack, neither
> + * allocated as an array, so the padding is not needed.
> + */
> +struct PACKED port_c_entry {
> +	struct port_c_entry *next;
> +	union {
> +		/** Valid if mp_size is 0. */
> +		struct tuple *tuple;
> +		/**
> +		 * Valid if mp_size is > 0. MessagePack is
> +		 * allocated either on heap or on the port entry
> +		 * mempool, if it fits into a pool object.
> +		 */
> +		char *mp;
> +	};
> +	uint32_t mp_size;
> +};
> +
> +/**
> + * C port is used by C functions from the public API. They can
> + * return tuples and arbitrary MessagePack.
> + */
> +struct port_c {
> +	const struct port_vtab *vtab;
> +	struct port_c_entry *first;
> +	struct port_c_entry *last;
> +	struct port_c_entry first_entry;
> +	int size;

Why do you call it size? I'd name it entry_cnt or count
(imho port size = sum of all entries' sizes).

> +};
> +
> +static_assert(sizeof(struct port_c) <= sizeof(struct port),
> +	      "sizeof(struct port_c) must be <= sizeof(struct port)");
> +
>  port_init(void);
>  

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy
@ 2020-03-26 17:51   ` Nikita Pettik
  2020-04-03 14:13   ` Igor Munkin
  1 sibling, 0 replies; 16+ messages in thread
From: Nikita Pettik @ 2020-03-26 17:51 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 08 Mar 18:47, Vladislav Shpilevoy wrote:
> Closes #4641

LGTM as obvious.
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy
  2020-03-26 17:49   ` Nikita Pettik
@ 2020-04-03 14:12   ` Igor Munkin
  2020-04-23  0:14     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 16+ messages in thread
From: Igor Munkin @ 2020-04-03 14:12 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for the patch! I see no flaws, but I left several nits below in
addition to Nikita's comments.

On 08.03.20, Vladislav Shpilevoy wrote:
> Port_c is a new descendant of struct port. It is used now for
> public C functions to store their result. Currently they can
> return only a tuple, but it will change soon, they will be able to
> return arbitrary MessagePack.
> 
> Port_tuple is not removed, because still is used for box_select(),
> for functional indexes, and in SQL as a base for port_sql.
> Although that may be changed later. Functional indexes really need
> only a single MessagePack object from their function. While
> box_select() working via port_tuple or port_c didn't show any
> significant difference during micro benchmarks.
> 
> Part of #4641
> ---
> 
> As you can see, port_tuple is used in a very few places now. And
> by performance it is not really different from the new port_c,
> according to micro benchmarks I did. Moreover, it may be actually
> faster for functional indexes - users won't need to create a tuple
> just to return a functional index key, which is usually small, I
> believe.
> 
> If you (reviewers) want me to drop port_tuple in scope of this
> patchset, I can do that. At the moment of sending the patchset I
> decided to postpone it.
> 
>  src/box/box.cc      |   2 +-
>  src/box/func.c      |   2 +-
>  src/box/lua/misc.cc |  20 +++++
>  src/box/port.c      | 191 +++++++++++++++++++++++++++++++++++++++++---
>  src/box/port.h      |  50 ++++++++++++
>  src/box/sql/func.c  |  22 ++---
>  6 files changed, 266 insertions(+), 21 deletions(-)
> 

<snipped>

> diff --git a/src/box/port.c b/src/box/port.c
> index 6e2fe3a6e..925a4b2d5 100644
> --- a/src/box/port.c
> +++ b/src/box/port.c

<snipped>

> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out)
>  extern void
>  port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat);
>  

<snipped>

> +
> +static void
> +port_c_destroy(struct port *base)
> +{
> +	struct port_c *port = (struct port_c *)base;
> +	struct port_c_entry *pe = port->first;
> +	if (pe == NULL)
> +		return;
> +	port_c_destroy_entry(pe);

Minor: It's worth to drop a few words here that the port->first points
to port->first_entry and it is released with the port_c structure.

> +	pe = pe->next;
> +	while (pe != NULL) {
> +		struct port_c_entry *cur = pe;
> +		pe = pe->next;
> +		port_c_destroy_entry(cur);
> +		mempool_free(&port_entry_pool, cur);
> +	}
> +}
> +

<snipped>

> +
> +int
> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
> +{
> +	struct port_c *port = (struct port_c *)base;
> +	struct port_c_entry *pe;
> +	uint32_t size = mp_end - mp;

Minor: since there are no guarantees except the existing contract and
common sense making this value be non-negative, I propose to add a
corresponding assert. Feel free to ignore.

> +	char *dst;
> +	if (size <= PORT_ENTRY_SIZE) {
> +		/*
> +		 * Alloc on a mempool is several times faster than
> +		 * on the heap. And it perfectly fits any
> +		 * MessagePack number, a short string, a boolean.
> +		 */
> +		dst = mempool_alloc(&port_entry_pool);
> +		if (dst == NULL) {
> +			diag_set(OutOfMemory, size, "mempool_alloc", "dst");
> +			return -1;
> +		}
> +	} else {
> +		dst = malloc(size);
> +		if (dst == NULL) {
> +			diag_set(OutOfMemory, size, "malloc", "dst");
> +			return -1;
> +		}
> +	}
> +	pe = port_c_new_entry(port);
> +	if (pe != NULL) {
> +		memcpy(dst, mp, size);
> +		pe->mp = dst;
> +		pe->mp_size = size;
> +		return 0;
> +	}
> +	if (size <= PORT_ENTRY_SIZE)
> +		mempool_free(&port_entry_pool, dst);
> +	else
> +		free(dst);
> +	return -1;
> +}
> +

<snipped>

> -- 
> 2.21.1 (Apple Git-122.3)
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function
  2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy
  2020-03-26 17:51   ` Nikita Pettik
@ 2020-04-03 14:13   ` Igor Munkin
  2020-04-23  0:14     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 16+ messages in thread
From: Igor Munkin @ 2020-04-03 14:13 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for the patch! LGTM except the one nit.

On 08.03.20, Vladislav Shpilevoy wrote:
> Closes #4641
> 
> @TarantoolBot document
> Title: box_return_mp() public C function
> 
> Stored C functions could return a result only via
> `box_return_tuple()` function. That made users create a tuple
> every time they wanted to return something from a C function.
> 
> Now public C API offers another way to return - `box_return_mp()`.
> It allows to return arbitrary MessagePack, not wrapped into a
> tuple object. This is simpler to use for small results like a
> number, boolean, or a short string. Besides, `box_return_mp()` is
> much faster than `box_return_tuple()`, especially for small
> MessagePack.
> 
> Note, that it is faster only if an alternative is to create a
> tuple by yourself. If an already existing tuple was obtained from
> an iterator, and you want to return it, then of course it is
> faster to return via `box_return_tuple()`, than via extraction of
> tuple data, and calling `box_return_mp()`.
> 
> Here is the function declaration from module.h:
> ```C
> /**
>  * Return MessagePack from a stored C procedure. The MessagePack
>  * is copied, so it is safe to free/reuse the passed arguments
>  * after the call.
>  * MessagePack is not validated, for the sake of speed. It is
>  * expected to be a single encoded object. An attempt to encode
>  * and return multiple objects without wrapping them into an
>  * MP_ARRAY or MP_MAP is undefined behaviour.
>  *
>  * \param ctx An opaque structure passed to the stored C procedure
>  *        by Tarantool.
>  * \param mp Begin of MessagePack.
>  * \param mp_end End of MessagePack.
>  * \retval -1 Error.
>  * \retval 0 Success.
>  */
> API_EXPORT int
> box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end);
> ```
> ---
>  extra/exports               |  1 +
>  src/box/box.cc              |  6 ++++++
>  src/box/box.h               | 19 +++++++++++++++++++
>  test/box/function1.c        | 37 +++++++++++++++++++++++++++++++++++++
>  test/box/function1.result   | 31 +++++++++++++++++++++++++++++++
>  test/box/function1.test.lua | 14 ++++++++++++++
>  6 files changed, 108 insertions(+)
> 

<snipped>

> diff --git a/test/box/function1.result b/test/box/function1.result
> index b91d63c51..301f666ef 100644
> --- a/test/box/function1.result
> +++ b/test/box/function1.result
> @@ -791,6 +791,37 @@ box.schema.func.drop("function1.multireturn")
>  ---
>  ...
>  --
> +-- gh-4641: box_return_mp() C API to return arbitrary MessagePack
> +-- from C functions.
> +--
> +name = 'function1.test_return_mp'
> +---
> +...
> +box.schema.func.create(name, {language = "C", exports = {'LUA'}})
> +---
> +...
> +box.func[name]:call()
> +---
> +- 1
> +- -1
> +- 18446744073709551615
> +- '123456789101112131415'
> +- [2]
> +...
> +box.schema.user.grant('guest', 'super')
> +---
> +...

Minor: I guess it worth to add a comment below referring the #4799 since
I was confused with the output below.

> +net:connect(box.cfg.listen):call(name)
> +---
> +- [1, -1, 18446744073709551615, '123456789101112131415', [2]]
> +...
> +box.schema.user.revoke('guest', 'super')
> +---
> +...
> +box.schema.func.drop(name)
> +---
> +...
> +--
>  -- gh-4182: Introduce persistent Lua functions.
>  --
>  test_run:cmd("setopt delimiter ';'")

<snipped>

> -- 
> 2.21.1 (Apple Git-122.3)
> 

-- 
Best regards,
IM

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-03-26 17:49   ` Nikita Pettik
@ 2020-04-23  0:14     ` Vladislav Shpilevoy
  2020-04-27 14:09       ` Nikita Pettik
  0 siblings, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-04-23  0:14 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Hi! Thanks for the review!

On 26/03/2020 18:49, Nikita Pettik wrote:
> On 08 Mar 18:47, Vladislav Shpilevoy wrote:
> 
> In general, patch is okay. Below some comments which can be
> considered as nits.
> 
> I see that your branch doesn't pass CI testing:
> https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status
> 
> I haven't managed to find box/alter_limits among existing flaky
> tests. If this test fail is not really related to your patches,
> could you please file an issue?

It is not related (at least I think so), I saw it failing on other
branches too.

Created an issue:
https://github.com/tarantool/tarantool/issues/4926

>> If you (reviewers) want me to drop port_tuple in scope of this
>> patchset, I can do that. At the moment of sending the patchset I
>> decided to postpone it.
> 
> I'd do it as a follow-up in case you have time to do it.
> Otherwise please just file an issue.

Now I have time, so it is done in a separate commit in
v2 thread.

>> diff --git a/src/box/port.c b/src/box/port.c
>> index 6e2fe3a6e..925a4b2d5 100644
>> --- a/src/box/port.c
>> +++ b/src/box/port.c
>> @@ -37,7 +37,12 @@
>>  #include <fiber.h>
>>  #include "errinj.h"
>>  
>> -static struct mempool port_tuple_entry_pool;
> 
> I'd add comment explaining that port_entry_pool holds both
> port_c and port_tuple entries now. Moreover, as an optimization
> some msgpacks stored in port_c also can be allocated in this
> pool (as far as I understand).

Ok, no problem:

====================
diff --git a/src/box/port.c b/src/box/port.c
index 3498f7af0..fea98ecc5 100644
--- a/src/box/port.c
+++ b/src/box/port.c
@@ -37,6 +37,12 @@
 #include <fiber.h>
 #include "errinj.h"
 
+/**
+ * The pools is used both by port_c and port_tuple, since their
+ * entires are almost of the same size. Also port_c can use
+ * objects from the pool to store result data in their memory,
+ * when it fits.
+ */
 static struct mempool port_entry_pool;
 
 enum {

====================

>> +static struct mempool port_entry_pool;
>> +
>> +enum {
>> +	PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry),
>> +			      sizeof(struct port_tuple_entry)),
>> +};
>>  
>> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out)
>>  extern void
>>  port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat);
>>  
>> +static inline void
>> +port_c_destroy_entry(struct port_c_entry *pe)
>> +{
> 
> I'd add a brief reference like:
> 
> /* For allocation policy see port_c_add_tuple() and port_c_add_mp(). */

Ok, done:

====================
diff --git a/src/box/port.c b/src/box/port.c
index fea98ecc5..2dfd0118d 100644
--- a/src/box/port.c
+++ b/src/box/port.c
@@ -141,6 +141,10 @@ port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat);
 static inline void
 port_c_destroy_entry(struct port_c_entry *pe)
 {
+	/*
+	 * See port_c_add_*() for algorithm of how and where to
+	 * store data, to understand why it is freed differently.
+	 */
 	if (pe->mp_size == 0)
 		tuple_unref(pe->tuple);
 	else if (pe->mp_size <= PORT_ENTRY_SIZE)

====================

>> +	if (pe->mp_size == 0)
>> +		tuple_unref(pe->tuple);
>> +	else if (pe->mp_size <= PORT_ENTRY_SIZE)
>> +		mempool_free(&port_entry_pool, pe->mp);
>> +	else
>> +		free(pe->mp);
>> +}
>> +
>> +static inline struct port_c_entry *
>> +port_c_new_entry(struct port_c *port)
>> +{
>> +	struct port_c_entry *e;
>> +	if (port->size == 0) {
>> +		e = &port->first_entry;
> 
> I don't really get this 'first_entry' optimization. Why do you
> need it at all?

Because most functions return only 1 result. And most of these
return it as a tuple. So having this value inside the port
preallocated allows to avoid any allocations/frees for these
cases. Zero allocation. Although I didn't really bench how
much it helps. I just stole it from port_tuple as a low hanging
fruit.

>> +		port->first = e;
>> +		port->last = e;
>> +	} else {
>> +		e = mempool_alloc(&port_entry_pool);
>> +		if (e == NULL) {
>> +			diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e");
>> +			return NULL;
>> +		}
>> +		port->last->next = e;
>> +		port->last = e;
>> +	}
>> +	e->next = NULL;
>> +	++port->size;
>> +	return e;
>> +}
>> +
>> +int
>> +port_c_add_tuple(struct port *base, struct tuple *tuple)
>> +{
>> +	struct port_c *port = (struct port_c *)base;
>> +	struct port_c_entry *pe = port_c_new_entry(port);
> 
> I guess reverting condition would make code flow more straightforward:
> if (pe == NULL)
> 	return -1;
> ...

I am infected with the paranoia that the most likely result should lead
to the positive branch. But I don't mind to make it simpler.

====================
diff --git a/src/box/port.c b/src/box/port.c
index 2dfd0118d..2c1fadb5c 100644
--- a/src/box/port.c
+++ b/src/box/port.c
@@ -202,14 +202,13 @@ port_c_add_tuple(struct port *base, struct tuple *tuple)
 {
 	struct port_c *port = (struct port_c *)base;
 	struct port_c_entry *pe = port_c_new_entry(port);
-	if (pe != NULL) {
-		/* 0 mp_size means the entry stores a tuple. */
-		pe->mp_size = 0;
-		pe->tuple = tuple;
-		tuple_ref(tuple);
-		return 0;
-	}
-	return -1;
+	if (pe == NULL)
+		return -1;
+	/* 0 mp_size means the entry stores a tuple. */
+	pe->mp_size = 0;
+	pe->tuple = tuple;
+	tuple_ref(tuple);
+	return 0;
 }
 
 int

====================

>> +	if (pe != NULL) {
>> +		/* 0 mp_size means the entry stores a tuple. */
>> +		pe->mp_size = 0;
>> +		pe->tuple = tuple;
>> +		tuple_ref(tuple);
>> +		return 0;
>> +	}
>> +	return -1;
>> +}
>> +
>> +int
>> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
>> +{
>> +	struct port_c *port = (struct port_c *)base;
>> +	struct port_c_entry *pe;
>> +	uint32_t size = mp_end - mp;
>> +	char *dst;
>> +	if (size <= PORT_ENTRY_SIZE) {
>> +		/*
>> +		 * Alloc on a mempool is several times faster than
>> +		 * on the heap. And it perfectly fits any
>> +		 * MessagePack number, a short string, a boolean.
>> +		 */
>> +		dst = mempool_alloc(&port_entry_pool);
> 
> Dubious optimization...I mean is this code complication really
> worth it?

mempool_alloc() of this size is x4-5 times faster than malloc()
according to my microbenchmarks on release build. I decided not
to write concrete numbers x4-5, since they probably depend on
machine.

By saying 'on this size' I mean fitting into port entry object.
And this is the perfect match for functions which return something
small and scalar like a number, a boolean, or a small string.

>> +		if (dst == NULL) {
>> +			diag_set(OutOfMemory, size, "mempool_alloc", "dst");
>> +			return -1;
>> +		}
>> +
>> +static int
>> +port_c_dump_msgpack_16(struct port *base, struct obuf *out)
>> +{
>> +	struct port_c *port = (struct port_c *)base;
>> +	struct port_c_entry *pe;
>> +	for (pe = port->first; pe != NULL; pe = pe->next) {
>> +		uint32_t size = pe->mp_size;
>> +		if (size == 0) {
>> +			if (tuple_to_obuf(pe->tuple, out) != 0)
>> +				return -1;
>> +		} else if (obuf_dup(out, pe->mp, size) != size) {
>> +			diag_set(OutOfMemory, size, "obuf_dup", "data");
>> +			return -1;
>> +		}
>> +		ERROR_INJECT(ERRINJ_PORT_DUMP, {
>> +			diag_set(OutOfMemory,
>> +				 size == 0 ? tuple_size(pe->tuple) : size,
>> +				 "obuf_dup", "data");
>> +			return -1;
>> +		});
> 
> You added error injection, but did not test it. Could you provide test
> (or point me at it please)?

In the v2 thread it is used in box/errinj.test.lua to check that
"port_dump can fail".

>> +	}
>> +	return port->size;
>> +}
>> +
>> diff --git a/src/box/port.h b/src/box/port.h
>> index 9d3d02b3c..cd40d8a15 100644
>> --- a/src/box/port.h
>> +++ b/src/box/port.h
>> @@ -130,6 +130,56 @@ void
>>  port_vdbemem_create(struct port *base, struct sql_value *mem,
>>  		    uint32_t mem_count);
>>  
>> +/**
>> + * The struct is PACKED to avoid unnecessary 4 byte padding
> 
> Is this really profitable?

-4 bytes for every object of this type by just adding one word to
the struct? Sounds good to me. Very easy to do, to prove it works
correct, and profit is obvious - the structure becomes 16% smaller.
This means 16% more objects in every slab used for port_c_entry
objects. Although I didn't measure how it affects things under heavy
load on C functions.

>> + * after mp_size. These objects are never stored on stack, neither
>> + * allocated as an array, so the padding is not needed.
>> + */
>> +struct PACKED port_c_entry {
>> +	struct port_c_entry *next;
>> +	union {
>> +		/** Valid if mp_size is 0. */
>> +		struct tuple *tuple;
>> +		/**
>> +		 * Valid if mp_size is > 0. MessagePack is
>> +		 * allocated either on heap or on the port entry
>> +		 * mempool, if it fits into a pool object.
>> +		 */
>> +		char *mp;
>> +	};
>> +	uint32_t mp_size;
>> +};
>> +
>> +/**
>> + * C port is used by C functions from the public API. They can
>> + * return tuples and arbitrary MessagePack.
>> + */
>> +struct port_c {
>> +	const struct port_vtab *vtab;
>> +	struct port_c_entry *first;
>> +	struct port_c_entry *last;
>> +	struct port_c_entry first_entry;
>> +	int size;
> 
> Why do you call it size? I'd name it entry_cnt or count
> (imho port size = sum of all entries' sizes).

For that we have term 'bsize'. But I don't have anything against
count, personally. The problem here is that it will be inconsistent
with other ports (port_lua, and port_tuple which still exists here),
and with local variables named 'size', which are assigned to
port->size member - that is the problem in src/box/lua/call.c.

When I renamed size to count, I got notable amount of unnecessary
diff. So I would like to keep size unless you insist, and in that
case I will do it in a separate patch.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function
  2020-04-03 14:13   ` Igor Munkin
@ 2020-04-23  0:14     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-04-23  0:14 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Thanks for the review!

>> diff --git a/test/box/function1.result b/test/box/function1.result
>> index b91d63c51..301f666ef 100644
>> --- a/test/box/function1.result
>> +++ b/test/box/function1.result
>> @@ -791,6 +791,37 @@ box.schema.func.drop("function1.multireturn")
>>  ---
>>  ...
>>  --
>> +-- gh-4641: box_return_mp() C API to return arbitrary MessagePack
>> +-- from C functions.
>> +--
>> +name = 'function1.test_return_mp'
>> +---
>> +...
>> +box.schema.func.create(name, {language = "C", exports = {'LUA'}})
>> +---
>> +...
>> +box.func[name]:call()
>> +---
>> +- 1
>> +- -1
>> +- 18446744073709551615
>> +- '123456789101112131415'
>> +- [2]
>> +...
>> +box.schema.user.grant('guest', 'super')
>> +---
>> +...
> 
> Minor: I guess it worth to add a comment below referring the #4799 since
> I was confused with the output below.

Fair.

====================
diff --git a/test/box/function1.result b/test/box/function1.result
index 301f666ef..905a4cdab 100644
--- a/test/box/function1.result
+++ b/test/box/function1.result
@@ -811,6 +811,8 @@ box.func[name]:call()
 box.schema.user.grant('guest', 'super')
 ---
 ...
+-- Netbox:call() returns not the same as local call for C
+-- functions, see #4799.
 net:connect(box.cfg.listen):call(name)
 ---
 - [1, -1, 18446744073709551615, '123456789101112131415', [2]]
diff --git a/test/box/function1.test.lua b/test/box/function1.test.lua
index 670d63a44..ab7b586a0 100644
--- a/test/box/function1.test.lua
+++ b/test/box/function1.test.lua
@@ -277,6 +277,8 @@ box.schema.func.create(name, {language = "C", exports = {'LUA'}})
 box.func[name]:call()
 
 box.schema.user.grant('guest', 'super')
+-- Netbox:call() returns not the same as local call for C
+-- functions, see #4799.
 net:connect(box.cfg.listen):call(name)
 box.schema.user.revoke('guest', 'super')
 
====================

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-04-03 14:12   ` Igor Munkin
@ 2020-04-23  0:14     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-04-23  0:14 UTC (permalink / raw)
  To: Igor Munkin; +Cc: tarantool-patches

Hi! Thanks for the review!

>> diff --git a/src/box/port.c b/src/box/port.c
>> index 6e2fe3a6e..925a4b2d5 100644
>> --- a/src/box/port.c
>> +++ b/src/box/port.c
>> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out)
>>  extern void
>>  port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat);
>>  
> 
> <snipped>
> 
>> +
>> +static void
>> +port_c_destroy(struct port *base)
>> +{
>> +	struct port_c *port = (struct port_c *)base;
>> +	struct port_c_entry *pe = port->first;
>> +	if (pe == NULL)
>> +		return;
>> +	port_c_destroy_entry(pe);
> 
> Minor: It's worth to drop a few words here that the port->first points
> to port->first_entry and it is released with the port_c structure.

No problem:

====================
diff --git a/src/box/port.c b/src/box/port.c
index 925a4b2d5..124625c7e 100644
--- a/src/box/port.c
+++ b/src/box/port.c
@@ -151,6 +151,11 @@ port_c_destroy(struct port *base)
 	if (pe == NULL)
 		return;
 	port_c_destroy_entry(pe);
+	/*
+	 * Port->first is skipped, it is pointing at
+	 * port_c.first_entry, and is freed together with the
+	 * port.
+	 */
 	pe = pe->next;
 	while (pe != NULL) {
 		struct port_c_entry *cur = pe;

====================

>> +	pe = pe->next;
>> +	while (pe != NULL) {
>> +		struct port_c_entry *cur = pe;
>> +		pe = pe->next;
>> +		port_c_destroy_entry(cur);
>> +		mempool_free(&port_entry_pool, cur);
>> +	}
>> +}
>> +
> 
> <snipped>
> 
>> +
>> +int
>> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
>> +{
>> +	struct port_c *port = (struct port_c *)base;
>> +	struct port_c_entry *pe;
>> +	uint32_t size = mp_end - mp;
> 
> Minor: since there are no guarantees except the existing contract and
> common sense making this value be non-negative, I propose to add a
> corresponding assert. Feel free to ignore.

I don't mind.

====================
diff --git a/src/box/port.c b/src/box/port.c
index 925a4b2d5..3498f7af0 100644
--- a/src/box/port.c
+++ b/src/box/port.c
@@ -202,6 +207,7 @@ port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
 {
 	struct port_c *port = (struct port_c *)base;
 	struct port_c_entry *pe;
+	assert(mp_end > mp);
 	uint32_t size = mp_end - mp;
 	char *dst;
 	if (size <= PORT_ENTRY_SIZE) {

====================

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-04-23  0:14     ` Vladislav Shpilevoy
@ 2020-04-27 14:09       ` Nikita Pettik
  2020-04-27 21:28         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 16+ messages in thread
From: Nikita Pettik @ 2020-04-27 14:09 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 23 Apr 02:14, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
> 
> On 26/03/2020 18:49, Nikita Pettik wrote:
> > On 08 Mar 18:47, Vladislav Shpilevoy wrote:
> > 
> > In general, patch is okay. Below some comments which can be
> > considered as nits.
> > 
> > I see that your branch doesn't pass CI testing:
> > https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status
> > 
> > I haven't managed to find box/alter_limits among existing flaky
> > tests. If this test fail is not really related to your patches,
> > could you please file an issue?
> 
> It is not related (at least I think so), I saw it failing on other
> branches too.
> 
> Created an issue:
> https://github.com/tarantool/tarantool/issues/4926

Thanks. (No doubts it is not related to your patch, just want to
make sure that all flaky tests are registrated.)
 
> >> +}
> >> +
> >> +static inline struct port_c_entry *
> >> +port_c_new_entry(struct port_c *port)
> >> +{
> >> +	struct port_c_entry *e;
> >> +	if (port->size == 0) {
> >> +		e = &port->first_entry;
> > 
> > I don't really get this 'first_entry' optimization. Why do you
> > need it at all?
> 
> Because most functions return only 1 result. And most of these
> return it as a tuple. So having this value inside the port
> preallocated allows to avoid any allocations/frees for these
> cases. Zero allocation. Although I didn't really bench how
> much it helps. I just stole it from port_tuple as a low hanging
> fruit.

On the other hand you get memory overhead, which you fight for
so much even using packed attribute...
Why not then keeping 2, 3 or whatever number benchmarks say 
entries inlined?
  
> >> +	if (pe != NULL) {
> >> +		/* 0 mp_size means the entry stores a tuple. */
> >> +		pe->mp_size = 0;
> >> +		pe->tuple = tuple;
> >> +		tuple_ref(tuple);
> >> +		return 0;
> >> +	}
> >> +	return -1;
> >> +}
> >> +
> >> +int
> >> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
> >> +{
> >> +	struct port_c *port = (struct port_c *)base;
> >> +	struct port_c_entry *pe;
> >> +	uint32_t size = mp_end - mp;
> >> +	char *dst;
> >> +	if (size <= PORT_ENTRY_SIZE) {
> >> +		/*
> >> +		 * Alloc on a mempool is several times faster than
> >> +		 * on the heap. And it perfectly fits any
> >> +		 * MessagePack number, a short string, a boolean.
> >> +		 */
> >> +		dst = mempool_alloc(&port_entry_pool);
> > 
> > Dubious optimization...I mean is this code complication really
> > worth it?
> 
> mempool_alloc() of this size is x4-5 times faster than malloc()
> according to my microbenchmarks on release build. I decided not
> to write concrete numbers x4-5, since they probably depend on
> machine.

Is there at least one real scenario where entry allocation
policy can turn out to be bottleneck in overall performance?
 
> >> diff --git a/src/box/port.h b/src/box/port.h
> >> index 9d3d02b3c..cd40d8a15 100644
> >> --- a/src/box/port.h
> >> +++ b/src/box/port.h
> >> @@ -130,6 +130,56 @@ void
> >>  port_vdbemem_create(struct port *base, struct sql_value *mem,
> >>  		    uint32_t mem_count);
> >>  
> >> +/**
> >> + * The struct is PACKED to avoid unnecessary 4 byte padding
> > 
> > Is this really profitable?
> 
> -4 bytes for every object of this type by just adding one word to
> the struct? Sounds good to me. Very easy to do, to prove it works
> correct, and profit is obvious - the structure becomes 16% smaller.
> This means 16% more objects in every slab used for port_c_entry
> objects. Although I didn't measure how it affects things under heavy
> load on C functions.

Well, okay. At least its profit is clear (in contrast to others).

> >> +struct PACKED port_c_entry {
> >> +	struct port_c_entry *next;
> >> +	union {
> >> +		/** Valid if mp_size is 0. */
> >> +		struct tuple *tuple;
> >> +		/**
> >> +		 * Valid if mp_size is > 0. MessagePack is
> >> +		 * allocated either on heap or on the port entry
> >> +		 * mempool, if it fits into a pool object.
> >> +		 */
> >> +		char *mp;
> >> +	};
> >> +	uint32_t mp_size;
> >> +};
> >> +
> >> +/**
> >> + * C port is used by C functions from the public API. They can
> >> + * return tuples and arbitrary MessagePack.
> >> + */
> >> +struct port_c {
> >> +	const struct port_vtab *vtab;
> >> +	struct port_c_entry *first;
> >> +	struct port_c_entry *last;
> >> +	struct port_c_entry first_entry;
> >> +	int size;
> > 
> > Why do you call it size? I'd name it entry_cnt or count
> > (imho port size = sum of all entries' sizes).
> 
> For that we have term 'bsize'. But I don't have anything against
> count, personally. The problem here is that it will be inconsistent
> with other ports (port_lua, and port_tuple which still exists here),
> and with local variables named 'size', which are assigned to
> port->size member - that is the problem in src/box/lua/call.c.
> 
> When I renamed size to count, I got notable amount of unnecessary
> diff. So I would like to keep size unless you insist, and in that
> case I will do it in a separate patch.

Personally I would still rename it, perhaps for all port classes.
Up to you as non-functional change.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-04-27 14:09       ` Nikita Pettik
@ 2020-04-27 21:28         ` Vladislav Shpilevoy
  2020-04-27 23:24           ` Nikita Pettik
  0 siblings, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy @ 2020-04-27 21:28 UTC (permalink / raw)
  To: Nikita Pettik; +Cc: tarantool-patches

Hi! Thanks for the review!

>>>> +}
>>>> +
>>>> +static inline struct port_c_entry *
>>>> +port_c_new_entry(struct port_c *port)
>>>> +{
>>>> +	struct port_c_entry *e;
>>>> +	if (port->size == 0) {
>>>> +		e = &port->first_entry;
>>>
>>> I don't really get this 'first_entry' optimization. Why do you
>>> need it at all?
>>
>> Because most functions return only 1 result. And most of these
>> return it as a tuple. So having this value inside the port
>> preallocated allows to avoid any allocations/frees for these
>> cases. Zero allocation. Although I didn't really bench how
>> much it helps. I just stole it from port_tuple as a low hanging
>> fruit.
> 
> On the other hand you get memory overhead, which you fight for
> so much even using packed attribute...

There is no overhead for preallocated entries (for our allocators
and the heap), since port is on the stack always.

At least there is no overhead with reasonable number of reserved
entries. Of course, if you preallocate, say, thousand, it would
affect the stack cache locality in a bad way.

> Why not then keeping 2, 3 or whatever number benchmarks say 
> entries inlined?

So yes, in theory adding 2 or 3 or a few more will probably speed up
something, but not the most common case, when result is just one,
or when there are tens or hundreds or even thousands results (although
thousands are unlikely, since tx thread will have problems with
selecting so much without yields).

>>>> +	if (pe != NULL) {
>>>> +		/* 0 mp_size means the entry stores a tuple. */
>>>> +		pe->mp_size = 0;
>>>> +		pe->tuple = tuple;
>>>> +		tuple_ref(tuple);
>>>> +		return 0;
>>>> +	}
>>>> +	return -1;
>>>> +}
>>>> +
>>>> +int
>>>> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
>>>> +{
>>>> +	struct port_c *port = (struct port_c *)base;
>>>> +	struct port_c_entry *pe;
>>>> +	uint32_t size = mp_end - mp;
>>>> +	char *dst;
>>>> +	if (size <= PORT_ENTRY_SIZE) {
>>>> +		/*
>>>> +		 * Alloc on a mempool is several times faster than
>>>> +		 * on the heap. And it perfectly fits any
>>>> +		 * MessagePack number, a short string, a boolean.
>>>> +		 */
>>>> +		dst = mempool_alloc(&port_entry_pool);
>>>
>>> Dubious optimization...I mean is this code complication really
>>> worth it?
>>
>> mempool_alloc() of this size is x4-5 times faster than malloc()
>> according to my microbenchmarks on release build. I decided not
>> to write concrete numbers x4-5, since they probably depend on
>> machine.
> 
> Is there at least one real scenario where entry allocation
> policy can turn out to be bottleneck in overall performance?

I guess we have plenty of such scenarios - choose any favorite among
any of our GC problems, when solution/sales team need to write
functions in C because of bad memory management in Lua or its high
overusage because of uncontrolled unpacking of everything. C stored
functions are supposed to be much much faster than Lua, be called
hundreds of thousands times per second easily.

If we use heap for each such request, we loose some real perf, and
aggravate heap fragmentation for other requests with lots of small
allocations. Moreover, allocations of about the same size - exactly
from what mempools are supposed to protect.

I measure here by numbers I got on my machine. I was running a C
function, which returned 100 unsigned numbers, each <= 128. The
function was called just 10k times, not a big number.

With using memory pool the bench finished in 35ms. With using heap
it finished in 143-144ms. So just by allocating results on the pool
instead of the heap I won more than 100ms in 1kk mere returns. Not
selects, or gets, or other non trivial operations. Entirely on
returns, of pure TX thread time.

On the summary, I consider this low hanging perf fruit a good
enough result which is worth the code complication.

Talking of memory problems, my entirely subjective guess is that
partially we don't see malloc/free in our top results in flamegraphs,
because we avoid their usage always when possible. All hot data
structures are pooled - transactions, iterators, port entries (used
not only for C functions, but for all selects too). All hot temporary
data is batched on regions (lsregion in case of vinyl). So if we don't
see malloc in top used functions, it does not mean we can start using it
in hot places, because it seems to be light. It is actually not.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c
  2020-04-27 21:28         ` Vladislav Shpilevoy
@ 2020-04-27 23:24           ` Nikita Pettik
  0 siblings, 0 replies; 16+ messages in thread
From: Nikita Pettik @ 2020-04-27 23:24 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On 27 Apr 23:28, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
> 
> >>>> +int
> >>>> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end)
> >>>> +{
> >>>> +	struct port_c *port = (struct port_c *)base;
> >>>> +	struct port_c_entry *pe;
> >>>> +	uint32_t size = mp_end - mp;
> >>>> +	char *dst;
> >>>> +	if (size <= PORT_ENTRY_SIZE) {
> >>>> +		/*
> >>>> +		 * Alloc on a mempool is several times faster than
> >>>> +		 * on the heap. And it perfectly fits any
> >>>> +		 * MessagePack number, a short string, a boolean.
> >>>> +		 */
> >>>> +		dst = mempool_alloc(&port_entry_pool);
> >>>
> >>> Dubious optimization...I mean is this code complication really
> >>> worth it?
> >>
> >> mempool_alloc() of this size is x4-5 times faster than malloc()
> >> according to my microbenchmarks on release build. I decided not
> >> to write concrete numbers x4-5, since they probably depend on
> >> machine.
> > 
> > Is there at least one real scenario where entry allocation
> > policy can turn out to be bottleneck in overall performance?
> 
> I guess we have plenty of such scenarios - choose any favorite among
> any of our GC problems, when solution/sales team need to write
> functions in C because of bad memory management in Lua or its high
> overusage because of uncontrolled unpacking of everything. C stored
> functions are supposed to be much much faster than Lua, be called
> hundreds of thousands times per second easily.

I should have said '... entry allocation policy in this particular case..'
implying exactly port_c_add_mp(). Surely, variety of allocators in
Tarantool exist for a reason..
 
> If we use heap for each such request, we loose some real perf, and
> aggravate heap fragmentation for other requests with lots of small
> allocations. Moreover, allocations of about the same size - exactly
> from what mempools are supposed to protect.
> 
> I measure here by numbers I got on my machine. I was running a C
> function, which returned 100 unsigned numbers, each <= 128. The
> function was called just 10k times, not a big number.

100 uints returned by one function sounds like an overestimation.
I mean in terms of synthetic tests perhaps it is ok, but is there
any profit on real workload? Let's leave this question as rhetorical,
since our current benchmarks can't tell us. Patch LGTM for sure.

> With using memory pool the bench finished in 35ms. With using heap
> it finished in 143-144ms. So just by allocating results on the pool
> instead of the heap I won more than 100ms in 1kk mere returns. Not
> selects, or gets, or other non trivial operations. Entirely on
> returns, of pure TX thread time.
> 
> On the summary, I consider this low hanging perf fruit a good
> enough result which is worth the code complication.

I just want to make sure that this optimization is really worth it.
"Premature optimization is the root of all evil."

> Talking of memory problems, my entirely subjective guess is that
> partially we don't see malloc/free in our top results in flamegraphs,
> because we avoid their usage always when possible. All hot data
> structures are pooled - transactions, iterators, port entries (used
> not only for C functions, but for all selects too). All hot temporary
> data is batched on regions (lsregion in case of vinyl). So if we don't
> see malloc in top used functions, it does not mean we can start using it
> in hot places, because it seems to be light. It is actually not.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-04-27 23:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy
2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy
2020-03-10 13:42   ` Nikita Pettik
2020-03-11  0:17     ` Vladislav Shpilevoy
2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy
2020-03-26 17:49   ` Nikita Pettik
2020-04-23  0:14     ` Vladislav Shpilevoy
2020-04-27 14:09       ` Nikita Pettik
2020-04-27 21:28         ` Vladislav Shpilevoy
2020-04-27 23:24           ` Nikita Pettik
2020-04-03 14:12   ` Igor Munkin
2020-04-23  0:14     ` Vladislav Shpilevoy
2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy
2020-03-26 17:51   ` Nikita Pettik
2020-04-03 14:13   ` Igor Munkin
2020-04-23  0:14     ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox