* [Tarantool-patches] [PATCH 0/3] box_return_mp @ 2020-03-08 17:47 Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw) To: tarantool-patches, korablev, imun The patchset extends the public C API with a function box_return_mp(). It allows to return arbitrary MessagePack from user's code. First patch is not really related to the ticket, but related to struct port, which is touched here. Second patch introduces a new struct port - port_c. It replaces port_tuple for stored C functions, and is able to store both tuples and raw MessagePack. Third patch introduces the new public function. Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-4641-c-function-ret-mp Issue: https://github.com/tarantool/tarantool/issues/4641 @ChangeLog - box_return_mp() - new C API function to return arbitrary MessagePack from stored C functions (gh-4641). Vladislav Shpilevoy (3): box: fix struct port_tuple.size wrong type in Lua box: introduce port_c box: introduce box_return_mp() public C function extra/exports | 1 + src/box/box.cc | 8 +- src/box/box.h | 19 ++++ src/box/func.c | 2 +- src/box/lua/misc.cc | 20 ++++ src/box/lua/schema.lua | 2 +- src/box/port.c | 191 ++++++++++++++++++++++++++++++++++-- src/box/port.h | 50 ++++++++++ src/box/sql/func.c | 22 +++-- test/box/function1.c | 37 +++++++ test/box/function1.result | 31 ++++++ test/box/function1.test.lua | 14 +++ 12 files changed, 375 insertions(+), 22 deletions(-) -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua 2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy @ 2020-03-08 17:47 ` Vladislav Shpilevoy 2020-03-10 13:42 ` Nikita Pettik 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy 2 siblings, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw) To: tarantool-patches, korablev, imun Original port_tuple in C has 'int size;' field. It was 'size_t size' in Lua. Since sizeof(size_t) usually is 8, and sizeof(int) is 4, this was a really bad typo. --- src/box/lua/schema.lua | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua index f537c3cec..aba439ffb 100644 --- a/src/box/lua/schema.lua +++ b/src/box/lua/schema.lua @@ -84,7 +84,7 @@ ffi.cdef[[ struct port_tuple { const struct port_vtab *vtab; - size_t size; + int size; struct port_tuple_entry *first; struct port_tuple_entry *last; struct port_tuple_entry first_entry; -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy @ 2020-03-10 13:42 ` Nikita Pettik 2020-03-11 0:17 ` Vladislav Shpilevoy 0 siblings, 1 reply; 16+ messages in thread From: Nikita Pettik @ 2020-03-10 13:42 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On 08 Mar 18:47, Vladislav Shpilevoy wrote: > Original port_tuple in C has 'int size;' field. It was > 'size_t size' in Lua. Since sizeof(size_t) usually is > 8, and sizeof(int) is 4, this was a really bad typo. LGTM (feel free to push out of order as obvious). > --- > src/box/lua/schema.lua | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua > index f537c3cec..aba439ffb 100644 > --- a/src/box/lua/schema.lua > +++ b/src/box/lua/schema.lua > @@ -84,7 +84,7 @@ ffi.cdef[[ > > struct port_tuple { > const struct port_vtab *vtab; > - size_t size; > + int size; > struct port_tuple_entry *first; > struct port_tuple_entry *last; > struct port_tuple_entry first_entry; > -- > 2.21.1 (Apple Git-122.3) > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua 2020-03-10 13:42 ` Nikita Pettik @ 2020-03-11 0:17 ` Vladislav Shpilevoy 0 siblings, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-03-11 0:17 UTC (permalink / raw) To: Nikita Pettik; +Cc: tarantool-patches On 10/03/2020 14:42, Nikita Pettik wrote: > On 08 Mar 18:47, Vladislav Shpilevoy wrote: >> Original port_tuple in C has 'int size;' field. It was >> 'size_t size' in Lua. Since sizeof(size_t) usually is >> 8, and sizeof(int) is 4, this was a really bad typo. > > LGTM (feel free to push out of order as obvious). Done. Pushed to master, 2.3, 2.2, 1.10. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy @ 2020-03-08 17:47 ` Vladislav Shpilevoy 2020-03-26 17:49 ` Nikita Pettik 2020-04-03 14:12 ` Igor Munkin 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy 2 siblings, 2 replies; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw) To: tarantool-patches, korablev, imun Port_c is a new descendant of struct port. It is used now for public C functions to store their result. Currently they can return only a tuple, but it will change soon, they will be able to return arbitrary MessagePack. Port_tuple is not removed, because still is used for box_select(), for functional indexes, and in SQL as a base for port_sql. Although that may be changed later. Functional indexes really need only a single MessagePack object from their function. While box_select() working via port_tuple or port_c didn't show any significant difference during micro benchmarks. Part of #4641 --- As you can see, port_tuple is used in a very few places now. And by performance it is not really different from the new port_c, according to micro benchmarks I did. Moreover, it may be actually faster for functional indexes - users won't need to create a tuple just to return a functional index key, which is usually small, I believe. If you (reviewers) want me to drop port_tuple in scope of this patchset, I can do that. At the moment of sending the patchset I decided to postpone it. src/box/box.cc | 2 +- src/box/func.c | 2 +- src/box/lua/misc.cc | 20 +++++ src/box/port.c | 191 +++++++++++++++++++++++++++++++++++++++++--- src/box/port.h | 50 ++++++++++++ src/box/sql/func.c | 22 ++--- 6 files changed, 266 insertions(+), 21 deletions(-) diff --git a/src/box/box.cc b/src/box/box.cc index 09dd67ab4..276430913 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1070,7 +1070,7 @@ boxk(int type, uint32_t space_id, const char *format, ...) int box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple) { - return port_tuple_add(ctx->port, tuple); + return port_c_add_tuple(ctx->port, tuple); } /* schema_find_id()-like method using only public API */ diff --git a/src/box/func.c b/src/box/func.c index 431577127..04b04b6ad 100644 --- a/src/box/func.c +++ b/src/box/func.c @@ -523,7 +523,7 @@ func_c_call(struct func *base, struct port *args, struct port *ret) if (data == NULL) return -1; - port_tuple_create(ret); + port_c_create(ret); box_function_ctx_t ctx = { ret }; /* Module can be changed after function reload. */ diff --git a/src/box/lua/misc.cc b/src/box/lua/misc.cc index 79b6cfe3a..a6ba024ea 100644 --- a/src/box/lua/misc.cc +++ b/src/box/lua/misc.cc @@ -78,6 +78,26 @@ port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat) } } +extern "C" void +port_c_dump_lua(struct port *base, struct lua_State *L, bool is_flat) +{ + struct port_c *port = (struct port_c *)base; + if (!is_flat) + lua_createtable(L, port->size, 0); + struct port_c_entry *pe = port->first; + const char *mp; + for (int i = 0; pe != NULL; pe = pe->next) { + if (pe->mp_size == 0) { + luaT_pushtuple(L, pe->tuple); + } else { + mp = pe->mp; + luamp_decode(L, luaL_msgpack_default, &mp); + } + if (!is_flat) + lua_rawseti(L, -2, ++i); + } +} + extern "C" void port_msgpack_dump_lua(struct port *base, struct lua_State *L, bool is_flat) { diff --git a/src/box/port.c b/src/box/port.c index 6e2fe3a6e..925a4b2d5 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -37,7 +37,12 @@ #include <fiber.h> #include "errinj.h" -static struct mempool port_tuple_entry_pool; +static struct mempool port_entry_pool; + +enum { + PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry), + sizeof(struct port_tuple_entry)), +}; int port_tuple_add(struct port *base, struct tuple *tuple) @@ -49,7 +54,7 @@ port_tuple_add(struct port *base, struct tuple *tuple) e = &port->first_entry; port->first = port->last = e; } else { - e = mempool_alloc(&port_tuple_entry_pool); + e = mempool_alloc(&port_entry_pool); if (e == NULL) { diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e"); return -1; @@ -87,7 +92,7 @@ port_tuple_destroy(struct port *base) struct port_tuple_entry *cur = e; e = e->next; tuple_unref(cur->tuple); - mempool_free(&port_tuple_entry_pool, cur); + mempool_free(&port_entry_pool, cur); } } @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out) extern void port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); +static inline void +port_c_destroy_entry(struct port_c_entry *pe) +{ + if (pe->mp_size == 0) + tuple_unref(pe->tuple); + else if (pe->mp_size <= PORT_ENTRY_SIZE) + mempool_free(&port_entry_pool, pe->mp); + else + free(pe->mp); +} + +static void +port_c_destroy(struct port *base) +{ + struct port_c *port = (struct port_c *)base; + struct port_c_entry *pe = port->first; + if (pe == NULL) + return; + port_c_destroy_entry(pe); + pe = pe->next; + while (pe != NULL) { + struct port_c_entry *cur = pe; + pe = pe->next; + port_c_destroy_entry(cur); + mempool_free(&port_entry_pool, cur); + } +} + +static inline struct port_c_entry * +port_c_new_entry(struct port_c *port) +{ + struct port_c_entry *e; + if (port->size == 0) { + e = &port->first_entry; + port->first = e; + port->last = e; + } else { + e = mempool_alloc(&port_entry_pool); + if (e == NULL) { + diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e"); + return NULL; + } + port->last->next = e; + port->last = e; + } + e->next = NULL; + ++port->size; + return e; +} + +int +port_c_add_tuple(struct port *base, struct tuple *tuple) +{ + struct port_c *port = (struct port_c *)base; + struct port_c_entry *pe = port_c_new_entry(port); + if (pe != NULL) { + /* 0 mp_size means the entry stores a tuple. */ + pe->mp_size = 0; + pe->tuple = tuple; + tuple_ref(tuple); + return 0; + } + return -1; +} + +int +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) +{ + struct port_c *port = (struct port_c *)base; + struct port_c_entry *pe; + uint32_t size = mp_end - mp; + char *dst; + if (size <= PORT_ENTRY_SIZE) { + /* + * Alloc on a mempool is several times faster than + * on the heap. And it perfectly fits any + * MessagePack number, a short string, a boolean. + */ + dst = mempool_alloc(&port_entry_pool); + if (dst == NULL) { + diag_set(OutOfMemory, size, "mempool_alloc", "dst"); + return -1; + } + } else { + dst = malloc(size); + if (dst == NULL) { + diag_set(OutOfMemory, size, "malloc", "dst"); + return -1; + } + } + pe = port_c_new_entry(port); + if (pe != NULL) { + memcpy(dst, mp, size); + pe->mp = dst; + pe->mp_size = size; + return 0; + } + if (size <= PORT_ENTRY_SIZE) + mempool_free(&port_entry_pool, dst); + else + free(dst); + return -1; +} + +static int +port_c_dump_msgpack_16(struct port *base, struct obuf *out) +{ + struct port_c *port = (struct port_c *)base; + struct port_c_entry *pe; + for (pe = port->first; pe != NULL; pe = pe->next) { + uint32_t size = pe->mp_size; + if (size == 0) { + if (tuple_to_obuf(pe->tuple, out) != 0) + return -1; + } else if (obuf_dup(out, pe->mp, size) != size) { + diag_set(OutOfMemory, size, "obuf_dup", "data"); + return -1; + } + ERROR_INJECT(ERRINJ_PORT_DUMP, { + diag_set(OutOfMemory, + size == 0 ? tuple_size(pe->tuple) : size, + "obuf_dup", "data"); + return -1; + }); + } + return port->size; +} + +static int +port_c_dump_msgpack(struct port *base, struct obuf *out) +{ + struct port_c *port = (struct port_c *)base; + char *size_buf = obuf_alloc(out, mp_sizeof_array(port->size)); + if (size_buf == NULL) { + diag_set(OutOfMemory, mp_sizeof_array(port->size), "obuf_alloc", + "size_buf"); + return -1; + } + mp_encode_array(size_buf, port->size); + if (port_c_dump_msgpack_16(base, out) < 0) + return -1; + return 1; +} + +extern void +port_c_dump_lua(struct port *port, struct lua_State *L, bool is_flat); + +extern struct sql_value * +port_c_get_vdbemem(struct port *base, uint32_t *size); + +static const struct port_vtab port_c_vtab = { + .dump_msgpack = port_c_dump_msgpack, + .dump_msgpack_16 = port_c_dump_msgpack_16, + .dump_lua = port_c_dump_lua, + .dump_plain = NULL, + .get_msgpack = NULL, + .get_vdbemem = port_c_get_vdbemem, + .destroy = port_c_destroy, +}; + +void +port_c_create(struct port *base) +{ + struct port_c *port = (struct port_c *)base; + port->vtab = &port_c_vtab; + port->first = NULL; + port->last = NULL; + port->size = 0; +} + void port_init(void) { - mempool_create(&port_tuple_entry_pool, &cord()->slabc, - sizeof(struct port_tuple_entry)); + mempool_create(&port_entry_pool, &cord()->slabc, PORT_ENTRY_SIZE); } void port_free(void) { - mempool_destroy(&port_tuple_entry_pool); + mempool_destroy(&port_entry_pool); } -extern struct sql_value * -port_tuple_get_vdbemem(struct port *base, uint32_t *size); - const struct port_vtab port_tuple_vtab = { .dump_msgpack = port_tuple_dump_msgpack, .dump_msgpack_16 = port_tuple_dump_msgpack_16, .dump_lua = port_tuple_dump_lua, .dump_plain = NULL, .get_msgpack = NULL, - .get_vdbemem = port_tuple_get_vdbemem, + .get_vdbemem = NULL, .destroy = port_tuple_destroy, }; diff --git a/src/box/port.h b/src/box/port.h index 9d3d02b3c..cd40d8a15 100644 --- a/src/box/port.h +++ b/src/box/port.h @@ -130,6 +130,56 @@ void port_vdbemem_create(struct port *base, struct sql_value *mem, uint32_t mem_count); +/** + * The struct is PACKED to avoid unnecessary 4 byte padding + * after mp_size. These objects are never stored on stack, neither + * allocated as an array, so the padding is not needed. + */ +struct PACKED port_c_entry { + struct port_c_entry *next; + union { + /** Valid if mp_size is 0. */ + struct tuple *tuple; + /** + * Valid if mp_size is > 0. MessagePack is + * allocated either on heap or on the port entry + * mempool, if it fits into a pool object. + */ + char *mp; + }; + uint32_t mp_size; +}; + +static_assert(sizeof(struct port_c_entry) == 20, + "port_c_entry is expected to be perfectly aligned and packed"); + +/** + * C port is used by C functions from the public API. They can + * return tuples and arbitrary MessagePack. + */ +struct port_c { + const struct port_vtab *vtab; + struct port_c_entry *first; + struct port_c_entry *last; + struct port_c_entry first_entry; + int size; +}; + +static_assert(sizeof(struct port_c) <= sizeof(struct port), + "sizeof(struct port_c) must be <= sizeof(struct port)"); + +/** Create a C port object. */ +void +port_c_create(struct port *base); + +/** Append a tuple to the port. Tuple is referenced. */ +int +port_c_add_tuple(struct port *port, struct tuple *tuple); + +/** Append raw MessagePack to the port. It is copied. */ +int +port_c_add_mp(struct port *port, const char *mp, const char *mp_end); + void port_init(void); diff --git a/src/box/sql/func.c b/src/box/sql/func.c index 6e724c824..70147da1e 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -299,9 +299,9 @@ error: } struct sql_value * -port_tuple_get_vdbemem(struct port *base, uint32_t *size) +port_c_get_vdbemem(struct port *base, uint32_t *size) { - struct port_tuple *port = (struct port_tuple *)base; + struct port_c *port = (struct port_c *)base; *size = port->size; if (*size == 0 || *size > 1) { diag_set(ClientError, ER_SQL_FUNC_WRONG_RET_COUNT, "C", *size); @@ -315,14 +315,18 @@ port_tuple_get_vdbemem(struct port *base, uint32_t *size) if (val == NULL) return NULL; int i = 0; - struct port_tuple_entry *pe; + const char *data; + struct port_c_entry *pe; for (pe = port->first; pe != NULL; pe = pe->next) { - const char *data = tuple_data(pe->tuple); - if (mp_typeof(*data) != MP_ARRAY || - mp_decode_array(&data) != 1) { - diag_set(ClientError, ER_SQL_EXECUTE, - "Unsupported type passed from C"); - goto error; + if (pe->mp_size == 0) { + data = tuple_data(pe->tuple); + if (mp_decode_array(&data) != 1) { + diag_set(ClientError, ER_SQL_EXECUTE, + "Unsupported type passed from C"); + goto error; + } + } else { + data = pe->mp; } uint32_t len; const char *str; -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy @ 2020-03-26 17:49 ` Nikita Pettik 2020-04-23 0:14 ` Vladislav Shpilevoy 2020-04-03 14:12 ` Igor Munkin 1 sibling, 1 reply; 16+ messages in thread From: Nikita Pettik @ 2020-03-26 17:49 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On 08 Mar 18:47, Vladislav Shpilevoy wrote: In general, patch is okay. Below some comments which can be considered as nits. I see that your branch doesn't pass CI testing: https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status I haven't managed to find box/alter_limits among existing flaky tests. If this test fail is not really related to your patches, could you please file an issue? > If you (reviewers) want me to drop port_tuple in scope of this > patchset, I can do that. At the moment of sending the patchset I > decided to postpone it. I'd do it as a follow-up in case you have time to do it. Otherwise please just file an issue. > diff --git a/src/box/port.c b/src/box/port.c > index 6e2fe3a6e..925a4b2d5 100644 > --- a/src/box/port.c > +++ b/src/box/port.c > @@ -37,7 +37,12 @@ > #include <fiber.h> > #include "errinj.h" > > -static struct mempool port_tuple_entry_pool; I'd add comment explaining that port_entry_pool holds both port_c and port_tuple entries now. Moreover, as an optimization some msgpacks stored in port_c also can be allocated in this pool (as far as I understand). > +static struct mempool port_entry_pool; > + > +enum { > + PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry), > + sizeof(struct port_tuple_entry)), > +}; > > @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out) > extern void > port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); > > +static inline void > +port_c_destroy_entry(struct port_c_entry *pe) > +{ I'd add a brief reference like: /* For allocation policy see port_c_add_tuple() and port_c_add_mp(). */ > + if (pe->mp_size == 0) > + tuple_unref(pe->tuple); > + else if (pe->mp_size <= PORT_ENTRY_SIZE) > + mempool_free(&port_entry_pool, pe->mp); > + else > + free(pe->mp); > +} > + > +static inline struct port_c_entry * > +port_c_new_entry(struct port_c *port) > +{ > + struct port_c_entry *e; > + if (port->size == 0) { > + e = &port->first_entry; I don't really get this 'first_entry' optimization. Why do you need it at all? > + port->first = e; > + port->last = e; > + } else { > + e = mempool_alloc(&port_entry_pool); > + if (e == NULL) { > + diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e"); > + return NULL; > + } > + port->last->next = e; > + port->last = e; > + } > + e->next = NULL; > + ++port->size; > + return e; > +} > + > +int > +port_c_add_tuple(struct port *base, struct tuple *tuple) > +{ > + struct port_c *port = (struct port_c *)base; > + struct port_c_entry *pe = port_c_new_entry(port); I guess reverting condition would make code flow more straightforward: if (pe == NULL) return -1; ... > + if (pe != NULL) { > + /* 0 mp_size means the entry stores a tuple. */ > + pe->mp_size = 0; > + pe->tuple = tuple; > + tuple_ref(tuple); > + return 0; > + } > + return -1; > +} > + > +int > +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) > +{ > + struct port_c *port = (struct port_c *)base; > + struct port_c_entry *pe; > + uint32_t size = mp_end - mp; > + char *dst; > + if (size <= PORT_ENTRY_SIZE) { > + /* > + * Alloc on a mempool is several times faster than > + * on the heap. And it perfectly fits any > + * MessagePack number, a short string, a boolean. > + */ > + dst = mempool_alloc(&port_entry_pool); Dubious optimization...I mean is this code complication really worth it? > + if (dst == NULL) { > + diag_set(OutOfMemory, size, "mempool_alloc", "dst"); > + return -1; > + } > + > +static int > +port_c_dump_msgpack_16(struct port *base, struct obuf *out) > +{ > + struct port_c *port = (struct port_c *)base; > + struct port_c_entry *pe; > + for (pe = port->first; pe != NULL; pe = pe->next) { > + uint32_t size = pe->mp_size; > + if (size == 0) { > + if (tuple_to_obuf(pe->tuple, out) != 0) > + return -1; > + } else if (obuf_dup(out, pe->mp, size) != size) { > + diag_set(OutOfMemory, size, "obuf_dup", "data"); > + return -1; > + } > + ERROR_INJECT(ERRINJ_PORT_DUMP, { > + diag_set(OutOfMemory, > + size == 0 ? tuple_size(pe->tuple) : size, > + "obuf_dup", "data"); > + return -1; > + }); You added error injection, but did not test it. Could you provide test (or point me at it please)? > + } > + return port->size; > +} > + > diff --git a/src/box/port.h b/src/box/port.h > index 9d3d02b3c..cd40d8a15 100644 > --- a/src/box/port.h > +++ b/src/box/port.h > @@ -130,6 +130,56 @@ void > port_vdbemem_create(struct port *base, struct sql_value *mem, > uint32_t mem_count); > > +/** > + * The struct is PACKED to avoid unnecessary 4 byte padding Is this really profitable? > + * after mp_size. These objects are never stored on stack, neither > + * allocated as an array, so the padding is not needed. > + */ > +struct PACKED port_c_entry { > + struct port_c_entry *next; > + union { > + /** Valid if mp_size is 0. */ > + struct tuple *tuple; > + /** > + * Valid if mp_size is > 0. MessagePack is > + * allocated either on heap or on the port entry > + * mempool, if it fits into a pool object. > + */ > + char *mp; > + }; > + uint32_t mp_size; > +}; > + > +/** > + * C port is used by C functions from the public API. They can > + * return tuples and arbitrary MessagePack. > + */ > +struct port_c { > + const struct port_vtab *vtab; > + struct port_c_entry *first; > + struct port_c_entry *last; > + struct port_c_entry first_entry; > + int size; Why do you call it size? I'd name it entry_cnt or count (imho port size = sum of all entries' sizes). > +}; > + > +static_assert(sizeof(struct port_c) <= sizeof(struct port), > + "sizeof(struct port_c) must be <= sizeof(struct port)"); > + > port_init(void); > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-03-26 17:49 ` Nikita Pettik @ 2020-04-23 0:14 ` Vladislav Shpilevoy 2020-04-27 14:09 ` Nikita Pettik 0 siblings, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-04-23 0:14 UTC (permalink / raw) To: Nikita Pettik; +Cc: tarantool-patches Hi! Thanks for the review! On 26/03/2020 18:49, Nikita Pettik wrote: > On 08 Mar 18:47, Vladislav Shpilevoy wrote: > > In general, patch is okay. Below some comments which can be > considered as nits. > > I see that your branch doesn't pass CI testing: > https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status > > I haven't managed to find box/alter_limits among existing flaky > tests. If this test fail is not really related to your patches, > could you please file an issue? It is not related (at least I think so), I saw it failing on other branches too. Created an issue: https://github.com/tarantool/tarantool/issues/4926 >> If you (reviewers) want me to drop port_tuple in scope of this >> patchset, I can do that. At the moment of sending the patchset I >> decided to postpone it. > > I'd do it as a follow-up in case you have time to do it. > Otherwise please just file an issue. Now I have time, so it is done in a separate commit in v2 thread. >> diff --git a/src/box/port.c b/src/box/port.c >> index 6e2fe3a6e..925a4b2d5 100644 >> --- a/src/box/port.c >> +++ b/src/box/port.c >> @@ -37,7 +37,12 @@ >> #include <fiber.h> >> #include "errinj.h" >> >> -static struct mempool port_tuple_entry_pool; > > I'd add comment explaining that port_entry_pool holds both > port_c and port_tuple entries now. Moreover, as an optimization > some msgpacks stored in port_c also can be allocated in this > pool (as far as I understand). Ok, no problem: ==================== diff --git a/src/box/port.c b/src/box/port.c index 3498f7af0..fea98ecc5 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -37,6 +37,12 @@ #include <fiber.h> #include "errinj.h" +/** + * The pools is used both by port_c and port_tuple, since their + * entires are almost of the same size. Also port_c can use + * objects from the pool to store result data in their memory, + * when it fits. + */ static struct mempool port_entry_pool; enum { ==================== >> +static struct mempool port_entry_pool; >> + >> +enum { >> + PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry), >> + sizeof(struct port_tuple_entry)), >> +}; >> >> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out) >> extern void >> port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); >> >> +static inline void >> +port_c_destroy_entry(struct port_c_entry *pe) >> +{ > > I'd add a brief reference like: > > /* For allocation policy see port_c_add_tuple() and port_c_add_mp(). */ Ok, done: ==================== diff --git a/src/box/port.c b/src/box/port.c index fea98ecc5..2dfd0118d 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -141,6 +141,10 @@ port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); static inline void port_c_destroy_entry(struct port_c_entry *pe) { + /* + * See port_c_add_*() for algorithm of how and where to + * store data, to understand why it is freed differently. + */ if (pe->mp_size == 0) tuple_unref(pe->tuple); else if (pe->mp_size <= PORT_ENTRY_SIZE) ==================== >> + if (pe->mp_size == 0) >> + tuple_unref(pe->tuple); >> + else if (pe->mp_size <= PORT_ENTRY_SIZE) >> + mempool_free(&port_entry_pool, pe->mp); >> + else >> + free(pe->mp); >> +} >> + >> +static inline struct port_c_entry * >> +port_c_new_entry(struct port_c *port) >> +{ >> + struct port_c_entry *e; >> + if (port->size == 0) { >> + e = &port->first_entry; > > I don't really get this 'first_entry' optimization. Why do you > need it at all? Because most functions return only 1 result. And most of these return it as a tuple. So having this value inside the port preallocated allows to avoid any allocations/frees for these cases. Zero allocation. Although I didn't really bench how much it helps. I just stole it from port_tuple as a low hanging fruit. >> + port->first = e; >> + port->last = e; >> + } else { >> + e = mempool_alloc(&port_entry_pool); >> + if (e == NULL) { >> + diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e"); >> + return NULL; >> + } >> + port->last->next = e; >> + port->last = e; >> + } >> + e->next = NULL; >> + ++port->size; >> + return e; >> +} >> + >> +int >> +port_c_add_tuple(struct port *base, struct tuple *tuple) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe = port_c_new_entry(port); > > I guess reverting condition would make code flow more straightforward: > if (pe == NULL) > return -1; > ... I am infected with the paranoia that the most likely result should lead to the positive branch. But I don't mind to make it simpler. ==================== diff --git a/src/box/port.c b/src/box/port.c index 2dfd0118d..2c1fadb5c 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -202,14 +202,13 @@ port_c_add_tuple(struct port *base, struct tuple *tuple) { struct port_c *port = (struct port_c *)base; struct port_c_entry *pe = port_c_new_entry(port); - if (pe != NULL) { - /* 0 mp_size means the entry stores a tuple. */ - pe->mp_size = 0; - pe->tuple = tuple; - tuple_ref(tuple); - return 0; - } - return -1; + if (pe == NULL) + return -1; + /* 0 mp_size means the entry stores a tuple. */ + pe->mp_size = 0; + pe->tuple = tuple; + tuple_ref(tuple); + return 0; } int ==================== >> + if (pe != NULL) { >> + /* 0 mp_size means the entry stores a tuple. */ >> + pe->mp_size = 0; >> + pe->tuple = tuple; >> + tuple_ref(tuple); >> + return 0; >> + } >> + return -1; >> +} >> + >> +int >> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe; >> + uint32_t size = mp_end - mp; >> + char *dst; >> + if (size <= PORT_ENTRY_SIZE) { >> + /* >> + * Alloc on a mempool is several times faster than >> + * on the heap. And it perfectly fits any >> + * MessagePack number, a short string, a boolean. >> + */ >> + dst = mempool_alloc(&port_entry_pool); > > Dubious optimization...I mean is this code complication really > worth it? mempool_alloc() of this size is x4-5 times faster than malloc() according to my microbenchmarks on release build. I decided not to write concrete numbers x4-5, since they probably depend on machine. By saying 'on this size' I mean fitting into port entry object. And this is the perfect match for functions which return something small and scalar like a number, a boolean, or a small string. >> + if (dst == NULL) { >> + diag_set(OutOfMemory, size, "mempool_alloc", "dst"); >> + return -1; >> + } >> + >> +static int >> +port_c_dump_msgpack_16(struct port *base, struct obuf *out) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe; >> + for (pe = port->first; pe != NULL; pe = pe->next) { >> + uint32_t size = pe->mp_size; >> + if (size == 0) { >> + if (tuple_to_obuf(pe->tuple, out) != 0) >> + return -1; >> + } else if (obuf_dup(out, pe->mp, size) != size) { >> + diag_set(OutOfMemory, size, "obuf_dup", "data"); >> + return -1; >> + } >> + ERROR_INJECT(ERRINJ_PORT_DUMP, { >> + diag_set(OutOfMemory, >> + size == 0 ? tuple_size(pe->tuple) : size, >> + "obuf_dup", "data"); >> + return -1; >> + }); > > You added error injection, but did not test it. Could you provide test > (or point me at it please)? In the v2 thread it is used in box/errinj.test.lua to check that "port_dump can fail". >> + } >> + return port->size; >> +} >> + >> diff --git a/src/box/port.h b/src/box/port.h >> index 9d3d02b3c..cd40d8a15 100644 >> --- a/src/box/port.h >> +++ b/src/box/port.h >> @@ -130,6 +130,56 @@ void >> port_vdbemem_create(struct port *base, struct sql_value *mem, >> uint32_t mem_count); >> >> +/** >> + * The struct is PACKED to avoid unnecessary 4 byte padding > > Is this really profitable? -4 bytes for every object of this type by just adding one word to the struct? Sounds good to me. Very easy to do, to prove it works correct, and profit is obvious - the structure becomes 16% smaller. This means 16% more objects in every slab used for port_c_entry objects. Although I didn't measure how it affects things under heavy load on C functions. >> + * after mp_size. These objects are never stored on stack, neither >> + * allocated as an array, so the padding is not needed. >> + */ >> +struct PACKED port_c_entry { >> + struct port_c_entry *next; >> + union { >> + /** Valid if mp_size is 0. */ >> + struct tuple *tuple; >> + /** >> + * Valid if mp_size is > 0. MessagePack is >> + * allocated either on heap or on the port entry >> + * mempool, if it fits into a pool object. >> + */ >> + char *mp; >> + }; >> + uint32_t mp_size; >> +}; >> + >> +/** >> + * C port is used by C functions from the public API. They can >> + * return tuples and arbitrary MessagePack. >> + */ >> +struct port_c { >> + const struct port_vtab *vtab; >> + struct port_c_entry *first; >> + struct port_c_entry *last; >> + struct port_c_entry first_entry; >> + int size; > > Why do you call it size? I'd name it entry_cnt or count > (imho port size = sum of all entries' sizes). For that we have term 'bsize'. But I don't have anything against count, personally. The problem here is that it will be inconsistent with other ports (port_lua, and port_tuple which still exists here), and with local variables named 'size', which are assigned to port->size member - that is the problem in src/box/lua/call.c. When I renamed size to count, I got notable amount of unnecessary diff. So I would like to keep size unless you insist, and in that case I will do it in a separate patch. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-04-23 0:14 ` Vladislav Shpilevoy @ 2020-04-27 14:09 ` Nikita Pettik 2020-04-27 21:28 ` Vladislav Shpilevoy 0 siblings, 1 reply; 16+ messages in thread From: Nikita Pettik @ 2020-04-27 14:09 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On 23 Apr 02:14, Vladislav Shpilevoy wrote: > Hi! Thanks for the review! > > On 26/03/2020 18:49, Nikita Pettik wrote: > > On 08 Mar 18:47, Vladislav Shpilevoy wrote: > > > > In general, patch is okay. Below some comments which can be > > considered as nits. > > > > I see that your branch doesn't pass CI testing: > > https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status > > > > I haven't managed to find box/alter_limits among existing flaky > > tests. If this test fail is not really related to your patches, > > could you please file an issue? > > It is not related (at least I think so), I saw it failing on other > branches too. > > Created an issue: > https://github.com/tarantool/tarantool/issues/4926 Thanks. (No doubts it is not related to your patch, just want to make sure that all flaky tests are registrated.) > >> +} > >> + > >> +static inline struct port_c_entry * > >> +port_c_new_entry(struct port_c *port) > >> +{ > >> + struct port_c_entry *e; > >> + if (port->size == 0) { > >> + e = &port->first_entry; > > > > I don't really get this 'first_entry' optimization. Why do you > > need it at all? > > Because most functions return only 1 result. And most of these > return it as a tuple. So having this value inside the port > preallocated allows to avoid any allocations/frees for these > cases. Zero allocation. Although I didn't really bench how > much it helps. I just stole it from port_tuple as a low hanging > fruit. On the other hand you get memory overhead, which you fight for so much even using packed attribute... Why not then keeping 2, 3 or whatever number benchmarks say entries inlined? > >> + if (pe != NULL) { > >> + /* 0 mp_size means the entry stores a tuple. */ > >> + pe->mp_size = 0; > >> + pe->tuple = tuple; > >> + tuple_ref(tuple); > >> + return 0; > >> + } > >> + return -1; > >> +} > >> + > >> +int > >> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) > >> +{ > >> + struct port_c *port = (struct port_c *)base; > >> + struct port_c_entry *pe; > >> + uint32_t size = mp_end - mp; > >> + char *dst; > >> + if (size <= PORT_ENTRY_SIZE) { > >> + /* > >> + * Alloc on a mempool is several times faster than > >> + * on the heap. And it perfectly fits any > >> + * MessagePack number, a short string, a boolean. > >> + */ > >> + dst = mempool_alloc(&port_entry_pool); > > > > Dubious optimization...I mean is this code complication really > > worth it? > > mempool_alloc() of this size is x4-5 times faster than malloc() > according to my microbenchmarks on release build. I decided not > to write concrete numbers x4-5, since they probably depend on > machine. Is there at least one real scenario where entry allocation policy can turn out to be bottleneck in overall performance? > >> diff --git a/src/box/port.h b/src/box/port.h > >> index 9d3d02b3c..cd40d8a15 100644 > >> --- a/src/box/port.h > >> +++ b/src/box/port.h > >> @@ -130,6 +130,56 @@ void > >> port_vdbemem_create(struct port *base, struct sql_value *mem, > >> uint32_t mem_count); > >> > >> +/** > >> + * The struct is PACKED to avoid unnecessary 4 byte padding > > > > Is this really profitable? > > -4 bytes for every object of this type by just adding one word to > the struct? Sounds good to me. Very easy to do, to prove it works > correct, and profit is obvious - the structure becomes 16% smaller. > This means 16% more objects in every slab used for port_c_entry > objects. Although I didn't measure how it affects things under heavy > load on C functions. Well, okay. At least its profit is clear (in contrast to others). > >> +struct PACKED port_c_entry { > >> + struct port_c_entry *next; > >> + union { > >> + /** Valid if mp_size is 0. */ > >> + struct tuple *tuple; > >> + /** > >> + * Valid if mp_size is > 0. MessagePack is > >> + * allocated either on heap or on the port entry > >> + * mempool, if it fits into a pool object. > >> + */ > >> + char *mp; > >> + }; > >> + uint32_t mp_size; > >> +}; > >> + > >> +/** > >> + * C port is used by C functions from the public API. They can > >> + * return tuples and arbitrary MessagePack. > >> + */ > >> +struct port_c { > >> + const struct port_vtab *vtab; > >> + struct port_c_entry *first; > >> + struct port_c_entry *last; > >> + struct port_c_entry first_entry; > >> + int size; > > > > Why do you call it size? I'd name it entry_cnt or count > > (imho port size = sum of all entries' sizes). > > For that we have term 'bsize'. But I don't have anything against > count, personally. The problem here is that it will be inconsistent > with other ports (port_lua, and port_tuple which still exists here), > and with local variables named 'size', which are assigned to > port->size member - that is the problem in src/box/lua/call.c. > > When I renamed size to count, I got notable amount of unnecessary > diff. So I would like to keep size unless you insist, and in that > case I will do it in a separate patch. Personally I would still rename it, perhaps for all port classes. Up to you as non-functional change. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-04-27 14:09 ` Nikita Pettik @ 2020-04-27 21:28 ` Vladislav Shpilevoy 2020-04-27 23:24 ` Nikita Pettik 0 siblings, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-04-27 21:28 UTC (permalink / raw) To: Nikita Pettik; +Cc: tarantool-patches Hi! Thanks for the review! >>>> +} >>>> + >>>> +static inline struct port_c_entry * >>>> +port_c_new_entry(struct port_c *port) >>>> +{ >>>> + struct port_c_entry *e; >>>> + if (port->size == 0) { >>>> + e = &port->first_entry; >>> >>> I don't really get this 'first_entry' optimization. Why do you >>> need it at all? >> >> Because most functions return only 1 result. And most of these >> return it as a tuple. So having this value inside the port >> preallocated allows to avoid any allocations/frees for these >> cases. Zero allocation. Although I didn't really bench how >> much it helps. I just stole it from port_tuple as a low hanging >> fruit. > > On the other hand you get memory overhead, which you fight for > so much even using packed attribute... There is no overhead for preallocated entries (for our allocators and the heap), since port is on the stack always. At least there is no overhead with reasonable number of reserved entries. Of course, if you preallocate, say, thousand, it would affect the stack cache locality in a bad way. > Why not then keeping 2, 3 or whatever number benchmarks say > entries inlined? So yes, in theory adding 2 or 3 or a few more will probably speed up something, but not the most common case, when result is just one, or when there are tens or hundreds or even thousands results (although thousands are unlikely, since tx thread will have problems with selecting so much without yields). >>>> + if (pe != NULL) { >>>> + /* 0 mp_size means the entry stores a tuple. */ >>>> + pe->mp_size = 0; >>>> + pe->tuple = tuple; >>>> + tuple_ref(tuple); >>>> + return 0; >>>> + } >>>> + return -1; >>>> +} >>>> + >>>> +int >>>> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) >>>> +{ >>>> + struct port_c *port = (struct port_c *)base; >>>> + struct port_c_entry *pe; >>>> + uint32_t size = mp_end - mp; >>>> + char *dst; >>>> + if (size <= PORT_ENTRY_SIZE) { >>>> + /* >>>> + * Alloc on a mempool is several times faster than >>>> + * on the heap. And it perfectly fits any >>>> + * MessagePack number, a short string, a boolean. >>>> + */ >>>> + dst = mempool_alloc(&port_entry_pool); >>> >>> Dubious optimization...I mean is this code complication really >>> worth it? >> >> mempool_alloc() of this size is x4-5 times faster than malloc() >> according to my microbenchmarks on release build. I decided not >> to write concrete numbers x4-5, since they probably depend on >> machine. > > Is there at least one real scenario where entry allocation > policy can turn out to be bottleneck in overall performance? I guess we have plenty of such scenarios - choose any favorite among any of our GC problems, when solution/sales team need to write functions in C because of bad memory management in Lua or its high overusage because of uncontrolled unpacking of everything. C stored functions are supposed to be much much faster than Lua, be called hundreds of thousands times per second easily. If we use heap for each such request, we loose some real perf, and aggravate heap fragmentation for other requests with lots of small allocations. Moreover, allocations of about the same size - exactly from what mempools are supposed to protect. I measure here by numbers I got on my machine. I was running a C function, which returned 100 unsigned numbers, each <= 128. The function was called just 10k times, not a big number. With using memory pool the bench finished in 35ms. With using heap it finished in 143-144ms. So just by allocating results on the pool instead of the heap I won more than 100ms in 1kk mere returns. Not selects, or gets, or other non trivial operations. Entirely on returns, of pure TX thread time. On the summary, I consider this low hanging perf fruit a good enough result which is worth the code complication. Talking of memory problems, my entirely subjective guess is that partially we don't see malloc/free in our top results in flamegraphs, because we avoid their usage always when possible. All hot data structures are pooled - transactions, iterators, port entries (used not only for C functions, but for all selects too). All hot temporary data is batched on regions (lsregion in case of vinyl). So if we don't see malloc in top used functions, it does not mean we can start using it in hot places, because it seems to be light. It is actually not. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-04-27 21:28 ` Vladislav Shpilevoy @ 2020-04-27 23:24 ` Nikita Pettik 0 siblings, 0 replies; 16+ messages in thread From: Nikita Pettik @ 2020-04-27 23:24 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On 27 Apr 23:28, Vladislav Shpilevoy wrote: > Hi! Thanks for the review! > > >>>> +int > >>>> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) > >>>> +{ > >>>> + struct port_c *port = (struct port_c *)base; > >>>> + struct port_c_entry *pe; > >>>> + uint32_t size = mp_end - mp; > >>>> + char *dst; > >>>> + if (size <= PORT_ENTRY_SIZE) { > >>>> + /* > >>>> + * Alloc on a mempool is several times faster than > >>>> + * on the heap. And it perfectly fits any > >>>> + * MessagePack number, a short string, a boolean. > >>>> + */ > >>>> + dst = mempool_alloc(&port_entry_pool); > >>> > >>> Dubious optimization...I mean is this code complication really > >>> worth it? > >> > >> mempool_alloc() of this size is x4-5 times faster than malloc() > >> according to my microbenchmarks on release build. I decided not > >> to write concrete numbers x4-5, since they probably depend on > >> machine. > > > > Is there at least one real scenario where entry allocation > > policy can turn out to be bottleneck in overall performance? > > I guess we have plenty of such scenarios - choose any favorite among > any of our GC problems, when solution/sales team need to write > functions in C because of bad memory management in Lua or its high > overusage because of uncontrolled unpacking of everything. C stored > functions are supposed to be much much faster than Lua, be called > hundreds of thousands times per second easily. I should have said '... entry allocation policy in this particular case..' implying exactly port_c_add_mp(). Surely, variety of allocators in Tarantool exist for a reason.. > If we use heap for each such request, we loose some real perf, and > aggravate heap fragmentation for other requests with lots of small > allocations. Moreover, allocations of about the same size - exactly > from what mempools are supposed to protect. > > I measure here by numbers I got on my machine. I was running a C > function, which returned 100 unsigned numbers, each <= 128. The > function was called just 10k times, not a big number. 100 uints returned by one function sounds like an overestimation. I mean in terms of synthetic tests perhaps it is ok, but is there any profit on real workload? Let's leave this question as rhetorical, since our current benchmarks can't tell us. Patch LGTM for sure. > With using memory pool the bench finished in 35ms. With using heap > it finished in 143-144ms. So just by allocating results on the pool > instead of the heap I won more than 100ms in 1kk mere returns. Not > selects, or gets, or other non trivial operations. Entirely on > returns, of pure TX thread time. > > On the summary, I consider this low hanging perf fruit a good > enough result which is worth the code complication. I just want to make sure that this optimization is really worth it. "Premature optimization is the root of all evil." > Talking of memory problems, my entirely subjective guess is that > partially we don't see malloc/free in our top results in flamegraphs, > because we avoid their usage always when possible. All hot data > structures are pooled - transactions, iterators, port entries (used > not only for C functions, but for all selects too). All hot temporary > data is batched on regions (lsregion in case of vinyl). So if we don't > see malloc in top used functions, it does not mean we can start using it > in hot places, because it seems to be light. It is actually not. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy 2020-03-26 17:49 ` Nikita Pettik @ 2020-04-03 14:12 ` Igor Munkin 2020-04-23 0:14 ` Vladislav Shpilevoy 1 sibling, 1 reply; 16+ messages in thread From: Igor Munkin @ 2020-04-03 14:12 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Vlad, Thanks for the patch! I see no flaws, but I left several nits below in addition to Nikita's comments. On 08.03.20, Vladislav Shpilevoy wrote: > Port_c is a new descendant of struct port. It is used now for > public C functions to store their result. Currently they can > return only a tuple, but it will change soon, they will be able to > return arbitrary MessagePack. > > Port_tuple is not removed, because still is used for box_select(), > for functional indexes, and in SQL as a base for port_sql. > Although that may be changed later. Functional indexes really need > only a single MessagePack object from their function. While > box_select() working via port_tuple or port_c didn't show any > significant difference during micro benchmarks. > > Part of #4641 > --- > > As you can see, port_tuple is used in a very few places now. And > by performance it is not really different from the new port_c, > according to micro benchmarks I did. Moreover, it may be actually > faster for functional indexes - users won't need to create a tuple > just to return a functional index key, which is usually small, I > believe. > > If you (reviewers) want me to drop port_tuple in scope of this > patchset, I can do that. At the moment of sending the patchset I > decided to postpone it. > > src/box/box.cc | 2 +- > src/box/func.c | 2 +- > src/box/lua/misc.cc | 20 +++++ > src/box/port.c | 191 +++++++++++++++++++++++++++++++++++++++++--- > src/box/port.h | 50 ++++++++++++ > src/box/sql/func.c | 22 ++--- > 6 files changed, 266 insertions(+), 21 deletions(-) > <snipped> > diff --git a/src/box/port.c b/src/box/port.c > index 6e2fe3a6e..925a4b2d5 100644 > --- a/src/box/port.c > +++ b/src/box/port.c <snipped> > @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out) > extern void > port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); > <snipped> > + > +static void > +port_c_destroy(struct port *base) > +{ > + struct port_c *port = (struct port_c *)base; > + struct port_c_entry *pe = port->first; > + if (pe == NULL) > + return; > + port_c_destroy_entry(pe); Minor: It's worth to drop a few words here that the port->first points to port->first_entry and it is released with the port_c structure. > + pe = pe->next; > + while (pe != NULL) { > + struct port_c_entry *cur = pe; > + pe = pe->next; > + port_c_destroy_entry(cur); > + mempool_free(&port_entry_pool, cur); > + } > +} > + <snipped> > + > +int > +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) > +{ > + struct port_c *port = (struct port_c *)base; > + struct port_c_entry *pe; > + uint32_t size = mp_end - mp; Minor: since there are no guarantees except the existing contract and common sense making this value be non-negative, I propose to add a corresponding assert. Feel free to ignore. > + char *dst; > + if (size <= PORT_ENTRY_SIZE) { > + /* > + * Alloc on a mempool is several times faster than > + * on the heap. And it perfectly fits any > + * MessagePack number, a short string, a boolean. > + */ > + dst = mempool_alloc(&port_entry_pool); > + if (dst == NULL) { > + diag_set(OutOfMemory, size, "mempool_alloc", "dst"); > + return -1; > + } > + } else { > + dst = malloc(size); > + if (dst == NULL) { > + diag_set(OutOfMemory, size, "malloc", "dst"); > + return -1; > + } > + } > + pe = port_c_new_entry(port); > + if (pe != NULL) { > + memcpy(dst, mp, size); > + pe->mp = dst; > + pe->mp_size = size; > + return 0; > + } > + if (size <= PORT_ENTRY_SIZE) > + mempool_free(&port_entry_pool, dst); > + else > + free(dst); > + return -1; > +} > + <snipped> > -- > 2.21.1 (Apple Git-122.3) > -- Best regards, IM ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c 2020-04-03 14:12 ` Igor Munkin @ 2020-04-23 0:14 ` Vladislav Shpilevoy 0 siblings, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-04-23 0:14 UTC (permalink / raw) To: Igor Munkin; +Cc: tarantool-patches Hi! Thanks for the review! >> diff --git a/src/box/port.c b/src/box/port.c >> index 6e2fe3a6e..925a4b2d5 100644 >> --- a/src/box/port.c >> +++ b/src/box/port.c >> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out) >> extern void >> port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); >> > > <snipped> > >> + >> +static void >> +port_c_destroy(struct port *base) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe = port->first; >> + if (pe == NULL) >> + return; >> + port_c_destroy_entry(pe); > > Minor: It's worth to drop a few words here that the port->first points > to port->first_entry and it is released with the port_c structure. No problem: ==================== diff --git a/src/box/port.c b/src/box/port.c index 925a4b2d5..124625c7e 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -151,6 +151,11 @@ port_c_destroy(struct port *base) if (pe == NULL) return; port_c_destroy_entry(pe); + /* + * Port->first is skipped, it is pointing at + * port_c.first_entry, and is freed together with the + * port. + */ pe = pe->next; while (pe != NULL) { struct port_c_entry *cur = pe; ==================== >> + pe = pe->next; >> + while (pe != NULL) { >> + struct port_c_entry *cur = pe; >> + pe = pe->next; >> + port_c_destroy_entry(cur); >> + mempool_free(&port_entry_pool, cur); >> + } >> +} >> + > > <snipped> > >> + >> +int >> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe; >> + uint32_t size = mp_end - mp; > > Minor: since there are no guarantees except the existing contract and > common sense making this value be non-negative, I propose to add a > corresponding assert. Feel free to ignore. I don't mind. ==================== diff --git a/src/box/port.c b/src/box/port.c index 925a4b2d5..3498f7af0 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -202,6 +207,7 @@ port_c_add_mp(struct port *base, const char *mp, const char *mp_end) { struct port_c *port = (struct port_c *)base; struct port_c_entry *pe; + assert(mp_end > mp); uint32_t size = mp_end - mp; char *dst; if (size <= PORT_ENTRY_SIZE) { ==================== ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function 2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy @ 2020-03-08 17:47 ` Vladislav Shpilevoy 2020-03-26 17:51 ` Nikita Pettik 2020-04-03 14:13 ` Igor Munkin 2 siblings, 2 replies; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-03-08 17:47 UTC (permalink / raw) To: tarantool-patches, korablev, imun Closes #4641 @TarantoolBot document Title: box_return_mp() public C function Stored C functions could return a result only via `box_return_tuple()` function. That made users create a tuple every time they wanted to return something from a C function. Now public C API offers another way to return - `box_return_mp()`. It allows to return arbitrary MessagePack, not wrapped into a tuple object. This is simpler to use for small results like a number, boolean, or a short string. Besides, `box_return_mp()` is much faster than `box_return_tuple()`, especially for small MessagePack. Note, that it is faster only if an alternative is to create a tuple by yourself. If an already existing tuple was obtained from an iterator, and you want to return it, then of course it is faster to return via `box_return_tuple()`, than via extraction of tuple data, and calling `box_return_mp()`. Here is the function declaration from module.h: ```C /** * Return MessagePack from a stored C procedure. The MessagePack * is copied, so it is safe to free/reuse the passed arguments * after the call. * MessagePack is not validated, for the sake of speed. It is * expected to be a single encoded object. An attempt to encode * and return multiple objects without wrapping them into an * MP_ARRAY or MP_MAP is undefined behaviour. * * \param ctx An opaque structure passed to the stored C procedure * by Tarantool. * \param mp Begin of MessagePack. * \param mp_end End of MessagePack. * \retval -1 Error. * \retval 0 Success. */ API_EXPORT int box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end); ``` --- extra/exports | 1 + src/box/box.cc | 6 ++++++ src/box/box.h | 19 +++++++++++++++++++ test/box/function1.c | 37 +++++++++++++++++++++++++++++++++++++ test/box/function1.result | 31 +++++++++++++++++++++++++++++++ test/box/function1.test.lua | 14 ++++++++++++++ 6 files changed, 108 insertions(+) diff --git a/extra/exports b/extra/exports index 3a0637317..d793edfe4 100644 --- a/extra/exports +++ b/extra/exports @@ -205,6 +205,7 @@ box_tuple_extract_key box_tuple_compare box_tuple_compare_with_key box_return_tuple +box_return_mp box_space_id_by_name box_index_id_by_name box_select diff --git a/src/box/box.cc b/src/box/box.cc index 276430913..4e15f21fc 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -1073,6 +1073,12 @@ box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple) return port_c_add_tuple(ctx->port, tuple); } +int +box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end) +{ + return port_c_add_mp(ctx->port, mp, mp_end); +} + /* schema_find_id()-like method using only public API */ uint32_t box_space_id_by_name(const char *name, uint32_t len) diff --git a/src/box/box.h b/src/box/box.h index f37a945eb..a23383630 100644 --- a/src/box/box.h +++ b/src/box/box.h @@ -284,6 +284,25 @@ typedef struct box_function_ctx box_function_ctx_t; API_EXPORT int box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple); +/** + * Return MessagePack from a stored C procedure. The MessagePack + * is copied, so it is safe to free/reuse the passed arguments + * after the call. + * MessagePack is not validated, for the sake of speed. It is + * expected to be a single encoded object. An attempt to encode + * and return multiple objects without wrapping them into an + * MP_ARRAY or MP_MAP is undefined behaviour. + * + * \param ctx An opaque structure passed to the stored C procedure + * by Tarantool. + * \param mp Begin of MessagePack. + * \param mp_end End of MessagePack. + * \retval -1 Error. + * \retval 0 Success. + */ +API_EXPORT int +box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end); + /** * Find space id by name. * diff --git a/test/box/function1.c b/test/box/function1.c index 87062d6a8..48a561100 100644 --- a/test/box/function1.c +++ b/test/box/function1.c @@ -245,3 +245,40 @@ test_sleep(box_function_ctx_t *ctx, const char *args, const char *args_end) fiber_sleep(0); return 0; } + +int +test_return_mp(box_function_ctx_t *ctx, const char *args, const char *args_end) +{ + (void) args; + (void) args_end; + char buf[512]; + char *pos = mp_encode_uint(buf, 1); + int rc = box_return_mp(ctx, buf, pos); + if (rc != 0) + return rc; + + pos = mp_encode_int(buf, -1); + rc = box_return_mp(ctx, buf, pos); + if (rc != 0) + return rc; + + pos = mp_encode_uint(buf, UINT64_MAX); + rc = box_return_mp(ctx, buf, pos); + if (rc != 0) + return rc; + + const char *str = "123456789101112131415"; + pos = mp_encode_str(buf, str, strlen(str)); + rc = box_return_mp(ctx, buf, pos); + if (rc != 0) + return rc; + + pos = mp_encode_array(buf, 1); + pos = mp_encode_uint(pos, 2); + box_tuple_t *tuple = box_tuple_new(box_tuple_format_default(), + buf, pos); + if (tuple == NULL) + return -1; + rc = box_return_tuple(ctx, tuple); + return rc; +} diff --git a/test/box/function1.result b/test/box/function1.result index b91d63c51..301f666ef 100644 --- a/test/box/function1.result +++ b/test/box/function1.result @@ -791,6 +791,37 @@ box.schema.func.drop("function1.multireturn") --- ... -- +-- gh-4641: box_return_mp() C API to return arbitrary MessagePack +-- from C functions. +-- +name = 'function1.test_return_mp' +--- +... +box.schema.func.create(name, {language = "C", exports = {'LUA'}}) +--- +... +box.func[name]:call() +--- +- 1 +- -1 +- 18446744073709551615 +- '123456789101112131415' +- [2] +... +box.schema.user.grant('guest', 'super') +--- +... +net:connect(box.cfg.listen):call(name) +--- +- [1, -1, 18446744073709551615, '123456789101112131415', [2]] +... +box.schema.user.revoke('guest', 'super') +--- +... +box.schema.func.drop(name) +--- +... +-- -- gh-4182: Introduce persistent Lua functions. -- test_run:cmd("setopt delimiter ';'") diff --git a/test/box/function1.test.lua b/test/box/function1.test.lua index b1841f3ad..670d63a44 100644 --- a/test/box/function1.test.lua +++ b/test/box/function1.test.lua @@ -268,6 +268,20 @@ box.schema.func.create('function1.multireturn', {language = "C", exports = {'LUA box.execute("SELECT \"function1.multireturn\"()") box.schema.func.drop("function1.multireturn") +-- +-- gh-4641: box_return_mp() C API to return arbitrary MessagePack +-- from C functions. +-- +name = 'function1.test_return_mp' +box.schema.func.create(name, {language = "C", exports = {'LUA'}}) +box.func[name]:call() + +box.schema.user.grant('guest', 'super') +net:connect(box.cfg.listen):call(name) +box.schema.user.revoke('guest', 'super') + +box.schema.func.drop(name) + -- -- gh-4182: Introduce persistent Lua functions. -- -- 2.21.1 (Apple Git-122.3) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy @ 2020-03-26 17:51 ` Nikita Pettik 2020-04-03 14:13 ` Igor Munkin 1 sibling, 0 replies; 16+ messages in thread From: Nikita Pettik @ 2020-03-26 17:51 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On 08 Mar 18:47, Vladislav Shpilevoy wrote: > Closes #4641 LGTM as obvious. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy 2020-03-26 17:51 ` Nikita Pettik @ 2020-04-03 14:13 ` Igor Munkin 2020-04-23 0:14 ` Vladislav Shpilevoy 1 sibling, 1 reply; 16+ messages in thread From: Igor Munkin @ 2020-04-03 14:13 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches Vlad, Thanks for the patch! LGTM except the one nit. On 08.03.20, Vladislav Shpilevoy wrote: > Closes #4641 > > @TarantoolBot document > Title: box_return_mp() public C function > > Stored C functions could return a result only via > `box_return_tuple()` function. That made users create a tuple > every time they wanted to return something from a C function. > > Now public C API offers another way to return - `box_return_mp()`. > It allows to return arbitrary MessagePack, not wrapped into a > tuple object. This is simpler to use for small results like a > number, boolean, or a short string. Besides, `box_return_mp()` is > much faster than `box_return_tuple()`, especially for small > MessagePack. > > Note, that it is faster only if an alternative is to create a > tuple by yourself. If an already existing tuple was obtained from > an iterator, and you want to return it, then of course it is > faster to return via `box_return_tuple()`, than via extraction of > tuple data, and calling `box_return_mp()`. > > Here is the function declaration from module.h: > ```C > /** > * Return MessagePack from a stored C procedure. The MessagePack > * is copied, so it is safe to free/reuse the passed arguments > * after the call. > * MessagePack is not validated, for the sake of speed. It is > * expected to be a single encoded object. An attempt to encode > * and return multiple objects without wrapping them into an > * MP_ARRAY or MP_MAP is undefined behaviour. > * > * \param ctx An opaque structure passed to the stored C procedure > * by Tarantool. > * \param mp Begin of MessagePack. > * \param mp_end End of MessagePack. > * \retval -1 Error. > * \retval 0 Success. > */ > API_EXPORT int > box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end); > ``` > --- > extra/exports | 1 + > src/box/box.cc | 6 ++++++ > src/box/box.h | 19 +++++++++++++++++++ > test/box/function1.c | 37 +++++++++++++++++++++++++++++++++++++ > test/box/function1.result | 31 +++++++++++++++++++++++++++++++ > test/box/function1.test.lua | 14 ++++++++++++++ > 6 files changed, 108 insertions(+) > <snipped> > diff --git a/test/box/function1.result b/test/box/function1.result > index b91d63c51..301f666ef 100644 > --- a/test/box/function1.result > +++ b/test/box/function1.result > @@ -791,6 +791,37 @@ box.schema.func.drop("function1.multireturn") > --- > ... > -- > +-- gh-4641: box_return_mp() C API to return arbitrary MessagePack > +-- from C functions. > +-- > +name = 'function1.test_return_mp' > +--- > +... > +box.schema.func.create(name, {language = "C", exports = {'LUA'}}) > +--- > +... > +box.func[name]:call() > +--- > +- 1 > +- -1 > +- 18446744073709551615 > +- '123456789101112131415' > +- [2] > +... > +box.schema.user.grant('guest', 'super') > +--- > +... Minor: I guess it worth to add a comment below referring the #4799 since I was confused with the output below. > +net:connect(box.cfg.listen):call(name) > +--- > +- [1, -1, 18446744073709551615, '123456789101112131415', [2]] > +... > +box.schema.user.revoke('guest', 'super') > +--- > +... > +box.schema.func.drop(name) > +--- > +... > +-- > -- gh-4182: Introduce persistent Lua functions. > -- > test_run:cmd("setopt delimiter ';'") <snipped> > -- > 2.21.1 (Apple Git-122.3) > -- Best regards, IM ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function 2020-04-03 14:13 ` Igor Munkin @ 2020-04-23 0:14 ` Vladislav Shpilevoy 0 siblings, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy @ 2020-04-23 0:14 UTC (permalink / raw) To: Igor Munkin; +Cc: tarantool-patches Thanks for the review! >> diff --git a/test/box/function1.result b/test/box/function1.result >> index b91d63c51..301f666ef 100644 >> --- a/test/box/function1.result >> +++ b/test/box/function1.result >> @@ -791,6 +791,37 @@ box.schema.func.drop("function1.multireturn") >> --- >> ... >> -- >> +-- gh-4641: box_return_mp() C API to return arbitrary MessagePack >> +-- from C functions. >> +-- >> +name = 'function1.test_return_mp' >> +--- >> +... >> +box.schema.func.create(name, {language = "C", exports = {'LUA'}}) >> +--- >> +... >> +box.func[name]:call() >> +--- >> +- 1 >> +- -1 >> +- 18446744073709551615 >> +- '123456789101112131415' >> +- [2] >> +... >> +box.schema.user.grant('guest', 'super') >> +--- >> +... > > Minor: I guess it worth to add a comment below referring the #4799 since > I was confused with the output below. Fair. ==================== diff --git a/test/box/function1.result b/test/box/function1.result index 301f666ef..905a4cdab 100644 --- a/test/box/function1.result +++ b/test/box/function1.result @@ -811,6 +811,8 @@ box.func[name]:call() box.schema.user.grant('guest', 'super') --- ... +-- Netbox:call() returns not the same as local call for C +-- functions, see #4799. net:connect(box.cfg.listen):call(name) --- - [1, -1, 18446744073709551615, '123456789101112131415', [2]] diff --git a/test/box/function1.test.lua b/test/box/function1.test.lua index 670d63a44..ab7b586a0 100644 --- a/test/box/function1.test.lua +++ b/test/box/function1.test.lua @@ -277,6 +277,8 @@ box.schema.func.create(name, {language = "C", exports = {'LUA'}}) box.func[name]:call() box.schema.user.grant('guest', 'super') +-- Netbox:call() returns not the same as local call for C +-- functions, see #4799. net:connect(box.cfg.listen):call(name) box.schema.user.revoke('guest', 'super') ==================== ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2020-04-27 23:24 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy 2020-03-10 13:42 ` Nikita Pettik 2020-03-11 0:17 ` Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy 2020-03-26 17:49 ` Nikita Pettik 2020-04-23 0:14 ` Vladislav Shpilevoy 2020-04-27 14:09 ` Nikita Pettik 2020-04-27 21:28 ` Vladislav Shpilevoy 2020-04-27 23:24 ` Nikita Pettik 2020-04-03 14:12 ` Igor Munkin 2020-04-23 0:14 ` Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy 2020-03-26 17:51 ` Nikita Pettik 2020-04-03 14:13 ` Igor Munkin 2020-04-23 0:14 ` Vladislav Shpilevoy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox