From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Nikita Pettik <korablev@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c Date: Thu, 23 Apr 2020 02:14:36 +0200 [thread overview] Message-ID: <7301da02-16f8-2629-b673-8d6125032111@tarantool.org> (raw) In-Reply-To: <20200326174911.GA5718@tarantool.org> Hi! Thanks for the review! On 26/03/2020 18:49, Nikita Pettik wrote: > On 08 Mar 18:47, Vladislav Shpilevoy wrote: > > In general, patch is okay. Below some comments which can be > considered as nits. > > I see that your branch doesn't pass CI testing: > https://travis-ci.org/github/tarantool/tarantool/jobs/659861432?utm_medium=notification&utm_source=github_status > > I haven't managed to find box/alter_limits among existing flaky > tests. If this test fail is not really related to your patches, > could you please file an issue? It is not related (at least I think so), I saw it failing on other branches too. Created an issue: https://github.com/tarantool/tarantool/issues/4926 >> If you (reviewers) want me to drop port_tuple in scope of this >> patchset, I can do that. At the moment of sending the patchset I >> decided to postpone it. > > I'd do it as a follow-up in case you have time to do it. > Otherwise please just file an issue. Now I have time, so it is done in a separate commit in v2 thread. >> diff --git a/src/box/port.c b/src/box/port.c >> index 6e2fe3a6e..925a4b2d5 100644 >> --- a/src/box/port.c >> +++ b/src/box/port.c >> @@ -37,7 +37,12 @@ >> #include <fiber.h> >> #include "errinj.h" >> >> -static struct mempool port_tuple_entry_pool; > > I'd add comment explaining that port_entry_pool holds both > port_c and port_tuple entries now. Moreover, as an optimization > some msgpacks stored in port_c also can be allocated in this > pool (as far as I understand). Ok, no problem: ==================== diff --git a/src/box/port.c b/src/box/port.c index 3498f7af0..fea98ecc5 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -37,6 +37,12 @@ #include <fiber.h> #include "errinj.h" +/** + * The pools is used both by port_c and port_tuple, since their + * entires are almost of the same size. Also port_c can use + * objects from the pool to store result data in their memory, + * when it fits. + */ static struct mempool port_entry_pool; enum { ==================== >> +static struct mempool port_entry_pool; >> + >> +enum { >> + PORT_ENTRY_SIZE = MAX(sizeof(struct port_c_entry), >> + sizeof(struct port_tuple_entry)), >> +}; >> >> @@ -127,28 +132,194 @@ port_tuple_dump_msgpack(struct port *base, struct obuf *out) >> extern void >> port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); >> >> +static inline void >> +port_c_destroy_entry(struct port_c_entry *pe) >> +{ > > I'd add a brief reference like: > > /* For allocation policy see port_c_add_tuple() and port_c_add_mp(). */ Ok, done: ==================== diff --git a/src/box/port.c b/src/box/port.c index fea98ecc5..2dfd0118d 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -141,6 +141,10 @@ port_tuple_dump_lua(struct port *base, struct lua_State *L, bool is_flat); static inline void port_c_destroy_entry(struct port_c_entry *pe) { + /* + * See port_c_add_*() for algorithm of how and where to + * store data, to understand why it is freed differently. + */ if (pe->mp_size == 0) tuple_unref(pe->tuple); else if (pe->mp_size <= PORT_ENTRY_SIZE) ==================== >> + if (pe->mp_size == 0) >> + tuple_unref(pe->tuple); >> + else if (pe->mp_size <= PORT_ENTRY_SIZE) >> + mempool_free(&port_entry_pool, pe->mp); >> + else >> + free(pe->mp); >> +} >> + >> +static inline struct port_c_entry * >> +port_c_new_entry(struct port_c *port) >> +{ >> + struct port_c_entry *e; >> + if (port->size == 0) { >> + e = &port->first_entry; > > I don't really get this 'first_entry' optimization. Why do you > need it at all? Because most functions return only 1 result. And most of these return it as a tuple. So having this value inside the port preallocated allows to avoid any allocations/frees for these cases. Zero allocation. Although I didn't really bench how much it helps. I just stole it from port_tuple as a low hanging fruit. >> + port->first = e; >> + port->last = e; >> + } else { >> + e = mempool_alloc(&port_entry_pool); >> + if (e == NULL) { >> + diag_set(OutOfMemory, sizeof(*e), "mempool_alloc", "e"); >> + return NULL; >> + } >> + port->last->next = e; >> + port->last = e; >> + } >> + e->next = NULL; >> + ++port->size; >> + return e; >> +} >> + >> +int >> +port_c_add_tuple(struct port *base, struct tuple *tuple) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe = port_c_new_entry(port); > > I guess reverting condition would make code flow more straightforward: > if (pe == NULL) > return -1; > ... I am infected with the paranoia that the most likely result should lead to the positive branch. But I don't mind to make it simpler. ==================== diff --git a/src/box/port.c b/src/box/port.c index 2dfd0118d..2c1fadb5c 100644 --- a/src/box/port.c +++ b/src/box/port.c @@ -202,14 +202,13 @@ port_c_add_tuple(struct port *base, struct tuple *tuple) { struct port_c *port = (struct port_c *)base; struct port_c_entry *pe = port_c_new_entry(port); - if (pe != NULL) { - /* 0 mp_size means the entry stores a tuple. */ - pe->mp_size = 0; - pe->tuple = tuple; - tuple_ref(tuple); - return 0; - } - return -1; + if (pe == NULL) + return -1; + /* 0 mp_size means the entry stores a tuple. */ + pe->mp_size = 0; + pe->tuple = tuple; + tuple_ref(tuple); + return 0; } int ==================== >> + if (pe != NULL) { >> + /* 0 mp_size means the entry stores a tuple. */ >> + pe->mp_size = 0; >> + pe->tuple = tuple; >> + tuple_ref(tuple); >> + return 0; >> + } >> + return -1; >> +} >> + >> +int >> +port_c_add_mp(struct port *base, const char *mp, const char *mp_end) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe; >> + uint32_t size = mp_end - mp; >> + char *dst; >> + if (size <= PORT_ENTRY_SIZE) { >> + /* >> + * Alloc on a mempool is several times faster than >> + * on the heap. And it perfectly fits any >> + * MessagePack number, a short string, a boolean. >> + */ >> + dst = mempool_alloc(&port_entry_pool); > > Dubious optimization...I mean is this code complication really > worth it? mempool_alloc() of this size is x4-5 times faster than malloc() according to my microbenchmarks on release build. I decided not to write concrete numbers x4-5, since they probably depend on machine. By saying 'on this size' I mean fitting into port entry object. And this is the perfect match for functions which return something small and scalar like a number, a boolean, or a small string. >> + if (dst == NULL) { >> + diag_set(OutOfMemory, size, "mempool_alloc", "dst"); >> + return -1; >> + } >> + >> +static int >> +port_c_dump_msgpack_16(struct port *base, struct obuf *out) >> +{ >> + struct port_c *port = (struct port_c *)base; >> + struct port_c_entry *pe; >> + for (pe = port->first; pe != NULL; pe = pe->next) { >> + uint32_t size = pe->mp_size; >> + if (size == 0) { >> + if (tuple_to_obuf(pe->tuple, out) != 0) >> + return -1; >> + } else if (obuf_dup(out, pe->mp, size) != size) { >> + diag_set(OutOfMemory, size, "obuf_dup", "data"); >> + return -1; >> + } >> + ERROR_INJECT(ERRINJ_PORT_DUMP, { >> + diag_set(OutOfMemory, >> + size == 0 ? tuple_size(pe->tuple) : size, >> + "obuf_dup", "data"); >> + return -1; >> + }); > > You added error injection, but did not test it. Could you provide test > (or point me at it please)? In the v2 thread it is used in box/errinj.test.lua to check that "port_dump can fail". >> + } >> + return port->size; >> +} >> + >> diff --git a/src/box/port.h b/src/box/port.h >> index 9d3d02b3c..cd40d8a15 100644 >> --- a/src/box/port.h >> +++ b/src/box/port.h >> @@ -130,6 +130,56 @@ void >> port_vdbemem_create(struct port *base, struct sql_value *mem, >> uint32_t mem_count); >> >> +/** >> + * The struct is PACKED to avoid unnecessary 4 byte padding > > Is this really profitable? -4 bytes for every object of this type by just adding one word to the struct? Sounds good to me. Very easy to do, to prove it works correct, and profit is obvious - the structure becomes 16% smaller. This means 16% more objects in every slab used for port_c_entry objects. Although I didn't measure how it affects things under heavy load on C functions. >> + * after mp_size. These objects are never stored on stack, neither >> + * allocated as an array, so the padding is not needed. >> + */ >> +struct PACKED port_c_entry { >> + struct port_c_entry *next; >> + union { >> + /** Valid if mp_size is 0. */ >> + struct tuple *tuple; >> + /** >> + * Valid if mp_size is > 0. MessagePack is >> + * allocated either on heap or on the port entry >> + * mempool, if it fits into a pool object. >> + */ >> + char *mp; >> + }; >> + uint32_t mp_size; >> +}; >> + >> +/** >> + * C port is used by C functions from the public API. They can >> + * return tuples and arbitrary MessagePack. >> + */ >> +struct port_c { >> + const struct port_vtab *vtab; >> + struct port_c_entry *first; >> + struct port_c_entry *last; >> + struct port_c_entry first_entry; >> + int size; > > Why do you call it size? I'd name it entry_cnt or count > (imho port size = sum of all entries' sizes). For that we have term 'bsize'. But I don't have anything against count, personally. The problem here is that it will be inconsistent with other ports (port_lua, and port_tuple which still exists here), and with local variables named 'size', which are assigned to port->size member - that is the problem in src/box/lua/call.c. When I renamed size to count, I got notable amount of unnecessary diff. So I would like to keep size unless you insist, and in that case I will do it in a separate patch.
next prev parent reply other threads:[~2020-04-23 0:14 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-08 17:47 [Tarantool-patches] [PATCH 0/3] box_return_mp Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 1/3] box: fix struct port_tuple.size wrong type in Lua Vladislav Shpilevoy 2020-03-10 13:42 ` Nikita Pettik 2020-03-11 0:17 ` Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 2/3] box: introduce port_c Vladislav Shpilevoy 2020-03-26 17:49 ` Nikita Pettik 2020-04-23 0:14 ` Vladislav Shpilevoy [this message] 2020-04-27 14:09 ` Nikita Pettik 2020-04-27 21:28 ` Vladislav Shpilevoy 2020-04-27 23:24 ` Nikita Pettik 2020-04-03 14:12 ` Igor Munkin 2020-04-23 0:14 ` Vladislav Shpilevoy 2020-03-08 17:47 ` [Tarantool-patches] [PATCH 3/3] box: introduce box_return_mp() public C function Vladislav Shpilevoy 2020-03-26 17:51 ` Nikita Pettik 2020-04-03 14:13 ` Igor Munkin 2020-04-23 0:14 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=7301da02-16f8-2629-b673-8d6125032111@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 2/3] box: introduce port_c' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox