Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call
Date: Tue, 29 Jun 2021 10:07:59 +0300	[thread overview]
Message-ID: <YNrGzydr84utZH0x@root> (raw)
In-Reply-To: <f7828c1c-f878-0c57-d496-e02006b44086@tarantool.org>

Thanks for the questions!

On 24.06.21, Vladislav Shpilevoy wrote:
> Thanks for the investigation!
> 
> >>> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> >>> index 0315e720c..3b2572096 100644
> >>> --- a/src/box/lua/call.c
> >>> +++ b/src/box/lua/call.c
> >>> @@ -450,13 +482,20 @@ port_lua_do_dump(struct port *base, struct mpstream *stream,
> >>>  	struct encode_lua_ctx ctx;
> >>>  	ctx.port = port;
> >>>  	ctx.stream = stream;
> >>> -	struct lua_State *L = tarantool_L;
> >>> -	int top = lua_gettop(L);
> >>> -	if (lua_cpcall(L, handler, &ctx) != 0) {
> >>> -		luaT_toerror(port->L);
> >>> +	lua_State *L = port->L;
> >>> +	/*
> >>> +	 * At the moment Lua stack holds only values to encode.
> >>> +	 * Insert corresponding encoder to the bottom and push
> >>> +	 * encode context as lightuserdata to the top.
> >>> +	 */
> >>> +	const int size = lua_gettop(L);
> >>> +	lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]);
> >>> +	assert(lua_isfunction(L, -1) && lua_iscfunction(L, -1));
> >>> +	lua_insert(L, 1);
> >>
> >> If I remember correctly, this leads to moving all the existing
> >> stack elements forward. Which might be expensive. I know from
> >> Vlad Grubov's words that they have code with hundreds of values in
> >> multireturn from stored functions. Did you bench what happens when
> >> the Lua coroutine contains a lot of values? In the bench by the
> >> link above I see only 1-element array and map. Size of the array
> >> and map does not matter though. Only multireturn is interesting
> >> here. Like 'return 1, 2, 3, ...'.
> > 
> > I've added this benchmark (200 numbers to return) to the gist.
> > Local results for the bench:
> > 
> > Master:
> > | Encode mul: 237900953 mcs, 12.6 K ps
> > 
> > My branch:
> > | Encode mul: 235735350 mcs, 12.7 K ps
> > 
> > `luamp_encode()` has the biggest impact in `port_do_lua_dump()`
> > (`lua_insert()` costs ~0.1% of the whole runtime).
> 
> That looks suspicious. Why does it have such a small impact? Did

I've run the benchmark for return case only with `perf stat` (and 5K
return values) and found three interesting metrics (others are pretty
the same, up to 0.07%):

Master:
| 10,255,748,650 cache-misses      # 13.686 % of all cache refs         (12.49%)
| 1,364,762,310  LLC-load-misses   # 15.92% of all LL-cache accesses    (12.50%)
| 401,136,504    iTLB-load-misses  # 39.68% of all iTLB cache accesses  (12.51%)

My branch:
| 9,964,939,197  cache-misses      # 13.413 % of all cache refs         (12.52%)
| 1,296,545,699  LLC-load-misses   # 15.08% of all LL-cache accesses    (12.50%)
| 285,852,865    iTLB-load-misses  # 29.13% of all iTLB cache accesses  (12.49%)

1) LLC (last-line cache, L2 in my case) misses is slightly decreasing.
I suppose that all insert movement operations use cache (L1 or L2)
instead memory access, that make it rather cheaply.

2) iTLB (instruction translation look-aside buffer) [1] needs to avoid
penalty of page walks to a page table. So looks like within this patch
code locality is improved (may be due to the fact of using
`luaT_call()` instead of 'lua_cpcall()` and 'luaT_toerror()`).

> you check that multireturn really made the stack bigger? Shouldn't

Yes, I did.

> it have linear complexity depending on the stack size after the
> target index?

Yes, with 5K items to return it becomes pretty equal or even a little
bit slower (up to ~1%) than the original code. But encoding is still
taking the biggest part of the runtime.

Side note: if user wants to return 5K values (or a table with 5K values)
from Tarantool we can't stop him, but only to offer him our empathy.

> 
> The patch looks fine to me, if the benchmarks show there is even a
> slight perf win. But I think Igor should take another look.

[1]: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/cpu-metrics-reference/front-end-bound/itlb-overhead.html

-- 
Best regards,
Sergey Kaplun

  reply	other threads:[~2021-06-29  7:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-18 18:14 Sergey Kaplun via Tarantool-patches
2021-06-21 20:36 ` Vladislav Shpilevoy via Tarantool-patches
2021-06-22 13:38   ` Sergey Kaplun via Tarantool-patches
2021-06-24 20:00     ` Vladislav Shpilevoy via Tarantool-patches
2021-06-29  7:07       ` Sergey Kaplun via Tarantool-patches [this message]
2021-08-01 12:34 ` Igor Munkin via Tarantool-patches
2021-08-02 14:25   ` Sergey Kaplun via Tarantool-patches
2021-08-12 17:35     ` Igor Munkin via Tarantool-patches
2021-08-13  7:30       ` Sergey Kaplun via Tarantool-patches
2021-08-13  7:41         ` Vitaliia Ioffe via Tarantool-patches
2021-08-13  9:27         ` Sergey Kaplun via Tarantool-patches
2021-08-04 22:29 ` Vladislav Shpilevoy via Tarantool-patches
2021-08-14 10:16 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNrGzydr84utZH0x@root \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=skaplun@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] lua: refactor port_lua_do_dump and encode_lua_call' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox