From: Igor Munkin <imun@tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] fiber: abort trace recording on fiber yield Date: Sat, 19 Sep 2020 18:29:58 +0300 [thread overview] Message-ID: <20200919152958.GN18920@tarantool.org> (raw) In-Reply-To: <a69ec1ec-32cb-5c9e-e9bb-df9320b404ee@tarantool.org> Vlad, Thanks for your reply! This is a PoC (so the patch was not polished). I'll address all your comments it in a separate series but let's finish the discussion here before it. On 17.09.20, Vladislav Shpilevoy wrote: > Hi! Thanks for the investigation! > > See 5 comments below. > > > diff --git a/src/lua/utils.c b/src/lua/utils.c > > index 0b05d72..7d0962f 100644 > > --- a/src/lua/utils.c > > +++ b/src/lua/utils.c > > @@ -1308,3 +1308,7 @@ tarantool_lua_utils_init(struct lua_State *L) > > luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX); > > return 0; > > } > > + > > +void lua_on_yield(void) > > +{ > > +} > > > > ================================================================================ > > > > * Vanilla -> Patched [extern noop callback] (min, median, mean, max): > > | fibers: 10; iters: 100 0% 2% 0% 0% > > | fibers: 10; iters: 1000 1% 3% 1% -1% > > | fibers: 10; iters: 10000 -1% 0% -1% -3% > > | fibers: 10; iters: 100000 -2% 0% -1% 0% > > | fibers: 100; iters: 100 0% -1% 0% -4% > > | fibers: 100; iters: 1000 0% 1% 0% 0% > > | fibers: 100; iters: 10000 0% 0% 0% -3% > > | fibers: 100; iters: 100000 0% 1% 0% -2% > > | fibers: 1000; iters: 100 0% 0% -1% -3% > > | fibers: 1000; iters: 1000 0% 0% 0% 1% > > | fibers: 1000; iters: 10000 0% 0% 0% 0% > > | fibers: 1000; iters: 100000 0% 0% 0% -1% > > | fibers: 10000; iters: 100 0% 0% 0% 2% > > | fibers: 10000; iters: 1000 0% -1% 0% 2% > > | fibers: 10000; iters: 10000 -1% -1% 0% 0% > > | fibers: 10000; iters: 100000 -1% 0% -1% -3% > > > > And here is a final one. I personally don't like it (considering my > > comments in the previous reply), but *for now* it can be a solution. > > 1. I couldn't find - why don't you like it? It seems to be the fastest > solution, not affecting the microbench at all, and definitely not > affecting any more complex scenarios. Here is the cite from the first one reply[1]: | > Why can't we call lj_trace_abort() directly? | | It's the internal API. Its usage complicates a switch between various | LuaJIT implementations (we faced several challenges when tried to build | Tarantool with uJIT). There is a public API to be used here (though in a | bit hacky way). | | > | > And most importantly, how does it affect perf? New trigger | > is +1 virtual call on each yield of every Lua fiber and +1 | > execution of non-trival function luaJIT_setmode(). I think | > it is better to write a micro bench checking how many yields | > can we do per time unit before and after this patch. From Lua | > fibers. You're right it's the fastest solution considering the microbench results, but we pay a code maintenance trade-off for it. *This* is the fact I don't like. I tested a noop workload (there is only yield in Lua bench) so for non-synthetic workload the numbers might differ less (or might not). Since we're using our bundled LuaJIT fork, I guess we can postpone these thoughts for now and simply fix the issue. But I definitely would like to return to it. Hope we'll have a nicer solution in future. > > > ================================================================================ > > > > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c > > index 483ae3ce1..ed6104c8d 100644 > > --- a/src/lib/core/fiber.c > > +++ b/src/lib/core/fiber.c > > @@ -46,6 +46,8 @@ > > #if ENABLE_FIBER_TOP > > #include <x86intrin.h> /* __rdtscp() */ > > > > +extern void lua_on_yield(void); > > + > > static inline void > > clock_stat_add_delta(struct clock_stat *stat, uint64_t clock_delta) > > { > > @@ -416,6 +418,10 @@ fiber_call(struct fiber *callee) > > /** By convention, these triggers must not throw. */ > > if (! rlist_empty(&caller->on_yield)) > > trigger_run(&caller->on_yield, NULL); > > + > > + if (cord_is_main()) > > + lua_on_yield(); > > 2. Why not inside fiber_call_impl? I thought we need to call > the abort on each coro_transfer(). <fiber_schedule_list> switches schedule fiber to the ready one. AFAICS, scheduler doesn't enter Lua world, so no abort is necessary here. > > > + > > clock_set_on_csw(caller); > > callee->caller = caller; > > callee->flags |= FIBER_IS_READY; > > @@ -645,6 +651,10 @@ fiber_yield(void) > > /** By convention, these triggers must not throw. */ > > if (! rlist_empty(&caller->on_yield)) > > trigger_run(&caller->on_yield, NULL); > > + > > + if (cord_is_main()) > > + lua_on_yield(); > > + > > clock_set_on_csw(caller); > > > > assert(callee->flags & FIBER_IS_READY || callee == &cord->sched); > > diff --git a/src/lua/utils.c b/src/lua/utils.c > > index af114b0a2..49e3c2bf0 100644 > > --- a/src/lua/utils.c > > +++ b/src/lua/utils.c > > @@ -1308,3 +1308,9 @@ tarantool_lua_utils_init(struct lua_State *L) > > luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX); > > return 0; > > } > > + > > +#include "lj_trace.h" > > 3. Why is the header included here, and not in the beginning? > > 4. It is worth adding a comment. This is a draft, I'll fix your comments in the v2 series. > > > +void lua_on_yield(void) > > +{ > > + lj_trace_abort(G(tarantool_L)); I forgot to add the check whether yield occurs on the running trace. Here are the corresponding *draft* changes: ================================================================================ diff --git a/src/lua/utils.c b/src/lua/utils.c index 49e3c2bf0..8d72abdb9 100644 --- a/src/lua/utils.c +++ b/src/lua/utils.c @@ -1310,7 +1310,18 @@ tarantool_lua_utils_init(struct lua_State *L) } #include "lj_trace.h" +#include "lj_err.h" void lua_on_yield(void) { - lj_trace_abort(G(tarantool_L)); + struct global_State *g = G(tarantool_L); + /* Forbid Lua world re-entrancy while running the trace */ + if (unlikely(tvref(g->jit_base))) { + struct lua_State *L = fiber()->storage.lua.stack; + setstrV(L, L->top++, lj_err_str(L, LJ_ERR_JITCALL)); + if (g->panic) +#undef panic + g->panic(L); + exit(EXIT_FAILURE); + } + lj_trace_abort(g); } ================================================================================ Here are benchmark results for the new implementation: * Vanilla -> Patched [extern macro callback] (min, median, mean, max): | fibers: 10; iters: 100 0% 1% 0% 0% | fibers: 10; iters: 1000 2% -2% -3% -5% | fibers: 10; iters: 10000 -3% -2% -2% -2% | fibers: 10; iters: 100000 -1% -1% -2% -4% | fibers: 100; iters: 100 0% 0% -2% -9% | fibers: 100; iters: 1000 0% 1% 1% 0% | fibers: 100; iters: 10000 0% 0% 0% 0% | fibers: 100; iters: 100000 0% 0% 0% -1% | fibers: 1000; iters: 100 0% 1% 0% -3% | fibers: 1000; iters: 1000 0% 1% 1% 4% | fibers: 1000; iters: 10000 0% 0% 0% 2% | fibers: 1000; iters: 100000 0% 0% 0% 1% | fibers: 10000; iters: 100 0% 0% 0% 3% | fibers: 10000; iters: 1000 0% 0% 0% 0% | fibers: 10000; iters: 10000 0% 0% 0% 4% | fibers: 10000; iters: 100000 0% 2% 1% 0% > > +} > > > > ================================================================================ > > > > * Vanilla -> Patched [extern macro callback] (min, median, mean, max): > > | fibers: 10; iters: 100 1% 1% 0% 0% > > | fibers: 10; iters: 1000 0% 4% 0% -1% > > | fibers: 10; iters: 10000 0% 5% 2% 6% > > | fibers: 10; iters: 100000 0% 0% 0% 0% > > | fibers: 100; iters: 100 0% -4% -3% -6% > > | fibers: 100; iters: 1000 0% 3% 1% 0% > > | fibers: 100; iters: 10000 0% 0% 0% -2% > > | fibers: 100; iters: 100000 0% 1% 0% -2% > > | fibers: 1000; iters: 100 0% 0% 0% -4% > > | fibers: 1000; iters: 1000 0% 0% 0% -1% > > | fibers: 1000; iters: 10000 0% 0% 0% 0% > > | fibers: 1000; iters: 100000 0% 0% 0% -1% > > | fibers: 10000; iters: 100 -1% 1% 1% 2% > > | fibers: 10000; iters: 1000 -1% 0% 0% 2% > > | fibers: 10000; iters: 10000 0% 0% 0% 0% > > | fibers: 10000; iters: 100000 0% 0% 0% 0% > > > > There was also an alternative idea by Sergos: introduce a special > > parameter to enable such feature by demand. > > 5. I am not sure it is so necessary - from your bench it looks the overhead > is almost 0, not counting the rare noise about +-1%. I agree, but Sergos proposed this way when results were not so good. -- Best regards, IM
next prev parent reply other threads:[~2020-09-19 15:40 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-30 22:44 Igor Munkin 2020-03-31 16:58 ` Konstantin Osipov 2020-03-31 23:57 ` Vladislav Shpilevoy 2020-07-07 22:24 ` Igor Munkin 2020-07-10 10:26 ` sergos 2020-09-21 19:23 ` Igor Munkin 2020-09-21 20:14 ` Sergey Ostanevich 2020-07-11 20:28 ` Vladislav Shpilevoy 2020-09-07 20:35 ` Igor Munkin 2020-09-17 14:21 ` Vladislav Shpilevoy 2020-09-19 15:29 ` Igor Munkin [this message] 2020-09-21 20:31 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200919152958.GN18920@tarantool.org \ --to=imun@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] fiber: abort trace recording on fiber yield' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox