* [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-19 20:50 [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Igor Munkin
@ 2020-06-19 20:40 ` Igor Munkin
2020-06-21 10:26 ` Sergey Ostanevich
2020-06-23 18:04 ` Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage Igor Munkin
` (3 subsequent siblings)
4 siblings, 2 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-19 20:40 UTC (permalink / raw)
To: Vladislav Shpilevoy, Sergey Ostanevich, Alexander V. Tikhonov
Cc: tarantool-patches
JIT compiler can generate invalid trace for <totable> function breaking
its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
produces right results, disabling JIT for this function (and all its
subfunctions) stops execution failures.
Relates to LuaJIT/LuaJIT#584
Fixes #4252
Signed-off-by: Igor Munkin <imun@tarantool.org>
---
test/box-tap/key_def.test.lua | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
index d7dbf5b88..ce7a3cb63 100755
--- a/test/box-tap/key_def.test.lua
+++ b/test/box-tap/key_def.test.lua
@@ -4,6 +4,14 @@ local tap = require('tap')
local ffi = require('ffi')
local json = require('json')
local fun = require('fun')
+
+-- Fix for gh-4252: to prevent invalid trace assembling (see
+-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
+-- method (these functions are different GCfunc objects) and all
+-- their subfunctions.
+jit.off(fun.totable, true)
+jit.off(fun.iter({}).totable, true)
+
local key_def_lib = require('key_def')
local usage_error = 'Bad params, use: key_def.new({' ..
--
2.25.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-19 20:50 [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function Igor Munkin
@ 2020-06-19 20:40 ` Igor Munkin
2020-06-20 17:42 ` Vladislav Shpilevoy
2020-06-22 22:54 ` [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Vladislav Shpilevoy
` (2 subsequent siblings)
4 siblings, 1 reply; 22+ messages in thread
From: Igor Munkin @ 2020-06-19 20:40 UTC (permalink / raw)
To: Vladislav Shpilevoy, Sergey Ostanevich; +Cc: tarantool-patches
<box_process_lua> function created a new GCfunc object for a handler
having no upvalues depending on the request context on each call.
The changes introduces the folliwing mapping:
| <handler id> -> <handler GCfunc object>
Initializing this mapping on Tarantool startup is aimed to reduce Lua GC
memory usage.
Signed-off-by: Igor Munkin <imun@tarantool.org>
---
src/box/lua/call.c | 32 +++++++++++++++++++++++++++-----
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index 6588ec2fa..e1b1a5e81 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -48,6 +48,15 @@
#include "mpstream/mpstream.h"
#include "box/session.h"
+enum handlers {
+ HANDLER_CALL,
+ HANDLER_CALL_BY_REF,
+ HANDLER_EVAL,
+ HANDLER_MAX,
+};
+
+static int execute_lua_refs[HANDLER_MAX];
+
/**
* A helper to find a Lua function by name and put it
* on top of the stack.
@@ -527,7 +536,7 @@ static const struct port_vtab port_lua_vtab = {
};
static inline int
-box_process_lua(lua_CFunction handler, struct execute_lua_ctx *ctx,
+box_process_lua(enum handlers handler, struct execute_lua_ctx *ctx,
struct port *ret)
{
lua_State *L = luaT_newthread(tarantool_L);
@@ -537,7 +546,8 @@ box_process_lua(lua_CFunction handler, struct execute_lua_ctx *ctx,
port_lua_create(ret, L);
((struct port_lua *) ret)->ref = coro_ref;
- lua_pushcfunction(L, handler);
+ lua_rawgeti(L, LUA_REGISTRYINDEX, execute_lua_refs[handler]);
+ assert(lua_isfunction(L, -1));
lua_pushlightuserdata(L, ctx);
if (luaT_call(L, 1, LUA_MULTRET) != 0) {
port_lua_destroy(ret);
@@ -554,7 +564,7 @@ box_lua_call(const char *name, uint32_t name_len,
ctx.name = name;
ctx.name_len = name_len;
ctx.args = args;
- return box_process_lua(execute_lua_call, &ctx, ret);
+ return box_process_lua(HANDLER_CALL, &ctx, ret);
}
int
@@ -565,7 +575,7 @@ box_lua_eval(const char *expr, uint32_t expr_len,
ctx.name = expr;
ctx.name_len = expr_len;
ctx.args = args;
- return box_process_lua(execute_lua_eval, &ctx, ret);
+ return box_process_lua(HANDLER_EVAL, &ctx, ret);
}
struct func_lua {
@@ -781,7 +791,7 @@ func_persistent_lua_call(struct func *base, struct port *args, struct port *ret)
struct execute_lua_ctx ctx;
ctx.lua_ref = func->lua_ref;
ctx.args = args;
- return box_process_lua(execute_lua_call_by_ref, &ctx, ret);
+ return box_process_lua(HANDLER_CALL_BY_REF, &ctx, ret);
}
@@ -998,6 +1008,18 @@ box_lua_call_init(struct lua_State *L)
*/
on_alter_func_in_lua.data = L;
trigger_add(&on_alter_func, &on_alter_func_in_lua);
+
+ lua_CFunction handles[] = {
+ [HANDLER_CALL] = execute_lua_call,
+ [HANDLER_CALL_BY_REF] = execute_lua_call_by_ref,
+ [HANDLER_EVAL] = execute_lua_eval,
+ };
+
+ for (int i = 0; i < HANDLER_MAX; i++) {
+ lua_pushcfunction(L, handles[i]);
+ execute_lua_refs[i] = luaL_ref(L, LUA_REGISTRYINDEX);
+ }
+
#if 0
/* Get CTypeID for `struct port *' */
int rc = luaL_cdef(L, "struct port;");
--
2.25.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool
@ 2020-06-19 20:50 Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function Igor Munkin
` (4 more replies)
0 siblings, 5 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-19 20:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, Sergey Ostanevich; +Cc: tarantool-patches
This series consists of the two patches that seems unrelated. However
you can look at the CI results[1] for the first incarnation[2] and see
the reason why the test is also attached.
The first patch just makes the flaky box-tap/key_def.test.lua stable and
thereby fixes #4252[3]. The fix is quite simple: since it hits the known
LuaJIT problem[4] disabling JIT for particular Lua Fun methods is a
solution for now. When the JIT misbehave is fixed, the patch can be
reverted. Furthermore, Sasha Tu. asked to open another ticket in our
queue tracking the origin LuaJIT issue, since #4252 will be closed.
I failed to find the exact way the patch affects this test, but again,
the result is in the CI.
The second patch was born in scope of Sasha L. research[5]. I glanced on
the flamegraphs and then to the sources and saw no reason why the same
functions need to be created on each iproto CALL/EVAL or "stored"
procedure call. These GCfunc objects live only in scope of the coroutine
(i.e. fiber) serving a request, but can be reused for the others since
there are no upvalues depending on the request context. As a result of
the change GC pressure is reduced.
Vlad asked me personally to make and provide some benchmarks for the
changes. I made the one you can find here[6]. One need to start
<callee.lua> instance and run <caller.lua> several times with the same
Tarantool binary.
After 15 runs with default args I obtained the following results:
* Vanilla bleeding master (mean):
| ===== 2.5.0-147-ge7a70be4e =====
| call by ref GC: 953127 Kb
| call by ref time: 1.401803 sec
| call GC: 507813 Kb
| eval GC: 686523 Kb
* Patched bleeding master (mean):
| ===== 2.5.0-149-g71b2bbc8a =====
| call by ref GC: 921877 Kb
| call by ref time: 1.318509 sec
| call GC: 476563 Kb
| eval GC: 655273 Kb
* Relative measurements (before -> after):
| call by ref GC: -3% (-31250 Kb)
| call by ref time: -5% (-0.083294 sec)
| call GC: -6% (-31250 Kb)
| eval GC: -4% (-31250 Kb)
Even if one consider benchmarks a bit synthetic, this change improves
platform performance a priori: as a result less garbage is produced,
ergo GC machinery is triggered less often.
Branch: https://github.com/tarantool/tarantool/tree/imun/experimental-execute-lua-handlers
I guess a ChangeLog entry is irrelevant, since nothing is changed for
the user space.
Igor Munkin (2):
test: disable JIT for Lua Fun totable function
box: reduce box_process_lua Lua GC memory usage
src/box/lua/call.c | 32 +++++++++++++++++++++++++++-----
test/box-tap/key_def.test.lua | 8 ++++++++
2 files changed, 35 insertions(+), 5 deletions(-)
[1]: https://gitlab.com/tarantool/tarantool/-/pipelines/138781540/builds
[2]: https://github.com/tarantool/tarantool/commit/8fe7f00
[3]: https://github.com/tarantool/tarantool/issues/4252
[4]: https://github.com/LuaJIT/LuaJIT/issues/584
[5]: https://github.com/tarantool/vshard/issues/224
[6]: https://gist.github.com/igormunkin/c941074fa9fdf0f7a4f068498fb5e24c
--
2.25.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage Igor Munkin
@ 2020-06-20 17:42 ` Vladislav Shpilevoy
2020-06-20 21:24 ` Igor Munkin
0 siblings, 1 reply; 22+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-20 17:42 UTC (permalink / raw)
To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches
Hi! Thanks for the patch!
On 19/06/2020 22:40, Igor Munkin wrote:
> <box_process_lua> function created a new GCfunc object for a handler
> having no upvalues depending on the request context on each call.
>
> The changes introduces the folliwing mapping:
Either 'changes' -> 'change', or 'introduces' -> 'introduce'.
'folliwing' -> 'following'.
> | <handler id> -> <handler GCfunc object>
> Initializing this mapping on Tarantool startup is aimed to reduce Lua GC
> memory usage.
>
> Signed-off-by: Igor Munkin <imun@tarantool.org>
> ---
> src/box/lua/call.c | 32 +++++++++++++++++++++++++++-----
> 1 file changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index 6588ec2fa..e1b1a5e81 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -48,6 +48,15 @@
> #include "mpstream/mpstream.h"
> #include "box/session.h"
>
> +enum handlers {
> + HANDLER_CALL,
> + HANDLER_CALL_BY_REF,
> + HANDLER_EVAL,
> + HANDLER_MAX,
Would be nice to have a comment here explaining why so complex.
Why lua_pushcfunction() can't be used on each call.
> +};
> +
> +static int execute_lua_refs[HANDLER_MAX];
> +
> /**
> * A helper to find a Lua function by name and put it
> * on top of the stack.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-20 17:42 ` Vladislav Shpilevoy
@ 2020-06-20 21:24 ` Igor Munkin
2020-06-21 10:27 ` Sergey Ostanevich
2020-06-21 15:35 ` Vladislav Shpilevoy
0 siblings, 2 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-20 21:24 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches
Vlad,
Thanks for the review! I fixed your comments and updated the remote
branch.
On 20.06.20, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
>
> On 19/06/2020 22:40, Igor Munkin wrote:
> > <box_process_lua> function created a new GCfunc object for a handler
> > having no upvalues depending on the request context on each call.
> >
> > The changes introduces the folliwing mapping:
>
> Either 'changes' -> 'change', or 'introduces' -> 'introduce'.
>
> 'folliwing' -> 'following'.
My bad, fixed.
>
> > | <handler id> -> <handler GCfunc object>
> > Initializing this mapping on Tarantool startup is aimed to reduce Lua GC
> > memory usage.
> >
> > Signed-off-by: Igor Munkin <imun@tarantool.org>
> > ---
> > src/box/lua/call.c | 32 +++++++++++++++++++++++++++-----
> > 1 file changed, 27 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> > index 6588ec2fa..e1b1a5e81 100644
> > --- a/src/box/lua/call.c
> > +++ b/src/box/lua/call.c
> > @@ -48,6 +48,15 @@
> > #include "mpstream/mpstream.h"
> > #include "box/session.h"
> >
> > +enum handlers {
> > + HANDLER_CALL,
> > + HANDLER_CALL_BY_REF,
> > + HANDLER_EVAL,
> > + HANDLER_MAX,
>
> Would be nice to have a comment here explaining why so complex.
> Why lua_pushcfunction() can't be used on each call.
OK, here is the diff:
================================================================================
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index e1b1a5e81..e52f16ca4 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -48,6 +48,15 @@
#include "mpstream/mpstream.h"
#include "box/session.h"
+/*
+ * Handlers identifiers to obtain lua_Cfunction reference from
+ * Lua registry table. These handlers are initialized on Tarantool
+ * startup and are used until the Lua universe is destroyed.
+ * Such approach reduces Lua GC usage since there is no need to
+ * create short-lived GCfunc objects for the corresponding C
+ * function on each iproto CALL/EVAL request or stored Lua
+ * procedure call.
+ */
enum handlers {
HANDLER_CALL,
HANDLER_CALL_BY_REF,
================================================================================
>
> > +};
> > +
> > +static int execute_lua_refs[HANDLER_MAX];
> > +
> > /**
> > * A helper to find a Lua function by name and put it
> > * on top of the stack.
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function Igor Munkin
@ 2020-06-21 10:26 ` Sergey Ostanevich
2020-06-21 20:24 ` Igor Munkin
2020-06-23 18:04 ` Igor Munkin
1 sibling, 1 reply; 22+ messages in thread
From: Sergey Ostanevich @ 2020-06-21 10:26 UTC (permalink / raw)
To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy
Hi!
Thanks for the patch!
AFAIU you just fix one of our test. Is it make sence to fix the library
code in fun.lua to bring the fix for all cutomers?
Also, as soon as we know the #584 essence - how hard will it be to test
all Tarantool libs for this bug?
Regards,
Sergos
On 19 Jun 23:40, Igor Munkin wrote:
> JIT compiler can generate invalid trace for <totable> function breaking
> its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
> produces right results, disabling JIT for this function (and all its
> subfunctions) stops execution failures.
>
> Relates to LuaJIT/LuaJIT#584
> Fixes #4252
>
> Signed-off-by: Igor Munkin <imun@tarantool.org>
> ---
> test/box-tap/key_def.test.lua | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> index d7dbf5b88..ce7a3cb63 100755
> --- a/test/box-tap/key_def.test.lua
> +++ b/test/box-tap/key_def.test.lua
> @@ -4,6 +4,14 @@ local tap = require('tap')
> local ffi = require('ffi')
> local json = require('json')
> local fun = require('fun')
> +
> +-- Fix for gh-4252: to prevent invalid trace assembling (see
> +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
> +-- method (these functions are different GCfunc objects) and all
> +-- their subfunctions.
> +jit.off(fun.totable, true)
> +jit.off(fun.iter({}).totable, true)
> +
> local key_def_lib = require('key_def')
>
> local usage_error = 'Bad params, use: key_def.new({' ..
> --
> 2.25.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-20 21:24 ` Igor Munkin
@ 2020-06-21 10:27 ` Sergey Ostanevich
2020-06-21 10:40 ` Igor Munkin
2020-06-21 15:35 ` Vladislav Shpilevoy
1 sibling, 1 reply; 22+ messages in thread
From: Sergey Ostanevich @ 2020-06-21 10:27 UTC (permalink / raw)
To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy
Hi!
Thanks for the patch!
LGTM.
Regards,
Sergos
On 21 Jun 00:24, Igor Munkin wrote:
> Vlad,
>
> Thanks for the review! I fixed your comments and updated the remote
> branch.
>
> On 20.06.20, Vladislav Shpilevoy wrote:
> > Hi! Thanks for the patch!
> >
> > On 19/06/2020 22:40, Igor Munkin wrote:
> > > <box_process_lua> function created a new GCfunc object for a handler
> > > having no upvalues depending on the request context on each call.
> > >
> > > The changes introduces the folliwing mapping:
> >
> > Either 'changes' -> 'change', or 'introduces' -> 'introduce'.
> >
> > 'folliwing' -> 'following'.
>
> My bad, fixed.
>
> >
> > > | <handler id> -> <handler GCfunc object>
> > > Initializing this mapping on Tarantool startup is aimed to reduce Lua GC
> > > memory usage.
> > >
> > > Signed-off-by: Igor Munkin <imun@tarantool.org>
> > > ---
> > > src/box/lua/call.c | 32 +++++++++++++++++++++++++++-----
> > > 1 file changed, 27 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> > > index 6588ec2fa..e1b1a5e81 100644
> > > --- a/src/box/lua/call.c
> > > +++ b/src/box/lua/call.c
> > > @@ -48,6 +48,15 @@
> > > #include "mpstream/mpstream.h"
> > > #include "box/session.h"
> > >
> > > +enum handlers {
> > > + HANDLER_CALL,
> > > + HANDLER_CALL_BY_REF,
> > > + HANDLER_EVAL,
> > > + HANDLER_MAX,
> >
> > Would be nice to have a comment here explaining why so complex.
> > Why lua_pushcfunction() can't be used on each call.
>
> OK, here is the diff:
>
> ================================================================================
>
> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index e1b1a5e81..e52f16ca4 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -48,6 +48,15 @@
> #include "mpstream/mpstream.h"
> #include "box/session.h"
>
> +/*
> + * Handlers identifiers to obtain lua_Cfunction reference from
> + * Lua registry table. These handlers are initialized on Tarantool
> + * startup and are used until the Lua universe is destroyed.
> + * Such approach reduces Lua GC usage since there is no need to
> + * create short-lived GCfunc objects for the corresponding C
> + * function on each iproto CALL/EVAL request or stored Lua
> + * procedure call.
> + */
> enum handlers {
> HANDLER_CALL,
> HANDLER_CALL_BY_REF,
>
> ================================================================================
>
> >
> > > +};
> > > +
> > > +static int execute_lua_refs[HANDLER_MAX];
> > > +
> > > /**
> > > * A helper to find a Lua function by name and put it
> > > * on top of the stack.
>
> --
> Best regards,
> IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-21 10:27 ` Sergey Ostanevich
@ 2020-06-21 10:40 ` Igor Munkin
0 siblings, 0 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-21 10:40 UTC (permalink / raw)
To: Sergey Ostanevich; +Cc: tarantool-patches, Vladislav Shpilevoy
Sergos,
Thanks for your review!
On 21.06.20, Sergey Ostanevich wrote:
> Hi!
>
> Thanks for the patch!
>
> LGTM.
Added your tag:
| Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>
>
> Regards,
> Sergos
>
<snipped>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-20 21:24 ` Igor Munkin
2020-06-21 10:27 ` Sergey Ostanevich
@ 2020-06-21 15:35 ` Vladislav Shpilevoy
2020-06-21 19:09 ` Igor Munkin
1 sibling, 1 reply; 22+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-21 15:35 UTC (permalink / raw)
To: Igor Munkin; +Cc: tarantool-patches
> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index e1b1a5e81..e52f16ca4 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -48,6 +48,15 @@
> #include "mpstream/mpstream.h"
> #include "box/session.h"
>
> +/*
Please, change to /**. In out of function comments we use /**
as a comment start.
> + * Handlers identifiers to obtain lua_Cfunction reference from
> + * Lua registry table. These handlers are initialized on Tarantool
> + * startup and are used until the Lua universe is destroyed.
> + * Such approach reduces Lua GC usage since there is no need to
> + * create short-lived GCfunc objects for the corresponding C
> + * function on each iproto CALL/EVAL request or stored Lua
> + * procedure call.
> + */
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage
2020-06-21 15:35 ` Vladislav Shpilevoy
@ 2020-06-21 19:09 ` Igor Munkin
0 siblings, 0 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-21 19:09 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches
Vlad,
On 21.06.20, Vladislav Shpilevoy wrote:
> > diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> > index e1b1a5e81..e52f16ca4 100644
> > --- a/src/box/lua/call.c
> > +++ b/src/box/lua/call.c
> > @@ -48,6 +48,15 @@
> > #include "mpstream/mpstream.h"
> > #include "box/session.h"
> >
> > +/*
>
> Please, change to /**. In out of function comments we use /**
> as a comment start.
Yes, it's mentioned in commenting example[1] but I see license comment
doesn't conform these guidelines. Nevertheless, I fixed, squashed,
force-pushed to the branch. Diff is below:
================================================================================
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index e52f16ca4..c3c60a4aa 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -48,7 +48,7 @@
#include "mpstream/mpstream.h"
#include "box/session.h"
-/*
+/**
* Handlers identifiers to obtain lua_Cfunction reference from
* Lua registry table. These handlers are initialized on Tarantool
* startup and are used until the Lua universe is destroyed.
================================================================================
>
> > + * Handlers identifiers to obtain lua_Cfunction reference from
> > + * Lua registry table. These handlers are initialized on Tarantool
> > + * startup and are used until the Lua universe is destroyed.
> > + * Such approach reduces Lua GC usage since there is no need to
> > + * create short-lived GCfunc objects for the corresponding C
> > + * function on each iproto CALL/EVAL request or stored Lua
> > + * procedure call.
> > + */
[1]: https://www.tarantool.io/en/doc/2.3/dev_guide/c_style_guide/#commenting-style
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-21 10:26 ` Sergey Ostanevich
@ 2020-06-21 20:24 ` Igor Munkin
2020-06-25 9:43 ` Sergey Ostanevich
0 siblings, 1 reply; 22+ messages in thread
From: Igor Munkin @ 2020-06-21 20:24 UTC (permalink / raw)
To: Sergey Ostanevich; +Cc: tarantool-patches, Vladislav Shpilevoy
Sergos,
On 21.06.20, Sergey Ostanevich wrote:
> Hi!
>
> Thanks for the patch!
>
> AFAIU you just fix one of our test. Is it make sence to fix the library
> code in fun.lua to bring the fix for all cutomers?
Yes, I "fixed" the test (i.e. made it non-flaky) that reproduces the
mentioned issue.
We have already discussed it offline with Sasha Tu. Of course it makes
sense to fix the "source" of the bug, but what exactly we need to fix?
Assume there is exactly the chunk in client's code making JIT to emit an
invalid RENAME. So there is no option to fix it in scope of the platform
except turning JIT off by default. The same issue is with Lua Fun: it's
just a "third party" Lua code leading to invalid traces assembling.
Furthermore, we have lots of code in the platform using Lua Fun methods
with not a single JIT-related problem (at least I don't know any). Lua
Fun is a high-performance functional programming library for Lua
designed with LuaJIT's trace compiler in mind (I copied this from GitHub
repo), ergo disabling JIT for the whole library will nerf performance
for the well-compiled Lua code.
As a result I see two solutions (besides fixing the root cause):
* Turn JIT engine off for the whole platform (or even disable JIT
machinery at all). It definitely will lead to dramatic performance
drop and might also lead to further dramatic source code changes,
since some Tarantool parts are tuned a lot regarding JIT specifics.
* Provide a recipe for ones facing the similar problem, so they can use
Lua Fun until the fix is ready. I guess the way the patch is done can
be shared with all customers.
>
> Also, as soon as we know the #584 essence - how hard will it be to test
> all Tarantool libs for this bug?
Since JIT tries to optimize and narrow Lua dynamic specifics, it's like
finding a needle in a haystack. As you can see in my reproducer[1] I
just reduce register pressure via enabling sink optimization and amiss
RENAME is emitted. With disabled sink optimization everything works
fine.
It would be nice to enrich our suite with such tests for other Tarantool
libs, but even this single case is way too complex to find: we need to
emit destructive x86 op with the different "source" and "destination"
registers allocated between two side exits related to the same snapshot
-- I'm not sure I didn't missed something.
It looks like we can add a trace abort (or even assert) for a RENAME
emitted between two side exits with the same snapno. As a result broken
traces are not compiled. Thoughts? I need some time to think about it.
Long story short: the stars were aligned and the failure occurred. For
now I have no idea how to catch such traces without patching assembling
machinery.
>
> Regards,
> Sergos
>
>
> On 19 Jun 23:40, Igor Munkin wrote:
> > JIT compiler can generate invalid trace for <totable> function breaking
> > its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
> > produces right results, disabling JIT for this function (and all its
> > subfunctions) stops execution failures.
> >
> > Relates to LuaJIT/LuaJIT#584
> > Fixes #4252
> >
> > Signed-off-by: Igor Munkin <imun@tarantool.org>
> > ---
> > test/box-tap/key_def.test.lua | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> > index d7dbf5b88..ce7a3cb63 100755
> > --- a/test/box-tap/key_def.test.lua
> > +++ b/test/box-tap/key_def.test.lua
> > @@ -4,6 +4,14 @@ local tap = require('tap')
> > local ffi = require('ffi')
> > local json = require('json')
> > local fun = require('fun')
> > +
> > +-- Fix for gh-4252: to prevent invalid trace assembling (see
> > +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
> > +-- method (these functions are different GCfunc objects) and all
> > +-- their subfunctions.
> > +jit.off(fun.totable, true)
> > +jit.off(fun.iter({}).totable, true)
> > +
> > local key_def_lib = require('key_def')
> >
> > local usage_error = 'Bad params, use: key_def.new({' ..
> > --
> > 2.25.0
> >
[1]: https://gist.github.com/igormunkin/18cb6afe5a495bce31f772d453f3117e#file-4252-lua
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool
2020-06-19 20:50 [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage Igor Munkin
@ 2020-06-22 22:54 ` Vladislav Shpilevoy
2020-06-23 21:06 ` Vladislav Shpilevoy
2020-06-23 21:59 ` Igor Munkin
4 siblings, 0 replies; 22+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-22 22:54 UTC (permalink / raw)
To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches
LGTM.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function Igor Munkin
2020-06-21 10:26 ` Sergey Ostanevich
@ 2020-06-23 18:04 ` Igor Munkin
2020-06-23 18:45 ` Alexander V. Tikhonov
1 sibling, 1 reply; 22+ messages in thread
From: Igor Munkin @ 2020-06-23 18:04 UTC (permalink / raw)
To: Alexander V. Tikhonov, Vladislav Shpilevoy; +Cc: tarantool-patches
Sasha,
Many thanks for such precise testing you've made for this patch!
It turned out the patch was totally wrong, since it disables JIT for the
wrong function (i.e. <totable> method) and its subfunctions. However it
doesn't disable JIT for <totable> *callees*, so invalid traces continue
to be compiled. I've made a totally new patch (at the end of the reply)
that is quite similar to this one. Sasha has already tested it manually
and CI is green (I've rebased my branch on the bleeding master).
Sasha, Vlad, could you please glance at the changes once more?
On 19.06.20, Igor Munkin wrote:
> JIT compiler can generate invalid trace for <totable> function breaking
> its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
> produces right results, disabling JIT for this function (and all its
> subfunctions) stops execution failures.
>
> Relates to LuaJIT/LuaJIT#584
> Fixes #4252
>
> Signed-off-by: Igor Munkin <imun@tarantool.org>
> ---
> test/box-tap/key_def.test.lua | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> index d7dbf5b88..ce7a3cb63 100755
> --- a/test/box-tap/key_def.test.lua
> +++ b/test/box-tap/key_def.test.lua
> @@ -4,6 +4,14 @@ local tap = require('tap')
> local ffi = require('ffi')
> local json = require('json')
> local fun = require('fun')
> +
> +-- Fix for gh-4252: to prevent invalid trace assembling (see
> +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
> +-- method (these functions are different GCfunc objects) and all
> +-- their subfunctions.
> +jit.off(fun.totable, true)
> +jit.off(fun.iter({}).totable, true)
> +
> local key_def_lib = require('key_def')
>
> local usage_error = 'Bad params, use: key_def.new({' ..
> --
> 2.25.0
>
================================================================================
[PATCH 1/2] test: disable JIT for Lua Fun chain iterator
JIT compiler can generate an invalid trace for <fun.chain> iterator
(i.e. chain_gen_r1) breaking its semantics (see LuaJIT/LuaJIT#584).
Since interpreter works fine and produces the right results, disabling
JIT for this function stops execution failures.
As a result box-tap/key_def.test.lua is removed from box-tap suite
fragile tests list.
Relates to LuaJIT/LuaJIT#584
Fixes #4252
Signed-off-by: Igor Munkin <imun@tarantool.org>
---
test/box-tap/key_def.test.lua | 8 ++++++++
test/box-tap/suite.ini | 1 -
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
index d7dbf5b88..3a4aad687 100755
--- a/test/box-tap/key_def.test.lua
+++ b/test/box-tap/key_def.test.lua
@@ -4,6 +4,14 @@ local tap = require('tap')
local ffi = require('ffi')
local json = require('json')
local fun = require('fun')
+
+-- XXX fix for gh-4252: to prevent invalid trace assembling (see
+-- https://github.com/LuaJIT/LuaJIT/issues/584) disable JIT for
+-- <fun.chain> iterator (i.e. <chain_gen_r1>). Since the function
+-- is local, the dummy chain generator is created to obtain the
+-- function GC object.
+jit.off(fun.chain({}).gen)
+
local key_def_lib = require('key_def')
local usage_error = 'Bad params, use: key_def.new({' ..
diff --git a/test/box-tap/suite.ini b/test/box-tap/suite.ini
index e62399a7e..07f8880b6 100644
--- a/test/box-tap/suite.ini
+++ b/test/box-tap/suite.ini
@@ -5,5 +5,4 @@ is_parallel = True
pretest_clean = True
use_unix_sockets_iproto = True
fragile = cfg.test.lua ; gh-4344
- key_def.test.lua ; gh-4252
config = suite.cfg
--
2.25.0
================================================================================
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-23 18:04 ` Igor Munkin
@ 2020-06-23 18:45 ` Alexander V. Tikhonov
2020-06-23 21:58 ` Igor Munkin
0 siblings, 1 reply; 22+ messages in thread
From: Alexander V. Tikhonov @ 2020-06-23 18:45 UTC (permalink / raw)
To: Igor Munkin; +Cc: tarantool-patches, v.shpilevoy
Hi Igor, many thanks for the patch. It was successfully checked
for a long time run, the patch LGTM.
On Tue, Jun 23, 2020 at 09:04:08PM +0300, Igor Munkin wrote:
> Sasha,
>
> Many thanks for such precise testing you've made for this patch!
>
> It turned out the patch was totally wrong, since it disables JIT for the
> wrong function (i.e. <totable> method) and its subfunctions. However it
> doesn't disable JIT for <totable> *callees*, so invalid traces continue
> to be compiled. I've made a totally new patch (at the end of the reply)
> that is quite similar to this one. Sasha has already tested it manually
> and CI is green (I've rebased my branch on the bleeding master).
>
> Sasha, Vlad, could you please glance at the changes once more?
>
> On 19.06.20, Igor Munkin wrote:
> > JIT compiler can generate invalid trace for <totable> function breaking
> > its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
> > produces right results, disabling JIT for this function (and all its
> > subfunctions) stops execution failures.
> >
> > Relates to LuaJIT/LuaJIT#584
> > Fixes #4252
> >
> > Signed-off-by: Igor Munkin <imun@tarantool.org>
> > ---
> > test/box-tap/key_def.test.lua | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> > index d7dbf5b88..ce7a3cb63 100755
> > --- a/test/box-tap/key_def.test.lua
> > +++ b/test/box-tap/key_def.test.lua
> > @@ -4,6 +4,14 @@ local tap = require('tap')
> > local ffi = require('ffi')
> > local json = require('json')
> > local fun = require('fun')
> > +
> > +-- Fix for gh-4252: to prevent invalid trace assembling (see
> > +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
> > +-- method (these functions are different GCfunc objects) and all
> > +-- their subfunctions.
> > +jit.off(fun.totable, true)
> > +jit.off(fun.iter({}).totable, true)
> > +
> > local key_def_lib = require('key_def')
> >
> > local usage_error = 'Bad params, use: key_def.new({' ..
> > --
> > 2.25.0
> >
>
> ================================================================================
>
> [PATCH 1/2] test: disable JIT for Lua Fun chain iterator
>
> JIT compiler can generate an invalid trace for <fun.chain> iterator
> (i.e. chain_gen_r1) breaking its semantics (see LuaJIT/LuaJIT#584).
> Since interpreter works fine and produces the right results, disabling
> JIT for this function stops execution failures.
>
> As a result box-tap/key_def.test.lua is removed from box-tap suite
> fragile tests list.
>
> Relates to LuaJIT/LuaJIT#584
> Fixes #4252
>
> Signed-off-by: Igor Munkin <imun@tarantool.org>
> ---
> test/box-tap/key_def.test.lua | 8 ++++++++
> test/box-tap/suite.ini | 1 -
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> index d7dbf5b88..3a4aad687 100755
> --- a/test/box-tap/key_def.test.lua
> +++ b/test/box-tap/key_def.test.lua
> @@ -4,6 +4,14 @@ local tap = require('tap')
> local ffi = require('ffi')
> local json = require('json')
> local fun = require('fun')
> +
> +-- XXX fix for gh-4252: to prevent invalid trace assembling (see
> +-- https://github.com/LuaJIT/LuaJIT/issues/584) disable JIT for
> +-- <fun.chain> iterator (i.e. <chain_gen_r1>). Since the function
> +-- is local, the dummy chain generator is created to obtain the
> +-- function GC object.
> +jit.off(fun.chain({}).gen)
> +
> local key_def_lib = require('key_def')
>
> local usage_error = 'Bad params, use: key_def.new({' ..
> diff --git a/test/box-tap/suite.ini b/test/box-tap/suite.ini
> index e62399a7e..07f8880b6 100644
> --- a/test/box-tap/suite.ini
> +++ b/test/box-tap/suite.ini
> @@ -5,5 +5,4 @@ is_parallel = True
> pretest_clean = True
> use_unix_sockets_iproto = True
> fragile = cfg.test.lua ; gh-4344
> - key_def.test.lua ; gh-4252
> config = suite.cfg
> --
> 2.25.0
>
> ================================================================================
>
> --
> Best regards,
> IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool
2020-06-19 20:50 [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Igor Munkin
` (2 preceding siblings ...)
2020-06-22 22:54 ` [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Vladislav Shpilevoy
@ 2020-06-23 21:06 ` Vladislav Shpilevoy
2020-06-23 21:58 ` Igor Munkin
2020-06-23 21:59 ` Igor Munkin
4 siblings, 1 reply; 22+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-23 21:06 UTC (permalink / raw)
To: Igor Munkin, Sergey Ostanevich; +Cc: tarantool-patches
LGTM.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-23 18:45 ` Alexander V. Tikhonov
@ 2020-06-23 21:58 ` Igor Munkin
0 siblings, 0 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-23 21:58 UTC (permalink / raw)
To: Alexander V. Tikhonov; +Cc: tarantool-patches, Vladislav Shpilevoy
Sasha,
Thanks for your review!
On 23.06.20, Alexander V. Tikhonov wrote:
> Hi Igor, many thanks for the patch. It was successfully checked
> for a long time run, the patch LGTM.
Added your tag:
| Reviewed-by: Alexander V. Tikhonov <avtikhon@tarantool.org>
>
<snipped>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool
2020-06-23 21:06 ` Vladislav Shpilevoy
@ 2020-06-23 21:58 ` Igor Munkin
0 siblings, 0 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-23 21:58 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tarantool-patches
Vlad,
Thanks for your review!
On 23.06.20, Vladislav Shpilevoy wrote:
> LGTM.
Added your tag to both patches:
| Reviewed-by: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool
2020-06-19 20:50 [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Igor Munkin
` (3 preceding siblings ...)
2020-06-23 21:06 ` Vladislav Shpilevoy
@ 2020-06-23 21:59 ` Igor Munkin
2020-06-27 13:22 ` Igor Munkin
4 siblings, 1 reply; 22+ messages in thread
From: Igor Munkin @ 2020-06-23 21:59 UTC (permalink / raw)
To: Kirill Yukhin; +Cc: tarantool-patches, Vladislav Shpilevoy
Kirill,
Please proceed with the series.
On 19.06.20, Igor Munkin wrote:
> This series consists of the two patches that seems unrelated. However
> you can look at the CI results[1] for the first incarnation[2] and see
> the reason why the test is also attached.
>
> The first patch just makes the flaky box-tap/key_def.test.lua stable and
> thereby fixes #4252[3]. The fix is quite simple: since it hits the known
> LuaJIT problem[4] disabling JIT for particular Lua Fun methods is a
> solution for now. When the JIT misbehave is fixed, the patch can be
> reverted. Furthermore, Sasha Tu. asked to open another ticket in our
> queue tracking the origin LuaJIT issue, since #4252 will be closed.
>
> I failed to find the exact way the patch affects this test, but again,
> the result is in the CI.
>
> The second patch was born in scope of Sasha L. research[5]. I glanced on
> the flamegraphs and then to the sources and saw no reason why the same
> functions need to be created on each iproto CALL/EVAL or "stored"
> procedure call. These GCfunc objects live only in scope of the coroutine
> (i.e. fiber) serving a request, but can be reused for the others since
> there are no upvalues depending on the request context. As a result of
> the change GC pressure is reduced.
>
> Vlad asked me personally to make and provide some benchmarks for the
> changes. I made the one you can find here[6]. One need to start
> <callee.lua> instance and run <caller.lua> several times with the same
> Tarantool binary.
>
> After 15 runs with default args I obtained the following results:
> * Vanilla bleeding master (mean):
> | ===== 2.5.0-147-ge7a70be4e =====
> | call by ref GC: 953127 Kb
> | call by ref time: 1.401803 sec
> | call GC: 507813 Kb
> | eval GC: 686523 Kb
> * Patched bleeding master (mean):
> | ===== 2.5.0-149-g71b2bbc8a =====
> | call by ref GC: 921877 Kb
> | call by ref time: 1.318509 sec
> | call GC: 476563 Kb
> | eval GC: 655273 Kb
> * Relative measurements (before -> after):
> | call by ref GC: -3% (-31250 Kb)
> | call by ref time: -5% (-0.083294 sec)
> | call GC: -6% (-31250 Kb)
> | eval GC: -4% (-31250 Kb)
>
> Even if one consider benchmarks a bit synthetic, this change improves
> platform performance a priori: as a result less garbage is produced,
> ergo GC machinery is triggered less often.
>
> Branch: https://github.com/tarantool/tarantool/tree/imun/experimental-execute-lua-handlers
>
> I guess a ChangeLog entry is irrelevant, since nothing is changed for
> the user space.
>
> Igor Munkin (2):
> test: disable JIT for Lua Fun totable function
> box: reduce box_process_lua Lua GC memory usage
>
> src/box/lua/call.c | 32 +++++++++++++++++++++++++++-----
> test/box-tap/key_def.test.lua | 8 ++++++++
> 2 files changed, 35 insertions(+), 5 deletions(-)
>
> [1]: https://gitlab.com/tarantool/tarantool/-/pipelines/138781540/builds
> [2]: https://github.com/tarantool/tarantool/commit/8fe7f00
> [3]: https://github.com/tarantool/tarantool/issues/4252
> [4]: https://github.com/LuaJIT/LuaJIT/issues/584
> [5]: https://github.com/tarantool/vshard/issues/224
> [6]: https://gist.github.com/igormunkin/c941074fa9fdf0f7a4f068498fb5e24c
>
> --
> 2.25.0
>
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-21 20:24 ` Igor Munkin
@ 2020-06-25 9:43 ` Sergey Ostanevich
2020-06-26 11:11 ` Igor Munkin
0 siblings, 1 reply; 22+ messages in thread
From: Sergey Ostanevich @ 2020-06-25 9:43 UTC (permalink / raw)
To: Igor Munkin; +Cc: tarantool-patches, Vladislav Shpilevoy
Hi!
Thanks for explanation!
My question was about hypothetic solution - as you mentioned for a
trace with wrong RENAME - how long will it take to implement such a
workaround?
It whould affect only traces with such problem, and next approach can
be to retry the trace with less optimizations, not just abort it?
Regards,
Sergos
On 21 Jun 23:24, Igor Munkin wrote:
> Sergos,
>
> On 21.06.20, Sergey Ostanevich wrote:
> > Hi!
> >
> > Thanks for the patch!
> >
> > AFAIU you just fix one of our test. Is it make sence to fix the library
> > code in fun.lua to bring the fix for all cutomers?
>
> Yes, I "fixed" the test (i.e. made it non-flaky) that reproduces the
> mentioned issue.
>
> We have already discussed it offline with Sasha Tu. Of course it makes
> sense to fix the "source" of the bug, but what exactly we need to fix?
>
> Assume there is exactly the chunk in client's code making JIT to emit an
> invalid RENAME. So there is no option to fix it in scope of the platform
> except turning JIT off by default. The same issue is with Lua Fun: it's
> just a "third party" Lua code leading to invalid traces assembling.
>
> Furthermore, we have lots of code in the platform using Lua Fun methods
> with not a single JIT-related problem (at least I don't know any). Lua
> Fun is a high-performance functional programming library for Lua
> designed with LuaJIT's trace compiler in mind (I copied this from GitHub
> repo), ergo disabling JIT for the whole library will nerf performance
> for the well-compiled Lua code.
>
> As a result I see two solutions (besides fixing the root cause):
> * Turn JIT engine off for the whole platform (or even disable JIT
> machinery at all). It definitely will lead to dramatic performance
> drop and might also lead to further dramatic source code changes,
> since some Tarantool parts are tuned a lot regarding JIT specifics.
> * Provide a recipe for ones facing the similar problem, so they can use
> Lua Fun until the fix is ready. I guess the way the patch is done can
> be shared with all customers.
>
> >
> > Also, as soon as we know the #584 essence - how hard will it be to test
> > all Tarantool libs for this bug?
>
> Since JIT tries to optimize and narrow Lua dynamic specifics, it's like
> finding a needle in a haystack. As you can see in my reproducer[1] I
> just reduce register pressure via enabling sink optimization and amiss
> RENAME is emitted. With disabled sink optimization everything works
> fine.
>
> It would be nice to enrich our suite with such tests for other Tarantool
> libs, but even this single case is way too complex to find: we need to
> emit destructive x86 op with the different "source" and "destination"
> registers allocated between two side exits related to the same snapshot
> -- I'm not sure I didn't missed something.
>
> It looks like we can add a trace abort (or even assert) for a RENAME
> emitted between two side exits with the same snapno. As a result broken
> traces are not compiled. Thoughts? I need some time to think about it.
>
> Long story short: the stars were aligned and the failure occurred. For
> now I have no idea how to catch such traces without patching assembling
> machinery.
>
> >
> > Regards,
> > Sergos
> >
> >
> > On 19 Jun 23:40, Igor Munkin wrote:
> > > JIT compiler can generate invalid trace for <totable> function breaking
> > > its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
> > > produces right results, disabling JIT for this function (and all its
> > > subfunctions) stops execution failures.
> > >
> > > Relates to LuaJIT/LuaJIT#584
> > > Fixes #4252
> > >
> > > Signed-off-by: Igor Munkin <imun@tarantool.org>
> > > ---
> > > test/box-tap/key_def.test.lua | 8 ++++++++
> > > 1 file changed, 8 insertions(+)
> > >
> > > diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> > > index d7dbf5b88..ce7a3cb63 100755
> > > --- a/test/box-tap/key_def.test.lua
> > > +++ b/test/box-tap/key_def.test.lua
> > > @@ -4,6 +4,14 @@ local tap = require('tap')
> > > local ffi = require('ffi')
> > > local json = require('json')
> > > local fun = require('fun')
> > > +
> > > +-- Fix for gh-4252: to prevent invalid trace assembling (see
> > > +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
> > > +-- method (these functions are different GCfunc objects) and all
> > > +-- their subfunctions.
> > > +jit.off(fun.totable, true)
> > > +jit.off(fun.iter({}).totable, true)
> > > +
> > > local key_def_lib = require('key_def')
> > >
> > > local usage_error = 'Bad params, use: key_def.new({' ..
> > > --
> > > 2.25.0
> > >
>
> [1]: https://gist.github.com/igormunkin/18cb6afe5a495bce31f772d453f3117e#file-4252-lua
>
> --
> Best regards,
> IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-25 9:43 ` Sergey Ostanevich
@ 2020-06-26 11:11 ` Igor Munkin
2020-06-26 13:11 ` Igor Munkin
0 siblings, 1 reply; 22+ messages in thread
From: Igor Munkin @ 2020-06-26 11:11 UTC (permalink / raw)
To: Sergey Ostanevich; +Cc: tarantool-patches, Vladislav Shpilevoy
Sergos,
On 25.06.20, Sergey Ostanevich wrote:
> Hi!
>
> Thanks for explanation!
>
> My question was about hypothetic solution - as you mentioned for a
> trace with wrong RENAME - how long will it take to implement such a
> workaround?
I'll need some time to precisely evaluate it, but for now it looks less
complex than implementing any of the approaches mentioned in the
original issue[1].
>
> It whould affect only traces with such problem, and next approach can
> be to retry the trace with less optimizations, not just abort it?
Yes, only for the invalid traces *to be assembled*: if RENAME should be
emmited for a destructive x86 op, it would check whether there are
guarded IRs with the same snapno prior to the assembling one. Other
RENAMEs occurrences seems to be OK. After the several failures this spot
will be blacklisted and JIT will leave this route uncompiled. It's worth
to check that such drastic aborts doesn't lead to a "blacklisting
cancer".
At the same time implementing "retry the trace with less optimizations"
is way more complex than fix invalid RENAMEs emitting and ergo has no
benefits. As you said, we need a workaround until the problem is not
fixed and I guess we should aim to the least complex one here.
I'll drop all my thoughts to an issue in our GH queue (since #4252 is
closed via this patch) and will send a link here.
>
> Regards,
> Sergos
>
> On 21 Jun 23:24, Igor Munkin wrote:
> > Sergos,
> >
> > On 21.06.20, Sergey Ostanevich wrote:
> > > Hi!
> > >
> > > Thanks for the patch!
> > >
> > > AFAIU you just fix one of our test. Is it make sence to fix the library
> > > code in fun.lua to bring the fix for all cutomers?
> >
> > Yes, I "fixed" the test (i.e. made it non-flaky) that reproduces the
> > mentioned issue.
> >
> > We have already discussed it offline with Sasha Tu. Of course it makes
> > sense to fix the "source" of the bug, but what exactly we need to fix?
> >
> > Assume there is exactly the chunk in client's code making JIT to emit an
> > invalid RENAME. So there is no option to fix it in scope of the platform
> > except turning JIT off by default. The same issue is with Lua Fun: it's
> > just a "third party" Lua code leading to invalid traces assembling.
> >
> > Furthermore, we have lots of code in the platform using Lua Fun methods
> > with not a single JIT-related problem (at least I don't know any). Lua
> > Fun is a high-performance functional programming library for Lua
> > designed with LuaJIT's trace compiler in mind (I copied this from GitHub
> > repo), ergo disabling JIT for the whole library will nerf performance
> > for the well-compiled Lua code.
> >
> > As a result I see two solutions (besides fixing the root cause):
> > * Turn JIT engine off for the whole platform (or even disable JIT
> > machinery at all). It definitely will lead to dramatic performance
> > drop and might also lead to further dramatic source code changes,
> > since some Tarantool parts are tuned a lot regarding JIT specifics.
> > * Provide a recipe for ones facing the similar problem, so they can use
> > Lua Fun until the fix is ready. I guess the way the patch is done can
> > be shared with all customers.
> >
> > >
> > > Also, as soon as we know the #584 essence - how hard will it be to test
> > > all Tarantool libs for this bug?
> >
> > Since JIT tries to optimize and narrow Lua dynamic specifics, it's like
> > finding a needle in a haystack. As you can see in my reproducer[1] I
> > just reduce register pressure via enabling sink optimization and amiss
> > RENAME is emitted. With disabled sink optimization everything works
> > fine.
> >
> > It would be nice to enrich our suite with such tests for other Tarantool
> > libs, but even this single case is way too complex to find: we need to
> > emit destructive x86 op with the different "source" and "destination"
> > registers allocated between two side exits related to the same snapshot
> > -- I'm not sure I didn't missed something.
> >
> > It looks like we can add a trace abort (or even assert) for a RENAME
> > emitted between two side exits with the same snapno. As a result broken
> > traces are not compiled. Thoughts? I need some time to think about it.
> >
> > Long story short: the stars were aligned and the failure occurred. For
> > now I have no idea how to catch such traces without patching assembling
> > machinery.
> >
> > >
> > > Regards,
> > > Sergos
> > >
> > >
> > > On 19 Jun 23:40, Igor Munkin wrote:
> > > > JIT compiler can generate invalid trace for <totable> function breaking
> > > > its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and
> > > > produces right results, disabling JIT for this function (and all its
> > > > subfunctions) stops execution failures.
> > > >
> > > > Relates to LuaJIT/LuaJIT#584
> > > > Fixes #4252
> > > >
> > > > Signed-off-by: Igor Munkin <imun@tarantool.org>
> > > > ---
> > > > test/box-tap/key_def.test.lua | 8 ++++++++
> > > > 1 file changed, 8 insertions(+)
> > > >
> > > > diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua
> > > > index d7dbf5b88..ce7a3cb63 100755
> > > > --- a/test/box-tap/key_def.test.lua
> > > > +++ b/test/box-tap/key_def.test.lua
> > > > @@ -4,6 +4,14 @@ local tap = require('tap')
> > > > local ffi = require('ffi')
> > > > local json = require('json')
> > > > local fun = require('fun')
> > > > +
> > > > +-- Fix for gh-4252: to prevent invalid trace assembling (see
> > > > +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and
> > > > +-- method (these functions are different GCfunc objects) and all
> > > > +-- their subfunctions.
> > > > +jit.off(fun.totable, true)
> > > > +jit.off(fun.iter({}).totable, true)
> > > > +
> > > > local key_def_lib = require('key_def')
> > > >
> > > > local usage_error = 'Bad params, use: key_def.new({' ..
> > > > --
> > > > 2.25.0
> > > >
> >
> > [1]: https://gist.github.com/igormunkin/18cb6afe5a495bce31f772d453f3117e#file-4252-lua
> >
> > --
> > Best regards,
> > IM
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function
2020-06-26 11:11 ` Igor Munkin
@ 2020-06-26 13:11 ` Igor Munkin
0 siblings, 0 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-26 13:11 UTC (permalink / raw)
To: Sergey Ostanevich; +Cc: tarantool-patches, Vladislav Shpilevoy
Sergos,
Here you are: https://github.com/tarantool/tarantool/issues/5118.
Let's proceed our discussion there.
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool
2020-06-23 21:59 ` Igor Munkin
@ 2020-06-27 13:22 ` Igor Munkin
0 siblings, 0 replies; 22+ messages in thread
From: Igor Munkin @ 2020-06-27 13:22 UTC (permalink / raw)
To: Kirill Yukhin; +Cc: tarantool-patches, Vladislav Shpilevoy
I see Kirill has merged the patchset to all stable branches:
* 1.10[1] (with some minor changes)
* 2.3[2]
* 2.4[3]
* bleeding[4]
I've dropped the remaining remote branch.
On 24.06.20, Igor Munkin wrote:
> Kirill,
>
> Please proceed with the series.
>
<snipped>
>
> --
> Best regards,
> IM
[1]: https://github.com/tarantool/tarantool/commit/3a9dcbb
[2]: https://github.com/tarantool/tarantool/commit/e13698f
[3]: https://github.com/tarantool/tarantool/commit/33c7fff
[4]: https://github.com/tarantool/tarantool/commit/e88c0d2
--
Best regards,
IM
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2020-06-27 13:31 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-19 20:50 [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function Igor Munkin
2020-06-21 10:26 ` Sergey Ostanevich
2020-06-21 20:24 ` Igor Munkin
2020-06-25 9:43 ` Sergey Ostanevich
2020-06-26 11:11 ` Igor Munkin
2020-06-26 13:11 ` Igor Munkin
2020-06-23 18:04 ` Igor Munkin
2020-06-23 18:45 ` Alexander V. Tikhonov
2020-06-23 21:58 ` Igor Munkin
2020-06-19 20:40 ` [Tarantool-patches] [PATCH 2/2] box: reduce box_process_lua Lua GC memory usage Igor Munkin
2020-06-20 17:42 ` Vladislav Shpilevoy
2020-06-20 21:24 ` Igor Munkin
2020-06-21 10:27 ` Sergey Ostanevich
2020-06-21 10:40 ` Igor Munkin
2020-06-21 15:35 ` Vladislav Shpilevoy
2020-06-21 19:09 ` Igor Munkin
2020-06-22 22:54 ` [Tarantool-patches] [PATCH 0/2] Reduce Lua GC pressure in Tarantool Vladislav Shpilevoy
2020-06-23 21:06 ` Vladislav Shpilevoy
2020-06-23 21:58 ` Igor Munkin
2020-06-23 21:59 ` Igor Munkin
2020-06-27 13:22 ` Igor Munkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox