From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp29.i.mail.ru (smtp29.i.mail.ru [94.100.177.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id AF7C241C5DA for ; Thu, 25 Jun 2020 12:43:52 +0300 (MSK) Date: Thu, 25 Jun 2020 12:43:51 +0300 From: Sergey Ostanevich Message-ID: <20200625094351.GC16381@tarantool.org> References: <20200621102626.GA16381@tarantool.org> <20200621202459.GE3503@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200621202459.GE3503@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH 1/2] test: disable JIT for Lua Fun totable function List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org, Vladislav Shpilevoy Hi! Thanks for explanation! My question was about hypothetic solution - as you mentioned for a trace with wrong RENAME - how long will it take to implement such a workaround? It whould affect only traces with such problem, and next approach can be to retry the trace with less optimizations, not just abort it? Regards, Sergos On 21 Jun 23:24, Igor Munkin wrote: > Sergos, > > On 21.06.20, Sergey Ostanevich wrote: > > Hi! > > > > Thanks for the patch! > > > > AFAIU you just fix one of our test. Is it make sence to fix the library > > code in fun.lua to bring the fix for all cutomers? > > Yes, I "fixed" the test (i.e. made it non-flaky) that reproduces the > mentioned issue. > > We have already discussed it offline with Sasha Tu. Of course it makes > sense to fix the "source" of the bug, but what exactly we need to fix? > > Assume there is exactly the chunk in client's code making JIT to emit an > invalid RENAME. So there is no option to fix it in scope of the platform > except turning JIT off by default. The same issue is with Lua Fun: it's > just a "third party" Lua code leading to invalid traces assembling. > > Furthermore, we have lots of code in the platform using Lua Fun methods > with not a single JIT-related problem (at least I don't know any). Lua > Fun is a high-performance functional programming library for Lua > designed with LuaJIT's trace compiler in mind (I copied this from GitHub > repo), ergo disabling JIT for the whole library will nerf performance > for the well-compiled Lua code. > > As a result I see two solutions (besides fixing the root cause): > * Turn JIT engine off for the whole platform (or even disable JIT > machinery at all). It definitely will lead to dramatic performance > drop and might also lead to further dramatic source code changes, > since some Tarantool parts are tuned a lot regarding JIT specifics. > * Provide a recipe for ones facing the similar problem, so they can use > Lua Fun until the fix is ready. I guess the way the patch is done can > be shared with all customers. > > > > > Also, as soon as we know the #584 essence - how hard will it be to test > > all Tarantool libs for this bug? > > Since JIT tries to optimize and narrow Lua dynamic specifics, it's like > finding a needle in a haystack. As you can see in my reproducer[1] I > just reduce register pressure via enabling sink optimization and amiss > RENAME is emitted. With disabled sink optimization everything works > fine. > > It would be nice to enrich our suite with such tests for other Tarantool > libs, but even this single case is way too complex to find: we need to > emit destructive x86 op with the different "source" and "destination" > registers allocated between two side exits related to the same snapshot > -- I'm not sure I didn't missed something. > > It looks like we can add a trace abort (or even assert) for a RENAME > emitted between two side exits with the same snapno. As a result broken > traces are not compiled. Thoughts? I need some time to think about it. > > Long story short: the stars were aligned and the failure occurred. For > now I have no idea how to catch such traces without patching assembling > machinery. > > > > > Regards, > > Sergos > > > > > > On 19 Jun 23:40, Igor Munkin wrote: > > > JIT compiler can generate invalid trace for function breaking > > > its semantics (see LuaJIT/LuaJIT#584). Since interpreter works fine and > > > produces right results, disabling JIT for this function (and all its > > > subfunctions) stops execution failures. > > > > > > Relates to LuaJIT/LuaJIT#584 > > > Fixes #4252 > > > > > > Signed-off-by: Igor Munkin > > > --- > > > test/box-tap/key_def.test.lua | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/test/box-tap/key_def.test.lua b/test/box-tap/key_def.test.lua > > > index d7dbf5b88..ce7a3cb63 100755 > > > --- a/test/box-tap/key_def.test.lua > > > +++ b/test/box-tap/key_def.test.lua > > > @@ -4,6 +4,14 @@ local tap = require('tap') > > > local ffi = require('ffi') > > > local json = require('json') > > > local fun = require('fun') > > > + > > > +-- Fix for gh-4252: to prevent invalid trace assembling (see > > > +-- LuaJIT/LuaJIT#584) disable JIT for fun.totable function and > > > +-- method (these functions are different GCfunc objects) and all > > > +-- their subfunctions. > > > +jit.off(fun.totable, true) > > > +jit.off(fun.iter({}).totable, true) > > > + > > > local key_def_lib = require('key_def') > > > > > > local usage_error = 'Bad params, use: key_def.new({' .. > > > -- > > > 2.25.0 > > > > > [1]: https://gist.github.com/igormunkin/18cb6afe5a495bce31f772d453f3117e#file-4252-lua > > -- > Best regards, > IM