Hi! Thanks for the patch, LGTM. I have no other explanation than that ‘missed flag as a result of compare’. But you did it better - took time for me to get to the point. BTW, will make easier to grasp with a simple C equivalent, like - if (iswite(str) || closed != 0) + if (iswite(str) || closed == 0) Sergos. > On 9 Aug 2021, at 19:01, Sergey Kaplun wrote: > > Igor, thanks for the feedback! > > On 08.08.21, Igor Munkin wrote: >> Sergey, >> >> Thanks for the fixes! See some new comments below. >> >> On 01.08.21, Sergey Kaplun wrote: >>> Igor, >>> >>> Thanks for the review! >>> Update commit message on the branch, considering you comments. >> >> Got it, but I still have some more comments regarding it. >> >>> >>> See answers to you questions below. >>> >> >> >> >>> >>>> >>>>> | ccmp TMP0w, #0, #0, ne >>>>> | beq <1 // branch out from barrier movement >>>>> `TMP0w` contains `upvalue->closed` field. If it equals NULL (the first >>>>> `#0`). The second zero is the value of NZCV condition flags set if the >>>>> condition (`ne`) is FALSE [1][2]. If the set value is not white, then >>>>> flags are set to zero and branch is not taken (no Zero flag). If it >>>>> happens at propagate or atomic GC State and the `lj_gc_barrieruv()` >>>>> function is called then the gray value to set is marked as white. That >>>>> leads to the assertion failure in the `gc_mark()` function. >>>> >>>> OK, I understand almost nothing from the part above. Here are the >>>> comments: >>>> 1. "If it equals NULL (the first `#0`)", then what? >>> >>> My bad: >>> I mean here: >>> If it equals NULL (the first `#0`), then the upvalue is open. >> >> So why do you use NULL instead of 0? The field is uint8_t type, so 0 is >> much clearer. > > Changed. > >> >>> Added this. >>> >>>> 2. Just to check we are on the same page: the second "immediate" >>>> mentioned in docs[1] is NZCV? >>> >>> Yes. >>> >>>> Then beq <1 branch is not taken since >>>> (TMP0w != 0) is FALSE (i.e. upvalue is not closed), but zero flag in >>>> NZCV value is not set? >>> >>> Yes. >>> >>>> So how does the color of the value to be stored >>>> relate to this control flow? >>> >>> This NZCV value isn't set if the upvalue is white, because condition is >>> of the following instruction >>> >>> | tst TMP1w, #LJ_GC_WHITES // iswhite(str) >>> >>> is TRUE. So the <1 branch is taken, because the upvalue is closed. >> >> Well... I can't imagine how I needed to find this... This relates mostly >> to ARM docs you've mentioned, but it would be nice to describe this >> behaviour in the commit message (since you're writing a verbose one). >> >>> >>>> 3. AFAICS, if the branch is not taken and is called at >>>> propagate or atomic phase, the value is colored either to gray or black. >>> >>> Yes, that leads to the assertion failure mentioned in the ticket in the >>> LuaJIT upstream. >>> >>>> >>>>> >>>>> This patch changes yielded NZCV condition flag to 4 (Zero flag is up) to >>>>> take the correct branch after `ccmp` instruction. >>>>> >>>>> Sergey Kaplun: >>>>> * added the description and the test for the problem >>>>> >>>>> [1]: https://developer.arm.com/documentation/dui0801/g/pge1427897656225 >>>>> [2]: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/condition-codes-1-condition-flags-and-codes >>>> >>>> Minor: Why #5629 is not mentioned? >>> >>> Added. >> >> Considering everything above, I propose the following wording: >> | Contributed by Javier Guerra Giraldez. >> | >> | (cherry picked from commit c785131ca5a6d24adc519e5e0bf1b69b671d912f) >> | >> | >> | Closed upvalues are never gray. Hence when closed upvalue is marked, it >> | is marked as black. Black objects can't refer white objects, so for >> | storing a white value in a closed upvalue, we need to move the barrier >> | forward and color our value to gray by using `lj_gc_barrieruv()`. This >> | function can't be called on closed upvalues with non-white values since >> | there is no need to mark it again. >> | >> | USETS bytecode for arm64 architecture has the incorrect NZCV condition >> | flag value in the instruction that checks the upvalue is closed: >> | | tst TMP1w, #LJ_GC_WHITES >> | | ccmp TMP0w, #0, #0, ne >> | | beq <1 // branch out from barrier movement >> | `TMP0w` contains `upvalue->closed` field, so the upvalue is open if this >> | field equals to zero (the first one in `ccmp`). The second zero is the >> | value of NZCV condition flags[1] yielded if the specified condition >> | (`ne`) is met for the current values of the condition flags[2]. Hence, >> | if the value to be stored is not white (`TMP1w` holds its color), then >> | the condition is FALSE and all flags bits are set to zero so branch is >> | not taken (Zero flag is not set). If this happens at propagate or atomic >> | GC phase, the `lj_gc_barrieruv()` function is called and the gray value >> | to be set is marked like if it is white. That leads to the assertion >> | failure in the `gc_mark()` function. >> | >> | This patch changes NZCV condition flag to 4 (Zero flag is set) to take >> | the correct branch after `ccmp` instruction. >> | >> | Sergey Kaplun: >> | * added the description and the test for the problem >> | >> | [1]: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/condition-codes-1-condition-flags-and-codes >> | [2]: https://developer.arm.com/documentation/dui0801/g/pge1427897656225 >> | >> | Part of tarantool/tarantool#5629 > > Updated, as you've suggested. > >> >>> >> >> >> >>>> >>>>> >>>>> src/vm_arm64.dasc | 2 +- >>>>> ...6-arm64-incorrect-check-closed-uv.test.lua | 38 +++++++++++++++++++ >>>>> 2 files changed, 39 insertions(+), 1 deletion(-) >>>>> create mode 100644 test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua >>>>> >>>> >>>> >>>> >>>>> diff --git a/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua b/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua >>>>> new file mode 100644 >>>>> index 00000000..b757133f >>>>> --- /dev/null >>>>> +++ b/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua >>>>> @@ -0,0 +1,38 @@ >>>>> +local tap = require('tap') >>>>> + >>>>> +local test = tap.test('lj-426-arm64-incorrect-check-closed-uv') >>>>> +test:plan(1) >>>>> + >>>>> +-- Test file to demonstrate LuaJIT USETS bytecode incorrect >>>>> +-- behaviour on arm64 in case when non-white object is set to >>>>> +-- closed upvalue. >>>>> +-- See also, https://github.com/LuaJIT/LuaJIT/issues/426. >>>>> + >>>>> +-- First, create a closed upvalue. >>>>> +do >>>> >>>> Minor: I'm not sure, we need a separate lexical block here. Could you >>>> please clarify the reason in the comment? >>> >>> We need a closed upvalue. I suppose that it is the simpiest way to >>> create one. Please, provide a simplier example if you know one. >> >> My bad. Yes, the easiest way to emit UCLO bytecode is using a separate >> lexical block. >> >>> >>>> >>>>> + local uv -- luacheck: no unused >>>>> + -- The function's prototype is created with the following >>>>> + -- constants at chunk parsing. After adding this constant to >>>>> + -- the function's prototype it will be marked as gray during >>>>> + -- propogate phase. >>>> >>>> Then what does it test, if the constant is marked as gray? Will this >>>> string be white later? >>> >>> It shouldn't be white, it should be gray, otherwise the aforementioned >>> condition is TRUE (remember, we need FALSE). >> >> Again, PEBKAC, thanks for the explanation. >> >>> >>>> >>>>> + local function usets() uv = '' end >>>>> + _G.usets = usets >>>>> +end >>>>> + >>>>> +-- Set GC state to GCpause. >>>>> +collectgarbage() >>>>> +-- Do GC step as often as possible. >>>>> +collectgarbage('setstepmul', 100) >>>> >>>> Minor: Don't get, why you need to make GC less aggressive for the test. >>>> The test is run, until propagate phase is finished. >>> >>> More likely, that it is run, until the upvalue is marked as black >>> during traversing (with the bug). I can remove this line if you insist. >> >> Drop it, please. I can't even *feel* its effect ;) > > Done. > >> >>> >>>> >>>>> + >>>>> +-- We don't know on what exactly step our upvalue is marked as >>>>> +-- black and USETS become dangerous, so just check it at each >>>>> +-- step. >>>>> +-- Don't need to do the full GC cycle step by step. >> >> Minor: It would be nice to drop a few words about string and upvalue >> colours during this loop, but it's up to you. > > Added. > > The iterative patch is the following: > > =================================================================== > diff --git a/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua b/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua > index b757133f..4cdf1211 100644 > --- a/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua > +++ b/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua > @@ -21,9 +21,10 @@ end > > -- Set GC state to GCpause. > collectgarbage() > --- Do GC step as often as possible. > -collectgarbage('setstepmul', 100) > > +-- We want to wait for the situation, when upvalue is black, > +-- the string is gray. Both conditions are satisfied, when the > +-- corresponding `usets()` function is marked, for example. > -- We don't know on what exactly step our upvalue is marked as > -- black and USETS become dangerous, so just check it at each > -- step. > =================================================================== > >> >>>>> +local old_steps_atomic = misc.getmetrics().gc_steps_atomic >>>>> +while (misc.getmetrics().gc_steps_atomic == old_steps_atomic) do >>>>> + collectgarbage('step') >>>>> + usets() -- luacheck: no global >>>>> +end >>>>> + >>>>> +test:ok(true) >>>>> +os.exit(test:check() and 0 or 1) >>>>> -- >>>>> 2.31.0 >>>>> >>>> >>>> [1]: https://lists.tarantool.org/tarantool-patches/20210719073632.12008-1-skaplun@tarantool.org/T/#u >>>> >>>> -- >>>> Best regards, >>>> IM >>> >>> -- >>> Best regards, >>> Sergey Kaplun >> >> -- >> Best regards, >> IM > > -- > Best regards, > Sergey Kaplun