[Tarantool-patches] [PATCH luajit] ARM64: Fix write barrier in BC_USETS.

Igor Munkin imun at tarantool.org
Sun Aug 1 13:39:55 MSK 2021


Sergey,

Thanks for the patch! Please consider the comments below.

On 07.07.21, Sergey Kaplun wrote:
> From: Mike Pall <mike>
> 
> Contributed by Javier Guerra Giraldez.
> 
> (cherry picked from commit c785131ca5a6d24adc519e5e0bf1b69b671d912f)
> 
> Closed upvalues are never gray. So after closed upvalue is marked, it is
> marked as black. Black objects can't refer white objects, so for storing
> a white value in closed upvalue, we need to move the barrier forward and
> color our value to gray by using `lj_gc_barrieruv()`. This function
> can't be called on closed upvalues with non-white values (at least there
> is no need to mark it again).

Minor: Considering the comments in parenthesis, "can't" looks more like
"shouldn't". Anyway, I looked to the sources, and see the assertion,
that only white and alive objects need to be marked, so I'm confused
with your remark.

> 
> USETS bytecode for arm64 architecture has the incorrect instruction to
> check that upvalue is closed:

AFAIU, the instruction is correct, but the nzcv value is not.

> | ccmp TMP0w, #0, #0, ne
> | beq <1 // branch out from barrier movement
> `TMP0w` contains `upvalue->closed` field. If it equals NULL (the first
> `#0`). The second zero is the value of NZCV condition flags set if the
> condition (`ne`) is FALSE [1][2]. If the set value is not white, then
> flags are set to zero and branch is not taken (no Zero flag). If it
> happens at propagate or atomic GC State and the `lj_gc_barrieruv()`
> function is called then the gray value to set is marked as white. That
> leads to the assertion failure in the `gc_mark()` function.

OK, I understand almost nothing from the part above. Here are the
comments:
1. "If it equals NULL (the first `#0`)", then what?
2. Just to check we are on the same page: the second "immediate"
mentioned in docs[1] is NZCV? Then beq <1 branch is not taken since
(TMP0w != 0) is FALSE (i.e. upvalue is not closed), but zero flag in
NZCV value is not set? So how does the color of the value to be stored
relate to this control flow?
3. AFAICS, if the branch is not taken and <lj_gc_barrieruv> is called at
propagate or atomic phase, the value is colored either to gray or black.

> 
> This patch changes yielded NZCV condition flag to 4 (Zero flag is up) to
> take the correct branch after `ccmp` instruction.
> 
> Sergey Kaplun:
> * added the description and the test for the problem
> 
> [1]: https://developer.arm.com/documentation/dui0801/g/pge1427897656225
> [2]: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/condition-codes-1-condition-flags-and-codes

Minor: Why #5629 is not mentioned?

> ---
> 
> LuaJIT branch: https://github.com/tarantool/luajit/tree/skaplun/lj-426-incorrect-check-closed-uv
> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/lj-426-incorrect-check-closed-uv
> 
> Assertion failure [1] is not related to the patch (I've reproduced it on
> master branch). Looks like another one GC64 issue.

Is this failure described in #6227 fixed by this patch[1]?

> 
> How to reproduce:
> 1) Run the following command from the Tarantool repo on Odroid:
> | $ i=0; while [[ $? == 0 ]]; do i=$(($i+1)); echo $i; make LuaJIT-tests; done
> 2) Wait (need 4-15 iterations).
> 
> [1]: https://github.com/tarantool/tarantool/runs/3009273464#step:4:4013
> 
> Side note: Thanks to the Lord, that there is no #0 issue and it is not
> mentioned that way...

Heh, GitHub is not ready for ARM64 support, but Tarantool almost is!

> 
>  src/vm_arm64.dasc                             |  2 +-
>  ...6-arm64-incorrect-check-closed-uv.test.lua | 38 +++++++++++++++++++
>  2 files changed, 39 insertions(+), 1 deletion(-)
>  create mode 100644 test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua
> 

<snipped>

> diff --git a/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua b/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua
> new file mode 100644
> index 00000000..b757133f
> --- /dev/null
> +++ b/test/tarantool-tests/lj-426-arm64-incorrect-check-closed-uv.test.lua
> @@ -0,0 +1,38 @@
> +local tap = require('tap')
> +
> +local test = tap.test('lj-426-arm64-incorrect-check-closed-uv')
> +test:plan(1)
> +
> +-- Test file to demonstrate LuaJIT USETS bytecode incorrect
> +-- behaviour on arm64 in case when non-white object is set to
> +-- closed upvalue.
> +-- See also, https://github.com/LuaJIT/LuaJIT/issues/426.
> +
> +-- First, create a closed upvalue.
> +do

Minor: I'm not sure, we need a separate lexical block here. Could you
please clarify the reason in the comment?

> +  local uv -- luacheck: no unused
> +  -- The function's prototype is created with the following
> +  -- constants at chunk parsing. After adding this constant to
> +  -- the function's prototype it will be marked as gray during
> +  -- propogate phase.

Then what does it test, if the constant is marked as gray? Will this
string be white later?

> +  local function usets() uv = '' end
> +  _G.usets = usets
> +end
> +
> +-- Set GC state to GCpause.
> +collectgarbage()
> +-- Do GC step as often as possible.
> +collectgarbage('setstepmul', 100)

Minor: Don't get, why you need to make GC less aggressive for the test.
The test is run, until propagate phase is finished.

> +
> +-- We don't know on what exactly step our upvalue is marked as
> +-- black and USETS become dangerous, so just check it at each
> +-- step.
> +-- Don't need to do the full GC cycle step by step.
> +local old_steps_atomic = misc.getmetrics().gc_steps_atomic
> +while (misc.getmetrics().gc_steps_atomic == old_steps_atomic) do
> +  collectgarbage('step')
> +  usets() -- luacheck: no global
> +end
> +
> +test:ok(true)
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.31.0
> 

[1]: https://lists.tarantool.org/tarantool-patches/20210719073632.12008-1-skaplun@tarantool.org/T/#u

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list