Tarantool development patches archive
 help / color / mirror / Atom feed
From: Igor Munkin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: sergos <sergos@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi().
Date: Fri, 11 Nov 2022 13:20:32 +0300	[thread overview]
Message-ID: <Y24h8K1qDeBz8mkS@tarantool.org> (raw)
In-Reply-To: <813CA8F4-3B19-4370-A859-12B956ACCF3C@tarantool.org>

Sergos,

On 06.07.22, sergos wrote:
> Hi!
> 
> Thanks for the patch!
> 
> I won’t veto on the patch, since it’s from upstream. Still I have got
> a number of questions on it’s crutch nature. Perhaps we need to follow
> up with some test cases to luajit.

Thanks for your precise review. I would like to shed some light onto the
patch regarding LuaJIT semantics. Even if you're familiar with all
specifics, it might be useful for other guys *still* reading ML and some
parts can be used either in commit message or in test comments.

> 
> Some description fixes are needed also.
> 
> Sergos
> 
> 
> > On 4 Jul 2022, at 12:33, Sergey Kaplun <skaplun@tarantool.org> wrote:
> > 
> > From: Mike Pall <mike>
> > 
> > Thanks to Peter Cawley.
> > 
> > (cherry picked from commit fb5e522fbc0750c838ef6a926b11c5d870826183)
> > 
> > To reproduce this issue, we need:
> > 1) a register which contains the constant zero value
> 
> > 2) a floating point comparison operation
> 
> The integer one will set the same flags (CF for the case of JNB) and XOR will
> clear it anyways. 

Strictly saying, yes, I see no reason for "float" requirement here. I
believe this can be changed just to "comparison operation occurs", but
Sergey can correct me if I'm wrong.

> 
> > 3) the comparison operation to perform a fused load, which in
> >   turn needs to allocate a register, and for there to be no
> >   free registers at that moment, and for the register chosen
> >   for sacrifice to be holding the constant zero.
> > 
> Unfortunately, it’s not clear what this register (I suppose it’s r14d) is used for.
> Is it an argument, needed after the fall-through in the same trace?
> Why not it sank down below the branch? 
> IIRC it is a dedicated register used for dispatch, so why is it used for sacrificing?

Like many things in LuaJIT register allocation is "state-of-the-art" (or
as I called this "ad-hoc") entity. Since it's implemented as a linear
scan RA working in a reverse direction of the trace being recorded,
register is "allocated" for IR slot being used as a source and "freed"
at the moment it becomes a destination.

To describe the process Sergey fit in a three-bullet list above, we have
the following.
0) IR for either "internal" (e.g. type check, hmask check) or "external"
   (e.g. branch or loop condition) guard is begin emitted to mcode.
1) JCC to side exit is emitted to the trace mcode at the beginning.
2) Condition (i.e. comparison) is going to be emitted.
3) Fuse optimization takes its place, that ought to allocate a register
   for the load base.
4) There is no free registers at this point.
5) The one storing the constant zero is chosen to be sacrificed and
   reallocated (consider allocation cost in ra_alloc for constant
   materialization).
6) Before (or in the sense of trace execution, after) register is
   being used in the aforementioned comparison, register (r14 in our
   case) is reset by XOR emitted right after (before) jump instruction.
7) The comparison with fused load within is emitted.

As a result flags set by comparison are reset by XOR emitted in between
of condition and jump instructions.

> 
> > This leads to assembly code like the following:
> > | ucomisd xmm7, [r14+0x18]
> > | xor r14d, r14d
> > | jnb 0x12a0e001c ->3
> > 
> > That xor is a big problem, as it modifies flags between the
> > ucomisd and the jnb, thereby causing the jnb to do the wrong
> > thing.
> > 

<snipped>

> > 
> > diff --git a/src/lj_emit_x86.h b/src/lj_emit_x86.h
> > index 5207f9da..6b58306b 100644
> > --- a/src/lj_emit_x86.h
> > +++ b/src/lj_emit_x86.h
> > @@ -274,10 +274,12 @@ static void emit_movmroi(ASMState *as, Reg base, int32_t ofs, int32_t i)
> > /* mov r, i / xor r, r */
> > static void emit_loadi(ASMState *as, Reg r, int32_t i)
> > {
> > -  /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP. */
> > +  /* XOR r,r is shorter, but modifies the flags. This is bad for HIOP/jcc. */
> >   if (i == 0 && !(LJ_32 && (IR(as->curins)->o == IR_HIOP ||
> > 			    (as->curins+1 < as->T->nins &&
> > -			     IR(as->curins+1)->o == IR_HIOP)))) {
> > +			     IR(as->curins+1)->o == IR_HIOP))) &&
> > +		!((*as->mcp == 0x0f && (as->mcp[1] & 0xf0) == XI_JCCn) ||
> > +		  (*as->mcp & 0xf0) == XI_JCCs)) {
> 
> I wonder if there can be a case with two instructions emitted between the comparison
> and the jump. In such a case the xor still can sneak in? 
> Another idea: there could be different instructions that relies on the flags, such as
> x86Arith:XOg_ADC or x86Op:XO_CMOV, so the if condition above could be bigger?

Nice question indeed! I believe, we need to re-check this case and send
the patch to upstream if the issue occurs.

> 
> >     emit_rr(as, XO_ARITH(XOg_XOR), r, r);
> >   } else {
> >     MCode *p = as->mcp;

<snipped>

> > -- 
> > 2.34.1
> > 
> 

-- 
Best regards,
IM

  reply	other threads:[~2022-11-11 10:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-04  9:33 Sergey Kaplun via Tarantool-patches
2022-07-05 15:10 ` Igor Munkin via Tarantool-patches
2022-07-06 10:30 ` sergos via Tarantool-patches
2022-11-11 10:20   ` Igor Munkin via Tarantool-patches [this message]
2022-11-22  4:36     ` Sergey Kaplun via Tarantool-patches
2022-11-22  5:45       ` Igor Munkin via Tarantool-patches
2022-08-31  8:48 ` Sergey Kaplun via Tarantool-patches
2022-08-31 16:04   ` sergos via Tarantool-patches
2022-09-01 10:32     ` Sergey Kaplun via Tarantool-patches
2022-11-23  7:50 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y24h8K1qDeBz8mkS@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=imun@tarantool.org \
    --cc=sergos@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] x86/x64: Check for jcc when using xor r, r in emit_loadi().' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox