Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Igor Munkin <imun@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit] Fix string.char() recording with no arguments.
Date: Fri, 28 Jan 2022 11:26:16 +0300	[thread overview]
Message-ID: <YfOoqPjpR7Brwmib@root> (raw)
In-Reply-To: <YfMSu8jAfEdupEfC@tarantool.org>

Igor,
Thanks for the review!

On 28.01.22, Igor Munkin wrote:
> Sergey,
> 
> Thanks for the patch! Please consider my comments below.
> 
> On 20.08.21, Sergey Kaplun wrote:
> > From: Mike Pall <mike>
> > 
> > (cherry picked from commit dfa692b746c9de067857d5fc992a41730be3d99a)
> > 
> > `string.char()` call without arguments yields an empty string. When JIT
> > machinery records the aforementioned call it doesn't handle this case.
> > Each recording fast function expects 1 result by default.  Hence, when
> 
> Typo: double space prior to "Hence".

Fixed.

> 
> > return from this call is recorded the framelink slot is used as a
> 
> Strictly saying, this might be not the framelink slot. If you look onto
> the trace dump, you'll see there is an SLOAD emitted for 12th stack
> slot, but the framelink of the string.char (to be inlined on the trace)
> is 11th AFAICS. So the loaded slot is just a garbage left after tap
> initialization, or something else (though it definitely looks to be a
> framelink earlier). See the artefacts of my investigation below.
> 

<snipped>

> 
> As I've mentioned above, IR 0025 is SLOAD #12, and the 12th slot while
> running the compiled trace is 0x400269e8 which was both base and top
> slot for <string.char> call, but wasn't not its framelink. It wasn't so
> clear, since luajit-gdb.py dumps nothing if both base and top points to
> a single slot.
> 
> I propose to omit this framelink part and just mention that some garbage
> can be loaded from stack as a result of <string.char> recording in case
> when no arguments are given.

The new commit message is the following:

===================================================================
Fix string.char() recording with no arguments.

(cherry picked from commit dfa692b746c9de067857d5fc992a41730be3d99a)

`string.char()` call without arguments yields an empty string. JIT
recording machinery doesn’t handle this case. Each recording of a fast
function expects 1 result by default. Hence, when return from this call
is recorded some garbarge value is considered as a result to yield. It
is loaded into the corresponding slot as an IR with `IRT_NUM` type. It
leads to assertion failure in `rec_check_slots()`, when a next bytecode
is recorded, because type of TValue on the stack (`LJ_STR`) isn't the
same as IR (and TRef) type.

This patch handles the case without arguments by the loading of IR with
empty string reference into the corresponding slot. It reuses assumption
of one result by default, hence there is no case for `i == 1` in the
code.

Sergey Kaplun:
* added the description and the test for the problem

Resolves tarantool/tarantool#6371
===================================================================

> 
> > result. It is loaded into the corresponding slot as an IR with `IRT_NUM`
> > type. It leads to assertion failure in `rec_check_slots()`, when a next
> > bytecode is recorded, because type of TValue on the stack (`LJ_STR`)
> > isn't the same as IR (and TRef) type.
> 
> I see no assertion running the test in Debug, unfortunately. Could you
> please provide the way to hit it?

I build as usual:
| $ cmake . -DCMAKE_BUILD_TYPE=Debug -DLUA_USE_APICHECK=ON -DLUA_USE_ASSERT=ON && make -j

and run this chunk (like test but simplier).
| $ src/luajit -e '
| local results = {}
| jit.opt.start("hotloop=1")
| for _ = 1, 3 do
|   print(string.char())
| end
| '

With the following result:

| luajit: /home/burii/builds_workspace/luajit/gh-6371-string-char-no-arg/src/lj_record.c:137: rec_check_slots: Assertion `itype2irt(tv) == ((IRType)(((tr)>>24) &
|  IRT_TYPE))' failed.
| Aborted (core dumped)

But there are many ways to fail (for example on fold optimisation):

| luajit: /home/burii/builds_workspace/luajit/gh-6371-string-char-no-arg/src/lj_opt_fold.c:2318: fold_fwd_sload: Assertion `J->slot[(&J->fold.ins)->op1] != 0' failed.

> 
> > 
> > This patch handles the case without arguments by the loading of IR with
> > empty string reference into the corresponding slot.
> > 
> > Sergey Kaplun:
> > * added the description and the test for the problem
> > 
> > Resolves tarantool/tarantool#6371
> > ---
> > 
> > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-6371-string-char-no-arg
> > Issue: https://github.com/tarantool/tarantool/issues/6371
> > Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6371-string-char-no-arg
> > Side note: CI is totally red, but AFAICS it's unrelated with my patch.
> 
> Side note: Try to rebase on the bleeding master, please.

The updated tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-6371-string-char-no-arg-full-ci

> 
> > Side note: See also Changelog at the Tarantool branch.
> > 
> >  src/lj_ffrecord.c                             |  2 ++
> >  .../gh-6371-string-char-no-arg.test.lua       | 28 +++++++++++++++++++
> >  2 files changed, 30 insertions(+)
> >  create mode 100644 test/tarantool-tests/gh-6371-string-char-no-arg.test.lua
> > 
> > diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> > index 8dfa80ed..be890a93 100644
> > --- a/src/lj_ffrecord.c
> > +++ b/src/lj_ffrecord.c
> > @@ -866,6 +866,8 @@ static void LJ_FASTCALL recff_string_char(jit_State *J, RecordFFData *rd)
> >      for (i = 0; J->base[i] != 0; i++)
> >        tr = emitir(IRT(IR_BUFPUT, IRT_PGC), tr, J->base[i]);
> >      J->base[0] = emitir(IRT(IR_BUFSTR, IRT_STR), tr, hdr);
> > +  } else if (i == 0) {
> > +    J->base[0] = lj_ir_kstr(J, &J2G(J)->strempty);
> >    }
> >    UNUSED(rd);
> >  }
> > diff --git a/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua b/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua
> > new file mode 100644
> > index 00000000..6df93f07
> > --- /dev/null
> > +++ b/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua
> > @@ -0,0 +1,28 @@
> > +local tap = require('tap')
> > +
> > +-- Test file to demonstrate assertion after `string.char()`
> > +-- recording.
> > +-- See also, https://github.com/tarantool/tarantool/issues/6371.
> > +
> > +local test = tap.test('gh-6371-string-char-no-arg')
> > +-- XXX: Number of loop iterations.
> > +-- 1, 2 -- instruction becomes hot
> > +-- 3 -- trace is recorded (considering loop recording specifics),
> > +-- but bytecodes are still executed via VM
> > +-- 4 -- trace is executed, need to check that emitted mcode is
> > +--      correct
> 
> Minor: Strictly saying, this is not quite true. The right description is
> the following:
> * 1 -- instruction becomes hot.
> * 2 -- recording of the loop body.
> * 3 -- required for trace finalization, but this iteration runs the
>   generated mcode. NB: the issue doesn't occur, since execution leaves
>   the trace on table resizing guard.
> * 4 -- reproduces the issue.
> 
> Hence, if you allocate the table with necessary size (3 in our case),
> then only 3 iterations are needed. Please adjust the comment above and
> the code below.

Fixed, thanks!

See the iterative patch below.
Branch is force-pushed.
===================================================================
diff --git a/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua b/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua
index 6df93f07..61d02101 100644
--- a/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua
+++ b/test/tarantool-tests/gh-6371-string-char-no-arg.test.lua
@@ -6,11 +6,12 @@ local tap = require('tap')
 
 local test = tap.test('gh-6371-string-char-no-arg')
 -- XXX: Number of loop iterations.
--- 1, 2 -- instruction becomes hot
--- 3 -- trace is recorded (considering loop recording specifics),
--- but bytecodes are still executed via VM
--- 4 -- trace is executed, need to check that emitted mcode is
---      correct
+-- * 1 -- instruction becomes hot.
+-- * 2 -- recording of the loop body.
+-- * 3 -- required for trace finalization, but this iteration runs the
+--   generated mcode. NB: the issue doesn't occur, since execution leaves
+--   the trace on table resizing guard.
+-- * 4 -- reproduces the issue.
 local NTEST = 4
 test:plan(NTEST)
 
===================================================================

> 
> > +local NTEST = 4
> > +test:plan(NTEST)
> > +
> > +-- Storage for the results to avoid trace aborting by `test:ok()`.
> > +local results = {}
> > +jit.opt.start('hotloop=1')
> > +for _ = 1, NTEST do
> > +  table.insert(results, string.char())
> > +end
> > +
> > +for i = 1, NTEST do
> > +  test:ok(results[i] == '', 'correct recording of string.char() without args')
> > +end
> > +
> > +os.exit(test:check() and 0 or 1)
> > -- 
> > 2.31.0
> > 
> 
> -- 
> Best regards,
> IM

-- 
Best regards,
Sergey Kaplun

  parent reply	other threads:[~2022-01-28  8:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20 15:48 Sergey Kaplun via Tarantool-patches
2021-08-31 11:39 ` Sergey Ostanevich via Tarantool-patches
2021-09-01  7:10   ` Sergey Kaplun via Tarantool-patches
2022-01-10 11:48     ` sergos via Tarantool-patches
2022-01-27 21:46 ` Igor Munkin via Tarantool-patches
2022-01-27 22:02   ` Igor Munkin via Tarantool-patches
2022-01-28  8:26   ` Sergey Kaplun via Tarantool-patches [this message]
2022-01-28 12:49     ` Sergey Kaplun via Tarantool-patches
2022-01-28 13:19     ` Igor Munkin via Tarantool-patches
2022-01-28 15:12       ` Sergey Kaplun via Tarantool-patches
2022-02-11 19:09 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfOoqPjpR7Brwmib@root \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=imun@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] Fix string.char() recording with no arguments.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox