Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Bronnikov <sergeyb@tarantool.org>
Cc: Sergey Bronnikov <estetus@gmail.com>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit v1] Fix BC_UCLO insertion for returns.
Date: Sun, 9 Jul 2023 16:15:32 +0300	[thread overview]
Message-ID: <ZKqy9E6BeKTgVgPS@root> (raw)
In-Reply-To: <895991f2-93ca-0901-031b-2b39e0612a39@tarantool.org>

Hi, Sergey!
Thanks for the fixes!
LGTM, except a few nits and rewordings below.

> Fix BC_UCLO insertion for returns.
>
> Contributed by XmiliaH.
>
> (cherry-picked from commit 93a65d3cc263aef2d2feb3d7ff2206aca3bee17e)
>
> Patch fixes a problem when LuaJIT generates a wrong bytecode with a
> missed BC_UCLO instruction. When some of BC_RET bytecode instructions are

Nit: commit line length is more than 72 symbols (see rationale here [1]).

> not fixup-ed, due to an early return, if UCLO is obtained before, those
> leads to VM inconsistency after return from the function.
>
> Patch makes the following changes in bytecode (thats it, emits extra

Typo: s/thats/that's/
Nit: I suggest to drop introductory phrase "thats it".


> BC_UCLO instruction that closes upvalues):
>
> @@ -11,11 +11,12 @@
>  0006 => LOOP     1 => 0012
>  0007    ISF          0
>  0008    JMP      1 => 0010
> -0009    RET1     0   2
> +0009    UCLO     0 => 0014
>  0010 => FNEW     0   0      ; uclo.lua:56
>  0011    JMP      1 => 0006
>  0012 => UCLO     0 => 0001
>  0013 => RET0     0   1
> +0014 => RET1     0   2

Side note: like this diff very much, good idea!

>
> NOTE: After emitting the bytecode instruction BC_FNEW fixup is not

Typo: s/NOTE: //

> required, because FuncState will set a flag PROTO_CHILD that will
> trigger emitting a pair of instructions BC_UCLO and BC_RET (see
> <src/lj_parse.c:2355>) and BC_RET will close all upvalues from a base
> equal to 0.
>
> JIT compilation of missing_uclo() function without a patch with fix is failed:
> src/lj_record.c:135: rec_check_slots: Assertion `((((((tr))>>24) & IRT_TYPE) - (TRef)(IRT_NUM) <= (TRef)(IRT_INT-IRT_NUM)))' failed.

I suppose there is no need to cite this assertion failure. So, I suggest
to reword this paragraph in the following way:

| JIT compilation of `missing_uclo()` function without a patch leads to
| assertion failure in `rec_check_slots()` due to Lua stack inconsistency.
| The root cause is still parsing misbehaviour. Nevertheless, the test
| case is added to be sure that the JIT compilation isn't affected.

Also, there is no need to mention me in the commit message :).

> (Thanks to Sergey Kaplun for discovering this!)
> Thus second testcase in a test covers a case with compilation as well.
>
> Sergey Bronnikov:
> * added the description and the test for the problem
>
> Signed-off-by: Sergey Bronnikov <sergeyb@tarantool.org>
> Co-authored-by: Sergey Kaplun <skaplun@tarantool.org>
>
> diff --git a/src/lj_parse.c b/src/lj_parse.c
> index af0dc53f..343fa797 100644
> --- a/src/lj_parse.c
> +++ b/src/lj_parse.c
> @@ -1546,7 +1546,7 @@ static void fs_fixup_ret(FuncState *fs)
>  	/* Replace with UCLO plus branch. */
>  	fs->bcbase[pc].ins = BCINS_AD(BC_UCLO, 0, offset);
>  	break;
> -      case BC_UCLO:
> +      case BC_FNEW:
>  	return;  /* We're done. */
>        default:
>  	break;
> diff --git a/test/tarantool-tests/lj-819-fix-missing-uclo.test.lua b/test/tarantool-tests/lj-819-fix-missing-uclo.test.lua
> new file mode 100644
> index 00000000..942c22b2
> --- /dev/null
> +++ b/test/tarantool-tests/lj-819-fix-missing-uclo.test.lua
> @@ -0,0 +1,115 @@

Please, restrict comment length to the 66 symbols in the test.

> +local tap = require('tap')
> +-- Test contains a reproducer for a problem when LuaJIT generates a wrong
> +-- bytecode with a missed BC_UCLO instruction.
> +local test = tap.test('lj-819-fix-missing-uclo'):skipcond({
> +  ['Test requires JIT enabled'] = not jit.status(),
> +})
> +
> +test:plan(2)
> +
> +-- Let's take a look at listings Listing 1 and Listing 2 below with bytecode

Typo: s/bytecode/the bytecode/

> +-- generated for a function missing_uclo() with and without a patch.

Minor: I suggest to use `` to line-up that this is functions names in
the code. Feel free to ignore.

> +-- Both listings contains two BC_UCLO instructions:

Typo: s/contains/contain/

> +-- - first one with id 0004 is generated for a statement 'break' inside

Typo: s/first/the first/

> +--   condition, see label BC_UCLO1;
> +-- - second one with id 0009 is generated for a statement 'return' inside

Typo: s/second/the second/

> +--   a nested loop, see label BC_UCLO2;
> +-- Both BC_UCLO's closes upvalues after leaving a function's scope.
> +--
> +-- The problem is happen when fs_fixup_ret() traverses bytecode instructions in

Typo: s/happen/happened/

> +-- a function prototype, meets first BC_UCLO instruction (break) and forgives a

Typo: s/first/the first/

> +-- second one (return). This leads to a wrong result produced by a function

Nit: I suggest the reword this in the following way:
| and misses the second one return to fixup

> +-- returned by missing_uclo() function. This also explains why do we need a
> +-- dead code in reproducer - without first BC_UCLO fs_fixup_ret() successfully

Typo: s/first/the first/

> +-- fixup BC_UCLO and problem does not appear.

Typo: s/problem/the problem/

> +--
> +-- Listing 1. Bytecode with a fix.
> +--
> +-- -- BYTECODE -- uclo.lua:1-59

I suggest to use listing from the test itself, i.e. use
lj-819-fix-missing-uclo.test.lua:79-97 (line numbers are used before
comment rewording and reallignment) here and below in headers.

> +-- 0001 => LOOP     0 => 0013
> +-- 0002    JMP      0 => 0003
> +-- 0003 => JMP      0 => 0005
> +-- 0004    UCLO     0 => 0013
> +-- 0005 => KPRI     0   0
> +-- 0006 => LOOP     1 => 0012
> +-- 0007    ISF          0
> +-- 0008    JMP      1 => 0010
> +-- 0009    UCLO     0 => 0014
> +-- 0010 => FNEW     0   0      ; uclo.lua:54

And lj-819-fix-missing-uclo.test.lua:92 here and below in listings.

> +-- 0011    JMP      1 => 0006
> +-- 0012 => UCLO     0 => 0001
> +-- 0013 => RET0     0   1
> +-- 0014 => RET1     0   2
> +--
> +-- Listing 2. Bytecode without a fix.
> +--
> +-- BYTECODE -- uclo.lua:1-59
> +-- 0001 => LOOP     0 => 0013
> +-- 0002    JMP      0 => 0003
> +-- 0003 => JMP      0 => 0005
> +-- 0004    UCLO     0 => 0013
> +-- 0005 => KPRI     0   0
> +-- 0006 => LOOP     1 => 0012
> +-- 0007    ISF          0
> +-- 0008    JMP      1 => 0010
> +-- 0009    UCLO     0 => 0014
> +-- 0010 => FNEW     0   0      ; uclo.lua:54
> +-- 0011    JMP      1 => 0006
> +-- 0012 => UCLO     0 => 0001
> +-- 0013 => RET0     0   1
> +-- 0014 => RET1     0   2
> +--
> +-- Listing 3. Changes in bytecode before and after a fix.
> +--
> +-- @@ -11,11 +11,12 @@
> +--  0006 => LOOP     1 => 0012
> +--  0007    ISF          0
> +--  0008    JMP      1 => 0010
> +-- -0009    RET1     0   2
> +-- +0009    UCLO     0 => 0014
> +--  0010 => FNEW     0   0      ; uclo.lua:56
> +--  0011    JMP      1 => 0006
> +--  0012 => UCLO     0 => 0001
> +--  0013 => RET0     0   1
> +-- +0014 => RET1     0   2
> +--
> +-- First testcase checks a correct bytecode generation by frontend

Typo: s/First/The first/
Typo: s/frontend/frontend,/

> +-- and the second testcase checks consistency on a JIT compilation.
> +
> +local function missing_uclo()
> +  while true do -- luacheck: ignore
> +    -- Attention: it is not a dead code, it is a part of reproducer.

Typo: s/reproducer/the reproducer/

I suggest the following rewording:
| XXX: it is not a dead code, it is a part of the reproducer, see the
| comment above.


> +    -- label: BC_UCLO1
> +    if false then
> +      break
> +    end
> +    local f
> +    while true do
> +      if f then
> +        -- label: BC_UCLO2
> +        return f
> +      end
> +      f = function()
> +        return f
> +      end
> +    end
> +  end
> +end
> +
> +local f = missing_uclo()
> +local res = f()
> +-- Without a patch we don't get here a function, because upvalue isn't closed

Typo: s/Without a patch/Without a patch,/

> +-- as desirable.
> +test:ok(type(res) == 'function', 'virtual machine consistency: type of returned value is correct')

Code width is more than 80 symbols.
Minor: s/virtual machine/VM/. Feel free to ignore.

> +
> +-- Make JIT compiler aggressive.

I suggest to drop this comment (there is no need to comment what
`hotloop=1` do, but if you want you may explain why we use 1 here (this
is still common practice in our tests, so I suggest just drop the
comment)).

> +jit.opt.start('hotloop=1')
> +
> +f = missing_uclo()
> +f()
> +f = missing_uclo()
> +local _
> +_, res = pcall(f)
> +test:ok(type(res) == 'function', 'consistency on compilation: type of returned value is correct')

Code width is more than 80 symbols.
Do we need pcall here?

Also, the test isn't failed with assertion failure as declared. But the
following one is:
| LUA_PATH="src/?.lua;;" src/luajit -Ohotloop=1 -e '
|
| local function missing_uclo()
|     while true do -- luacheck: ignore
|         local f
|         if false then break end
|         while true do
|             if f then
|                 return f
|             end
|             f = function()
|                 return f
|             end
|         end
|     end
| end
|
| -- Function to pollute Lua stack.
| local function ret_arg(f) return f end
|
| f = missing_uclo()
| ret_arg(f())
| ret_arg(f())
| '

> +
> +os.exit(test:check() and 0 or 1)

[1]: https://github.com/tarantool/tarantool/wiki/Code-review-procedure#commit-message

-- 
Best regards,
Sergey Kaplun

  reply	other threads:[~2023-07-09 13:19 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-30 16:56 Sergey Bronnikov via Tarantool-patches
2023-06-06 12:51 ` Sergey Kaplun via Tarantool-patches
2023-06-07 11:35   ` Maxim Kokryashkin via Tarantool-patches
2023-07-06  9:43     ` Sergey Bronnikov via Tarantool-patches
2023-07-06 11:31       ` Maxim Kokryashkin via Tarantool-patches
2023-07-06 13:45         ` Sergey Bronnikov via Tarantool-patches
2023-07-06 21:12           ` Maxim Kokryashkin via Tarantool-patches
2023-07-06  9:40   ` Sergey Bronnikov via Tarantool-patches
2023-07-09 13:15     ` Sergey Kaplun via Tarantool-patches [this message]
2023-07-10 14:53       ` Sergey Bronnikov via Tarantool-patches
2023-07-13  7:57         ` Sergey Kaplun via Tarantool-patches
2023-07-13  9:55           ` Sergey Bronnikov via Tarantool-patches
2023-07-13 10:25             ` Sergey Kaplun via Tarantool-patches
2023-07-20 18:37 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZKqy9E6BeKTgVgPS@root \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=estetus@gmail.com \
    --cc=sergeyb@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit v1] Fix BC_UCLO insertion for returns.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox