[Tarantool-patches] [PATCH luajit] Fix predict_next() in parser (again).

Wed Aug 30 13:53:22 MSK 2023

Hi, Sergey!
Thanks for the patch!
LGTM, except for a few nits below.
On Tue, Aug 29, 2023 at 01:42:40PM +0300, Sergey Bronnikov via Tarantool-patches wrote:
> From: sergeyb at tarantool.org
> 
> Reported by Sergey Bronnikov. #1054
> 
> (cherry picked from commit 309fb42b871b6414f53e0e0e708bce0b0d62daff)
> 
> The following Lua snippet triggers an out of boundary access to a stack:
> 
> ```lua
> a, b, c = 1, 2, 3
> local d
> for _ in nil do end
> ```
> 
> With execution snippet by LuaJIT instrumented by ASAN it leads to
> a heap-buffer-overflow.
I suggest the following rephrasing with grammar fixes:
| During the execution of this snippet with LuaJIT instrumented by ASAN,
| it leads to a heap buffer overflow.
> 
> In a function `predict_next` variable `exprpc` looks forward and expects
Typo: s/In a/In/
> extra bytecodes on the stack. However, `KPRI` is merged to the `KNIL`
Typo: s/to the/to/
> and there is no new bytecode to add, so `exprpc == fs->bclim` and it
> leads to out of boundary access.

The last sentence that you don't have here, but have on GitHub should look like
the following:
| The issue has been fixed by an early return when `pc >= fs->bclim`.
> 
> Sergey Bronnikov:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#8825
> ---
> 
> PR: https://github.com/tarantool/tarantool/pull/9054
> Branch: https://github.com/tarantool/luajit/tree/ligurio/lj-1054-incorrect-pc-value-predict_next
> Related issue:
> * https://github.com/LuaJIT/LuaJIT/issues/1054
> 
>  src/lj_parse.c                                 |  4 +++-
>  ...incorrect-pc-value-in-predict_next.test.lua | 18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
>  create mode 100644 test/tarantool-tests/lj-1054-incorrect-pc-value-in-predict_next.test.lua
> 
> diff --git a/src/lj_parse.c b/src/lj_parse.c
> index 343fa797..f1015960 100644
> --- a/src/lj_parse.c
> +++ b/src/lj_parse.c
> @@ -2511,9 +2511,11 @@ static void parse_for_num(LexState *ls, GCstr *varname, BCLine line)
>  */
>  static int predict_next(LexState *ls, FuncState *fs, BCPos pc)
>  {
> -  BCIns ins = fs->bcbase[pc].ins;
> +  BCIns ins;
>    GCstr *name;
>    cTValue *o;
> +  if (pc >= fs->bclim) return 0;
> +  ins = fs->bcbase[pc].ins;
>    switch (bc_op(ins)) {
>    case BC_MOV:
>      name = gco2str(gcref(var_get(ls, fs, bc_d(ins)).name));
> diff --git a/test/tarantool-tests/lj-1054-incorrect-pc-value-in-predict_next.test.lua b/test/tarantool-tests/lj-1054-incorrect-pc-value-in-predict_next.test.lua
> new file mode 100644
> index 00000000..17f1b994
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1054-incorrect-pc-value-in-predict_next.test.lua
> @@ -0,0 +1,18 @@
> +local tap = require('tap')
> +local test = tap.test('lj-1054-incorrect-pc-value-in-predict_next')
> +test:plan(1)
> +
> +
> +-- The test demonstrates a problem with out of boundary access to a stack.
> +-- Sample executed in LuaJIT instrumented by ASAN leads to
> +-- a heap-buffer-overflow.
> +-- See also https://github.com/LuaJIT/LuaJIT/issues/528
This chunk is a bit dated and I don't really want to bother with
going through a bunch of emails and sequential diffs, so I'll just
bring the actual one here by myself.

Here it is:
-- The test demonstrates a problem with out-of-boundary
-- access to a stack. The problem can be easily observed
-- on execution the sample by LuaJIT by ASAN, sanitizer
Typo: s/execution/execution of/
Typo: s/sanitizer/where the sanitizer/
-- reports a heap-based buffer overflow.
-- See also https://github.com/LuaJIT/LuaJIT/issues/1054.

Otherwise, considering the changes you've already made after
Sergey's comments, this part is ok.
> +local lua_code = [[
> +a, b, c = 1, 2, 3
> +local d
> +for _ in nil do end
> +]]
> +
> +test:ok(loadstring(lua_code), 'parsing is correct')
> +
> +test:done(true)
> -- 
> 2.34.1
>