[Tarantool-patches] [PATCH luajit 2/5] Handle on-trace OOM errors from helper functions.

Sergey Kaplun skaplun at tarantool.org
Thu Feb 16 14:36:47 MSK 2023


Hi, Maxim!
Thanks for the patch!
It's really well described and generally LGTM, but I still have some
more minor questions and comments.

On 15.02.23, Maxim Kokryashkin wrote:
> From: Mike Pall <mike>
> 
> (cherry-picked from commit 4bba29e697d00df5f020e76c2003bb9ce51c5d38)
> 
> This patch introduces handling of errors from internal helper
> functions on traces. FFI C++ exception interoperability is
> not yet implemented.
> 
> For each throwing trace, its mcode entry is augmented with a
> DWARF2 frame description entry. After that, a dynamic DWARF2

Minor: Common information entry and frame description entry.

> frame info is registered based on that entry with
> `__register_frame()`. Because the ARM32 architecture lacks

Is there any documentation links for this function (I see only something
like: "document me!")?
If not, may you please describe it shortly?

> the __register_frame, unwinding is not supported on it.
> 
> For each throwing function call, a snapshot is allocated.
> When we have a parent trace, our side trace head requires
> additional snapshot allocation, so the additional

Typo: s/additional/an additional/

> `asm_snap_prev()` call is added.
> 
> Unwinding on traces is disabled on Darwin, due to quirks of
> Apple's version of libunwind.

I don't see those changes, what are you talking about?

> 
> Maxim Kokryashkin:
> * added the description and the test for the problem
> 
> Part of tarantool/tarantool#7745

I suggest to add here:
| Part of tarantool/tarantool#8069

Also, it will be nice to describe new fields in structures (`asm_State`,
`SnapShot`) introduced by Mike.

Also, please mention what registers are chosen as EHRAREG (you can use a
bullet list with arches, numbers and names of registers).

Also, is good to mention why do we now need snapshots right after each
`pcall()` and `xpcall()` frame.

> ---
>  doc/status.html                               |   7 -
>  src/lj_arch.h                                 |  12 +
>  src/lj_asm.c                                  |  77 ++++-
>  src/lj_dispatch.h                             |   4 +-
>  src/lj_err.c                                  | 274 +++++++++++++++++-
>  src/lj_err.h                                  |  19 +-
>  src/lj_ffrecord.c                             |   2 +
>  src/lj_jit.h                                  |   2 +
>  src/lj_mcode.c                                |   5 +-
>  src/lj_opt_loop.c                             |   1 +
>  src/lj_record.c                               |   3 +-
>  src/lj_snap.c                                 |   1 +
>  src/lj_state.c                                |   1 +
>  src/lj_target_x86.h                           |   2 +
>  src/lj_trace.c                                |  61 +++-
>  src/lj_trace.h                                |   3 +
>  src/lj_vm.h                                   |   3 +
>  src/vm_arm.dasc                               |   3 +-
>  src/vm_arm64.dasc                             |   4 +-
>  src/vm_mips.dasc                              |   9 +-
>  src/vm_mips64.dasc                            |  10 +-
>  src/vm_ppc.dasc                               |   3 +-
>  src/vm_x64.dasc                               |   6 +-
>  src/vm_x86.dasc                               |   4 +-
>  .../gh-7745-oom-on-trace.test.lua             |  20 ++
>  25 files changed, 478 insertions(+), 58 deletions(-)
>  create mode 100644 test/tarantool-tests/gh-7745-oom-on-trace.test.lua
> 
> diff --git a/doc/status.html b/doc/status.html
> index c89255b6..c9cd9071 100644
> --- a/doc/status.html
> +++ b/doc/status.html

<snipped>

> diff --git a/src/lj_arch.h b/src/lj_arch.h
> index 2e458d20..8643aa4a 100644
> --- a/src/lj_arch.h
> +++ b/src/lj_arch.h

<snipped>

> diff --git a/src/lj_asm.c b/src/lj_asm.c
> index a154547b..adfaf286 100644
> --- a/src/lj_asm.c
> +++ b/src/lj_asm.c

<snipped>

> @@ -944,6 +946,14 @@ static void asm_snap_alloc(ASMState *as)
>      if (!irref_isk(ref)) {
>        asm_snap_alloc1(as, ref);
>        if (LJ_SOFTFP && (sn & SNAP_SOFTFPNUM)) {
> +        /*
> +        ** FIXME: The following assert was replaced with

Minor: I'm not so sure about this heavy comment. I suggest just to use
`lua_assert()` as is, it will be replaced back after backporting
"Improve assertions".
Feel free to ignore.

> +        ** the conventional `lua_assert`.
> +        **
> +        ** lj_assertA(irt_type(IR(ref+1)->t) == IRT_SOFTFP,
> +		    ** "snap %d[%d] points to bad SOFTFP IR %04d",
> +		    ** snapno, n, ref - REF_BIAS);
> +        */
>  	lua_assert(irt_type(IR(ref+1)->t) == IRT_SOFTFP);
>  	asm_snap_alloc1(as, ref+1);
>        }
> @@ -970,19 +980,16 @@ static int asm_snap_checkrename(ASMState *as, IRRef ren)

<snipped>

> diff --git a/src/lj_dispatch.h b/src/lj_dispatch.h
> index addf5572..b8bc2594 100644
> --- a/src/lj_dispatch.h
> +++ b/src/lj_dispatch.h

<snipped>

> diff --git a/src/lj_err.c b/src/lj_err.c
> index d0223384..c7fd9e65 100644
> --- a/src/lj_err.c
> +++ b/src/lj_err.c

<snipped>

> @@ -304,12 +309,59 @@ LJ_FUNCA int lj_err_unwind_win(EXCEPTION_RECORD *rec,

<snipped>

> +
> +#if LJ_UNWIND_JIT
> +/* DWARF2 personality handler for JIT-compiled code. */
> +static int err_unwind_jit(int version, int actions,
> +  uint64_t uexclass, _Unwind_Exception *uex, _Unwind_Context *ctx)
> +{
> +  /* NYI: FFI C++ exception interoperability. */
> +  if (version != 1 || !LJ_UEXCLASS_CHECK(uexclass))
> +    return _URC_FATAL_PHASE1_ERROR;
> +  if ((actions & _UA_SEARCH_PHASE)) {
> +    return _URC_HANDLER_FOUND;
> +  }
> +  if ((actions & _UA_CLEANUP_PHASE)) {
> +    global_State *g = *(global_State **)(uex+1);

Side note: Just get the next field of the structure, nothing to see
here (cust to the whole structure is too easy way...).

> +    ExitNo exitno;
> +    uintptr_t addr = _Unwind_GetIP(ctx);  /* Return address _after_ call. */
> +    uintptr_t stub = lj_trace_unwind(G2J(g), addr - sizeof(MCode), &exitno);
> +    /*
> +    ** FIXME: The following assert was replaced with
> +    ** the conventional `lua_assert`.
> +    **
> +    ** lj_assertG(tvref(g->jit_base), "unexpected throw across mcode frame");
> +    */
> +    lua_assert(tvref(g->jit_base));
> +    if (stub) {  /* Jump to side exit to unwind the trace. */
> +      G2J(g)->exitcode = LJ_UEXCLASS_ERRCODE(uexclass);
> +#ifdef LJ_TARGET_MIPS
> +      _Unwind_SetGR(ctx, 4, stub);
> +      _Unwind_SetGR(ctx, 5, exitno);
> +      _Unwind_SetIP(ctx, (uintptr_t)(void *)lj_vm_unwind_stub);
> +#else
> +      _Unwind_SetIP(ctx, stub);
> +#endif
> +      return _URC_INSTALL_CONTEXT;
> +    }
> +    return _URC_FATAL_PHASE2_ERROR;
> +  }
> +  return _URC_FATAL_PHASE1_ERROR;
> +}
> +
> +/* DWARF2 template frame info for JIT-compiled code.
> +**
> +** After copying the template to the start of the mcode segment,
> +** the frame handler function and the code size is patched.
> +** The frame handler always installs a new context to jump to the exit,
> +** so don't bother to add any unwind opcodes.
> +*/
> +static const uint8_t err_frame_jit_template[] = {

Side note: Still don't want to use defines (or structures:)) for all
those constatns and strings...

> +#if LJ_BE
> +  0,0,0,
> +#endif
> +  LJ_64 ? 0x1c : 0x14,  /* CIE length. */
> +#if LJ_LE
> +  0,0,0,
> +#endif
> +  0,0,0,0, 1, 'z','P','R',0,  /* CIE mark, CIE version, augmentation. */
> +  1, LJ_64 ? 0x78 : 0x7c, LJ_TARGET_EHRAREG,  /* Code/data align, RA. */

Side note:
Code alignment factor is 1 (uleb128).
Data alignment factor is LJ_64 ? 8 : 4 (sleb128).

> +#if LJ_64
> +  10, 0, 0,0,0,0,0,0,0,0, 0x1b,  /* Aug. data ABS handler, PCREL|SDATA4 code. */

Don't get this. Why do we use PCREL|SDATA4 code for 8-byte pointers
(instead PCREL|SDATA8) (I mean we already reserve room for it)?

What is "ABS handler"? I don't see any docs about it.

> +  0,0,0,0,0,  /* Alignment. */
> +#else
> +  6, 0, 0,0,0,0, 0x1b,  /* Aug. data ABS handler, PCREL|SDATA4 code. */
> +  0,  /* Alignment. */
> +#endif
> +#if LJ_BE
> +  0,0,0,
> +#endif
> +  LJ_64 ? 0x14 : 0x10,  /* FDE length. */

Side note: Is differ due to "Alignment".

> +  0,0,0,
> +  LJ_64 ? 0x24 : 0x1c,  /* CIE offset. */
> +  0,0,0,
> +  LJ_64 ? 0x14 : 0x10,  /* Code offset. After Final FDE. */

Side note: Equals to FDE length, because this entry is right before the
corresponding MCode.

> +#if LJ_LE
> +  0,0,0,
> +#endif
> +  0,0,0,0, 0, 0,0,0, /* Code size, augmentation length, alignment. */
> +#if LJ_64
> +  0,0,0,0,  /* Alignment. */
> +#endif
> +  0,0,0,0  /* Final FDE. */
> +};
> +

<snipped>

> +#define ERR_FRAME_JIT_OFS_HANDLER       0x12
> +#define ERR_FRAME_JIT_OFS_FDE           (LJ_64 ? 0x20 : 0x18)
> +#define ERR_FRAME_JIT_OFS_CODE_SIZE     (LJ_64 ? 0x2c : 0x24)
> +#if LJ_TARGET_OSX
> +#define ERR_FRAME_JIT_OFS_REGISTER      ERR_FRAME_JIT_OFS_FDE

Why this field is different for macOS and Linux (macOS uses FDE start, and
Linux CIE start as an argument)?
Please describe it in the commit message.
(May be the link to `__register_frame()` documentation is enough).

> +#else
> +#define ERR_FRAME_JIT_OFS_REGISTER      0
> +#endif
> +
> +extern void __register_frame(const void *);
> +extern void __deregister_frame(const void *);
> +
> +uint8_t *lj_err_register_mcode(void *base, size_t sz, uint8_t *info)
> +{
> +  void **handler;
> +  memcpy(info, err_frame_jit_template, sizeof(err_frame_jit_template));
> +  handler = (void *)err_unwind_jit;
> +  memcpy(info + ERR_FRAME_JIT_OFS_HANDLER, &handler, sizeof(handler));
> +  *(uint32_t *)(info + ERR_FRAME_JIT_OFS_CODE_SIZE) =
> +    (uint32_t)(sz - sizeof(err_frame_jit_template) - (info - (uint8_t *)base));
> +  __register_frame(info + ERR_FRAME_JIT_OFS_REGISTER);
> +#ifdef LUA_USE_ASSERT
> +  {
> +    struct dwarf_eh_bases ehb;
> +    lj_assertX(_Unwind_Find_FDE(info + sizeof(err_frame_jit_template)+1, &ehb),
> +               "bad JIT unwind table registration");
> +  }
> +#endif
> +  return info + sizeof(err_frame_jit_template);
> +}
> +
> +void lj_err_deregister_mcode(void *base, size_t sz, uint8_t *info)
> +{
> +  UNUSED(base); UNUSED(sz);
> +  __deregister_frame(info + ERR_FRAME_JIT_OFS_REGISTER);

Should it be the opposite assert here?

> +}

<snipped>

> @@ -492,25 +685,68 @@ LJ_FUNCA int lj_err_unwind_arm(int state, _Unwind_Control_Block *ucb,
>    }
>    if (__gnu_unwind_frame(ucb, ctx) != _URC_OK)
>      return _URC_FAILURE;
> +#ifdef LUA_USE_ASSERT
> +  /* We should never get here unless this is a forced unwind aka backtrace. */
> +  if (_Unwind_GetGR(ctx, 0) == 0xff33aa77) {
> +    _Unwind_SetGR(ctx, 0, 0xff33aa88);
> +  }
> +#endif
>    return _URC_CONTINUE_UNWIND;
>  }
>  
> -#if LJ_UNWIND_EXT
> -static __thread _Unwind_Control_Block static_uex;
> +#if LJ_UNWIND_EXT && defined(LUA_USE_ASSERT)
> +typedef int (*_Unwind_Trace_Fn)(_Unwind_Context *, void *);
> +extern int _Unwind_Backtrace(_Unwind_Trace_Fn, void *);
> +
> +static int err_verify_bt(_Unwind_Context *ctx, int *got)
> +{
> +  if (_Unwind_GetGR(ctx, 0) == 0xff33aa88) { *got = 2; }
> +  else if (*got == 0) { *got = 1; _Unwind_SetGR(ctx, 0, 0xff33aa77); }

Side note: OMG we totally, don't want to use defines with meaningfull
names here, just use random constants:)...

> +  return _URC_OK;
> +}
> +
> +/* Verify that external error handling actually has a chance to work. */
> +void lj_err_verify(void)
> +{
> +  int got = 0;
> +  _Unwind_Backtrace((_Unwind_Trace_Fn)err_verify_bt, &got);
> +  /*
> +  ** FIXME: The following assert was replaced with
> +  ** the conventional `lua_assert`.
> +  **
> +  ** lj_assertX(got == 2, "broken build: external frame unwinding enabled, but missing -funwind-tables");
> +  */
> +  lua_assert(got == 2);
> +}
>  #endif

<snipped>

> @@ -620,7 +856,7 @@ static ptrdiff_t finderrfunc(lua_State *L)
>  /* Runtime error. */
>  LJ_NOINLINE void LJ_FASTCALL lj_err_run(lua_State *L)
>  {
> -  ptrdiff_t ef = finderrfunc(L);
> +  ptrdiff_t ef = (LJ_HASJIT && tvref(G(L)->jit_base)) ? 0 : finderrfunc(L);

If we don't want to find error function when we on trace why do we take
`if (errcode == LUA_ERRRUN)` below?
Does it mean that we will still try to rethrow error raising inside
`lj_trace_exit()` (AFAICS this is the only place when `jit_base == 0`
and we call `lj_err_trace()`)?
If so, please mention this in the commit message.

Also, please mention that `lj_err_trace()` is introduced to use instead
`lj_err_run()` in rethrowing of the error.

>    if (ef) {
>      TValue *errfunc = restorestack(L, ef);
>      TValue *top = L->top;
> @@ -639,6 +875,16 @@ LJ_NOINLINE void LJ_FASTCALL lj_err_run(lua_State *L)
>    lj_err_throw(L, LUA_ERRRUN);
>  }
>  
> +#if LJ_HASJIT
> +LJ_NOINLINE void LJ_FASTCALL lj_err_trace(lua_State *L, int errcode)
> +{
> +  if (errcode == LUA_ERRRUN)
> +    lj_err_run(L);
> +  else
> +    lj_err_throw(L, errcode);
> +}
> +#endif
> +
>  /* Formatted runtime error message. */
>  LJ_NORET LJ_NOINLINE static void err_msgv(lua_State *L, ErrMsg em, ...)
>  {
> diff --git a/src/lj_err.h b/src/lj_err.h
> index aa4b7e0d..b0c72c24 100644
> --- a/src/lj_err.h
> +++ b/src/lj_err.h

<snipped>

> diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
> index 649ac705..8af9da1d 100644
> --- a/src/lj_ffrecord.c
> +++ b/src/lj_ffrecord.c

<snipped>

> diff --git a/src/lj_jit.h b/src/lj_jit.h
> index d82292f8..f2ad3c6e 100644
> --- a/src/lj_jit.h
> +++ b/src/lj_jit.h

<snipped>

> diff --git a/src/lj_mcode.c b/src/lj_mcode.c
> index 77035bf7..7184d3b4 100644
> --- a/src/lj_mcode.c
> +++ b/src/lj_mcode.c

<snipped>

> diff --git a/src/lj_opt_loop.c b/src/lj_opt_loop.c
> index 441b8add..10613641 100644
> --- a/src/lj_opt_loop.c
> +++ b/src/lj_opt_loop.c

<snipped>

> diff --git a/src/lj_record.c b/src/lj_record.c
> index 9e2e1d9e..e7dac7ac 100644
> --- a/src/lj_record.c
> +++ b/src/lj_record.c
> @@ -800,7 +800,7 @@ void lj_record_ret(jit_State *J, BCReg rbase, ptrdiff_t gotresults)
>      J->base -= cbase;
>      J->base[--rbase] = TREF_TRUE;  /* Prepend true to results. */
>      frame = frame_prevd(frame);
> +    J->needsnap = 1;  /* Stop catching on-trace errors. */
>    }
>    /* Return to lower frame via interpreter for unhandled cases. */
>    if (J->framedepth == 0 && J->pt && bc_isret(bc_op(*J->pc)) &&
> @@ -2021,7 +2022,7 @@ void lj_record_ins(jit_State *J)
>    /* Need snapshot before recording next bytecode (e.g. after a store). */
>    if (J->needsnap) {
>      J->needsnap = 0;
> -    lj_snap_purge(J);
> +    if (J->pt) lj_snap_purge(J);

In which cases there is no prototype?

>      lj_snap_add(J);
>      J->mergesnap = 1;
>    }
> diff --git a/src/lj_snap.c b/src/lj_snap.c
> index 2f7cf80a..a8b49fcb 100644
> --- a/src/lj_snap.c
> +++ b/src/lj_snap.c

<snipped>

> diff --git a/src/lj_state.c b/src/lj_state.c
> index cc6f92f1..4add3d65 100644
> --- a/src/lj_state.c
> +++ b/src/lj_state.c

<snipped>

> diff --git a/src/lj_target_x86.h b/src/lj_target_x86.h
> index 194f8e70..4efb566b 100644
> --- a/src/lj_target_x86.h
> +++ b/src/lj_target_x86.h

<snipped>

> diff --git a/src/lj_trace.c b/src/lj_trace.c
> index c6e2f72e..17743159 100644
> --- a/src/lj_trace.c
> +++ b/src/lj_trace.c

<snipped>

> @@ -932,4 +944,47 @@ int LJ_FASTCALL lj_trace_exit(jit_State *J, void *exptr)
>    }
>  }
>  
> +#if LJ_UNWIND_JIT
> +/* Given an mcode address determine trace exit address for unwinding. */
> +uintptr_t LJ_FASTCALL lj_trace_unwind(jit_State *J, uintptr_t addr, ExitNo *ep)
> +{
> +#if EXITTRACE_VMSTATE
> +  TraceNo traceno = J2G(J)->vmstate;
> +#else
> +  TraceNo traceno = trace_exit_find(J, (MCode *)addr);
> +#endif

Side note: Looks like we should use the same approach for our symtab
dumping in profilers.

> +  GCtrace *T = traceref(J, traceno);
> +  if (T
> +#if EXITTRACE_VMSTATE
> +      && addr >= (uintptr_t)T->mcode && addr < (uintptr_t)T->mcode + T->szmcode
> +#endif
> +     ) {

<snipped>

> +}
> +#endif
> +
>  #endif
> diff --git a/src/lj_trace.h b/src/lj_trace.h
> index 22cae741..0bfb606f 100644
> --- a/src/lj_trace.h
> +++ b/src/lj_trace.h

<snipped>

> diff --git a/src/lj_vm.h b/src/lj_vm.h
> index 1cc7eed7..411caafa 100644
> --- a/src/lj_vm.h
> +++ b/src/lj_vm.h

<snipped>

> diff --git a/src/vm_arm.dasc b/src/vm_arm.dasc
> index a29292f1..436a9fd7 100644
> --- a/src/vm_arm.dasc
> +++ b/src/vm_arm.dasc

<snipped>

> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc
> index f517a808..7066b051 100644
> --- a/src/vm_arm64.dasc
> +++ b/src/vm_arm64.dasc

<snipped>

> diff --git a/src/vm_mips.dasc b/src/vm_mips.dasc
> index 93c772ff..32caabf7 100644
> --- a/src/vm_mips.dasc
> +++ b/src/vm_mips.dasc

<snipped>

> diff --git a/src/vm_mips64.dasc b/src/vm_mips64.dasc
> index 9a749f93..04be38f0 100644
> --- a/src/vm_mips64.dasc
> +++ b/src/vm_mips64.dasc
> @@ -545,6 +545,10 @@ static void build_subroutines(BuildCtx *ctx)

<snipped>

> diff --git a/src/vm_ppc.dasc b/src/vm_ppc.dasc
> index 980176a2..7ad8df37 100644
> --- a/src/vm_ppc.dasc
> +++ b/src/vm_ppc.dasc

<snipped>

> diff --git a/src/vm_x64.dasc b/src/vm_x64.dasc
> index 59f117ba..27af164a 100644
> --- a/src/vm_x64.dasc
> +++ b/src/vm_x64.dasc

<snipped>

> diff --git a/src/vm_x86.dasc b/src/vm_x86.dasc
> index f7ffe5d2..85807129 100644
> --- a/src/vm_x86.dasc
> +++ b/src/vm_x86.dasc

<snipped>

> diff --git a/test/tarantool-tests/gh-7745-oom-on-trace.test.lua b/test/tarantool-tests/gh-7745-oom-on-trace.test.lua
> new file mode 100644
> index 00000000..1366af30
> --- /dev/null
> +++ b/test/tarantool-tests/gh-7745-oom-on-trace.test.lua
> @@ -0,0 +1,20 @@
> +require('utils').skipcond(jit.os == 'OSX', 'Disabled due to incorrect behavior of unwinder in tarantool_panic_handler')

Line length is more than 80 symbols.

> +local ffi = require('ffi')
> +
> +local tap = require('tap')
> +
> +local test = tap.test('OOM on trace')
> +test:plan(1)
> +
> +local function memory_payload()
> +  local t = {}
> +  for i = 1, 1e10 do
> +    t[ffi.new("uint64_t")] = i

Typo: s/"uint64_t"/'uint64_t'/

This case returns OOM, error (and may return the TABOV error).
Can we check exactly TABOV error too(may be with the help of `table.new()`?)?

> +  end
> +  print(t)
> +end
> +
> +local res = pcall(memory_payload)
> +test:ok(res == false)
> +
> +os.exit(test:check() and 0 or 1)
> -- 
> 2.39.0
> 

-- 
Best regards,
Sergey Kaplun


More information about the Tarantool-patches mailing list