<HTML><BODY><div>Hi!</div><div>Thanks for the review!</div><div> </div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div> <blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div id=""><div class="js-helper js-readmsg-msg"><div><div id="style_16692956320495467829_BODY">Hi, Maksim!<br><br>Thanks for the patch!<br><br>LGTM, but I have a bunch of questions to clarify it.<br><br>On 28.10.22, Maksim Kokryashkin wrote:<br>> Contributed by Edmund Kapusniak. For more info,<br>> see #698 and #757.<br>><br>> (cherry picked from commit c38747b626b978555324504ec29a110f6b04902f)<br>><br>> To allow compiler generate compact unwind info generation<br>> for Mach-O, fp must point to the saved fp, and the frame<br>> must be specified relative to fp+16.<br><br>Is there any link to documentation or source code to inspect this<br>behaviour?</div></div></div></div></blockquote><div>Unfortunately, there are no official docs for that. However, there is</div><div>a community effort to create one here[1]. Also, this header file from</div><div>the Apple’s sources is quite useful. Added both of them to commit</div><div>message for those who will get the masculine urge to dive into this.</div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>> ELF unwind info has<br>> been updated to also use fp+16 rather than sp+CFRAME_SIZE.<br><br>Also, I see that `.byte` notation is replaced with `.uleb128` or<br>`.sleb128`. What is the reason?<br>Same question for removing names of entries ("_lj_vm_ffi_call.eh:" for<br>example).</div></div></div></div></blockquote><div>IINM, it is something Apple expects from the compact unwind info[2].</div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>><br>> Offset to pointer to personality routine specified as @GOT-. rather<br>> than @GOTPCREL.<br><br>Does it mean that we use incorrect encoded offset (I see encoding for<br>offset is still the same) for our personality routine?<br>If so, maybe the other changes are just refactoring?</div></div></div></div></blockquote><div>No, that is not correct. Offset has changed, because any </div><div>`func@GOT` expression is translated into offset.</div><div>Moreover, I doubt it is possible to fill in offset to GOT by hand.</div><div>What goes after the pointer to the personality routine</div><div>is LSDA, according to the Apple’s source[2]. @GOTPCREL and @GOT</div><div>are interchangeable most of the time, except for the cases when signed</div><div>32-bit RIP references are not enough for you, which seems to be the case here.</div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><br>><br>> Re-enabled LUAJIT_UNWIND_EXTERNAL by default on OSX.<br>><br>> Maxim Kokryashkin:<br>> * added the description for the issue and the test<br>><br>> Resolves tarantool/tarantool#6096<br>> Part of tarantool/tarantool#7230<br>> ---<br>> cmake/SetTargetFlags.cmake | 4 +-<br>> src/Makefile.original | 5 +-<br>> src/vm_arm64.dasc | 89 ++++++++-----------<br>> ...-6096-external-unwinding-on-arm64.test.lua | 13 +++<br>> 4 files changed, 54 insertions(+), 57 deletions(-)<br>> create mode 100644 test/tarantool-tests/gh-6096-external-unwinding-on-arm64.test.lua<br>><br>> diff --git a/cmake/SetTargetFlags.cmake b/cmake/SetTargetFlags.cmake<br>> index 8abb6288..7cf26be1 100644<br>> --- a/cmake/SetTargetFlags.cmake<br>> +++ b/cmake/SetTargetFlags.cmake<br><br><snipped><br><br>> diff --git a/src/Makefile.original b/src/Makefile.original<br>> index 2d014e43..813d9c12 100644<br>> --- a/src/Makefile.original<br>> +++ b/src/Makefile.original<br><br><snipped><br><br>> diff --git a/src/vm_arm64.dasc b/src/vm_arm64.dasc<br>> index ccfa72bd..f517a808 100644<br>> --- a/src/vm_arm64.dasc<br>> +++ b/src/vm_arm64.dasc<br><br><snipped><br><br>> @@ -504,8 +503,9 @@ static void build_subroutines(BuildCtx *ctx)<br>> | ldr GL, L->glref // Setup pointer to global state.<br>> | mov BASE, CARG2<br>> | str CARG1, SAVE_PC // Any value outside of bytecode is ok.<br>> - | str RC, SAVE_CFRAME<br>> - | str fp, L->cframe // Add our C frame to cframe chain.<br>> + | add TMP0, sp, #0<br>> + | str RC, SAVE_CFRAME<br>> + | str TMP0, L->cframe // Add our C frame to cframe chain.<br><br>Why can't we just use the following?<br>| str RC, SAVE_CFRAME<br>| str sp, L->cframe // Add our C frame to cframe chain.<br><br>> |<br>> |3: // Entry point for vm_cpcall/vm_resume (BASE = base, PC = ftype).<br>> | str L, GL->cur_L<br>> @@ -540,8 +540,9 @@ static void build_subroutines(BuildCtx *ctx)<br>> | sub RA, RA, RB // Compute -savestack(L, L->top).<br>> | str RAw, SAVE_NRES // Neg. delta means cframe w/o frame.<br>> | str wzr, SAVE_ERRF // No error function.<br>> - | str RC, SAVE_CFRAME<br>> - | str fp, L->cframe // Add our C frame to cframe chain.<br>> + | add TMP0, sp, #0<br>> + | str RC, SAVE_CFRAME<br>> + | str TMP0, L->cframe // Add our C frame to cframe chain.<br><br>Ditto.<br><br>> | str L, GL->cur_L<br>> | blr CARG4 // (lua_State *L, lua_CFunction func, void *ud)<br>> | mov BASE, CRET1<br>> @@ -2129,14 +2130,14 @@ static void build_subroutines(BuildCtx *ctx)<br><br><snipped><br><br>> @@ -3879,7 +3880,7 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.uleb128 0x1\n"<br>> "\t.sleb128 -8\n"<br>> "\t.byte 30\n" /* Return address is in lr. */<br>> - "\t.byte 0xc\n\t.uleb128 31\n\t.uleb128 0\n" /* def_cfa sp */<br>> + "\t.byte 0xc\n\t.uleb128 29\n\t.uleb128 16\n" /* def_cfa fp 16 */<br>> "\t.align 3\n"<br>> ".LECIE0:\n\n");<br>> fprintf(ctx->fp,<br>> @@ -3889,10 +3890,9 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long .Lframe0\n"<br>> "\t.quad .Lbegin\n"<br>> "\t.quad %d\n"<br>> - "\t.byte 0xe\n\t.uleb128 %d\n" /* def_cfa_offset */<br>> "\t.byte 0x9e\n\t.uleb128 1\n" /* offset lr */<br>> "\t.byte 0x9d\n\t.uleb128 2\n", /* offset fp */<br>> - fcofs, CFRAME_SIZE);<br>> + fcofs);<br>> for (i = 19; i <= 28; i++) /* offset x19-x28 */<br>> fprintf(ctx->fp, "\t.byte 0x%x\n\t.uleb128 %d\n", 0x80+i, i+(3-19));<br>> for (i = 8; i <= 15; i++) /* offset d8-d15 */<br>> @@ -3909,12 +3909,10 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long .Lframe0\n"<br>> "\t.quad lj_vm_ffi_call\n"<br>> "\t.quad %d\n"<br>> - "\t.byte 0xe\n\t.uleb128 32\n" /* def_cfa_offset */<br>> "\t.byte 0x9e\n\t.uleb128 1\n" /* offset lr */<br>> "\t.byte 0x9d\n\t.uleb128 2\n" /* offset fp */<br>> "\t.byte 0x93\n\t.uleb128 3\n" /* offset x19 */<br>> "\t.byte 0x94\n\t.uleb128 4\n" /* offset x20 */<br>> - "\t.byte 0xd\n\t.uleb128 0x1d\n" /* def_cfa_register fp */<br>> "\t.align 3\n"<br>> ".LEFDE1:\n\n", (int)ctx->codesz - fcofs);<br>> #endif<br>> @@ -3933,7 +3931,7 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.byte 0x1b\n" /* pcrel|sdata4 */<br>> "\t.long lj_err_unwind_dwarf-.\n"<br>> "\t.byte 0x1b\n" /* pcrel|sdata4 */<br>> - "\t.byte 0xc\n\t.uleb128 31\n\t.uleb128 0\n" /* def_cfa sp */<br>> + "\t.byte 0xc\n\t.uleb128 29\n\t.uleb128 16\n" /* def_cfa fp 16 */<br>> "\t.align 3\n"<br>> ".LECIE1:\n\n");<br>> fprintf(ctx->fp,<br>> @@ -3944,10 +3942,9 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long .Lbegin-.\n"<br>> "\t.long %d\n"<br>> "\t.uleb128 0\n" /* augmentation length */<br>> - "\t.byte 0xe\n\t.uleb128 %d\n" /* def_cfa_offset */<br>> "\t.byte 0x9e\n\t.uleb128 1\n" /* offset lr */<br>> "\t.byte 0x9d\n\t.uleb128 2\n", /* offset fp */<br>> - fcofs, CFRAME_SIZE);<br>> + fcofs);<br>> for (i = 19; i <= 28; i++) /* offset x19-x28 */<br>> fprintf(ctx->fp, "\t.byte 0x%x\n\t.uleb128 %d\n", 0x80+i, i+(3-19));<br>> for (i = 8; i <= 15; i++) /* offset d8-d15 */<br>> @@ -3969,7 +3966,7 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.byte 30\n" /* Return address is in lr. */<br>> "\t.uleb128 1\n" /* augmentation length */<br>> "\t.byte 0x1b\n" /* pcrel|sdata4 */<br>> - "\t.byte 0xc\n\t.uleb128 31\n\t.uleb128 0\n" /* def_cfa sp */<br>> + "\t.byte 0xc\n\t.uleb128 29\n\t.uleb128 16\n" /* def_cfa fp 16 */<br>> "\t.align 3\n"<br>> ".LECIE2:\n\n");<br>> fprintf(ctx->fp,<br>> @@ -3980,18 +3977,15 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long lj_vm_ffi_call-.\n"<br>> "\t.long %d\n"<br>> "\t.uleb128 0\n" /* augmentation length */<br>> - "\t.byte 0xe\n\t.uleb128 32\n" /* def_cfa_offset */<br>> "\t.byte 0x9e\n\t.uleb128 1\n" /* offset lr */<br>> "\t.byte 0x9d\n\t.uleb128 2\n" /* offset fp */<br>> "\t.byte 0x93\n\t.uleb128 3\n" /* offset x19 */<br>> "\t.byte 0x94\n\t.uleb128 4\n" /* offset x20 */<br>> - "\t.byte 0xd\n\t.uleb128 0x1d\n" /* def_cfa_register fp */<br>> "\t.align 3\n"<br>> ".LEFDE3:\n\n", (int)ctx->codesz - fcofs);<br>> #endif<br>> break;<br>> - /* Disabled until someone finds a fix. See #698. */<br>> -#if !LJ_NO_UNWIND && 0<br>> +#if !LJ_NO_UNWIND<br>> case BUILD_machasm: {<br>> #if LJ_HASFFI<br>> int fcsize = 0;<br>> @@ -4006,14 +4000,14 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long 0\n"<br>> "\t.byte 0x1\n"<br>> "\t.ascii \"zPR\\0\"\n"<br>> - "\t.byte 0x1\n"<br>> - "\t.byte 128-8\n"<br>> + "\t.uleb128 0x1\n"<br>> + "\t.sleb128 -8\n"<br>> "\t.byte 30\n" /* Return address is in lr. */<br>> - "\t.byte 6\n" /* augmentation length */<br>> + "\t.uleb128 6\n" /* augmentation length */<br>> "\t.byte 0x9b\n" /* indirect|pcrel|sdata4 */<br>> - "\t.long _lj_err_unwind_dwarf@GOTPCREL\n"<br>> + "\t.long _lj_err_unwind_dwarf@GOT-.\n"<br>> "\t.byte 0x1b\n" /* pcrel|sdata4 */<br>> - "\t.byte 0xc\n\t.byte 31\n\t.byte 0\n" /* def_cfa sp */<br>> + "\t.byte 0xc\n\t.uleb128 29\n\t.uleb128 16\n" /* def_cfa fp 16 */<br>> "\t.align 3\n"<br>> "LECIEX:\n\n");<br>> for (j = 0; j < ctx->nsym; j++) {<br>> @@ -4024,7 +4018,6 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> if (!strcmp(name, "_lj_vm_ffi_call")) { fcsize = size; continue; }<br>> #endif<br>> fprintf(ctx->fp,<br>> - "%s.eh:\n"<br>> "LSFDE%d:\n"<br>> "\t.set L$set$%d,LEFDE%d-LASFDE%d\n"<br>> "\t.long L$set$%d\n"<br>> @@ -4032,15 +4025,14 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long LASFDE%d-EH_frame1\n"<br>> "\t.long %s-.\n"<br>> "\t.long %d\n"<br>> - "\t.byte 0\n" /* augmentation length */<br>> - "\t.byte 0xe\n\t.byte %d\n\t.byte 1\n" /* def_cfa_offset */<br>> - "\t.byte 0x9e\n\t.byte 1\n" /* offset lr */<br>> - "\t.byte 0x9d\n\t.byte 2\n", /* offset fp */<br>> - name, j, j, j, j, j, j, j, name, size, CFRAME_SIZE);<br>> + "\t.uleb128 0\n" /* augmentation length */<br>> + "\t.byte 0x9e\n\t.uleb128 1\n" /* offset lr */<br>> + "\t.byte 0x9d\n\t.uleb128 2\n", /* offset fp */<br>> + j, j, j, j, j, j, j, name, size);<br>> for (i = 19; i <= 28; i++) /* offset x19-x28 */<br>> - fprintf(ctx->fp, "\t.byte 0x%x\n\t.byte %d\n", 0x80+i, i+(3-19));<br>> + fprintf(ctx->fp, "\t.byte 0x%x\n\t.uleb128 %d\n", 0x80+i, i+(3-19));<br>> for (i = 8; i <= 15; i++) /* offset d8-d15 */<br>> - fprintf(ctx->fp, "\t.byte 5\n\t.byte 0x%x\n\t.byte %d\n",<br>> + fprintf(ctx->fp, "\t.byte 5\n\t.uleb128 0x%x\n\t.uleb128 %d\n",<br>> 64+i, i+(3+(28-19+1)-8));<br>> fprintf(ctx->fp,<br>> "\t.align 3\n"<br>> @@ -4056,16 +4048,15 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long 0\n"<br>> "\t.byte 0x1\n"<br>> "\t.ascii \"zR\\0\"\n"<br>> - "\t.byte 0x1\n"<br>> - "\t.byte 128-8\n"<br>> + "\t.uleb128 0x1\n"<br>> + "\t.sleb128 -8\n"<br>> "\t.byte 30\n" /* Return address is in lr. */<br>> - "\t.byte 1\n" /* augmentation length */<br>> + "\t.uleb128 1\n" /* augmentation length */<br>> "\t.byte 0x1b\n" /* pcrel|sdata4 */<br>> - "\t.byte 0xc\n\t.byte 31\n\t.byte 0\n" /* def_cfa sp */<br>> + "\t.byte 0xc\n\t.uleb128 29\n\t.uleb128 16\n" /* def_cfa fp 16 */<br>> "\t.align 3\n"<br>> "LECIEY:\n\n");<br>> fprintf(ctx->fp,<br>> - "_lj_vm_ffi_call.eh:\n"<br>> "LSFDEY:\n"<br>> "\t.set L$set$yy,LEFDEY-LASFDEY\n"<br>> "\t.long L$set$yy\n"<br>> @@ -4073,13 +4064,11 @@ static void emit_asm_debug(BuildCtx *ctx)<br>> "\t.long LASFDEY-EH_frame2\n"<br>> "\t.long _lj_vm_ffi_call-.\n"<br>> "\t.long %d\n"<br>> - "\t.byte 0\n" /* augmentation length */<br>> - "\t.byte 0xe\n\t.byte 32\n" /* def_cfa_offset */<br>> - "\t.byte 0x9e\n\t.byte 1\n" /* offset lr */<br>> - "\t.byte 0x9d\n\t.byte 2\n" /* offset fp */<br>> - "\t.byte 0x93\n\t.byte 3\n" /* offset x19 */<br>> - "\t.byte 0x94\n\t.byte 4\n" /* offset x20 */<br>> - "\t.byte 0xd\n\t.uleb128 0x1d\n" /* def_cfa_register fp */<br>> + "\t.uleb128 0\n" /* augmentation length */<br>> + "\t.byte 0x9e\n\t.uleb128 1\n" /* offset lr */<br>> + "\t.byte 0x9d\n\t.uleb128 2\n" /* offset fp */<br>> + "\t.byte 0x93\n\t.uleb128 3\n" /* offset x19 */<br>> + "\t.byte 0x94\n\t.uleb128 4\n" /* offset x20 */<br>> "\t.align 3\n"<br>> "LEFDEY:\n\n", fcsize);<br>> }<br>> diff --git a/test/tarantool-tests/gh-6096-external-unwinding-on-arm64.test.lua b/test/tarantool-tests/gh-6096-external-unwinding-on-arm64.test.lua<br>> new file mode 100644<br>> index 00000000..cdeea441<br>> --- /dev/null<br>> +++ b/test/tarantool-tests/gh-6096-external-unwinding-on-arm64.test.lua<br><br><snipped><br><br>> --<br>> 2.37.0 (Apple Git-136)<br>><br><br>--<br>Best regards,<br>Sergey Kaplun</div></div></div></div></blockquote><div>[1]: <a href="https://faultlore.com/blah/compact-unwinding/#unwinding-tables-dwarf-cfi">https://faultlore.com/blah/compact-unwinding/#unwinding-tables-dwarf-cfi</a></div><div>[2]: <a href="https://opensource.apple.com/source/libunwind/libunwind-35.3/include/mach-o/compact_unwind_encoding.h">https://opensource.apple.com/source/libunwind/libunwind-35.3/include/mach-o/compact_unwind_encoding.h</a></div><div><div>--<br>Best regards,</div><div>Maxim Kokryashkin</div></div><div> </div></div></blockquote></BODY></HTML>