[Tarantool-patches] [PATCH luajit v1] tools: fix luajit-gdb stack dump
Igor Munkin
imun at tarantool.org
Tue Jul 20 15:29:52 MSK 2021
Sergey,
Thanks for the patch! Nice catch, actually! I wonder, how much impact
this bug made while digging the crash reports. Anyway, LGTM, but
consider several nits below.
On 18.06.21, Sergey Kaplun wrote:
> When there is only one frame (so-called dummy frame) with a dummy thread
> on stack bottom at L->base - 1 - LJ_FR2, stack dump does not show stack
> slots from bottom to top, because the frame link is already pointed to
> stack bottom (dummy thread mentioned above). Its output looks like the
Too much information in a single sentence. I consider the following info
vital to understand the patch.
* Dummy frame is the particular and kinda "initial" coroutine state,
when the framelink slot (i.e. L->base - (1 + LJ_FR2)) is the most
bottom slot of the guest stack (i.e. L->stack).
* Since coroutine stack unwinding is implemented via precondition loop,
no slots are dumped from L->top to L->base in case when only dummy
frame is on the stack.
* Python is a crap language, and seems like postcondition loop violates
The Zen (this is too hard for users, isn't it?), so Guido decided not
to introduce such construction in the language at all.
#ifdef offtop
I dare to guess, it was just too hard to distinguish precondition
from postcondition in "colon&spaces" paradigm; glad Larry Wall
decided to use braces for blocks, and Roberto just left Pascalish
syntax for this. I'm upset with the GDB guys choice every time we're
back to developing this extension.
#endif
comment.yield()
> following:
>
> | 0x7fb512ac40:0x7fb512ac70 [ ] 7 slots: Red zone
> | 0x7fb512ac38 [ M]
> | 0x7fb512ab28:0x7fb512ac30 [ ] 34 slots: Free stack slots
> | 0x7fb512ab20 [ T ]
> | 0x7fb512ab08:0x7fb512ab10 [S ] FRAME: dummy L
Minor: I guess, you can align the output. At least I see no reason not
to do this.
>
comment.resume()
Hence, there is no other option except unroll the first iteration and
reorder stack slots dump with moving framelinks downwards to the
coroutine stack bottom.
> This patch unrolled first loop iteration. Stack dump first inspect all
> fields for the top frame and then continue if there are other frames.
> After patch the output looks like the following:
>
> | 0x7fb512ac40:0x7fb512ac70 [ ] 7 slots: Red zone
> | 0x7fb512ac38 [ M]
> | 0x7fb512ab28:0x7fb512ac30 [ ] 34 slots: Free stack slots
> | 0x7fb512ab20 [ T ]
> | 0x7fb512ab18 [ ] VALUE: string 0 "/tmp/net_box.lua:6: err in ser" @ 0x7fb512ade8
> | 0x7fb512ab10 [ B ] VALUE: table @ 0x7fb512ac80 (asize: 0, hmask: 0x0)
> | 0x7fb512ab00:0x7fb512ab08 [S ] FRAME: dummy L
Feel free to align the output here too.
> ---
>
> Branch: https://github.com/tarantool/luajit/tree/skaplun/fix-luajit-gdb
>
> Steps to test behaviour:
> 0) Clone branch
> 1) Build tarantool with GC64 or not (I've tested both cases)
> 2) Run the following script </tmp/net_box.lua> by Tarantool via gdb.
>
> | $ cat /tmp/netbox.lua
> | box.cfg{log_level = 50, listen = "127.0.0.1:3802"}
> | box.schema.user.grant('guest','read,write,execute','universe', nil, {if_not_exists=true})
> | local net_box = require"net.box"
> | local cn = net_box:connect("127.0.0.1:3802")
> | local function f()
> | return setmetatable({}, {__serialize = function() error[[err in ser]] end})
> | end
> | _G.f = f
> | local r, st = cn:call("f")
> | print(r,st)
>
> | $ gdb --args ./tarantool /tmp/netbox.lua
> | ...
> | (gdb) source ~/builds_workspace/luajit/fix-luajit-gdb/src/luajit-gdb.py
> | ...
>
> 3) Set breakpoint to ./src/lua/msgpack.c:234, and run
>
> | (gdb) b ./src/lua/msgpack.c:234
>
> I'm not sure about your sources version, so you can just set breakpoin to
> mp_encode(), run and continue (need two entries). We are interested in
> the line #234:
>
> | 233|.......if (luaL_tofield(L, cfg, opts, lua_gettop(L), &field) < 0)
> | 234|.......|.......return luaT_error(L);
>
> And dump the stack:
>
> | (gdb) lj-stack L
> | 0x7fb512ac40:0x7fb512ac70 [ ] 7 slots: Red zone
> | 0x7fb512ac38 [ M]
> | 0x7fb512ab28:0x7fb512ac30 [ ] 34 slots: Free stack slots
> | 0x7fb512ab20 [ T ]
> | 0x7fb512ab18 [ ] VALUE: string 0 "/tmp/net_box.lua:6: err in ser" @ 0x7fb512ade8
> | 0x7fb512ab10 [ B ] VALUE: table @ 0x7fb512ac80 (asize: 0, hmask: 0x0)
> | 0x7fb512ab00:0x7fb512ab08 [S ] FRAME: dummy L
>
> src/luajit-gdb.py | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
> index f1fd6230..a3dad585 100644
> --- a/src/luajit-gdb.py
> +++ b/src/luajit-gdb.py
> @@ -424,16 +424,26 @@ def dump_stack(L, base=None, top=None):
> )
> ]) + '\n'
>
> + # Set framelink for the top frame. Dump frame slots even for dummy frame.
It would be also great to drop a few words why the iteration is unrolled
(the lack of postcondition loops in Python) via XXX comment.
> slot = top - 1
> framelink = base - (1 + LJ_FR2)
I suggest to split this part into two blocks via blank line:
* variable initialization
* unrolled iteration
> + while slot > framelink + LJ_FR2:
> + dump += dump_stack_slot(L, slot, base, top)
> + slot -= 1
>
> + # If there are other frames -- continue.
Again, this is not the particular reason. As for me the main reason you
can't implement stack unwinding via do-while construction.
> while framelink > mref('TValue *', L['stack']):
> - while slot > framelink + LJ_FR2:
> - dump += dump_stack_slot(L, slot, base, top)
> - slot -= 1
> + assert slot == framelink + LJ_FR2, "Invalid slot during frame unwind"
Please, use parentheses for assert call.
> dump += dump_framelink(L, framelink)
> framelink = frame_prev(framelink + LJ_FR2) - LJ_FR2
> slot -= 1 + LJ_FR2
> + while slot > framelink + LJ_FR2:
> + dump += dump_stack_slot(L, slot, base, top)
> + slot -= 1
> +
> + assert slot == framelink + LJ_FR2, "Invalid slot after frame unwind"
Ditto.
> + # Skip nil slot for the last frame for 2-slot frames.
> + slot -= LJ_FR2
>
> dump += '{fr}{padding} [S ] FRAME: dummy L'.format(
> fr = slot,
> --
> 2.31.0
>
--
Best regards,
IM
More information about the Tarantool-patches
mailing list