From: Igor Munkin via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Maksim Kokryashkin <max.kokryashkin@gmail.com> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit v2] lldb: introduce luajit-lldb Date: Mon, 24 Apr 2023 14:26:58 +0000 [thread overview] Message-ID: <ZEaROhG2en/vXHCl@tarantool.org> (raw) In-Reply-To: <20221027131807.31287-1-max.kokryashkin@gmail.com> Max, Thanks for your patch! Nicely done with isolating debugger specific helpers, I'm looking forward to merging it with luajit-gdb.py On 27.10.22, Maksim Kokryashkin wrote: > From: "m.kokryashkin" <m.kokryashkin@m-kokryashkin.local> Oops, looks like you've committed the change on your local macOS. > > It is impossible to run gdb on M1 devices, the only available > debugger is lldb. The luajit-gdb extension doesn't work with > lldb, so this patch introduces the luajit-lldb extension, > which re-implements exactly the same functionality. > > It is worth noting that the extension is called > `luajit_lldb.py` instead of `luajit-lldb.py` because lldb > imports it as a Python module, and it is prohibited to use > dashes in a python module name. > > Part of tarantool/tarantool#4808 > --- > Branch: https://github.com/tarantool/luajit/tree/gh-fckxorg/luajit-lldb > > Changes in v2: > - Fixed comments as per review by Sergey > - Reorganized the `c_structs` spec. > > src/luajit_lldb.py | 1006 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 1006 insertions(+) > create mode 100644 src/luajit_lldb.py > > diff --git a/src/luajit_lldb.py b/src/luajit_lldb.py > new file mode 100644 > index 00000000..edd2884e > --- /dev/null > +++ b/src/luajit_lldb.py > @@ -0,0 +1,1006 @@ <snipped> > + def __add__(self, other): > + assert isinstance(other, int) > + return self.__class__(cast(self.normal_type.__name__ + ' *', cast('uintptr_t', self.value.unsigned + other * self.value.deref.size))) > + Minor: I see no reason in such assymetry in type variety supported by __add__ and __sub__. Anyway, it doesn't affect extension usage, so feel free to ignore. > + def __sub__(self, other): > + assert isinstance(other, int) or isinstance(other, Ptr) > + if isinstance(other, int): > + return self.__add__(-other) > + else: > + return self.value.unsigned - other.value.unsigned > + > + <snipped> > + Could you please add some comments to <cast> function? AFAIU, there are type detection and value cast: some spots are quite cryptic... > +def cast(typename, value): > + pointer_type = False > + name = None > + if isinstance(value, Struct) or isinstance(value, Ptr): > + value = value.value > + if isinstance(typename, type): > + name = typename.__name__ > + if name.endswith('Ptr'): > + pointer_type = True > + name = name[:-3] > + else: > + name = typename > + if name[-1] == '*': > + name = name[:-1].strip() > + pointer_type = True > + t = target.FindFirstType(name) > + if pointer_type: > + t = t.GetPointerType() > + > + if isinstance(value, int): > + if pointer_type: > + casted = target.CreateValueFromAddress('value', lldb.SBAddress(value, target), t.GetPointeeType()).address_of > + else: > + casted = target.CreateValueFromData(name = 'value', data = lldb.SBData.CreateDataFromInt(value, size=8), type=t) > + else: > + casted = value.Cast(t) > + > + if isinstance(typename, type): > + return typename(casted) > + else: > + return casted > + > +def lookup_global(name): > + global target Minor: Do we need explicitly mention <target> as global here? > + return target.FindFirstGlobalVariable(name) > + > +def type_member(type_obj, name): > + return next((x for x in type_obj.members if x.name == name), None) > + > +def find_type(typename): > + return target.FindFirstType(typename) I believe, you can "inline" this function to their callers, if you use it only in "Debugger helpers" scope. > + > +def offsetof(typename, membername): > + type_obj = find_type(typename) > + member = type_member(type_obj, membername) > + return member.GetOffsetInBytes() AFAIU, <type_member> can yield None, so the following "dereference" can fail in this case. Furthermore, what about inlining <type_member> here? > + > +def sizeof(type_obj): > + return type_obj.GetByteSize() I'd rather add <find_type> right to the function, like it's done in <offsetof> helper. > + <snipped> > +def J(g): > + return cast(jit_StatePtr, vtou64(cast('char *', g)) - offsetof('GG_State', 'g') + offsetof('GG_State', 'J')) Please, split this into several lines. > + <snipped> > +def strdata(obj): > + try: > + ptr = cast('char *', obj + 1) > + return ptr.summary > + except UnicodeEncodeError: > + return "<luajit-lldb: error occured while rendering non-ascii slot>" <snipped> > +def frame_prevl(framelink): > + return framelink - (1 + LJ_FR2 + bc_a(vtou64(dbg_eval('((BCIns *)' + str(frame_pc(framelink)) + ')[-1]')))) Ouch... Could you please clarify this black voodoo magic with the comment? I don't know, whether it's possible to do it in another more clear way, so I ask only for clarification. > + <snipped> > +def dump_framelink(L, fr): > + fr2 = fr + LJ_FR2 > + > + return '{fr}{padding} [ ] FRAME: [{pp}] delta={d}, {f}\n'.format( > + fr = strx64(fr), > + padding = ':{fr2: <{width}}'.format(fr2 = strx64(fr2), width=len(PADDING) - 1) if LJ_FR2 else PADDING, > + pp = 'PP' if frame_ispcall(fr2) else '{frname}{p}'.format( > + frname = frametypes(frame_type(fr2)), > + p = 'P' if frame_typep(fr2) & FRAME_P else '' > + ), > + d = fr2 - frame_prev(fr2), > + f = dump_lj_tfunc(fr), > + ) > + > +def dump_stack_slot(L, slot, base=None, top=None, eol='\n'): > + base = base or L.base > + top = top or L.top > + > + return '{addr}{padding} [ {B}{T}{M}] VALUE: {value}{eol}'.format( > + addr = strx64(slot), > + padding = PADDING, > + B = 'B' if slot == base else ' ', > + T = 'T' if slot == top else ' ', > + M = 'M' if slot == mref(TValuePtr, L.maxstack) else ' ', > + value = dump_tvalue(slot), > + eol = eol, > + ) > + I'd rather implement <dump_stack> via Python iterators, like Misha has done in scope of 2692f36[1]. It will definitely make maintenance easier. > +def dump_stack(L, base=None, top=None): > + base = base or L.base > + top = top or L.top > + stack = mref(TValuePtr, L.stack) > + maxstack = mref(TValuePtr, L.maxstack) > + red = 5 + 2 * LJ_FR2 > + > + dump = '\n'.join([ > + '{padding} Red zone: {nredslots: >2} slots {padding}'.format( > + padding = '-' * len(PADDING), > + nredslots = red, > + ), > + *( > + dump_stack_slot(L, maxstack + offset, base, top, '') > + for offset in range(red, 0, -1) > + ), > + '{padding} Stack: {nstackslots: >5} slots {padding}'.format( > + padding = '-' * len(PADDING), > + nstackslots = int((maxstack - stack) >> 3), > + ), > + dump_stack_slot(L, maxstack, base, top, ''), > + '{start}:{end: <{width}} [ ] {nfreeslots} slots: Free stack slots'.format( > + start = strx64(top + 1), > + end = strx64(maxstack - 1), > + width = len(PADDING) - 1, > + nfreeslots = int((maxstack - top - 8) >> 3), > + ), > + ]) + '\n' > + > + slot = top > + framelink = base - (1 + LJ_FR2) > + > + # XXX: Lua stack unwinding algorithm consists of the following steps: > + # 1. dump all data slots in the (framelink, top) interval > + # 2. check whether there are remaining frames > + # 3. if there are no slots further, stop the unwinding loop > + # 4. otherwise, resolve the next framelink and top and go to (1) > + # > + # Postcondition (i.e. do-while) loops is the most fitting idiom for such > + # case, but Python doesn't provide such lexical construction. Hence step (1) > + # is unrolled for the topmost stack frame. > + while slot > framelink + LJ_FR2: > + dump += dump_stack_slot(L, slot, base, top) > + slot -= 1 > + > + while framelink > stack: > + assert slot == framelink + LJ_FR2, "Invalid slot during frame unwind" > + dump += dump_framelink(L, framelink) > + framelink = frame_prev(framelink + LJ_FR2) - LJ_FR2 > + slot -= 1 + LJ_FR2 > + while slot > framelink + LJ_FR2: > + dump += dump_stack_slot(L, slot, base, top) > + slot -= 1 > + > + assert slot == framelink + LJ_FR2, "Invalid slot after frame unwind" > + # Skip a nil slot for the last frame for 2-slot frames. > + slot -= LJ_FR2 > + > + dump += '{fr}{padding} [S ] FRAME: dummy L'.format( > + fr = strx64(slot), > + padding = ':{nilslot: <{offset}}'.format(nilslot = strx64(slot + 1), offset=len(PADDING) - 1) if LJ_FR2 else PADDING > + ) > + > + return dump <snipped> > +class LJDumpStack(Command): <snipped> > + command = 'lj-stack' > + > + def execute(self, debugger, args, result): > + global target > + target = debugger.GetSelectedTarget() As we discussed I suggest to move target and <command> field initialization to the parent constructor, like it's done in GDB script. > + l = self.parse(args) > + print('{}\n'.format(dump_stack(L(l)))) > + > + > + > + > + > +def register_commands(debugger): > + for cls in Command.__subclasses__(): > + debugger.HandleCommand( > + 'command script add --overwrite --class luajit_lldb.{cls} {command}'.format( > + cls=cls.__name__, > + command=cls.command, > + ) > + ) It would be also nice to report that the extension is loaded. Consider the output produced by GDB: | (gdb) source luajit-gdb.py | lj-arch command initialized | lj-tv command initialized | lj-str command initialized | lj-tab command initialized | lj-stack command initialized | lj-state command initialized | lj-gc command initialized | luajit-gdb.py is successfully loaded > + > +def configure(debugger): > + global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, PADDING, LJ_TISNUM > + target = debugger.GetSelectedTarget() > + module = target.modules[0] > + LJ_DUALNUM = module.FindSymbol('lj_lib_checknumber') != None > + > + try: > + irtype_enum = target.FindFirstType('IRType').enum_members > + for member in irtype_enum: > + if member.name == 'IRT_PTR': > + LJ_64 = member.unsigned & 0x1f == IRT_P64 > + if member.name == 'IRT_PGC': > + LJ_FR2 = LJ_GC64 = member.unsigned & 0x1f == IRT_P64 > + except: > + print('luajit-lldb.py failed to load: ' > + 'no debugging symbols found for libluajit\n') Initialization is failed, hence return is missing in the except branch. > + > + PADDING = ' ' * len(':' + hex((1 << (47 if LJ_GC64 else 32)) - 1)) > + LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] > + > + > + > +def __lldb_init_module(debugger, internal_dict): > + configure(debugger) > + register_commands(debugger) > -- > 2.37.0 (Apple Git-136) > [1]: https://github.com/tarantool/luajit/commit/2692f36 -- Best regards, IM
parent reply other threads:[~2023-04-24 14:37 UTC|newest] Thread overview: expand[flat|nested] mbox.gz Atom feed [parent not found: <20221027131807.31287-1-max.kokryashkin@gmail.com>]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ZEaROhG2en/vXHCl@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=imun@tarantool.org \ --cc=max.kokryashkin@gmail.com \ --subject='Re: [Tarantool-patches] [PATCH luajit v2] lldb: introduce luajit-lldb' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox