From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 74A20408A21; Mon, 24 Apr 2023 17:37:56 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 74A20408A21 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1682347076; bh=W5MLwsQb++8qc79Zt/u4tgCZc1ad5TFw7MIAGsamdTk=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=kb26kXG4fYd2yM8vLNvQbgdQRzJ7iOcBGJfuP1Aw0Fagxh3di4OxOYlBuPYEIoLyC nGIRDx0SnTnj82f8Ox+djtFlR2rl6duEaH7M8ITZsQLJByE4vl5vbm1IFJluIiMIlN ZFzLDw6/xVBS2rJA3bJ1Rr8gtrjuouMUNEdg52J0= Received: from smtp31.i.mail.ru (smtp31.i.mail.ru [95.163.41.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B9F066ECCD for ; Mon, 24 Apr 2023 17:37:54 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B9F066ECCD Received: by smtp31.i.mail.ru with esmtpa (envelope-from ) id 1pqxKD-005pci-HS; Mon, 24 Apr 2023 17:37:54 +0300 Date: Mon, 24 Apr 2023 14:26:58 +0000 To: Maksim Kokryashkin Message-ID: References: <20221027131807.31287-1-max.kokryashkin@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20221027131807.31287-1-max.kokryashkin@gmail.com> X-Clacks-Overhead: GNU Terry Pratchett X-Mailru-Src: smtp X-7564579A: 78E4E2B564C1792B X-77F55803: 4F1203BC0FB41BD9D2463843EA06979D4186056CE8B52E4BB197C31DE68DC085182A05F538085040FF7E96B56C2EE36F06EDB6CB0C175A70D5196BD8C8B8A6044E8135D78A5B91E0 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE77E216A0E97507353EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006373DBA3D7E24987517EA1F7E6F0F101C6723150C8DA25C47586E58E00D9D99D84E1BDDB23E98D2D38B6F1F7B995052D5CEAC8ED2A80FD66E4D10B167E12C31D678CC7F00164DA146DAFE8445B8C89999728AA50765F79006371442AD7DCBC27D0F389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC886D40F53BA192295F6B57BC7E6449061A352F6E88A58FB86F5D81C698A659EA73AA81AA40904B5D9A18204E546F3947C893991AD1F2BFC6A2D242C3BD2E3F4C64AD6D5ED66289B52698AB9A7B718F8C46E0066C2D8992A16725E5C173C3A84C3DF404FCFBB7C2CD3BA3038C0950A5D36B5C8C57E37DE458B330BD67F2E7D9AF16D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE785AFBB0BB0954C1D731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A57D69E5F09C43C3DAD4478FF1A92C38D889C833FA28E438D6F87CCE6106E1FC07E67D4AC08A07B9B0CE135D2742255B35CB5012B2E24CD356 X-C8649E89: 1C3962B70DF3F0ADE00A9FD3E00BEEDF77DD89D51EBB7742D3581295AF09D3DF87807E0823442EA2ED31085941D9CD0AF7F820E7B07EA4CF2E90626561A6E453DBE2A098F92F4C6566FD9247760406EAF08E9C762982B92E9C8755E3AEA661EFD6450F9A19A4D28D068DF46B853A365503AB4F2AD71F8B8B54CE6006BCE07E73 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj8e+LbpfNszFOCs9F7hK3jw== X-Mailru-Sender: 2FEBA92C8E508479FE7B9A1DF348D531911D71E4C8EC28E8A01371ACC50D12ED3120E6DF4C9998C92326FE6F2A341ACE0FB9F97486540B4CD9E8847AB8CFED4D9ABF8A61C016C2CFB0DAF586E7D11B3E67EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH luajit v2] lldb: introduce luajit-lldb X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Igor Munkin via Tarantool-patches Reply-To: Igor Munkin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Max, Thanks for your patch! Nicely done with isolating debugger specific helpers, I'm looking forward to merging it with luajit-gdb.py On 27.10.22, Maksim Kokryashkin wrote: > From: "m.kokryashkin" Oops, looks like you've committed the change on your local macOS. > > It is impossible to run gdb on M1 devices, the only available > debugger is lldb. The luajit-gdb extension doesn't work with > lldb, so this patch introduces the luajit-lldb extension, > which re-implements exactly the same functionality. > > It is worth noting that the extension is called > `luajit_lldb.py` instead of `luajit-lldb.py` because lldb > imports it as a Python module, and it is prohibited to use > dashes in a python module name. > > Part of tarantool/tarantool#4808 > --- > Branch: https://github.com/tarantool/luajit/tree/gh-fckxorg/luajit-lldb > > Changes in v2: > - Fixed comments as per review by Sergey > - Reorganized the `c_structs` spec. > > src/luajit_lldb.py | 1006 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 1006 insertions(+) > create mode 100644 src/luajit_lldb.py > > diff --git a/src/luajit_lldb.py b/src/luajit_lldb.py > new file mode 100644 > index 00000000..edd2884e > --- /dev/null > +++ b/src/luajit_lldb.py > @@ -0,0 +1,1006 @@ > + def __add__(self, other): > + assert isinstance(other, int) > + return self.__class__(cast(self.normal_type.__name__ + ' *', cast('uintptr_t', self.value.unsigned + other * self.value.deref.size))) > + Minor: I see no reason in such assymetry in type variety supported by __add__ and __sub__. Anyway, it doesn't affect extension usage, so feel free to ignore. > + def __sub__(self, other): > + assert isinstance(other, int) or isinstance(other, Ptr) > + if isinstance(other, int): > + return self.__add__(-other) > + else: > + return self.value.unsigned - other.value.unsigned > + > + > + Could you please add some comments to function? AFAIU, there are type detection and value cast: some spots are quite cryptic... > +def cast(typename, value): > + pointer_type = False > + name = None > + if isinstance(value, Struct) or isinstance(value, Ptr): > + value = value.value > + if isinstance(typename, type): > + name = typename.__name__ > + if name.endswith('Ptr'): > + pointer_type = True > + name = name[:-3] > + else: > + name = typename > + if name[-1] == '*': > + name = name[:-1].strip() > + pointer_type = True > + t = target.FindFirstType(name) > + if pointer_type: > + t = t.GetPointerType() > + > + if isinstance(value, int): > + if pointer_type: > + casted = target.CreateValueFromAddress('value', lldb.SBAddress(value, target), t.GetPointeeType()).address_of > + else: > + casted = target.CreateValueFromData(name = 'value', data = lldb.SBData.CreateDataFromInt(value, size=8), type=t) > + else: > + casted = value.Cast(t) > + > + if isinstance(typename, type): > + return typename(casted) > + else: > + return casted > + > +def lookup_global(name): > + global target Minor: Do we need explicitly mention as global here? > + return target.FindFirstGlobalVariable(name) > + > +def type_member(type_obj, name): > + return next((x for x in type_obj.members if x.name == name), None) > + > +def find_type(typename): > + return target.FindFirstType(typename) I believe, you can "inline" this function to their callers, if you use it only in "Debugger helpers" scope. > + > +def offsetof(typename, membername): > + type_obj = find_type(typename) > + member = type_member(type_obj, membername) > + return member.GetOffsetInBytes() AFAIU, can yield None, so the following "dereference" can fail in this case. Furthermore, what about inlining here? > + > +def sizeof(type_obj): > + return type_obj.GetByteSize() I'd rather add right to the function, like it's done in helper. > + > +def J(g): > + return cast(jit_StatePtr, vtou64(cast('char *', g)) - offsetof('GG_State', 'g') + offsetof('GG_State', 'J')) Please, split this into several lines. > + > +def strdata(obj): > + try: > + ptr = cast('char *', obj + 1) > + return ptr.summary > + except UnicodeEncodeError: > + return "" > +def frame_prevl(framelink): > + return framelink - (1 + LJ_FR2 + bc_a(vtou64(dbg_eval('((BCIns *)' + str(frame_pc(framelink)) + ')[-1]')))) Ouch... Could you please clarify this black voodoo magic with the comment? I don't know, whether it's possible to do it in another more clear way, so I ask only for clarification. > + > +def dump_framelink(L, fr): > + fr2 = fr + LJ_FR2 > + > + return '{fr}{padding} [ ] FRAME: [{pp}] delta={d}, {f}\n'.format( > + fr = strx64(fr), > + padding = ':{fr2: <{width}}'.format(fr2 = strx64(fr2), width=len(PADDING) - 1) if LJ_FR2 else PADDING, > + pp = 'PP' if frame_ispcall(fr2) else '{frname}{p}'.format( > + frname = frametypes(frame_type(fr2)), > + p = 'P' if frame_typep(fr2) & FRAME_P else '' > + ), > + d = fr2 - frame_prev(fr2), > + f = dump_lj_tfunc(fr), > + ) > + > +def dump_stack_slot(L, slot, base=None, top=None, eol='\n'): > + base = base or L.base > + top = top or L.top > + > + return '{addr}{padding} [ {B}{T}{M}] VALUE: {value}{eol}'.format( > + addr = strx64(slot), > + padding = PADDING, > + B = 'B' if slot == base else ' ', > + T = 'T' if slot == top else ' ', > + M = 'M' if slot == mref(TValuePtr, L.maxstack) else ' ', > + value = dump_tvalue(slot), > + eol = eol, > + ) > + I'd rather implement via Python iterators, like Misha has done in scope of 2692f36[1]. It will definitely make maintenance easier. > +def dump_stack(L, base=None, top=None): > + base = base or L.base > + top = top or L.top > + stack = mref(TValuePtr, L.stack) > + maxstack = mref(TValuePtr, L.maxstack) > + red = 5 + 2 * LJ_FR2 > + > + dump = '\n'.join([ > + '{padding} Red zone: {nredslots: >2} slots {padding}'.format( > + padding = '-' * len(PADDING), > + nredslots = red, > + ), > + *( > + dump_stack_slot(L, maxstack + offset, base, top, '') > + for offset in range(red, 0, -1) > + ), > + '{padding} Stack: {nstackslots: >5} slots {padding}'.format( > + padding = '-' * len(PADDING), > + nstackslots = int((maxstack - stack) >> 3), > + ), > + dump_stack_slot(L, maxstack, base, top, ''), > + '{start}:{end: <{width}} [ ] {nfreeslots} slots: Free stack slots'.format( > + start = strx64(top + 1), > + end = strx64(maxstack - 1), > + width = len(PADDING) - 1, > + nfreeslots = int((maxstack - top - 8) >> 3), > + ), > + ]) + '\n' > + > + slot = top > + framelink = base - (1 + LJ_FR2) > + > + # XXX: Lua stack unwinding algorithm consists of the following steps: > + # 1. dump all data slots in the (framelink, top) interval > + # 2. check whether there are remaining frames > + # 3. if there are no slots further, stop the unwinding loop > + # 4. otherwise, resolve the next framelink and top and go to (1) > + # > + # Postcondition (i.e. do-while) loops is the most fitting idiom for such > + # case, but Python doesn't provide such lexical construction. Hence step (1) > + # is unrolled for the topmost stack frame. > + while slot > framelink + LJ_FR2: > + dump += dump_stack_slot(L, slot, base, top) > + slot -= 1 > + > + while framelink > stack: > + assert slot == framelink + LJ_FR2, "Invalid slot during frame unwind" > + dump += dump_framelink(L, framelink) > + framelink = frame_prev(framelink + LJ_FR2) - LJ_FR2 > + slot -= 1 + LJ_FR2 > + while slot > framelink + LJ_FR2: > + dump += dump_stack_slot(L, slot, base, top) > + slot -= 1 > + > + assert slot == framelink + LJ_FR2, "Invalid slot after frame unwind" > + # Skip a nil slot for the last frame for 2-slot frames. > + slot -= LJ_FR2 > + > + dump += '{fr}{padding} [S ] FRAME: dummy L'.format( > + fr = strx64(slot), > + padding = ':{nilslot: <{offset}}'.format(nilslot = strx64(slot + 1), offset=len(PADDING) - 1) if LJ_FR2 else PADDING > + ) > + > + return dump > +class LJDumpStack(Command): > + command = 'lj-stack' > + > + def execute(self, debugger, args, result): > + global target > + target = debugger.GetSelectedTarget() As we discussed I suggest to move target and field initialization to the parent constructor, like it's done in GDB script. > + l = self.parse(args) > + print('{}\n'.format(dump_stack(L(l)))) > + > + > + > + > + > +def register_commands(debugger): > + for cls in Command.__subclasses__(): > + debugger.HandleCommand( > + 'command script add --overwrite --class luajit_lldb.{cls} {command}'.format( > + cls=cls.__name__, > + command=cls.command, > + ) > + ) It would be also nice to report that the extension is loaded. Consider the output produced by GDB: | (gdb) source luajit-gdb.py | lj-arch command initialized | lj-tv command initialized | lj-str command initialized | lj-tab command initialized | lj-stack command initialized | lj-state command initialized | lj-gc command initialized | luajit-gdb.py is successfully loaded > + > +def configure(debugger): > + global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, PADDING, LJ_TISNUM > + target = debugger.GetSelectedTarget() > + module = target.modules[0] > + LJ_DUALNUM = module.FindSymbol('lj_lib_checknumber') != None > + > + try: > + irtype_enum = target.FindFirstType('IRType').enum_members > + for member in irtype_enum: > + if member.name == 'IRT_PTR': > + LJ_64 = member.unsigned & 0x1f == IRT_P64 > + if member.name == 'IRT_PGC': > + LJ_FR2 = LJ_GC64 = member.unsigned & 0x1f == IRT_P64 > + except: > + print('luajit-lldb.py failed to load: ' > + 'no debugging symbols found for libluajit\n') Initialization is failed, hence return is missing in the except branch. > + > + PADDING = ' ' * len(':' + hex((1 << (47 if LJ_GC64 else 32)) - 1)) > + LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] > + > + > + > +def __lldb_init_module(debugger, internal_dict): > + configure(debugger) > + register_commands(debugger) > -- > 2.37.0 (Apple Git-136) > [1]: https://github.com/tarantool/luajit/commit/2692f36 -- Best regards, IM