* [Tarantool-patches] [PATCH luajit v6 0/2] debug: generalized extension @ 2024-04-03 22:21 Maxim Kokryashkin via Tarantool-patches 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 1/2] " Maxim Kokryashkin via Tarantool-patches 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions Maxim Kokryashkin via Tarantool-patches 0 siblings, 2 replies; 11+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2024-04-03 22:21 UTC (permalink / raw) To: tarantool-patches, skaplun, sergeyb Changes in v6: - Rebased onto the CTest machinery. Branch: https://github.com/tarantool/luajit/tree/fckxorg/generalized-debugger Maksim Kokryashkin (1): test: add tests for debugging extensions Maxim Kokryashkin (1): debug: generalized extension .flake8rc | 4 + src/luajit-gdb.py | 885 ------------------ src/{luajit_lldb.py => luajit_dbg.py} | 616 ++++++++---- test/CMakeLists.txt | 1 + .../CMakeLists.txt | 80 ++ .../debug-extension-tests.py | 250 +++++ 6 files changed, 751 insertions(+), 1085 deletions(-) delete mode 100644 src/luajit-gdb.py rename src/{luajit_lldb.py => luajit_dbg.py} (63%) create mode 100644 test/LuaJIT-debug-extensions-tests/CMakeLists.txt create mode 100644 test/LuaJIT-debug-extensions-tests/debug-extension-tests.py -- 2.44.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Tarantool-patches] [PATCH luajit v6 1/2] debug: generalized extension 2024-04-03 22:21 [Tarantool-patches] [PATCH luajit v6 0/2] debug: generalized extension Maxim Kokryashkin via Tarantool-patches @ 2024-04-03 22:21 ` Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:14 ` Sergey Bronnikov via Tarantool-patches ` (2 more replies) 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions Maxim Kokryashkin via Tarantool-patches 1 sibling, 3 replies; 11+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2024-04-03 22:21 UTC (permalink / raw) To: tarantool-patches, skaplun, sergeyb This patch joins the LLDB and GDB LuaJIT debugging extensions into one, so now the extension logic can be debugger-agnostic. To do that, an adapter class is introduced, and all of the debugger-specific behavior is encapsulated there. The extension auto-detects the debugger it was loaded into and selects the correct low-level logic implementation. --- src/luajit-gdb.py | 885 -------------------------- src/{luajit_lldb.py => luajit_dbg.py} | 616 ++++++++++++------ 2 files changed, 416 insertions(+), 1085 deletions(-) delete mode 100644 src/luajit-gdb.py rename src/{luajit_lldb.py => luajit_dbg.py} (63%) diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py deleted file mode 100644 index d2070e9b..00000000 --- a/src/luajit-gdb.py +++ /dev/null @@ -1,885 +0,0 @@ -# GDB extension for LuaJIT post-mortem analysis. -# To use, just put 'source <path-to-repo>/src/luajit-gdb.py' in gdb. - -import re -import gdb -import sys - -# make script compatible with the ancient Python {{{ - - -LEGACY = re.match(r'^2\.', sys.version) - -if LEGACY: - CONNECTED = False - int = long - range = xrange - - -# }}} - - -gtype_cache = {} - - -def gtype(typestr): - global gtype_cache - if typestr in gtype_cache: - return gtype_cache[typestr] - - m = re.match(r'((?:(?:struct|union) )?\S*)\s*[*]', typestr) - - gtype = gdb.lookup_type(typestr) if m is None \ - else gdb.lookup_type(m.group(1)).pointer() - - gtype_cache[typestr] = gtype - return gtype - - -def cast(typestr, val): - return gdb.Value(val).cast(gtype(typestr)) - - -def lookup(symbol): - variable, _ = gdb.lookup_symbol(symbol) - return variable.value() if variable else None - - -def parse_arg(arg): - if not arg: - return None - - ret = gdb.parse_and_eval(arg) - - if not ret: - raise gdb.GdbError('table argument empty') - - return ret - - -def tou64(val): - return cast('uint64_t', val) & 0xFFFFFFFFFFFFFFFF - - -def tou32(val): - return cast('uint32_t', val) & 0xFFFFFFFF - - -def i2notu32(val): - return ~int(val) & 0xFFFFFFFF - - -def strx64(val): - return re.sub('L?$', '', - hex(int(cast('uint64_t', val) & 0xFFFFFFFFFFFFFFFF))) - - -# Types {{{ - - -LJ_T = { - 'NIL': i2notu32(0), - 'FALSE': i2notu32(1), - 'TRUE': i2notu32(2), - 'LIGHTUD': i2notu32(3), - 'STR': i2notu32(4), - 'UPVAL': i2notu32(5), - 'THREAD': i2notu32(6), - 'PROTO': i2notu32(7), - 'FUNC': i2notu32(8), - 'TRACE': i2notu32(9), - 'CDATA': i2notu32(10), - 'TAB': i2notu32(11), - 'UDATA': i2notu32(12), - 'NUMX': i2notu32(13), -} - - -def typenames(value): - return { - LJ_T[k]: 'LJ_T' + k for k in LJ_T.keys() - }.get(int(value), 'LJ_TINVALID') - - -# }}} - -# Frames {{{ - - -FRAME_TYPE = 0x3 -FRAME_P = 0x4 -FRAME_TYPEP = FRAME_TYPE | FRAME_P - -FRAME = { - 'LUA': 0x0, - 'C': 0x1, - 'CONT': 0x2, - 'VARG': 0x3, - 'LUAP': 0x4, - 'CP': 0x5, - 'PCALL': 0x6, - 'PCALLH': 0x7, -} - - -def frametypes(ft): - return { - FRAME['LUA']: 'L', - FRAME['C']: 'C', - FRAME['CONT']: 'M', - FRAME['VARG']: 'V', - }.get(ft, '?') - - -def bc_a(ins): - return (ins >> 8) & 0xff - - -def frame_ftsz(framelink): - return cast('ptrdiff_t', framelink['ftsz'] if LJ_FR2 - else framelink['fr']['tp']['ftsz']) - - -def frame_pc(framelink): - return cast('BCIns *', frame_ftsz(framelink)) if LJ_FR2 \ - else mref('BCIns *', framelink['fr']['tp']['pcr']) - - -def frame_prevl(framelink): - return framelink - (1 + LJ_FR2 + bc_a(frame_pc(framelink)[-1])) - - -def frame_ispcall(framelink): - return (frame_ftsz(framelink) & FRAME['PCALL']) == FRAME['PCALL'] - - -def frame_sized(framelink): - return (frame_ftsz(framelink) & ~FRAME_TYPEP) - - -def frame_prevd(framelink): - return cast('TValue *', cast('char *', framelink) - frame_sized(framelink)) - - -def frame_type(framelink): - return frame_ftsz(framelink) & FRAME_TYPE - - -def frame_typep(framelink): - return frame_ftsz(framelink) & FRAME_TYPEP - - -def frame_islua(framelink): - return frametypes(int(frame_type(framelink))) == 'L' \ - and int(frame_ftsz(framelink)) > 0 - - -def frame_prev(framelink): - return frame_prevl(framelink) if frame_islua(framelink) \ - else frame_prevd(framelink) - - -def frame_sentinel(L): - return mref('TValue *', L['stack']) + LJ_FR2 - - -# }}} - -# Const {{{ - - -LJ_64 = None -LJ_GC64 = None -LJ_FR2 = None -LJ_DUALNUM = None - -LJ_GCVMASK = ((1 << 47) - 1) -LJ_TISNUM = None -PADDING = None - -# These constants are meaningful only for 'LJ_64' mode. -LJ_LIGHTUD_BITS_SEG = 8 -LJ_LIGHTUD_BITS_LO = 47 - LJ_LIGHTUD_BITS_SEG -LIGHTUD_SEG_MASK = (1 << LJ_LIGHTUD_BITS_SEG) - 1 -LIGHTUD_LO_MASK = (1 << LJ_LIGHTUD_BITS_LO) - 1 - - -# }}} - - -def itype(o): - return cast('uint32_t', o['it64'] >> 47) if LJ_GC64 else o['it'] - - -def mref(typename, obj): - return cast(typename, obj['ptr64'] if LJ_GC64 else obj['ptr32']) - - -def gcref(obj): - return cast('GCobj *', obj['gcptr64'] if LJ_GC64 - else cast('uintptr_t', obj['gcptr32'])) - - -def gcval(obj): - return cast('GCobj *', obj['gcptr64'] & LJ_GCVMASK if LJ_GC64 - else cast('uintptr_t', obj['gcptr32'])) - - -def gcnext(obj): - return gcref(obj)['gch']['nextgc'] - - -def L(L=None): - # lookup a symbol for the main coroutine considering the host app - # XXX Fragile: though the loop initialization looks like a crap but it - # respects both Python 2 and Python 3. - for lstate in [L] + list(map(lambda main: lookup(main), ( - # LuaJIT main coro (see luajit/src/luajit.c) - 'globalL', - # Tarantool main coro (see tarantool/src/lua/init.h) - 'tarantool_L', - # TODO: Add more - ))): - if lstate: - return cast('lua_State *', lstate) - - -def G(L): - return mref('global_State *', L['glref']) - - -def J(g): - typeGG = gtype('GG_State') - - return cast('jit_State *', int(cast('char *', g)) - - int(typeGG['g'].bitpos / 8) - + int(typeGG['J'].bitpos / 8)) - - -def vm_state(g): - return { - i2notu32(0): 'INTERP', - i2notu32(1): 'LFUNC', - i2notu32(2): 'FFUNC', - i2notu32(3): 'CFUNC', - i2notu32(4): 'GC', - i2notu32(5): 'EXIT', - i2notu32(6): 'RECORD', - i2notu32(7): 'OPT', - i2notu32(8): 'ASM', - }.get(int(tou32(g['vmstate'])), 'TRACE') - - -def gc_state(g): - return { - 0: 'PAUSE', - 1: 'PROPAGATE', - 2: 'ATOMIC', - 3: 'SWEEPSTRING', - 4: 'SWEEP', - 5: 'FINALIZE', - 6: 'LAST', - }.get(int(g['gc']['state']), 'INVALID') - - -def jit_state(g): - return { - 0: 'IDLE', - 0x10: 'ACTIVE', - 0x11: 'RECORD', - 0x12: 'START', - 0x13: 'END', - 0x14: 'ASM', - 0x15: 'ERR', - }.get(int(J(g)['state']), 'INVALID') - - -def tvisint(o): - return LJ_DUALNUM and itype(o) == LJ_TISNUM - - -def tvisnumber(o): - return itype(o) <= LJ_TISNUM - - -def tvislightud(o): - if LJ_64 and not LJ_GC64: - return (cast('int32_t', itype(o)) >> 15) == -2 - else: - return itype(o) == LJ_T['LIGHTUD'] - - -def strdata(obj): - # String is printed with pointer to it, thanks to gdb. Just strip it. - try: - return str(cast('char *', cast('GCstr *', obj) + 1))[len(PADDING):] - except UnicodeEncodeError: - return "<luajit-gdb: error occurred while rendering non-ascii slot>" - - -def itypemap(o): - if LJ_64 and not LJ_GC64: - return LJ_T['NUMX'] if tvisnumber(o) \ - else LJ_T['LIGHTUD'] if tvislightud(o) \ - else itype(o) - else: - return LJ_T['NUMX'] if tvisnumber(o) else itype(o) - - -def funcproto(func): - assert func['ffid'] == 0 - - return cast('GCproto *', - mref('char *', func['pc']) - gdb.lookup_type('GCproto').sizeof) - - -def gclistlen(root, end=0x0): - count = 0 - while (gcref(root) != end): - count += 1 - root = gcnext(root) - return count - - -def gcringlen(root): - if not gcref(root): - return 0 - elif gcref(root) == gcref(gcnext(root)): - return 1 - else: - return 1 + gclistlen(gcnext(root), gcref(root)) - - -gclen = { - 'root': gclistlen, - 'gray': gclistlen, - 'grayagain': gclistlen, - 'weak': gclistlen, - # XXX: gc.mmudata is a ring-list. - 'mmudata': gcringlen, -} - - -# The generator that implements frame iterator. -# Every frame is represented as a tuple of framelink and frametop. -def frames(L): - frametop = L['top'] - framelink = L['base'] - 1 - framelink_sentinel = frame_sentinel(L) - while True: - yield framelink, frametop - frametop = framelink - (1 + LJ_FR2) - if framelink <= framelink_sentinel: - break - framelink = frame_prev(framelink) - - -def lightudV(tv): - if LJ_64: - u = int(tv['u64']) - # lightudseg macro expanded. - seg = (u >> LJ_LIGHTUD_BITS_LO) & LIGHTUD_SEG_MASK - segmap = mref('uint32_t *', G(L(None))['gc']['lightudseg']) - # lightudlo macro expanded. - return (int(segmap[seg]) << 32) | (u & LIGHTUD_LO_MASK) - else: - return gcval(tv['gcr']) - - -# Dumpers {{{ - - -def dump_lj_tnil(tv): - return 'nil' - - -def dump_lj_tfalse(tv): - return 'false' - - -def dump_lj_ttrue(tv): - return 'true' - - -def dump_lj_tlightud(tv): - return 'light userdata @ {}'.format(strx64(lightudV(tv))) - - -def dump_lj_tstr(tv): - return 'string {body} @ {address}'.format( - body=strdata(gcval(tv['gcr'])), - address=strx64(gcval(tv['gcr'])) - ) - - -def dump_lj_tupval(tv): - return 'upvalue @ {}'.format(strx64(gcval(tv['gcr']))) - - -def dump_lj_tthread(tv): - return 'thread @ {}'.format(strx64(gcval(tv['gcr']))) - - -def dump_lj_tproto(tv): - return 'proto @ {}'.format(strx64(gcval(tv['gcr']))) - - -def dump_lj_tfunc(tv): - func = cast('struct GCfuncC *', gcval(tv['gcr'])) - ffid = func['ffid'] - - if ffid == 0: - pt = funcproto(func) - return 'Lua function @ {addr}, {nups} upvalues, {chunk}:{line}'.format( - addr=strx64(func), - nups=int(func['nupvalues']), - chunk=strdata(cast('GCstr *', gcval(pt['chunkname']))), - line=pt['firstline'] - ) - elif ffid == 1: - return 'C function @ {}'.format(strx64(func['f'])) - else: - return 'fast function #{}'.format(int(ffid)) - - -def dump_lj_ttrace(tv): - trace = cast('struct GCtrace *', gcval(tv['gcr'])) - return 'trace {traceno} @ {addr}'.format( - traceno=strx64(trace['traceno']), - addr=strx64(trace) - ) - - -def dump_lj_tcdata(tv): - return 'cdata @ {}'.format(strx64(gcval(tv['gcr']))) - - -def dump_lj_ttab(tv): - table = cast('GCtab *', gcval(tv['gcr'])) - return 'table @ {gcr} (asize: {asize}, hmask: {hmask})'.format( - gcr=strx64(table), - asize=table['asize'], - hmask=strx64(table['hmask']), - ) - - -def dump_lj_tudata(tv): - return 'userdata @ {}'.format(strx64(gcval(tv['gcr']))) - - -def dump_lj_tnumx(tv): - if tvisint(tv): - return 'integer {}'.format(cast('int32_t', tv['i'])) - else: - return 'number {}'.format(cast('double', tv['n'])) - - -def dump_lj_invalid(tv): - return 'not valid type @ {}'.format(strx64(gcval(tv['gcr']))) - - -# }}} - - -dumpers = { - 'LJ_TNIL': dump_lj_tnil, - 'LJ_TFALSE': dump_lj_tfalse, - 'LJ_TTRUE': dump_lj_ttrue, - 'LJ_TLIGHTUD': dump_lj_tlightud, - 'LJ_TSTR': dump_lj_tstr, - 'LJ_TUPVAL': dump_lj_tupval, - 'LJ_TTHREAD': dump_lj_tthread, - 'LJ_TPROTO': dump_lj_tproto, - 'LJ_TFUNC': dump_lj_tfunc, - 'LJ_TTRACE': dump_lj_ttrace, - 'LJ_TCDATA': dump_lj_tcdata, - 'LJ_TTAB': dump_lj_ttab, - 'LJ_TUDATA': dump_lj_tudata, - 'LJ_TNUMX': dump_lj_tnumx, -} - - -def dump_tvalue(tvalue): - return dumpers.get(typenames(itypemap(tvalue)), dump_lj_invalid)(tvalue) - - -def dump_framelink_slot_address(fr): - return '{}:{}'.format(fr - 1, fr) if LJ_FR2 \ - else '{}'.format(fr) + PADDING - - -def dump_framelink(L, fr): - if fr == frame_sentinel(L): - return '{addr} [S ] FRAME: dummy L'.format( - addr=dump_framelink_slot_address(fr), - ) - return '{addr} [ ] FRAME: [{pp}] delta={d}, {f}'.format( - addr=dump_framelink_slot_address(fr), - pp='PP' if frame_ispcall(fr) else '{frname}{p}'.format( - frname=frametypes(int(frame_type(fr))), - p='P' if frame_typep(fr) & FRAME_P else '' - ), - d=cast('TValue *', fr) - cast('TValue *', frame_prev(fr)), - f=dump_lj_tfunc(fr - LJ_FR2), - ) - - -def dump_stack_slot(L, slot, base=None, top=None): - base = base or L['base'] - top = top or L['top'] - - return '{addr}{padding} [ {B}{T}{M}] VALUE: {value}'.format( - addr=strx64(slot), - padding=PADDING, - B='B' if slot == base else ' ', - T='T' if slot == top else ' ', - M='M' if slot == mref('TValue *', L['maxstack']) else ' ', - value=dump_tvalue(slot), - ) - - -def dump_stack(L, base=None, top=None): - base = base or L['base'] - top = top or L['top'] - stack = mref('TValue *', L['stack']) - maxstack = mref('TValue *', L['maxstack']) - red = 5 + 2 * LJ_FR2 - - dump = [ - '{padding} Red zone: {nredslots: >2} slots {padding}'.format( - padding='-' * len(PADDING), - nredslots=red, - ), - ] - dump.extend([ - dump_stack_slot(L, maxstack + offset, base, top) - for offset in range(red, 0, -1) # noqa: E131 - ]) - dump.extend([ - '{padding} Stack: {nstackslots: >5} slots {padding}'.format( - padding='-' * len(PADDING), - nstackslots=int((tou64(maxstack) - tou64(stack)) >> 3), - ), - dump_stack_slot(L, maxstack, base, top), - '{start}:{end} [ ] {nfreeslots} slots: Free stack slots'.format( - start=strx64(top + 1), - end=strx64(maxstack - 1), - nfreeslots=int((tou64(maxstack) - tou64(top) - 8) >> 3), - ), - ]) - - for framelink, frametop in frames(L): - # Dump all data slots in the (framelink, top) interval. - dump.extend([ - dump_stack_slot(L, framelink + offset, base, top) - for offset in range(frametop - framelink, 0, -1) # noqa: E131 - ]) - # Dump frame slot (2 slots in case of GC64). - dump.append(dump_framelink(L, framelink)) - - return '\n'.join(dump) - - -def dump_gc(g): - gc = g['gc'] - stats = ['{key}: {value}'.format(key=f, value=gc[f]) for f in ( - 'total', 'threshold', 'debt', 'estimate', 'stepmul', 'pause' - )] - - stats += ['sweepstr: {sweepstr}/{strmask}'.format( - sweepstr=gc['sweepstr'], - # String hash mask (size of hash table - 1). - strmask=g['strmask'] + 1, - )] - - stats += ['{key}: {number} objects'.format( - key=stat, - number=handler(gc[stat]) - ) for stat, handler in gclen.items()] - - return '\n'.join(map(lambda s: '\t' + s, stats)) - - -class LJBase(gdb.Command): - - def __init__(self, name): - # XXX Fragile: though the command initialization looks like a crap but - # it respects both Python 2 and Python 3. - gdb.Command.__init__(self, name, gdb.COMMAND_DATA) - gdb.write('{} command initialized\n'.format(name)) - - -class LJDumpArch(LJBase): - ''' -lj-arch - -The command requires no args and dumps values of LJ_64 and LJ_GC64 -compile-time flags. These values define the sizes of host and GC -pointers respectively. - ''' - - def invoke(self, arg, from_tty): - gdb.write( - 'LJ_64: {LJ_64}, LJ_GC64: {LJ_GC64}, LJ_DUALNUM: {LJ_DUALNUM}\n' - .format( - LJ_64=LJ_64, - LJ_GC64=LJ_GC64, - LJ_DUALNUM=LJ_DUALNUM - ) - ) - - -class LJDumpTValue(LJBase): - ''' -lj-tv <TValue *> - -The command receives a pointer to <tv> (TValue address) and dumps -the type and some info related to it. - -* LJ_TNIL: nil -* LJ_TFALSE: false -* LJ_TTRUE: true -* LJ_TLIGHTUD: light userdata @ <gcr> -* LJ_TSTR: string <string payload> @ <gcr> -* LJ_TUPVAL: upvalue @ <gcr> -* LJ_TTHREAD: thread @ <gcr> -* LJ_TPROTO: proto @ <gcr> -* LJ_TFUNC: <LFUNC|CFUNC|FFUNC> - <LFUNC>: Lua function @ <gcr>, <nupvals> upvalues, <chunk:line> - <CFUNC>: C function <mcode address> - <FFUNC>: fast function #<ffid> -* LJ_TTRACE: trace <traceno> @ <gcr> -* LJ_TCDATA: cdata @ <gcr> -* LJ_TTAB: table @ <gcr> (asize: <asize>, hmask: <hmask>) -* LJ_TUDATA: userdata @ <gcr> -* LJ_TNUMX: number <numeric payload> - -Whether the type of the given address differs from the listed above, then -error message occurs. - ''' - - def invoke(self, arg, from_tty): - tv = cast('TValue *', parse_arg(arg)) - gdb.write('{}\n'.format(dump_tvalue(tv))) - - -class LJDumpString(LJBase): - ''' -lj-str <GCstr *> - -The command receives a <gcr> of the corresponding GCstr object and dumps -the payload, size in bytes and hash. - -*Caveat*: Since Python 2 provides no native Unicode support, the payload -is replaced with the corresponding error when decoding fails. - ''' - - def invoke(self, arg, from_tty): - string = cast('GCstr *', parse_arg(arg)) - gdb.write("String: {body} [{len} bytes] with hash {hash}\n".format( - body=strdata(string), - hash=strx64(string['hash']), - len=string['len'], - )) - - -class LJDumpTable(LJBase): - ''' -lj-tab <GCtab *> - -The command receives a GCtab address and dumps the table contents: -* Metatable address whether the one is set -* Array part <asize> slots: - <aslot ptr>: [<index>]: <tv> -* Hash part <hsize> nodes: - <hnode ptr>: { <tv> } => { <tv> }; next = <next hnode ptr> - ''' - - def invoke(self, arg, from_tty): - t = cast('GCtab *', parse_arg(arg)) - array = mref('TValue *', t['array']) - nodes = mref('struct Node *', t['node']) - mt = gcval(t['metatable']) - capacity = { - 'apart': int(t['asize']), - 'hpart': int(t['hmask'] + 1) if t['hmask'] > 0 else 0 - } - - if mt != 0: - gdb.write('Metatable detected: {}\n'.format(strx64(mt))) - - gdb.write('Array part: {} slots\n'.format(capacity['apart'])) - for i in range(capacity['apart']): - slot = array + i - gdb.write('{ptr}: [{index}]: {value}\n'.format( - ptr=slot, - index=i, - value=dump_tvalue(slot) - )) - - gdb.write('Hash part: {} nodes\n'.format(capacity['hpart'])) - # See hmask comment in lj_obj.h - for i in range(capacity['hpart']): - node = nodes + i - gdb.write('{ptr}: {{ {key} }} => {{ {val} }}; next = {n}\n'.format( - ptr=node, - key=dump_tvalue(node['key']), - val=dump_tvalue(node['val']), - n=mref('struct Node *', node['next']) - )) - - -class LJDumpStack(LJBase): - ''' -lj-stack [<lua_State *>] - -The command receives a lua_State address and dumps the given Lua -coroutine guest stack: - -<slot ptr> [<slot attributes>] <VALUE|FRAME> - -* <slot ptr>: guest stack slot address -* <slot attributes>: - - S: Bottom of the stack (the slot L->stack points to) - - B: Base of the current guest frame (the slot L->base points to) - - T: Top of the current guest frame (the slot L->top points to) - - M: Last slot of the stack (the slot L->maxstack points to) -* <VALUE>: see help lj-tv for more info -* <FRAME>: framelink slot differs from the value slot: it contains info - related to the function being executed within this guest frame, its - type and link to the parent guest frame - [<frame type>] delta=<slots in frame>, <lj-tv for LJ_TFUNC slot> - - <frame type>: - + L: VM performs a call as a result of bytecode execution - + C: VM performs a call as a result of lj_vm_call - + M: VM performs a call to a metamethod as a result of bytecode - execution - + V: Variable-length frame for storing arguments of a variadic - function - + CP: Protected C frame - + PP: VM performs a call as a result of executinig pcall or xpcall - -If L is omitted the main coroutine is used. - ''' - - def invoke(self, arg, from_tty): - gdb.write('{}\n'.format(dump_stack(L(parse_arg(arg))))) - - -class LJState(LJBase): - ''' -lj-state -The command requires no args and dumps current VM and GC states -* VM state: <INTERP|C|GC|EXIT|RECORD|OPT|ASM|TRACE> -* GC state: <PAUSE|PROPAGATE|ATOMIC|SWEEPSTRING|SWEEP|FINALIZE|LAST> -* JIT state: <IDLE|ACTIVE|RECORD|START|END|ASM|ERR> - ''' - - def invoke(self, arg, from_tty): - g = G(L(None)) - gdb.write('{}\n'.format('\n'.join( - map(lambda t: '{} state: {}'.format(*t), { - 'VM': vm_state(g), - 'GC': gc_state(g), - 'JIT': jit_state(g), - }.items()) - ))) - - -class LJGC(LJBase): - ''' -lj-gc - -The command requires no args and dumps current GC stats: -* total: <total number of allocated bytes in GC area> -* threshold: <limit when gc step is triggered> -* debt: <how much GC is behind schedule> -* estimate: <estimate of memory actually in use> -* stepmul: <incremental GC step granularity> -* pause: <pause between successive GC cycles> -* sweepstr: <sweep position in string table> -* root: <number of all collectable objects> -* gray: <number of gray objects> -* grayagain: <number of objects for atomic traversal> -* weak: <number of weak tables (to be cleared)> -* mmudata: <number of udata|cdata to be finalized> - ''' - - def invoke(self, arg, from_tty): - g = G(L(None)) - gdb.write('GC stats: {state}\n{stats}\n'.format( - state=gc_state(g), - stats=dump_gc(g) - )) - - -def init(commands): - global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, LJ_TISNUM, PADDING - - # XXX Fragile: though connecting the callback looks like a crap but it - # respects both Python 2 and Python 3 (see #4828). - def connect(callback): - if LEGACY: - global CONNECTED - CONNECTED = True - gdb.events.new_objfile.connect(callback) - - # XXX Fragile: though disconnecting the callback looks like a crap but it - # respects both Python 2 and Python 3 (see #4828). - def disconnect(callback): - if LEGACY: - global CONNECTED - if not CONNECTED: - return - CONNECTED = False - gdb.events.new_objfile.disconnect(callback) - - try: - # Try to remove the callback at first to not append duplicates to - # gdb.events.new_objfile internal list. - disconnect(load) - except Exception: - # Callback is not connected. - pass - - try: - # Detect whether libluajit objfile is loaded. - gdb.parse_and_eval('luaJIT_setmode') - except Exception: - gdb.write('luajit-gdb.py initialization is postponed ' - 'until libluajit objfile is loaded\n') - # Add a callback to be executed when the next objfile is loaded. - connect(load) - return - - try: - LJ_64 = str(gdb.parse_and_eval('IRT_PTR')) == 'IRT_P64' - LJ_FR2 = LJ_GC64 = str(gdb.parse_and_eval('IRT_PGC')) == 'IRT_P64' - LJ_DUALNUM = gdb.lookup_global_symbol('lj_lib_checknumber') is not None - except Exception: - gdb.write('luajit-gdb.py failed to load: ' - 'no debugging symbols found for libluajit\n') - return - - for name, command in commands.items(): - command(name) - - PADDING = ' ' * len(':' + hex((1 << (47 if LJ_GC64 else 32)) - 1)) - LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] - - gdb.write('luajit-gdb.py is successfully loaded\n') - - -def load(event=None): - init({ - 'lj-arch': LJDumpArch, - 'lj-tv': LJDumpTValue, - 'lj-str': LJDumpString, - 'lj-tab': LJDumpTable, - 'lj-stack': LJDumpStack, - 'lj-state': LJState, - 'lj-gc': LJGC, - }) - - -load(None) diff --git a/src/luajit_lldb.py b/src/luajit_dbg.py similarity index 63% rename from src/luajit_lldb.py rename to src/luajit_dbg.py index 5ac11b65..a42d8f25 100644 --- a/src/luajit_lldb.py +++ b/src/luajit_dbg.py @@ -1,10 +1,230 @@ -# LLDB extension for LuaJIT post-mortem analysis. -# To use, just put 'command script import <path-to-repo>/src/luajit_lldb.py' -# in lldb. +# Debug extension for LuaJIT post-mortem analysis. +# To use in LLDB: 'command script import <path-to-repo>/src/luajit_dbg.py' +# To use in GDB: 'source <path-to-repo>/src/luajit_dbg.py' import abc import re -import lldb +import sys +import types + +from importlib import import_module + +# make script compatible with the ancient Python {{{ + + +LEGACY = re.match(r'^2\.', sys.version) + +if LEGACY: + CONNECTED = False + int = long + range = xrange + + +def is_integer_type(val): + return isinstance(val, int) or (LEGACY and isinstance(val, types.IntType)) + + +# }}} + + +class Debugger(object): + def __init__(self): + self.GDB = False + self.LLDB = False + + debuggers = { + 'gdb': lambda lib: True, + 'lldb': lambda lib: lib.debugger is not None, + } + for name, healthcheck in debuggers.items(): + lib = None + try: + lib = import_module(name) + if healthcheck(lib): + setattr(self, name.upper(), True) + globals()[name] = lib + self.name = name + except Exception: + continue + + assert self.LLDB != self.GDB + + def setup_target(self, debugger): + global target + if self.LLDB: + target = debugger.GetSelectedTarget() + + def write(self, msg): + if self.LLDB: + print(msg) + else: + gdb.write(msg + '\n') + + def cmd_init(self, cmd_cls, debugger=None): + if self.LLDB: + debugger.HandleCommand( + 'command script add --overwrite --class ' + 'luajit_dbg.{cls} {cmd}' + .format( + cls=cmd_cls.__name__, + cmd=cmd_cls.command, + ) + ) + else: + cmd_cls() + + def event_connect(self, callback): + if not self.LLDB: + # XXX Fragile: though connecting the callback looks like a crap but + # it respects both Python 2 and Python 3 (see #4828). + if LEGACY: + global CONNECTED + CONNECTED = True + gdb.events.new_objfile.connect(callback) + + def event_disconnect(self, callback): + if not self.LLDB: + # XXX Fragile: though disconnecting the callback looks like a crap + # but it respects both Python 2 and Python 3 (see #4828). + if LEGACY: + global CONNECTED + if not CONNECTED: + return + CONNECTED = False + gdb.events.new_objfile.disconnect(callback) + + def lookup_variable(self, name): + if self.LLDB: + return target.FindFirstGlobalVariable(name) + else: + variable, _ = gdb.lookup_symbol(name) + return variable.value() if variable else None + + def lookup_symbol(self, sym): + if self.LLDB: + return target.modules[0].FindSymbol(sym) + else: + return gdb.lookup_global_symbol(sym) + + def to_unsigned(self, val): + return val.unsigned if self.LLDB else int(val) + + def to_signed(self, val): + return val.signed if self.LLDB else int(val) + + def to_str(self, val): + return val.value if self.LLDB else str(val) + + def find_type(self, typename): + if self.LLDB: + return target.FindFirstType(typename) + else: + return gdb.lookup_type(typename) + + def type_to_pointer_type(self, tp): + if self.LLDB: + return tp.GetPointerType() + else: + return tp.pointer() + + def cast_impl(self, value, t, pointer_type): + if self.LLDB: + if is_integer_type(value): + # Integer casts require some black magic + # for lldb to behave properly. + if pointer_type: + return target.CreateValueFromAddress( + 'value', + lldb.SBAddress(value, target), + t.GetPointeeType(), + ).address_of + else: + return target.CreateValueFromData( + name='value', + data=lldb.SBData.CreateDataFromInt(value, size=8), + type=t, + ) + else: + return value.Cast(t) + else: + return gdb.Value(value).cast(t) + + def dereference(self, val): + if self.LLDB: + return val.Dereference() + else: + return val.dereference() + + def eval(self, expression): + if self.LLDB: + process = target.GetProcess() + thread = process.GetSelectedThread() + frame = thread.GetSelectedFrame() + + if not expression: + return None + + return frame.EvaluateExpression(expression) + else: + return gdb.parse_and_eval(expression) + + def type_sizeof_impl(self, tp): + if self.LLDB: + return tp.GetByteSize() + else: + return tp.sizeof + + def summary(self, val): + if self.LLDB: + return val.summary + else: + return str(val)[len(PADDING):].strip() + + def type_member(self, type_obj, name): + if self.LLDB: + return next((x for x in type_obj.members if x.name == name), None) + else: + return type_obj[name] + + def type_member_offset(self, member): + if self.LLDB: + return member.GetOffsetInBytes() + else: + return member.bitpos / 8 + + def get_member(self, value, member_name): + if self.LLDB: + return value.GetChildMemberWithName(member_name) + else: + return value[member_name] + + def address_of(self, value): + if self.LLDB: + return value.address_of + else: + return value.address + + def arch_init(self): + global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, PADDING, LJ_TISNUM, target + if self.LLDB: + irtype_enum = dbg.find_type('IRType').enum_members + for member in irtype_enum: + if member.name == 'IRT_PTR': + LJ_64 = dbg.to_unsigned(member) & 0x1f == IRT_P64 + if member.name == 'IRT_PGC': + LJ_GC64 = dbg.to_unsigned(member) & 0x1f == IRT_P64 + else: + LJ_64 = str(dbg.eval('IRT_PTR')) == 'IRT_P64' + LJ_GC64 = str(dbg.eval('IRT_PGC')) == 'IRT_P64' + + LJ_FR2 = LJ_GC64 + LJ_DUALNUM = dbg.lookup_symbol('lj_lib_checknumber') is not None + # Two extra characters are required to fit in the `0x` part. + PADDING = ' ' * len(strx64(L())) + LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] + + +dbg = Debugger() LJ_64 = None LJ_GC64 = None @@ -17,68 +237,73 @@ IRT_P64 = 9 LJ_GCVMASK = ((1 << 47) - 1) LJ_TISNUM = None -# Debugger specific {{{ - - # Global target = None -class Ptr: +class Ptr(object): def __init__(self, value, normal_type): self.value = value self.normal_type = normal_type @property def __deref(self): - return self.normal_type(self.value.Dereference()) + return self.normal_type(dbg.dereference(self.value)) def __add__(self, other): - assert isinstance(other, int) + assert is_integer_type(other) return self.__class__( cast( self.normal_type.__name__ + ' *', cast( 'uintptr_t', - self.value.unsigned + other * self.value.deref.size, + dbg.to_unsigned(self.value) + other * sizeof( + self.normal_type.__name__ + ), ), ), ) def __sub__(self, other): - assert isinstance(other, int) or isinstance(other, Ptr) - if isinstance(other, int): + assert is_integer_type(other) or isinstance(other, Ptr) + if is_integer_type(other): return self.__add__(-other) else: - return int((self.value.unsigned - other.value.unsigned) - / sizeof(self.normal_type.__name__)) + return int( + ( + dbg.to_unsigned(self.value) - dbg.to_unsigned(other.value) + ) / sizeof(self.normal_type.__name__) + ) def __eq__(self, other): - assert isinstance(other, Ptr) or isinstance(other, int) and other >= 0 + assert isinstance(other, Ptr) or is_integer_type(other) if isinstance(other, Ptr): - return self.value.unsigned == other.value.unsigned + return dbg.to_unsigned(self.value) == dbg.to_unsigned(other.value) else: - return self.value.unsigned == other + return dbg.to_unsigned(self.value) == other def __ne__(self, other): return not self == other def __gt__(self, other): assert isinstance(other, Ptr) - return self.value.unsigned > other.value.unsigned + return dbg.to_unsigned(self.value) > dbg.to_unsigned(other.value) def __ge__(self, other): assert isinstance(other, Ptr) - return self.value.unsigned >= other.value.unsigned + return dbg.to_unsigned(self.value) >= dbg.to_unsigned(other.value) def __bool__(self): - return self.value.unsigned != 0 + return dbg.to_unsigned(self.value) != 0 def __int__(self): - return self.value.unsigned + return dbg.to_unsigned(self.value) + + def __long__(self): + return dbg.to_unsigned(self.value) def __str__(self): - return self.value.value + return dbg.to_str(self.value) def __getattr__(self, name): if name != '__deref': @@ -86,53 +311,26 @@ class Ptr: return self.__deref -class MetaStruct(type): - def __init__(cls, name, bases, nmspc): - super(MetaStruct, cls).__init__(name, bases, nmspc) - - def make_general(field, tp): - builtin = { - 'uint': 'unsigned', - 'int': 'signed', - 'string': 'value', - } - if tp in builtin.keys(): - return lambda self: getattr(self[field], builtin[tp]) - else: - return lambda self: globals()[tp](self[field]) - - if hasattr(cls, 'metainfo'): - for field in cls.metainfo: - if not isinstance(field[0], str): - setattr(cls, field[1], field[0]) - else: - setattr( - cls, - field[1], - property(make_general(field[1], field[0])), - ) - - -class Struct(metaclass=MetaStruct): +class Struct(object): def __init__(self, value): self.value = value def __getitem__(self, name): - return self.value.GetChildMemberWithName(name) + return dbg.get_member(self.value, name) @property def addr(self): - return self.value.address_of + return dbg.address_of(self.value) c_structs = { 'MRef': [ - (property(lambda self: self['ptr64'].unsigned if LJ_GC64 - else self['ptr32'].unsigned), 'ptr') + (property(lambda self: dbg.to_unsigned(self['ptr64']) if LJ_GC64 + else dbg.to_unsigned(self['ptr32'])), 'ptr') ], 'GCRef': [ - (property(lambda self: self['gcptr64'].unsigned if LJ_GC64 - else self['gcptr32'].unsigned), 'gcptr') + (property(lambda self: dbg.to_unsigned(self['gcptr64']) if LJ_GC64 + else dbg.to_unsigned(self['gcptr32'])), 'gcptr') ], 'TValue': [ ('GCRef', 'gcr'), @@ -141,8 +339,12 @@ c_structs = { ('int', 'it64'), ('string', 'n'), (property(lambda self: FR(self['fr']) if not LJ_GC64 else None), 'fr'), - (property(lambda self: self['ftsz'].signed if LJ_GC64 else None), - 'ftsz') + ( + property( + lambda self: dbg.to_signed(self['ftsz']) if LJ_GC64 else None + ), + 'ftsz' + ) ], 'GCState': [ ('GCRef', 'root'), @@ -216,26 +418,51 @@ c_structs = { ('TValue', 'val'), ('MRef', 'next') ], - 'BCIns': [] + 'BCIns': [], } -for cls in c_structs.keys(): - globals()[cls] = type(cls, (Struct, ), {'metainfo': c_structs[cls]}) +def make_property_from_metadata(field, tp): + builtin = { + 'uint': dbg.to_unsigned, + 'int': dbg.to_signed, + 'string': dbg.to_str, + } + if tp in builtin.keys(): + return lambda self: builtin[tp](self[field]) + else: + return lambda self: globals()[tp](self[field]) + + +for cls, metainfo in c_structs.items(): + cls_dict = {} + for field in metainfo: + if not isinstance(field[0], str): + cls_dict[field[1]] = field[0] + else: + cls_dict[field[1]] = property( + make_property_from_metadata(field[1], field[0]) + ) + globals()[cls] = type(cls, (Struct, ), cls_dict) for cls in Struct.__subclasses__(): ptr_name = cls.__name__ + 'Ptr' + def make_init(cls): + return lambda self, value: super(type(self), self).__init__(value, cls) + globals()[ptr_name] = type(ptr_name, (Ptr,), { - '__init__': - lambda self, value: super(type(self), self).__init__(value, cls) + '__init__': make_init(cls) }) -class Command(object): - def __init__(self, debugger, unused): - pass +class Command(object if dbg.LLDB else gdb.Command): + def __init__(self, debugger=None, unused=None): + if dbg.GDB: + # XXX Fragile: though initialization looks like a crap but it + # respects both Python 2 and Python 3 (see #4828). + gdb.Command.__init__(self, self.command, gdb.COMMAND_DATA) def get_short_help(self): return self.__doc__.splitlines()[0] @@ -245,21 +472,15 @@ class Command(object): def __call__(self, debugger, command, exe_ctx, result): try: - self.execute(debugger, command, result) + self.execute(command) except Exception as e: msg = 'Failed to execute command `{}`: {}'.format(self.command, e) result.SetError(msg) def parse(self, command): - process = target.GetProcess() - thread = process.GetSelectedThread() - frame = thread.GetSelectedFrame() - if not command: return None - - ret = frame.EvaluateExpression(command) - return ret + return dbg.to_unsigned(dbg.eval(command)) @abc.abstractproperty def command(self): @@ -270,7 +491,7 @@ class Command(object): """ @abc.abstractmethod - def execute(self, debugger, args, result): + def execute(self, args): """Implementation of the command. Subclasses override this method to implement the logic of a given command, e.g. printing a stacktrace. The command output should be @@ -278,6 +499,11 @@ class Command(object): properly routed to LLDB frontend. Any unhandled exception will be automatically transformed into proper errors. """ + def invoke(self, arg, from_tty): + try: + self.execute(arg) + except Exception as e: + dbg.write(e) def cast(typename, value): @@ -299,75 +525,38 @@ def cast(typename, value): name = name[:-1].strip() pointer_type = True - # Get the lldb type representation. - t = target.FindFirstType(name) + # Get the inferior type representation. + t = dbg.find_type(name) if pointer_type: - t = t.GetPointerType() - - if isinstance(value, int): - # Integer casts require some black magic for lldb to behave properly. - if pointer_type: - casted = target.CreateValueFromAddress( - 'value', - lldb.SBAddress(value, target), - t.GetPointeeType(), - ).address_of - else: - casted = target.CreateValueFromData( - name='value', - data=lldb.SBData.CreateDataFromInt(value, size=8), - type=t, - ) - else: - casted = value.Cast(t) + t = dbg.type_to_pointer_type(t) + + casted = dbg.cast_impl(value, t, pointer_type) if isinstance(typename, type): - # Wrap lldb object, if possible + # Wrap inferior object, if possible return typename(casted) else: return casted -def lookup_global(name): - return target.FindFirstGlobalVariable(name) - - -def type_member(type_obj, name): - return next((x for x in type_obj.members if x.name == name), None) - - -def find_type(typename): - return target.FindFirstType(typename) - - def offsetof(typename, membername): - type_obj = find_type(typename) - member = type_member(type_obj, membername) + type_obj = dbg.find_type(typename) + member = dbg.type_member(type_obj, membername) assert member is not None - return member.GetOffsetInBytes() + return dbg.type_member_offset(member) def sizeof(typename): - type_obj = find_type(typename) - return type_obj.GetByteSize() + type_obj = dbg.find_type(typename) + return dbg.type_sizeof_impl(type_obj) def vtou64(value): - return value.unsigned & 0xFFFFFFFFFFFFFFFF + return dbg.to_unsigned(value) & 0xFFFFFFFFFFFFFFFF def vtoi(value): - return value.signed - - -def dbg_eval(expr): - process = target.GetProcess() - thread = process.GetSelectedThread() - frame = thread.GetSelectedFrame() - return frame.EvaluateExpression(expr) - - -# }}} Debugger specific + return dbg.to_signed(value) def gcval(obj): @@ -393,7 +582,7 @@ def gclistlen(root, end=0x0): def gcringlen(root): - if not gcref(root): + if gcref(root) == 0: return 0 elif gcref(root) == gcref(gcnext(root)): return 1 @@ -439,7 +628,7 @@ def J(g): J_offset = offsetof('GG_State', 'J') return cast( jit_StatePtr, - vtou64(cast('char *', g)) - g_offset + J_offset, + int(vtou64(cast('char *', g)) - g_offset + J_offset), ) @@ -451,7 +640,7 @@ def L(L=None): # lookup a symbol for the main coroutine considering the host app # XXX Fragile: though the loop initialization looks like a crap but it # respects both Python 2 and Python 3. - for lstate in [L] + list(map(lambda main: lookup_global(main), ( + for lstate in [L] + list(map(lambda main: dbg.lookup_variable(main), ( # LuaJIT main coro (see luajit/src/luajit.c) 'globalL', # Tarantool main coro (see tarantool/src/lua/init.h) @@ -459,7 +648,7 @@ def L(L=None): # TODO: Add more ))): if lstate: - return lua_State(lstate) + return lua_StatePtr(lstate) def tou32(val): @@ -523,9 +712,9 @@ def funcproto(func): def strdata(obj): try: ptr = cast('char *', obj + 1) - return ptr.summary + return dbg.summary(ptr) except UnicodeEncodeError: - return "<luajit-lldb: error occurred while rendering non-ascii slot>" + return "<luajit_dbg: error occured while rendering non-ascii slot>" def itype(o): @@ -730,12 +919,12 @@ def frame_pc(framelink): def frame_prevl(framelink): - # We are evaluating the `frame_pc(framelink)[-1])` with lldb's + # We are evaluating the `frame_pc(framelink)[-1])` with # REPL, because the lldb API is faulty and it's not possible to cast # a struct member of 32-bit type to 64-bit type without getting onto # the next property bits, despite the fact that it's an actual value, not # a pointer to it. - bcins = vtou64(dbg_eval('((BCIns *)' + str(frame_pc(framelink)) + ')[-1]')) + bcins = vtou64(dbg.eval('((BCIns *)' + str(frame_pc(framelink)) + ')[-1]')) return framelink - (1 + LJ_FR2 + bc_a(bcins)) @@ -789,12 +978,12 @@ def frames(L): def dump_framelink_slot_address(fr): return '{start:{padding}}:{end:{padding}}'.format( - start=hex(int(fr - 1)), - end=hex(int(fr)), + start=strx64(fr - 1), + end=strx64(fr), padding=len(PADDING), ) if LJ_FR2 else '{addr:{padding}}'.format( - addr=hex(int(fr)), - padding=len(PADDING), + addr=strx64(fr), + padding=2 * len(PADDING) + 1, ) @@ -863,7 +1052,6 @@ def dump_stack(L, base=None, top=None): nfreeslots=int((maxstack - top - 8) >> 3), ), ]) - for framelink, frametop in frames(L): # Dump all data slots in the (framelink, top) interval. dump.extend([ @@ -904,9 +1092,11 @@ the type and some info related to it. Whether the type of the given address differs from the listed above, then error message occurs. ''' - def execute(self, debugger, args, result): + command = 'lj-tv' + + def execute(self, args): tvptr = TValuePtr(cast('TValue *', self.parse(args))) - print('{}'.format(dump_tvalue(tvptr))) + dbg.write('{}'.format(dump_tvalue(tvptr))) class LJState(Command): @@ -917,9 +1107,11 @@ The command requires no args and dumps current VM and GC states * GC state: <PAUSE|PROPAGATE|ATOMIC|SWEEPSTRING|SWEEP|FINALIZE|LAST> * JIT state: <IDLE|ACTIVE|RECORD|START|END|ASM|ERR> ''' - def execute(self, debugger, args, result): + command = 'lj-state' + + def execute(self, args): g = G(L(None)) - print('{}'.format('\n'.join( + dbg.write('{}'.format('\n'.join( map(lambda t: '{} state: {}'.format(*t), { 'VM': vm_state(g), 'GC': gc_state(g), @@ -936,8 +1128,10 @@ The command requires no args and dumps values of LJ_64 and LJ_GC64 compile-time flags. These values define the sizes of host and GC pointers respectively. ''' - def execute(self, debugger, args, result): - print( + command = 'lj-arch' + + def execute(self, args): + dbg.write( 'LJ_64: {LJ_64}, LJ_GC64: {LJ_GC64}, LJ_DUALNUM: {LJ_DUALNUM}' .format( LJ_64=LJ_64, @@ -965,9 +1159,11 @@ The command requires no args and dumps current GC stats: * weak: <number of weak tables (to be cleared)> * mmudata: <number of udata|cdata to be finalized> ''' - def execute(self, debugger, args, result): + command = 'lj-gc' + + def execute(self, args): g = G(L(None)) - print('GC stats: {state}\n{stats}'.format( + dbg.write('GC stats: {state}\n{stats}'.format( state=gc_state(g), stats=dump_gc(g) )) @@ -983,9 +1179,11 @@ the payload, size in bytes and hash. *Caveat*: Since Python 2 provides no native Unicode support, the payload is replaced with the corresponding error when decoding fails. ''' - def execute(self, debugger, args, result): + command = 'lj-str' + + def execute(self, args): string_ptr = GCstrPtr(cast('GCstr *', self.parse(args))) - print("String: {body} [{len} bytes] with hash {hash}".format( + dbg.write("String: {body} [{len} bytes] with hash {hash}".format( body=strdata(string_ptr), hash=strx64(string_ptr.hash), len=string_ptr.len, @@ -1003,7 +1201,9 @@ The command receives a GCtab address and dumps the table contents: * Hash part <hsize> nodes: <hnode ptr>: { <tv> } => { <tv> }; next = <next hnode ptr> ''' - def execute(self, debugger, args, result): + command = 'lj-tab' + + def execute(self, args): t = GCtabPtr(cast('GCtab *', self.parse(args))) array = mref(TValuePtr, t.array) nodes = mref(NodePtr, t.node) @@ -1014,22 +1214,22 @@ The command receives a GCtab address and dumps the table contents: } if mt: - print('Metatable detected: {}'.format(strx64(mt))) + dbg.write('Metatable detected: {}'.format(strx64(mt))) - print('Array part: {} slots'.format(capacity['apart'])) + dbg.write('Array part: {} slots'.format(capacity['apart'])) for i in range(capacity['apart']): slot = array + i - print('{ptr}: [{index}]: {value}'.format( + dbg.write('{ptr}: [{index}]: {value}'.format( ptr=strx64(slot), index=i, value=dump_tvalue(slot) )) - print('Hash part: {} nodes'.format(capacity['hpart'])) + dbg.write('Hash part: {} nodes'.format(capacity['hpart'])) # See hmask comment in lj_obj.h for i in range(capacity['hpart']): node = nodes + i - print('{ptr}: {{ {key} }} => {{ {val} }}; next = {n}'.format( + dbg.write('{ptr}: {{ {key} }} => {{ {val} }}; next = {n}'.format( ptr=strx64(node), key=dump_tvalue(TValuePtr(node.key.addr)), val=dump_tvalue(TValuePtr(node.val.addr)), @@ -1069,56 +1269,72 @@ coroutine guest stack: If L is omitted the main coroutine is used. ''' - def execute(self, debugger, args, result): + command = 'lj-stack' + + def execute(self, args): lstate = self.parse(args) - lstate_ptr = cast('lua_State *', lstate) if coro is not None else None - print('{}'.format(dump_stack(L(lstate_ptr)))) + lstate_ptr = cast('lua_State *', lstate) if lstate else None + dbg.write('{}'.format(dump_stack(L(lstate_ptr)))) -def register_commands(debugger, commands): - for command, cls in commands.items(): - cls.command = command - debugger.HandleCommand( - 'command script add --overwrite --class luajit_lldb.{cls} {cmd}' - .format( - cls=cls.__name__, - cmd=cls.command, - ) - ) - print('{cmd} command intialized'.format(cmd=cls.command)) +LJ_COMMANDS = [ + LJDumpTValue, + LJState, + LJDumpArch, + LJGC, + LJDumpString, + LJDumpTable, + LJDumpStack, +] + +def register_commands(commands, debugger=None): + for cls in commands: + dbg.cmd_init(cls, debugger) + dbg.write('{cmd} command intialized'.format(cmd=cls.command)) -def configure(debugger): - global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, PADDING, LJ_TISNUM, target - target = debugger.GetSelectedTarget() - module = target.modules[0] - LJ_DUALNUM = module.FindSymbol('lj_lib_checknumber') is not None +def configure(debugger=None): + global PADDING, LJ_TISNUM, LJ_DUALNUM + dbg.setup_target(debugger) try: - irtype_enum = target.FindFirstType('IRType').enum_members - for member in irtype_enum: - if member.name == 'IRT_PTR': - LJ_64 = member.unsigned & 0x1f == IRT_P64 - if member.name == 'IRT_PGC': - LJ_FR2 = LJ_GC64 = member.unsigned & 0x1f == IRT_P64 + # Try to remove the callback at first to not append duplicates to + # gdb.events.new_objfile internal list. + dbg.event_disconnect(load) except Exception: - print('luajit_lldb.py failed to load: ' - 'no debugging symbols found for libluajit') - return - - PADDING = ' ' * len(strx64((TValuePtr(L().addr)))) - LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] - - -def __lldb_init_module(debugger, internal_dict): - configure(debugger) - register_commands(debugger, { - 'lj-tv': LJDumpTValue, - 'lj-state': LJState, - 'lj-arch': LJDumpArch, - 'lj-gc': LJGC, - 'lj-str': LJDumpString, - 'lj-tab': LJDumpTable, - 'lj-stack': LJDumpStack, - }) - print('luajit_lldb.py is successfully loaded') + # Callback is not connected. + pass + + try: + # Detect whether libluajit objfile is loaded. + dbg.eval('luaJIT_setmode') + except Exception: + dbg.write('luajit_dbg.py initialization is postponed ' + 'until libluajit objfile is loaded\n') + # Add a callback to be executed when the next objfile is loaded. + dbg.event_connect(load) + return False + + try: + dbg.arch_init() + except Exception: + dbg.write('LuaJIT debug extension failed to load: ' + 'no debugging symbols found for libluajit') + return False + return True + + +# XXX: The dummy parameter is needed for this function to +# work as a gdb callback. +def load(_=None, debugger=None): + if configure(debugger): + register_commands(LJ_COMMANDS, debugger) + dbg.write('LuaJIT debug extension is successfully loaded') + + +def __lldb_init_module(debugger, _=None): + load(None, debugger) + + +if dbg.GDB: + load() -- 2.44.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 1/2] debug: generalized extension 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 1/2] " Maxim Kokryashkin via Tarantool-patches @ 2024-04-04 10:14 ` Sergey Bronnikov via Tarantool-patches 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches 2024-08-14 19:34 ` Mikhail Elhimov via Tarantool-patches 2 siblings, 0 replies; 11+ messages in thread From: Sergey Bronnikov via Tarantool-patches @ 2024-04-04 10:14 UTC (permalink / raw) To: Maxim Kokryashkin, tarantool-patches, skaplun Hi, Max LGTM, as I said for a previous version. On 4/4/24 01:21, Maxim Kokryashkin wrote: > This patch joins the LLDB and GDB LuaJIT debugging extensions > into one, so now the extension logic can be debugger-agnostic. > To do that, an adapter class is introduced, and all of the > debugger-specific behavior is encapsulated there. The extension > auto-detects the debugger it was loaded into and selects the > correct low-level logic implementation. > --- > src/luajit-gdb.py | 885 -------------------------- > src/{luajit_lldb.py => luajit_dbg.py} | 616 ++++++++++++------ > 2 files changed, 416 insertions(+), 1085 deletions(-) > delete mode 100644 src/luajit-gdb.py > rename src/{luajit_lldb.py => luajit_dbg.py} (63%) > > diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py > deleted file mode 100644 > index d2070e9b..00000000 > --- a/src/luajit-gdb.py > +++ /dev/null > @@ -1,885 +0,0 @@ > -# GDB extension for LuaJIT post-mortem analysis. > -# To use, just put 'source <path-to-repo>/src/luajit-gdb.py' in gdb. > - > -import re > -import gdb > -import sys > - > -# make script compatible with the ancient Python {{{ > - > - > -LEGACY = re.match(r'^2\.', sys.version) > - > -if LEGACY: > - CONNECTED = False > - int = long > - range = xrange > - > - > -# }}} > - > - > -gtype_cache = {} > - > - > -def gtype(typestr): > - global gtype_cache > - if typestr in gtype_cache: > - return gtype_cache[typestr] > - > - m = re.match(r'((?:(?:struct|union) )?\S*)\s*[*]', typestr) > - > - gtype = gdb.lookup_type(typestr) if m is None \ > - else gdb.lookup_type(m.group(1)).pointer() > - > - gtype_cache[typestr] = gtype > - return gtype > - > - > -def cast(typestr, val): > - return gdb.Value(val).cast(gtype(typestr)) > - > - > -def lookup(symbol): > - variable, _ = gdb.lookup_symbol(symbol) > - return variable.value() if variable else None > - > - > -def parse_arg(arg): > - if not arg: > - return None > - > - ret = gdb.parse_and_eval(arg) > - > - if not ret: > - raise gdb.GdbError('table argument empty') > - > - return ret > - > - > -def tou64(val): > - return cast('uint64_t', val) & 0xFFFFFFFFFFFFFFFF > - > - > -def tou32(val): > - return cast('uint32_t', val) & 0xFFFFFFFF > - > - > -def i2notu32(val): > - return ~int(val) & 0xFFFFFFFF > - > - > -def strx64(val): > - return re.sub('L?$', '', > - hex(int(cast('uint64_t', val) & 0xFFFFFFFFFFFFFFFF))) > - > - > -# Types {{{ > - > - > -LJ_T = { > - 'NIL': i2notu32(0), > - 'FALSE': i2notu32(1), > - 'TRUE': i2notu32(2), > - 'LIGHTUD': i2notu32(3), > - 'STR': i2notu32(4), > - 'UPVAL': i2notu32(5), > - 'THREAD': i2notu32(6), > - 'PROTO': i2notu32(7), > - 'FUNC': i2notu32(8), > - 'TRACE': i2notu32(9), > - 'CDATA': i2notu32(10), > - 'TAB': i2notu32(11), > - 'UDATA': i2notu32(12), > - 'NUMX': i2notu32(13), > -} > - > - > -def typenames(value): > - return { > - LJ_T[k]: 'LJ_T' + k for k in LJ_T.keys() > - }.get(int(value), 'LJ_TINVALID') > - > - > -# }}} > - > -# Frames {{{ > - > - > -FRAME_TYPE = 0x3 > -FRAME_P = 0x4 > -FRAME_TYPEP = FRAME_TYPE | FRAME_P > - > -FRAME = { > - 'LUA': 0x0, > - 'C': 0x1, > - 'CONT': 0x2, > - 'VARG': 0x3, > - 'LUAP': 0x4, > - 'CP': 0x5, > - 'PCALL': 0x6, > - 'PCALLH': 0x7, > -} > - > - > -def frametypes(ft): > - return { > - FRAME['LUA']: 'L', > - FRAME['C']: 'C', > - FRAME['CONT']: 'M', > - FRAME['VARG']: 'V', > - }.get(ft, '?') > - > - > -def bc_a(ins): > - return (ins >> 8) & 0xff > - > - > -def frame_ftsz(framelink): > - return cast('ptrdiff_t', framelink['ftsz'] if LJ_FR2 > - else framelink['fr']['tp']['ftsz']) > - > - > -def frame_pc(framelink): > - return cast('BCIns *', frame_ftsz(framelink)) if LJ_FR2 \ > - else mref('BCIns *', framelink['fr']['tp']['pcr']) > - > - > -def frame_prevl(framelink): > - return framelink - (1 + LJ_FR2 + bc_a(frame_pc(framelink)[-1])) > - > - > -def frame_ispcall(framelink): > - return (frame_ftsz(framelink) & FRAME['PCALL']) == FRAME['PCALL'] > - > - > -def frame_sized(framelink): > - return (frame_ftsz(framelink) & ~FRAME_TYPEP) > - > - > -def frame_prevd(framelink): > - return cast('TValue *', cast('char *', framelink) - frame_sized(framelink)) > - > - > -def frame_type(framelink): > - return frame_ftsz(framelink) & FRAME_TYPE > - > - > -def frame_typep(framelink): > - return frame_ftsz(framelink) & FRAME_TYPEP > - > - > -def frame_islua(framelink): > - return frametypes(int(frame_type(framelink))) == 'L' \ > - and int(frame_ftsz(framelink)) > 0 > - > - > -def frame_prev(framelink): > - return frame_prevl(framelink) if frame_islua(framelink) \ > - else frame_prevd(framelink) > - > - > -def frame_sentinel(L): > - return mref('TValue *', L['stack']) + LJ_FR2 > - > - > -# }}} > - > -# Const {{{ > - > - > -LJ_64 = None > -LJ_GC64 = None > -LJ_FR2 = None > -LJ_DUALNUM = None > - > -LJ_GCVMASK = ((1 << 47) - 1) > -LJ_TISNUM = None > -PADDING = None > - > -# These constants are meaningful only for 'LJ_64' mode. > -LJ_LIGHTUD_BITS_SEG = 8 > -LJ_LIGHTUD_BITS_LO = 47 - LJ_LIGHTUD_BITS_SEG > -LIGHTUD_SEG_MASK = (1 << LJ_LIGHTUD_BITS_SEG) - 1 > -LIGHTUD_LO_MASK = (1 << LJ_LIGHTUD_BITS_LO) - 1 > - > - > -# }}} > - > - > -def itype(o): > - return cast('uint32_t', o['it64'] >> 47) if LJ_GC64 else o['it'] > - > - > -def mref(typename, obj): > - return cast(typename, obj['ptr64'] if LJ_GC64 else obj['ptr32']) > - > - > -def gcref(obj): > - return cast('GCobj *', obj['gcptr64'] if LJ_GC64 > - else cast('uintptr_t', obj['gcptr32'])) > - > - > -def gcval(obj): > - return cast('GCobj *', obj['gcptr64'] & LJ_GCVMASK if LJ_GC64 > - else cast('uintptr_t', obj['gcptr32'])) > - > - > -def gcnext(obj): > - return gcref(obj)['gch']['nextgc'] > - > - > -def L(L=None): > - # lookup a symbol for the main coroutine considering the host app > - # XXX Fragile: though the loop initialization looks like a crap but it > - # respects both Python 2 and Python 3. > - for lstate in [L] + list(map(lambda main: lookup(main), ( > - # LuaJIT main coro (see luajit/src/luajit.c) > - 'globalL', > - # Tarantool main coro (see tarantool/src/lua/init.h) > - 'tarantool_L', > - # TODO: Add more > - ))): > - if lstate: > - return cast('lua_State *', lstate) > - > - > -def G(L): > - return mref('global_State *', L['glref']) > - > - > -def J(g): > - typeGG = gtype('GG_State') > - > - return cast('jit_State *', int(cast('char *', g)) > - - int(typeGG['g'].bitpos / 8) > - + int(typeGG['J'].bitpos / 8)) > - > - > -def vm_state(g): > - return { > - i2notu32(0): 'INTERP', > - i2notu32(1): 'LFUNC', > - i2notu32(2): 'FFUNC', > - i2notu32(3): 'CFUNC', > - i2notu32(4): 'GC', > - i2notu32(5): 'EXIT', > - i2notu32(6): 'RECORD', > - i2notu32(7): 'OPT', > - i2notu32(8): 'ASM', > - }.get(int(tou32(g['vmstate'])), 'TRACE') > - > - > -def gc_state(g): > - return { > - 0: 'PAUSE', > - 1: 'PROPAGATE', > - 2: 'ATOMIC', > - 3: 'SWEEPSTRING', > - 4: 'SWEEP', > - 5: 'FINALIZE', > - 6: 'LAST', > - }.get(int(g['gc']['state']), 'INVALID') > - > - > -def jit_state(g): > - return { > - 0: 'IDLE', > - 0x10: 'ACTIVE', > - 0x11: 'RECORD', > - 0x12: 'START', > - 0x13: 'END', > - 0x14: 'ASM', > - 0x15: 'ERR', > - }.get(int(J(g)['state']), 'INVALID') > - > - > -def tvisint(o): > - return LJ_DUALNUM and itype(o) == LJ_TISNUM > - > - > -def tvisnumber(o): > - return itype(o) <= LJ_TISNUM > - > - > -def tvislightud(o): > - if LJ_64 and not LJ_GC64: > - return (cast('int32_t', itype(o)) >> 15) == -2 > - else: > - return itype(o) == LJ_T['LIGHTUD'] > - > - > -def strdata(obj): > - # String is printed with pointer to it, thanks to gdb. Just strip it. > - try: > - return str(cast('char *', cast('GCstr *', obj) + 1))[len(PADDING):] > - except UnicodeEncodeError: > - return "<luajit-gdb: error occurred while rendering non-ascii slot>" > - > - > -def itypemap(o): > - if LJ_64 and not LJ_GC64: > - return LJ_T['NUMX'] if tvisnumber(o) \ > - else LJ_T['LIGHTUD'] if tvislightud(o) \ > - else itype(o) > - else: > - return LJ_T['NUMX'] if tvisnumber(o) else itype(o) > - > - > -def funcproto(func): > - assert func['ffid'] == 0 > - > - return cast('GCproto *', > - mref('char *', func['pc']) - gdb.lookup_type('GCproto').sizeof) > - > - > -def gclistlen(root, end=0x0): > - count = 0 > - while (gcref(root) != end): > - count += 1 > - root = gcnext(root) > - return count > - > - > -def gcringlen(root): > - if not gcref(root): > - return 0 > - elif gcref(root) == gcref(gcnext(root)): > - return 1 > - else: > - return 1 + gclistlen(gcnext(root), gcref(root)) > - > - > -gclen = { > - 'root': gclistlen, > - 'gray': gclistlen, > - 'grayagain': gclistlen, > - 'weak': gclistlen, > - # XXX: gc.mmudata is a ring-list. > - 'mmudata': gcringlen, > -} > - > - > -# The generator that implements frame iterator. > -# Every frame is represented as a tuple of framelink and frametop. > -def frames(L): > - frametop = L['top'] > - framelink = L['base'] - 1 > - framelink_sentinel = frame_sentinel(L) > - while True: > - yield framelink, frametop > - frametop = framelink - (1 + LJ_FR2) > - if framelink <= framelink_sentinel: > - break > - framelink = frame_prev(framelink) > - > - > -def lightudV(tv): > - if LJ_64: > - u = int(tv['u64']) > - # lightudseg macro expanded. > - seg = (u >> LJ_LIGHTUD_BITS_LO) & LIGHTUD_SEG_MASK > - segmap = mref('uint32_t *', G(L(None))['gc']['lightudseg']) > - # lightudlo macro expanded. > - return (int(segmap[seg]) << 32) | (u & LIGHTUD_LO_MASK) > - else: > - return gcval(tv['gcr']) > - > - > -# Dumpers {{{ > - > - > -def dump_lj_tnil(tv): > - return 'nil' > - > - > -def dump_lj_tfalse(tv): > - return 'false' > - > - > -def dump_lj_ttrue(tv): > - return 'true' > - > - > -def dump_lj_tlightud(tv): > - return 'light userdata @ {}'.format(strx64(lightudV(tv))) > - > - > -def dump_lj_tstr(tv): > - return 'string {body} @ {address}'.format( > - body=strdata(gcval(tv['gcr'])), > - address=strx64(gcval(tv['gcr'])) > - ) > - > - > -def dump_lj_tupval(tv): > - return 'upvalue @ {}'.format(strx64(gcval(tv['gcr']))) > - > - > -def dump_lj_tthread(tv): > - return 'thread @ {}'.format(strx64(gcval(tv['gcr']))) > - > - > -def dump_lj_tproto(tv): > - return 'proto @ {}'.format(strx64(gcval(tv['gcr']))) > - > - > -def dump_lj_tfunc(tv): > - func = cast('struct GCfuncC *', gcval(tv['gcr'])) > - ffid = func['ffid'] > - > - if ffid == 0: > - pt = funcproto(func) > - return 'Lua function @ {addr}, {nups} upvalues, {chunk}:{line}'.format( > - addr=strx64(func), > - nups=int(func['nupvalues']), > - chunk=strdata(cast('GCstr *', gcval(pt['chunkname']))), > - line=pt['firstline'] > - ) > - elif ffid == 1: > - return 'C function @ {}'.format(strx64(func['f'])) > - else: > - return 'fast function #{}'.format(int(ffid)) > - > - > -def dump_lj_ttrace(tv): > - trace = cast('struct GCtrace *', gcval(tv['gcr'])) > - return 'trace {traceno} @ {addr}'.format( > - traceno=strx64(trace['traceno']), > - addr=strx64(trace) > - ) > - > - > -def dump_lj_tcdata(tv): > - return 'cdata @ {}'.format(strx64(gcval(tv['gcr']))) > - > - > -def dump_lj_ttab(tv): > - table = cast('GCtab *', gcval(tv['gcr'])) > - return 'table @ {gcr} (asize: {asize}, hmask: {hmask})'.format( > - gcr=strx64(table), > - asize=table['asize'], > - hmask=strx64(table['hmask']), > - ) > - > - > -def dump_lj_tudata(tv): > - return 'userdata @ {}'.format(strx64(gcval(tv['gcr']))) > - > - > -def dump_lj_tnumx(tv): > - if tvisint(tv): > - return 'integer {}'.format(cast('int32_t', tv['i'])) > - else: > - return 'number {}'.format(cast('double', tv['n'])) > - > - > -def dump_lj_invalid(tv): > - return 'not valid type @ {}'.format(strx64(gcval(tv['gcr']))) > - > - > -# }}} > - > - > -dumpers = { > - 'LJ_TNIL': dump_lj_tnil, > - 'LJ_TFALSE': dump_lj_tfalse, > - 'LJ_TTRUE': dump_lj_ttrue, > - 'LJ_TLIGHTUD': dump_lj_tlightud, > - 'LJ_TSTR': dump_lj_tstr, > - 'LJ_TUPVAL': dump_lj_tupval, > - 'LJ_TTHREAD': dump_lj_tthread, > - 'LJ_TPROTO': dump_lj_tproto, > - 'LJ_TFUNC': dump_lj_tfunc, > - 'LJ_TTRACE': dump_lj_ttrace, > - 'LJ_TCDATA': dump_lj_tcdata, > - 'LJ_TTAB': dump_lj_ttab, > - 'LJ_TUDATA': dump_lj_tudata, > - 'LJ_TNUMX': dump_lj_tnumx, > -} > - > - > -def dump_tvalue(tvalue): > - return dumpers.get(typenames(itypemap(tvalue)), dump_lj_invalid)(tvalue) > - > - > -def dump_framelink_slot_address(fr): > - return '{}:{}'.format(fr - 1, fr) if LJ_FR2 \ > - else '{}'.format(fr) + PADDING > - > - > -def dump_framelink(L, fr): > - if fr == frame_sentinel(L): > - return '{addr} [S ] FRAME: dummy L'.format( > - addr=dump_framelink_slot_address(fr), > - ) > - return '{addr} [ ] FRAME: [{pp}] delta={d}, {f}'.format( > - addr=dump_framelink_slot_address(fr), > - pp='PP' if frame_ispcall(fr) else '{frname}{p}'.format( > - frname=frametypes(int(frame_type(fr))), > - p='P' if frame_typep(fr) & FRAME_P else '' > - ), > - d=cast('TValue *', fr) - cast('TValue *', frame_prev(fr)), > - f=dump_lj_tfunc(fr - LJ_FR2), > - ) > - > - > -def dump_stack_slot(L, slot, base=None, top=None): > - base = base or L['base'] > - top = top or L['top'] > - > - return '{addr}{padding} [ {B}{T}{M}] VALUE: {value}'.format( > - addr=strx64(slot), > - padding=PADDING, > - B='B' if slot == base else ' ', > - T='T' if slot == top else ' ', > - M='M' if slot == mref('TValue *', L['maxstack']) else ' ', > - value=dump_tvalue(slot), > - ) > - > - > -def dump_stack(L, base=None, top=None): > - base = base or L['base'] > - top = top or L['top'] > - stack = mref('TValue *', L['stack']) > - maxstack = mref('TValue *', L['maxstack']) > - red = 5 + 2 * LJ_FR2 > - > - dump = [ > - '{padding} Red zone: {nredslots: >2} slots {padding}'.format( > - padding='-' * len(PADDING), > - nredslots=red, > - ), > - ] > - dump.extend([ > - dump_stack_slot(L, maxstack + offset, base, top) > - for offset in range(red, 0, -1) # noqa: E131 > - ]) > - dump.extend([ > - '{padding} Stack: {nstackslots: >5} slots {padding}'.format( > - padding='-' * len(PADDING), > - nstackslots=int((tou64(maxstack) - tou64(stack)) >> 3), > - ), > - dump_stack_slot(L, maxstack, base, top), > - '{start}:{end} [ ] {nfreeslots} slots: Free stack slots'.format( > - start=strx64(top + 1), > - end=strx64(maxstack - 1), > - nfreeslots=int((tou64(maxstack) - tou64(top) - 8) >> 3), > - ), > - ]) > - > - for framelink, frametop in frames(L): > - # Dump all data slots in the (framelink, top) interval. > - dump.extend([ > - dump_stack_slot(L, framelink + offset, base, top) > - for offset in range(frametop - framelink, 0, -1) # noqa: E131 > - ]) > - # Dump frame slot (2 slots in case of GC64). > - dump.append(dump_framelink(L, framelink)) > - > - return '\n'.join(dump) > - > - > -def dump_gc(g): > - gc = g['gc'] > - stats = ['{key}: {value}'.format(key=f, value=gc[f]) for f in ( > - 'total', 'threshold', 'debt', 'estimate', 'stepmul', 'pause' > - )] > - > - stats += ['sweepstr: {sweepstr}/{strmask}'.format( > - sweepstr=gc['sweepstr'], > - # String hash mask (size of hash table - 1). > - strmask=g['strmask'] + 1, > - )] > - > - stats += ['{key}: {number} objects'.format( > - key=stat, > - number=handler(gc[stat]) > - ) for stat, handler in gclen.items()] > - > - return '\n'.join(map(lambda s: '\t' + s, stats)) > - > - > -class LJBase(gdb.Command): > - > - def __init__(self, name): > - # XXX Fragile: though the command initialization looks like a crap but > - # it respects both Python 2 and Python 3. > - gdb.Command.__init__(self, name, gdb.COMMAND_DATA) > - gdb.write('{} command initialized\n'.format(name)) > - > - > -class LJDumpArch(LJBase): > - ''' > -lj-arch > - > -The command requires no args and dumps values of LJ_64 and LJ_GC64 > -compile-time flags. These values define the sizes of host and GC > -pointers respectively. > - ''' > - > - def invoke(self, arg, from_tty): > - gdb.write( > - 'LJ_64: {LJ_64}, LJ_GC64: {LJ_GC64}, LJ_DUALNUM: {LJ_DUALNUM}\n' > - .format( > - LJ_64=LJ_64, > - LJ_GC64=LJ_GC64, > - LJ_DUALNUM=LJ_DUALNUM > - ) > - ) > - > - > -class LJDumpTValue(LJBase): > - ''' > -lj-tv <TValue *> > - > -The command receives a pointer to <tv> (TValue address) and dumps > -the type and some info related to it. > - > -* LJ_TNIL: nil > -* LJ_TFALSE: false > -* LJ_TTRUE: true > -* LJ_TLIGHTUD: light userdata @ <gcr> > -* LJ_TSTR: string <string payload> @ <gcr> > -* LJ_TUPVAL: upvalue @ <gcr> > -* LJ_TTHREAD: thread @ <gcr> > -* LJ_TPROTO: proto @ <gcr> > -* LJ_TFUNC: <LFUNC|CFUNC|FFUNC> > - <LFUNC>: Lua function @ <gcr>, <nupvals> upvalues, <chunk:line> > - <CFUNC>: C function <mcode address> > - <FFUNC>: fast function #<ffid> > -* LJ_TTRACE: trace <traceno> @ <gcr> > -* LJ_TCDATA: cdata @ <gcr> > -* LJ_TTAB: table @ <gcr> (asize: <asize>, hmask: <hmask>) > -* LJ_TUDATA: userdata @ <gcr> > -* LJ_TNUMX: number <numeric payload> > - > -Whether the type of the given address differs from the listed above, then > -error message occurs. > - ''' > - > - def invoke(self, arg, from_tty): > - tv = cast('TValue *', parse_arg(arg)) > - gdb.write('{}\n'.format(dump_tvalue(tv))) > - > - > -class LJDumpString(LJBase): > - ''' > -lj-str <GCstr *> > - > -The command receives a <gcr> of the corresponding GCstr object and dumps > -the payload, size in bytes and hash. > - > -*Caveat*: Since Python 2 provides no native Unicode support, the payload > -is replaced with the corresponding error when decoding fails. > - ''' > - > - def invoke(self, arg, from_tty): > - string = cast('GCstr *', parse_arg(arg)) > - gdb.write("String: {body} [{len} bytes] with hash {hash}\n".format( > - body=strdata(string), > - hash=strx64(string['hash']), > - len=string['len'], > - )) > - > - > -class LJDumpTable(LJBase): > - ''' > -lj-tab <GCtab *> > - > -The command receives a GCtab address and dumps the table contents: > -* Metatable address whether the one is set > -* Array part <asize> slots: > - <aslot ptr>: [<index>]: <tv> > -* Hash part <hsize> nodes: > - <hnode ptr>: { <tv> } => { <tv> }; next = <next hnode ptr> > - ''' > - > - def invoke(self, arg, from_tty): > - t = cast('GCtab *', parse_arg(arg)) > - array = mref('TValue *', t['array']) > - nodes = mref('struct Node *', t['node']) > - mt = gcval(t['metatable']) > - capacity = { > - 'apart': int(t['asize']), > - 'hpart': int(t['hmask'] + 1) if t['hmask'] > 0 else 0 > - } > - > - if mt != 0: > - gdb.write('Metatable detected: {}\n'.format(strx64(mt))) > - > - gdb.write('Array part: {} slots\n'.format(capacity['apart'])) > - for i in range(capacity['apart']): > - slot = array + i > - gdb.write('{ptr}: [{index}]: {value}\n'.format( > - ptr=slot, > - index=i, > - value=dump_tvalue(slot) > - )) > - > - gdb.write('Hash part: {} nodes\n'.format(capacity['hpart'])) > - # See hmask comment in lj_obj.h > - for i in range(capacity['hpart']): > - node = nodes + i > - gdb.write('{ptr}: {{ {key} }} => {{ {val} }}; next = {n}\n'.format( > - ptr=node, > - key=dump_tvalue(node['key']), > - val=dump_tvalue(node['val']), > - n=mref('struct Node *', node['next']) > - )) > - > - > -class LJDumpStack(LJBase): > - ''' > -lj-stack [<lua_State *>] > - > -The command receives a lua_State address and dumps the given Lua > -coroutine guest stack: > - > -<slot ptr> [<slot attributes>] <VALUE|FRAME> > - > -* <slot ptr>: guest stack slot address > -* <slot attributes>: > - - S: Bottom of the stack (the slot L->stack points to) > - - B: Base of the current guest frame (the slot L->base points to) > - - T: Top of the current guest frame (the slot L->top points to) > - - M: Last slot of the stack (the slot L->maxstack points to) > -* <VALUE>: see help lj-tv for more info > -* <FRAME>: framelink slot differs from the value slot: it contains info > - related to the function being executed within this guest frame, its > - type and link to the parent guest frame > - [<frame type>] delta=<slots in frame>, <lj-tv for LJ_TFUNC slot> > - - <frame type>: > - + L: VM performs a call as a result of bytecode execution > - + C: VM performs a call as a result of lj_vm_call > - + M: VM performs a call to a metamethod as a result of bytecode > - execution > - + V: Variable-length frame for storing arguments of a variadic > - function > - + CP: Protected C frame > - + PP: VM performs a call as a result of executinig pcall or xpcall > - > -If L is omitted the main coroutine is used. > - ''' > - > - def invoke(self, arg, from_tty): > - gdb.write('{}\n'.format(dump_stack(L(parse_arg(arg))))) > - > - > -class LJState(LJBase): > - ''' > -lj-state > -The command requires no args and dumps current VM and GC states > -* VM state: <INTERP|C|GC|EXIT|RECORD|OPT|ASM|TRACE> > -* GC state: <PAUSE|PROPAGATE|ATOMIC|SWEEPSTRING|SWEEP|FINALIZE|LAST> > -* JIT state: <IDLE|ACTIVE|RECORD|START|END|ASM|ERR> > - ''' > - > - def invoke(self, arg, from_tty): > - g = G(L(None)) > - gdb.write('{}\n'.format('\n'.join( > - map(lambda t: '{} state: {}'.format(*t), { > - 'VM': vm_state(g), > - 'GC': gc_state(g), > - 'JIT': jit_state(g), > - }.items()) > - ))) > - > - > -class LJGC(LJBase): > - ''' > -lj-gc > - > -The command requires no args and dumps current GC stats: > -* total: <total number of allocated bytes in GC area> > -* threshold: <limit when gc step is triggered> > -* debt: <how much GC is behind schedule> > -* estimate: <estimate of memory actually in use> > -* stepmul: <incremental GC step granularity> > -* pause: <pause between successive GC cycles> > -* sweepstr: <sweep position in string table> > -* root: <number of all collectable objects> > -* gray: <number of gray objects> > -* grayagain: <number of objects for atomic traversal> > -* weak: <number of weak tables (to be cleared)> > -* mmudata: <number of udata|cdata to be finalized> > - ''' > - > - def invoke(self, arg, from_tty): > - g = G(L(None)) > - gdb.write('GC stats: {state}\n{stats}\n'.format( > - state=gc_state(g), > - stats=dump_gc(g) > - )) > - > - > -def init(commands): > - global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, LJ_TISNUM, PADDING > - > - # XXX Fragile: though connecting the callback looks like a crap but it > - # respects both Python 2 and Python 3 (see #4828). > - def connect(callback): > - if LEGACY: > - global CONNECTED > - CONNECTED = True > - gdb.events.new_objfile.connect(callback) > - > - # XXX Fragile: though disconnecting the callback looks like a crap but it > - # respects both Python 2 and Python 3 (see #4828). > - def disconnect(callback): > - if LEGACY: > - global CONNECTED > - if not CONNECTED: > - return > - CONNECTED = False > - gdb.events.new_objfile.disconnect(callback) > - > - try: > - # Try to remove the callback at first to not append duplicates to > - # gdb.events.new_objfile internal list. > - disconnect(load) > - except Exception: > - # Callback is not connected. > - pass > - > - try: > - # Detect whether libluajit objfile is loaded. > - gdb.parse_and_eval('luaJIT_setmode') > - except Exception: > - gdb.write('luajit-gdb.py initialization is postponed ' > - 'until libluajit objfile is loaded\n') > - # Add a callback to be executed when the next objfile is loaded. > - connect(load) > - return > - > - try: > - LJ_64 = str(gdb.parse_and_eval('IRT_PTR')) == 'IRT_P64' > - LJ_FR2 = LJ_GC64 = str(gdb.parse_and_eval('IRT_PGC')) == 'IRT_P64' > - LJ_DUALNUM = gdb.lookup_global_symbol('lj_lib_checknumber') is not None > - except Exception: > - gdb.write('luajit-gdb.py failed to load: ' > - 'no debugging symbols found for libluajit\n') > - return > - > - for name, command in commands.items(): > - command(name) > - > - PADDING = ' ' * len(':' + hex((1 << (47 if LJ_GC64 else 32)) - 1)) > - LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] > - > - gdb.write('luajit-gdb.py is successfully loaded\n') > - > - > -def load(event=None): > - init({ > - 'lj-arch': LJDumpArch, > - 'lj-tv': LJDumpTValue, > - 'lj-str': LJDumpString, > - 'lj-tab': LJDumpTable, > - 'lj-stack': LJDumpStack, > - 'lj-state': LJState, > - 'lj-gc': LJGC, > - }) > - > - > -load(None) > diff --git a/src/luajit_lldb.py b/src/luajit_dbg.py > similarity index 63% > rename from src/luajit_lldb.py > rename to src/luajit_dbg.py > index 5ac11b65..a42d8f25 100644 > --- a/src/luajit_lldb.py > +++ b/src/luajit_dbg.py > @@ -1,10 +1,230 @@ > -# LLDB extension for LuaJIT post-mortem analysis. > -# To use, just put 'command script import <path-to-repo>/src/luajit_lldb.py' > -# in lldb. > +# Debug extension for LuaJIT post-mortem analysis. > +# To use in LLDB: 'command script import <path-to-repo>/src/luajit_dbg.py' > +# To use in GDB: 'source <path-to-repo>/src/luajit_dbg.py' > > import abc > import re > -import lldb > +import sys > +import types > + > +from importlib import import_module > + > +# make script compatible with the ancient Python {{{ > + > + > +LEGACY = re.match(r'^2\.', sys.version) > + > +if LEGACY: > + CONNECTED = False > + int = long > + range = xrange > + > + > +def is_integer_type(val): > + return isinstance(val, int) or (LEGACY and isinstance(val, types.IntType)) > + > + > +# }}} > + > + > +class Debugger(object): > + def __init__(self): > + self.GDB = False > + self.LLDB = False > + > + debuggers = { > + 'gdb': lambda lib: True, > + 'lldb': lambda lib: lib.debugger is not None, > + } > + for name, healthcheck in debuggers.items(): > + lib = None > + try: > + lib = import_module(name) > + if healthcheck(lib): > + setattr(self, name.upper(), True) > + globals()[name] = lib > + self.name = name > + except Exception: > + continue > + > + assert self.LLDB != self.GDB > + > + def setup_target(self, debugger): > + global target > + if self.LLDB: > + target = debugger.GetSelectedTarget() > + > + def write(self, msg): > + if self.LLDB: > + print(msg) > + else: > + gdb.write(msg + '\n') > + > + def cmd_init(self, cmd_cls, debugger=None): > + if self.LLDB: > + debugger.HandleCommand( > + 'command script add --overwrite --class ' > + 'luajit_dbg.{cls} {cmd}' > + .format( > + cls=cmd_cls.__name__, > + cmd=cmd_cls.command, > + ) > + ) > + else: > + cmd_cls() > + > + def event_connect(self, callback): > + if not self.LLDB: > + # XXX Fragile: though connecting the callback looks like a crap but > + # it respects both Python 2 and Python 3 (see #4828). > + if LEGACY: > + global CONNECTED > + CONNECTED = True > + gdb.events.new_objfile.connect(callback) > + > + def event_disconnect(self, callback): > + if not self.LLDB: > + # XXX Fragile: though disconnecting the callback looks like a crap > + # but it respects both Python 2 and Python 3 (see #4828). > + if LEGACY: > + global CONNECTED > + if not CONNECTED: > + return > + CONNECTED = False > + gdb.events.new_objfile.disconnect(callback) > + > + def lookup_variable(self, name): > + if self.LLDB: > + return target.FindFirstGlobalVariable(name) > + else: > + variable, _ = gdb.lookup_symbol(name) > + return variable.value() if variable else None > + > + def lookup_symbol(self, sym): > + if self.LLDB: > + return target.modules[0].FindSymbol(sym) > + else: > + return gdb.lookup_global_symbol(sym) > + > + def to_unsigned(self, val): > + return val.unsigned if self.LLDB else int(val) > + > + def to_signed(self, val): > + return val.signed if self.LLDB else int(val) > + > + def to_str(self, val): > + return val.value if self.LLDB else str(val) > + > + def find_type(self, typename): > + if self.LLDB: > + return target.FindFirstType(typename) > + else: > + return gdb.lookup_type(typename) > + > + def type_to_pointer_type(self, tp): > + if self.LLDB: > + return tp.GetPointerType() > + else: > + return tp.pointer() > + > + def cast_impl(self, value, t, pointer_type): > + if self.LLDB: > + if is_integer_type(value): > + # Integer casts require some black magic > + # for lldb to behave properly. > + if pointer_type: > + return target.CreateValueFromAddress( > + 'value', > + lldb.SBAddress(value, target), > + t.GetPointeeType(), > + ).address_of > + else: > + return target.CreateValueFromData( > + name='value', > + data=lldb.SBData.CreateDataFromInt(value, size=8), > + type=t, > + ) > + else: > + return value.Cast(t) > + else: > + return gdb.Value(value).cast(t) > + > + def dereference(self, val): > + if self.LLDB: > + return val.Dereference() > + else: > + return val.dereference() > + > + def eval(self, expression): > + if self.LLDB: > + process = target.GetProcess() > + thread = process.GetSelectedThread() > + frame = thread.GetSelectedFrame() > + > + if not expression: > + return None > + > + return frame.EvaluateExpression(expression) > + else: > + return gdb.parse_and_eval(expression) > + > + def type_sizeof_impl(self, tp): > + if self.LLDB: > + return tp.GetByteSize() > + else: > + return tp.sizeof > + > + def summary(self, val): > + if self.LLDB: > + return val.summary > + else: > + return str(val)[len(PADDING):].strip() > + > + def type_member(self, type_obj, name): > + if self.LLDB: > + return next((x for x in type_obj.members if x.name == name), None) > + else: > + return type_obj[name] > + > + def type_member_offset(self, member): > + if self.LLDB: > + return member.GetOffsetInBytes() > + else: > + return member.bitpos / 8 > + > + def get_member(self, value, member_name): > + if self.LLDB: > + return value.GetChildMemberWithName(member_name) > + else: > + return value[member_name] > + > + def address_of(self, value): > + if self.LLDB: > + return value.address_of > + else: > + return value.address > + > + def arch_init(self): > + global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, PADDING, LJ_TISNUM, target > + if self.LLDB: > + irtype_enum = dbg.find_type('IRType').enum_members > + for member in irtype_enum: > + if member.name == 'IRT_PTR': > + LJ_64 = dbg.to_unsigned(member) & 0x1f == IRT_P64 > + if member.name == 'IRT_PGC': > + LJ_GC64 = dbg.to_unsigned(member) & 0x1f == IRT_P64 > + else: > + LJ_64 = str(dbg.eval('IRT_PTR')) == 'IRT_P64' > + LJ_GC64 = str(dbg.eval('IRT_PGC')) == 'IRT_P64' > + > + LJ_FR2 = LJ_GC64 > + LJ_DUALNUM = dbg.lookup_symbol('lj_lib_checknumber') is not None > + # Two extra characters are required to fit in the `0x` part. > + PADDING = ' ' * len(strx64(L())) > + LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] > + > + > +dbg = Debugger() > > LJ_64 = None > LJ_GC64 = None > @@ -17,68 +237,73 @@ IRT_P64 = 9 > LJ_GCVMASK = ((1 << 47) - 1) > LJ_TISNUM = None > > -# Debugger specific {{{ > - > - > # Global > target = None > > > -class Ptr: > +class Ptr(object): > def __init__(self, value, normal_type): > self.value = value > self.normal_type = normal_type > > @property > def __deref(self): > - return self.normal_type(self.value.Dereference()) > + return self.normal_type(dbg.dereference(self.value)) > > def __add__(self, other): > - assert isinstance(other, int) > + assert is_integer_type(other) > return self.__class__( > cast( > self.normal_type.__name__ + ' *', > cast( > 'uintptr_t', > - self.value.unsigned + other * self.value.deref.size, > + dbg.to_unsigned(self.value) + other * sizeof( > + self.normal_type.__name__ > + ), > ), > ), > ) > > def __sub__(self, other): > - assert isinstance(other, int) or isinstance(other, Ptr) > - if isinstance(other, int): > + assert is_integer_type(other) or isinstance(other, Ptr) > + if is_integer_type(other): > return self.__add__(-other) > else: > - return int((self.value.unsigned - other.value.unsigned) > - / sizeof(self.normal_type.__name__)) > + return int( > + ( > + dbg.to_unsigned(self.value) - dbg.to_unsigned(other.value) > + ) / sizeof(self.normal_type.__name__) > + ) > > def __eq__(self, other): > - assert isinstance(other, Ptr) or isinstance(other, int) and other >= 0 > + assert isinstance(other, Ptr) or is_integer_type(other) > if isinstance(other, Ptr): > - return self.value.unsigned == other.value.unsigned > + return dbg.to_unsigned(self.value) == dbg.to_unsigned(other.value) > else: > - return self.value.unsigned == other > + return dbg.to_unsigned(self.value) == other > > def __ne__(self, other): > return not self == other > > def __gt__(self, other): > assert isinstance(other, Ptr) > - return self.value.unsigned > other.value.unsigned > + return dbg.to_unsigned(self.value) > dbg.to_unsigned(other.value) > > def __ge__(self, other): > assert isinstance(other, Ptr) > - return self.value.unsigned >= other.value.unsigned > + return dbg.to_unsigned(self.value) >= dbg.to_unsigned(other.value) > > def __bool__(self): > - return self.value.unsigned != 0 > + return dbg.to_unsigned(self.value) != 0 > > def __int__(self): > - return self.value.unsigned > + return dbg.to_unsigned(self.value) > + > + def __long__(self): > + return dbg.to_unsigned(self.value) > > def __str__(self): > - return self.value.value > + return dbg.to_str(self.value) > > def __getattr__(self, name): > if name != '__deref': > @@ -86,53 +311,26 @@ class Ptr: > return self.__deref > > > -class MetaStruct(type): > - def __init__(cls, name, bases, nmspc): > - super(MetaStruct, cls).__init__(name, bases, nmspc) > - > - def make_general(field, tp): > - builtin = { > - 'uint': 'unsigned', > - 'int': 'signed', > - 'string': 'value', > - } > - if tp in builtin.keys(): > - return lambda self: getattr(self[field], builtin[tp]) > - else: > - return lambda self: globals()[tp](self[field]) > - > - if hasattr(cls, 'metainfo'): > - for field in cls.metainfo: > - if not isinstance(field[0], str): > - setattr(cls, field[1], field[0]) > - else: > - setattr( > - cls, > - field[1], > - property(make_general(field[1], field[0])), > - ) > - > - > -class Struct(metaclass=MetaStruct): > +class Struct(object): > def __init__(self, value): > self.value = value > > def __getitem__(self, name): > - return self.value.GetChildMemberWithName(name) > + return dbg.get_member(self.value, name) > > @property > def addr(self): > - return self.value.address_of > + return dbg.address_of(self.value) > > > c_structs = { > 'MRef': [ > - (property(lambda self: self['ptr64'].unsigned if LJ_GC64 > - else self['ptr32'].unsigned), 'ptr') > + (property(lambda self: dbg.to_unsigned(self['ptr64']) if LJ_GC64 > + else dbg.to_unsigned(self['ptr32'])), 'ptr') > ], > 'GCRef': [ > - (property(lambda self: self['gcptr64'].unsigned if LJ_GC64 > - else self['gcptr32'].unsigned), 'gcptr') > + (property(lambda self: dbg.to_unsigned(self['gcptr64']) if LJ_GC64 > + else dbg.to_unsigned(self['gcptr32'])), 'gcptr') > ], > 'TValue': [ > ('GCRef', 'gcr'), > @@ -141,8 +339,12 @@ c_structs = { > ('int', 'it64'), > ('string', 'n'), > (property(lambda self: FR(self['fr']) if not LJ_GC64 else None), 'fr'), > - (property(lambda self: self['ftsz'].signed if LJ_GC64 else None), > - 'ftsz') > + ( > + property( > + lambda self: dbg.to_signed(self['ftsz']) if LJ_GC64 else None > + ), > + 'ftsz' > + ) > ], > 'GCState': [ > ('GCRef', 'root'), > @@ -216,26 +418,51 @@ c_structs = { > ('TValue', 'val'), > ('MRef', 'next') > ], > - 'BCIns': [] > + 'BCIns': [], > } > > > -for cls in c_structs.keys(): > - globals()[cls] = type(cls, (Struct, ), {'metainfo': c_structs[cls]}) > +def make_property_from_metadata(field, tp): > + builtin = { > + 'uint': dbg.to_unsigned, > + 'int': dbg.to_signed, > + 'string': dbg.to_str, > + } > + if tp in builtin.keys(): > + return lambda self: builtin[tp](self[field]) > + else: > + return lambda self: globals()[tp](self[field]) > + > + > +for cls, metainfo in c_structs.items(): > + cls_dict = {} > + for field in metainfo: > + if not isinstance(field[0], str): > + cls_dict[field[1]] = field[0] > + else: > + cls_dict[field[1]] = property( > + make_property_from_metadata(field[1], field[0]) > + ) > + globals()[cls] = type(cls, (Struct, ), cls_dict) > > > for cls in Struct.__subclasses__(): > ptr_name = cls.__name__ + 'Ptr' > > + def make_init(cls): > + return lambda self, value: super(type(self), self).__init__(value, cls) > + > globals()[ptr_name] = type(ptr_name, (Ptr,), { > - '__init__': > - lambda self, value: super(type(self), self).__init__(value, cls) > + '__init__': make_init(cls) > }) > > > -class Command(object): > - def __init__(self, debugger, unused): > - pass > +class Command(object if dbg.LLDB else gdb.Command): > + def __init__(self, debugger=None, unused=None): > + if dbg.GDB: > + # XXX Fragile: though initialization looks like a crap but it > + # respects both Python 2 and Python 3 (see #4828). > + gdb.Command.__init__(self, self.command, gdb.COMMAND_DATA) > > def get_short_help(self): > return self.__doc__.splitlines()[0] > @@ -245,21 +472,15 @@ class Command(object): > > def __call__(self, debugger, command, exe_ctx, result): > try: > - self.execute(debugger, command, result) > + self.execute(command) > except Exception as e: > msg = 'Failed to execute command `{}`: {}'.format(self.command, e) > result.SetError(msg) > > def parse(self, command): > - process = target.GetProcess() > - thread = process.GetSelectedThread() > - frame = thread.GetSelectedFrame() > - > if not command: > return None > - > - ret = frame.EvaluateExpression(command) > - return ret > + return dbg.to_unsigned(dbg.eval(command)) > > @abc.abstractproperty > def command(self): > @@ -270,7 +491,7 @@ class Command(object): > """ > > @abc.abstractmethod > - def execute(self, debugger, args, result): > + def execute(self, args): > """Implementation of the command. > Subclasses override this method to implement the logic of a given > command, e.g. printing a stacktrace. The command output should be > @@ -278,6 +499,11 @@ class Command(object): > properly routed to LLDB frontend. Any unhandled exception will be > automatically transformed into proper errors. > """ > + def invoke(self, arg, from_tty): > + try: > + self.execute(arg) > + except Exception as e: > + dbg.write(e) > > > def cast(typename, value): > @@ -299,75 +525,38 @@ def cast(typename, value): > name = name[:-1].strip() > pointer_type = True > > - # Get the lldb type representation. > - t = target.FindFirstType(name) > + # Get the inferior type representation. > + t = dbg.find_type(name) > if pointer_type: > - t = t.GetPointerType() > - > - if isinstance(value, int): > - # Integer casts require some black magic for lldb to behave properly. > - if pointer_type: > - casted = target.CreateValueFromAddress( > - 'value', > - lldb.SBAddress(value, target), > - t.GetPointeeType(), > - ).address_of > - else: > - casted = target.CreateValueFromData( > - name='value', > - data=lldb.SBData.CreateDataFromInt(value, size=8), > - type=t, > - ) > - else: > - casted = value.Cast(t) > + t = dbg.type_to_pointer_type(t) > + > + casted = dbg.cast_impl(value, t, pointer_type) > > if isinstance(typename, type): > - # Wrap lldb object, if possible > + # Wrap inferior object, if possible > return typename(casted) > else: > return casted > > > -def lookup_global(name): > - return target.FindFirstGlobalVariable(name) > - > - > -def type_member(type_obj, name): > - return next((x for x in type_obj.members if x.name == name), None) > - > - > -def find_type(typename): > - return target.FindFirstType(typename) > - > - > def offsetof(typename, membername): > - type_obj = find_type(typename) > - member = type_member(type_obj, membername) > + type_obj = dbg.find_type(typename) > + member = dbg.type_member(type_obj, membername) > assert member is not None > - return member.GetOffsetInBytes() > + return dbg.type_member_offset(member) > > > def sizeof(typename): > - type_obj = find_type(typename) > - return type_obj.GetByteSize() > + type_obj = dbg.find_type(typename) > + return dbg.type_sizeof_impl(type_obj) > > > def vtou64(value): > - return value.unsigned & 0xFFFFFFFFFFFFFFFF > + return dbg.to_unsigned(value) & 0xFFFFFFFFFFFFFFFF > > > def vtoi(value): > - return value.signed > - > - > -def dbg_eval(expr): > - process = target.GetProcess() > - thread = process.GetSelectedThread() > - frame = thread.GetSelectedFrame() > - return frame.EvaluateExpression(expr) > - > - > -# }}} Debugger specific > + return dbg.to_signed(value) > > > def gcval(obj): > @@ -393,7 +582,7 @@ def gclistlen(root, end=0x0): > > > def gcringlen(root): > - if not gcref(root): > + if gcref(root) == 0: > return 0 > elif gcref(root) == gcref(gcnext(root)): > return 1 > @@ -439,7 +628,7 @@ def J(g): > J_offset = offsetof('GG_State', 'J') > return cast( > jit_StatePtr, > - vtou64(cast('char *', g)) - g_offset + J_offset, > + int(vtou64(cast('char *', g)) - g_offset + J_offset), > ) > > > @@ -451,7 +640,7 @@ def L(L=None): > # lookup a symbol for the main coroutine considering the host app > # XXX Fragile: though the loop initialization looks like a crap but it > # respects both Python 2 and Python 3. > - for lstate in [L] + list(map(lambda main: lookup_global(main), ( > + for lstate in [L] + list(map(lambda main: dbg.lookup_variable(main), ( > # LuaJIT main coro (see luajit/src/luajit.c) > 'globalL', > # Tarantool main coro (see tarantool/src/lua/init.h) > @@ -459,7 +648,7 @@ def L(L=None): > # TODO: Add more > ))): > if lstate: > - return lua_State(lstate) > + return lua_StatePtr(lstate) > > > def tou32(val): > @@ -523,9 +712,9 @@ def funcproto(func): > def strdata(obj): > try: > ptr = cast('char *', obj + 1) > - return ptr.summary > + return dbg.summary(ptr) > except UnicodeEncodeError: > - return "<luajit-lldb: error occurred while rendering non-ascii slot>" > + return "<luajit_dbg: error occured while rendering non-ascii slot>" > > > def itype(o): > @@ -730,12 +919,12 @@ def frame_pc(framelink): > > > def frame_prevl(framelink): > - # We are evaluating the `frame_pc(framelink)[-1])` with lldb's > + # We are evaluating the `frame_pc(framelink)[-1])` with > # REPL, because the lldb API is faulty and it's not possible to cast > # a struct member of 32-bit type to 64-bit type without getting onto > # the next property bits, despite the fact that it's an actual value, not > # a pointer to it. > - bcins = vtou64(dbg_eval('((BCIns *)' + str(frame_pc(framelink)) + ')[-1]')) > + bcins = vtou64(dbg.eval('((BCIns *)' + str(frame_pc(framelink)) + ')[-1]')) > return framelink - (1 + LJ_FR2 + bc_a(bcins)) > > > @@ -789,12 +978,12 @@ def frames(L): > > def dump_framelink_slot_address(fr): > return '{start:{padding}}:{end:{padding}}'.format( > - start=hex(int(fr - 1)), > - end=hex(int(fr)), > + start=strx64(fr - 1), > + end=strx64(fr), > padding=len(PADDING), > ) if LJ_FR2 else '{addr:{padding}}'.format( > - addr=hex(int(fr)), > - padding=len(PADDING), > + addr=strx64(fr), > + padding=2 * len(PADDING) + 1, > ) > > > @@ -863,7 +1052,6 @@ def dump_stack(L, base=None, top=None): > nfreeslots=int((maxstack - top - 8) >> 3), > ), > ]) > - > for framelink, frametop in frames(L): > # Dump all data slots in the (framelink, top) interval. > dump.extend([ > @@ -904,9 +1092,11 @@ the type and some info related to it. > Whether the type of the given address differs from the listed above, then > error message occurs. > ''' > - def execute(self, debugger, args, result): > + command = 'lj-tv' > + > + def execute(self, args): > tvptr = TValuePtr(cast('TValue *', self.parse(args))) > - print('{}'.format(dump_tvalue(tvptr))) > + dbg.write('{}'.format(dump_tvalue(tvptr))) > > > class LJState(Command): > @@ -917,9 +1107,11 @@ The command requires no args and dumps current VM and GC states > * GC state: <PAUSE|PROPAGATE|ATOMIC|SWEEPSTRING|SWEEP|FINALIZE|LAST> > * JIT state: <IDLE|ACTIVE|RECORD|START|END|ASM|ERR> > ''' > - def execute(self, debugger, args, result): > + command = 'lj-state' > + > + def execute(self, args): > g = G(L(None)) > - print('{}'.format('\n'.join( > + dbg.write('{}'.format('\n'.join( > map(lambda t: '{} state: {}'.format(*t), { > 'VM': vm_state(g), > 'GC': gc_state(g), > @@ -936,8 +1128,10 @@ The command requires no args and dumps values of LJ_64 and LJ_GC64 > compile-time flags. These values define the sizes of host and GC > pointers respectively. > ''' > - def execute(self, debugger, args, result): > - print( > + command = 'lj-arch' > + > + def execute(self, args): > + dbg.write( > 'LJ_64: {LJ_64}, LJ_GC64: {LJ_GC64}, LJ_DUALNUM: {LJ_DUALNUM}' > .format( > LJ_64=LJ_64, > @@ -965,9 +1159,11 @@ The command requires no args and dumps current GC stats: > * weak: <number of weak tables (to be cleared)> > * mmudata: <number of udata|cdata to be finalized> > ''' > - def execute(self, debugger, args, result): > + command = 'lj-gc' > + > + def execute(self, args): > g = G(L(None)) > - print('GC stats: {state}\n{stats}'.format( > + dbg.write('GC stats: {state}\n{stats}'.format( > state=gc_state(g), > stats=dump_gc(g) > )) > @@ -983,9 +1179,11 @@ the payload, size in bytes and hash. > *Caveat*: Since Python 2 provides no native Unicode support, the payload > is replaced with the corresponding error when decoding fails. > ''' > - def execute(self, debugger, args, result): > + command = 'lj-str' > + > + def execute(self, args): > string_ptr = GCstrPtr(cast('GCstr *', self.parse(args))) > - print("String: {body} [{len} bytes] with hash {hash}".format( > + dbg.write("String: {body} [{len} bytes] with hash {hash}".format( > body=strdata(string_ptr), > hash=strx64(string_ptr.hash), > len=string_ptr.len, > @@ -1003,7 +1201,9 @@ The command receives a GCtab address and dumps the table contents: > * Hash part <hsize> nodes: > <hnode ptr>: { <tv> } => { <tv> }; next = <next hnode ptr> > ''' > - def execute(self, debugger, args, result): > + command = 'lj-tab' > + > + def execute(self, args): > t = GCtabPtr(cast('GCtab *', self.parse(args))) > array = mref(TValuePtr, t.array) > nodes = mref(NodePtr, t.node) > @@ -1014,22 +1214,22 @@ The command receives a GCtab address and dumps the table contents: > } > > if mt: > - print('Metatable detected: {}'.format(strx64(mt))) > + dbg.write('Metatable detected: {}'.format(strx64(mt))) > > - print('Array part: {} slots'.format(capacity['apart'])) > + dbg.write('Array part: {} slots'.format(capacity['apart'])) > for i in range(capacity['apart']): > slot = array + i > - print('{ptr}: [{index}]: {value}'.format( > + dbg.write('{ptr}: [{index}]: {value}'.format( > ptr=strx64(slot), > index=i, > value=dump_tvalue(slot) > )) > > - print('Hash part: {} nodes'.format(capacity['hpart'])) > + dbg.write('Hash part: {} nodes'.format(capacity['hpart'])) > # See hmask comment in lj_obj.h > for i in range(capacity['hpart']): > node = nodes + i > - print('{ptr}: {{ {key} }} => {{ {val} }}; next = {n}'.format( > + dbg.write('{ptr}: {{ {key} }} => {{ {val} }}; next = {n}'.format( > ptr=strx64(node), > key=dump_tvalue(TValuePtr(node.key.addr)), > val=dump_tvalue(TValuePtr(node.val.addr)), > @@ -1069,56 +1269,72 @@ coroutine guest stack: > > If L is omitted the main coroutine is used. > ''' > - def execute(self, debugger, args, result): > + command = 'lj-stack' > + > + def execute(self, args): > lstate = self.parse(args) > - lstate_ptr = cast('lua_State *', lstate) if coro is not None else None > - print('{}'.format(dump_stack(L(lstate_ptr)))) > + lstate_ptr = cast('lua_State *', lstate) if lstate else None > + dbg.write('{}'.format(dump_stack(L(lstate_ptr)))) > > > -def register_commands(debugger, commands): > - for command, cls in commands.items(): > - cls.command = command > - debugger.HandleCommand( > - 'command script add --overwrite --class luajit_lldb.{cls} {cmd}' > - .format( > - cls=cls.__name__, > - cmd=cls.command, > - ) > - ) > - print('{cmd} command intialized'.format(cmd=cls.command)) > +LJ_COMMANDS = [ > + LJDumpTValue, > + LJState, > + LJDumpArch, > + LJGC, > + LJDumpString, > + LJDumpTable, > + LJDumpStack, > +] > + > > +def register_commands(commands, debugger=None): > + for cls in commands: > + dbg.cmd_init(cls, debugger) > + dbg.write('{cmd} command intialized'.format(cmd=cls.command)) > > -def configure(debugger): > - global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, PADDING, LJ_TISNUM, target > - target = debugger.GetSelectedTarget() > - module = target.modules[0] > - LJ_DUALNUM = module.FindSymbol('lj_lib_checknumber') is not None > > +def configure(debugger=None): > + global PADDING, LJ_TISNUM, LJ_DUALNUM > + dbg.setup_target(debugger) > try: > - irtype_enum = target.FindFirstType('IRType').enum_members > - for member in irtype_enum: > - if member.name == 'IRT_PTR': > - LJ_64 = member.unsigned & 0x1f == IRT_P64 > - if member.name == 'IRT_PGC': > - LJ_FR2 = LJ_GC64 = member.unsigned & 0x1f == IRT_P64 > + # Try to remove the callback at first to not append duplicates to > + # gdb.events.new_objfile internal list. > + dbg.event_disconnect(load) > except Exception: > - print('luajit_lldb.py failed to load: ' > - 'no debugging symbols found for libluajit') > - return > - > - PADDING = ' ' * len(strx64((TValuePtr(L().addr)))) > - LJ_TISNUM = 0xfffeffff if LJ_64 and not LJ_GC64 else LJ_T['NUMX'] > - > - > -def __lldb_init_module(debugger, internal_dict): > - configure(debugger) > - register_commands(debugger, { > - 'lj-tv': LJDumpTValue, > - 'lj-state': LJState, > - 'lj-arch': LJDumpArch, > - 'lj-gc': LJGC, > - 'lj-str': LJDumpString, > - 'lj-tab': LJDumpTable, > - 'lj-stack': LJDumpStack, > - }) > - print('luajit_lldb.py is successfully loaded') > + # Callback is not connected. > + pass > + > + try: > + # Detect whether libluajit objfile is loaded. > + dbg.eval('luaJIT_setmode') > + except Exception: > + dbg.write('luajit_dbg.py initialization is postponed ' > + 'until libluajit objfile is loaded\n') > + # Add a callback to be executed when the next objfile is loaded. > + dbg.event_connect(load) > + return False > + > + try: > + dbg.arch_init() > + except Exception: > + dbg.write('LuaJIT debug extension failed to load: ' > + 'no debugging symbols found for libluajit') > + return False > + return True > + > + > +# XXX: The dummy parameter is needed for this function to > +# work as a gdb callback. > +def load(_=None, debugger=None): > + if configure(debugger): > + register_commands(LJ_COMMANDS, debugger) > + dbg.write('LuaJIT debug extension is successfully loaded') > + > + > +def __lldb_init_module(debugger, _=None): > + load(None, debugger) > + > + > +if dbg.GDB: > + load() ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 1/2] debug: generalized extension 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 1/2] " Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:14 ` Sergey Bronnikov via Tarantool-patches @ 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches 2024-04-17 22:42 ` Maxim Kokryashkin via Tarantool-patches 2024-08-14 19:34 ` Mikhail Elhimov via Tarantool-patches 2 siblings, 1 reply; 11+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2024-04-17 16:00 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Hi, Maxim! Thanks for the patch! Please consider my comments below. On 04.04.24, Maxim Kokryashkin wrote: <snipped> > --- > src/luajit-gdb.py | 885 -------------------------- > src/{luajit_lldb.py => luajit_dbg.py} | 616 ++++++++++++------ Since luajit_lldb.py is gone, please change the comment in <.flake8rc>. > 2 files changed, 416 insertions(+), 1085 deletions(-) > delete mode 100644 src/luajit-gdb.py > rename src/{luajit_lldb.py => luajit_dbg.py} (63%) > > diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py > deleted file mode 100644 <snipped> > diff --git a/src/luajit_lldb.py b/src/luajit_dbg.py > similarity index 63% > rename from src/luajit_lldb.py > rename to src/luajit_dbg.py > index 5ac11b65..a42d8f25 100644 > --- a/src/luajit_lldb.py > +++ b/src/luajit_dbg.py > @@ -1,10 +1,230 @@ > -# LLDB extension for LuaJIT post-mortem analysis. > -# To use, just put 'command script import <path-to-repo>/src/luajit_lldb.py' > -# in lldb. > +# Debug extension for LuaJIT post-mortem analysis. > +# To use in LLDB: 'command script import <path-to-repo>/src/luajit_dbg.py' > +# To use in GDB: 'source <path-to-repo>/src/luajit_dbg.py' > > import abc > import re > -import lldb > +import sys > +import types > + > +from importlib import import_module > + > +# make script compatible with the ancient Python {{{ Typo: s/make script/Make the script/ <snipped> > +class Debugger(object): > + def __init__(self): > + self.GDB = False > + self.LLDB = False > + > + debuggers = { > + 'gdb': lambda lib: True, > + 'lldb': lambda lib: lib.debugger is not None, > + } > + for name, healthcheck in debuggers.items(): > + lib = None > + try: > + lib = import_module(name) > + if healthcheck(lib): Why do we need this healthcheck? Why just import of the module isn't enough? Please add a comment near `debuggers` definition. > + setattr(self, name.upper(), True) > + globals()[name] = lib > + self.name = name > + except Exception: > + continue > + > + assert self.LLDB != self.GDB <snipped> > + > + def find_type(self, typename): > + if self.LLDB: > + return target.FindFirstType(typename) > + else: > + return gdb.lookup_type(typename) Why do you drop the cache for types here? It may be critical when running scripts for the search of objects on big coredumps or the attached process. > + > + def type_to_pointer_type(self, tp): > + if self.LLDB: > + return tp.GetPointerType() > + else: > + return tp.pointer() <snipped> > + > + def type_member_offset(self, member): > + if self.LLDB: > + return member.GetOffsetInBytes() > + else: > + return member.bitpos / 8 Should it be `//`? <snipped> > +class Struct(object): Should we do this part for GDB too? I thought that this class generation may be skipped for GDB. > def __init__(self, value): > self.value = value > > def __getitem__(self, name): <snipped> > +def make_property_from_metadata(field, tp): > + builtin = { > + 'uint': dbg.to_unsigned, > + 'int': dbg.to_signed, > + 'string': dbg.to_str, > + } > + if tp in builtin.keys(): > + return lambda self: builtin[tp](self[field]) > + else: > + return lambda self: globals()[tp](self[field]) > + > + > +for cls, metainfo in c_structs.items(): > + cls_dict = {} > + for field in metainfo: May you please name field[0], field[1] as local variables for better readability? > + if not isinstance(field[0], str): > + cls_dict[field[1]] = field[0] > + else: > + cls_dict[field[1]] = property( > + make_property_from_metadata(field[1], field[0]) > + ) > + globals()[cls] = type(cls, (Struct, ), cls_dict) > > > for cls in Struct.__subclasses__(): > ptr_name = cls.__name__ + 'Ptr' > <snipped> > - > - ret = frame.EvaluateExpression(command) > - return ret > + return dbg.to_unsigned(dbg.eval(command)) Why do we need return unsigned here? > > @abc.abstractproperty > def command(self): > @@ -270,7 +491,7 @@ class Command(object): <snipped> > @@ -278,6 +499,11 @@ class Command(object): > properly routed to LLDB frontend. Any unhandled exception will be > automatically transformed into proper errors. > """ > + def invoke(self, arg, from_tty): > + try: > + self.execute(arg) > + except Exception as e: > + dbg.write(e) Why do we need this change? The error message for such situation is changed and non informative: | Breakpoint 1, lj_cf_dofile (L=0x2) at /home/burii/reviews/luajit/lj-dbg/src/lib_base.c:429 | 429 { | (gdb) lj-stack L | Python Exception <class 'TypeError'>: unsupported operand type(s) for +: 'MemoryError' and 'str' | Error occurred in Python: unsupported operand type(s) for +: 'MemoryError' and 'str' Within the following implementation all works as expected. | def invoke(self, arg, from_tty): | self.execute(arg) This produces more understandable reason of an error: | (gdb) lj-stack L | Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x26 | Error occurred in Python: Cannot access memory at address 0x26 Also, maybe it is good to add a test for this error. <snipped> -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 1/2] debug: generalized extension 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches @ 2024-04-17 22:42 ` Maxim Kokryashkin via Tarantool-patches 2024-04-18 8:00 ` Sergey Kaplun via Tarantool-patches 0 siblings, 1 reply; 11+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2024-04-17 22:42 UTC (permalink / raw) To: Sergey Kaplun; +Cc: Maxim Kokryashkin, tarantool-patches Hi, Sergey! Thanks for the review! See my answers below. On Wed, Apr 17, 2024 at 07:00:00PM +0300, Sergey Kaplun wrote: > Hi, Maxim! > Thanks for the patch! > Please consider my comments below. > > On 04.04.24, Maxim Kokryashkin wrote: > > <snipped> > > > --- > > src/luajit-gdb.py | 885 -------------------------- > > src/{luajit_lldb.py => luajit_dbg.py} | 616 ++++++++++++------ > > Since luajit_lldb.py is gone, please change the comment in <.flake8rc>. Fixed. > > > 2 files changed, 416 insertions(+), 1085 deletions(-) > > delete mode 100644 src/luajit-gdb.py > > rename src/{luajit_lldb.py => luajit_dbg.py} (63%) > > > > diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py > > deleted file mode 100644 > > <snipped> > > > diff --git a/src/luajit_lldb.py b/src/luajit_dbg.py > > similarity index 63% > > rename from src/luajit_lldb.py > > rename to src/luajit_dbg.py > > index 5ac11b65..a42d8f25 100644 > > --- a/src/luajit_lldb.py > > +++ b/src/luajit_dbg.py > > @@ -1,10 +1,230 @@ > > -# LLDB extension for LuaJIT post-mortem analysis. > > -# To use, just put 'command script import <path-to-repo>/src/luajit_lldb.py' > > -# in lldb. > > +# Debug extension for LuaJIT post-mortem analysis. > > +# To use in LLDB: 'command script import <path-to-repo>/src/luajit_dbg.py' > > +# To use in GDB: 'source <path-to-repo>/src/luajit_dbg.py' > > > > import abc > > import re > > -import lldb > > +import sys > > +import types > > + > > +from importlib import import_module > > + > > +# make script compatible with the ancient Python {{{ > > Typo: s/make script/Make the script/ > > <snipped> Fixed. > > > +class Debugger(object): > > + def __init__(self): > > + self.GDB = False > > + self.LLDB = False > > + > > + debuggers = { > > + 'gdb': lambda lib: True, > > + 'lldb': lambda lib: lib.debugger is not None, > > + } > > + for name, healthcheck in debuggers.items(): > > + lib = None > > + try: > > + lib = import_module(name) > > + if healthcheck(lib): > > Why do we need this healthcheck? Why just import of the module isn't > enough? > Please add a comment near `debuggers` definition. Added. > > > + setattr(self, name.upper(), True) > > + globals()[name] = lib > > + self.name = name > > + except Exception: > > + continue > > + > > + assert self.LLDB != self.GDB > > <snipped> > > > + > > + def find_type(self, typename): > > + if self.LLDB: > > + return target.FindFirstType(typename) > > + else: > > + return gdb.lookup_type(typename) > > Why do you drop the cache for types here? It may be critical when > running scripts for the search of objects on big coredumps or the > attached process. Well, I believe I just forgot about that. Added. Side note: if only we could do the python3-only version of that, the fix would just be a single decorator in front of the function definition... > > + > > + def type_to_pointer_type(self, tp): > > + if self.LLDB: > > + return tp.GetPointerType() > > + else: > > + return tp.pointer() > > <snipped> > > > + > > + def type_member_offset(self, member): > > + if self.LLDB: > > + return member.GetOffsetInBytes() > > + else: > > + return member.bitpos / 8 > > Should it be `//`? Doesn't really matter, but ok. Fixed. > > <snipped> > > > +class Struct(object): > > Should we do this part for GDB too? I thought that this class generation > may be skipped for GDB. No, the whole idea is to encapsulate the debugger-specific things into the debugger object and these classes. So we cannot skip them, they have crucial role in this adapter. > > > def __init__(self, value): > > self.value = value > > > > def __getitem__(self, name): > > <snipped> > > > +def make_property_from_metadata(field, tp): > > + builtin = { > > + 'uint': dbg.to_unsigned, > > + 'int': dbg.to_signed, > > + 'string': dbg.to_str, > > + } > > + if tp in builtin.keys(): > > + return lambda self: builtin[tp](self[field]) > > + else: > > + return lambda self: globals()[tp](self[field]) > > + > > + > > +for cls, metainfo in c_structs.items(): > > + cls_dict = {} > > + for field in metainfo: > > May you please name field[0], field[1] as local variables for better > readability? Fixed. > > > + if not isinstance(field[0], str): > > + cls_dict[field[1]] = field[0] > > + else: > > + cls_dict[field[1]] = property( > > + make_property_from_metadata(field[1], field[0]) > > + ) > > + globals()[cls] = type(cls, (Struct, ), cls_dict) > > > > > > for cls in Struct.__subclasses__(): > > ptr_name = cls.__name__ + 'Ptr' > > > > > <snipped> > > > - > > - ret = frame.EvaluateExpression(command) > > - return ret > > + return dbg.to_unsigned(dbg.eval(command)) > > Why do we need return unsigned here? Because all of our commands accept either pointers, or numbers as argumnents and lldb's eval may return a string instead. > > > > > @abc.abstractproperty > > def command(self): > > @@ -270,7 +491,7 @@ class Command(object): > > <snipped> > > > @@ -278,6 +499,11 @@ class Command(object): > > properly routed to LLDB frontend. Any unhandled exception will be > > automatically transformed into proper errors. > > """ > > + def invoke(self, arg, from_tty): > > + try: > > + self.execute(arg) > > + except Exception as e: > > + dbg.write(e) > > Why do we need this change? > > The error message for such situation is changed and non informative: > > | Breakpoint 1, lj_cf_dofile (L=0x2) at /home/burii/reviews/luajit/lj-dbg/src/lib_base.c:429 > | 429 { > | (gdb) lj-stack L > | Python Exception <class 'TypeError'>: unsupported operand type(s) for +: 'MemoryError' and 'str' > | Error occurred in Python: unsupported operand type(s) for +: 'MemoryError' and 'str' > > Within the following implementation all works as expected. > | def invoke(self, arg, from_tty): > | self.execute(arg) Fixed. > > This produces more understandable reason of an error: > | (gdb) lj-stack L > | Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x26 > | Error occurred in Python: Cannot access memory at address 0x26 > Without the explicit exception output LLDB command just fails silently. Choosing from a less readable error and not understanding what happened (and whether it even happened at all) I would personally prefer the first option. > Also, maybe it is good to add a test for this error. Kind of a strange idea to test whether the extension throws an unhandled exception. It's like checking if Tarantool gives a segfault with a certain backtrace after a NULL dereference. > > <snipped Here is the diff with changes: === diff --git a/.flake8rc b/.flake8rc index 6766ed41..6c1200fc 100644 --- a/.flake8rc +++ b/.flake8rc @@ -1,7 +1,7 @@ [flake8] extend-ignore = # XXX: Suppress F821, since we have autogenerated names for - # 'ptr' type complements in luajit_lldb.py. + # 'ptr' type complements in luajit_dbg.py. F821 per-file-ignores = # XXX: Flake considers regexp special characters to be diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py index a42d8f25..e800b514 100644 --- a/src/luajit_dbg.py +++ b/src/luajit_dbg.py @@ -9,7 +9,7 @@ import types from importlib import import_module -# make script compatible with the ancient Python {{{ +# Make the script compatible with the ancient Python {{{ LEGACY = re.match(r'^2\.', sys.version) @@ -31,7 +31,13 @@ class Debugger(object): def __init__(self): self.GDB = False self.LLDB = False + self.type_cache = {} + # XXX: While the `gdb` library is only available inside + # a debug session, the `lldb` library can be loaded in + # any Python script. To address that, we need to perform + # an additional check to ensure a debug session is + # actually running. debuggers = { 'gdb': lambda lib: True, 'lldb': lambda lib: lib.debugger is not None, @@ -116,10 +122,12 @@ class Debugger(object): return val.value if self.LLDB else str(val) def find_type(self, typename): - if self.LLDB: - return target.FindFirstType(typename) - else: - return gdb.lookup_type(typename) + if typename not in self.type_cache: + if self.LLDB: + self.type_cache[typename] = target.FindFirstType(typename) + else: + self.type_cache[typename] = gdb.lookup_type(typename) + return self.type_cache[typename] def type_to_pointer_type(self, tp): if self.LLDB: @@ -190,7 +198,7 @@ class Debugger(object): if self.LLDB: return member.GetOffsetInBytes() else: - return member.bitpos / 8 + return member.bitpos // 8 def get_member(self, value, member_name): if self.LLDB: @@ -437,11 +445,13 @@ def make_property_from_metadata(field, tp): for cls, metainfo in c_structs.items(): cls_dict = {} for field in metainfo: - if not isinstance(field[0], str): - cls_dict[field[1]] = field[0] + prop_constructor = field[0] + prop_name = field[1] + if not isinstance(prop_constructor, str): + cls_dict[prop_name] = prop_constructor else: - cls_dict[field[1]] = property( - make_property_from_metadata(field[1], field[0]) + cls_dict[prop_name] = property( + make_property_from_metadata(prop_name, prop_constructor) ) globals()[cls] = type(cls, (Struct, ), cls_dict) === > > -- > Best regards, > Sergey Kaplun ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 1/2] debug: generalized extension 2024-04-17 22:42 ` Maxim Kokryashkin via Tarantool-patches @ 2024-04-18 8:00 ` Sergey Kaplun via Tarantool-patches 0 siblings, 0 replies; 11+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2024-04-18 8:00 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: Maxim Kokryashkin, tarantool-patches Hi, Maxim! Thanks for the fixes! Unfortunately, I am troubled with other regressions. Assume, we run the following command: | gdb --args src/luajit -e 'print(1)' and try to load the extension before running the child process. Then we got the following error: | (gdb) b lj_cf_print | Breakpoint 1 at 0x39ce3: file /home/burii/reviews/luajit/lj-dbg/src/lib_base.c, line 496. | (gdb) source src/luajit_dbg.py | LuaJIT debug extension failed to load: no debugging symbols found for libluajit While the original gdb extension is loaded as expected: | (gdb) source ~/builds_workspace/luajit/master/src/luajit-gdb.py | lj-arch command initialized | lj-tv command initialized | lj-str command initialized | lj-tab command initialized | lj-stack command initialized | lj-state command initialized | lj-gc command initialized | luajit-gdb.py is successfully loaded Also, when I used something like the following for the generalized debugger extension, I got the following: | (gdb) lj-stack 0x0 | ---------- Red zone: 5 slots ---------- | 0x40001a80 [ ] VALUE: nil | ... | (gdb) info locals | i = 0 | nargs = 93824992266313 | tv = 0x0 | __func__ = "lj_cf_print" | shortcut = 21845 | (gdb) lj-stack tv | ---------- Red zone: 5 slots ---------- | ... While in the old version of debugger I got the following output: | (gdb) lj-stack tv | table argument empty | (gdb) lj-stack 0x0 | table argument empty I suppose, that the new debugger extension version used default global `main_L` when the given argument is 0. On 18.04.24, Maxim Kokryashkin wrote: > Hi, Sergey! > Thanks for the review! > See my answers below. > On Wed, Apr 17, 2024 at 07:00:00PM +0300, Sergey Kaplun wrote: <snipped> > > > - > > > - ret = frame.EvaluateExpression(command) > > > - return ret > > > + return dbg.to_unsigned(dbg.eval(command)) > > > > Why do we need return unsigned here? > Because all of our commands accept either pointers, or numbers as > argumnents and lldb's eval may return a string instead. Got it thanks! A comment will be appreciated. > > > > > > > > @abc.abstractproperty > > > def command(self): > > > @@ -270,7 +491,7 @@ class Command(object): > > > > <snipped> > > > > > @@ -278,6 +499,11 @@ class Command(object): > > > properly routed to LLDB frontend. Any unhandled exception will be > > > automatically transformed into proper errors. > > > """ > > > + def invoke(self, arg, from_tty): > > > + try: > > > + self.execute(arg) > > > + except Exception as e: > > > + dbg.write(e) > > > > Why do we need this change? > > > > The error message for such situation is changed and non informative: > > > > | Breakpoint 1, lj_cf_dofile (L=0x2) at /home/burii/reviews/luajit/lj-dbg/src/lib_base.c:429 > > | 429 { > > | (gdb) lj-stack L > > | Python Exception <class 'TypeError'>: unsupported operand type(s) for +: 'MemoryError' and 'str' > > | Error occurred in Python: unsupported operand type(s) for +: 'MemoryError' and 'str' > > > > Within the following implementation all works as expected. > > | def invoke(self, arg, from_tty): > > | self.execute(arg) > Fixed. Sorry, but I see no related changes, and the old error message is still persisted. > > > > This produces more understandable reason of an error: > > | (gdb) lj-stack L > > | Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x26 > > | Error occurred in Python: Cannot access memory at address 0x26 > > > Without the explicit exception output LLDB command just fails silently. > Choosing from a less readable error and not understanding what happened > (and whether it even happened at all) I would personally prefer the > first option. So, this case should be handled specially for the lldb provider? <snipped> -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 1/2] debug: generalized extension 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 1/2] " Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:14 ` Sergey Bronnikov via Tarantool-patches 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches @ 2024-08-14 19:34 ` Mikhail Elhimov via Tarantool-patches 2 siblings, 0 replies; 11+ messages in thread From: Mikhail Elhimov via Tarantool-patches @ 2024-08-14 19:34 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches [-- Attachment #1: Type: text/plain, Size: 1297 bytes --] Hi, Maxim! Thanks for the patch! Please consider my comments below. On 04.04.2024 01:21, Maxim Kokryashkin via Tarantool-patches wrote: <snipped> > +class Debugger(object): > + def __init__(self): > + self.GDB = False > + self.LLDB = False > + > + debuggers = { > + 'gdb': lambda lib: True, > + 'lldb': lambda lib: lib.debugger is not None, > + } > + for name, healthcheck in debuggers.items(): > + lib = None > + try: > + lib = import_module(name) > + if healthcheck(lib): > + setattr(self, name.upper(), True) > + globals()[name] = lib > + self.name = name > + except Exception: > + continue > + > + assert self.LLDB != self.GDB I'd suggest to use two separate implementations of Debugger interface for GDB and LLDB, so you would not need all these checking (like `if self.LLDB`) in every single method of all-in-one implementation. With this approach it seems any initial setup that is specific to certain debugger (like setup of event_connect/event_disconnect handlers for GDB) could be done as a part of corresponding __init__ method. <snipped> -- Best regards, Mikhail Elhimov [-- Attachment #2: Type: text/html, Size: 2030 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions 2024-04-03 22:21 [Tarantool-patches] [PATCH luajit v6 0/2] debug: generalized extension Maxim Kokryashkin via Tarantool-patches 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 1/2] " Maxim Kokryashkin via Tarantool-patches @ 2024-04-03 22:21 ` Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:27 ` Sergey Bronnikov via Tarantool-patches 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches 1 sibling, 2 replies; 11+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2024-04-03 22:21 UTC (permalink / raw) To: tarantool-patches, skaplun, sergeyb; +Cc: Maksim Kokryashkin From: Maksim Kokryashkin <max.kokryashkin@gmail.com> This patch adds tests for LuaJIT debugging extensions for lldb and gdb. --- .flake8rc | 4 + test/CMakeLists.txt | 1 + .../CMakeLists.txt | 80 ++++++ .../debug-extension-tests.py | 250 ++++++++++++++++++ 4 files changed, 335 insertions(+) create mode 100644 test/LuaJIT-debug-extensions-tests/CMakeLists.txt create mode 100644 test/LuaJIT-debug-extensions-tests/debug-extension-tests.py diff --git a/.flake8rc b/.flake8rc index 13e6178f..6766ed41 100644 --- a/.flake8rc +++ b/.flake8rc @@ -3,3 +3,7 @@ extend-ignore = # XXX: Suppress F821, since we have autogenerated names for # 'ptr' type complements in luajit_lldb.py. F821 +per-file-ignores = + # XXX: Flake considers regexp special characters to be + # escape sequences. + test/LuaJIT-debug-extensions-tests/debug-extension-tests.py:W605 diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt index 19726f5a..a3b48939 100644 --- a/test/CMakeLists.txt +++ b/test/CMakeLists.txt @@ -148,6 +148,7 @@ add_subdirectory(PUC-Rio-Lua-5.1-tests) add_subdirectory(lua-Harness-tests) add_subdirectory(tarantool-c-tests) add_subdirectory(tarantool-tests) +add_subdirectory(LuaJIT-debug-extensions-tests) # Each testsuite has its own CMake target, but combining these # target into a single one is not desired, because each target diff --git a/test/LuaJIT-debug-extensions-tests/CMakeLists.txt b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt new file mode 100644 index 00000000..9ac626ec --- /dev/null +++ b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt @@ -0,0 +1,80 @@ +SET(TEST_SUITE_NAME "LuaJIT-dbg-extension-tests") +add_test_suite_target(LuaJIT-dbg-extension-tests + LABELS ${TEST_SUITE_NAME} + DEPENDS ${LUAJIT_TEST_BINARY} +) + +# Debug info is required for testing of extensions. +if(NOT (CMAKE_BUILD_TYPE MATCHES Debug)) + message(WARNING + "not a DEBUG build, LuaJIT-lldb-extension-tests and " + "LuaJIT-gdb-extension-tests are dummy" + ) + return() +endif() + +# MacOS asks for permission to debug a process even when the +# machine is set into development mode. To solve the issue, +# it is required to add relevant users to the `_developer` user +# group in MacOS. Disabled for now. +if(CMAKE_SYSTEM_NAME STREQUAL "Darwin" AND DEFINED ENV{CI}) + message(WARNING + "Interactive debugging is unavailable for macOS CI builds," + "LuaJIT-lldb-extension-tests is dummy" + ) + return() +endif() + +find_package(PythonInterp) +if(NOT PYTHONINTERP_FOUND) + message(WARNING + "`python` is not found, LuaJIT-lldb-extension-tests and " + "LuaJIT-gdb-extension-tests are dummy" + ) + return() +endif() + +set(DEBUGGER_TEST_ENV + "LUAJIT_TEST_BINARY=${LUAJIT_TEST_BINARY}" + # Suppresses __pycache__ generation. + "PYTHONDONTWRITEBYTECODE=1" + "DEBUGGER_EXTENSION_PATH=${PROJECT_SOURCE_DIR}/src/luajit_dbg.py" +) + +set(TEST_SCRIPT_PATH + ${PROJECT_SOURCE_DIR}/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py +) + +find_program(GDB gdb) +if(GDB) + set(test_title "test/${TEST_SUITE_NAME}/gdb") + set(GDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${GDB}") + add_test(NAME "${test_title}" + COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} + WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} + ) + set_tests_properties("${test_title}" PROPERTIES + ENVIRONMENT "${GDB_TEST_ENV}" + LABELS ${TEST_SUITE_NAME} + DEPENDS LuaJIT-dbg-extension-tests-deps + ) +else() + message(WARNING "`gdb' is not found, so LuaJIT-gdb-extension-tests is dummy") +endif() + +find_program(LLDB lldb) +if(LLDB) + set(test_title "test/${TEST_SUITE_NAME}/lldb") + set(LLDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${LLDB}") + add_test(NAME "test/${TEST_SUITE_NAME}/lldb" + COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} + WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} + ) + set_tests_properties("${test_title}" PROPERTIES + ENVIRONMENT "${LLDB_TEST_ENV}" + LABELS ${TEST_SUITE_NAME} + DEPENDS LuaJIT-dbg-extension-tests-deps + ) +else() + message(WARNING "`lldb' is not found, so LuaJIT-lldb-extension-tests is dummy") +endif() diff --git a/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py b/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py new file mode 100644 index 00000000..6ef87473 --- /dev/null +++ b/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py @@ -0,0 +1,250 @@ +# This file provides tests for LuaJIT debug extensions for lldb and gdb. +import os +import re +import subprocess +import sys +import tempfile +import unittest + +from threading import Timer + +LEGACY = re.match(r'^2\.', sys.version) + +LUAJIT_BINARY = os.environ['LUAJIT_TEST_BINARY'] +EXTENSION = os.environ['DEBUGGER_EXTENSION_PATH'] +DEBUGGER = os.environ['DEBUGGER_COMMAND'] +LLDB = 'lldb' in DEBUGGER +TIMEOUT = 10 + +RUN_CMD_FILE = '-s' if LLDB else '-x' +INFERIOR_ARGS = '--' if LLDB else '--args' +PROCESS_RUN = 'process launch' if LLDB else 'r' +LOAD_EXTENSION = ( + 'command script import {ext}' if LLDB else 'source {ext}' +).format(ext=EXTENSION) + + +def persist(data): + tmp = tempfile.NamedTemporaryFile(mode='w') + tmp.write(data) + tmp.flush() + return tmp + + +def execute_process(cmd, timeout=TIMEOUT): + if LEGACY: + # XXX: The Python 2.7 version of `subprocess.Popen` doesn't have a + # timeout option, so the required functionality was implemented via + # `threading.Timer`. + process = subprocess.Popen(cmd, stdout=subprocess.PIPE) + timer = Timer(TIMEOUT, process.kill) + timer.start() + stdout, _ = process.communicate() + timer.cancel() + + # XXX: If the timeout is exceeded and the process is killed by the + # timer, then the return code is non-zero, and we are going to blow up. + assert process.returncode == 0 + return stdout.decode('ascii') + else: + process = subprocess.run(cmd, capture_output=True, timeout=TIMEOUT) + return process.stdout.decode('ascii') + + +def filter_debugger_output(output): + descriptor = '(lldb)' if LLDB else '(gdb)' + return ''.join( + filter( + lambda line: not line.startswith(descriptor), + output.splitlines(True), + ), + ) + + +class TestCaseBase(unittest.TestCase): + @classmethod + def construct_cmds(cls): + return '\n'.join([ + 'b {loc}'.format(loc=cls.location), + PROCESS_RUN, + 'n', + LOAD_EXTENSION, + cls.extension_cmds.strip(), + 'q', + ]) + + @classmethod + def setUpClass(cls): + cmd_file = persist(cls.construct_cmds()) + script_file = persist(cls.lua_script) + process_cmd = [ + DEBUGGER, + RUN_CMD_FILE, + cmd_file.name, + INFERIOR_ARGS, + LUAJIT_BINARY, + script_file.name, + ] + cls.output = filter_debugger_output(execute_process(process_cmd)) + cmd_file.close() + script_file.close() + + def check(self): + if LEGACY: + self.assertRegexpMatches(self.output, self.pattern.strip()) + else: + self.assertRegex(self.output, self.pattern.strip()) + + +class TestLoad(TestCaseBase): + extension_cmds = '' + location = 'lj_cf_print' + lua_script = 'print(1)' + pattern = ( + 'lj-tv command intialized\n' + 'lj-state command intialized\n' + 'lj-arch command intialized\n' + 'lj-gc command intialized\n' + 'lj-str command intialized\n' + 'lj-tab command intialized\n' + 'lj-stack command intialized\n' + 'LuaJIT debug extension is successfully loaded\n' + ) + + +class TestLJArch(TestCaseBase): + extension_cmds = 'lj-arch' + location = 'lj_cf_print' + lua_script = 'print(1)' + pattern = ( + 'LJ_64: (True|False), ' + 'LJ_GC64: (True|False), ' + 'LJ_DUALNUM: (True|False)' + ) + + +class TestLJState(TestCaseBase): + extension_cmds = 'lj-state' + location = 'lj_cf_print' + lua_script = 'print(1)' + pattern = ( + 'VM state: [A-Z]+\n' + 'GC state: [A-Z]+\n' + 'JIT state: [A-Z]+\n' + ) + + +class TestLJGC(TestCaseBase): + extension_cmds = 'lj-gc' + location = 'lj_cf_print' + lua_script = 'print(1)' + pattern = ( + 'GC stats: [A-Z]+\n' + '\ttotal: \d+\n' + '\tthreshold: \d+\n' + '\tdebt: \d+\n' + '\testimate: \d+\n' + '\tstepmul: \d+\n' + '\tpause: \d+\n' + '\tsweepstr: \d+/\d+\n' + '\troot: \d+ objects\n' + '\tgray: \d+ objects\n' + '\tgrayagain: \d+ objects\n' + '\tweak: \d+ objects\n' + '\tmmudata: \d+ objects\n' + ) + + +class TestLJStack(TestCaseBase): + extension_cmds = 'lj-stack' + location = 'lj_cf_print' + lua_script = 'print(1)' + pattern = ( + '-+ Red zone:\s+\d+ slots -+\n' + '(0x[a-zA-Z0-9]+\s+\[(S|\s)(B|\s)(T|\s)(M|\s)\] VALUE: nil\n?)*\n' + '-+ Stack:\s+\d+ slots -+\n' + '(0x[A-Za-z0-9]+(:0x[A-Za-z0-9]+)?\s+' + '\[(S|\s)(B|\s)(T|\s)(M|\s)\].*\n?)+\n' + ) + + +class TestLJTV(TestCaseBase): + location = 'lj_cf_print' + lua_script = 'print(1)' + extension_cmds = ( + 'lj-tv L->base\n' + 'lj-tv L->base + 1\n' + 'lj-tv L->base + 2\n' + 'lj-tv L->base + 3\n' + 'lj-tv L->base + 4\n' + 'lj-tv L->base + 5\n' + 'lj-tv L->base + 6\n' + 'lj-tv L->base + 7\n' + 'lj-tv L->base + 8\n' + 'lj-tv L->base + 9\n' + 'lj-tv L->base + 10\n' + 'lj-tv L->base + 11\n' + ) + + lua_script = ( + 'local ffi = require("ffi")\n' + 'print(\n' + ' nil,\n' + ' false,\n' + ' true,\n' + ' "hello",\n' + ' {1},\n' + ' 1,\n' + ' 1.1,\n' + ' coroutine.create(function() end),\n' + ' ffi.new("int*"),\n' + ' function() end,\n' + ' print,\n' + ' require\n' + ')\n' + ) + + pattern = ( + 'nil\n' + 'false\n' + 'true\n' + 'string \"hello\" @ 0x[a-zA-Z0-9]+\n' + 'table @ 0x[a-zA-Z0-9]+ \(asize: \d+, hmask: 0x[a-zA-Z0-9]+\)\n' + '(number|integer) .*1.*\n' + 'number 1.1\d+\n' + 'thread @ 0x[a-zA-Z0-9]+\n' + 'cdata @ 0x[a-zA-Z0-9]+\n' + 'Lua function @ 0x[a-zA-Z0-9]+, [0-9]+ upvalues, .+:[0-9]+\n' + 'fast function #[0-9]+\n' + 'C function @ 0x[a-zA-Z0-9]+\n' + ) + + +class TestLJStr(TestCaseBase): + extension_cmds = 'lj-str fname' + location = 'lj_cf_dofile' + lua_script = 'pcall(dofile("name"))' + pattern = 'String: .* \[\d+ bytes\] with hash 0x[a-zA-Z0-9]+' + + +class TestLJTab(TestCaseBase): + extension_cmds = 'lj-tab t' + location = 'lj_cf_unpack' + lua_script = 'unpack({1; a = 1})' + pattern = ( + 'Array part: 3 slots\n' + '0x[a-zA-Z0-9]+: \[0\]: nil\n' + '0x[a-zA-Z0-9]+: \[1\]: .+ 1\n' + '0x[a-zA-Z0-9]+: \[2\]: nil\n' + 'Hash part: 2 nodes\n' + '0x[a-zA-Z0-9]+: { string "a" @ 0x[a-zA-Z0-9]+ } => ' + '{ .+ 1 }; next = 0x0\n' + '0x[a-zA-Z0-9]+: { nil } => { nil }; next = 0x0\n' + ) + + +for test_cls in TestCaseBase.__subclasses__(): + test_cls.test = lambda self: self.check() + +if __name__ == '__main__': + unittest.main(verbosity=2) -- 2.44.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions Maxim Kokryashkin via Tarantool-patches @ 2024-04-04 10:27 ` Sergey Bronnikov via Tarantool-patches 2024-04-08 9:45 ` Maxim Kokryashkin via Tarantool-patches 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches 1 sibling, 1 reply; 11+ messages in thread From: Sergey Bronnikov via Tarantool-patches @ 2024-04-04 10:27 UTC (permalink / raw) To: Maxim Kokryashkin, tarantool-patches, skaplun Hi, Max thanks for the patch. See my comments below: On 4/4/24 01:21, Maxim Kokryashkin wrote: > From: Maksim Kokryashkin <max.kokryashkin@gmail.com> > > This patch adds tests for LuaJIT debugging > extensions for lldb and gdb. > --- > .flake8rc | 4 + > test/CMakeLists.txt | 1 + > .../CMakeLists.txt | 80 ++++++ > .../debug-extension-tests.py | 250 ++++++++++++++++++ > 4 files changed, 335 insertions(+) > create mode 100644 test/LuaJIT-debug-extensions-tests/CMakeLists.txt > create mode 100644 test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > > diff --git a/.flake8rc b/.flake8rc > index 13e6178f..6766ed41 100644 > --- a/.flake8rc > +++ b/.flake8rc > @@ -3,3 +3,7 @@ extend-ignore = > # XXX: Suppress F821, since we have autogenerated names for > # 'ptr' type complements in luajit_lldb.py. > F821 > +per-file-ignores = > + # XXX: Flake considers regexp special characters to be > + # escape sequences. > + test/LuaJIT-debug-extensions-tests/debug-extension-tests.py:W605 > diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt > index 19726f5a..a3b48939 100644 > --- a/test/CMakeLists.txt > +++ b/test/CMakeLists.txt > @@ -148,6 +148,7 @@ add_subdirectory(PUC-Rio-Lua-5.1-tests) > add_subdirectory(lua-Harness-tests) > add_subdirectory(tarantool-c-tests) > add_subdirectory(tarantool-tests) > +add_subdirectory(LuaJIT-debug-extensions-tests) > > # Each testsuite has its own CMake target, but combining these > # target into a single one is not desired, because each target > diff --git a/test/LuaJIT-debug-extensions-tests/CMakeLists.txt b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt > new file mode 100644 > index 00000000..9ac626ec > --- /dev/null > +++ b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt > @@ -0,0 +1,80 @@ > +SET(TEST_SUITE_NAME "LuaJIT-dbg-extension-tests") > +add_test_suite_target(LuaJIT-dbg-extension-tests > + LABELS ${TEST_SUITE_NAME} > + DEPENDS ${LUAJIT_TEST_BINARY} > +) > + > +# Debug info is required for testing of extensions. > +if(NOT (CMAKE_BUILD_TYPE MATCHES Debug)) > + message(WARNING > + "not a DEBUG build, LuaJIT-lldb-extension-tests and " > + "LuaJIT-gdb-extension-tests are dummy" > + ) > + return() > +endif() > + > +# MacOS asks for permission to debug a process even when the > +# machine is set into development mode. To solve the issue, > +# it is required to add relevant users to the `_developer` user > +# group in MacOS. Disabled for now. > +if(CMAKE_SYSTEM_NAME STREQUAL "Darwin" AND DEFINED ENV{CI}) > + message(WARNING > + "Interactive debugging is unavailable for macOS CI builds," > + "LuaJIT-lldb-extension-tests is dummy" > + ) > + return() > +endif() > + > +find_package(PythonInterp) > +if(NOT PYTHONINTERP_FOUND) > + message(WARNING > + "`python` is not found, LuaJIT-lldb-extension-tests and " > + "LuaJIT-gdb-extension-tests are dummy" > + ) > + return() > +endif() > + > +set(DEBUGGER_TEST_ENV > + "LUAJIT_TEST_BINARY=${LUAJIT_TEST_BINARY}" > + # Suppresses __pycache__ generation. > + "PYTHONDONTWRITEBYTECODE=1" > + "DEBUGGER_EXTENSION_PATH=${PROJECT_SOURCE_DIR}/src/luajit_dbg.py" > +) > + > +set(TEST_SCRIPT_PATH > + ${PROJECT_SOURCE_DIR}/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > +) > + > +find_program(GDB gdb) > +if(GDB) > + set(test_title "test/${TEST_SUITE_NAME}/gdb") > + set(GDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${GDB}") > + add_test(NAME "${test_title}" > + COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} > + WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} > + ) > + set_tests_properties("${test_title}" PROPERTIES > + ENVIRONMENT "${GDB_TEST_ENV}" > + LABELS ${TEST_SUITE_NAME} > + DEPENDS LuaJIT-dbg-extension-tests-deps > + ) > +else() > + message(WARNING "`gdb' is not found, so LuaJIT-gdb-extension-tests is dummy") > +endif() > + > +find_program(LLDB lldb) > +if(LLDB) > + set(test_title "test/${TEST_SUITE_NAME}/lldb") I would use a real path as a test title like we do in other testsuites (PUC Rio Lua and LuaJIT tests are exception in this rule). For these test we have two flavors, so I suggest to create symlinks: ~/sources/MRG/tarantool/third_party/luajit/test/LuaJIT-debug-extensions-tests$ ls -la total 20 drwxrwxr-x 2 sergeyb sergeyb 4096 Apr 4 13:15 . drwxrwxr-x 10 sergeyb sergeyb 4096 Apr 4 12:44 .. -rw-rw-r-- 1 sergeyb sergeyb 2451 Apr 4 12:44 CMakeLists.txt -rw-rw-r-- 1 sergeyb sergeyb 6787 Apr 4 12:44 debug-extension-tests.py lrwxrwxrwx 1 sergeyb sergeyb 24 Apr 4 13:15 gdb-debug-extension-tests.py -> debug-extension-tests.py lrwxrwxrwx 1 sergeyb sergeyb 24 Apr 4 13:15 lldb-debug-extension-tests.py -> debug-extension-tests.py And generate CMake tests for these files: --- a/test/LuaJIT-debug-extensions-tests/CMakeLists.txt +++ b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt @@ -41,14 +41,11 @@ set(DEBUGGER_TEST_ENV "DEBUGGER_EXTENSION_PATH=${PROJECT_SOURCE_DIR}/src/luajit_dbg.py" ) -set(TEST_SCRIPT_PATH - ${PROJECT_SOURCE_DIR}/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py -) - find_program(GDB gdb) if(GDB) - set(test_title "test/${TEST_SUITE_NAME}/gdb") + set(test_title "test/${TEST_SUITE_NAME}/gdb-debug-extension-tests.py") set(GDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${GDB}") + set(TEST_SCRIPT_PATH ${CMAKE_CURRENT_SOURCE_DIR}/gdb-debug-extension-tests.py) add_test(NAME "${test_title}" COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} @@ -64,8 +61,9 @@ endif() find_program(LLDB lldb) if(LLDB) - set(test_title "test/${TEST_SUITE_NAME}/lldb") + set(test_title "test/${TEST_SUITE_NAME}/lldb-debug-extension-tests.py") set(LLDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${LLDB}") + set(TEST_SCRIPT_PATH ${CMAKE_CURRENT_SOURCE_DIR}/lldb-debug-extension-tests.py) add_test(NAME "test/${TEST_SUITE_NAME}/lldb" COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} In CTest test titles looks as a real file names: $ ctest -L LuaJIT-dbg-extension-tests Test project /home/sergeyb/sources/MRG/tarantool/third_party/luajit/build/gc64 Start 212: LuaJIT-dbg-extension-tests-deps 1/2 Test #212: LuaJIT-dbg-extension-tests-deps ................................ Passed 0.01 sec Start 213: test/LuaJIT-dbg-extension-tests/gdb-debug-extension-tests.py 2/2 Test #213: test/LuaJIT-dbg-extension-tests/gdb-debug-extension-tests.py ... Passed 2.11 sec 100% tests passed, 0 tests failed out of 2 Label Time Summary: LuaJIT-dbg-extension-tests = 2.12 sec*proc (2 tests) Total Test time (real) = 2.14 sec What do you think? > + set(LLDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${LLDB}") > + add_test(NAME "test/${TEST_SUITE_NAME}/lldb" > + COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} > + WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} > + ) > + set_tests_properties("${test_title}" PROPERTIES > + ENVIRONMENT "${LLDB_TEST_ENV}" > + LABELS ${TEST_SUITE_NAME} > + DEPENDS LuaJIT-dbg-extension-tests-deps > + ) > +else() > + message(WARNING "`lldb' is not found, so LuaJIT-lldb-extension-tests is dummy") > +endif() > diff --git a/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py b/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > new file mode 100644 > index 00000000..6ef87473 > --- /dev/null > +++ b/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > @@ -0,0 +1,250 @@ > +# This file provides tests for LuaJIT debug extensions for lldb and gdb. > +import os > +import re > +import subprocess > +import sys > +import tempfile > +import unittest > + > +from threading import Timer > + > +LEGACY = re.match(r'^2\.', sys.version) > + > +LUAJIT_BINARY = os.environ['LUAJIT_TEST_BINARY'] > +EXTENSION = os.environ['DEBUGGER_EXTENSION_PATH'] > +DEBUGGER = os.environ['DEBUGGER_COMMAND'] > +LLDB = 'lldb' in DEBUGGER > +TIMEOUT = 10 > + > +RUN_CMD_FILE = '-s' if LLDB else '-x' > +INFERIOR_ARGS = '--' if LLDB else '--args' > +PROCESS_RUN = 'process launch' if LLDB else 'r' > +LOAD_EXTENSION = ( > + 'command script import {ext}' if LLDB else 'source {ext}' > +).format(ext=EXTENSION) > + > + > +def persist(data): > + tmp = tempfile.NamedTemporaryFile(mode='w') > + tmp.write(data) > + tmp.flush() > + return tmp > + > + > +def execute_process(cmd, timeout=TIMEOUT): > + if LEGACY: > + # XXX: The Python 2.7 version of `subprocess.Popen` doesn't have a > + # timeout option, so the required functionality was implemented via > + # `threading.Timer`. > + process = subprocess.Popen(cmd, stdout=subprocess.PIPE) > + timer = Timer(TIMEOUT, process.kill) > + timer.start() > + stdout, _ = process.communicate() > + timer.cancel() > + > + # XXX: If the timeout is exceeded and the process is killed by the > + # timer, then the return code is non-zero, and we are going to blow up. > + assert process.returncode == 0 > + return stdout.decode('ascii') > + else: > + process = subprocess.run(cmd, capture_output=True, timeout=TIMEOUT) > + return process.stdout.decode('ascii') > + > + > +def filter_debugger_output(output): > + descriptor = '(lldb)' if LLDB else '(gdb)' > + return ''.join( > + filter( > + lambda line: not line.startswith(descriptor), > + output.splitlines(True), > + ), > + ) > + > + > +class TestCaseBase(unittest.TestCase): > + @classmethod > + def construct_cmds(cls): > + return '\n'.join([ > + 'b {loc}'.format(loc=cls.location), > + PROCESS_RUN, > + 'n', > + LOAD_EXTENSION, > + cls.extension_cmds.strip(), > + 'q', > + ]) > + > + @classmethod > + def setUpClass(cls): > + cmd_file = persist(cls.construct_cmds()) > + script_file = persist(cls.lua_script) > + process_cmd = [ > + DEBUGGER, > + RUN_CMD_FILE, > + cmd_file.name, > + INFERIOR_ARGS, > + LUAJIT_BINARY, > + script_file.name, > + ] > + cls.output = filter_debugger_output(execute_process(process_cmd)) > + cmd_file.close() > + script_file.close() > + > + def check(self): > + if LEGACY: > + self.assertRegexpMatches(self.output, self.pattern.strip()) > + else: > + self.assertRegex(self.output, self.pattern.strip()) > + > + > +class TestLoad(TestCaseBase): > + extension_cmds = '' > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + pattern = ( > + 'lj-tv command intialized\n' > + 'lj-state command intialized\n' > + 'lj-arch command intialized\n' > + 'lj-gc command intialized\n' > + 'lj-str command intialized\n' > + 'lj-tab command intialized\n' > + 'lj-stack command intialized\n' > + 'LuaJIT debug extension is successfully loaded\n' > + ) > + > + > +class TestLJArch(TestCaseBase): > + extension_cmds = 'lj-arch' > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + pattern = ( > + 'LJ_64: (True|False), ' > + 'LJ_GC64: (True|False), ' > + 'LJ_DUALNUM: (True|False)' > + ) > + > + > +class TestLJState(TestCaseBase): > + extension_cmds = 'lj-state' > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + pattern = ( > + 'VM state: [A-Z]+\n' > + 'GC state: [A-Z]+\n' > + 'JIT state: [A-Z]+\n' > + ) > + > + > +class TestLJGC(TestCaseBase): > + extension_cmds = 'lj-gc' > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + pattern = ( > + 'GC stats: [A-Z]+\n' > + '\ttotal: \d+\n' > + '\tthreshold: \d+\n' > + '\tdebt: \d+\n' > + '\testimate: \d+\n' > + '\tstepmul: \d+\n' > + '\tpause: \d+\n' > + '\tsweepstr: \d+/\d+\n' > + '\troot: \d+ objects\n' > + '\tgray: \d+ objects\n' > + '\tgrayagain: \d+ objects\n' > + '\tweak: \d+ objects\n' > + '\tmmudata: \d+ objects\n' > + ) > + > + > +class TestLJStack(TestCaseBase): > + extension_cmds = 'lj-stack' > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + pattern = ( > + '-+ Red zone:\s+\d+ slots -+\n' > + '(0x[a-zA-Z0-9]+\s+\[(S|\s)(B|\s)(T|\s)(M|\s)\] VALUE: nil\n?)*\n' > + '-+ Stack:\s+\d+ slots -+\n' > + '(0x[A-Za-z0-9]+(:0x[A-Za-z0-9]+)?\s+' > + '\[(S|\s)(B|\s)(T|\s)(M|\s)\].*\n?)+\n' > + ) > + > + > +class TestLJTV(TestCaseBase): > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + extension_cmds = ( > + 'lj-tv L->base\n' > + 'lj-tv L->base + 1\n' > + 'lj-tv L->base + 2\n' > + 'lj-tv L->base + 3\n' > + 'lj-tv L->base + 4\n' > + 'lj-tv L->base + 5\n' > + 'lj-tv L->base + 6\n' > + 'lj-tv L->base + 7\n' > + 'lj-tv L->base + 8\n' > + 'lj-tv L->base + 9\n' > + 'lj-tv L->base + 10\n' > + 'lj-tv L->base + 11\n' > + ) > + > + lua_script = ( > + 'local ffi = require("ffi")\n' > + 'print(\n' > + ' nil,\n' > + ' false,\n' > + ' true,\n' > + ' "hello",\n' > + ' {1},\n' > + ' 1,\n' > + ' 1.1,\n' > + ' coroutine.create(function() end),\n' > + ' ffi.new("int*"),\n' > + ' function() end,\n' > + ' print,\n' > + ' require\n' > + ')\n' > + ) > + > + pattern = ( > + 'nil\n' > + 'false\n' > + 'true\n' > + 'string \"hello\" @ 0x[a-zA-Z0-9]+\n' > + 'table @ 0x[a-zA-Z0-9]+ \(asize: \d+, hmask: 0x[a-zA-Z0-9]+\)\n' > + '(number|integer) .*1.*\n' > + 'number 1.1\d+\n' > + 'thread @ 0x[a-zA-Z0-9]+\n' > + 'cdata @ 0x[a-zA-Z0-9]+\n' > + 'Lua function @ 0x[a-zA-Z0-9]+, [0-9]+ upvalues, .+:[0-9]+\n' > + 'fast function #[0-9]+\n' > + 'C function @ 0x[a-zA-Z0-9]+\n' > + ) > + > + > +class TestLJStr(TestCaseBase): > + extension_cmds = 'lj-str fname' > + location = 'lj_cf_dofile' > + lua_script = 'pcall(dofile("name"))' > + pattern = 'String: .* \[\d+ bytes\] with hash 0x[a-zA-Z0-9]+' > + > + > +class TestLJTab(TestCaseBase): > + extension_cmds = 'lj-tab t' > + location = 'lj_cf_unpack' > + lua_script = 'unpack({1; a = 1})' > + pattern = ( > + 'Array part: 3 slots\n' > + '0x[a-zA-Z0-9]+: \[0\]: nil\n' > + '0x[a-zA-Z0-9]+: \[1\]: .+ 1\n' > + '0x[a-zA-Z0-9]+: \[2\]: nil\n' > + 'Hash part: 2 nodes\n' > + '0x[a-zA-Z0-9]+: { string "a" @ 0x[a-zA-Z0-9]+ } => ' > + '{ .+ 1 }; next = 0x0\n' > + '0x[a-zA-Z0-9]+: { nil } => { nil }; next = 0x0\n' > + ) > + > + > +for test_cls in TestCaseBase.__subclasses__(): > + test_cls.test = lambda self: self.check() > + > +if __name__ == '__main__': > + unittest.main(verbosity=2) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions 2024-04-04 10:27 ` Sergey Bronnikov via Tarantool-patches @ 2024-04-08 9:45 ` Maxim Kokryashkin via Tarantool-patches 0 siblings, 0 replies; 11+ messages in thread From: Maxim Kokryashkin via Tarantool-patches @ 2024-04-08 9:45 UTC (permalink / raw) To: Sergey Bronnikov; +Cc: Maxim Kokryashkin, tarantool-patches Hi, Sergey! Thanks for the review! See my comments below. On Thu, Apr 04, 2024 at 01:27:19PM +0300, Sergey Bronnikov via Tarantool-patches wrote: > Hi, Max > > thanks for the patch. See my comments below: > > > On 4/4/24 01:21, Maxim Kokryashkin wrote: > > From: Maksim Kokryashkin <max.kokryashkin@gmail.com> > > > > This patch adds tests for LuaJIT debugging > > extensions for lldb and gdb. > > --- <snipped> > > +else() > > + message(WARNING "`gdb' is not found, so LuaJIT-gdb-extension-tests is dummy") > > +endif() > > + > > +find_program(LLDB lldb) > > +if(LLDB) > > + set(test_title "test/${TEST_SUITE_NAME}/lldb") > > I would use a real path as a test title like we do in other testsuites > > (PUC Rio Lua and LuaJIT tests are exception in this rule). > > For these test we have two flavors, so I suggest to create symlinks: > > ~/sources/MRG/tarantool/third_party/luajit/test/LuaJIT-debug-extensions-tests$ > ls -la > total 20 > drwxrwxr-x 2 sergeyb sergeyb 4096 Apr 4 13:15 . > drwxrwxr-x 10 sergeyb sergeyb 4096 Apr 4 12:44 .. > -rw-rw-r-- 1 sergeyb sergeyb 2451 Apr 4 12:44 CMakeLists.txt > -rw-rw-r-- 1 sergeyb sergeyb 6787 Apr 4 12:44 debug-extension-tests.py > lrwxrwxrwx 1 sergeyb sergeyb 24 Apr 4 13:15 gdb-debug-extension-tests.py > -> debug-extension-tests.py > lrwxrwxrwx 1 sergeyb sergeyb 24 Apr 4 13:15 > lldb-debug-extension-tests.py -> debug-extension-tests.py > > And generate CMake tests for these files: > > --- a/test/LuaJIT-debug-extensions-tests/CMakeLists.txt > +++ b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt > @@ -41,14 +41,11 @@ set(DEBUGGER_TEST_ENV > "DEBUGGER_EXTENSION_PATH=${PROJECT_SOURCE_DIR}/src/luajit_dbg.py" > ) > > -set(TEST_SCRIPT_PATH > - ${PROJECT_SOURCE_DIR}/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > -) > - > find_program(GDB gdb) > if(GDB) > - set(test_title "test/${TEST_SUITE_NAME}/gdb") > + set(test_title "test/${TEST_SUITE_NAME}/gdb-debug-extension-tests.py") > set(GDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${GDB}") > + set(TEST_SCRIPT_PATH > ${CMAKE_CURRENT_SOURCE_DIR}/gdb-debug-extension-tests.py) > add_test(NAME "${test_title}" > COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} > WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} > @@ -64,8 +61,9 @@ endif() > > find_program(LLDB lldb) > if(LLDB) > - set(test_title "test/${TEST_SUITE_NAME}/lldb") > + set(test_title "test/${TEST_SUITE_NAME}/lldb-debug-extension-tests.py") > set(LLDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${LLDB}") > + set(TEST_SCRIPT_PATH > ${CMAKE_CURRENT_SOURCE_DIR}/lldb-debug-extension-tests.py) > add_test(NAME "test/${TEST_SUITE_NAME}/lldb" > COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} > WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} > > In CTest test titles looks as a real file names: > > $ ctest -L LuaJIT-dbg-extension-tests > Test project > /home/sergeyb/sources/MRG/tarantool/third_party/luajit/build/gc64 > Start 212: LuaJIT-dbg-extension-tests-deps > 1/2 Test #212: LuaJIT-dbg-extension-tests-deps > ................................ Passed 0.01 sec > Start 213: test/LuaJIT-dbg-extension-tests/gdb-debug-extension-tests.py > 2/2 Test #213: test/LuaJIT-dbg-extension-tests/gdb-debug-extension-tests.py > ... Passed 2.11 sec > > 100% tests passed, 0 tests failed out of 2 > > Label Time Summary: > LuaJIT-dbg-extension-tests = 2.12 sec*proc (2 tests) > > Total Test time (real) = 2.14 sec > > > What do you think? > Well, while I see the clear reason why you suggest to create those symlinks, I don't favor this idea. The issue is that symlinks are going to mislead people from outside of the team. You are not likely to check whether the file you decided to open is a symlink or a hardlink, but you are likely to think that these are duplicates. At this point, the test file is kind of defeats its own purpose of being an easy to maintain singular source of truth for our debugging extensions testing. If we want this consistency with test names in other suites, maybe we can set the same path for both tests and add the corresponding prefix with the debugger name in front of the path. What do you think? <snipped> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:27 ` Sergey Bronnikov via Tarantool-patches @ 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches 1 sibling, 0 replies; 11+ messages in thread From: Sergey Kaplun via Tarantool-patches @ 2024-04-17 16:00 UTC (permalink / raw) To: Maxim Kokryashkin; +Cc: tarantool-patches Hi, Maxim! Thanks for the patch! Please consider my comments below. On 04.04.24, Maxim Kokryashkin wrote: > From: Maksim Kokryashkin <max.kokryashkin@gmail.com> > > This patch adds tests for LuaJIT debugging > extensions for lldb and gdb. > --- > .flake8rc | 4 + > test/CMakeLists.txt | 1 + > .../CMakeLists.txt | 80 ++++++ > .../debug-extension-tests.py | 250 ++++++++++++++++++ > 4 files changed, 335 insertions(+) > create mode 100644 test/LuaJIT-debug-extensions-tests/CMakeLists.txt > create mode 100644 test/LuaJIT-debug-extensions-tests/debug-extension-tests.py I see no updates of the CI actions for these tests, so tests are skipped if the CI runner has no installed python|lldb|gdb. > > diff --git a/.flake8rc b/.flake8rc > index 13e6178f..6766ed41 100644 > --- a/.flake8rc > +++ b/.flake8rc > @@ -3,3 +3,7 @@ extend-ignore = > # XXX: Suppress F821, since we have autogenerated names for > # 'ptr' type complements in luajit_lldb.py. > F821 > +per-file-ignores = > + # XXX: Flake considers regexp special characters to be > + # escape sequences. > + test/LuaJIT-debug-extensions-tests/debug-extension-tests.py:W605 Do we need this ignore? IINM, we can just use the `r` prefix for all regex strings that produce the warning. See [1]. > diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt > index 19726f5a..a3b48939 100644 > --- a/test/CMakeLists.txt > +++ b/test/CMakeLists.txt > @@ -148,6 +148,7 @@ add_subdirectory(PUC-Rio-Lua-5.1-tests) > add_subdirectory(lua-Harness-tests) > add_subdirectory(tarantool-c-tests) > add_subdirectory(tarantool-tests) > +add_subdirectory(LuaJIT-debug-extensions-tests) I suggest renaming like "LuaJIT-debug-extensions-tests" -> "tarantool-debugger-tests" to be consistent with other of our tests (tarantool-tests, tarantool-c-tests) since the debugger itself is a part of our test suite, not LuaJIT's. Also, please notice that entries are alphabetically sorted. > > # Each testsuite has its own CMake target, but combining these > # target into a single one is not desired, because each target > diff --git a/test/LuaJIT-debug-extensions-tests/CMakeLists.txt b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt > new file mode 100644 > index 00000000..9ac626ec > --- /dev/null > +++ b/test/LuaJIT-debug-extensions-tests/CMakeLists.txt > @@ -0,0 +1,80 @@ > +SET(TEST_SUITE_NAME "LuaJIT-dbg-extension-tests") > +add_test_suite_target(LuaJIT-dbg-extension-tests > + LABELS ${TEST_SUITE_NAME} > + DEPENDS ${LUAJIT_TEST_BINARY} > +) > + > +# Debug info is required for testing of extensions. Strictly saying we can have debug symbols for the "RelWithDebInfo" build too. OTOH, values in registers return <optimized> instead of the desired output for the build, so skipping the test for non-Debug builds is not a bad idea after all (at least until we noticed some regression related to this build). Adjust a comment if you still don't want to test the RelWithDebInfo build. > +if(NOT (CMAKE_BUILD_TYPE MATCHES Debug)) > + message(WARNING > + "not a DEBUG build, LuaJIT-lldb-extension-tests and " I see a lot of usage of these names below. It will be better to set variables for them. Also, the names were changed after the previous review iteration, so please update the names to be up-to-date. > + "LuaJIT-gdb-extension-tests are dummy" > + ) > + return() > +endif() > + > +# MacOS asks for permission to debug a process even when the > +# machine is set into development mode. To solve the issue, > +# it is required to add relevant users to the `_developer` user > +# group in MacOS. Disabled for now. Minor: I suppose we should add the corresponding ticket to the tarantool-qa repo. It's understandable that MacOS isn't a top priority, so probably this will not be done soon, but having a ticket with a rationale and using the link to it here will still be nice. > +if(CMAKE_SYSTEM_NAME STREQUAL "Darwin" AND DEFINED ENV{CI}) > + message(WARNING > + "Interactive debugging is unavailable for macOS CI builds," > + "LuaJIT-lldb-extension-tests is dummy" > + ) > + return() > +endif() > + > +find_package(PythonInterp) Please fix the following warning policy. | CMake Warning (dev) at test/LuaJIT-debug-extensions-tests/CMakeLists.txt:28 (find_package): | Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules | are removed. Run "cmake --help-policy CMP0148" for policy details. Use | the cmake_policy command to set the policy and suppress this warning. | | This warning is for project developers. Use -Wno-dev to suppress it. IINM, you may workaround this for old CMake versions like the following [1]: | if(POLICY CMP0148) | # This policy is not known to old CMake. | cmake_policy(SET CMP0148 OLD) | endif() > +if(NOT PYTHONINTERP_FOUND) > + message(WARNING > + "`python` is not found, LuaJIT-lldb-extension-tests and " > + "LuaJIT-gdb-extension-tests are dummy" > + ) > + return() > +endif() > + > +set(DEBUGGER_TEST_ENV > + "LUAJIT_TEST_BINARY=${LUAJIT_TEST_BINARY}" > + # Suppresses __pycache__ generation. > + "PYTHONDONTWRITEBYTECODE=1" As an alternative it may be done via the -B flag. Since the -E flag will(?) ignore the value of this env variable, I suggest to using flags instead. > + "DEBUGGER_EXTENSION_PATH=${PROJECT_SOURCE_DIR}/src/luajit_dbg.py" > +) > + > +set(TEST_SCRIPT_PATH > + ${PROJECT_SOURCE_DIR}/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py Why don't use ${CMAKE_CURRENT_SOURCE_DIR} instead? > +) > + > +find_program(GDB gdb) > +if(GDB) > + set(test_title "test/${TEST_SUITE_NAME}/gdb") > + set(GDB_TEST_ENV ${DEBUGGER_TEST_ENV} "DEBUGGER_COMMAND=${GDB}") > + add_test(NAME "${test_title}" > + COMMAND ${PYTHON_EXECUTABLE} ${TEST_SCRIPT_PATH} I think it will be nice to add the -E option here to avoid interference with the user's env variables that are set locally. > + WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR} > + ) > + set_tests_properties("${test_title}" PROPERTIES > + ENVIRONMENT "${GDB_TEST_ENV}" > + LABELS ${TEST_SUITE_NAME} > + DEPENDS LuaJIT-dbg-extension-tests-deps > + ) > +else() > + message(WARNING "`gdb' is not found, so LuaJIT-gdb-extension-tests is dummy") > +endif() > + > +find_program(LLDB lldb) May we use some macro here to avoid copy-pasting? I suggest using the following prototype: | macro(AddTestDebugger target dbg) | # ... | endmacro() Side note: The macro presumes that target goes first to be similar to the rest of the CMake semantics. > +if(LLDB) <snipped> > diff --git a/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py b/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > new file mode 100644 > index 00000000..6ef87473 > --- /dev/null > +++ b/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py > @@ -0,0 +1,250 @@ > +# This file provides tests for LuaJIT debug extensions for lldb and gdb. > +import os > +import re > +import subprocess > +import sys > +import tempfile > +import unittest > + > +from threading import Timer > + > +LEGACY = re.match(r'^2\.', sys.version) Have we any runners with Python 2? If the answer is "no", do we need these workarounds if there are no guarantees that tests will work for Python 2? > + > +LUAJIT_BINARY = os.environ['LUAJIT_TEST_BINARY'] > +EXTENSION = os.environ['DEBUGGER_EXTENSION_PATH'] > +DEBUGGER = os.environ['DEBUGGER_COMMAND'] > +LLDB = 'lldb' in DEBUGGER > +TIMEOUT = 10 Side note: I suppose the timeout should be adjusted after adding the CI checks. > + > +RUN_CMD_FILE = '-s' if LLDB else '-x' > +INFERIOR_ARGS = '--' if LLDB else '--args' It will be nice to add some flags to avoid loading of users .gdbinit (`--nx` [3][4]) and .lldbinit (`--no-lldbinit` [5]). > +PROCESS_RUN = 'process launch' if LLDB else 'r' > +LOAD_EXTENSION = ( > + 'command script import {ext}' if LLDB else 'source {ext}' > +).format(ext=EXTENSION) > + > + > +def persist(data): > + tmp = tempfile.NamedTemporaryFile(mode='w') > + tmp.write(data) > + tmp.flush() > + return tmp > + > + <snipped> > +class TestCaseBase(unittest.TestCase): > + @classmethod > + def construct_cmds(cls): > + return '\n'.join([ > + 'b {loc}'.format(loc=cls.location), > + PROCESS_RUN, > + 'n', > + LOAD_EXTENSION, > + cls.extension_cmds.strip(), > + 'q', > + ]) > + > + @classmethod > + def setUpClass(cls): > + cmd_file = persist(cls.construct_cmds()) > + script_file = persist(cls.lua_script) > + process_cmd = [ > + DEBUGGER, > + RUN_CMD_FILE, > + cmd_file.name, > + INFERIOR_ARGS, > + LUAJIT_BINARY, > + script_file.name, > + ] > + cls.output = filter_debugger_output(execute_process(process_cmd)) > + cmd_file.close() > + script_file.close() It would be nice to mention that `NamedTemporaryFile()` has the `delete=True` default to remove those files. > + > + def check(self): > + if LEGACY: > + self.assertRegexpMatches(self.output, self.pattern.strip()) > + else: > + self.assertRegex(self.output, self.pattern.strip()) > + > + > +class TestLoad(TestCaseBase): > + extension_cmds = '' > + location = 'lj_cf_print' > + lua_script = 'print(1)' I suppose we may use these values of `location` and `lua_script` as defaults. It is usefull for base tests and simplifies reading. Also, we may add the comment there that `print()` isn't actually executed since we quit before it finishes, so it's not spoiling the output. > + pattern = ( > + 'lj-tv command intialized\n' > + 'lj-state command intialized\n' > + 'lj-arch command intialized\n' > + 'lj-gc command intialized\n' > + 'lj-str command intialized\n' > + 'lj-tab command intialized\n' > + 'lj-stack command intialized\n' > + 'LuaJIT debug extension is successfully loaded\n' > + ) > + > + > +class TestLJArch(TestCaseBase): <snipped> > + ) > + > + > +class TestLJState(TestCaseBase): <snipped> > + ) > + > + > +class TestLJGC(TestCaseBase): <snipped> > + ) > + > + > +class TestLJStack(TestCaseBase): > + extension_cmds = 'lj-stack' > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + pattern = ( > + '-+ Red zone:\s+\d+ slots -+\n' > + '(0x[a-zA-Z0-9]+\s+\[(S|\s)(B|\s)(T|\s)(M|\s)\] VALUE: nil\n?)*\n' I suppose you mean 0x[a-fA-F0-9]? Here and below. I see that this pattern for hexademic address is used a lot. Can we predefine it above? Same for double usage of the stack pattern. > + '-+ Stack:\s+\d+ slots -+\n' > + '(0x[A-Za-z0-9]+(:0x[A-Za-z0-9]+)?\s+' > + '\[(S|\s)(B|\s)(T|\s)(M|\s)\].*\n?)+\n' > + ) > + > + > +class TestLJTV(TestCaseBase): > + location = 'lj_cf_print' > + lua_script = 'print(1)' > + extension_cmds = ( > + 'lj-tv L->base\n' <snipped> > + 'lj-tv L->base + 11\n' > + ) > + > + lua_script = ( > + 'local ffi = require("ffi")\n' > + 'print(\n' > + ' nil,\n' > + ' false,\n' > + ' true,\n' > + ' "hello",\n' > + ' {1},\n' > + ' 1,\n' > + ' 1.1,\n' > + ' coroutine.create(function() end),\n' > + ' ffi.new("int*"),\n' > + ' function() end,\n' > + ' print,\n' > + ' require\n' > + ')\n' It would be nice to add a check for userdata too, using `newproxy()`. Also, it would be nice to sort the inputs in LJ_T* order. See <test/tarantool-tests/lj-351-print-tostring-number.test.lua>, for example. > + ) > + > + pattern = ( > + 'nil\n' > + 'false\n' > + 'true\n' > + 'string \"hello\" @ 0x[a-zA-Z0-9]+\n' > + 'table @ 0x[a-zA-Z0-9]+ \(asize: \d+, hmask: 0x[a-zA-Z0-9]+\)\n' > + '(number|integer) .*1.*\n' > + 'number 1.1\d+\n' > + 'thread @ 0x[a-zA-Z0-9]+\n' > + 'cdata @ 0x[a-zA-Z0-9]+\n' > + 'Lua function @ 0x[a-zA-Z0-9]+, [0-9]+ upvalues, .+:[0-9]+\n' > + 'fast function #[0-9]+\n' > + 'C function @ 0x[a-zA-Z0-9]+\n' > + ) > + > + > +class TestLJStr(TestCaseBase): I've got the following timeouts locally, when build in debug mode: | 3/3 Test #2: test/LuaJIT-dbg-extension-tests/gdb ....***Failed 304.54 sec | test (__main__.TestLJArch.test) ... ok | test (__main__.TestLJGC.test) ... ok | test (__main__.TestLJStack.test) ... ok | test (__main__.TestLJState.test) ... ok | setUpClass (__main__.TestLJStr) ... ERROR | test (__main__.TestLJTV.test) ... ok | test (__main__.TestLJTab.test) ... ok | test (__main__.TestLoad.test) ... ok | test (__main__.TestLJTab.test) ... ok | | ====================================================================== | ERROR: setUpClass (__main__.TestLJStr) | ---------------------------------------------------------------------- | Traceback (most recent call last): | File "/home/burii/reviews/luajit/lj-dbg/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py", line 88, in setUpClass | cls.output = filter_debugger_output(execute_process(process_cmd)) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | File "/home/burii/reviews/luajit/lj-dbg/test/LuaJIT-debug-extensions-tests/debug-extension-tests.py", line 50, in execute_process | process = subprocess.run(cmd, capture_output=True, timeout=TIMEOUT) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | File "/usr/lib/python3.11/subprocess.py", line 550, in run | stdout, stderr = process.communicate(input, timeout=timeout) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | File "/usr/lib/python3.11/subprocess.py", line 1209, in communicate | stdout, stderr = self._communicate(input, endtime, timeout) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | File "/usr/lib/python3.11/subprocess.py", line 2114, in _communicate | self._check_timeout(endtime, orig_timeout, stdout, stderr) | File "/usr/lib/python3.11/subprocess.py", line 1253, in _check_timeout | raise TimeoutExpired( The problem is that the `fname` contains invalid address at the moment of breakpoint (for the Debug build), so the error is raised. | /tmp/tmpltp1awoe:5: Error in sourced command file: The following dirty-patch fixes the issue for me: | class TestLJStr(TestCaseBase): | - extension_cmds = 'lj-str fname' | + extension_cmds = 'n\nlj-str fname' I suppose other places where we use local variables should be adjusted as well. > + extension_cmds = 'lj-str fname' > + location = 'lj_cf_dofile' > + lua_script = 'pcall(dofile("name"))' I suppose you mean: | pcall(dofile, "name") If we are not interested in the status of child process we may just avoid the `pcall()` at all. Also, I suppose we may use something from the `string` built-in library to check the `lj-str` command. > + pattern = 'String: .* \[\d+ bytes\] with hash 0x[a-zA-Z0-9]+' > + > + > +class TestLJTab(TestCaseBase): > + extension_cmds = 'lj-tab t' > + location = 'lj_cf_unpack' > + lua_script = 'unpack({1; a = 1})' > + pattern = ( > + 'Array part: 3 slots\n' > + '0x[a-zA-Z0-9]+: \[0\]: nil\n' > + '0x[a-zA-Z0-9]+: \[1\]: .+ 1\n' > + '0x[a-zA-Z0-9]+: \[2\]: nil\n' > + 'Hash part: 2 nodes\n' > + '0x[a-zA-Z0-9]+: { string "a" @ 0x[a-zA-Z0-9]+ } => ' > + '{ .+ 1 }; next = 0x0\n' > + '0x[a-zA-Z0-9]+: { nil } => { nil }; next = 0x0\n' > + ) > + > + > +for test_cls in TestCaseBase.__subclasses__(): > + test_cls.test = lambda self: self.check() > + > +if __name__ == '__main__': > + unittest.main(verbosity=2) > -- > 2.44.0 > [1]: https://www.flake8rules.com/rules/W605.html [2]: https://cmake.org/cmake/help/book/mastering-cmake/chapter/Policies.html#supporting-multiple-cmake-versions [3]: https://sourceware.org/gdb/current/onlinedocs/gdb.html/Mode-Options.html [4]: https://sourceware.org/gdb/current/onlinedocs/gdb.html/Initialization-Files.html#Initialization-Files [5]: https://lldb.llvm.org/man/lldb.html -- Best regards, Sergey Kaplun ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-08-14 19:34 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-04-03 22:21 [Tarantool-patches] [PATCH luajit v6 0/2] debug: generalized extension Maxim Kokryashkin via Tarantool-patches 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 1/2] " Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:14 ` Sergey Bronnikov via Tarantool-patches 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches 2024-04-17 22:42 ` Maxim Kokryashkin via Tarantool-patches 2024-04-18 8:00 ` Sergey Kaplun via Tarantool-patches 2024-08-14 19:34 ` Mikhail Elhimov via Tarantool-patches 2024-04-03 22:21 ` [Tarantool-patches] [PATCH luajit v6 2/2] test: add tests for debugging extensions Maxim Kokryashkin via Tarantool-patches 2024-04-04 10:27 ` Sergey Bronnikov via Tarantool-patches 2024-04-08 9:45 ` Maxim Kokryashkin via Tarantool-patches 2024-04-17 16:00 ` Sergey Kaplun via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox