[Tarantool-patches] [PATCH luajit 0/3] Extend debug extension

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension
@ 2026-06-25 20:29 Sergey Kaplun via Tarantool-patches
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-25 20:29 UTC (permalink / raw)
  To: Sergey Bronnikov, Evgeniy Temirgaleev; +Cc: tarantool-patches

This patchset adds the commands to process FFI-related and JIT-related
components of LuaJIT. The last patch simplifies the testing process with
verbose mode by environment variable.

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-4808-jslots-trace-ir-cdata-ctype
Related issue: https://github.com/tarantool/tarantool/issues/4808

Sergey Kaplun (3):
  dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  dbg: introduce lj-ctype command, extend cdata dump
  test: add verbose mode for debug extension tests

 src/luajit_dbg.py                             | 1547 ++++++++++++++++-
 .../debug-extension-tests.py                  |  580 +++++-
 2 files changed, 2111 insertions(+), 16 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-25 20:29 [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
@ 2026-06-25 20:29 ` Sergey Kaplun via Tarantool-patches
  2026-06-28  1:03   ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-30 14:45   ` Sergey Bronnikov via Tarantool-patches
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump Sergey Kaplun via Tarantool-patches
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-25 20:29 UTC (permalink / raw)
  To: Sergey Bronnikov, Evgeniy Temirgaleev; +Cc: tarantool-patches

This patch adds dumpers for a single IR instruction (`lj-ir`), as well
as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
similar to the -jdump flag but also reports types of register operands
(`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
The `lj-trace` command accepts optional /rs flags to dump registers
associated with IR and snapshots for the trace correspondingly.
The `lj-ir` command can be used for dumping IR constants as well.
The `lj-jslots` command dumps the content of `J->slot`. It is useful to
simplify debugging of `rec_check_slots()` assertion failures.

For LLDB value, the `__getitem__` metamethod now accepts bool keys.
Also, `__index__` is set to allow lldb.value to be used as an index
without explicit conversion to int. Old GDB versions (below 7.12) are
not supported because of the gdb.Value lacks the `__index__` metamethod
and can't be monkey-patched. The support for these versions may be added
by demand.

Part of tarantool/tarantool#4808
---
 src/luajit_dbg.py                             | 1216 ++++++++++++++++-
 .../debug-extension-tests.py                  |  365 +++++
 2 files changed, 1570 insertions(+), 11 deletions(-)

diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 2edb199a..fd6ca8a5 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -58,6 +58,26 @@ class Debugger(object):
             self.LLDB = True
             return super(Debugger, self).__new__(_LLDBDebugger)
 
+    def parse_flags(self, raw_flags, permitted_flags):
+        flags = {}
+        for flag in raw_flags:
+            if flag not in permitted_flags:
+                raise self.error('Unrecongnized option: "{}"'.format(flag))
+            flags[flag] = True
+        return flags
+
+    def extract_flags(self, arg, permitted_flags):
+        if not arg:
+            return None, None
+        flags = {}
+        if arg.startswith('/'):
+            match = re.match(r'/(\S*)\s+(.*)$', arg)
+            if not match:
+                return arg, flags
+            raw_flags, arg = match.group(1, 2)
+            flags = self.parse_flags(raw_flags, permitted_flags)
+        return arg, flags
+
     def configure(self):
         global PADDING, LJ_TISNUM
         if not self.check_libluajit():
@@ -70,6 +90,17 @@ class Debugger(object):
             self.write('luajit_dbg.py failed to load: '
                        'no debugging symbols found for libluajit\n')
             return False
+
+        # Setup arch.
+        try:
+            self.arch = str(self.eval('LJ_ARCH_NAME')).split('"')[1]
+        except Exception:
+            try:
+                self.arch = self.detect_arch()
+            except Exception:
+                # Setup on demand if necessary.
+                pass
+
         return True
 
     def initialize_extension(self, commands):
@@ -99,21 +130,42 @@ class Debugger(object):
         '''Return the content of the string by the given pointer.'''
         pass
 
+    @abc.abstractmethod
+    def address(self, obj):
+        '''Return the address in memory of the given object.'''
+        pass
+
     @abc.abstractmethod
     def lookup_global(self, symbol):
         '''Look up the global C symbol by the given name.'''
         pass
 
+    @abc.abstractmethod
+    def member_by_offset(self, typename, offset, prev_name=None):
+        '''Look up the global C symbol by the given name.'''
+        pass
+
     @abc.abstractmethod
     def eval(self, command):
         '''Parse and evaluate the given debugger command.'''
         pass
 
+    @abc.abstractmethod
+    def detect_arch(self):
+        '''Detect the CPU architecture and canonicalize it to the LuaJIT
+        notation.'''
+        pass
+
     @abc.abstractmethod
     def write(self, msg):
         '''Print the message.'''
         pass
 
+    @abc.abstractmethod
+    def error(self, msg):
+        '''Create the error object with message.'''
+        pass
+
     @abc.abstractmethod
     def check_libluajit(self):
         '''Check that libluajit is loaded.
@@ -172,10 +224,50 @@ class _GDBDebugger(Debugger):
         # A string is printed with a pointer to it. Just strip it.
         return re.sub(r'^0x[a-f0-9]+\s+(?=")', '', str(strptr))
 
+    def address(self, obj):
+        return obj.address
+
     def lookup_global(self, symbol):
         variable, _ = gdb.lookup_symbol(symbol)
         return variable.value() if variable else None
 
+    def member_by_offset(self, tp, offset, prev_name=None):
+        if isinstance(tp, str):
+            tp = self._dbgtype(tp)
+        assert offset < tp.sizeof, 'offset is bigger than object size'
+        if tp.code == gdb.TYPE_CODE_TYPEDEF:
+            tp = tp.strip_typedefs()
+        if tp.code == gdb.TYPE_CODE_STRUCT:
+            fields = tp.fields()
+            for n_field in range(len(fields)):
+                islast = n_field == (len(fields) - 1)
+                field = fields[n_field]
+                start_field = field.bitpos / 8
+                end_field = fields[n_field + 1].bitpos / 8 if not islast \
+                    else tp.sizeof
+                if start_field <= offset and offset < end_field:
+                    next_name = self.member_by_offset(
+                        field.type,
+                        offset - start_field,
+                        prev_name=field.name
+                    )
+                    return '.{field}{suffix}'.format(
+                        field=field.name,
+                        suffix=next_name if next_name else ''
+                    )
+        elif tp.code == gdb.TYPE_CODE_ARRAY:
+            # Get array field type.
+            target = tp.target()
+            tsize = target.sizeof
+            idx = int(offset // tsize)
+            next_name = self.member_by_offset(target, offset - idx * tsize)
+            idxname = idx_name(prev_name)
+            if idxname and idx in idxname:
+                idx = idxname[idx]
+            return '[{}]{}'.format(idx, next_name if next_name else '')
+        else:
+            return None
+
     def eval(self, command):
         if not command:
             return None
@@ -185,9 +277,23 @@ class _GDBDebugger(Debugger):
             raise gdb.GdbError('table argument empty')
         return ret
 
+    def detect_arch(self):
+        if hasattr(self, 'arch'):
+            return self.arch
+        target = str(gdb.execute('info target', False, True))
+        if re.match('.*x86-64.*', target, flags=re.DOTALL):
+            return 'x64'
+        elif re.match('.*aarch64.*', target, flags=re.DOTALL):
+            return 'arm64'
+        else:
+            return ''
+
     def write(self, msg):
         gdb.write(msg)
 
+    def error(self, errmsg):
+        return gdb.GdbError(errmsg)
+
     def check_libluajit(self):
         # XXX Fragile: Though connecting the callback looks bad,
         # it respects both Python 2 and Python 3 (see #4828).
@@ -322,8 +428,26 @@ class _LLDBDebugger(Debugger):
         def lldb__getitem__(lldbval, key):
             if type(key) is lldb.value:
                 key = int(key)
+            if type(key) is bool:
+                key = int(key)
             if type(key) is int:
                 # Allow array access.
+                ltp = lldbval.sbvalue.GetType()
+                # XXX: LLDB in versions 17 - 19 can't use an array
+                # object as the initializer for `lldb.value` since
+                # `GetValue()` for it returns `None` leading to
+                # the invalid result. See
+                # https://github.com/llvm/llvm-project/pull/90144.
+                if (self.version < 17 or self.version > 19) or \
+                   ltp.GetTypeClass() != lldb.eTypeClassArray:
+                    pass
+                else:
+                    ptr_tp = ltp.GetArrayElementType().GetPointerType()
+                    lldbval = self._lldb_value_from_raw(
+                        lldbval.sbvalue.GetLoadAddress(),
+                        ptr_tp.GetByteSize(),
+                        ptr_tp
+                    )
                 if key >= 0 and not lldbval.sbvalue.TypeIsPointerType():
                     return lldb.value(
                         lldbval.sbvalue.GetValueForExpressionPath('[%i]' % key)
@@ -349,6 +473,9 @@ class _LLDBDebugger(Debugger):
         def lldb__gt__(lldbval, other):
             return int(lldbval) > int(other)
 
+        def lldb__index__(lldbval):
+            return int(lldbval)
+
         def lldb__le__(lldbval, other):
             return int(lldbval) <= int(other)
 
@@ -406,6 +533,7 @@ class _LLDBDebugger(Debugger):
         lldb.value.__ge__ = lldb__ge__
         lldb.value.__getitem__ = lldb__getitem__
         lldb.value.__gt__ = lldb__gt__
+        lldb.value.__index__ = lldb__index__
         lldb.value.__le__ = lldb__le__
         lldb.value.__lt__ = lldb__lt__
         lldb.value.__str__ = lldb__str__
@@ -474,6 +602,9 @@ class _LLDBDebugger(Debugger):
     def cstr(self, strptr):
         return strptr.sbvalue.summary
 
+    def address(self, obj):
+        return lldb.value(obj.sbvalue.address_of)
+
     def lookup_global(self, symbol):
         sbvalue = self.target.FindFirstGlobalVariable(symbol)
         tp = sbvalue.GetType()
@@ -492,6 +623,46 @@ class _LLDBDebugger(Debugger):
                 ptr_tp
             )
 
+    def member_by_offset(self, tp, offset, prev_name=None):
+        if isinstance(tp, str):
+            tp = self._dbgtype(tp)
+        assert offset < tp.GetByteSize(), 'offset is bigger than object size'
+        tp = tp.GetCanonicalType()
+        if tp.GetTypeClass() == lldb.eTypeClassStruct:
+            len_fields = tp.GetNumberOfFields()
+            for n_field in range(len_fields):
+                islast = n_field == (len_fields - 1)
+                field = tp.GetFieldAtIndex(n_field)
+                start_field = field.GetOffsetInBytes()
+                if not islast:
+                    end_field = tp.GetFieldAtIndex(
+                        n_field + 1
+                    ).GetOffsetInBytes()
+                else:
+                    end_field = tp.GetByteSize()
+                if start_field <= offset and offset < end_field:
+                    next_name = self.member_by_offset(
+                        field.GetType(),
+                        offset - start_field,
+                        prev_name=field.GetName()
+                    )
+                    return '.{field}{suffix}'.format(
+                        field=field.GetName(),
+                        suffix=next_name if next_name else ''
+                    )
+        if tp.GetTypeClass() == lldb.eTypeClassArray:
+            # Get array field type.
+            target = tp.GetArrayElementType()
+            tsize = target.GetByteSize()
+            idx = int(offset // tsize)
+            next_name = self.member_by_offset(target, offset - idx * tsize)
+            idxname = idx_name(prev_name)
+            if idxname and idx in idxname:
+                idx = idxname[idx]
+            return '[{}]{}'.format(idx, next_name if next_name else '')
+        else:
+            return None
+
     def eval(self, command):
         if not command:
             return None
@@ -502,9 +673,23 @@ class _LLDBDebugger(Debugger):
         ret = frame.EvaluateExpression(command)
         return ret
 
+    def detect_arch(self):
+        if hasattr(self, 'arch'):
+            return self.arch
+        target = self.target.GetTriple().split('-')[0]
+        if target == 'x86_64':
+            return 'x64'
+        elif target == 'arm64' or target == 'aarch64':
+            return 'arm64'
+        else:
+            return ''
+
     def write(self, msg):
         sys.stdout.write(msg)
 
+    def error(self, errmsg):
+        return Exception(errmsg)
+
     def check_libluajit(self):
         # TODO: Implement postpone loading for LLDB too.
         return True
@@ -997,6 +1182,86 @@ def J(g):
     return dbg.cast('jit_State *', dbg.cast('char *', g) - g_offset + J_offset)
 
 
+# Matched `MMDEF(_)`.
+MM_NAMES = [
+    'index',
+    'newindex',
+    'gc',
+    'mode',
+    'eq',
+    'len',
+    'lt',
+    'le',
+    'concat',
+    'call',
+    'add',
+    'sub',
+    'mul',
+    'div',
+    'mod',
+    'pow',
+    'unm',
+    'metatable',
+    'tostring',
+    # TODO: depends on LJ_HASFFI, see `MMDEF_FFI(_)`.
+    'new',
+    # TODO: depends on LJ_52 || LJ_HASFFI, see `MMDEF_PAIRS(_)`.
+    'pairs',
+    'ipairs',
+]
+
+
+GCROOT_MMNAME = 0
+GCROOT_BASEMT = GCROOT_MMNAME + len(MM_NAMES)
+GCROOT_IO_INPUT = GCROOT_BASEMT + i2notu32(LJ_T['NUMX']) + 1
+GCROOT_IO_OUTPUT = GCROOT_IO_INPUT + 1
+
+
+# Get the name of the index in the predefined arrays.
+def idx_name(field_name):
+    # Don't use **{ to be compatible with Python 2.
+    gcroot = {}
+    gcroot.update({
+        i: 'GCROOT_MMNAME_' + MM_NAMES[i] for i in range(len(MM_NAMES))
+    })
+    gcroot.update({
+        i2notu32(LJ_T[k]) + GCROOT_BASEMT: 'GCROOT_BASEMT_' + k
+        for k in LJ_T.keys()
+    })
+    gcroot.update({
+        GCROOT_IO_INPUT:  'GCROOT_IO_INPUT',
+        GCROOT_IO_OUTPUT: 'GCROOT_IO_OUTPUT',
+    })
+    return {
+        # May be one of 2 slots depending on the result address.
+        'ksimd': {
+            0 * 2 + 0: 'LJ_KSIMD_ABS',
+            0 * 2 + 1: 'LJ_KSIMD_ABS',
+            1 * 2 + 0: 'LJ_KSIMD_NEG',
+            1 * 2 + 1: 'LJ_KSIMD_NEG',
+        },
+        'gcroot': gcroot,
+    }.get(field_name, None)
+
+
+ggfname_cache = {}
+
+
+# Get GG field name by given offset. Use in JIT dump.
+def ggfname_by_offset(offset):
+    if offset in ggfname_cache:
+        return ggfname_cache[offset]
+
+    field_path = dbg.member_by_offset('GG_State', offset)
+    if not field_path:
+        return None
+
+    # Remove first '.'.
+    ggfname = 'offsetof(GG, {})'.format(field_path[1:])
+    ggfname_cache[offset] = ggfname
+    return ggfname
+
+
 def vm_state(g):
     return {
         i2notu32(0): 'INTERP',
@@ -1087,6 +1352,555 @@ def lightudV(tv):
         return gcval(tv['gcr'])
 
 
+# JIT engine.
+
+
+IRS = [
+    # Guarded assertions.
+    'LT',
+    'GE',
+    'LE',
+    'GT',
+
+    'ULT',
+    'UGE',
+    'ULE',
+    'UGT',
+
+    'EQ',
+    'NE',
+
+    'ABC',
+    'RETF',
+
+    # Miscellaneous ops.
+    'NOP',
+    'BASE',
+    'PVAL',
+    'GCSTEP',
+    'HIOP',
+    'LOOP',
+    'USE',
+    'PHI',
+    'RENAME',
+    'PROF',
+
+    # Constants.
+    'KPRI',
+    'KINT',
+    'KGC',
+    'KPTR',
+    'KKPTR',
+    'KNULL',
+    'KNUM',
+    'KINT64',
+    'KSLOT',
+
+    # Bit ops.
+    'BNOT',
+    'BSWAP',
+    'BAND',
+    'BOR',
+    'BXOR',
+    'BSHL',
+    'BSHR',
+    'BSAR',
+    'BROL',
+    'BROR',
+
+    # Arithmetic ops. ORDER ARITH
+    'ADD',
+    'SUB',
+    'MUL',
+    'DIV',
+    'MOD',
+    'POW',
+    'NEG',
+
+    'ABS',
+    'LDEXP',
+    'MIN',
+    'MAX',
+    'FPMATH',
+
+    # Overflow-checking arithmetic ops.
+    'ADDOV',
+    'SUBOV',
+    'MULOV',
+
+    # Memory ops. A = array, H = hash, U = upvalue, F = field,
+    # S = stack.
+
+    # Memory references.
+    'AREF',
+    'HREFK',
+    'HREF',
+    'NEWREF',
+    'UREFO',
+    'UREFC',
+    'FREF',
+    'STRREF',
+    'LREF',
+
+    # Loads and Stores. These must be in the same order.
+    'ALOAD',
+    'HLOAD',
+    'ULOAD',
+    'FLOAD',
+    'XLOAD',
+    'SLOAD',
+    'VLOAD',
+
+    'ASTORE',
+    'HSTORE',
+    'USTORE',
+    'FSTORE',
+    'XSTORE',
+
+    # Allocations.
+    'SNEW',
+    'XSNEW',
+    'TNEW',
+    'TDUP',
+    'CNEW',
+    'CNEWI',
+
+    # Buffer operations.
+    'BUFHDR',
+    'BUFPUT',
+    'BUFSTR',
+
+    # Barriers.
+    'TBAR',
+    'OBAR',
+    'XBAR',
+
+    # Type conversions.
+    'CONV',
+    'TOBIT',
+    'TOSTR',
+    'STRTO',
+
+    # Calls.
+    'CALLN',
+    'CALLA',
+    'CALLL',
+    'CALLS',
+    'CALLXS',
+    'CARG',
+]
+
+
+# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
+# Non-weak guard. */
+IRM_C = 0x10
+IRM_A = 0x20
+IRM_L = 0x40
+IRM_S = 0x60
+IRM_W = 0x80
+
+
+# IR operand mode (2 bit).
+IRM = [
+  'ref',
+  'lit',
+  'cst',
+  '',  # none
+]
+
+
+lj_ir_mode_ = None
+
+
+def lj_ir_mode():
+    global lj_ir_mode_
+    if lj_ir_mode_:
+        return lj_ir_mode_
+    lj_ir_mode_ = dbg.lookup_global('lj_ir_mode')
+    return lj_ir_mode_
+
+
+def ir_left(op):
+    return IRM[int(lj_ir_mode()[op] & 3)]
+
+
+def ir_right(op):
+    return IRM[int(lj_ir_mode()[op] >> 2 & 3)]
+
+
+def ir_mode(op):
+    mode = ''
+    ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
+    if ir_mode == IRM_C:
+        mode = 'C'
+    elif ir_mode == IRM_A:
+        mode = 'A'
+    elif ir_mode == IRM_L:
+        mode = 'L'
+    elif ir_mode == IRM_S:
+        mode = 'S'
+    else:
+        mode = 'N'
+    mode += 'W' if ir_mode & IRM_W else ''
+    return mode
+
+
+IRTYPES = [
+  'nil',
+  'fal',
+  'tru',
+  'lud',
+  'str',
+  'p32',
+  'thr',
+  'pro',
+  'fun',
+  'p64',
+  'cdt',
+  'tab',
+  'udt',
+  'flt',
+  'num',
+  'i8 ',
+  'u8 ',
+  'i16',
+  'u16',
+  'int',
+  'u32',
+  'i64',
+  'u64',
+  'sfp',
+]
+
+
+IRT_NUM = 14
+assert IRTYPES[IRT_NUM] == 'num', 'incorrect IRT_NUM definition'
+
+
+IRFIELDS = [
+    'str.len',
+    'func.env',
+    'func.pc',
+    'func.ffid',
+    'thread.env',
+    'tab.meta',
+    'tab.array',
+    'tab.node',
+    'tab.asize',
+    'tab.hmask',
+    'tab.nomm',
+    'udata.meta',
+    'udata.udtype',
+    'udata.file',
+    'cdata.ctypeid',
+    'cdata.ptr',
+    'cdata.int',
+    'cdata.int64',
+    'cdata.int64_4',
+]
+
+
+IRFPMS = [
+    'floor',
+    'ceil',
+    'trunc',
+    'sqrt',
+    'exp2',
+    'log',
+    'log2',
+    'other'
+]
+
+
+# Don't use *[ to be compatible with Python 2.
+REGISTERS = {'x64': [
+    'rax',
+    'rcx',
+    'rdx',
+    'rbx',
+    'rsp',
+    'rbp',
+    'rsi',
+    'rdi',
+] + [
+    'r{}'.format(i) for i in range(8, 16)  # r8 .. r15
+] + [
+    'xmm{}'.format(i) for i in range(0, 16)  # xmm0 .. xmm15
+], 'arm64': [
+    'x{}'.format(i) for i in range(0, 31)  # x0 .. x30
+] + ['sp'] + [  # x31
+    'd{}'.format(i) for i in range(0, 32)  # d0 .. d31
+]}
+
+
+IR_CALLS = [
+    'lj_str_cmp',
+    'lj_str_find',
+    'lj_str_new',
+    'lj_strscan_num',
+    'lj_strfmt_int',
+    'lj_strfmt_num',
+    'lj_strfmt_char',
+    'lj_strfmt_putint',
+    'lj_strfmt_putnum',
+    'lj_strfmt_putquoted',
+    'lj_strfmt_putfxint',
+    'lj_strfmt_putfnum_int',
+    'lj_strfmt_putfnum_uint',
+    'lj_strfmt_putfnum',
+    'lj_strfmt_putfstr',
+    'lj_strfmt_putfchar',
+    'lj_buf_putmem',
+    'lj_buf_putstr',
+    'lj_buf_putchar',
+    'lj_buf_putstr_reverse',
+    'lj_buf_putstr_lower',
+    'lj_buf_putstr_upper',
+    'lj_buf_putstr_rep',
+    'lj_buf_puttab',
+    'lj_buf_tostr',
+    'lj_tab_new_ah',
+    'lj_tab_new1',
+    'lj_tab_dup',
+    'lj_tab_clear',
+    'lj_tab_newkey',
+    'lj_tab_len',
+    'lj_gc_step_jit',
+    'lj_gc_barrieruv',
+    'lj_mem_newgco',
+    'lj_math_random_step',
+    'lj_vm_modi',
+    'log10',
+    'exp',
+    'sin',
+    'cos',
+    'tan',
+    'asin',
+    'acos',
+    'atan',
+    'sinh',
+    'cosh',
+    'tanh',
+    'fputc',
+    'fwrite',
+    'fflush',
+    'lj_vm_floor',
+    'lj_vm_ceil',
+    'lj_vm_trunc',
+    'sqrt',
+    'log',
+    'lj_vm_log2',
+    'pow',
+    'atan2',
+    'ldexp',
+    'lj_vm_tobit',
+    'softfp_add',
+    'softfp_sub',
+    'softfp_mul',
+    'softfp_div',
+    'softfp_cmp',
+    'softfp_i2d',
+    'softfp_d2i',
+    'lj_vm_sfmin',
+    'lj_vm_sfmax',
+    'lj_vm_tointg',
+    'softfp_ui2d',
+    'softfp_f2d',
+    'softfp_d2ui',
+    'softfp_d2f',
+    'softfp_i2f',
+    'softfp_ui2f',
+    'softfp_f2i',
+    'softfp_f2ui',
+    'fp64_l2d',
+    'fp64_ul2d',
+    'fp64_l2f',
+    'fp64_ul2f',
+    'fp64_d2l',
+    'fp64_d2ul',
+    'fp64_f2l',
+    'fp64_f2ul',
+    'lj_carith_divi64',
+    'lj_carith_divu64',
+    'lj_carith_modi64',
+    'lj_carith_modu64',
+    'lj_carith_powi64',
+    'lj_carith_powu64',
+    'lj_cdata_newv',
+    'lj_cdata_setfin',
+    'strlen',
+    'memcpy',
+    'memset',
+    'lj_vm_errno',
+    'lj_carith_mul64',
+    'lj_carith_shl64',
+    'lj_carith_shr64',
+    'lj_carith_sar64',
+    'lj_carith_rol64',
+    'lj_carith_ror64',
+]
+
+
+def regname(reg_number):
+    if not hasattr(dbg, 'arch'):
+        dbg.arch = dbg.detect_arch()
+    return REGISTERS[dbg.arch][reg_number]
+
+
+def litname_sload(mode):
+    modes_str = ''
+    modes_str += 'P' if mode & 0x1 else ''
+    modes_str += 'F' if mode & 0x2 else ''
+    modes_str += 'T' if mode & 0x4 else ''
+    modes_str += 'C' if mode & 0x8 else ''
+    modes_str += 'R' if mode & 0x10 else ''
+    modes_str += 'I' if mode & 0x20 else ''
+    return modes_str
+
+
+def litname_xload(mode):
+    flags = ['-', 'R', 'V', 'RV', 'U', 'RU', 'VU', 'RVU']
+    return flags[mode]
+
+
+def litname_conv(mode):
+    IRCONV_DSH = 5
+    IRCONV_CSH = 12
+    IRCONV_SEXT = 0x800
+    IRCONV_SRCMASK = 0x1f
+    conv_str = '{to}.{frm}'.format(
+        to=IRTYPES[(mode >> IRCONV_DSH) & IRCONV_SRCMASK],
+        frm=IRTYPES[mode & IRCONV_SRCMASK]
+    )
+    conv_str += ' sext' if mode & IRCONV_SEXT else ''
+    num2int_mode = mode >> IRCONV_CSH
+    if num2int_mode == 2:
+        conv_str += ' index'
+    elif num2int_mode == 3:
+        conv_str += ' check'
+    return conv_str
+
+
+def litname_irfield(mode):
+    if mode >= len(IRFIELDS):
+        return 'unknown irfield'
+    return IRFIELDS[mode]
+
+
+def litname_fpm(mode):
+    if mode >= len(IRFPMS):
+        return 'unknown irfpm'
+    return IRFPMS[mode]
+
+
+def litname_bufhdr(mode):
+    modes = ['RESET', 'APPEND']
+    if mode >= len(modes):
+        return 'unknown bufhdr mode'
+    return modes[mode]
+
+
+def litname_tostr(mode):
+    modes = ['INT', 'NUM', 'CHAR']
+    if mode >= len(modes):
+        return 'unknown tostr mode'
+    return modes[mode]
+
+
+IR_LITNAMES = {
+    'SLOAD':  litname_sload,
+    'XLOAD':  litname_xload,
+    'CONV':   litname_conv,
+    'FLOAD':  litname_irfield,
+    'FREF':   litname_irfield,
+    'FPMATH': litname_fpm,
+    'BUFHDR': litname_bufhdr,
+    'TOSTR':  litname_tostr
+}
+
+# Additional flags.
+IRT_MARK = 0x20  # Marker for misc. purposes.
+IRT_ISPHI = 0x40  # Instruction is left or right PHI operand.
+IRT_GUARD = 0x80  # Instruction is a guard.
+# Masks.
+IRT_TYPE = 0x1f
+
+RID_NONE = 0x80
+RID_MASK = 0x7f
+RID_INIT = (RID_NONE | RID_MASK)
+RID_SINK = (RID_INIT - 1)
+RID_SUNK = (RID_INIT - 2)
+# Spill slot 0 means no spill slot has been allocated.
+SPS_NONE = 0
+
+REF_BIAS = 0x8000
+
+TREF_SHIFT = 24
+
+TREF_REFMASK = 0x0000ffff
+TREF_FRAME = 0x00010000
+TREF_CONT = 0x00020000
+# Snapshot flags and masks.
+SNAP_FRAME = 0x010000
+SNAP_SOFTFPNUM = 0x080000
+
+
+def irt_type(t):
+    return dbg.cast('IRType', t['irt'] & IRT_TYPE)
+
+
+def tref_type(tr):
+    return dbg.cast('IRType', (tr >> TREF_SHIFT) & IRT_TYPE)
+
+
+def tref_ref(tr):
+    return int(tr & TREF_REFMASK)
+
+
+def irt_ismarked(t):
+    return t['irt'] & IRT_MARK
+
+
+def irt_isphi(t):
+    return t['irt'] & IRT_ISPHI
+
+
+def irt_isguard(t):
+    return t['irt'] & IRT_GUARD
+
+
+def irt_toitype(irt):
+    t = irt_type(irt)
+    if LJ_DUALNUM and t > IRT_NUM:
+        return LJ_T['NUMX']
+    else:
+        return i2notu32(t)
+
+
+def ir_kptr(ir):
+    irname = IRS[ir['o']]
+    assert irname == 'KPTR' or irname == 'KKPTR', 'wrong IR for ir_iptr()'
+    return mref('void *', dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['ptr'])
+
+
+def ir_kgc(ir):
+    irname = IRS[ir['o']]
+    assert irname == 'KGC', 'wrong IR for ir_kgc()'
+    return gcref(dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['gcr'])
+
+
+def ir_knum(ir):
+    irname = IRS[ir['o']]
+    assert irname == 'KNUM', 'wrong IR for ir_knum()'
+    return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
+
+
+def ir_kint64(ir):
+    irname = IRS[ir['o']]
+    assert irname == 'KINT64', 'wrong IR for ir_knum()'
+    return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
+
+
 # Dumpers.
 
 # GCobj dumpers.
@@ -1467,6 +2281,325 @@ def dump_func(func):
         return 'fast function #{}\n'.format(int(ffid))
 
 
+# JIT dumpers.
+
+
+def dump_call_func(trace, callop):
+    ctype = ''
+    if callop > 0:
+        ir = trace['ir'][REF_BIAS + callop]
+        if IRTYPES[irt_type(ir['t'])] == 'nil':  # nil == CARG(func, ctype)
+            callop = int(ir['op1']) - REF_BIAS
+            cdt_idx_irk = trace['ir'][ir['op2']]
+            assert IRS[cdt_idx_irk['o']] == 'KINT', \
+                   'unexpected IR for ctype storage'
+            ctype_idx = cdt_idx_irk['i']
+            ctype = 'ctype: {}'.format(ctype_idx)
+
+    func_str = ''
+    if callop < 0:
+        irk = trace['ir'][REF_BIAS + callop]
+        assert IRS[irk['o']] == 'KINT64', \
+               'unexpected IR for FFI function storage'
+        func_addr = int(ir_kint64(irk)['u64'])
+        # TODO: Symbol demangling.
+        func_str = '[{:#x}]'.format(func_addr)
+    else:
+        func_str = '[{:04d}]'.format(callop)
+
+    return func_str, ctype
+
+
+def dump_call_args(trace, ins):
+    if ins < 0:
+        return '{{{}}}'.format(dump_irk(trace, ins))
+    else:
+        ir = trace['ir'][REF_BIAS + ins]
+        irname = IRS[ir['o']]
+        if irname == 'CARG':
+            last_arg = ''
+            args = dump_call_args(trace, int(ir['op1']) - REF_BIAS)
+            op2 = int(ir['op2']) - REF_BIAS
+            if op2 < 0:
+                last_arg = '{{{}}}'.format(dump_irk(trace, op2))
+            else:
+                last_arg = '{{{:04d}}}'.format(op2)
+            return args + ', ' + last_arg
+        else:
+            return '{{{:04d}}}'.format(ins)
+
+
+# Special FP constant.
+CONST_BIAS = 2 ** 52 + 2 ** 51
+
+
+def dump_irk(trace, idx):
+    ref = idx + REF_BIAS
+    assert ref >= trace['nk'] and ref < REF_BIAS, 'bad constant in IR dump'
+    irins = trace['ir'][ref]
+    irname = IRS[irins['o']]
+    slot = ''
+    if irname == 'KSLOT':
+        slot = ' KSLOT: @{}'.format(int(irins['op2']))
+        irins = trace['ir'][irins['op1']]
+        irname = IRS[irins['o']]
+
+    irtype = irins['t']
+    if irname == 'KPRI':
+        typename = typenames(irt_toitype(irtype))
+        # Trivial dump for primitives.
+        irk = tv_dumpers.get(
+            typename, dump_lj_tv_invalid  # noqa: F821 # Generated.
+        )(0)
+    elif irname == 'KINT':
+        irk = 'integer {}'.format(dbg.cast('int32_t', irins['i']))
+    elif irname == 'KGC':
+        typename = typenames(irt_toitype(irtype))
+        irk = gco_dumpers.get(typename, dump_lj_gco_invalid)(ir_kgc(irins))
+    elif irname == 'KKPTR':
+        addr = ir_kptr(irins)
+        if addr == dbg.address(G(L())['nilnode']):
+            return '[g->nilnode]' + slot
+        irk = '[{}]'.format(strx64(addr))
+    elif irname == 'KPTR':
+        irk = '[{}]'.format(strx64(ir_kptr(irins)))
+    elif irname == 'KNULL':
+        irk = 'NULL'
+    elif irname == 'KNUM':
+        tv_num = ir_knum(irins)
+        if float(tv_num['n']) == CONST_BIAS:
+            return 'bias'
+        irk = dump_lj_tv_numx(tv_num)
+    elif irname == 'KINT64':
+        irk = 'int64_t {}'.format(dbg.cast(
+            'int64_t', int(ir_kint64(irins)['u64'])
+        ))
+    else:
+        return 'Unknown IRK: ' + irname
+    return irk + slot
+
+
+def dump_irins(irins, trace=None):
+    irop = int(irins['o'])
+    if irop >= len(IRS):
+        return 'INVALID'
+
+    irname = IRS[irop]
+    leftop = ir_left(irop)
+    rightop = ir_right(irop)
+    irt = irins['t']
+    is_sinksunk = irins['r'] == RID_SINK or irins['r'] == RID_SUNK
+    flags = '{is_sinksunk}{is_marked}{is_guard}{is_phi}'.format(
+        # Sink flag should be the first to match sink slots during
+        # the dump of registers.
+        is_sinksunk='}' if is_sinksunk else ' ',
+        is_marked='!' if irt_ismarked(irt) else ' ',
+        is_guard='>' if irt_isguard(irt) else ' ',
+        is_phi='+' if irt_isphi(irt) else ' '
+    )
+
+    if not trace:
+        g = G(L(None))
+        compiling = jit_state(g) != 'IDLE'
+        assert compiling, 'attempt to dump IR for J.cur trace in bad VM state'
+        trace = J(g)['cur']
+
+    left = ''
+    right = ''
+    lisref = leftop == 'ref'
+    risref = rightop == 'ref'
+    op1 = int((irins['op1'] - REF_BIAS) if lisref else irins['op1'])
+    op2 = int((irins['op2'] - REF_BIAS) if risref else irins['op2'])
+
+    skip_right = False
+    if re.match('CALL', irname):
+        ctype = ''
+        args = ''
+        if rightop == 'lit':
+            func = IR_CALLS[op2]
+        else:
+            func, ctype = dump_call_func(trace, op2)
+
+        if op1 != -1:
+            args = dump_call_args(trace, int(op1))
+
+        return '{flags} {type} {name:6} [{mode:2}] {f}({args}) {ct}\n'.format(
+            flags=flags,
+            name=irname,
+            mode=ir_mode(irop),
+            type=IRTYPES[irt_type(irt)],
+            ct=ctype,
+            args=args,
+            f=func,
+        )
+    elif irname == 'CNEW' and op2 == -1:
+        left = dump_irk(trace, op1)
+        skip_right = True
+    elif leftop:
+        if op1 < 0:
+            left = dump_irk(trace, op1)
+        elif leftop == 'cst':
+            idx = irins - dbg.address(trace['ir'][REF_BIAS])
+            left = dump_irk(trace, idx)
+        else:
+            left = ('{:04d}' if lisref else '#{:<3d}').format(op1)
+
+        if rightop:
+            if rightop == 'lit':
+                litname = IR_LITNAMES.get(irname, None)
+                if litname:
+                    # Try to handle `lj_ir_ggfload()`.
+                    ggfname = None
+                    if irname == 'FLOAD' and left == 'nil' \
+                       and op2 >= len(IRFIELDS):
+                        ggfname = ggfname_by_offset(op2 << 2)
+
+                    if ggfname:
+                        right = ggfname
+                    else:
+                        right = litname(op2)
+                elif irname == 'UREFO' or irname == 'UREFC':
+                    right = '#{:<3d}'.format(op2 >> 8)
+                else:
+                    right = '#{:<3d}'.format(op2)
+            elif op2 < 0:
+                right = dump_irk(trace, op2)
+            else:
+                right = ('{:04d}').format(op2)
+
+    typename = ''
+    if irname == 'LOOP':
+        typename = '---'
+    elif irname == 'NOP':
+        typename = '   '
+    else:
+        typename = IRTYPES[irt_type(irt)]
+
+    return '{flags} {type} {name:6} [{mode:2}] {left:<9s} {right}\n'.format(
+        flags=flags,
+        name=irname,
+        mode=ir_mode(irop),
+        type=typename,
+        left=(leftop + ': ' + left) if leftop else '',
+        right=(rightop + ': ' + right) if rightop and not skip_right else '',
+    )
+
+
+def dump_snap(trace, snapno, snap):
+    dump = 'SNAP   #{:<3d} ['.format(snapno)
+    snap_map = dbg.address(trace['snapmap'][snap['mapofs']])
+    snap_entry_num = 0
+    for slot in range(0, snap['nslots']):
+        dump += ' '
+        snap_entry = int(snap_map[snap_entry_num])
+        if snap_entry_num < snap['nent'] and snap_entry >> TREF_SHIFT == slot:
+            snap_entry_num += 1
+            ref = int((snap_entry & TREF_REFMASK) - REF_BIAS)
+            if ref < 0:
+                if int(snap_entry) == 0x1057fff:
+                    dump += '----'
+                    continue
+                elif (snap_entry & TREF_CONT):
+                    dump += 'contpc'
+                elif (snap_entry & TREF_FRAME):
+                    dump += 'ftsz '
+                else:
+                    dump += '{{{const}}}'.format(const=dump_irk(trace, ref))
+            elif snap_entry & SNAP_SOFTFPNUM:
+                dump += '{:04d}/{:04d}'.format(ref, ref + 1)
+            else:
+                dump += '{:04d}'.format(ref)
+
+            if snap_entry & SNAP_FRAME:
+                dump += '|'
+        else:
+            dump += '----'
+
+    dump += ' ]\n'
+    return dump
+
+
+def dump_sink_slot(rid, spill, ins_number):
+    assert rid == RID_SINK or rid == RID_SUNK, 'incorrect rid in sink dump'
+    tp = 'sink' if rid == RID_SINK else 'sunk'
+    return '{{{}'.format(tp) if spill == RID_INIT or spill == SPS_NONE \
+           else '{{{:04d}'.format(int(ins_number - spill))
+
+
+def dump_regsp(irins, ins_number):
+    rid = irins['r']
+    spill = irins['s']
+    if rid == RID_SINK or rid == RID_SUNK:
+        return dump_sink_slot(rid, spill, ins_number)
+    elif irins['prev'] > 255:
+        return '[{:#05x}]'.format(int(spill * 4))
+    elif rid < 128:
+        return regname(rid)
+    else:
+        return ''
+
+
+def dump_trace(trace, flags):
+    dump = 'Trace {num} start\n\tproto: {start_pt}\n\tBC: {start_bc}\n'.format(
+        num=trace['traceno'],
+        start_pt=gcref(trace['startpt']),
+        start_bc=mref('BCIns *', trace['startpc']),
+    )
+
+    nins = trace['nins'] - REF_BIAS
+    dump += '---- TRACE IR\n'
+    nsnap = 0
+    snap = trace['snap'][nsnap]
+    snapref = snap['ref']
+    for irnum in range(1, nins):
+        irref = REF_BIAS + irnum
+        if 's' in flags and irref >= snapref and nsnap < trace['nsnap']:
+            dump += '....          '
+            if 'r' in flags:
+                dump += ' ' * 7
+            dump += dump_snap(trace, nsnap, snap)
+            nsnap += 1
+            snap = trace['snap'][nsnap]
+            snapref = snap['ref']
+        dump += '{:04d} '.format(irnum)
+        if 'r' in flags:
+            dump += '{:>7}'.format(dump_regsp(trace['ir'][irref], irnum))
+        dump += dump_irins(trace['ir'][irref], trace)
+    return dump
+
+
+def dump_tref(tref):
+    return '[{F}{C}] {tp} {ref:#x}'.format(
+        F='F' if tref & TREF_FRAME else ' ',
+        C='C' if tref & TREF_CONT else ' ',
+        tp=IRTYPES[tref_type(tref)],
+        ref=tref_ref(tref)
+    )
+
+
+def dump_jslots(coroutine):
+    lstate = L(None)
+    g = G(lstate or coroutine)
+    j = J(g)
+
+    dump = ''
+    maxslot = j['baseslot'] + j['maxslot']
+    first_base_slot = 1 + LJ_FR2
+    for n in reversed(range(first_base_slot, maxslot)):
+        tref = j['slot'][n]
+        ref = tref_ref(tref)
+        address = dbg.address(tref)
+        dump += '{addr} {nslot:04d} {base:1s} {tref}{const}\n'.format(
+            addr=address,
+            base='B' if address == j['base'] else ' ',
+            nslot=n,
+            tref=dump_tref(tref),
+            const=' ' + dump_irk(j['cur'], ref - REF_BIAS)
+                if ref != 0 and ref < REF_BIAS else ''
+        )
+    return dump
+
+
 # Extension commands. ############################################
 
 
@@ -1600,6 +2733,42 @@ error message occurs.
         dbg.write('{}\n'.format(dump_gcobj(gcobj)))
 
 
+class LJDumpIR(dbg.LJBase):
+    '''
+lj-ir <IRIns *>
+
+The command receives a pointer to <ir> (IRIns address) and dumps
+the IR type and some info related to it. The format is similar to
+the `jit.dump` tool but also provides information about IR mode and
+operands modes.
+
+For the list of IR names and modes (operand types), see:
+https://github.com/tarantool/tarantool/wiki/LuaJIT-SSA-IR.
+    '''
+
+    def execute(self, arg):
+        dbg.write('{}'.format(dump_irins(dbg.cast('IRIns *', dbg.eval(arg)))))
+
+
+class LJDumpJSlots(dbg.LJBase):
+    '''
+lj-jslots [<lua_State *>]
+
+The command receives an optional lua_State address and dumps the
+slots of JIT stack map:
+
+<slot ptr> <slot number> [<FRAME|CONTINUATION>] <IR reference>
+
+The lua_State pointer is optional to help in finding the VM's JIT state
+when there is no coroutine to be inspected in the debugged frame.
+    '''
+
+    def execute(self, arg):
+        dbg.write('{}'.format(
+            dump_jslots(dbg.cast('lua_State *', dbg.eval(arg)))
+        ))
+
+
 class LJDumpProto(dbg.LJBase):
     '''
 lj-proto <GCproto *>
@@ -1784,19 +2953,44 @@ error message occurs.
         dbg.write('{}\n'.format(dump_tvalue(tv)))
 
 
+class LJDumpTrace(dbg.LJBase):
+    '''
+lj-trace [/FLAGS] <GCtrace *>
+
+The command receives a pointer to <trace> (IRIns address) and dumps
+its number, IRs, and information about start location. The format is
+similar to the `jit.dump` tool but also provides information about
+IR mode and operands modes.
+
+Trace may be preceded with /FLAGS:
+* r: Dump registers associated with IR, if any.
+* s: Dump snapshots for the trace.
+    '''
+
+    def execute(self, arg):
+        arg, flags = dbg.extract_flags(arg, 'rs')
+        dbg.write('{}'.format(dump_trace(
+            dbg.cast('GCtrace *', dbg.eval(arg)),
+            flags
+        )))
+
+
 def load(event=None):
     dbg.initialize_extension({
-        'lj-arch':  LJDumpArch,
-        'lj-bc':    LJDumpBC,
-        'lj-func':  LJDumpFunc,
-        'lj-gc':    LJGC,
-        'lj-gco':   LJDumpGCobj,
-        'lj-proto': LJDumpProto,
-        'lj-stack': LJDumpStack,
-        'lj-state': LJState,
-        'lj-str':   LJDumpString,
-        'lj-tab':   LJDumpTable,
-        'lj-tv':    LJDumpTValue,
+        'lj-arch':   LJDumpArch,
+        'lj-bc':     LJDumpBC,
+        'lj-func':   LJDumpFunc,
+        'lj-gc':     LJGC,
+        'lj-gco':    LJDumpGCobj,
+        'lj-ir':     LJDumpIR,
+        'lj-jslots': LJDumpJSlots,
+        'lj-proto':  LJDumpProto,
+        'lj-stack':  LJDumpStack,
+        'lj-state':  LJState,
+        'lj-str':    LJDumpString,
+        'lj-tab':    LJDumpTable,
+        'lj-trace':  LJDumpTrace,
+        'lj-tv':     LJDumpTValue,
     })
 
 
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 7e8ea5a2..76543daa 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -46,7 +46,9 @@ else:
 RX_ADDR = r'0x[a-f0-9]+'
 RX_HASH = RX_ADDR  # The same pattern for hexademic values.
 RX_BCN = r'00\d\d'
+RX_IRN = RX_BCN  # The same as for the bytecodes.
 RX_FRAME = r'\[(S|\s)(B|\s)(T|\s)(M|\s)\]'
+RX_IRREF = r'0x\d\d\d\d'
 
 
 def persist(data):
@@ -101,6 +103,9 @@ IS_GC64 = execute_process([
     LUAJIT_BINARY, '-e', "print(require('ffi').abi('gc64'))"
 ]).strip() == 'true'
 
+# Regexp for pointer type in IR.
+RX_P = 'p64' if IS_GC64 else 'p32'
+
 # If it is the guaranteed DUALNUM build (for example, on aarch64),
 # we use this regexp for the guaranteed 'integer' check and
 # 'number' for single-number build.
@@ -108,6 +113,18 @@ RX_INT = r'integer' if IS_DUALNUM else r'number'
 RX_ISDUALNUM = r'True' if IS_DUALNUM else r'False'
 
 
+# Assume not cross-platform debugging.
+machine = os.uname().machine
+if machine == 'x86_64':
+    RX_GPR = r'r\w\w'
+    RX_FPR = r'xmm\d+'
+elif machine == 'arm64' or machine == 'aarch64':
+    RX_GPR = r'x\d+'
+    RX_FPR = r'd\d+'
+else:
+    raise Exception('Unknown archeticture in testing')
+
+
 class TestCaseBase(unittest.TestCase):
     @classmethod
     def construct_cmds(cls):
@@ -193,6 +210,16 @@ def mref(arg, tp):
             return '((' + tp + '*)(' + arg + ').ptr32)'
 
 
+def gcref(arg):
+    if SUPPORT_MACRO_EXPAND:
+        return 'gcref(' + arg + ')'
+    else:
+        if IS_GC64:
+            return '(' + arg + ').gcptr64'
+        else:
+            return '(' + arg + ').gcptr32'
+
+
 class TestLoad(TestCaseBase):
     extension_cmds = ''
     location = 'lj_cf_print'
@@ -203,11 +230,14 @@ class TestLoad(TestCaseBase):
         r'lj-func command initialized\n'
         r'lj-gc command initialized\n'
         r'lj-gco command initialized\n'
+        r'lj-ir command initialized\n'
+        r'lj-jslots command initialized\n'
         r'lj-proto command initialized\n'
         r'lj-stack command initialized\n'
         r'lj-state command initialized\n'
         r'lj-str command initialized\n'
         r'lj-tab command initialized\n'
+        r'lj-trace command initialized\n'
         r'lj-tv command initialized\n'
         r'LuaJIT debug extension is successfully loaded'
     )
@@ -473,6 +503,341 @@ class TestLJBC(TestCaseBase):
     )
 
 
+# JIT engine.
+
+
+class TestLJTraceBase(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'for _ = 1, 4 do end\n'
+        'print()\n'
+    )
+    pattern = (
+        r'Trace 1 start\n'
+        r'\t*proto: ' + RX_ADDR + r'\n' +
+        r'\t*BC: ' + RX_ADDR + r'\n' +
+        r'---- TRACE IR\n' +
+        RX_IRN + r'\s+    int SLOAD  \[N \] lit: #[12]   lit: C?I\n' +
+        RX_IRN + r'\s+ \+ int ADD    \[C \] ref: ' + RX_IRN +
+                 r' ref: integer 1\n' +
+        RX_IRN + r'\s+ >  int LE     \[N \] ref: ' + RX_IRN +
+                 r' ref: integer 4\n' +
+        RX_IRN + r'\s+ >  --- LOOP   \[N \]\s*\n' +
+        RX_IRN + r'\s+ \+ int ADD    \[C \] ref: ' + RX_IRN +
+                 r' ref: integer 1\n' +
+        RX_IRN + r'\s+ >  int LE     \[N \] ref: ' + RX_IRN +
+                 r' ref: integer 4\n' +
+        RX_IRN + r'\s+    int PHI    \[S \] ref: ' + RX_IRN + r' ref: ' +
+                 RX_IRN + r'\n' +
+        RX_IRN + r'\s+        NOP    \[N \]\s*\n'
+    )
+
+
+# Check the IR enumeration correcness by test the lowest (LT) and
+# the highest (CARG) IRs. Also, checks CALL* occasionally.
+class TestLJTraceIRRange(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'ffi.cdef[[int getpid(int, int);]]\n'  # Use argument for testing.
+        'jit.opt.start("hotloop=1")\n'
+        'for i = 1, 4 do\n'
+        '  if i < 100 then\n'  # LT.
+        '    ffi.C.getpid(i, 1LL)\n'  # CARG and CALLXS.
+        '  end\n'
+        'end\n'
+        'print()\n'
+    )
+    # IRs from variant part of the trace.
+    pattern = (
+        RX_IRN + r'\s+ >  int LT     \[N \] ref: ' +
+                 RX_IRN + r' ref: integer 100\n' +
+        RX_IRN + r'\s+    nil CARG   \[N \] ref: ' +
+                 RX_IRN + r' ref: integer 1\n' +
+        RX_IRN + r'\s+    int CALLXS \[S \] \[' + RX_ADDR +
+                 r'\]\(\{' + RX_IRN + r'\}, \{integer 1\}\)'
+    )
+
+
+# Test /rs flags.
+class TestLJTraceFlags(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace /rs ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'local r = 0.1\n'
+        'for i = 1, 4 do\n'
+        '  r = i + r\n'
+        'end\n'
+        'print()\n'
+    )
+    # IRs and snapshot from variant part of the trace.
+    pattern = (
+        RX_IRN + r'\s+' + RX_FPR + r'\s* \+ num ADD.*\n' +
+        RX_IRN + r'\s+' + RX_GPR + r'\s* \+ int ADD.*\n' +
+        r'\.\.\.\.\s* SNAP   #\d   \[ (---- )*' + RX_IRN + r' \]'
+    )
+
+
+class TestLJIRConst(TestCaseBase):
+    location = 'trace_stop'
+
+    # No narrowing of 42.
+    if IS_DUALNUM:
+        # KNUM occupies 2 slots.
+        _knum_irnum = '6'
+        _kgc_irnum = '8' if IS_GC64 else '7'
+        _kptr_irnum = '10' if IS_GC64 else '8'
+    else:
+        # KNUM occupies 2 slots.
+        _knum_irnum = '8'
+        _kgc_irnum = '10' if IS_GC64 else '9'
+        _kptr_irnum = '12' if IS_GC64 else '10'
+    extension_cmds = (
+        'n\n'  # Load J.
+        'lj-ir &J->cur.ir[0x8000 - 0]\n'
+        'lj-ir &J->cur.ir[0x8000 - 1]\n'
+        'lj-ir &J->cur.ir[0x8000 - 2]\n'
+        'lj-ir &J->cur.ir[0x8000 - 3]\n'
+        'lj-ir &J->cur.ir[0x8000 - 4]\n'
+        # Skip non-DUALNUM narrowed value.
+        'lj-ir &J->cur.ir[0x8000 - ' + _knum_irnum + ']\n'
+        'lj-ir &J->cur.ir[0x8000 - ' + _kgc_irnum + ']\n'
+        'lj-ir &J->cur.ir[0x8000 - ' + _kptr_irnum + ']\n'
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'local function trace(x)\n'
+        '   return x + 42, x + 0.5, x .. "1"\n'
+        'end\n'
+        'trace(1)\n'
+        'trace(1)\n'
+    )
+    pattern = (
+        RX_P + r' BASE.*\n' +
+        r'\s* nil KPRI.*\n'
+        r'\s* fal KPRI.*\n'
+        r'\s* tru KPRI.*\n'
+        r'\s* int KINT.*cst: integer 42\s*\n'
+        r'\s* num KNUM.*cst: number 0.5\s*\n'
+        r'\s* str KGC.*cst: string "1".*\n' +
+        r'\s*' + RX_P + r' KPTR.*cst: \[' + RX_ADDR + r'\]'
+    )
+
+
+class TestLJIRFloadNeg(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'local function trace(a)\n'
+        '  local x = -a\n'
+        '  return x\n'
+        'end\n'
+        'trace(1.1)\n'
+        'trace(1.1)\n'
+        'print()\n'
+    )
+    pattern = (
+        r'num FLOAD .* ref: nil  lit: offsetof\(GG, J\.ksimd\[LJ_KSIMD_NEG\]\)'
+    )
+
+
+class TestLJIRFloadAbs(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'local math_abs = math.abs\n'
+        'local function trace(a)\n'
+        '  local x = math_abs(a)\n'
+        '  return x\n'
+        'end\n'
+        'trace(1)\n'
+        'trace(1)\n'
+        'print()\n'
+    )
+    pattern = (
+        r'num FLOAD .* ref: nil  lit: offsetof\(GG, J\.ksimd\[LJ_KSIMD_ABS\]\)'
+    )
+
+
+# XXX: Implemented only for GC64 in LuaJIT until backporting the
+# corresponding commit.
+if IS_GC64:
+    class TestLJIRFloadGCRootBaseMT(TestCaseBase):
+        location = 'lj_cf_print'
+        extension_cmds = (
+            'n\n'  # Load L.
+            'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+        )
+        lua_script = (
+            'jit.opt.start("hotloop=1")\n'
+            'local function trace(a)\n'
+            'local x = a.sub(1, 2)\n'
+            '  return x\n'
+            'end\n'
+            'trace("12")\n'
+            'trace("12")\n'
+            'print()\n'
+        )
+        pattern = (
+            r'tab FLOAD .* ref: nil  lit: '
+            r'offsetof\(GG, g\.gcroot\[GCROOT_BASEMT_STR\]\.gcptr64\)'
+        )
+
+    class TestLJIRFloadGCRootIO(TestCaseBase):
+        location = 'lj_cf_print'
+        extension_cmds = (
+            'n\n'  # Load L.
+            'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+        )
+        lua_script = (
+            'jit.opt.start("hotloop=1")\n'
+            'local io_flush = io.flush\n'
+            'local function trace()\n'
+            '  io_flush()\n'
+            'end\n'
+            'trace()\n'
+            'trace()\n'
+            'print()\n'
+        )
+        pattern = (
+            r'udt FLOAD .* ref: nil  lit: '
+            r'offsetof\(GG, g\.gcroot\[GCROOT_IO_OUTPUT\]\.gcptr64\)'
+        )
+
+
+# Some IRs related to tables.
+class TestLJIRTable(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'local function trace(t)\n'
+        '  t.a = nil\n'
+        '  t.b = 1\n'
+        '  return t\n'
+        'end\n'
+        'trace({a = 1})\n'
+        'trace({a = 1})\n'
+        'print()\n'
+    )
+    pattern = (
+        r'(?s)int FLOAD .* tab\.hmask\n'
+        r'.*' + RX_P + r' FLOAD .* tab\.node\n'
+        r'.*' + RX_P + r' HREFK .* string "a" @ ' + RX_ADDR +
+                       r' KSLOT: @\d\n'
+        r'.*' + RX_P + r' HREF .* string "b" @ ' + RX_ADDR + r'\s*\n'
+        r'.*' + RX_P + r' EQ .* \[g->nilnode\]'
+    )
+
+
+class TestLJIRUref(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'local uv = 0\n'
+        'local function trace(a)\n'
+        '  uv = a\n'
+        '  return uv\n'
+        'end\n'
+        'trace(1)\n'
+        'trace(1)\n'
+        'print()\n'
+    )
+    pattern = r'UREFO .* lit: #0'
+
+
+# Check border values (that always avalable) of CALL IRs.
+class TestLJIRCall(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'jit.opt.start("hotloop=1")\n'
+        'local function trace(a, b)\n'
+        '  return a < b, ffi.errno()\n'
+        'end\n'
+        'trace("abc", "abd")\n'
+        'trace("abc", "abd")\n'
+        'print(1)\n'
+    )
+    pattern = (
+        r'(?s)int CALLN .* '
+        r'lj_str_cmp\(\{' + RX_IRN + r'\}, \{' + RX_IRN + r'\}\)'
+        r'.*int CALLS .* lj_vm_errno\(\)'
+    )
+
+
+# Test ffi call with ctype stored in CARG.
+class TestLJIRCallXSCType(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'ffi.cdef[[int printf(const char *fmt, ...);]]\n'
+        'jit.opt.start("hotloop=1")\n'
+        'local function trace()\n'
+        '  local t = ffi.C.printf("")\n'
+        '  return t\n'
+        'end\n'
+        'trace()\n'
+        'trace()\n'
+        'print()\n'
+    )
+    pattern = r'int CALLXS .* [' + RX_ADDR + r'\]\(.*\) ctype: \d+'
+
+
+class TestLJJSlotsBase(TestCaseBase):
+    location = 'trace_stop'
+    extension_cmds = (
+        'n\n'  # Load J.
+        'lj-jslots J->L\n'
+    )
+    lua_script = (
+        'jit.opt.start("hotloop=1")\n'
+        'for _ = 1, 4 do end\n'
+    )
+    pattern = (
+        r'(?s)(.*' +
+        RX_ADDR + ' ' + RX_IRN + r' (B|\s) \[(F|\s)(C|\s)\] \w\w\w ' +
+        RX_IRREF +
+        r'.*)+'
+    )
+
+
 for test_cls in TestCaseBase.__subclasses__():
     test_cls.test = lambda self: self.check()
 
-- 
2.54.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-25 20:29 [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
@ 2026-06-25 20:29 ` Sergey Kaplun via Tarantool-patches
  2026-06-29 13:55   ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-30 14:53   ` Sergey Bronnikov via Tarantool-patches
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests Sergey Kaplun via Tarantool-patches
  2026-07-07 10:44 ` [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
  3 siblings, 2 replies; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-25 20:29 UTC (permalink / raw)
  To: Sergey Bronnikov, Evgeniy Temirgaleev; +Cc: tarantool-patches

This patch extends dumped information for the given cdata object. Now
it resolves the given `CType` and prints it in the format similar to the
`__tostring` metamethod. The `lj-ctype` command is introduced to dump
this information where there is only the `CType` pointer but no cdata
associated with it.

`__or__` and `__ror__` metamethods are monkey-patched for the LLDB value
object. In `__sub__` metamethod for LLDB pointers `GetPointeeType()` is
used to get the pointee type instead of the incorrectly used
`GetDereferencedType()` which always returns the same type with size 8.
Casting from negative values to the unsigned values is supported to
check `CTF_UCHAR`.

Part of tarantool/tarantool#4808
---
 src/luajit_dbg.py                             | 333 +++++++++++++++++-
 .../debug-extension-tests.py                  | 208 ++++++++++-
 2 files changed, 535 insertions(+), 6 deletions(-)

diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index fd6ca8a5..62cd65d5 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -386,6 +386,8 @@ class _LLDBDebugger(Debugger):
             pack_flag = '<q'
         else:
             pack_flag = '<Q'
+            # Cast to unsigned.
+            raw_value &= 0xFFFFFFFFFFFFFFFF
         raw_data = struct.pack(pack_flag, raw_value)
         sbdata = lldb.SBData()
         sbdata.SetData(
@@ -482,6 +484,9 @@ class _LLDBDebugger(Debugger):
         def lldb__lt__(lldbval, other):
             return int(lldbval) < int(other)
 
+        def lldb__or__(lldbval, other):
+            return int(lldbval) | int(other)
+
         def lldb__str__(lldbval):
             # Instead of default GetSummary.
             if not lldbval.sbvalue.TypeIsPointerType():
@@ -512,8 +517,8 @@ class _LLDBDebugger(Debugger):
                 lldbval_tp = sbval.GetType()
                 other_tp = osbval.GetType()
                 # Subtract pointers of the same size only.
-                elsz = lldbval_tp.GetDereferencedType().size
-                if other_tp.GetDereferencedType().size != elsz:
+                elsz = lldbval_tp.GetPointeeType().size
+                if other_tp.GetPointeeType().size != elsz:
                     raise Exception(
                         'Attempt to substruct {otp} from {stp}'.format(
                             stp=lldbval_tp.name,
@@ -536,6 +541,8 @@ class _LLDBDebugger(Debugger):
         lldb.value.__index__ = lldb__index__
         lldb.value.__le__ = lldb__le__
         lldb.value.__lt__ = lldb__lt__
+        lldb.value.__or__ = lldb__or__
+        lldb.value.__ror__ = lldb__or__  # Same semantics.
         lldb.value.__str__ = lldb__str__
         lldb.value.__sub__ = lldb__sub__
 
@@ -1352,6 +1359,119 @@ def lightudV(tv):
         return gcval(tv['gcr'])
 
 
+# FFI.
+
+
+def ctype_ctsG(g):
+    return mref('CTState *', g['ctype_state'])
+
+
+def ctype_get(cts, id):
+    return dbg.address(cts['tab'][id])
+
+
+# Externally visible types.
+CT_NUM = 0  # Integer or floating-point numbers.
+CT_STRUCT = 1  # Struct or union.
+CT_PTR = 2  # Pointer or reference.
+CT_ARRAY = 3  # Array or complex type.
+CT_MAYCONVERT = CT_ARRAY
+CT_VOID = 4  # Void type.
+CT_ENUM = 5  # Enumeration.
+CT_HASSIZE = CT_ENUM  # Last type where ct->size holds the actual size.
+CT_FUNC = 6  # Function.
+CT_TYPEDEF = 7  # Typedef.
+CT_ATTRIB = 8  # Miscellaneous attributes.
+
+# Common types.
+CTID_CTYPEID = 21
+
+# C type info flags.
+CTF_BOOL = 0x08000000  # Boolean: NUM, BITFIELD.
+CTF_FP = 0x04000000  # Floating-point: NUM.
+CTF_CONST = 0x02000000  # Const qualifier.
+CTF_VOLATILE = 0x01000000  # Volatile qualifier.
+CTF_UNSIGNED = 0x00800000  # Unsigned: NUM, BITFIELD.
+CTF_LONG = 0x00400000  # Long: NUM.
+CTF_VLA = 0x00100000  # Variable-length: ARRAY, STRUCT.
+CTF_REF = 0x00800000  # Reference: PTR.
+CTF_VECTOR = 0x08000000  # Vector: ARRAY.
+CTF_COMPLEX = 0x04000000  # Complex: ARRAY.
+CTF_UNION = 0x00800000  # Union: STRUCT.
+CTF_VARARG = 0x00800000  # Vararg: FUNC.
+CTF_SSEREGPARM = 0x00400000  # SSE register parameters: FUNC.
+
+CTF_UCHAR = CTF_UNSIGNED if int(dbg.cast('char', -1)) > 0 else 0
+
+CTMASK_ATTRIB = 255  # Max. 256 attributes.
+CTSHIFT_ATTRIB = 16
+
+# Attribute numbers.
+CTA_QUAL = 1  # Unmerged qualifiers.
+
+CTSHIFT_NUM = 28
+CTMASK_CID = 0x0000ffff
+CTMASK_NUM = 0xf0000000  # Max. 16 type numbers.
+
+# Special sizes.
+CTSIZE_INVALID = 0xffffffff
+DWORDSZ = 4
+QWORDSZ = 8
+
+
+def ctype_type(info):
+    return info >> CTSHIFT_NUM
+
+
+def ctype_attrib(info):
+    return (info >> CTSHIFT_ATTRIB) & CTMASK_ATTRIB
+
+
+def ctinfo(ct, flags):
+    return (tou32(ct) << CTSHIFT_NUM) + flags
+
+
+def ctype_isptr(info):
+    return ctype_type(info) == CT_PTR
+
+
+def ctype_iscomplex(info):
+    return (info & (CTMASK_NUM | CTF_COMPLEX)) == ctinfo(CT_ARRAY, CTF_COMPLEX)
+
+
+def ctype_isinteger(info):
+    return (info & (CTMASK_NUM | CTF_BOOL | CTF_FP)) == ctinfo(CT_NUM, 0)
+
+
+def ctype_isrefarray(info):
+    return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
+           ctinfo(CT_ARRAY, 0)
+
+
+def ctype_cid(info):
+    return info & CTMASK_CID
+
+
+def ctype_child(cts, ctype):
+    return ctype_get(cts, ctype_cid(ctype['info']))
+
+
+def cdataptr(cd):
+    return dbg.cast('void *', (cd + 1))
+
+
+def cdata_getptr(p, size):
+    if LJ_64 and size == 4:
+        return dbg.cast('void *', dbg.cast('uint32_t *', p)[0])
+    else:
+        return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
+
+
+# Get C type ID for a C type.
+def ctype_typeid(cts, ct):
+    return ct - cts['tab']
+
+
 # JIT engine.
 
 
@@ -1951,7 +2071,26 @@ def dump_lj_gco_trace(gcobj):
 
 
 def dump_lj_gco_cdata(gcobj):
-    return 'cdata @ {}'.format(strx64(gcobj))
+    cdata = dbg.cast('struct GCcdata *', gcobj)
+    cts = ctype_ctsG(G(L()))
+    cid = cdata['ctypeid']
+    ctype = ctype_get(cts, cid)
+    info = ctype['info']
+    size = ctype['size']
+    value = ''
+    if ctype_iscomplex(info):
+        value = cdata_val_complex(cdata, ctype)
+    elif size == 8 and ctype_isinteger(info):
+        value = cdata_val_int64(cdata, ctype)
+    else:
+        value = cdataptr(cdata)
+        if ctype_isptr(info):
+            value = cdata_getptr(value, size)
+    return 'cdata @ {addr} {ctype} {value}'.format(
+        addr=strx64(gcobj),
+        ctype=dump_ctype(ctype),
+        value=value,
+    )
 
 
 def dump_lj_gco_tab(gcobj):
@@ -2281,6 +2420,176 @@ def dump_func(func):
         return 'fast function #{}\n'.format(int(ffid))
 
 
+# FFI dumpers.
+
+
+def cdata_val_int64(cdata, ctype):
+    info = ctype['info']
+    isunsigned = info & CTF_UNSIGNED
+    cdataval = cdataptr(cdata)
+    valueptr = None
+    usuffix = ''
+    if isunsigned:
+        usuffix = 'U'
+        valueptr = dbg.cast('uint64_t *', cdataval)
+    else:
+        valueptr = dbg.cast('int64_t *', cdataval)
+    return str(valueptr[0]) + usuffix + 'LL'
+
+
+def cdata_val_complex(cdata, ctype):
+    size = ctype['size']
+    cdataval = cdataptr(cdata)
+    casttype = None
+    if size == QWORDSZ * 2:
+        casttype = 'double *'
+    else:
+        assert size == DWORDSZ * 2, 'bad (complex float) size'
+        casttype = 'float *'
+    re = dbg.cast(casttype, cdataval)[0]
+    im = dbg.cast(casttype, cdataval)[1]
+    sign = '+' if im > 0 else ''
+    return '{re}{sign}{im}i'.format(re=re, im=im, sign=sign)
+
+
+def ctype_preplit(ctypestr, lit):
+    # Prevent extra space in the end of the string.
+    space = ' ' if ctypestr != '' else ''
+    return lit + space + ctypestr
+
+
+def ctype_prepqual(ctypestr, info):
+    if (info & CTF_VOLATILE):
+        ctypestr = ctype_preplit(ctypestr, 'volatile')
+    if (info & CTF_CONST):
+        ctypestr = ctype_preplit(ctypestr, 'const')
+    return ctypestr
+
+
+def ctype_preptype(cts, ctypestr, ctype, qual, tp):
+    nameref = gcref(ctype['name'])
+    if nameref:
+        ctypestr = ctype_preplit(ctypestr, re.sub('"', '', strdata(nameref)))
+    else:
+        ctypestr = ctype_preplit(ctypestr, str(ctype_typeid(cts, ctype)))
+    ctypestr = ctype_preplit(ctypestr, tp)
+    ctypestr = ctype_prepqual(ctypestr, qual)
+    return ctypestr
+
+
+def ctype_prepnum(ctypestr, info, size):
+    if info & CTF_BOOL:
+        ctypestr = ctype_preplit(ctypestr, 'bool')
+    elif info & CTF_FP:
+        if size == QWORDSZ:
+            ctypestr = ctype_preplit(ctypestr, 'double')
+        elif size == DWORDSZ:
+            ctypestr = ctype_preplit(ctypestr, 'float')
+        else:
+            assert size == QWORDSZ * 2, 'bad (long double) size'
+            ctypestr = ctype_preplit(ctypestr, 'long double')
+    elif size == 1:
+        if not ((info ^ CTF_UCHAR) & CTF_UNSIGNED):
+            ctypestr = ctype_preplit(ctypestr, 'char')
+        elif CTF_UCHAR:
+            ctypestr = ctype_preplit(ctypestr, 'signed char')
+        else:
+            ctypestr = ctype_preplit(ctypestr, 'unsigned char')
+    elif size < 8:
+        if size == 4:
+            ctypestr = ctype_preplit(ctypestr, 'int')
+        else:
+            assert size == DWORDSZ // 2, 'bad (short) size'
+            ctypestr = ctype_preplit(ctypestr, 'short')
+        if info & CTF_UNSIGNED:
+            ctypestr = ctype_preplit(ctypestr, 'unsigned')
+    else:
+        size_t = '{u}int{sz}_t'.format(
+            u='u' if info & CTF_UNSIGNED else '',
+            sz=size * 8,
+        )
+        ctypestr = ctype_preplit(ctypestr, size_t)
+    return ctypestr
+
+
+def ctype_repr(cts, id):
+    ctype = ctype_get(cts, id)
+    ctypestr = ''
+    qual = 0
+    ptrto = 0
+    while True:
+        info = ctype['info']
+        size = ctype['size']
+        ctp = ctype_type(info)
+        if ctp == CT_NUM:
+            ctypestr = ctype_prepnum(ctypestr, info, size)
+            return ctype_prepqual(ctypestr, qual | info)
+        elif ctp == CT_VOID:
+            ctypestr = ctype_preplit(ctypestr, 'void')
+            return ctype_prepqual(ctypestr, qual | info)
+        elif ctp == CT_STRUCT:
+            tp = 'union' if (info & CTF_UNION) else 'struct'
+            return ctype_preptype(cts, ctypestr, ctype, qual, tp)
+        elif ctp == CT_ENUM:
+            if id == CTID_CTYPEID:
+                return ctype_preplit(ctypestr, 'ctype')
+            return ctype_preptype(cts, ctypestr, ctype, qual, 'enum')
+        elif ctp == CT_ATTRIB:
+            if ctype_attrib(info) == CTA_QUAL:
+                qual |= size
+        elif ctp == CT_PTR:
+            if info & CTF_REF:
+                ctypestr = ctype_preplit(ctypestr, '&')
+            else:
+                ctypestr = ctype_prepqual(ctypestr, qual | info)
+                if LJ_64 and size == 4:
+                    ctypestr = ctype_preplit(ctypestr, '__ptr32')
+                ctypestr = ctype_preplit(ctypestr, '*')
+            qual = 0
+            ptrto = 1
+        elif ctp == CT_ARRAY:
+            if ctype_isrefarray(info):
+                if ptrto:
+                    ptrto = 0
+                    ctypestr = '(' + ctypestr + ')'
+                arrsize = ''
+                if size != CTSIZE_INVALID:
+                    child_size = ctype_child(cts, ctype)['size']
+                    arrsize = str(int(size / child_size) if child_size > 0
+                                  else 0)
+                elif info & CTF_VLA:
+                    arrsize = '?'
+                ctypestr = ctypestr + '[{}]'.format(arrsize)
+            elif ctype_iscomplex(info):
+                if size == DWORDSZ * 2:
+                    ctypestr = ctype_preplit(ctypestr, 'float')
+                else:
+                    assert size == QWORDSZ * 2, 'bad (complex double) size'
+                return ctype_preplit(ctypestr, 'complex')
+            else:
+                ctypestr = ctype_preplit(
+                    ctypestr,
+                    '__attribute__((vector_size({})))'.format(size)
+                )
+        elif ctp == CT_FUNC:
+            if ptrto:
+                ptrto = 0
+                ctypestr = '(' + ctypestr + ')'
+            ctypestr += '()'
+        ctype = ctype_child(cts, ctype)
+    return 'NYI'
+
+
+def dump_ctype(ct):
+    cts = ctype_ctsG(G(L()))
+    cid = ctype_typeid(cts, ct)
+    name = ctype_repr(cts, cid)
+    return '[{id}] <{name}>'.format(
+        id=cid,
+        name=name,
+    )
+
+
 # JIT dumpers.
 
 
@@ -2294,7 +2603,8 @@ def dump_call_func(trace, callop):
             assert IRS[cdt_idx_irk['o']] == 'KINT', \
                    'unexpected IR for ctype storage'
             ctype_idx = cdt_idx_irk['i']
-            ctype = 'ctype: {}'.format(ctype_idx)
+            cts = ctype_ctsG(G(L()))
+            ctype = 'ctype: {}'.format(dump_ctype(ctype_get(cts, ctype_idx)))
 
     func_str = ''
     if callop < 0:
@@ -2652,6 +2962,20 @@ https://github.com/tarantool/tarantool/wiki/LuaJIT-Bytecodes.
         ))
 
 
+class LJDumpCType(dbg.LJBase):
+    '''
+lj-ctype <CType *>
+
+The command receives a pointer <ctype> of the corresponding CType
+and dumps the ID and the name for this C data type.
+    '''
+
+    def execute(self, arg):
+        dbg.write('{}\n'.format(
+            dump_ctype(dbg.cast('CType *', dbg.eval(arg)))
+        ))
+
+
 class LJDumpFunc(dbg.LJBase):
     '''
 lj-func <GCfunc *>
@@ -2979,6 +3303,7 @@ def load(event=None):
     dbg.initialize_extension({
         'lj-arch':   LJDumpArch,
         'lj-bc':     LJDumpBC,
+        'lj-ctype':  LJDumpCType,
         'lj-func':   LJDumpFunc,
         'lj-gc':     LJGC,
         'lj-gco':    LJDumpGCobj,
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 76543daa..fc5d2c7b 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -227,6 +227,7 @@ class TestLoad(TestCaseBase):
     pattern = (
         r'lj-arch command initialized\n'
         r'lj-bc command initialized\n'
+        r'lj-ctype command initialized\n'
         r'lj-func command initialized\n'
         r'lj-gc command initialized\n'
         r'lj-gco command initialized\n'
@@ -331,7 +332,7 @@ GCO_RX = (
     r'Lua function @ ' + RX_ADDR + r', [0-9]+ upvalues, .+:[0-9]+\n'
     r'C function @ ' + RX_ADDR + r'\n'
     r'fast function #[0-9]+\n'
-    r'cdata @ ' + RX_ADDR + r'\n'
+    r'cdata @ ' + RX_ADDR + r' \[\d+\] <int \*> 0x0\n'
     r'table @ ' + RX_ADDR + r' \(asize: \d+, hmask: ' + RX_HASH + r'\)\n'
     r'userdata @ ' + RX_ADDR + r'\n'
 )
@@ -817,7 +818,9 @@ class TestLJIRCallXSCType(TestCaseBase):
         'trace()\n'
         'print()\n'
     )
-    pattern = r'int CALLXS .* [' + RX_ADDR + r'\]\(.*\) ctype: \d+'
+    pattern = (
+        r'int CALLXS .* [' + RX_ADDR + r'\]\(.*\) ctype: \[\d+\] <int \(\)>'
+    )
 
 
 class TestLJJSlotsBase(TestCaseBase):
@@ -838,6 +841,207 @@ class TestLJJSlotsBase(TestCaseBase):
     )
 
 
+def cdata_rx(tpstr, suffix=None):
+    return r'cdata @ ' + RX_ADDR + r' \[\d+\] <' + tpstr + '> ' + (
+        RX_ADDR if not suffix else suffix
+    )
+
+
+CHAR_SIGNED = machine in ['arm64', 'aarch64'] and sys.platform != 'darwin'
+HAS_LONG_DOUBLE = not (machine in ['arm64', 'aarch64'] and
+                       sys.platform == 'darwin')
+
+
+class TestLJCTypePrim(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-tv L->base\n'
+        'lj-tv L->base + 1\n'
+        'lj-tv L->base + 2\n'
+        'lj-tv L->base + 3\n'
+        'lj-tv L->base + 4\n'
+        'lj-tv L->base + 5\n'
+        'lj-tv L->base + 6\n'
+        'lj-tv L->base + 7\n'
+        'lj-tv L->base + 8\n'
+        'lj-tv L->base + 9\n'
+        'lj-tv L->base + 10\n'
+        'lj-tv L->base + 11\n'
+        'lj-tv L->base + 12\n'
+        'lj-tv L->base + 13\n'
+        'lj-tv L->base + 14\n'
+        'lj-tv L->base + 15\n'
+        'lj-tv L->base + 16\n'
+        'lj-tv L->base + 17\n'
+        'lj-tv L->base + 18\n'
+        'lj-tv L->base + 19\n'
+        'lj-tv L->base + 20\n'
+        'lj-tv L->base + 21\n'
+        'lj-tv L->base + 22\n'
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'print(\n'
+        '  ffi.new("bool"),\n'
+        '  ffi.new("char"),\n'
+        '  ffi.new("signed char"),\n'
+        '  ffi.new("unsigned char"),\n'
+        '  ffi.new("int"),\n'
+        '  ffi.new("short"),\n'
+        '  ffi.new("unsigned"),\n'
+        '  ffi.new("int8_t"),\n'
+        '  ffi.new("int16_t"),\n'
+        '  ffi.new("int32_t"),\n'
+        '  ffi.new("int64_t"),\n'
+        '  ffi.new("uint8_t"),\n'
+        '  ffi.new("uint64_t"),\n'
+        '  ffi.new("float"),\n'
+        '  ffi.new("double"),\n'
+        '  ffi.new("long double"),\n'
+        '  1i,\n'
+        '  ffi.new("complex float", 1, -2),\n'
+        '  ffi.new("const volatile int"),\n'
+        '  ffi.new("void *"),\n'
+        '  ffi.new("void * __ptr32"),\n'
+        '  ffi.new("int &"),\n'
+        '  ffi.typeof(1LL)\n'
+        ')\n'
+    )
+    pattern = (
+        cdata_rx('bool') + r'\n' +
+        cdata_rx('char') + r'\n' +
+        cdata_rx(('signed ' if CHAR_SIGNED else '') + 'char') + r'\n' +
+        cdata_rx(('unsigned ' if not CHAR_SIGNED else '') + 'char') + r'\n' +
+        cdata_rx('int') + r'\n' +
+        cdata_rx('short') + r'\n' +
+        cdata_rx('unsigned int') + r'\n' +
+        cdata_rx(('signed ' if CHAR_SIGNED else '') + 'char') + r'\n' +
+        cdata_rx('short') + r'\n' +
+        cdata_rx('int') + r'\n' +
+        cdata_rx('int64_t', '0LL') + r'\n' +
+        cdata_rx(('unsigned ' if not CHAR_SIGNED else '') + 'char') + r'\n' +
+        cdata_rx('uint64_t', '0ULL') + r'\n' +
+        cdata_rx('float') + r'\n' +
+        cdata_rx('double') + r'\n' +
+        cdata_rx(('long ' if HAS_LONG_DOUBLE else '') + 'double') + r'\n' +
+        cdata_rx('complex', r'0\+1i') + r'\n' +
+        cdata_rx('complex float', '1-2i') + r'\n' +
+        cdata_rx('const volatile int') + r'\n' +
+        cdata_rx(r'void \*') + r'\n' +
+        cdata_rx(r'void \* __ptr32') + r'\n' +
+        cdata_rx('int &') + r'\n' +
+        cdata_rx('ctype') + r'\n'
+    )
+
+
+class TestLJCTypeStructUnionEnum(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-tv L->base\n'
+        'lj-tv L->base + 1\n'
+        'lj-tv L->base + 2\n'
+        'lj-tv L->base + 3\n'
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'ffi.cdef[[\n'
+        '  struct test {int a;};\n'
+        ']]\n'
+        'print(\n'
+        '  ffi.new("struct test"),\n'
+        '  ffi.new("struct {int a;}"),\n'
+        '  ffi.new("union {int a;}"),\n'
+        '  ffi.new("enum {ENUM1}")\n'
+        ')\n'
+    )
+    pattern = (
+        cdata_rx('struct test') + r'\n' +
+        cdata_rx(r'struct \d+') + r'\n' +
+        cdata_rx(r'union \d+') + r'\n' +
+        cdata_rx(r'enum \d+') + r'\n'
+    )
+
+
+class TestLJCTypeArray(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-tv L->base\n'
+        'lj-tv L->base + 1\n'
+        'lj-tv L->base + 2\n'
+        'lj-tv L->base + 3\n'
+        'lj-tv L->base + 4\n'
+        'lj-tv L->base + 5\n'
+        'lj-tv L->base + 6\n'
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'print(\n'
+        '  ffi.new("char [0]"),\n'
+        '  ffi.new("int [1]"),\n'
+        '  ffi.new("complex [2]"),\n'
+        '  ffi.new("complex float [3]"),\n'
+        '  ffi.new("float __attribute__((vector_size(4)))"),\n'
+        '  ffi.new("int (&)[5]"),\n'
+        '  ffi.new("int[?]", 6)\n'
+        ')\n'
+    )
+    pattern = (
+        cdata_rx(r'char \[0\]') + r'\n' +
+        cdata_rx(r'int \[1\]') + r'\n' +
+        cdata_rx(r'complex \[2\]') + r'\n' +
+        cdata_rx(r'complex float \[3\]') + r'\n' +
+        cdata_rx(r'float __attribute__\(\(vector_size\(4\)\)\)') + r'\n' +
+        cdata_rx(r'int \(&\)\[5\]') + r'\n' +
+        cdata_rx(r'int \[\?\]') + r'\n'
+    )
+
+
+class TestLJCTypeFunc(TestCaseBase):
+    location = 'lj_cf_print'
+    extension_cmds = (
+        'n\n'  # Load L.
+        'lj-tv L->base\n'
+        'lj-tv L->base + 1\n'
+        'lj-tv L->base + 2\n'
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'ffi.cdef[[void getpid(void);]]\n'
+        'print(\n'
+        '  ffi.C.getpid,\n'
+        '  ffi.new("int (*)()"),\n'
+        '  ffi.new("int (*(*)(void))[2]")\n'
+        ')\n'
+    )
+    pattern = (
+        cdata_rx(r'void \(\)') + r'\n' +
+        cdata_rx(r'int \(\*\)\(\)') + r'\n' +
+        cdata_rx(r'int \(\* \(\*\)\(\)\)\[2\]') + r'\n'
+    )
+
+
+class TestLJCTypeBase(TestCaseBase):
+    location = 'lj_cf_ffi_new'
+    extension_cmds = (
+        # Load `ct`. Skip inlined functions for LLDB.
+        'n\n'
+        'n\n'
+        'n\n'
+        'n\n'
+        'n\n'
+        'n\n'
+        'lj-ctype ct\n'
+    )
+    lua_script = (
+        'local ffi = require("ffi")\n'
+        'ffi.new("int")\n'
+    )
+    pattern = r'\[\d+\] <int>'
+
+
 for test_cls in TestCaseBase.__subclasses__():
     test_cls.test = lambda self: self.check()
 
-- 
2.54.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests
  2026-06-25 20:29 [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump Sergey Kaplun via Tarantool-patches
@ 2026-06-25 20:29 ` Sergey Kaplun via Tarantool-patches
  2026-06-28  1:31   ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-30 14:54   ` Sergey Bronnikov via Tarantool-patches
  2026-07-07 10:44 ` [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
  3 siblings, 2 replies; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-25 20:29 UTC (permalink / raw)
  To: Sergey Bronnikov, Evgeniy Temirgaleev; +Cc: tarantool-patches

If the environment variable `DUBUGGER_TEST_VERBOSE` is set, each test
prints the generated command and its output and doesn't delete the files
generated for it.
---
 test/tarantool-debugger-tests/debug-extension-tests.py | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index fc5d2c7b..adb83e1e 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -43,6 +43,8 @@ else:
     # Don't run any initialization scripts.
     RUN_CMD_FILE = ['--batch', '--nx', '--quiet', '--command']
 
+TEST_VERBOSE = os.getenv('DUBUGGER_TEST_VERBOSE', default=False)
+
 RX_ADDR = r'0x[a-f0-9]+'
 RX_HASH = RX_ADDR  # The same pattern for hexademic values.
 RX_BCN = r'00\d\d'
@@ -52,7 +54,7 @@ RX_IRREF = r'0x\d\d\d\d'
 
 
 def persist(data):
-    tmp = tempfile.NamedTemporaryFile(mode='w')
+    tmp = tempfile.NamedTemporaryFile(mode='w', delete=not TEST_VERBOSE)
     tmp.write(data)
     tmp.flush()
     return tmp
@@ -149,7 +151,12 @@ class TestCaseBase(unittest.TestCase):
             LUAJIT_BINARY,
             script_file.name,
         ]
+        if TEST_VERBOSE:
+            print('# Test name: {}'.format(cls.__name__))
+            print('# Test command: {}'.format(' '.join(process_cmd)))
         cls.output = execute_process(process_cmd)
+        if TEST_VERBOSE:
+            print('# Command output: {}'.format(cls.output))
         cmd_file.close()
         script_file.close()
 
-- 
2.54.0


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
@ 2026-06-28  1:03   ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-28 16:32     ` Sergey Kaplun via Tarantool-patches
  2026-06-30 14:45   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-28  1:03 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 46392 bytes --]

Hi, Sergey!

Thanks for the patch. Please, see my comments.

--
Best regards,
Evgeniy Temirgaleev

> 
> От кого: Sergey Kaplun <skaplun@tarantool.org>
> Кому: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> >
> Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> >
> Дата: Четверг, 25 июня 2026, 23:29 +03:00
> This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> similar to the -jdump flag but also reports types of register operands
> (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> The `lj-trace` command accepts optional /rs flags to dump registers
> associated with IR and snapshots for the trace correspondingly.
> The `lj-ir` command can be used for dumping IR constants as well.
> The `lj-jslots` command dumps the content of `J->slot`. It is useful to
> simplify debugging of `rec_check_slots()` assertion failures.
> 
> For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> Also, `__index__` is set to allow lldb.value to be used as an index
> without explicit conversion to int. Old GDB versions (below 7.12) are
> not supported because of the gdb.Value lacks the `__index__` metamethod
> and can't be monkey-patched. The support for these versions may be added
> by demand.
> 
> Part of tarantool/tarantool#4808
> ---
> src/luajit_dbg.py | 1216 ++++++++++++++++-
> .../debug-extension-tests.py | 365 +++++
> 2 files changed, 1570 insertions(+), 11 deletions(-)
> 
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 2edb199a..fd6ca8a5 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -58,6 +58,26 @@ class Debugger(object):
> self.LLDB = True
> return super(Debugger, self).__new__(_LLDBDebugger)
> 
> + def parse_flags(self, raw_flags, permitted_flags):
> + flags = {}
> + for flag in raw_flags:
> + if flag not in permitted_flags:
> + raise self.error('Unrecongnized option: "{}"'.format(flag))
> + flags[flag] = True
> + return flags
> +
> + def extract_flags(self, arg, permitted_flags):
> + if not arg:
> + return None, None
> + flags = {}
> + if arg.startswith('/'):
> + match = re.match(r'/(\S*)\s+(.*)$', arg)
> + if not match:
> + return arg, flags
> + raw_flags, arg = match.group(1, 2)
> + flags = self.parse_flags(raw_flags, permitted_flags)
> + return arg, flags
> +
> def configure(self):
> global PADDING, LJ_TISNUM
> if not self.check_libluajit():
> @@ -70,6 +90,17 @@ class Debugger(object):
> self.write('luajit_dbg.py failed to load: '
> 'no debugging symbols found for libluajit\n')
> return False
> +
> + # Setup arch.
> + try:
> + self.arch = str(self.eval('LJ_ARCH_NAME')).split('"')[1]
> + except Exception:
> + try:
> + self.arch = self.detect_arch()
> + except Exception:
> + # Setup on demand if necessary.
> + pass
> +
> return True
> 
> def initialize_extension(self, commands):
> @@ -99,21 +130,42 @@ class Debugger(object):
> '''Return the content of the string by the given pointer.'''
> pass
> 
> + @abc.abstractmethod
> + def address(self, obj):
> + '''Return the address in memory of the given object.'''
> + pass
> +
> @abc.abstractmethod
> def lookup_global(self, symbol):
> '''Look up the global C symbol by the given name.'''
> pass
> 
> + @abc.abstractmethod
> + def member_by_offset(self, typename, offset, prev_name=None):
> + '''Look up the global C symbol by the given name.'''
> + pass
> +
> @abc.abstractmethod
> def eval(self, command):
> '''Parse and evaluate the given debugger command.'''
> pass
> 
> + @abc.abstractmethod
> + def detect_arch(self):
> + '''Detect the CPU architecture and canonicalize it to the LuaJIT
> + notation.'''
> + pass
> +
> @abc.abstractmethod
> def write(self, msg):
> '''Print the message.'''
> pass
> 
> + @abc.abstractmethod
> + def error(self, msg):
> + '''Create the error object with message.'''
> + pass
> +
> @abc.abstractmethod
> def check_libluajit(self):
> '''Check that libluajit is loaded.
> @@ -172,10 +224,50 @@ class _GDBDebugger(Debugger):
> # A string is printed with a pointer to it. Just strip it.
> return re.sub(r'^0x[a-f0-9]+\s+(?=")', '', str(strptr))
> 
> + def address(self, obj):
> + return obj.address
> +
> def lookup_global(self, symbol):
> variable, _ = gdb.lookup_symbol(symbol)
> return variable.value() if variable else None
> 
> + def member_by_offset(self, tp, offset, prev_name=None):
> + if isinstance(tp, str):
> + tp = self._dbgtype(tp)
> + assert offset < tp.sizeof, 'offset is bigger than object size'
> + if tp.code == gdb.TYPE_CODE_TYPEDEF:
> + tp = tp.strip_typedefs()
> + if tp.code == gdb.TYPE_CODE_STRUCT:
> + fields = tp.fields()
> + for n_field in range(len(fields)):
> + islast = n_field == (len(fields) - 1)
> + field = fields[n_field]
> + start_field = field.bitpos / 8
> + end_field = fields[n_field + 1].bitpos / 8 if not islast \
> + else tp.sizeof
> + if start_field <= offset and offset < end_field:
> + next_name = self.member_by_offset(
> + field.type,
> + offset - start_field,
> + prev_name=field.name
> + )
> + return '.{field}{suffix}'.format(
> + field=field.name,
> + suffix=next_name if next_name else ''
> + )
> + elif tp.code == gdb.TYPE_CODE_ARRAY:
> + # Get array field type.
> + target = tp.target()
> + tsize = target.sizeof
> + idx = int(offset // tsize)
> + next_name = self.member_by_offset(target, offset - idx * tsize)
> + idxname = idx_name(prev_name)
> + if idxname and idx in idxname:
> + idx = idxname[idx]
> + return '[{}]{}'.format(idx, next_name if next_name else '')
> + else:
> + return None
> +
> def eval(self, command):
> if not command:
> return None
> @@ -185,9 +277,23 @@ class _GDBDebugger(Debugger):
> raise gdb.GdbError('table argument empty')
> return ret
> 
> + def detect_arch(self):
> + if hasattr(self, 'arch'):
> + return self.arch
> + target = str(gdb.execute('info target', False, True))
> + if re.match('.*x86-64.*', target, flags=re.DOTALL):
> + return 'x64'
> + elif re.match('.*aarch64.*', target, flags=re.DOTALL):
> + return 'arm64'
> + else:
> + return ''
> +
> def write(self, msg):
> gdb.write(msg)
> 
> + def error(self, errmsg):
> + return gdb.GdbError(errmsg)
> +
> def check_libluajit(self):
> # XXX Fragile: Though connecting the callback looks bad,
> # it respects both Python 2 and Python 3 (see #4828).
> @@ -322,8 +428,26 @@ class _LLDBDebugger(Debugger):
> def lldb__getitem__(lldbval, key):
> if type(key) is lldb.value:
> key = int(key)
> + if type(key) is bool:
> + key = int(key)
> if type(key) is int:
> # Allow array access.
> + ltp = lldbval.sbvalue.GetType()
> + # XXX: LLDB in versions 17 - 19 can't use an array
> + # object as the initializer for `lldb.value` since
> + # `GetValue()` for it returns `None` leading to
> + # the invalid result. See
> + # https://github.com/llvm/llvm-project/pull/90144.
> + if (self.version < 17 or self.version > 19) or \
> + ltp.GetTypeClass() != lldb.eTypeClassArray:
> + pass
> + else:
> + ptr_tp = ltp.GetArrayElementType().GetPointerType()
> + lldbval = self._lldb_value_from_raw(
> + lldbval.sbvalue.GetLoadAddress(),
> + ptr_tp.GetByteSize(),
> + ptr_tp
> + )
> if key >= 0 and not lldbval.sbvalue.TypeIsPointerType():
> return lldb.value(
> lldbval.sbvalue.GetValueForExpressionPath('[%i]' % key)
> @@ -349,6 +473,9 @@ class _LLDBDebugger(Debugger):
> def lldb__gt__(lldbval, other):
> return int(lldbval) > int(other)
> 
> + def lldb__index__(lldbval):
> + return int(lldbval)
> +
> def lldb__le__(lldbval, other):
> return int(lldbval) <= int(other)
> 
> @@ -406,6 +533,7 @@ class _LLDBDebugger(Debugger):
> lldb.value.__ge__ = lldb__ge__
> lldb.value.__getitem__ = lldb__getitem__
> lldb.value.__gt__ = lldb__gt__
> + lldb.value.__index__ = lldb__index__
> lldb.value.__le__ = lldb__le__
> lldb.value.__lt__ = lldb__lt__
> lldb.value.__str__ = lldb__str__
> @@ -474,6 +602,9 @@ class _LLDBDebugger(Debugger):
> def cstr(self, strptr):
> return strptr.sbvalue.summary
> 
> + def address(self, obj):
> + return lldb.value(obj.sbvalue.address_of)
> +
> def lookup_global(self, symbol):
> sbvalue = self.target.FindFirstGlobalVariable(symbol)
> tp = sbvalue.GetType()
> @@ -492,6 +623,46 @@ class _LLDBDebugger(Debugger):
> ptr_tp
> )
> 
> + def member_by_offset(self, tp, offset, prev_name=None):
> + if isinstance(tp, str):
> + tp = self._dbgtype(tp)
> + assert offset < tp.GetByteSize(), 'offset is bigger than object size'
> + tp = tp.GetCanonicalType()
> + if tp.GetTypeClass() == lldb.eTypeClassStruct:
> + len_fields = tp.GetNumberOfFields()
> + for n_field in range(len_fields):
> + islast = n_field == (len_fields - 1)
> + field = tp.GetFieldAtIndex(n_field)
> + start_field = field.GetOffsetInBytes()
> + if not islast:
> + end_field = tp.GetFieldAtIndex(
> + n_field + 1
> + ).GetOffsetInBytes()
> + else:
> + end_field = tp.GetByteSize()
> + if start_field <= offset and offset < end_field:
> + next_name = self.member_by_offset(
> + field.GetType(),
> + offset - start_field,
> + prev_name=field.GetName()
> + )
> + return '.{field}{suffix}'.format(
> + field=field.GetName(),
> + suffix=next_name if next_name else ''
> + )
> + if tp.GetTypeClass() == lldb.eTypeClassArray:
> 

Typo?: elif

> 
> + # Get array field type.
> + target = tp.GetArrayElementType()
> + tsize = target.GetByteSize()
> + idx = int(offset // tsize)
> + next_name = self.member_by_offset(target, offset - idx * tsize)
> + idxname = idx_name(prev_name)
> + if idxname and idx in idxname:
> + idx = idxname[idx]
> + return '[{}]{}'.format(idx, next_name if next_name else '')
> + else:
> + return None
> +
> def eval(self, command):
> if not command:
> return None
> @@ -502,9 +673,23 @@ class _LLDBDebugger(Debugger):
> ret = frame.EvaluateExpression(command)
> return ret
> 
> + def detect_arch(self):
> + if hasattr(self, 'arch'):
> + return self.arch
> + target = self.target.GetTriple().split('-')[0]
> + if target == 'x86_64':
> + return 'x64'
> + elif target == 'arm64' or target == 'aarch64':
> + return 'arm64'
> + else:
> + return ''
> +
> def write(self, msg):
> sys.stdout.write(msg)
> 
> + def error(self, errmsg):
> + return Exception(errmsg)
> +
> def check_libluajit(self):
> # TODO: Implement postpone loading for LLDB too.
> return True
> @@ -997,6 +1182,86 @@ def J(g):
> return dbg.cast('jit_State *', dbg.cast('char *', g) - g_offset +
> J_offset)
> 
> 
> +# Matched `MMDEF(_)`.
> +MM_NAMES = [
> + 'index',
> + 'newindex',
> + 'gc',
> + 'mode',
> + 'eq',
> + 'len',
> + 'lt',
> + 'le',
> + 'concat',
> + 'call',
> + 'add',
> + 'sub',
> + 'mul',
> + 'div',
> + 'mod',
> + 'pow',
> + 'unm',
> + 'metatable',
> + 'tostring',
> + # TODO: depends on LJ_HASFFI, see `MMDEF_FFI(_)`.
> + 'new',
> + # TODO: depends on LJ_52 || LJ_HASFFI, see `MMDEF_PAIRS(_)`.
> + 'pairs',
> + 'ipairs',
> +]
> +
> +
> +GCROOT_MMNAME = 0
> +GCROOT_BASEMT = GCROOT_MMNAME + len(MM_NAMES)
> +GCROOT_IO_INPUT = GCROOT_BASEMT + i2notu32(LJ_T['NUMX']) + 1
> +GCROOT_IO_OUTPUT = GCROOT_IO_INPUT + 1
> +
> +
> +# Get the name of the index in the predefined arrays.
> +def idx_name(field_name):
> + # Don't use **{ to be compatible with Python 2.
> + gcroot = {}
> + gcroot.update({
> + i: 'GCROOT_MMNAME_' + MM_NAMES[i] for i in range(len(MM_NAMES))
> + })
> + gcroot.update({
> + i2notu32(LJ_T[k]) + GCROOT_BASEMT: 'GCROOT_BASEMT_' + k
> + for k in LJ_T.keys()
> + })
> + gcroot.update({
> + GCROOT_IO_INPUT: 'GCROOT_IO_INPUT',
> + GCROOT_IO_OUTPUT: 'GCROOT_IO_OUTPUT',
> + })
> + return {
> + # May be one of 2 slots depending on the result address.
> + 'ksimd': {
> + 0 * 2 + 0: 'LJ_KSIMD_ABS',
> + 0 * 2 + 1: 'LJ_KSIMD_ABS',
> + 1 * 2 + 0: 'LJ_KSIMD_NEG',
> + 1 * 2 + 1: 'LJ_KSIMD_NEG',
> + },
> + 'gcroot': gcroot,
> + }.get(field_name, None)
> +
> +
> +ggfname_cache = {}
> +
> +
> +# Get GG field name by given offset. Use in JIT dump.
> +def ggfname_by_offset(offset):
> + if offset in ggfname_cache:
> + return ggfname_cache[offset]
> +
> + field_path = dbg.member_by_offset('GG_State', offset)
> + if not field_path:
> + return None
> +
> + # Remove first '.'.
> + ggfname = 'offsetof(GG, {})'.format(field_path[1:])
> + ggfname_cache[offset] = ggfname
> + return ggfname
> +
> +
> def vm_state(g):
> return {
> i2notu32(0): 'INTERP',
> @@ -1087,6 +1352,555 @@ def lightudV(tv):
> return gcval(tv['gcr'])
> 
> 
> +# JIT engine.
> +
> +
> +IRS = [
> + # Guarded assertions.
> + 'LT',
> + 'GE',
> + 'LE',
> + 'GT',
> +
> + 'ULT',
> + 'UGE',
> + 'ULE',
> + 'UGT',
> +
> + 'EQ',
> + 'NE',
> +
> + 'ABC',
> + 'RETF',
> +
> + # Miscellaneous ops.
> + 'NOP',
> + 'BASE',
> + 'PVAL',
> + 'GCSTEP',
> + 'HIOP',
> + 'LOOP',
> + 'USE',
> + 'PHI',
> + 'RENAME',
> + 'PROF',
> +
> + # Constants.
> + 'KPRI',
> + 'KINT',
> + 'KGC',
> + 'KPTR',
> + 'KKPTR',
> + 'KNULL',
> + 'KNUM',
> + 'KINT64',
> + 'KSLOT',
> +
> + # Bit ops.
> + 'BNOT',
> + 'BSWAP',
> + 'BAND',
> + 'BOR',
> + 'BXOR',
> + 'BSHL',
> + 'BSHR',
> + 'BSAR',
> + 'BROL',
> + 'BROR',
> +
> + # Arithmetic ops. ORDER ARITH
> + 'ADD',
> + 'SUB',
> + 'MUL',
> + 'DIV',
> + 'MOD',
> + 'POW',
> + 'NEG',
> +
> + 'ABS',
> + 'LDEXP',
> + 'MIN',
> + 'MAX',
> + 'FPMATH',
> +
> + # Overflow-checking arithmetic ops.
> + 'ADDOV',
> + 'SUBOV',
> + 'MULOV',
> +
> + # Memory ops. A = array, H = hash, U = upvalue, F = field,
> + # S = stack.
> +
> + # Memory references.
> + 'AREF',
> + 'HREFK',
> + 'HREF',
> + 'NEWREF',
> + 'UREFO',
> + 'UREFC',
> + 'FREF',
> + 'STRREF',
> + 'LREF',
> +
> + # Loads and Stores. These must be in the same order.
> + 'ALOAD',
> + 'HLOAD',
> + 'ULOAD',
> + 'FLOAD',
> + 'XLOAD',
> + 'SLOAD',
> + 'VLOAD',
> +
> + 'ASTORE',
> + 'HSTORE',
> + 'USTORE',
> + 'FSTORE',
> + 'XSTORE',
> +
> + # Allocations.
> + 'SNEW',
> + 'XSNEW',
> + 'TNEW',
> + 'TDUP',
> + 'CNEW',
> + 'CNEWI',
> +
> + # Buffer operations.
> + 'BUFHDR',
> + 'BUFPUT',
> + 'BUFSTR',
> +
> + # Barriers.
> + 'TBAR',
> + 'OBAR',
> + 'XBAR',
> +
> + # Type conversions.
> + 'CONV',
> + 'TOBIT',
> + 'TOSTR',
> + 'STRTO',
> +
> + # Calls.
> + 'CALLN',
> + 'CALLA',
> + 'CALLL',
> + 'CALLS',
> + 'CALLXS',
> + 'CARG',
> +]
> +
> +
> +# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> +# Non-weak guard. */
> 

Typo: C comment end */

> 
> +IRM_C = 0x10
> +IRM_A = 0x20
> +IRM_L = 0x40
> +IRM_S = 0x60
> +IRM_W = 0x80
> +
> +
> +# IR operand mode (2 bit).
> +IRM = [
> + 'ref',
> + 'lit',
> + 'cst',
> + '', # none
> +]
> +
> +
> +lj_ir_mode_ = None
> +
> +
> +def lj_ir_mode():
> + global lj_ir_mode_
> + if lj_ir_mode_:
> + return lj_ir_mode_
> + lj_ir_mode_ = dbg.lookup_global('lj_ir_mode')
> + return lj_ir_mode_
> +
> +
> +def ir_left(op):
> + return IRM[int(lj_ir_mode()[op] & 3)]
> 

May be binary constant will be more clear? xxx & 0b0011

> 
> +
> +
> +def ir_right(op):
> + return IRM[int(lj_ir_mode()[op] >> 2 & 3)]
> 

May be binary constant will be more clear? (xxx & 0b1100) >> 2

> 
> +
> +
> +def ir_mode(op):
> + mode = ''
> + ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
> 

> 
> + if ir_mode == IRM_C:
> + mode = 'C'
> + elif ir_mode == IRM_A:
> + mode = 'A'
> + elif ir_mode == IRM_L:
> + mode = 'L'
> + elif ir_mode == IRM_S:
> + mode = 'S'
> + else:
> + mode = 'N'
> 

> 
> + mode += 'W' if ir_mode & IRM_W else ''
> 

May be table with 16 items and comments will be more clear? E. g. return XXX[(lj_ir_mode()[op] & 0b11110000) >> 4]
And it will contain invalid values also.
# <flag bits in a comment>
XXX[0b0000] = ‘NW’ # Normal/Ref | !Non-weak guard
XXX[0b0001] = ‘CW’ # Commutative | !Non-weak guard
XXX[0b0011] = ‘Invalid’
...
XXX[0b1000] = ‘N’ # Normal/Ref | Non-weak guard
XXX[0b1001] = ‘C’ # Commutative | Non-weak guard
XXX[0b1011] = ‘Invalid’
...

> 
> + return mode
> 

> 
> +
> +
> +IRTYPES = [
> + 'nil',
> + 'fal',
> + 'tru',
> + 'lud',
> + 'str',
> + 'p32',
> + 'thr',
> + 'pro',
> + 'fun',
> + 'p64',
> + 'cdt',
> + 'tab',
> + 'udt',
> + 'flt',
> + 'num',
> + 'i8 ',
> + 'u8 ',
> + 'i16',
> + 'u16',
> + 'int',
> + 'u32',
> + 'i64',
> + 'u64',
> + 'sfp',
> +]
> +
> +
> +IRT_NUM = 14
> +assert IRTYPES[IRT_NUM] == 'num', 'incorrect IRT_NUM definition'
> +
> +
> +IRFIELDS = [
> + 'str.len',
> + 'func.env',
> + 'func.pc',
> + 'func.ffid',
> + 'thread.env',
> + 'tab.meta',
> + 'tab.array',
> + 'tab.node',
> + 'tab.asize',
> + 'tab.hmask',
> + 'tab.nomm',
> + 'udata.meta',
> + 'udata.udtype',
> + 'udata.file',
> + 'cdata.ctypeid',
> + 'cdata.ptr',
> + 'cdata.int',
> + 'cdata.int64',
> + 'cdata.int64_4',
> +]
> +
> +
> +IRFPMS = [
> + 'floor',
> + 'ceil',
> + 'trunc',
> + 'sqrt',
> + 'exp2',
> + 'log',
> + 'log2',
> + 'other'
> +]
> +
> +
> +# Don't use *[ to be compatible with Python 2.
> +REGISTERS = {'x64': [
> + 'rax',
> + 'rcx',
> + 'rdx',
> + 'rbx',
> + 'rsp',
> + 'rbp',
> + 'rsi',
> + 'rdi',
> +] + [
> + 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> +] + [
> + 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> +], 'arm64': [
> + 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> +] + ['sp'] + [ # x31
> + 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> +]}
> 

It seems, the ‘arm64’ registers are missed.

> 
> +
> +
> +IR_CALLS = [
> + 'lj_str_cmp',
> + 'lj_str_find',
> + 'lj_str_new',
> + 'lj_strscan_num',
> + 'lj_strfmt_int',
> + 'lj_strfmt_num',
> + 'lj_strfmt_char',
> + 'lj_strfmt_putint',
> + 'lj_strfmt_putnum',
> + 'lj_strfmt_putquoted',
> + 'lj_strfmt_putfxint',
> + 'lj_strfmt_putfnum_int',
> + 'lj_strfmt_putfnum_uint',
> + 'lj_strfmt_putfnum',
> + 'lj_strfmt_putfstr',
> + 'lj_strfmt_putfchar',
> + 'lj_buf_putmem',
> + 'lj_buf_putstr',
> + 'lj_buf_putchar',
> + 'lj_buf_putstr_reverse',
> + 'lj_buf_putstr_lower',
> + 'lj_buf_putstr_upper',
> + 'lj_buf_putstr_rep',
> + 'lj_buf_puttab',
> + 'lj_buf_tostr',
> + 'lj_tab_new_ah',
> + 'lj_tab_new1',
> + 'lj_tab_dup',
> + 'lj_tab_clear',
> + 'lj_tab_newkey',
> + 'lj_tab_len',
> + 'lj_gc_step_jit',
> + 'lj_gc_barrieruv',
> + 'lj_mem_newgco',
> + 'lj_math_random_step',
> + 'lj_vm_modi',
> + 'log10',
> + 'exp',
> + 'sin',
> + 'cos',
> + 'tan',
> + 'asin',
> + 'acos',
> + 'atan',
> + 'sinh',
> + 'cosh',
> + 'tanh',
> + 'fputc',
> + 'fwrite',
> + 'fflush',
> + 'lj_vm_floor',
> + 'lj_vm_ceil',
> + 'lj_vm_trunc',
> + 'sqrt',
> + 'log',
> + 'lj_vm_log2',
> + 'pow',
> + 'atan2',
> + 'ldexp',
> + 'lj_vm_tobit',
> + 'softfp_add',
> + 'softfp_sub',
> + 'softfp_mul',
> + 'softfp_div',
> + 'softfp_cmp',
> + 'softfp_i2d',
> + 'softfp_d2i',
> + 'lj_vm_sfmin',
> + 'lj_vm_sfmax',
> + 'lj_vm_tointg',
> + 'softfp_ui2d',
> + 'softfp_f2d',
> + 'softfp_d2ui',
> + 'softfp_d2f',
> + 'softfp_i2f',
> + 'softfp_ui2f',
> + 'softfp_f2i',
> + 'softfp_f2ui',
> + 'fp64_l2d',
> + 'fp64_ul2d',
> + 'fp64_l2f',
> + 'fp64_ul2f',
> + 'fp64_d2l',
> + 'fp64_d2ul',
> + 'fp64_f2l',
> + 'fp64_f2ul',
> + 'lj_carith_divi64',
> + 'lj_carith_divu64',
> + 'lj_carith_modi64',
> + 'lj_carith_modu64',
> + 'lj_carith_powi64',
> + 'lj_carith_powu64',
> + 'lj_cdata_newv',
> + 'lj_cdata_setfin',
> + 'strlen',
> + 'memcpy',
> + 'memset',
> + 'lj_vm_errno',
> + 'lj_carith_mul64',
> + 'lj_carith_shl64',
> + 'lj_carith_shr64',
> + 'lj_carith_sar64',
> + 'lj_carith_rol64',
> + 'lj_carith_ror64',
> +]
> +
> +
> +def regname(reg_number):
> + if not hasattr(dbg, 'arch'):
> + dbg.arch = dbg.detect_arch()
> + return REGISTERS[dbg.arch][reg_number]
> +
> +
> +def litname_sload(mode):
> + modes_str = ''
> + modes_str += 'P' if mode & 0x1 else ''
> + modes_str += 'F' if mode & 0x2 else ''
> + modes_str += 'T' if mode & 0x4 else ''
> + modes_str += 'C' if mode & 0x8 else ''
> + modes_str += 'R' if mode & 0x10 else ''
> + modes_str += 'I' if mode & 0x20 else ''
> + return modes_str
> +
> +
> +def litname_xload(mode):
> + flags = ['-', 'R', 'V', 'RV', 'U', 'RU', 'VU', 'RVU']
> 

Does we need a range check as in litname_bufhdr()?

> 
> + return flags[mode]
> +
> +
> +def litname_conv(mode):
> 

Does we need some range checking here?

> 
> + IRCONV_DSH = 5
> + IRCONV_CSH = 12
> + IRCONV_SEXT = 0x800
> + IRCONV_SRCMASK = 0x1f
> + conv_str = '{to}.{frm}'.format(
> + to=IRTYPES[(mode >> IRCONV_DSH) & IRCONV_SRCMASK],
> + frm=IRTYPES[mode & IRCONV_SRCMASK]
> + )
> + conv_str += ' sext' if mode & IRCONV_SEXT else ''
> + num2int_mode = mode >> IRCONV_CSH
> + if num2int_mode == 2:
> + conv_str += ' index'
> + elif num2int_mode == 3:
> + conv_str += ' check'
> + return conv_str
> +
> +
> +def litname_irfield(mode):
> + if mode >= len(IRFIELDS):
> + return 'unknown irfield'
> + return IRFIELDS[mode]
> +
> +
> +def litname_fpm(mode):
> + if mode >= len(IRFPMS):
> + return 'unknown irfpm'
> + return IRFPMS[mode]
> +
> +
> +def litname_bufhdr(mode):
> + modes = ['RESET', 'APPEND']
> + if mode >= len(modes):
> + return 'unknown bufhdr mode'
> + return modes[mode]
> +
> +
> +def litname_tostr(mode):
> + modes = ['INT', 'NUM', 'CHAR']
> + if mode >= len(modes):
> + return 'unknown tostr mode'
> + return modes[mode]
> +
> +
> +IR_LITNAMES = {
> + 'SLOAD': litname_sload,
> + 'XLOAD': litname_xload,
> + 'CONV': litname_conv,
> + 'FLOAD': litname_irfield,
> + 'FREF': litname_irfield,
> + 'FPMATH': litname_fpm,
> + 'BUFHDR': litname_bufhdr,
> + 'TOSTR': litname_tostr
> +}
> +
> +# Additional flags.
> +IRT_MARK = 0x20 # Marker for misc. purposes.
> +IRT_ISPHI = 0x40 # Instruction is left or right PHI operand.
> +IRT_GUARD = 0x80 # Instruction is a guard.
> +# Masks.
> +IRT_TYPE = 0x1f
> +
> +RID_NONE = 0x80
> +RID_MASK = 0x7f
> +RID_INIT = (RID_NONE | RID_MASK)
> +RID_SINK = (RID_INIT - 1)
> +RID_SUNK = (RID_INIT - 2)
> +# Spill slot 0 means no spill slot has been allocated.
> +SPS_NONE = 0
> +
> +REF_BIAS = 0x8000
> +
> +TREF_SHIFT = 24
> +
> +TREF_REFMASK = 0x0000ffff
> +TREF_FRAME = 0x00010000
> +TREF_CONT = 0x00020000
> +# Snapshot flags and masks.
> +SNAP_FRAME = 0x010000
> +SNAP_SOFTFPNUM = 0x080000
> +
> +
> +def irt_type(t):
> + return dbg.cast('IRType', t['irt'] & IRT_TYPE)
> +
> +
> +def tref_type(tr):
> + return dbg.cast('IRType', (tr >> TREF_SHIFT) & IRT_TYPE)
> +
> +
> +def tref_ref(tr):
> + return int(tr & TREF_REFMASK)
> +
> +
> +def irt_ismarked(t):
> + return t['irt'] & IRT_MARK
> 

I propose explicit bool cast (!= 0) here and below.

> 
> +
> +
> +def irt_isphi(t):
> + return t['irt'] & IRT_ISPHI
> +
> +
> +def irt_isguard(t):
> + return t['irt'] & IRT_GUARD
> +
> +
> +def irt_toitype(irt):
> + t = irt_type(irt)
> + if LJ_DUALNUM and t > IRT_NUM:
> + return LJ_T['NUMX']
> + else:
> + return i2notu32(t)
> +
> +
> +def ir_kptr(ir):
> + irname = IRS[ir['o']]
> + assert irname == 'KPTR' or irname == 'KKPTR', 'wrong IR for ir_iptr()'
> + return mref('void *', dbg.cast('IRIns *',
> dbg.address(ir))[LJ_GC64]['ptr'])
> +
> +
> +def ir_kgc(ir):
> + irname = IRS[ir['o']]
> + assert irname == 'KGC', 'wrong IR for ir_kgc()'
> + return gcref(dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['gcr'])
> +
> +
> +def ir_knum(ir):
> + irname = IRS[ir['o']]
> + assert irname == 'KNUM', 'wrong IR for ir_knum()'
> + return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
> +
> +
> +def ir_kint64(ir):
> + irname = IRS[ir['o']]
> + assert irname == 'KINT64', 'wrong IR for ir_knum()'
> + return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
> +
> +
> # Dumpers.
> 
> # GCobj dumpers.
> @@ -1467,6 +2281,325 @@ def dump_func(func):
> return 'fast function #{}\n'.format(int(ffid))
> 
> 
> +# JIT dumpers.
> +
> +
> +def dump_call_func(trace, callop):
> + ctype = ''
> + if callop > 0:
> + ir = trace['ir'][REF_BIAS + callop]
> + if IRTYPES[irt_type(ir['t'])] == 'nil': # nil == CARG(func, ctype)
> + callop = int(ir['op1']) - REF_BIAS
> + cdt_idx_irk = trace['ir'][ir['op2']]
> + assert IRS[cdt_idx_irk['o']] == 'KINT', \
> + 'unexpected IR for ctype storage'
> + ctype_idx = cdt_idx_irk['i']
> + ctype = 'ctype: {}'.format(ctype_idx)
> +
> + func_str = ''
> + if callop < 0:
> + irk = trace['ir'][REF_BIAS + callop]
> + assert IRS[irk['o']] == 'KINT64', \
> + 'unexpected IR for FFI function storage'
> + func_addr = int(ir_kint64(irk)['u64'])
> + # TODO: Symbol demangling.
> + func_str = '[{:#x}]'.format(func_addr)
> + else:
> + func_str = '[{:04d}]'.format(callop)
> +
> + return func_str, ctype
> +
> +
> +def dump_call_args(trace, ins):
> + if ins < 0:
> + return '{{{}}}'.format(dump_irk(trace, ins))
> + else:
> + ir = trace['ir'][REF_BIAS + ins]
> + irname = IRS[ir['o']]
> + if irname == 'CARG':
> + last_arg = ''
> + args = dump_call_args(trace, int(ir['op1']) - REF_BIAS)
> + op2 = int(ir['op2']) - REF_BIAS
> + if op2 < 0:
> + last_arg = '{{{}}}'.format(dump_irk(trace, op2))
> + else:
> + last_arg = '{{{:04d}}}'.format(op2)
> + return args + ', ' + last_arg
> + else:
> + return '{{{:04d}}}'.format(ins)
> +
> +
> +# Special FP constant.
> +CONST_BIAS = 2 ** 52 + 2 ** 51
> +
> +
> +def dump_irk(trace, idx):
> + ref = idx + REF_BIAS
> + assert ref >= trace['nk'] and ref < REF_BIAS, 'bad constant in IR dump'
> + irins = trace['ir'][ref]
> + irname = IRS[irins['o']]
> + slot = ''
> + if irname == 'KSLOT':
> + slot = ' KSLOT: @{}'.format(int(irins['op2']))
> + irins = trace['ir'][irins['op1']]
> + irname = IRS[irins['o']]
> +
> + irtype = irins['t']
> + if irname == 'KPRI':
> + typename = typenames(irt_toitype(irtype))
> + # Trivial dump for primitives.
> + irk = tv_dumpers.get(
> + typename, dump_lj_tv_invalid # noqa: F821 # Generated.
> + )(0)
> + elif irname == 'KINT':
> + irk = 'integer {}'.format(dbg.cast('int32_t', irins['i']))
> + elif irname == 'KGC':
> + typename = typenames(irt_toitype(irtype))
> + irk = gco_dumpers.get(typename, dump_lj_gco_invalid)(ir_kgc(irins))
> + elif irname == 'KKPTR':
> + addr = ir_kptr(irins)
> + if addr == dbg.address(G(L())['nilnode']):
> + return '[g->nilnode]' + slot
> + irk = '[{}]'.format(strx64(addr))
> + elif irname == 'KPTR':
> + irk = '[{}]'.format(strx64(ir_kptr(irins)))
> + elif irname == 'KNULL':
> + irk = 'NULL'
> + elif irname == 'KNUM':
> + tv_num = ir_knum(irins)
> + if float(tv_num['n']) == CONST_BIAS:
> + return 'bias'
> + irk = dump_lj_tv_numx(tv_num)
> + elif irname == 'KINT64':
> + irk = 'int64_t {}'.format(dbg.cast(
> + 'int64_t', int(ir_kint64(irins)['u64'])
> + ))
> + else:
> + return 'Unknown IRK: ' + irname
> + return irk + slot
> +
> +
> +def dump_irins(irins, trace=None):
> + irop = int(irins['o'])
> + if irop >= len(IRS):
> + return 'INVALID'
> +
> + irname = IRS[irop]
> + leftop = ir_left(irop)
> + rightop = ir_right(irop)
> + irt = irins['t']
> + is_sinksunk = irins['r'] == RID_SINK or irins['r'] == RID_SUNK
> + flags = '{is_sinksunk}{is_marked}{is_guard}{is_phi}'.format(
> + # Sink flag should be the first to match sink slots during
> + # the dump of registers.
> + is_sinksunk='}' if is_sinksunk else ' ',
> + is_marked='!' if irt_ismarked(irt) else ' ',
> + is_guard='>' if irt_isguard(irt) else ' ',
> + is_phi='+' if irt_isphi(irt) else ' '
> + )
> +
> + if not trace:
> + g = G(L(None))
> + compiling = jit_state(g) != 'IDLE'
> + assert compiling, 'attempt to dump IR for J.cur trace in bad VM state'
> + trace = J(g)['cur']
> +
> + left = ''
> + right = ''
> + lisref = leftop == 'ref'
> + risref = rightop == 'ref'
> + op1 = int((irins['op1'] - REF_BIAS) if lisref else irins['op1'])
> + op2 = int((irins['op2'] - REF_BIAS) if risref else irins['op2'])
> 

> 
> +
> + skip_right = False
> + if re.match('CALL', irname):
> + ctype = ''
> + args = ''
> + if rightop == 'lit':
> + func = IR_CALLS[op2]
> + else:
> + func, ctype = dump_call_func(trace, op2)
> +
> + if op1 != -1:
> + args = dump_call_args(trace, int(op1))
> +
> + return '{flags} {type} {name:6} [{mode:2}] {f}({args}) {ct}\n'.format(
> + flags=flags,
> + name=irname,
> + mode=ir_mode(irop),
> + type=IRTYPES[irt_type(irt)],
> + ct=ctype,
> + args=args,
> + f=func,
> + )
> + elif irname == 'CNEW' and op2 == -1:
> + left = dump_irk(trace, op1)
> + skip_right = True
> + elif leftop:
> + if op1 < 0:
> + left = dump_irk(trace, op1)
> + elif leftop == 'cst':
> + idx = irins - dbg.address(trace['ir'][REF_BIAS])
> + left = dump_irk(trace, idx)
> + else:
> + left = ('{:04d}' if lisref else '#{:<3d}').format(op1)
> +
> + if rightop:
> + if rightop == 'lit':
> + litname = IR_LITNAMES.get(irname, None)
> + if litname:
> + # Try to handle `lj_ir_ggfload()`.
> + ggfname = None
> + if irname == 'FLOAD' and left == 'nil' \
> + and op2 >= len(IRFIELDS):
> + ggfname = ggfname_by_offset(op2 << 2)
> +
> + if ggfname:
> + right = ggfname
> + else:
> + right = litname(op2)
> + elif irname == 'UREFO' or irname == 'UREFC':
> + right = '#{:<3d}'.format(op2 >> 8)
> + else:
> + right = '#{:<3d}'.format(op2)
> + elif op2 < 0:
> + right = dump_irk(trace, op2)
> + else:
> + right = ('{:04d}').format(op2)
> +
> + typename = ''
> + if irname == 'LOOP':
> + typename = '---'
> + elif irname == 'NOP':
> + typename = ' '
> + else:
> + typename = IRTYPES[irt_type(irt)]
> +
> + return '{flags} {type} {name:6} [{mode:2}] {left:<9s} {right}\n'.format(
> 
> + flags=flags,
> + name=irname,
> + mode=ir_mode(irop),
> + type=typename,
> + left=(leftop + ': ' + left) if leftop else '',
> + right=(rightop + ': ' + right) if rightop and not skip_right else '',
> + )
> +
> +
> +def dump_snap(trace, snapno, snap):
> + dump = 'SNAP #{:<3d} ['.format(snapno)
> + snap_map = dbg.address(trace['snapmap'][snap['mapofs']])
> + snap_entry_num = 0
> + for slot in range(0, snap['nslots']):
> + dump += ' '
> + snap_entry = int(snap_map[snap_entry_num])
> + if snap_entry_num < snap['nent'] and snap_entry >> TREF_SHIFT == slot:
> + snap_entry_num += 1
> + ref = int((snap_entry & TREF_REFMASK) - REF_BIAS)
> + if ref < 0:
> + if int(snap_entry) == 0x1057fff:
> + dump += '----'
> + continue
> + elif (snap_entry & TREF_CONT):
> + dump += 'contpc'
> + elif (snap_entry & TREF_FRAME):
> + dump += 'ftsz '
> + else:
> + dump += '{{{const}}}'.format(const=dump_irk(trace, ref))
> + elif snap_entry & SNAP_SOFTFPNUM:
> + dump += '{:04d}/{:04d}'.format(ref, ref + 1)
> + else:
> + dump += '{:04d}'.format(ref)
> +
> + if snap_entry & SNAP_FRAME:
> + dump += '|'
> + else:
> + dump += '----'
> +
> + dump += ' ]\n'
> + return dump
> +
> +
> +def dump_sink_slot(rid, spill, ins_number):
> + assert rid == RID_SINK or rid == RID_SUNK, 'incorrect rid in sink dump'
> + tp = 'sink' if rid == RID_SINK else 'sunk'
> + return '{{{}'.format(tp) if spill == RID_INIT or spill == SPS_NONE \
> + else '{{{:04d}'.format(int(ins_number - spill))
> +
> +
> +def dump_regsp(irins, ins_number):
> + rid = irins['r']
> + spill = irins['s']
> + if rid == RID_SINK or rid == RID_SUNK:
> + return dump_sink_slot(rid, spill, ins_number)
> + elif irins['prev'] > 255:
> + return '[{:#05x}]'.format(int(spill * 4))
> + elif rid < 128:
> + return regname(rid)
> + else:
> + return ''
> +
> +
> +def dump_trace(trace, flags):
> + dump = 'Trace {num} start\n\tproto: {start_pt}\n\tBC:
> {start_bc}\n'.format(
> + num=trace['traceno'],
> + start_pt=gcref(trace['startpt']),
> + start_bc=mref('BCIns *', trace['startpc']),
> + )
> +
> + nins = trace['nins'] - REF_BIAS
> + dump += '---- TRACE IR\n'
> + nsnap = 0
> + snap = trace['snap'][nsnap]
> + snapref = snap['ref']
> + for irnum in range(1, nins):
> + irref = REF_BIAS + irnum
> + if 's' in flags and irref >= snapref and nsnap < trace['nsnap']:
> + dump += '.... '
> + if 'r' in flags:
> + dump += ' ' * 7
> + dump += dump_snap(trace, nsnap, snap)
> + nsnap += 1
> + snap = trace['snap'][nsnap]
> + snapref = snap['ref']
> + dump += '{:04d} '.format(irnum)
> + if 'r' in flags:
> + dump += '{:>7}'.format(dump_regsp(trace['ir'][irref], irnum))
> + dump += dump_irins(trace['ir'][irref], trace)
> + return dump
> +
> +
> +def dump_tref(tref):
> + return '[{F}{C}] {tp} {ref:#x}'.format(
> + F='F' if tref & TREF_FRAME else ' ',
> + C='C' if tref & TREF_CONT else ' ',
> + tp=IRTYPES[tref_type(tref)],
> + ref=tref_ref(tref)
> + )
> +
> +
> +def dump_jslots(coroutine):
> + lstate = L(None)
> + g = G(lstate or coroutine)
> + j = J(g)
> +
> + dump = ''
> + maxslot = j['baseslot'] + j['maxslot']
> + first_base_slot = 1 + LJ_FR2
> + for n in reversed(range(first_base_slot, maxslot)):
> + tref = j['slot'][n]
> + ref = tref_ref(tref)
> + address = dbg.address(tref)
> + dump += '{addr} {nslot:04d} {base:1s} {tref}{const}\n'.format(
> + addr=address,
> + base='B' if address == j['base'] else ' ',
> + nslot=n,
> + tref=dump_tref(tref),
> + const=' ' + dump_irk(j['cur'], ref - REF_BIAS)
> + if ref != 0 and ref < REF_BIAS else ''
> + )
> + return dump
> +
> +
> # Extension commands. ############################################
> 
> 
> @@ -1600,6 +2733,42 @@ error message occurs.
> dbg.write('{}\n'.format(dump_gcobj(gcobj)))
> 
> 
> +class LJDumpIR(dbg.LJBase):
> + '''
> +lj-ir <IRIns *>
> +
> +The command receives a pointer to <ir> (IRIns address) and dumps
> +the IR type and some info related to it. The format is similar to
> +the `jit.dump` tool but also provides information about IR mode and
> +operands modes.
> +
> +For the list of IR names and modes (operand types), see:
> + https://github.com/tarantool/tarantool/wiki/LuaJIT-SSA-IR.
> + '''
> +
> + def execute(self, arg):
> + dbg.write('{}'.format(dump_irins(dbg.cast('IRIns *', dbg.eval(arg)))))
> +
> +
> +class LJDumpJSlots(dbg.LJBase):
> + '''
> +lj-jslots [<lua_State *>]
> +
> +The command receives an optional lua_State address and dumps the
> +slots of JIT stack map:
> +
> +<slot ptr> <slot number> [<FRAME|CONTINUATION>] <IR reference>
> +
> +The lua_State pointer is optional to help in finding the VM's JIT state
> +when there is no coroutine to be inspected in the debugged frame.
> + '''
> +
> + def execute(self, arg):
> + dbg.write('{}'.format(
> + dump_jslots(dbg.cast('lua_State *', dbg.eval(arg)))
> + ))
> +
> +
> class LJDumpProto(dbg.LJBase):
> '''
> lj-proto <GCproto *>
> @@ -1784,19 +2953,44 @@ error message occurs.
> dbg.write('{}\n'.format(dump_tvalue(tv)))
> 
> 
> +class LJDumpTrace(dbg.LJBase):
> + '''
> +lj-trace [/FLAGS] <GCtrace *>
> +
> +The command receives a pointer to <trace> (IRIns address) and dumps
> +its number, IRs, and information about start location. The format is
> +similar to the `jit.dump` tool but also provides information about
> +IR mode and operands modes.
> +
> +Trace may be preceded with /FLAGS:
> +* r: Dump registers associated with IR, if any.
> +* s: Dump snapshots for the trace.
> + '''
> +
> + def execute(self, arg):
> + arg, flags = dbg.extract_flags(arg, 'rs')
> + dbg.write('{}'.format(dump_trace(
> + dbg.cast('GCtrace *', dbg.eval(arg)),
> + flags
> + )))
> +
> +
> def load(event=None):
> dbg.initialize_extension({
> - 'lj-arch': LJDumpArch,
> - 'lj-bc': LJDumpBC,
> - 'lj-func': LJDumpFunc,
> - 'lj-gc': LJGC,
> - 'lj-gco': LJDumpGCobj,
> - 'lj-proto': LJDumpProto,
> - 'lj-stack': LJDumpStack,
> - 'lj-state': LJState,
> - 'lj-str': LJDumpString,
> - 'lj-tab': LJDumpTable,
> - 'lj-tv': LJDumpTValue,
> + 'lj-arch': LJDumpArch,
> + 'lj-bc': LJDumpBC,
> + 'lj-func': LJDumpFunc,
> + 'lj-gc': LJGC,
> + 'lj-gco': LJDumpGCobj,
> + 'lj-ir': LJDumpIR,
> + 'lj-jslots': LJDumpJSlots,
> + 'lj-proto': LJDumpProto,
> + 'lj-stack': LJDumpStack,
> + 'lj-state': LJState,
> + 'lj-str': LJDumpString,
> + 'lj-tab': LJDumpTable,
> + 'lj-trace': LJDumpTrace,
> + 'lj-tv': LJDumpTValue,
> })
> 
> 
> diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py
> b/test/tarantool-debugger-tests/debug-extension-tests.py
> index 7e8ea5a2..76543daa 100644
> --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> @@ -46,7 +46,9 @@ else:
> RX_ADDR = r'0x[a-f0-9]+'
> RX_HASH = RX_ADDR # The same pattern for hexademic values.
> RX_BCN = r'00\d\d'
> +RX_IRN = RX_BCN # The same as for the bytecodes.
> RX_FRAME = r'\[(S|\s)(B|\s)(T|\s)(M|\s)\]'
> +RX_IRREF = r'0x\d\d\d\d'
> 
> 
> def persist(data):
> @@ -101,6 +103,9 @@ IS_GC64 = execute_process([
> LUAJIT_BINARY, '-e', "print(require('ffi').abi('gc64'))"
> ]).strip() == 'true'
> 
> +# Regexp for pointer type in IR.
> +RX_P = 'p64' if IS_GC64 else 'p32'
> +
> # If it is the guaranteed DUALNUM build (for example, on aarch64),
> # we use this regexp for the guaranteed 'integer' check and
> # 'number' for single-number build.
> @@ -108,6 +113,18 @@ RX_INT = r'integer' if IS_DUALNUM else r'number'
> RX_ISDUALNUM = r'True' if IS_DUALNUM else r'False'
> 
> 
> +# Assume not cross-platform debugging.
> +machine = os.uname().machine
> +if machine == 'x86_64':
> + RX_GPR = r'r\w\w'
> + RX_FPR = r'xmm\d+'
> +elif machine == 'arm64' or machine == 'aarch64':
> + RX_GPR = r'x\d+'
> + RX_FPR = r'd\d+'
> +else:
> + raise Exception('Unknown archeticture in testing')
> +
> +
> class TestCaseBase(unittest.TestCase):
> @classmethod
> def construct_cmds(cls):
> @@ -193,6 +210,16 @@ def mref(arg, tp):
> return '((' + tp + '*)(' + arg + ').ptr32)'
> 
> 
> +def gcref(arg):
> + if SUPPORT_MACRO_EXPAND:
> + return 'gcref(' + arg + ')'
> + else:
> + if IS_GC64:
> + return '(' + arg + ').gcptr64'
> + else:
> + return '(' + arg + ').gcptr32'
> +
> +
> class TestLoad(TestCaseBase):
> extension_cmds = ''
> location = 'lj_cf_print'
> @@ -203,11 +230,14 @@ class TestLoad(TestCaseBase):
> r'lj-func command initialized\n'
> r'lj-gc command initialized\n'
> r'lj-gco command initialized\n'
> + r'lj-ir command initialized\n'
> + r'lj-jslots command initialized\n'
> r'lj-proto command initialized\n'
> r'lj-stack command initialized\n'
> r'lj-state command initialized\n'
> r'lj-str command initialized\n'
> r'lj-tab command initialized\n'
> + r'lj-trace command initialized\n'
> r'lj-tv command initialized\n'
> r'LuaJIT debug extension is successfully loaded'
> )
> @@ -473,6 +503,341 @@ class TestLJBC(TestCaseBase):
> )
> 
> 
> +# JIT engine.
> +
> +
> +class TestLJTraceBase(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'for _ = 1, 4 do end\n'
> + 'print()\n'
> + )
> + pattern = (
> + r'Trace 1 start\n'
> + r'\t*proto: ' + RX_ADDR + r'\n' +
> + r'\t*BC: ' + RX_ADDR + r'\n' +
> + r'---- TRACE IR\n' +
> + RX_IRN + r'\s+ int SLOAD \[N \] lit: #[12] lit: C?I\n' +
> + RX_IRN + r'\s+ \+ int ADD \[C \] ref: ' + RX_IRN +
> + r' ref: integer 1\n' +
> + RX_IRN + r'\s+ > int LE \[N \] ref: ' + RX_IRN +
> + r' ref: integer 4\n' +
> + RX_IRN + r'\s+ > --- LOOP \[N \]\s*\n' +
> + RX_IRN + r'\s+ \+ int ADD \[C \] ref: ' + RX_IRN +
> + r' ref: integer 1\n' +
> + RX_IRN + r'\s+ > int LE \[N \] ref: ' + RX_IRN +
> + r' ref: integer 4\n' +
> + RX_IRN + r'\s+ int PHI \[S \] ref: ' + RX_IRN + r' ref: ' +
> + RX_IRN + r'\n' +
> + RX_IRN + r'\s+ NOP \[N \]\s*\n'
> + )
> +
> +
> +# Check the IR enumeration correcness by test the lowest (LT) and
> +# the highest (CARG) IRs. Also, checks CALL* occasionally.
> +class TestLJTraceIRRange(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'ffi.cdef[[int getpid(int, int);]]\n' # Use argument for testing.
> + 'jit.opt.start("hotloop=1")\n'
> + 'for i = 1, 4 do\n'
> + ' if i < 100 then\n' # LT.
> + ' ffi.C.getpid(i, 1LL)\n' # CARG and CALLXS.
> + ' end\n'
> + 'end\n'
> + 'print()\n'
> + )
> + # IRs from variant part of the trace.
> + pattern = (
> + RX_IRN + r'\s+ > int LT \[N \] ref: ' +
> + RX_IRN + r' ref: integer 100\n' +
> + RX_IRN + r'\s+ nil CARG \[N \] ref: ' +
> + RX_IRN + r' ref: integer 1\n' +
> + RX_IRN + r'\s+ int CALLXS \[S \] \[' + RX_ADDR +
> + r'\]\(\{' + RX_IRN + r'\}, \{integer 1\}\)'
> + )
> +
> +
> +# Test /rs flags.
> +class TestLJTraceFlags(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace /rs ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local r = 0.1\n'
> + 'for i = 1, 4 do\n'
> + ' r = i + r\n'
> + 'end\n'
> + 'print()\n'
> + )
> + # IRs and snapshot from variant part of the trace.
> + pattern = (
> + RX_IRN + r'\s+' + RX_FPR + r'\s* \+ num ADD.*\n' +
> + RX_IRN + r'\s+' + RX_GPR + r'\s* \+ int ADD.*\n' +
> + r'\.\.\.\.\s* SNAP #\d \[ (---- )*' + RX_IRN + r' \]'
> + )
> +
> +
> +class TestLJIRConst(TestCaseBase):
> + location = 'trace_stop'
> +
> + # No narrowing of 42.
> + if IS_DUALNUM:
> + # KNUM occupies 2 slots.
> + _knum_irnum = '6'
> + _kgc_irnum = '8' if IS_GC64 else '7'
> + _kptr_irnum = '10' if IS_GC64 else '8'
> + else:
> + # KNUM occupies 2 slots.
> + _knum_irnum = '8'
> + _kgc_irnum = '10' if IS_GC64 else '9'
> + _kptr_irnum = '12' if IS_GC64 else '10'
> + extension_cmds = (
> + 'n\n' # Load J.
> + 'lj-ir &J->cur.ir[0x8000 - 0]\n'
> + 'lj-ir &J->cur.ir[0x8000 - 1]\n'
> + 'lj-ir &J->cur.ir[0x8000 - 2]\n'
> + 'lj-ir &J->cur.ir[0x8000 - 3]\n'
> + 'lj-ir &J->cur.ir[0x8000 - 4]\n'
> + # Skip non-DUALNUM narrowed value.
> + 'lj-ir &J->cur.ir[0x8000 - ' + _knum_irnum + ']\n'
> + 'lj-ir &J->cur.ir[0x8000 - ' + _kgc_irnum + ']\n'
> + 'lj-ir &J->cur.ir[0x8000 - ' + _kptr_irnum + ']\n'
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local function trace(x)\n'
> + ' return x + 42, x + 0.5, x .. "1"\n'
> + 'end\n'
> + 'trace(1)\n'
> + 'trace(1)\n'
> + )
> + pattern = (
> + RX_P + r' BASE.*\n' +
> + r'\s* nil KPRI.*\n'
> + r'\s* fal KPRI.*\n'
> + r'\s* tru KPRI.*\n'
> + r'\s* int KINT.*cst: integer 42\s*\n'
> + r'\s* num KNUM.*cst: number 0.5\s*\n'
> + r'\s* str KGC.*cst: string "1".*\n' +
> + r'\s*' + RX_P + r' KPTR.*cst: \[' + RX_ADDR + r'\]'
> + )
> +
> +
> +class TestLJIRFloadNeg(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local function trace(a)\n'
> + ' local x = -a\n'
> + ' return x\n'
> + 'end\n'
> + 'trace(1.1)\n'
> + 'trace(1.1)\n'
> + 'print()\n'
> + )
> + pattern = (
> + r'num FLOAD .* ref: nil lit: offsetof\(GG, J\.ksimd\[LJ_KSIMD_NEG\]\)'
> + )
> +
> +
> +class TestLJIRFloadAbs(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local math_abs = math.abs\n'
> + 'local function trace(a)\n'
> + ' local x = math_abs(a)\n'
> + ' return x\n'
> + 'end\n'
> + 'trace(1)\n'
> + 'trace(1)\n'
> + 'print()\n'
> + )
> + pattern = (
> + r'num FLOAD .* ref: nil lit: offsetof\(GG, J\.ksimd\[LJ_KSIMD_ABS\]\)'
> + )
> +
> +
> +# XXX: Implemented only for GC64 in LuaJIT until backporting the
> +# corresponding commit.
> +if IS_GC64:
> + class TestLJIRFloadGCRootBaseMT(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local function trace(a)\n'
> + 'local x = a.sub(1, 2)\n'
> + ' return x\n'
> + 'end\n'
> + 'trace("12")\n'
> + 'trace("12")\n'
> + 'print()\n'
> + )
> + pattern = (
> + r'tab FLOAD .* ref: nil lit: '
> + r'offsetof\(GG, g\.gcroot\[GCROOT_BASEMT_STR\]\.gcptr64\)'
> + )
> +
> + class TestLJIRFloadGCRootIO(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local io_flush = io.flush\n'
> + 'local function trace()\n'
> + ' io_flush()\n'
> + 'end\n'
> + 'trace()\n'
> + 'trace()\n'
> + 'print()\n'
> + )
> + pattern = (
> + r'udt FLOAD .* ref: nil lit: '
> + r'offsetof\(GG, g\.gcroot\[GCROOT_IO_OUTPUT\]\.gcptr64\)'
> + )
> +
> +
> +# Some IRs related to tables.
> +class TestLJIRTable(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local function trace(t)\n'
> + ' t.a = nil\n'
> + ' t.b = 1\n'
> + ' return t\n'
> + 'end\n'
> + 'trace({a = 1})\n'
> + 'trace({a = 1})\n'
> + 'print()\n'
> + )
> + pattern = (
> + r'(?s)int FLOAD .* tab\.hmask\n'
> + r'.*' + RX_P + r' FLOAD .* tab\.node\n'
> + r'.*' + RX_P + r' HREFK .* string "a" @ ' + RX_ADDR +
> + r' KSLOT: @\d\n'
> + r'.*' + RX_P + r' HREF .* string "b" @ ' + RX_ADDR + r'\s*\n'
> + r'.*' + RX_P + r' EQ .* \[g->nilnode\]'
> + )
> +
> +
> +class TestLJIRUref(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'local uv = 0\n'
> + 'local function trace(a)\n'
> + ' uv = a\n'
> + ' return uv\n'
> + 'end\n'
> + 'trace(1)\n'
> + 'trace(1)\n'
> + 'print()\n'
> + )
> + pattern = r'UREFO .* lit: #0'
> +
> +
> +# Check border values (that always avalable) of CALL IRs.
> +class TestLJIRCall(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'jit.opt.start("hotloop=1")\n'
> + 'local function trace(a, b)\n'
> + ' return a < b, ffi.errno()\n'
> + 'end\n'
> + 'trace("abc", "abd")\n'
> + 'trace("abc", "abd")\n'
> + 'print(1)\n'
> + )
> + pattern = (
> + r'(?s)int CALLN .* '
> + r'lj_str_cmp\(\{' + RX_IRN + r'\}, \{' + RX_IRN + r'\}\)'
> + r'.*int CALLS .* lj_vm_errno\(\)'
> + )
> +
> +
> +# Test ffi call with ctype stored in CARG.
> +class TestLJIRCallXSCType(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-trace ' + gcref('((GG_State *)L)->J->trace[1]')
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'ffi.cdef[[int printf(const char *fmt, ...);]]\n'
> + 'jit.opt.start("hotloop=1")\n'
> + 'local function trace()\n'
> + ' local t = ffi.C.printf("")\n'
> + ' return t\n'
> + 'end\n'
> + 'trace()\n'
> + 'trace()\n'
> + 'print()\n'
> + )
> + pattern = r'int CALLXS .* [' + RX_ADDR + r'\]\(.*\) ctype: \d+'
> +
> +
> +class TestLJJSlotsBase(TestCaseBase):
> + location = 'trace_stop'
> + extension_cmds = (
> + 'n\n' # Load J.
> + 'lj-jslots J->L\n'
> + )
> + lua_script = (
> + 'jit.opt.start("hotloop=1")\n'
> + 'for _ = 1, 4 do end\n'
> + )
> + pattern = (
> + r'(?s)(.*' +
> + RX_ADDR + ' ' + RX_IRN + r' (B|\s) \[(F|\s)(C|\s)\] \w\w\w ' +
> + RX_IRREF +
> + r'.*)+'
> + )
> +
> +
> for test_cls in TestCaseBase.__subclasses__():
> test_cls.test = lambda self: self.check()
> 
> --
> 2.54.0
>

[-- Attachment #2: Type: text/html, Size: 53767 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 3/3] test: add verbose mode for debug extension tests
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests Sergey Kaplun via Tarantool-patches
@ 2026-06-28  1:31   ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-28 15:19     ` Sergey Kaplun via Tarantool-patches
  2026-06-30 14:54   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-28  1:31 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2065 bytes --]

Hi, Sergey!

Thanks for the patch!

LGTM after fixing the comments.

--
Best regards,
Evgeniy Temirgaleev

> 
> От кого: Sergey Kaplun <skaplun@tarantool.org>
> Кому: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> >
> Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> >
> Дата: Четверг, 25 июня 2026, 23:29 +03:00
> If the environment variable `DUBUGGER_TEST_VERBOSE` is set, each test
> prints the generated command and its output and doesn't delete the files
> generated for it.
> ---
> test/tarantool-debugger-tests/debug-extension-tests.py | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py
> b/test/tarantool-debugger-tests/debug-extension-tests.py
> index fc5d2c7b..adb83e1e 100644
> --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> @@ -43,6 +43,8 @@ else:
> # Don't run any initialization scripts.
> RUN_CMD_FILE = ['--batch', '--nx', '--quiet', '--command']
> 
> +TEST_VERBOSE = os.getenv('DUBUGGER_TEST_VERBOSE', default=False)
> 

Typo: /DU/DE/

> 
> +
> RX_ADDR = r'0x[a-f0-9]+'
> RX_HASH = RX_ADDR # The same pattern for hexademic values.
> RX_BCN = r'00\d\d'
> @@ -52,7 +54,7 @@ RX_IRREF = r'0x\d\d\d\d'
> 
> 
> def persist(data):
> - tmp = tempfile.NamedTemporaryFile(mode='w')
> + tmp = tempfile.NamedTemporaryFile(mode='w', delete=not TEST_VERBOSE)
> tmp.write(data)
> tmp.flush()
> return tmp
> @@ -149,7 +151,12 @@ class TestCaseBase(unittest.TestCase):
> LUAJIT_BINARY,
> script_file.name,
> ]
> + if TEST_VERBOSE:
> + print('# Test name: {}'.format(cls.__name__))
> + print('# Test command: {}'.format(' '.join(process_cmd)))
> cls.output = execute_process(process_cmd)
> + if TEST_VERBOSE:
> + print('# Command output: {}'.format(cls.output))
> 

Please, append a new line before command (multiline) output.

> 
> cmd_file.close()
> script_file.close()
> 
> --
> 2.54.0
>

[-- Attachment #2: Type: text/html, Size: 3577 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests
  2026-06-28  1:31   ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-28 15:19     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-28 15:19 UTC (permalink / raw)
  To: Evgeniy Temirgaleev; +Cc: tarantool-patches

Hi, Evgeniy!
Thanks for the review!
Fixed your comments and force-pushed the branch.

On 28.06.26, Evgeniy Temirgaleev wrote:
> Hi, Sergey!
> 
> Thanks for the patch!
> 
> LGTM after fixing the comments.
> 
> --
> Best regards,
> Evgeniy Temirgaleev
> 
> > 
> > От кого: Sergey Kaplun <skaplun@tarantool.org>
> > Кому: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> > >
> > Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> > >
> > Дата: Четверг, 25 июня 2026, 23:29 +03:00
> > If the environment variable `DUBUGGER_TEST_VERBOSE` is set, each test

Updated the commit message:

| test: add verbose mode for debug extension tests
|
| If the environment variable `DEBUGGER_TEST_VERBOSE` is set, each test
| prints the generated command and its output and doesn't delete the files
| generated for it.

> > prints the generated command and its output and doesn't delete the files
> > generated for it.
> > ---
> > test/tarantool-debugger-tests/debug-extension-tests.py | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py
> > b/test/tarantool-debugger-tests/debug-extension-tests.py
> > index fc5d2c7b..adb83e1e 100644
> > --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> > +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> > @@ -43,6 +43,8 @@ else:
> > # Don't run any initialization scripts.
> > RUN_CMD_FILE = ['--batch', '--nx', '--quiet', '--command']
> > 
> > +TEST_VERBOSE = os.getenv('DUBUGGER_TEST_VERBOSE', default=False)
> > 
> 
> Typo: /DU/DE/

Fixed, thanks!

===================================================================
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index adb83e1e..9bf375d5 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -43,7 +43,7 @@ else:
     # Don't run any initialization scripts.
     RUN_CMD_FILE = ['--batch', '--nx', '--quiet', '--command']
 
-TEST_VERBOSE = os.getenv('DUBUGGER_TEST_VERBOSE', default=False)
+TEST_VERBOSE = os.getenv('DEBUGGER_TEST_VERBOSE', default=False)
 
 RX_ADDR = r'0x[a-f0-9]+'
 RX_HASH = RX_ADDR  # The same pattern for hexademic values.
===================================================================

> 
> > 
> > +
> > RX_ADDR = r'0x[a-f0-9]+'
> > RX_HASH = RX_ADDR # The same pattern for hexademic values.
> > RX_BCN = r'00\d\d'
> > @@ -52,7 +54,7 @@ RX_IRREF = r'0x\d\d\d\d'
> > 
> > 
> > def persist(data):
> > - tmp = tempfile.NamedTemporaryFile(mode='w')
> > + tmp = tempfile.NamedTemporaryFile(mode='w', delete=not TEST_VERBOSE)
> > tmp.write(data)
> > tmp.flush()
> > return tmp
> > @@ -149,7 +151,12 @@ class TestCaseBase(unittest.TestCase):
> > LUAJIT_BINARY,
> > script_file.name,
> > ]
> > + if TEST_VERBOSE:
> > + print('# Test name: {}'.format(cls.__name__))
> > + print('# Test command: {}'.format(' '.join(process_cmd)))
> > cls.output = execute_process(process_cmd)
> > + if TEST_VERBOSE:
> > + print('# Command output: {}'.format(cls.output))
> > 
> 
> Please, append a new line before command (multiline) output.

Added:

===================================================================
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 9bf375d5..8e069fe0 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -156,7 +156,7 @@ class TestCaseBase(unittest.TestCase):
             print('# Test command: {}'.format(' '.join(process_cmd)))
         cls.output = execute_process(process_cmd)
         if TEST_VERBOSE:
-            print('# Command output: {}'.format(cls.output))
+            print('# Command output:\n{}'.format(cls.output))
         cmd_file.close()
         script_file.close()
 
===================================================================

> 
> > 
> > cmd_file.close()
> > script_file.close()
> > 
> > --
> > 2.54.0
> >

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-28  1:03   ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-28 16:32     ` Sergey Kaplun via Tarantool-patches
  2026-06-29 16:35       ` Evgeniy Temirgaleev via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-28 16:32 UTC (permalink / raw)
  To: Evgeniy Temirgaleev; +Cc: tarantool-patches

Hi, Evgeniy!
Thanks for the review!
See my answers below.

Branch is force-pushed with the fixes.

On 28.06.26, Evgeniy Temirgaleev wrote:
> Hi, Sergey!
> 
> Thanks for the patch. Please, see my comments.
> 
> --
> Best regards,
> Evgeniy Temirgaleev
> 
> > 
> > От кого: Sergey Kaplun <skaplun@tarantool.org>
> > Кому: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> > >
> > Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> > >
> > Дата: Четверг, 25 июня 2026, 23:29 +03:00
> > This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> > as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> > similar to the -jdump flag but also reports types of register operands
> > (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> > The `lj-trace` command accepts optional /rs flags to dump registers
> > associated with IR and snapshots for the trace correspondingly.
> > The `lj-ir` command can be used for dumping IR constants as well.
> > The `lj-jslots` command dumps the content of `J->slot`. It is useful to
> > simplify debugging of `rec_check_slots()` assertion failures.
> > 
> > For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> > Also, `__index__` is set to allow lldb.value to be used as an index
> > without explicit conversion to int. Old GDB versions (below 7.12) are
> > not supported because of the gdb.Value lacks the `__index__` metamethod
> > and can't be monkey-patched. The support for these versions may be added
> > by demand.
> > 
> > Part of tarantool/tarantool#4808
> > ---
> > src/luajit_dbg.py | 1216 ++++++++++++++++-
> > .../debug-extension-tests.py | 365 +++++
> > 2 files changed, 1570 insertions(+), 11 deletions(-)
> > 
> > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > index 2edb199a..fd6ca8a5 100644
> > --- a/src/luajit_dbg.py
> > +++ b/src/luajit_dbg.py

<snipped>

> > + if tp.GetTypeClass() == lldb.eTypeClassStruct:
> > + len_fields = tp.GetNumberOfFields()
> > + for n_field in range(len_fields):
> > + islast = n_field == (len_fields - 1)
> > + field = tp.GetFieldAtIndex(n_field)
> > + start_field = field.GetOffsetInBytes()
> > + if not islast:
> > + end_field = tp.GetFieldAtIndex(
> > + n_field + 1
> > + ).GetOffsetInBytes()
> > + else:
> > + end_field = tp.GetByteSize()
> > + if start_field <= offset and offset < end_field:
> > + next_name = self.member_by_offset(
> > + field.GetType(),
> > + offset - start_field,
> > + prev_name=field.GetName()
> > + )
> > + return '.{field}{suffix}'.format(
> > + field=field.GetName(),
> > + suffix=next_name if next_name else ''
> > + )
> > + if tp.GetTypeClass() == lldb.eTypeClassArray:
> > 
> 
> Typo?: elif

Fixed, thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 62cd65d5..3b7cf7a1 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -657,7 +657,7 @@ class _LLDBDebugger(Debugger):
                         field=field.GetName(),
                         suffix=next_name if next_name else ''
                     )
-        if tp.GetTypeClass() == lldb.eTypeClassArray:
+        elif tp.GetTypeClass() == lldb.eTypeClassArray:
             # Get array field type.
             target = tp.GetArrayElementType()
             tsize = target.GetByteSize()
===================================================================

<snipped>

> > +# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> > +# Non-weak guard. */
> > 
> 
> Typo: C comment end */

Removed, thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 62cd65d5..3b7cf7a1 100644
@@ -1612,12 +1612,16 @@ IRS = [


 # Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
-# Non-weak guard. */
+# Non-weak guard.
 IRM_C = 0x10
 IRM_A = 0x20
 IRM_L = 0x40
===================================================================

> 
> > 
> > +IRM_C = 0x10
> > +IRM_A = 0x20
> > +IRM_L = 0x40
> > +IRM_S = 0x60
> > +IRM_W = 0x80
> > +
> > +
> > +# IR operand mode (2 bit).
> > +IRM = [
> > + 'ref',
> > + 'lit',
> > + 'cst',
> > + '', # none
> > +]
> > +
> > +
> > +lj_ir_mode_ = None
> > +
> > +
> > +def lj_ir_mode():
> > + global lj_ir_mode_
> > + if lj_ir_mode_:
> > + return lj_ir_mode_
> > + lj_ir_mode_ = dbg.lookup_global('lj_ir_mode')
> > + return lj_ir_mode_
> > +
> > +
> > +def ir_left(op):
> > + return IRM[int(lj_ir_mode()[op] & 3)]
> > 
> 
> May be binary constant will be more clear? xxx & 0b0011

I prefer to leave it consistent with the original sources, see
<src/lj_ir.h>. Also, the bc decoding has the same format.

> 
> > 
> > +
> > +
> > +def ir_right(op):
> > + return IRM[int(lj_ir_mode()[op] >> 2 & 3)]
> > 
> 
> May be binary constant will be more clear? (xxx & 0b1100) >> 2

Ditto.

> 
> > 
> > +
> > +
> > +def ir_mode(op):
> > + mode = ''
> > + ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
> > 
> 
> > 
> > + if ir_mode == IRM_C:
> > + mode = 'C'
> > + elif ir_mode == IRM_A:
> > + mode = 'A'
> > + elif ir_mode == IRM_L:
> > + mode = 'L'
> > + elif ir_mode == IRM_S:
> > + mode = 'S'
> > + else:
> > + mode = 'N'
> > 
> 
> > 
> > + mode += 'W' if ir_mode & IRM_W else ''
> > 
> 
> May be table with 16 items and comments will be more clear? E. g. return XXX[(lj_ir_mode()[op] & 0b11110000) >> 4]
> And it will contain invalid values also.
> # <flag bits in a comment>
> XXX[0b0000] = ‘NW’ # Normal/Ref | !Non-weak guard
> XXX[0b0001] = ‘CW’ # Commutative | !Non-weak guard
> XXX[0b0011] = ‘Invalid’
> ...
> XXX[0b1000] = ‘N’ # Normal/Ref | Non-weak guard
> XXX[0b1001] = ‘C’ # Commutative | Non-weak guard
> XXX[0b1011] = ‘Invalid’
> ...

Rewrote with table usage as the following:
Also, you help me to notice that the original implementation was
incorrect (due to bits of operand modes after xor). Tests was corrupted
as well, fixed. Thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index d4f89eb5..32b0cea7 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1613,11 +1613,15 @@ IRS = [
 
 # Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
 # Non-weak guard.
-IRM_C = 0x10
-IRM_A = 0x20
-IRM_L = 0x40
-IRM_S = 0x60
-IRM_W = 0x80
+IRM_BITS_W = 0x80
+IRM_BITS = {
+    0x00: 'N',
+    0x10: 'C',
+    0x20: 'A',
+    0x40: 'L',
+    0x60: 'S',
+}
+IRM_BITS_MASK = 0x70
 
 
 # IR operand mode (2 bit).
@@ -1649,19 +1653,10 @@ def ir_right(op):
 
 
 def ir_mode(op):
-    mode = ''
-    ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
-    if ir_mode == IRM_C:
-        mode = 'C'
-    elif ir_mode == IRM_A:
-        mode = 'A'
-    elif ir_mode == IRM_L:
-        mode = 'L'
-    elif ir_mode == IRM_S:
-        mode = 'S'
-    else:
-        mode = 'N'
-    mode += 'W' if ir_mode & IRM_W else ''
+    irmode = int((lj_ir_mode()[op]))
+    isweak = not bool(irmode & IRM_BITS_W)
+    mode = IRM_BITS[irmode & IRM_BITS_MASK]
+    mode += 'W' if isweak else ''
     return mode
 
 
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 8e069fe0..f17de27e 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -530,12 +530,12 @@ class TestLJTraceBase(TestCaseBase):
         r'\t*proto: ' + RX_ADDR + r'\n' +
         r'\t*BC: ' + RX_ADDR + r'\n' +
         r'---- TRACE IR\n' +
-        RX_IRN + r'\s+    int SLOAD  \[N \] lit: #[12]   lit: C?I\n' +
+        RX_IRN + r'\s+    int SLOAD  \[L \] lit: #[12]   lit: C?I\n' +
         RX_IRN + r'\s+ \+ int ADD    \[C \] ref: ' + RX_IRN +
                  r' ref: integer 1\n' +
         RX_IRN + r'\s+ >  int LE     \[N \] ref: ' + RX_IRN +
                  r' ref: integer 4\n' +
-        RX_IRN + r'\s+ >  --- LOOP   \[N \]\s*\n' +
+        RX_IRN + r'\s+ >  --- LOOP   \[S \]\s*\n' +
         RX_IRN + r'\s+ \+ int ADD    \[C \] ref: ' + RX_IRN +
                  r' ref: integer 1\n' +
         RX_IRN + r'\s+ >  int LE     \[N \] ref: ' + RX_IRN +
===================================================================

But intentionally didn't use bit notation to be consistent with original
declarations in <src/lj_ir.h>. I've used the mask to strip lower bits
related to "NonWeak" guard and operand modes.

<snipped>

> > +# Don't use *[ to be compatible with Python 2.
> > +REGISTERS = {'x64': [
> > + 'rax',
> > + 'rcx',
> > + 'rdx',
> > + 'rbx',
> > + 'rsp',
> > + 'rbp',
> > + 'rsi',
> > + 'rdi',
> > +] + [
> > + 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> > +] + [
> > + 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> > +], 'arm64': [
> > + 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> > +] + ['sp'] + [ # x31
> > + 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> > +]}
> > 
> 
> It seems, the ‘arm64’ registers are missed.

Actiually no, but I understand your confusion.
Reformated as the following:
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 32b0cea7..a79caad0 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1728,24 +1728,29 @@ IRFPMS = [
 
 
 # Don't use *[ to be compatible with Python 2.
-REGISTERS = {'x64': [
-    'rax',
-    'rcx',
-    'rdx',
-    'rbx',
-    'rsp',
-    'rbp',
-    'rsi',
-    'rdi',
-] + [
-    'r{}'.format(i) for i in range(8, 16)  # r8 .. r15
-] + [
-    'xmm{}'.format(i) for i in range(0, 16)  # xmm0 .. xmm15
-], 'arm64': [
-    'x{}'.format(i) for i in range(0, 31)  # x0 .. x30
-] + ['sp'] + [  # x31
-    'd{}'.format(i) for i in range(0, 32)  # d0 .. d31
-]}
+REGISTERS = {
+    'x64': [
+        'rax',
+        'rcx',
+        'rdx',
+        'rbx',
+        'rsp',
+        'rbp',
+        'rsi',
+        'rdi',
+    ] + [
+        'r{}'.format(i) for i in range(8, 16)  # r8 .. r15
+    ] + [
+        'xmm{}'.format(i) for i in range(0, 16)  # xmm0 .. xmm15
+    ],
+    'arm64': [
+        'x{}'.format(i) for i in range(0, 31)  # x0 .. x30
+    ] + [
+        'sp'  # x31
+    ] + [
+        'd{}'.format(i) for i in range(0, 32)  # d0 .. d31
+    ]
+}
 
 
 IR_CALLS = [
===================================================================

> 

<snipped>

> > +def litname_xload(mode):
> > + flags = ['-', 'R', 'V', 'RV', 'U', 'RU', 'VU', 'RVU']
> > 
> 
> Does we need a range check as in litname_bufhdr()?

I prefer error raising if something goes wrong here (invalid IR or
incorrect extension implementation).

> 
> > 
> > + return flags[mode]
> > +
> > +
> > +def litname_conv(mode):
> > 
> 
> Does we need some range checking here?

Ditto.

> 
> > 

<snippped>

> > +
> > +
> > +def irt_ismarked(t):
> > + return t['irt'] & IRT_MARK
> > 
> 
> I propose explicit bool cast (!= 0) here and below.

Added:
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index a79caad0..6b0827d9 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1978,15 +1978,15 @@ def tref_ref(tr):
 
 
 def irt_ismarked(t):
-    return t['irt'] & IRT_MARK
+    return bool(t['irt'] & IRT_MARK)
 
 
 def irt_isphi(t):
-    return t['irt'] & IRT_ISPHI
+    return bool(t['irt'] & IRT_ISPHI)
 
 
 def irt_isguard(t):
-    return t['irt'] & IRT_GUARD
+    return bool(t['irt'] & IRT_GUARD)
 
 
 def irt_toitype(irt):
===================================================================

> 
> > 
> > +
> > +
> > +def irt_isphi(t):
> > + return t['irt'] & IRT_ISPHI
> > +
> > +
> > +def irt_isguard(t):
> > + return t['irt'] & IRT_GUARD
> > +
> > +

<snipped>

> > 
> > --
> > 2.54.0
> >

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump Sergey Kaplun via Tarantool-patches
@ 2026-06-29 13:55   ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-29 19:20     ` Sergey Kaplun via Tarantool-patches
  2026-06-30 14:53   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-29 13:55 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 20368 bytes --]

Hi, Sergey!

Thanks for the patch!
Please, see the commends below.

--
Best regards,
Evgeniy Temirgaleev

> 
> From: Sergey Kaplun <skaplun@tarantool.org>
> To: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> >
> Cc: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> >
> Date: Thursday, June 25, 2026 11:29 PM +03:00
> This patch extends dumped information for the given cdata object. Now
> it resolves the given `CType` and prints it in the format similar to the
> `__tostring` metamethod. The `lj-ctype` command is introduced to dump
> this information where there is only the `CType` pointer but no cdata
> associated with it.
> 
> `__or__` and `__ror__` metamethods are monkey-patched for the LLDB value
> object. In `__sub__` metamethod for LLDB pointers `GetPointeeType()` is
> used to get the pointee type instead of the incorrectly used
> `GetDereferencedType()` which always returns the same type with size 8.
> Casting from negative values to the unsigned values is supported to
> check `CTF_UCHAR`.
> 
> Part of tarantool/tarantool#4808
> ---
> src/luajit_dbg.py | 333 +++++++++++++++++-
> .../debug-extension-tests.py | 208 ++++++++++-
> 2 files changed, 535 insertions(+), 6 deletions(-)
> 
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index fd6ca8a5..62cd65d5 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -386,6 +386,8 @@ class _LLDBDebugger(Debugger):
> pack_flag = '<q'
> else:
> pack_flag = '<Q'
> + # Cast to unsigned.
> 

Is /unsigned/uint64_t/ clearly?

> 
> + raw_value &= 0xFFFFFFFFFFFFFFFF
> raw_data = struct.pack(pack_flag, raw_value)
> sbdata = lldb.SBData()
> sbdata.SetData(
> @@ -482,6 +484,9 @@ class _LLDBDebugger(Debugger):
> def lldb__lt__(lldbval, other):
> return int(lldbval) < int(other)
> 
> + def lldb__or__(lldbval, other):
> + return int(lldbval) | int(other)
> +
> def lldb__str__(lldbval):
> # Instead of default GetSummary.
> if not lldbval.sbvalue.TypeIsPointerType():
> @@ -512,8 +517,8 @@ class _LLDBDebugger(Debugger):
> lldbval_tp = sbval.GetType()
> other_tp = osbval.GetType()
> # Subtract pointers of the same size only.
> - elsz = lldbval_tp.GetDereferencedType().size
> - if other_tp.GetDereferencedType().size != elsz:
> + elsz = lldbval_tp.GetPointeeType().size
> + if other_tp.GetPointeeType().size != elsz:
> raise Exception(
> 'Attempt to substruct {otp} from {stp}'.format(
> stp=lldbval_tp.name,
> @@ -536,6 +541,8 @@ class _LLDBDebugger(Debugger):
> lldb.value.__index__ = lldb__index__
> lldb.value.__le__ = lldb__le__
> lldb.value.__lt__ = lldb__lt__
> + lldb.value.__or__ = lldb__or__
> + lldb.value.__ror__ = lldb__or__ # Same semantics.
> lldb.value.__str__ = lldb__str__
> lldb.value.__sub__ = lldb__sub__
> 
> @@ -1352,6 +1359,119 @@ def lightudV(tv):
> return gcval(tv['gcr'])
> 
> 
> +# FFI.
> +
> +
> +def ctype_ctsG(g):
> + return mref('CTState *', g['ctype_state'])
> +
> +
> +def ctype_get(cts, id):
> + return dbg.address(cts['tab'][id])
> +
> +
> +# Externally visible types.
> +CT_NUM = 0 # Integer or floating-point numbers.
> +CT_STRUCT = 1 # Struct or union.
> +CT_PTR = 2 # Pointer or reference.
> +CT_ARRAY = 3 # Array or complex type.
> +CT_MAYCONVERT = CT_ARRAY
> +CT_VOID = 4 # Void type.
> +CT_ENUM = 5 # Enumeration.
> +CT_HASSIZE = CT_ENUM # Last type where ct->size holds the actual size.
> +CT_FUNC = 6 # Function.
> +CT_TYPEDEF = 7 # Typedef.
> +CT_ATTRIB = 8 # Miscellaneous attributes.
> +
> +# Common types.
> +CTID_CTYPEID = 21
> +
> +# C type info flags.
> +CTF_BOOL = 0x08000000 # Boolean: NUM, BITFIELD.
> +CTF_FP = 0x04000000 # Floating-point: NUM.
> +CTF_CONST = 0x02000000 # Const qualifier.
> +CTF_VOLATILE = 0x01000000 # Volatile qualifier.
> +CTF_UNSIGNED = 0x00800000 # Unsigned: NUM, BITFIELD.
> +CTF_LONG = 0x00400000 # Long: NUM.
> +CTF_VLA = 0x00100000 # Variable-length: ARRAY, STRUCT.
> +CTF_REF = 0x00800000 # Reference: PTR.
> +CTF_VECTOR = 0x08000000 # Vector: ARRAY.
> +CTF_COMPLEX = 0x04000000 # Complex: ARRAY.
> +CTF_UNION = 0x00800000 # Union: STRUCT.
> +CTF_VARARG = 0x00800000 # Vararg: FUNC.
> +CTF_SSEREGPARM = 0x00400000 # SSE register parameters: FUNC.
> +
> +CTF_UCHAR = CTF_UNSIGNED if int(dbg.cast('char', -1)) > 0 else 0
> +
> +CTMASK_ATTRIB = 255 # Max. 256 attributes.
> +CTSHIFT_ATTRIB = 16
> +
> +# Attribute numbers.
> +CTA_QUAL = 1 # Unmerged qualifiers.
> +
> +CTSHIFT_NUM = 28
> +CTMASK_CID = 0x0000ffff
> +CTMASK_NUM = 0xf0000000 # Max. 16 type numbers.
> +
> +# Special sizes.
> +CTSIZE_INVALID = 0xffffffff
> +DWORDSZ = 4
> +QWORDSZ = 8
> +
> +
> +def ctype_type(info):
> + return info >> CTSHIFT_NUM
> +
> +
> +def ctype_attrib(info):
> + return (info >> CTSHIFT_ATTRIB) & CTMASK_ATTRIB
> +
> +
> +def ctinfo(ct, flags):
> 

May we name this function ‘CTINFO’ as in ‘lj_ctype.h’? Or leave a comment with an original name for quick grep.

> 
> + return (tou32(ct) << CTSHIFT_NUM) + flags
> +
> +
> +def ctype_isptr(info):
> + return ctype_type(info) == CT_PTR
> +
> +
> +def ctype_iscomplex(info):
> + return (info & (CTMASK_NUM | CTF_COMPLEX)) == ctinfo(CT_ARRAY,
> CTF_COMPLEX)
> +
> +
> +def ctype_isinteger(info):
> + return (info & (CTMASK_NUM | CTF_BOOL | CTF_FP)) == ctinfo(CT_NUM, 0)
> +
> +
> +def ctype_isrefarray(info):
> + return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
> + ctinfo(CT_ARRAY, 0)
> +
> +
> +def ctype_cid(info):
> 

Let’s put these function definitions in the ‘lj_ctype.h’ order?
May we group the definitions by corresponding C files also? # lj_ctype.h … # lj_cdata.h … # lj_xxx.c …

> 
> + return info & CTMASK_CID
> +
> +
> +def ctype_child(cts, ctype):
> + return ctype_get(cts, ctype_cid(ctype['info']))
> +
> +
> +def cdataptr(cd):
> + return dbg.cast('void *', (cd + 1))
> +
> +
> +def cdata_getptr(p, size):
> + if LJ_64 and size == 4:
> + return dbg.cast('void *', dbg.cast('uint32_t *', p)[0])
> + else:
> 

assert for size == 8 ?

> 
> + return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
> +
> +
> +# Get C type ID for a C type.
> +def ctype_typeid(cts, ct):
> + return ct - cts['tab']
> +
> +
> # JIT engine.
> 
> 
> @@ -1951,7 +2071,26 @@ def dump_lj_gco_trace(gcobj):
> 
> 
> def dump_lj_gco_cdata(gcobj):
> - return 'cdata @ {}'.format(strx64(gcobj))
> + cdata = dbg.cast('struct GCcdata *', gcobj)
> + cts = ctype_ctsG(G(L()))
> + cid = cdata['ctypeid']
> + ctype = ctype_get(cts, cid)
> + info = ctype['info']
> + size = ctype['size']
> + value = ''
> + if ctype_iscomplex(info):
> + value = cdata_val_complex(cdata, ctype)
> + elif size == 8 and ctype_isinteger(info):
> + value = cdata_val_int64(cdata, ctype)
> + else:
> + value = cdataptr(cdata)
> + if ctype_isptr(info):
> + value = cdata_getptr(value, size)
> + return 'cdata @ {addr} {ctype} {value}'.format(
> + addr=strx64(gcobj),
> + ctype=dump_ctype(ctype),
> + value=value,
> + )
> 
> 
> def dump_lj_gco_tab(gcobj):
> @@ -2281,6 +2420,176 @@ def dump_func(func):
> return 'fast function #{}\n'.format(int(ffid))
> 
> 
> +# FFI dumpers.
> +
> +
> +def cdata_val_int64(cdata, ctype):
> + info = ctype['info']
> + isunsigned = info & CTF_UNSIGNED
> + cdataval = cdataptr(cdata)
> + valueptr = None
> + usuffix = ''
> + if isunsigned:
> + usuffix = 'U'
> + valueptr = dbg.cast('uint64_t *', cdataval)
> + else:
> + valueptr = dbg.cast('int64_t *', cdataval)
> + return str(valueptr[0]) + usuffix + 'LL'
> +
> +
> +def cdata_val_complex(cdata, ctype):
> + size = ctype['size']
> + cdataval = cdataptr(cdata)
> + casttype = None
> + if size == QWORDSZ * 2:
> + casttype = 'double *'
> + else:
> + assert size == DWORDSZ * 2, 'bad (complex float) size'
> + casttype = 'float *'
> + re = dbg.cast(casttype, cdataval)[0]
> + im = dbg.cast(casttype, cdataval)[1]
> + sign = '+' if im > 0 else ''
> + return '{re}{sign}{im}i'.format(re=re, im=im, sign=sign)
> +
> +
> +def ctype_preplit(ctypestr, lit):
> + # Prevent extra space in the end of the string.
> + space = ' ' if ctypestr != '' else ''
> + return lit + space + ctypestr
> +
> +
> +def ctype_prepqual(ctypestr, info):
> + if (info & CTF_VOLATILE):
> + ctypestr = ctype_preplit(ctypestr, 'volatile')
> + if (info & CTF_CONST):
> + ctypestr = ctype_preplit(ctypestr, 'const')
> + return ctypestr
> +
> +
> +def ctype_preptype(cts, ctypestr, ctype, qual, tp):
> + nameref = gcref(ctype['name'])
> + if nameref:
> + ctypestr = ctype_preplit(ctypestr, re.sub('"', '', strdata(nameref)))
> + else:
> + ctypestr = ctype_preplit(ctypestr, str(ctype_typeid(cts, ctype)))
> + ctypestr = ctype_preplit(ctypestr, tp)
> + ctypestr = ctype_prepqual(ctypestr, qual)
> + return ctypestr
> +
> +
> +def ctype_prepnum(ctypestr, info, size):
> 

Func proto differs with lj_ctype.c (static void ctype_prepnum(CTRepr *ctr, uint32_t n)).
It seems, you move some of ctype_repr() code here. Let’s comment it?

> 
> + if info & CTF_BOOL:
> + ctypestr = ctype_preplit(ctypestr, 'bool')
> + elif info & CTF_FP:
> + if size == QWORDSZ:
> + ctypestr = ctype_preplit(ctypestr, 'double')
> + elif size == DWORDSZ:
> + ctypestr = ctype_preplit(ctypestr, 'float')
> + else:
> + assert size == QWORDSZ * 2, 'bad (long double) size'
> + ctypestr = ctype_preplit(ctypestr, 'long double')
> + elif size == 1:
> + if not ((info ^ CTF_UCHAR) & CTF_UNSIGNED):
> + ctypestr = ctype_preplit(ctypestr, 'char')
> + elif CTF_UCHAR:
> + ctypestr = ctype_preplit(ctypestr, 'signed char')
> 

> 
> + else:
> + ctypestr = ctype_preplit(ctypestr, 'unsigned char')
> + elif size < 8:
> + if size == 4:
> + ctypestr = ctype_preplit(ctypestr, 'int')
> + else:
> + assert size == DWORDSZ // 2, 'bad (short) size'
> + ctypestr = ctype_preplit(ctypestr, 'short')
> + if info & CTF_UNSIGNED:
> + ctypestr = ctype_preplit(ctypestr, 'unsigned')
> + else:
> + size_t = '{u}int{sz}_t'.format(
> + u='u' if info & CTF_UNSIGNED else '',
> + sz=size * 8,
> + )
> + ctypestr = ctype_preplit(ctypestr, size_t)
> + return ctypestr
> +
> +
> +def ctype_repr(cts, id):
> + ctype = ctype_get(cts, id)
> + ctypestr = ''
> + qual = 0
> + ptrto = 0
> + while True:
> + info = ctype['info']
> + size = ctype['size']
> + ctp = ctype_type(info)
> + if ctp == CT_NUM:
> + ctypestr = ctype_prepnum(ctypestr, info, size)
> + return ctype_prepqual(ctypestr, qual | info)
> + elif ctp == CT_VOID:
> + ctypestr = ctype_preplit(ctypestr, 'void')
> + return ctype_prepqual(ctypestr, qual | info)
> + elif ctp == CT_STRUCT:
> + tp = 'union' if (info & CTF_UNION) else 'struct'
> + return ctype_preptype(cts, ctypestr, ctype, qual, tp)
> + elif ctp == CT_ENUM:
> + if id == CTID_CTYPEID:
> + return ctype_preplit(ctypestr, 'ctype')
> + return ctype_preptype(cts, ctypestr, ctype, qual, 'enum')
> + elif ctp == CT_ATTRIB:
> + if ctype_attrib(info) == CTA_QUAL:
> + qual |= size
> + elif ctp == CT_PTR:
> + if info & CTF_REF:
> + ctypestr = ctype_preplit(ctypestr, '&')
> + else:
> + ctypestr = ctype_prepqual(ctypestr, qual | info)
> + if LJ_64 and size == 4:
> + ctypestr = ctype_preplit(ctypestr, '__ptr32')
> + ctypestr = ctype_preplit(ctypestr, '*')
> + qual = 0
> + ptrto = 1
> + elif ctp == CT_ARRAY:
> + if ctype_isrefarray(info):
> + if ptrto:
> + ptrto = 0
> + ctypestr = '(' + ctypestr + ')'
> + arrsize = ''
> + if size != CTSIZE_INVALID:
> + child_size = ctype_child(cts, ctype)['size']
> + arrsize = str(int(size / child_size) if child_size > 0
> + else 0)
> + elif info & CTF_VLA:
> + arrsize = '?'
> + ctypestr = ctypestr + '[{}]'.format(arrsize)
> + elif ctype_iscomplex(info):
> + if size == DWORDSZ * 2:
> + ctypestr = ctype_preplit(ctypestr, 'float')
> + else:
> + assert size == QWORDSZ * 2, 'bad (complex double) size'
> + return ctype_preplit(ctypestr, 'complex')
> + else:
> + ctypestr = ctype_preplit(
> + ctypestr,
> + '__attribute__((vector_size({})))'.format(size)
> + )
> + elif ctp == CT_FUNC:
> + if ptrto:
> + ptrto = 0
> + ctypestr = '(' + ctypestr + ')'
> + ctypestr += '()'
> + ctype = ctype_child(cts, ctype)
> + return 'NYI'
> +
> +
> +def dump_ctype(ct):
> 

Also, it seems, it will be easy to read to code, if it will be possible to distinguish between ported functions and extension itself ones. May be by use the ‘dbg_’ prefix for extension function names.

> 
> + cts = ctype_ctsG(G(L()))
> + cid = ctype_typeid(cts, ct)
> + name = ctype_repr(cts, cid)
> + return '[{id}] <{name}>'.format(
> + id=cid,
> + name=name,
> + )
> +
> +
> # JIT dumpers.
> 
> 
> @@ -2294,7 +2603,8 @@ def dump_call_func(trace, callop):
> assert IRS[cdt_idx_irk['o']] == 'KINT', \
> 'unexpected IR for ctype storage'
> ctype_idx = cdt_idx_irk['i']
> - ctype = 'ctype: {}'.format(ctype_idx)
> + cts = ctype_ctsG(G(L()))
> + ctype = 'ctype: {}'.format(dump_ctype(ctype_get(cts, ctype_idx)))
> 
> func_str = ''
> if callop < 0:
> @@ -2652,6 +2962,20 @@ https://github.com/tarantool/tarantool/wiki/LuaJIT-Bytecodes
> .
> ))
> 
> 
> +class LJDumpCType(dbg.LJBase):
> + '''
> +lj-ctype <CType *>
> +
> +The command receives a pointer <ctype> of the corresponding CType
> +and dumps the ID and the name for this C data type.
> + '''
> +
> + def execute(self, arg):
> + dbg.write('{}\n'.format(
> + dump_ctype(dbg.cast('CType *', dbg.eval(arg)))
> + ))
> +
> +
> class LJDumpFunc(dbg.LJBase):
> '''
> lj-func <GCfunc *>
> @@ -2979,6 +3303,7 @@ def load(event=None):
> dbg.initialize_extension({
> 'lj-arch': LJDumpArch,
> 'lj-bc': LJDumpBC,
> + 'lj-ctype': LJDumpCType,
> 'lj-func': LJDumpFunc,
> 'lj-gc': LJGC,
> 'lj-gco': LJDumpGCobj,
> diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py
> b/test/tarantool-debugger-tests/debug-extension-tests.py
> index 76543daa..fc5d2c7b 100644
> --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> @@ -227,6 +227,7 @@ class TestLoad(TestCaseBase):
> pattern = (
> r'lj-arch command initialized\n'
> r'lj-bc command initialized\n'
> + r'lj-ctype command initialized\n'
> r'lj-func command initialized\n'
> r'lj-gc command initialized\n'
> r'lj-gco command initialized\n'
> @@ -331,7 +332,7 @@ GCO_RX = (
> r'Lua function @ ' + RX_ADDR + r', [0-9]+ upvalues, .+:[0-9]+\n'
> r'C function @ ' + RX_ADDR + r'\n'
> r'fast function #[0-9]+\n'
> - r'cdata @ ' + RX_ADDR + r'\n'
> + r'cdata @ ' + RX_ADDR + r' \[\d+\] <int \*> 0x0\n'
> r'table @ ' + RX_ADDR + r' \(asize: \d+, hmask: ' + RX_HASH + r'\)\n'
> r'userdata @ ' + RX_ADDR + r'\n'
> )
> @@ -817,7 +818,9 @@ class TestLJIRCallXSCType(TestCaseBase):
> 'trace()\n'
> 'print()\n'
> )
> - pattern = r'int CALLXS .* [' + RX_ADDR + r'\]\(.*\) ctype: \d+'
> + pattern = (
> + r'int CALLXS .* [' + RX_ADDR + r'\]\(.*\) ctype: \[\d+\] <int \(\)>'
> + )
> 
> 
> class TestLJJSlotsBase(TestCaseBase):
> @@ -838,6 +841,207 @@ class TestLJJSlotsBase(TestCaseBase):
> )
> 
> 
> +def cdata_rx(tpstr, suffix=None):
> + return r'cdata @ ' + RX_ADDR + r' \[\d+\] <' + tpstr + '> ' + (
> + RX_ADDR if not suffix else suffix
> + )
> +
> +
> +CHAR_SIGNED = machine in ['arm64', 'aarch64'] and sys.platform !=
> 'darwin'
> +HAS_LONG_DOUBLE = not (machine in ['arm64', 'aarch64'] and
> + sys.platform == 'darwin')
> +
> +
> +class TestLJCTypePrim(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-tv L->base\n'
> + 'lj-tv L->base + 1\n'
> + 'lj-tv L->base + 2\n'
> + 'lj-tv L->base + 3\n'
> + 'lj-tv L->base + 4\n'
> + 'lj-tv L->base + 5\n'
> + 'lj-tv L->base + 6\n'
> + 'lj-tv L->base + 7\n'
> + 'lj-tv L->base + 8\n'
> + 'lj-tv L->base + 9\n'
> + 'lj-tv L->base + 10\n'
> + 'lj-tv L->base + 11\n'
> + 'lj-tv L->base + 12\n'
> + 'lj-tv L->base + 13\n'
> + 'lj-tv L->base + 14\n'
> + 'lj-tv L->base + 15\n'
> + 'lj-tv L->base + 16\n'
> + 'lj-tv L->base + 17\n'
> + 'lj-tv L->base + 18\n'
> + 'lj-tv L->base + 19\n'
> + 'lj-tv L->base + 20\n'
> + 'lj-tv L->base + 21\n'
> + 'lj-tv L->base + 22\n'
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'print(\n'
> + ' ffi.new("bool"),\n'
> + ' ffi.new("char"),\n'
> + ' ffi.new("signed char"),\n'
> + ' ffi.new("unsigned char"),\n'
> + ' ffi.new("int"),\n'
> + ' ffi.new("short"),\n'
> + ' ffi.new("unsigned"),\n'
> + ' ffi.new("int8_t"),\n'
> + ' ffi.new("int16_t"),\n'
> + ' ffi.new("int32_t"),\n'
> + ' ffi.new("int64_t"),\n'
> + ' ffi.new("uint8_t"),\n'
> + ' ffi.new("uint64_t"),\n'
> + ' ffi.new("float"),\n'
> + ' ffi.new("double"),\n'
> + ' ffi.new("long double"),\n'
> + ' 1i,\n'
> + ' ffi.new("complex float", 1, -2),\n'
> + ' ffi.new("const volatile int"),\n'
> + ' ffi.new("void *"),\n'
> + ' ffi.new("void * __ptr32"),\n'
> + ' ffi.new("int &"),\n'
> + ' ffi.typeof(1LL)\n'
> + ')\n'
> + )
> + pattern = (
> + cdata_rx('bool') + r'\n' +
> + cdata_rx('char') + r'\n' +
> + cdata_rx(('signed ' if CHAR_SIGNED else '') + 'char') + r'\n' +
> + cdata_rx(('unsigned ' if not CHAR_SIGNED else '') + 'char') + r'\n' +
> + cdata_rx('int') + r'\n' +
> + cdata_rx('short') + r'\n' +
> + cdata_rx('unsigned int') + r'\n' +
> + cdata_rx(('signed ' if CHAR_SIGNED else '') + 'char') + r'\n' +
> + cdata_rx('short') + r'\n' +
> + cdata_rx('int') + r'\n' +
> + cdata_rx('int64_t', '0LL') + r'\n' +
> + cdata_rx(('unsigned ' if not CHAR_SIGNED else '') + 'char') + r'\n' +
> + cdata_rx('uint64_t', '0ULL') + r'\n' +
> + cdata_rx('float') + r'\n' +
> + cdata_rx('double') + r'\n' +
> + cdata_rx(('long ' if HAS_LONG_DOUBLE else '') + 'double') + r'\n' +
> + cdata_rx('complex', r'0\+1i') + r'\n' +
> + cdata_rx('complex float', '1-2i') + r'\n' +
> + cdata_rx('const volatile int') + r'\n' +
> + cdata_rx(r'void \*') + r'\n' +
> + cdata_rx(r'void \* __ptr32') + r'\n' +
> + cdata_rx('int &') + r'\n' +
> + cdata_rx('ctype') + r'\n'
> + )
> +
> +
> +class TestLJCTypeStructUnionEnum(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-tv L->base\n'
> + 'lj-tv L->base + 1\n'
> + 'lj-tv L->base + 2\n'
> + 'lj-tv L->base + 3\n'
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'ffi.cdef[[\n'
> + ' struct test {int a;};\n'
> + ']]\n'
> + 'print(\n'
> + ' ffi.new("struct test"),\n'
> + ' ffi.new("struct {int a;}"),\n'
> + ' ffi.new("union {int a;}"),\n'
> + ' ffi.new("enum {ENUM1}")\n'
> + ')\n'
> + )
> + pattern = (
> + cdata_rx('struct test') + r'\n' +
> + cdata_rx(r'struct \d+') + r'\n' +
> + cdata_rx(r'union \d+') + r'\n' +
> + cdata_rx(r'enum \d+') + r'\n'
> + )
> +
> +
> +class TestLJCTypeArray(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-tv L->base\n'
> + 'lj-tv L->base + 1\n'
> + 'lj-tv L->base + 2\n'
> + 'lj-tv L->base + 3\n'
> + 'lj-tv L->base + 4\n'
> + 'lj-tv L->base + 5\n'
> + 'lj-tv L->base + 6\n'
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'print(\n'
> + ' ffi.new("char [0]"),\n'
> + ' ffi.new("int [1]"),\n'
> + ' ffi.new("complex [2]"),\n'
> + ' ffi.new("complex float [3]"),\n'
> + ' ffi.new("float __attribute__((vector_size(4)))"),\n'
> + ' ffi.new("int (&)[5]"),\n'
> + ' ffi.new("int[?]", 6)\n'
> + ')\n'
> + )
> + pattern = (
> + cdata_rx(r'char \[0\]') + r'\n' +
> + cdata_rx(r'int \[1\]') + r'\n' +
> + cdata_rx(r'complex \[2\]') + r'\n' +
> + cdata_rx(r'complex float \[3\]') + r'\n' +
> + cdata_rx(r'float __attribute__\(\(vector_size\(4\)\)\)') + r'\n' +
> + cdata_rx(r'int \(&\)\[5\]') + r'\n' +
> + cdata_rx(r'int \[\?\]') + r'\n'
> + )
> +
> +
> +class TestLJCTypeFunc(TestCaseBase):
> + location = 'lj_cf_print'
> + extension_cmds = (
> + 'n\n' # Load L.
> + 'lj-tv L->base\n'
> + 'lj-tv L->base + 1\n'
> + 'lj-tv L->base + 2\n'
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'ffi.cdef[[void getpid(void);]]\n'
> + 'print(\n'
> + ' ffi.C.getpid,\n'
> + ' ffi.new("int (*)()"),\n'
> + ' ffi.new("int (*(*)(void))[2]")\n'
> + ')\n'
> + )
> + pattern = (
> + cdata_rx(r'void \(\)') + r'\n' +
> + cdata_rx(r'int \(\*\)\(\)') + r'\n' +
> + cdata_rx(r'int \(\* \(\*\)\(\)\)\[2\]') + r'\n'
> + )
> +
> +
> +class TestLJCTypeBase(TestCaseBase):
> + location = 'lj_cf_ffi_new'
> + extension_cmds = (
> + # Load `ct`. Skip inlined functions for LLDB.
> 

The extension command set is common for GDB and LLDB. Does we skip for GDB also?

> 
> + 'n\n'
> + 'n\n'
> + 'n\n'
> + 'n\n'
> + 'n\n'
> + 'n\n'
> + 'lj-ctype ct\n'
> + )
> + lua_script = (
> + 'local ffi = require("ffi")\n'
> + 'ffi.new("int")\n'
> + )
> + pattern = r'\[\d+\] <int>'
> +
> +
> for test_cls in TestCaseBase.__subclasses__():
> test_cls.test = lambda self: self.check()
> 
> --
> 2.54.0
>

[-- Attachment #2: Type: text/html, Size: 24755 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-28 16:32     ` Sergey Kaplun via Tarantool-patches
@ 2026-06-29 16:35       ` Evgeniy Temirgaleev via Tarantool-patches
  0 siblings, 0 replies; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-29 16:35 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 12445 bytes --]

Hi, Sergey!

Thanks for your updates and for your answers, LGTM.

--
Best regards,
Evgeniy Temirgaleev

> 
> From: Sergey Kaplun <skaplun@tarantool.org>
> To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> >
> Date: Sunday, June 28, 2026 7:33 PM +03:00
> Hi, Evgeniy!
> Thanks for the review!
> See my answers below.
> 
> Branch is force-pushed with the fixes.
> 
> On 28.06.26, Evgeniy Temirgaleev wrote:
> > Hi, Sergey!
> >
> > Thanks for the patch. Please, see my comments.
> >
> > --
> > Best regards,
> > Evgeniy Temirgaleev
> >
> > >
> > > От кого: Sergey Kaplun <skaplun@tarantool.org>
> > > Кому: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> 
> > > >
> > > Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> 
> > > >
> > > Дата: Четверг, 25 июня 2026, 23:29 +03:00
> > > This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> 
> > > as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> > > similar to the -jdump flag but also reports types of register operands
> 
> > > (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> > > The `lj-trace` command accepts optional /rs flags to dump registers
> > > associated with IR and snapshots for the trace correspondingly.
> > > The `lj-ir` command can be used for dumping IR constants as well.
> > > The `lj-jslots` command dumps the content of `J->slot`. It is useful
> to
> > > simplify debugging of `rec_check_slots()` assertion failures.
> > >
> > > For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> > > Also, `__index__` is set to allow lldb.value to be used as an index
> > > without explicit conversion to int. Old GDB versions (below 7.12) are
> > > not supported because of the gdb.Value lacks the `__index__`
> metamethod
> > > and can't be monkey-patched. The support for these versions may be
> added
> > > by demand.
> > >
> > > Part of tarantool/tarantool#4808
> > > ---
> > > src/luajit_dbg.py | 1216 ++++++++++++++++-
> > > .../debug-extension-tests.py | 365 +++++
> > > 2 files changed, 1570 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > > index 2edb199a..fd6ca8a5 100644
> > > --- a/src/luajit_dbg.py
> > > +++ b/src/luajit_dbg.py
> 
> <snipped>
> 
> > > + if tp.GetTypeClass() == lldb.eTypeClassStruct:
> > > + len_fields = tp.GetNumberOfFields()
> > > + for n_field in range(len_fields):
> > > + islast = n_field == (len_fields - 1)
> > > + field = tp.GetFieldAtIndex(n_field)
> > > + start_field = field.GetOffsetInBytes()
> > > + if not islast:
> > > + end_field = tp.GetFieldAtIndex(
> > > + n_field + 1
> > > + ).GetOffsetInBytes()
> > > + else:
> > > + end_field = tp.GetByteSize()
> > > + if start_field <= offset and offset < end_field:
> > > + next_name = self.member_by_offset(
> > > + field.GetType(),
> > > + offset - start_field,
> > > + prev_name=field.GetName()
> > > + )
> > > + return '.{field}{suffix}'.format(
> > > + field=field.GetName(),
> > > + suffix=next_name if next_name else ''
> > > + )
> > > + if tp.GetTypeClass() == lldb.eTypeClassArray:
> > >
> >
> > Typo?: elif
> 
> Fixed, thanks!
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 62cd65d5..3b7cf7a1 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -657,7 +657,7 @@ class _LLDBDebugger(Debugger):
> field=field.GetName(),
> suffix=next_name if next_name else ''
> )
> - if tp.GetTypeClass() == lldb.eTypeClassArray:
> + elif tp.GetTypeClass() == lldb.eTypeClassArray:
> # Get array field type.
> target = tp.GetArrayElementType()
> tsize = target.GetByteSize()
> ===================================================================
> 
> <snipped>
> 
> > > +# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> > > +# Non-weak guard. */
> > >
> >
> > Typo: C comment end */
> 
> Removed, thanks!
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 62cd65d5..3b7cf7a1 100644
> @@ -1612,12 +1612,16 @@ IRS = [
> 
> 
> # Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> -# Non-weak guard. */
> +# Non-weak guard.
> IRM_C = 0x10
> IRM_A = 0x20
> IRM_L = 0x40
> ===================================================================
> 
> >
> > >
> > > +IRM_C = 0x10
> > > +IRM_A = 0x20
> > > +IRM_L = 0x40
> > > +IRM_S = 0x60
> > > +IRM_W = 0x80
> > > +
> > > +
> > > +# IR operand mode (2 bit).
> > > +IRM = [
> > > + 'ref',
> > > + 'lit',
> > > + 'cst',
> > > + '', # none
> > > +]
> > > +
> > > +
> > > +lj_ir_mode_ = None
> > > +
> > > +
> > > +def lj_ir_mode():
> > > + global lj_ir_mode_
> > > + if lj_ir_mode_:
> > > + return lj_ir_mode_
> > > + lj_ir_mode_ = dbg.lookup_global('lj_ir_mode')
> > > + return lj_ir_mode_
> > > +
> > > +
> > > +def ir_left(op):
> > > + return IRM[int(lj_ir_mode()[op] & 3)]
> > >
> >
> > May be binary constant will be more clear? xxx & 0b0011
> 
> I prefer to leave it consistent with the original sources, see
> <src/lj_ir.h>. Also, the bc decoding has the same format.
> 
> >
> > >
> > > +
> > > +
> > > +def ir_right(op):
> > > + return IRM[int(lj_ir_mode()[op] >> 2 & 3)]
> > >
> >
> > May be binary constant will be more clear? (xxx & 0b1100) >> 2
> 
> Ditto.
> 
> >
> > >
> > > +
> > > +
> > > +def ir_mode(op):
> > > + mode = ''
> > > + ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
> > >
> >
> > >
> > > + if ir_mode == IRM_C:
> > > + mode = 'C'
> > > + elif ir_mode == IRM_A:
> > > + mode = 'A'
> > > + elif ir_mode == IRM_L:
> > > + mode = 'L'
> > > + elif ir_mode == IRM_S:
> > > + mode = 'S'
> > > + else:
> > > + mode = 'N'
> > >
> >
> > >
> > > + mode += 'W' if ir_mode & IRM_W else ''
> > >
> >
> > May be table with 16 items and comments will be more clear? E. g. return
> XXX[(lj_ir_mode()[op] & 0b11110000) >> 4]
> > And it will contain invalid values also.
> > # <flag bits in a comment>
> > XXX[0b0000] = ‘NW’ # Normal/Ref | !Non-weak guard
> > XXX[0b0001] = ‘CW’ # Commutative | !Non-weak guard
> > XXX[0b0011] = ‘Invalid’
> > ...
> > XXX[0b1000] = ‘N’ # Normal/Ref | Non-weak guard
> > XXX[0b1001] = ‘C’ # Commutative | Non-weak guard
> > XXX[0b1011] = ‘Invalid’
> > ...
> 
> Rewrote with table usage as the following:
> Also, you help me to notice that the original implementation was
> incorrect (due to bits of operand modes after xor). Tests was corrupted
> as well, fixed. Thanks!
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index d4f89eb5..32b0cea7 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -1613,11 +1613,15 @@ IRS = [
> 
> # Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> # Non-weak guard.
> -IRM_C = 0x10
> -IRM_A = 0x20
> -IRM_L = 0x40
> -IRM_S = 0x60
> -IRM_W = 0x80
> +IRM_BITS_W = 0x80
> +IRM_BITS = {
> + 0x00: 'N',
> + 0x10: 'C',
> + 0x20: 'A',
> + 0x40: 'L',
> + 0x60: 'S',
> +}
> +IRM_BITS_MASK = 0x70
> 
> 
> # IR operand mode (2 bit).
> @@ -1649,19 +1653,10 @@ def ir_right(op):
> 
> 
> def ir_mode(op):
> - mode = ''
> - ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
> - if ir_mode == IRM_C:
> - mode = 'C'
> - elif ir_mode == IRM_A:
> - mode = 'A'
> - elif ir_mode == IRM_L:
> - mode = 'L'
> - elif ir_mode == IRM_S:
> - mode = 'S'
> - else:
> - mode = 'N'
> - mode += 'W' if ir_mode & IRM_W else ''
> + irmode = int((lj_ir_mode()[op]))
> + isweak = not bool(irmode & IRM_BITS_W)
> + mode = IRM_BITS[irmode & IRM_BITS_MASK]
> + mode += 'W' if isweak else ''
> return mode
> 
> 
> diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py
> b/test/tarantool-debugger-tests/debug-extension-tests.py
> index 8e069fe0..f17de27e 100644
> --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> @@ -530,12 +530,12 @@ class TestLJTraceBase(TestCaseBase):
> r'\t*proto: ' + RX_ADDR + r'\n' +
> r'\t*BC: ' + RX_ADDR + r'\n' +
> r'---- TRACE IR\n' +
> - RX_IRN + r'\s+ int SLOAD \[N \] lit: #[12] lit: C?I\n' +
> + RX_IRN + r'\s+ int SLOAD \[L \] lit: #[12] lit: C?I\n' +
> RX_IRN + r'\s+ \+ int ADD \[C \] ref: ' + RX_IRN +
> r' ref: integer 1\n' +
> RX_IRN + r'\s+ > int LE \[N \] ref: ' + RX_IRN +
> r' ref: integer 4\n' +
> - RX_IRN + r'\s+ > --- LOOP \[N \]\s*\n' +
> + RX_IRN + r'\s+ > --- LOOP \[S \]\s*\n' +
> RX_IRN + r'\s+ \+ int ADD \[C \] ref: ' + RX_IRN +
> r' ref: integer 1\n' +
> RX_IRN + r'\s+ > int LE \[N \] ref: ' + RX_IRN +
> ===================================================================
> 
> But intentionally didn't use bit notation to be consistent with original
> declarations in <src/lj_ir.h>. I've used the mask to strip lower bits
> related to "NonWeak" guard and operand modes.
> 
> <snipped>
> 
> > > +# Don't use *[ to be compatible with Python 2.
> > > +REGISTERS = {'x64': [
> > > + 'rax',
> > > + 'rcx',
> > > + 'rdx',
> > > + 'rbx',
> > > + 'rsp',
> > > + 'rbp',
> > > + 'rsi',
> > > + 'rdi',
> > > +] + [
> > > + 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> > > +] + [
> > > + 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> > > +], 'arm64': [
> > > + 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> > > +] + ['sp'] + [ # x31
> > > + 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> > > +]}
> > >
> >
> > It seems, the ‘arm64’ registers are missed.
> 
> Actiually no, but I understand your confusion.
> Reformated as the following:
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 32b0cea7..a79caad0 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -1728,24 +1728,29 @@ IRFPMS = [
> 
> 
> # Don't use *[ to be compatible with Python 2.
> -REGISTERS = {'x64': [
> - 'rax',
> - 'rcx',
> - 'rdx',
> - 'rbx',
> - 'rsp',
> - 'rbp',
> - 'rsi',
> - 'rdi',
> -] + [
> - 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> -] + [
> - 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> -], 'arm64': [
> - 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> -] + ['sp'] + [ # x31
> - 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> -]}
> +REGISTERS = {
> + 'x64': [
> + 'rax',
> + 'rcx',
> + 'rdx',
> + 'rbx',
> + 'rsp',
> + 'rbp',
> + 'rsi',
> + 'rdi',
> + ] + [
> + 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> + ] + [
> + 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> + ],
> + 'arm64': [
> + 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> + ] + [
> + 'sp' # x31
> + ] + [
> + 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> + ]
> +}
> 
> 
> IR_CALLS = [
> ===================================================================
> 
> >
> 
> <snipped>
> 
> > > +def litname_xload(mode):
> > > + flags = ['-', 'R', 'V', 'RV', 'U', 'RU', 'VU', 'RVU']
> > >
> >
> > Does we need a range check as in litname_bufhdr()?
> 
> I prefer error raising if something goes wrong here (invalid IR or
> incorrect extension implementation).
> 
> >
> > >
> > > + return flags[mode]
> > > +
> > > +
> > > +def litname_conv(mode):
> > >
> >
> > Does we need some range checking here?
> 
> Ditto.
> 
> >
> > >
> 
> <snippped>
> 
> > > +
> > > +
> > > +def irt_ismarked(t):
> > > + return t['irt'] & IRT_MARK
> > >
> >
> > I propose explicit bool cast (!= 0) here and below.
> 
> Added:
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index a79caad0..6b0827d9 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -1978,15 +1978,15 @@ def tref_ref(tr):
> 
> 
> def irt_ismarked(t):
> - return t['irt'] & IRT_MARK
> + return bool(t['irt'] & IRT_MARK)
> 
> 
> def irt_isphi(t):
> - return t['irt'] & IRT_ISPHI
> + return bool(t['irt'] & IRT_ISPHI)
> 
> 
> def irt_isguard(t):
> - return t['irt'] & IRT_GUARD
> + return bool(t['irt'] & IRT_GUARD)
> 
> 
> def irt_toitype(irt):
> ===================================================================
> 
> >
> > >
> > > +
> > > +
> > > +def irt_isphi(t):
> > > + return t['irt'] & IRT_ISPHI
> > > +
> > > +
> > > +def irt_isguard(t):
> > > + return t['irt'] & IRT_GUARD
> > > +
> > > +
> 
> <snipped>
> 
> > >
> > > --
> > > 2.54.0
> > >
> 
> --
> Best regards,
> Sergey Kaplun
>

[-- Attachment #2: Type: text/html, Size: 15234 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-29 13:55   ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-29 19:20     ` Sergey Kaplun via Tarantool-patches
  2026-06-30 11:48       ` Evgeniy Temirgaleev via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-29 19:20 UTC (permalink / raw)
  To: Evgeniy Temirgaleev; +Cc: tarantool-patches

Hi, Evgeniy!
Thanks for the review!
Fixed your comments and force-pushed the branch.

On 29.06.26, Evgeniy Temirgaleev wrote:
> Hi, Sergey!
> 
> Thanks for the patch!
> Please, see the commends below.
> 
> --
> Best regards,
> Evgeniy Temirgaleev
> 
> > 
> > From: Sergey Kaplun <skaplun@tarantool.org>
> > To: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> > >
> > Cc: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> > >
> > Date: Thursday, June 25, 2026 11:29 PM +03:00
> > This patch extends dumped information for the given cdata object. Now
> > it resolves the given `CType` and prints it in the format similar to the
> > `__tostring` metamethod. The `lj-ctype` command is introduced to dump
> > this information where there is only the `CType` pointer but no cdata
> > associated with it.
> > 
> > `__or__` and `__ror__` metamethods are monkey-patched for the LLDB value
> > object. In `__sub__` metamethod for LLDB pointers `GetPointeeType()` is
> > used to get the pointee type instead of the incorrectly used
> > `GetDereferencedType()` which always returns the same type with size 8.
> > Casting from negative values to the unsigned values is supported to
> > check `CTF_UCHAR`.
> > 
> > Part of tarantool/tarantool#4808
> > ---
> > src/luajit_dbg.py | 333 +++++++++++++++++-
> > .../debug-extension-tests.py | 208 ++++++++++-
> > 2 files changed, 535 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > index fd6ca8a5..62cd65d5 100644
> > --- a/src/luajit_dbg.py
> > +++ b/src/luajit_dbg.py
> > @@ -386,6 +386,8 @@ class _LLDBDebugger(Debugger):
> > pack_flag = '<q'
> > else:
> > pack_flag = '<Q'
> > + # Cast to unsigned.
> > 
> 
> Is /unsigned/uint64_t/ clearly?

Rephrased as you suggested. Also, lowercase the value to be consistent
with other hexademical values.

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 6b0827d9..0769f2ee 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -386,8 +386,8 @@ class _LLDBDebugger(Debugger):
             pack_flag = '<q'
         else:
             pack_flag = '<Q'
-            # Cast to unsigned.
-            raw_value &= 0xFFFFFFFFFFFFFFFF
+            # Cast to 64-bit unsigned value in Python.
+            raw_value &= 0xffffffffffffffff
         raw_data = struct.pack(pack_flag, raw_value)
         sbdata = lldb.SBData()
         sbdata.SetData(
===================================================================

> 
> > 
> > + raw_value &= 0xFFFFFFFFFFFFFFFF
> > raw_data = struct.pack(pack_flag, raw_value)
> > sbdata = lldb.SBData()
> > sbdata.SetData(

<snipped>

> > +
> > +
> > +def ctinfo(ct, flags):
> > 
> 
> May we name this function ‘CTINFO’ as in ‘lj_ctype.h’? Or leave a comment with an original name for quick grep.

Added a comment since the upper case is used for constants.

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 0769f2ee..de83a2b5 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1427,6 +1427,7 @@ def ctype_attrib(info):
     return (info >> CTSHIFT_ATTRIB) & CTMASK_ATTRIB
 
 
+# Implementation of the `CTINFO()` macro.
 def ctinfo(ct, flags):
     return (tou32(ct) << CTSHIFT_NUM) + flags
 
===================================================================


> 
> > 
> > + return (tou32(ct) << CTSHIFT_NUM) + flags
> > +
> > +
> > +def ctype_isptr(info):
> > + return ctype_type(info) == CT_PTR
> > +
> > +
> > +def ctype_iscomplex(info):
> > + return (info & (CTMASK_NUM | CTF_COMPLEX)) == ctinfo(CT_ARRAY,
> > CTF_COMPLEX)
> > +
> > +
> > +def ctype_isinteger(info):
> > + return (info & (CTMASK_NUM | CTF_BOOL | CTF_FP)) == ctinfo(CT_NUM, 0)
> > +
> > +
> > +def ctype_isrefarray(info):
> > + return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
> > + ctinfo(CT_ARRAY, 0)
> > +
> > +
> > +def ctype_cid(info):
> > 
> 
> Let’s put these function definitions in the ‘lj_ctype.h’ order?
> May we group the definitions by corresponding C files also? # lj_ctype.h … # lj_cdata.h … # lj_xxx.c …


Sorted as you suggested. The sorting is the following:
* lj_ctype.h
* lj_cdata.h -- `cdata_getptr()`
* lj_obj.h -- `cdataptr()`
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index de83a2b5..183dda3b 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1362,14 +1362,6 @@ def lightudV(tv):
 # FFI.
 
 
-def ctype_ctsG(g):
-    return mref('CTState *', g['ctype_state'])
-
-
-def ctype_get(cts, id):
-    return dbg.address(cts['tab'][id])
-
-
 # Externally visible types.
 CT_NUM = 0  # Integer or floating-point numbers.
 CT_STRUCT = 1  # Struct or union.
@@ -1419,46 +1411,55 @@ DWORDSZ = 4
 QWORDSZ = 8
 
 
+# Implementation of the `CTINFO()` macro.
+def ctinfo(ct, flags):
+    return (tou32(ct) << CTSHIFT_NUM) + flags
+
+
 def ctype_type(info):
     return info >> CTSHIFT_NUM
 
 
-def ctype_attrib(info):
-    return (info >> CTSHIFT_ATTRIB) & CTMASK_ATTRIB
+def ctype_cid(info):
+    return info & CTMASK_CID
 
 
-# Implementation of the `CTINFO()` macro.
-def ctinfo(ct, flags):
-    return (tou32(ct) << CTSHIFT_NUM) + flags
+def ctype_attrib(info):
+    return (info >> CTSHIFT_ATTRIB) & CTMASK_ATTRIB
 
 
 def ctype_isptr(info):
     return ctype_type(info) == CT_PTR
 
 
-def ctype_iscomplex(info):
-    return (info & (CTMASK_NUM | CTF_COMPLEX)) == ctinfo(CT_ARRAY, CTF_COMPLEX)
-
-
 def ctype_isinteger(info):
     return (info & (CTMASK_NUM | CTF_BOOL | CTF_FP)) == ctinfo(CT_NUM, 0)
 
 
+def ctype_iscomplex(info):
+    return (info & (CTMASK_NUM | CTF_COMPLEX)) == ctinfo(CT_ARRAY, CTF_COMPLEX)
+
+
 def ctype_isrefarray(info):
     return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
            ctinfo(CT_ARRAY, 0)
 
 
-def ctype_cid(info):
-    return info & CTMASK_CID
-
-
 def ctype_child(cts, ctype):
     return ctype_get(cts, ctype_cid(ctype['info']))
 
 
-def cdataptr(cd):
-    return dbg.cast('void *', (cd + 1))
+def ctype_ctsG(g):
+    return mref('CTState *', g['ctype_state'])
+
+
+def ctype_get(cts, id):
+    return dbg.address(cts['tab'][id])
+
+
+# Get C type ID for a C type.
+def ctype_typeid(cts, ct):
+    return ct - cts['tab']
 
 
 def cdata_getptr(p, size):
@@ -1468,9 +1469,8 @@ def cdata_getptr(p, size):
         return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
 
 
-# Get C type ID for a C type.
-def ctype_typeid(cts, ct):
-    return ct - cts['tab']
+def cdataptr(cd):
+    return dbg.cast('void *', (cd + 1))
 
 
 # JIT engine.
===================================================================

> 
> > 
> > + return info & CTMASK_CID
> > +
> > +
> > +def ctype_child(cts, ctype):
> > + return ctype_get(cts, ctype_cid(ctype['info']))
> > +
> > +
> > +def cdataptr(cd):
> > + return dbg.cast('void *', (cd + 1))
> > +
> > +
> > +def cdata_getptr(p, size):
> > + if LJ_64 and size == 4:
> > + return dbg.cast('void *', dbg.cast('uint32_t *', p)[0])
> > + else:
> > 
> 
> assert for size == 8 ?

Added since it may possibly lead to the incorrect (shrinked) pointer
result. If we ever see the 16-bit pointers. Also, support the 32-bit
systems (not LJ_64) as well.

================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 183dda3b..de22d450 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1463,9 +1463,10 @@ def ctype_typeid(cts, ct):
 
 
 def cdata_getptr(p, size):
-    if LJ_64 and size == 4:
+    if (LJ_64 and size == 4) or not LJ_64:
         return dbg.cast('void *', dbg.cast('uint32_t *', p)[0])
     else:
+        assert size == 8, 'incorrect pointer size'
         return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
 
 
================================================================

> 
> > 
> > + return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
> > +
> > +

<snipped>

> > +def ctype_prepnum(ctypestr, info, size):
> > 
> 
> Func proto differs with lj_ctype.c (static void ctype_prepnum(CTRepr *ctr, uint32_t n)).
> It seems, you move some of ctype_repr() code here. Let’s comment it?

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index de22d450..28cbe97d 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -2479,6 +2479,8 @@ def ctype_preptype(cts, ctypestr, ctype, qual, tp):
     return ctypestr
 
 
+# Partially moved the code from `ctype_repr()` here to make it
+# more readable.
 def ctype_prepnum(ctypestr, info, size):
     if info & CTF_BOOL:
         ctypestr = ctype_preplit(ctypestr, 'bool')
===================================================================

<snipped>

> > +def dump_ctype(ct):
> > 
> 
> Also, it seems, it will be easy to read to code, if it will be possible to distinguish between ported functions and extension itself ones. May be by use the ‘dbg_’ prefix for extension function names.

I suppose this refactoring can be done in the separate issue. Since it
is related to all functions. Also, the `dbg` is already used for the
instance of the corresponding class. `dump_` prefix looks common for all
dumpers of our extension.


<skipped>

> > +class TestLJCTypeBase(TestCaseBase):
> > + location = 'lj_cf_ffi_new'
> > + extension_cmds = (
> > + # Load `ct`. Skip inlined functions for LLDB.
> > 
> 
> The extension command set is common for GDB and LLDB. Does we skip for GDB also?

For GDB this function isn't inlined, but these n-s are harmless.
Adjusted the comment.

===================================================================
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index f17de27e..71b763d2 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -1033,7 +1033,9 @@ class TestLJCTypeFunc(TestCaseBase):
 class TestLJCTypeBase(TestCaseBase):
     location = 'lj_cf_ffi_new'
     extension_cmds = (
-        # Load `ct`. Skip inlined functions for LLDB.
+        # Load `ct`. Skip inlined functions for LLDB. The skip is
+        # harmless for GDB since we are still in the body of the
+        # function.
         'n\n'
         'n\n'
         'n\n'
===================================================================


> 
> > 
> > + 'n\n'
> > + 'n\n'
> > + 'n\n'
> > + 'n\n'
> > + 'n\n'
> > + 'n\n'
> > + 'lj-ctype ct\n'
> > + )

<snipped>

> > --
> > 2.54.0
> >

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-29 19:20     ` Sergey Kaplun via Tarantool-patches
@ 2026-06-30 11:48       ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-30 12:27         ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-30 11:48 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 9585 bytes --]

Hi, Sergey!

LGTM. And please see my note about renaming issue.

> 
> From: Sergey Kaplun <skaplun@tarantool.org>
> To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> >
> Date: Monday, June 29, 2026 10:21 PM +03:00
> Hi, Evgeniy!
> Thanks for the review!
> Fixed your comments and force-pushed the branch.
> 
> On 29.06.26, Evgeniy Temirgaleev wrote:
> > Hi, Sergey!
> >
> > Thanks for the patch!
> > Please, see the commends below.
> >
> > --
> > Best regards,
> > Evgeniy Temirgaleev
> >
> > >
> > > From: Sergey Kaplun <skaplun@tarantool.org>
> > > To: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> 
> > > >
> > > Cc: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> 
> > > >
> > > Date: Thursday, June 25, 2026 11:29 PM +03:00
> > > This patch extends dumped information for the given cdata object. Now
> > > it resolves the given `CType` and prints it in the format similar to
> the
> > > `__tostring` metamethod. The `lj-ctype` command is introduced to dump
> > > this information where there is only the `CType` pointer but no cdata
> > > associated with it.
> > >
> > > `__or__` and `__ror__` metamethods are monkey-patched for the LLDB
> value
> > > object. In `__sub__` metamethod for LLDB pointers `GetPointeeType()`
> is
> > > used to get the pointee type instead of the incorrectly used
> > > `GetDereferencedType()` which always returns the same type with size
> 8.
> > > Casting from negative values to the unsigned values is supported to
> > > check `CTF_UCHAR`.
> > >
> > > Part of tarantool/tarantool#4808
> > > ---
> > > src/luajit_dbg.py | 333 +++++++++++++++++-
> > > .../debug-extension-tests.py | 208 ++++++++++-
> > > 2 files changed, 535 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > > index fd6ca8a5..62cd65d5 100644
> > > --- a/src/luajit_dbg.py
> > > +++ b/src/luajit_dbg.py
> > > @@ -386,6 +386,8 @@ class _LLDBDebugger(Debugger):
> > > pack_flag = '<q'
> > > else:
> > > pack_flag = '<Q'
> > > + # Cast to unsigned.
> > >
> >
> > Is /unsigned/uint64_t/ clearly?
> 
> Rephrased as you suggested. Also, lowercase the value to be consistent
> with other hexademical values.
> 

Thanks!

> 
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 6b0827d9..0769f2ee 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -386,8 +386,8 @@ class _LLDBDebugger(Debugger):
> pack_flag = '<q'
> else:
> pack_flag = '<Q'
> - # Cast to unsigned.
> - raw_value &= 0xFFFFFFFFFFFFFFFF
> + # Cast to 64-bit unsigned value in Python.
> + raw_value &= 0xffffffffffffffff
> raw_data = struct.pack(pack_flag, raw_value)
> sbdata = lldb.SBData()
> sbdata.SetData(
> ===================================================================
> 
> >
> > >
> > > + raw_value &= 0xFFFFFFFFFFFFFFFF
> > > raw_data = struct.pack(pack_flag, raw_value)
> > > sbdata = lldb.SBData()
> > > sbdata.SetData(
> 
> <snipped>
> 
> > > +
> > > +
> > > +def ctinfo(ct, flags):
> > >
> >
> > May we name this function ‘CTINFO’ as in ‘lj_ctype.h’? Or leave a
> comment with an original name for quick grep.
> 
> Added a comment since the upper case is used for constants.
> 

Thanks!

> 
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 0769f2ee..de83a2b5 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -1427,6 +1427,7 @@ def ctype_attrib(info):
> return (info >> CTSHIFT_ATTRIB) & CTMASK_ATTRIB
> 
> 
> +# Implementation of the `CTINFO()` macro.
> def ctinfo(ct, flags):
> return (tou32(ct) << CTSHIFT_NUM) + flags
> 
> ===================================================================
> 
> 
> >
> > >
> > > + return (tou32(ct) << CTSHIFT_NUM) + flags
> > > +
> > > +
> > > +def ctype_isptr(info):
> > > + return ctype_type(info) == CT_PTR
> > > +
> > > +
> > > +def ctype_iscomplex(info):
> > > + return (info & (CTMASK_NUM | CTF_COMPLEX)) == ctinfo(CT_ARRAY,
> > > CTF_COMPLEX)
> > > +
> > > +
> > > +def ctype_isinteger(info):
> > > + return (info & (CTMASK_NUM | CTF_BOOL | CTF_FP)) == ctinfo(CT_NUM,
> 0)
> > > +
> > > +
> > > +def ctype_isrefarray(info):
> > > + return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
> > > + ctinfo(CT_ARRAY, 0)
> > > +
> > > +
> > > +def ctype_cid(info):
> > >
> >
> > Let’s put these function definitions in the ‘lj_ctype.h’ order?
> > May we group the definitions by corresponding C files also? # lj_ctype.h
> … # lj_cdata.h … # lj_xxx.c …
> 
> 
> Sorted as you suggested. The sorting is the following:
> * lj_ctype.h
> * lj_cdata.h -- `cdata_getptr()`
> * lj_obj.h -- `cdataptr()`
> 

Thanks! We can add a comment with file name before each block to improve readability slightly more. Feel free to ignore.

> 
> 
> 

<snipped>

> 
> 
> >
> > >
> > > + return info & CTMASK_CID
> > > +
> > > +
> > > +def ctype_child(cts, ctype):
> > > + return ctype_get(cts, ctype_cid(ctype['info']))
> > > +
> > > +
> > > +def cdataptr(cd):
> > > + return dbg.cast('void *', (cd + 1))
> > > +
> > > +
> > > +def cdata_getptr(p, size):
> > > + if LJ_64 and size == 4:
> > > + return dbg.cast('void *', dbg.cast('uint32_t *', p)[0])
> > > + else:
> > >
> >
> > assert for size == 8 ?
> 
> Added since it may possibly lead to the incorrect (shrinked) pointer
> result. If we ever see the 16-bit pointers. Also, support the 32-bit
> systems (not LJ_64) as well.
> 

Thanks!

> 
> 
> ================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 183dda3b..de22d450 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -1463,9 +1463,10 @@ def ctype_typeid(cts, ct):
> 
> 
> def cdata_getptr(p, size):
> - if LJ_64 and size == 4:
> + if (LJ_64 and size == 4) or not LJ_64:
> return dbg.cast('void *', dbg.cast('uint32_t *', p)[0])
> else:
> + assert size == 8, 'incorrect pointer size'
> return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
> 
> 
> ================================================================
> 
> >
> > >
> > > + return dbg.cast('void *', dbg.cast('uint64_t *', p)[0])
> > > +
> > > +
> 
> <snipped>
> 
> > > +def ctype_prepnum(ctypestr, info, size):
> > >
> >
> > Func proto differs with lj_ctype.c (static void ctype_prepnum(CTRepr
> *ctr, uint32_t n)).
> > It seems, you move some of ctype_repr() code here. Let’s comment it?
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index de22d450..28cbe97d 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -2479,6 +2479,8 @@ def ctype_preptype(cts, ctypestr, ctype, qual, tp):
> return ctypestr
> 
> 
> +# Partially moved the code from `ctype_repr()` here to make it
> +# more readable.
> def ctype_prepnum(ctypestr, info, size):
> if info & CTF_BOOL:
> ctypestr = ctype_preplit(ctypestr, 'bool')
> ===================================================================
> 

Thanks!

> 
> 
> <snipped>
> 
> > > +def dump_ctype(ct):
> > >
> >
> > Also, it seems, it will be easy to read to code, if it will be possible
> to distinguish between ported functions and extension itself ones. May be
> by use the ‘dbg_’ prefix for extension function names.
> 
> I suppose this refactoring can be done in the separate issue. Since it
> is related to all functions. Also, the `dbg` is already used for the
> instance of the corresponding class. `dump_` prefix looks common for all
> dumpers of our extension.
> 

Agreed. And I vote for this patch.
May be it will be several documented prefixes. It will be more verbosely, but I think it will be very helpful in a long perspective for supporting the extension to quick distinguish LuaJIT-ported routine e.g. `ctype_preplit` with extension routine e.g. `cdata_val_int64`.
Can you offer some prefix name good for you now? May be we can start naming with it at this point, what do you think?

> 
> 
> 
> <skipped>
> 
> > > +class TestLJCTypeBase(TestCaseBase):
> > > + location = 'lj_cf_ffi_new'
> > > + extension_cmds = (
> > > + # Load `ct`. Skip inlined functions for LLDB.
> > >
> >
> > The extension command set is common for GDB and LLDB. Does we skip for
> GDB also?
> 
> For GDB this function isn't inlined, but these n-s are harmless.
> Adjusted the comment.
> 

Thanks!

> 
> 
> ===================================================================
> diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py
> b/test/tarantool-debugger-tests/debug-extension-tests.py
> index f17de27e..71b763d2 100644
> --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> @@ -1033,7 +1033,9 @@ class TestLJCTypeFunc(TestCaseBase):
> class TestLJCTypeBase(TestCaseBase):
> location = 'lj_cf_ffi_new'
> extension_cmds = (
> - # Load `ct`. Skip inlined functions for LLDB.
> + # Load `ct`. Skip inlined functions for LLDB. The skip is
> + # harmless for GDB since we are still in the body of the
> + # function.
> 'n\n'
> 'n\n'
> 'n\n'
> ===================================================================
> 
> 
> >
> > >
> > > + 'n\n'
> > > + 'n\n'
> > > + 'n\n'
> > > + 'n\n'
> > > + 'n\n'
> > > + 'n\n'
> > > + 'lj-ctype ct\n'
> > > + )
> 
> <snipped>
> 
> > > --
> > > 2.54.0
> > >
> 
> --
> Best regards,
> Sergey Kaplun
> 

--
Best regards,
Evgeniy Temirgaleev

[-- Attachment #2: Type: text/html, Size: 14187 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-30 11:48       ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-30 12:27         ` Sergey Kaplun via Tarantool-patches
  2026-06-30 13:07           ` Evgeniy Temirgaleev via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-30 12:27 UTC (permalink / raw)
  To: Evgeniy Temirgaleev; +Cc: tarantool-patches

Evgeniy,
Thanks for the answer, see my thoughts below.

On 30.06.26, Evgeniy Temirgaleev wrote:
> Hi, Sergey!
> 
> LGTM. And please see my note about renaming issue.
> 
> > 
> > From: Sergey Kaplun <skaplun@tarantool.org>
> > To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> > Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> > >
> > Date: Monday, June 29, 2026 10:21 PM +03:00

<snipped>

> > > > +
> > > > +def ctype_isrefarray(info):
> > > > + return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
> > > > + ctinfo(CT_ARRAY, 0)
> > > > +
> > > > +
> > > > +def ctype_cid(info):
> > > >
> > >
> > > Let’s put these function definitions in the ‘lj_ctype.h’ order?
> > > May we group the definitions by corresponding C files also? # lj_ctype.h
> > … # lj_cdata.h … # lj_xxx.c …
> > 
> > 
> > Sorted as you suggested. The sorting is the following:
> > * lj_ctype.h
> > * lj_cdata.h -- `cdata_getptr()`
> > * lj_obj.h -- `cdataptr()`
> > 
> 
> Thanks! We can add a comment with file name before each block to improve readability slightly more. Feel free to ignore.

I suppose that ctags or whatsoever will do the trick. Ignoring since you
don't insist.

<snipped>

> > > >
> > >
> > > Also, it seems, it will be easy to read to code, if it will be possible
> > to distinguish between ported functions and extension itself ones. May be
> > by use the ‘dbg_’ prefix for extension function names.
> > 
> > I suppose this refactoring can be done in the separate issue. Since it
> > is related to all functions. Also, the `dbg` is already used for the
> > instance of the corresponding class. `dump_` prefix looks common for all
> > dumpers of our extension.
> > 
> 
> Agreed. And I vote for this patch.
> May be it will be several documented prefixes. It will be more verbosely, but I think it will be very helpful in a long perspective for supporting the extension to quick distinguish LuaJIT-ported routine e.g. `ctype_preplit` with extension routine e.g. `cdata_val_int64`.
> Can you offer some prefix name good for you now? May be we can start naming with it at this point, what do you think?

For now we have several "prefixes":
* `dbg.` for the debugger implementation defined helpers.
* `dump_` for the dumper function of any kind (even helpers).
* `LJ*` for the classes to be the entry points for our extensions.

I'm not sure that ctype_preplit -> dump_ctype_preplit helps for find the
original logic for this dumper from the LuaJIT source code. So, I'm open
to ideas ;).

I prefer if the refactoring of names will be done separately so there
is no a part of the naming in this patch series and another part in the
next one. At least this is inconsistent.

<snipped>

> 
> --
> Best regards,
> Evgeniy Temirgaleev

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-30 12:27         ` Sergey Kaplun via Tarantool-patches
@ 2026-06-30 13:07           ` Evgeniy Temirgaleev via Tarantool-patches
  2026-06-30 14:03             ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-30 13:07 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3760 bytes --]

Sergey, thanks for the answer!

> I'm not sure that ctype_preplit -> dump_ctype_preplit helps for find the
> original logic for this dumper from the LuaJIT source code. So, I'm open
> to ideas ;).

I think, we can use original names from LuaJIT(ctype_prelit for ctype_prelit) and differ names for routines which are not present in LuaJIT. No more ideas at a time :)

I can’t find cdata_val_int64 or cdata_val_complex in LuaJIT code, ‘dump_’ prefix missed?

--
Best regards,
Evgeniy Temirgaleev

> 
> From: Sergey Kaplun <skaplun@tarantool.org>
> To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> >
> Date: Tuesday, June 30, 2026 3:27 PM +03:00
> Evgeniy,
> Thanks for the answer, see my thoughts below.
> 
> On 30.06.26, Evgeniy Temirgaleev wrote:
> > Hi, Sergey!
> >
> > LGTM. And please see my note about renaming issue.
> >
> > >
> > > From: Sergey Kaplun <skaplun@tarantool.org>
> > > To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> > > Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> 
> > > >
> > > Date: Monday, June 29, 2026 10:21 PM +03:00
> 
> <snipped>
> 
> > > > > +
> > > > > +def ctype_isrefarray(info):
> > > > > + return (info & (CTMASK_NUM | CTF_VECTOR | CTF_COMPLEX)) == \
> > > > > + ctinfo(CT_ARRAY, 0)
> > > > > +
> > > > > +
> > > > > +def ctype_cid(info):
> > > > >
> > > >
> > > > Let’s put these function definitions in the ‘lj_ctype.h’ order?
> > > > May we group the definitions by corresponding C files also? #
> lj_ctype.h
> > > … # lj_cdata.h … # lj_xxx.c …
> > >
> > >
> > > Sorted as you suggested. The sorting is the following:
> > > * lj_ctype.h
> > > * lj_cdata.h -- `cdata_getptr()`
> > > * lj_obj.h -- `cdataptr()`
> > >
> >
> > Thanks! We can add a comment with file name before each block to improve
> readability slightly more. Feel free to ignore.
> 
> I suppose that ctags or whatsoever will do the trick. Ignoring since you
> don't insist.
> 
> <snipped>
> 
> > > > >
> > > >
> > > > Also, it seems, it will be easy to read to code, if it will be
> possible
> > > to distinguish between ported functions and extension itself ones. May
> be
> > > by use the ‘dbg_’ prefix for extension function names.
> > >
> > > I suppose this refactoring can be done in the separate issue. Since it
> 
> > > is related to all functions. Also, the `dbg` is already used for the
> > > instance of the corresponding class. `dump_` prefix looks common for
> all
> > > dumpers of our extension.
> > >
> >
> > Agreed. And I vote for this patch.
> > May be it will be several documented prefixes. It will be more
> verbosely, but I think it will be very helpful in a long perspective for
> supporting the extension to quick distinguish LuaJIT-ported routine e.g.
> `ctype_preplit` with extension routine e.g. `cdata_val_int64`.
> > Can you offer some prefix name good for you now? May be we can start
> naming with it at this point, what do you think?
> 
> For now we have several "prefixes":
> * `dbg.` for the debugger implementation defined helpers.
> * `dump_` for the dumper function of any kind (even helpers).
> * `LJ*` for the classes to be the entry points for our extensions.
> 
> I'm not sure that ctype_preplit -> dump_ctype_preplit helps for find the
> original logic for this dumper from the LuaJIT source code. So, I'm open
> to ideas ;).
> 
> I prefer if the refactoring of names will be done separately so there
> is no a part of the naming in this patch series and another part in the
> next one. At least this is inconsistent.
> 
> <snipped>
> 
> >
> > --
> > Best regards,
> > Evgeniy Temirgaleev
> 
> --
> Best regards,
> Sergey Kaplun
>

[-- Attachment #2: Type: text/html, Size: 5273 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-30 13:07           ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-30 14:03             ` Sergey Kaplun via Tarantool-patches
  2026-06-30 15:07               ` Evgeniy Temirgaleev via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-30 14:03 UTC (permalink / raw)
  To: Evgeniy Temirgaleev; +Cc: tarantool-patches

Evgeniy,

On 30.06.26, Evgeniy Temirgaleev wrote:
> Sergey, thanks for the answer!
> 
> > I'm not sure that ctype_preplit -> dump_ctype_preplit helps for find the
> > original logic for this dumper from the LuaJIT source code. So, I'm open
> > to ideas ;).
> 
> I think, we can use original names from LuaJIT(ctype_prelit for ctype_prelit) and differ names for routines which are not present in LuaJIT. No more ideas at a time :)
> 
> I can’t find cdata_val_int64 or cdata_val_complex in LuaJIT code, ‘dump_’ prefix missed?

They are partially inspired by __tostring metamethod for FFI.
Added the prefix as you suggested:

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 28cbe97d..19cefaed 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -2081,9 +2081,9 @@ def dump_lj_gco_cdata(gcobj):
     size = ctype['size']
     value = ''
     if ctype_iscomplex(info):
-        value = cdata_val_complex(cdata, ctype)
+        value = dump_cdata_val_complex(cdata, ctype)
     elif size == 8 and ctype_isinteger(info):
-        value = cdata_val_int64(cdata, ctype)
+        value = dump_cdata_val_int64(cdata, ctype)
     else:
         value = cdataptr(cdata)
         if ctype_isptr(info):
@@ -2425,7 +2425,7 @@ def dump_func(func):
 # FFI dumpers.
 
 
-def cdata_val_int64(cdata, ctype):
+def dump_cdata_val_int64(cdata, ctype):
     info = ctype['info']
     isunsigned = info & CTF_UNSIGNED
     cdataval = cdataptr(cdata)
@@ -2439,7 +2439,7 @@ def cdata_val_int64(cdata, ctype):
     return str(valueptr[0]) + usuffix + 'LL'
 
 
-def cdata_val_complex(cdata, ctype):
+def dump_cdata_val_complex(cdata, ctype):
     size = ctype['size']
     cdataval = cdataptr(cdata)
     casttype = None
===================================================================

Branch is force-pushed.

> 
> --
> Best regards,
> Evgeniy Temirgaleev
> 
> > 
> > From: Sergey Kaplun <skaplun@tarantool.org>
> > To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> > Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> > >
> > Date: Tuesday, June 30, 2026 3:27 PM +03:00
> > Evgeniy,
> > Thanks for the answer, see my thoughts below.
> > 
> > On 30.06.26, Evgeniy Temirgaleev wrote:

<snipped>

> > --
> > Best regards,
> > Sergey Kaplun
> >

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
  2026-06-28  1:03   ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-30 14:45   ` Sergey Bronnikov via Tarantool-patches
  2026-06-30 16:01     ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 25+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2026-06-30 14:45 UTC (permalink / raw)
  To: Sergey Kaplun, Evgeniy Temirgaleev; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 14398 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with minor comments.

Sergey

On 6/25/26 23:29, Sergey Kaplun wrote:
> This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> similar to the -jdump flag but also reports types of register operands
> (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> The `lj-trace` command accepts optional /rs flags to dump registers
> associated with IR and snapshots for the trace correspondingly.
> The `lj-ir` command can be used for dumping IR constants as well.
> The `lj-jslots` command dumps the content of `J->slot`. It is useful to
> simplify debugging of `rec_check_slots()` assertion failures.
>
> For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> Also, `__index__` is set to allow lldb.value to be used as an index
> without explicit conversion to int. Old GDB versions (below 7.12) are
> not supported because of the gdb.Value lacks the `__index__` metamethod
> and can't be monkey-patched. The support for these versions may be added
> by demand.
>
> Part of tarantool/tarantool#4808
> ---
>   src/luajit_dbg.py                             | 1216 ++++++++++++++++-
>   .../debug-extension-tests.py                  |  365 +++++
>   2 files changed, 1570 insertions(+), 11 deletions(-)
>
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 2edb199a..fd6ca8a5 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -58,6 +58,26 @@ class Debugger(object):
>               self.LLDB = True
>               return super(Debugger, self).__new__(_LLDBDebugger)
>   
> +    def parse_flags(self, raw_flags, permitted_flags):
> +        flags = {}
> +        for flag in raw_flags:
> +            if flag not in permitted_flags:
> +                raise self.error('Unrecongnized option: "{}"'.format(flag))
typo: s/Unrecongnized/Unrecognized/
> +            flags[flag] = True
> +        return flags
> +
> +    def extract_flags(self, arg, permitted_flags):
> +        if not arg:
> +            return None, None
> +        flags = {}
> +        if arg.startswith('/'):
> +            match = re.match(r'/(\S*)\s+(.*)$', arg)
> +            if not match:
> +                return arg, flags
> +            raw_flags, arg = match.group(1, 2)
> +            flags = self.parse_flags(raw_flags, permitted_flags)
> +        return arg, flags
> +
>       def configure(self):
>           global PADDING, LJ_TISNUM
>           if not self.check_libluajit():
> @@ -70,6 +90,17 @@ class Debugger(object):
>               self.write('luajit_dbg.py failed to load: '
>                          'no debugging symbols found for libluajit\n')
>               return False
> +
> +        # Setup arch.
> +        try:
> +            self.arch = str(self.eval('LJ_ARCH_NAME')).split('"')[1]
> +        except Exception:
> +            try:
> +                self.arch = self.detect_arch()
> +            except Exception:
> +                # Setup on demand if necessary.
> +                pass
> +
>           return True
>   
>       def initialize_extension(self, commands):
> @@ -99,21 +130,42 @@ class Debugger(object):
>           '''Return the content of the string by the given pointer.'''
>           pass
>   
> +    @abc.abstractmethod
> +    def address(self, obj):
> +        '''Return the address in memory of the given object.'''
> +        pass
> +
>       @abc.abstractmethod
>       def lookup_global(self, symbol):
>           '''Look up the global C symbol by the given name.'''
>           pass
>   
> +    @abc.abstractmethod
> +    def member_by_offset(self, typename, offset, prev_name=None):
> +        '''Look up the global C symbol by the given name.'''
> +        pass
> +
>       @abc.abstractmethod
>       def eval(self, command):
>           '''Parse and evaluate the given debugger command.'''
>           pass
>   
> +    @abc.abstractmethod
> +    def detect_arch(self):
> +        '''Detect the CPU architecture and canonicalize it to the LuaJIT
> +        notation.'''
> +        pass
> +
>       @abc.abstractmethod
>       def write(self, msg):
>           '''Print the message.'''
>           pass
>   
> +    @abc.abstractmethod
> +    def error(self, msg):
> +        '''Create the error object with message.'''
> +        pass
> +
>       @abc.abstractmethod
>       def check_libluajit(self):
>           '''Check that libluajit is loaded.
> @@ -172,10 +224,50 @@ class _GDBDebugger(Debugger):
>           # A string is printed with a pointer to it. Just strip it.
>           return re.sub(r'^0x[a-f0-9]+\s+(?=")', '', str(strptr))
>   
> +    def address(self, obj):
> +        return obj.address
> +
>       def lookup_global(self, symbol):
>           variable, _ = gdb.lookup_symbol(symbol)
>           return variable.value() if variable else None
>   
> +    def member_by_offset(self, tp, offset, prev_name=None):
> +        if isinstance(tp, str):
> +            tp = self._dbgtype(tp)
> +        assert offset < tp.sizeof, 'offset is bigger than object size'
> +        if tp.code == gdb.TYPE_CODE_TYPEDEF:
> +            tp = tp.strip_typedefs()
> +        if tp.code == gdb.TYPE_CODE_STRUCT:
> +            fields = tp.fields()
> +            for n_field in range(len(fields)):
> +                islast = n_field == (len(fields) - 1)
> +                field = fields[n_field]
> +                start_field = field.bitpos / 8
may be //?
> +                end_field = fields[n_field + 1].bitpos / 8 if not islast \
> +                    else tp.sizeof
> +                if start_field <= offset and offset < end_field:
> +                    next_name = self.member_by_offset(
> +                        field.type,
> +                        offset - start_field,
> +                        prev_name=field.name
> +                    )
> +                    return '.{field}{suffix}'.format(
> +                        field=field.name,
> +                        suffix=next_name if next_name else ''
> +                    )
> +        elif tp.code == gdb.TYPE_CODE_ARRAY:
> +            # Get array field type.
> +            target = tp.target()
> +            tsize = target.sizeof
> +            idx = int(offset // tsize)
> +            next_name = self.member_by_offset(target, offset - idx * tsize)
> +            idxname = idx_name(prev_name)
> +            if idxname and idx in idxname:
> +                idx = idxname[idx]
> +            return '[{}]{}'.format(idx, next_name if next_name else '')
> +        else:
> +            return None
> +
>       def eval(self, command):
>           if not command:
>               return None
> @@ -185,9 +277,23 @@ class _GDBDebugger(Debugger):
>               raise gdb.GdbError('table argument empty')
>           return ret
>   
> +    def detect_arch(self):
> +        if hasattr(self, 'arch'):
> +            return self.arch
> +        target = str(gdb.execute('info target', False, True))
> +        if re.match('.*x86-64.*', target, flags=re.DOTALL):
> +            return 'x64'
> +        elif re.match('.*aarch64.*', target, flags=re.DOTALL):
> +            return 'arm64'
> +        else:
> +            return ''
> +
>       def write(self, msg):
>           gdb.write(msg)
>   
> +    def error(self, errmsg):
> +        return gdb.GdbError(errmsg)
> +
>       def check_libluajit(self):
>           # XXX Fragile: Though connecting the callback looks bad,
>           # it respects both Python 2 and Python 3 (see #4828).
> @@ -322,8 +428,26 @@ class _LLDBDebugger(Debugger):
>           def lldb__getitem__(lldbval, key):
>               if type(key) is lldb.value:
>                   key = int(key)
> +            if type(key) is bool:
> +                key = int(key)
>               if type(key) is int:
>                   # Allow array access.
> +                ltp = lldbval.sbvalue.GetType()
> +                # XXX: LLDB in versions 17 - 19 can't use an array
> +                # object as the initializer for `lldb.value` since
> +                # `GetValue()` for it returns `None` leading to
> +                # the invalid result. See
> +                #https://github.com/llvm/llvm-project/pull/90144.
> +                if (self.version < 17 or self.version > 19) or \
> +                   ltp.GetTypeClass() != lldb.eTypeClassArray:
> +                    pass
probably it is better to invert condition and remove section with "pass"
<snipped>
> +
> +
> +def ir_kptr(ir):
> +    irname = IRS[ir['o']]
> +    assert irname == 'KPTR' or irname == 'KKPTR', 'wrong IR for ir_iptr()'
typo: s/ir_iptr()/ir_kptr() or ir_kkptr()/
> +    return mref('void *', dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['ptr'])
> +
> +
> +def ir_kgc(ir):
> +    irname = IRS[ir['o']]
> +    assert irname == 'KGC', 'wrong IR for ir_kgc()'
> +    return gcref(dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['gcr'])
> +
> +
> +def ir_knum(ir):
> +    irname = IRS[ir['o']]
> +    assert irname == 'KNUM', 'wrong IR for ir_knum()'
> +    return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
> +
> +
> +def ir_kint64(ir):
> +    irname = IRS[ir['o']]
> +    assert irname == 'KINT64', 'wrong IR for ir_knum()'
typo: s/ir_knum/ir_kint64/
> +    return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
> +
> +
>   # Dumpers.
>   
>   # GCobj dumpers.
> @@ -1467,6 +2281,325 @@ def dump_func(func):
>           return 'fast function #{}\n'.format(int(ffid))
>   
>   
> +# JIT dumpers.
> +
> +
> +def dump_call_func(trace, callop):
> +    ctype = ''
> +    if callop > 0:
> +        ir = trace['ir'][REF_BIAS + callop]
> +        if IRTYPES[irt_type(ir['t'])] == 'nil':  # nil == CARG(func, ctype)
> +            callop = int(ir['op1']) - REF_BIAS
> +            cdt_idx_irk = trace['ir'][ir['op2']]
> +            assert IRS[cdt_idx_irk['o']] == 'KINT', \
> +                   'unexpected IR for ctype storage'
> +            ctype_idx = cdt_idx_irk['i']
> +            ctype = 'ctype: {}'.format(ctype_idx)
> +
> +    func_str = ''
> +    if callop < 0:
> +        irk = trace['ir'][REF_BIAS + callop]
> +        assert IRS[irk['o']] == 'KINT64', \
> +               'unexpected IR for FFI function storage'
> +        func_addr = int(ir_kint64(irk)['u64'])
> +        # TODO: Symbol demangling.
> +        func_str = '[{:#x}]'.format(func_addr)
> +    else:
> +        func_str = '[{:04d}]'.format(callop)
> +
> +    return func_str, ctype
> +
> +
> +def dump_call_args(trace, ins):
> +    if ins < 0:
> +        return '{{{}}}'.format(dump_irk(trace, ins))
> +    else:
> +        ir = trace['ir'][REF_BIAS + ins]
> +        irname = IRS[ir['o']]
> +        if irname == 'CARG':
> +            last_arg = ''
> +            args = dump_call_args(trace, int(ir['op1']) - REF_BIAS)
> +            op2 = int(ir['op2']) - REF_BIAS
> +            if op2 < 0:
> +                last_arg = '{{{}}}'.format(dump_irk(trace, op2))
> +            else:
> +                last_arg = '{{{:04d}}}'.format(op2)
> +            return args + ', ' + last_arg
> +        else:
> +            return '{{{:04d}}}'.format(ins)
> +
> +
> +# Special FP constant.
> +CONST_BIAS = 2 ** 52 + 2 ** 51
> +
> +
> +def dump_irk(trace, idx):
> +    ref = idx + REF_BIAS
> +    assert ref >= trace['nk'] and ref < REF_BIAS, 'bad constant in IR dump'
> +    irins = trace['ir'][ref]
> +    irname = IRS[irins['o']]
> +    slot = ''
> +    if irname == 'KSLOT':
> +        slot = ' KSLOT: @{}'.format(int(irins['op2']))
> +        irins = trace['ir'][irins['op1']]
> +        irname = IRS[irins['o']]
> +
> +    irtype = irins['t']
> +    if irname == 'KPRI':
> +        typename = typenames(irt_toitype(irtype))
> +        # Trivial dump for primitives.
> +        irk = tv_dumpers.get(
> +            typename, dump_lj_tv_invalid  # noqa: F821 # Generated.
> +        )(0)
> +    elif irname == 'KINT':
> +        irk = 'integer {}'.format(dbg.cast('int32_t', irins['i']))
> +    elif irname == 'KGC':
> +        typename = typenames(irt_toitype(irtype))
> +        irk = gco_dumpers.get(typename, dump_lj_gco_invalid)(ir_kgc(irins))
> +    elif irname == 'KKPTR':
> +        addr = ir_kptr(irins)
> +        if addr == dbg.address(G(L())['nilnode']):
> +            return '[g->nilnode]' + slot
> +        irk = '[{}]'.format(strx64(addr))
> +    elif irname == 'KPTR':
> +        irk = '[{}]'.format(strx64(ir_kptr(irins)))
> +    elif irname == 'KNULL':
> +        irk = 'NULL'
> +    elif irname == 'KNUM':
> +        tv_num = ir_knum(irins)
> +        if float(tv_num['n']) == CONST_BIAS:
> +            return 'bias'
> +        irk = dump_lj_tv_numx(tv_num)
> +    elif irname == 'KINT64':
> +        irk = 'int64_t {}'.format(dbg.cast(
> +            'int64_t', int(ir_kint64(irins)['u64'])
> +        ))
> +    else:
> +        return 'Unknown IRK: ' + irname
> +    return irk + slot
> +
> +
<snipped>
> +
> +def dump_snap(trace, snapno, snap):
> +    dump = 'SNAP   #{:<3d} ['.format(snapno)
> +    snap_map = dbg.address(trace['snapmap'][snap['mapofs']])
> +    snap_entry_num = 0
> +    for slot in range(0, snap['nslots']):
> +        dump += ' '
> +        snap_entry = int(snap_map[snap_entry_num])
> +        if snap_entry_num < snap['nent'] and snap_entry >> TREF_SHIFT == slot:
> +            snap_entry_num += 1
> +            ref = int((snap_entry & TREF_REFMASK) - REF_BIAS)
> +            if ref < 0:
> +                if int(snap_entry) == 0x1057fff:
magic number
<snipped>
> +# Assume not cross-platform debugging.
> +machine = os.uname().machine
> +if machine == 'x86_64':
> +    RX_GPR = r'r\w\w'
> +    RX_FPR = r'xmm\d+'
> +elif machine == 'arm64' or machine == 'aarch64':
> +    RX_GPR = r'x\d+'
> +    RX_FPR = r'd\d+'
> +else:
> +    raise Exception('Unknown archeticture in testing')
typo: s/archeticture/architecture/
<snipped>
> +
> +class TestLJIRConst(TestCaseBase):
> +    location = 'trace_stop'
> +
> +    # No narrowing of 42.
> +    if IS_DUALNUM:
> +        # KNUM occupies 2 slots.
> +        _knum_irnum = '6'
> +        _kgc_irnum = '8' if IS_GC64 else '7'
> +        _kptr_irnum = '10' if IS_GC64 else '8'
> +    else:
> +        # KNUM occupies 2 slots.
> +        _knum_irnum = '8'
> +        _kgc_irnum = '10' if IS_GC64 else '9'
> +        _kptr_irnum = '12' if IS_GC64 else '10'
both branches contains the same comment, is it a typo or not?
> <snipped>
>   

[-- Attachment #2: Type: text/html, Size: 16188 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump Sergey Kaplun via Tarantool-patches
  2026-06-29 13:55   ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-30 14:53   ` Sergey Bronnikov via Tarantool-patches
  2026-06-30 15:03     ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 25+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2026-06-30 14:53 UTC (permalink / raw)
  To: Sergey Kaplun, Evgeniy Temirgaleev; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 427 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with minor comments.

Sergey

On 6/25/26 23:29, Sergey Kaplun wrote:


<snipped>

> +
> +
> +def ctype_repr(cts, id):
> +    ctype = ctype_get(cts, id)
> +    ctypestr = ''
> +    qual = 0
> +    ptrto = 0
> +    while True:

probably it should be somehow limited, with broken memory/coredump

the loop may be an infinite

<snipped>
>       test_cls.test = lambda self: self.check()
>   

[-- Attachment #2: Type: text/html, Size: 1260 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests Sergey Kaplun via Tarantool-patches
  2026-06-28  1:31   ` Evgeniy Temirgaleev via Tarantool-patches
@ 2026-06-30 14:54   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 0 replies; 25+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2026-06-30 14:54 UTC (permalink / raw)
  To: Sergey Kaplun, Evgeniy Temirgaleev; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1790 bytes --]

Hi, Sergey,

thanks for the patch! LGTM

Sergey

On 6/25/26 23:29, Sergey Kaplun wrote:
> If the environment variable `DUBUGGER_TEST_VERBOSE` is set, each test
> prints the generated command and its output and doesn't delete the files
> generated for it.
> ---
>   test/tarantool-debugger-tests/debug-extension-tests.py | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
> index fc5d2c7b..adb83e1e 100644
> --- a/test/tarantool-debugger-tests/debug-extension-tests.py
> +++ b/test/tarantool-debugger-tests/debug-extension-tests.py
> @@ -43,6 +43,8 @@ else:
>       # Don't run any initialization scripts.
>       RUN_CMD_FILE = ['--batch', '--nx', '--quiet', '--command']
>   
> +TEST_VERBOSE = os.getenv('DUBUGGER_TEST_VERBOSE', default=False)
> +
>   RX_ADDR = r'0x[a-f0-9]+'
>   RX_HASH = RX_ADDR  # The same pattern for hexademic values.
>   RX_BCN = r'00\d\d'
> @@ -52,7 +54,7 @@ RX_IRREF = r'0x\d\d\d\d'
>   
>   
>   def persist(data):
> -    tmp = tempfile.NamedTemporaryFile(mode='w')
> +    tmp = tempfile.NamedTemporaryFile(mode='w', delete=not TEST_VERBOSE)
>       tmp.write(data)
>       tmp.flush()
>       return tmp
> @@ -149,7 +151,12 @@ class TestCaseBase(unittest.TestCase):
>               LUAJIT_BINARY,
>               script_file.name,
>           ]
> +        if TEST_VERBOSE:
> +            print('# Test name: {}'.format(cls.__name__))
> +            print('# Test command: {}'.format(' '.join(process_cmd)))
>           cls.output = execute_process(process_cmd)
> +        if TEST_VERBOSE:
> +            print('# Command output: {}'.format(cls.output))
>           cmd_file.close()
>           script_file.close()
>   

[-- Attachment #2: Type: text/html, Size: 2151 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-30 14:53   ` Sergey Bronnikov via Tarantool-patches
@ 2026-06-30 15:03     ` Sergey Kaplun via Tarantool-patches
  2026-07-01  8:25       ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-30 15:03 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey,

Thanks for the review!
See my answers below.

On 30.06.26, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with minor comments.
> 
> Sergey
> 
> On 6/25/26 23:29, Sergey Kaplun wrote:
> 
> 
> <snipped>
> 
> > +
> > +
> > +def ctype_repr(cts, id):
> > +    ctype = ctype_get(cts, id)
> > +    ctypestr = ''
> > +    qual = 0
> > +    ptrto = 0
> > +    while True:
> 
> probably it should be somehow limited, with broken memory/coredump
> 
> the loop may be an infinite

The infinite loop here means the infinite ctypes chain, so it means
infinite memory. I suppose this is not a possible case.
And maybe that ctype_child() returns the same class.
I don't want to add an artificial limit for it.

> 
> <snipped>
> >       test_cls.test = lambda self: self.check()
> >   

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches]  [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-30 14:03             ` Sergey Kaplun via Tarantool-patches
@ 2026-06-30 15:07               ` Evgeniy Temirgaleev via Tarantool-patches
  0 siblings, 0 replies; 25+ messages in thread
From: Evgeniy Temirgaleev via Tarantool-patches @ 2026-06-30 15:07 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2827 bytes --]

Sergey! Thanks for the additional updates and for the explanation!
(The patch LGTM, as it was stated earlier.)
--
Best regards,
Evgeniy Temirgaleev

> 
> From: Sergey Kaplun <skaplun@tarantool.org>
> To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> >
> Date: Tuesday, June 30, 2026 5:04 PM +03:00
> Evgeniy,
> 
> On 30.06.26, Evgeniy Temirgaleev wrote:
> > Sergey, thanks for the answer!
> >
> > > I'm not sure that ctype_preplit -> dump_ctype_preplit helps for find
> the
> > > original logic for this dumper from the LuaJIT source code. So, I'm
> open
> > > to ideas ;).
> >
> > I think, we can use original names from LuaJIT(ctype_prelit for
> ctype_prelit) and differ names for routines which are not present in
> LuaJIT. No more ideas at a time :)
> >
> > I can’t find cdata_val_int64 or cdata_val_complex in LuaJIT code,
> ‘dump_’ prefix missed?
> 
> They are partially inspired by __tostring metamethod for FFI.
> Added the prefix as you suggested:
> 
> ===================================================================
> diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> index 28cbe97d..19cefaed 100644
> --- a/src/luajit_dbg.py
> +++ b/src/luajit_dbg.py
> @@ -2081,9 +2081,9 @@ def dump_lj_gco_cdata(gcobj):
> size = ctype['size']
> value = ''
> if ctype_iscomplex(info):
> - value = cdata_val_complex(cdata, ctype)
> + value = dump_cdata_val_complex(cdata, ctype)
> elif size == 8 and ctype_isinteger(info):
> - value = cdata_val_int64(cdata, ctype)
> + value = dump_cdata_val_int64(cdata, ctype)
> else:
> value = cdataptr(cdata)
> if ctype_isptr(info):
> @@ -2425,7 +2425,7 @@ def dump_func(func):
> # FFI dumpers.
> 
> 
> -def cdata_val_int64(cdata, ctype):
> +def dump_cdata_val_int64(cdata, ctype):
> info = ctype['info']
> isunsigned = info & CTF_UNSIGNED
> cdataval = cdataptr(cdata)
> @@ -2439,7 +2439,7 @@ def cdata_val_int64(cdata, ctype):
> return str(valueptr[0]) + usuffix + 'LL'
> 
> 
> -def cdata_val_complex(cdata, ctype):
> +def dump_cdata_val_complex(cdata, ctype):
> size = ctype['size']
> cdataval = cdataptr(cdata)
> casttype = None
> ===================================================================
> 
> Branch is force-pushed.
> 
> >
> > --
> > Best regards,
> > Evgeniy Temirgaleev
> >
> > >
> > > From: Sergey Kaplun <skaplun@tarantool.org>
> > > To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
> > > Cc: tarantool-patches@dev.tarantool.org, Sergey Bronnikov <sergeyb@tarantool.org
> 
> > > >
> > > Date: Tuesday, June 30, 2026 3:27 PM +03:00
> > > Evgeniy,
> > > Thanks for the answer, see my thoughts below.
> > >
> > > On 30.06.26, Evgeniy Temirgaleev wrote:
> 
> <snipped>
> 
> > > --
> > > Best regards,
> > > Sergey Kaplun
> > >
> 
> --
> Best regards,
> Sergey Kaplun
>

[-- Attachment #2: Type: text/html, Size: 3962 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-30 14:45   ` Sergey Bronnikov via Tarantool-patches
@ 2026-06-30 16:01     ` Sergey Kaplun via Tarantool-patches
  2026-07-01 11:23       ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-06-30 16:01 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments and force-pushed the branch.

On 30.06.26, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with minor comments.
> 
> Sergey
> 
> On 6/25/26 23:29, Sergey Kaplun wrote:
> > This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> > as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> > similar to the -jdump flag but also reports types of register operands
> > (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> > The `lj-trace` command accepts optional /rs flags to dump registers
> > associated with IR and snapshots for the trace correspondingly.
> > The `lj-ir` command can be used for dumping IR constants as well.
> > The `lj-jslots` command dumps the content of `J->slot`. It is useful to
> > simplify debugging of `rec_check_slots()` assertion failures.
> >
> > For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> > Also, `__index__` is set to allow lldb.value to be used as an index
> > without explicit conversion to int. Old GDB versions (below 7.12) are
> > not supported because of the gdb.Value lacks the `__index__` metamethod
> > and can't be monkey-patched. The support for these versions may be added
> > by demand.
> >
> > Part of tarantool/tarantool#4808
> > ---
> >   src/luajit_dbg.py                             | 1216 ++++++++++++++++-
> >   .../debug-extension-tests.py                  |  365 +++++
> >   2 files changed, 1570 insertions(+), 11 deletions(-)
> >
> > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > index 2edb199a..fd6ca8a5 100644
> > --- a/src/luajit_dbg.py
> > +++ b/src/luajit_dbg.py
> > @@ -58,6 +58,26 @@ class Debugger(object):
> >               self.LLDB = True
> >               return super(Debugger, self).__new__(_LLDBDebugger)
> >   
> > +    def parse_flags(self, raw_flags, permitted_flags):
> > +        flags = {}
> > +        for flag in raw_flags:
> > +            if flag not in permitted_flags:
> > +                raise self.error('Unrecongnized option: "{}"'.format(flag))
> typo: s/Unrecongnized/Unrecognized/

Fixed, thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 19cefaed..6412575c 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -62,7 +62,7 @@ class Debugger(object):
         flags = {}
         for flag in raw_flags:
             if flag not in permitted_flags:
-                raise self.error('Unrecongnized option: "{}"'.format(flag))
+                raise self.error('Unrecognized option: "{}"'.format(flag))
             flags[flag] = True
         return flags
 
===================================================================

> > +            flags[flag] = True
> > +        return flags
> > +

<snipped>

> > +    def member_by_offset(self, tp, offset, prev_name=None):
> > +        if isinstance(tp, str):
> > +            tp = self._dbgtype(tp)
> > +        assert offset < tp.sizeof, 'offset is bigger than object size'
> > +        if tp.code == gdb.TYPE_CODE_TYPEDEF:
> > +            tp = tp.strip_typedefs()
> > +        if tp.code == gdb.TYPE_CODE_STRUCT:
> > +            fields = tp.fields()
> > +            for n_field in range(len(fields)):
> > +                islast = n_field == (len(fields) - 1)
> > +                field = fields[n_field]
> > +                start_field = field.bitpos / 8
> may be //?

I'm not sure that this is crucial. In case when the bitpos isn't a
multiple of 8 we will get an error, so we may find the bug earlier
instead of an incorrect result. I'd prefer to ignore it if you don't
insist.

> > +                end_field = fields[n_field + 1].bitpos / 8 if not islast \
> > +                    else tp.sizeof
> > +                if start_field <= offset and offset < end_field:
> > +                    next_name = self.member_by_offset(
> > +                        field.type,
> > +                        offset - start_field,
> > +                        prev_name=field.name
> > +                    )
> > +                    return '.{field}{suffix}'.format(
> > +                        field=field.name,
> > +                        suffix=next_name if next_name else ''
> > +                    )

<snipped>

> > @@ -322,8 +428,26 @@ class _LLDBDebugger(Debugger):
> >           def lldb__getitem__(lldbval, key):
> >               if type(key) is lldb.value:
> >                   key = int(key)
> > +            if type(key) is bool:
> > +                key = int(key)
> >               if type(key) is int:
> >                   # Allow array access.
> > +                ltp = lldbval.sbvalue.GetType()
> > +                # XXX: LLDB in versions 17 - 19 can't use an array
> > +                # object as the initializer for `lldb.value` since
> > +                # `GetValue()` for it returns `None` leading to
> > +                # the invalid result. See
> > +                #https://github.com/llvm/llvm-project/pull/90144.
> > +                if (self.version < 17 or self.version > 19) or \
> > +                   ltp.GetTypeClass() != lldb.eTypeClassArray:
> > +                    pass
> probably it is better to invert condition and remove section with "pass"

The same condition is used for loading the global symbol, so it is more
bulletproof, I suppose. Also, this attracts an attention about the
mentioned issue. I'd prefer to leave it as is if you don't insist.

> > +
> > +
> > +def ir_kptr(ir):
> > +    irname = IRS[ir['o']]
> > +    assert irname == 'KPTR' or irname == 'KKPTR', 'wrong IR for ir_iptr()'
> typo: s/ir_iptr()/ir_kptr() or ir_kkptr()/

Fixed, thanks:

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 6412575c..d5967716 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -2001,7 +2001,7 @@ def irt_toitype(irt):
 
 def ir_kptr(ir):
     irname = IRS[ir['o']]
-    assert irname == 'KPTR' or irname == 'KKPTR', 'wrong IR for ir_iptr()'
+    assert irname == 'KPTR' or irname == 'KKPTR', 'wrong IR for ir_kptr()'
     return mref('void *', dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['ptr'])
 
 
===================================================================

> > +    return mref('void *', dbg.cast('IRIns *', dbg.address(ir))[LJ_GC64]['ptr'])

<snipped>

> > +def ir_kint64(ir):
> > +    irname = IRS[ir['o']]
> > +    assert irname == 'KINT64', 'wrong IR for ir_knum()'
> typo: s/ir_knum/ir_kint64/

Fixed, thanks:

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index d5967716..68f7970b 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -2019,7 +2019,7 @@ def ir_knum(ir):
 
 def ir_kint64(ir):
     irname = IRS[ir['o']]
-    assert irname == 'KINT64', 'wrong IR for ir_knum()'
+    assert irname == 'KINT64', 'wrong IR for ir_kint64()'
     return dbg.address(dbg.cast('IRIns *', dbg.address(ir))[1]['tv'])
 
 
===================================================================

<snipped>

> > +    for slot in range(0, snap['nslots']):
> > +        dump += ' '
> > +        snap_entry = int(snap_map[snap_entry_num])
> > +        if snap_entry_num < snap['nent'] and snap_entry >> TREF_SHIFT == slot:
> > +            snap_entry_num += 1
> > +            ref = int((snap_entry & TREF_REFMASK) - REF_BIAS)
> > +            if ref < 0:
> > +                if int(snap_entry) == 0x1057fff:
> magic number

Indeed, fixed:

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 68f7970b..9534dfad 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1956,6 +1956,7 @@ RID_SUNK = (RID_INIT - 2)
 SPS_NONE = 0
 
 REF_BIAS = 0x8000
+REF_NIL = REF_BIAS - 1
 
 TREF_SHIFT = 24
 
@@ -1964,8 +1965,11 @@ TREF_FRAME = 0x00010000
 TREF_CONT = 0x00020000
 # Snapshot flags and masks.
 SNAP_FRAME = 0x010000
+SNAP_NORESTORE = 0x040000
 SNAP_SOFTFPNUM = 0x080000
 
+SNAP_FR2_SLOT = (1 << TREF_SHIFT) | SNAP_FRAME | SNAP_NORESTORE + REF_NIL
+
 def irt_type(t):
     return dbg.cast('IRType', t['irt'] & IRT_TYPE)
 
@@ -2810,7 +2813,7 @@ def dump_snap(trace, snapno, snap):
             snap_entry_num += 1
             ref = int((snap_entry & TREF_REFMASK) - REF_BIAS)
             if ref < 0:
-                if int(snap_entry) == 0x1057fff:
+                if int(snap_entry) == SNAP_FR2_SLOT:
                     dump += '----'
                     continue
                 elif (snap_entry & TREF_CONT):
===================================================================

> <snipped>
> > +# Assume not cross-platform debugging.
> > +machine = os.uname().machine
> > +if machine == 'x86_64':
> > +    RX_GPR = r'r\w\w'
> > +    RX_FPR = r'xmm\d+'
> > +elif machine == 'arm64' or machine == 'aarch64':
> > +    RX_GPR = r'x\d+'
> > +    RX_FPR = r'd\d+'
> > +else:
> > +    raise Exception('Unknown archeticture in testing')
> typo: s/archeticture/architecture/

Fixed, thanks!
===================================================================
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 71b763d2..895171a4 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -124,7 +124,7 @@ elif machine == 'arm64' or machine == 'aarch64':
     RX_GPR = r'x\d+'
     RX_FPR = r'd\d+'
 else:
-    raise Exception('Unknown archeticture in testing')
+    raise Exception('Unknown architecture in testing')
 
 
 class TestCaseBase(unittest.TestCase):
===================================================================


> <snipped>
> > +
> > +class TestLJIRConst(TestCaseBase):
> > +    location = 'trace_stop'
> > +
> > +    # No narrowing of 42.
> > +    if IS_DUALNUM:
> > +        # KNUM occupies 2 slots.
> > +        _knum_irnum = '6'
> > +        _kgc_irnum = '8' if IS_GC64 else '7'
> > +        _kptr_irnum = '10' if IS_GC64 else '8'
> > +    else:
> > +        # KNUM occupies 2 slots.
> > +        _knum_irnum = '8'
> > +        _kgc_irnum = '10' if IS_GC64 else '9'
> > +        _kptr_irnum = '12' if IS_GC64 else '10'
> both branches contains the same comment, is it a typo or not?

It is intentional. KNUM occupies 2 slots anyway.

> > <snipped>
> >   

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-06-30 15:03     ` Sergey Kaplun via Tarantool-patches
@ 2026-07-01  8:25       ` Sergey Kaplun via Tarantool-patches
  2026-07-01 11:20         ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 1 reply; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-07-01  8:25 UTC (permalink / raw)
  To: Sergey Bronnikov, tarantool-patches

Sergey,

On 30.06.26, Sergey Kaplun via Tarantool-patches wrote:
> Hi, Sergey,
> 
> Thanks for the review!
> See my answers below.
> 
> On 30.06.26, Sergey Bronnikov wrote:
> > Hi, Sergey,
> > 
> > thanks for the patch! LGTM with minor comments.
> > 
> > Sergey
> > 
> > On 6/25/26 23:29, Sergey Kaplun wrote:
> > 
> > 
> > <snipped>
> > 
> > > +
> > > +
> > > +def ctype_repr(cts, id):
> > > +    ctype = ctype_get(cts, id)
> > > +    ctypestr = ''
> > > +    qual = 0
> > > +    ptrto = 0
> > > +    while True:
> > 
> > probably it should be somehow limited, with broken memory/coredump
> > 
> > the loop may be an infinite
> 
> The infinite loop here means the infinite ctypes chain, so it means
> infinite memory. I suppose this is not a possible case.
> And maybe that ctype_child() returns the same class.
> I don't want to add an artificial limit for it.

After some considerations I've removed the unreachable code (since the
loop has no break statement):

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index b1a7182b..80057a4e 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -2585,7 +2585,6 @@ def ctype_repr(cts, id):
                 ctypestr = '(' + ctypestr + ')'
             ctypestr += '()'
         ctype = ctype_child(cts, ctype)
-    return 'NYI'
 
 
 def dump_ctype(ct):
===================================================================

<snipped>

> -- 
> Best regards,
> Sergey Kaplun

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump
  2026-07-01  8:25       ` Sergey Kaplun via Tarantool-patches
@ 2026-07-01 11:20         ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 0 replies; 25+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2026-07-01 11:20 UTC (permalink / raw)
  To: Sergey Kaplun, tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 64 bytes --]

Sergey,

LGTM

On 7/1/26 11:25, Sergey Kaplun wrote:

<snipped>

[-- Attachment #2: Type: text/html, Size: 375 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
  2026-06-30 16:01     ` Sergey Kaplun via Tarantool-patches
@ 2026-07-01 11:23       ` Sergey Bronnikov via Tarantool-patches
  0 siblings, 0 replies; 25+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2026-07-01 11:23 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 174 bytes --]

Sergey,

On 6/30/26 19:01, Sergey Kaplun wrote:
> Hi, Sergey!
> Thanks for the review!
> Fixed your comments and force-pushed the branch.
>
Thanks for fixes! LGTM


<snipped>

[-- Attachment #2: Type: text/html, Size: 631 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension
  2026-06-25 20:29 [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
                   ` (2 preceding siblings ...)
  2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests Sergey Kaplun via Tarantool-patches
@ 2026-07-07 10:44 ` Sergey Kaplun via Tarantool-patches
  3 siblings, 0 replies; 25+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-07-07 10:44 UTC (permalink / raw)
  To: Sergey Bronnikov, Evgeniy Temirgaleev; +Cc: tarantool-patches

I've applied the patch-set into all long-term branches in
tarantool/luajit and bumped a new version in Tarantool's master [1],
release/3.7 [2], release/3.6 [3] and release/2.11 [4].

[1]: https://github.com/tarantool/tarantool/pull/12883
[2]: https://github.com/tarantool/tarantool/pull/12884
[3]: https://github.com/tarantool/tarantool/pull/12885
[4]: https://github.com/tarantool/tarantool/pull/12886

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2026-07-07 10:44 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-06-25 20:29 [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
2026-06-28  1:03   ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-28 16:32     ` Sergey Kaplun via Tarantool-patches
2026-06-29 16:35       ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-30 14:45   ` Sergey Bronnikov via Tarantool-patches
2026-06-30 16:01     ` Sergey Kaplun via Tarantool-patches
2026-07-01 11:23       ` Sergey Bronnikov via Tarantool-patches
2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump Sergey Kaplun via Tarantool-patches
2026-06-29 13:55   ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-29 19:20     ` Sergey Kaplun via Tarantool-patches
2026-06-30 11:48       ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-30 12:27         ` Sergey Kaplun via Tarantool-patches
2026-06-30 13:07           ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-30 14:03             ` Sergey Kaplun via Tarantool-patches
2026-06-30 15:07               ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-30 14:53   ` Sergey Bronnikov via Tarantool-patches
2026-06-30 15:03     ` Sergey Kaplun via Tarantool-patches
2026-07-01  8:25       ` Sergey Kaplun via Tarantool-patches
2026-07-01 11:20         ` Sergey Bronnikov via Tarantool-patches
2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests Sergey Kaplun via Tarantool-patches
2026-06-28  1:31   ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-28 15:19     ` Sergey Kaplun via Tarantool-patches
2026-06-30 14:54   ` Sergey Bronnikov via Tarantool-patches
2026-07-07 10:44 ` [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox