Hi, Evgeniy!
Thanks for the review!
See my answers below.
Branch is force-pushed with the fixes.
On 28.06.26, Evgeniy Temirgaleev wrote:
> Hi, Sergey!
>
> Thanks for the patch. Please, see my comments.
>
> --
> Best regards,
> Evgeniy Temirgaleev
>
> >
> > От кого: Sergey Kaplun <
skaplun@tarantool.org>
> > Кому: Sergey Bronnikov <
sergeyb@tarantool.org>, Evgeniy Temirgaleev <
e.temirgaleev@tarantool.org> > >
> > Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <
skaplun@tarantool.org> > >
> > Дата: Четверг, 25 июня 2026, 23:29 +03:00
> > This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> > as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> > similar to the -jdump flag but also reports types of register operands
> > (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> > The `lj-trace` command accepts optional /rs flags to dump registers
> > associated with IR and snapshots for the trace correspondingly.
> > The `lj-ir` command can be used for dumping IR constants as well.
> > The `lj-jslots` command dumps the content of `J->slot`. It is useful to
> > simplify debugging of `rec_check_slots()` assertion failures.
> >
> > For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> > Also, `__index__` is set to allow lldb.value to be used as an index
> > without explicit conversion to int. Old GDB versions (below 7.12) are
> > not supported because of the gdb.Value lacks the `__index__` metamethod
> > and can't be monkey-patched. The support for these versions may be added
> > by demand.
> >
> > Part of tarantool/tarantool#4808
> > ---
> > src/luajit_dbg.py | 1216 ++++++++++++++++-
> > .../debug-extension-tests.py | 365 +++++
> > 2 files changed, 1570 insertions(+), 11 deletions(-)
> >
> > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > index 2edb199a..fd6ca8a5 100644
> > --- a/src/luajit_dbg.py
> > +++ b/src/luajit_dbg.py
<snipped>
> > + if tp.GetTypeClass() == lldb.eTypeClassStruct:
> > + len_fields = tp.GetNumberOfFields()
> > + for n_field in range(len_fields):
> > + islast = n_field == (len_fields - 1)
> > + field = tp.GetFieldAtIndex(n_field)
> > + start_field = field.GetOffsetInBytes()
> > + if not islast:
> > + end_field = tp.GetFieldAtIndex(
> > + n_field + 1
> > + ).GetOffsetInBytes()
> > + else:
> > + end_field = tp.GetByteSize()
> > + if start_field <= offset and offset < end_field:
> > + next_name = self.member_by_offset(
> > + field.GetType(),
> > + offset - start_field,
> > + prev_name=field.GetName()
> > + )
> > + return '.{field}{suffix}'.format(
> > + field=field.GetName(),
> > + suffix=next_name if next_name else ''
> > + )
> > + if tp.GetTypeClass() == lldb.eTypeClassArray:
> >
>
> Typo?: elif
Fixed, thanks!
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 62cd65d5..3b7cf7a1 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -657,7 +657,7 @@ class _LLDBDebugger(Debugger):
field=field.GetName(),
suffix=next_name if next_name else ''
)
- if tp.GetTypeClass() == lldb.eTypeClassArray:
+ elif tp.GetTypeClass() == lldb.eTypeClassArray:
# Get array field type.
target = tp.GetArrayElementType()
tsize = target.GetByteSize()
===================================================================
<snipped>
> > +# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> > +# Non-weak guard. */
> >
>
> Typo: C comment end */
Removed, thanks!
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 62cd65d5..3b7cf7a1 100644
@@ -1612,12 +1612,16 @@ IRS = [
# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
-# Non-weak guard. */
+# Non-weak guard.
IRM_C = 0x10
IRM_A = 0x20
IRM_L = 0x40
===================================================================
>
> >
> > +IRM_C = 0x10
> > +IRM_A = 0x20
> > +IRM_L = 0x40
> > +IRM_S = 0x60
> > +IRM_W = 0x80
> > +
> > +
> > +# IR operand mode (2 bit).
> > +IRM = [
> > + 'ref',
> > + 'lit',
> > + 'cst',
> > + '', # none
> > +]
> > +
> > +
> > +lj_ir_mode_ = None
> > +
> > +
> > +def lj_ir_mode():
> > + global lj_ir_mode_
> > + if lj_ir_mode_:
> > + return lj_ir_mode_
> > + lj_ir_mode_ = dbg.lookup_global('lj_ir_mode')
> > + return lj_ir_mode_
> > +
> > +
> > +def ir_left(op):
> > + return IRM[int(lj_ir_mode()[op] & 3)]
> >
>
> May be binary constant will be more clear? xxx & 0b0011
I prefer to leave it consistent with the original sources, see
<src/lj_ir.h>. Also, the bc decoding has the same format.
>
> >
> > +
> > +
> > +def ir_right(op):
> > + return IRM[int(lj_ir_mode()[op] >> 2 & 3)]
> >
>
> May be binary constant will be more clear? (xxx & 0b1100) >> 2
Ditto.
>
> >
> > +
> > +
> > +def ir_mode(op):
> > + mode = ''
> > + ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
> >
>
> >
> > + if ir_mode == IRM_C:
> > + mode = 'C'
> > + elif ir_mode == IRM_A:
> > + mode = 'A'
> > + elif ir_mode == IRM_L:
> > + mode = 'L'
> > + elif ir_mode == IRM_S:
> > + mode = 'S'
> > + else:
> > + mode = 'N'
> >
>
> >
> > + mode += 'W' if ir_mode & IRM_W else ''
> >
>
> May be table with 16 items and comments will be more clear? E. g. return XXX[(lj_ir_mode()[op] & 0b11110000) >> 4]
> And it will contain invalid values also.
> # <flag bits in a comment>
> XXX[0b0000] = ‘NW’ # Normal/Ref | !Non-weak guard
> XXX[0b0001] = ‘CW’ # Commutative | !Non-weak guard
> XXX[0b0011] = ‘Invalid’
> ...
> XXX[0b1000] = ‘N’ # Normal/Ref | Non-weak guard
> XXX[0b1001] = ‘C’ # Commutative | Non-weak guard
> XXX[0b1011] = ‘Invalid’
> ...
Rewrote with table usage as the following:
Also, you help me to notice that the original implementation was
incorrect (due to bits of operand modes after xor). Tests was corrupted
as well, fixed. Thanks!
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index d4f89eb5..32b0cea7 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1613,11 +1613,15 @@ IRS = [
# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
# Non-weak guard.
-IRM_C = 0x10
-IRM_A = 0x20
-IRM_L = 0x40
-IRM_S = 0x60
-IRM_W = 0x80
+IRM_BITS_W = 0x80
+IRM_BITS = {
+ 0x00: 'N',
+ 0x10: 'C',
+ 0x20: 'A',
+ 0x40: 'L',
+ 0x60: 'S',
+}
+IRM_BITS_MASK = 0x70
# IR operand mode (2 bit).
@@ -1649,19 +1653,10 @@ def ir_right(op):
def ir_mode(op):
- mode = ''
- ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
- if ir_mode == IRM_C:
- mode = 'C'
- elif ir_mode == IRM_A:
- mode = 'A'
- elif ir_mode == IRM_L:
- mode = 'L'
- elif ir_mode == IRM_S:
- mode = 'S'
- else:
- mode = 'N'
- mode += 'W' if ir_mode & IRM_W else ''
+ irmode = int((lj_ir_mode()[op]))
+ isweak = not bool(irmode & IRM_BITS_W)
+ mode = IRM_BITS[irmode & IRM_BITS_MASK]
+ mode += 'W' if isweak else ''
return mode
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 8e069fe0..f17de27e 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -530,12 +530,12 @@ class TestLJTraceBase(TestCaseBase):
r'\t*proto: ' + RX_ADDR + r'\n' +
r'\t*BC: ' + RX_ADDR + r'\n' +
r'---- TRACE IR\n' +
- RX_IRN + r'\s+ int SLOAD \[N \] lit: #[12] lit: C?I\n' +
+ RX_IRN + r'\s+ int SLOAD \[L \] lit: #[12] lit: C?I\n' +
RX_IRN + r'\s+ \+ int ADD \[C \] ref: ' + RX_IRN +
r' ref: integer 1\n' +
RX_IRN + r'\s+ > int LE \[N \] ref: ' + RX_IRN +
r' ref: integer 4\n' +
- RX_IRN + r'\s+ > --- LOOP \[N \]\s*\n' +
+ RX_IRN + r'\s+ > --- LOOP \[S \]\s*\n' +
RX_IRN + r'\s+ \+ int ADD \[C \] ref: ' + RX_IRN +
r' ref: integer 1\n' +
RX_IRN + r'\s+ > int LE \[N \] ref: ' + RX_IRN +
===================================================================
But intentionally didn't use bit notation to be consistent with original
declarations in <src/lj_ir.h>. I've used the mask to strip lower bits
related to "NonWeak" guard and operand modes.
<snipped>
> > +# Don't use *[ to be compatible with Python 2.
> > +REGISTERS = {'x64': [
> > + 'rax',
> > + 'rcx',
> > + 'rdx',
> > + 'rbx',
> > + 'rsp',
> > + 'rbp',
> > + 'rsi',
> > + 'rdi',
> > +] + [
> > + 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> > +] + [
> > + 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> > +], 'arm64': [
> > + 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> > +] + ['sp'] + [ # x31
> > + 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> > +]}
> >
>
> It seems, the ‘arm64’ registers are missed.
Actiually no, but I understand your confusion.
Reformated as the following:
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 32b0cea7..a79caad0 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1728,24 +1728,29 @@ IRFPMS = [
# Don't use *[ to be compatible with Python 2.
-REGISTERS = {'x64': [
- 'rax',
- 'rcx',
- 'rdx',
- 'rbx',
- 'rsp',
- 'rbp',
- 'rsi',
- 'rdi',
-] + [
- 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
-] + [
- 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
-], 'arm64': [
- 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
-] + ['sp'] + [ # x31
- 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
-]}
+REGISTERS = {
+ 'x64': [
+ 'rax',
+ 'rcx',
+ 'rdx',
+ 'rbx',
+ 'rsp',
+ 'rbp',
+ 'rsi',
+ 'rdi',
+ ] + [
+ 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
+ ] + [
+ 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
+ ],
+ 'arm64': [
+ 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
+ ] + [
+ 'sp' # x31
+ ] + [
+ 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
+ ]
+}
IR_CALLS = [
===================================================================
>
<snipped>
> > +def litname_xload(mode):
> > + flags = ['-', 'R', 'V', 'RV', 'U', 'RU', 'VU', 'RVU']
> >
>
> Does we need a range check as in litname_bufhdr()?
I prefer error raising if something goes wrong here (invalid IR or
incorrect extension implementation).
>
> >
> > + return flags[mode]
> > +
> > +
> > +def litname_conv(mode):
> >
>
> Does we need some range checking here?
Ditto.
>
> >
<snippped>
> > +
> > +
> > +def irt_ismarked(t):
> > + return t['irt'] & IRT_MARK
> >
>
> I propose explicit bool cast (!= 0) here and below.
Added:
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index a79caad0..6b0827d9 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1978,15 +1978,15 @@ def tref_ref(tr):
def irt_ismarked(t):
- return t['irt'] & IRT_MARK
+ return bool(t['irt'] & IRT_MARK)
def irt_isphi(t):
- return t['irt'] & IRT_ISPHI
+ return bool(t['irt'] & IRT_ISPHI)
def irt_isguard(t):
- return t['irt'] & IRT_GUARD
+ return bool(t['irt'] & IRT_GUARD)
def irt_toitype(irt):
===================================================================
>
> >
> > +
> > +
> > +def irt_isphi(t):
> > + return t['irt'] & IRT_ISPHI
> > +
> > +
> > +def irt_isguard(t):
> > + return t['irt'] & IRT_GUARD
> > +
> > +
<snipped>
> >
> > --
> > 2.54.0
> >
--
Best regards,
Sergey Kaplun