Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Evgeniy Temirgaleev <e.temirgaleev@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers
Date: Sun, 28 Jun 2026 19:32:38 +0300	[thread overview]
Message-ID: <akFMplezjjPtVBsW@root> (raw)
In-Reply-To: <1782608610.240669306@f725.i.mail.ru>

Hi, Evgeniy!
Thanks for the review!
See my answers below.

Branch is force-pushed with the fixes.

On 28.06.26, Evgeniy Temirgaleev wrote:
> Hi, Sergey!
> 
> Thanks for the patch. Please, see my comments.
> 
> --
> Best regards,
> Evgeniy Temirgaleev
> 
> > 
> > От кого: Sergey Kaplun <skaplun@tarantool.org>
> > Кому: Sergey Bronnikov <sergeyb@tarantool.org>, Evgeniy Temirgaleev <e.temirgaleev@tarantool.org
> > >
> > Копия: tarantool-patches@dev.tarantool.org, Sergey Kaplun <skaplun@tarantool.org
> > >
> > Дата: Четверг, 25 июня 2026, 23:29 +03:00
> > This patch adds dumpers for a single IR instruction (`lj-ir`), as well
> > as for all bytecodes inside one trace (`lj-trace`). Its dump is quite
> > similar to the -jdump flag but also reports types of register operands
> > (`ref`, `lit`, `cst`) and operation mode (`N`, `A`, `W`, etc.).
> > The `lj-trace` command accepts optional /rs flags to dump registers
> > associated with IR and snapshots for the trace correspondingly.
> > The `lj-ir` command can be used for dumping IR constants as well.
> > The `lj-jslots` command dumps the content of `J->slot`. It is useful to
> > simplify debugging of `rec_check_slots()` assertion failures.
> > 
> > For LLDB value, the `__getitem__` metamethod now accepts bool keys.
> > Also, `__index__` is set to allow lldb.value to be used as an index
> > without explicit conversion to int. Old GDB versions (below 7.12) are
> > not supported because of the gdb.Value lacks the `__index__` metamethod
> > and can't be monkey-patched. The support for these versions may be added
> > by demand.
> > 
> > Part of tarantool/tarantool#4808
> > ---
> > src/luajit_dbg.py | 1216 ++++++++++++++++-
> > .../debug-extension-tests.py | 365 +++++
> > 2 files changed, 1570 insertions(+), 11 deletions(-)
> > 
> > diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
> > index 2edb199a..fd6ca8a5 100644
> > --- a/src/luajit_dbg.py
> > +++ b/src/luajit_dbg.py

<snipped>

> > + if tp.GetTypeClass() == lldb.eTypeClassStruct:
> > + len_fields = tp.GetNumberOfFields()
> > + for n_field in range(len_fields):
> > + islast = n_field == (len_fields - 1)
> > + field = tp.GetFieldAtIndex(n_field)
> > + start_field = field.GetOffsetInBytes()
> > + if not islast:
> > + end_field = tp.GetFieldAtIndex(
> > + n_field + 1
> > + ).GetOffsetInBytes()
> > + else:
> > + end_field = tp.GetByteSize()
> > + if start_field <= offset and offset < end_field:
> > + next_name = self.member_by_offset(
> > + field.GetType(),
> > + offset - start_field,
> > + prev_name=field.GetName()
> > + )
> > + return '.{field}{suffix}'.format(
> > + field=field.GetName(),
> > + suffix=next_name if next_name else ''
> > + )
> > + if tp.GetTypeClass() == lldb.eTypeClassArray:
> > 
> 
> Typo?: elif

Fixed, thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 62cd65d5..3b7cf7a1 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -657,7 +657,7 @@ class _LLDBDebugger(Debugger):
                         field=field.GetName(),
                         suffix=next_name if next_name else ''
                     )
-        if tp.GetTypeClass() == lldb.eTypeClassArray:
+        elif tp.GetTypeClass() == lldb.eTypeClassArray:
             # Get array field type.
             target = tp.GetArrayElementType()
             tsize = target.GetByteSize()
===================================================================

<snipped>

> > +# Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
> > +# Non-weak guard. */
> > 
> 
> Typo: C comment end */

Removed, thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 62cd65d5..3b7cf7a1 100644
@@ -1612,12 +1612,16 @@ IRS = [


 # Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
-# Non-weak guard. */
+# Non-weak guard.
 IRM_C = 0x10
 IRM_A = 0x20
 IRM_L = 0x40
===================================================================

> 
> > 
> > +IRM_C = 0x10
> > +IRM_A = 0x20
> > +IRM_L = 0x40
> > +IRM_S = 0x60
> > +IRM_W = 0x80
> > +
> > +
> > +# IR operand mode (2 bit).
> > +IRM = [
> > + 'ref',
> > + 'lit',
> > + 'cst',
> > + '', # none
> > +]
> > +
> > +
> > +lj_ir_mode_ = None
> > +
> > +
> > +def lj_ir_mode():
> > + global lj_ir_mode_
> > + if lj_ir_mode_:
> > + return lj_ir_mode_
> > + lj_ir_mode_ = dbg.lookup_global('lj_ir_mode')
> > + return lj_ir_mode_
> > +
> > +
> > +def ir_left(op):
> > + return IRM[int(lj_ir_mode()[op] & 3)]
> > 
> 
> May be binary constant will be more clear? xxx & 0b0011

I prefer to leave it consistent with the original sources, see
<src/lj_ir.h>. Also, the bc decoding has the same format.

> 
> > 
> > +
> > +
> > +def ir_right(op):
> > + return IRM[int(lj_ir_mode()[op] >> 2 & 3)]
> > 
> 
> May be binary constant will be more clear? (xxx & 0b1100) >> 2

Ditto.

> 
> > 
> > +
> > +
> > +def ir_mode(op):
> > + mode = ''
> > + ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
> > 
> 
> > 
> > + if ir_mode == IRM_C:
> > + mode = 'C'
> > + elif ir_mode == IRM_A:
> > + mode = 'A'
> > + elif ir_mode == IRM_L:
> > + mode = 'L'
> > + elif ir_mode == IRM_S:
> > + mode = 'S'
> > + else:
> > + mode = 'N'
> > 
> 
> > 
> > + mode += 'W' if ir_mode & IRM_W else ''
> > 
> 
> May be table with 16 items and comments will be more clear? E. g. return XXX[(lj_ir_mode()[op] & 0b11110000) >> 4]
> And it will contain invalid values also.
> # <flag bits in a comment>
> XXX[0b0000] = ‘NW’ # Normal/Ref | !Non-weak guard
> XXX[0b0001] = ‘CW’ # Commutative | !Non-weak guard
> XXX[0b0011] = ‘Invalid’
> ...
> XXX[0b1000] = ‘N’ # Normal/Ref | Non-weak guard
> XXX[0b1001] = ‘C’ # Commutative | Non-weak guard
> XXX[0b1011] = ‘Invalid’
> ...

Rewrote with table usage as the following:
Also, you help me to notice that the original implementation was
incorrect (due to bits of operand modes after xor). Tests was corrupted
as well, fixed. Thanks!

===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index d4f89eb5..32b0cea7 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1613,11 +1613,15 @@ IRS = [
 
 # Mode bits: Commutative, {Normal/Ref, Alloc, Load, Store},
 # Non-weak guard.
-IRM_C = 0x10
-IRM_A = 0x20
-IRM_L = 0x40
-IRM_S = 0x60
-IRM_W = 0x80
+IRM_BITS_W = 0x80
+IRM_BITS = {
+    0x00: 'N',
+    0x10: 'C',
+    0x20: 'A',
+    0x40: 'L',
+    0x60: 'S',
+}
+IRM_BITS_MASK = 0x70
 
 
 # IR operand mode (2 bit).
@@ -1649,19 +1653,10 @@ def ir_right(op):
 
 
 def ir_mode(op):
-    mode = ''
-    ir_mode = int(lj_ir_mode()[op] ^ IRM_W)
-    if ir_mode == IRM_C:
-        mode = 'C'
-    elif ir_mode == IRM_A:
-        mode = 'A'
-    elif ir_mode == IRM_L:
-        mode = 'L'
-    elif ir_mode == IRM_S:
-        mode = 'S'
-    else:
-        mode = 'N'
-    mode += 'W' if ir_mode & IRM_W else ''
+    irmode = int((lj_ir_mode()[op]))
+    isweak = not bool(irmode & IRM_BITS_W)
+    mode = IRM_BITS[irmode & IRM_BITS_MASK]
+    mode += 'W' if isweak else ''
     return mode
 
 
diff --git a/test/tarantool-debugger-tests/debug-extension-tests.py b/test/tarantool-debugger-tests/debug-extension-tests.py
index 8e069fe0..f17de27e 100644
--- a/test/tarantool-debugger-tests/debug-extension-tests.py
+++ b/test/tarantool-debugger-tests/debug-extension-tests.py
@@ -530,12 +530,12 @@ class TestLJTraceBase(TestCaseBase):
         r'\t*proto: ' + RX_ADDR + r'\n' +
         r'\t*BC: ' + RX_ADDR + r'\n' +
         r'---- TRACE IR\n' +
-        RX_IRN + r'\s+    int SLOAD  \[N \] lit: #[12]   lit: C?I\n' +
+        RX_IRN + r'\s+    int SLOAD  \[L \] lit: #[12]   lit: C?I\n' +
         RX_IRN + r'\s+ \+ int ADD    \[C \] ref: ' + RX_IRN +
                  r' ref: integer 1\n' +
         RX_IRN + r'\s+ >  int LE     \[N \] ref: ' + RX_IRN +
                  r' ref: integer 4\n' +
-        RX_IRN + r'\s+ >  --- LOOP   \[N \]\s*\n' +
+        RX_IRN + r'\s+ >  --- LOOP   \[S \]\s*\n' +
         RX_IRN + r'\s+ \+ int ADD    \[C \] ref: ' + RX_IRN +
                  r' ref: integer 1\n' +
         RX_IRN + r'\s+ >  int LE     \[N \] ref: ' + RX_IRN +
===================================================================

But intentionally didn't use bit notation to be consistent with original
declarations in <src/lj_ir.h>. I've used the mask to strip lower bits
related to "NonWeak" guard and operand modes.

<snipped>

> > +# Don't use *[ to be compatible with Python 2.
> > +REGISTERS = {'x64': [
> > + 'rax',
> > + 'rcx',
> > + 'rdx',
> > + 'rbx',
> > + 'rsp',
> > + 'rbp',
> > + 'rsi',
> > + 'rdi',
> > +] + [
> > + 'r{}'.format(i) for i in range(8, 16) # r8 .. r15
> > +] + [
> > + 'xmm{}'.format(i) for i in range(0, 16) # xmm0 .. xmm15
> > +], 'arm64': [
> > + 'x{}'.format(i) for i in range(0, 31) # x0 .. x30
> > +] + ['sp'] + [ # x31
> > + 'd{}'.format(i) for i in range(0, 32) # d0 .. d31
> > +]}
> > 
> 
> It seems, the ‘arm64’ registers are missed.

Actiually no, but I understand your confusion.
Reformated as the following:
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index 32b0cea7..a79caad0 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1728,24 +1728,29 @@ IRFPMS = [
 
 
 # Don't use *[ to be compatible with Python 2.
-REGISTERS = {'x64': [
-    'rax',
-    'rcx',
-    'rdx',
-    'rbx',
-    'rsp',
-    'rbp',
-    'rsi',
-    'rdi',
-] + [
-    'r{}'.format(i) for i in range(8, 16)  # r8 .. r15
-] + [
-    'xmm{}'.format(i) for i in range(0, 16)  # xmm0 .. xmm15
-], 'arm64': [
-    'x{}'.format(i) for i in range(0, 31)  # x0 .. x30
-] + ['sp'] + [  # x31
-    'd{}'.format(i) for i in range(0, 32)  # d0 .. d31
-]}
+REGISTERS = {
+    'x64': [
+        'rax',
+        'rcx',
+        'rdx',
+        'rbx',
+        'rsp',
+        'rbp',
+        'rsi',
+        'rdi',
+    ] + [
+        'r{}'.format(i) for i in range(8, 16)  # r8 .. r15
+    ] + [
+        'xmm{}'.format(i) for i in range(0, 16)  # xmm0 .. xmm15
+    ],
+    'arm64': [
+        'x{}'.format(i) for i in range(0, 31)  # x0 .. x30
+    ] + [
+        'sp'  # x31
+    ] + [
+        'd{}'.format(i) for i in range(0, 32)  # d0 .. d31
+    ]
+}
 
 
 IR_CALLS = [
===================================================================

> 

<snipped>

> > +def litname_xload(mode):
> > + flags = ['-', 'R', 'V', 'RV', 'U', 'RU', 'VU', 'RVU']
> > 
> 
> Does we need a range check as in litname_bufhdr()?

I prefer error raising if something goes wrong here (invalid IR or
incorrect extension implementation).

> 
> > 
> > + return flags[mode]
> > +
> > +
> > +def litname_conv(mode):
> > 
> 
> Does we need some range checking here?

Ditto.

> 
> > 

<snippped>

> > +
> > +
> > +def irt_ismarked(t):
> > + return t['irt'] & IRT_MARK
> > 
> 
> I propose explicit bool cast (!= 0) here and below.

Added:
===================================================================
diff --git a/src/luajit_dbg.py b/src/luajit_dbg.py
index a79caad0..6b0827d9 100644
--- a/src/luajit_dbg.py
+++ b/src/luajit_dbg.py
@@ -1978,15 +1978,15 @@ def tref_ref(tr):
 
 
 def irt_ismarked(t):
-    return t['irt'] & IRT_MARK
+    return bool(t['irt'] & IRT_MARK)
 
 
 def irt_isphi(t):
-    return t['irt'] & IRT_ISPHI
+    return bool(t['irt'] & IRT_ISPHI)
 
 
 def irt_isguard(t):
-    return t['irt'] & IRT_GUARD
+    return bool(t['irt'] & IRT_GUARD)
 
 
 def irt_toitype(irt):
===================================================================

> 
> > 
> > +
> > +
> > +def irt_isphi(t):
> > + return t['irt'] & IRT_ISPHI
> > +
> > +
> > +def irt_isguard(t):
> > + return t['irt'] & IRT_GUARD
> > +
> > +

<snipped>

> > 
> > --
> > 2.54.0
> >

-- 
Best regards,
Sergey Kaplun

  reply	other threads:[~2026-06-28 16:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-25 20:29 [Tarantool-patches] [PATCH luajit 0/3] Extend debug extension Sergey Kaplun via Tarantool-patches
2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers Sergey Kaplun via Tarantool-patches
2026-06-28  1:03   ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-28 16:32     ` Sergey Kaplun via Tarantool-patches [this message]
2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 2/3] dbg: introduce lj-ctype command, extend cdata dump Sergey Kaplun via Tarantool-patches
2026-06-29 13:55   ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-25 20:29 ` [Tarantool-patches] [PATCH luajit 3/3] test: add verbose mode for debug extension tests Sergey Kaplun via Tarantool-patches
2026-06-28  1:31   ` Evgeniy Temirgaleev via Tarantool-patches
2026-06-28 15:19     ` Sergey Kaplun via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akFMplezjjPtVBsW@root \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=e.temirgaleev@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit 1/3] dbg: introduce lj-ir, lj-jslots, lj-trace dumpers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox