Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit 0/2] Introduce dumpers for bytecodes in gdb
@ 2022-06-09 10:11 Sergey Kaplun via Tarantool-patches
  2022-06-09 10:11 ` [Tarantool-patches] [PATCH luajit 1/2] gdb: introduce dumpers for GCobj Sergey Kaplun via Tarantool-patches
  2022-06-09 10:11 ` [Tarantool-patches] [PATCH luajit 2/2] gdb: introduce lj-bc, lj-func and lj-proto dumpers Sergey Kaplun via Tarantool-patches
  0 siblings, 2 replies; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-06-09 10:11 UTC (permalink / raw)
  To: Maxim Kokryashkin, Igor Munkin; +Cc: tarantool-patches

Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-luajit-gdb-dump-bc

This patchset allows to inspect bytecodes as for single instruction, as for
all bytecodes inside function or its prototype via gdb. The first
auxiliary patch is needed to introduce dumpers for GCobject similar to
TValues dumpers. The second patch introduces 3 new commands:

* lj-bc <GCIns *> -- dump single bytecode instruction
* lj-func <GCfunc *> -- dump all bytecode instructions for Lua function
  or report type of C or F function
* lj-proto <GCproto *> -- dump all bytecode instructions for the
  prototype

For example, we have the following Lua script named <tmp.lua>:

| 1 local function mywhile(a)
| 2 	local r = 0
| 3 	print(a)
| 4 	while (a < 30) do
| 5 		r = r + a * r/2
| 6 	end
| 7 	return r
| 8 end
| 9
| 10 local uvname1 = false
| 11 local uvname2 = false
| 12 local function myif(a)
| 13 	local s1 = a + 4
| 14 	local s2 = s1 + 4
| 15 	uvname1 = "s10"
| 16 	uvname2 = "s11"
| 17 	print(a)
| 18 	if a > 10 then
| 19 		return a + s2 + s1
| 20 	else
| 21 		return a - 10 - s2 - s1
| 22 	end
| 23 end
| 24
| 25 local f1 = myif
| 26 local f2 = mywhile
| 27 myif(12)
| 28 mywhile(12)

Assume, we set a breakpoint at `lj_cf_print` (line 3).
The lj-stack output contains the following lines:

| 0x40001970            [    ] VALUE: Lua function @ 0x400083c0, 0 upvalues, "@../tmp.lua":1
| 0x40001968            [    ] VALUE: Lua function @ 0x40002148, 2 upvalues, "@../tmp.lua":12
| ...
| 0x40001940            [    ] FRAME: [V] delta=1, Lua function @ 0x400084a0, 0 upvalues, "@../tmp.lua":0

The first one is `myif()` function, the second is `mywhile()` and the
last one is function loaded via `dofile()`.

The resulting output for the functions is the following:

1)
| (gdb) lj-func 0x400083c0
| "@../tmp.lua":1-8
| 0000 FUNCF  rbase:   4
| 0001 KSHORT dst:     1 lits:    0
| 0002 GGET   dst:     2 str:     0 ; string "print" @ 0x400037f0
| 0003 MOV    dst:     3 var:     0
| 0004 CALL   base:    2 lit:     1 lit:     2
| 0005 KSHORT dst:     2 lits:   30
| 0006 ISGE   var:     0 var:     2
| 0007 JMP    rbase:   2 jump:  => 0013
| 0008 LOOP   rbase:   2 jump:  => 0013
| 0009 MULVV  dst:     2 var:     0 var:     1
| 0010 DIVVN  dst:     2 var:     2 num:     0 ; number 2
| 0011 ADDVV  dst:     1 var:     1 var:     2
| 0012 JMP    rbase:   2 jump:  => 0005
| 0013 RET1   rbase:   1 lit:     2

The report is the same as for the following command:
| lj-proto (GCproto *)(((char *)(((GCfuncL *)0x400083c0)->pc.ptr32))-sizeof(GCproto))

2)
| (gdb) lj-func 0x40002148
| "@../tmp.lua":12-23
| 0000 FUNCF  rbase:   5
| 0001 ADDVN  dst:     1 var:     0 num:     0 ; number 4
| 0002 ADDVN  dst:     2 var:     1 num:     0 ; number 4
| 0003 USETS  uv:      0 str:     0 ; 0x40002527 "uvname1" ; string "s10" @ 0x40002298
| 0004 USETS  uv:      1 str:     1 ; 0x4000252f "uvname2" ; string "s11" @ 0x400022b8
| 0005 GGET   dst:     3 str:     2 ; string "print" @ 0x400037f0
| 0006 MOV    dst:     4 var:     0
| 0007 CALL   base:    3 lit:     1 lit:     2
| 0008 KSHORT dst:     3 lits:   10
| 0009 ISGE   var:     3 var:     0
| 0010 JMP    rbase:   3 jump:  => 0015
| 0011 ADDVV  dst:     3 var:     0 var:     2
| 0012 ADDVV  dst:     3 var:     3 var:     1
| 0013 RET1   rbase:   3 lit:     2
| 0014 JMP    rbase:   3 jump:  => 0019
| 0015 SUBVN  dst:     3 var:     0 num:     1 ; number 10
| 0016 SUBVV  dst:     3 var:     3 var:     2
| 0017 SUBVV  dst:     3 var:     3 var:     1
| 0018 RET1   rbase:   3 lit:     2
| 0019 RET0   rbase:   0 lit:     1

3)

| (gdb) lj-func 0x400084a0
| "@../tmp.lua":0-30
| 0000 FUNCV  rbase:   8
| 0001 FNEW   dst:     0 func:    0 ; "@../tmp.lua":1
| 0002 KPRI   dst:     1 pri:     1
| 0003 KPRI   dst:     2 pri:     1
| 0004 FNEW   dst:     3 func:    1 ; "@../tmp.lua":12
| 0005 MOV    dst:     4 var:     3
| 0006 MOV    dst:     5 var:     0
| 0007 MOV    dst:     6 var:     3
| 0008 KSHORT dst:     7 lits:   12
| 0009 CALL   base:    6 lit:     1 lit:     2
| 0010 MOV    dst:     6 var:     0
| 0011 KSHORT dst:     7 lits:   12
| 0012 CALL   base:    6 lit:     1 lit:     2
| 0013 UCLO   rbase:   0 jump:  => 0014
| 0014 RET0   rbase:   0 lit:     1

The single bytecode instruction may be useful, when you debug VM:

| (gdb) b lj_BC_ISGE
| Breakpoint 2 at 0x5555555f0a08
| (gdb) c
| Continuing.
| Breakpoint 2, 0x00005555555f0a08 in lj_BC_ISGE ()
| (gdb) lj-bc $rbx # PC refers __the next instruction__
| JMP    rbase:   3 jump:  +5
| (gdb) lj-bc ((BCIns *)$rbx) - 1 # current instruction
| ISGE   var:     3 var:     0

Sergey Kaplun (2):
  gdb: introduce dumpers for GCobj
  gdb: introduce lj-bc, lj-func and lj-proto dumpers

 src/luajit-gdb.py | 475 +++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 425 insertions(+), 50 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Tarantool-patches] [PATCH luajit 1/2] gdb: introduce dumpers for GCobj
  2022-06-09 10:11 [Tarantool-patches] [PATCH luajit 0/2] Introduce dumpers for bytecodes in gdb Sergey Kaplun via Tarantool-patches
@ 2022-06-09 10:11 ` Sergey Kaplun via Tarantool-patches
  2022-06-09 10:11 ` [Tarantool-patches] [PATCH luajit 2/2] gdb: introduce lj-bc, lj-func and lj-proto dumpers Sergey Kaplun via Tarantool-patches
  1 sibling, 0 replies; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-06-09 10:11 UTC (permalink / raw)
  To: Maxim Kokryashkin, Igor Munkin; +Cc: tarantool-patches

Our gdb extension already has dumpers for TValues. But sometimes it
may be useful to dump GCobjects without stack context. This patch adds
additional wrappers around dumpers for GC objects, to get the
corresponding GC object from a TValue.
---
 src/luajit-gdb.py | 151 +++++++++++++++++++++++++++++++---------------
 1 file changed, 102 insertions(+), 49 deletions(-)

diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
index baf66f66..779a25f8 100644
--- a/src/luajit-gdb.py
+++ b/src/luajit-gdb.py
@@ -301,35 +301,28 @@ gclen = {
 
 # Dumpers {{{
 
-def dump_lj_tnil(tv):
-    return 'nil'
-
-def dump_lj_tfalse(tv):
-    return 'false'
+# GCobj dumpers.
 
-def dump_lj_ttrue(tv):
-    return 'true'
+def dump_lj_gco_lightud(gcobj):
+    return 'light userdata @ {}'.format(strx64(gcobj))
 
-def dump_lj_tlightud(tv):
-    return 'light userdata @ {}'.format(strx64(gcval(tv['gcr'])))
-
-def dump_lj_tstr(tv):
+def dump_lj_gco_str(gcobj):
     return 'string {body} @ {address}'.format(
-        body = strdata(gcval(tv['gcr'])),
-        address = strx64(gcval(tv['gcr']))
+        body = strdata(gcobj),
+        address = strx64(gcobj)
     )
 
-def dump_lj_tupval(tv):
-    return 'upvalue @ {}'.format(strx64(gcval(tv['gcr'])))
+def dump_lj_gco_upval(gcobj):
+    return 'upvalue @ {}'.format(strx64(gcobj))
 
-def dump_lj_tthread(tv):
-    return 'thread @ {}'.format(strx64(gcval(tv['gcr'])))
+def dump_lj_gco_thread(gcobj):
+    return 'thread @ {}'.format(strx64(gcobj))
 
-def dump_lj_tproto(tv):
-    return 'proto @ {}'.format(strx64(gcval(tv['gcr'])))
+def dump_lj_gco_proto(gcobj):
+    return 'proto @ {}'.format(strx64(gcobj))
 
-def dump_lj_tfunc(tv):
-    func = cast('struct GCfuncC *', gcval(tv['gcr']))
+def dump_lj_gco_func(gcobj):
+    func = cast('struct GCfuncC *', gcobj)
     ffid = func['ffid']
 
     if ffid == 0:
@@ -345,57 +338,117 @@ def dump_lj_tfunc(tv):
     else:
         return 'fast function #{}'.format(int(ffid))
 
-def dump_lj_ttrace(tv):
-    trace = cast('struct GCtrace *', gcval(tv['gcr']))
+def dump_lj_gco_trace(gcobj):
+    trace = cast('struct GCtrace *', gcobj)
     return 'trace {traceno} @ {addr}'.format(
         traceno = strx64(trace['traceno']),
         addr = strx64(trace)
     )
 
-def dump_lj_tcdata(tv):
-    return 'cdata @ {}'.format(strx64(gcval(tv['gcr'])))
+def dump_lj_gco_cdata(gcobj):
+    return 'cdata @ {}'.format(strx64(gcobj))
 
-def dump_lj_ttab(tv):
-    table = cast('GCtab *', gcval(tv['gcr']))
+def dump_lj_gco_tab(gcobj):
+    table = cast('GCtab *', gcobj)
     return 'table @ {gcr} (asize: {asize}, hmask: {hmask})'.format(
         gcr = strx64(table),
         asize = table['asize'],
         hmask = strx64(table['hmask']),
     )
 
-def dump_lj_tudata(tv):
-    return 'userdata @ {}'.format(strx64(gcval(tv['gcr'])))
+def dump_lj_gco_udata(gcobj):
+    return 'userdata @ {}'.format(strx64(gcobj))
+
+def dump_lj_gco_invalid(gcobj):
+    return 'not valid type @ {}'.format(strx64(gcobj))
+
+# TValue dumpers.
+
+def dump_lj_tv_nil(tv):
+    return 'nil'
+
+def dump_lj_tv_false(tv):
+    return 'false'
+
+def dump_lj_tv_true(tv):
+    return 'true'
+
+def dump_lj_tv_lightud(tv):
+    return dump_lj_gco_lightud(gcval(tv['gcr']))
+
+def dump_lj_tv_str(tv):
+    return dump_lj_gco_str(gcval(tv['gcr']))
+
+def dump_lj_tv_upval(tv):
+    return dump_lj_gco_upval(gcval(tv['gcr']))
+
+def dump_lj_tv_thread(tv):
+    return dump_lj_gco_thread(gcval(tv['gcr']))
+
+def dump_lj_tv_proto(tv):
+    return dump_lj_gco_proto(gcval(tv['gcr']))
+
+def dump_lj_tv_func(tv):
+    return dump_lj_gco_func(gcval(tv['gcr']))
+
+def dump_lj_tv_trace(tv):
+    return dump_lj_gco_trace(gcval(tv['gcr']))
 
-def dump_lj_tnumx(tv):
+def dump_lj_tv_cdata(tv):
+    return dump_lj_gco_cdata(gcval(tv['gcr']))
+
+def dump_lj_tv_tab(tv):
+    return dump_lj_gco_tab(gcval(tv['gcr']))
+
+def dump_lj_tv_udata(tv):
+    return dump_lj_gco_udata(gcval(tv['gcr']))
+
+def dump_lj_tv_numx(tv):
     if tvisint(tv):
         return 'integer {}'.format(cast('int32_t', tv['i']))
     else:
         return 'number {}'.format(cast('double', tv['n']))
 
-def dump_lj_invalid(tv):
-    return 'not valid type @ {}'.format(strx64(gcval(tv['gcr'])))
+def dump_lj_tv_invalid(tv):
+    return dump_lj_tv_udata(gcval(tv['gcr']))
 
 # }}}
 
-dumpers = {
-    'LJ_TNIL': dump_lj_tnil,
-    'LJ_TFALSE': dump_lj_tfalse,
-    'LJ_TTRUE': dump_lj_ttrue,
-    'LJ_TLIGHTUD': dump_lj_tlightud,
-    'LJ_TSTR': dump_lj_tstr,
-    'LJ_TUPVAL': dump_lj_tupval,
-    'LJ_TTHREAD': dump_lj_tthread,
-    'LJ_TPROTO': dump_lj_tproto,
-    'LJ_TFUNC': dump_lj_tfunc,
-    'LJ_TTRACE': dump_lj_ttrace,
-    'LJ_TCDATA': dump_lj_tcdata,
-    'LJ_TTAB': dump_lj_ttab,
-    'LJ_TUDATA': dump_lj_tudata,
-    'LJ_TNUMX': dump_lj_tnumx,
+gco_dumpers = {
+    'LJ_TLIGHTUD': dump_lj_gco_lightud,
+    'LJ_TSTR':     dump_lj_gco_str,
+    'LJ_TUPVAL':   dump_lj_gco_upval,
+    'LJ_TTHREAD':  dump_lj_gco_thread,
+    'LJ_TPROTO':   dump_lj_gco_proto,
+    'LJ_TFUNC':    dump_lj_gco_func,
+    'LJ_TTRACE':   dump_lj_gco_trace,
+    'LJ_TCDATA':   dump_lj_gco_cdata,
+    'LJ_TTAB':     dump_lj_gco_tab,
+    'LJ_TUDATA':   dump_lj_gco_udata,
 }
 
+tv_dumpers = {
+    'LJ_TNIL':     dump_lj_tv_nil,
+    'LJ_TFALSE':   dump_lj_tv_false,
+    'LJ_TTRUE':    dump_lj_tv_true,
+    'LJ_TLIGHTUD': dump_lj_tv_lightud,
+    'LJ_TSTR':     dump_lj_tv_str,
+    'LJ_TUPVAL':   dump_lj_tv_upval,
+    'LJ_TTHREAD':  dump_lj_tv_thread,
+    'LJ_TPROTO':   dump_lj_tv_proto,
+    'LJ_TFUNC':    dump_lj_tv_func,
+    'LJ_TTRACE':   dump_lj_tv_trace,
+    'LJ_TCDATA':   dump_lj_tv_cdata,
+    'LJ_TTAB':     dump_lj_tv_tab,
+    'LJ_TUDATA':   dump_lj_tv_udata,
+    'LJ_TNUMX':    dump_lj_tv_numx,
+}
+
+def dump_gcobj(gcobj):
+    return gco_dumpers.get(typenames(i2notu32(gcobj['gch']['gct'])), dump_lj_gco_invalid)(gcobj)
+
 def dump_tvalue(tvalue):
-    return dumpers.get(typenames(itypemap(tvalue)), dump_lj_invalid)(tvalue)
+    return tv_dumpers.get(typenames(itypemap(tvalue)), dump_lj_tv_invalid)(tvalue)
 
 def dump_framelink(L, fr):
     fr2 = fr + LJ_FR2
@@ -408,7 +461,7 @@ def dump_framelink(L, fr):
             p = 'P' if frame_typep(fr2) & FRAME_P else ''
         ),
         d = cast('TValue *', fr2) - cast('TValue *', frame_prev(fr2)),
-        f = dump_lj_tfunc(fr),
+        f = dump_lj_tv_func(fr),
     )
 
 def dump_stack_slot(L, slot, base=None, top=None, eol='\n'):
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Tarantool-patches] [PATCH luajit 2/2] gdb: introduce lj-bc, lj-func and lj-proto dumpers
  2022-06-09 10:11 [Tarantool-patches] [PATCH luajit 0/2] Introduce dumpers for bytecodes in gdb Sergey Kaplun via Tarantool-patches
  2022-06-09 10:11 ` [Tarantool-patches] [PATCH luajit 1/2] gdb: introduce dumpers for GCobj Sergey Kaplun via Tarantool-patches
@ 2022-06-09 10:11 ` Sergey Kaplun via Tarantool-patches
  1 sibling, 0 replies; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2022-06-09 10:11 UTC (permalink / raw)
  To: Maxim Kokryashkin, Igor Munkin; +Cc: tarantool-patches

This patch adds dumpers as for single bytecode instruction (`lj-bc`), as
for all bytecodes inside one function (`lj-func`) or prototype
(`lj-proto`). Its dump is quite similar with -bl flag, but also reports
types of registers operands (`jmp`, `dst`, `str`, etc.).
---
 src/luajit-gdb.py | 324 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 323 insertions(+), 1 deletion(-)

diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
index 779a25f8..339b57ed 100644
--- a/src/luajit-gdb.py
+++ b/src/luajit-gdb.py
@@ -113,8 +113,162 @@ def frametypes(ft):
         FRAME['VARG'] : 'V',
     }.get(ft, '?')
 
+def bc_op(ins):
+    return int(ins) & 0xff
+
 def bc_a(ins):
-    return (ins >> 8) & 0xff
+    return (int(ins) >> 8) & 0xff
+
+def bc_b(ins):
+    return int(ins) >> 24
+
+def bc_c(ins):
+    return (int(ins) >> 16) & 0xff
+
+def bc_d(ins):
+    return int(ins) >> 16
+
+___ = 0
+BC_NAME = 0
+BC_A = 1
+BC_B = 2
+BC_CD = 3
+BC_MM = 4
+
+BYTECODES = [
+    # Comparison ops. ORDER OPR.
+    ['ISLT',   'var',  ___,     'var',    'lt'],
+    ['ISGE',   'var',  ___,     'var',    'lt'],
+    ['ISLE',   'var',  ___,     'var',    'le'],
+    ['ISGT',   'var',  ___,     'var',    'le'],
+
+    ['ISEQV',  'var',  ___,     'var',    'eq'],
+    ['ISNEV',  'var',  ___,     'var',    'eq'],
+    ['ISEQS',  'var',  ___,     'str',    'eq'],
+    ['ISNES',  'var',  ___,     'str',    'eq'],
+    ['ISEQN',  'var',  ___,     'num',    'eq'],
+    ['ISNEN',  'var',  ___,     'num',    'eq'],
+    ['ISEQP',  'var',  ___,     'pri',    'eq'],
+    ['ISNEP',  'var',  ___,     'pri',    'eq'],
+
+    # Unary test and copy ops.
+    ['ISTC',   'dst',  ___,     'var',    ___],
+    ['ISFC',   'dst',  ___,     'var',    ___],
+    ['IST',    ___,    ___,     'var',    ___],
+    ['ISF',    ___,    ___,     'var',    ___],
+    ['ISTYPE', 'var',  ___,     'lit',    ___],
+    ['ISNUM',  'var',  ___,     'lit',    ___],
+    ['MOV',    'dst',  ___,     'var',    ___],
+    ['NOT',    'dst',  ___,     'var',    ___],
+    ['UNM',    'dst',  ___,     'var',    'unm'],
+    ['LEN',    'dst',  ___,     'var',    'len'],
+    ['ADDVN',  'dst',  'var',   'num',    'add'],
+    ['SUBVN',  'dst',  'var',   'num',    'sub'],
+    ['MULVN',  'dst',  'var',   'num',    'mul'],
+    ['DIVVN',  'dst',  'var',   'num',    'div'],
+    ['MODVN',  'dst',  'var',   'num',    'mod'],
+
+    # Binary ops. ORDER OPR.
+    ['ADDNV',  'dst',  'var',   'num',    'add'],
+    ['SUBNV',  'dst',  'var',   'num',    'sub'],
+    ['MULNV',  'dst',  'var',   'num',    'mul'],
+    ['DIVNV',  'dst',  'var',   'num',    'div'],
+    ['MODNV',  'dst',  'var',   'num',    'mod'],
+
+    ['ADDVV',  'dst',  'var',   'var',    'add'],
+    ['SUBVV',  'dst',  'var',   'var',    'sub'],
+    ['MULVV',  'dst',  'var',   'var',    'mul'],
+    ['DIVVV',  'dst',  'var',   'var',    'div'],
+    ['MODVV',  'dst',  'var',   'var',    'mod'],
+
+    ['POW',    'dst',  'var',   'var',    'pow'],
+    ['CAT',    'dst',  'rbase', 'rbase',  'concat'],
+
+    # Constant ops.
+    ['KSTR',   'dst',   ___,    'str',    ___],
+    ['KCDATA', 'dst',   ___,    'cdata',  ___],
+    ['KSHORT', 'dst',   ___,    'lits',   ___],
+    ['KNUM',   'dst',   ___,    'num',    ___],
+    ['KPRI',   'dst',   ___,    'pri',    ___],
+    ['KNIL',   'base',  ___,    'base',   ___],
+
+    # Upvalue and function ops.
+    ['UGET',   'dst',    ___,    'uv',     ___],
+    ['USETV',  'uv',     ___,    'var',    ___],
+    ['USETS',  'uv',     ___,    'str',    ___],
+    ['USETN',  'uv',     ___,    'num',    ___],
+    ['USETP',  'uv',     ___,    'pri',    ___],
+    ['UCLO',   'rbase',  ___,    'jump',   ___],
+    ['FNEW',   'dst',    ___,    'func',   ___],
+
+    # Table ops.
+    ['TNEW',   'dst',   ___,    'lit',    ___],
+    ['TDUP',   'dst',   ___,    'tab',    ___],
+    ['GGET',   'dst',   ___,    'str',    'index'],
+    ['GSET',   'var',   ___,    'str',    'newindex'],
+    ['TGETV',  'dst',   'var',  'var',    'index'],
+    ['TGETS',  'dst',   'var',  'str',    'index'],
+    ['TGETB',  'dst',   'var',  'lit',    'index'],
+    ['TGETR',  'dst',   'var',  'var',    'index'],
+    ['TSETV',  'var',   'var',  'var',    'newindex'],
+    ['TSETS',  'var',   'var',  'str',    'newindex'],
+    ['TSETB',  'var',   'var',  'lit',    'newindex'],
+    ['TSETM',  'base',  ___,    'num',    'newindex'],
+    ['TSETR',  'var',   'var',  'var',    'newindex'],
+
+    # Calls and vararg handling. T = tail call.
+    ['CALLM',  'base',  'lit',  'lit',    'call'],
+    ['CALL',   'base',  'lit',  'lit',    'call'],
+    ['CALLMT', 'base',  ___,    'lit',    'call'],
+    ['CALLT',  'base',  ___,    'lit',    'call'],
+    ['ITERC',  'base',  'lit',  'lit',    'call'],
+    ['ITERN',  'base',  'lit',  'lit',    'call'],
+    ['VARG',   'base',  'lit',  'lit',    ___],
+    ['ISNEXT', 'base',  ___,    'jump',   ___],
+
+    # Returns.
+    ['RETM',   'base',  ___,    'lit',    ___],
+    ['RET',    'rbase', ___,    'lit',    ___],
+    ['RET0',   'rbase', ___,    'lit',    ___],
+    ['RET1',   'rbase', ___,    'lit',    ___],
+
+    # Loops and branches. I/J = interp/JIT, I/C/L = init/call/loop.
+    ['FORI',   'base',  ___,    'jump',   ___],
+    ['JFORI',  'base',  ___,    'jump',   ___],
+
+    ['FORL',   'base',  ___,    'jump',   ___],
+    ['IFORL',  'base',  ___,    'jump',   ___],
+    ['JFORL',  'base',  ___,    'lit',    ___],
+
+    ['ITERL',  'base',  ___,    'jump',   ___],
+    ['IITERL', 'base',  ___,    'jump',   ___],
+    ['JITERL', 'base',  ___,    'lit',    ___],
+
+    ['LOOP',   'rbase', ___,    'jump',   ___],
+    ['ILOOP',  'rbase', ___,    'jump',   ___],
+    ['JLOOP',  'rbase', ___,    'lit',    ___],
+
+    ['JMP',    'rbase', ___,    'jump',   ___],
+
+    # Function headers. I/J = interp/JIT, F/V/C = fixarg/vararg/C func.
+    ['FUNCF',  'rbase', ___,    ___,      ___],
+    ['IFUNCF', 'rbase', ___,    ___,      ___],
+    ['JFUNCF', 'rbase', ___,    'lit',    ___],
+    ['FUNCV',  'rbase', ___,    ___,      ___],
+    ['IFUNCV', 'rbase', ___,    ___,      ___],
+    ['JFUNCV', 'rbase', ___,    'lit',    ___],
+    ['FUNCC',  'rbase', ___,    ___,      ___],
+    ['FUNCCW', 'rbase', ___,    ___,      ___],
+]
+
+def proto_bc(proto):
+    return cast('BCIns *', cast('char *', proto) + gdb.lookup_type('GCproto').sizeof)
+
+def proto_kgc(pt, idx):
+    return gcref(mref('GCRef *', pt['k'])[idx])
+
+def proto_knumtv(pt, idx):
+    return mref('TValue *', pt['k'])[idx]
 
 def frame_ftsz(framelink):
     return cast('ptrdiff_t', framelink['ftsz'] if LJ_FR2 \
@@ -561,6 +715,132 @@ def dump_gc(g):
 
     return '\n'.join(map(lambda s: '\t' + s, stats))
 
+def proto_loc(proto):
+    return '{chunk}:{firstline}'.format(
+        chunk = strdata(cast('GCstr *', gcval(proto['chunkname']))),
+        firstline = proto['firstline'],
+    )
+
+def funck(pt, idx):
+    if idx >= 0:
+        assert idx < pt['sizekn'], 'invalid idx for numeric constant in proto'
+        tv = proto_knumtv(pt, idx)
+        return dump_tvalue(tv)
+    else:
+        assert ~idx < pt['sizekgc'], 'invalid idx for GC constant in proto'
+        gcobj = proto_kgc(pt, idx)
+        if typenames(i2notu32(gcobj['gch']['gct'])) == 'LJ_TPROTO':
+            return proto_loc(cast('GCproto *', gcobj))
+        return dump_gcobj(gcobj)
+
+def funcuvname(pt, idx):
+    assert idx < pt['sizeuv'], 'invalid idx for upvalue in proto'
+    uvinfo = mref('uint8_t *', pt['uvinfo'])
+    if not uvinfo:
+        return ''
+
+    # if (idx) while (*uvinfo++ || --idx);
+    while idx > 0:
+        while uvinfo[0]:
+            uvinfo += 1
+        uvinfo += 1
+        idx -= 1
+
+    return str(cast('char *', uvinfo))
+
+def dump_reg(bc, reg, value, jmp_format=None, jmp_ctx=None):
+    rtype = bc[reg]
+    is_jmp = rtype == 'jump'
+    padding = ':' + ' ' * (5 - len(rtype))
+
+    if rtype == 'jump':
+        # Destination of jump instruction encoded as offset from BCBIAS_J.
+        delta = value - 0x7fff
+        if jmp_format:
+            value = jmp_format(jmp_ctx, delta)
+        else:
+            prefix = '+' if delta >= 0 else ''
+            value = prefix + str(delta)
+    else:
+        value = '{:3d}'.format(value)
+
+    return '{rtype}{padding} {value}'.format(
+        rtype = rtype,
+        padding = padding,
+        value = value,
+    )
+
+def dump_kc(bc, reg, value, proto):
+    rtype = bc[reg]
+    kc = ''
+    if proto:
+        if rtype == 'str' or rtype == 'func':
+            kc = funck(proto, ~value)
+        elif rtype == 'num':
+            kc = funck(proto, value)
+        elif rtype == 'uv':
+            kc = funcuvname(proto, value)
+
+        if kc != '':
+            kc = ' ; ' + kc
+    return kc
+
+def dump_bc(ins, jmp_format=None, jmp_ctx=None, proto=None):
+    op = bc_op(ins)
+    if op >= len(BYTECODES):
+        return 'INVALID'
+
+    bc = BYTECODES[op]
+    bc_name = bc[BC_NAME]
+    name_padding = ' ' * (6 - len(bc_name))
+
+    bc_hasa = bc[BC_A]
+    bc_hasb = bc[BC_B]
+
+    kca = dump_kc(bc, BC_A, bc_a(ins), proto) if bc_hasa else ''
+    kcc = dump_kc(bc, BC_CD, bc_c(ins) if bc_hasb else bc_d(ins), proto) if bc[BC_CD] else ''
+
+    return '{name}{npad} {ra}{rb}{rcd}{kc}'.format(
+        name = bc_name,
+        npad = name_padding,
+        ra = dump_reg(bc, BC_A, bc_a(ins)) + ' ' if bc_hasa else '',
+        rb = dump_reg(bc, BC_B, bc_b(ins)) + ' ' if bc_hasb else '',
+        rcd = dump_reg(
+            bc, BC_CD, bc_c(ins) if bc_hasb else bc_d(ins),
+            jmp_format=jmp_format, jmp_ctx=jmp_ctx
+        ) if bc[BC_CD] else '',
+        kc=kca+kcc
+    )
+
+def dump_proto(proto):
+    startbc = proto_bc(proto)
+    func_loc = proto_loc(proto)
+    # Location has the following format: '{chunk}:{firstline}'.
+    dump = '{func_loc}-{lastline}\n'.format(
+        func_loc = func_loc,
+        lastline = proto['firstline'] + proto['numline'],
+    )
+
+    def jmp_format(npc_from, delta):
+        return '=> ' + str(npc_from + delta).zfill(4)
+
+    for bcnum in range(0, proto['sizebc']):
+        dump += (str(bcnum).zfill(4) + ' ' + dump_bc(
+            startbc[bcnum], jmp_format=jmp_format, jmp_ctx=bcnum,
+            proto = proto,
+        ) + '\n')
+    return dump
+
+def dump_func(func):
+    ffid = func['ffid']
+
+    if ffid == 0:
+        pt = funcproto(func)
+        return dump_proto(pt)
+    elif ffid == 1:
+        return 'C function @ {}'.format(strx64(func['f']))
+    else:
+        return 'fast function #{}'.format(int(ffid))
 
 class LJBase(gdb.Command):
 
@@ -767,6 +1047,45 @@ The command requires no args and dumps current GC stats:
             stats = dump_gc(g)
         ))
 
+class LJDumpBC(LJBase):
+    '''
+lj-bc <BCIns *>
+
+The command receives a pointer to bytecode instruction and dumps
+type of an instruction, the values of RA, RB and RC (or RD) registers.
+    '''
+
+    def invoke(self, arg, from_tty):
+        gdb.write('{}\n'.format(dump_bc(cast("BCIns *", parse_arg(arg))[0])))
+
+class LJDumpProto(LJBase):
+    '''
+lj-proto <GCproto *>
+
+The command receives a <gcr> of the corresponding GCproto object and dumps
+the chunk name, where the corresponding function is defined, corresponding
+range of lines and a list of bytecodes related to this function.
+
+The constants or upvalues of the prototype are decoded after ';'.
+    '''
+
+    def invoke(self, arg, from_tty):
+        gdb.write('{}'.format(dump_proto(cast("GCproto *", parse_arg(arg)))))
+
+class LJDumpFunc(LJBase):
+    '''
+lj-funcl <GCfunc *>
+
+The command receives a <gcr> of the corresponding GCfunc object and dumps
+the chunk name, where the corresponding function is defined, corresponding
+range of lines and a list of bytecodes related to this function.
+
+The constants or upvalues of the function are decoded after ';'.
+    '''
+
+    def invoke(self, arg, from_tty):
+        gdb.write('{}'.format(dump_func(cast("GCfuncC *", parse_arg(arg)))))
+
 def init(commands):
     global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, LJ_TISNUM, PADDING
 
@@ -832,6 +1151,9 @@ def load(event=None):
         'lj-stack': LJDumpStack,
         'lj-state': LJState,
         'lj-gc': LJGC,
+        'lj-bc': LJDumpBC,
+        'lj-proto': LJDumpProto,
+        'lj-func': LJDumpFunc,
     })
 
 load(None)
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-06-09 10:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-09 10:11 [Tarantool-patches] [PATCH luajit 0/2] Introduce dumpers for bytecodes in gdb Sergey Kaplun via Tarantool-patches
2022-06-09 10:11 ` [Tarantool-patches] [PATCH luajit 1/2] gdb: introduce dumpers for GCobj Sergey Kaplun via Tarantool-patches
2022-06-09 10:11 ` [Tarantool-patches] [PATCH luajit 2/2] gdb: introduce lj-bc, lj-func and lj-proto dumpers Sergey Kaplun via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox