[Tarantool-patches] [PATCH luajit 0/2] Introduce dumpers for bytecodes in gdb

Sergey Kaplun skaplun at tarantool.org
Thu Jun 9 13:11:12 MSK 2022


Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-luajit-gdb-dump-bc

This patchset allows to inspect bytecodes as for single instruction, as for
all bytecodes inside function or its prototype via gdb. The first
auxiliary patch is needed to introduce dumpers for GCobject similar to
TValues dumpers. The second patch introduces 3 new commands:

* lj-bc <GCIns *> -- dump single bytecode instruction
* lj-func <GCfunc *> -- dump all bytecode instructions for Lua function
  or report type of C or F function
* lj-proto <GCproto *> -- dump all bytecode instructions for the
  prototype

For example, we have the following Lua script named <tmp.lua>:

| 1 local function mywhile(a)
| 2 	local r = 0
| 3 	print(a)
| 4 	while (a < 30) do
| 5 		r = r + a * r/2
| 6 	end
| 7 	return r
| 8 end
| 9
| 10 local uvname1 = false
| 11 local uvname2 = false
| 12 local function myif(a)
| 13 	local s1 = a + 4
| 14 	local s2 = s1 + 4
| 15 	uvname1 = "s10"
| 16 	uvname2 = "s11"
| 17 	print(a)
| 18 	if a > 10 then
| 19 		return a + s2 + s1
| 20 	else
| 21 		return a - 10 - s2 - s1
| 22 	end
| 23 end
| 24
| 25 local f1 = myif
| 26 local f2 = mywhile
| 27 myif(12)
| 28 mywhile(12)

Assume, we set a breakpoint at `lj_cf_print` (line 3).
The lj-stack output contains the following lines:

| 0x40001970            [    ] VALUE: Lua function @ 0x400083c0, 0 upvalues, "@../tmp.lua":1
| 0x40001968            [    ] VALUE: Lua function @ 0x40002148, 2 upvalues, "@../tmp.lua":12
| ...
| 0x40001940            [    ] FRAME: [V] delta=1, Lua function @ 0x400084a0, 0 upvalues, "@../tmp.lua":0

The first one is `myif()` function, the second is `mywhile()` and the
last one is function loaded via `dofile()`.

The resulting output for the functions is the following:

1)
| (gdb) lj-func 0x400083c0
| "@../tmp.lua":1-8
| 0000 FUNCF  rbase:   4
| 0001 KSHORT dst:     1 lits:    0
| 0002 GGET   dst:     2 str:     0 ; string "print" @ 0x400037f0
| 0003 MOV    dst:     3 var:     0
| 0004 CALL   base:    2 lit:     1 lit:     2
| 0005 KSHORT dst:     2 lits:   30
| 0006 ISGE   var:     0 var:     2
| 0007 JMP    rbase:   2 jump:  => 0013
| 0008 LOOP   rbase:   2 jump:  => 0013
| 0009 MULVV  dst:     2 var:     0 var:     1
| 0010 DIVVN  dst:     2 var:     2 num:     0 ; number 2
| 0011 ADDVV  dst:     1 var:     1 var:     2
| 0012 JMP    rbase:   2 jump:  => 0005
| 0013 RET1   rbase:   1 lit:     2

The report is the same as for the following command:
| lj-proto (GCproto *)(((char *)(((GCfuncL *)0x400083c0)->pc.ptr32))-sizeof(GCproto))

2)
| (gdb) lj-func 0x40002148
| "@../tmp.lua":12-23
| 0000 FUNCF  rbase:   5
| 0001 ADDVN  dst:     1 var:     0 num:     0 ; number 4
| 0002 ADDVN  dst:     2 var:     1 num:     0 ; number 4
| 0003 USETS  uv:      0 str:     0 ; 0x40002527 "uvname1" ; string "s10" @ 0x40002298
| 0004 USETS  uv:      1 str:     1 ; 0x4000252f "uvname2" ; string "s11" @ 0x400022b8
| 0005 GGET   dst:     3 str:     2 ; string "print" @ 0x400037f0
| 0006 MOV    dst:     4 var:     0
| 0007 CALL   base:    3 lit:     1 lit:     2
| 0008 KSHORT dst:     3 lits:   10
| 0009 ISGE   var:     3 var:     0
| 0010 JMP    rbase:   3 jump:  => 0015
| 0011 ADDVV  dst:     3 var:     0 var:     2
| 0012 ADDVV  dst:     3 var:     3 var:     1
| 0013 RET1   rbase:   3 lit:     2
| 0014 JMP    rbase:   3 jump:  => 0019
| 0015 SUBVN  dst:     3 var:     0 num:     1 ; number 10
| 0016 SUBVV  dst:     3 var:     3 var:     2
| 0017 SUBVV  dst:     3 var:     3 var:     1
| 0018 RET1   rbase:   3 lit:     2
| 0019 RET0   rbase:   0 lit:     1

3)

| (gdb) lj-func 0x400084a0
| "@../tmp.lua":0-30
| 0000 FUNCV  rbase:   8
| 0001 FNEW   dst:     0 func:    0 ; "@../tmp.lua":1
| 0002 KPRI   dst:     1 pri:     1
| 0003 KPRI   dst:     2 pri:     1
| 0004 FNEW   dst:     3 func:    1 ; "@../tmp.lua":12
| 0005 MOV    dst:     4 var:     3
| 0006 MOV    dst:     5 var:     0
| 0007 MOV    dst:     6 var:     3
| 0008 KSHORT dst:     7 lits:   12
| 0009 CALL   base:    6 lit:     1 lit:     2
| 0010 MOV    dst:     6 var:     0
| 0011 KSHORT dst:     7 lits:   12
| 0012 CALL   base:    6 lit:     1 lit:     2
| 0013 UCLO   rbase:   0 jump:  => 0014
| 0014 RET0   rbase:   0 lit:     1

The single bytecode instruction may be useful, when you debug VM:

| (gdb) b lj_BC_ISGE
| Breakpoint 2 at 0x5555555f0a08
| (gdb) c
| Continuing.
| Breakpoint 2, 0x00005555555f0a08 in lj_BC_ISGE ()
| (gdb) lj-bc $rbx # PC refers __the next instruction__
| JMP    rbase:   3 jump:  +5
| (gdb) lj-bc ((BCIns *)$rbx) - 1 # current instruction
| ISGE   var:     3 var:     0

Sergey Kaplun (2):
  gdb: introduce dumpers for GCobj
  gdb: introduce lj-bc, lj-func and lj-proto dumpers

 src/luajit-gdb.py | 475 +++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 425 insertions(+), 50 deletions(-)

-- 
2.34.1



More information about the Tarantool-patches mailing list