[Tarantool-patches] [PATCH 2/2] gdb: refactor iteration over frames while dumping stack

Mikhail Elhimov m.elhimov at vk.team
Thu Sep 1 02:20:26 MSK 2022


Hi, Igor!

Thanks for the review!

On 31.08.2022 17:47, Igor Munkin wrote:
> Misha,
>
> Thanks for the patch! Please consider my comments below.
>
> On 23.07.22, Mikhail Elhimov via Tarantool-patches wrote:
>> Changes:
> Minor: You can omit "Changes:" here; just list them :)
Fixed
>> - Introduce generator to iterate over frames in a Python way
>> - Dump framelink of the bottom frame in a common way and identify it as
>>    a dummy framelink only inside of common dump function
>> - Framelink always refers to the slot right before frame's base
> Side note: These three bullets above are simply great, thanks a lot!
>
>> - Introduce constant LJ_FRAMELINK_NUM_SLOTS to make slot arithmetic
>>    more obvious. Previous using of boolean LJ_RT2 in arithmetic looked
>>    like a tricky way, both to read and maintain (however places where
>>    LJ_RT2 approach had been originally replicated from the source code
>>    are left intact, because it is expected that maintenance efforts for
>>    them will be minimal)
> Well, I'm also see no sense in introducing this variable.
> 1. It doesn't exist in LuaJIT sources.
> 2. It's simply LJ_FR2 + 1 (you can convert LJ_FR2 from the boolean
>     context to the numeric one, if you want)
> 3. It is required only in a single place: <frames> function. Another
>     place where it is used is <get_framelink_sentinel> but as Sergey
>     already mentioned, LJ_FR2 fits much better there.
>
> I suggest to stay as close as possible to LuaJIT defines and LuaJIT code
> base, since this extension also violates DRY approach, that is already
> ignored almost everywhere in LuaJIT codebase.
>
> All in all, I see no reason to introduce a new variable for a single
> case, and propose to continue using LJ_FR2 for such cases.
Fixed
>> ---
>>   src/luajit-gdb.py | 101 ++++++++++++++++++++++------------------------
>>   1 file changed, 49 insertions(+), 52 deletions(-)
>>
>> diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
>> index 1e9a96fb..d7d34c70 100644
>> --- a/src/luajit-gdb.py
>> +++ b/src/luajit-gdb.py
>> @@ -158,6 +158,7 @@ LJ_64 = None
>>   LJ_GC64 = None
>>   LJ_FR2 = None
>>   LJ_DUALNUM = None
>> +LJ_FRAMELINK_NUM_SLOTS = None
> Considering the comment right above, this can be dropped.
Fixed
>>   
>>   LJ_GCVMASK = ((1 << 47) - 1)
>>   LJ_TISNUM = None
>> @@ -299,6 +300,21 @@ gclen = {
>>       'mmudata': gcringlen,
>>   }
>>   
>> +def get_framelink_sentinel(L):
>> +    stack = mref('TValue *', L['stack'])
>> +    return stack + LJ_FRAMELINK_NUM_SLOTS - 1
> IMHO, you can move this into Frames section with the name kinda
> <frame_sentinel>. And use LJ_FR2 here to make it a oneliner.
Fixed
>> +
>> +# Generator that implements frame iterator
>> +# every frame is represented as a tuple of framelink and frametop
>> +def frames(L):
>> +    frametop = L['top']
>> +    framelink = L['base'] - 1
>> +    framelink_sentinel = get_framelink_sentinel(L)
>> +    while framelink is not None:
>> +        yield framelink, frametop
>> +        frametop = framelink - LJ_FRAMELINK_NUM_SLOTS
>> +        framelink = frame_prev(framelink) if framelink > framelink_sentinel else None
> Side note: I've seen the newest version.
>
>> +
>>   # Dumpers {{{
>>   
>>   def dump_lj_tnil(tv):
>> @@ -397,32 +413,38 @@ dumpers = {
>>   def dump_tvalue(tvalue):
>>       return dumpers.get(typenames(itypemap(tvalue)), dump_lj_invalid)(tvalue)
>>   
>> +def dump_framelink_slot_address(fr):
>> +    return '{}:{}'.format(fr - 1, fr) if LJ_FR2 else '{}'.format(fr) + PADDING
> Minor: It's better to split this line into two (like it's done for
> <frame_ftsz> or <frame_pc>). Feel free to ignore, since I was just
> confused whether PADDING is added only to the else part or to the whole
> if-else result.
Fixed
>> +
>> +def dump_framelink_func(fr):
>> +    return dump_lj_tfunc(fr - 1 if LJ_FR2 else fr)
> Oh.. Just use fr - LJ_FR2 in the call below, please: there is no need to
> introduce a function to be used once.
Fixed
>> +
>>   def dump_framelink(L, fr):
>> -    fr2 = fr + LJ_FR2
>> -
>> -    return '{fr}{padding} [    ] FRAME: [{pp}] delta={d}, {f}\n'.format(
>> -        fr = fr,
>> -        padding = ':{fr2}'.format(fr2 = fr2) if LJ_FR2 else PADDING,
>> -        pp = 'PP' if frame_ispcall(fr2) else '{frname}{p}'.format(
>> -            frname = frametypes(int(frame_type(fr2))),
>> -            p = 'P' if frame_typep(fr2) & FRAME_P else ''
>> +    if fr == get_framelink_sentinel(L):
>> +        return '{addr} [S   ] FRAME: dummy L'.format(
>> +            addr = dump_framelink_slot_address(fr),
>> +        )
>> +    return '{addr} [    ] FRAME: [{pp}] delta={d}, {f}'.format(
>> +        addr = dump_framelink_slot_address(fr),
>> +        pp = 'PP' if frame_ispcall(fr) else '{frname}{p}'.format(
>> +            frname = frametypes(int(frame_type(fr))),
>> +            p = 'P' if frame_typep(fr) & FRAME_P else ''
>>           ),
>> -        d = cast('TValue *', fr2) - cast('TValue *', frame_prev(fr2)),
>> -        f = dump_lj_tfunc(fr),
>> +        d = cast('TValue *', fr) - cast('TValue *', frame_prev(fr)),
>> +        f = dump_framelink_func(fr),
> Considering the comment right above, <dump_lj_tfunc> can be used here.
Fixed
>>       )
>>   
> <snipped>
>
>> @@ -717,7 +713,7 @@ The command requires no args and dumps current GC stats:
>>           ))
>>   
>>   def init(commands):
>> -    global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, LJ_TISNUM, PADDING
>> +    global LJ_64, LJ_GC64, LJ_FR2, LJ_DUALNUM, LJ_FRAMELINK_NUM_SLOTS, LJ_TISNUM, PADDING
> Considering the comment at the beginning, this part can be dropped.
Fixed (other occurrences of LJ_FRAMELINK_NUM_SLOTS are dropped as well)
>>   
>>       # XXX Fragile: though connecting the callback looks like a crap but it
>>       # respects both Python 2 and Python 3 (see #4828).
>> @@ -759,6 +755,7 @@ def init(commands):
>>           LJ_64 = str(gdb.parse_and_eval('IRT_PTR')) == 'IRT_P64'
>>           LJ_FR2 = LJ_GC64 = str(gdb.parse_and_eval('IRT_PGC')) == 'IRT_P64'
>>           LJ_DUALNUM = gdb.lookup_global_symbol('lj_lib_checknumber') is not None
>> +        LJ_FRAMELINK_NUM_SLOTS = 2 if LJ_FR2 else 1
>>       except:
>>           gdb.write('luajit-gdb.py failed to load: '
>>                     'no debugging symbols found for libluajit\n')
>> -- 
>> 2.34.1
>>
===================================================
diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
index 9846f3a2..85dffa7b 100644
--- a/src/luajit-gdb.py
+++ b/src/luajit-gdb.py
@@ -150,6 +150,9 @@ def frame_prev(framelink):
      return frame_prevl(framelink) if frame_islua(framelink) \
          else frame_prevd(framelink)
  
+def frame_sentinel(L):
+    return mref('TValue *', L['stack']) + LJ_FR2
+
  # }}}
  
  # Const {{{
@@ -299,6 +302,19 @@ gclen = {
      'mmudata': gcringlen,
  }
  
+# The generator that implements frame iterator.
+# Every frame is represented as a tuple of framelink and frametop.
+def frames(L):
+    frametop = L['top']
+    framelink = L['base'] - 1
+    framelink_sentinel = frame_sentinel(L)
+    while True:
+        yield framelink, frametop
+        frametop = framelink - (1 + LJ_FR2)
+        if framelink <= framelink_sentinel:
+            break
+        framelink = frame_prev(framelink)
+
  # Dumpers {{{
  
  def dump_lj_tnil(tv):
@@ -397,32 +413,36 @@ dumpers = {
  def dump_tvalue(tvalue):
      return dumpers.get(typenames(itypemap(tvalue)), dump_lj_invalid)(tvalue)
  
+def dump_framelink_slot_address(fr):
+    return '{}:{}'.format(fr - 1, fr) if LJ_FR2 \
+        else '{}'.format(fr) + PADDING
+
  def dump_framelink(L, fr):
-    fr2 = fr + LJ_FR2
-
-    return '{fr}{padding} [    ] FRAME: [{pp}] delta={d}, {f}\n'.format(
-        fr = fr,
-        padding = ':{fr2}'.format(fr2 = fr2) if LJ_FR2 else PADDING,
-        pp = 'PP' if frame_ispcall(fr2) else '{frname}{p}'.format(
-            frname = frametypes(int(frame_type(fr2))),
-            p = 'P' if frame_typep(fr2) & FRAME_P else ''
+    if fr == frame_sentinel(L):
+        return '{addr} [S   ] FRAME: dummy L'.format(
+            addr = dump_framelink_slot_address(fr),
+        )
+    return '{addr} [    ] FRAME: [{pp}] delta={d}, {f}'.format(
+        addr = dump_framelink_slot_address(fr),
+        pp = 'PP' if frame_ispcall(fr) else '{frname}{p}'.format(
+            frname = frametypes(int(frame_type(fr))),
+            p = 'P' if frame_typep(fr) & FRAME_P else ''
          ),
-        d = cast('TValue *', fr2) - cast('TValue *', frame_prev(fr2)),
-        f = dump_lj_tfunc(fr),
+        d = cast('TValue *', fr) - cast('TValue *', frame_prev(fr)),
+        f = dump_lj_tfunc(fr - LJ_FR2),
      )
  
-def dump_stack_slot(L, slot, base=None, top=None, eol='\n'):
+def dump_stack_slot(L, slot, base=None, top=None):
      base = base or L['base']
      top = top or L['top']
  
-    return '{addr}{padding} [ {B}{T}{M}] VALUE: {value}{eol}'.format(
+    return '{addr}{padding} [ {B}{T}{M}] VALUE: {value}'.format(
          addr = strx64(slot),
          padding = PADDING,
          B = 'B' if slot == base else ' ',
          T = 'T' if slot == top else ' ',
          M = 'M' if slot == mref('TValue *', L['maxstack']) else ' ',
          value = dump_tvalue(slot),
-        eol = eol,
      )
  
  def dump_stack(L, base=None, top=None):
@@ -439,7 +459,7 @@ def dump_stack(L, base=None, top=None):
          ),
      ]
      dump.extend([
-        dump_stack_slot(L, maxstack + offset, base, top, '')
+        dump_stack_slot(L, maxstack + offset, base, top)
              for offset in range(red, 0, -1)
      ])
      dump.extend([
@@ -447,50 +467,24 @@ def dump_stack(L, base=None, top=None):
              padding = '-' * len(PADDING),
              nstackslots = int((tou64(maxstack) - tou64(stack)) >> 3),
          ),
-        dump_stack_slot(L, maxstack, base, top, ''),
+        dump_stack_slot(L, maxstack, base, top),
          '{start}:{end} [    ] {nfreeslots} slots: Free stack slots'.format(
              start = strx64(top + 1),
              end = strx64(maxstack - 1),
              nfreeslots = int((tou64(maxstack) - tou64(top) - 8) >> 3),
          ),
      ])
-    dump = '\n'.join(dump) + '\n'
-
-    slot = top
-    framelink = base - (1 + LJ_FR2)
-
-    # XXX: Lua stack unwinding algorithm consists of the following steps:
-    # 1. dump all data slots in the (framelink, top) interval
-    # 2. check whether there are remaining frames
-    # 3. if there are no slots further, stop the unwinding loop
-    # 4. otherwise, resolve the next framelink and top and go to (1)
-    #
-    # Postcondition (i.e. do-while) loops is the most fitting idiom for such
-    # case, but Python doesn't provide such lexical construction. Hence step (1)
-    # is unrolled for the topmost stack frame.
-    while slot > framelink + LJ_FR2:
-        dump += dump_stack_slot(L, slot, base, top)
-        slot -= 1
-
-    while framelink > stack:
-        assert slot == framelink + LJ_FR2, "Invalid slot during frame unwind"
-        dump += dump_framelink(L, framelink)
-        framelink = frame_prev(framelink + LJ_FR2) - LJ_FR2
-        slot -= 1 + LJ_FR2
-        while slot > framelink + LJ_FR2:
-            dump += dump_stack_slot(L, slot, base, top)
-            slot -= 1
-
-    assert slot == framelink + LJ_FR2, "Invalid slot after frame unwind"
-    # Skip a nil slot for the last frame for 2-slot frames.
-    slot -= LJ_FR2
-
-    dump += '{fr}{padding} [S   ] FRAME: dummy L'.format(
-        fr = slot,
-        padding = ':{nilslot}'.format(nilslot = slot + 1) if LJ_FR2 else PADDING
-    )
  
-    return dump
+    for framelink, frametop in frames(L):
+        # Dump all data slots in the (framelink, top) interval.
+        dump.extend([
+            dump_stack_slot(L, framelink + offset, base, top)
+                for offset in range(frametop - framelink, 0, -1)
+        ])
+        # Dump frame slot (2 slots in case of GC64).
+        dump.append( dump_framelink(L, framelink) )
+
+    return '\n'.join(dump)
  
  def dump_gc(g):
      gc = g['gc']

-- 
Best regards,
Mikhail Elhimov



More information about the Tarantool-patches mailing list