Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype
@ 2021-08-21 12:49 Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:49 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Here the results of benchmark [1] for introduced patchset.
The time is average for 15 runs:

| -------------------------------------- |
|            JIT-off, memprof-on         |
| -------------------------------------- |
|       BEFORE       |       AFTER       |
| ------------------ | ----------------- |
| 5.3859 (0.0325)    | +0.2534 (0.0248)  |
| -------------------------------------- |

BEFORE branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5679-report-jit-allocations-as-internal

[1]: https://gist.github.com/Shishqa/94152d538816fc64b68200336e6305d3

Issue: https://github.com/tarantool/tarantool/issues/5815
Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated

Mikhail Shishatskiy (4):
  memprof: enrich symtab when meeting new prototype
  memprof: demangle symbols on the spot
  memprof: substitute long proto names with aliases
  luajit: change order of modules

 src/lj_bcread.c                               |   1 +
 src/lj_memprof.c                              |  19 +++-
 src/lj_memprof.h                              |  16 ++-
 src/lj_obj.h                                  |   7 ++
 src/lj_parse.c                                |   1 +
 .../misclib-memprof-lapi.test.lua             | 102 +++++++++++++-----
 tools/memprof.lua                             |   5 +-
 tools/memprof/humanize.lua                    |  51 +++++++--
 tools/memprof/parse.lua                       |  67 ++++++++----
 tools/memprof/process.lua                     |  12 +--
 tools/utils/symtab.lua                        |  20 +++-
 11 files changed, 230 insertions(+), 71 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Tarantool-patches] [PATCH luajit v1 1/4] memprof: enrich symtab when meeting new prototype
  2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches
@ 2021-08-21 12:49 ` Mikhail Shishatskiy via Tarantool-patches
  2021-08-24 14:16   ` Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot Mikhail Shishatskiy via Tarantool-patches
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:49 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Previously, the profiler dumped all the function prototypes only
at the start of profiling. It was enough for most cases, but
sometimes we may want to investigate the memory profile of module
loading. In such a case, we expect memprof to dump new prototypes
from parsed source code.

This patch extends memprof's streaming format with entries providing
extra information about allocation source's symbols if they were not
streamed at the start of profiling.

| loc         := loc-lua | loc-lua-sym | loc-c | loc-trace
| loc-lua-sym := sym-addr sym-chunk sym-line line-no

The `sym-chunk` and `sym-line` are the prototype's chunk name and line
where it was defined.

To recognize entries with new symbols, event-header format changed:

| event-header: [FUTSSSEE]
| * T : 1 bit indicating if additional symtab entry provided or not

Also, the profiler parser is adjusted to recognize entries described
above and enrich the symtab on the fly while parsing events. For this
reason, <utils/symtab.lua> API has been changed: function
`parse_sym_lfunc` became public and now returns sym-addr to be used
in the event parser module.

Resolves tarantool/tarantool#5815
---

Issue: https://github.com/tarantool/tarantool/issues/5815
Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated

 src/lj_bcread.c                               |   1 +
 src/lj_memprof.c                              |  19 +++-
 src/lj_memprof.h                              |  16 ++-
 src/lj_obj.h                                  |   7 ++
 src/lj_parse.c                                |   1 +
 .../misclib-memprof-lapi.test.lua             | 102 +++++++++++++-----
 tools/memprof/parse.lua                       |  55 +++++++---
 tools/utils/symtab.lua                        |   6 +-
 8 files changed, 160 insertions(+), 47 deletions(-)

diff --git a/src/lj_bcread.c b/src/lj_bcread.c
index 48c5e7c7..692e1a9d 100644
--- a/src/lj_bcread.c
+++ b/src/lj_bcread.c
@@ -353,6 +353,7 @@ GCproto *lj_bcread_proto(LexState *ls)
   pt->sizeuv = (uint8_t)sizeuv;
   pt->flags = (uint8_t)flags;
   pt->trace = 0;
+  pt->is_streamed = 0;
   setgcref(pt->chunkname, obj2gco(ls->chunkname));
 
   /* Close potentially uninitialized gap between bc and kgc. */
diff --git a/src/lj_memprof.c b/src/lj_memprof.c
index 14e4e67b..fb217a2c 100644
--- a/src/lj_memprof.c
+++ b/src/lj_memprof.c
@@ -67,7 +67,7 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace)
 static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
 {
   const GCRef *iter = &g->gc.root;
-  const GCobj *o;
+  GCobj *o;
   const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]);
 
   /* Write prologue. */
@@ -76,11 +76,12 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
   while ((o = gcref(*iter)) != NULL) {
     switch (o->gch.gct) {
     case (~LJ_TPROTO): {
-      const GCproto *pt = gco2pt(o);
+      GCproto *pt = gco2pt(o);
       lj_wbuf_addbyte(out, SYMTAB_LFUNC);
       lj_wbuf_addu64(out, (uintptr_t)pt);
       lj_wbuf_addstring(out, proto_chunknamestr(pt));
       lj_wbuf_addu64(out, (uint64_t)pt->firstline);
+      pt->is_streamed = 1;
       break;
     }
     case (~LJ_TTRACE): {
@@ -138,6 +139,7 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent,
   ** -DLUAJIT_DISABLE_DEBUGINFO flag.
   */
   const BCLine line = lj_debug_frameline(L, fn, nextframe);
+  GCproto *pt = funcproto(fn);
 
   if (line < 0) {
     /*
@@ -149,9 +151,20 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent,
     ** We report such allocations as internal in order not to confuse users.
     */
     lj_wbuf_addbyte(out, aevent | ASOURCE_INT);
+
+  } else if (LJ_UNLIKELY(!pt->is_streamed)) {
+
+    lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC | LJM_SYMTAB);
+    lj_wbuf_addu64(out, (uintptr_t)pt);
+    lj_wbuf_addstring(out, proto_chunknamestr(pt));
+    lj_wbuf_addu64(out, (uint64_t)pt->firstline);
+    lj_wbuf_addu64(out, (uint64_t)line);
+    pt->is_streamed = 1;
+
   } else {
+
     lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC);
-    lj_wbuf_addu64(out, (uintptr_t)funcproto(fn));
+    lj_wbuf_addu64(out, (uintptr_t)pt);
     lj_wbuf_addu64(out, (uint64_t)line);
   }
 }
diff --git a/src/lj_memprof.h b/src/lj_memprof.h
index e72dadf7..0f5b4c6d 100644
--- a/src/lj_memprof.h
+++ b/src/lj_memprof.h
@@ -57,7 +57,7 @@
 #define SYMTAB_TRACE ((uint8_t)1)
 #define SYMTAB_FINAL ((uint8_t)0x80)
 
-#define LJM_CURRENT_FORMAT_VERSION 0x02
+#define LJM_CURRENT_FORMAT_VERSION 0x03
 
 /*
 ** Event stream format:
@@ -73,11 +73,14 @@
 ** event-realloc  := event-header loc? oaddr osize naddr nsize
 ** event-free     := event-header loc? oaddr osize
 ** event-header   := <BYTE>
-** loc            := loc-lua | loc-c | loc-trace
+** loc            := loc-lua | loc-lua-sym | loc-c | loc-trace
 ** loc-lua        := sym-addr line-no
+** loc-lua-sym    := sym-addr sym-chunk sym-line line-no
 ** loc-c          := sym-addr
 ** loc-trace      := trace-addr trace-no
 ** sym-addr       := <ULEB128>
+** sym-chunk      := string
+** sym-line       := <ULEB128>
 ** line-no        := <ULEB128>
 ** trace-addr     := <ULEB128>
 ** trace-no       := <ULEB128>
@@ -85,6 +88,9 @@
 ** naddr          := <ULEB128>
 ** osize          := <ULEB128>
 ** nsize          := <ULEB128>
+** string         := string-len string-payload
+** string-len     := <ULEB128>
+** string-payload := <BYTE> {string-len}
 ** epilogue       := event-header
 **
 ** <BYTE>   :  A single byte (no surprises here)
@@ -95,10 +101,11 @@
 ** version: [VVVVVVVV]
 **  * VVVVVVVV: Byte interpreted as a plain integer version number
 **
-** event-header: [FUUSSSEE]
+** event-header: [FUTSSSEE]
 **  * EE   : 2 bits for representing allocation event type (AEVENT_*)
 **  * SSS  : 3 bits for representing allocation source type (ASOURCE_*)
-**  * UU   : 2 unused bits
+**  * T    : 1 bit indicating if additional symtab entry provided or not
+**  * U    : 1 unused bit
 **  * F    : 0 for regular events, 1 for epilogue's *F*inal header
 **           (if F is set to 1, all other bits are currently ignored)
 */
@@ -114,6 +121,7 @@
 #define ASOURCE_CFUNC ((uint8_t)(3 << 2))
 #define ASOURCE_TRACE ((uint8_t)(4 << 2))
 
+#define LJM_SYMTAB          0x20
 #define LJM_EPILOGUE_HEADER 0x80
 
 /* Profiler public API. */
diff --git a/src/lj_obj.h b/src/lj_obj.h
index d26e60be..c8f6b13a 100644
--- a/src/lj_obj.h
+++ b/src/lj_obj.h
@@ -385,6 +385,13 @@ typedef struct GCproto {
   MRef lineinfo;	/* Compressed map from bytecode ins. to source line. */
   MRef uvinfo;		/* Upvalue names. */
   MRef varinfo;		/* Names and compressed extents of local variables. */
+#if LJ_HASMEMPROF
+  /*
+  ** Flag, indicating, if this proto has been streamed to the
+  ** memprof's binary output.
+  */
+  uint8_t is_streamed;
+#endif
 } GCproto;
 
 /* Flags for prototype. */
diff --git a/src/lj_parse.c b/src/lj_parse.c
index a6325a76..1a540610 100644
--- a/src/lj_parse.c
+++ b/src/lj_parse.c
@@ -1577,6 +1577,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
   pt->flags = (uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN));
   pt->numparams = fs->numparams;
   pt->framesize = fs->framesize;
+  pt->is_streamed = 0;
   setgcref(pt->chunkname, obj2gco(ls->chunkname));
 
   /* Close potentially uninitialized gap between bc and kgc. */
diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
index 17bcb2f1..f4666da5 100644
--- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
+++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
@@ -7,7 +7,7 @@ require("utils").skipcond(
 local tap = require("tap")
 
 local test = tap.test("misc-memprof-lapi")
-test:plan(4)
+test:plan(5)
 
 local jit_opt_default = {
     3, -- level
@@ -70,7 +70,7 @@ local function generate_parsed_output(payload)
 
   local reader = bufread.new(TMP_BINFILE)
   local symbols = symtab.parse(reader)
-  local events = memprof.parse(reader)
+  local events = memprof.parse(reader, symbols)
 
   -- We don't need it any more.
   os.remove(TMP_BINFILE)
@@ -79,7 +79,9 @@ local function generate_parsed_output(payload)
 end
 
 local function fill_ev_type(events, symbols, event_type)
-  local ev_type = {}
+  local ev_type = {
+    trace = {},
+  }
   for _, event in pairs(events[event_type]) do
     local addr = event.loc.addr
     local traceno = event.loc.traceno
@@ -87,7 +89,7 @@ local function fill_ev_type(events, symbols, event_type)
     if traceno ~= 0 and symbols.trace[traceno] then
       local trace_loc = symbols.trace[traceno].sym_loc
       addr = trace_loc.addr
-      ev_type[trace_loc.line] = {
+      ev_type.trace[traceno] = {
         name = string.format("TRACE [%d] %s:%d",
           traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
         ),
@@ -99,9 +101,13 @@ local function fill_ev_type(events, symbols, event_type)
         num = event.num,
     }
     elseif symbols.lfunc[addr] then
-      ev_type[event.loc.line] = {
+      local source = symbols.lfunc[addr].source
+      if ev_type[source] == nil then
+        ev_type[source] = {}
+      end
+      ev_type[source][event.loc.line] = {
         name = string.format(
-          "%s:%d", symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
+          "%s:%d", source, symbols.lfunc[addr].linedefined
         ),
         num = event.num,
       }
@@ -110,23 +116,35 @@ local function fill_ev_type(events, symbols, event_type)
   return ev_type
 end
 
-local function form_source_line(line)
-  return string.format("@%s:%d", arg[0], line)
+local function form_source_line(loc)
+  return string.format("%s:%d", loc.source, loc.function_line)
 end
 
-local function check_alloc_report(alloc, traceno, line, function_line, nevents)
-  assert(alloc[line], ("no event on line %d"):format(line))
-  local event = alloc[line]
-  local expected_name
-  if traceno ~= 0 then
-    expected_name = string.format("TRACE [%d] ", traceno)..
-                    form_source_line(function_line)
+local function get_event_name_trace(loc, alloc)
+  local event = alloc.trace[loc.traceno]
+  return event, string.format("TRACE [%d] ", loc.traceno)..form_source_line(loc)
+end
+
+local function get_event_name_lua(loc, alloc)
+  local event = alloc[loc.source][loc.line]
+  return event, form_source_line(loc)
+end
+
+local function check_alloc_report(loc, alloc, nevents)
+  if loc.source == nil then
+    loc.source = '@'..arg[0]
+  end
+
+  local event, name
+  if loc.traceno ~= nil then
+    event, name = get_event_name_trace(loc, alloc)
   else
-    expected_name = form_source_line(function_line)
+    event, name = get_event_name_lua(loc, alloc)
   end
-  assert(expected_name == event.name, ("got='%s', expected='%s'"):format(
+
+  assert(name == event.name, ("got='%s', expected='%s'"):format(
     event.name,
-    expected_name
+    name
   ))
   assert(event.num == nevents, ("got=%d, expected=%d"):format(
     event.num,
@@ -175,9 +193,9 @@ test:test("output", function(subtest)
   -- the number of allocations.
   -- 1 event - alocation of table by itself + 1 allocation
   -- of array part as far it is bigger than LJ_MAX_COLOSIZE (16).
-  subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
+  subtest:ok(check_alloc_report({line = 34, function_line = 32}, alloc, 2))
   -- 20 strings allocations.
-  subtest:ok(check_alloc_report(alloc, 0, 39, 32, 20))
+  subtest:ok(check_alloc_report({line = 39, function_line = 32}, alloc, 20))
 
   -- Collect all previous allocated objects.
   subtest:ok(free.INTERNAL.num == 22)
@@ -185,8 +203,8 @@ test:test("output", function(subtest)
   -- Tests for leak-only option.
   -- See also https://github.com/tarantool/tarantool/issues/5812.
   local heap_delta = process.form_heap_delta(events, symbols)
-  local tab_alloc_stats = heap_delta[form_source_line(34)]
-  local str_alloc_stats = heap_delta[form_source_line(39)]
+  local tab_alloc_stats = heap_delta[form_source_line({ function_line = 34 })]
+  local str_alloc_stats = heap_delta[form_source_line({ function_line = 39 })]
   subtest:ok(tab_alloc_stats.nalloc == tab_alloc_stats.nfree)
   subtest:ok(tab_alloc_stats.dbytes == 0)
   subtest:ok(str_alloc_stats.nalloc == str_alloc_stats.nfree)
@@ -215,6 +233,38 @@ test:test("stack-resize", function(subtest)
   misc.memprof.stop()
 end)
 
+-- Test for extending symtab with function prototypes
+-- while profiler is running.
+test:test("symtab-enriching", function(subtest)
+  subtest:plan(2)
+
+  local payload_str = [[
+  local M = {
+    tmp = string.rep("1", 100)  -- line 2.
+  }
+
+  function M.payload()
+    local str = string.rep("42", 100)  -- line 6.
+  end
+
+  return M
+  ]]
+
+  local symbols, events = generate_parsed_output(function()
+    local str_chunk = assert(loadstring(payload_str, 'str_chunk'))()
+    str_chunk.payload()
+  end)
+
+  local alloc = fill_ev_type(events, symbols, "alloc")
+
+  subtest:ok(check_alloc_report(
+    {source = 'str_chunk', line = 6, function_line = 5}, alloc, 1)
+  )
+  subtest:ok(check_alloc_report(
+    {source = 'str_chunk', line = 2, function_line = 0}, alloc, 1)
+  )
+end)
+
 -- Test profiler with enabled JIT.
 jit.on()
 
@@ -247,9 +297,13 @@ test:test("jit-output", function(subtest)
   alloc = fill_ev_type(events, symbols, "alloc")
 
   -- We expect, that loop will be compiled into a trace.
-  subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20))
+  subtest:ok(check_alloc_report(
+    { traceno = 1, line = 37, function_line = 32 }, alloc, 20
+  ))
   -- See same checks with jit.off().
-  subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
+  subtest:ok(check_alloc_report(
+    { line = 34, function_line = 32 }, alloc, 2
+  ))
 
   -- Restore default JIT settings.
   jit.opt.start(unpack(jit_opt_default))
diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua
index adc7c072..888c6636 100644
--- a/tools/memprof/parse.lua
+++ b/tools/memprof/parse.lua
@@ -8,11 +8,14 @@ local bit = require "bit"
 local band = bit.band
 local lshift = bit.lshift
 
+local symtab = require "utils.symtab"
+
 local string_format = string.format
 
 local LJM_MAGIC = "ljm"
-local LJM_CURRENT_VERSION = 0x02
+local LJM_CURRENT_VERSION = 0x03
 
+local LJM_SYMTAB          = 0x20
 local LJM_EPILOGUE_HEADER = 0x80
 
 local AEVENT_ALLOC = 1
@@ -26,9 +29,10 @@ local ASOURCE_LFUNC = lshift(2, 2)
 local ASOURCE_CFUNC = lshift(3, 2)
 local ASOURCE_TRACE = lshift(4, 2)
 
-local ASOURCE_MASK = lshift(0x7, 2)
+local ASOURCE_MASK      = lshift(0xF, 2)
+local ASOURCE_TYPE_MASK = lshift(0x7, 2)
 
-local EV_HEADER_MAX = ASOURCE_TRACE + AEVENT_REALLOC
+local EV_HEADER_MAX = LJM_SYMTAB + ASOURCE_TRACE + AEVENT_REALLOC
 
 local M = {}
 
@@ -70,7 +74,18 @@ local function id_location(addr, line, traceno)
   }
 end
 
-local function parse_location(reader, asource)
+local function parse_location_symbols(reader, asource, symbols)
+  if asource == ASOURCE_LFUNC then
+    return id_location(
+      symtab.parse_sym_lfunc(reader, symbols),
+      reader:read_uleb128(),
+      0
+    )
+  end
+  error("Unknown asource "..asource)
+end
+
+local function parse_location_common(reader, asource)
   if asource == ASOURCE_INT then
     return id_location(0, 0, 0)
   elseif asource == ASOURCE_CFUNC then
@@ -83,8 +98,20 @@ local function parse_location(reader, asource)
   error("Unknown asource "..asource)
 end
 
-local function parse_alloc(reader, asource, events, heap)
-  local id, loc = parse_location(reader, asource)
+local function parse_location(reader, asource, symbols)
+  local has_symbols  = band(asource, LJM_SYMTAB)
+  local asource_type = band(asource, ASOURCE_TYPE_MASK)
+  local id, loc
+  if has_symbols == LJM_SYMTAB then
+    id, loc = parse_location_symbols(reader, asource_type, symbols)
+  else
+    id, loc = parse_location_common(reader, asource_type)
+  end
+  return id, loc
+end
+
+local function parse_alloc(reader, asource, events, heap, symbols)
+  local id, loc = parse_location(reader, asource, symbols)
 
   local naddr = reader:read_uleb128()
   local nsize = reader:read_uleb128()
@@ -99,8 +126,8 @@ local function parse_alloc(reader, asource, events, heap)
   heap[naddr] = {nsize, id, loc}
 end
 
-local function parse_realloc(reader, asource, events, heap)
-  local id, loc = parse_location(reader, asource)
+local function parse_realloc(reader, asource, events, heap, symbols)
+  local id, loc = parse_location(reader, asource, symbols)
 
   local oaddr = reader:read_uleb128()
   local osize = reader:read_uleb128()
@@ -121,8 +148,8 @@ local function parse_realloc(reader, asource, events, heap)
   heap[naddr] = {nsize, id, loc}
 end
 
-local function parse_free(reader, asource, events, heap)
-  local id, loc = parse_location(reader, asource)
+local function parse_free(reader, asource, events, heap, symbols)
+  local id, loc = parse_location(reader, asource, symbols)
 
   local oaddr = reader:read_uleb128()
   local osize = reader:read_uleb128()
@@ -155,7 +182,7 @@ local function ev_header_split(evh)
   return band(evh, AEVENT_MASK), band(evh, ASOURCE_MASK)
 end
 
-local function parse_event(reader, events)
+local function parse_event(reader, events, symbols)
   local ev_header = reader:read_octet()
 
   assert(ev_header_is_valid(ev_header), "Bad ev_header "..ev_header)
@@ -169,12 +196,12 @@ local function parse_event(reader, events)
 
   assert(parser, "Bad aevent "..aevent)
 
-  parser.parse(reader, asource, events[parser.evname], events.heap)
+  parser.parse(reader, asource, events[parser.evname], events.heap, symbols)
 
   return true
 end
 
-function M.parse(reader)
+function M.parse(reader, symbols)
   local events = {
     alloc = {},
     realloc = {},
@@ -199,7 +226,7 @@ function M.parse(reader)
     ))
   end
 
-  while parse_event(reader, events) do
+  while parse_event(reader, events, symbols) do
     -- Empty body.
   end
 
diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
index 0e742ee1..601c9563 100644
--- a/tools/utils/symtab.lua
+++ b/tools/utils/symtab.lua
@@ -20,7 +20,7 @@ local SYMTAB_TRACE = 1
 local M = {}
 
 -- Parse a single entry in a symtab: lfunc symbol.
-local function parse_sym_lfunc(reader, symtab)
+function M.parse_sym_lfunc(reader, symtab)
   local sym_addr = reader:read_uleb128()
   local sym_chunk = reader:read_string()
   local sym_line = reader:read_uleb128()
@@ -29,6 +29,8 @@ local function parse_sym_lfunc(reader, symtab)
     source = sym_chunk,
     linedefined = sym_line,
   }
+
+  return sym_addr
 end
 
 local function parse_sym_trace(reader, symtab)
@@ -48,7 +50,7 @@ local function parse_sym_trace(reader, symtab)
 end
 
 local parsers = {
-  [SYMTAB_LFUNC] = parse_sym_lfunc,
+  [SYMTAB_LFUNC] = M.parse_sym_lfunc,
   [SYMTAB_TRACE] = parse_sym_trace,
 }
 
-- 
2.32.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot
  2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches
@ 2021-08-21 12:50 ` Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules Mikhail Shishatskiy via Tarantool-patches
  3 siblings, 0 replies; 6+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

There may be situations, when new prototype has the same address than
the previously dumped one. In this case, the new symtab entry will
shadow the preceeding, and if we demangle symbols when all the events
are parsed, we can get irrelevant information:

| Dump symtab on the start:
|   addr  name
| * 0x001 proto_1
| * 0x002 proto_2
| * 0x003 proto_3
|
| Memprof running:
| ...
| * alloc: 0x001 proto_1
| // 0x001 proto_1 collected by the GC
| // new proto loaded on the 0x001 address
| * alloc: 0x001 proto_NEW
|   + symtab enriching -> [0x001] = proto_NEW

If we demangle symbols in the end, we will get 2 allocations
by proto_NEW because of the collision. But since symtab enriching
is implemented, we have enough information to avoid collision
while parsing events, demangling now happens on the spot, while
symtab entries are not shadowed by new ones.

Follows up tarantool/tarantool#5815
---

Issue: https://github.com/tarantool/tarantool/issues/5815
Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated

 tools/memprof.lua          |  4 ++--
 tools/memprof/humanize.lua | 16 ++++++++--------
 tools/memprof/parse.lua    | 26 +++++++++++++-------------
 tools/memprof/process.lua  | 12 +++++-------
 4 files changed, 28 insertions(+), 30 deletions(-)

diff --git a/tools/memprof.lua b/tools/memprof.lua
index 18b44fdd..496a0d78 100644
--- a/tools/memprof.lua
+++ b/tools/memprof.lua
@@ -102,9 +102,9 @@ local function dump(inputfile)
   local symbols = symtab.parse(reader)
   local events = memprof.parse(reader, symbols)
   if not leak_only then
-    view.profile_info(events, symbols)
+    view.profile_info(events)
   end
-  local dheap = process.form_heap_delta(events, symbols)
+  local dheap = process.form_heap_delta(events)
   view.leak_info(dheap)
   os.exit(0)
 end
diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
index 800a465e..0360362d 100644
--- a/tools/memprof/humanize.lua
+++ b/tools/memprof/humanize.lua
@@ -7,7 +7,7 @@ local symtab = require "utils.symtab"
 
 local M = {}
 
-function M.render(events, symbols)
+function M.render(events)
   local ids = {}
 
   for id, _ in pairs(events) do
@@ -21,15 +21,15 @@ function M.render(events, symbols)
   for i = 1, #ids do
     local event = events[ids[i]]
     print(string.format("%s: %d events\t+%d bytes\t-%d bytes",
-      M.describe_location(symbols, event.loc),
+      ids[i],
       event.num,
       event.alloc,
       event.free
     ))
 
     local prim_loc = {}
-    for _, heap_chunk in pairs(event.primary) do
-      table.insert(prim_loc, M.describe_location(symbols, heap_chunk.loc))
+    for id, _ in pairs(event.primary) do
+      table.insert(prim_loc, id)
     end
     if #prim_loc ~= 0 then
       table.sort(prim_loc)
@@ -42,17 +42,17 @@ function M.render(events, symbols)
   end
 end
 
-function M.profile_info(events, symbols)
+function M.profile_info(events)
   print("ALLOCATIONS")
-  M.render(events.alloc, symbols)
+  M.render(events.alloc)
   print("")
 
   print("REALLOCATIONS")
-  M.render(events.realloc, symbols)
+  M.render(events.realloc)
   print("")
 
   print("DEALLOCATIONS")
-  M.render(events.free, symbols)
+  M.render(events.free)
   print("")
 end
 
diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua
index 888c6636..a540360b 100644
--- a/tools/memprof/parse.lua
+++ b/tools/memprof/parse.lua
@@ -9,6 +9,7 @@ local band = bit.band
 local lshift = bit.lshift
 
 local symtab = require "utils.symtab"
+local humanize = require "memprof.humanize"
 
 local string_format = string.format
 
@@ -66,12 +67,13 @@ local function link_to_previous(heap_chunk, e, nsize)
   end
 end
 
-local function id_location(addr, line, traceno)
-  return string_format("f%#xl%dt%d", addr, line, traceno), {
+local function id_location(addr, line, traceno, symbols)
+  local loc =  {
     addr = addr,
     line = line,
     traceno = traceno,
   }
+  return humanize.describe_location(symbols, loc), loc
 end
 
 local function parse_location_symbols(reader, asource, symbols)
@@ -79,21 +81,22 @@ local function parse_location_symbols(reader, asource, symbols)
     return id_location(
       symtab.parse_sym_lfunc(reader, symbols),
       reader:read_uleb128(),
-      0
+      0,
+      symbols
     )
   end
   error("Unknown asource "..asource)
 end
 
-local function parse_location_common(reader, asource)
+local function parse_location_common(reader, asource, symbols)
   if asource == ASOURCE_INT then
-    return id_location(0, 0, 0)
+    return id_location(0, 0, 0, symbols)
   elseif asource == ASOURCE_CFUNC then
-    return id_location(reader:read_uleb128(), 0, 0)
+    return id_location(reader:read_uleb128(), 0, 0, symbols)
   elseif asource == ASOURCE_LFUNC then
-    return id_location(reader:read_uleb128(), reader:read_uleb128(), 0)
+    return id_location(reader:read_uleb128(), reader:read_uleb128(), 0, symbols)
   elseif asource == ASOURCE_TRACE then
-    return id_location(reader:read_uleb128(), 0, reader:read_uleb128())
+    return id_location(reader:read_uleb128(), 0, reader:read_uleb128(), symbols)
   end
   error("Unknown asource "..asource)
 end
@@ -101,13 +104,10 @@ end
 local function parse_location(reader, asource, symbols)
   local has_symbols  = band(asource, LJM_SYMTAB)
   local asource_type = band(asource, ASOURCE_TYPE_MASK)
-  local id, loc
   if has_symbols == LJM_SYMTAB then
-    id, loc = parse_location_symbols(reader, asource_type, symbols)
-  else
-    id, loc = parse_location_common(reader, asource_type)
+    return parse_location_symbols(reader, asource_type, symbols)
   end
-  return id, loc
+  return parse_location_common(reader, asource_type, symbols)
 end
 
 local function parse_alloc(reader, asource, events, heap, symbols)
diff --git a/tools/memprof/process.lua b/tools/memprof/process.lua
index f277ed84..0a39d333 100644
--- a/tools/memprof/process.lua
+++ b/tools/memprof/process.lua
@@ -2,9 +2,7 @@
 
 local M = {}
 
-local humanize = require "memprof.humanize"
-
-function M.form_heap_delta(events, symbols)
+function M.form_heap_delta(events)
   -- Auto resurrects source event lines for counting/reporting.
   local dheap = setmetatable({}, {__index = function(t, line)
     rawset(t, line, {
@@ -15,9 +13,9 @@ function M.form_heap_delta(events, symbols)
     return t[line]
   end})
 
-  for _, event in pairs(events.alloc) do
+  for id, event in pairs(events.alloc) do
     if event.loc then
-      local ev_line = humanize.describe_location(symbols, event.loc)
+      local ev_line = id
 
       if (event.alloc > 0) then
         dheap[ev_line].dbytes = dheap[ev_line].dbytes + event.alloc
@@ -36,8 +34,8 @@ function M.form_heap_delta(events, symbols)
       -- Realloc and free events always have key named "primary"
       -- that references the table with memory changed
       -- (may be empty).
-      for _, heap_chunk in pairs(event.primary) do
-        local ev_line = humanize.describe_location(symbols, heap_chunk.loc)
+      for id, heap_chunk in pairs(event.primary) do
+        local ev_line = id
 
         if (heap_chunk.alloced > 0) then
           dheap[ev_line].dbytes = dheap[ev_line].dbytes + heap_chunk.alloced
-- 
2.32.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases
  2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot Mikhail Shishatskiy via Tarantool-patches
@ 2021-08-21 12:50 ` Mikhail Shishatskiy via Tarantool-patches
  2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules Mikhail Shishatskiy via Tarantool-patches
  3 siblings, 0 replies; 6+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Sometimes a loaded chunk name can be multiline (actually, it
is the Lua code itself). In order not to burden memprof parser
output with big multiline names, aliases were introduced.

The chunk name is replaced by `function_alias_N` (where N is a unique id)
to be displayed in the allocation events report. All the aliases are
printed in the end of parser's output under the header "ALIASES".

Because of changes mentioned above, the API of <utils/symtab.lua>
changed: now symtab has additional `alias` assotiative table for
storing aliases and `alias_count` counter to store number of aliases.

Follows up tarantool/tarantool#5815
---

Issue: https://github.com/tarantool/tarantool/issues/5815
Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated

 tools/memprof.lua          |  1 +
 tools/memprof/humanize.lua | 35 +++++++++++++++++++++++++++++++++++
 tools/utils/symtab.lua     | 14 +++++++++++++-
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/tools/memprof.lua b/tools/memprof.lua
index 496a0d78..52e09fd9 100644
--- a/tools/memprof.lua
+++ b/tools/memprof.lua
@@ -106,6 +106,7 @@ local function dump(inputfile)
   end
   local dheap = process.form_heap_delta(events)
   view.leak_info(dheap)
+  view.aliases(symbols)
   os.exit(0)
 end
 
diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua
index 0360362d..72e71080 100644
--- a/tools/memprof/humanize.lua
+++ b/tools/memprof/humanize.lua
@@ -5,6 +5,9 @@
 
 local symtab = require "utils.symtab"
 
+-- Assume that loaded string chunks have less than 9999 lines.
+local ALIAS_MAX_LINE_NUMBER_LEN = 4
+
 local M = {}
 
 function M.render(events)
@@ -98,4 +101,36 @@ function M.describe_location(symbols, loc)
   return symtab.demangle(symbols, loc)
 end
 
+local function get_aliases(symbols)
+  local aliases = {}
+  for source, alias in pairs(symbols.alias) do
+    table.insert(aliases, { alias, source })
+  end
+  table.sort(aliases, function(a, b)
+    return a[1] < b[1]
+  end)
+  return aliases
+end
+
+local function format_source_prefix(lineno)
+  local line_str = tostring(lineno)
+  local line_str_len = line_str:len()
+  return line_str..string.rep(" ", ALIAS_MAX_LINE_NUMBER_LEN - line_str_len)
+end
+
+function M.aliases(symbols)
+  if symbols.alias_count == 0 then return end
+  print("ALIASES:")
+  local aliases = get_aliases(symbols)
+  for _, alias in ipairs(aliases) do
+    print(alias[1]..":")
+    local lineno = 1
+    for line in alias[2]:gmatch("(.-)\n") do
+      print(format_source_prefix(lineno)..'|'..line)
+      lineno = lineno + 1
+    end
+    print("~\n")
+  end
+end
+
 return M
diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
index 601c9563..88fdb42e 100644
--- a/tools/utils/symtab.lua
+++ b/tools/utils/symtab.lua
@@ -58,6 +58,8 @@ function M.parse(reader)
   local symtab = {
     lfunc = {},
     trace = {},
+    alias = {},
+    alias_count = 0,
   }
   local magic = reader:read_octets(3)
   local version = reader:read_octets(1)
@@ -102,7 +104,17 @@ local function demangle_lfunc(symtab, loc)
   if addr == 0 then
     return "INTERNAL"
   elseif symtab.lfunc[addr] then
-    return string_format("%s:%d", symtab.lfunc[loc.addr].source, loc.line)
+    local name = symtab.lfunc[addr].source
+    if name:find('\n') == nil then
+      return string_format("%s:%d", name, loc.line)
+    end
+    if not symtab.alias[name] then
+      symtab.alias[name] = string_format(
+        "function_alias_%d", symtab.alias_count
+      )
+      symtab.alias_count = symtab.alias_count + 1
+    end
+    return string_format("%s:%d", symtab.alias[name], loc.line)
   end
   return string_format("CFUNC %#x", addr)
 end
-- 
2.32.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules
  2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases Mikhail Shishatskiy via Tarantool-patches
@ 2021-08-21 12:50 ` Mikhail Shishatskiy via Tarantool-patches
  3 siblings, 0 replies; 6+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-21 12:50 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Since commit [1], <memprof/parse.lua> module depends
on <memprof/humanize.lua> module, so we need to change
order of these modules in <src/lua/init.c> to avoid
unprotected API calls.

Follows up tarantool/tarantool#5815
---

Issue: https://github.com/tarantool/tarantool/issues/5815
Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated

 src/lua/init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/lua/init.c b/src/lua/init.c
index 8ac2f3592..fcb1d2c90 100644
--- a/src/lua/init.c
+++ b/src/lua/init.c
@@ -126,8 +126,8 @@ extern char strict_lua[],
 	/* tools.* libraries. */
 	bufread_lua[],
 	symtab_lua[],
-	parse_lua[],
 	humanize_lua[],
+	parse_lua[],
 	process_lua[],
 	memprof_lua[]
 ;
@@ -180,8 +180,8 @@ static const char *lua_modules[] = {
 	/* tools.* libraries. Order is important. */
 	"utils.bufread", bufread_lua,
 	"utils.symtab", symtab_lua,
-	"memprof.parse", parse_lua,
 	"memprof.humanize", humanize_lua,
+	"memprof.parse", parse_lua,
 	"memprof.process", process_lua,
 	"memprof", memprof_lua,
 	NULL
-- 
2.32.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Tarantool-patches] [PATCH luajit v1 1/4] memprof: enrich symtab when meeting new prototype
  2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches
@ 2021-08-24 14:16   ` Mikhail Shishatskiy via Tarantool-patches
  0 siblings, 0 replies; 6+ messages in thread
From: Mikhail Shishatskiy via Tarantool-patches @ 2021-08-24 14:16 UTC (permalink / raw)
  To: tarantool-patches, imun, skaplun

Sorry, I forgot about LJ_HASMEMPROF in some files:

diff --git a/src/lj_bcread.c b/src/lj_bcread.c
index 692e1a9d..242ea465 100644
--- a/src/lj_bcread.c
+++ b/src/lj_bcread.c
@@ -353,7 +353,9 @@ GCproto *lj_bcread_proto(LexState *ls)
    pt->sizeuv = (uint8_t)sizeuv;
    pt->flags = (uint8_t)flags;
    pt->trace = 0;
+#if LJ_HASMEMPROF
    pt->is_streamed = 0;
+#endif
    setgcref(pt->chunkname, obj2gco(ls->chunkname));
  
    /* Close potentially uninitialized gap between bc and kgc. */
diff --git a/src/lj_parse.c b/src/lj_parse.c
index 1a540610..746c6e33 100644
--- a/src/lj_parse.c
+++ b/src/lj_parse.c
@@ -1577,7 +1577,9 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
    pt->flags = (uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN));
    pt->numparams = fs->numparams;
    pt->framesize = fs->framesize;
+#if LJ_HASMEMPROF
    pt->is_streamed = 0;
+#endif
    setgcref(pt->chunkname, obj2gco(ls->chunkname));
  
    /* Close potentially uninitialized gap between bc and kgc. */

On 21.08.2021 19:49, Mikhail Shishatskiy wrote:
>Previously, the profiler dumped all the function prototypes only
>at the start of profiling. It was enough for most cases, but
>sometimes we may want to investigate the memory profile of module
>loading. In such a case, we expect memprof to dump new prototypes
>from parsed source code.
>
>This patch extends memprof's streaming format with entries providing
>extra information about allocation source's symbols if they were not
>streamed at the start of profiling.
>
>| loc         := loc-lua | loc-lua-sym | loc-c | loc-trace
>| loc-lua-sym := sym-addr sym-chunk sym-line line-no
>
>The `sym-chunk` and `sym-line` are the prototype's chunk name and line
>where it was defined.
>
>To recognize entries with new symbols, event-header format changed:
>
>| event-header: [FUTSSSEE]
>| * T : 1 bit indicating if additional symtab entry provided or not
>
>Also, the profiler parser is adjusted to recognize entries described
>above and enrich the symtab on the fly while parsing events. For this
>reason, <utils/symtab.lua> API has been changed: function
>`parse_sym_lfunc` became public and now returns sym-addr to be used
>in the event parser module.
>
>Resolves tarantool/tarantool#5815
>---
>
>Issue: https://github.com/tarantool/tarantool/issues/5815
>Branch: https://github.com/tarantool/luajit/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
>Tarantool branch: https://github.com/tarantool/tarantool/tree/shishqa/gh-5815-enrich-symtab-when-prototype-is-allocated
>
> src/lj_bcread.c                               |   1 +
> src/lj_memprof.c                              |  19 +++-
> src/lj_memprof.h                              |  16 ++-
> src/lj_obj.h                                  |   7 ++
> src/lj_parse.c                                |   1 +
> .../misclib-memprof-lapi.test.lua             | 102 +++++++++++++-----
> tools/memprof/parse.lua                       |  55 +++++++---
> tools/utils/symtab.lua                        |   6 +-
> 8 files changed, 160 insertions(+), 47 deletions(-)
>
>diff --git a/src/lj_bcread.c b/src/lj_bcread.c
>index 48c5e7c7..692e1a9d 100644
>--- a/src/lj_bcread.c
>+++ b/src/lj_bcread.c
>@@ -353,6 +353,7 @@ GCproto *lj_bcread_proto(LexState *ls)
>   pt->sizeuv = (uint8_t)sizeuv;
>   pt->flags = (uint8_t)flags;
>   pt->trace = 0;
>+  pt->is_streamed = 0;
>   setgcref(pt->chunkname, obj2gco(ls->chunkname));
>
>   /* Close potentially uninitialized gap between bc and kgc. */
>diff --git a/src/lj_memprof.c b/src/lj_memprof.c
>index 14e4e67b..fb217a2c 100644
>--- a/src/lj_memprof.c
>+++ b/src/lj_memprof.c
>@@ -67,7 +67,7 @@ static void dump_symtab_trace(struct lj_wbuf *out, const GCtrace *trace)
> static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
> {
>   const GCRef *iter = &g->gc.root;
>-  const GCobj *o;
>+  GCobj *o;
>   const size_t ljs_header_len = sizeof(ljs_header) / sizeof(ljs_header[0]);
>
>   /* Write prologue. */
>@@ -76,11 +76,12 @@ static void dump_symtab(struct lj_wbuf *out, const struct global_State *g)
>   while ((o = gcref(*iter)) != NULL) {
>     switch (o->gch.gct) {
>     case (~LJ_TPROTO): {
>-      const GCproto *pt = gco2pt(o);
>+      GCproto *pt = gco2pt(o);
>       lj_wbuf_addbyte(out, SYMTAB_LFUNC);
>       lj_wbuf_addu64(out, (uintptr_t)pt);
>       lj_wbuf_addstring(out, proto_chunknamestr(pt));
>       lj_wbuf_addu64(out, (uint64_t)pt->firstline);
>+      pt->is_streamed = 1;
>       break;
>     }
>     case (~LJ_TTRACE): {
>@@ -138,6 +139,7 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent,
>   ** -DLUAJIT_DISABLE_DEBUGINFO flag.
>   */
>   const BCLine line = lj_debug_frameline(L, fn, nextframe);
>+  GCproto *pt = funcproto(fn);
>
>   if (line < 0) {
>     /*
>@@ -149,9 +151,20 @@ static void memprof_write_lfunc(struct lj_wbuf *out, uint8_t aevent,
>     ** We report such allocations as internal in order not to confuse users.
>     */
>     lj_wbuf_addbyte(out, aevent | ASOURCE_INT);
>+
>+  } else if (LJ_UNLIKELY(!pt->is_streamed)) {
>+
>+    lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC | LJM_SYMTAB);
>+    lj_wbuf_addu64(out, (uintptr_t)pt);
>+    lj_wbuf_addstring(out, proto_chunknamestr(pt));
>+    lj_wbuf_addu64(out, (uint64_t)pt->firstline);
>+    lj_wbuf_addu64(out, (uint64_t)line);
>+    pt->is_streamed = 1;
>+
>   } else {
>+
>     lj_wbuf_addbyte(out, aevent | ASOURCE_LFUNC);
>-    lj_wbuf_addu64(out, (uintptr_t)funcproto(fn));
>+    lj_wbuf_addu64(out, (uintptr_t)pt);
>     lj_wbuf_addu64(out, (uint64_t)line);
>   }
> }
>diff --git a/src/lj_memprof.h b/src/lj_memprof.h
>index e72dadf7..0f5b4c6d 100644
>--- a/src/lj_memprof.h
>+++ b/src/lj_memprof.h
>@@ -57,7 +57,7 @@
> #define SYMTAB_TRACE ((uint8_t)1)
> #define SYMTAB_FINAL ((uint8_t)0x80)
>
>-#define LJM_CURRENT_FORMAT_VERSION 0x02
>+#define LJM_CURRENT_FORMAT_VERSION 0x03
>
> /*
> ** Event stream format:
>@@ -73,11 +73,14 @@
> ** event-realloc  := event-header loc? oaddr osize naddr nsize
> ** event-free     := event-header loc? oaddr osize
> ** event-header   := <BYTE>
>-** loc            := loc-lua | loc-c | loc-trace
>+** loc            := loc-lua | loc-lua-sym | loc-c | loc-trace
> ** loc-lua        := sym-addr line-no
>+** loc-lua-sym    := sym-addr sym-chunk sym-line line-no
> ** loc-c          := sym-addr
> ** loc-trace      := trace-addr trace-no
> ** sym-addr       := <ULEB128>
>+** sym-chunk      := string
>+** sym-line       := <ULEB128>
> ** line-no        := <ULEB128>
> ** trace-addr     := <ULEB128>
> ** trace-no       := <ULEB128>
>@@ -85,6 +88,9 @@
> ** naddr          := <ULEB128>
> ** osize          := <ULEB128>
> ** nsize          := <ULEB128>
>+** string         := string-len string-payload
>+** string-len     := <ULEB128>
>+** string-payload := <BYTE> {string-len}
> ** epilogue       := event-header
> **
> ** <BYTE>   :  A single byte (no surprises here)
>@@ -95,10 +101,11 @@
> ** version: [VVVVVVVV]
> **  * VVVVVVVV: Byte interpreted as a plain integer version number
> **
>-** event-header: [FUUSSSEE]
>+** event-header: [FUTSSSEE]
> **  * EE   : 2 bits for representing allocation event type (AEVENT_*)
> **  * SSS  : 3 bits for representing allocation source type (ASOURCE_*)
>-**  * UU   : 2 unused bits
>+**  * T    : 1 bit indicating if additional symtab entry provided or not
>+**  * U    : 1 unused bit
> **  * F    : 0 for regular events, 1 for epilogue's *F*inal header
> **           (if F is set to 1, all other bits are currently ignored)
> */
>@@ -114,6 +121,7 @@
> #define ASOURCE_CFUNC ((uint8_t)(3 << 2))
> #define ASOURCE_TRACE ((uint8_t)(4 << 2))
>
>+#define LJM_SYMTAB          0x20
> #define LJM_EPILOGUE_HEADER 0x80
>
> /* Profiler public API. */
>diff --git a/src/lj_obj.h b/src/lj_obj.h
>index d26e60be..c8f6b13a 100644
>--- a/src/lj_obj.h
>+++ b/src/lj_obj.h
>@@ -385,6 +385,13 @@ typedef struct GCproto {
>   MRef lineinfo;	/* Compressed map from bytecode ins. to source line. */
>   MRef uvinfo;		/* Upvalue names. */
>   MRef varinfo;		/* Names and compressed extents of local variables. */
>+#if LJ_HASMEMPROF
>+  /*
>+  ** Flag, indicating, if this proto has been streamed to the
>+  ** memprof's binary output.
>+  */
>+  uint8_t is_streamed;
>+#endif
> } GCproto;
>
> /* Flags for prototype. */
>diff --git a/src/lj_parse.c b/src/lj_parse.c
>index a6325a76..1a540610 100644
>--- a/src/lj_parse.c
>+++ b/src/lj_parse.c
>@@ -1577,6 +1577,7 @@ static GCproto *fs_finish(LexState *ls, BCLine line)
>   pt->flags = (uint8_t)(fs->flags & ~(PROTO_HAS_RETURN|PROTO_FIXUP_RETURN));
>   pt->numparams = fs->numparams;
>   pt->framesize = fs->framesize;
>+  pt->is_streamed = 0;
>   setgcref(pt->chunkname, obj2gco(ls->chunkname));
>
>   /* Close potentially uninitialized gap between bc and kgc. */
>diff --git a/test/tarantool-tests/misclib-memprof-lapi.test.lua b/test/tarantool-tests/misclib-memprof-lapi.test.lua
>index 17bcb2f1..f4666da5 100644
>--- a/test/tarantool-tests/misclib-memprof-lapi.test.lua
>+++ b/test/tarantool-tests/misclib-memprof-lapi.test.lua
>@@ -7,7 +7,7 @@ require("utils").skipcond(
> local tap = require("tap")
>
> local test = tap.test("misc-memprof-lapi")
>-test:plan(4)
>+test:plan(5)
>
> local jit_opt_default = {
>     3, -- level
>@@ -70,7 +70,7 @@ local function generate_parsed_output(payload)
>
>   local reader = bufread.new(TMP_BINFILE)
>   local symbols = symtab.parse(reader)
>-  local events = memprof.parse(reader)
>+  local events = memprof.parse(reader, symbols)
>
>   -- We don't need it any more.
>   os.remove(TMP_BINFILE)
>@@ -79,7 +79,9 @@ local function generate_parsed_output(payload)
> end
>
> local function fill_ev_type(events, symbols, event_type)
>-  local ev_type = {}
>+  local ev_type = {
>+    trace = {},
>+  }
>   for _, event in pairs(events[event_type]) do
>     local addr = event.loc.addr
>     local traceno = event.loc.traceno
>@@ -87,7 +89,7 @@ local function fill_ev_type(events, symbols, event_type)
>     if traceno ~= 0 and symbols.trace[traceno] then
>       local trace_loc = symbols.trace[traceno].sym_loc
>       addr = trace_loc.addr
>-      ev_type[trace_loc.line] = {
>+      ev_type.trace[traceno] = {
>         name = string.format("TRACE [%d] %s:%d",
>           traceno, symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
>         ),
>@@ -99,9 +101,13 @@ local function fill_ev_type(events, symbols, event_type)
>         num = event.num,
>     }
>     elseif symbols.lfunc[addr] then
>-      ev_type[event.loc.line] = {
>+      local source = symbols.lfunc[addr].source
>+      if ev_type[source] == nil then
>+        ev_type[source] = {}
>+      end
>+      ev_type[source][event.loc.line] = {
>         name = string.format(
>-          "%s:%d", symbols.lfunc[addr].source, symbols.lfunc[addr].linedefined
>+          "%s:%d", source, symbols.lfunc[addr].linedefined
>         ),
>         num = event.num,
>       }
>@@ -110,23 +116,35 @@ local function fill_ev_type(events, symbols, event_type)
>   return ev_type
> end
>
>-local function form_source_line(line)
>-  return string.format("@%s:%d", arg[0], line)
>+local function form_source_line(loc)
>+  return string.format("%s:%d", loc.source, loc.function_line)
> end
>
>-local function check_alloc_report(alloc, traceno, line, function_line, nevents)
>-  assert(alloc[line], ("no event on line %d"):format(line))
>-  local event = alloc[line]
>-  local expected_name
>-  if traceno ~= 0 then
>-    expected_name = string.format("TRACE [%d] ", traceno)..
>-                    form_source_line(function_line)
>+local function get_event_name_trace(loc, alloc)
>+  local event = alloc.trace[loc.traceno]
>+  return event, string.format("TRACE [%d] ", loc.traceno)..form_source_line(loc)
>+end
>+
>+local function get_event_name_lua(loc, alloc)
>+  local event = alloc[loc.source][loc.line]
>+  return event, form_source_line(loc)
>+end
>+
>+local function check_alloc_report(loc, alloc, nevents)
>+  if loc.source == nil then
>+    loc.source = '@'..arg[0]
>+  end
>+
>+  local event, name
>+  if loc.traceno ~= nil then
>+    event, name = get_event_name_trace(loc, alloc)
>   else
>-    expected_name = form_source_line(function_line)
>+    event, name = get_event_name_lua(loc, alloc)
>   end
>-  assert(expected_name == event.name, ("got='%s', expected='%s'"):format(
>+
>+  assert(name == event.name, ("got='%s', expected='%s'"):format(
>     event.name,
>-    expected_name
>+    name
>   ))
>   assert(event.num == nevents, ("got=%d, expected=%d"):format(
>     event.num,
>@@ -175,9 +193,9 @@ test:test("output", function(subtest)
>   -- the number of allocations.
>   -- 1 event - alocation of table by itself + 1 allocation
>   -- of array part as far it is bigger than LJ_MAX_COLOSIZE (16).
>-  subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
>+  subtest:ok(check_alloc_report({line = 34, function_line = 32}, alloc, 2))
>   -- 20 strings allocations.
>-  subtest:ok(check_alloc_report(alloc, 0, 39, 32, 20))
>+  subtest:ok(check_alloc_report({line = 39, function_line = 32}, alloc, 20))
>
>   -- Collect all previous allocated objects.
>   subtest:ok(free.INTERNAL.num == 22)
>@@ -185,8 +203,8 @@ test:test("output", function(subtest)
>   -- Tests for leak-only option.
>   -- See also https://github.com/tarantool/tarantool/issues/5812.
>   local heap_delta = process.form_heap_delta(events, symbols)
>-  local tab_alloc_stats = heap_delta[form_source_line(34)]
>-  local str_alloc_stats = heap_delta[form_source_line(39)]
>+  local tab_alloc_stats = heap_delta[form_source_line({ function_line = 34 })]
>+  local str_alloc_stats = heap_delta[form_source_line({ function_line = 39 })]
>   subtest:ok(tab_alloc_stats.nalloc == tab_alloc_stats.nfree)
>   subtest:ok(tab_alloc_stats.dbytes == 0)
>   subtest:ok(str_alloc_stats.nalloc == str_alloc_stats.nfree)
>@@ -215,6 +233,38 @@ test:test("stack-resize", function(subtest)
>   misc.memprof.stop()
> end)
>
>+-- Test for extending symtab with function prototypes
>+-- while profiler is running.
>+test:test("symtab-enriching", function(subtest)
>+  subtest:plan(2)
>+
>+  local payload_str = [[
>+  local M = {
>+    tmp = string.rep("1", 100)  -- line 2.
>+  }
>+
>+  function M.payload()
>+    local str = string.rep("42", 100)  -- line 6.
>+  end
>+
>+  return M
>+  ]]
>+
>+  local symbols, events = generate_parsed_output(function()
>+    local str_chunk = assert(loadstring(payload_str, 'str_chunk'))()
>+    str_chunk.payload()
>+  end)
>+
>+  local alloc = fill_ev_type(events, symbols, "alloc")
>+
>+  subtest:ok(check_alloc_report(
>+    {source = 'str_chunk', line = 6, function_line = 5}, alloc, 1)
>+  )
>+  subtest:ok(check_alloc_report(
>+    {source = 'str_chunk', line = 2, function_line = 0}, alloc, 1)
>+  )
>+end)
>+
> -- Test profiler with enabled JIT.
> jit.on()
>
>@@ -247,9 +297,13 @@ test:test("jit-output", function(subtest)
>   alloc = fill_ev_type(events, symbols, "alloc")
>
>   -- We expect, that loop will be compiled into a trace.
>-  subtest:ok(check_alloc_report(alloc, 1, 37, 32, 20))
>+  subtest:ok(check_alloc_report(
>+    { traceno = 1, line = 37, function_line = 32 }, alloc, 20
>+  ))
>   -- See same checks with jit.off().
>-  subtest:ok(check_alloc_report(alloc, 0, 34, 32, 2))
>+  subtest:ok(check_alloc_report(
>+    { line = 34, function_line = 32 }, alloc, 2
>+  ))
>
>   -- Restore default JIT settings.
>   jit.opt.start(unpack(jit_opt_default))
>diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua
>index adc7c072..888c6636 100644
>--- a/tools/memprof/parse.lua
>+++ b/tools/memprof/parse.lua
>@@ -8,11 +8,14 @@ local bit = require "bit"
> local band = bit.band
> local lshift = bit.lshift
>
>+local symtab = require "utils.symtab"
>+
> local string_format = string.format
>
> local LJM_MAGIC = "ljm"
>-local LJM_CURRENT_VERSION = 0x02
>+local LJM_CURRENT_VERSION = 0x03
>
>+local LJM_SYMTAB          = 0x20
> local LJM_EPILOGUE_HEADER = 0x80
>
> local AEVENT_ALLOC = 1
>@@ -26,9 +29,10 @@ local ASOURCE_LFUNC = lshift(2, 2)
> local ASOURCE_CFUNC = lshift(3, 2)
> local ASOURCE_TRACE = lshift(4, 2)
>
>-local ASOURCE_MASK = lshift(0x7, 2)
>+local ASOURCE_MASK      = lshift(0xF, 2)
>+local ASOURCE_TYPE_MASK = lshift(0x7, 2)
>
>-local EV_HEADER_MAX = ASOURCE_TRACE + AEVENT_REALLOC
>+local EV_HEADER_MAX = LJM_SYMTAB + ASOURCE_TRACE + AEVENT_REALLOC
>
> local M = {}
>
>@@ -70,7 +74,18 @@ local function id_location(addr, line, traceno)
>   }
> end
>
>-local function parse_location(reader, asource)
>+local function parse_location_symbols(reader, asource, symbols)
>+  if asource == ASOURCE_LFUNC then
>+    return id_location(
>+      symtab.parse_sym_lfunc(reader, symbols),
>+      reader:read_uleb128(),
>+      0
>+    )
>+  end
>+  error("Unknown asource "..asource)
>+end
>+
>+local function parse_location_common(reader, asource)
>   if asource == ASOURCE_INT then
>     return id_location(0, 0, 0)
>   elseif asource == ASOURCE_CFUNC then
>@@ -83,8 +98,20 @@ local function parse_location(reader, asource)
>   error("Unknown asource "..asource)
> end
>
>-local function parse_alloc(reader, asource, events, heap)
>-  local id, loc = parse_location(reader, asource)
>+local function parse_location(reader, asource, symbols)
>+  local has_symbols  = band(asource, LJM_SYMTAB)
>+  local asource_type = band(asource, ASOURCE_TYPE_MASK)
>+  local id, loc
>+  if has_symbols == LJM_SYMTAB then
>+    id, loc = parse_location_symbols(reader, asource_type, symbols)
>+  else
>+    id, loc = parse_location_common(reader, asource_type)
>+  end
>+  return id, loc
>+end
>+
>+local function parse_alloc(reader, asource, events, heap, symbols)
>+  local id, loc = parse_location(reader, asource, symbols)
>
>   local naddr = reader:read_uleb128()
>   local nsize = reader:read_uleb128()
>@@ -99,8 +126,8 @@ local function parse_alloc(reader, asource, events, heap)
>   heap[naddr] = {nsize, id, loc}
> end
>
>-local function parse_realloc(reader, asource, events, heap)
>-  local id, loc = parse_location(reader, asource)
>+local function parse_realloc(reader, asource, events, heap, symbols)
>+  local id, loc = parse_location(reader, asource, symbols)
>
>   local oaddr = reader:read_uleb128()
>   local osize = reader:read_uleb128()
>@@ -121,8 +148,8 @@ local function parse_realloc(reader, asource, events, heap)
>   heap[naddr] = {nsize, id, loc}
> end
>
>-local function parse_free(reader, asource, events, heap)
>-  local id, loc = parse_location(reader, asource)
>+local function parse_free(reader, asource, events, heap, symbols)
>+  local id, loc = parse_location(reader, asource, symbols)
>
>   local oaddr = reader:read_uleb128()
>   local osize = reader:read_uleb128()
>@@ -155,7 +182,7 @@ local function ev_header_split(evh)
>   return band(evh, AEVENT_MASK), band(evh, ASOURCE_MASK)
> end
>
>-local function parse_event(reader, events)
>+local function parse_event(reader, events, symbols)
>   local ev_header = reader:read_octet()
>
>   assert(ev_header_is_valid(ev_header), "Bad ev_header "..ev_header)
>@@ -169,12 +196,12 @@ local function parse_event(reader, events)
>
>   assert(parser, "Bad aevent "..aevent)
>
>-  parser.parse(reader, asource, events[parser.evname], events.heap)
>+  parser.parse(reader, asource, events[parser.evname], events.heap, symbols)
>
>   return true
> end
>
>-function M.parse(reader)
>+function M.parse(reader, symbols)
>   local events = {
>     alloc = {},
>     realloc = {},
>@@ -199,7 +226,7 @@ function M.parse(reader)
>     ))
>   end
>
>-  while parse_event(reader, events) do
>+  while parse_event(reader, events, symbols) do
>     -- Empty body.
>   end
>
>diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
>index 0e742ee1..601c9563 100644
>--- a/tools/utils/symtab.lua
>+++ b/tools/utils/symtab.lua
>@@ -20,7 +20,7 @@ local SYMTAB_TRACE = 1
> local M = {}
>
> -- Parse a single entry in a symtab: lfunc symbol.
>-local function parse_sym_lfunc(reader, symtab)
>+function M.parse_sym_lfunc(reader, symtab)
>   local sym_addr = reader:read_uleb128()
>   local sym_chunk = reader:read_string()
>   local sym_line = reader:read_uleb128()
>@@ -29,6 +29,8 @@ local function parse_sym_lfunc(reader, symtab)
>     source = sym_chunk,
>     linedefined = sym_line,
>   }
>+
>+  return sym_addr
> end
>
> local function parse_sym_trace(reader, symtab)
>@@ -48,7 +50,7 @@ local function parse_sym_trace(reader, symtab)
> end
>
> local parsers = {
>-  [SYMTAB_LFUNC] = parse_sym_lfunc,
>+  [SYMTAB_LFUNC] = M.parse_sym_lfunc,
>   [SYMTAB_TRACE] = parse_sym_trace,
> }
>
>-- 
>2.32.0
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-24 14:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-21 12:49 [Tarantool-patches] [PATCH luajit v1 0/4] memprof: enrich symtab when meeting new prototype Mikhail Shishatskiy via Tarantool-patches
2021-08-21 12:49 ` [Tarantool-patches] [PATCH luajit v1 1/4] " Mikhail Shishatskiy via Tarantool-patches
2021-08-24 14:16   ` Mikhail Shishatskiy via Tarantool-patches
2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 2/4] memprof: demangle symbols on the spot Mikhail Shishatskiy via Tarantool-patches
2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 3/4] memprof: substitute long proto names with aliases Mikhail Shishatskiy via Tarantool-patches
2021-08-21 12:50 ` [Tarantool-patches] [PATCH luajit v1 4/4] luajit: change order of modules Mikhail Shishatskiy via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox