From: Sergey Ostanevich via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Kaplun <skaplun@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH luajit v1] tools: introduce --leak-only memprof parser option Date: Fri, 29 Jan 2021 18:17:45 +0300 [thread overview] Message-ID: <D56710BE-9C25-4190-9026-EF240C007919@tarantool.org> (raw) In-Reply-To: <20210125193544.18050-1-skaplun@tarantool.org> Hi! Thanks for the patch, some comments below. Regards, Sergos > On 25 Jan 2021, at 22:35, Sergey Kaplun <skaplun@tarantool.org> wrote: > > This patch adds a new --leak-only memory profiler parser option. > When the parser runs with that option, it will report only lines > (or "INTERNAL") that allocate and do not free some amount of bytes. > The parser also reports the number of allocation and deallocation > events related to each line. > --- > > Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-memprof-memleaks-option > Testing branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-memprof-memleaks-option > > Side note: I only update commit message thats why CI from the "old" > commit. > CI: https://gitlab.com/tarantool/tarantool/-/pipelines/246541599 > > This patch is a result of offline discussion of memory profiler > and its usage and feedback from Mons. > > I don't know - is it reasonable to reference [1] issue here or create a > new one? I think it’s ok to mention [1] > > The example of output. Assuming we have file <format_concat.lua>: > | 1 jit.off() -- more verbose reports > | 2 > | 3 local function concat(a) > | 4 local nstr = a.."a" > | 5 return nstr > | 6 end > | 7 > | 8 local function format(a) > | 9 local nstr = string.format("%sa", a) > | 10 return nstr > | 11 end > | 12 > | 13 collectgarbage() -- cleanup > | 14 > | 15 local binfile = "/tmp/memprof_"..(arg[0]):match("/([^/]*).lua")..".bin" > | 16 > | 17 local st, err = misc.memprof.start(binfile) > | 18 assert(st, err) > | 19 > | 20 for i = 1, 10000 do > | 21 local f = format(i) > | 22 local c = concat(i) > | 23 end > | 24 > | 25 local st, err = misc.memprof.stop() > | 26 assert(st, err) > | 27 > | 28 os.exit() > > Parser's output without option: > | ALLOCATIONS > | @/home/burii/reports/demo_memprof/format_concat.lua:8, line 9: 19998 624322 0 > | > | REALLOCATIONS > | > | DEALLOCATIONS > | INTERNAL: 283 0 5602 > | Overrides: > | @/home/burii/reports/demo_memprof/format_concat.lua:8, line 9 > | > | @/home/burii/reports/demo_memprof/format_concat.lua:8, line 9: 2 0 98304 > | Overrides: > | @/home/burii/reports/demo_memprof/format_concat.lua:8, line 9 > > And with: > > | HEAP SUMMARY: > | @/home/burii/reports/demo_memprof/format_concat.lua:8, line 9 holds 553214 bytes 19998 allocs, 283 frees > > Side note: The attentive reader will notice that we have 283+2 frees in > the first case and 283 in the second. > This is because --leak-only considers deallocation/reallocation events > for which we know the source address. The default report aggregates > these all deallocations into one. See `link_to_previous()` in > <tools/memprof/parse.lua> for details. > > tools/memprof.lua | 23 +++++----- > tools/memprof/humanize.lua | 91 +++++++++++++++++++++++++++++++++++++- > tools/memprof/parse.lua | 20 +++++++-- > 3 files changed, 115 insertions(+), 19 deletions(-) > > diff --git a/tools/memprof.lua b/tools/memprof.lua > index 9f96208..5d09fbd 100644 > --- a/tools/memprof.lua > +++ b/tools/memprof.lua > @@ -37,6 +37,11 @@ Supported options are: > os.exit(0) > end > > +local leak_only = false > +opt_map["leak-only"] = function() Should you add this into the SYNOPSIS part? > + leak_only = true > +end > + > -- Print error and exit with error status. > local function opterror(...) > stderr:write("luajit-parse-memprof.lua: ERROR: ", ...) > @@ -94,19 +99,11 @@ local function dump(inputfile) > local reader = bufread.new(inputfile) > local symbols = symtab.parse(reader) > local events = memprof.parse(reader, symbols) > - > - stdout:write("ALLOCATIONS", "\n") > - view.render(events.alloc, symbols) > - stdout:write("\n") > - > - stdout:write("REALLOCATIONS", "\n") > - view.render(events.realloc, symbols) > - stdout:write("\n") > - > - stdout:write("DEALLOCATIONS", "\n") > - view.render(events.free, symbols) > - stdout:write("\n") > - Should the HEAP SUMMARY appear in the end of the ‘all’ report? Otherwise the ‘_only’ in the option name is mesleading: show part of the report, while it shows completely different one. Same for the code below. > + if leak_only then > + view.leak_only(events, symbols) > + else > + view.all(events, symbols) > + end > os.exit(0) > end > > diff --git a/tools/memprof/humanize.lua b/tools/memprof/humanize.lua > index 109a39d..bad0597 100644 > --- a/tools/memprof/humanize.lua > +++ b/tools/memprof/humanize.lua > @@ -28,8 +28,8 @@ function M.render(events, symbols) > )) > > local prim_loc = {} > - for _, loc in pairs(event.primary) do > - table.insert(prim_loc, symtab.demangle(symbols, loc)) > + for _, heap_chunk in pairs(event.primary) do > + table.insert(prim_loc, symtab.demangle(symbols, heap_chunk.loc)) The order of allocations is changed, was it incorrect in some way? > end > if #prim_loc ~= 0 then > table.sort(prim_loc) > @@ -42,4 +42,91 @@ function M.render(events, symbols) > end > end > > +function M.all(events, symbols) > + print("ALLOCATIONS") > + M.render(events.alloc, symbols) > + print("") > + > + print("REALLOCATIONS") > + M.render(events.realloc, symbols) > + print("") > + > + print("DEALLOCATIONS") > + M.render(events.free, symbols) > + print("") > +end > + > +function M.leak_only(events, symbols) > + -- Auto resurrects source event lines for counting/reporting. > + local heap = setmetatable({}, {__index = function(t, line) > + t[line] = { > + size = 0, > + cnt_alloc = 0, > + cnt_free = 0, > + } > + return t[line] > + end}) > + > + for _, event in pairs(events.alloc) do > + if event.loc then > + local ev_line = symtab.demangle(symbols, event.loc) > + > + heap[ev_line].size = heap[ev_line].size + event.alloc > + if (event.alloc > 0) then > + heap[ev_line].cnt_alloc = heap[ev_line].cnt_alloc + event.num > + end > + end > + end > + > + -- Realloc and free events are pretty the same. > + -- We don't interesting in aggregated alloc/free sizes for aren’t interested > + -- the event, but only for new and old size values inside > + -- alloc-realloc-free chain. Assuming that we have > + -- no collisions between different object addresses. I believe the address is source of the chunk {_|de|re}allocation, is there any different one? > + local function process_non_alloc_events(events_by_type) > + for _, event in pairs(events_by_type) do > + -- Realloc and free events always have "primary" key > + -- that references table with rewrited memory > + -- (may be empty). > + for _, heap_chunk in pairs(event.primary) do > + local ev_line = symtab.demangle(symbols, heap_chunk.loc) > + > + heap[ev_line].size = heap[ev_line].size + heap_chunk.alloced > + if (heap_chunk.alloced > 0) then Can it be negative? If not - the above line can be put under this if also. > + heap[ev_line].cnt_alloc = heap[ev_line].cnt_alloc + heap_chunk.cnt > + end > + > + heap[ev_line].size = heap[ev_line].size - heap_chunk.freed > + if (heap_chunk.freed > 0) then > + heap[ev_line].cnt_free = heap[ev_line].cnt_free + heap_chunk.cnt > + end > + end > + end > + end > + process_non_alloc_events(events.realloc) > + process_non_alloc_events(events.free) > + > + local rest_heap = {} > + for line, info in pairs(heap) do > + -- Report "INTERNAL" events inconsistencies for profiling > + -- with enabled jit. > + if info.size > 0 then > + table.insert(rest_heap, {line = line, hold_bytes = info.size}) > + end > + end > + > + table.sort(rest_heap, function(h1, h2) > + return h1.hold_bytes > h2.hold_bytes > + end) > + > + print("HEAP SUMMARY:") > + for _, h in pairs(rest_heap) do > + print(string.format( > + "%s holds %d bytes %d allocs, %d frees", > + h.line, h.hold_bytes, heap[h.line].cnt_alloc, heap[h.line].cnt_free > + )) > + end > + print("") > +end > + > return M > diff --git a/tools/memprof/parse.lua b/tools/memprof/parse.lua > index 6dae22d..374686c 100644 > --- a/tools/memprof/parse.lua > +++ b/tools/memprof/parse.lua > @@ -39,11 +39,23 @@ local function new_event(loc) > } > end > > -local function link_to_previous(heap_chunk, e) > +local function link_to_previous(heap_chunk, e, nsize) > -- Memory at this chunk was allocated before we start tracking. > if heap_chunk then > -- Save Lua code location (line) by address (id). > - e.primary[heap_chunk[2]] = heap_chunk[3] > + if e.primary[heap_chunk[2]] == nil then > + e.primary[heap_chunk[2]] = { > + loc = heap_chunk[3], > + alloced = 0, > + freed = 0, > + cnt = 0, > + } > + end > + -- Save memory diff heap information. > + local location_data = e.primary[heap_chunk[2]] > + location_data.alloced = location_data.alloced + nsize > + location_data.freed = location_data.freed + heap_chunk[1] > + location_data.cnt = location_data.cnt + 1 > end > end > > @@ -97,7 +109,7 @@ local function parse_realloc(reader, asource, events, heap) > e.free = e.free + osize > e.alloc = e.alloc + nsize > > - link_to_previous(heap[oaddr], e) > + link_to_previous(heap[oaddr], e, nsize) > > heap[oaddr] = nil > heap[naddr] = {nsize, id, loc} > @@ -116,7 +128,7 @@ local function parse_free(reader, asource, events, heap) > e.num = e.num + 1 > e.free = e.free + osize > > - link_to_previous(heap[oaddr], e) > + link_to_previous(heap[oaddr], e, 0) > > heap[oaddr] = nil > end > -- > 2.28.0 > > [1]: https://github.com/tarantool/tarantool/issues/5657
next prev parent reply other threads:[~2021-01-29 15:17 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-01-25 19:35 Sergey Kaplun via Tarantool-patches 2021-01-29 15:17 ` Sergey Ostanevich via Tarantool-patches [this message] 2021-03-17 15:03 ` Sergey Kaplun via Tarantool-patches 2021-03-29 15:50 ` Sergey Ostanevich via Tarantool-patches 2021-03-29 9:41 ` Igor Munkin via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=D56710BE-9C25-4190-9026-EF240C007919@tarantool.org \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergos@tarantool.org \ --cc=skaplun@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH luajit v1] tools: introduce --leak-only memprof parser option' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox