From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4B7744765E0 for ; Fri, 25 Dec 2020 02:09:50 +0300 (MSK) Date: Fri, 25 Dec 2020 02:09:44 +0300 From: Igor Munkin Message-ID: <20201224230944.GC5396@tarantool.org> References: <9928314fb60091f99ca9bec1c17ca7cbb4bb79de.1608142899.git.skaplun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <9928314fb60091f99ca9bec1c17ca7cbb4bb79de.1608142899.git.skaplun@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH luajit v1 11/11] profile: introduce profile parser List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Sergey, Thanks for the patch! Please consider my comments below. On 16.12.20, Sergey Kaplun wrote: > This patch adds parser for profiler dumped binary data. > It provides a script that parses given binary file. It parses symtab using > ffi first and after map memory events with this symtab. Finally, it > renders the data in human-readable format. Side note: I saw the new version for the commit message, so left no comments for this one. > > Part of tarantool/tarantool#5442 > Part of tarantool/tarantool#5490 > --- > test/misclib-memprof-lapi.test.lua | 125 +++++++++++++++++ > tools/luajit-parse-memprof | 20 +++ > tools/parse_memprof/bufread.lua | 143 +++++++++++++++++++ > tools/parse_memprof/main.lua | 104 ++++++++++++++ > tools/parse_memprof/parse_memprof.lua | 195 ++++++++++++++++++++++++++ > tools/parse_memprof/parse_symtab.lua | 88 ++++++++++++ > tools/parse_memprof/view_plain.lua | 45 ++++++ > 7 files changed, 720 insertions(+) > create mode 100755 test/misclib-memprof-lapi.test.lua > create mode 100755 tools/luajit-parse-memprof > create mode 100644 tools/parse_memprof/bufread.lua > create mode 100644 tools/parse_memprof/main.lua > create mode 100644 tools/parse_memprof/parse_memprof.lua > create mode 100644 tools/parse_memprof/parse_symtab.lua > create mode 100644 tools/parse_memprof/view_plain.lua General notes for all comments in the sources below: * Write it above the code to be commented * Start it with a capital letter * End it with a dot > > diff --git a/test/misclib-memprof-lapi.test.lua b/test/misclib-memprof-lapi.test.lua > new file mode 100755 > index 0000000..e3663bb > --- /dev/null > +++ b/test/misclib-memprof-lapi.test.lua > @@ -0,0 +1,125 @@ > +local path = arg[0]:gsub('/[^/]+%.test%.lua', '') > +local parse_suffix = '../tools/parse_memprof/?.lua;' > +package.path = ('%s/%s;'):format(path, parse_suffix)..package.path This is why I hate patching package.path in sources. There is a patchset where I move this suite to another place. Now I see that I need to patch this test. OK, we can do nothing with tests for now, so please add a FIXME note for these lines to fix the issue via environment variable. > + > +local table_new = require "table.new" > + > +local bufread = require "bufread" > +local memprof = require "parse_memprof" > +local symtab = require "parse_symtab" Typo: Something is wrong with the whitespace above. > + > +local TMP_BINFILE = arg[0]:gsub('[^/]+%.test%.lua', '%.%1.memprofdata.tmp.bin') > +local BAD_PATH = arg[0]:gsub('[^/]+%.test%.lua', '%1/memprofdata.tmp.bin') Typo: Something is wrong with the whitespace above. > + > +local tap = require('tap') > + > +local test = tap.test("misc-memprof-lapi") > +test:plan(6) Why is the test initialized *here*? > + > +local function check_alloc_report(alloc, line, function_line, nevents) > + assert(string.format("@%s:%d", arg[0], function_line) == alloc[line].name) > + print(nevents, alloc[line].num) Do you need this print? > + assert(alloc[line].num == nevents) > + return true > +end > + > +-- Not a directory. > +local res, err = misc.memprof.start(BAD_PATH) > +test:ok(res == nil and err:match("Not a directory")) This case can also check so-called "system-dependent error code", right? > + > +local reader = bufread.new(TMP_BINFILE) > +local symbols = symtab.parse(reader) > +local events = memprof.parse(reader, symbols) Typo: Something is wrong with the whitespace above. > + > +-- 1 event -- alocation of table by itself + 1 allocation > +-- of array part as far it bigger then LJ_MAX_COLOSIZE (16). > +test:ok(check_alloc_report(alloc, 21, 19, 2)) > +-- 100 strings allocations. > +test:ok(check_alloc_report(alloc, 26, 19, 100)) > + > +-- Collect all previous allocated objects. > +test:ok(free.INTERNAL.num == 102) I guess you there is no need to name these magic numbers, but it's totally worth to leave a comment describing them. > + > +os.exit(test:check() and 0 or 1) > diff --git a/tools/luajit-parse-memprof b/tools/luajit-parse-memprof > new file mode 100755 > index 0000000..b9b16d7 > --- /dev/null > +++ b/tools/luajit-parse-memprof I believe it's better to implement runner to be installed to the system instead of the one to be used in sources. By the way, take a look on the way luajit.pc is installed. > @@ -0,0 +1,20 @@ > diff --git a/tools/parse_memprof/bufread.lua b/tools/parse_memprof/bufread.lua > new file mode 100644 > index 0000000..d48d6e8 > --- /dev/null > +++ b/tools/parse_memprof/bufread.lua > @@ -0,0 +1,143 @@ > +local ffi_C = ffi.C > +local band = bit.band Typo: Something is wrong with the whitespace above. > + > + > +local function _read_stream(reader, n) > + local free_size Minor: It's better to declare this variable where it is used. > + local tail_size = reader._end - reader._pos > +function M.read_uleb128(reader) > + local value = ffi.new('uint64_t', 0) > + local shift = 0 > + > + repeat > + local oct = M.read_octet(reader) > + > + if oct == nil then > + break > + end I guess this is an exceptional case, isn't it? If we hit this condition the ULEB128 value is misencoded, so it's worth to raise an error. > + > + -- Alas, bit library works only with 32-bit arguments: > + local oct_u64 = ffi.new('uint64_t', band(oct, 0x7f)) Please do not use magic number in code. > + value = value + oct_u64 * (2 ^ shift) > + shift = shift + 7 > + > + until band(oct, 0x80) == 0 Please do not use magic number in code. > + > + return tonumber(value) > +end > diff --git a/tools/parse_memprof/main.lua b/tools/parse_memprof/main.lua > new file mode 100644 > index 0000000..9a161b1 > --- /dev/null > +++ b/tools/parse_memprof/main.lua > @@ -0,0 +1,104 @@ > +local bufread = require "bufread" > +local memprof = require "parse_memprof" > +local symtab = require "parse_symtab" > +local view = require "view_plain" Typo: Something is wrong with the whitespace above. > + > +local stdout, stderr = io.stdout, io.stderr > +local _s = string This variable looks excess: it is used once for the lookup below. > +local match, gmatch = _s.match, _s.gmatch > +-- Parse arguments. > +local function parseargs(args) > + -- Process all option arguments. > + args.argn = 1 > + repeat > + local a = args[args.argn] > + if not a then break end > + local lopt, opt = match(a, "^%-(%-?)(.+)") > + if not opt then break end > + args.argn = args.argn + 1 > + if lopt == "" then > + -- Loop through short options. > + for o in gmatch(opt, ".") do parseopt(o, args) end Please make this loop multiline. > + else > + -- Long option. > + parseopt(opt, args) > + end > + until false > + > + -- Check for proper number of arguments. > + local nargs = #args - args.argn + 1 > + if nargs ~= 1 then > + opt_map.help() > + end > + > + -- Translate a single input file. > + -- TODO: Handle multiple files? > + return args[args.argn] > +end > + > +local inputfile = parseargs{...} > + > +local reader = bufread.new(inputfile) > +local symbols = symtab.parse(reader) > +local events = memprof.parse(reader, symbols) Typo: Something is wrong with the whitespace above. > + > diff --git a/tools/parse_memprof/parse_memprof.lua b/tools/parse_memprof/parse_memprof.lua > new file mode 100644 > index 0000000..dc56fed > --- /dev/null > +++ b/tools/parse_memprof/parse_memprof.lua > @@ -0,0 +1,195 @@ > +local bit = require 'bit' > +local band = bit.band > +local lshift = bit.lshift > + > +local string_format = string.format > + > +local LJM_MAGIC = 'ljm' > +local LJM_CURRENT_VERSION = 2 > + > +local LJM_EPILOGUE_HEADER = 0x80 > + > +local AEVENT_ALLOC = 1 > +local AEVENT_FREE = 2 > +local AEVENT_REALLOC = 3 > + > +local ASOURCE_INT = lshift(1, 2) > +local ASOURCE_LFUNC = lshift(2, 2) > +local ASOURCE_CFUNC = lshift(3, 2) Typo: Something is wrong with the whitespace above. > + > +local function link_to_previous(heap, e, oaddr) > + -- memory at oaddr was allocated before we started tracking: > + if heap[oaddr] then > + e.primary[heap[oaddr][2]] = heap[oaddr][3] Please leave the comment that the key is id and the value is location. > + end > +end > + > +local function parse_location(reader, asource) What about introducing another function for it? | local function id_location(addr, line) | return string_format('f%#xl%d', addr, line), { | addr = addr, | line = line, | } | end | | local function parse_location(reader, asource) | if asource == ASOURCE_INT then | return id_location(0, 0) | elseif asource == ASOURCE_CFUNC then | return id_location(reader:read_uleb128(), 0) | elseif asource == ASOURCE_LFUNC then | return id_location(reader:read_uleb128(), reader:read_uleb128()) | end | error('Unknown asource '..asource) | end > + local e = events[id] > + e.num = e.num + 1 > + e.alloc = e.alloc + nsize Typo: Something is wrong with the whitespace above. > + > + local e = events[id] > + e.num = e.num + 1 > + e.free = e.free + osize > + e.alloc = e.alloc + nsize Typo: Something is wrong with the whitespace above. > + > + local e = events[id] > + e.num = e.num + 1 > + e.free = e.free + osize Typo: Something is wrong with the whitespace above. > + > +local parsers = { > + [AEVENT_ALLOC] = {evname = 'alloc', parse = parse_alloc}, > + [AEVENT_REALLOC] = {evname = 'realloc', parse = parse_realloc}, > + [AEVENT_FREE] = {evname = 'free', parse = parse_free}, Typo: Something is wrong with the whitespace above. > +} > +local function ev_header_is_epilogue(evh) > + return evh == LJM_EPILOGUE_HEADER > +end This function is excess. Just use this comparison in the condition. > + > + local magic = reader:read_octets(3) > + local version = reader:read_octets(1) > + -- dummy-consume reserved bytes > + local _ = reader:read_octets(3) Typo: Something is wrong with the whitespace above. > + > diff --git a/tools/parse_memprof/parse_symtab.lua b/tools/parse_memprof/parse_symtab.lua > new file mode 100644 > index 0000000..54e9337 > --- /dev/null > +++ b/tools/parse_memprof/parse_symtab.lua > @@ -0,0 +1,88 @@ > +local band = bit.band > +local string_format = string.format > + > +local LJS_MAGIC = 'ljs' > +local LJS_CURRENT_VERSION = 2 > +local LJS_EPILOGUE_HEADER = 0x80 > +local LJS_SYMTYPE_MASK = 0x03 Typo: Something is wrong with the whitespace above. > + > +-- Parse a single entry in a symtab: lfunc symbol > +local function parse_sym_lfunc(reader, symtab) > + local sym_addr = reader:read_uleb128() > + local sym_chunk = reader:read_string() > + local sym_line = reader:read_uleb128() Typo: Something is wrong with the whitespace above. > + > +function M.parse(reader) > + local symtab = {} > + local magic = reader:read_octets(3) > + local version = reader:read_octets(1) Typo: Something is wrong with the whitespace above. > + > + > + while not reader:eof() do > + local header = reader:read_octet() > + local is_final = band(header, LJS_EPILOGUE_HEADER) ~= 0 Typo: Something is wrong with the whitespace above. > + > -- > 2.28.0 > -- Best regards, IM