[Tarantool-patches] [PATCH luajit v1 11/11] profile: introduce profile parser

Igor Munkin imun at tarantool.org
Fri Dec 25 02:09:44 MSK 2020


Sergey,

Thanks for the patch! Please consider my comments below.

On 16.12.20, Sergey Kaplun wrote:
> This patch adds parser for profiler dumped binary data.
> It provides a script that parses given binary file. It parses symtab using
> ffi first and after map memory events with this symtab. Finally, it
> renders the data in human-readable format.

Side note: I saw the new version for the commit message, so left no
comments for this one.

> 
> Part of tarantool/tarantool#5442
> Part of tarantool/tarantool#5490
> ---
>  test/misclib-memprof-lapi.test.lua    | 125 +++++++++++++++++
>  tools/luajit-parse-memprof            |  20 +++
>  tools/parse_memprof/bufread.lua       | 143 +++++++++++++++++++
>  tools/parse_memprof/main.lua          | 104 ++++++++++++++
>  tools/parse_memprof/parse_memprof.lua | 195 ++++++++++++++++++++++++++
>  tools/parse_memprof/parse_symtab.lua  |  88 ++++++++++++
>  tools/parse_memprof/view_plain.lua    |  45 ++++++
>  7 files changed, 720 insertions(+)
>  create mode 100755 test/misclib-memprof-lapi.test.lua
>  create mode 100755 tools/luajit-parse-memprof
>  create mode 100644 tools/parse_memprof/bufread.lua
>  create mode 100644 tools/parse_memprof/main.lua
>  create mode 100644 tools/parse_memprof/parse_memprof.lua
>  create mode 100644 tools/parse_memprof/parse_symtab.lua
>  create mode 100644 tools/parse_memprof/view_plain.lua

General notes for all comments in the sources below:
* Write it above the code to be commented
* Start it with a capital letter
* End it with a dot

> 
> diff --git a/test/misclib-memprof-lapi.test.lua b/test/misclib-memprof-lapi.test.lua
> new file mode 100755
> index 0000000..e3663bb
> --- /dev/null
> +++ b/test/misclib-memprof-lapi.test.lua
> @@ -0,0 +1,125 @@

<snipped>

> +local path = arg[0]:gsub('/[^/]+%.test%.lua', '')
> +local parse_suffix = '../tools/parse_memprof/?.lua;'
> +package.path = ('%s/%s;'):format(path, parse_suffix)..package.path

This is why I hate patching package.path in sources. There is a patchset
where I move this suite to another place. Now I see that I need to patch
this test. OK, we can do nothing with tests for now, so please add a
FIXME note for these lines to fix the issue via environment variable.

> +
> +local table_new = require "table.new"
> +
> +local bufread = require "bufread"
> +local memprof = require "parse_memprof"
> +local symtab  = require "parse_symtab"

Typo: Something is wrong with the whitespace above.

> +
> +local TMP_BINFILE = arg[0]:gsub('[^/]+%.test%.lua', '%.%1.memprofdata.tmp.bin')
> +local BAD_PATH    = arg[0]:gsub('[^/]+%.test%.lua', '%1/memprofdata.tmp.bin')

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +local tap = require('tap')
> +
> +local test = tap.test("misc-memprof-lapi")
> +test:plan(6)

Why is the test initialized *here*?

> +

<snipped>

> +local function check_alloc_report(alloc, line, function_line, nevents)
> +  assert(string.format("@%s:%d", arg[0], function_line) == alloc[line].name)
> +  print(nevents, alloc[line].num)

Do you need this print?

> +  assert(alloc[line].num == nevents)
> +  return true
> +end
> +
> +-- Not a directory.
> +local res, err = misc.memprof.start(BAD_PATH)
> +test:ok(res == nil and err:match("Not a directory"))

This case can also check so-called "system-dependent error code", right?

> +

<snipped>

> +local reader  = bufread.new(TMP_BINFILE)
> +local symbols = symtab.parse(reader)
> +local events  = memprof.parse(reader, symbols)

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +-- 1 event -- alocation of table by itself + 1 allocation
> +-- of array part as far it bigger then LJ_MAX_COLOSIZE (16).
> +test:ok(check_alloc_report(alloc, 21, 19, 2))
> +-- 100 strings allocations.
> +test:ok(check_alloc_report(alloc, 26, 19, 100))
> +
> +-- Collect all previous allocated objects.
> +test:ok(free.INTERNAL.num == 102)

I guess you there is no need to name these magic numbers, but it's
totally worth to leave a comment describing them.

> +
> +os.exit(test:check() and 0 or 1)
> diff --git a/tools/luajit-parse-memprof b/tools/luajit-parse-memprof
> new file mode 100755
> index 0000000..b9b16d7
> --- /dev/null
> +++ b/tools/luajit-parse-memprof

I believe it's better to implement runner to be installed to the system
instead of the one to be used in sources. By the way, take a look on the
way luajit.pc is installed.

> @@ -0,0 +1,20 @@

<snipped>

> diff --git a/tools/parse_memprof/bufread.lua b/tools/parse_memprof/bufread.lua
> new file mode 100644
> index 0000000..d48d6e8
> --- /dev/null
> +++ b/tools/parse_memprof/bufread.lua
> @@ -0,0 +1,143 @@

<snipped>

> +local ffi_C  = ffi.C
> +local band   = bit.band

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +
> +local function _read_stream(reader, n)
> +  local free_size

Minor: It's better to declare this variable where it is used.

> +  local tail_size = reader._end - reader._pos

<snipped>

> +function M.read_uleb128(reader)
> +  local value = ffi.new('uint64_t', 0)
> +  local shift = 0
> +
> +  repeat
> +    local oct = M.read_octet(reader)
> +
> +    if oct == nil then
> +      break
> +    end

I guess this is an exceptional case, isn't it? If we hit this condition
the ULEB128 value is misencoded, so it's worth to raise an error.

> +
> +    -- Alas, bit library works only with 32-bit arguments:
> +    local oct_u64 = ffi.new('uint64_t', band(oct, 0x7f))

Please do not use magic number in code.

> +    value = value + oct_u64 * (2 ^ shift)
> +    shift = shift + 7
> +
> +  until band(oct, 0x80) == 0

Please do not use magic number in code.

> +
> +  return tonumber(value)
> +end

<snipped>

> diff --git a/tools/parse_memprof/main.lua b/tools/parse_memprof/main.lua
> new file mode 100644
> index 0000000..9a161b1
> --- /dev/null
> +++ b/tools/parse_memprof/main.lua
> @@ -0,0 +1,104 @@

<snipped>

> +local bufread = require "bufread"
> +local memprof = require "parse_memprof"
> +local symtab  = require "parse_symtab"
> +local view    = require "view_plain"

Typo: Something is wrong with the whitespace above.

> +
> +local stdout, stderr = io.stdout, io.stderr
> +local _s = string

This variable looks excess: it is used once for the lookup below.

> +local match, gmatch = _s.match, _s.gmatch

<snipped>

> +-- Parse arguments.
> +local function parseargs(args)
> +  -- Process all option arguments.
> +  args.argn = 1
> +  repeat
> +    local a = args[args.argn]
> +    if not a then break end
> +    local lopt, opt = match(a, "^%-(%-?)(.+)")
> +    if not opt then break end
> +    args.argn = args.argn + 1
> +    if lopt == "" then
> +      -- Loop through short options.
> +      for o in gmatch(opt, ".") do parseopt(o, args) end

Please make this loop multiline.

> +    else
> +      -- Long option.
> +      parseopt(opt, args)
> +    end
> +  until false
> +
> +  -- Check for proper number of arguments.
> +  local nargs = #args - args.argn + 1
> +  if nargs ~= 1 then
> +    opt_map.help()
> +  end
> +
> +  -- Translate a single input file.
> +  -- TODO: Handle multiple files?
> +  return args[args.argn]
> +end
> +
> +local inputfile = parseargs{...}
> +
> +local reader  = bufread.new(inputfile)
> +local symbols = symtab.parse(reader)
> +local events  = memprof.parse(reader, symbols)

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> diff --git a/tools/parse_memprof/parse_memprof.lua b/tools/parse_memprof/parse_memprof.lua
> new file mode 100644
> index 0000000..dc56fed
> --- /dev/null
> +++ b/tools/parse_memprof/parse_memprof.lua
> @@ -0,0 +1,195 @@

<snipped>

> +local bit    = require 'bit'
> +local band   = bit.band
> +local lshift = bit.lshift
> +
> +local string_format = string.format
> +
> +local LJM_MAGIC           = 'ljm'
> +local LJM_CURRENT_VERSION = 2
> +
> +local LJM_EPILOGUE_HEADER = 0x80
> +
> +local AEVENT_ALLOC   = 1
> +local AEVENT_FREE    = 2
> +local AEVENT_REALLOC = 3
> +
> +local ASOURCE_INT   = lshift(1, 2)
> +local ASOURCE_LFUNC = lshift(2, 2)
> +local ASOURCE_CFUNC = lshift(3, 2)

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +local function link_to_previous(heap, e, oaddr)
> +  -- memory at oaddr was allocated before we started tracking:
> +  if heap[oaddr] then
> +    e.primary[heap[oaddr][2]] = heap[oaddr][3]

Please leave the comment that the key is id and the value is location.

> +  end
> +end
> +
> +local function parse_location(reader, asource)

What about introducing another function for it?
| local function id_location(addr, line)
|   return string_format('f%#xl%d', addr, line), {
|     addr = addr,
|     line = line,
|   }
| end
|
| local function parse_location(reader, asource)
|   if asource == ASOURCE_INT then
|     return id_location(0, 0)
|   elseif asource == ASOURCE_CFUNC then
|     return id_location(reader:read_uleb128(), 0)
|   elseif asource == ASOURCE_LFUNC then
|     return id_location(reader:read_uleb128(), reader:read_uleb128())
|   end
|   error('Unknown asource '..asource)
| end

<snipped>

> +  local e = events[id]
> +  e.num   = e.num + 1
> +  e.alloc = e.alloc + nsize

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +  local e = events[id]
> +  e.num   = e.num + 1
> +  e.free  = e.free + osize
> +  e.alloc = e.alloc + nsize

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +  local e = events[id]
> +  e.num   = e.num + 1
> +  e.free  = e.free + osize

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +local parsers = {
> +  [AEVENT_ALLOC]   = {evname =   'alloc', parse = parse_alloc},
> +  [AEVENT_REALLOC] = {evname = 'realloc', parse = parse_realloc},
> +  [AEVENT_FREE]    = {evname =    'free', parse = parse_free},

Typo: Something is wrong with the whitespace above.

> +}

<snipped>

> +local function ev_header_is_epilogue(evh)
> +  return evh == LJM_EPILOGUE_HEADER
> +end

This function is excess. Just use this comparison in the condition.

> +

<snipped>

> +  local magic   = reader:read_octets(3)
> +  local version = reader:read_octets(1)
> +  -- dummy-consume reserved bytes
> +  local _       = reader:read_octets(3)

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> diff --git a/tools/parse_memprof/parse_symtab.lua b/tools/parse_memprof/parse_symtab.lua
> new file mode 100644
> index 0000000..54e9337
> --- /dev/null
> +++ b/tools/parse_memprof/parse_symtab.lua
> @@ -0,0 +1,88 @@

<snipped>

> +local band          = bit.band
> +local string_format = string.format
> +
> +local LJS_MAGIC           = 'ljs'
> +local LJS_CURRENT_VERSION = 2
> +local LJS_EPILOGUE_HEADER = 0x80
> +local LJS_SYMTYPE_MASK    = 0x03

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +-- Parse a single entry in a symtab: lfunc symbol
> +local function parse_sym_lfunc(reader, symtab)
> +  local sym_addr  = reader:read_uleb128()
> +  local sym_chunk = reader:read_string()
> +  local sym_line  = reader:read_uleb128()

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +function M.parse(reader)
> +  local symtab   = {}
> +  local magic    = reader:read_octets(3)
> +  local version  = reader:read_octets(1)

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> +
> +  while not reader:eof() do
> +    local header   = reader:read_octet()
> +    local is_final = band(header, LJS_EPILOGUE_HEADER) ~= 0

Typo: Something is wrong with the whitespace above.

> +

<snipped>

> -- 
> 2.28.0
> 

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list