From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounces@dev.tarantool.org>
Received: from [87.239.111.99] (localhost [127.0.0.1])
	by dev.tarantool.org (Postfix) with ESMTP id 2F9042ACFC1;
	Tue, 23 May 2023 15:41:05 +0300 (MSK)
DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 2F9042ACFC1
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev;
	t=1684845665; bh=/pByznJ1kFeY9wAudE4ulQ8EiFqcxkOStKgvT1vZKPg=;
	h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe:
	 List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
	 From;
	b=uGQ1R5hrAVc/1FZE/U7FDzJaplKdrZgtzi7is8kV2ObPhvQMQVbM5pnc9gdy9SizZ
	 +ay7UAKrhpy6H1EPACBlP+wveI84RcgFVTvz0rVUZRhcXuQiC8MJ6xLGriV9a+si/R
	 5bF7kPI1OGQy+VxO61n/g+3MQrJs/PvrDsvWYI88=
Received: from smtpng3.i.mail.ru (smtpng3.i.mail.ru [94.100.177.149])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id 598512ACFC1
 for <tarantool-patches@dev.tarantool.org>;
 Tue, 23 May 2023 15:41:04 +0300 (MSK)
DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 598512ACFC1
Received: by smtpng3.m.smailru.net with esmtpa (envelope-from
 <skaplun@tarantool.org>)
 id 1q1RK1-0004bL-IP; Tue, 23 May 2023 15:41:03 +0300
Date: Tue, 23 May 2023 15:36:55 +0300
To: Maxim Kokryashkin <max.kokryashkin@gmail.com>
Message-ID: <ZGyzZ1D8LFOsmfz6@root>
References: <20230518114927.277888-1-m.kokryashkin@tarantool.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20230518114927.277888-1-m.kokryashkin@tarantool.org>
X-Mailru-Src: smtp
X-4EC0790: 10
X-7564579A: B8F34718100C35BD
X-77F55803: 4F1203BC0FB41BD9318AAE2601AA39B849C248043F9511BC1BBFE9B9F88C6A2500894C459B0CD1B94BF5291DE4DE6C3ED98FCBE37CFF8C0B23C64D8C0C9336E57BC1A8A20AE16A52
X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE70ED3881ADD6CEF6AEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375549E03BCBF89E018638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D843277F2CD61AADF219EB27A7180334DA117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA1C5B227563B4AFA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F4460429728776938767073520CCD848CCB6FE560CF04B652EEC242312D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B613439FA09F3DCB32089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF
X-C1DE0DAB: 0D63561A33F958A5E2732CDCA48B50514FB0C4885087D67B038E8D09A7CF0F94F87CCE6106E1FC07E67D4AC08A07B9B0CE135D2742255B35CB5012B2E24CD356
X-C8649E89: 1C3962B70DF3F0ADE00A9FD3E00BEEDF3FED46C3ACD6F73ED3581295AF09D3DF87807E0823442EA2ED31085941D9CD0AF7F820E7B07EA4CF5F1D9860BDBDAE241654B27C0F9B109361F26DD010D9648BDEDFD5C2173E7EEE45D21F2230575A0FD5191E2618EE498C6A3DACD8EEB3486C6A341B08F40EEDCBF4E8A8FB6BF8EBF5
X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojzY8FQurh3bCjGqwTVyLt7g==
X-DA7885C5: CF72D99D86B541B5DE6AB59F29739FB250D5A8F3C71D4146A12033C093795253262E2D401490A4A0DB037EFA58388B346E8BC1A9835FDE71
X-Mailru-Sender: 689FA8AB762F73930F533AC2B33E986B5CDA8DA6B65084023A08A66DE87BE0190FBE9A32752B8C9C2AA642CC12EC09F1FB559BB5D741EB962F61BD320559CF1EFD657A8799238ED55FEEDEB644C299C0ED14614B50AE0675
X-Mras: Ok
Subject: Re: [Tarantool-patches] [PATCH luajit] sysprof: improve parser's
 memory footprint
X-BeenThere: tarantool-patches@dev.tarantool.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
Reply-To: Sergey Kaplun <skaplun@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Errors-To: tarantool-patches-bounces@dev.tarantool.org
Sender: "Tarantool-patches" <tarantool-patches-bounces@dev.tarantool.org>

Hi, Maxim!
Thanks for the patch!
LGTM, with some minor comments below.

On 18.05.23, Maxim Kokryashkin wrote:
> This patch reduces sysprof's parser memory footprint,
> by avoiding reading all callchains before collapsing them.
> Instead of it, parser merges stacks immediately after
> reading them and stores counts in a lua table.
> 
> Also, it fixes a bug in the AVL-tree implementation,
> which produced unnecessary inserts of values into nodes.

Should it be any test for this?
Also, may be this should be done in the separate commit (not patch?).

> ---
> Branch: https://github.com/tarantool/luajit/tree/fckxorg/gh-noticket-sysprof-parser-refactoring
> PR: https://github.com/tarantool/tarantool/pull/8670
> 
> NB: CI is red in LuaJIT repo because this patch requires changes in the
> tarantool repo, so please refer to CI runs in PR.
> 
>  tools/CMakeLists.txt       |   2 -
>  tools/sysprof.lua          |  27 +-------
>  tools/sysprof/collapse.lua | 124 ------------------------------------
>  tools/sysprof/parse.lua    | 125 ++++++++++++++++++++++++++-----------
>  tools/utils/avl.lua        |   2 +-
>  tools/utils/symtab.lua     |   2 +-
>  6 files changed, 95 insertions(+), 187 deletions(-)
>  delete mode 100755 tools/sysprof/collapse.lua
> 
> diff --git a/tools/CMakeLists.txt b/tools/CMakeLists.txt
> index dd7ec6bd..3a919433 100644
> --- a/tools/CMakeLists.txt
> +++ b/tools/CMakeLists.txt

<snipped>

> diff --git a/tools/sysprof.lua b/tools/sysprof.lua
> index 1afab195..be2a0565 100644
> --- a/tools/sysprof.lua
> +++ b/tools/sysprof.lua

<snipped>

> diff --git a/tools/sysprof/collapse.lua b/tools/sysprof/collapse.lua
> deleted file mode 100755
> index 3d83d5ea..00000000
> --- a/tools/sysprof/collapse.lua
> +++ /dev/null

<snipped>

> diff --git a/tools/sysprof/parse.lua b/tools/sysprof/parse.lua
> index 5b52f104..3db36472 100755
> --- a/tools/sysprof/parse.lua
> +++ b/tools/sysprof/parse.lua

<snipped>

>  end
>  
> -local function parse_ffunc(reader, event, _)
> +local function parse_ffunc(reader, _)
>    local ffid = reader:read_uleb128()
> -  table.insert(event.lua.callchain, 1, {
> -    type = M.FRAME.FFUNC,
> -    ffid = ffid,
> -  })
> +  return vmdef.ffnames[ffid]

Nice, good changes!

>  end
>  

<snipped>

>  local function parse_lua_callchain(reader, event, symbols)
>    while true do
>      local frame_header = reader:read_octet()
> -    if frame_header == M.FRAME.BOTTOM then
> +    if frame_header == FRAME.BOTTOM then
>        break
>      end
> -    frame_parsers[frame_header](reader, event, symbols)
> +    local name = frame_parsers[frame_header](reader, symbols)
> +    table.insert(event.lua.callchain, 1, {name=name, type=frame_header})

Nit: missed whitespaces around `=`.

>    end
>  end
>  
> @@ -100,7 +100,7 @@ local function parse_host_callchain(reader, event, symbols)

<snipped>

> @@ -108,10 +108,20 @@ end
>  --~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--
>  
>  local function parse_trace_callchain(reader, event, symbols)
> -  event.lua.trace.traceno  = reader:read_uleb128()
> -  event.lua.trace.addr = reader:read_uleb128()
> -  event.lua.trace.line = reader:read_uleb128()
> -  event.lua.trace.gen = symtab.loc(symbols, event.lua.trace).gen
> +  local loc = {
> +    traceno  = reader:read_uleb128(),
> +    addr = reader:read_uleb128(),
> +    line = reader:read_uleb128()

OK, this looks fragile. Yes, LuaJIT parser returns bytecode, that
execude this in order, but I suggest to rewrite it in more clear way:

| local loc = {}
| loc.traceno = reader:read_uleb128()
| loc.addr = reader:read_uleb128()
| loc.line = reader:read_uleb128()

> +  }

<snipped>

>  
> +local function insert_lua_callchain(chain, lua)
> +  local ins_cnt = 0
> +  local name_lua
> +  for _, fr in ipairs(lua.callchain) do
> +    ins_cnt = ins_cnt + 1
> +    if fr.type == FRAME.CFUNC then
> +      -- C function encountered, the next chunk
> +      -- of frames is located on the C stack.
> +      break
> +    end
> +    name_lua = fr.name
> +
> +    if fr.type == FRAME.LFUNC and lua.trace.traceno ~= nil and
> +        lua.trace.addr == fr.addr and lua.trace.line == fr.line then
> +            name_lua = lua.trace.name
> +    end

Minor: I suggest formating like the following:

| if fr.type == FRAME.LFUNC
|    and lua.trace.traceno ~= nil
|    and lua.trace.addr == fr.addr
|    and lua.trace.line == fr.line
| then
|   name_lua = lua.trace.name
| end

or

| if
|   fr.type == FRAME.LFUNC
|   and lua.trace.traceno ~= nil
|   and lua.trace.addr == fr.addr
|   and lua.trace.line == fr.line
| then
|   name_lua = lua.trace.name
| end

> +
> +    table.insert(chain, name_lua)
> +  end
> +  table.remove(lua.callchain, ins_cnt)
> +end
> +
> +local function merge(event)
> +  local cc = {}
> +
> +  for _, name_host in ipairs(event.host.callchain) do
> +    table.insert(cc, name_host)
> +    if string.match(name_host, '^lua_cpcall') ~= nil then
> +      -- Any C function is present on both the C and the Lua
> +      -- stacks. It is more convenient to get its info from the
> +      -- host stack, since it has information about child frames.
> +      table.remove(event.lua.callchain)
> +    end
> +
> +    if string.match(name_host, '^lua_p?call') ~= nil then
> +      insert_lua_callchain(cc, event.lua)
> +    end
> +
> +  end
> +  return cc
> +end
> +
>  local function parse_event(reader, events, symbols)
>    local event = new_event()
>  
> @@ -171,8 +223,10 @@ local function parse_event(reader, events, symbols)
>    event.lua.vmstate = vmstate
>  
>    event_parsers[vmstate](reader, event, symbols)
> -
> -  table.insert(events, event)
> +  local cc = merge(event)
> +  local cc_str = table.concat(cc, ';') .. ';'

Should we just return cc_str from merge?
It will be look like really merged stack.

Also, what does cc|cc_str mean?:)

> +  local cur_cnt = events[cc_str]
> +  events[cc_str] = (cur_cnt or 0) + 1
>    return true
>  end
>  
> @@ -203,4 +257,5 @@ function M.parse(reader, symbols)
>    return events
>  end
>  
> +
>  return M
> diff --git a/tools/utils/avl.lua b/tools/utils/avl.lua
> index d5baa534..098f58ec 100644
> --- a/tools/utils/avl.lua
> +++ b/tools/utils/avl.lua

<snipped>

> diff --git a/tools/utils/symtab.lua b/tools/utils/symtab.lua
> index c26a9e8c..7f6c78f0 100644
> --- a/tools/utils/symtab.lua
> +++ b/tools/utils/symtab.lua

<snipped>

> -- 
> 2.40.1
> 

-- 
Best regards,
Sergey Kaplun