* Re: [Tarantool-patches] [PATCH luajit][v2] Always close profiler output file.
2025-02-12 12:20 [Tarantool-patches] [PATCH luajit][v2] Always close profiler output file Sergey Bronnikov via Tarantool-patches
@ 2025-02-13 15:31 ` Sergey Kaplun via Tarantool-patches
0 siblings, 0 replies; 2+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-02-13 15:31 UTC (permalink / raw)
To: Sergey Bronnikov; +Cc: tarantool-patches
Hi, Sergey!
Thanks for the fixes and clarification!
Please consider by answers below.
On 12.02.25, Sergey Bronnikov wrote:
> From: Mike Pall <mike>
>
> Reported by Guilherme Batalheiro.
>
> (cherry picked from commit fca66335d131669cf017420af6963a7565babb58)
>
> Before the patch, a function `prof_finish` wrote a string
> `No samples collected` to a profiler output file and then exited.
> Due to early exit, thethe output file handle stay open. This patch
Typo: s/thethe /the /
> fixes the condition and the file handle is closed even if the
> number of samples is equal to 0.
>
> Sergey Bronnikov:
> * added the description and the test for the problem
>
> Part of tarantool/tarantool#11055
> ---
> Changes v2:
> - Added fixes according to comments by Sergey Kaplun.
>
> Branch: https://github.com/tarantool/luajit/tree/ligurio/gh-xxxx-close-file-profiler
>
> Related issues:
> - https://github.com/luajIT/luajIT/issues/1304
> - https://github.com/tarantool/tarantool/issues/11055
>
> src/jit/p.lua | 4 +--
> ...close-profile-dump-with-0-samples.test.lua | 31 +++++++++++++++++++
> .../script.lua | 19 ++++++++++++
> test/tarantool-tests/utils/tools.lua | 4 +++
> 4 files changed, 55 insertions(+), 3 deletions(-)
> create mode 100644 test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua
> create mode 100644 test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua
>
> diff --git a/src/jit/p.lua b/src/jit/p.lua
> index 4569d69e..89b49584 100644
> --- a/src/jit/p.lua
> +++ b/src/jit/p.lua
<snipped>
> diff --git a/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua
> new file mode 100644
> index 00000000..614ecc69
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua
> @@ -0,0 +1,31 @@
<snipped>
> +local script = utils.exec.makecmd(arg)
> +-- Execute a Lua script with start and stop LuaJIT profiler,
> +-- it is expected no samples found by profiler. The script's
> +-- output is suppressed, it is not interested.
> +local _ = script(p_filename, p_postfix)
> # > I don't get it. Why do we need the separate script for it?
> # prof_finish() is executed first time on calling `jit.p.stop()`
> #
> # and executed second time when GC finalizer is calling [1] and therefore
> # message "'[No samples collected]'" appears
> #
> # two times in the output file.
> #
> # 1. https://github.com/tarantool/luajit/blob/80360fb2e570ce8d54ff59ccf0bcd5dfb6b98527/src/jit/p.lua#L290
> #
> # We can inline script.lua to a test body and call garbage collector manually,
> # but I propose to leave it as is because executing as a separate process is
> # more real use case.
Now it is clear to me, thanks!
This is another face of the bug that was fixed by this commit.
Nice catch!
In the original issue, automated parsing shows empty file content.
I suppose it was done with something like the following:
| LUA_PATH="src/?.lua;;" src/luajit -e '
| local jp = require"jit.p"
| local filename = "/tmp/j.p"
|
| collectgarbage("stop")
| jp.start("i9999999", filename)
| jp.stop()
|
| local f = io.open(filename, "r")
| print(("<%s>"):format(f:read("*all")))
| f.close()
| collectgarbage("restart")
| os.remove(filename)
| '
Before the patch, I get the following output:
| <>
After the patch, I get the following output:
| <[No samples collected]
| >
I guess this is the use case for guys from the NeoVim:
They run the profiler inside the application but have no samples
collected. The inspector may read the file from the different process,
but there is no need for it. And, yes, in the test we have to manually
stop and restart the GC, but only for the stability of the reproducer.
It is not a part of a use case itself.
Your test case may be simplified as the following (in terms of my
script):
| LUA_PATH="src/?.lua;;" src/luajit -e '
| local jp = require"jit.p"
| local filename = "/tmp/j.p"
|
| collectgarbage("stop")
| jp.start("i9999999", filename)
| jp.stop()
|
| -- Unload the module and clean the local.
| package.loaded["jit.p"] = nil
| jp = nil
| collectgarbage("collect")
|
| local f = io.open(filename, "r")
| print(("<%s>"):format(f:read("*all")))
| f.close()
| collectgarbage("restart")
| os.remove(filename)
| '
| <[No samples collected]
| [No samples collected]
| >
Since, IMHO, there is no need to check the </proc/$$/fd> content (see
the rationale below), we may use this test case in the original file
without the helper script.
So, let's check both cases (the original one and that found by you) in
the test.
> +
> +local p_content = io.open(p_filename):read('a*')
> +test:is(utils.tools.trim(p_content), '[No samples collected]',
> + 'profile dump has no samples')
> # > Minor: I suppose simple `test:like()` will be enough. In that case there
> # > is no need to use the newly introduced `trim()` function here.
> #
> # The check checks that exactly one "[No samples collected]" is in a file.
> #
> # Before the patch, a file contains two "[No samples collected]".
OK, then let's use the comparison with '[No samples collected]\n', since
we want exact matching. That way we will be sure that there are no
garbage whitespaces, and there is no need to see what exactly (strip
only \n or all whitespaces, etc.) `trim()` helper does.
> +
> +-- Teardown.
> +os.remove(p_filename)
> +
> +test:done(true)
> diff --git a/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua
> new file mode 100644
> index 00000000..864958f5
> --- /dev/null
> +++ b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua
<snipped>
> +local ls_output = io.popen('ls -l /proc/$$/fd'):read('a*')
> # > Minor: I am not sure that we really need this check. The descriptor will
> # > be closed when it will be garbage collected, IINM, so there is no leak,
> # > actually. I suggest dropping it and checking only the content of the
> # > file -- this is the main issue regarding `jit.p` solved by that commit.
> #
> # This is for checking that `prof_finish` behaves correctly.
> #
> # fd leak is a resource leak and there is a separate CWE for this -
> #
> # https://cwe.mitre.org/data/definitions/775.html
> #
> # On my machine max number of file descriptors is 126693 per process:
> #
> # [0] ~ $ ulimit -n
> # 126693
> #
> # You will get file descriptor exhaustion if you will run
> #
> # `require('jit.p').start()` 126693 times with different output files.
I still believe that this is not a leak.
Since we have 1 storage for file descriptor (`out` in <jit/p.lua>) all
previous objects stored to this upvalue become dead and will be
collected by the GC.
On my machine, I have only 1024 fds available for the process by
default:
| $ ulimit -n
| 1024
So we can easily check the fds leakage:
| LUA_PATH="src/?.lua;;" src/luajit -e '
| local jp = require"jit.p"
| local filename = "/tmp/j.p"
|
| -- collectgarbage("stop")
| for i = 1, 2000 do
| jp.start("i9999999", filename)
| jp.stop()
| if i % 100 == 0 then print(i) end
| end
|
| os.remove(filename)
| '
| 100
| ...
| 2000
Prints numbers up to 2000 successfully (i.e., there is no leakage of
descriptors).
Plus, we have a bunch of errors like the following:
| ERROR in finalizer: src/jit/p.lua:230: attempt to use a closed file
This happens on `lua_close()` since the udata with the `io` handle is
finalized before the `newproxy()`. Thus, the finalizer for the proxy
can't be used `io` in its finalizer.
Obviously, if we uncomment the `collectgarbage("stop")` line, we get the
corresponding error:
| 1000
| src/luajit: src/jit/p.lua:300: /tmp/j.p: Too many open files
But, anyway, at the `lua_close()` all descriptors will be closed.
The similar behaviour we have for the pure Lua:
| lua -e '
| local filename = "/tmp/j.p"
| collectgarbage("stop")
| for i = 1, 2000 do
| assert(io.open(filename, "w"))
| if i % 100 == 0 then print(i) end
| end
|
| os.remove(filename)
| '
| ..
| 1000
| lua: (command line):7: /tmp/j.p: Too many open files
Thus, there is no need for the to check </proc/$$/fd> content. Hence, we
don't need to run this inside the separate script.
<snipped>
> --
> 2.34.1
>
--
Best regards,
Sergey Kaplun
^ permalink raw reply [flat|nested] 2+ messages in thread