Hi, Sergey, thanks for review! Updates applied and force-pushed. Sergey On 13.02.2025 18:31, Sergey Kaplun via Tarantool-patches wrote: > Hi, Sergey! > Thanks for the fixes and clarification! > Please consider by answers below. > > On 12.02.25, Sergey Bronnikov wrote: >> From: Mike Pall <mike> >> >> Reported by Guilherme Batalheiro. >> >> (cherry picked from commit fca66335d131669cf017420af6963a7565babb58) >> >> Before the patch, a function `prof_finish` wrote a string >> `No samples collected` to a profiler output file and then exited. >> Due to early exit, thethe output file handle stay open. This patch > Typo: s/thethe /the / Fixed. > >> fixes the condition and the file handle is closed even if the >> number of samples is equal to 0. >> >> Sergey Bronnikov: >> * added the description and the test for the problem >> >> Part of tarantool/tarantool#11055 >> --- >> Changes v2: >> - Added fixes according to comments by Sergey Kaplun. >> >> Branch:https://github.com/tarantool/luajit/tree/ligurio/gh-xxxx-close-file-profiler >> >> Related issues: >> -https://github.com/luajIT/luajIT/issues/1304 >> -https://github.com/tarantool/tarantool/issues/11055 >> >> src/jit/p.lua | 4 +-- >> ...close-profile-dump-with-0-samples.test.lua | 31 +++++++++++++++++++ >> .../script.lua | 19 ++++++++++++ >> test/tarantool-tests/utils/tools.lua | 4 +++ >> 4 files changed, 55 insertions(+), 3 deletions(-) >> create mode 100644 test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua >> create mode 100644 test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua >> >> diff --git a/src/jit/p.lua b/src/jit/p.lua >> index 4569d69e..89b49584 100644 >> --- a/src/jit/p.lua >> +++ b/src/jit/p.lua > <snipped> > >> diff --git a/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua >> new file mode 100644 >> index 00000000..614ecc69 >> --- /dev/null >> +++ b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua >> @@ -0,0 +1,31 @@ > <snipped> > >> +local script = utils.exec.makecmd(arg) >> +-- Execute a Lua script with start and stop LuaJIT profiler, >> +-- it is expected no samples found by profiler. The script's >> +-- output is suppressed, it is not interested. >> +local _ = script(p_filename, p_postfix) >> # > I don't get it. Why do we need the separate script for it? >> # prof_finish() is executed first time on calling `jit.p.stop()` >> # >> # and executed second time when GC finalizer is calling [1] and therefore >> # message "'[No samples collected]'" appears >> # >> # two times in the output file. >> # >> # 1.https://github.com/tarantool/luajit/blob/80360fb2e570ce8d54ff59ccf0bcd5dfb6b98527/src/jit/p.lua#L290 >> # >> # We can inline script.lua to a test body and call garbage collector manually, >> # but I propose to leave it as is because executing as a separate process is >> # more real use case. > Now it is clear to me, thanks! > > This is another face of the bug that was fixed by this commit. > Nice catch! > > In the original issue, automated parsing shows empty file content. > I suppose it was done with something like the following: > > | LUA_PATH="src/?.lua;;" src/luajit -e ' > | local jp = require"jit.p" > | local filename = "/tmp/j.p" > | > | collectgarbage("stop") > | jp.start("i9999999", filename) > | jp.stop() > | > | local f = io.open(filename, "r") > | print(("<%s>"):format(f:read("*all"))) > | f.close() > | collectgarbage("restart") > | os.remove(filename) > | ' > > Before the patch, I get the following output: > > | <> > > After the patch, I get the following output: > > | <[No samples collected] > | > ok, replaced testcase > > I guess this is the use case for guys from the NeoVim: > They run the profiler inside the application but have no samples > collected. The inspector may read the file from the different process, > but there is no need for it. And, yes, in the test we have to manually > stop and restart the GC, but only for the stability of the reproducer. > It is not a part of a use case itself. > > Your test case may be simplified as the following (in terms of my > script): > | LUA_PATH="src/?.lua;;" src/luajit -e ' > | local jp = require"jit.p" > | local filename = "/tmp/j.p" > | > | collectgarbage("stop") > | jp.start("i9999999", filename) > | jp.stop() > | > | -- Unload the module and clean the local. > | package.loaded["jit.p"] = nil > | jp = nil > | collectgarbage("collect") > | > | local f = io.open(filename, "r") > | print(("<%s>"):format(f:read("*all"))) > | f.close() > | collectgarbage("restart") > | os.remove(filename) > | ' > | <[No samples collected] > | [No samples collected] > | > > > Since, IMHO, there is no need to check the </proc/$$/fd> content (see > the rationale below), we may use this test case in the original file > without the helper script. > > So, let's check both cases (the original one and that found by you) in > the test. What is a rationale for this? This is a helper script for LuaJIT profiler and according to our backporting rules test must cover a backported patch. This rule is fulfilled by my test. > >> + >> +local p_content = io.open(p_filename):read('a*') >> +test:is(utils.tools.trim(p_content), '[No samples collected]', >> + 'profile dump has no samples') >> # > Minor: I suppose simple `test:like()` will be enough. In that case there >> # > is no need to use the newly introduced `trim()` function here. >> # >> # The check checks that exactly one "[No samples collected]" is in a file. >> # >> # Before the patch, a file contains two "[No samples collected]". > OK, then let's use the comparison with '[No samples collected]\n', since > we want exact matching. That way we will be sure that there are no > garbage whitespaces, and there is no need to see what exactly (strip > only \n or all whitespaces, etc.) `trim()` helper does. It looks reasonable, fixed: --- a/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua +++ b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples.test.lua @@ -22,8 +22,7 @@ local script = utils.exec.makecmd(arg) local _ = script(p_filename, p_postfix) local p_content = io.open(p_filename):read('a*') -test:is(utils.tools.trim(p_content), '[No samples collected]', - 'profile dump has no samples') +test:is(p_content, '[No samples collected]\n', 'profile dump has no samples') -- Teardown. os.remove(p_filename) > >> + >> +-- Teardown. >> +os.remove(p_filename) >> + >> +test:done(true) >> diff --git a/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua >> new file mode 100644 >> index 00000000..864958f5 >> --- /dev/null >> +++ b/test/tarantool-tests/lj-1304-close-profile-dump-with-0-samples/script.lua > <snipped> > >> +local ls_output = io.popen('ls -l /proc/$$/fd'):read('a*') >> # > Minor: I am not sure that we really need this check. The descriptor will >> # > be closed when it will be garbage collected, IINM, so there is no leak, >> # > actually. I suggest dropping it and checking only the content of the >> # > file -- this is the main issue regarding `jit.p` solved by that commit. >> # >> # This is for checking that `prof_finish` behaves correctly. >> # >> # fd leak is a resource leak and there is a separate CWE for this - >> # >> #https://cwe.mitre.org/data/definitions/775.html >> # >> # On my machine max number of file descriptors is 126693 per process: >> # >> # [0] ~ $ ulimit -n >> # 126693 >> # >> # You will get file descriptor exhaustion if you will run >> # >> # `require('jit.p').start()` 126693 times with different output files. > I still believe that this is not a leak. > Since we have 1 storage for file descriptor (`out` in <jit/p.lua>) all > previous objects stored to this upvalue become dead and will be > collected by the GC. Right, I've missed this. > > On my machine, I have only 1024 fds available for the process by > default: > | $ ulimit -n > | 1024 > > So we can easily check the fds leakage: > | LUA_PATH="src/?.lua;;" src/luajit -e ' > | local jp = require"jit.p" > | local filename = "/tmp/j.p" > | > | -- collectgarbage("stop") > | for i = 1, 2000 do > | jp.start("i9999999", filename) > | jp.stop() > | if i % 100 == 0 then print(i) end > | end > | > | os.remove(filename) > | ' > | 100 > | ... > | 2000 > > Prints numbers up to 2000 successfully (i.e., there is no leakage of > descriptors). > Plus, we have a bunch of errors like the following: > | ERROR in finalizer: src/jit/p.lua:230: attempt to use a closed file > > This happens on `lua_close()` since the udata with the `io` handle is > finalized before the `newproxy()`. Thus, the finalizer for the proxy > can't be used `io` in its finalizer. > > Obviously, if we uncomment the `collectgarbage("stop")` line, we get the > corresponding error: > > | 1000 > | src/luajit: src/jit/p.lua:300: /tmp/j.p: Too many open files > > But, anyway, at the `lua_close()` all descriptors will be closed. > > The similar behaviour we have for the pure Lua: > | lua -e ' > | local filename = "/tmp/j.p" > | collectgarbage("stop") > | for i = 1, 2000 do > | assert(io.open(filename, "w")) > | if i % 100 == 0 then print(i) end > | end > | > | os.remove(filename) > | ' > | .. > | 1000 > | lua: (command line):7: /tmp/j.p: Too many open files > > Thus, there is no need for the to check </proc/$$/fd> content. Hence, we > don't need to run this inside the separate script. > > <snipped> > >> -- >> 2.34.1 >>