[Tarantool-patches] [PATCH 2/2] test: popen: fix popen test 'hang' under test-run
Alexander Turenko
alexander.turenko at tarantool.org
Mon May 18 14:42:58 MSK 2020
killpg() on Mac OS may don't deliver a signal to a process: it seems
that there is a race when a process is just forked. It means that
popen_handle:close() may leave a process alive, when `opts.setsid` and
`opts.group_signal` are set.
There is simple reproducer, which does not leave alive `sleep 120`
processes on Linux, but does it on Mac OS (within three-four runs in
rows):
| #include <signal.h>
| #include <unistd.h>
| #include <fcntl.h>
|
| int
| main()
| {
| char *child_argv[] = {
| "/bin/sh",
| "-c",
| "sleep 120",
| NULL,
| };
| pid_t pid;
| int fd[2];
| pipe(fd);
| fcntl(fd[0], F_SETFD, FD_CLOEXEC);
| fcntl(fd[1], F_SETFD, FD_CLOEXEC);
|
| if ((pid = fork()) == 0) {
| /* Child. */
| close(fd[0]);
| setpgrp();
| for (int i = 0; i < 10; ++i) {
| /* Proceed with killpg. */
| if (i == 5)
| close(fd[1]);
| if (fork() == 0) {
| /* Child. */
| execve("/bin/sh", child_argv, NULL);
| }
| }
| } else {
| /* Parent. */
| close(fd[1]);
| char c;
| read(fd[0], &c, 1);
| killpg(pid, SIGKILL);
| }
| return 0;
| }
Compile it (`cc test.c -o test`) and run several times:
$ for i in $(seq 1 1000); do \
echo $i; \
./test \
&& ps -o pid,pgid,command -ax | grep [s]leep \
&& break; \
done
This is the reason why `sleep 120` process may be alive even when the
whole test passes.
test-run captures stdout and stderr of a 'core = app' test and waits EOF
on them. If a child process inherit one of them or both, the fd is still
open for writing and so EOF situation will not appear until `sleep 120`
will exit.
This commit doesn't try to overcome the root of the problem, but close
stdout and stderr for the child process that may not be killed / exited
in time.
Aside of this, updated found Mac OS peculiars in API comments of C and
Lua popen modules.
Fixes #4938
@TarantoolBot document
Title: popen: add note re group signaling on Mac OS
Copyed from the popen_handle:signal() updated description:
> Note: Mac OS may don't deliver a signal to a process in a group when
> opts.setsid and opts.group_signal are set. It seems there is a race
> here: when a process is just forked it may be not signaled.
Copyed from the popen_handle:close() updated description:
> Details about signaling:
>
> <...>
> - There are peculiars in group signaling on Mac OS,
> @see popen_handle:signal() for details.
Follows up https://github.com/tarantool/doc/issues/1248
---
src/lib/core/popen.c | 5 +++++
src/lua/popen.c | 7 +++++++
test/app-tap/popen.test.lua | 24 +++++++++++++++++++++++-
3 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/src/lib/core/popen.c b/src/lib/core/popen.c
index da4b89757..3b5fd1062 100644
--- a/src/lib/core/popen.c
+++ b/src/lib/core/popen.c
@@ -683,6 +683,11 @@ popen_state_str(unsigned int state)
* signal to the process group even when ..._SETSID and
* ..._GROUP_SIGNAL are set.
*
+ * Mac OS may don't deliver a signal to a processes in a group
+ * when ..._SETSID and ..._GROUP_SIGNAL are set. It seems there
+ * is a race here: when a process is just forked it may be not
+ * signaled.
+ *
* Return 0 at success or -1 at failure (and set a diag).
*
* Possible errors:
diff --git a/src/lua/popen.c b/src/lua/popen.c
index 471964ee6..18c8282f1 100644
--- a/src/lua/popen.c
+++ b/src/lua/popen.c
@@ -1502,6 +1502,11 @@ lbox_popen_shell(struct lua_State *L)
* Note: The module offers popen.signal.SIG* constants, because
* some signals have different numbers on different platforms.
*
+ * Note: Mac OS may don't deliver a signal to a process in a
+ * group when opts.setsid and opts.group_signal are set. It
+ * seems there is a race here: when a process is just forked it
+ * may be not signaled.
+ *
* Raise an error on incorrect parameters:
*
* - IllegalParams: an incorrect handle parameter.
@@ -2207,6 +2212,8 @@ lbox_popen_info(struct lua_State *L)
* - The signal is sent to a process or a grocess group depending
* of opts.group_signal. (@see lbox_popen_new() for details of
* group signaling).
+ * - There are peculiars in group signaling on Mac OS,
+ * @see lbox_popen_signal() for details.
*
* Resources are released disregarding of whether a signal
* sending succeeds: fds are closed, memory is released,
diff --git a/test/app-tap/popen.test.lua b/test/app-tap/popen.test.lua
index e1aef9ef9..3b6d9469f 100755
--- a/test/app-tap/popen.test.lua
+++ b/test/app-tap/popen.test.lua
@@ -269,7 +269,29 @@ local function test_read_chunk(test)
local payload = 'hello'
local script = ('printf "%s"; sleep 120'):format(payload)
- local ph = popen.shell(script, 'r')
+
+ -- It is important to close stderr here.
+ --
+ -- killpg() on Mac OS may don't deliver a signal to a process
+ -- that is just forked at killpg() call. So :close() at end of
+ -- the test does not guarantee that 'sleep 120' will be
+ -- signaled.
+ --
+ -- test-run captures stdout and stderr for a 'core = app' test
+ -- (like this one) using pipes and waits EOF on its side for
+ -- both. If `sleep 120` process inherits stderr, it will not
+ -- close it and the file descriptor will remain open for
+ -- writing. In this case EOF situation will not occur on the
+ -- reading end of pipe even despite that the test process
+ -- itself already exited.
+ local ph = popen.new({script}, {
+ shell = true,
+ setsid = true,
+ group_signal = true,
+ stdin = popen.opts.CLOSE,
+ stdout = popen.opts.PIPE,
+ stderr = popen.opts.CLOSE,
+ })
-- When a first byte is available, read() should return all
-- bytes arrived at the time.
--
2.25.0
More information about the Tarantool-patches
mailing list