[Tarantool-patches] [PATCH v10 2/3] popen: introduce a backend engine

Mon Feb 17 02:04:42 MSK 2020

It seems I constantly have no time to concentrate on this nice feature.

I would however share my fear about it: libev doing a lot of work to
make epoll() work properly. See [1], EVBACKEND_EPOLL description.

For example,

 | The biggest issue is fork races, however - if a program forks then
 | both parent and child process have to recreate the epoll set, which
 | can take considerable time (one syscall per file descriptor) and is
 | of course hard to detect.

Is it applicable for vfork()?

I would feel much more comfortable if we would look though libev docs /
code to at least be aware about such possibilities. After this we can
say, whether popen engine is safe comparing to libev (which should be
good) or not (or how much).

[1]: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod

>  - popen_write_timeout
> 	to write data into child's stdin with
> 	timeout
>  - popen_read_timeout
> 	to read data from child's stdout/stderr
> 	with timeout

My initial thought (see [2]) was that the popen engine will just give
several file descriptors, but coio_create() / coio_read_timeout() /
coio_write_timeout() / coio_close() will be called from a module that
implements Lua API for read / write streams.

This approach draws a solid line between process management and IO
management and would simplify them both. Are there problems with this
way?

[2]: https://lists.tarantool.org/pipermail/tarantool-patches/2019-December/013040.html

> +/**
> + * Handle SIGCHLD when a child process exit.
> + */
> +static void
> +popen_sigchld_handler(EV_P_ ev_child *w, int revents)

Are we really need to use those libev macros within our code? Our code
usually do:

 | ev_loop *loop = loop();
 | ev_<something>(loop, <...>);

> +/**
> + * popen_send_signal - send a signal to a child process
> + * @handle:	popen handle
> + * @signo:	signal number
> + *
> + * Returns 0 on success, -1 otherwise.
> + */
> +int
> +popen_send_signal(struct popen_handle *handle, int signo)
> +{
> +	int ret;
> +
> +	/*
> +	 * A child may be killed or exited already.
> +	 */
> +	if (!popen_may_pidop(handle))
> +		return -1;
> +
> +	say_debug("popen: kill %d signo %d", handle->pid, signo);
> +	ret = kill(handle->pid, signo);
> +	if (ret < 0) {
> +		diag_set(SystemError, "Unable to kill %d signo %d",
> +			 handle->pid, signo);
> +	}
> +	return ret;
> +}

In some of previous versions of the patchset I saw unconditional
killpg() here. The ability to do it is often requested together with
setsid() in context of Python's subprocess.Popen(). Looks as important
feature, especially when a shell script is executed.

I think this should be configurable at least from the backend engine
perspective.

> +	/*
> +	 * A caller must preserve space for this.
> +	 */
> +	if (opts->flags & POPEN_FLAG_SHELL) {
> +		opts->argv[0] = "sh";
> +		opts->argv[1] = "-c";
> +	}

I would let a caller do this. The code of the backend engine tends to be
general and whether to add 'sh -c' and whether it should assume setsid()
+ killpg() looks more as calling code matter.