[PATCH v3] core: Non-blocking io.popen

Vladimir Davydov vdavydov.dev at gmail.com
Fri Jun 21 15:31:21 MSK 2019


On Mon, Jun 17, 2019 at 09:54:59PM +0300, Stanislav Zudin wrote:
> Adds nonblocking implementation of popen.
> The method is available in namespace fio.
> fio.popen() returns an object providing facilities
> for dealing with standard input/output, getting
> status of the child process.
> 
> Closes #4031
> 
> @TarantoolBot document
> Title: Nonblocking fio.popen
> 
> handle, err = fio.popen(parameters)
> 
> fio.popen starts a process and redirects its input/output.
> 
> parameters - a table containing arguments to run a process.
> The following arguments are expected:
> 
> argv - [mandatory] is a table of argument strings passed to
> the new program. By convention, the first of these strings should
> contain the filename associated with the file being executed.
> 
> environment - [optional] is a table of strings, conventionally of the
> form key=value, which are passed as environment to the new program.
> If not specified a parent's environment is used.

I'd rename 'argv' to 'args', 'environment' to 'env', 'parameters' to
'opts' - IMO this would be more consistent with Tarantool interfaces.
Also, since 'args' is mandatory, I would make it a separate argument:

  fio.popen(args[, opts])

where opts is a table that may contain the following keys:

    env
    stdin
    stdout
    stderr

Example:

  fio.popen({'cat', 'file'}, {stdout = fio.DEVNULL})

> 
> By default stdin, stdout,stderr of the associated process are
> available for writing/reading using object's methods
> handle:write() and handle:read().
> One can override default behavior to redirect streams to/from file
> or to input/output of another process or to the default input/output
> of the parent process.
> 
> stdin - [optional] overrides the child process's standard input.
> stdout - [optional] overrides the child process's standard output.
> stderr - [optional] overrides the the child process's standard
> error output.
> May accept one of the following values:
> Handle of the file open with fio.open()
> Handle of the standard input/output of another process,
> open by fio.popen
> A constant defining the parent's STDIN, STDOUT or STDERR.
> A constants:
> fio.PIPE - Opens a file descriptor available for reading/writing
> or redirection to/from another process.
> fio.DEVNULL - Makes fio.popen redirect the output to /dev/null.
> 
> On success returns an object providing methods for
> communication with the running process.
> Returns nil if underlying functions calls fail;
> in this case the second return value, err, contains a error message.
> 
> The object created by fio.popen provides the following methods:
> read()
> write()
> kill()
> wait()
> status()
> stdin()
> stdout()
> stderr()
> 
> number handle:stdin()
> 
> Returns handle of the child process's standard input.
> The handle is available only if it was created with
> fio.PIPE option.
> Use this handle to setup a redirection
> from file or other process to the input of the associated process.
> If handle is unavailable the method returns nil.
> 
> number handle:stdout()
> number handle:stderr()
> 
> Return STDOUT and STDIN of the associated process accordingly.
> See handle:stdin() for details.
> 
> rc,err = handle:wait(timeout)
> 
> The wait() waits for the associated process to terminate.
> 
> timeout - an integer specifies number of seconds to wait.
> If the requested time has elapsed the method returns false,
> the second return value, err, contains a error message.
> To distinguish timeout from the the other errors use errno.
> If timeout is nil, the method waits infinitely until
> the associated process is terminated.
> On success function returns true.
> If failed, rc is false and err contains a error message.
> 
> If the associated process is terminated, one can use the following
> methods get the exit status:
> 
> rc = handle:status()

Looking at the code, I see that one can't read() the remaining output
from stdout/stderr pipe after wait(). I thought we'd agreed to allow
that.

> 
> returns nil if process is still running
>  == 0 if the process exited normally
>  error code > 0 if the process terminated with an error
>  -signal if the process was killed by a signal
> 
> rc, err = handle:kill(sig)
> 
> The kill() sends a specified signal to the associated process.
> On success the method returns true,
> if failed - false and error message.
> If the sig is nil the default "SIGTERM" is being sent to the process.
> If the signal is unknown then the method fails (with error EINVAL).
> The method accepts string names of signal as well as their numeric
> values.

Why string literals? Why not simply define constants in the signal
module, like signal.SIGTERM, signal.SIGKILL, etc. IMO it would be more
convenient, compare:

	p:kill('SIGTERM')
	p:wait()
	if p:status() == signal.c.signals['SIGTERM'] ...

and

	p:kill(signal.SIGTERM)
	p:wait()
	if (p:status() == signal.SIGTERM) ...

> 
> rc,src,err = handle:read(buffer,size,timeout)
> 
> read stdout & stderr of the process started by fio.popen
> Usage:
>  read(size) -> str, source, err
>  read(buf, size) -> length, source, err
>  read(size, timeout) -> str, source, err
>  read(buf, size, timeout) -> length, source, err

No option to read until EOF?

> 
>  timeout - number of seconds to wait (optional)
>  source contains id of the stream, fio.STDOUT or fio.STDERR
>  err - error message if method has failed or nil on success
> 
> rc, err = handle:write(buf[, size])
> rc, err = handle:write(opts), where opts are:
>   * buf
>   * size
>   * timeout

So write may take opts while read may not. Looks inconsistent. Please
change the API so that read() and write() have similar signatures.

> 
> Writes specified number of bytes
> On success returns number of written bytes.
> If failed the rc is nil and err contains an error message.

Please mention that 'read' and 'write' are only available if
stdin/stdout has been redirected to a pipe.

I expected to also see explicit 'close' method to close stdin/stdout in
case they are redirected to a pipe. Here's why we need it. Suppose you
want to grep something in a string you have defined in your code. So you
run popen({'grep', 'what'}) and feed the string to it with write(). Then
you read the output and expect the program to terminate. However, it
won't, because grep doesn't exit until it receives EOF. That's why we
need to be able to close stdin:

  p = popen({'grep', 'what'})
  p:write(my_str)
  p:close(STDIN)
  result = p:read()

Note, we need to be able to explicitly specify which end we want to
close (STDIN, STDOUT or STDERR or all of them), similarly to shutdown(2)
system call.

> 
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 33b64f6a..96f1d751 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -120,6 +120,7 @@ set (server_sources
>       lua/string.c
>       lua/buffer.c
>       lua/swim.c
> +     lua/lua_signal.c

Why not simply lua/signal.c?

>       ${lua_sources}
>       ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
>       ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
> diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
> index 66e430a2..64152f08 100644
> --- a/src/lib/core/CMakeLists.txt
> +++ b/src/lib/core/CMakeLists.txt
> @@ -27,8 +27,16 @@ set(core_sources
>      mpstream.c
>      port.c
>      decimal.c
> +    coio_popen.c
>  )
>  
> +# Disable gcc compiler error
> +if (CMAKE_COMPILER_IS_GNUCC)
> +    set_source_files_properties(coio_popen.c PROPERTIES COMPILE_FLAGS
> +            -Wno-clobbered)
> +endif()
> +
> +

I got no compilation error after removing this. I have gcc 6.3.0.
Please explain what happens without this option. I want to try to
figure out if we can fix it somehow else, without patching cmake.

>  if (TARGET_OS_NETBSD)
>      # A workaround for "undefined reference to `__gcc_personality_v0'"
>      # on x86_64-rumprun-netbsd-gcc
> diff --git a/src/lib/core/coio_popen.c b/src/lib/core/coio_popen.c
> new file mode 100644
> index 00000000..dcc50136
> --- /dev/null
> +++ b/src/lib/core/coio_popen.c
> @@ -0,0 +1,827 @@
> +/*
> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#include <stddef.h>
> +#include <signal.h>
> +#include "coio_popen.h"
> +#include "coio_task.h"
> +#include "fiber.h"
> +#include "fio.h"
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/wait.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <pthread.h>
> +#include <float.h>
> +#include <sysexits.h>
> +
> +/*
> + * On OSX this global variable is not declared
> + * in <unistd.h>
> + */
> +extern char **environ;
> +
> +
> +struct popen_handle {
> +	/* Process id */
> +	pid_t pid;
> +
> +	/*
> +	 * Three descriptors:
> +	 * [0] write to stdin of the child process
> +	 * [1] read from stdout of the child process
> +	 * [2] read from stderr of the child process
> +	 * Valid only for pipe.

Otherwise they are set to what? -1 I assume? Please mention in the
comment.

> +	 */
> +	int fd[3];
> +
> +	/*
> +	 * Handle to /dev/null.
> +	 */
> +	int devnull_fd;

Do we need to open /dev/null per each popen? Can't we use the same file
descriptor for all child processes?

> +
> +	/*
> +	 * Current process status.
> +	 * The SIGCHLD handler changes this status.
> +	 */
> +	enum popen_status status;
> +
> +	/*
> +	 * Exit status of the associated process
> +	 * or number of the signal that caused the
> +	 * associated process to terminate.
> +	 */
> +	int exit_code;

This is confusing. First, 'status' is used for reporting exit code or
signal in Lua, but here it's something completely different. Second
'exit_code' isn't necessarily an exit code, but -signo. May be, it would
be more readable to write something like this instead:

	/* Set to true if the process has terminated. */
	bool terminated;
	/*
	 * If the process has terminated, this variable stores
	 * the exit code if the process exited normally, by
	 * calling exit(), or -signo if the process was killed
	 * by a signal.
	 */
	int status;

> +};
> +
> +/*
> + * Map: (pid) => (popen_handle *)
> + */
> +#define mh_name _popen_storage
> +#define mh_key_t pid_t
> +#define mh_node_t struct popen_handle*
> +#define mh_arg_t void *
> +#define mh_hash(a, arg) ((*a)->pid)
> +#define mh_hash_key(a, arg) (a)
> +#define mh_cmp(a, b, arg) (((*a)->pid) != ((*b)->pid))
> +#define mh_cmp_key(a, b, arg) ((a) != ((*b)->pid))
> +#define MH_SOURCE 1
> +#include "salad/mhash.h"
> +
> +
> +static pthread_mutex_t mutex;

Bad name for a mutex - what does it protect and why?

> +static struct mh_popen_storage_t* popen_hash_table = NULL;

According to our coding style, we put * closer to variable name, in
C-stle. There quite a few places where you put it in C++-style, i.e.
closer to type name. Please fix.

> +
> +void
> +popen_initialize()

Please rename to popen_init. There should be popen_free to clean up
objects allocated by it.

> +{
> +	pthread_mutexattr_t errorcheck;
> +	pthread_mutexattr_init(&errorcheck);
> +	pthread_mutexattr_settype(&errorcheck,
> +		PTHREAD_MUTEX_ERRORCHECK);
> +	pthread_mutex_init(&mutex, &errorcheck);
> +	pthread_mutexattr_destroy(&errorcheck);

We have tt_pthread_* wrappers. You should use them, as they warn about
system errors.

> +
> +	popen_hash_table = mh_popen_storage_new();
> +}
> +
> +static void
> +popen_lock_data_list()
> +{
> +	pthread_mutex_lock(&mutex);
> +}
> +
> +static void
> +popen_unlock_data_list()
> +{
> +	pthread_mutex_unlock(&mutex);
> +}
> +
> +static void
> +popen_append_to_list(struct popen_handle *data)
> +{
> +	struct popen_handle **old = NULL;
> +	mh_int_t id = mh_popen_storage_put(popen_hash_table,
> +		(const struct popen_handle **)&data, &old, NULL);
> +	(void)id;
> +}
> +
> +static struct popen_handle *
> +popen_lookup_data_by_pid(pid_t pid)
> +{
> +	mh_int_t pos = mh_popen_storage_find(popen_hash_table, pid, NULL);
> +	if (pos == mh_end(popen_hash_table))
> +		return NULL;
> +	else {
> +		struct popen_handle ** ptr =
> +			mh_popen_storage_node(popen_hash_table, pos);
> +		return *ptr;
> +	}
> +}
> +
> +static void
> +popen_exclude_from_list(struct popen_handle *data)
> +{
> +	mh_popen_storage_remove(popen_hash_table,
> +		(const struct popen_handle **)&data, NULL);
> +}

What's the point in this one-liners each of which is used exactly once?
IMO they only obfuscate the code. Please fold them in.

> +
> +static struct popen_handle *
> +popen_data_new()
> +{
> +	struct popen_handle *data =
> +		(struct popen_handle *)calloc(1, sizeof(*data));

Please handle OOM here and everywhere else. It might look pointless,
but we have to until https://github.com/tarantool/tarantool/issues/3534
is fixed.

> +	data->fd[0] = -1;
> +	data->fd[1] = -1;
> +	data->fd[2] = -1;
> +	data->devnull_fd = -1;
> +	data->status = POPEN_RUNNING;
> +	return data;
> +}
> +
> +enum pipe_end {
> +	PIPE_READ = 0,
> +	PIPE_WRITE = 1
> +};
> +
> +static inline enum pipe_end
> +	popen_opposite_pipe(enum pipe_end side)

Bad indentation. So is it an 'end' or a 'side'? Please use the same
terminology throughout the code.

> +{
> +	return (enum pipe_end)(side ^ 1);
> +	/*
> +	 * The code is equal to:
> +	 * return (side == PIPE_READ) ? PIPE_WRITE
> +	 * 			      : PIPE_READ;
> +	 */

Why not write it like that than in the first place? Anyway, the helper
looks pointless as it's a trivial one liner used exactly once. Please
fold it.

> +}
> +
> +static inline bool
> +popen_create_pipe(int fd, int pipe_pair[2], enum pipe_end parent_side)
> +{
> +	if (fd == FIO_PIPE) {
> +		if (pipe(pipe_pair) < 0 ||
> +		    fcntl(pipe_pair[parent_side], F_SETFL, O_NONBLOCK) < 0) {
> +			return false;
> +		}
> +	}
> +	return true;
> +}
> +
> +static inline void
> +popen_close_child_fd(int std_fd, int pipe_pair[2],
> +	int *saved_fd, enum pipe_end child_side)
> +{
> +	if (std_fd == FIO_PIPE) {
> +		/* Close child's side. */
> +		close(pipe_pair[child_side]);
> +
> +		enum pipe_end parent_side = popen_opposite_pipe(child_side);
> +		*saved_fd = pipe_pair[parent_side];
> +	}
> +}
> +
> +static inline void
> +popen_close_pipe(int pipe_pair[2])
> +{
> +	if (pipe_pair[0] >= 0) {
> +		close(pipe_pair[0]);
> +		close(pipe_pair[1]);
> +	}
> +}
> +
> +
> +/**
> + * Implementation of fio.popen.
> + * The function opens a process by creating a pipe
> + * forking.
> + *
> + * @param argv - is an array of character pointers
> + * to the arguments terminated by a null pointer.
> + *
> + * @param envp - is the pointer to an array
> + * of character pointers to the environment strings.
> + *
> + * @param stdin_fd - the file handle to be redirected to the
> + * child process's STDIN.
> + *
> + * @param stdout_fd - the file handle receiving the STDOUT
> + * output of the child process.
> + *
> + * @param stderr_fd - the file handle receiving the STDERR
> + * output of the child process.
> + *
> + * The stdin_fd, stdout_fd & stderr_fd accept file descriptors
> + * from open() or the following values:
> + *
> + * FIO_PIPE - opens a pipe, binds it with child's
> + * input/output. The pipe is available for reading/writing.
> + *
> + * FIO_DEVNULL - redirects output from process to /dev/null.
> + *
> + * @return handle of the pipe for reading or writing
> + * (depends on value of type).
> + * In a case of error returns NULL.
> + */
> +static struct popen_handle *
> +popen_new_impl(char **argv, char **envp,
> +	int stdin_fd, int stdout_fd, int stderr_fd)
> +{
> +	bool popen_list_locked = false;
> +	pid_t pid;
> +	int pipe_rd[2] = {-1,-1};
> +	int pipe_wr[2] = {-1,-1};
> +	int pipe_er[2] = {-1,-1};
> +	errno = 0;
> +
> +	struct popen_handle *data = popen_data_new();
> +	if (data == NULL)
> +		return NULL;
> +
> +	/*
> +	 * Setup a /dev/null if necessary.
> +	 */
> +	bool read_devnull = (stdin_fd == FIO_DEVNULL);
> +	bool write_devnull = (stdout_fd == FIO_DEVNULL) ||
> +			     (stderr_fd == FIO_DEVNULL);
> +	int devnull_flags = O_RDWR;
> +	if (!read_devnull)
> +		devnull_flags = O_WRONLY;
> +	else if (!write_devnull)
> +		devnull_flags = O_RDONLY;
> +
> +	if (read_devnull || write_devnull) {
> +		data->devnull_fd = open("/dev/null", devnull_flags);
> +		if (data->devnull_fd < 0)
> +			goto on_error;
> +		else {
> +			if (stdin_fd == FIO_DEVNULL)
> +				stdin_fd = data->devnull_fd;
> +			if (stdout_fd == FIO_DEVNULL)
> +				stdout_fd = data->devnull_fd;
> +			if (stderr_fd == FIO_DEVNULL)
> +				stderr_fd = data->devnull_fd;
> +		}
> +	}
> +
> +	if (!popen_create_pipe(stdin_fd, pipe_rd, PIPE_WRITE))

The function name is a bit confusing. It's called 'create_pipe', but it
doesn't necessarily create a pipe. Same goes for popen_close_child_fd.
I think we should rename them to something like

	popen_prepare_fd - called before fork
	popen_setup_child_fd - called in the child after fork
	popen_setup_parent_fd - called in the parent after fork
	popen_cleanup_fd - called on error

(names aren't perfect - you might want to think more on them)

Also, I think that you should fold all fd manipulation into those
functions, including /dev/null setup and dup. Please try.

> +		goto on_error;
> +	if (!popen_create_pipe(stdout_fd, pipe_wr, PIPE_READ))
> +		goto on_error;
> +	if (!popen_create_pipe(stderr_fd, pipe_er, PIPE_READ))
> +		goto on_error;
> +
> +	/*
> +	 * Prepare data for the child process.
> +	 * There must be no branches, to avoid compiler
> +	 * error: "argument ‘xxx’ might be clobbered by
> +	 * ‘longjmp’ or ‘vfork’ [-Werror=clobbered]".
> +	 */
> +	if (envp == NULL)
> +		envp = environ;
> +
> +	/* Handles to be closed in child process */
> +	int close_fd[3] = {-1, -1, -1};
> +	/* Handles to be duplicated in child process */
> +	int dup_fd[3] = {-1, -1, -1};
> +	/* Handles to be closed after dup2 in child process */
> +	int close_after_dup_fd[3] = {-1, -1, -1};
> +
> +	if (stdin_fd == FIO_PIPE) {
> +		close_fd[STDIN_FILENO] = pipe_rd[PIPE_WRITE];
> +		dup_fd[STDIN_FILENO] = pipe_rd[PIPE_READ];
> +	} else if (stdin_fd != STDIN_FILENO) {
> +		dup_fd[STDIN_FILENO] = stdin_fd;
> +		close_after_dup_fd[STDIN_FILENO] = stdin_fd;
> +	}
> +
> +	if (stdout_fd == FIO_PIPE) {
> +		close_fd[STDOUT_FILENO] = pipe_wr[PIPE_READ];
> +		dup_fd[STDOUT_FILENO] = pipe_wr[PIPE_WRITE];
> +	} else if (stdout_fd != STDOUT_FILENO){
> +		dup_fd[STDOUT_FILENO] = stdout_fd;
> +		if (stdout_fd != STDERR_FILENO)
> +			close_after_dup_fd[STDOUT_FILENO] = stdout_fd;
> +	}
> +
> +	if (stderr_fd == FIO_PIPE) {
> +		close_fd[STDERR_FILENO] = pipe_er[PIPE_READ];
> +		dup_fd[STDERR_FILENO] = pipe_er[PIPE_WRITE];
> +	} else if (stderr_fd != STDERR_FILENO) {
> +		dup_fd[STDERR_FILENO] = stderr_fd;
> +		if (stderr_fd != STDOUT_FILENO)
> +			close_after_dup_fd[STDERR_FILENO] = stderr_fd;
> +	}
> +
> +
> +	popen_lock_data_list();
> +	popen_list_locked = true;
> +
> +	pid = vfork();

I thought we'd agreed to block all signals in the parent before vfork()
and unblock them right after vfork(), so as to prevent the child from
corrupting parent's memory while handling a signal.

> +
> +	if (pid < 0)
> +		goto on_error;
> +	else if (pid == 0) /* child */ {
> +		/* Reset all signals to their defaults. */
> +		struct sigaction sa;
> +		memset(&sa, 0, sizeof(sa));
> +		sigemptyset(&sa.sa_mask);
> +		sa.sa_handler = SIG_DFL;
> +
> +		if (sigaction(SIGUSR1, &sa, NULL) == -1 ||
> +		    sigaction(SIGINT, &sa, NULL) == -1 ||
> +		    sigaction(SIGTERM, &sa, NULL) == -1 ||
> +		    sigaction(SIGHUP, &sa, NULL) == -1 ||
> +		    sigaction(SIGWINCH, &sa, NULL) == -1 ||
> +		    sigaction(SIGSEGV, &sa, NULL) == -1 ||
> +		    sigaction(SIGFPE, &sa, NULL) == -1 ||
> +		    sigaction(SIGCHLD, &sa, NULL) == -1)
> +			exit(EX_OSERR);
> +
> +		/* Unblock any signals blocked by libev. */
> +		sigset_t sigset;
> +		sigfillset(&sigset);
> +		if (sigprocmask(SIG_UNBLOCK, &sigset, NULL) == -1)
> +			exit(EX_OSERR);

This function is huge. Let's try to split it even more. E.g. we could
move signal setup to a helper function.

> +
> +		/* Setup stdin/stdout */
> +		for(int i = 0; i < 3; ++i) {
> +			if (close_fd[i] >= 0)
> +				close(close_fd[i]);
> +			if (dup_fd[i] >= 0)
> +				dup2(dup_fd[i], i);
> +			if (close_after_dup_fd[i] >= 0)
> +				close(close_after_dup_fd[i]);
> +		}
> +
> +		execve( argv[0], argv, envp);
> +		exit(EX_OSERR);
> +		unreachable();
> +	}
> +
> +	/* Parent process */
> +	popen_close_child_fd(stdin_fd,  pipe_rd,
> +		&data->fd[STDIN_FILENO], PIPE_READ);
> +	popen_close_child_fd(stdout_fd, pipe_wr,
> +		&data->fd[STDOUT_FILENO], PIPE_WRITE);
> +	popen_close_child_fd(stderr_fd, pipe_er,
> +		&data->fd[STDERR_FILENO], PIPE_WRITE);
> +
> +	data->pid = pid;
> +
> +on_cleanup:
> +	if (data){
> +		popen_append_to_list(data);
> +	}
> +
> +	if (popen_list_locked)
> +		popen_unlock_data_list();
> +
> +	if (argv){
> +		for(int i = 0; argv[i] != NULL; ++i)
> +			free(argv[i]);
> +		free(argv);
> +	}
> +	if (envp && envp != environ) {
> +		for(int i = 0; envp[i] != NULL; ++i)
> +			free(envp[i]);
> +		free(envp);
> +	}

It should be a responsibility of the caller to free environment and
args. E.g. they could be allocated on the region. Requiring them to be
allocated with malloc() obscures the function protocol.

> +
> +	return data;
> +
> +on_error:
> +	popen_close_pipe(pipe_rd);
> +	popen_close_pipe(pipe_wr);
> +	popen_close_pipe(pipe_er);
> +
> +	if (data) {
> +		if (data->devnull_fd >= 0)
> +			close(data->devnull_fd);
> +		free(data);
> +	}
> +	data = NULL;
> +
> +	goto on_cleanup;
> +	unreachable();
> +}
> +
> +ssize_t
> +popen_new(va_list ap)
> +{
> +	char **argv = va_arg(ap, char **);
> +	char **envp = va_arg(ap, char **);
> +	int stdin_fd = va_arg(ap, int);
> +	int stdout_fd = va_arg(ap, int);
> +	int stderr_fd = va_arg(ap, int);
> +	struct popen_handle **handle = va_arg(ap, struct popen_handle **);
> +
> +	*handle = popen_new_impl(argv, envp, stdin_fd, stdout_fd, stderr_fd);
> +	return (*handle) ? 0 : -1;
> +}
> +
> +static void
> +popen_close_handles(struct popen_handle *data)

Sometimes you call variables referring to a popen_handle object
'handle', sometimes 'data', sometimes 'fh'. Please be consistent.

> +{
> +	for(int i = 0; i < 3; ++i) {
> +		if (data->fd[i] >= 0) {
> +			close(data->fd[i]);
> +			data->fd[i] = -1;
> +		}
> +	}
> +	if (data->devnull_fd >= 0) {
> +		close(data->devnull_fd);
> +		data->devnull_fd = -1;
> +	}
> +}
> +
> +int
> +popen_destroy(struct popen_handle *fh)
> +{
> +	assert(fh);

In general, assertions like this are pointless, because the code below
will crash anyway if you pass NULL for fh.

> +
> +	popen_lock_data_list();
> +	popen_close_handles(fh);
> +	popen_exclude_from_list(fh);
> +	popen_unlock_data_list();
> +
> +	free(fh);
> +	return 0;
> +}
> +
> +/**
> + * Check if an errno, returned from a sio function, means a

sio?

> + * non-critical error: EAGAIN, EWOULDBLOCK, EINTR.
> + */
> +static inline bool
> +popen_wouldblock(int err)
> +{
> +	return err == EAGAIN || err == EWOULDBLOCK || err == EINTR;
> +}
> +
> +static int
> +popen_do_read(struct popen_handle *data, void *buf, size_t count, int *source_id)

The line's longer than 80 characters and could be split without hurting
readability.

> +{
> +	/*
> +	 * STDERR has higher priority, read it first.
> +	 */
> +	int rc = 0;
> +	errno = 0;
> +	int fd_count = 0;
> +	if (data->fd[STDERR_FILENO] >= 0) {
> +		++fd_count;
> +		rc = read(data->fd[STDERR_FILENO], buf, count);
> +
> +		if (rc >= 0)
> +			*source_id = STDERR_FILENO;
> +		if (rc > 0)
> +			return rc;
> +
> +		if (rc < 0 && !popen_wouldblock(errno))
> +			return rc;
> +
> +	}
> +
> +	/*
> +	 * STDERR is not available or not ready, try STDOUT.
> +	 */
> +	if (data->fd[STDOUT_FILENO] >= 0) {
> +		++fd_count;
> +		rc = read(data->fd[STDOUT_FILENO], buf, count);
> +
> +		if (rc >= 0) {
> +			*source_id = STDOUT_FILENO;
> +			return rc;
> +		}
> +	}
> +
> +	if (!fd_count) {
> +		/*
> +		 * There are no open handles for reading.
> +		 */
> +		errno = EBADF;
> +		rc = -1;
> +	}
> +	return rc;
> +}
> +
> +static void
> +popen_coio_create(struct ev_io *coio, int fd)
> +{
> +	coio->data = fiber();
> +	ev_init(coio, (ev_io_cb) fiber_schedule_cb);
> +	coio->fd = fd;

AFAICS setting coio->fd is unnecessary, because ev_io_set sets it anyway.

Since this is a trivial two-line routine which is only used in a couple
of places, I'd inline it.

> +}
> +
> +int
> +popen_read(struct popen_handle *fh, void *buf, size_t count,
> +	size_t *read_bytes, int *source_id,

Why don't you return ssize_t, like read(2) does? Then you wouldn't need
the read_bytes argument.

> +	ev_tstamp timeout)
> +{
> +	assert(fh);
> +	if (timeout < 0.0)
> +		timeout = DBL_MAX;

We don't treat negative timeouts as infinity. Instead we pass
TIMEOUT_INFINITY, which is simply a very large number. Take
a look at lua/socket.lua for example.

> +
> +	ev_tstamp start, delay;
> +	evio_timeout_init(loop(), &start, &delay, timeout);
> +
> +	struct ev_io coio_rd;
> +	struct ev_io coio_er;

In the function above you use 'pipe_rd' for STDIN while here you use
'coio_rd' for STDOUT. This is confusing. Let's rename these variables
to coio_stdout and coio_stderr and pipe variables to pipe_stdout, etc.

> +	popen_coio_create(&coio_er, fh->fd[STDERR_FILENO]);
> +	popen_coio_create(&coio_rd, fh->fd[STDOUT_FILENO]);
> +
> +	int result = 0;
> +
> +	while (true) {
> +		int rc = popen_do_read(fh, buf, count, source_id);
> +		if (rc >= 0) {
> +			*read_bytes = rc;
> +			break;
> +		}
> +
> +		if (!popen_wouldblock(errno)) {
> +			result = -1;
> +			break;
> +		}
> +
> +		/*
> +		 * The handlers are not ready, yield.

Those are not 'handlers', but 'handles' or 'descriptors'.

> +		 */
> +		if (!ev_is_active(&coio_rd) &&
> +		     fh->fd[STDOUT_FILENO] >= 0) {
> +			ev_io_set(&coio_rd, fh->fd[STDOUT_FILENO], EV_READ);
> +			ev_io_start(loop(), &coio_rd);
> +		}
> +		if (!ev_is_active(&coio_er) &&
> +		    fh->fd[STDERR_FILENO] >= 0) {
> +			ev_io_set(&coio_er, fh->fd[STDERR_FILENO], EV_READ);
> +			ev_io_start(loop(), &coio_er);
> +		}
> +		/*
> +		 * Yield control to other fibers until the
> +		 * timeout is reached.
> +		 */
> +		bool is_timedout = fiber_yield_timeout(delay);
> +		if (is_timedout) {
> +			errno = ETIMEDOUT;
> +			result = -1;
> +			break;
> +		}
> +
> +		if (fiber_is_cancelled()) {
> +			errno = EINTR;
> +			result = -1;
> +			break;
> +		}
> +
> +		evio_timeout_update(loop(), &start, &delay);
> +	}
> +
> +	ev_io_stop(loop(), &coio_er);
> +	ev_io_stop(loop(), &coio_rd);
> +	return result;
> +}
> +
> +int
> +popen_write(struct popen_handle *fh, const void *buf, size_t count,
> +	size_t *written, ev_tstamp timeout)
> +{
> +	assert(fh);
> +	if (count == 0) {
> +		*written = 0;
> +		return  0;
> +	}
> +
> +	if (timeout < 0.0)
> +		timeout = DBL_MAX;
> +
> +	if (fh->fd[STDIN_FILENO] < 0) {
> +		*written = 0;
> +		errno = EBADF;
> +		return -1;
> +	}
> +
> +	ev_tstamp start, delay;
> +	evio_timeout_init(loop(), &start, &delay, timeout);
> +
> +	struct ev_io coio;
> +	popen_coio_create(&coio, fh->fd[STDIN_FILENO]);
> +	int result = 0;
> +
> +	while(true) {
> +		ssize_t rc = write(fh->fd[STDIN_FILENO], buf, count);
> +		if (rc < 0 && !popen_wouldblock(errno)) {
> +			result = -1;
> +			break;
> +		}
> +

> +		size_t urc = (size_t)rc;
> +
> +		if (urc == count) {
> +			*written = count;
> +			break;
> +		}
> +
> +		if (rc > 0) {
> +			buf += rc;
> +			count -= urc;
> +		}

Let's rewrite this part without urc:

		buf += rc;
		count -= rc;
		if (count == 0)
			break;

> +
> +		/*
> +		 * The handlers are not ready, yield.
> +		 */
> +		if (!ev_is_active(&coio)) {
> +			ev_io_set(&coio, fh->fd[STDIN_FILENO], EV_WRITE);
> +			ev_io_start(loop(), &coio);
> +		}
> +
> +		/*
> +		 * Yield control to other fibers until the
> +		 * timeout is reached.
> +		 */
> +		bool is_timedout = fiber_yield_timeout(delay);
> +		if (is_timedout) {
> +			errno = ETIMEDOUT;
> +			result = -1;
> +			break;
> +		}
> +
> +		if (fiber_is_cancelled()) {
> +			errno = EINTR;
> +			result = -1;
> +			break;

IMHO it's uncommon to return EINTR in case the fiber is cancelled.
Why don't you use diag for propagating errors from this module? There's
SystemError for errors returned by syscalls and FiberIsCancelled error
for this case.

> +		}
> +
> +		evio_timeout_update(loop(), &start, &delay);
> +	}
> +
> +	ev_io_stop(loop(), &coio);
> +	return result;
> +}
> +
> +int
> +popen_kill(struct popen_handle *fh, int signal_id)
> +{
> +	assert(fh);
> +	return kill(fh->pid, signal_id);

I guess we should return an error if the process happens to have been
reaped.

> +}
> +
> +int
> +popen_wait(struct popen_handle *fh, ev_tstamp timeout, int *exit_code)

What's the point to return the exit code? One can access it directly
after the function completes.

> +{
> +	assert(fh);
> +	if (timeout < 0.0)
> +		timeout = DBL_MAX;
> +
> +	ev_tstamp start, delay;
> +	evio_timeout_init(loop(), &start, &delay, timeout);
> +
> +	int result = 0;
> +
> +	while (true) {
> +		/* Wait for SIGCHLD */
> +		int code = 0;
> +
> +		int rc = popen_get_status(fh, &code);
> +		if (rc != POPEN_RUNNING) {
> +			*exit_code = (rc == POPEN_EXITED) ? code
> +							  : -code;
> +			break;
> +		}
> +
> +		/*
> +		 * Yield control to other fibers until the
> +		 * timeout is reached.
> +		 * Let's sleep for 20 msec.
> +		 */
> +		fiber_yield_timeout(0.02);

Why 20 ms? I think what you need is fiber_cond.

> +
> +		if (fiber_is_cancelled()) {
> +			errno = EINTR;
> +			result = -1;
> +			break;
> +		}
> +
> +		evio_timeout_update(loop(), &start, &delay);
> +		bool is_timedout = (delay == 0.0);
> +		if (is_timedout) {
> +			errno = ETIMEDOUT;
> +			result = -1;
> +			break;
> +		}
> +	}
> +
> +	return result;
> +}
> +
> +int
> +popen_get_std_file_handle(struct popen_handle *fh, int file_no)
> +{
> +	assert(fh);
> +	if (file_no < STDIN_FILENO || STDERR_FILENO < file_no){
> +		errno = EINVAL;
> +		return -1;
> +	}
> +
> +	errno = 0;
> +	return fh->fd[file_no];
> +}
> +
> +int
> +popen_get_status(struct popen_handle *fh, int *exit_code)
> +{

Don't see any point in having these helper function - it's okay to
access the status and fd fields directly.

> +	assert(fh);
> +	errno = 0;
> +
> +	if (exit_code)
> +		*exit_code = fh->exit_code;
> +
> +	return fh->status;
> +}
> +
> +/*
> + * evio data to control SIGCHLD
> + */
> +static ev_child cw;
> +
> +static void
> +popen_sigchld_cb(ev_loop *loop, ev_child *watcher, int revents)
> +{
> +	(void)loop;
> +	(void)revents;
> +
> +	popen_lock_data_list();
> +
> +	struct popen_handle *data = popen_lookup_data_by_pid(watcher->rpid);
> +	if (data) {
> +		if (WIFEXITED(watcher->rstatus)) {
> +			data->exit_code = WEXITSTATUS(watcher->rstatus);
> +			data->status = POPEN_EXITED;
> +		} else if (WIFSIGNALED(watcher->rstatus)) {
> +			data->exit_code = WTERMSIG(watcher->rstatus);
> +
> +			if (WCOREDUMP(watcher->rstatus))
> +				data->status = POPEN_DUMPED;

I don't see any point at all in this status. We don't even report it.
Please remove.

> +			else
> +				data->status = POPEN_KILLED;
> +		} else {
> +			/*
> +			 * The status is not determined, treat as EXITED
> +			 */
> +			data->exit_code = EX_SOFTWARE;
> +			data->status = POPEN_EXITED;
> +		}
> +
> +		/*
> +		 * We shouldn't close file descriptors here.
> +		 * The child process may exit earlier than
> +		 * the parent process finishes reading data.
> +		 * In this case the reading fails.
> +		 */
> +	}
> +
> +	popen_unlock_data_list();
> +}
> +
> +void
> +popen_setup_sigchld_handler()
> +{
> +	ev_child_init (&cw, popen_sigchld_cb, 0/*pid*/, 0);
> +	ev_child_start(loop(), &cw);

Can't you set it up in popen_init instead?

> +
> +}
> +
> +void
> +popen_reset_sigchld_handler()
> +{
> +	ev_child_stop(loop(), &cw);
> +}
> diff --git a/src/lib/core/coio_popen.h b/src/lib/core/coio_popen.h
> new file mode 100644
> index 00000000..57057e97
> --- /dev/null
> +++ b/src/lib/core/coio_popen.h
> @@ -0,0 +1,229 @@
> +#ifndef TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED
> +#define TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED
> +/*
> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif /* defined(__cplusplus) */
> +
> +#include "evio.h"
> +#include <stdarg.h>
> +
> +/**
> + * Special values of the file descriptors passed to fio.popen
> + * */

Bad comment formatting. It's really annoying to point to such minor
things in a review. There are other places in the patch where there's
an extra new line or a missing space. Please always self-review your
patches so as to make sure it doesn't happen.

Regarding the comment: it would be nice to point out why you use
negative constants.

> +enum {
> +	/**
> +	 * Tells fio.popen to open a handle for
> +	 * direct reading/writing.
> +	 */
> +	FIO_PIPE = -2,

Why do you start from -2? Why not -1?

> +
> +	/**
> +	 * Tells fio.popen to redirect the given standard
> +	 * stream into /dev/null.
> +	 */
> +	FIO_DEVNULL = -3
> +};
> +
> +struct popen_handle;
> +
> +/**
> + * Possible status of the process started via fio.popen
> + **/
> +enum popen_status {
> +
> +	/**
> +	 * The process is alive and well.
> +	 */
> +	POPEN_RUNNING = 1,
> +
> +	/**
> +	 * The process exited.
> +	 */
> +	POPEN_EXITED = 2,
> +
> +	/**
> +	 * The process terminated by a signal.
> +	 */
> +	POPEN_KILLED = 3,
> +
> +	/**
> +	 * The process terminated abnormally.
> +	 */
> +	POPEN_DUMPED = 4
> +};
> +
> +/**
> + * Initializes inner data of fio.popen
> + * */
> +void
> +popen_initialize();
> +
> +ssize_t
> +popen_new(va_list ap);
> +
> +/**
> + * The function releases allocated resources.
> + * The function doesn't wait for the associated process
> + * to terminate.
> + *
> + * @param fh handle returned by fio.popen.
> + *
> + * @return 0 if the process is terminated
> + * @return -1 for an error
> + */
> +int
> +popen_destroy(struct popen_handle *fh);
> +
> +/**
> + * The function reads up to count bytes from the handle
> + * associated with the child process.
> + * Returns immediately
> + *
> + * @param fd handle returned by fio.popen.
> + * @param buf a buffer to be read into
> + * @param count size of buffer in bytes
> + * @param read_bytes A pointer to the
> + * variable that receives the number of bytes read.
> + * @param source_id A pointer to the variable that receives a
> + * source stream id, 1 - for STDOUT, 2 - for STDERR.
> + *
> + * @return 0 data were successfully read
> + * @return -1 an error occurred, see errno for error code
> + *
> + * If there is nothing to read yet function returns -1
> + * and errno set no EAGAIN.
> + */
> +int
> +popen_read(struct popen_handle *fh, void *buf, size_t count,
> +	size_t *read_bytes, int *source_id,
> +	ev_tstamp timeout);
> +
> +/**
> + * The function writes up to count bytes to the handle
> + * associated with the child process.
> + * Tries to write as much as possible without blocking
> + * and immediately returns.
> + *
> + * @param fd handle returned by fio.popen.
> + * @param buf a buffer to be written from
> + * @param count size of buffer in bytes
> + * @param written A pointer to the
> + * variable that receives the number of bytes actually written.
> + * If function fails the number of written bytes is undefined.
> + *
> + * @return 0 data were successfully written
> + * Compare values of <count> and <written> to check
> + * whether all data were written or not.
> + * @return -1 an error occurred, see errno for error code
> + *
> + * If the writing can block, function returns -1
> + * and errno set no EAGAIN.
> + */
> +int
> +popen_write(struct popen_handle *fh, const void *buf, size_t count,
> +	size_t *written, ev_tstamp timeout);
> +
> +
> +/**
> + * The function send the specified signal
> + * to the associated process.
> + *
> + * @param fd - handle returned by fio.popen.
> + *
> + * @return 0 on success
> + * @return -1 an error occurred, see errno for error code
> + */
> +int
> +popen_kill(struct popen_handle *fh, int signal_id);
> +
> +/**
> + * Wait for the associated process to terminate.
> + * The function doesn't release the allocated resources.
> + *
> + * @param fd handle returned by fio.popen.

'fd'? The argument is named 'fh'.

> + *
> + * @param timeout number of second to wait before function exit with error.
> + * If function exited due to timeout the errno equals to ETIMEDOUT.
> + *
> + * @exit_code On success contains the exit code as a positive number
> + * or signal id as a negative number.
> +
> + * @return On success function returns 0, and -1 on error.

What happens if the process has already exited? Has been reaped? Has
been waited for using this function? Please describe in the comment.

> + */
> +int
> +popen_wait(struct popen_handle *fh, ev_tstamp timeout, int *exit_code);

Please use double for timeouts so as not to include evio.h into header
files.

> +
> +/**
> + * Returns descriptor of the specified file.
> + *
> + * @param fd - handle returned by fio.popen.
> + * @param file_no accepts one of the
> + * following values:
> + * STDIN_FILENO,
> + * STDOUT_FILENO,
> + * STDERR_FILENO
> + *
> + * @return file descriptor or -1 if not available
> + */
> +int
> +popen_get_std_file_handle(struct popen_handle *fh, int file_no);
> +
> +
> +/**
> + * Returns status of the associated process.
> + *
> + * @param fd - handle returned by fio.popen.
> + *
> + * @param exit_code - if not NULL accepts the exit code
> + * if the process terminated normally or signal id
> + * if process was termianted by signal.
> + *
> + * @return one of the following values:
> + * POPEN_RUNNING if the process is alive
> + * POPEN_EXITED if the process was terminated normally
> + * POPEN_KILLED if the process was terminated by a signal
> + */
> +int
> +popen_get_status(struct popen_handle *fh, int *exit_code);
> +
> +void
> +popen_setup_sigchld_handler();
> +void
> +popen_reset_sigchld_handler();
> +
> +#if defined(__cplusplus)
> +} /* extern "C" */
> +#endif /* defined(__cplusplus) */
> +
> +#endif /* TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED */
> diff --git a/src/lib/core/coio_task.c b/src/lib/core/coio_task.c
> index 908b336e..e6a1b327 100644
> --- a/src/lib/core/coio_task.c
> +++ b/src/lib/core/coio_task.c
> @@ -40,6 +40,7 @@
>  
>  #include "fiber.h"
>  #include "third_party/tarantool_ev.h"
> +#include "coio_popen.h"
>  
>  /*
>   * Asynchronous IO Tasks (libeio wrapper).
> @@ -129,6 +130,7 @@ coio_on_stop(void *data)
>  void
>  coio_init(void)
>  {
> +	popen_initialize();
>  	eio_set_thread_on_start(coio_on_start, NULL);
>  	eio_set_thread_on_stop(coio_on_stop, NULL);
>  }
> diff --git a/src/lua/fio.c b/src/lua/fio.c
> index 806f4256..873ee165 100644
> --- a/src/lua/fio.c
> +++ b/src/lua/fio.c
> @@ -46,6 +46,9 @@
>  
>  #include "lua/utils.h"
>  #include "coio_file.h"
> +#include "coio_popen.h"
> +
> +static uint32_t CTID_STRUCT_POPEN_HANDLE_REF = 0;
>  
>  static inline void
>  lbox_fio_pushsyserror(struct lua_State *L)
> @@ -703,6 +706,269 @@ lbox_fio_copyfile(struct lua_State *L)
>  	return lbox_fio_pushbool(L, coio_copyfile(source, dest) == 0);
>  }
>  
> +static bool
> +popen_verify_argv(struct lua_State *L)
> +{
> +	if (!lua_istable(L, 1))
> +		return false;
> +	int num = (int)lua_objlen(L, 1);  /*luaL_getn(L,1);*/

popen_extract_strarray does the same. Do we really need this function?

> +	return num >= 1;
> +}
> +
> +static char**
> +popen_extract_strarray(struct lua_State *L, int index, int* array_size)

The name is confusing. Let's please name the function to point out that
it's used for extracting args/env from Lua stack. Something like

  lbox_fio_popen_get_args

Also, a comment would be helpful.

> +{
> +	if (lua_type(L, index) != LUA_TTABLE) {

Why don't you use lua_istable here?

> +		if (array_size)
> +			*array_size = 0;
> +		return NULL;
> +	}
> +
> +	size_t num = lua_objlen(L, index);  /*luaL_getn(L,index);*/

What this comment is for?

> +
> +	char** array = calloc(num+1, sizeof(char*));

Since the array is temporary and freed right upon popen_new completion,
I think it's okay to allocate it on the region. Please take a look at
how region_alloc and region_truncate are used.

> +	/*
> +	 * The last item in the array must be NULL
> +	 */
> +
> +	if (array == NULL)
> +		return NULL;
> +
> +	for(size_t i = 0; i < num; ++i) {
> +		lua_rawgeti(L, index, i+1);
> +		size_t slen = 0;
> +		const char* str = lua_tolstring(L, -1, &slen);
> +		if (!str)
> +			str = "";

This looks like an invalid argument to me. I think we should raise a Lua
exception in this case.

> +		array[i] = strdup(str);
> +		lua_pop(L, 1);
> +	}
> +
> +	if (array_size)
> +		*array_size = num;

Why do you return array_size? AFAICS you never use it.

> +	/*
> +	 * The number of elements doesn't include
> +	 * the trailing NULL pointer
> +	 */
> +	return array;
> +}
> +
> +static struct popen_handle *
> +fio_popen_get_handle(lua_State *L, int idx)
> +{
> +	uint32_t cdata_type;
> +	struct popen_handle **handle_ptr = luaL_checkcdata(L, idx, &cdata_type);
> +	if (handle_ptr == NULL || cdata_type != CTID_STRUCT_POPEN_HANDLE_REF)
> +		return NULL;
> +	return *handle_ptr;
> +}
> +
> +static void
> +fio_popen_invalidate_handle(lua_State *L, int idx)

Let's please prefix all function names dealing with lua with lbox_.

> +{
> +	uint32_t cdata_type;
> +	struct popen_handle **handle_ptr = luaL_checkcdata(L, idx, &cdata_type);
> +	if (handle_ptr != NULL && cdata_type == CTID_STRUCT_POPEN_HANDLE_REF) {
> +		*handle_ptr = NULL;
> +	}
> +}
> +
> +static int
> +lbox_fio_popen_gc(lua_State *L)
> +{
> +	struct popen_handle *handle = fio_popen_get_handle(L,1);
> +
> +	if (handle)
> +		popen_destroy(handle);
> +	return 0;
> +}
> +
> +static int
> +lbox_fio_popen(struct lua_State *L)
> +{
> +	if (lua_gettop(L) < 1) {

Just one argument is enough? But below you use up to 5 arguments.
Please clean up Lua stack checks.

> +		usage:
> +		luaL_error(L, "fio.popen: Invalid arguments");
> +	}
> +
> +	if (!popen_verify_argv(L))
> +		goto usage;
> +
> +	int stdin_fd = FIO_PIPE;
> +	int stdout_fd = FIO_PIPE;
> +	int stderr_fd = FIO_PIPE;
> +
> +	char** argv = popen_extract_strarray(L, 1, NULL);
> +	char** env = popen_extract_strarray(L, 2, NULL);
> +
> +	if (lua_isnumber(L, 3))
> +		stdin_fd = lua_tonumber(L, 3);
> +	if (lua_isnumber(L, 4))
> +		stdout_fd = lua_tonumber(L, 4);
> +	if (lua_isnumber(L, 5))
> +		stderr_fd = lua_tonumber(L, 5);
> +
> +	struct popen_handle* handle = NULL;
> +	if (coio_call(popen_new, argv, env, stdin_fd, stdout_fd,

coio_call should be hidden behind the popen_new implementation.

Anyway, why do we need coio_call at all? Now since we use vfork,
popen_new should be fast enough to be called right from tx.
BTW, you wouldn't need pthread_mutex guarding popen hash if you
didn't use coio_call.

> +		      stderr_fd, &handle) < 0) {
> +		lua_pushnil(L);
> +		lbox_fio_pushsyserror(L);
> +		return 2;
> +	} else {
> +		luaL_pushcdata(L, CTID_STRUCT_POPEN_HANDLE_REF);
> +		*(struct popen_handle **)
> +			luaL_pushcdata(L, CTID_STRUCT_POPEN_HANDLE_REF) = handle;
> +		lua_pushcfunction(L, lbox_fio_popen_gc);
> +		luaL_setcdatagc(L, -2);
> +
> +		return 1;
> +	}
> +}
> +
> +static int
> +lbox_fio_popen_read(struct lua_State *L)
> +{
> +	/* popen_read(self.fh, buf, size, seconds) */
> +
> +	void* fh = fio_popen_get_handle(L, 1);
> +	uint32_t ctypeid;
> +	char *buf = *(char **)luaL_checkcdata(L, 2, &ctypeid);
> +	size_t len = lua_tonumber(L, 3);
> +	ev_tstamp seconds = lua_tonumber(L, 4);
> +
> +	if (!len) {
> +		lua_pushinteger(L, 0);
> +		lua_pushinteger(L, STDOUT_FILENO);
> +		return 2;
> +	}
> +
> +	int output_number = 0;
> +	size_t received = 0;
> +	int rc = popen_read(fh, buf, len,
> +			    &received, &output_number,
> +			    seconds);
> +	if (rc == 0) {                /* The reading's succeeded */
> +		lua_pushinteger(L, received);
> +		lua_pushinteger(L, output_number);
> +		return 2;
> +	} else {
> +		lua_pushnil(L);
> +		lua_pushnil(L);
> +		lbox_fio_pushsyserror(L);
> +		return 3;
> +	}
> +}
> +
> +static int
> +lbox_fio_popen_write(struct lua_State *L)
> +{
> +	struct popen_handle* fh = fio_popen_get_handle(L, 1);
> +	const char *buf = lua_tostring(L, 2);
> +	uint32_t ctypeid = 0;
> +	if (buf == NULL)
> +		buf = *(const char **)luaL_checkcdata(L, 2, &ctypeid);
> +	size_t len = lua_tointeger(L, 3);
> +	double timeout = lua_tonumber(L, 4);
> +
> +	size_t written = 0;
> +	int rc = popen_write(fh, buf, len,
> +			     &written, timeout);
> +	if (rc == 0 && written == len) {
> +		/* The writing's succeeded */
> +		lua_pushinteger(L, (ssize_t) written);

Why do you cast 'written' to ssize_t?

> +		return 1;
> +	} else {
> +		lua_pushnil(L);
> +		lbox_fio_pushsyserror(L);
> +		return 2;
> +	}
> +}
> +
> +static int
> +lbox_fio_popen_get_status(struct lua_State *L)
> +{
> +	struct popen_handle* fh = fio_popen_get_handle(L, 1);
> +	int exit_code = 0;
> +	int res = popen_get_status(fh, &exit_code);
> +
> +	switch (res) {
> +		case POPEN_RUNNING:
> +			lua_pushnil(L);
> +			break;
> +
> +		case POPEN_KILLED:
> +			lua_pushinteger(L, -exit_code);
> +			break;
> +
> +		default:
> +			lua_pushinteger(L, exit_code);
> +			break;
> +	}
> +
> +	return 1;
> +}
> +
> +static int
> +lbox_fio_popen_get_std_file_handle(struct lua_State *L)
> +{
> +	struct popen_handle* fh = fio_popen_get_handle(L, 1);
> +	int file_no = lua_tonumber(L, 2);
> +	int res = popen_get_std_file_handle(fh, file_no);
> +
> +	if (res < 0)
> +		lua_pushnil(L);
> +	else
> +		lua_pushinteger(L, res);
> +	return 1;
> +}
> +
> +static int
> +lbox_fio_popen_kill(struct lua_State *L)
> +{
> +	struct popen_handle* fh = fio_popen_get_handle(L, 1);
> +	int signal_id = lua_tonumber(L, 2);
> +
> +	int res = popen_kill(fh, signal_id);
> +	if (res < 0){
> +		lua_pushboolean(L, false);
> +		lbox_fio_pushsyserror(L);
> +		return 2;
> +	} else {
> +		lua_pushboolean(L, true);
> +		return 1;
> +	}
> +}
> +
> +static int
> +lbox_fio_popen_wait(struct lua_State *L)
> +{
> +	struct popen_handle *fh = fio_popen_get_handle(L, 1);
> +	assert(fh);
> +	ev_tstamp timeout = lua_tonumber(L, 2);
> +
> +	/*
> +	 * On success function returns an
> +	 * exit code as a positive number
> +	 * or signal id as a negative number.
> +	 * If failed, rc is nul and err
> +	 * contains a error message.
> +	 */
> +
> +	int exit_code =0;
> +	int res = popen_wait(fh, timeout, &exit_code);
> +	if (res < 0){
> +		lua_pushnil(L);
> +		lbox_fio_pushsyserror(L);
> +		return 2;
> +	} else {
> +		/* Release the allocated resources */
> +		popen_destroy(fh);
> +		fio_popen_invalidate_handle(L, 1);

Why invalidate it now? This means we won't be able to read stdout of a
reaped process, which looks unexpected to me. I think all resources
should be released by gc. There should be an explicit call to close file
descriptors. Please also see my comments to the commit message.

> +
> +		lua_pushinteger(L, exit_code);
> +		return 1;
> +	}
> +}
>  
>  
>  void
> @@ -747,6 +1013,13 @@ tarantool_lua_fio_init(struct lua_State *L)
>  		{ "listdir",		lbox_fio_listdir		},
>  		{ "fstat",		lbox_fio_fstat			},
>  		{ "copyfile",		lbox_fio_copyfile,		},
> +		{ "popen",		lbox_fio_popen			},
> +		{ "popen_read",		lbox_fio_popen_read 		},
> +		{ "popen_write",	lbox_fio_popen_write 		},
> +		{ "popen_get_status",	lbox_fio_popen_get_status	},
> +		{ "popen_get_std_file_handle",	lbox_fio_popen_get_std_file_handle },
> +		{ "popen_kill",		lbox_fio_popen_kill 		},
> +		{ "popen_wait",		lbox_fio_popen_wait 		},
>  		{ NULL,			NULL				}
>  	};
>  	luaL_register(L, NULL, internal_methods);
> @@ -849,4 +1122,12 @@ tarantool_lua_fio_init(struct lua_State *L)
>  
>  	lua_settable(L, -3);
>  	lua_pop(L, 1);
> +
> +	/* Get CTypeID for `struct tuple' */
> +	int rc = luaL_cdef(L, "struct popen_handle;");
> +	assert(rc == 0);
> +	(void) rc;
> +	CTID_STRUCT_POPEN_HANDLE_REF = luaL_ctypeid(L, "struct popen_handle &");
> +	assert(CTID_STRUCT_POPEN_HANDLE_REF != 0);
> +
>  }
> diff --git a/src/lua/fio.lua b/src/lua/fio.lua
> index ba8c47ec..d07cac55 100644
> --- a/src/lua/fio.lua
> +++ b/src/lua/fio.lua
> @@ -3,6 +3,8 @@
>  local fio = require('fio')
>  local ffi = require('ffi')
>  local buffer = require('buffer')
> +local signal = require('signal')
> +local errno = require('errno')
>  
>  ffi.cdef[[
>      int umask(int mask);
> @@ -15,6 +17,13 @@ local const_char_ptr_t = ffi.typeof('const char *')
>  local internal = fio.internal
>  fio.internal = nil
>  
> +fio.STDIN = 0
> +fio.STDOUT = 1
> +fio.STDERR = 2
> +fio.PIPE = -2
> +fio.DEVNULL = -3
> +
> +
>  local function sprintf(fmt, ...)
>      if select('#', ...) == 0 then
>          return fmt
> @@ -206,6 +215,216 @@ fio.open = function(path, flags, mode)
>      return fh
>  end
>  
> +local popen_methods = {}
> +
> +-- read stdout & stderr of the process started by fio.popen
> +-- Usage:
> +-- read(size) -> str, source, err
> +-- read(buf, size) -> length, source, err
> +-- read(size, timeout) -> str, source, err
> +-- read(buf, size, timeout) -> length, source, err
> +--
> +-- timeout - number of seconds to wait (optional)
> +-- source contains id of the stream, fio.STDOUT or fio.STDERR
> +-- err - error message if method has failed or nil on success
> +popen_methods.read = function(self, buf, size, timeout)
> +    if self.fh == nil then
> +        return nil, nil, 'Invalid object'
> +    end
> +
> +    local tmpbuf
> +
> +    if ffi.istype(const_char_ptr_t, buf) then
> +        -- ext. buffer is specified
> +        if type(size) ~= 'number' then
> +            error('fio.popen.read: invalid size argument')
> +        end
> +        timeout = timeout or -1
> +    elseif type(buf) == 'number' then
> +        -- use temp. buffer
> +        timeout = size or -1
> +        size = buf
> +
> +        tmpbuf = buffer.ibuf()
> +        buf = tmpbuf:reserve(size)
> +    else
> +        error("fio.popen.read: invalid arguments")
> +    end
> +
> +    local res, output_no, err = internal.popen_read(self.fh, buf, size, timeout)
> +    if res == nil then
> +        if tmpbuf ~= nil then
> +            tmpbuf:recycle()
> +        end
> +        return nil, nil, err
> +    end
> +
> +    if tmpbuf ~= nil then
> +        tmpbuf:alloc(res)
> +        res = ffi.string(tmpbuf.rpos, tmpbuf:size())
> +        tmpbuf:recycle()
> +    end
> +    return res, output_no
> +end
> +
> +-- write(str)
> +-- write(buf, len)
> +popen_methods.write = function(self, data, size)
> +    local timeout = -1.0
> +    if type(data) == 'table' then
> +        timeout = data.timeout or timeout
> +        size = data.size or size
> +        data = data.buf
> +    end
> +
> +    if type(data) == 'string' then
> +        if size == nil then
> +            size = string.len(data)
> +        end
> +    elseif not ffi.istype(const_char_ptr_t, data) then
> +        data = tostring(data)
> +        size = #data
> +    end
> +
> +    local res, err = internal.popen_write(self.fh, data, tonumber(size), tonumber(timeout))
> +    if err ~= nil then
> +        return false, err
> +    end
> +    return res >= 0
> +end
> +
> +popen_methods.status = function(self)
> +    if self.fh ~= nil then
> +        return internal.popen_get_status(self.fh)
> +    else
> +        return self.exit_code
> +    end
> +end
> +
> +popen_methods.stdin = function (self)
> +    if self.fh == nil then
> +        return nil, 'Invalid object'
> +    end
> +
> +    return internal.popen_get_std_file_handle(self.fh, fio.STDIN)
> +end
> +
> +popen_methods.stdout = function (self)
> +    if self.fh == nil then
> +        return nil, 'Invalid object'
> +    end
> +
> +    return internal.popen_get_std_file_handle(self.fh, fio.STDOUT)
> +end
> +
> +popen_methods.stderr = function (self)
> +    if self.fh == nil then
> +        return nil, 'Invalid object'
> +    end
> +
> +    return internal.popen_get_std_file_handle(self.fh, fio.STDERR)
> +end
> +
> +popen_methods.kill = function(self, sig)
> +    if self.fh == nil then
> +        return false, errno.strerror(errno.ESRCH)
> +    end
> +
> +    if sig == nil then
> +        sig = 'SIGTERM'
> +    end
> +    if type(sig) == 'string' then
> +        sig = signal.c.signals[sig]
> +        if sig == nil then
> +            errno(errno.EINVAL)
> +            return false, sprintf("fio.popen.kill(): unknown signal: %s", sig)
> +        end
> +    else
> +        sig = tonumber(sig)
> +    end
> +
> +    return internal.popen_kill(self.fh, sig)
> +end
> +
> +popen_methods.wait = function(self, timeout)
> +    if self.fh == nil then
> +        return false, 'Invalid object'
> +    end
> +
> +    if timeout == nil then
> +        timeout = -1
> +    else
> +        timeout = tonumber(timeout)
> +    end
> +
> +    local rc, err = internal.popen_wait(self.fh, timeout)
> +    if rc ~= nil then
> +        self.exit_code = tonumber(rc)
> +        self.fh = nil
> +        return true
> +    else
> +        return false,err
> +    end
> +end
> +
> +
> +local popen_mt = { __index = popen_methods }
> +
> +fio.popen = function(params)
> +    local argv = params.argv
> +    local env = params.environment
> +    local hstdin = params.stdin
> +    local hstdout = params.stdout
> +    local hstderr = params.stderr
> +
> +    if type(hstdin) == 'table' then
> +        hstdin = hstdin.fh
> +    end
> +    if type(hstdout) == 'table' then
> +        hstdout = hstdout.fh
> +    end
> +    if type(hstderr) == 'table' then
> +        hstderr = hstderr.fh
> +    end
> +
> +    if argv == nil or
> +       type(argv) ~= 'table' or
> +       table.getn(argv) < 1 then
> +        local errmsg = [[Usage: fio.popen({parameters}),
> +parameters - a table containing arguments to run a process.
> +The following arguments are expected:
> +
> +argv - [mandatory] is a table of argument strings passed
> +       to the new program.
> +       By convention, the first of these strings should
> +       contain the filename associated with the file
> +       being executed.
> +
> +environment - [optional] is a table of strings,
> +              conventionally of the form key=value,
> +              which are passed as environment to the
> +              new program.
> +
> +stdin  - [optional] overrides the child process's
> +         standard input.
> +stdout - [optional] overrides the child process's
> +         standard output.
> +stderr - [optional] overrides the child process's
> +         standard error output.
> +]]

AFAIR we don't typically write a detailed help on error. Just a one-line
"Usage..." reminder. There's a documentation for the rest.



More information about the Tarantool-patches mailing list