[PATCH v3] core: Non-blocking io.popen
Stanislav Zudin
szudin at tarantool.org
Tue Jul 2 09:56:00 MSK 2019
Please find the comments inplace.
The patch is in the separate letter.
On 21.06.2019 15:31, Vladimir Davydov wrote:
> On Mon, Jun 17, 2019 at 09:54:59PM +0300, Stanislav Zudin wrote:
>> Adds nonblocking implementation of popen.
>> The method is available in namespace fio.
>> fio.popen() returns an object providing facilities
>> for dealing with standard input/output, getting
>> status of the child process.
>>
>> Closes #4031
>>
>> @TarantoolBot document
>> Title: Nonblocking fio.popen
>>
>> handle, err = fio.popen(parameters)
>>
>> fio.popen starts a process and redirects its input/output.
>>
>> parameters - a table containing arguments to run a process.
>> The following arguments are expected:
>>
>> argv - [mandatory] is a table of argument strings passed to
>> the new program. By convention, the first of these strings should
>> contain the filename associated with the file being executed.
>>
>> environment - [optional] is a table of strings, conventionally of the
>> form key=value, which are passed as environment to the new program.
>> If not specified a parent's environment is used.
>
> I'd rename 'argv' to 'args', 'environment' to 'env', 'parameters' to
> 'opts' - IMO this would be more consistent with Tarantool interfaces.
> Also, since 'args' is mandatory, I would make it a separate argument:
>
> fio.popen(args[, opts])
>
> where opts is a table that may contain the following keys:
>
> env
> stdin
> stdout
> stderr
>
> Example:
>
> fio.popen({'cat', 'file'}, {stdout = fio.DEVNULL})
>
Fixed.
>>
>> By default stdin, stdout,stderr of the associated process are
>> available for writing/reading using object's methods
>> handle:write() and handle:read().
>> One can override default behavior to redirect streams to/from file
>> or to input/output of another process or to the default input/output
>> of the parent process.
>>
>> stdin - [optional] overrides the child process's standard input.
>> stdout - [optional] overrides the child process's standard output.
>> stderr - [optional] overrides the the child process's standard
>> error output.
>> May accept one of the following values:
>> Handle of the file open with fio.open()
>> Handle of the standard input/output of another process,
>> open by fio.popen
>> A constant defining the parent's STDIN, STDOUT or STDERR.
>> A constants:
>> fio.PIPE - Opens a file descriptor available for reading/writing
>> or redirection to/from another process.
>> fio.DEVNULL - Makes fio.popen redirect the output to /dev/null.
>>
>> On success returns an object providing methods for
>> communication with the running process.
>> Returns nil if underlying functions calls fail;
>> in this case the second return value, err, contains a error message.
>>
>> The object created by fio.popen provides the following methods:
>> read()
>> write()
>> kill()
>> wait()
>> status()
>> stdin()
>> stdout()
>> stderr()
>>
>> number handle:stdin()
>>
>> Returns handle of the child process's standard input.
>> The handle is available only if it was created with
>> fio.PIPE option.
>> Use this handle to setup a redirection
>> from file or other process to the input of the associated process.
>> If handle is unavailable the method returns nil.
>>
>> number handle:stdout()
>> number handle:stderr()
>>
>> Return STDOUT and STDIN of the associated process accordingly.
>> See handle:stdin() for details.
>>
>> rc,err = handle:wait(timeout)
>>
>> The wait() waits for the associated process to terminate.
>>
>> timeout - an integer specifies number of seconds to wait.
>> If the requested time has elapsed the method returns false,
>> the second return value, err, contains a error message.
>> To distinguish timeout from the the other errors use errno.
>> If timeout is nil, the method waits infinitely until
>> the associated process is terminated.
>> On success function returns true.
>> If failed, rc is false and err contains a error message.
>>
>> If the associated process is terminated, one can use the following
>> methods get the exit status:
>>
>> rc = handle:status()
>
> Looking at the code, I see that one can't read() the remaining output
> from stdout/stderr pipe after wait(). I thought we'd agreed to allow
> that.
>
I was sure that we agreed that wait() is a final method :)
Anyway, now it's possible to read after wait().
>>
>> returns nil if process is still running
>> == 0 if the process exited normally
>> error code > 0 if the process terminated with an error
>> -signal if the process was killed by a signal
>>
>> rc, err = handle:kill(sig)
>>
>> The kill() sends a specified signal to the associated process.
>> On success the method returns true,
>> if failed - false and error message.
>> If the sig is nil the default "SIGTERM" is being sent to the process.
>> If the signal is unknown then the method fails (with error EINVAL).
>> The method accepts string names of signal as well as their numeric
>> values.
>
> Why string literals? Why not simply define constants in the signal
> module, like signal.SIGTERM, signal.SIGKILL, etc. IMO it would be more
> convenient, compare:
>
> p:kill('SIGTERM')
> p:wait()
> if p:status() == signal.c.signals['SIGTERM'] ...
>
> and
>
> p:kill(signal.SIGTERM)
> p:wait()
> if (p:status() == signal.SIGTERM) ...
>
Fixed.
>>
>> rc,src,err = handle:read(buffer,size,timeout)
>>
>> read stdout & stderr of the process started by fio.popen
>> Usage:
>> read(size) -> str, source, err
>> read(buf, size) -> length, source, err
>> read(size, timeout) -> str, source, err
>> read(buf, size, timeout) -> length, source, err
>
> No option to read until EOF?
>
It's a pipe, we can't use fstat() to prepare buffer enough size to
read until EOF. So it's a user's responsibility to choose the size
of buffer and repeat reading till the end.
>>
>> timeout - number of seconds to wait (optional)
>> source contains id of the stream, fio.STDOUT or fio.STDERR
>> err - error message if method has failed or nil on success
>>
>> rc, err = handle:write(buf[, size])
>> rc, err = handle:write(opts), where opts are:
>> * buf
>> * size
>> * timeout
>
> So write may take opts while read may not. Looks inconsistent. Please
> change the API so that read() and write() have similar signatures.
>
Done.
>>
>> Writes specified number of bytes
>> On success returns number of written bytes.
>> If failed the rc is nil and err contains an error message.
>
> Please mention that 'read' and 'write' are only available if
> stdin/stdout has been redirected to a pipe.
>
Done
> I expected to also see explicit 'close' method to close stdin/stdout in
> case they are redirected to a pipe. Here's why we need it. Suppose you
> want to grep something in a string you have defined in your code. So you
> run popen({'grep', 'what'}) and feed the string to it with write(). Then
> you read the output and expect the program to terminate. However, it
> won't, because grep doesn't exit until it receives EOF. That's why we
> need to be able to close stdin:
>
> p = popen({'grep', 'what'})
> p:write(my_str)
> p:close(STDIN)
> result = p:read()
>
> Note, we need to be able to explicitly specify which end we want to
> close (STDIN, STDOUT or STDERR or all of them), similarly to shutdown(2)
> system call.
>
Sounds reasonable.
Done.
>>
>> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
>> index 33b64f6a..96f1d751 100644
>> --- a/src/CMakeLists.txt
>> +++ b/src/CMakeLists.txt
>> @@ -120,6 +120,7 @@ set (server_sources
>> lua/string.c
>> lua/buffer.c
>> lua/swim.c
>> + lua/lua_signal.c
>
> Why not simply lua/signal.c?
To avoid possible conflicts with system files.
>
>> ${lua_sources}
>> ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
>> ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
>> diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
>> index 66e430a2..64152f08 100644
>> --- a/src/lib/core/CMakeLists.txt
>> +++ b/src/lib/core/CMakeLists.txt
>> @@ -27,8 +27,16 @@ set(core_sources
>> mpstream.c
>> port.c
>> decimal.c
>> + coio_popen.c
>> )
>>
>> +# Disable gcc compiler error
>> +if (CMAKE_COMPILER_IS_GNUCC)
>> + set_source_files_properties(coio_popen.c PROPERTIES COMPILE_FLAGS
>> + -Wno-clobbered)
>> +endif()
>> +
>> +
>
> I got no compilation error after removing this. I have gcc 6.3.0.
> Please explain what happens without this option. I want to try to
> figure out if we can fix it somehow else, without patching cmake.
>
if you build release version using
cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_WERROR=ON
You're getting the following errors:
/home/szudin/work/ttool03/src/lib/core/coio_popen.c: In function
‘popen_new_impl’:
/home/szudin/work/ttool03/src/lib/core/coio_popen.c:302:7: error:
variable ‘popen_list_locked’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
bool popen_list_locked = false;
^~~~~~~~~~~~~~~~~
/home/szudin/work/ttool03/src/lib/core/coio_popen.c:158:23: error:
variable ‘handle’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
struct popen_handle *handle =
^~~~~~
/home/szudin/work/ttool03/src/lib/core/coio_popen.c:299:36: error:
argument ‘env’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
popen_new_impl(char **argv, char **env,
^~~
/home/szudin/work/ttool03/src/lib/core/coio_popen.c:300:6: error:
argument ‘stdin_fd’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
int stdin_fd, int stdout_fd, int stderr_fd)
^~~~~~~~
/home/szudin/work/ttool03/src/lib/core/coio_popen.c:300:20: error:
argument ‘stdout_fd’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
int stdin_fd, int stdout_fd, int stderr_fd)
^~~~~~~~~
/home/szudin/work/ttool03/src/lib/core/coio_popen.c:300:35: error:
argument ‘stderr_fd’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
int stdin_fd, int stdout_fd, int stderr_fd)
^~~~~~~~~
/home/szudin/work/ttool03/src/lib/core/coio_popen.c: At top level:
cc1: error: unrecognized command line option ‘-Wno-cast-function-type’
[-Werror]
cc1: all warnings being treated as errors
src/lib/core/CMakeFiles/core.dir/build.make:734: recipe for target
'src/lib/core/CMakeFiles/core.dir/coio_popen.c.o' failed
make[2]: *** [src/lib/core/CMakeFiles/core.dir/coio_popen.c.o] Error 1
CMakeFiles/Makefile2:3518: recipe for target 'src/lib/core/CMakeFiles
/core.dir/all' failed
make[1]: *** [src/lib/core/CMakeFiles/core.dir/all] Error 2
The version of gcc doesn't matter.
The error is reproducible on the travis as well as on a local
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
>> if (TARGET_OS_NETBSD)
>> # A workaround for "undefined reference to `__gcc_personality_v0'"
>> # on x86_64-rumprun-netbsd-gcc
>> diff --git a/src/lib/core/coio_popen.c b/src/lib/core/coio_popen.c
>> new file mode 100644
>> index 00000000..dcc50136
>> --- /dev/null
>> +++ b/src/lib/core/coio_popen.c
>> @@ -0,0 +1,827 @@
>> +/*
>> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
>> + *
>> + * Redistribution and use in source and binary forms, with or
>> + * without modification, are permitted provided that the following
>> + * conditions are met:
>> + *
>> + * 1. Redistributions of source code must retain the above
>> + * copyright notice, this list of conditions and the
>> + * following disclaimer.
>> + *
>> + * 2. Redistributions in binary form must reproduce the above
>> + * copyright notice, this list of conditions and the following
>> + * disclaimer in the documentation and/or other materials
>> + * provided with the distribution.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
>> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
>> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
>> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
>> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
>> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
>> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
>> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
>> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
>> + * SUCH DAMAGE.
>> + */
>> +
>> +#include <stddef.h>
>> +#include <signal.h>
>> +#include "coio_popen.h"
>> +#include "coio_task.h"
>> +#include "fiber.h"
>> +#include "fio.h"
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <sys/wait.h>
>> +#include <unistd.h>
>> +#include <sys/types.h>
>> +#include <sys/socket.h>
>> +#include <pthread.h>
>> +#include <float.h>
>> +#include <sysexits.h>
>> +
>> +/*
>> + * On OSX this global variable is not declared
>> + * in <unistd.h>
>> + */
>> +extern char **environ;
>> +
>> +
>> +struct popen_handle {
>> + /* Process id */
>> + pid_t pid;
>> +
>> + /*
>> + * Three descriptors:
>> + * [0] write to stdin of the child process
>> + * [1] read from stdout of the child process
>> + * [2] read from stderr of the child process
>> + * Valid only for pipe.
>
> Otherwise they are set to what? -1 I assume? Please mention in the
> comment.
>
yes, they are.
Fixed the comment.
>> + */
>> + int fd[3];
>> +
>> + /*
>> + * Handle to /dev/null.
>> + */
>> + int devnull_fd;
>
> Do we need to open /dev/null per each popen? Can't we use the same file
> descriptor for all child processes?
>
Ok, the /dev/null is shared between all child processes.
>> +
>> + /*
>> + * Current process status.
>> + * The SIGCHLD handler changes this status.
>> + */
>> + enum popen_status status;
>> +
>> + /*
>> + * Exit status of the associated process
>> + * or number of the signal that caused the
>> + * associated process to terminate.
>> + */
>> + int exit_code;
>
> This is confusing. First, 'status' is used for reporting exit code or
> signal in Lua, but here it's something completely different. Second
> 'exit_code' isn't necessarily an exit code, but -signo. May be, it would
> be more readable to write something like this instead:
>
> /* Set to true if the process has terminated. */
> bool terminated;
> /*
> * If the process has terminated, this variable stores
> * the exit code if the process exited normally, by
> * calling exit(), or -signo if the process was killed
> * by a signal.
> */
> int status;
>
Done.
>> +};
>> +
>> +/*
>> + * Map: (pid) => (popen_handle *)
>> + */
>> +#define mh_name _popen_storage
>> +#define mh_key_t pid_t
>> +#define mh_node_t struct popen_handle*
>> +#define mh_arg_t void *
>> +#define mh_hash(a, arg) ((*a)->pid)
>> +#define mh_hash_key(a, arg) (a)
>> +#define mh_cmp(a, b, arg) (((*a)->pid) != ((*b)->pid))
>> +#define mh_cmp_key(a, b, arg) ((a) != ((*b)->pid))
>> +#define MH_SOURCE 1
>> +#include "salad/mhash.h"
>> +
>> +
>> +static struct mh_popen_storage_t* popen_hash_table = NULL;
>
> According to our coding style, we put * closer to variable name, in
> C-stle. There quite a few places where you put it in C++-style, i.e.
> closer to type name. Please fix.
>
ok
>> +
>> +void
>> +popen_initialize()
>
> Please rename to popen_init. There should be popen_free to clean up
> objects allocated by it.
>
Done.
>> +
>> + popen_hash_table = mh_popen_storage_new();
>> +}
>> +
>> +static void
>> +popen_lock_data_list()
>> +{
>> + pthread_mutex_lock(&mutex);
>> +}
>> +
>> +static void
>> +popen_unlock_data_list()
>> +{
>> + pthread_mutex_unlock(&mutex);
>> +}
>> +
>> +static void
>> +popen_append_to_list(struct popen_handle *data)
>> +{
>> + struct popen_handle **old = NULL;
>> + mh_int_t id = mh_popen_storage_put(popen_hash_table,
>> + (const struct popen_handle **)&data, &old, NULL);
>> + (void)id;
>> +}
>> +
>> +static struct popen_handle *
>> +popen_lookup_data_by_pid(pid_t pid)
>> +{
>> + mh_int_t pos = mh_popen_storage_find(popen_hash_table, pid, NULL);
>> + if (pos == mh_end(popen_hash_table))
>> + return NULL;
>> + else {
>> + struct popen_handle ** ptr =
>> + mh_popen_storage_node(popen_hash_table, pos);
>> + return *ptr;
>> + }
>> +}
>> +
>> +static void
>> +popen_exclude_from_list(struct popen_handle *data)
>> +{
>> + mh_popen_storage_remove(popen_hash_table,
>> + (const struct popen_handle **)&data, NULL);
>> +}
>
> What's the point in this one-liners each of which is used exactly once?
> IMO they only obfuscate the code. Please fold them in.
>
Done.
>> +
>> +static struct popen_handle *
>> +popen_data_new()
>> +{
>> + struct popen_handle *data =
>> + (struct popen_handle *)calloc(1, sizeof(*data));
>
> Please handle OOM here and everywhere else. It might look pointless,
> but we have to until https://github.com/tarantool/tarantool/issues/3534
> is fixed.
>
Done.
>> + data->fd[0] = -1;
>> + data->fd[1] = -1;
>> + data->fd[2] = -1;
>> + data->devnull_fd = -1;
>> + data->status = POPEN_RUNNING;
>> + return data;
>> +}
>> +
>> +enum pipe_end {
>> + PIPE_READ = 0,
>> + PIPE_WRITE = 1
>> +};
>> +
>> +static inline enum pipe_end
>> + popen_opposite_pipe(enum pipe_end side)
>
> Bad indentation. So is it an 'end' or a 'side'? Please use the same
> terminology throughout the code.
>
Ok.
>> +{
>> + return (enum pipe_end)(side ^ 1);
>> + /*
>> + * The code is equal to:
>> + * return (side == PIPE_READ) ? PIPE_WRITE
>> + * : PIPE_READ;
>> + */
>
> Why not write it like that than in the first place? Anyway, the helper
> looks pointless as it's a trivial one liner used exactly once. Please
> fold it.
>
Done.
>> +}
>> +
>> +static inline bool
>> +popen_create_pipe(int fd, int pipe_pair[2], enum pipe_end parent_side)
>> +{
>> + if (fd == FIO_PIPE) {
>> + if (pipe(pipe_pair) < 0 ||
>> + fcntl(pipe_pair[parent_side], F_SETFL, O_NONBLOCK) < 0) {
>> + return false;
>> + }
>> + }
>> + return true;
>> +}
>> +
>> +static inline void
>> +popen_close_child_fd(int std_fd, int pipe_pair[2],
>> + int *saved_fd, enum pipe_end child_side)
>> +{
>> + if (std_fd == FIO_PIPE) {
>> + /* Close child's side. */
>> + close(pipe_pair[child_side]);
>> +
>> + enum pipe_end parent_side = popen_opposite_pipe(child_side);
>> + *saved_fd = pipe_pair[parent_side];
>> + }
>> +}
>> +
>> +static inline void
>> +popen_close_pipe(int pipe_pair[2])
>> +{
>> + if (pipe_pair[0] >= 0) {
>> + close(pipe_pair[0]);
>> + close(pipe_pair[1]);
>> + }
>> +}
>> +
>> +
>> +/**
>> + * Implementation of fio.popen.
>> + * The function opens a process by creating a pipe
>> + * forking.
>> + *
>> + * @param argv - is an array of character pointers
>> + * to the arguments terminated by a null pointer.
>> + *
>> + * @param envp - is the pointer to an array
>> + * of character pointers to the environment strings.
>> + *
>> + * @param stdin_fd - the file handle to be redirected to the
>> + * child process's STDIN.
>> + *
>> + * @param stdout_fd - the file handle receiving the STDOUT
>> + * output of the child process.
>> + *
>> + * @param stderr_fd - the file handle receiving the STDERR
>> + * output of the child process.
>> + *
>> + * The stdin_fd, stdout_fd & stderr_fd accept file descriptors
>> + * from open() or the following values:
>> + *
>> + * FIO_PIPE - opens a pipe, binds it with child's
>> + * input/output. The pipe is available for reading/writing.
>> + *
>> + * FIO_DEVNULL - redirects output from process to /dev/null.
>> + *
>> + * @return handle of the pipe for reading or writing
>> + * (depends on value of type).
>> + * In a case of error returns NULL.
>> + */
>> +static struct popen_handle *
>> +popen_new_impl(char **argv, char **envp,
>> + int stdin_fd, int stdout_fd, int stderr_fd)
>> +{
>> + bool popen_list_locked = false;
>> + pid_t pid;
>> + int pipe_rd[2] = {-1,-1};
>> + int pipe_wr[2] = {-1,-1};
>> + int pipe_er[2] = {-1,-1};
>> + errno = 0;
>> +
>> + struct popen_handle *data = popen_data_new();
>> + if (data == NULL)
>> + return NULL;
>> +
>> + /*
>> + * Setup a /dev/null if necessary.
>> + */
>> + bool read_devnull = (stdin_fd == FIO_DEVNULL);
>> + bool write_devnull = (stdout_fd == FIO_DEVNULL) ||
>> + (stderr_fd == FIO_DEVNULL);
>> + int devnull_flags = O_RDWR;
>> + if (!read_devnull)
>> + devnull_flags = O_WRONLY;
>> + else if (!write_devnull)
>> + devnull_flags = O_RDONLY;
>> +
>> + if (read_devnull || write_devnull) {
>> + data->devnull_fd = open("/dev/null", devnull_flags);
>> + if (data->devnull_fd < 0)
>> + goto on_error;
>> + else {
>> + if (stdin_fd == FIO_DEVNULL)
>> + stdin_fd = data->devnull_fd;
>> + if (stdout_fd == FIO_DEVNULL)
>> + stdout_fd = data->devnull_fd;
>> + if (stderr_fd == FIO_DEVNULL)
>> + stderr_fd = data->devnull_fd;
>> + }
>> + }
>> +
>> + if (!popen_create_pipe(stdin_fd, pipe_rd, PIPE_WRITE))
>
> The function name is a bit confusing. It's called 'create_pipe', but it
> doesn't necessarily create a pipe. Same goes for popen_close_child_fd.
> I think we should rename them to something like
>
> popen_prepare_fd - called before fork
> popen_setup_child_fd - called in the child after fork
> popen_setup_parent_fd - called in the parent after fork
> popen_cleanup_fd - called on error
>
> (names aren't perfect - you might want to think more on them)
>
> Also, I think that you should fold all fd manipulation into those
> functions, including /dev/null setup and dup. Please try.
>
Done.
>> + goto on_error;
>> + if (!popen_create_pipe(stdout_fd, pipe_wr, PIPE_READ))
>> + goto on_error;
>> + if (!popen_create_pipe(stderr_fd, pipe_er, PIPE_READ))
>> + goto on_error;
>> +
>> + /*
>> + * Prepare data for the child process.
>> + * There must be no branches, to avoid compiler
>> + * error: "argument ‘xxx’ might be clobbered by
>> + * ‘longjmp’ or ‘vfork’ [-Werror=clobbered]".
>> + */
>> + if (envp == NULL)
>> + envp = environ;
>> +
>> + /* Handles to be closed in child process */
>> + int close_fd[3] = {-1, -1, -1};
>> + /* Handles to be duplicated in child process */
>> + int dup_fd[3] = {-1, -1, -1};
>> + /* Handles to be closed after dup2 in child process */
>> + int close_after_dup_fd[3] = {-1, -1, -1};
>> +
>> + if (stdin_fd == FIO_PIPE) {
>> + close_fd[STDIN_FILENO] = pipe_rd[PIPE_WRITE];
>> + dup_fd[STDIN_FILENO] = pipe_rd[PIPE_READ];
>> + } else if (stdin_fd != STDIN_FILENO) {
>> + dup_fd[STDIN_FILENO] = stdin_fd;
>> + close_after_dup_fd[STDIN_FILENO] = stdin_fd;
>> + }
>> +
>> + if (stdout_fd == FIO_PIPE) {
>> + close_fd[STDOUT_FILENO] = pipe_wr[PIPE_READ];
>> + dup_fd[STDOUT_FILENO] = pipe_wr[PIPE_WRITE];
>> + } else if (stdout_fd != STDOUT_FILENO){
>> + dup_fd[STDOUT_FILENO] = stdout_fd;
>> + if (stdout_fd != STDERR_FILENO)
>> + close_after_dup_fd[STDOUT_FILENO] = stdout_fd;
>> + }
>> +
>> + if (stderr_fd == FIO_PIPE) {
>> + close_fd[STDERR_FILENO] = pipe_er[PIPE_READ];
>> + dup_fd[STDERR_FILENO] = pipe_er[PIPE_WRITE];
>> + } else if (stderr_fd != STDERR_FILENO) {
>> + dup_fd[STDERR_FILENO] = stderr_fd;
>> + if (stderr_fd != STDOUT_FILENO)
>> + close_after_dup_fd[STDERR_FILENO] = stderr_fd;
>> + }
>> +
>> +
>> + popen_lock_data_list();
>> + popen_list_locked = true;
>> +
>> + pid = vfork();
>
> I thought we'd agreed to block all signals in the parent before vfork()
> and unblock them right after vfork(), so as to prevent the child from
> corrupting parent's memory while handling a signal.
>
Done.
>> +
>> + if (pid < 0)
>> + goto on_error;
>> + else if (pid == 0) /* child */ {
>> + /* Reset all signals to their defaults. */
>> + struct sigaction sa;
>> + memset(&sa, 0, sizeof(sa));
>> + sigemptyset(&sa.sa_mask);
>> + sa.sa_handler = SIG_DFL;
>> +
>> + if (sigaction(SIGUSR1, &sa, NULL) == -1 ||
>> + sigaction(SIGINT, &sa, NULL) == -1 ||
>> + sigaction(SIGTERM, &sa, NULL) == -1 ||
>> + sigaction(SIGHUP, &sa, NULL) == -1 ||
>> + sigaction(SIGWINCH, &sa, NULL) == -1 ||
>> + sigaction(SIGSEGV, &sa, NULL) == -1 ||
>> + sigaction(SIGFPE, &sa, NULL) == -1 ||
>> + sigaction(SIGCHLD, &sa, NULL) == -1)
>> + exit(EX_OSERR);
>> +
>> + /* Unblock any signals blocked by libev. */
>> + sigset_t sigset;
>> + sigfillset(&sigset);
>> + if (sigprocmask(SIG_UNBLOCK, &sigset, NULL) == -1)
>> + exit(EX_OSERR);
>
> This function is huge. Let's try to split it even more. E.g. we could
> move signal setup to a helper function.
>
Done.
>> +
>> + /* Setup stdin/stdout */
>> + for(int i = 0; i < 3; ++i) {
>> + if (close_fd[i] >= 0)
>> + close(close_fd[i]);
>> + if (dup_fd[i] >= 0)
>> + dup2(dup_fd[i], i);
>> + if (close_after_dup_fd[i] >= 0)
>> + close(close_after_dup_fd[i]);
>> + }
>> +
>> + execve( argv[0], argv, envp);
>> + exit(EX_OSERR);
>> + unreachable();
>> + }
>> +
>> + /* Parent process */
>> + popen_close_child_fd(stdin_fd, pipe_rd,
>> + &data->fd[STDIN_FILENO], PIPE_READ);
>> + popen_close_child_fd(stdout_fd, pipe_wr,
>> + &data->fd[STDOUT_FILENO], PIPE_WRITE);
>> + popen_close_child_fd(stderr_fd, pipe_er,
>> + &data->fd[STDERR_FILENO], PIPE_WRITE);
>> +
>> + data->pid = pid;
>> +
>> +on_cleanup:
>> + if (data){
>> + popen_append_to_list(data);
>> + }
>> +
>> + if (popen_list_locked)
>> + popen_unlock_data_list();
>> +
>> + if (argv){
>> + for(int i = 0; argv[i] != NULL; ++i)
>> + free(argv[i]);
>> + free(argv);
>> + }
>> + if (envp && envp != environ) {
>> + for(int i = 0; envp[i] != NULL; ++i)
>> + free(envp[i]);
>> + free(envp);
>> + }
>
> It should be a responsibility of the caller to free environment and
> args. E.g. they could be allocated on the region. Requiring them to be
> allocated with malloc() obscures the function protocol.
>
Done.
>> +
>> + return data;
>> +
>> +on_error:
>> + popen_close_pipe(pipe_rd);
>> + popen_close_pipe(pipe_wr);
>> + popen_close_pipe(pipe_er);
>> +
>> + if (data) {
>> + if (data->devnull_fd >= 0)
>> + close(data->devnull_fd);
>> + free(data);
>> + }
>> + data = NULL;
>> +
>> + goto on_cleanup;
>> + unreachable();
>> +}
>> +
>> +ssize_t
>> +popen_new(va_list ap)
>> +{
>> + char **argv = va_arg(ap, char **);
>> + char **envp = va_arg(ap, char **);
>> + int stdin_fd = va_arg(ap, int);
>> + int stdout_fd = va_arg(ap, int);
>> + int stderr_fd = va_arg(ap, int);
>> + struct popen_handle **handle = va_arg(ap, struct popen_handle **);
>> +
>> + *handle = popen_new_impl(argv, envp, stdin_fd, stdout_fd, stderr_fd);
>> + return (*handle) ? 0 : -1;
>> +}
>> +
>> +static void
>> +popen_close_handles(struct popen_handle *data)
>
> Sometimes you call variables referring to a popen_handle object
> 'handle', sometimes 'data', sometimes 'fh'. Please be consistent.
>
Ok.
>> +{
>> + for(int i = 0; i < 3; ++i) {
>> + if (data->fd[i] >= 0) {
>> + close(data->fd[i]);
>> + data->fd[i] = -1;
>> + }
>> + }
>> + if (data->devnull_fd >= 0) {
>> + close(data->devnull_fd);
>> + data->devnull_fd = -1;
>> + }
>> +}
>> +
>> +int
>> +popen_destroy(struct popen_handle *fh)
>> +{
>> + assert(fh);
>
> In general, assertions like this are pointless, because the code below
> will crash anyway if you pass NULL for fh.
>
The runtime asserts are the way to specify a contract.
They define and check the acceptable values and provide a readable error
message on exit.
>> +
>> + popen_lock_data_list();
>> + popen_close_handles(fh);
>> + popen_exclude_from_list(fh);
>> + popen_unlock_data_list();
>> +
>> + free(fh);
>> + return 0;
>> +}
>> +
>> +/**
>> + * Check if an errno, returned from a sio function, means a
>
> sio?
Bad copypaste. Fixed.
>
>> + * non-critical error: EAGAIN, EWOULDBLOCK, EINTR.
>> + */
>> +static inline bool
>> +popen_wouldblock(int err)
>> +{
>> + return err == EAGAIN || err == EWOULDBLOCK || err == EINTR;
>> +}
>> +
>> +static int
>> +popen_do_read(struct popen_handle *data, void *buf, size_t count, int *source_id)
>
> The line's longer than 80 characters and could be split without hurting
> readability.
>
Done.
>> +{
>> + /*
>> + * STDERR has higher priority, read it first.
>> + */
>> + int rc = 0;
>> + errno = 0;
>> + int fd_count = 0;
>> + if (data->fd[STDERR_FILENO] >= 0) {
>> + ++fd_count;
>> + rc = read(data->fd[STDERR_FILENO], buf, count);
>> +
>> + if (rc >= 0)
>> + *source_id = STDERR_FILENO;
>> + if (rc > 0)
>> + return rc;
>> +
>> + if (rc < 0 && !popen_wouldblock(errno))
>> + return rc;
>> +
>> + }
>> +
>> + /*
>> + * STDERR is not available or not ready, try STDOUT.
>> + */
>> + if (data->fd[STDOUT_FILENO] >= 0) {
>> + ++fd_count;
>> + rc = read(data->fd[STDOUT_FILENO], buf, count);
>> +
>> + if (rc >= 0) {
>> + *source_id = STDOUT_FILENO;
>> + return rc;
>> + }
>> + }
>> +
>> + if (!fd_count) {
>> + /*
>> + * There are no open handles for reading.
>> + */
>> + errno = EBADF;
>> + rc = -1;
>> + }
>> + return rc;
>> +}
>> +
>> +static void
>> +popen_coio_create(struct ev_io *coio, int fd)
>> +{
>> + coio->data = fiber();
>> + ev_init(coio, (ev_io_cb) fiber_schedule_cb);
>> + coio->fd = fd;
>
> AFAICS setting coio->fd is unnecessary, because ev_io_set sets it anyway.
>
> Since this is a trivial two-line routine which is only used in a couple
> of places, I'd inline it.
>
Done.
>> +}
>> +
>> +int
>> +popen_read(struct popen_handle *fh, void *buf, size_t count,
>> + size_t *read_bytes, int *source_id,
>
> Why don't you return ssize_t, like read(2) does? Then you wouldn't need
> the read_bytes argument.
>
Fixed.
>> + ev_tstamp timeout)
>> +{
>> + assert(fh);
>> + if (timeout < 0.0)
>> + timeout = DBL_MAX;
>
> We don't treat negative timeouts as infinity. Instead we pass
> TIMEOUT_INFINITY, which is simply a very large number. Take
> a look at lua/socket.lua for example.
>
Fixed.
>> +
>> + ev_tstamp start, delay;
>> + evio_timeout_init(loop(), &start, &delay, timeout);
>> +
>> + struct ev_io coio_rd;
>> + struct ev_io coio_er;
>
> In the function above you use 'pipe_rd' for STDIN while here you use
> 'coio_rd' for STDOUT. This is confusing. Let's rename these variables
> to coio_stdout and coio_stderr and pipe variables to pipe_stdout, etc.
>
Fixed.
>> + popen_coio_create(&coio_er, fh->fd[STDERR_FILENO]);
>> + popen_coio_create(&coio_rd, fh->fd[STDOUT_FILENO]);
>> +
>> + int result = 0;
>> +
>> + while (true) {
>> + int rc = popen_do_read(fh, buf, count, source_id);
>> + if (rc >= 0) {
>> + *read_bytes = rc;
>> + break;
>> + }
>> +
>> + if (!popen_wouldblock(errno)) {
>> + result = -1;
>> + break;
>> + }
>> +
>> + /*
>> + * The handlers are not ready, yield.
>
> Those are not 'handlers', but 'handles' or 'descriptors'.
>
Ok.
>> + */
>> + if (!ev_is_active(&coio_rd) &&
>> + fh->fd[STDOUT_FILENO] >= 0) {
>> + ev_io_set(&coio_rd, fh->fd[STDOUT_FILENO], EV_READ);
>> + ev_io_start(loop(), &coio_rd);
>> + }
>> + if (!ev_is_active(&coio_er) &&
>> + fh->fd[STDERR_FILENO] >= 0) {
>> + ev_io_set(&coio_er, fh->fd[STDERR_FILENO], EV_READ);
>> + ev_io_start(loop(), &coio_er);
>> + }
>> + /*
>> + * Yield control to other fibers until the
>> + * timeout is reached.
>> + */
>> + bool is_timedout = fiber_yield_timeout(delay);
>> + if (is_timedout) {
>> + errno = ETIMEDOUT;
>> + result = -1;
>> + break;
>> + }
>> +
>> + if (fiber_is_cancelled()) {
>> + errno = EINTR;
>> + result = -1;
>> + break;
>> + }
>> +
>> + evio_timeout_update(loop(), &start, &delay);
>> + }
>> +
>> + ev_io_stop(loop(), &coio_er);
>> + ev_io_stop(loop(), &coio_rd);
>> + return result;
>> +}
>> +
>> +int
>> +popen_write(struct popen_handle *fh, const void *buf, size_t count,
>> + size_t *written, ev_tstamp timeout)
>> +{
>> + assert(fh);
>> + if (count == 0) {
>> + *written = 0;
>> + return 0;
>> + }
>> +
>> + if (timeout < 0.0)
>> + timeout = DBL_MAX;
>> +
>> + if (fh->fd[STDIN_FILENO] < 0) {
>> + *written = 0;
>> + errno = EBADF;
>> + return -1;
>> + }
>> +
>> + ev_tstamp start, delay;
>> + evio_timeout_init(loop(), &start, &delay, timeout);
>> +
>> + struct ev_io coio;
>> + popen_coio_create(&coio, fh->fd[STDIN_FILENO]);
>> + int result = 0;
>> +
>> + while(true) {
>> + ssize_t rc = write(fh->fd[STDIN_FILENO], buf, count);
>> + if (rc < 0 && !popen_wouldblock(errno)) {
>> + result = -1;
>> + break;
>> + }
>> +
>
>> + size_t urc = (size_t)rc;
>> +
>> + if (urc == count) {
>> + *written = count;
>> + break;
>> + }
>> +
>> + if (rc > 0) {
>> + buf += rc;
>> + count -= urc;
>> + }
>
> Let's rewrite this part without urc:
>
> buf += rc;
> count -= rc;
> if (count == 0)
> break;
>
Done.
>> +
>> + /*
>> + * The handlers are not ready, yield.
>> + */
>> + if (!ev_is_active(&coio)) {
>> + ev_io_set(&coio, fh->fd[STDIN_FILENO], EV_WRITE);
>> + ev_io_start(loop(), &coio);
>> + }
>> +
>> + /*
>> + * Yield control to other fibers until the
>> + * timeout is reached.
>> + */
>> + bool is_timedout = fiber_yield_timeout(delay);
>> + if (is_timedout) {
>> + errno = ETIMEDOUT;
>> + result = -1;
>> + break;
>> + }
>> +
>> + if (fiber_is_cancelled()) {
>> + errno = EINTR;
>> + result = -1;
>> + break;
>
> IMHO it's uncommon to return EINTR in case the fiber is cancelled.
> Why don't you use diag for propagating errors from this module? There's
> SystemError for errors returned by syscalls and FiberIsCancelled error
> for this case.
>
Fixed.
>> + }
>> +
>> + evio_timeout_update(loop(), &start, &delay);
>> + }
>> +
>> + ev_io_stop(loop(), &coio);
>> + return result;
>> +}
>> +
>> +int
>> +popen_kill(struct popen_handle *fh, int signal_id)
>> +{
>> + assert(fh);
>> + return kill(fh->pid, signal_id);
>
> I guess we should return an error if the process happens to have been
> reaped.
>
Done.
>> +}
>> +
>> +int
>> +popen_wait(struct popen_handle *fh, ev_tstamp timeout, int *exit_code)
>
> What's the point to return the exit code? One can access it directly
> after the function completes.
>
Fixed. popen_wait() returns 0 for success or -1 for error.
>> +{
>> + assert(fh);
>> + if (timeout < 0.0)
>> + timeout = DBL_MAX;
>> +
>> + ev_tstamp start, delay;
>> + evio_timeout_init(loop(), &start, &delay, timeout);
>> +
>> + int result = 0;
>> +
>> + while (true) {
>> + /* Wait for SIGCHLD */
>> + int code = 0;
>> +
>> + int rc = popen_get_status(fh, &code);
>> + if (rc != POPEN_RUNNING) {
>> + *exit_code = (rc == POPEN_EXITED) ? code
>> + : -code;
>> + break;
>> + }
>> +
>> + /*
>> + * Yield control to other fibers until the
>> + * timeout is reached.
>> + * Let's sleep for 20 msec.
>> + */
>> + fiber_yield_timeout(0.02);
>
> Why 20 ms? I think what you need is fiber_cond.
The fiber_cond was not applicable in multithreaded environment.
For now when everything works in a single thread fiber_cond
can be applied.
>
>> +
>> + if (fiber_is_cancelled()) {
>> + errno = EINTR;
>> + result = -1;
>> + break;
>> + }
>> +
>> + evio_timeout_update(loop(), &start, &delay);
>> + bool is_timedout = (delay == 0.0);
>> + if (is_timedout) {
>> + errno = ETIMEDOUT;
>> + result = -1;
>> + break;
>> + }
>> + }
>> +
>> + return result;
>> +}
>> +
>> +int
>> +popen_get_std_file_handle(struct popen_handle *fh, int file_no)
>> +{
>> + assert(fh);
>> + if (file_no < STDIN_FILENO || STDERR_FILENO < file_no){
>> + errno = EINVAL;
>> + return -1;
>> + }
>> +
>> + errno = 0;
>> + return fh->fd[file_no];
>> +}
>> +
>> +int
>> +popen_get_status(struct popen_handle *fh, int *exit_code)
>> +{
>
> Don't see any point in having these helper function - it's okay to
> access the status and fd fields directly.
>
It's a bad idea to reveal the implementation details.
Consider popen_get_status() as a getter.
>> + assert(fh);
>> + errno = 0;
>> +
>> + if (exit_code)
>> + *exit_code = fh->exit_code;
>> +
>> + return fh->status;
>> +}
>> +
>> +/*
>> + * evio data to control SIGCHLD
>> + */
>> +static ev_child cw;
>> +
>> +static void
>> +popen_sigchld_cb(ev_loop *loop, ev_child *watcher, int revents)
>> +{
>> + (void)loop;
>> + (void)revents;
>> +
>> + popen_lock_data_list();
>> +
>> + struct popen_handle *data = popen_lookup_data_by_pid(watcher->rpid);
>> + if (data) {
>> + if (WIFEXITED(watcher->rstatus)) {
>> + data->exit_code = WEXITSTATUS(watcher->rstatus);
>> + data->status = POPEN_EXITED;
>> + } else if (WIFSIGNALED(watcher->rstatus)) {
>> + data->exit_code = WTERMSIG(watcher->rstatus);
>> +
>> + if (WCOREDUMP(watcher->rstatus))
>> + data->status = POPEN_DUMPED;
>
> I don't see any point at all in this status. We don't even report it.
> Please remove.
>
Done.
>> + else
>> + data->status = POPEN_KILLED;
>> + } else {
>> + /*
>> + * The status is not determined, treat as EXITED
>> + */
>> + data->exit_code = EX_SOFTWARE;
>> + data->status = POPEN_EXITED;
>> + }
>> +
>> + /*
>> + * We shouldn't close file descriptors here.
>> + * The child process may exit earlier than
>> + * the parent process finishes reading data.
>> + * In this case the reading fails.
>> + */
>> + }
>> +
>> + popen_unlock_data_list();
>> +}
>> +
>> +void
>> +popen_setup_sigchld_handler()
>> +{
>> + ev_child_init (&cw, popen_sigchld_cb, 0/*pid*/, 0);
>> + ev_child_start(loop(), &cw);
>
> Can't you set it up in popen_init instead?
>
Done.
>> +
>> +}
>> +
>> +void
>> +popen_reset_sigchld_handler()
>> +{
>> + ev_child_stop(loop(), &cw);
>> +}
>> diff --git a/src/lib/core/coio_popen.h b/src/lib/core/coio_popen.h
>> new file mode 100644
>> index 00000000..57057e97
>> --- /dev/null
>> +++ b/src/lib/core/coio_popen.h
>> @@ -0,0 +1,229 @@
>> +#ifndef TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED
>> +#define TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED
>> +/*
>> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
>> + *
>> + * Redistribution and use in source and binary forms, with or
>> + * without modification, are permitted provided that the following
>> + * conditions are met:
>> + *
>> + * 1. Redistributions of source code must retain the above
>> + * copyright notice, this list of conditions and the
>> + * following disclaimer.
>> + *
>> + * 2. Redistributions in binary form must reproduce the above
>> + * copyright notice, this list of conditions and the following
>> + * disclaimer in the documentation and/or other materials
>> + * provided with the distribution.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
>> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
>> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
>> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
>> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
>> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
>> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
>> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
>> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
>> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
>> + * SUCH DAMAGE.
>> + */
>> +
>> +#if defined(__cplusplus)
>> +extern "C" {
>> +#endif /* defined(__cplusplus) */
>> +
>> +#include "evio.h"
>> +#include <stdarg.h>
>> +
>> +/**
>> + * Special values of the file descriptors passed to fio.popen
>> + * */
>
> Bad comment formatting. It's really annoying to point to such minor
> things in a review. There are other places in the patch where there's
> an extra new line or a missing space. Please always self-review your
> patches so as to make sure it doesn't happen.
Ok.
>
> Regarding the comment: it would be nice to point out why you use
> negative constants.
>
I use negative constants to distinguish them from the valid file
descriptors.
>> +enum {
>> + /**
>> + * Tells fio.popen to open a handle for
>> + * direct reading/writing.
>> + */
>> + FIO_PIPE = -2,
>
> Why do you start from -2? Why not -1?
>
The -1 reserved for "invalid file descriptor".
>> +
>> + /**
>> + * Tells fio.popen to redirect the given standard
>> + * stream into /dev/null.
>> + */
>> + FIO_DEVNULL = -3
>> +};
>> +
>> +struct popen_handle;
>> +
>> +/**
>> + * Possible status of the process started via fio.popen
>> + **/
>> +enum popen_status {
>> +
>> + /**
>> + * The process is alive and well.
>> + */
>> + POPEN_RUNNING = 1,
>> +
>> + /**
>> + * The process exited.
>> + */
>> + POPEN_EXITED = 2,
>> +
>> + /**
>> + * The process terminated by a signal.
>> + */
>> + POPEN_KILLED = 3,
>> +
>> + /**
>> + * The process terminated abnormally.
>> + */
>> + POPEN_DUMPED = 4
>> +};
>> +
>> +/**
>> + * Initializes inner data of fio.popen
>> + * */
>> +void
>> +popen_initialize();
>> +
>> +ssize_t
>> +popen_new(va_list ap);
>> +
>> +/**
>> + * The function releases allocated resources.
>> + * The function doesn't wait for the associated process
>> + * to terminate.
>> + *
>> + * @param fh handle returned by fio.popen.
>> + *
>> + * @return 0 if the process is terminated
>> + * @return -1 for an error
>> + */
>> +int
>> +popen_destroy(struct popen_handle *fh);
>> +
>> +/**
>> + * The function reads up to count bytes from the handle
>> + * associated with the child process.
>> + * Returns immediately
>> + *
>> + * @param fd handle returned by fio.popen.
>> + * @param buf a buffer to be read into
>> + * @param count size of buffer in bytes
>> + * @param read_bytes A pointer to the
>> + * variable that receives the number of bytes read.
>> + * @param source_id A pointer to the variable that receives a
>> + * source stream id, 1 - for STDOUT, 2 - for STDERR.
>> + *
>> + * @return 0 data were successfully read
>> + * @return -1 an error occurred, see errno for error code
>> + *
>> + * If there is nothing to read yet function returns -1
>> + * and errno set no EAGAIN.
>> + */
>> +int
>> +popen_read(struct popen_handle *fh, void *buf, size_t count,
>> + size_t *read_bytes, int *source_id,
>> + ev_tstamp timeout);
>> +
>> +/**
>> + * The function writes up to count bytes to the handle
>> + * associated with the child process.
>> + * Tries to write as much as possible without blocking
>> + * and immediately returns.
>> + *
>> + * @param fd handle returned by fio.popen.
>> + * @param buf a buffer to be written from
>> + * @param count size of buffer in bytes
>> + * @param written A pointer to the
>> + * variable that receives the number of bytes actually written.
>> + * If function fails the number of written bytes is undefined.
>> + *
>> + * @return 0 data were successfully written
>> + * Compare values of <count> and <written> to check
>> + * whether all data were written or not.
>> + * @return -1 an error occurred, see errno for error code
>> + *
>> + * If the writing can block, function returns -1
>> + * and errno set no EAGAIN.
>> + */
>> +int
>> +popen_write(struct popen_handle *fh, const void *buf, size_t count,
>> + size_t *written, ev_tstamp timeout);
>> +
>> +
>> +/**
>> + * The function send the specified signal
>> + * to the associated process.
>> + *
>> + * @param fd - handle returned by fio.popen.
>> + *
>> + * @return 0 on success
>> + * @return -1 an error occurred, see errno for error code
>> + */
>> +int
>> +popen_kill(struct popen_handle *fh, int signal_id);
>> +
>> +/**
>> + * Wait for the associated process to terminate.
>> + * The function doesn't release the allocated resources.
>> + *
>> + * @param fd handle returned by fio.popen.
>
> 'fd'? The argument is named 'fh'.
>
Fixed.
>> + *
>> + * @param timeout number of second to wait before function exit with error.
>> + * If function exited due to timeout the errno equals to ETIMEDOUT.
>> + *
>> + * @exit_code On success contains the exit code as a positive number
>> + * or signal id as a negative number.
>> +
>> + * @return On success function returns 0, and -1 on error.
>
> What happens if the process has already exited? Has been reaped? Has
> been waited for using this function? Please describe in the comment.
>
Ok.
>> + */
>> +int
>> +popen_wait(struct popen_handle *fh, ev_tstamp timeout, int *exit_code);
>
> Please use double for timeouts so as not to include evio.h into header
> files.
>
Ok.
>> +
>> +/**
>> + * Returns descriptor of the specified file.
>> + *
>> + * @param fd - handle returned by fio.popen.
>> + * @param file_no accepts one of the
>> + * following values:
>> + * STDIN_FILENO,
>> + * STDOUT_FILENO,
>> + * STDERR_FILENO
>> + *
>> + * @return file descriptor or -1 if not available
>> + */
>> +int
>> +popen_get_std_file_handle(struct popen_handle *fh, int file_no);
>> +
>> +
>> +/**
>> + * Returns status of the associated process.
>> + *
>> + * @param fd - handle returned by fio.popen.
>> + *
>> + * @param exit_code - if not NULL accepts the exit code
>> + * if the process terminated normally or signal id
>> + * if process was termianted by signal.
>> + *
>> + * @return one of the following values:
>> + * POPEN_RUNNING if the process is alive
>> + * POPEN_EXITED if the process was terminated normally
>> + * POPEN_KILLED if the process was terminated by a signal
>> + */
>> +int
>> +popen_get_status(struct popen_handle *fh, int *exit_code);
>> +
>> +void
>> +popen_setup_sigchld_handler();
>> +void
>> +popen_reset_sigchld_handler();
>> +
>> +#if defined(__cplusplus)
>> +} /* extern "C" */
>> +#endif /* defined(__cplusplus) */
>> +
>> +#endif /* TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED */
>> diff --git a/src/lib/core/coio_task.c b/src/lib/core/coio_task.c
>> index 908b336e..e6a1b327 100644
>> --- a/src/lib/core/coio_task.c
>> +++ b/src/lib/core/coio_task.c
>> @@ -40,6 +40,7 @@
>>
>> #include "fiber.h"
>> #include "third_party/tarantool_ev.h"
>> +#include "coio_popen.h"
>>
>> /*
>> * Asynchronous IO Tasks (libeio wrapper).
>> @@ -129,6 +130,7 @@ coio_on_stop(void *data)
>> void
>> coio_init(void)
>> {
>> + popen_initialize();
>> eio_set_thread_on_start(coio_on_start, NULL);
>> eio_set_thread_on_stop(coio_on_stop, NULL);
>> }
>> diff --git a/src/lua/fio.c b/src/lua/fio.c
>> index 806f4256..873ee165 100644
>> --- a/src/lua/fio.c
>> +++ b/src/lua/fio.c
>> @@ -46,6 +46,9 @@
>>
>> #include "lua/utils.h"
>> #include "coio_file.h"
>> +#include "coio_popen.h"
>> +
>> +static uint32_t CTID_STRUCT_POPEN_HANDLE_REF = 0;
>>
>> static inline void
>> lbox_fio_pushsyserror(struct lua_State *L)
>> @@ -703,6 +706,269 @@ lbox_fio_copyfile(struct lua_State *L)
>> return lbox_fio_pushbool(L, coio_copyfile(source, dest) == 0);
>> }
>>
>> +static bool
>> +popen_verify_argv(struct lua_State *L)
>> +{
>> + if (!lua_istable(L, 1))
>> + return false;
>> + int num = (int)lua_objlen(L, 1); /*luaL_getn(L,1);*/
>
> popen_extract_strarray does the same. Do we really need this function?
>
Well, we won't die without this function.
Removed.
>> + return num >= 1;
>> +}
>> +
>> +static char**
>> +popen_extract_strarray(struct lua_State *L, int index, int* array_size)
>
> The name is confusing. Let's please name the function to point out that
> it's used for extracting args/env from Lua stack. Something like
>
> lbox_fio_popen_get_args
>
> Also, a comment would be helpful.
>
Done.
>> +{
>> + if (lua_type(L, index) != LUA_TTABLE) {
>
> Why don't you use lua_istable here?
>
Fixed.
>> + if (array_size)
>> + *array_size = 0;
>> + return NULL;
>> + }
>> +
>> + size_t num = lua_objlen(L, index); /*luaL_getn(L,index);*/
>
> What this comment is for?
>
Removed.
In the recent version of Lua they use luaL_getn() to retrieve the i-th
element of array. Our current version doesn't have this function.
>> +
>> + char** array = calloc(num+1, sizeof(char*));
>
> Since the array is temporary and freed right upon popen_new completion,
> I think it's okay to allocate it on the region. Please take a look at
> how region_alloc and region_truncate are used.
>
Done.
>> + /*
>> + * The last item in the array must be NULL
>> + */
>> +
>> + if (array == NULL)
>> + return NULL;
>> +
>> + for(size_t i = 0; i < num; ++i) {
>> + lua_rawgeti(L, index, i+1);
>> + size_t slen = 0;
>> + const char* str = lua_tolstring(L, -1, &slen);
>> + if (!str)
>> + str = "";
>
> This looks like an invalid argument to me. I think we should raise a Lua
> exception in this case.
Fixed.
>
>> + array[i] = strdup(str);
>> + lua_pop(L, 1);
>> + }
>> +
>> + if (array_size)
>> + *array_size = num;
>
> Why do you return array_size? AFAICS you never use it.
>
Removed.
>> + /*
>> + * The number of elements doesn't include
>> + * the trailing NULL pointer
>> + */
>> + return array;
>> +}
>> +
>> +static struct popen_handle *
>> +fio_popen_get_handle(lua_State *L, int idx)
>> +{
>> + uint32_t cdata_type;
>> + struct popen_handle **handle_ptr = luaL_checkcdata(L, idx, &cdata_type);
>> + if (handle_ptr == NULL || cdata_type != CTID_STRUCT_POPEN_HANDLE_REF)
>> + return NULL;
>> + return *handle_ptr;
>> +}
>> +
>> +static void
>> +fio_popen_invalidate_handle(lua_State *L, int idx)
>
> Let's please prefix all function names dealing with lua with lbox_.
>
Fixed.
>> +{
>> + uint32_t cdata_type;
>> + struct popen_handle **handle_ptr = luaL_checkcdata(L, idx, &cdata_type);
>> + if (handle_ptr != NULL && cdata_type == CTID_STRUCT_POPEN_HANDLE_REF) {
>> + *handle_ptr = NULL;
>> + }
>> +}
>> +
>> +static int
>> +lbox_fio_popen_gc(lua_State *L)
>> +{
>> + struct popen_handle *handle = fio_popen_get_handle(L,1);
>> +
>> + if (handle)
>> + popen_destroy(handle);
>> + return 0;
>> +}
>> +
>> +static int
>> +lbox_fio_popen(struct lua_State *L)
>> +{
>> + if (lua_gettop(L) < 1) {
>
> Just one argument is enough? But below you use up to 5 arguments.
> Please clean up Lua stack checks.
>
Fixed.
>> + usage:
>> + luaL_error(L, "fio.popen: Invalid arguments");
>> + }
>> +
>> + if (!popen_verify_argv(L))
>> + goto usage;
>> +
>> + int stdin_fd = FIO_PIPE;
>> + int stdout_fd = FIO_PIPE;
>> + int stderr_fd = FIO_PIPE;
>> +
>> + char** argv = popen_extract_strarray(L, 1, NULL);
>> + char** env = popen_extract_strarray(L, 2, NULL);
>> +
>> + if (lua_isnumber(L, 3))
>> + stdin_fd = lua_tonumber(L, 3);
>> + if (lua_isnumber(L, 4))
>> + stdout_fd = lua_tonumber(L, 4);
>> + if (lua_isnumber(L, 5))
>> + stderr_fd = lua_tonumber(L, 5);
>> +
>> + struct popen_handle* handle = NULL;
>> + if (coio_call(popen_new, argv, env, stdin_fd, stdout_fd,
>
> coio_call should be hidden behind the popen_new implementation.
>
> Anyway, why do we need coio_call at all? Now since we use vfork,
> popen_new should be fast enough to be called right from tx.
> BTW, you wouldn't need pthread_mutex guarding popen hash if you
> didn't use coio_call.
Ok, done.
>
>> + stderr_fd, &handle) < 0) {
>> + lua_pushnil(L);
>> + lbox_fio_pushsyserror(L);
>> + return 2;
>> + } else {
>> + luaL_pushcdata(L, CTID_STRUCT_POPEN_HANDLE_REF);
>> + *(struct popen_handle **)
>> + luaL_pushcdata(L, CTID_STRUCT_POPEN_HANDLE_REF) = handle;
>> + lua_pushcfunction(L, lbox_fio_popen_gc);
>> + luaL_setcdatagc(L, -2);
>> +
>> + return 1;
>> + }
>> +}
>> +
>> +static int
>> +lbox_fio_popen_read(struct lua_State *L)
>> +{
>> + /* popen_read(self.fh, buf, size, seconds) */
>> +
>> + void* fh = fio_popen_get_handle(L, 1);
>> + uint32_t ctypeid;
>> + char *buf = *(char **)luaL_checkcdata(L, 2, &ctypeid);
>> + size_t len = lua_tonumber(L, 3);
>> + ev_tstamp seconds = lua_tonumber(L, 4);
>> +
>> + if (!len) {
>> + lua_pushinteger(L, 0);
>> + lua_pushinteger(L, STDOUT_FILENO);
>> + return 2;
>> + }
>> +
>> + int output_number = 0;
>> + size_t received = 0;
>> + int rc = popen_read(fh, buf, len,
>> + &received, &output_number,
>> + seconds);
>> + if (rc == 0) { /* The reading's succeeded */
>> + lua_pushinteger(L, received);
>> + lua_pushinteger(L, output_number);
>> + return 2;
>> + } else {
>> + lua_pushnil(L);
>> + lua_pushnil(L);
>> + lbox_fio_pushsyserror(L);
>> + return 3;
>> + }
>> +}
>> +
>> +static int
>> +lbox_fio_popen_write(struct lua_State *L)
>> +{
>> + struct popen_handle* fh = fio_popen_get_handle(L, 1);
>> + const char *buf = lua_tostring(L, 2);
>> + uint32_t ctypeid = 0;
>> + if (buf == NULL)
>> + buf = *(const char **)luaL_checkcdata(L, 2, &ctypeid);
>> + size_t len = lua_tointeger(L, 3);
>> + double timeout = lua_tonumber(L, 4);
>> +
>> + size_t written = 0;
>> + int rc = popen_write(fh, buf, len,
>> + &written, timeout);
>> + if (rc == 0 && written == len) {
>> + /* The writing's succeeded */
>> + lua_pushinteger(L, (ssize_t) written);
>
> Why do you cast 'written' to ssize_t?
Fixed.
>
>> + return 1;
>> + } else {
>> + lua_pushnil(L);
>> + lbox_fio_pushsyserror(L);
>> + return 2;
>> + }
>> +}
>> +
>> +static int
>> +lbox_fio_popen_get_status(struct lua_State *L)
>> +{
>> + struct popen_handle* fh = fio_popen_get_handle(L, 1);
>> + int exit_code = 0;
>> + int res = popen_get_status(fh, &exit_code);
>> +
>> + switch (res) {
>> + case POPEN_RUNNING:
>> + lua_pushnil(L);
>> + break;
>> +
>> + case POPEN_KILLED:
>> + lua_pushinteger(L, -exit_code);
>> + break;
>> +
>> + default:
>> + lua_pushinteger(L, exit_code);
>> + break;
>> + }
>> +
>> + return 1;
>> +}
>> +
>> +static int
>> +lbox_fio_popen_get_std_file_handle(struct lua_State *L)
>> +{
>> + struct popen_handle* fh = fio_popen_get_handle(L, 1);
>> + int file_no = lua_tonumber(L, 2);
>> + int res = popen_get_std_file_handle(fh, file_no);
>> +
>> + if (res < 0)
>> + lua_pushnil(L);
>> + else
>> + lua_pushinteger(L, res);
>> + return 1;
>> +}
>> +
>> +static int
>> +lbox_fio_popen_kill(struct lua_State *L)
>> +{
>> + struct popen_handle* fh = fio_popen_get_handle(L, 1);
>> + int signal_id = lua_tonumber(L, 2);
>> +
>> + int res = popen_kill(fh, signal_id);
>> + if (res < 0){
>> + lua_pushboolean(L, false);
>> + lbox_fio_pushsyserror(L);
>> + return 2;
>> + } else {
>> + lua_pushboolean(L, true);
>> + return 1;
>> + }
>> +}
>> +
>> +static int
>> +lbox_fio_popen_wait(struct lua_State *L)
>> +{
>> + struct popen_handle *fh = fio_popen_get_handle(L, 1);
>> + assert(fh);
>> + ev_tstamp timeout = lua_tonumber(L, 2);
>> +
>> + /*
>> + * On success function returns an
>> + * exit code as a positive number
>> + * or signal id as a negative number.
>> + * If failed, rc is nul and err
>> + * contains a error message.
>> + */
>> +
>> + int exit_code =0;
>> + int res = popen_wait(fh, timeout, &exit_code);
>> + if (res < 0){
>> + lua_pushnil(L);
>> + lbox_fio_pushsyserror(L);
>> + return 2;
>> + } else {
>> + /* Release the allocated resources */
>> + popen_destroy(fh);
>> + fio_popen_invalidate_handle(L, 1);
>
> Why invalidate it now? This means we won't be able to read stdout of a
> reaped process, which looks unexpected to me. I think all resources
> should be released by gc. There should be an explicit call to close file
> descriptors. Please also see my comments to the commit message.
>
Fixed.
The cleanup is performed in a separate call of popen:shutdown().
Or in gc if shutdown() was not called.
>> +
>> + lua_pushinteger(L, exit_code);
>> + return 1;
>> + }
>> +}
>>
>>
>> void
>> @@ -747,6 +1013,13 @@ tarantool_lua_fio_init(struct lua_State *L)
>> { "listdir", lbox_fio_listdir },
>> { "fstat", lbox_fio_fstat },
>> { "copyfile", lbox_fio_copyfile, },
>> + { "popen", lbox_fio_popen },
>> + { "popen_read", lbox_fio_popen_read },
>> + { "popen_write", lbox_fio_popen_write },
>> + { "popen_get_status", lbox_fio_popen_get_status },
>> + { "popen_get_std_file_handle", lbox_fio_popen_get_std_file_handle },
>> + { "popen_kill", lbox_fio_popen_kill },
>> + { "popen_wait", lbox_fio_popen_wait },
>> { NULL, NULL }
>> };
>> luaL_register(L, NULL, internal_methods);
>> @@ -849,4 +1122,12 @@ tarantool_lua_fio_init(struct lua_State *L)
>>
>> lua_settable(L, -3);
>> lua_pop(L, 1);
>> +
>> + /* Get CTypeID for `struct tuple' */
>> + int rc = luaL_cdef(L, "struct popen_handle;");
>> + assert(rc == 0);
>> + (void) rc;
>> + CTID_STRUCT_POPEN_HANDLE_REF = luaL_ctypeid(L, "struct popen_handle &");
>> + assert(CTID_STRUCT_POPEN_HANDLE_REF != 0);
>> +
>> }
>> diff --git a/src/lua/fio.lua b/src/lua/fio.lua
>> index ba8c47ec..d07cac55 100644
>> --- a/src/lua/fio.lua
>> +++ b/src/lua/fio.lua
>> @@ -3,6 +3,8 @@
>> local fio = require('fio')
>> local ffi = require('ffi')
>> local buffer = require('buffer')
>> +local signal = require('signal')
>> +local errno = require('errno')
>>
>> ffi.cdef[[
>> int umask(int mask);
>> @@ -15,6 +17,13 @@ local const_char_ptr_t = ffi.typeof('const char *')
>> local internal = fio.internal
>> fio.internal = nil
>>
>> +fio.STDIN = 0
>> +fio.STDOUT = 1
>> +fio.STDERR = 2
>> +fio.PIPE = -2
>> +fio.DEVNULL = -3
>> +
>> +
>> local function sprintf(fmt, ...)
>> if select('#', ...) == 0 then
>> return fmt
>> @@ -206,6 +215,216 @@ fio.open = function(path, flags, mode)
>> return fh
>> end
>>
>> +local popen_methods = {}
>> +
>> +-- read stdout & stderr of the process started by fio.popen
>> +-- Usage:
>> +-- read(size) -> str, source, err
>> +-- read(buf, size) -> length, source, err
>> +-- read(size, timeout) -> str, source, err
>> +-- read(buf, size, timeout) -> length, source, err
>> +--
>> +-- timeout - number of seconds to wait (optional)
>> +-- source contains id of the stream, fio.STDOUT or fio.STDERR
>> +-- err - error message if method has failed or nil on success
>> +popen_methods.read = function(self, buf, size, timeout)
>> + if self.fh == nil then
>> + return nil, nil, 'Invalid object'
>> + end
>> +
>> + local tmpbuf
>> +
>> + if ffi.istype(const_char_ptr_t, buf) then
>> + -- ext. buffer is specified
>> + if type(size) ~= 'number' then
>> + error('fio.popen.read: invalid size argument')
>> + end
>> + timeout = timeout or -1
>> + elseif type(buf) == 'number' then
>> + -- use temp. buffer
>> + timeout = size or -1
>> + size = buf
>> +
>> + tmpbuf = buffer.ibuf()
>> + buf = tmpbuf:reserve(size)
>> + else
>> + error("fio.popen.read: invalid arguments")
>> + end
>> +
>> + local res, output_no, err = internal.popen_read(self.fh, buf, size, timeout)
>> + if res == nil then
>> + if tmpbuf ~= nil then
>> + tmpbuf:recycle()
>> + end
>> + return nil, nil, err
>> + end
>> +
>> + if tmpbuf ~= nil then
>> + tmpbuf:alloc(res)
>> + res = ffi.string(tmpbuf.rpos, tmpbuf:size())
>> + tmpbuf:recycle()
>> + end
>> + return res, output_no
>> +end
>> +
>> +-- write(str)
>> +-- write(buf, len)
>> +popen_methods.write = function(self, data, size)
>> + local timeout = -1.0
>> + if type(data) == 'table' then
>> + timeout = data.timeout or timeout
>> + size = data.size or size
>> + data = data.buf
>> + end
>> +
>> + if type(data) == 'string' then
>> + if size == nil then
>> + size = string.len(data)
>> + end
>> + elseif not ffi.istype(const_char_ptr_t, data) then
>> + data = tostring(data)
>> + size = #data
>> + end
>> +
>> + local res, err = internal.popen_write(self.fh, data, tonumber(size), tonumber(timeout))
>> + if err ~= nil then
>> + return false, err
>> + end
>> + return res >= 0
>> +end
>> +
>> +popen_methods.status = function(self)
>> + if self.fh ~= nil then
>> + return internal.popen_get_status(self.fh)
>> + else
>> + return self.exit_code
>> + end
>> +end
>> +
>> +popen_methods.stdin = function (self)
>> + if self.fh == nil then
>> + return nil, 'Invalid object'
>> + end
>> +
>> + return internal.popen_get_std_file_handle(self.fh, fio.STDIN)
>> +end
>> +
>> +popen_methods.stdout = function (self)
>> + if self.fh == nil then
>> + return nil, 'Invalid object'
>> + end
>> +
>> + return internal.popen_get_std_file_handle(self.fh, fio.STDOUT)
>> +end
>> +
>> +popen_methods.stderr = function (self)
>> + if self.fh == nil then
>> + return nil, 'Invalid object'
>> + end
>> +
>> + return internal.popen_get_std_file_handle(self.fh, fio.STDERR)
>> +end
>> +
>> +popen_methods.kill = function(self, sig)
>> + if self.fh == nil then
>> + return false, errno.strerror(errno.ESRCH)
>> + end
>> +
>> + if sig == nil then
>> + sig = 'SIGTERM'
>> + end
>> + if type(sig) == 'string' then
>> + sig = signal.c.signals[sig]
>> + if sig == nil then
>> + errno(errno.EINVAL)
>> + return false, sprintf("fio.popen.kill(): unknown signal: %s", sig)
>> + end
>> + else
>> + sig = tonumber(sig)
>> + end
>> +
>> + return internal.popen_kill(self.fh, sig)
>> +end
>> +
>> +popen_methods.wait = function(self, timeout)
>> + if self.fh == nil then
>> + return false, 'Invalid object'
>> + end
>> +
>> + if timeout == nil then
>> + timeout = -1
>> + else
>> + timeout = tonumber(timeout)
>> + end
>> +
>> + local rc, err = internal.popen_wait(self.fh, timeout)
>> + if rc ~= nil then
>> + self.exit_code = tonumber(rc)
>> + self.fh = nil
>> + return true
>> + else
>> + return false,err
>> + end
>> +end
>> +
>> +
>> +local popen_mt = { __index = popen_methods }
>> +
>> +fio.popen = function(params)
>> + local argv = params.argv
>> + local env = params.environment
>> + local hstdin = params.stdin
>> + local hstdout = params.stdout
>> + local hstderr = params.stderr
>> +
>> + if type(hstdin) == 'table' then
>> + hstdin = hstdin.fh
>> + end
>> + if type(hstdout) == 'table' then
>> + hstdout = hstdout.fh
>> + end
>> + if type(hstderr) == 'table' then
>> + hstderr = hstderr.fh
>> + end
>> +
>> + if argv == nil or
>> + type(argv) ~= 'table' or
>> + table.getn(argv) < 1 then
>> + local errmsg = [[Usage: fio.popen({parameters}),
>> +parameters - a table containing arguments to run a process.
>> +The following arguments are expected:
>> +
>> +argv - [mandatory] is a table of argument strings passed
>> + to the new program.
>> + By convention, the first of these strings should
>> + contain the filename associated with the file
>> + being executed.
>> +
>> +environment - [optional] is a table of strings,
>> + conventionally of the form key=value,
>> + which are passed as environment to the
>> + new program.
>> +
>> +stdin - [optional] overrides the child process's
>> + standard input.
>> +stdout - [optional] overrides the child process's
>> + standard output.
>> +stderr - [optional] overrides the child process's
>> + standard error output.
>> +]]
>
> AFAIR we don't typically write a detailed help on error. Just a one-line
> "Usage..." reminder. There's a documentation for the rest.
>
Fixed.
More information about the Tarantool-patches
mailing list