From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH v3] core: Non-blocking io.popen References: <20190617185459.13193-1-szudin@tarantool.org> <20190621123121.g4lel4ytmmxlkjyk@esperanza> From: Stanislav Zudin Message-ID: Date: Tue, 2 Jul 2019 09:56:00 +0300 MIME-Version: 1.0 In-Reply-To: <20190621123121.g4lel4ytmmxlkjyk@esperanza> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Language: en-US Content-Transfer-Encoding: 8bit To: Vladimir Davydov Cc: tarantool-patches@freelists.org, alexander.turenko@tarantool.org List-ID: Please find the comments inplace. The patch is in the separate letter. On 21.06.2019 15:31, Vladimir Davydov wrote: > On Mon, Jun 17, 2019 at 09:54:59PM +0300, Stanislav Zudin wrote: >> Adds nonblocking implementation of popen. >> The method is available in namespace fio. >> fio.popen() returns an object providing facilities >> for dealing with standard input/output, getting >> status of the child process. >> >> Closes #4031 >> >> @TarantoolBot document >> Title: Nonblocking fio.popen >> >> handle, err = fio.popen(parameters) >> >> fio.popen starts a process and redirects its input/output. >> >> parameters - a table containing arguments to run a process. >> The following arguments are expected: >> >> argv - [mandatory] is a table of argument strings passed to >> the new program. By convention, the first of these strings should >> contain the filename associated with the file being executed. >> >> environment - [optional] is a table of strings, conventionally of the >> form key=value, which are passed as environment to the new program. >> If not specified a parent's environment is used. > > I'd rename 'argv' to 'args', 'environment' to 'env', 'parameters' to > 'opts' - IMO this would be more consistent with Tarantool interfaces. > Also, since 'args' is mandatory, I would make it a separate argument: > > fio.popen(args[, opts]) > > where opts is a table that may contain the following keys: > > env > stdin > stdout > stderr > > Example: > > fio.popen({'cat', 'file'}, {stdout = fio.DEVNULL}) > Fixed. >> >> By default stdin, stdout,stderr of the associated process are >> available for writing/reading using object's methods >> handle:write() and handle:read(). >> One can override default behavior to redirect streams to/from file >> or to input/output of another process or to the default input/output >> of the parent process. >> >> stdin - [optional] overrides the child process's standard input. >> stdout - [optional] overrides the child process's standard output. >> stderr - [optional] overrides the the child process's standard >> error output. >> May accept one of the following values: >> Handle of the file open with fio.open() >> Handle of the standard input/output of another process, >> open by fio.popen >> A constant defining the parent's STDIN, STDOUT or STDERR. >> A constants: >> fio.PIPE - Opens a file descriptor available for reading/writing >> or redirection to/from another process. >> fio.DEVNULL - Makes fio.popen redirect the output to /dev/null. >> >> On success returns an object providing methods for >> communication with the running process. >> Returns nil if underlying functions calls fail; >> in this case the second return value, err, contains a error message. >> >> The object created by fio.popen provides the following methods: >> read() >> write() >> kill() >> wait() >> status() >> stdin() >> stdout() >> stderr() >> >> number handle:stdin() >> >> Returns handle of the child process's standard input. >> The handle is available only if it was created with >> fio.PIPE option. >> Use this handle to setup a redirection >> from file or other process to the input of the associated process. >> If handle is unavailable the method returns nil. >> >> number handle:stdout() >> number handle:stderr() >> >> Return STDOUT and STDIN of the associated process accordingly. >> See handle:stdin() for details. >> >> rc,err = handle:wait(timeout) >> >> The wait() waits for the associated process to terminate. >> >> timeout - an integer specifies number of seconds to wait. >> If the requested time has elapsed the method returns false, >> the second return value, err, contains a error message. >> To distinguish timeout from the the other errors use errno. >> If timeout is nil, the method waits infinitely until >> the associated process is terminated. >> On success function returns true. >> If failed, rc is false and err contains a error message. >> >> If the associated process is terminated, one can use the following >> methods get the exit status: >> >> rc = handle:status() > > Looking at the code, I see that one can't read() the remaining output > from stdout/stderr pipe after wait(). I thought we'd agreed to allow > that. > I was sure that we agreed that wait() is a final method :) Anyway, now it's possible to read after wait(). >> >> returns nil if process is still running >> == 0 if the process exited normally >> error code > 0 if the process terminated with an error >> -signal if the process was killed by a signal >> >> rc, err = handle:kill(sig) >> >> The kill() sends a specified signal to the associated process. >> On success the method returns true, >> if failed - false and error message. >> If the sig is nil the default "SIGTERM" is being sent to the process. >> If the signal is unknown then the method fails (with error EINVAL). >> The method accepts string names of signal as well as their numeric >> values. > > Why string literals? Why not simply define constants in the signal > module, like signal.SIGTERM, signal.SIGKILL, etc. IMO it would be more > convenient, compare: > > p:kill('SIGTERM') > p:wait() > if p:status() == signal.c.signals['SIGTERM'] ... > > and > > p:kill(signal.SIGTERM) > p:wait() > if (p:status() == signal.SIGTERM) ... > Fixed. >> >> rc,src,err = handle:read(buffer,size,timeout) >> >> read stdout & stderr of the process started by fio.popen >> Usage: >> read(size) -> str, source, err >> read(buf, size) -> length, source, err >> read(size, timeout) -> str, source, err >> read(buf, size, timeout) -> length, source, err > > No option to read until EOF? > It's a pipe, we can't use fstat() to prepare buffer enough size to read until EOF. So it's a user's responsibility to choose the size of buffer and repeat reading till the end. >> >> timeout - number of seconds to wait (optional) >> source contains id of the stream, fio.STDOUT or fio.STDERR >> err - error message if method has failed or nil on success >> >> rc, err = handle:write(buf[, size]) >> rc, err = handle:write(opts), where opts are: >> * buf >> * size >> * timeout > > So write may take opts while read may not. Looks inconsistent. Please > change the API so that read() and write() have similar signatures. > Done. >> >> Writes specified number of bytes >> On success returns number of written bytes. >> If failed the rc is nil and err contains an error message. > > Please mention that 'read' and 'write' are only available if > stdin/stdout has been redirected to a pipe. > Done > I expected to also see explicit 'close' method to close stdin/stdout in > case they are redirected to a pipe. Here's why we need it. Suppose you > want to grep something in a string you have defined in your code. So you > run popen({'grep', 'what'}) and feed the string to it with write(). Then > you read the output and expect the program to terminate. However, it > won't, because grep doesn't exit until it receives EOF. That's why we > need to be able to close stdin: > > p = popen({'grep', 'what'}) > p:write(my_str) > p:close(STDIN) > result = p:read() > > Note, we need to be able to explicitly specify which end we want to > close (STDIN, STDOUT or STDERR or all of them), similarly to shutdown(2) > system call. > Sounds reasonable. Done. >> >> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt >> index 33b64f6a..96f1d751 100644 >> --- a/src/CMakeLists.txt >> +++ b/src/CMakeLists.txt >> @@ -120,6 +120,7 @@ set (server_sources >> lua/string.c >> lua/buffer.c >> lua/swim.c >> + lua/lua_signal.c > > Why not simply lua/signal.c? To avoid possible conflicts with system files. > >> ${lua_sources} >> ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc >> ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c >> diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt >> index 66e430a2..64152f08 100644 >> --- a/src/lib/core/CMakeLists.txt >> +++ b/src/lib/core/CMakeLists.txt >> @@ -27,8 +27,16 @@ set(core_sources >> mpstream.c >> port.c >> decimal.c >> + coio_popen.c >> ) >> >> +# Disable gcc compiler error >> +if (CMAKE_COMPILER_IS_GNUCC) >> + set_source_files_properties(coio_popen.c PROPERTIES COMPILE_FLAGS >> + -Wno-clobbered) >> +endif() >> + >> + > > I got no compilation error after removing this. I have gcc 6.3.0. > Please explain what happens without this option. I want to try to > figure out if we can fix it somehow else, without patching cmake. > if you build release version using cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_WERROR=ON You're getting the following errors: /home/szudin/work/ttool03/src/lib/core/coio_popen.c: In function ‘popen_new_impl’: /home/szudin/work/ttool03/src/lib/core/coio_popen.c:302:7: error: variable ‘popen_list_locked’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] bool popen_list_locked = false; ^~~~~~~~~~~~~~~~~ /home/szudin/work/ttool03/src/lib/core/coio_popen.c:158:23: error: variable ‘handle’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] struct popen_handle *handle = ^~~~~~ /home/szudin/work/ttool03/src/lib/core/coio_popen.c:299:36: error: argument ‘env’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] popen_new_impl(char **argv, char **env, ^~~ /home/szudin/work/ttool03/src/lib/core/coio_popen.c:300:6: error: argument ‘stdin_fd’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] int stdin_fd, int stdout_fd, int stderr_fd) ^~~~~~~~ /home/szudin/work/ttool03/src/lib/core/coio_popen.c:300:20: error: argument ‘stdout_fd’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] int stdin_fd, int stdout_fd, int stderr_fd) ^~~~~~~~~ /home/szudin/work/ttool03/src/lib/core/coio_popen.c:300:35: error: argument ‘stderr_fd’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] int stdin_fd, int stdout_fd, int stderr_fd) ^~~~~~~~~ /home/szudin/work/ttool03/src/lib/core/coio_popen.c: At top level: cc1: error: unrecognized command line option ‘-Wno-cast-function-type’ [-Werror] cc1: all warnings being treated as errors src/lib/core/CMakeFiles/core.dir/build.make:734: recipe for target 'src/lib/core/CMakeFiles/core.dir/coio_popen.c.o' failed make[2]: *** [src/lib/core/CMakeFiles/core.dir/coio_popen.c.o] Error 1 CMakeFiles/Makefile2:3518: recipe for target 'src/lib/core/CMakeFiles /core.dir/all' failed make[1]: *** [src/lib/core/CMakeFiles/core.dir/all] Error 2 The version of gcc doesn't matter. The error is reproducible on the travis as well as on a local gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 >> if (TARGET_OS_NETBSD) >> # A workaround for "undefined reference to `__gcc_personality_v0'" >> # on x86_64-rumprun-netbsd-gcc >> diff --git a/src/lib/core/coio_popen.c b/src/lib/core/coio_popen.c >> new file mode 100644 >> index 00000000..dcc50136 >> --- /dev/null >> +++ b/src/lib/core/coio_popen.c >> @@ -0,0 +1,827 @@ >> +/* >> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file. >> + * >> + * Redistribution and use in source and binary forms, with or >> + * without modification, are permitted provided that the following >> + * conditions are met: >> + * >> + * 1. Redistributions of source code must retain the above >> + * copyright notice, this list of conditions and the >> + * following disclaimer. >> + * >> + * 2. Redistributions in binary form must reproduce the above >> + * copyright notice, this list of conditions and the following >> + * disclaimer in the documentation and/or other materials >> + * provided with the distribution. >> + * >> + * THIS SOFTWARE IS PROVIDED BY ``AS IS'' AND >> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED >> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL >> + * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, >> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL >> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF >> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR >> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF >> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF >> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF >> + * SUCH DAMAGE. >> + */ >> + >> +#include >> +#include >> +#include "coio_popen.h" >> +#include "coio_task.h" >> +#include "fiber.h" >> +#include "fio.h" >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +/* >> + * On OSX this global variable is not declared >> + * in >> + */ >> +extern char **environ; >> + >> + >> +struct popen_handle { >> + /* Process id */ >> + pid_t pid; >> + >> + /* >> + * Three descriptors: >> + * [0] write to stdin of the child process >> + * [1] read from stdout of the child process >> + * [2] read from stderr of the child process >> + * Valid only for pipe. > > Otherwise they are set to what? -1 I assume? Please mention in the > comment. > yes, they are. Fixed the comment. >> + */ >> + int fd[3]; >> + >> + /* >> + * Handle to /dev/null. >> + */ >> + int devnull_fd; > > Do we need to open /dev/null per each popen? Can't we use the same file > descriptor for all child processes? > Ok, the /dev/null is shared between all child processes. >> + >> + /* >> + * Current process status. >> + * The SIGCHLD handler changes this status. >> + */ >> + enum popen_status status; >> + >> + /* >> + * Exit status of the associated process >> + * or number of the signal that caused the >> + * associated process to terminate. >> + */ >> + int exit_code; > > This is confusing. First, 'status' is used for reporting exit code or > signal in Lua, but here it's something completely different. Second > 'exit_code' isn't necessarily an exit code, but -signo. May be, it would > be more readable to write something like this instead: > > /* Set to true if the process has terminated. */ > bool terminated; > /* > * If the process has terminated, this variable stores > * the exit code if the process exited normally, by > * calling exit(), or -signo if the process was killed > * by a signal. > */ > int status; > Done. >> +}; >> + >> +/* >> + * Map: (pid) => (popen_handle *) >> + */ >> +#define mh_name _popen_storage >> +#define mh_key_t pid_t >> +#define mh_node_t struct popen_handle* >> +#define mh_arg_t void * >> +#define mh_hash(a, arg) ((*a)->pid) >> +#define mh_hash_key(a, arg) (a) >> +#define mh_cmp(a, b, arg) (((*a)->pid) != ((*b)->pid)) >> +#define mh_cmp_key(a, b, arg) ((a) != ((*b)->pid)) >> +#define MH_SOURCE 1 >> +#include "salad/mhash.h" >> + >> + >> +static struct mh_popen_storage_t* popen_hash_table = NULL; > > According to our coding style, we put * closer to variable name, in > C-stle. There quite a few places where you put it in C++-style, i.e. > closer to type name. Please fix. > ok >> + >> +void >> +popen_initialize() > > Please rename to popen_init. There should be popen_free to clean up > objects allocated by it. > Done. >> + >> + popen_hash_table = mh_popen_storage_new(); >> +} >> + >> +static void >> +popen_lock_data_list() >> +{ >> + pthread_mutex_lock(&mutex); >> +} >> + >> +static void >> +popen_unlock_data_list() >> +{ >> + pthread_mutex_unlock(&mutex); >> +} >> + >> +static void >> +popen_append_to_list(struct popen_handle *data) >> +{ >> + struct popen_handle **old = NULL; >> + mh_int_t id = mh_popen_storage_put(popen_hash_table, >> + (const struct popen_handle **)&data, &old, NULL); >> + (void)id; >> +} >> + >> +static struct popen_handle * >> +popen_lookup_data_by_pid(pid_t pid) >> +{ >> + mh_int_t pos = mh_popen_storage_find(popen_hash_table, pid, NULL); >> + if (pos == mh_end(popen_hash_table)) >> + return NULL; >> + else { >> + struct popen_handle ** ptr = >> + mh_popen_storage_node(popen_hash_table, pos); >> + return *ptr; >> + } >> +} >> + >> +static void >> +popen_exclude_from_list(struct popen_handle *data) >> +{ >> + mh_popen_storage_remove(popen_hash_table, >> + (const struct popen_handle **)&data, NULL); >> +} > > What's the point in this one-liners each of which is used exactly once? > IMO they only obfuscate the code. Please fold them in. > Done. >> + >> +static struct popen_handle * >> +popen_data_new() >> +{ >> + struct popen_handle *data = >> + (struct popen_handle *)calloc(1, sizeof(*data)); > > Please handle OOM here and everywhere else. It might look pointless, > but we have to until https://github.com/tarantool/tarantool/issues/3534 > is fixed. > Done. >> + data->fd[0] = -1; >> + data->fd[1] = -1; >> + data->fd[2] = -1; >> + data->devnull_fd = -1; >> + data->status = POPEN_RUNNING; >> + return data; >> +} >> + >> +enum pipe_end { >> + PIPE_READ = 0, >> + PIPE_WRITE = 1 >> +}; >> + >> +static inline enum pipe_end >> + popen_opposite_pipe(enum pipe_end side) > > Bad indentation. So is it an 'end' or a 'side'? Please use the same > terminology throughout the code. > Ok. >> +{ >> + return (enum pipe_end)(side ^ 1); >> + /* >> + * The code is equal to: >> + * return (side == PIPE_READ) ? PIPE_WRITE >> + * : PIPE_READ; >> + */ > > Why not write it like that than in the first place? Anyway, the helper > looks pointless as it's a trivial one liner used exactly once. Please > fold it. > Done. >> +} >> + >> +static inline bool >> +popen_create_pipe(int fd, int pipe_pair[2], enum pipe_end parent_side) >> +{ >> + if (fd == FIO_PIPE) { >> + if (pipe(pipe_pair) < 0 || >> + fcntl(pipe_pair[parent_side], F_SETFL, O_NONBLOCK) < 0) { >> + return false; >> + } >> + } >> + return true; >> +} >> + >> +static inline void >> +popen_close_child_fd(int std_fd, int pipe_pair[2], >> + int *saved_fd, enum pipe_end child_side) >> +{ >> + if (std_fd == FIO_PIPE) { >> + /* Close child's side. */ >> + close(pipe_pair[child_side]); >> + >> + enum pipe_end parent_side = popen_opposite_pipe(child_side); >> + *saved_fd = pipe_pair[parent_side]; >> + } >> +} >> + >> +static inline void >> +popen_close_pipe(int pipe_pair[2]) >> +{ >> + if (pipe_pair[0] >= 0) { >> + close(pipe_pair[0]); >> + close(pipe_pair[1]); >> + } >> +} >> + >> + >> +/** >> + * Implementation of fio.popen. >> + * The function opens a process by creating a pipe >> + * forking. >> + * >> + * @param argv - is an array of character pointers >> + * to the arguments terminated by a null pointer. >> + * >> + * @param envp - is the pointer to an array >> + * of character pointers to the environment strings. >> + * >> + * @param stdin_fd - the file handle to be redirected to the >> + * child process's STDIN. >> + * >> + * @param stdout_fd - the file handle receiving the STDOUT >> + * output of the child process. >> + * >> + * @param stderr_fd - the file handle receiving the STDERR >> + * output of the child process. >> + * >> + * The stdin_fd, stdout_fd & stderr_fd accept file descriptors >> + * from open() or the following values: >> + * >> + * FIO_PIPE - opens a pipe, binds it with child's >> + * input/output. The pipe is available for reading/writing. >> + * >> + * FIO_DEVNULL - redirects output from process to /dev/null. >> + * >> + * @return handle of the pipe for reading or writing >> + * (depends on value of type). >> + * In a case of error returns NULL. >> + */ >> +static struct popen_handle * >> +popen_new_impl(char **argv, char **envp, >> + int stdin_fd, int stdout_fd, int stderr_fd) >> +{ >> + bool popen_list_locked = false; >> + pid_t pid; >> + int pipe_rd[2] = {-1,-1}; >> + int pipe_wr[2] = {-1,-1}; >> + int pipe_er[2] = {-1,-1}; >> + errno = 0; >> + >> + struct popen_handle *data = popen_data_new(); >> + if (data == NULL) >> + return NULL; >> + >> + /* >> + * Setup a /dev/null if necessary. >> + */ >> + bool read_devnull = (stdin_fd == FIO_DEVNULL); >> + bool write_devnull = (stdout_fd == FIO_DEVNULL) || >> + (stderr_fd == FIO_DEVNULL); >> + int devnull_flags = O_RDWR; >> + if (!read_devnull) >> + devnull_flags = O_WRONLY; >> + else if (!write_devnull) >> + devnull_flags = O_RDONLY; >> + >> + if (read_devnull || write_devnull) { >> + data->devnull_fd = open("/dev/null", devnull_flags); >> + if (data->devnull_fd < 0) >> + goto on_error; >> + else { >> + if (stdin_fd == FIO_DEVNULL) >> + stdin_fd = data->devnull_fd; >> + if (stdout_fd == FIO_DEVNULL) >> + stdout_fd = data->devnull_fd; >> + if (stderr_fd == FIO_DEVNULL) >> + stderr_fd = data->devnull_fd; >> + } >> + } >> + >> + if (!popen_create_pipe(stdin_fd, pipe_rd, PIPE_WRITE)) > > The function name is a bit confusing. It's called 'create_pipe', but it > doesn't necessarily create a pipe. Same goes for popen_close_child_fd. > I think we should rename them to something like > > popen_prepare_fd - called before fork > popen_setup_child_fd - called in the child after fork > popen_setup_parent_fd - called in the parent after fork > popen_cleanup_fd - called on error > > (names aren't perfect - you might want to think more on them) > > Also, I think that you should fold all fd manipulation into those > functions, including /dev/null setup and dup. Please try. > Done. >> + goto on_error; >> + if (!popen_create_pipe(stdout_fd, pipe_wr, PIPE_READ)) >> + goto on_error; >> + if (!popen_create_pipe(stderr_fd, pipe_er, PIPE_READ)) >> + goto on_error; >> + >> + /* >> + * Prepare data for the child process. >> + * There must be no branches, to avoid compiler >> + * error: "argument ‘xxx’ might be clobbered by >> + * ‘longjmp’ or ‘vfork’ [-Werror=clobbered]". >> + */ >> + if (envp == NULL) >> + envp = environ; >> + >> + /* Handles to be closed in child process */ >> + int close_fd[3] = {-1, -1, -1}; >> + /* Handles to be duplicated in child process */ >> + int dup_fd[3] = {-1, -1, -1}; >> + /* Handles to be closed after dup2 in child process */ >> + int close_after_dup_fd[3] = {-1, -1, -1}; >> + >> + if (stdin_fd == FIO_PIPE) { >> + close_fd[STDIN_FILENO] = pipe_rd[PIPE_WRITE]; >> + dup_fd[STDIN_FILENO] = pipe_rd[PIPE_READ]; >> + } else if (stdin_fd != STDIN_FILENO) { >> + dup_fd[STDIN_FILENO] = stdin_fd; >> + close_after_dup_fd[STDIN_FILENO] = stdin_fd; >> + } >> + >> + if (stdout_fd == FIO_PIPE) { >> + close_fd[STDOUT_FILENO] = pipe_wr[PIPE_READ]; >> + dup_fd[STDOUT_FILENO] = pipe_wr[PIPE_WRITE]; >> + } else if (stdout_fd != STDOUT_FILENO){ >> + dup_fd[STDOUT_FILENO] = stdout_fd; >> + if (stdout_fd != STDERR_FILENO) >> + close_after_dup_fd[STDOUT_FILENO] = stdout_fd; >> + } >> + >> + if (stderr_fd == FIO_PIPE) { >> + close_fd[STDERR_FILENO] = pipe_er[PIPE_READ]; >> + dup_fd[STDERR_FILENO] = pipe_er[PIPE_WRITE]; >> + } else if (stderr_fd != STDERR_FILENO) { >> + dup_fd[STDERR_FILENO] = stderr_fd; >> + if (stderr_fd != STDOUT_FILENO) >> + close_after_dup_fd[STDERR_FILENO] = stderr_fd; >> + } >> + >> + >> + popen_lock_data_list(); >> + popen_list_locked = true; >> + >> + pid = vfork(); > > I thought we'd agreed to block all signals in the parent before vfork() > and unblock them right after vfork(), so as to prevent the child from > corrupting parent's memory while handling a signal. > Done. >> + >> + if (pid < 0) >> + goto on_error; >> + else if (pid == 0) /* child */ { >> + /* Reset all signals to their defaults. */ >> + struct sigaction sa; >> + memset(&sa, 0, sizeof(sa)); >> + sigemptyset(&sa.sa_mask); >> + sa.sa_handler = SIG_DFL; >> + >> + if (sigaction(SIGUSR1, &sa, NULL) == -1 || >> + sigaction(SIGINT, &sa, NULL) == -1 || >> + sigaction(SIGTERM, &sa, NULL) == -1 || >> + sigaction(SIGHUP, &sa, NULL) == -1 || >> + sigaction(SIGWINCH, &sa, NULL) == -1 || >> + sigaction(SIGSEGV, &sa, NULL) == -1 || >> + sigaction(SIGFPE, &sa, NULL) == -1 || >> + sigaction(SIGCHLD, &sa, NULL) == -1) >> + exit(EX_OSERR); >> + >> + /* Unblock any signals blocked by libev. */ >> + sigset_t sigset; >> + sigfillset(&sigset); >> + if (sigprocmask(SIG_UNBLOCK, &sigset, NULL) == -1) >> + exit(EX_OSERR); > > This function is huge. Let's try to split it even more. E.g. we could > move signal setup to a helper function. > Done. >> + >> + /* Setup stdin/stdout */ >> + for(int i = 0; i < 3; ++i) { >> + if (close_fd[i] >= 0) >> + close(close_fd[i]); >> + if (dup_fd[i] >= 0) >> + dup2(dup_fd[i], i); >> + if (close_after_dup_fd[i] >= 0) >> + close(close_after_dup_fd[i]); >> + } >> + >> + execve( argv[0], argv, envp); >> + exit(EX_OSERR); >> + unreachable(); >> + } >> + >> + /* Parent process */ >> + popen_close_child_fd(stdin_fd, pipe_rd, >> + &data->fd[STDIN_FILENO], PIPE_READ); >> + popen_close_child_fd(stdout_fd, pipe_wr, >> + &data->fd[STDOUT_FILENO], PIPE_WRITE); >> + popen_close_child_fd(stderr_fd, pipe_er, >> + &data->fd[STDERR_FILENO], PIPE_WRITE); >> + >> + data->pid = pid; >> + >> +on_cleanup: >> + if (data){ >> + popen_append_to_list(data); >> + } >> + >> + if (popen_list_locked) >> + popen_unlock_data_list(); >> + >> + if (argv){ >> + for(int i = 0; argv[i] != NULL; ++i) >> + free(argv[i]); >> + free(argv); >> + } >> + if (envp && envp != environ) { >> + for(int i = 0; envp[i] != NULL; ++i) >> + free(envp[i]); >> + free(envp); >> + } > > It should be a responsibility of the caller to free environment and > args. E.g. they could be allocated on the region. Requiring them to be > allocated with malloc() obscures the function protocol. > Done. >> + >> + return data; >> + >> +on_error: >> + popen_close_pipe(pipe_rd); >> + popen_close_pipe(pipe_wr); >> + popen_close_pipe(pipe_er); >> + >> + if (data) { >> + if (data->devnull_fd >= 0) >> + close(data->devnull_fd); >> + free(data); >> + } >> + data = NULL; >> + >> + goto on_cleanup; >> + unreachable(); >> +} >> + >> +ssize_t >> +popen_new(va_list ap) >> +{ >> + char **argv = va_arg(ap, char **); >> + char **envp = va_arg(ap, char **); >> + int stdin_fd = va_arg(ap, int); >> + int stdout_fd = va_arg(ap, int); >> + int stderr_fd = va_arg(ap, int); >> + struct popen_handle **handle = va_arg(ap, struct popen_handle **); >> + >> + *handle = popen_new_impl(argv, envp, stdin_fd, stdout_fd, stderr_fd); >> + return (*handle) ? 0 : -1; >> +} >> + >> +static void >> +popen_close_handles(struct popen_handle *data) > > Sometimes you call variables referring to a popen_handle object > 'handle', sometimes 'data', sometimes 'fh'. Please be consistent. > Ok. >> +{ >> + for(int i = 0; i < 3; ++i) { >> + if (data->fd[i] >= 0) { >> + close(data->fd[i]); >> + data->fd[i] = -1; >> + } >> + } >> + if (data->devnull_fd >= 0) { >> + close(data->devnull_fd); >> + data->devnull_fd = -1; >> + } >> +} >> + >> +int >> +popen_destroy(struct popen_handle *fh) >> +{ >> + assert(fh); > > In general, assertions like this are pointless, because the code below > will crash anyway if you pass NULL for fh. > The runtime asserts are the way to specify a contract. They define and check the acceptable values and provide a readable error message on exit. >> + >> + popen_lock_data_list(); >> + popen_close_handles(fh); >> + popen_exclude_from_list(fh); >> + popen_unlock_data_list(); >> + >> + free(fh); >> + return 0; >> +} >> + >> +/** >> + * Check if an errno, returned from a sio function, means a > > sio? Bad copypaste. Fixed. > >> + * non-critical error: EAGAIN, EWOULDBLOCK, EINTR. >> + */ >> +static inline bool >> +popen_wouldblock(int err) >> +{ >> + return err == EAGAIN || err == EWOULDBLOCK || err == EINTR; >> +} >> + >> +static int >> +popen_do_read(struct popen_handle *data, void *buf, size_t count, int *source_id) > > The line's longer than 80 characters and could be split without hurting > readability. > Done. >> +{ >> + /* >> + * STDERR has higher priority, read it first. >> + */ >> + int rc = 0; >> + errno = 0; >> + int fd_count = 0; >> + if (data->fd[STDERR_FILENO] >= 0) { >> + ++fd_count; >> + rc = read(data->fd[STDERR_FILENO], buf, count); >> + >> + if (rc >= 0) >> + *source_id = STDERR_FILENO; >> + if (rc > 0) >> + return rc; >> + >> + if (rc < 0 && !popen_wouldblock(errno)) >> + return rc; >> + >> + } >> + >> + /* >> + * STDERR is not available or not ready, try STDOUT. >> + */ >> + if (data->fd[STDOUT_FILENO] >= 0) { >> + ++fd_count; >> + rc = read(data->fd[STDOUT_FILENO], buf, count); >> + >> + if (rc >= 0) { >> + *source_id = STDOUT_FILENO; >> + return rc; >> + } >> + } >> + >> + if (!fd_count) { >> + /* >> + * There are no open handles for reading. >> + */ >> + errno = EBADF; >> + rc = -1; >> + } >> + return rc; >> +} >> + >> +static void >> +popen_coio_create(struct ev_io *coio, int fd) >> +{ >> + coio->data = fiber(); >> + ev_init(coio, (ev_io_cb) fiber_schedule_cb); >> + coio->fd = fd; > > AFAICS setting coio->fd is unnecessary, because ev_io_set sets it anyway. > > Since this is a trivial two-line routine which is only used in a couple > of places, I'd inline it. > Done. >> +} >> + >> +int >> +popen_read(struct popen_handle *fh, void *buf, size_t count, >> + size_t *read_bytes, int *source_id, > > Why don't you return ssize_t, like read(2) does? Then you wouldn't need > the read_bytes argument. > Fixed. >> + ev_tstamp timeout) >> +{ >> + assert(fh); >> + if (timeout < 0.0) >> + timeout = DBL_MAX; > > We don't treat negative timeouts as infinity. Instead we pass > TIMEOUT_INFINITY, which is simply a very large number. Take > a look at lua/socket.lua for example. > Fixed. >> + >> + ev_tstamp start, delay; >> + evio_timeout_init(loop(), &start, &delay, timeout); >> + >> + struct ev_io coio_rd; >> + struct ev_io coio_er; > > In the function above you use 'pipe_rd' for STDIN while here you use > 'coio_rd' for STDOUT. This is confusing. Let's rename these variables > to coio_stdout and coio_stderr and pipe variables to pipe_stdout, etc. > Fixed. >> + popen_coio_create(&coio_er, fh->fd[STDERR_FILENO]); >> + popen_coio_create(&coio_rd, fh->fd[STDOUT_FILENO]); >> + >> + int result = 0; >> + >> + while (true) { >> + int rc = popen_do_read(fh, buf, count, source_id); >> + if (rc >= 0) { >> + *read_bytes = rc; >> + break; >> + } >> + >> + if (!popen_wouldblock(errno)) { >> + result = -1; >> + break; >> + } >> + >> + /* >> + * The handlers are not ready, yield. > > Those are not 'handlers', but 'handles' or 'descriptors'. > Ok. >> + */ >> + if (!ev_is_active(&coio_rd) && >> + fh->fd[STDOUT_FILENO] >= 0) { >> + ev_io_set(&coio_rd, fh->fd[STDOUT_FILENO], EV_READ); >> + ev_io_start(loop(), &coio_rd); >> + } >> + if (!ev_is_active(&coio_er) && >> + fh->fd[STDERR_FILENO] >= 0) { >> + ev_io_set(&coio_er, fh->fd[STDERR_FILENO], EV_READ); >> + ev_io_start(loop(), &coio_er); >> + } >> + /* >> + * Yield control to other fibers until the >> + * timeout is reached. >> + */ >> + bool is_timedout = fiber_yield_timeout(delay); >> + if (is_timedout) { >> + errno = ETIMEDOUT; >> + result = -1; >> + break; >> + } >> + >> + if (fiber_is_cancelled()) { >> + errno = EINTR; >> + result = -1; >> + break; >> + } >> + >> + evio_timeout_update(loop(), &start, &delay); >> + } >> + >> + ev_io_stop(loop(), &coio_er); >> + ev_io_stop(loop(), &coio_rd); >> + return result; >> +} >> + >> +int >> +popen_write(struct popen_handle *fh, const void *buf, size_t count, >> + size_t *written, ev_tstamp timeout) >> +{ >> + assert(fh); >> + if (count == 0) { >> + *written = 0; >> + return 0; >> + } >> + >> + if (timeout < 0.0) >> + timeout = DBL_MAX; >> + >> + if (fh->fd[STDIN_FILENO] < 0) { >> + *written = 0; >> + errno = EBADF; >> + return -1; >> + } >> + >> + ev_tstamp start, delay; >> + evio_timeout_init(loop(), &start, &delay, timeout); >> + >> + struct ev_io coio; >> + popen_coio_create(&coio, fh->fd[STDIN_FILENO]); >> + int result = 0; >> + >> + while(true) { >> + ssize_t rc = write(fh->fd[STDIN_FILENO], buf, count); >> + if (rc < 0 && !popen_wouldblock(errno)) { >> + result = -1; >> + break; >> + } >> + > >> + size_t urc = (size_t)rc; >> + >> + if (urc == count) { >> + *written = count; >> + break; >> + } >> + >> + if (rc > 0) { >> + buf += rc; >> + count -= urc; >> + } > > Let's rewrite this part without urc: > > buf += rc; > count -= rc; > if (count == 0) > break; > Done. >> + >> + /* >> + * The handlers are not ready, yield. >> + */ >> + if (!ev_is_active(&coio)) { >> + ev_io_set(&coio, fh->fd[STDIN_FILENO], EV_WRITE); >> + ev_io_start(loop(), &coio); >> + } >> + >> + /* >> + * Yield control to other fibers until the >> + * timeout is reached. >> + */ >> + bool is_timedout = fiber_yield_timeout(delay); >> + if (is_timedout) { >> + errno = ETIMEDOUT; >> + result = -1; >> + break; >> + } >> + >> + if (fiber_is_cancelled()) { >> + errno = EINTR; >> + result = -1; >> + break; > > IMHO it's uncommon to return EINTR in case the fiber is cancelled. > Why don't you use diag for propagating errors from this module? There's > SystemError for errors returned by syscalls and FiberIsCancelled error > for this case. > Fixed. >> + } >> + >> + evio_timeout_update(loop(), &start, &delay); >> + } >> + >> + ev_io_stop(loop(), &coio); >> + return result; >> +} >> + >> +int >> +popen_kill(struct popen_handle *fh, int signal_id) >> +{ >> + assert(fh); >> + return kill(fh->pid, signal_id); > > I guess we should return an error if the process happens to have been > reaped. > Done. >> +} >> + >> +int >> +popen_wait(struct popen_handle *fh, ev_tstamp timeout, int *exit_code) > > What's the point to return the exit code? One can access it directly > after the function completes. > Fixed. popen_wait() returns 0 for success or -1 for error. >> +{ >> + assert(fh); >> + if (timeout < 0.0) >> + timeout = DBL_MAX; >> + >> + ev_tstamp start, delay; >> + evio_timeout_init(loop(), &start, &delay, timeout); >> + >> + int result = 0; >> + >> + while (true) { >> + /* Wait for SIGCHLD */ >> + int code = 0; >> + >> + int rc = popen_get_status(fh, &code); >> + if (rc != POPEN_RUNNING) { >> + *exit_code = (rc == POPEN_EXITED) ? code >> + : -code; >> + break; >> + } >> + >> + /* >> + * Yield control to other fibers until the >> + * timeout is reached. >> + * Let's sleep for 20 msec. >> + */ >> + fiber_yield_timeout(0.02); > > Why 20 ms? I think what you need is fiber_cond. The fiber_cond was not applicable in multithreaded environment. For now when everything works in a single thread fiber_cond can be applied. > >> + >> + if (fiber_is_cancelled()) { >> + errno = EINTR; >> + result = -1; >> + break; >> + } >> + >> + evio_timeout_update(loop(), &start, &delay); >> + bool is_timedout = (delay == 0.0); >> + if (is_timedout) { >> + errno = ETIMEDOUT; >> + result = -1; >> + break; >> + } >> + } >> + >> + return result; >> +} >> + >> +int >> +popen_get_std_file_handle(struct popen_handle *fh, int file_no) >> +{ >> + assert(fh); >> + if (file_no < STDIN_FILENO || STDERR_FILENO < file_no){ >> + errno = EINVAL; >> + return -1; >> + } >> + >> + errno = 0; >> + return fh->fd[file_no]; >> +} >> + >> +int >> +popen_get_status(struct popen_handle *fh, int *exit_code) >> +{ > > Don't see any point in having these helper function - it's okay to > access the status and fd fields directly. > It's a bad idea to reveal the implementation details. Consider popen_get_status() as a getter. >> + assert(fh); >> + errno = 0; >> + >> + if (exit_code) >> + *exit_code = fh->exit_code; >> + >> + return fh->status; >> +} >> + >> +/* >> + * evio data to control SIGCHLD >> + */ >> +static ev_child cw; >> + >> +static void >> +popen_sigchld_cb(ev_loop *loop, ev_child *watcher, int revents) >> +{ >> + (void)loop; >> + (void)revents; >> + >> + popen_lock_data_list(); >> + >> + struct popen_handle *data = popen_lookup_data_by_pid(watcher->rpid); >> + if (data) { >> + if (WIFEXITED(watcher->rstatus)) { >> + data->exit_code = WEXITSTATUS(watcher->rstatus); >> + data->status = POPEN_EXITED; >> + } else if (WIFSIGNALED(watcher->rstatus)) { >> + data->exit_code = WTERMSIG(watcher->rstatus); >> + >> + if (WCOREDUMP(watcher->rstatus)) >> + data->status = POPEN_DUMPED; > > I don't see any point at all in this status. We don't even report it. > Please remove. > Done. >> + else >> + data->status = POPEN_KILLED; >> + } else { >> + /* >> + * The status is not determined, treat as EXITED >> + */ >> + data->exit_code = EX_SOFTWARE; >> + data->status = POPEN_EXITED; >> + } >> + >> + /* >> + * We shouldn't close file descriptors here. >> + * The child process may exit earlier than >> + * the parent process finishes reading data. >> + * In this case the reading fails. >> + */ >> + } >> + >> + popen_unlock_data_list(); >> +} >> + >> +void >> +popen_setup_sigchld_handler() >> +{ >> + ev_child_init (&cw, popen_sigchld_cb, 0/*pid*/, 0); >> + ev_child_start(loop(), &cw); > > Can't you set it up in popen_init instead? > Done. >> + >> +} >> + >> +void >> +popen_reset_sigchld_handler() >> +{ >> + ev_child_stop(loop(), &cw); >> +} >> diff --git a/src/lib/core/coio_popen.h b/src/lib/core/coio_popen.h >> new file mode 100644 >> index 00000000..57057e97 >> --- /dev/null >> +++ b/src/lib/core/coio_popen.h >> @@ -0,0 +1,229 @@ >> +#ifndef TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED >> +#define TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED >> +/* >> + * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file. >> + * >> + * Redistribution and use in source and binary forms, with or >> + * without modification, are permitted provided that the following >> + * conditions are met: >> + * >> + * 1. Redistributions of source code must retain the above >> + * copyright notice, this list of conditions and the >> + * following disclaimer. >> + * >> + * 2. Redistributions in binary form must reproduce the above >> + * copyright notice, this list of conditions and the following >> + * disclaimer in the documentation and/or other materials >> + * provided with the distribution. >> + * >> + * THIS SOFTWARE IS PROVIDED BY ``AS IS'' AND >> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED >> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL >> + * OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, >> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL >> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF >> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR >> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF >> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF >> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF >> + * SUCH DAMAGE. >> + */ >> + >> +#if defined(__cplusplus) >> +extern "C" { >> +#endif /* defined(__cplusplus) */ >> + >> +#include "evio.h" >> +#include >> + >> +/** >> + * Special values of the file descriptors passed to fio.popen >> + * */ > > Bad comment formatting. It's really annoying to point to such minor > things in a review. There are other places in the patch where there's > an extra new line or a missing space. Please always self-review your > patches so as to make sure it doesn't happen. Ok. > > Regarding the comment: it would be nice to point out why you use > negative constants. > I use negative constants to distinguish them from the valid file descriptors. >> +enum { >> + /** >> + * Tells fio.popen to open a handle for >> + * direct reading/writing. >> + */ >> + FIO_PIPE = -2, > > Why do you start from -2? Why not -1? > The -1 reserved for "invalid file descriptor". >> + >> + /** >> + * Tells fio.popen to redirect the given standard >> + * stream into /dev/null. >> + */ >> + FIO_DEVNULL = -3 >> +}; >> + >> +struct popen_handle; >> + >> +/** >> + * Possible status of the process started via fio.popen >> + **/ >> +enum popen_status { >> + >> + /** >> + * The process is alive and well. >> + */ >> + POPEN_RUNNING = 1, >> + >> + /** >> + * The process exited. >> + */ >> + POPEN_EXITED = 2, >> + >> + /** >> + * The process terminated by a signal. >> + */ >> + POPEN_KILLED = 3, >> + >> + /** >> + * The process terminated abnormally. >> + */ >> + POPEN_DUMPED = 4 >> +}; >> + >> +/** >> + * Initializes inner data of fio.popen >> + * */ >> +void >> +popen_initialize(); >> + >> +ssize_t >> +popen_new(va_list ap); >> + >> +/** >> + * The function releases allocated resources. >> + * The function doesn't wait for the associated process >> + * to terminate. >> + * >> + * @param fh handle returned by fio.popen. >> + * >> + * @return 0 if the process is terminated >> + * @return -1 for an error >> + */ >> +int >> +popen_destroy(struct popen_handle *fh); >> + >> +/** >> + * The function reads up to count bytes from the handle >> + * associated with the child process. >> + * Returns immediately >> + * >> + * @param fd handle returned by fio.popen. >> + * @param buf a buffer to be read into >> + * @param count size of buffer in bytes >> + * @param read_bytes A pointer to the >> + * variable that receives the number of bytes read. >> + * @param source_id A pointer to the variable that receives a >> + * source stream id, 1 - for STDOUT, 2 - for STDERR. >> + * >> + * @return 0 data were successfully read >> + * @return -1 an error occurred, see errno for error code >> + * >> + * If there is nothing to read yet function returns -1 >> + * and errno set no EAGAIN. >> + */ >> +int >> +popen_read(struct popen_handle *fh, void *buf, size_t count, >> + size_t *read_bytes, int *source_id, >> + ev_tstamp timeout); >> + >> +/** >> + * The function writes up to count bytes to the handle >> + * associated with the child process. >> + * Tries to write as much as possible without blocking >> + * and immediately returns. >> + * >> + * @param fd handle returned by fio.popen. >> + * @param buf a buffer to be written from >> + * @param count size of buffer in bytes >> + * @param written A pointer to the >> + * variable that receives the number of bytes actually written. >> + * If function fails the number of written bytes is undefined. >> + * >> + * @return 0 data were successfully written >> + * Compare values of and to check >> + * whether all data were written or not. >> + * @return -1 an error occurred, see errno for error code >> + * >> + * If the writing can block, function returns -1 >> + * and errno set no EAGAIN. >> + */ >> +int >> +popen_write(struct popen_handle *fh, const void *buf, size_t count, >> + size_t *written, ev_tstamp timeout); >> + >> + >> +/** >> + * The function send the specified signal >> + * to the associated process. >> + * >> + * @param fd - handle returned by fio.popen. >> + * >> + * @return 0 on success >> + * @return -1 an error occurred, see errno for error code >> + */ >> +int >> +popen_kill(struct popen_handle *fh, int signal_id); >> + >> +/** >> + * Wait for the associated process to terminate. >> + * The function doesn't release the allocated resources. >> + * >> + * @param fd handle returned by fio.popen. > > 'fd'? The argument is named 'fh'. > Fixed. >> + * >> + * @param timeout number of second to wait before function exit with error. >> + * If function exited due to timeout the errno equals to ETIMEDOUT. >> + * >> + * @exit_code On success contains the exit code as a positive number >> + * or signal id as a negative number. >> + >> + * @return On success function returns 0, and -1 on error. > > What happens if the process has already exited? Has been reaped? Has > been waited for using this function? Please describe in the comment. > Ok. >> + */ >> +int >> +popen_wait(struct popen_handle *fh, ev_tstamp timeout, int *exit_code); > > Please use double for timeouts so as not to include evio.h into header > files. > Ok. >> + >> +/** >> + * Returns descriptor of the specified file. >> + * >> + * @param fd - handle returned by fio.popen. >> + * @param file_no accepts one of the >> + * following values: >> + * STDIN_FILENO, >> + * STDOUT_FILENO, >> + * STDERR_FILENO >> + * >> + * @return file descriptor or -1 if not available >> + */ >> +int >> +popen_get_std_file_handle(struct popen_handle *fh, int file_no); >> + >> + >> +/** >> + * Returns status of the associated process. >> + * >> + * @param fd - handle returned by fio.popen. >> + * >> + * @param exit_code - if not NULL accepts the exit code >> + * if the process terminated normally or signal id >> + * if process was termianted by signal. >> + * >> + * @return one of the following values: >> + * POPEN_RUNNING if the process is alive >> + * POPEN_EXITED if the process was terminated normally >> + * POPEN_KILLED if the process was terminated by a signal >> + */ >> +int >> +popen_get_status(struct popen_handle *fh, int *exit_code); >> + >> +void >> +popen_setup_sigchld_handler(); >> +void >> +popen_reset_sigchld_handler(); >> + >> +#if defined(__cplusplus) >> +} /* extern "C" */ >> +#endif /* defined(__cplusplus) */ >> + >> +#endif /* TARANTOOL_LIB_CORE_COIO_POPEN_H_INCLUDED */ >> diff --git a/src/lib/core/coio_task.c b/src/lib/core/coio_task.c >> index 908b336e..e6a1b327 100644 >> --- a/src/lib/core/coio_task.c >> +++ b/src/lib/core/coio_task.c >> @@ -40,6 +40,7 @@ >> >> #include "fiber.h" >> #include "third_party/tarantool_ev.h" >> +#include "coio_popen.h" >> >> /* >> * Asynchronous IO Tasks (libeio wrapper). >> @@ -129,6 +130,7 @@ coio_on_stop(void *data) >> void >> coio_init(void) >> { >> + popen_initialize(); >> eio_set_thread_on_start(coio_on_start, NULL); >> eio_set_thread_on_stop(coio_on_stop, NULL); >> } >> diff --git a/src/lua/fio.c b/src/lua/fio.c >> index 806f4256..873ee165 100644 >> --- a/src/lua/fio.c >> +++ b/src/lua/fio.c >> @@ -46,6 +46,9 @@ >> >> #include "lua/utils.h" >> #include "coio_file.h" >> +#include "coio_popen.h" >> + >> +static uint32_t CTID_STRUCT_POPEN_HANDLE_REF = 0; >> >> static inline void >> lbox_fio_pushsyserror(struct lua_State *L) >> @@ -703,6 +706,269 @@ lbox_fio_copyfile(struct lua_State *L) >> return lbox_fio_pushbool(L, coio_copyfile(source, dest) == 0); >> } >> >> +static bool >> +popen_verify_argv(struct lua_State *L) >> +{ >> + if (!lua_istable(L, 1)) >> + return false; >> + int num = (int)lua_objlen(L, 1); /*luaL_getn(L,1);*/ > > popen_extract_strarray does the same. Do we really need this function? > Well, we won't die without this function. Removed. >> + return num >= 1; >> +} >> + >> +static char** >> +popen_extract_strarray(struct lua_State *L, int index, int* array_size) > > The name is confusing. Let's please name the function to point out that > it's used for extracting args/env from Lua stack. Something like > > lbox_fio_popen_get_args > > Also, a comment would be helpful. > Done. >> +{ >> + if (lua_type(L, index) != LUA_TTABLE) { > > Why don't you use lua_istable here? > Fixed. >> + if (array_size) >> + *array_size = 0; >> + return NULL; >> + } >> + >> + size_t num = lua_objlen(L, index); /*luaL_getn(L,index);*/ > > What this comment is for? > Removed. In the recent version of Lua they use luaL_getn() to retrieve the i-th element of array. Our current version doesn't have this function. >> + >> + char** array = calloc(num+1, sizeof(char*)); > > Since the array is temporary and freed right upon popen_new completion, > I think it's okay to allocate it on the region. Please take a look at > how region_alloc and region_truncate are used. > Done. >> + /* >> + * The last item in the array must be NULL >> + */ >> + >> + if (array == NULL) >> + return NULL; >> + >> + for(size_t i = 0; i < num; ++i) { >> + lua_rawgeti(L, index, i+1); >> + size_t slen = 0; >> + const char* str = lua_tolstring(L, -1, &slen); >> + if (!str) >> + str = ""; > > This looks like an invalid argument to me. I think we should raise a Lua > exception in this case. Fixed. > >> + array[i] = strdup(str); >> + lua_pop(L, 1); >> + } >> + >> + if (array_size) >> + *array_size = num; > > Why do you return array_size? AFAICS you never use it. > Removed. >> + /* >> + * The number of elements doesn't include >> + * the trailing NULL pointer >> + */ >> + return array; >> +} >> + >> +static struct popen_handle * >> +fio_popen_get_handle(lua_State *L, int idx) >> +{ >> + uint32_t cdata_type; >> + struct popen_handle **handle_ptr = luaL_checkcdata(L, idx, &cdata_type); >> + if (handle_ptr == NULL || cdata_type != CTID_STRUCT_POPEN_HANDLE_REF) >> + return NULL; >> + return *handle_ptr; >> +} >> + >> +static void >> +fio_popen_invalidate_handle(lua_State *L, int idx) > > Let's please prefix all function names dealing with lua with lbox_. > Fixed. >> +{ >> + uint32_t cdata_type; >> + struct popen_handle **handle_ptr = luaL_checkcdata(L, idx, &cdata_type); >> + if (handle_ptr != NULL && cdata_type == CTID_STRUCT_POPEN_HANDLE_REF) { >> + *handle_ptr = NULL; >> + } >> +} >> + >> +static int >> +lbox_fio_popen_gc(lua_State *L) >> +{ >> + struct popen_handle *handle = fio_popen_get_handle(L,1); >> + >> + if (handle) >> + popen_destroy(handle); >> + return 0; >> +} >> + >> +static int >> +lbox_fio_popen(struct lua_State *L) >> +{ >> + if (lua_gettop(L) < 1) { > > Just one argument is enough? But below you use up to 5 arguments. > Please clean up Lua stack checks. > Fixed. >> + usage: >> + luaL_error(L, "fio.popen: Invalid arguments"); >> + } >> + >> + if (!popen_verify_argv(L)) >> + goto usage; >> + >> + int stdin_fd = FIO_PIPE; >> + int stdout_fd = FIO_PIPE; >> + int stderr_fd = FIO_PIPE; >> + >> + char** argv = popen_extract_strarray(L, 1, NULL); >> + char** env = popen_extract_strarray(L, 2, NULL); >> + >> + if (lua_isnumber(L, 3)) >> + stdin_fd = lua_tonumber(L, 3); >> + if (lua_isnumber(L, 4)) >> + stdout_fd = lua_tonumber(L, 4); >> + if (lua_isnumber(L, 5)) >> + stderr_fd = lua_tonumber(L, 5); >> + >> + struct popen_handle* handle = NULL; >> + if (coio_call(popen_new, argv, env, stdin_fd, stdout_fd, > > coio_call should be hidden behind the popen_new implementation. > > Anyway, why do we need coio_call at all? Now since we use vfork, > popen_new should be fast enough to be called right from tx. > BTW, you wouldn't need pthread_mutex guarding popen hash if you > didn't use coio_call. Ok, done. > >> + stderr_fd, &handle) < 0) { >> + lua_pushnil(L); >> + lbox_fio_pushsyserror(L); >> + return 2; >> + } else { >> + luaL_pushcdata(L, CTID_STRUCT_POPEN_HANDLE_REF); >> + *(struct popen_handle **) >> + luaL_pushcdata(L, CTID_STRUCT_POPEN_HANDLE_REF) = handle; >> + lua_pushcfunction(L, lbox_fio_popen_gc); >> + luaL_setcdatagc(L, -2); >> + >> + return 1; >> + } >> +} >> + >> +static int >> +lbox_fio_popen_read(struct lua_State *L) >> +{ >> + /* popen_read(self.fh, buf, size, seconds) */ >> + >> + void* fh = fio_popen_get_handle(L, 1); >> + uint32_t ctypeid; >> + char *buf = *(char **)luaL_checkcdata(L, 2, &ctypeid); >> + size_t len = lua_tonumber(L, 3); >> + ev_tstamp seconds = lua_tonumber(L, 4); >> + >> + if (!len) { >> + lua_pushinteger(L, 0); >> + lua_pushinteger(L, STDOUT_FILENO); >> + return 2; >> + } >> + >> + int output_number = 0; >> + size_t received = 0; >> + int rc = popen_read(fh, buf, len, >> + &received, &output_number, >> + seconds); >> + if (rc == 0) { /* The reading's succeeded */ >> + lua_pushinteger(L, received); >> + lua_pushinteger(L, output_number); >> + return 2; >> + } else { >> + lua_pushnil(L); >> + lua_pushnil(L); >> + lbox_fio_pushsyserror(L); >> + return 3; >> + } >> +} >> + >> +static int >> +lbox_fio_popen_write(struct lua_State *L) >> +{ >> + struct popen_handle* fh = fio_popen_get_handle(L, 1); >> + const char *buf = lua_tostring(L, 2); >> + uint32_t ctypeid = 0; >> + if (buf == NULL) >> + buf = *(const char **)luaL_checkcdata(L, 2, &ctypeid); >> + size_t len = lua_tointeger(L, 3); >> + double timeout = lua_tonumber(L, 4); >> + >> + size_t written = 0; >> + int rc = popen_write(fh, buf, len, >> + &written, timeout); >> + if (rc == 0 && written == len) { >> + /* The writing's succeeded */ >> + lua_pushinteger(L, (ssize_t) written); > > Why do you cast 'written' to ssize_t? Fixed. > >> + return 1; >> + } else { >> + lua_pushnil(L); >> + lbox_fio_pushsyserror(L); >> + return 2; >> + } >> +} >> + >> +static int >> +lbox_fio_popen_get_status(struct lua_State *L) >> +{ >> + struct popen_handle* fh = fio_popen_get_handle(L, 1); >> + int exit_code = 0; >> + int res = popen_get_status(fh, &exit_code); >> + >> + switch (res) { >> + case POPEN_RUNNING: >> + lua_pushnil(L); >> + break; >> + >> + case POPEN_KILLED: >> + lua_pushinteger(L, -exit_code); >> + break; >> + >> + default: >> + lua_pushinteger(L, exit_code); >> + break; >> + } >> + >> + return 1; >> +} >> + >> +static int >> +lbox_fio_popen_get_std_file_handle(struct lua_State *L) >> +{ >> + struct popen_handle* fh = fio_popen_get_handle(L, 1); >> + int file_no = lua_tonumber(L, 2); >> + int res = popen_get_std_file_handle(fh, file_no); >> + >> + if (res < 0) >> + lua_pushnil(L); >> + else >> + lua_pushinteger(L, res); >> + return 1; >> +} >> + >> +static int >> +lbox_fio_popen_kill(struct lua_State *L) >> +{ >> + struct popen_handle* fh = fio_popen_get_handle(L, 1); >> + int signal_id = lua_tonumber(L, 2); >> + >> + int res = popen_kill(fh, signal_id); >> + if (res < 0){ >> + lua_pushboolean(L, false); >> + lbox_fio_pushsyserror(L); >> + return 2; >> + } else { >> + lua_pushboolean(L, true); >> + return 1; >> + } >> +} >> + >> +static int >> +lbox_fio_popen_wait(struct lua_State *L) >> +{ >> + struct popen_handle *fh = fio_popen_get_handle(L, 1); >> + assert(fh); >> + ev_tstamp timeout = lua_tonumber(L, 2); >> + >> + /* >> + * On success function returns an >> + * exit code as a positive number >> + * or signal id as a negative number. >> + * If failed, rc is nul and err >> + * contains a error message. >> + */ >> + >> + int exit_code =0; >> + int res = popen_wait(fh, timeout, &exit_code); >> + if (res < 0){ >> + lua_pushnil(L); >> + lbox_fio_pushsyserror(L); >> + return 2; >> + } else { >> + /* Release the allocated resources */ >> + popen_destroy(fh); >> + fio_popen_invalidate_handle(L, 1); > > Why invalidate it now? This means we won't be able to read stdout of a > reaped process, which looks unexpected to me. I think all resources > should be released by gc. There should be an explicit call to close file > descriptors. Please also see my comments to the commit message. > Fixed. The cleanup is performed in a separate call of popen:shutdown(). Or in gc if shutdown() was not called. >> + >> + lua_pushinteger(L, exit_code); >> + return 1; >> + } >> +} >> >> >> void >> @@ -747,6 +1013,13 @@ tarantool_lua_fio_init(struct lua_State *L) >> { "listdir", lbox_fio_listdir }, >> { "fstat", lbox_fio_fstat }, >> { "copyfile", lbox_fio_copyfile, }, >> + { "popen", lbox_fio_popen }, >> + { "popen_read", lbox_fio_popen_read }, >> + { "popen_write", lbox_fio_popen_write }, >> + { "popen_get_status", lbox_fio_popen_get_status }, >> + { "popen_get_std_file_handle", lbox_fio_popen_get_std_file_handle }, >> + { "popen_kill", lbox_fio_popen_kill }, >> + { "popen_wait", lbox_fio_popen_wait }, >> { NULL, NULL } >> }; >> luaL_register(L, NULL, internal_methods); >> @@ -849,4 +1122,12 @@ tarantool_lua_fio_init(struct lua_State *L) >> >> lua_settable(L, -3); >> lua_pop(L, 1); >> + >> + /* Get CTypeID for `struct tuple' */ >> + int rc = luaL_cdef(L, "struct popen_handle;"); >> + assert(rc == 0); >> + (void) rc; >> + CTID_STRUCT_POPEN_HANDLE_REF = luaL_ctypeid(L, "struct popen_handle &"); >> + assert(CTID_STRUCT_POPEN_HANDLE_REF != 0); >> + >> } >> diff --git a/src/lua/fio.lua b/src/lua/fio.lua >> index ba8c47ec..d07cac55 100644 >> --- a/src/lua/fio.lua >> +++ b/src/lua/fio.lua >> @@ -3,6 +3,8 @@ >> local fio = require('fio') >> local ffi = require('ffi') >> local buffer = require('buffer') >> +local signal = require('signal') >> +local errno = require('errno') >> >> ffi.cdef[[ >> int umask(int mask); >> @@ -15,6 +17,13 @@ local const_char_ptr_t = ffi.typeof('const char *') >> local internal = fio.internal >> fio.internal = nil >> >> +fio.STDIN = 0 >> +fio.STDOUT = 1 >> +fio.STDERR = 2 >> +fio.PIPE = -2 >> +fio.DEVNULL = -3 >> + >> + >> local function sprintf(fmt, ...) >> if select('#', ...) == 0 then >> return fmt >> @@ -206,6 +215,216 @@ fio.open = function(path, flags, mode) >> return fh >> end >> >> +local popen_methods = {} >> + >> +-- read stdout & stderr of the process started by fio.popen >> +-- Usage: >> +-- read(size) -> str, source, err >> +-- read(buf, size) -> length, source, err >> +-- read(size, timeout) -> str, source, err >> +-- read(buf, size, timeout) -> length, source, err >> +-- >> +-- timeout - number of seconds to wait (optional) >> +-- source contains id of the stream, fio.STDOUT or fio.STDERR >> +-- err - error message if method has failed or nil on success >> +popen_methods.read = function(self, buf, size, timeout) >> + if self.fh == nil then >> + return nil, nil, 'Invalid object' >> + end >> + >> + local tmpbuf >> + >> + if ffi.istype(const_char_ptr_t, buf) then >> + -- ext. buffer is specified >> + if type(size) ~= 'number' then >> + error('fio.popen.read: invalid size argument') >> + end >> + timeout = timeout or -1 >> + elseif type(buf) == 'number' then >> + -- use temp. buffer >> + timeout = size or -1 >> + size = buf >> + >> + tmpbuf = buffer.ibuf() >> + buf = tmpbuf:reserve(size) >> + else >> + error("fio.popen.read: invalid arguments") >> + end >> + >> + local res, output_no, err = internal.popen_read(self.fh, buf, size, timeout) >> + if res == nil then >> + if tmpbuf ~= nil then >> + tmpbuf:recycle() >> + end >> + return nil, nil, err >> + end >> + >> + if tmpbuf ~= nil then >> + tmpbuf:alloc(res) >> + res = ffi.string(tmpbuf.rpos, tmpbuf:size()) >> + tmpbuf:recycle() >> + end >> + return res, output_no >> +end >> + >> +-- write(str) >> +-- write(buf, len) >> +popen_methods.write = function(self, data, size) >> + local timeout = -1.0 >> + if type(data) == 'table' then >> + timeout = data.timeout or timeout >> + size = data.size or size >> + data = data.buf >> + end >> + >> + if type(data) == 'string' then >> + if size == nil then >> + size = string.len(data) >> + end >> + elseif not ffi.istype(const_char_ptr_t, data) then >> + data = tostring(data) >> + size = #data >> + end >> + >> + local res, err = internal.popen_write(self.fh, data, tonumber(size), tonumber(timeout)) >> + if err ~= nil then >> + return false, err >> + end >> + return res >= 0 >> +end >> + >> +popen_methods.status = function(self) >> + if self.fh ~= nil then >> + return internal.popen_get_status(self.fh) >> + else >> + return self.exit_code >> + end >> +end >> + >> +popen_methods.stdin = function (self) >> + if self.fh == nil then >> + return nil, 'Invalid object' >> + end >> + >> + return internal.popen_get_std_file_handle(self.fh, fio.STDIN) >> +end >> + >> +popen_methods.stdout = function (self) >> + if self.fh == nil then >> + return nil, 'Invalid object' >> + end >> + >> + return internal.popen_get_std_file_handle(self.fh, fio.STDOUT) >> +end >> + >> +popen_methods.stderr = function (self) >> + if self.fh == nil then >> + return nil, 'Invalid object' >> + end >> + >> + return internal.popen_get_std_file_handle(self.fh, fio.STDERR) >> +end >> + >> +popen_methods.kill = function(self, sig) >> + if self.fh == nil then >> + return false, errno.strerror(errno.ESRCH) >> + end >> + >> + if sig == nil then >> + sig = 'SIGTERM' >> + end >> + if type(sig) == 'string' then >> + sig = signal.c.signals[sig] >> + if sig == nil then >> + errno(errno.EINVAL) >> + return false, sprintf("fio.popen.kill(): unknown signal: %s", sig) >> + end >> + else >> + sig = tonumber(sig) >> + end >> + >> + return internal.popen_kill(self.fh, sig) >> +end >> + >> +popen_methods.wait = function(self, timeout) >> + if self.fh == nil then >> + return false, 'Invalid object' >> + end >> + >> + if timeout == nil then >> + timeout = -1 >> + else >> + timeout = tonumber(timeout) >> + end >> + >> + local rc, err = internal.popen_wait(self.fh, timeout) >> + if rc ~= nil then >> + self.exit_code = tonumber(rc) >> + self.fh = nil >> + return true >> + else >> + return false,err >> + end >> +end >> + >> + >> +local popen_mt = { __index = popen_methods } >> + >> +fio.popen = function(params) >> + local argv = params.argv >> + local env = params.environment >> + local hstdin = params.stdin >> + local hstdout = params.stdout >> + local hstderr = params.stderr >> + >> + if type(hstdin) == 'table' then >> + hstdin = hstdin.fh >> + end >> + if type(hstdout) == 'table' then >> + hstdout = hstdout.fh >> + end >> + if type(hstderr) == 'table' then >> + hstderr = hstderr.fh >> + end >> + >> + if argv == nil or >> + type(argv) ~= 'table' or >> + table.getn(argv) < 1 then >> + local errmsg = [[Usage: fio.popen({parameters}), >> +parameters - a table containing arguments to run a process. >> +The following arguments are expected: >> + >> +argv - [mandatory] is a table of argument strings passed >> + to the new program. >> + By convention, the first of these strings should >> + contain the filename associated with the file >> + being executed. >> + >> +environment - [optional] is a table of strings, >> + conventionally of the form key=value, >> + which are passed as environment to the >> + new program. >> + >> +stdin - [optional] overrides the child process's >> + standard input. >> +stdout - [optional] overrides the child process's >> + standard output. >> +stderr - [optional] overrides the child process's >> + standard error output. >> +]] > > AFAIR we don't typically write a detailed help on error. Just a one-line > "Usage..." reminder. There's a documentation for the rest. > Fixed.