From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id A03674696C3 for ; Sun, 19 Apr 2020 15:45:52 +0300 (MSK) Date: Sun, 19 Apr 2020 15:38:45 +0300 From: Igor Munkin Message-ID: <20200419123845.GP8314@tarantool.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Subject: Re: [Tarantool-patches] [PATCH 2/2] popen: add popen Lua module List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Turenko Cc: tarantool-patches@dev.tarantool.org Sasha, Thanks for the patch! I left several comments regarding a docbot request with API description and usage, though they can be freely ignored. On 18.04.20, Alexander Turenko wrote: > Co-developed-by: Cyrill Gorcunov > Co-developed-by: Igor Munkin > > Fixes #4031 > > @TarantoolBot document > Title: popen module > > > Module functions > ================ > > The `popen` module provides two functions to create that named popen > object `popen.shell` which is the similar to libc `popen` syscall and Typo: I guess a colon is missing here. > `popen.new` to create popen object with more specific options. > > `popen.shell(command[, mode]) -> handle, err` > --------------------------------------------- > > Execute a shell command. > > @param command a command to run, mandatory > @param mode communication mode, optional > 'w' to use ph:write() > 'r' to use ph:read() > 'R' to use ph:read({stderr = true}) > nil inherit parent's std* file descriptors > > Several mode characters can be set together: 'rw', 'rRw', etc. > > This function is just shortcut for popen.new({command}, opts) > with opts.{shell,setsid,group_signal} set to `true` and > and opts.{stdin,stdout,stderr} set based on `mode` parameter. > > All std* streams are inherited from parent by default if it is > not changed using mode: 'r' for stdout, 'R' for stderr, 'w' for > stdin. > > Raise an error on incorrect parameters: > > - IllegalParams: incorrect type or value of a parameter. > > Return a popen handle on success. > > Return `nil, err` on a failure. > @see popen.new() for possible reasons. > > Example: > > | local popen = require('popen') > | > | local ph = popen.shell('date', 'r') > | local date = ph:read():rstrip() Minor: IMHO call might lead to a misuse, considering there is no word about it in description below. I guess you can drop it. > | ph:close() > | print(date) > > Execute 'sh -c date' command, read the output and close the > popen object. > > `popen.new(argv[, opts]) -> handle, err` > ---------------------------------------- > > Execute a child program in a new process. > > @param argv an array of a program to run with > command line options, mandatory; > absolute path to the program is required Absolute path is not required when shell option is set to . > > @param opts table of options > > @param opts.stdin action on STDIN_FILENO > @param opts.stdout action on STDOUT_FILENO > @param opts.stderr action on STDERR_FILENO > > File descriptor actions: > > popen.opts.INHERIT (== 'inherit') [default] > inherit the fd from the parent > popen.opts.DEVNULL (== 'devnull') > open /dev/null on the fd > popen.opts.CLOSE (== 'close') > close the fd > popen.opts.PIPE (== 'pipe') > feed data from/to the fd to parent > using a pipe > > @param opts.env an array of environment variables to > be used inside a process; > - when is not set then the current > environment is inherited; > - if set then the environment will be > replaced > - if set to an empty array then the > environment will be dropped Minor: I looks more convenient to sort the lines above the following way: * when is not set <...> * if set to an empty table <...> * if set to a non-empty table <...> > > @param opts.shell (boolean, default: false) > true run a child process via > 'sh -c "${opts.argv}"' > false call the executable directly > > @param opts.setsid (boolean, default: false) > true start executable inside a new > session > false don't do that Minor: I reword it in more clear way below: | @param opts.setsid (boolean, default: false) | true run the program in a new | session | false run the program in the | tarantool instance's | session > > @param opts.close_fds (boolean, default: true) > true close all inherited fds from a > parent > false don't do that > > @param opts.restore_signals (boolean, default: true) > true reset all signal actions > modified in parent's process > false inherit changed actions > > @param opts.group_signal (boolean, default: false) > true send signal to a child process > group (only when opts.setsid is > enabled) > false send signal to a child process > only > > @param opts.keep_child (boolean, default: false) > true don't send SIGKILL to a child > process at freeing (by :close() > or Lua GC) > false send SIGKILL to a child process > (or a process group if > opts.group_signal is enabled) at > :close() or collecting of the > handle by Lua GC > > The returned handle is subject to garbage collection, but > it may be freed explicitly using :close(). SIGKILL is sent > to the child process at :close() / GC if it is alive and > opts.keep_child is not set. Minor: Let's adjust the terminology. What do you mean speaking about "handle"? If it's a GC object, then there is no way to release it manually or "explicitly", only via platform GC. If it's a GC object *payload* then it's not managed by Lua GC and need to be explicitly released via special method or the corresponding __gc metamethod. > > It is recommended to use opts.setsid + opts.group_signal > if a child process may spawn its own childs and they all > should be killed together. > > Note: A signal will not be sent if the child process is > already dead: otherwise we might kill another process that > occupies the same PID later. This means that if the child > process dies before its own childs, the function will not > send a signal to the process group even when opts.setsid and > opts.group_signal are set. > > Use os.environ() to pass copy of current environment with > several replacements (see example 2 below). > > Raise an error on incorrect parameters: > > - IllegalParams: incorrect type or value of a parameter. > - IllegalParams: group signal is set, while setsid is not. > > Return a popen handle on success. > > Return `nil, err` on a failure. Possible reasons: > > - SystemError: dup(), fcntl(), pipe(), vfork() or close() > fails in the parent process. > - SystemError: (temporary restriction) the parent process > has closed stdin, stdout or stderr. > - OutOfMemory: unable to allocate the handle or a temporary > buffer. > > Example 1: > > | local popen = require('popen') > | > | local ph = popen.new({'/bin/date'}, { > | stdout = popen.opts.PIPE, > | }) > | local date = ph:read():rstrip() > | ph:close() > | print(date) -- Thu 16 Apr 2020 01:40:56 AM MSK > > Execute 'date' command, read the result and close the > popen object. Minor: Again, no mention about rstrip. > > Example 2: > > | local popen = require('popen') > | > | local env = os.environ() > | env['FOO'] = 'bar' > | > | local ph = popen.new({'echo "${FOO}"'}, { > | stdout = popen.opts.PIPE, > | shell = true, > | env = env, > | }) > | local res = ph:read():rstrip() > | ph:close() > | print(res) -- bar > > It is quite similar to the previous one, but sets the > environment variable and uses shell builtin 'echo' to > show it. > > Example 3: > > | local popen = require('popen') > | > | local ph = popen.new({'echo hello >&2'}, { -- !! > | stderr = popen.opts.PIPE, -- !! > | shell = true, > | }) > | local res = ph:read({stderr = true}):rstrip() > | ph:close() > | print(res) -- hello > > This example demonstrates how to capture child's stderr. > > Example 4: > > | local function call_jq(stdin, filter) Minor: look a bit misleading. What about , , etc? > | -- Start jq process, connect to stdin, stdout and stderr. > | local jq_argv = {'/usr/bin/jq', '-M', '--unbuffered', filter} > | local ph, err = popen.new(jq_argv, { > | stdin = popen.opts.PIPE, > | stdout = popen.opts.PIPE, > | stderr = popen.opts.PIPE, > | }) > | if ph == nil then return nil, err end > | > | -- Write input data to child's stdin and send EOF. > | local ok, err = ph:write(stdin) > | if not ok then return nil, err end > | ph:shutdown({stdin = true}) > | > | -- Read everything until EOF. > | local chunks = {} > | while true do > | local chunk, err = ph:read() > | if chunk == nil then > | ph:close() > | return nil, err > | end > | if chunk == '' then break end -- EOF > | table.insert(chunks, chunk) > | end > | > | -- Read diagnostics from stderr if any. > | local err = ph:read({stderr = true}) > | if err ~= '' then > | ph:close() > | return nil, err > | end > | > | -- Glue all chunks, strip trailing newline. Side note: This comment is definitely needed near calls above. > | return table.concat(chunks):rstrip() > | end > > Demonstrates how to run a stream program (like `grep`, `sed` > and so), write to its stdin and read from its stdout. > > The example assumes that input data are small enough to fit > a pipe buffer (typically 64 KiB, but depends on a platform > and its configuration). It will stuck in :write() for large > data. How to handle this case: call :read() in a loop in > another fiber (start it before a first :write()). > > If a process writes large text to stderr, it may fill out > stderr pipe buffer and block in write(2, ...). So we need to > read stderr too in another fiber to handle this case. Typo: s/stderr too in another fiber/stderr in another fiber too/. > > Handle methods > ============== > > `popen_handle:read([opts]) -> str, err` > --------------------------------------- > > Read data from a child peer. > > @param handle handle of a child process > @param opts an options table > @param opts.stdout whether to read from stdout, boolean > (default: true) > @param opts.stderr whether to read from stderr, boolean > (default: false) > @param opts.timeout time quota in seconds > (default: 100 years) > > Read data from stdout or stderr streams with @a timeout. > By default it reads from stdout. Set @a opts.stderr to > `true` to read from stderr. If only is set to true, is stdout output also captured? > > Raise an error on incorrect parameters or when the fiber is > cancelled: > > - IllegalParams: incorrect type or value of a parameter. > - IllegalParams: called on a closed handle. > - IllegalParams: opts.stdout and opts.stderr are set both > - IllegalParams: a requested IO operation is not supported > by the handle (stdout / stderr is not > piped). > - IllegalParams: attempt to operate on a closed file > descriptor. > - FiberIsCancelled: cancelled by an outside code. > > Return a string on success, an empty string at EOF. > > Return `nil, err` on a failure. Possible reasons: > > - SocketError: an IO error occurs at read(). SocketError on pipe looks ridiculous to me. Maybe an IO one? > - TimedOut: @a timeout quota is exceeded. > - OutOfMemory: no memory space for a buffer to read into. > - LuajitError: ("not enough memory"): no memory space for > the Lua string. > > > `popen_handle:shutdown(opts) -> true` Minor: Considering the method description, looks more convenient name here. On the other hand name might be misguiding one considering feof(3) standart C function. > ------------------------------------------ > > Close parent's ends of std* fds. > > @param handle handle of a child process > @param opts an options table > @param opts.stdin close parent's end of stdin, boolean > @param opts.stdout close parent's end of stdout, boolean > @param opts.stderr close parent's end of stderr, boolean > > The main reason to use this function is to send EOF to > child's stdin. However parent's end of stdout / stderr > may be closed too. > > The function does not fail on already closed fds (idempotence). > However it fails on attempt to close the end of a pipe that was > never exist. In other words, only those std* options that > were set to popen.opts.PIPE at a handle creation may be used > here (for popen.shell: 'r' corresponds to stdout, 'R' to stderr > and 'w' to stdin). > > The function does not close any fds on a failure: either all > requested fds are closed or neither of them. > > Example: > > | local popen = require('popen') > | > | local ph = popen.shell('sed s/foo/bar/', 'rw') > | ph:write('lorem foo ipsum') > | ph:shutdown({stdin = true}) > | local res = ph:read() > | ph:close() > | print(res) -- lorem bar ipsum > > Raise an error on incorrect parameters: > > - IllegalParams: an incorrect handle parameter. > - IllegalParams: called on a closed handle. > - IllegalParams: neither stdin, stdout nor stderr is choosen. > - IllegalParams: a requested IO operation is not supported > by the handle (one of std* is not piped). > > Return `true` on success. > > > `popen_handle:info() -> res, err` is never returned. > --------------------------------- > > Return information about popen handle. > > @param handle a handle of a child process > > Raise an error on incorrect parameters: > > - IllegalParams: an incorrect handle parameter. > - IllegalParams: called on a closed handle. > > Return information about the handle in the following > format: > > { > pid = or , > command = , > opts = , > status =
, > stdin = one-of( > popen.stream.OPENED (== 'opened'), > popen.stream.CLOSED (== 'closed'), > nil, > ), > stdout = one-of( > popen.stream.OPENED (== 'opened'), > popen.stream.CLOSED (== 'closed'), > nil, > ), > stderr = one-of( > popen.stream.OPENED (== 'opened'), > popen.stream.CLOSED (== 'closed'), > nil, > ), > } > > `pid` is a process id of the process when it is alive, > otherwise `pid` is nil. > > `command` is a concatenation of space separated arguments > that were passed to execve(). Multiword arguments are quoted. > Quotes inside arguments are not escaped. > > `opts` is a table of handle options in the format of > popen.new() `opts` parameter. `opts.env` is not shown here, > because the environment variables map is not stored in a > handle. > > `status` is a table that represents a process status in the > following format: > > { > state = one-of( > popen.state.ALIVE (== 'alive'), > popen.state.EXITED (== 'exited'), > popen.state.SIGNALED (== 'signaled'), > ) > > -- Present when `state` is 'exited'. > exit_code = , > > -- Present when `state` is 'signaled'. > signo = , > signame = , > } > > `stdin`, `stdout`, `stderr` reflect status of parent's end > of a piped stream. When a stream is not piped the field is > not present (`nil`). When it is piped, the status may be > one of the following: > > - popen.stream.OPENED (== 'opened') > - popen.stream.CLOSED (== 'closed') > > The status may be changed from 'opened' to 'closed' > by :shutdown({std... = true}) call. > > Example 1 (tarantool console): > > | tarantool> require('popen').new({'/usr/bin/touch', '/tmp/foo'}) > | --- > | - command: /usr/bin/touch /tmp/foo > | status: > | state: alive > | opts: > | stdout: inherit > | stdin: inherit > | group_signal: false > | keep_child: false > | close_fds: true > | restore_signals: true > | shell: false > | setsid: false > | stderr: inherit > | pid: 9499 > | ... > > Example 2 (tarantool console): > > | tarantool> require('popen').shell('grep foo', 'wrR') > | --- > | - stdout: opened > | command: sh -c 'grep foo' > | stderr: opened > | status: > | state: alive > | stdin: opened > | opts: > | stdout: pipe > | stdin: pipe > | group_signal: true > | keep_child: false > | close_fds: true > | restore_signals: true > | shell: true > | setsid: true > | stderr: pipe > | pid: 10497 > | ... > > `popen_handle:wait() -> res, err` is never returned. > --------------------------------- > > Wait until a child process get exited or signaled. > > @param handle a handle of process to wait > > Raise an error on incorrect parameters or when the fiber is > cancelled: > > - IllegalParams: an incorrect handle parameter. > - IllegalParams: called on a closed handle. > - FiberIsCancelled: cancelled by an outside code. > > Return a process status table (the same as ph.status and > ph.info().status). @see popen_handle:info() for the format > of the table. > > --- > -- > 2.25.0 > -- Best regards, IM