[Tarantool-patches] [PATCH 2/2] popen: add popen Lua module
Igor Munkin
imun at tarantool.org
Sun Apr 19 15:38:45 MSK 2020
Sasha,
Thanks for the patch! I left several comments regarding a docbot request
with API description and usage, though they can be freely ignored.
On 18.04.20, Alexander Turenko wrote:
> Co-developed-by: Cyrill Gorcunov <gorcunov at gmail.com>
> Co-developed-by: Igor Munkin <imun at tarantool.org>
>
> Fixes #4031
>
> @TarantoolBot document
> Title: popen module
>
<snipped>
>
> Module functions
> ================
>
> The `popen` module provides two functions to create that named popen
> object `popen.shell` which is the similar to libc `popen` syscall and
Typo: I guess a colon is missing here.
> `popen.new` to create popen object with more specific options.
>
> `popen.shell(command[, mode]) -> handle, err`
> ---------------------------------------------
>
> Execute a shell command.
>
> @param command a command to run, mandatory
> @param mode communication mode, optional
> 'w' to use ph:write()
> 'r' to use ph:read()
> 'R' to use ph:read({stderr = true})
> nil inherit parent's std* file descriptors
>
> Several mode characters can be set together: 'rw', 'rRw', etc.
>
> This function is just shortcut for popen.new({command}, opts)
> with opts.{shell,setsid,group_signal} set to `true` and
> and opts.{stdin,stdout,stderr} set based on `mode` parameter.
>
> All std* streams are inherited from parent by default if it is
> not changed using mode: 'r' for stdout, 'R' for stderr, 'w' for
> stdin.
>
> Raise an error on incorrect parameters:
>
> - IllegalParams: incorrect type or value of a parameter.
>
> Return a popen handle on success.
>
> Return `nil, err` on a failure.
> @see popen.new() for possible reasons.
>
> Example:
>
> | local popen = require('popen')
> |
> | local ph = popen.shell('date', 'r')
> | local date = ph:read():rstrip()
Minor: IMHO <rstrip> call might lead to a misuse, considering there is
no word about it in description below. I guess you can drop it.
> | ph:close()
> | print(date)
>
> Execute 'sh -c date' command, read the output and close the
> popen object.
>
> `popen.new(argv[, opts]) -> handle, err`
> ----------------------------------------
>
> Execute a child program in a new process.
>
> @param argv an array of a program to run with
> command line options, mandatory;
> absolute path to the program is required
Absolute path is not required when shell option is set to <true>.
>
> @param opts table of options
>
> @param opts.stdin action on STDIN_FILENO
> @param opts.stdout action on STDOUT_FILENO
> @param opts.stderr action on STDERR_FILENO
>
> File descriptor actions:
>
> popen.opts.INHERIT (== 'inherit') [default]
> inherit the fd from the parent
> popen.opts.DEVNULL (== 'devnull')
> open /dev/null on the fd
> popen.opts.CLOSE (== 'close')
> close the fd
> popen.opts.PIPE (== 'pipe')
> feed data from/to the fd to parent
> using a pipe
>
> @param opts.env an array of environment variables to
> be used inside a process;
> - when is not set then the current
> environment is inherited;
> - if set then the environment will be
> replaced
> - if set to an empty array then the
> environment will be dropped
Minor: I looks more convenient to sort the lines above the following
way:
* when is not set <...>
* if set to an empty table <...>
* if set to a non-empty table <...>
>
> @param opts.shell (boolean, default: false)
> true run a child process via
> 'sh -c "${opts.argv}"'
> false call the executable directly
>
> @param opts.setsid (boolean, default: false)
> true start executable inside a new
> session
> false don't do that
Minor: I reword it in more clear way below:
| @param opts.setsid (boolean, default: false)
| true run the program in a new
| session
| false run the program in the
| tarantool instance's
| session
>
> @param opts.close_fds (boolean, default: true)
> true close all inherited fds from a
> parent
> false don't do that
>
> @param opts.restore_signals (boolean, default: true)
> true reset all signal actions
> modified in parent's process
> false inherit changed actions
>
> @param opts.group_signal (boolean, default: false)
> true send signal to a child process
> group (only when opts.setsid is
> enabled)
> false send signal to a child process
> only
>
> @param opts.keep_child (boolean, default: false)
> true don't send SIGKILL to a child
> process at freeing (by :close()
> or Lua GC)
> false send SIGKILL to a child process
> (or a process group if
> opts.group_signal is enabled) at
> :close() or collecting of the
> handle by Lua GC
>
> The returned handle is subject to garbage collection, but
> it may be freed explicitly using :close(). SIGKILL is sent
> to the child process at :close() / GC if it is alive and
> opts.keep_child is not set.
Minor: Let's adjust the terminology. What do you mean speaking about
"handle"? If it's a GC object, then there is no way to release it
manually or "explicitly", only via platform GC. If it's a GC object
*payload* then it's not managed by Lua GC and need to be explicitly
released via special method or the corresponding __gc metamethod.
>
> It is recommended to use opts.setsid + opts.group_signal
> if a child process may spawn its own childs and they all
> should be killed together.
>
> Note: A signal will not be sent if the child process is
> already dead: otherwise we might kill another process that
> occupies the same PID later. This means that if the child
> process dies before its own childs, the function will not
> send a signal to the process group even when opts.setsid and
> opts.group_signal are set.
>
> Use os.environ() to pass copy of current environment with
> several replacements (see example 2 below).
>
> Raise an error on incorrect parameters:
>
> - IllegalParams: incorrect type or value of a parameter.
> - IllegalParams: group signal is set, while setsid is not.
>
> Return a popen handle on success.
>
> Return `nil, err` on a failure. Possible reasons:
>
> - SystemError: dup(), fcntl(), pipe(), vfork() or close()
> fails in the parent process.
> - SystemError: (temporary restriction) the parent process
> has closed stdin, stdout or stderr.
> - OutOfMemory: unable to allocate the handle or a temporary
> buffer.
>
> Example 1:
>
> | local popen = require('popen')
> |
> | local ph = popen.new({'/bin/date'}, {
> | stdout = popen.opts.PIPE,
> | })
> | local date = ph:read():rstrip()
> | ph:close()
> | print(date) -- Thu 16 Apr 2020 01:40:56 AM MSK
>
> Execute 'date' command, read the result and close the
> popen object.
Minor: Again, no mention about rstrip.
>
> Example 2:
>
> | local popen = require('popen')
> |
> | local env = os.environ()
> | env['FOO'] = 'bar'
> |
> | local ph = popen.new({'echo "${FOO}"'}, {
> | stdout = popen.opts.PIPE,
> | shell = true,
> | env = env,
> | })
> | local res = ph:read():rstrip()
> | ph:close()
> | print(res) -- bar
>
> It is quite similar to the previous one, but sets the
> environment variable and uses shell builtin 'echo' to
> show it.
>
> Example 3:
>
> | local popen = require('popen')
> |
> | local ph = popen.new({'echo hello >&2'}, { -- !!
> | stderr = popen.opts.PIPE, -- !!
> | shell = true,
> | })
> | local res = ph:read({stderr = true}):rstrip()
> | ph:close()
> | print(res) -- hello
>
> This example demonstrates how to capture child's stderr.
>
> Example 4:
>
> | local function call_jq(stdin, filter)
Minor: <stdin> look a bit misleading. What about <input>, <json>, etc?
> | -- Start jq process, connect to stdin, stdout and stderr.
> | local jq_argv = {'/usr/bin/jq', '-M', '--unbuffered', filter}
> | local ph, err = popen.new(jq_argv, {
> | stdin = popen.opts.PIPE,
> | stdout = popen.opts.PIPE,
> | stderr = popen.opts.PIPE,
> | })
> | if ph == nil then return nil, err end
> |
> | -- Write input data to child's stdin and send EOF.
> | local ok, err = ph:write(stdin)
> | if not ok then return nil, err end
> | ph:shutdown({stdin = true})
> |
> | -- Read everything until EOF.
> | local chunks = {}
> | while true do
> | local chunk, err = ph:read()
> | if chunk == nil then
> | ph:close()
> | return nil, err
> | end
> | if chunk == '' then break end -- EOF
> | table.insert(chunks, chunk)
> | end
> |
> | -- Read diagnostics from stderr if any.
> | local err = ph:read({stderr = true})
> | if err ~= '' then
> | ph:close()
> | return nil, err
> | end
> |
> | -- Glue all chunks, strip trailing newline.
Side note: This comment is definitely needed near <rstrip> calls above.
> | return table.concat(chunks):rstrip()
> | end
>
> Demonstrates how to run a stream program (like `grep`, `sed`
> and so), write to its stdin and read from its stdout.
>
> The example assumes that input data are small enough to fit
> a pipe buffer (typically 64 KiB, but depends on a platform
> and its configuration). It will stuck in :write() for large
> data. How to handle this case: call :read() in a loop in
> another fiber (start it before a first :write()).
>
> If a process writes large text to stderr, it may fill out
> stderr pipe buffer and block in write(2, ...). So we need to
> read stderr too in another fiber to handle this case.
Typo: s/stderr too in another fiber/stderr in another fiber too/.
>
> Handle methods
> ==============
>
> `popen_handle:read([opts]) -> str, err`
> ---------------------------------------
>
> Read data from a child peer.
>
> @param handle handle of a child process
> @param opts an options table
> @param opts.stdout whether to read from stdout, boolean
> (default: true)
> @param opts.stderr whether to read from stderr, boolean
> (default: false)
> @param opts.timeout time quota in seconds
> (default: 100 years)
>
> Read data from stdout or stderr streams with @a timeout.
> By default it reads from stdout. Set @a opts.stderr to
> `true` to read from stderr.
If only <stderr> is set to true, is stdout output also captured?
>
> Raise an error on incorrect parameters or when the fiber is
> cancelled:
>
> - IllegalParams: incorrect type or value of a parameter.
> - IllegalParams: called on a closed handle.
> - IllegalParams: opts.stdout and opts.stderr are set both
> - IllegalParams: a requested IO operation is not supported
> by the handle (stdout / stderr is not
> piped).
> - IllegalParams: attempt to operate on a closed file
> descriptor.
> - FiberIsCancelled: cancelled by an outside code.
>
> Return a string on success, an empty string at EOF.
>
> Return `nil, err` on a failure. Possible reasons:
>
> - SocketError: an IO error occurs at read().
SocketError on pipe looks ridiculous to me. Maybe an IO one?
> - TimedOut: @a timeout quota is exceeded.
> - OutOfMemory: no memory space for a buffer to read into.
> - LuajitError: ("not enough memory"): no memory space for
> the Lua string.
>
<snipped>
>
> `popen_handle:shutdown(opts) -> true`
Minor: Considering the method description, <eof> looks more convenient
name here. On the other hand <eof> name might be misguiding one
considering feof(3) standart C function.
> ------------------------------------------
>
> Close parent's ends of std* fds.
>
> @param handle handle of a child process
> @param opts an options table
> @param opts.stdin close parent's end of stdin, boolean
> @param opts.stdout close parent's end of stdout, boolean
> @param opts.stderr close parent's end of stderr, boolean
>
> The main reason to use this function is to send EOF to
> child's stdin. However parent's end of stdout / stderr
> may be closed too.
>
> The function does not fail on already closed fds (idempotence).
> However it fails on attempt to close the end of a pipe that was
> never exist. In other words, only those std* options that
> were set to popen.opts.PIPE at a handle creation may be used
> here (for popen.shell: 'r' corresponds to stdout, 'R' to stderr
> and 'w' to stdin).
>
> The function does not close any fds on a failure: either all
> requested fds are closed or neither of them.
>
> Example:
>
> | local popen = require('popen')
> |
> | local ph = popen.shell('sed s/foo/bar/', 'rw')
> | ph:write('lorem foo ipsum')
> | ph:shutdown({stdin = true})
> | local res = ph:read()
> | ph:close()
> | print(res) -- lorem bar ipsum
>
> Raise an error on incorrect parameters:
>
> - IllegalParams: an incorrect handle parameter.
> - IllegalParams: called on a closed handle.
> - IllegalParams: neither stdin, stdout nor stderr is choosen.
> - IllegalParams: a requested IO operation is not supported
> by the handle (one of std* is not piped).
>
> Return `true` on success.
>
<snipped>
>
> `popen_handle:info() -> res, err`
<err> is never returned.
> ---------------------------------
>
> Return information about popen handle.
>
> @param handle a handle of a child process
>
> Raise an error on incorrect parameters:
>
> - IllegalParams: an incorrect handle parameter.
> - IllegalParams: called on a closed handle.
>
> Return information about the handle in the following
> format:
>
> {
> pid = <number> or <nil>,
> command = <string>,
> opts = <table>,
> status = <table>,
> stdin = one-of(
> popen.stream.OPENED (== 'opened'),
> popen.stream.CLOSED (== 'closed'),
> nil,
> ),
> stdout = one-of(
> popen.stream.OPENED (== 'opened'),
> popen.stream.CLOSED (== 'closed'),
> nil,
> ),
> stderr = one-of(
> popen.stream.OPENED (== 'opened'),
> popen.stream.CLOSED (== 'closed'),
> nil,
> ),
> }
>
> `pid` is a process id of the process when it is alive,
> otherwise `pid` is nil.
>
> `command` is a concatenation of space separated arguments
> that were passed to execve(). Multiword arguments are quoted.
> Quotes inside arguments are not escaped.
>
> `opts` is a table of handle options in the format of
> popen.new() `opts` parameter. `opts.env` is not shown here,
> because the environment variables map is not stored in a
> handle.
>
> `status` is a table that represents a process status in the
> following format:
>
> {
> state = one-of(
> popen.state.ALIVE (== 'alive'),
> popen.state.EXITED (== 'exited'),
> popen.state.SIGNALED (== 'signaled'),
> )
>
> -- Present when `state` is 'exited'.
> exit_code = <number>,
>
> -- Present when `state` is 'signaled'.
> signo = <number>,
> signame = <string>,
> }
>
> `stdin`, `stdout`, `stderr` reflect status of parent's end
> of a piped stream. When a stream is not piped the field is
> not present (`nil`). When it is piped, the status may be
> one of the following:
>
> - popen.stream.OPENED (== 'opened')
> - popen.stream.CLOSED (== 'closed')
>
> The status may be changed from 'opened' to 'closed'
> by :shutdown({std... = true}) call.
>
> Example 1 (tarantool console):
>
> | tarantool> require('popen').new({'/usr/bin/touch', '/tmp/foo'})
> | ---
> | - command: /usr/bin/touch /tmp/foo
> | status:
> | state: alive
> | opts:
> | stdout: inherit
> | stdin: inherit
> | group_signal: false
> | keep_child: false
> | close_fds: true
> | restore_signals: true
> | shell: false
> | setsid: false
> | stderr: inherit
> | pid: 9499
> | ...
>
> Example 2 (tarantool console):
>
> | tarantool> require('popen').shell('grep foo', 'wrR')
> | ---
> | - stdout: opened
> | command: sh -c 'grep foo'
> | stderr: opened
> | status:
> | state: alive
> | stdin: opened
> | opts:
> | stdout: pipe
> | stdin: pipe
> | group_signal: true
> | keep_child: false
> | close_fds: true
> | restore_signals: true
> | shell: true
> | setsid: true
> | stderr: pipe
> | pid: 10497
> | ...
>
> `popen_handle:wait() -> res, err`
<err> is never returned.
> ---------------------------------
>
> Wait until a child process get exited or signaled.
>
> @param handle a handle of process to wait
>
> Raise an error on incorrect parameters or when the fiber is
> cancelled:
>
> - IllegalParams: an incorrect handle parameter.
> - IllegalParams: called on a closed handle.
> - FiberIsCancelled: cancelled by an outside code.
>
> Return a process status table (the same as ph.status and
> ph.info().status). @see popen_handle:info() for the format
> of the table.
>
<snipped>
> ---
<snipped>
> --
> 2.25.0
>
--
Best regards,
IM
More information about the Tarantool-patches
mailing list