[tarantool-patches] Re: [PATCH v2] core: Non-blocking io.popen

Tue Jun 4 14:14:01 MSK 2019

On Mon, Jun 03, 2019 at 06:52:35PM +0300, Stanislav Zudin wrote:
> > > rc,src,err = handle:read(buffer,size)
> > > rc,src,err = handle:read2(buffer,size,seconds)
> > > 
> > > read stdout & stderr of the process started by fio.popen
> > > read() -> str, source
> > > read(buf) -> len, source
> > > read(size) -> str, source
> > > read(buf, size) -> len, source
> > > read2(seconds) -> str, source
> > > read2(buf,seconds) -> len, source
> > > read2(size,seconds) -> str, source
> > > read2(buf, size,seconds) -> len, source
> > 
> > Please use the same function name for both variants - read() - as
> > you can figure out what to do by looking at function arguments, no?
> > 
> 
> Well, actually there is an obstacle.
> The size and the seconds are both 'number' so we can't distinguish them.
> The possible way is to pass the fixed number of arguments, e.g.
> read(nil,size) instead of read(size),
> read(buf, nil, seconds) instead of read2(buf,seconds)
> and so on.
> If this approach is acceptable then i'll make the changes.

Dunno. Need to think. May be, pass seconds in a table? AKA options?
I'll talk to Alexander T. - may be he knows a better way.

> > > +static int
> > > +coio_do_nonblock_popen_read(void *fh, void *buf, size_t count,
> > > +	size_t *read_bytes, int *source_id)
> > > +{
> > > +	INIT_COEIO_FILE(eio);
> > > +	eio.popen_read.buf = buf;
> > > +	eio.popen_read.count = count;
> > > +	eio.popen_read.handle = fh;
> > > +	eio.popen_read.read_bytes = read_bytes;
> > > +	eio.popen_read.output_number = source_id;
> > > +	eio_req *req = eio_custom(coio_do_popen_read, 0,
> > > +				  coio_complete, &eio);
> > > +	return coio_wait_done(req, &eio);
> > > +}
> > > +
> > > +ssize_t
> > > +coio_popen_read(void *fh, void *buf, size_t count,
> > > +		int *output_number, int timeout)
> > > +{
> > > +	size_t received = 0;
> > > +	int rc = coio_popen_try_to_read(fh, buf, count,
> > > +		&received, output_number);
> > 
> > You call the same function coio_popen_try_to_read from both tx and coio.
> > How's that? If it's blocking, it blocks tx, which is unacceptable. If it
> > isn't, why call it from coio at all?
> 
> coio_popen_try_to_read is not blocking.

Then it doesn't need to be called via coio.

> But somebody should call fiber_yield(). And coio_file.c contains all
> facilities.
> 
> > > +int
> > > +coio_popen_wait(void *fh, int timeout, int *exit_code)
> > > +{
> > 
> > This function doesn't depend on coio_file infrastructure. Why define it
> > here at all?
> > 
> 
> popen is part of fio.
> fio is implemented in coio_* functions.

You could define it in coio_popen.c then. Actually, I think you could
move all popen stuff there and use coio_call for blocking syscalls.

> > > +	pid = fork();
> > 
> > Please use vfork().
> > 
> 
> I'm afraid it won't work because of at_fork handlers.

Why? Please elaborate.

> > > +	popen_lock_data_list();
> > 
> > This function is called from a signal handler => it might deadlock with
> > coio_popen_destroy, for instance. Please consider using evio signal
> > handlers (take a look at ev_signal_init).
> 
> evio drops all information related to the signal and just reports that the
> signal had been caught.
> And since evio is implemented upon the signalfd() the further calls of
> waitpid are useless - the information about child process was discarded.
> All you can get is the fact that the process is not running.
> 
> So i see the following ways to resolve this problem:
> 1. Refactoring of evio - to keep siginfo_t received with a signal
> (the worst idea IMO).
> 2. Use signalfd and run the own thread to dispatch SIGCHLD
> 3. Deal with the lack of child's exit code
> 4. make the wait() is mandatory, call waitpid() and do not use SIGCHLD
> handler at all.

Please take a look at ev_child.

> > > -#define fiber() cord()->fiber
> > > -#define loop() (cord()->loop)
> > > +#define fiber() (cord() ? cord()->fiber : NULL)
> > > +#define loop() (cord() ? cord()->loop : NULL)
> > 
> > Why is that? A comment would be nice to have.
> 
> It's an at_fork() consequences.

What consequences? What happens without this change and why?
Please elaborate.

> Not sure if we can safely remove at_fork handlers.
> The parent process is forking to run as a daemon.
> 
> > > +	PUSHTABLE("SIGUSR1", lua_pushinteger, SIGUSR1);
> > > +	PUSHTABLE("SIGUSR2", lua_pushinteger, SIGUSR2);
> > > +	lua_settable(L, -3); /* "signals" */
> > 
> > This could be done from Lua - signal numbers are standard AFAIK.
> Did you mean to use constants like fio.STDIN?

Yes. Don't see much point in defining them in C.

> > > diff --git a/test/box-tap/fio_popen.test.lua b/test/box-tap/fio_popen.test.lua
> > 
> > Should be app-tap? Anyway, why tap test? Normal tests are easier to
> > write and understand IMO.
> > 
> 
> Ok, let it be app-tap.
> 
> What a 'Normal' test? The ones in e.g. test/app directory?
> It's hard to find the error when result files are quite big.
> Have to use external tools, e.g. 'diff' etc.

Yes, but I find it more convenient than analyzing a tap test result
file. Please ask Alexander T. about our test policy.

> > > +-- Test 2. Run application, write to stdin, read from stdout
> > > +app1 = fio.popen({argv = {'/bin/sh', '-c', 'cat'},
> > 
> > What's the point using '/bin/sh' just to run 'cat'?
> > 
> 
> cat fails without it.

Why? It should be possible to run 'cat' directly, without the help of
'sh' AFAIU.

> > > diff --git a/third_party/libeio/eio.c b/third_party/libeio/eio.c
> > > index 7351d5dda..e433b1b3b 100644
> > > --- a/third_party/libeio/eio.c
> > > +++ b/third_party/libeio/eio.c
> > > @@ -1741,7 +1741,24 @@ static int eio_threads;
> > >   static void ecb_cold
> > >   eio_prefork()
> > >   {
> > > -    eio_threads = etp_set_max_parallel(EIO_POOL, 0);
> > > +	/*
> > > +	 * When fork() is called libeio shuts
> > > +	 * down all working threads.
> > > +	 * But it causes a deadlock if fork() was
> > > +	 * called from the libeio thread.
> > > +	 * To avoid this do not close the
> > > +	 * thread who called fork().
> > > +	 * This behaviour is acceptable for the
> > > +	 * case when fork() is immediately followed
> > > +	 * by exec().
> > > +	 * To clone a process call fork() from the
> > > +	 * main thread.
> > > +	 */
> > > +
> > > +	if (etp_is_in_pool_thread())
> > > +		eio_threads = etp_get_max_parallel(EIO_POOL);
> > > +	else
> > > +    		eio_threads = etp_set_max_parallel(EIO_POOL, 0);
> > 
> > Please investigate why it was initially done that way, because
> > I'm afraid this change might break something.
> > 
> 
> A usual precautions before fork: stop all thread before fork and resume
> after fork.

If it was just a precaution, then we could remove it altogether rather
than adding a workaround, couldn't we?

However, judging by the commit log, this behavior isn't a part of the
standard libev - it was added by us deliberately. Please check out why.
May be, the commit history and/or ticket description will shed some
light on it.

> This works fine if fork() is called in the main thread.
> Of course the current workaround has its disadvantages. But they're not
> worth mentioning in the case when fork() is followed by exec.

What are the disadvantages?

> > > diff --git a/third_party/libev/ev.c b/third_party/libev/ev.c
> > > index 6a2648591..5fa8293a1 100644
> > > --- a/third_party/libev/ev.c
> > > +++ b/third_party/libev/ev.c
> > > @@ -4214,6 +4214,8 @@ noinline
> > >   void
> > >   ev_signal_stop (EV_P_ ev_signal *w) EV_THROW
> > >   {
> > > +	if (!loop)
> > > +		return;
> > 
> > Why is that? At least deserves a comment. Better handle it at the upper
> > level.
> 
> It's an at_fork() consequences.
> See my comments above.

Please elaborate what exactly happens and why. Can we move this check to
the place where ev_signal_stop is called? It looks very suspicious here.
Especially without a comment.

> >> +static struct popen_data *
> >> +popen_lookup_data_by_pid(pid_t pid)
> >> +{
> >> +	struct popen_data *cur = popen_data_list;
> >> +	for(; cur; cur = cur->next) {
> >> +		if (cur->pid == pid)
> >> +			return cur;
> >> +	}
> >
> > Please write a comment explaining why you think linear search is fine
> > here and why we don't use some kind of hash or tree.
> >
> 
> I believe the number of simultaneously running popen processes is rather
> small - tens. In this case hash won't give a significant performance boost.

I don't really bother, but since Kostja insists on using hash, I'd
consider using mhash for mapping pid->popen_handle if I were you.
It should be pretty trivial to do without cluttering the code and
hence isn't worth arguing about IMO.

> >> +/*
> >> + * Returns next socket to read.
> >> + * Use this function when both STDOUT and STDERR outputs
> >> + * are ready for reading.
> >
> > But what does it do? How does it prioritize? Though short, this function
> > looks kinda arcane to me due to this bit manipulation magic. Please
> > clarify. However, if you switch to ev io, you'll probably won't be
> > needing it.
> >
> 
> It's just a round-robin.
> Used in a case when both STDIN & STDOUT are ready for reading while
> popen.read() returns only one result. Doesn't let any stream stuck.

TBO I don't see any point in RR scheduling in this case. I'd simply use
stderr if it's available (as it's considered high prio typically),
otherwise use stdout. If stdout gets stuck, the process will block and
hence stop writing to stderr eventually and we switch to stdout.
Anyway, worth explaining the logic behind this code in a comment.

> >> +void *
> >> +coio_popen_impl(char** argv, int argc,
> >> +	char** environment, int environment_size,
> >> +	int stdin_fd, int stdout_fd, int stderr_fd)
> >> +{
> >> +	pid_t pid;
> >> +	int socket_rd[2] = {-1,-1};
> >> +	int socket_wr[2] = {-1,-1};
> >> +	int socket_er[2] = {-1,-1};
> >> +	errno = 0;
> >> +
> >> +	struct popen_data *data = popen_data_new();
> >> +	if (data == NULL)
> >> +		return NULL;
> >> +
> >> +	/* Setup a /dev/null if necessary */
> >> +	bool read_devnull = (stdin_fd == FIO_DEVNULL);
> >> +	bool write_devnull = (stdout_fd == FIO_DEVNULL) ||
> >> +			     (stderr_fd == FIO_DEVNULL);
> >> +	int devnull_flags = O_RDWR | O_CREAT;
> >> +	if (!read_devnull)
> >> +		devnull_flags = O_WRONLY | O_CREAT;
> >> +	else if (!write_devnull)
> >> +		devnull_flags = O_RDONLY | O_CREAT;
> >> +
> >> +	if (read_devnull || write_devnull) {
> >> +		data->devnull_fd = open("/dev/null", devnull_flags, 0666);
> >
> > No need to pass mode unless you're creating a file.
> 
> It doesn't affect the existing files. Thus one line is shorted than two.
>  :-)

Oh, I think I see - you use O_CREAT. Please don't. We aren't suppose to
create /dev/null if it doesn't exist - better fail in this case.

> 
> >
> >> +		if (data->devnull_fd < 0)
> >> +			goto on_error;
> >> +		else {
> >> +			if (stdin_fd == FIO_DEVNULL)
> >> +				stdin_fd = data->devnull_fd;
> >> +			if (stdout_fd == FIO_DEVNULL)
> >> +				stdout_fd = data->devnull_fd;
> >> +			if (stderr_fd == FIO_DEVNULL)
> >> +				stderr_fd = data->devnull_fd;
> >> +		}
> >> +	}
> >> +
> >> +	if (stdin_fd == FIO_PIPE) {
> >> +		/*
> >> +		 * Enable non-blocking for the parent side
> >> +		 * and close-on-exec on the child's side.
> >> +		 * The socketpair on OSX doesn't support
> >> +		 * SOCK_NONBLOCK & SOCK_CLOEXEC flags.
> >> +		 */
> >> +		if (socketpair(AF_UNIX, SOCK_STREAM, 0, socket_rd) < 0 ||
> >
> > I overlooked it when I reviewed the patch last time. Why do you use
> > sockets? Why is a simple pipe not enough? Sockets are full-duplex
> > while in your case a simplex connection would be enough, no?
> 
> Ask Georgy, he insisted on replacing pipes by socketpair.

Once you've submitted a patch, it's your responsibility to stand up for
every aspect of each design decision made in the code. Simply listening
to anyone without understanding and, most importantly, accepting his
argumentation isn't a good idea.

Please elaborate on why using socketpair() is important. What are the
problems with pipe()? Can we circumvent the problems somehow? Because
to me socketpair() seems to be a bit of an overkill - as I said it's
bi-directional while for our purposes a uni-directional channel is
enough.

> >> +		execve( argv[0], argv,
> >> +			environment ? environment : environ);
> >> +		_exit(127);
> >
> > Why _exit(), why not exit()? If there's a reason, please explain in a
> > comment. Also, why 127? We prefer to avoid explicit numeric constants.
> > May be, you could use EXIT_FAILURE instead?
> 
> _exit() doesn't call atexit(), but in particular this case it doesn't
> matter.

Then we should use exit() so as not to raise questions.

> As to exit code i need to distinguish exit code returned by child process
> from execve error.
> The native implementation of popen does return 127.

Okay, I see. Please put this reasoning in a comment. Also, it might be
worth defining this exit code in a constant (or use an existing one if
it already exists) with a proper comment.

> >> +void
> >> +coio_popen_child_is_dead(int sig, siginfo_t *si, void *context)
> >> +{
> >> +	(void)context;	/* UNUSED */
> >
> > Why pass it at all if it's unused?
> >
> 
> Because C compiler requires the argument to be defined.
> 
> [6.9.1.5] If the declarator includes a parameter type list, the declaration
> of each parameter shall include an identifier, except for the special case
> of a parameter list consisting of a single parameter of type void, in which
> case there shall not be an identifier. No declaration list shall follow.
> 
> Changing the compiler flags is too much, IMO.

Oh, you pass this function directly to sigaction. Missed that.
IMO this isn't a good idea as it makes the popen internal API
depend on siginfo_t. I'd add a wrapper to main.cc and call
this function from it, passing only arguments it requires.

> >> +/**
> >> + * Handle SIGCHLD signal
> >> + */
> >> +void
> >> +coio_popen_child_is_dead(int sig, siginfo_t *si, void *);
> >
> > Better not use siginfo_t here - instead pass what you really need.
> > This makes the API clearer and cuts the dependency list.
> >
> 
> coio_popen_child_is_dead is passed to sigaction.
> To separate it from siginfo_t we need one more function.
> No problem, will add.

Yes, please do.

> 
> >> diff --git a/src/lua/fio.c b/src/lua/fio.c
> >> +/**
> >> + * A wrapper applicable for using with lua's GC.
> >> + * */
> >> +struct fio_popen_wrap {
> >
> > Why do you need a wrapper at all?
> 
> Right now it contains only a "handle" (a pointer to
> struct popen_data *).
> Later we may need to keep something else in the managed memory.
> It's more readable than just a pointer to pointer.

AFAIK we don't usually add such trivial wrappers to export objects to
Lua. Please ask Alexander T. as he's an expert in Lua-C interaction.

> 
> >
> >> +	void* handle;
> >> +};
> >> +static const char* fio_popen_typename = "fio.popen.data";
> >> +
> >> +static struct fio_popen_wrap *
> >> +fio_popen_get_handle_wrap(lua_State *L, int index)
> >> +{
> >> +	luaL_checktype(L, index, LUA_TUSERDATA);
> >> +	struct fio_popen_wrap *wrap = (struct fio_popen_wrap *)
> >> +		luaL_checkudata(L, index, fio_popen_typename);
> >
> > Hmm, why do you use udata? Why not plain cdata? I keep forgetting
> > the difference, to tell the truth.
> 
> The Lua finalizer deals only with "managed" memory.
> It's not aware of cdata and can't release it.

Hmm, but we use finalizers with cdata - take a look at luaT_pushtuble
and how it uses luaL_setcdatagc. Again, it's worth talking to Alex T.
re this.

> 
> >
> >> diff --git a/src/lua/fio.lua b/src/lua/fio.lua
> >> +popen_methods.do_read = function(self, buf, size, seconds)
> >> +    if size == nil or type(size) ~= 'number' then
> >> +        error('fio.popen.read: invalid size argument')
> >> +    end
> >> +    if seconds == nil or type(seconds) ~= 'number' then
> >> +        error('fio.popen.read: invalid seconds argument')
> >> +    end
> >> +
> >> +    local tmpbuf
> >> +    if not ffi.istype(const_char_ptr_t, buf) then
> >> +        tmpbuf = buffer.ibuf()
> >> +        buf = tmpbuf:reserve(size)
> >> +    end
> >> +
> >> +    local res, output_no, err = internal.popen_read(self.fh, buf, size,
> seconds)
> >> +    if res == nil then
> >> +        if tmpbuf ~= nil then
> >> +            tmpbuf:recycle()
> >> +        end
> >> +        return nil, nil, err
> >> +    end
> >> +
> >> +    if tmpbuf ~= nil then
> >> +        tmpbuf:alloc(res)
> >> +        res = ffi.string(tmpbuf.rpos, tmpbuf:size())
> >> +        tmpbuf:recycle()
> >> +    end
> >> +    return res, output_no
> >> +end
> >> +
> >> +-- read stdout & stderr of the process started by fio.popen
> >> +-- read() -> str, source
> >> +-- read(buf) -> len, source
> >> +-- read(size) -> str, source
> >> +-- read(buf, size) -> len, source
> >> +-- source contains id of the stream, fio.STDOUT or fio.STDERR
> >> +popen_methods.read = function(self, buf, size)
> >> +    if self.fh == nil then
> >> +        return nil, nil, 'Invalid object'
> >> +    end
> >> +
> >> +    if buf == nil and size == nil then
> >> +        -- read()
> >> +        size = 512
> >
> > Why 512? Aren't we supposed to read everything till EOF in this case?
> 
> The behaviour was borrowed from fio.read()

Hmm, I don't see any 512 in fio.read().

> The reading everything till EOF causes problems in reading from both
> sources: STDIN and STDOUT. While you waiting one source another one may
> get stuck.

I don't quite understand. We don't really wait on a source - we use
select and non-blocking reads. Once a source is ready, we read from it.
If another is full, well, okay, we continue reading the first one until
there's data in it. Am I missing something obvious?

> 
> >
> >> +    elseif ffi.istype(const_char_ptr_t, buf) and size == nil then
> >> +        -- read(buf)
> >> +        size = 512
> >> +    elseif not ffi.istype(const_char_ptr_t, buf) and
> >> +            buf ~= nil and size == nil then
> >> +        -- read(size)
> >> +        size = tonumber(buf)
> >> +        buf = nil
> >> +    elseif ffi.istype(const_char_ptr_t, buf) and size ~= nil then
> >> +        -- read(buf, size)
> >> +        size = tonumber(size)
> >> +    else
> >> +        error("fio.popen.read: invalid arguments")
> >> +    end
> >> +
> >> +    return self:do_read(buf, size, -1)
> >> +end
> >
> >> +popen_methods.wait = function(self, timeout)
> >> +    if self.fh == nil then
> >> +        return nil, 'Invalid object'
> >> +    end
> >> +
> >> +    if timeout == nil then
> >> +        timeout = -1
> >> +    else
> >> +        timeout = tonumber(timeout)
> >> +    end
> >> +
> >> +    local rc, err = internal.popen_wait(self.fh, timeout)
> >> +    if rc ~= nil then
> >> +        self.exit_code = tonumber(rc)
> >> +        self.fh = nil
> >
> > This is kinda unexpected. What if we want to read the process output
> > after waiting for it to complete.
> 
> And what do you expect to read after process is complete?
> If application is a some kind of linear script it produces some output and
> then exits. In this case the caller reads everything till the EOF.
> BTW, in our tests test scripts exit earlier than the parent starts
> reading their output.
> 
> For interactive applications who may wait for input forever we've got a
> kill(). It's allowed to read after sending signal and before calling
> wait(). In this case we won't miss anything from the child process output.
> 
> So my point is: wait() is a synonim to close() (or join() in C++ world).
> It releases all resources.

Okay, makes sense. If George and Alexander agree, I'm okay with it.

> >> +app1:write(test2str)
> >> +rc,src,err = app1:read(256)
> >
> > Please also test the case when the child writes both to
> > stdin and stdout (looks like you missed it).
> 
> Did you mean stderr & stdout?

Yeah, s/stdin/stderr.