[Tarantool-patches] [PATCH v3] fio/coio: Handle partial writes

Fri Feb 21 01:10:26 MSK 2020

I have three comments and all are up to you.

Please, rebase the patch on top of the current master (errinj.result was
changed).

WBR, Alexander Turenko.

On Wed, Dec 04, 2019 at 05:35:11PM +0300, Cyrill Gorcunov wrote:
> Writting less bytes than requested is fine. In turn our
> fio.write/pwrite api simply returns 'true' even if only
> some part of a buffer has been written. Thus make coio_write
> and coio_pwrite to write the whole data in a cycle.

I would expect that coio_*() functions will have the same behaviour as
its libc counterparts except that they'll yield a current fiber instead
of blocking a current thread.

So spin in a loop looks more as fio.c work for me, however I would not
say that I have a strict vision here.

If you'll decide to keep this logic in coio, at least left comments in
the header file for those functions.

> 
> Fixes #4651
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
> ---
> branch gorcunov/gh-4651-partial-write-3

> +	struct errinj *inj = errinj(ERRINJ_COIO_WRITE_CHUNK, ERRINJ_INT);
> <...>
> +
> +		if (inj != NULL && inj->iparam > 0)
> +			chunk = (ssize_t)inj->iparam;
> +		else
> +			chunk = left;
> +

AFAIR, we have macros for error injections, which allow to avoid
producing any extra machine code for a release build. It seems they can
be used here. Can you elaborate, please?

Nit: We usually split a cast operator from its argument with a
whitespace, e.g. `(ssize_t) inj->iparam`.

>  
>  ssize_t
> @@ -200,7 +221,12 @@ coio_preadn(int fd, void *buf, size_t count, off_t offset)
>  static void
>  coio_do_write(eio_req *req)
>  {
> +	struct errinj *inj = errinj(ERRINJ_COIO_WRITE_CHUNK, ERRINJ_INT);
>  	struct coio_file_task *eio = (struct coio_file_task *)req->data;
> +
> +	if (inj != NULL && inj->iparam > 0)
> +		eio->write.count = (ssize_t)inj->iparam;
> +

(If you'll keep the loop in coio.) I would do it in coio_write() itself:
this way the logic in coio_pwrite() and coio_write() will be quite
similar and so easier to verify if I don't miss something.

Just wonder: is there any reason to use eio_custom() for write? pwrite
uses eio_write(), which support both write and pwrite (it checks whether
the offset >= 0.

I tracked it down to:

commit b9616280533a6bba1bf4739ce11a57461262560f
Author: Dmitry E. Oboukhov <unera at debian.org>
Date:   Mon Aug 11 20:03:30 2014 +0400

    fiber-io tests.

And it is the whole description.

Anyway, I just curious.

>  	req->result = write(eio->write.fd, eio->write.buf, eio->write.count);
>  	eio->errorno = errno;
>  }
> @@ -208,13 +234,28 @@ coio_do_write(eio_req *req)
>  ssize_t
>  coio_write(int fd, const void *buf, size_t count)
>  {
> -	INIT_COEIO_FILE(eio);
> -	eio.write.buf = buf;
> -	eio.write.count = count;
> -	eio.write.fd = fd;
> -	eio_req *req = eio_custom(coio_do_write, 0,
> -				  coio_complete, &eio);
> -	return coio_wait_done(req, &eio);
> +	ssize_t left = count, pos = 0, res;
> +	eio_req *req;
> +
> +	while (left > 0) {
> +		INIT_COEIO_FILE(eio);
> +
> +		eio.write.buf	= buf + pos;
> +		eio.write.count	= left;
> +		eio.write.fd	= fd;
> +
> +		req = eio_custom(coio_do_write, EIO_PRI_DEFAULT,
> +				 coio_complete, &eio);
> +		res = coio_wait_done(req, &eio);
> +		if (res < 0) {
> +			pos = -1;
> +			break;
> +		} else {
> +			left -= res;
> +			pos += res;
> +		}
> +	}
> +	return pos;
>  }

> diff --git a/test/box/errinj.result b/test/box/errinj.result
> index a148346e8..9b831769f 100644
> --- a/test/box/errinj.result
> +++ b/test/box/errinj.result
> @@ -25,6 +25,8 @@ errinj.info()
>      state: 0
>    ERRINJ_WAL_WRITE:
>      state: false
> +  ERRINJ_COIO_WRITE_CHUNK:
> +    state: -1

Now it should be rebased on top of master, this hunk was changed.