[PATCH 2/2] recovery: fix incorrect handling of empty-body requests.

Vladimir Davydov vdavydov.dev at gmail.com
Fri Sep 14 19:27:49 MSK 2018


On Fri, Sep 14, 2018 at 05:36:41PM +0300, Serge Petrenko wrote:
> In some cases no-ops are written to xlog. They have no effect but are
> needed to bump lsn. Such ops have no body, and empty body requests are
> not handled in xrow_header_decode(). This leads to recovery errors in
> special case: when we have a multi-statement transaction containing
> no-ops written to xlog, upon recovering from such xlog, all data after
> the no-op end till the start of new transaction will become no-ops body,
> so, effectively, it will be ignored. Here's example `tarantoolctl cat`
> output showing this (BODY contains next request data):
> ```
>     ---
>     HEADER:
>       lsn: 5
>       replica_id: 1
>       type: NOP
>       timestamp: 1536656270.5092
>     BODY:
>       type: 3
>       timestamp: 1536656270.5092
>       lsn: 6
>       replica_id: 1
>     ---
>     HEADER:
>       type: 0
>     ...
> ```
> This patch handles no-ops correctly in xrow_header_decode().
> 
> Closes #3678
> ---
>  src/box/xrow.c                 |  6 +++-
>  test/xlog/recover_nop.result   | 72 ++++++++++++++++++++++++++++++++++++++++++
>  test/xlog/recover_nop.test.lua | 28 ++++++++++++++++
>  3 files changed, 105 insertions(+), 1 deletion(-)
>  create mode 100644 test/xlog/recover_nop.result
>  create mode 100644 test/xlog/recover_nop.test.lua
> 
> diff --git a/src/box/xrow.c b/src/box/xrow.c
> index 7a35d0db1..99dce5395 100644
> --- a/src/box/xrow.c
> +++ b/src/box/xrow.c
> @@ -99,6 +99,8 @@ error:
>  	if (mp_typeof(**pos) != MP_MAP)
>  		goto error;
>  
> +	bool expect_body = true;
> +
>  	uint32_t size = mp_decode_map(pos);
>  	for (uint32_t i = 0; i < size; i++) {
>  		if (mp_typeof(**pos) != MP_UINT)
> @@ -110,6 +112,8 @@ error:
>  		switch (key) {
>  		case IPROTO_REQUEST_TYPE:
>  			header->type = mp_decode_uint(pos);
> +			if (header->type == IPROTO_NOP)
> +				expect_body = false;
>  			break;
>  		case IPROTO_SYNC:
>  			header->sync = mp_decode_uint(pos);
> @@ -135,7 +139,7 @@ error:
>  		}
>  	}
>  	assert(*pos <= end);
> -	if (*pos < end) {
> +	if (*pos < end && expect_body) {

Why not simply

	/* NOP request isn't supposed to have a body. */
	if (*pos < end && header->type != IPROTO_NOP) {

?

Anyway, there's a problem here: NOP requests used to have a body, see
commit 89e5b7846c9d ("xrow: make NOP requests bodiless"), i.e. this
function has to decode NOP requests with and without a body.

I don't know what to do about it, because we can't discern a BODY from a
HEADER in an xlog without decoding it. Solutions I can see:

 1. Assume nobody has ever used before_replace and hence there shouldn't
    be NOP requests in old xlogs. Then this patch would be OK.
 2. Take a peek inside the next map and try to determine whether it's a
    body or a header.
 3. Introduce a separate op-code for NOP without body.

> diff --git a/test/xlog/recover_nop.result b/test/xlog/recover_nop.result
> new file mode 100644

NOP recovery is already tested in box/before_replace.test.lua.
I think you'd better extend that rather than introducing a new test.



More information about the Tarantool-patches mailing list