From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 14 Sep 2018 19:27:49 +0300 From: Vladimir Davydov Subject: Re: [PATCH 2/2] recovery: fix incorrect handling of empty-body requests. Message-ID: <20180914162749.dijxyytj6b3y7fk7@esperanza> References: <724c197f81170aac18d37d976c835177925959f6.1536934854.git.sergepetrenko@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <724c197f81170aac18d37d976c835177925959f6.1536934854.git.sergepetrenko@tarantool.org> To: Serge Petrenko Cc: tarantool-patches@freelists.org List-ID: On Fri, Sep 14, 2018 at 05:36:41PM +0300, Serge Petrenko wrote: > In some cases no-ops are written to xlog. They have no effect but are > needed to bump lsn. Such ops have no body, and empty body requests are > not handled in xrow_header_decode(). This leads to recovery errors in > special case: when we have a multi-statement transaction containing > no-ops written to xlog, upon recovering from such xlog, all data after > the no-op end till the start of new transaction will become no-ops body, > so, effectively, it will be ignored. Here's example `tarantoolctl cat` > output showing this (BODY contains next request data): > ``` > --- > HEADER: > lsn: 5 > replica_id: 1 > type: NOP > timestamp: 1536656270.5092 > BODY: > type: 3 > timestamp: 1536656270.5092 > lsn: 6 > replica_id: 1 > --- > HEADER: > type: 0 > ... > ``` > This patch handles no-ops correctly in xrow_header_decode(). > > Closes #3678 > --- > src/box/xrow.c | 6 +++- > test/xlog/recover_nop.result | 72 ++++++++++++++++++++++++++++++++++++++++++ > test/xlog/recover_nop.test.lua | 28 ++++++++++++++++ > 3 files changed, 105 insertions(+), 1 deletion(-) > create mode 100644 test/xlog/recover_nop.result > create mode 100644 test/xlog/recover_nop.test.lua > > diff --git a/src/box/xrow.c b/src/box/xrow.c > index 7a35d0db1..99dce5395 100644 > --- a/src/box/xrow.c > +++ b/src/box/xrow.c > @@ -99,6 +99,8 @@ error: > if (mp_typeof(**pos) != MP_MAP) > goto error; > > + bool expect_body = true; > + > uint32_t size = mp_decode_map(pos); > for (uint32_t i = 0; i < size; i++) { > if (mp_typeof(**pos) != MP_UINT) > @@ -110,6 +112,8 @@ error: > switch (key) { > case IPROTO_REQUEST_TYPE: > header->type = mp_decode_uint(pos); > + if (header->type == IPROTO_NOP) > + expect_body = false; > break; > case IPROTO_SYNC: > header->sync = mp_decode_uint(pos); > @@ -135,7 +139,7 @@ error: > } > } > assert(*pos <= end); > - if (*pos < end) { > + if (*pos < end && expect_body) { Why not simply /* NOP request isn't supposed to have a body. */ if (*pos < end && header->type != IPROTO_NOP) { ? Anyway, there's a problem here: NOP requests used to have a body, see commit 89e5b7846c9d ("xrow: make NOP requests bodiless"), i.e. this function has to decode NOP requests with and without a body. I don't know what to do about it, because we can't discern a BODY from a HEADER in an xlog without decoding it. Solutions I can see: 1. Assume nobody has ever used before_replace and hence there shouldn't be NOP requests in old xlogs. Then this patch would be OK. 2. Take a peek inside the next map and try to determine whether it's a body or a header. 3. Introduce a separate op-code for NOP without body. > diff --git a/test/xlog/recover_nop.result b/test/xlog/recover_nop.result > new file mode 100644 NOP recovery is already tested in box/before_replace.test.lua. I think you'd better extend that rather than introducing a new test.