From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 0811F6EC55; Thu, 15 Jul 2021 13:20:56 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0811F6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1626344456; bh=J3imSuFX6cGhYN2/LuiMXp0tEGeSxjYEeOg7K5F9UhU=; h=To:Cc:References:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=v+xfp/isqHYh3Sl3tDUMNZO+hHnXB1KsS80RUflpCHmw/hqW/8Qv9WC2P6UrKT68v BCTPMoDYhFycJmUYKfdIgIrb+wRW1WOOBnpAGLamUT0cQbAclE6A9VZHai8928J1dC YSE3jet1M7lB5/7pM044q6ycEI1RTqM+2dVE1qj4= Received: from smtp33.i.mail.ru (smtp33.i.mail.ru [94.100.177.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 6540D6EC55 for ; Thu, 15 Jul 2021 13:20:54 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6540D6EC55 Received: by smtp33.i.mail.ru with esmtpa (envelope-from ) id 1m3yU9-00017B-Da; Thu, 15 Jul 2021 13:20:53 +0300 To: Vladislav Shpilevoy , gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org References: <20210709074048.18169-1-sergepetrenko@tarantool.org> <9cec77da-70b3-e228-f092-2df40609d9f3@tarantool.org> Message-ID: <5765246e-6855-7baf-95bd-d6b16a2cbf38@tarantool.org> Date: Thu, 15 Jul 2021 13:20:52 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <9cec77da-70b3-e228-f092-2df40609d9f3@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD941C43E597735A9C38DCC94ADF53B46036ECF04D6E32A6D03182A05F538085040F04F81E2443616F0580B2FA2316ABD25F6D5D5F39B834FB0757A61899A9A5756 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE75A6765C746F51968EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375F0BD5CF353A411D8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D83E30575CCEFBA5C4E751DA82AE5F595A117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCF1175FABE1C0F9B6A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18F04B652EEC242312D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6BF3059D42242344A089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-C1DE0DAB: 0D63561A33F958A587B3291BCBFC4874A57BEF80054486382E5A81A4944E2F82D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7501A9DF589746230F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D346EDE9E12965E8CD5094FAC5915CC413A24F859ADA2F3B2CD0DD2EB1F77B2BE1B5FAF3E0A0086D7B91D7E09C32AA3244C2BE4689C051A6614B04D40AA2840870FE3D93501275E802FFACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojSyb42jm8PHJwtJFM6crlGg== X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A446451BEF24985B1198AB9DDD720A237A8426FE0C83B255A38F424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH] replication: stop pushing TimedOut error to the replica X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 14.07.2021 00:49, Vladislav Shpilevoy пишет: > Hi! Good job on the patch! Especially on the > explanation in the commit message. Hi! Thanks for the review! TLDR: applied your diff. Please find my answers below. > >> diff --git a/src/box/xrow.c b/src/box/xrow.c >> index 16cb2484c..c4647c6cb 100644 >> --- a/src/box/xrow.c >> +++ b/src/box/xrow.c >> @@ -549,6 +550,10 @@ iproto_write_error(int fd, const struct error *e, uint32_t schema_version, >> >> size_t region_svp = region_used(region); >> mpstream_iproto_encode_error(&stream, e); >> + struct errinj *inj = errinj(ERRINJ_IPROTO_WRITE_ERROR_LARGE, >> + ERRINJ_INT); >> + if (inj != NULL && inj->iparam > 0) >> + mpstream_memset(&stream, 'E', inj->iparam); > Error injections are supposed to simulate something what can happen > in reality. Otherwise they might test an impossible case, like here - > the error object can't be corrupted. If I understood correctly, this I agree, but I couldn't find a better way to reproduce the issue. > way you are trying to prove that iproto_write_error() is not called > at all. Because if it is, you break the data and see it in the test. > You could make it panic() with the same result I suppose. It is one way. What I'm trying to do is to achieve the exact same error. I am trying to write an incomplete row, indeed, but that's not enough to trigger the error: once replica receives an incomplete row, it just exits with "Unexpected EOF when reading from socket", which is recoverable. In order to receive ER_INVALID_MSGPACK, not only we need to write an incomplete row, we also need len(incomplete row) + len(iproto error) >= len(original row) This way length check passes and replication is broken on ER_INVALID_MSGPACK, because obviously you can't have an error object in the middle of other row body. Unfortunately I could only achieve writing an incomplete row with relatively large rows (60 kb in one case and ~ 8 kb in the other). The row is usually split in half, so even while the problem is here (replica receives an iproto error right in the middle of another row), it can't be tested, because all we see is "Unexpected EOF". (8kb / 2 + 180  < 8kb) I tried to turn off length checks on the replica with an error injection, but stumbled upon a couple of problems. So I decided to make the error object huge, just so it would pass the length check when written after the partial row. This way the exact error we need is triggered, ER_INVALID_MSGPACK. And it's triggered not because iproto error contains garbage, but because the row is padded with an iproto error. I checked with verbose logs, and the broken row had the beginning of a normal row, and an iproto error instead of the rest of the row body. > Another way would be to simulate exactly the happened case - fill the > buffer with an incomplete xrow. > > You tried to do it, from what I see in the test. You create a big > tuple which should fill the buffer. > > But still it didn't help somewhy and you added the invalid error > object, right? Yes, as described above. > > I managed to remove this injection and still make the test fail 100% > without your patch. I also think I found an explanation of why the > corrupted error object was needed before. > > See my comment in the test. > >> diff --git a/test/replication/gh-4040-invalid-msgpack.result b/test/replication/gh-4040-invalid-msgpack.result >> new file mode 100644 >> index 000000000..4348ebd90 >> --- /dev/null >> +++ b/test/replication/gh-4040-invalid-msgpack.result >> @@ -0,0 +1,180 @@ > <...> > >> +-- Generate enough data to fill the sendbuf. >> +-- This will make the relay yield in the middle of writing a row waiting for the >> +-- socket to become writeable. >> +tbl = {1} >> + | --- >> + | ... >> +filler = string.rep('b', 100) >> + | --- >> + | ... >> +for i = 2, 0.7 * bufsize / 100 do\ >> + tbl[i] = filler\ > The reason why you needed the corrupted error object is that the > tuples are not big enough. > > Consider my machine (I think you should have the same values). I > have bufsize = 8192 (8KB). Now consider the tuples you create: each > is < 6KB. Each tuple even with all the MessagePack meta fits into > the buffer alone. My buffer size is 8kb as well, yes. > > Relay writes the xrows using writev(). On my machine this is what I > see when I print its results on your tuples: > > Written 5744 > Written 5744 > > So the xrows are written one by one, not in parts. I don't know why. > Probably in the kernel it does not accept even a part of the buffer > if sees it does not fit. So you never had in this test a partially > written xrow. That's strange. I did have partially written rows. And I could only reproduce the error reliably only with row size 0.7 * bufsize. I checked the logs (there's a hex dump of the row in verbose mode), the error is due to partially written rows, not due to IPROTO_WRITE_ERROR_LARGE. > > I changed it slightly: > > -for i = 2, 0.7 * bufsize / 100 do\ > +for i = 2, bufsize / 100 + 1 do\ > > And removed your ERRINJ_IPROTO_WRITE_ERROR_LARGE injection. Now > each tuple is > 8KB. The kernel is forced to write it in parts. > This is what I see: > > Written 8192 > Written 102 > > The total size of one xrow is > 8KB. The kernel allows to write a > part of the tuple, and now any error, even a valid one, will corrupt > the buffer. That happens if I revert your timeout error ignorance. > > What I don't get is why did you have 0.7 coefficient in the loop? With other row lengths the reproducer was not stable, Tarantool didn't fail every time. It failed once in a couple of runs. And again, I know the large error errinj looks strange, but it doesn't make the test fail due to the errinj. The test fails due to partially written rows. > > My full diff: Cool! I tried your version and it does exactly what's needed. Applied. I don't know why that didn't work out when I first tried this approach. I guess you really need row_size ~ buf size, this way the part of the row that doesn't fit is small (< iproto error size), and normal iproto error on top is enough to trigger the error. Thanks for the help! > > ==================== > diff --git a/src/box/xrow.c b/src/box/xrow.c > index c4647c6cb..db72437a9 100644 > --- a/src/box/xrow.c > +++ b/src/box/xrow.c > @@ -550,10 +550,6 @@ iproto_write_error(int fd, const struct error *e, uint32_t schema_version, > > size_t region_svp = region_used(region); > mpstream_iproto_encode_error(&stream, e); > - struct errinj *inj = errinj(ERRINJ_IPROTO_WRITE_ERROR_LARGE, > - ERRINJ_INT); > - if (inj != NULL && inj->iparam > 0) > - mpstream_memset(&stream, 'E', inj->iparam); > mpstream_flush(&stream); > if (is_error) > goto cleanup; > diff --git a/src/lib/core/errinj.h b/src/lib/core/errinj.h > index 0d8ec967d..e49242807 100644 > --- a/src/lib/core/errinj.h > +++ b/src/lib/core/errinj.h > @@ -153,7 +153,6 @@ struct errinj { > _(ERRINJ_SNAP_COMMIT_FAIL, ERRINJ_BOOL, {.bparam = false}) \ > _(ERRINJ_IPROTO_SINGLE_THREAD_STAT, ERRINJ_INT, {.iparam = -1}) \ > _(ERRINJ_IPROTO_WRITE_ERROR_DELAY, ERRINJ_BOOL, {.bparam = false})\ > - _(ERRINJ_IPROTO_WRITE_ERROR_LARGE, ERRINJ_INT, {.iparam = -1})\ > _(ERRINJ_APPLIER_READ_TX_ROW_DELAY, ERRINJ_BOOL, {.bparam = false})\ > > ENUM0(errinj_id, ERRINJ_LIST); > diff --git a/test/box/errinj.result b/test/box/errinj.result > index e5193a3d3..44f86a54e 100644 > --- a/test/box/errinj.result > +++ b/test/box/errinj.result > @@ -61,7 +61,6 @@ evals > - ERRINJ_IPROTO_SINGLE_THREAD_STAT: -1 > - ERRINJ_IPROTO_TX_DELAY: false > - ERRINJ_IPROTO_WRITE_ERROR_DELAY: false > - - ERRINJ_IPROTO_WRITE_ERROR_LARGE: -1 > - ERRINJ_LOG_ROTATE: false > - ERRINJ_MEMTX_DELAY_GC: false > - ERRINJ_PORT_DUMP: false > diff --git a/test/replication/gh-4040-invalid-msgpack.result b/test/replication/gh-4040-invalid-msgpack.result > index 4348ebd90..def47ed3c 100644 > --- a/test/replication/gh-4040-invalid-msgpack.result > +++ b/test/replication/gh-4040-invalid-msgpack.result > @@ -47,9 +47,6 @@ test_run:switch('replica') > | - true > | ... > > -box.cfg{log_level=6} > - | --- > - | ... > sign = box.info.signature > | --- > | ... > @@ -81,10 +78,6 @@ box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', true) > | --- > | - ok > | ... > -box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', bufsize) > - | --- > - | - ok > - | ... > -- Generate enough data to fill the sendbuf. > -- This will make the relay yield in the middle of writing a row waiting for the > -- socket to become writeable. > @@ -94,7 +87,7 @@ tbl = {1} > filler = string.rep('b', 100) > | --- > | ... > -for i = 2, 0.7 * bufsize / 100 do\ > +for i = 2, bufsize / 100 + 1 do\ > tbl[i] = filler\ > end > | --- > @@ -136,10 +129,6 @@ box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', false) > | --- > | - ok > | ... > -box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', -1) > - | --- > - | - ok > - | ... > > test_run:switch('replica') > | --- > diff --git a/test/replication/gh-4040-invalid-msgpack.test.lua b/test/replication/gh-4040-invalid-msgpack.test.lua > index 0e51affac..f7f1c3953 100644 > --- a/test/replication/gh-4040-invalid-msgpack.test.lua > +++ b/test/replication/gh-4040-invalid-msgpack.test.lua > @@ -18,7 +18,6 @@ _ = box.space.test:create_index('pk') > test_run:cmd('start server replica with args="1000"') > test_run:switch('replica') > > -box.cfg{log_level=6} > sign = box.info.signature > box.error.injection.set('ERRINJ_APPLIER_READ_TX_ROW_DELAY', true) > test_run:switch('master') > @@ -31,13 +30,12 @@ bufsize = soc:getsockopt('SOL_SOCKET', 'SO_SNDBUF') > require('log').info("SO_SNDBUF size is %d", bufsize) > -- Master shouldn't try to write the error while the socket isn't writeable. > box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', true) > -box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', bufsize) > -- Generate enough data to fill the sendbuf. > -- This will make the relay yield in the middle of writing a row waiting for the > -- socket to become writeable. > tbl = {1} > filler = string.rep('b', 100) > -for i = 2, 0.7 * bufsize / 100 do\ > +for i = 2, bufsize / 100 + 1 do\ > tbl[i] = filler\ > end > for i = 1,10 do\ > @@ -57,7 +55,6 @@ test_run:wait_cond(function() return box.info.signature > sign end) > > test_run:switch('master') > box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', false) > -box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', -1) > > test_run:switch('replica') > -- Serge Petrenko