[Tarantool-patches] [PATCH] replication: stop pushing TimedOut error to the replica

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Jul 14 00:49:53 MSK 2021


Hi! Good job on the patch! Especially on the
explanation in the commit message.

> diff --git a/src/box/xrow.c b/src/box/xrow.c
> index 16cb2484c..c4647c6cb 100644
> --- a/src/box/xrow.c
> +++ b/src/box/xrow.c
> @@ -549,6 +550,10 @@ iproto_write_error(int fd, const struct error *e, uint32_t schema_version,
>  
>  	size_t region_svp = region_used(region);
>  	mpstream_iproto_encode_error(&stream, e);
> +	struct errinj *inj = errinj(ERRINJ_IPROTO_WRITE_ERROR_LARGE,
> +				    ERRINJ_INT);
> +	if (inj != NULL && inj->iparam > 0)
> +		mpstream_memset(&stream, 'E', inj->iparam);

Error injections are supposed to simulate something what can happen
in reality. Otherwise they might test an impossible case, like here -
the error object can't be corrupted. If I understood correctly, this
way you are trying to prove that iproto_write_error() is not called
at all. Because if it is, you break the data and see it in the test.
You could make it panic() with the same result I suppose. It is one way.
Another way would be to simulate exactly the happened case - fill the
buffer with an incomplete xrow.

You tried to do it, from what I see in the test. You create a big
tuple which should fill the buffer.

But still it didn't help somewhy and you added the invalid error
object, right?

I managed to remove this injection and still make the test fail 100%
without your patch. I also think I found an explanation of why the
corrupted error object was needed before.

See my comment in the test.

> diff --git a/test/replication/gh-4040-invalid-msgpack.result b/test/replication/gh-4040-invalid-msgpack.result
> new file mode 100644
> index 000000000..4348ebd90
> --- /dev/null
> +++ b/test/replication/gh-4040-invalid-msgpack.result
> @@ -0,0 +1,180 @@

<...>

> +-- Generate enough data to fill the sendbuf.
> +-- This will make the relay yield in the middle of writing a row waiting for the
> +-- socket to become writeable.
> +tbl = {1}
> + | ---
> + | ...
> +filler = string.rep('b', 100)
> + | ---
> + | ...
> +for i = 2, 0.7 * bufsize / 100 do\
> +    tbl[i] = filler\

The reason why you needed the corrupted error object is that the
tuples are not big enough.

Consider my machine (I think you should have the same values). I
have bufsize = 8192 (8KB). Now consider the tuples you create: each
is < 6KB. Each tuple even with all the MessagePack meta fits into
the buffer alone.

Relay writes the xrows using writev(). On my machine this is what I
see when I print its results on your tuples:

	Written 5744
	Written 5744

So the xrows are written one by one, not in parts. I don't know why.
Probably in the kernel it does not accept even a part of the buffer
if sees it does not fit. So you never had in this test a partially
written xrow.

I changed it slightly:

	-for i = 2, 0.7 * bufsize / 100 do\
	+for i = 2, bufsize / 100 + 1 do\

And removed your ERRINJ_IPROTO_WRITE_ERROR_LARGE injection. Now
each tuple is > 8KB. The kernel is forced to write it in parts.
This is what I see:

	Written 8192
	Written 102

The total size of one xrow is > 8KB. The kernel allows to write a
part of the tuple, and now any error, even a valid one, will corrupt
the buffer. That happens if I revert your timeout error ignorance.

What I don't get is why did you have 0.7 coefficient in the loop?

My full diff:

====================
diff --git a/src/box/xrow.c b/src/box/xrow.c
index c4647c6cb..db72437a9 100644
--- a/src/box/xrow.c
+++ b/src/box/xrow.c
@@ -550,10 +550,6 @@ iproto_write_error(int fd, const struct error *e, uint32_t schema_version,
 
 	size_t region_svp = region_used(region);
 	mpstream_iproto_encode_error(&stream, e);
-	struct errinj *inj = errinj(ERRINJ_IPROTO_WRITE_ERROR_LARGE,
-				    ERRINJ_INT);
-	if (inj != NULL && inj->iparam > 0)
-		mpstream_memset(&stream, 'E', inj->iparam);
 	mpstream_flush(&stream);
 	if (is_error)
 		goto cleanup;
diff --git a/src/lib/core/errinj.h b/src/lib/core/errinj.h
index 0d8ec967d..e49242807 100644
--- a/src/lib/core/errinj.h
+++ b/src/lib/core/errinj.h
@@ -153,7 +153,6 @@ struct errinj {
 	_(ERRINJ_SNAP_COMMIT_FAIL, ERRINJ_BOOL, {.bparam = false}) \
 	_(ERRINJ_IPROTO_SINGLE_THREAD_STAT, ERRINJ_INT, {.iparam = -1}) \
 	_(ERRINJ_IPROTO_WRITE_ERROR_DELAY, ERRINJ_BOOL, {.bparam = false})\
-	_(ERRINJ_IPROTO_WRITE_ERROR_LARGE, ERRINJ_INT, {.iparam = -1})\
 	_(ERRINJ_APPLIER_READ_TX_ROW_DELAY, ERRINJ_BOOL, {.bparam = false})\
 
 ENUM0(errinj_id, ERRINJ_LIST);
diff --git a/test/box/errinj.result b/test/box/errinj.result
index e5193a3d3..44f86a54e 100644
--- a/test/box/errinj.result
+++ b/test/box/errinj.result
@@ -61,7 +61,6 @@ evals
   - ERRINJ_IPROTO_SINGLE_THREAD_STAT: -1
   - ERRINJ_IPROTO_TX_DELAY: false
   - ERRINJ_IPROTO_WRITE_ERROR_DELAY: false
-  - ERRINJ_IPROTO_WRITE_ERROR_LARGE: -1
   - ERRINJ_LOG_ROTATE: false
   - ERRINJ_MEMTX_DELAY_GC: false
   - ERRINJ_PORT_DUMP: false
diff --git a/test/replication/gh-4040-invalid-msgpack.result b/test/replication/gh-4040-invalid-msgpack.result
index 4348ebd90..def47ed3c 100644
--- a/test/replication/gh-4040-invalid-msgpack.result
+++ b/test/replication/gh-4040-invalid-msgpack.result
@@ -47,9 +47,6 @@ test_run:switch('replica')
  | - true
  | ...
 
-box.cfg{log_level=6}
- | ---
- | ...
 sign = box.info.signature
  | ---
  | ...
@@ -81,10 +78,6 @@ box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', true)
  | ---
  | - ok
  | ...
-box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', bufsize)
- | ---
- | - ok
- | ...
 -- Generate enough data to fill the sendbuf.
 -- This will make the relay yield in the middle of writing a row waiting for the
 -- socket to become writeable.
@@ -94,7 +87,7 @@ tbl = {1}
 filler = string.rep('b', 100)
  | ---
  | ...
-for i = 2, 0.7 * bufsize / 100 do\
+for i = 2, bufsize / 100 + 1 do\
     tbl[i] = filler\
 end
  | ---
@@ -136,10 +129,6 @@ box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', false)
  | ---
  | - ok
  | ...
-box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', -1)
- | ---
- | - ok
- | ...
 
 test_run:switch('replica')
  | ---
diff --git a/test/replication/gh-4040-invalid-msgpack.test.lua b/test/replication/gh-4040-invalid-msgpack.test.lua
index 0e51affac..f7f1c3953 100644
--- a/test/replication/gh-4040-invalid-msgpack.test.lua
+++ b/test/replication/gh-4040-invalid-msgpack.test.lua
@@ -18,7 +18,6 @@ _ = box.space.test:create_index('pk')
 test_run:cmd('start server replica with args="1000"')
 test_run:switch('replica')
 
-box.cfg{log_level=6}
 sign = box.info.signature
 box.error.injection.set('ERRINJ_APPLIER_READ_TX_ROW_DELAY', true)
 test_run:switch('master')
@@ -31,13 +30,12 @@ bufsize = soc:getsockopt('SOL_SOCKET', 'SO_SNDBUF')
 require('log').info("SO_SNDBUF size is %d", bufsize)
 -- Master shouldn't try to write the error while the socket isn't writeable.
 box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', true)
-box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', bufsize)
 -- Generate enough data to fill the sendbuf.
 -- This will make the relay yield in the middle of writing a row waiting for the
 -- socket to become writeable.
 tbl = {1}
 filler = string.rep('b', 100)
-for i = 2, 0.7 * bufsize / 100 do\
+for i = 2, bufsize / 100 + 1 do\
     tbl[i] = filler\
 end
 for i = 1,10 do\
@@ -57,7 +55,6 @@ test_run:wait_cond(function() return box.info.signature > sign end)
 
 test_run:switch('master')
 box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_DELAY', false)
-box.error.injection.set('ERRINJ_IPROTO_WRITE_ERROR_LARGE', -1)
 
 test_run:switch('replica')
 


More information about the Tarantool-patches mailing list