Tarantool development patches archive
 help / color / mirror / Atom feed
* [tarantool-patches] Re: [PATCH v1] test: fix error message check pattern
       [not found] <217570966faa84e877eed6b6eb52a07e40053622.1562587240.git.avtikhon@tarantool.org>
@ 2019-07-09  1:46 ` Alexander Turenko
  0 siblings, 0 replies; only message in thread
From: Alexander Turenko @ 2019-07-09  1:46 UTC (permalink / raw)
  To: Alexander V. Tikhonov; +Cc: tarantool-patches

I made changes to the patch and the commit message and pushed the patch
to master, 2.1 and 1.10. But you need to consider comments below.

WBR, Alexander Turenko.

> test: fix error message check pattern

I unable to undestand anything from this commit message header. What is
the problem? Which test or test case is subject to fix? What is the
pattern you said about?

> Found that in box/net.box test the checker of the error
> message can find the previous error message under highload,
> when log file updates with time delay. Fix is to wait for

is -> it?

> the needed message and check it completely to avoid of
> mistakes. Fixed issue:

I found that it is hard to understand the message. I'm start reading and
have no idea what is 'the checker' and 'the error' you said about.

I cannot give you a formal criteria how to a write good commit message.
Imagine that you know nothing about the problem and want to look at a
human readable description w/o digging into a patch itself. Avoid
forward references, try to minimize assumed context. Read your text, fix
everything that does not look good and repeat.

I rewrote the commit message and that was the bigger part of the job I
made on this patch (30min at least and much more to write this answer).
It becomes really annoying: I don't want to pass such simple patches
back and make several review rounds on them, but I also don't want to
rewrite your texts each time. It is your responsibility to make a patch
good enough to land into master.

This is the problem. I cannot (and will not) pay so much time to review
QA patches. They should be good enough to pass review (most of the time
at least).

> 
>     Test failed! Result content mismatch:
>     --- box/net.box.result	Mon Jun 24 17:23:49 2019
>     +++ box/net.box.reject	Mon Jun 24 17:51:52 2019
>     @@ -1404,7 +1404,7 @@
>      ...
>      test_run:grep_log('default', 'ER_INVALID_MSGPACK.*')
>     ---
>     -- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet body'
>     +- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet length'
>     ...
>     -- gh-983 selecting a lot of data crashes the server or hangs the
>     -- connection
> 
> Closes: #4311

I rewrote it in this way:

 | test: net.box: fix case re invalid msgpack warning
 | 
 | The test case has two problems that appear from time to time and lead to
 | flaky fails. Those fails are look as shown below in a test-run output.
 | 
 |  | Test failed! Result content mismatch:
 |  | --- box/net.box.result       Mon Jun 24 17:23:49 2019
 |  | +++ box/net.box.reject       Mon Jun 24 17:51:52 2019
 |  | @@ -1404,7 +1404,7 @@
 |  |  ...
 |  |  test_run:grep_log('default', 'ER_INVALID_MSGPACK.*')
 |  | ---
 |  | -- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet body'
 |  | +- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet length'
 |  | ...
 |  | -- gh-983 selecting a lot of data crashes the server or hangs the
 |  | -- connection
 | 
 | 'ER_INVALID_MSGPACK.*' regexp should match 'ER_INVALID_MSGPACK: Invalid
 | MsgPack - packet body' log message, but if it is not in a log file at a
 | time of grep_log() call (just don't flushed to the file yet) a message
 | produced by another test case can be matched ('ER_INVALID_MSGPACK:
 | Invalid MsgPack - packet length'). The fix here is to match the entire
 | message and check for the message periodically during several seconds
 | (use wait_log() instead of grep_log()).
 | 
 | Another problem is the race between writing a response to an iproto
 | socket on a server side and closing the socket on a client end. If
 | tarantool is unable to write a response, it does not produce the warning
 | re invalid msgpack, but shows 'broken pipe' message instead. We need
 | first grep for the message in logs and only then close the socket on a
 | client. The similar problem (with another test case) is described in
 | [1].
 | 
 | [1]: https://github.com/tarantool/tarantool/issues/4273#issuecomment-508939695
 | 
 | Closes: #4311

> ---
> 
> Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh4311-net-box-msg-fix
> Issue: https://github.com/tarantool/tarantool/issues/4311
> 
>  test/box/net.box.result   | 2 +-
>  test/box/net.box.test.lua | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/test/box/net.box.result b/test/box/net.box.result
> index efaed11de..b7ac5e4f2 100644
> --- a/test/box/net.box.result
> +++ b/test/box/net.box.result
> @@ -1402,7 +1402,7 @@ sock:close()
>  ---
>  - true
>  ...
> -test_run:grep_log('default', 'ER_INVALID_MSGPACK.*')
> +test_run:wait_log('default', 'ER_INVALID_MSGPACK: Invalid MsgPack .* packet body', nil, 10)
>  ---
>  - 'ER_INVALID_MSGPACK: Invalid MsgPack - packet body'
>  ...
> diff --git a/test/box/net.box.test.lua b/test/box/net.box.test.lua
> index a14099024..cb1f2d341 100644
> --- a/test/box/net.box.test.lua
> +++ b/test/box/net.box.test.lua
> @@ -549,7 +549,7 @@ sock = socket.tcp_connect(LISTEN.host, LISTEN.service)
>  data = string.fromhex("6783000000000000000000000000000000000000000000800000C8000000000000000000000000000000000000000000FFFF210100373208000000FFFF000055AAEB66486472530D02000000000010A0350001008000001000000000000000000000000000D05700")
>  sock:write(data)
>  sock:close()
> -test_run:grep_log('default', 'ER_INVALID_MSGPACK.*')
> +test_run:wait_log('default', 'ER_INVALID_MSGPACK: Invalid MsgPack .* packet body', nil, 10)

It is better to just use %- to escape the dash. Please, read [1] and
[2]. Lua regexps are simple, but it worth to know its peculiars to use
them.

I run the test 1000 times in 16 threads. The test case still fails. The
reason is the same as here in [3]. We need to close the socket after the
log check.

It seems you should adopt a way how you verify such flaky fails to
ensure that a change actually fixes a flaky fail.

[1]: http://lua-users.org/wiki/PatternsTutorial
[2]: https://www.lua.org/pil/20.2.html
[3]: https://github.com/tarantool/tarantool/issues/4273#issuecomment-508939695

>  
>  -- gh-983 selecting a lot of data crashes the server or hangs the
>  -- connection
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-07-09  1:47 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <217570966faa84e877eed6b6eb52a07e40053622.1562587240.git.avtikhon@tarantool.org>
2019-07-09  1:46 ` [tarantool-patches] Re: [PATCH v1] test: fix error message check pattern Alexander Turenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox