From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 1571B21C32 for ; Mon, 8 Jul 2019 21:47:50 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UEYgSja6rc6v for ; Mon, 8 Jul 2019 21:47:49 -0400 (EDT) Received: from smtp32.i.mail.ru (smtp32.i.mail.ru [94.100.177.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 5D0A021BF9 for ; Mon, 8 Jul 2019 21:47:49 -0400 (EDT) Date: Tue, 9 Jul 2019 04:46:55 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH v1] test: fix error message check pattern Message-ID: <20190709014654.3fx4u7yhh2lm2bte@tkn_work_nb> References: <217570966faa84e877eed6b6eb52a07e40053622.1562587240.git.avtikhon@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <217570966faa84e877eed6b6eb52a07e40053622.1562587240.git.avtikhon@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: "Alexander V. Tikhonov" Cc: tarantool-patches@freelists.org I made changes to the patch and the commit message and pushed the patch to master, 2.1 and 1.10. But you need to consider comments below. WBR, Alexander Turenko. > test: fix error message check pattern I unable to undestand anything from this commit message header. What is the problem? Which test or test case is subject to fix? What is the pattern you said about? > Found that in box/net.box test the checker of the error > message can find the previous error message under highload, > when log file updates with time delay. Fix is to wait for is -> it? > the needed message and check it completely to avoid of > mistakes. Fixed issue: I found that it is hard to understand the message. I'm start reading and have no idea what is 'the checker' and 'the error' you said about. I cannot give you a formal criteria how to a write good commit message. Imagine that you know nothing about the problem and want to look at a human readable description w/o digging into a patch itself. Avoid forward references, try to minimize assumed context. Read your text, fix everything that does not look good and repeat. I rewrote the commit message and that was the bigger part of the job I made on this patch (30min at least and much more to write this answer). It becomes really annoying: I don't want to pass such simple patches back and make several review rounds on them, but I also don't want to rewrite your texts each time. It is your responsibility to make a patch good enough to land into master. This is the problem. I cannot (and will not) pay so much time to review QA patches. They should be good enough to pass review (most of the time at least). > > Test failed! Result content mismatch: > --- box/net.box.result Mon Jun 24 17:23:49 2019 > +++ box/net.box.reject Mon Jun 24 17:51:52 2019 > @@ -1404,7 +1404,7 @@ > ... > test_run:grep_log('default', 'ER_INVALID_MSGPACK.*') > --- > -- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet body' > +- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet length' > ... > -- gh-983 selecting a lot of data crashes the server or hangs the > -- connection > > Closes: #4311 I rewrote it in this way: | test: net.box: fix case re invalid msgpack warning | | The test case has two problems that appear from time to time and lead to | flaky fails. Those fails are look as shown below in a test-run output. | | | Test failed! Result content mismatch: | | --- box/net.box.result Mon Jun 24 17:23:49 2019 | | +++ box/net.box.reject Mon Jun 24 17:51:52 2019 | | @@ -1404,7 +1404,7 @@ | | ... | | test_run:grep_log('default', 'ER_INVALID_MSGPACK.*') | | --- | | -- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet body' | | +- 'ER_INVALID_MSGPACK: Invalid MsgPack - packet length' | | ... | | -- gh-983 selecting a lot of data crashes the server or hangs the | | -- connection | | 'ER_INVALID_MSGPACK.*' regexp should match 'ER_INVALID_MSGPACK: Invalid | MsgPack - packet body' log message, but if it is not in a log file at a | time of grep_log() call (just don't flushed to the file yet) a message | produced by another test case can be matched ('ER_INVALID_MSGPACK: | Invalid MsgPack - packet length'). The fix here is to match the entire | message and check for the message periodically during several seconds | (use wait_log() instead of grep_log()). | | Another problem is the race between writing a response to an iproto | socket on a server side and closing the socket on a client end. If | tarantool is unable to write a response, it does not produce the warning | re invalid msgpack, but shows 'broken pipe' message instead. We need | first grep for the message in logs and only then close the socket on a | client. The similar problem (with another test case) is described in | [1]. | | [1]: https://github.com/tarantool/tarantool/issues/4273#issuecomment-508939695 | | Closes: #4311 > --- > > Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh4311-net-box-msg-fix > Issue: https://github.com/tarantool/tarantool/issues/4311 > > test/box/net.box.result | 2 +- > test/box/net.box.test.lua | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/test/box/net.box.result b/test/box/net.box.result > index efaed11de..b7ac5e4f2 100644 > --- a/test/box/net.box.result > +++ b/test/box/net.box.result > @@ -1402,7 +1402,7 @@ sock:close() > --- > - true > ... > -test_run:grep_log('default', 'ER_INVALID_MSGPACK.*') > +test_run:wait_log('default', 'ER_INVALID_MSGPACK: Invalid MsgPack .* packet body', nil, 10) > --- > - 'ER_INVALID_MSGPACK: Invalid MsgPack - packet body' > ... > diff --git a/test/box/net.box.test.lua b/test/box/net.box.test.lua > index a14099024..cb1f2d341 100644 > --- a/test/box/net.box.test.lua > +++ b/test/box/net.box.test.lua > @@ -549,7 +549,7 @@ sock = socket.tcp_connect(LISTEN.host, LISTEN.service) > data = string.fromhex("6783000000000000000000000000000000000000000000800000C8000000000000000000000000000000000000000000FFFF210100373208000000FFFF000055AAEB66486472530D02000000000010A0350001008000001000000000000000000000000000D05700") > sock:write(data) > sock:close() > -test_run:grep_log('default', 'ER_INVALID_MSGPACK.*') > +test_run:wait_log('default', 'ER_INVALID_MSGPACK: Invalid MsgPack .* packet body', nil, 10) It is better to just use %- to escape the dash. Please, read [1] and [2]. Lua regexps are simple, but it worth to know its peculiars to use them. I run the test 1000 times in 16 threads. The test case still fails. The reason is the same as here in [3]. We need to close the socket after the log check. It seems you should adopt a way how you verify such flaky fails to ensure that a change actually fixes a flaky fail. [1]: http://lua-users.org/wiki/PatternsTutorial [2]: https://www.lua.org/pil/20.2.html [3]: https://github.com/tarantool/tarantool/issues/4273#issuecomment-508939695 > > -- gh-983 selecting a lot of data crashes the server or hangs the > -- connection > -- > 2.17.1 >