From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id CD64446970F for ; Mon, 25 Nov 2019 18:31:44 +0300 (MSK) Date: Mon, 25 Nov 2019 18:31:41 +0300 From: Alexander Turenko Message-ID: <20191125153140.pgpok3phtvqqts7n@tkn_work_nb> References: <20191123145012.16074-1-i.kosarev@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20191123145012.16074-1-i.kosarev@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ilya Kosarev Cc: tarantool-patches@dev.tarantool.org On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote: > For some tests, for example, replication/box_set_replication_stress, > socket.tcp_connect() in test_run:cmd() might sometimes fail when > running under high load. Now it is fixed. > > Closes #193 > --- > https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd > https://github.com/tarantool/test-run/issues/193 > > test_run.lua | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/test_run.lua b/test_run.lua > index 63dfdef..0d450bd 100644 > --- a/test_run.lua > +++ b/test_run.lua > @@ -11,6 +11,9 @@ local clock = require('clock') > > local function cmd(self, msg) > local sock = socket.tcp_connect(self.host, self.port) > + while sock == nil do > + sock = socket.tcp_connect(self.host, self.port) > + end > local data = msg .. '\n' > sock:send(data) I'm tentative about possibly infinite loop. I know, test-run will fail a hung test, but it would be better to fail gracefully and provide some information about an error (is socket.tcp_connect returns something about?) Let's consider using wait_cond or, better, set a connection timeout (if possible). It also interesting what is the reason of the connection error in the first place. Whether test-run actually listens at the moment? Maybe it unable to proceed much incoming connection requests at time? Please, don't bury youself with that, but look around briefly. WBR, Alexander Turenko.