* [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() @ 2019-11-23 14:50 Ilya Kosarev 2019-11-25 15:31 ` Alexander Turenko 0 siblings, 1 reply; 3+ messages in thread From: Ilya Kosarev @ 2019-11-23 14:50 UTC (permalink / raw) To: tarantool-patches For some tests, for example, replication/box_set_replication_stress, socket.tcp_connect() in test_run:cmd() might sometimes fail when running under high load. Now it is fixed. Closes #193 --- https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd https://github.com/tarantool/test-run/issues/193 test_run.lua | 3 +++ 1 file changed, 3 insertions(+) diff --git a/test_run.lua b/test_run.lua index 63dfdef..0d450bd 100644 --- a/test_run.lua +++ b/test_run.lua @@ -11,6 +11,9 @@ local clock = require('clock') local function cmd(self, msg) local sock = socket.tcp_connect(self.host, self.port) + while sock == nil do + sock = socket.tcp_connect(self.host, self.port) + end local data = msg .. '\n' sock:send(data) -- 2.17.1 ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() 2019-11-23 14:50 [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() Ilya Kosarev @ 2019-11-25 15:31 ` Alexander Turenko 2019-11-26 0:19 ` Ilya Kosarev 0 siblings, 1 reply; 3+ messages in thread From: Alexander Turenko @ 2019-11-25 15:31 UTC (permalink / raw) To: Ilya Kosarev; +Cc: tarantool-patches On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote: > For some tests, for example, replication/box_set_replication_stress, > socket.tcp_connect() in test_run:cmd() might sometimes fail when > running under high load. Now it is fixed. > > Closes #193 > --- > https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd > https://github.com/tarantool/test-run/issues/193 > > test_run.lua | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/test_run.lua b/test_run.lua > index 63dfdef..0d450bd 100644 > --- a/test_run.lua > +++ b/test_run.lua > @@ -11,6 +11,9 @@ local clock = require('clock') > > local function cmd(self, msg) > local sock = socket.tcp_connect(self.host, self.port) > + while sock == nil do > + sock = socket.tcp_connect(self.host, self.port) > + end > local data = msg .. '\n' > sock:send(data) I'm tentative about possibly infinite loop. I know, test-run will fail a hung test, but it would be better to fail gracefully and provide some information about an error (is socket.tcp_connect returns something about?) Let's consider using wait_cond or, better, set a connection timeout (if possible). It also interesting what is the reason of the connection error in the first place. Whether test-run actually listens at the moment? Maybe it unable to proceed much incoming connection requests at time? Please, don't bury youself with that, but look around briefly. WBR, Alexander Turenko. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() 2019-11-25 15:31 ` Alexander Turenko @ 2019-11-26 0:19 ` Ilya Kosarev 0 siblings, 0 replies; 3+ messages in thread From: Ilya Kosarev @ 2019-11-26 0:19 UTC (permalink / raw) To: Alexander Turenko; +Cc: tarantool-patches Hi! Thanks for your review. The real reason of the socket.tcp_connect returning nil in this case is linux "open files limit": 2019-11-26 02:34:49.426 [12405] main/202/console/unix/: test_run.lua:15 E> test_run:cmd 2019-11-26 02:34:49.427 [12405] main/103/console/unix/:/home/kosarev/tara socket.c:925 !> accept(6): Too many open files 2019-11-26 02:34:49.427 [12405] main/103/console/unix/:/home/kosarev/tara socket.lua:1090 E> accept(fd 6, aka unix/:/home/kosarev/tarantool/test/var/004_replication/master_quorum1.socket-admin) failed: Too many open files 2019-11-26 02:34:49.427 [12405] main/202/console/unix/: test_run.lua:18 E> sock == nil Sent v2 of the patch considering mentioned drawbacks. >Понедельник, 25 ноября 2019, 18:31 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>: > >On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote: >> For some tests, for example, replication/box_set_replication_stress, >> socket.tcp_connect() in test_run:cmd() might sometimes fail when >> running under high load. Now it is fixed. >> >> Closes #193 >> --- >> https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd >> https://github.com/tarantool/test-run/issues/193 >> >> test_run.lua | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/test_run.lua b/test_run.lua >> index 63dfdef..0d450bd 100644 >> --- a/test_run.lua >> +++ b/test_run.lua >> @@ -11,6 +11,9 @@ local clock = require('clock') >> >> local function cmd(self, msg) >> local sock = socket.tcp_connect(self.host, self.port) >> + while sock == nil do >> + sock = socket.tcp_connect(self.host, self.port) >> + end >> local data = msg .. '\n' >> sock:send(data) > >I'm tentative about possibly infinite loop. I know, test-run will fail a >hung test, but it would be better to fail gracefully and provide some >information about an error (is socket.tcp_connect returns something >about?) > >Let's consider using wait_cond or, better, set a connection timeout (if >possible). > >It also interesting what is the reason of the connection error in the >first place. Whether test-run actually listens at the moment? Maybe it >unable to proceed much incoming connection requests at time? > >Please, don't bury youself with that, but look around briefly. > >WBR, Alexander Turenko. -- Ilya Kosarev ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-11-26 0:19 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-23 14:50 [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() Ilya Kosarev 2019-11-25 15:31 ` Alexander Turenko 2019-11-26 0:19 ` Ilya Kosarev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox