* [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd()
@ 2019-11-23 14:50 Ilya Kosarev
2019-11-25 15:31 ` Alexander Turenko
0 siblings, 1 reply; 3+ messages in thread
From: Ilya Kosarev @ 2019-11-23 14:50 UTC (permalink / raw)
To: tarantool-patches
For some tests, for example, replication/box_set_replication_stress,
socket.tcp_connect() in test_run:cmd() might sometimes fail when
running under high load. Now it is fixed.
Closes #193
---
https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd
https://github.com/tarantool/test-run/issues/193
test_run.lua | 3 +++
1 file changed, 3 insertions(+)
diff --git a/test_run.lua b/test_run.lua
index 63dfdef..0d450bd 100644
--- a/test_run.lua
+++ b/test_run.lua
@@ -11,6 +11,9 @@ local clock = require('clock')
local function cmd(self, msg)
local sock = socket.tcp_connect(self.host, self.port)
+ while sock == nil do
+ sock = socket.tcp_connect(self.host, self.port)
+ end
local data = msg .. '\n'
sock:send(data)
--
2.17.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd()
2019-11-23 14:50 [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() Ilya Kosarev
@ 2019-11-25 15:31 ` Alexander Turenko
2019-11-26 0:19 ` Ilya Kosarev
0 siblings, 1 reply; 3+ messages in thread
From: Alexander Turenko @ 2019-11-25 15:31 UTC (permalink / raw)
To: Ilya Kosarev; +Cc: tarantool-patches
On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote:
> For some tests, for example, replication/box_set_replication_stress,
> socket.tcp_connect() in test_run:cmd() might sometimes fail when
> running under high load. Now it is fixed.
>
> Closes #193
> ---
> https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd
> https://github.com/tarantool/test-run/issues/193
>
> test_run.lua | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/test_run.lua b/test_run.lua
> index 63dfdef..0d450bd 100644
> --- a/test_run.lua
> +++ b/test_run.lua
> @@ -11,6 +11,9 @@ local clock = require('clock')
>
> local function cmd(self, msg)
> local sock = socket.tcp_connect(self.host, self.port)
> + while sock == nil do
> + sock = socket.tcp_connect(self.host, self.port)
> + end
> local data = msg .. '\n'
> sock:send(data)
I'm tentative about possibly infinite loop. I know, test-run will fail a
hung test, but it would be better to fail gracefully and provide some
information about an error (is socket.tcp_connect returns something
about?)
Let's consider using wait_cond or, better, set a connection timeout (if
possible).
It also interesting what is the reason of the connection error in the
first place. Whether test-run actually listens at the moment? Maybe it
unable to proceed much incoming connection requests at time?
Please, don't bury youself with that, but look around briefly.
WBR, Alexander Turenko.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd()
2019-11-25 15:31 ` Alexander Turenko
@ 2019-11-26 0:19 ` Ilya Kosarev
0 siblings, 0 replies; 3+ messages in thread
From: Ilya Kosarev @ 2019-11-26 0:19 UTC (permalink / raw)
To: Alexander Turenko; +Cc: tarantool-patches
Hi!
Thanks for your review.
The real reason of the socket.tcp_connect returning nil in this case
is linux "open files limit":
2019-11-26 02:34:49.426 [12405] main/202/console/unix/: test_run.lua:15 E> test_run:cmd
2019-11-26 02:34:49.427 [12405] main/103/console/unix/:/home/kosarev/tara socket.c:925 !> accept(6): Too many open files
2019-11-26 02:34:49.427 [12405] main/103/console/unix/:/home/kosarev/tara socket.lua:1090 E> accept(fd 6, aka unix/:/home/kosarev/tarantool/test/var/004_replication/master_quorum1.socket-admin) failed: Too many open files
2019-11-26 02:34:49.427 [12405] main/202/console/unix/: test_run.lua:18 E> sock == nil
Sent v2 of the patch considering mentioned drawbacks.
>Понедельник, 25 ноября 2019, 18:31 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>:
>
>On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote:
>> For some tests, for example, replication/box_set_replication_stress,
>> socket.tcp_connect() in test_run:cmd() might sometimes fail when
>> running under high load. Now it is fixed.
>>
>> Closes #193
>> ---
>> https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd
>> https://github.com/tarantool/test-run/issues/193
>>
>> test_run.lua | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/test_run.lua b/test_run.lua
>> index 63dfdef..0d450bd 100644
>> --- a/test_run.lua
>> +++ b/test_run.lua
>> @@ -11,6 +11,9 @@ local clock = require('clock')
>>
>> local function cmd(self, msg)
>> local sock = socket.tcp_connect(self.host, self.port)
>> + while sock == nil do
>> + sock = socket.tcp_connect(self.host, self.port)
>> + end
>> local data = msg .. '\n'
>> sock:send(data)
>
>I'm tentative about possibly infinite loop. I know, test-run will fail a
>hung test, but it would be better to fail gracefully and provide some
>information about an error (is socket.tcp_connect returns something
>about?)
>
>Let's consider using wait_cond or, better, set a connection timeout (if
>possible).
>
>It also interesting what is the reason of the connection error in the
>first place. Whether test-run actually listens at the moment? Maybe it
>unable to proceed much incoming connection requests at time?
>
>Please, don't bury youself with that, but look around briefly.
>
>WBR, Alexander Turenko.
--
Ilya Kosarev
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-11-26 0:19 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-23 14:50 [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() Ilya Kosarev
2019-11-25 15:31 ` Alexander Turenko
2019-11-26 0:19 ` Ilya Kosarev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox