Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd()
@ 2019-11-23 14:50 Ilya Kosarev
  2019-11-25 15:31 ` Alexander Turenko
  0 siblings, 1 reply; 3+ messages in thread
From: Ilya Kosarev @ 2019-11-23 14:50 UTC (permalink / raw)
  To: tarantool-patches

For some tests, for example, replication/box_set_replication_stress,
socket.tcp_connect() in test_run:cmd() might sometimes fail when
running under high load. Now it is fixed.

Closes #193
---
https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd
https://github.com/tarantool/test-run/issues/193

 test_run.lua | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/test_run.lua b/test_run.lua
index 63dfdef..0d450bd 100644
--- a/test_run.lua
+++ b/test_run.lua
@@ -11,6 +11,9 @@ local clock = require('clock')
 
 local function cmd(self, msg)
     local sock = socket.tcp_connect(self.host, self.port)
+    while sock == nil do
+        sock = socket.tcp_connect(self.host, self.port)
+    end
     local data = msg .. '\n'
     sock:send(data)
 
-- 
2.17.1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd()
  2019-11-23 14:50 [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() Ilya Kosarev
@ 2019-11-25 15:31 ` Alexander Turenko
  2019-11-26  0:19   ` Ilya Kosarev
  0 siblings, 1 reply; 3+ messages in thread
From: Alexander Turenko @ 2019-11-25 15:31 UTC (permalink / raw)
  To: Ilya Kosarev; +Cc: tarantool-patches

On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote:
> For some tests, for example, replication/box_set_replication_stress,
> socket.tcp_connect() in test_run:cmd() might sometimes fail when
> running under high load. Now it is fixed.
> 
> Closes #193
> ---
> https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd
> https://github.com/tarantool/test-run/issues/193
> 
>  test_run.lua | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/test_run.lua b/test_run.lua
> index 63dfdef..0d450bd 100644
> --- a/test_run.lua
> +++ b/test_run.lua
> @@ -11,6 +11,9 @@ local clock = require('clock')
>  
>  local function cmd(self, msg)
>      local sock = socket.tcp_connect(self.host, self.port)
> +    while sock == nil do
> +        sock = socket.tcp_connect(self.host, self.port)
> +    end
>      local data = msg .. '\n'
>      sock:send(data)

I'm tentative about possibly infinite loop. I know, test-run will fail a
hung test, but it would be better to fail gracefully and provide some
information about an error (is socket.tcp_connect returns something
about?)

Let's consider using wait_cond or, better, set a connection timeout (if
possible).

It also interesting what is the reason of the connection error in the
first place. Whether test-run actually listens at the moment? Maybe it
unable to proceed much incoming connection requests at time?

Please, don't bury youself with that, but look around briefly.

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd()
  2019-11-25 15:31 ` Alexander Turenko
@ 2019-11-26  0:19   ` Ilya Kosarev
  0 siblings, 0 replies; 3+ messages in thread
From: Ilya Kosarev @ 2019-11-26  0:19 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Hi!

Thanks for your review.

The real reason of the socket.tcp_connect returning nil in this case
is linux "open files limit":
2019-11-26 02:34:49.426 [12405] main/202/console/unix/: test_run.lua:15 E> test_run:cmd
2019-11-26 02:34:49.427 [12405] main/103/console/unix/:/home/kosarev/tara socket.c:925 !> accept(6): Too many open files
2019-11-26 02:34:49.427 [12405] main/103/console/unix/:/home/kosarev/tara socket.lua:1090 E> accept(fd 6, aka unix/:/home/kosarev/tarantool/test/var/004_replication/master_quorum1.socket-admin) failed: Too many open files
2019-11-26 02:34:49.427 [12405] main/202/console/unix/: test_run.lua:18 E> sock == nil

Sent v2 of the patch considering mentioned drawbacks.


>Понедельник, 25 ноября 2019, 18:31 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>:
>
>On Sat, Nov 23, 2019 at 05:50:12PM +0300, Ilya Kosarev wrote:
>> For some tests, for example, replication/box_set_replication_stress,
>> socket.tcp_connect() in test_run:cmd() might sometimes fail when
>> running under high load. Now it is fixed.
>> 
>> Closes #193
>> ---
>>  https://github.com/tarantool/test-run/tree/i.kosarev/gh-193-stabilize-test-run-cmd
>>  https://github.com/tarantool/test-run/issues/193
>> 
>>  test_run.lua | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/test_run.lua b/test_run.lua
>> index 63dfdef..0d450bd 100644
>> --- a/test_run.lua
>> +++ b/test_run.lua
>> @@ -11,6 +11,9 @@ local clock = require('clock')
>> 
>>  local function cmd(self, msg)
>>      local sock = socket.tcp_connect(self.host, self.port)
>> +    while sock == nil do
>> +        sock = socket.tcp_connect(self.host, self.port)
>> +    end
>>      local data = msg .. '\n'
>>      sock:send(data)
>
>I'm tentative about possibly infinite loop. I know, test-run will fail a
>hung test, but it would be better to fail gracefully and provide some
>information about an error (is socket.tcp_connect returns something
>about?)
>
>Let's consider using wait_cond or, better, set a connection timeout (if
>possible).
>
>It also interesting what is the reason of the connection error in the
>first place. Whether test-run actually listens at the moment? Maybe it
>unable to proceed much incoming connection requests at time?
>
>Please, don't bury youself with that, but look around briefly.
>
>WBR, Alexander Turenko.


-- 
Ilya Kosarev

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-11-26  0:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-23 14:50 [Tarantool-patches] [PATCH] Stabilize tcp_connect in test_run:cmd() Ilya Kosarev
2019-11-25 15:31 ` Alexander Turenko
2019-11-26  0:19   ` Ilya Kosarev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox