Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Ilya Kosarev <i.kosarev@tarantool.org>,
	tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] test: fix flaky socket test
Date: Thu, 5 Dec 2019 22:31:17 +0100	[thread overview]
Message-ID: <ec633a19-20d5-fc31-2c67-23d05e26e278@tarantool.org> (raw)
In-Reply-To: <20191204134320.27760-1-i.kosarev@tarantool.org>

Hi! Thanks for the patch!

See 7 comments below.

On 04/12/2019 14:43, Ilya Kosarev wrote:
> socket.test had a number of flaky problems:
> - socket readableness expectation
> - race conditions on socket shutdown in emulation test cases
> - tcp_server stability in socket receive inconsistent behavior case
> Now they are solved. Port randomization is improved.
> Socket test is not fragile anymore.
> 
> Closes #4451, #4426, #4469
> ---
> Branch: https://github.com/tarantool/tarantool/tree/i.kosarev/gh-4426-4451-fix-socket-test
> Issues: https://github.com/tarantool/tarantool/issues/4426
>         https://github.com/tarantool/tarantool/issues/4451
>         https://github.com/tarantool/tarantool/issues/4469
> 
>  test/app/socket.result   | 84 +++++++++++++++++++++++++---------------
>  test/app/socket.test.lua | 60 +++++++++++++++++++---------
>  test/app/suite.ini       |  1 -
>  3 files changed, 94 insertions(+), 51 deletions(-)
> 
> diff --git a/test/app/socket.result b/test/app/socket.result
> index fd299424c9..42dde8f375 100644
> --- a/test/app/socket.result
> +++ b/test/app/socket.result
> @@ -45,6 +45,9 @@ test_run:cmd("push filter '(error: .builtin/.*[.]lua):[0-9]+' to '\\1'")
>  WAIT_COND_TIME = 10

1. This is now unused.

>  ---
>  ...
> +WAIT_TCP_CONNECT_TIME = 240
> +---
> +...
>  socket('PF_INET', 'SOCK_STREAM', 'tcp121222');
>  ---
>  - null
> @@ -107,7 +110,7 @@ s:nonblock(true)
>  ---
>  - true
>  ...
> -s:readable(.1)
> +s:readable(WAIT_TCP_CONNECT_TIME)
>  ---
>  - true
>  ...
> @@ -227,7 +230,7 @@ s:syswrite(ffi.cast('const char *', ping), #ping)
>  ---
>  - 6
>  ...
> -s:readable(1)
> +s:readable(WAIT_TCP_CONNECT_TIME)

2. I understand, why it is 'connect time' in the previous
hunk. But here the socket is already connected. So it is
not connect time.

On the other hand, there are places right after connect,
which don't use this timeout. For example, line 457
in socket.result file:

    sc:sysconnect('127.0.0.1', s:name().port) or errno() == errno.EINPROGRESS
    ---
    - true
    ...
    sc:writable(10)
    ---
    - true
    ...

Here it is 10 seconds, while in other places you started
using 240.

>  ---
>  - true
>  ...
> @@ -830,7 +833,7 @@ sc:sendto('127.0.0.1', s:name().port, 'Hello, world')
>  ---
>  - 12
>  ...
> -s:readable(10)
> +s:readable(WAIT_TCP_CONNECT_TIME)
>  ---
>  - true
>  ...

3. There is no connect time. This is DGRAM socket with UDP.
No connections.

The same below. Moreover, there are now 2 waits for
'connect time' on the same socket.

> @@ -842,7 +845,7 @@ sc:sendto('127.0.0.1', s:name().port, 'Hello, world, 2')
>  ---
>  - 15
>  ...
> -s:readable(10)
> +s:readable(WAIT_TCP_CONNECT_TIME)
>  ---
>  - true
>  ...
> @@ -898,7 +901,7 @@ sc:sendto('127.0.0.1', s:name().port, 'Hello, World!')
>  ---
>  - 13
>  ...
> -s:readable(1)
> +s:readable(WAIT_TCP_CONNECT_TIME)
>  ---
>  - true
>  ...
> @@ -1074,6 +1077,15 @@ master:setsockopt('SOL_SOCKET', 'SO_REUSEADDR', true)
>  ---
>  - true
>  ...
> +seed = ''
> +---
> +...
> +for d in string.gmatch(box.info.uuid, '%d') do  seed = seed .. d end

4. Well, this is probably the most intricate way to
get a seed. Have you considered fiber.time()? And why do
you need it? Isn't it initialized already?

> +---
> +...
> +math.randomseed(tonumber(seed))
> +---
> +...
>  port = 32768 + math.random(0, 32767)
>  ---
>  ...
> @@ -1092,7 +1104,7 @@ test_run:wait_cond(function()
>          return false, master:error()
>      end
>      return true
> -end, WAIT_COND_TIME);
> +end, 100);

5. Why? 10 seconds was not enough to try a few random
ports?

>  ---
>  - true
>  ...
> @@ -1822,8 +1834,14 @@ test_run:cmd("setopt delimiter ';'")
>  ---
>  - true
>  ...
> +socket_opened = true
>  cfiber = fiber.create(function(sc, rch, wch)
> -    while sc:send(wch:get()) and rch:put(sc:receive("*l")) do end
> +    while socket_opened do
> +        sc:send(wch:get())
> +        local data = sc:receive("*l")
> +        if not socket_opened then sc:close() end
> +        rch:put(data)
> +    end
>  end, sc, rch, wch);

6. What was a problem here? In the commit message you said
'race conditions'. Where? You just moved socket close to
the background fiber from the main fiber.

>  ---
>  ...
> @@ -1936,6 +1954,9 @@ c:receive("*l")
>  ---
>  - 
>  ...
> +socket_opened = false
> +---
> +...
>  wch:put("Fu")
>  ---
>  - true
> @@ -1944,10 +1965,6 @@ c:send("354 Please type your message\n")
>  ---
>  - 29
>  ...
> -sc:close()
> ----
> -- 1
> -...
>  c:receive("*l", "Line: ")
>  ---
>  - null
> @@ -2816,7 +2833,7 @@ test_run:cmd("clear filter")
>  ---
>  - true
>  ...
> --- case: sicket receive inconsistent behavior
> +-- case: socket receive inconsistent behavior
>  chan = fiber.channel()
>  ---
>  ...
> @@ -2826,41 +2843,46 @@ counter = 0
>  fn = function(s) counter = 0; while true do s:write((tostring(counter)):rep(chan:get())); counter = counter + 1 end end
>  ---
>  ...
> -srv = socket.tcp_server('0.0.0.0', 8888, fn)
> +srv = nil
>  ---
>  ...
> -s = socket.connect('localhost', 8888)
> +test_run:cmd("setopt delimiter ';'")
>  ---
> +- true
>  ...
> -chan:put(5)
> +test_run:wait_cond(function()
> +    port = 32768 + math.random(0, 32767)
> +    srv = socket.tcp_server('0.0.0.0', port, fn)
> +    return srv ~= nil
> +end, 100);
>  ---
>  - true
>  ...
> -chan:put(5)
> +receive1 = nil; receive2 = nil;
>  ---
> -- true
>  ...
> -s:receive(5)
> +if srv ~= nil then
> +    s = socket.connect('localhost', port)
> +    chan:put(5)
> +    chan:put(5)
> +    receive1 = s:receive(5)
> +    chan:put(5)
> +    s:settimeout(1)
> +    receive2 = s:receive('*a')
> +    s:close()
> +    srv:close()
> +end;

7. How is it possible, that the server still is nil here?
At least one of 32k ports should have been free.

>  ---
> -- '00000'
>  ...
> -chan:put(5)
> +test_run:cmd("setopt delimiter ''");
>  ---
>  - true
>  ...
> -s:settimeout(1)
> +receive1
>  ---
> -- 1
> +- '00000'
>  ...
> -s:receive('*a')
> +receive2
>  ---
>  - '1111122222'
>  ...
> -s:close()
> ----
> -- 1
> -...
> -srv:close()
> ----
> -- true
> -...

  reply	other threads:[~2019-12-05 21:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-04 13:43 Ilya Kosarev
2019-12-05 21:31 ` Vladislav Shpilevoy [this message]
2019-12-06 16:01   ` Ilya Kosarev
2019-12-08 15:52     ` Vladislav Shpilevoy
2019-12-10 14:36       ` Ilya Kosarev
2019-12-17  0:03         ` Vladislav Shpilevoy
  -- strict thread matches above, loose matches on Subject: below --
2019-11-26 21:45 Ilya Kosarev
2019-11-26  1:19 Ilya Kosarev
2019-08-30 15:28 [tarantool-patches] " Ilya Kosarev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec633a19-20d5-fc31-2c67-23d05e26e278@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=i.kosarev@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] test: fix flaky socket test' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox