From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: Ilya Kosarev <i.kosarev@tarantool.org>, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] test: fix flaky socket test Date: Thu, 5 Dec 2019 22:31:17 +0100 [thread overview] Message-ID: <ec633a19-20d5-fc31-2c67-23d05e26e278@tarantool.org> (raw) In-Reply-To: <20191204134320.27760-1-i.kosarev@tarantool.org> Hi! Thanks for the patch! See 7 comments below. On 04/12/2019 14:43, Ilya Kosarev wrote: > socket.test had a number of flaky problems: > - socket readableness expectation > - race conditions on socket shutdown in emulation test cases > - tcp_server stability in socket receive inconsistent behavior case > Now they are solved. Port randomization is improved. > Socket test is not fragile anymore. > > Closes #4451, #4426, #4469 > --- > Branch: https://github.com/tarantool/tarantool/tree/i.kosarev/gh-4426-4451-fix-socket-test > Issues: https://github.com/tarantool/tarantool/issues/4426 > https://github.com/tarantool/tarantool/issues/4451 > https://github.com/tarantool/tarantool/issues/4469 > > test/app/socket.result | 84 +++++++++++++++++++++++++--------------- > test/app/socket.test.lua | 60 +++++++++++++++++++--------- > test/app/suite.ini | 1 - > 3 files changed, 94 insertions(+), 51 deletions(-) > > diff --git a/test/app/socket.result b/test/app/socket.result > index fd299424c9..42dde8f375 100644 > --- a/test/app/socket.result > +++ b/test/app/socket.result > @@ -45,6 +45,9 @@ test_run:cmd("push filter '(error: .builtin/.*[.]lua):[0-9]+' to '\\1'") > WAIT_COND_TIME = 10 1. This is now unused. > --- > ... > +WAIT_TCP_CONNECT_TIME = 240 > +--- > +... > socket('PF_INET', 'SOCK_STREAM', 'tcp121222'); > --- > - null > @@ -107,7 +110,7 @@ s:nonblock(true) > --- > - true > ... > -s:readable(.1) > +s:readable(WAIT_TCP_CONNECT_TIME) > --- > - true > ... > @@ -227,7 +230,7 @@ s:syswrite(ffi.cast('const char *', ping), #ping) > --- > - 6 > ... > -s:readable(1) > +s:readable(WAIT_TCP_CONNECT_TIME) 2. I understand, why it is 'connect time' in the previous hunk. But here the socket is already connected. So it is not connect time. On the other hand, there are places right after connect, which don't use this timeout. For example, line 457 in socket.result file: sc:sysconnect('127.0.0.1', s:name().port) or errno() == errno.EINPROGRESS --- - true ... sc:writable(10) --- - true ... Here it is 10 seconds, while in other places you started using 240. > --- > - true > ... > @@ -830,7 +833,7 @@ sc:sendto('127.0.0.1', s:name().port, 'Hello, world') > --- > - 12 > ... > -s:readable(10) > +s:readable(WAIT_TCP_CONNECT_TIME) > --- > - true > ... 3. There is no connect time. This is DGRAM socket with UDP. No connections. The same below. Moreover, there are now 2 waits for 'connect time' on the same socket. > @@ -842,7 +845,7 @@ sc:sendto('127.0.0.1', s:name().port, 'Hello, world, 2') > --- > - 15 > ... > -s:readable(10) > +s:readable(WAIT_TCP_CONNECT_TIME) > --- > - true > ... > @@ -898,7 +901,7 @@ sc:sendto('127.0.0.1', s:name().port, 'Hello, World!') > --- > - 13 > ... > -s:readable(1) > +s:readable(WAIT_TCP_CONNECT_TIME) > --- > - true > ... > @@ -1074,6 +1077,15 @@ master:setsockopt('SOL_SOCKET', 'SO_REUSEADDR', true) > --- > - true > ... > +seed = '' > +--- > +... > +for d in string.gmatch(box.info.uuid, '%d') do seed = seed .. d end 4. Well, this is probably the most intricate way to get a seed. Have you considered fiber.time()? And why do you need it? Isn't it initialized already? > +--- > +... > +math.randomseed(tonumber(seed)) > +--- > +... > port = 32768 + math.random(0, 32767) > --- > ... > @@ -1092,7 +1104,7 @@ test_run:wait_cond(function() > return false, master:error() > end > return true > -end, WAIT_COND_TIME); > +end, 100); 5. Why? 10 seconds was not enough to try a few random ports? > --- > - true > ... > @@ -1822,8 +1834,14 @@ test_run:cmd("setopt delimiter ';'") > --- > - true > ... > +socket_opened = true > cfiber = fiber.create(function(sc, rch, wch) > - while sc:send(wch:get()) and rch:put(sc:receive("*l")) do end > + while socket_opened do > + sc:send(wch:get()) > + local data = sc:receive("*l") > + if not socket_opened then sc:close() end > + rch:put(data) > + end > end, sc, rch, wch); 6. What was a problem here? In the commit message you said 'race conditions'. Where? You just moved socket close to the background fiber from the main fiber. > --- > ... > @@ -1936,6 +1954,9 @@ c:receive("*l") > --- > - > ... > +socket_opened = false > +--- > +... > wch:put("Fu") > --- > - true > @@ -1944,10 +1965,6 @@ c:send("354 Please type your message\n") > --- > - 29 > ... > -sc:close() > ---- > -- 1 > -... > c:receive("*l", "Line: ") > --- > - null > @@ -2816,7 +2833,7 @@ test_run:cmd("clear filter") > --- > - true > ... > --- case: sicket receive inconsistent behavior > +-- case: socket receive inconsistent behavior > chan = fiber.channel() > --- > ... > @@ -2826,41 +2843,46 @@ counter = 0 > fn = function(s) counter = 0; while true do s:write((tostring(counter)):rep(chan:get())); counter = counter + 1 end end > --- > ... > -srv = socket.tcp_server('0.0.0.0', 8888, fn) > +srv = nil > --- > ... > -s = socket.connect('localhost', 8888) > +test_run:cmd("setopt delimiter ';'") > --- > +- true > ... > -chan:put(5) > +test_run:wait_cond(function() > + port = 32768 + math.random(0, 32767) > + srv = socket.tcp_server('0.0.0.0', port, fn) > + return srv ~= nil > +end, 100); > --- > - true > ... > -chan:put(5) > +receive1 = nil; receive2 = nil; > --- > -- true > ... > -s:receive(5) > +if srv ~= nil then > + s = socket.connect('localhost', port) > + chan:put(5) > + chan:put(5) > + receive1 = s:receive(5) > + chan:put(5) > + s:settimeout(1) > + receive2 = s:receive('*a') > + s:close() > + srv:close() > +end; 7. How is it possible, that the server still is nil here? At least one of 32k ports should have been free. > --- > -- '00000' > ... > -chan:put(5) > +test_run:cmd("setopt delimiter ''"); > --- > - true > ... > -s:settimeout(1) > +receive1 > --- > -- 1 > +- '00000' > ... > -s:receive('*a') > +receive2 > --- > - '1111122222' > ... > -s:close() > ---- > -- 1 > -... > -srv:close() > ---- > -- true > -...
next prev parent reply other threads:[~2019-12-05 21:31 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-12-04 13:43 Ilya Kosarev 2019-12-05 21:31 ` Vladislav Shpilevoy [this message] 2019-12-06 16:01 ` Ilya Kosarev 2019-12-08 15:52 ` Vladislav Shpilevoy 2019-12-10 14:36 ` Ilya Kosarev 2019-12-17 0:03 ` Vladislav Shpilevoy -- strict thread matches above, loose matches on Subject: below -- 2019-11-26 21:45 Ilya Kosarev 2019-11-26 1:19 Ilya Kosarev 2019-08-30 15:28 [tarantool-patches] " Ilya Kosarev
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ec633a19-20d5-fc31-2c67-23d05e26e278@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=i.kosarev@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] test: fix flaky socket test' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox