From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 5AFCA25973 for ; Mon, 6 Aug 2018 10:27:42 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HYhCwp1kyFYr for ; Mon, 6 Aug 2018 10:27:42 -0400 (EDT) Received: from smtp50.i.mail.ru (smtp50.i.mail.ru [94.100.177.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 15A3B24711 for ; Mon, 6 Aug 2018 10:27:42 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH] Fix lua socket polling in case of a spurious wakeup References: <859b06efe0827949f49623ccec7bd4e4738683ac.1533224602.git.georgy@tarantool.org> From: Vladislav Shpilevoy Message-ID: <33a8cd8c-c295-2224-605a-7b3bbb5e72da@tarantool.org> Date: Mon, 6 Aug 2018 17:27:39 +0300 MIME-Version: 1.0 In-Reply-To: <859b06efe0827949f49623ccec7bd4e4738683ac.1533224602.git.georgy@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Georgy Kirichenko Hi! See 7 comments below. Last 6 of them I've fixed on the branch in a separate commit. Please, look, squash and fix the first comment yourself. On 02/08/2018 18:44, Georgy Kirichenko wrote: > socket_writable/socket_readable handles socket.iowait spurious wakeup > until event is happened or timeout is exceeded. > > Fixes #3344 > --- > Issue: https://github.com/tarantool/tarantool/issues/3344 > Branch: https://github.com/tarantool/tarantool/tree/g.kirichenko/gh-3344-socket-io-spurios-wakeup 1. The test constantly fails: ====================================================================================== WORKR TEST PARAMS RESULT --------------------------------------------------------------------------------- [001] app/socket.test.lua [ fail ] [001] [001] Test failed! Result content mismatch: [001] --- app/socket.result Mon Aug 6 16:36:23 2018 [001] +++ app/socket.reject Mon Aug 6 17:06:23 2018 [001] @@ -2294,7 +2294,7 @@ [001] ... [001] client:read(1, 0.1) == '' [001] --- [001] -- true [001] +- false [001] ... [001] server:close() [001] --- I have tried to fix it and it looks like socket close event is not triggered or checked in socket_read() function. Please, fix. > src/lua/socket.lua | 67 ++++++++++++++++++++------------------ > test/app/socket.result | 70 ++++++++++++++++++++++++++++++++++++++++ > test/app/socket.test.lua | 29 +++++++++++++++++ > 3 files changed, 135 insertions(+), 31 deletions(-) > > diff --git a/src/lua/socket.lua b/src/lua/socket.lua > index 06306eae2..9fcaee0e8 100644 > --- a/src/lua/socket.lua > +++ b/src/lua/socket.lua > @@ -328,12 +328,18 @@ local function do_wait(self, what, timeout) > self._errno = nil > timeout = timeout or TIMEOUT_INFINITY > > - local res = internal.iowait(fd, what, timeout) > - if res == 0 then > - self._errno = boxerrno.ETIMEDOUT > - return 0 > + local deadline = fiber.clock() + timeout > + local events > + > + while fiber.clock() <= deadline do > + events = internal.iowait(fd, what, deadline - fiber.clock()) > + fiber.testcancel() > + if events ~= 0 and events ~= '' then> + return events > + end > end > - return res > + self._errno = boxerrno.ETIMEDOUT > + return events 2. Is it possible that you have an event after timeout? I believe, events here is always 0. So you can return 0, and put 'local events' declaration inside the 'while' loop above. > end > > local function socket_readable(self, timeout) > @@ -935,16 +934,19 @@ local function socket_tcp_connect(s, address, port, timeout) > -- Wait until the connection is established or ultimately fails. > -- In either condition the socket becomes writable. To tell these > -- conditions appart SO_ERROR must be consulted (man connect). > - if socket_writable(s, timeout) then > - s._errno = socket_getsockopt(s, 'SOL_SOCKET', 'SO_ERROR') > - else > - s._errno = boxerrno.ETIMEDOUT > - end > - if s._errno ~= 0 then > - return nil > + local deadline = timeout + fiber.clock() > + while deadline - fiber.clock() >= 0 do 3. This 'while' can be removed since the wait is done inside socket_writable. > + if socket_writable(s, deadline - fiber.clock()) then > + s._errno = socket_getsockopt(s, 'SOL_SOCKET', 'SO_ERROR') > + if s._errno == 0 then > + return true > + else > + return nil > + end > + end > end > - -- Connected > - return true > + s._errno = boxerrno.ETIMEDOUT 4. Socket_writable already sets errno. > + return nil > end > > local function tcp_connect(host, port, timeout) > @@ -1005,7 +1007,8 @@ end > local function tcp_server_loop(server, s, addr) > fiber.name(format("%s/%s:%s", server.name, addr.host, addr.port), {truncate = true}) > log.info("started") > - while socket_readable(s) do > + while true do > + socket_readable(s) 5. If the socket is not writable you should not accept anything. The previous version was right. > local sc, from = socket_accept(s) > if sc == nil then > local errno = s._errno > @@ -1325,7 +1328,8 @@ end > local function lsocket_tcp_accept(self) > check_socket(self) > local deadline = fiber.clock() + (self.timeout or TIMEOUT_INFINITY) > - repeat > + while deadline - fiber.clock() >= 0 do > + socket_readable(self, deadline - fiber.clock()) 6. The same. If the socket is not writable, here is nothing to accept. > local client = socket_accept(self) > if client then > setmetatable(client, lsocket_tcp_client_mt) > @@ -1390,7 +1394,8 @@ local function lsocket_tcp_receive(self, pattern, prefix) > elseif pattern == "*a" then > local result = { prefix } > local deadline = fiber.clock() + (self.timeout or TIMEOUT_INFINITY) > - repeat > + while deadline - fiber.clock() >= 0 do > + socket_readable(self, deadline - fiber.clock()) 7. The same. > local data = socket_sysread(self) > if data == nil then > if not errno_is_transient[self._errno] then