From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 9D4C32F234 for ; Sun, 23 Jun 2019 16:32:06 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id y-3A5dy0LINw for ; Sun, 23 Jun 2019 16:32:06 -0400 (EDT) Received: from smtpng1.m.smailru.net (smtpng1.m.smailru.net [94.100.181.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id E63B42EBBC for ; Sun, 23 Jun 2019 16:32:05 -0400 (EDT) Date: Sun, 23 Jun 2019 23:31:43 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH] lua: return getaddrinfo() errors Message-ID: <20190623203141.snnjgyqrntr6okp4@tkn_work_nb> References: <20190615165829.11888-1-roman.habibov@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190615165829.11888-1-roman.habibov@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: Roman Khabibov Cc: tarantool-patches@freelists.org Re C part of the commit (src/lib/core/coio_task.c). The origin of the (rc < 0) check is the following commit (marked relevant phrase with !!): commit ea1da04d5add51c308efb3fd2d71cdfabed8411c Author: Roman Tsisyk Date: Mon Dec 5 20:28:59 2016 +0300 Refactor coio_task * Add coio_task_create() and coio_task_destroy() * Rename coio_task() to coio_task_post() * Fix the core invariant of coio_task_post(): - On timeout or when fiber was cancelled set diag, return -1 and guarantee that on_timeout will be called somewhen. - Otherwise don't touch diag, don't call on_timeout callback, return 0. Please check task->base.result and task->diag to get the original return code and diag from the callback. * Change the return value of coio_task_post() to "int". * Add diag to coio_getaddrinfo() and fix a possible bug in replication; !! ignore uncoventional getaddrinfo(3) error codes for now. * Fix buggy box.snapshot() tests. Needed for #1954 I don't find any reason to ignoring 'unconventional' errors. getaddrinfo() returns negative values on Linux, however it is not so on Mac OS. Values and corresponding messages can be checked like so: $ gcc -Wall -Wextra -x c <(echo -e '#include \n#include \n#include \n#include \nint main() { printf("%d\\n", EAI_NONAME); return 0; }') && ./a.out; rm a.out $ gcc -Wall -Wextra -x c <(echo -e '#include \n#include \n#include \n#include \nint main() { printf("%s\\n", gai_strerror(EAI_NONAME)); return 0; }') && ./a.out; rm a.out The commit message should mention Roman's commit and cleanly state that (rc < 0) assumption does not work on Mac OS. Aside of that there is nothing bad in calling gai_strerror() with some unusual value, it just return "Unknown error" (I checked both Linux and Mac OS). I would add relevant test case for coio_getaddrinfo() into test/unit/coio.cc. Consider comments re Lua part of the commit below. > Before this patch, branch when coio_getaddrinfo() returns > getaddrinfo() errors has never reached. Add this errors into the > socket and thenet.box modules. Typo: thenet.box -> net.box. Aside that I would add more investigation results and contracts change (see other comments in this email). > > Closes #4138 > --- > > Branch: https://github.com/tarantool/tarantool/tree/romanhabibov/gh-4138-getaddrinfo > Issue: https://github.com/tarantool/tarantool/issues/4138 > diff --git a/src/lib/core/coio_task.c b/src/lib/core/coio_task.c > index 908b336ed..83f669d05 100644 > --- a/src/lib/core/coio_task.c > +++ b/src/lib/core/coio_task.c > @@ -413,7 +413,7 @@ coio_getaddrinfo(const char *host, const char *port, > return -1; /* timed out or cancelled */ > > /* Task finished */ > - if (task->rc < 0) { > + if (task->rc != 0) { See comments at the start of the email. > diff --git a/src/lua/socket.c b/src/lua/socket.c > index 130378caf..5a8469ddf 100644 > --- a/src/lua/socket.c > +++ b/src/lua/socket.c > @@ -54,6 +54,7 @@ > #include > #include "lua/utils.h" > #include "lua/fiber.h" > +#include "reflection.h" It is only needed to deference 'err->type', so I would remove it with this check. > > extern int coio_wait(int fd, int event, double timeout); > > @@ -816,6 +817,11 @@ lbox_socket_getaddrinfo(struct lua_State *L) > > if (dns_res != 0) { > lua_pushnil(L); > + struct error *err = diag_get()->last; > + if (strcmp(err->type->name, "SystemError") == 0) { I don't think that we should check a type here. Why not report any error as is? After that the convention for lbox_socket_getaddrinfo() will be simple: it returns a table of results when successul; `nil`, err_msg otherwise. Now it can return just `nil` or `nil`, err_msg -- the contract is more complex. Lua's code will be simpler if we'll simplify this contract. I suggest to add a comment (in free form, but other reviewers can enforce specific format) to the function that shows it contract explicitly, like so: return <...> at success, otherwise return <...>. > diff --git a/src/lua/socket.lua b/src/lua/socket.lua > index 2dba0a8d2..f0b432925 100644 > --- a/src/lua/socket.lua > +++ b/src/lua/socket.lua > @@ -1028,11 +1028,14 @@ local function tcp_connect(host, port, timeout) > end > local timeout = timeout or TIMEOUT_INFINITY > local stop = fiber.clock() + timeout > - local dns = getaddrinfo(host, port, timeout, { type = 'SOCK_STREAM', > + local dns, err = getaddrinfo(host, port, timeout, { type = 'SOCK_STREAM', > protocol = 'tcp' }) > if dns == nil or #dns == 0 then > - boxerrno(boxerrno.EINVAL) > - return nil > + if not err then > + boxerrno(boxerrno.EINVAL) > + return nil > + end > + return nil, err Here we change the contract of socket.getaddrinfo() -- the user visible function. I would mention that in the commit message. > diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua > index dab168f90..140baf22f 100644 > --- a/test/app/socket.test.lua > +++ b/test/app/socket.test.lua > @@ -301,7 +301,7 @@ sc:close() > -- tcp_connect > > -- test timeout > -socket.tcp_connect('127.0.0.1', 80, 0.00000000001) > +socket.tcp_connect('127.0.0.1', 80, 0.00000000000001) It seems this change is not related to your commit. > +--gh-4138 Check getaddrinfo() error. > + > +test_run:cmd("setopt delimiter ';'") > +function check_err(err) > + if err == 'getaddrinfo: nodename nor servname provided, or not known' or > + err == 'getaddrinfo: Servname not supported for ai_socktype' or > + err == 'getaddrinfo: Name or service not known' then > + return true > + end > + return false > +end; Flush the delimiter to '' here. Please, comment (outside of semicolon-delimiter block) that these messages corresponds to the following codes: * EAI_NONAME (Mac OS) * EAI_SERVICE (Linux and Mac OS) * EAI_NONAME (Linux) It seems that you reveice either EAI_NONAME or EAI_SERVICE, not? Please, elaborate. I guess it worth to see which certain error is returned by getaddrinfo() here and check only for it (but consider difference between Linux and Mac OS gai_strerror() messages if it is EAI_NONAME). > + > +s, err = socket:connect('hostname:3301'); I would use less generic hostname like 'non_exists_hostname'. > diff --git a/test/box/net.box.test.lua b/test/box/net.box.test.lua > index bea43002d..13ed4e200 100644 > --- a/test/box/net.box.test.lua > +++ b/test/box/net.box.test.lua > @@ -1538,4 +1538,19 @@ test_run:grep_log('default', '00000020:.*') > test_run:grep_log('default', '00000030:.*') > test_run:grep_log('default', '00000040:.*') > > +--gh-4138 Check getaddrinfo() error. > +test_run:cmd("setopt delimiter ';'") > + > +function check_err(err) > + if err == 'getaddrinfo: nodename nor servname provided, or not known' or > + err == 'getaddrinfo: Servname not supported for ai_socktype' or > + err == 'getaddrinfo: Name or service not known' then > + return true > + end > + return false > +end; The same comments as above are applicable for this test case too.