Begin forwarded message:

From: Roman Khabibov <roman.habibov@tarantool.org>
Subject: Re: [tarantool-patches] [server-dev] Re: Re: [PATCH v2 1/2] lua: return getaddrinfo() errors
Date: December 26, 2019 at 20:29:16 GMT+3
To: Sergey Ostanevich <sergos@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org, Alexander Turenko <alexander.turenko@tarantool.org>

Hi! Alexander, thanks for the review. Sergos, can you, please, do a second review?

https://github.com/tarantool/tarantool/tree/romanhabibov/gh-4138-getaddrinfo-full-ci

On Dec 23, 2019, at 16:36, Alexander Turenko <alexander.turenko@tarantool.org> wrote:

Comments below are trivial. Let's fix and proceed with the next reviewer
(Sergey O.). Please, resend the patchset for him.

I still think you should run CI on all targets (push to ...-full-ci
branch).

LGTM except minor comments below.

WBR, Alexander Turenko.

@@ -1360,8 +1363,12 @@ local function lsocket_tcp_connect(self, host, port)
   -- This function is broken by design
   local ga_opts = { family = 'AF_INET', type = 'SOCK_STREAM' }
   local timeout = deadline - fiber.clock()
-    local dns = getaddrinfo(host, port, timeout, ga_opts)
-    if dns == nil or #dns == 0 then
+    local dns, err = getaddrinfo(host, port, timeout, ga_opts)
+    if dns == nil then
+        self._errno = boxerrno()

Master's socket _errno was set to EINVAL before this patch. Let's keep
this behaviour, but change the 2nd return value.

I mean:

| if dns == nil then
|     self._errno = boxerrno.EINVAL
|     return nil, err
| end
| if #dns == 0 then
|     self._errno = boxerrno.EINVAL
|     return nil, socket_error(self)
| end
@@ -1360,8 +1363,12 @@ local function lsocket_tcp_connect(self, host, port)
    -- This function is broken by design
    local ga_opts = { family = 'AF_INET', type = 'SOCK_STREAM' }
    local timeout = deadline - fiber.clock()
-    local dns = getaddrinfo(host, port, timeout, ga_opts)
-    if dns == nil or #dns == 0 then
+    local dns, err = getaddrinfo(host, port, timeout, ga_opts)
+    if dns == nil then
+        self._errno = boxerrno.EINVAL
+        return nil, err
+    end
+    if #dns == 0 then
        self._errno = boxerrno.EINVAL
        return nil, socket_error(self)
    end

+        return nil, err
+    end
+    if #dns == 0 then
       self._errno = boxerrno.EINVAL
       return nil, socket_error(self)
   end

diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua
index c72d41763..7086e496b 100644
--- a/test/app/socket.test.lua
+++ b/test/app/socket.test.lua
@@ -960,6 +968,27 @@ server:close()

test_run:cmd("clear filter")

+-- gh-4138 Check getaddrinfo() error from socket:connect() only.
+-- Error code and error message returned by getaddrinfo() depends
+-- on system's gai_strerror().
+test_run:cmd("setopt delimiter ';'")
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' then
+        return true
+    end
+    return false
+end;

You added the following error message to coio_getaddrinfo() unit test in
the first commit:

| const char *exp_errmsg_4 = "getaddrinfo: hostname nor servname provided"
|         ", or not known";

So it worth to add it here too. We'll enable the test of FreeBSD sooner
or later.

Also I got EAI_AGAIN on my system now (maybe something was changed on
DNS server side). Let's add it too (for both commits).
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' or
+       err == 'getaddrinfo: hostname nor servname provided, or not known' or
+       err == 'getaddrinfo: temporary failure in name resolution' then
+        return true
+    end
+    return false
+end;

diff --git a/test/box/net.box.test.lua b/test/box/net.box.test.lua
index 8e65ff470..9c9aee1ad 100644
--- a/test/box/net.box.test.lua
+++ b/test/box/net.box.test.lua
@@ -1582,4 +1582,21 @@ test_run:wait_log('default', '00000030:.*', nil, 10)
-- we expect nothing below, so don't wait
test_run:grep_log('default', '00000040:.*')

+-- gh-4138 Check getaddrinfo() error from connect() only. Error
+-- code and error message returned by getaddrinfo() depends on
+-- system's gai_strerror().
+test_run:cmd("setopt delimiter ';'")
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' then
+            return true
+    end
+    return false
+end;
+test_run:cmd("setopt delimiter ''");

Same here.
Done.


commit 22937d2fce2d127a762a32e392d2c4d525c22c0b
Author: Roman Khabibov <roman.habibov@tarantool.org>
Date:   Thu Nov 21 14:37:54 2019 +0300

   lua: return getaddrinfo() errors

   Add getaddrinfo() errors into the several fuctions of socket. Now
   getaddrinfo() can return a pair of values (nil and error message)
   in case of error.

   Closes #4138

   @TarantoolBot document
   Title: socket API changes

   * socket.getaddrinfo()

   Can return error message as second return value in case of
   internal getaddrinfo() error.

   Example:
   tarantool> socket.getaddrinfo('non_exists_hostname', 3301)
   ---
   - null
   - 'getaddrinfo: nodename nor servname provided, or not known'
   ...

   * socket.tcp_connect()

   Can return socket.getaddrinfo() error message as second return
   value.

   Example:
   tarantool> socket.tcp_connect('non_exists_hostname', 3301)
   ---
   - null
   - 'getaddrinfo: nodename nor servname provided, or not known'
   ...

diff --git a/src/box/lua/net_box.lua b/src/box/lua/net_box.lua
index c2e1bb9c4..2fb3e6df6 100644
--- a/src/box/lua/net_box.lua
+++ b/src/box/lua/net_box.lua
@@ -150,9 +150,12 @@ local function next_id(id) return band(id + 1, 0x7FFFFFFF) end
local function establish_connection(host, port, timeout)
    local timeout = timeout or DEFAULT_CONNECT_TIMEOUT
    local begin = fiber.clock()
-    local s = socket.tcp_connect(host, port, timeout)
+    local s, err = socket.tcp_connect(host, port, timeout)
    if not s then
-        return nil, errno.strerror(errno())
+        if not err then
+            return nil, errno.strerror(errno())
+        end
+        return nil, err
    end
    local msg = s:read({chunk = IPROTO_GREETING_SIZE},
                        timeout - (fiber.clock() - begin))
diff --git a/src/lua/socket.c b/src/lua/socket.c
index 130378caf..dbade2d27 100644
--- a/src/lua/socket.c
+++ b/src/lua/socket.c
@@ -774,6 +774,18 @@ lbox_getaddrinfo_result_wrapper(struct lua_State *L)
return 1;
}

+/**
+ * Wrap coio_getaddrinfo() and call it. Push returned values onto
+ * @a L Lua stack.
+ *
+ * @param L Lua stack.
+ *
+ * @retval 1 Number of returned values by Lua function if
+ * coio_getaddrinfo() success.
+ * @retval 2 Number of returned values by Lua function if
+ * coio_getaddrinfo() is failed (first is nil, second is error
+ * message).
+ */
static int
lbox_socket_getaddrinfo(struct lua_State *L)
{
@@ -816,7 +828,9 @@ lbox_socket_getaddrinfo(struct lua_State *L)

if (dns_res != 0) {
lua_pushnil(L);
- return 1;
+ struct error *err = diag_get()->last;
+ lua_pushstring(L, err->errmsg);
+ return 2;
}

/* no results */
diff --git a/src/lua/socket.lua b/src/lua/socket.lua
index a334ad45b..ce6dab301 100644
--- a/src/lua/socket.lua
+++ b/src/lua/socket.lua
@@ -1041,9 +1041,12 @@ local function tcp_connect(host, port, timeout)
    end
    local timeout = timeout or TIMEOUT_INFINITY
    local stop = fiber.clock() + timeout
-    local dns = getaddrinfo(host, port, timeout, { type = 'SOCK_STREAM',
+    local dns, err = getaddrinfo(host, port, timeout, { type = 'SOCK_STREAM',
        protocol = 'tcp' })
-    if dns == nil or #dns == 0 then
+    if dns == nil then
+        return nil, err
+    end
+    if #dns == 0 then
        boxerrno(boxerrno.EINVAL)
        return nil
    end
@@ -1360,8 +1363,12 @@ local function lsocket_tcp_connect(self, host, port)
    -- This function is broken by design
    local ga_opts = { family = 'AF_INET', type = 'SOCK_STREAM' }
    local timeout = deadline - fiber.clock()
-    local dns = getaddrinfo(host, port, timeout, ga_opts)
-    if dns == nil or #dns == 0 then
+    local dns, err = getaddrinfo(host, port, timeout, ga_opts)
+    if dns == nil then
+        self._errno = boxerrno.EINVAL
+        return nil, err
+    end
+    if #dns == 0 then
        self._errno = boxerrno.EINVAL
        return nil, socket_error(self)
    end
@@ -1547,9 +1554,12 @@ local function lsocket_connect(host, port)
    if host == nil or port == nil then
        error("Usage: luasocket.connect(host, port)")
    end
-    local s = tcp_connect(host, port)
+    local s, err = tcp_connect(host, port)
    if not s then
-        return nil, boxerrno.strerror()
+        if not err then
+            return nil, boxerrno.strerror()
+        end
+        return nil, err
    end
    setmetatable(s, lsocket_tcp_client_mt)
    return s
diff --git a/test/app/socket.result b/test/app/socket.result
index fd299424c..aca189f4c 100644
--- a/test/app/socket.result
+++ b/test/app/socket.result
@@ -941,10 +941,20 @@ sc:close()
- true
...
-- tcp_connect
--- test timeout
-socket.tcp_connect('127.0.0.1', 80, 0.00000000001)
+-- Test timeout. In this test, tcp_connect can return the second
+-- output value from internal.getaddrinfo (usually on Mac OS, but
+-- theoretically it can happen on Linux too). Sometimes
+-- getaddrinfo() is timed out, sometimes connect. On Linux however
+-- getaddrinfo is fast enough to never give timeout error in
+-- the case. So, there are two sources of timeout errors that are
+-- reported differently. This difference has appeared after
+-- gh-4138 patch.
+s, err = socket.tcp_connect('127.0.0.1', 80, 0.00000000001)
---
-- null
+...
+s == nil
+---
+- true
...
-- AF_INET
s = socket('AF_INET', 'SOCK_STREAM', 'tcp')
@@ -1606,7 +1616,7 @@ fio.stat(path) == nil
{ socket.tcp_connect('abrakadabra#123') == nil, errno.strerror() }
---
- - true
-  - Invalid argument
+  - Input/output error
...
-- wrong options for getaddrinfo
socket.getaddrinfo('host', 'port', { type = 'WRONG' }) == nil and errno() == errno.EINVAL
@@ -2816,6 +2826,50 @@ test_run:cmd("clear filter")
---
- true
...
+-- gh-4138 Check getaddrinfo() error from socket:connect() only.
+-- Error code and error message returned by getaddrinfo() depends
+-- on system's gai_strerror().
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' or
+       err == 'getaddrinfo: hostname nor servname provided, or not known' or
+       err == 'getaddrinfo: temporary failure in name resolution' then
+        return true
+    end
+    return false
+end;
+---
+...
+test_run:cmd("setopt delimiter ''");
+---
+- true
+...
+s, err = socket.getaddrinfo('non_exists_hostname', 3301)
+---
+...
+check_err(err)
+---
+- true
+...
+s, err = socket.connect('non_exists_hostname', 3301)
+---
+...
+check_err(err)
+---
+- true
+...
+s, err = socket.tcp_connect('non_exists_hostname', 3301)
+---
+...
+check_err(err)
+---
+- true
+...
-- case: sicket receive inconsistent behavior
chan = fiber.channel()
---
diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua
index c72d41763..223935e66 100644
--- a/test/app/socket.test.lua
+++ b/test/app/socket.test.lua
@@ -300,8 +300,16 @@ sc:close()

-- tcp_connect

--- test timeout
-socket.tcp_connect('127.0.0.1', 80, 0.00000000001)
+-- Test timeout. In this test, tcp_connect can return the second
+-- output value from internal.getaddrinfo (usually on Mac OS, but
+-- theoretically it can happen on Linux too). Sometimes
+-- getaddrinfo() is timed out, sometimes connect. On Linux however
+-- getaddrinfo is fast enough to never give timeout error in
+-- the case. So, there are two sources of timeout errors that are
+-- reported differently. This difference has appeared after
+-- gh-4138 patch.
+s, err = socket.tcp_connect('127.0.0.1', 80, 0.00000000001)
+s == nil

-- AF_INET
s = socket('AF_INET', 'SOCK_STREAM', 'tcp')
@@ -960,6 +968,29 @@ server:close()

test_run:cmd("clear filter")

+-- gh-4138 Check getaddrinfo() error from socket:connect() only.
+-- Error code and error message returned by getaddrinfo() depends
+-- on system's gai_strerror().
+test_run:cmd("setopt delimiter ';'")
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' or
+       err == 'getaddrinfo: hostname nor servname provided, or not known' or
+       err == 'getaddrinfo: temporary failure in name resolution' then
+        return true
+    end
+    return false
+end;
+test_run:cmd("setopt delimiter ''");
+
+s, err = socket.getaddrinfo('non_exists_hostname', 3301)
+check_err(err)
+s, err = socket.connect('non_exists_hostname', 3301)
+check_err(err)
+s, err = socket.tcp_connect('non_exists_hostname', 3301)
+check_err(err)
+
-- case: sicket receive inconsistent behavior
chan = fiber.channel()
counter = 0
diff --git a/test/box/net.box.result b/test/box/net.box.result
index e3dabf7d9..11914dd68 100644
--- a/test/box/net.box.result
+++ b/test/box/net.box.result
@@ -3927,6 +3927,36 @@ test_run:grep_log('default', '00000040:.*')
---
- null
...
+-- gh-4138 Check getaddrinfo() error from connect() only. Error
+-- code and error message returned by getaddrinfo() depends on
+-- system's gai_strerror().
+test_run:cmd("setopt delimiter ';'")
+---
+- true
+...
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' or
+       err == 'getaddrinfo: hostname nor servname provided, or not known' or
+       err == 'getaddrinfo: temporary failure in name resolution' then
+            return true
+    end
+    return false
+end;
+---
+...
+test_run:cmd("setopt delimiter ''");
+---
+- true
+...
+s = remote.connect('non_exists_hostname:3301')
+---
+...
+check_err(s['error'])
+---
+- true
+...
box.cfg{log_level=log_level}
---
...
diff --git a/test/box/net.box.test.lua b/test/box/net.box.test.lua
index 8e65ff470..79707eaa4 100644
--- a/test/box/net.box.test.lua
+++ b/test/box/net.box.test.lua
@@ -1582,4 +1582,23 @@ test_run:wait_log('default', '00000030:.*', nil, 10)
-- we expect nothing below, so don't wait
test_run:grep_log('default', '00000040:.*')

+-- gh-4138 Check getaddrinfo() error from connect() only. Error
+-- code and error message returned by getaddrinfo() depends on
+-- system's gai_strerror().
+test_run:cmd("setopt delimiter ';'")
+function check_err(err)
+    if err == 'getaddrinfo: nodename nor servname provided, or not known' or
+       err == 'getaddrinfo: Servname not supported for ai_socktype' or
+       err == 'getaddrinfo: Name or service not known' or
+       err == 'getaddrinfo: hostname nor servname provided, or not known' or
+       err == 'getaddrinfo: temporary failure in name resolution' then
+            return true
+    end
+    return false
+end;
+test_run:cmd("setopt delimiter ''");
+
+s = remote.connect('non_exists_hostname:3301')
+check_err(s['error'])
+
box.cfg{log_level=log_level}