[tarantool-patches] Re: [PATCH] lua: return getaddrinfo() errors
Alexander Turenko
alexander.turenko at tarantool.org
Sun Jun 23 23:31:43 MSK 2019
Re C part of the commit (src/lib/core/coio_task.c).
The origin of the (rc < 0) check is the following commit (marked
relevant phrase with !!):
commit ea1da04d5add51c308efb3fd2d71cdfabed8411c
Author: Roman Tsisyk <roman at tsisyk.com>
Date: Mon Dec 5 20:28:59 2016 +0300
Refactor coio_task
* Add coio_task_create() and coio_task_destroy()
* Rename coio_task() to coio_task_post()
* Fix the core invariant of coio_task_post():
- On timeout or when fiber was cancelled set diag, return -1 and
guarantee that on_timeout will be called somewhen.
- Otherwise don't touch diag, don't call on_timeout callback,
return 0. Please check task->base.result and task->diag
to get the original return code and diag from the callback.
* Change the return value of coio_task_post() to "int".
* Add diag to coio_getaddrinfo() and fix a possible bug in replication;
!! ignore uncoventional getaddrinfo(3) error codes for now.
* Fix buggy box.snapshot() tests.
Needed for #1954
I don't find any reason to ignoring 'unconventional' errors.
getaddrinfo() returns negative values on Linux, however it is not so on
Mac OS. Values and corresponding messages can be checked like so:
$ gcc -Wall -Wextra -x c <(echo -e '#include <sys/types.h>\n#include <sys/socket.h>\n#include <netdb.h>\n#include <stdio.h>\nint main() { printf("%d\\n", EAI_NONAME); return 0; }') && ./a.out; rm a.out
$ gcc -Wall -Wextra -x c <(echo -e '#include <sys/types.h>\n#include <sys/socket.h>\n#include <netdb.h>\n#include <stdio.h>\nint main() { printf("%s\\n", gai_strerror(EAI_NONAME)); return 0; }') && ./a.out; rm a.out
The commit message should mention Roman's commit and cleanly state that
(rc < 0) assumption does not work on Mac OS.
Aside of that there is nothing bad in calling gai_strerror() with some
unusual value, it just return "Unknown error" (I checked both Linux and
Mac OS).
I would add relevant test case for coio_getaddrinfo() into
test/unit/coio.cc.
Consider comments re Lua part of the commit below.
> Before this patch, branch when coio_getaddrinfo() returns
> getaddrinfo() errors has never reached. Add this errors into the
> socket and thenet.box modules.
Typo: thenet.box -> net.box.
Aside that I would add more investigation results and contracts change
(see other comments in this email).
>
> Closes #4138
> ---
>
> Branch: https://github.com/tarantool/tarantool/tree/romanhabibov/gh-4138-getaddrinfo
> Issue: https://github.com/tarantool/tarantool/issues/4138
> diff --git a/src/lib/core/coio_task.c b/src/lib/core/coio_task.c
> index 908b336ed..83f669d05 100644
> --- a/src/lib/core/coio_task.c
> +++ b/src/lib/core/coio_task.c
> @@ -413,7 +413,7 @@ coio_getaddrinfo(const char *host, const char *port,
> return -1; /* timed out or cancelled */
>
> /* Task finished */
> - if (task->rc < 0) {
> + if (task->rc != 0) {
See comments at the start of the email.
> diff --git a/src/lua/socket.c b/src/lua/socket.c
> index 130378caf..5a8469ddf 100644
> --- a/src/lua/socket.c
> +++ b/src/lua/socket.c
> @@ -54,6 +54,7 @@
> #include <fiber.h>
> #include "lua/utils.h"
> #include "lua/fiber.h"
> +#include "reflection.h"
It is only needed to deference 'err->type', so I would remove it with
this check.
>
> extern int coio_wait(int fd, int event, double timeout);
>
> @@ -816,6 +817,11 @@ lbox_socket_getaddrinfo(struct lua_State *L)
>
> if (dns_res != 0) {
> lua_pushnil(L);
> + struct error *err = diag_get()->last;
> + if (strcmp(err->type->name, "SystemError") == 0) {
I don't think that we should check a type here. Why not report any error
as is?
After that the convention for lbox_socket_getaddrinfo() will be simple:
it returns a table of results when successul; `nil`, err_msg otherwise.
Now it can return just `nil` or `nil`, err_msg -- the contract is more
complex.
Lua's code will be simpler if we'll simplify this contract.
I suggest to add a comment (in free form, but other reviewers can
enforce specific format) to the function that shows it contract
explicitly, like so: return <...> at success, otherwise return <...>.
> diff --git a/src/lua/socket.lua b/src/lua/socket.lua
> index 2dba0a8d2..f0b432925 100644
> --- a/src/lua/socket.lua
> +++ b/src/lua/socket.lua
> @@ -1028,11 +1028,14 @@ local function tcp_connect(host, port, timeout)
> end
> local timeout = timeout or TIMEOUT_INFINITY
> local stop = fiber.clock() + timeout
> - local dns = getaddrinfo(host, port, timeout, { type = 'SOCK_STREAM',
> + local dns, err = getaddrinfo(host, port, timeout, { type = 'SOCK_STREAM',
> protocol = 'tcp' })
> if dns == nil or #dns == 0 then
> - boxerrno(boxerrno.EINVAL)
> - return nil
> + if not err then
> + boxerrno(boxerrno.EINVAL)
> + return nil
> + end
> + return nil, err
Here we change the contract of socket.getaddrinfo() -- the user visible
function. I would mention that in the commit message.
> diff --git a/test/app/socket.test.lua b/test/app/socket.test.lua
> index dab168f90..140baf22f 100644
> --- a/test/app/socket.test.lua
> +++ b/test/app/socket.test.lua
> @@ -301,7 +301,7 @@ sc:close()
> -- tcp_connect
>
> -- test timeout
> -socket.tcp_connect('127.0.0.1', 80, 0.00000000001)
> +socket.tcp_connect('127.0.0.1', 80, 0.00000000000001)
It seems this change is not related to your commit.
> +--gh-4138 Check getaddrinfo() error.
> +
> +test_run:cmd("setopt delimiter ';'")
> +function check_err(err)
> + if err == 'getaddrinfo: nodename nor servname provided, or not known' or
> + err == 'getaddrinfo: Servname not supported for ai_socktype' or
> + err == 'getaddrinfo: Name or service not known' then
> + return true
> + end
> + return false
> +end;
Flush the delimiter to '' here.
Please, comment (outside of semicolon-delimiter block) that these
messages corresponds to the following codes:
* EAI_NONAME (Mac OS)
* EAI_SERVICE (Linux and Mac OS)
* EAI_NONAME (Linux)
It seems that you reveice either EAI_NONAME or EAI_SERVICE, not? Please,
elaborate. I guess it worth to see which certain error is returned by
getaddrinfo() here and check only for it (but consider difference
between Linux and Mac OS gai_strerror() messages if it is EAI_NONAME).
> +
> +s, err = socket:connect('hostname:3301');
I would use less generic hostname like 'non_exists_hostname'.
> diff --git a/test/box/net.box.test.lua b/test/box/net.box.test.lua
> index bea43002d..13ed4e200 100644
> --- a/test/box/net.box.test.lua
> +++ b/test/box/net.box.test.lua
> @@ -1538,4 +1538,19 @@ test_run:grep_log('default', '00000020:.*')
> test_run:grep_log('default', '00000030:.*')
> test_run:grep_log('default', '00000040:.*')
>
> +--gh-4138 Check getaddrinfo() error.
> +test_run:cmd("setopt delimiter ';'")
> +
> +function check_err(err)
> + if err == 'getaddrinfo: nodename nor servname provided, or not known' or
> + err == 'getaddrinfo: Servname not supported for ai_socktype' or
> + err == 'getaddrinfo: Name or service not known' then
> + return true
> + end
> + return false
> +end;
The same comments as above are applicable for this test case too.
More information about the Tarantool-patches
mailing list