[Tarantool-patches] [PATCH v1] test: flaky box/net.box_wait_connected_gh-3856
Alexander V. Tikhonov
avtikhon at tarantool.org
Sat Jun 20 07:47:33 MSK 2020
Hi Alexander, thanks for your deep review, please chec my comments below.
On Thu, Jun 18, 2020 at 09:25:13PM +0300, Alexander Turenko wrote:
> The reason of the fail is that getaddrinfo() returns EIA_SERVICE for an
> incorrect TCP/IP port on FreeBSD, but crops it as modulo of 65536 on
> Linux/glibc. You may check it youself:
>
> | /* cc getaddrinfo.c -o getaddrinfo */
> |
> | #include <sys/types.h>
> | #include <sys/socket.h>
> | #include <netdb.h>
> | #include <netinet/in.h>
> | #include <stdio.h>
> | #include <stdlib.h>
> | #include <string.h>
> | #include <errno.h>
> |
> | const char *
> | family_str(int family)
> | {
> | if (family == AF_INET)
> | return "AF_INET";
> | if (family == AF_INET6)
> | return "AF_INET6";
> | return "?";
> | }
> |
> | const char *
> | socktype_str(int socktype)
> | {
> | if (socktype == SOCK_DGRAM)
> | return "SOCK_DGRAM";
> | if (socktype == SOCK_STREAM)
> | return "SOCK_STREAM";
> | if (socktype == SOCK_RAW)
> | return "SOCK_RAW";
> | return "?";
> | }
> |
> | const char *
> | protocol_str(int protocol)
> | {
> | if (protocol == IPPROTO_TCP)
> | return "IPPROTO_TCP";
> | if (protocol == IPPROTO_UDP)
> | return "IPPROTO_UDP";
> | return "?";
> | }
> |
> | int
> | main(int argc, char **argv)
> | {
> | static char host[1024];
> | static char serv[1024];
> |
> | struct addrinfo hints;
> | memset(&hints, (char) 0, sizeof(hints));
> | hints.ai_family = AF_UNSPEC;
> | hints.ai_socktype = SOCK_STREAM;
> |
> | if (argc != 3) {
> | fprintf(stderr, "Usage: %s host port\n", argv[0]);
> | return 1;
> | }
> |
> | struct addrinfo *addrs;
> | int rc = getaddrinfo(argv[1], argv[2], &hints, &addrs);
> | if (rc != 0) {
> | fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rc));
> | exit(1);
> | }
> |
> | int flags = NI_NUMERICHOST | NI_NUMERICSERV;
> | struct addrinfo *addr;
> | for (addr = addrs; addr != NULL; addr = addr->ai_next) {
> | int rc = getnameinfo(addr->ai_addr, addr->ai_addrlen,
> | host, 1024, serv, 1024, flags);
> | if (rc != 0) {
> | fprintf(stderr, "getnameinfo error\n");
> | exit(1);
> | }
> | printf("----\n");
> | printf("family: %s\n", family_str(addr->ai_family));
> | printf("socktype: %s\n", socktype_str(addr->ai_socktype));
> | printf("protocol: %s\n", protocol_str(addr->ai_protocol));
> | printf("host: %s\n", host);
> | printf("serv: %s\n", serv);
> |
> | #if 0
> | printf("Connecting...\n");
> | int fd = socket(addr->ai_family, addr->ai_socktype, 0);
> | if (connect(fd, addr->ai_addr, addr->ai_addrlen) == -1) {
> | fprintf(stderr, "connect errno: %d\n", errno);
> | perror("connect");
> | } else {
> | printf("connected successfully\n");
> | }
> | #endif
> | }
> |
> | freeaddrinfo(addrs);
> |
> | return 0;
> | }
>
> (Linux/glibc) $ ./getaddrinfo 8.8.8.8 123456
> ----
> family: AF_INET
> socktype: SOCK_STREAM
> protocol: IPPROTO_TCP
> host: 8.8.8.8
> serv: 57920
>
> (FreeBSD) $ ./getaddrinfo 8.8.8.8 123456
> getaddrinfo: Service was not recognized for socket type
>
> So obvious fix would be change 123456 to something less or equal to
> 65535. Say, 1234.
>
Ok, sure, changed.
> > diff --git a/test/box/net.box_wait_connected_gh-3856.test.lua b/test/box/net.box_wait_connected_gh-3856.test.lua
> > index 29e997fb5..d9fa80f3f 100644
> > --- a/test/box/net.box_wait_connected_gh-3856.test.lua
> > +++ b/test/box/net.box_wait_connected_gh-3856.test.lua
> > @@ -1,8 +1,12 @@
> > net = require('net.box')
> > +test_run = require('test_run').new()
> >
> > --
> > -- gh-3856: wait_connected = false is ignored.
> > +-- Test uses Google DNS IP for testing:
> > +-- https://developers.google.com/speed/public-dns/docs/using
> > --
> > c = net.connect('8.8.8.8:123456', {wait_connected = false})
> > +test_run:wait_cond(function() return c.state == 'initial' end)
> > c
> > c:close()
>
> It should not work and does not. I checked it with the following command
> on a FreeBSD virtual machine:
>
> $ ( cd test && ./test-run.py -j 20 `yes box/net.box_wait_connected_gh-3856.test.lua | head -n 1000` )
>
> The 123456 -> 1234 change, however, passes.
>
Right, I see it too now.
> The test still depend on an order in which fibers will be scheduled
> (net_box.connect() creates a separate fiber for connecting in background
> using fiber.create(), which yields). Unlikely our fiber will not get
> execution time during the connection attempt, so it is more like a
> formal thing.
>
> But we can decrease probability of this situation even more if we'll
> grab all connection fields just when net_box.connect() returns, not
> after yield in console (which is due to waiting a next command from
> test-run).
>
> Consider this way:
>
> | $ cat test/box/net.box_wait_connected_gh-3856.test.lua
> | net = require('net.box')
> |
> | --
> | -- gh-3856: wait_connected = false is ignored.
> | --
> | do \
> | c = net.connect('8.8.8.8:1234', {wait_connected = false}) \
> | return c.state \
> | end
> | c:close()
>
Ok, I've used your code as you suggested, thank you.
> CCed Vlad as author of the test case.
>
Sure, I'll send him too.
> WBR, Alexander Turenko.
More information about the Tarantool-patches
mailing list