From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id DD00D275E8 for ; Wed, 28 Aug 2019 17:34:49 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l0GDdVSJp0KP for ; Wed, 28 Aug 2019 17:34:49 -0400 (EDT) Received: from smtp46.i.mail.ru (smtp46.i.mail.ru [94.100.177.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 069A4275E4 for ; Wed, 28 Aug 2019 17:34:48 -0400 (EDT) Date: Thu, 29 Aug 2019 00:34:31 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH v2 2/2] say: take getaddrinfo() errors into account Message-ID: <20190828213431.3yd4kwcahe2oizgs@tkn_work_nb> References: <3603f7507651b37ddd549a8c247709cc7ff43f44.1561469272.git.roman.habibov@tarantool.org> <20190723145249.5xwc2td6omphwwzw@tkn_work_nb> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: Roman Khabibov Cc: tarantool-patches@freelists.org This particular patch is mostly okay, but I would work a bit more on tests and minor details. Please, consider comments below. WBR, Alexander Turenko. On Mon, Aug 05, 2019 at 04:32:41PM +0300, Roman Khabibov wrote: > > > > On Jul 23, 2019, at 17:52, Alexander Turenko wrote: > > > >> @@ -594,6 +594,7 @@ log_syslog_init(struct log *log, const char *init_str) > >> log->fd = log_syslog_connect(log); > >> if (log->fd < 0) { > >> /* syslog indent is freed in atexit(). */ > >> + diag_log(); > > > > 1. It is need to be properly commented: we need to log a diagnostics > > here until stacked diagnostics will be implemented (#1148). > > > > 2. syslog_connect_unix() does not set a diag, diag_log() will lead to an > > assertion fail. > > > > 3. I would mention this change in the commit message, because it is not > > part of the problem with Mac OS you described in it. > @@ -506,10 +506,15 @@ syslog_connect_remote(const char *server_address) > hints.ai_protocol = IPPROTO_UDP; > > ret = getaddrinfo(remote, portnum, &hints, &inf); > - if (ret < 0) { > + if (ret != 0) { > errno = EIO; > diag_set(SystemError, "getaddrinfo: %s", > gai_strerror(ret)); > + /* > + * We need to log a diagnostics here until stacked > + * diagnostics will be implemented (#1148). > + */ > + diag_log(); It is not the only error that is possible in this function, but others will not be logged. I think the logging should be added before replacing the diagnostic with the next one: | --- a/src/lib/core/say.c | +++ b/src/lib/core/say.c | @@ -593,6 +593,7 @@ log_syslog_init(struct log *log, const char *init_str) | say_free_syslog_opts(&opts); | log->fd = log_syslog_connect(log); | if (log->fd < 0) { | + /* XXX: comment. */ | + diag_log(); | /* syslog indent is freed in atexit(). */ | diag_set(SystemError, "syslog logger: %s", strerror(errno)); | return -1; > commit f5b19e933fbf2eb3784a0c3cc28d999a3fa85abe > Author: Roman Khabibov > Date: Tue Jul 30 15:39:21 2019 +0300 > > coio/say: take getaddrinfo() errors into account I would say that this fix is for Mac OS in the commit header if possible. Say, 'coio/say: fix getaddrinfo error handling on Mac OS'. > > Before this patch, branch when getaddrinfo() returns error codes > couldn't be reached on Mac OS, because they are greater than 0 on > Mac OS (assumption "rc < 0" in commit ea1da04 is incorrect for > Mac OS). > > * diag log() in say.c was added, because we need to log a > diagnostics here until stacked diagnostics will be implemented > (#1148). I would say, because otherwise it will be hid by the following diagnostic and then say that it should be handler in a better way after #1148. Asterisk here is redundant, because it is not a list. > > Need for #4138 > > diff --git a/src/lib/core/coio_task.c b/src/lib/core/coio_task.c > index 908b336ed..83f669d05 100644 > --- a/src/lib/core/coio_task.c > +++ b/src/lib/core/coio_task.c > @@ -413,7 +413,7 @@ coio_getaddrinfo(const char *host, const char *port, > return -1; /* timed out or cancelled */ > > /* Task finished */ > - if (task->rc < 0) { > + if (task->rc != 0) { > /* getaddrinfo() failed */ > errno = EIO; > diag_set(SystemError, "getaddrinfo: %s", > diff --git a/src/lib/core/say.c b/src/lib/core/say.c > index 0b2cf2c34..a45595443 100644 > --- a/src/lib/core/say.c > +++ b/src/lib/core/say.c > @@ -506,10 +506,15 @@ syslog_connect_remote(const char *server_address) > hints.ai_protocol = IPPROTO_UDP; > > ret = getaddrinfo(remote, portnum, &hints, &inf); > - if (ret < 0) { > + if (ret != 0) { > errno = EIO; > diag_set(SystemError, "getaddrinfo: %s", > gai_strerror(ret)); > + /* > + * We need to log a diagnostics here until stacked > + * diagnostics will be implemented (#1148). > + */ > + diag_log(); > goto out; > } > for (ptr = inf; ptr; ptr = ptr->ai_next) { > diff --git a/test/box-tap/cfg.test.lua b/test/box-tap/cfg.test.lua > index 55de5e41c..d92e9140d 100755 > --- a/test/box-tap/cfg.test.lua > +++ b/test/box-tap/cfg.test.lua > @@ -6,7 +6,7 @@ local socket = require('socket') > local fio = require('fio') > local uuid = require('uuid') > local msgpack = require('msgpack') > -test:plan(104) > +test:plan(105) > > -------------------------------------------------------------------------------- > -- Invalid values > @@ -566,6 +566,23 @@ os.exit(0) > ]] > test:is(run_script(code), 0, "log_nonblock") > > +-- > +-- gh-4138: check getaddrinfo() error and panic after that. > +-- > +code=[[ > +local socket = require('socket') > +local log = require('log') > +local fio = require('fio') > + > +path = fio.pathjoin(fio.cwd(), 'log_unix_socket_test.sock') > +unix_socket = socket('AF_UNIX', 'SOCK_DGRAM', 0) > +unix_socket:bind('unix/', path) > + > +opt = string.format("syslog:server=non_exists_hostname:%s,identity=tarantool", path) > +box.cfg{log = opt, log_nonblock=true} log_nonblock is not needed here, so it is better to remove it. box.cfg{log = 'syslog:server=non_exists_hostname:3301'} is enough, not need to form a file path, no need identity, no need requiring socket, log and fio. The test passes even before the patch, so what it is intended to test? I think we should write a test that verifies stderr output to find all log messages we expect to appear in the case: Linux: | SystemError getaddrinfo: Temporary failure in name resolution: Input/output error | SystemError syslog logger: Input/output error: Input/output error | failed to initialize logging subsystem gai_strerror() message corresponds to EAI_AGAIN. Mac OS: | SystemError getaddrinfo: nodename nor servname provided, or not known: Input/output error | SystemError syslog logger: Input/output error: Input/output error | failed to initialize logging subsystem gai_strerror() message corresponds to EAI_NONAME. I propose to call ffi.C.gai_strerror() right from a test to form two error messages and verify that the actual input match one of them. If it is hard to catch stderr, then let's proceed w/o this test. However I think it is doable. I also propose to test error messages in the similar way (using ffi.C.gai_strerror(GAI_AGAIN) and ffi.C.gai_strerror(GAI_NONAME)) in test cases in second patch of the patchset. > +]] > +test:is(run_script(code), PANIC, "log_nonblock") > + > -- > -- Crash (instead of panic) when trying to recover a huge tuple. > -- > diff --git a/test/unit/coio.cc b/test/unit/coio.cc > index bb8bd7131..a70d3254d 100644 > --- a/test/unit/coio.cc > +++ b/test/unit/coio.cc > @@ -72,7 +72,7 @@ static void > test_getaddrinfo(void) > { > header(); > - plan(1); > + plan(3); > const char *host = "127.0.0.1"; > const char *port = "3333"; > struct addrinfo *i; > @@ -81,6 +81,12 @@ test_getaddrinfo(void) > is(rc, 0, "getaddrinfo"); > freeaddrinfo(i); > > + /* gh-4138: Check getaddrinfo() error. */ > + isnt(coio_getaddrinfo("non_exists_hostname", port, NULL, &i, 1), 0, > + "getaddrinfo error"); I would say 'getaddrinfo retval' instead 'getaddrinfo error'. Use `rc = coio_getaddrinfo(<...>)` as above within this function, it reads easier. > + isnt(strstr(diag_get()->last->errmsg, "getaddrinfo"), NULL, > + "getaddrinfo error message"); > + I propose to verify the entire error message using gai_strerror(GAI_AGAIN) and gai_strerror(GAI_NONAME)—just as proposed above for a log message. > /* > * gh-4209: 0 timeout should not be a special value and > * detach a task. Before a fix it led to segfault > diff --git a/test/unit/coio.result b/test/unit/coio.result > index 5019fa48a..49759b747 100644 > --- a/test/unit/coio.result > +++ b/test/unit/coio.result > @@ -7,6 +7,8 @@ > # call done with res 0 > *** test_call_f: done *** > *** test_getaddrinfo *** > -1..1 > +1..3 > ok 1 - getaddrinfo > +ok 2 - getaddrinfo error > +ok 3 - getaddrinfo error message > *** test_getaddrinfo: done *** >