* [Tarantool-patches] [PATCH 0/2] Swim broadcast errors @ 2021-03-18 0:02 Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches ` (3 more replies) 0 siblings, 4 replies; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:02 UTC (permalink / raw) To: tarantool-patches, gorcunov, sergepetrenko The patchset fixes "Permission denied" error on broadcast in SWIM. Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5864-swim-broadcast Issue: https://github.com/tarantool/tarantool/issues/5864 Vladislav Shpilevoy (2): swim: add SO_BROADCAST option lua: separate sched and script diag changelogs/unreleased/swim-broadcast-error.md | 6 +++++ src/lib/swim/swim_io.c | 19 +++++++++++++-- src/lib/swim/swim_transport_udp.c | 23 +++++++++++-------- src/lua/init.c | 12 +++++++++- 4 files changed, 48 insertions(+), 12 deletions(-) create mode 100644 changelogs/unreleased/swim-broadcast-error.md -- 2.24.3 (Apple Git-128) ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:02 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 19:37 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag Vladislav Shpilevoy via Tarantool-patches ` (2 subsequent siblings) 3 siblings, 2 replies; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:02 UTC (permalink / raw) To: tarantool-patches, gorcunov, sergepetrenko Swim node couldn't talk to broadcast network interfaces because the option SO_BROADCAST wasn't set. It worked fine for localhost broadcast, but failed for all the other IPs. There is no a test, because the tests work for the localhost only anyway. It still fails on Mac though in case the swim node was bound to 127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt to broadcast beyond the local machine. Not present on Linux, where such an error simply can't be returned from sendto(). This error is ignored on Mac, because it is not critical. Part of #5906 --- src/lib/swim/swim_io.c | 19 +++++++++++++++++-- src/lib/swim/swim_transport_udp.c | 23 ++++++++++++++--------- 2 files changed, 31 insertions(+), 11 deletions(-) diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c index c8558c43e..45df36ba4 100644 --- a/src/lib/swim/swim_io.c +++ b/src/lib/swim/swim_io.c @@ -512,8 +512,23 @@ static inline void swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, ssize_t size) { - if (size < 0) - diag_log(); + if (size < 0) { + bool is_critical = false; +#if TARGET_OS_DARWIN + /* + * On Mac this error happens regularly if SWIM is bound to + * the localhost and tries to broadcast out of the machine. This + * is not critical, because will happen in the tests a lot, and + * in prod it simply should not bind to localhost if there are + * multiple machines in the cluster. Besides, Mac as a platform + * is not supposed to be used in prod. + */ + struct error *last = diag_last_error(diag_get()); + is_critical = (last->saved_errno == EADDRNOTAVAIL); +#endif + if (is_critical) + diag_log(); + } if (task->complete != NULL) task->complete(task, scheduler, size); } diff --git a/src/lib/swim/swim_transport_udp.c b/src/lib/swim/swim_transport_udp.c index c0317a20b..12626cc49 100644 --- a/src/lib/swim/swim_transport_udp.c +++ b/src/lib/swim/swim_transport_udp.c @@ -80,25 +80,27 @@ swim_transport_bind(struct swim_transport *transport, return 0; } + int is_on = 1; + int real_port = new_addr->sin_port; int fd = sio_socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); if (fd < 0) return -1; - if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0 || - evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) { + if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0) { if (errno == EADDRINUSE) diag_set(SocketError, sio_socketname(fd), "bind"); - close(fd); - return -1; + goto end_error; } - int real_port = new_addr->sin_port; + if (sio_setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &is_on, + sizeof(is_on)) != 0) + goto end_error; + if (evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) + goto end_error; if (is_new_port_any) { struct sockaddr_in real_addr; addr_len = sizeof(real_addr); if (sio_getsockname(fd, (struct sockaddr *) &real_addr, - &addr_len) != 0) { - close(fd); - return -1; - } + &addr_len) != 0) + goto end_error; real_port = real_addr.sin_port; } if (transport->fd != -1) @@ -107,6 +109,9 @@ swim_transport_bind(struct swim_transport *transport, transport->addr = *new_addr; transport->addr.sin_port = real_port; return 0; +end_error: + close(fd); + return -1; } void -- 2.24.3 (Apple Git-128) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 19:37 ` Cyrill Gorcunov via Tarantool-patches 1 sibling, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:49 UTC (permalink / raw) To: tarantool-patches, gorcunov, sergepetrenko Sorry, wrong issue reference in this and the next commit. I force pushed a fix on the branch. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:49 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 19:37 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 20:47 ` Vladislav Shpilevoy via Tarantool-patches 1 sibling, 1 reply; 16+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 19:37 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Thu, Mar 18, 2021 at 01:02:03AM +0100, Vladislav Shpilevoy wrote: > Swim node couldn't talk to broadcast network interfaces because > the option SO_BROADCAST wasn't set. > > It worked fine for localhost broadcast, but failed for all the > other IPs. There is no a test, because the tests work for the > localhost only anyway. > > It still fails on Mac though in case the swim node was bound to > 127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt > to broadcast beyond the local machine. Not present on Linux, where > such an error simply can't be returned from sendto(). This error > is ignored on Mac, because it is not critical. > > Part of #5906 > --- > src/lib/swim/swim_io.c | 19 +++++++++++++++++-- > src/lib/swim/swim_transport_udp.c | 23 ++++++++++++++--------- > 2 files changed, 31 insertions(+), 11 deletions(-) > > diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c > index c8558c43e..45df36ba4 100644 > --- a/src/lib/swim/swim_io.c > +++ b/src/lib/swim/swim_io.c > @@ -512,8 +512,23 @@ static inline void > swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, > ssize_t size) > { > - if (size < 0) > - diag_log(); > + if (size < 0) { > + bool is_critical = false; > +#if TARGET_OS_DARWIN > + /* > + * On Mac this error happens regularly if SWIM is bound to > + * the localhost and tries to broadcast out of the machine. This > + * is not critical, because will happen in the tests a lot, and > + * in prod it simply should not bind to localhost if there are > + * multiple machines in the cluster. Besides, Mac as a platform > + * is not supposed to be used in prod. > + */ > + struct error *last = diag_last_error(diag_get()); > + is_critical = (last->saved_errno == EADDRNOTAVAIL); > +#endif > + if (is_critical) > + diag_log(); > + } > if (task->complete != NULL) > task->complete(task, scheduler, size); Vlad, I don't understand. For non-mac users this @is_critical will always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN then? Ie if (size < 0) { #if TARGET_OS_DARWIN if (is_critical) diag_log(); #endif } or you made it this way to escape unused parameter compiler warning? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 19:37 ` Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 20:47 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 21:27 ` Cyrill Gorcunov via Tarantool-patches 0 siblings, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 20:47 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches Hi! Thanks for the review! >> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c >> index c8558c43e..45df36ba4 100644 >> --- a/src/lib/swim/swim_io.c >> +++ b/src/lib/swim/swim_io.c >> @@ -512,8 +512,23 @@ static inline void >> swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, >> ssize_t size) >> { >> - if (size < 0) >> - diag_log(); >> + if (size < 0) { >> + bool is_critical = false; >> +#if TARGET_OS_DARWIN >> + /* >> + * On Mac this error happens regularly if SWIM is bound to >> + * the localhost and tries to broadcast out of the machine. This >> + * is not critical, because will happen in the tests a lot, and >> + * in prod it simply should not bind to localhost if there are >> + * multiple machines in the cluster. Besides, Mac as a platform >> + * is not supposed to be used in prod. >> + */ >> + struct error *last = diag_last_error(diag_get()); >> + is_critical = (last->saved_errno == EADDRNOTAVAIL); >> +#endif >> + if (is_critical) >> + diag_log(); >> + } >> if (task->complete != NULL) >> task->complete(task, scheduler, size); > > Vlad, I don't understand. For non-mac users this @is_critical will > always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN > then? Ie > > if (size < 0) { > #if TARGET_OS_DARWIN > if (is_critical) > diag_log(); > #endif > } > > or you made it this way to escape unused parameter compiler warning? I added the variable on all platforms on purpose. Because if the code would be done the way you propose above, non-Mac platforms won't log anything. I, on the contrary, want to log on all platforms, but not certain errors in certain platforms. Only EADDRNOTAVAIL and only on Mac. Everything else must be logged. Because the user does not call 'send' directly, and has no other way to know there was an error except via the logs. Another option would be to have it like this: #if TARGET_OS_DARWIN bool is_critical = ...; if (is_critical) diag_log(); #else diag_log(); #endif But I don't like having 2 places handling the critical error. I wanted to have only one diag_log(). Talking of the warning - I am not sure I understand. It is used on both platforms. On non-Mac it is the 'if' condition. Even if it is never changed and inlined by the compiler, it is still used. There can't be a warning. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 20:47 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 21:27 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 22:34 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 23:18 ` Vladislav Shpilevoy via Tarantool-patches 0 siblings, 2 replies; 16+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 21:27 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Thu, Mar 18, 2021 at 09:47:36PM +0100, Vladislav Shpilevoy wrote: > Hi! Thanks for the review! > > >> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c > >> index c8558c43e..45df36ba4 100644 > >> --- a/src/lib/swim/swim_io.c > >> +++ b/src/lib/swim/swim_io.c > >> @@ -512,8 +512,23 @@ static inline void > >> swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, > >> ssize_t size) > >> { > >> - if (size < 0) > >> - diag_log(); > >> + if (size < 0) { > >> + bool is_critical = false; > >> +#if TARGET_OS_DARWIN > >> + /* > >> + * On Mac this error happens regularly if SWIM is bound to > >> + * the localhost and tries to broadcast out of the machine. This > >> + * is not critical, because will happen in the tests a lot, and > >> + * in prod it simply should not bind to localhost if there are > >> + * multiple machines in the cluster. Besides, Mac as a platform > >> + * is not supposed to be used in prod. > >> + */ > >> + struct error *last = diag_last_error(diag_get()); > >> + is_critical = (last->saved_errno == EADDRNOTAVAIL); > >> +#endif > >> + if (is_critical) > >> + diag_log(); > >> + } > >> if (task->complete != NULL) > >> task->complete(task, scheduler, size); > > > > Vlad, I don't understand. For non-mac users this @is_critical will > > always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN > > then? Ie > > > > if (size < 0) { > > #if TARGET_OS_DARWIN > > if (is_critical) > > diag_log(); > > #endif > > } > > > > or you made it this way to escape unused parameter compiler warning? > > I added the variable on all platforms on purpose. Because if the code would > be done the way you propose above, non-Mac platforms won't log anything. The do not log anything with the patch. Lets drop TARGET_OS_DARWIN path from your patch and we will have swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, ssize_t size) { if (size < 0) { bool is_critical = false; if (is_critical) diag_log(); } } The is_critical is always false, or there is a typo in the code above. > I, on the contrary, want to log on all platforms, but not certain errors > in certain platforms. Only EADDRNOTAVAIL and only on Mac. Everything else > must be logged. Because the user does not call 'send' directly, and has no > other way to know there was an error except via the logs. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 21:27 ` Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 22:34 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 23:18 ` Vladislav Shpilevoy via Tarantool-patches 1 sibling, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 22:34 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches >>>> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c >>>> index c8558c43e..45df36ba4 100644 >>>> --- a/src/lib/swim/swim_io.c >>>> +++ b/src/lib/swim/swim_io.c >>>> @@ -512,8 +512,23 @@ static inline void >>>> swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, >>>> ssize_t size) >>>> { >>>> - if (size < 0) >>>> - diag_log(); >>>> + if (size < 0) { >>>> + bool is_critical = false; >>>> +#if TARGET_OS_DARWIN >>>> + /* >>>> + * On Mac this error happens regularly if SWIM is bound to >>>> + * the localhost and tries to broadcast out of the machine. This >>>> + * is not critical, because will happen in the tests a lot, and >>>> + * in prod it simply should not bind to localhost if there are >>>> + * multiple machines in the cluster. Besides, Mac as a platform >>>> + * is not supposed to be used in prod. >>>> + */ >>>> + struct error *last = diag_last_error(diag_get()); >>>> + is_critical = (last->saved_errno == EADDRNOTAVAIL); >>>> +#endif >>>> + if (is_critical) >>>> + diag_log(); >>>> + } >>>> if (task->complete != NULL) >>>> task->complete(task, scheduler, size); >>> >>> Vlad, I don't understand. For non-mac users this @is_critical will >>> always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN >>> then? Ie >>> >>> if (size < 0) { >>> #if TARGET_OS_DARWIN >>> if (is_critical) >>> diag_log(); >>> #endif >>> } >>> >>> or you made it this way to escape unused parameter compiler warning? >> >> I added the variable on all platforms on purpose. Because if the code would >> be done the way you propose above, non-Mac platforms won't log anything. > > The do not log anything with the patch. Lets drop TARGET_OS_DARWIN path > from your patch and we will have > > swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, > ssize_t size) > { > if (size < 0) { > bool is_critical = false; > if (is_critical) > diag_log(); > } > } > > The is_critical is always false, or there is a typo in the code above. Oh shit, indeed :) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 21:27 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 22:34 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 23:18 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 6:53 ` Cyrill Gorcunov via Tarantool-patches 1 sibling, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 23:18 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches I force updated the patch on the branch. Could you please check if it works on your machine? For example, this test: uuid = require('uuid') swim = require('swim') s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'}) s:broadcast(3313) On Mac it used to print EADDRNOTAVAIL message. Probably would fail with something on Linux too, but I don't know. Might just work fine as well. ==================== diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c index 45df36ba4..c3fca3d7f 100644 --- a/src/lib/swim/swim_io.c +++ b/src/lib/swim/swim_io.c @@ -513,7 +513,7 @@ swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, ssize_t size) { if (size < 0) { - bool is_critical = false; + bool is_critical = true; #if TARGET_OS_DARWIN /* * On Mac this error happens regularly if SWIM is bound to @@ -524,7 +524,7 @@ swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, * is not supposed to be used in prod. */ struct error *last = diag_last_error(diag_get()); - is_critical = (last->saved_errno == EADDRNOTAVAIL); + is_critical = (last->saved_errno != EADDRNOTAVAIL); #endif if (is_critical) diag_log(); ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-18 23:18 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 6:53 ` Cyrill Gorcunov via Tarantool-patches 2021-03-19 20:18 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 21:46 ` Vladislav Shpilevoy via Tarantool-patches 0 siblings, 2 replies; 16+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19 6:53 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Fri, Mar 19, 2021 at 12:18:46AM +0100, Vladislav Shpilevoy wrote: > I force updated the patch on the branch. > > Could you please check if it works on your machine? For > example, this test: > > uuid = require('uuid') > swim = require('swim') > > s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'}) > > s:broadcast(3313) > > On Mac it used to print EADDRNOTAVAIL message. Probably > would fail with something on Linux too, but I don't know. > Might just work fine as well. Ack. Thanks! FWIW on linux this fails with | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-19 6:53 ` Cyrill Gorcunov via Tarantool-patches @ 2021-03-19 20:18 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 20:29 ` Cyrill Gorcunov via Tarantool-patches 2021-03-19 21:46 ` Vladislav Shpilevoy via Tarantool-patches 1 sibling, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 20:18 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches On 19.03.2021 07:53, Cyrill Gorcunov wrote: > On Fri, Mar 19, 2021 at 12:18:46AM +0100, Vladislav Shpilevoy wrote: >> I force updated the patch on the branch. >> >> Could you please check if it works on your machine? For >> example, this test: >> >> uuid = require('uuid') >> swim = require('swim') >> >> s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'}) >> >> s:broadcast(3313) >> >> On Mac it used to print EADDRNOTAVAIL message. Probably >> would fail with something on Linux too, but I don't know. >> Might just work fine as well. > > Ack. Thanks! FWIW on linux this fails with > | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument What if you use a non-local address? For example, I have an IP from my router with 192.168... . Could you please try that one on your machine? If it will work, I will silence 'Invalid argument' on Linux for the same reason as EADDRNOTAVAIL on Mac. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-19 20:18 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 20:29 ` Cyrill Gorcunov via Tarantool-patches 0 siblings, 0 replies; 16+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19 20:29 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Fri, Mar 19, 2021 at 09:18:52PM +0100, Vladislav Shpilevoy wrote: > > What if you use a non-local address? For example, I have an IP from my > router with 192.168... . Could you please try that one on your machine? > If it will work, I will silence 'Invalid argument' on Linux for the > same reason as EADDRNOTAVAIL on Mac. tarantool> s,err = swim.new({uuid = uuid(), uri = '192.168.88.249:3313'}) tarantool> s --- - uuid: 053f4d3f-3f59-429e-8677-f2f97883e439 uri: 192.168.88.249:3313 ... tarantool> s:broadcast(3313) --- - true Cyrill ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-19 6:53 ` Cyrill Gorcunov via Tarantool-patches 2021-03-19 20:18 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 21:46 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 22:15 ` Cyrill Gorcunov via Tarantool-patches 1 sibling, 1 reply; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 21:46 UTC (permalink / raw) To: Cyrill Gorcunov; +Cc: tarantool-patches On 19.03.2021 07:53, Cyrill Gorcunov via Tarantool-patches wrote: > On Fri, Mar 19, 2021 at 12:18:46AM +0100, Vladislav Shpilevoy wrote: >> I force updated the patch on the branch. >> >> Could you please check if it works on your machine? For >> example, this test: >> >> uuid = require('uuid') >> swim = require('swim') >> >> s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'}) >> >> s:broadcast(3313) >> >> On Mac it used to print EADDRNOTAVAIL message. Probably >> would fail with something on Linux too, but I don't know. >> Might just work fine as well. > > Ack. Thanks! FWIW on linux this fails with > | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument I made the error not critical on Linux so it is ignored now. The new patch: ==================== swim: add SO_BROADCAST option Swim node couldn't talk to broadcast network interfaces because the option SO_BROADCAST wasn't set. It worked fine for localhost broadcast, but failed for all the other IPs. There is no a test, because the tests work for the localhost only anyway. It still fails on Mac though in case the swim node was bound to 127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt to broadcast beyond the local machine. It happens on Linux too, but with EINVAL error. These errors are ignored because are not critical. Part of #5864 diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c index c8558c43e..0002472fa 100644 --- a/src/lib/swim/swim_io.c +++ b/src/lib/swim/swim_io.c @@ -512,8 +512,26 @@ static inline void swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task, ssize_t size) { - if (size < 0) - diag_log(); + if (size < 0) { + bool is_critical = true; + int err = diag_last_error(diag_get())->saved_errno; +#if TARGET_OS_DARWIN + /* + * On Mac this error happens regularly if SWIM is bound to + * the localhost and tries to broadcast out of the machine. This + * is not critical, because will happen in the tests a lot, and + * in prod it simply should not bind to localhost if there are + * multiple machines in the cluster. Besides, Mac as a platform + * is not supposed to be used in prod. + */ + is_critical = (err != EADDRNOTAVAIL); +#else + /* The same as EADDRNOTAVAIL, but happens on Linux as EINVAL. */ + is_critical = (err != EINVAL); +#endif + if (is_critical) + diag_log(); + } if (task->complete != NULL) task->complete(task, scheduler, size); } diff --git a/src/lib/swim/swim_transport_udp.c b/src/lib/swim/swim_transport_udp.c index c0317a20b..12626cc49 100644 --- a/src/lib/swim/swim_transport_udp.c +++ b/src/lib/swim/swim_transport_udp.c @@ -80,25 +80,27 @@ swim_transport_bind(struct swim_transport *transport, return 0; } + int is_on = 1; + int real_port = new_addr->sin_port; int fd = sio_socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); if (fd < 0) return -1; - if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0 || - evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) { + if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0) { if (errno == EADDRINUSE) diag_set(SocketError, sio_socketname(fd), "bind"); - close(fd); - return -1; + goto end_error; } - int real_port = new_addr->sin_port; + if (sio_setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &is_on, + sizeof(is_on)) != 0) + goto end_error; + if (evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) + goto end_error; if (is_new_port_any) { struct sockaddr_in real_addr; addr_len = sizeof(real_addr); if (sio_getsockname(fd, (struct sockaddr *) &real_addr, - &addr_len) != 0) { - close(fd); - return -1; - } + &addr_len) != 0) + goto end_error; real_port = real_addr.sin_port; } if (transport->fd != -1) @@ -107,6 +109,9 @@ swim_transport_bind(struct swim_transport *transport, transport->addr = *new_addr; transport->addr.sin_port = real_port; return 0; +end_error: + close(fd); + return -1; } void ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option 2021-03-19 21:46 ` Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 22:15 ` Cyrill Gorcunov via Tarantool-patches 0 siblings, 0 replies; 16+ messages in thread From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19 22:15 UTC (permalink / raw) To: Vladislav Shpilevoy; +Cc: tarantool-patches On Fri, Mar 19, 2021 at 10:46:02PM +0100, Vladislav Shpilevoy wrote: > >> > >> On Mac it used to print EADDRNOTAVAIL message. Probably > >> would fail with something on Linux too, but I don't know. > >> Might just work fine as well. > > > > Ack. Thanks! FWIW on linux this fails with > > | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument > > I made the error not critical on Linux so it is ignored now. The > new patch: Ack ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag 2021-03-18 0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:02 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 13:17 ` [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Serge Petrenko via Tarantool-patches 2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches 3 siblings, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 0:02 UTC (permalink / raw) To: tarantool-patches, gorcunov, sergepetrenko When Lua main script was launched, the sched fiber passed its own diag to the script's fiber. When the script was finished, it put its error into the diag. The sched fiber then checked if the diag is empty to detect an error. But it wasn't really correct. The error could also happen right in the scheduler fiber in a libev callback. For example, in one of ev_io callbacks in SWIM. Then the process would end with an error even if the script was finished successfully. These errors were not related to the main fiber executing the script. The patch makes so the scheduler fiber's diag no longer is used as an indication of an error in the script. Instead, a new diag is created on the stack of the scheduler's fiber, where the Lua script saves the error. Closes #5906 --- changelogs/unreleased/swim-broadcast-error.md | 6 ++++++ src/lua/init.c | 12 +++++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) create mode 100644 changelogs/unreleased/swim-broadcast-error.md diff --git a/changelogs/unreleased/swim-broadcast-error.md b/changelogs/unreleased/swim-broadcast-error.md new file mode 100644 index 000000000..19ad44ca2 --- /dev/null +++ b/changelogs/unreleased/swim-broadcast-error.md @@ -0,0 +1,6 @@ +## bugfix/swim + +* Fix `<swim_instance>:broadcast()` which does not work on non-local addresses + and spams "Permission denied" errors to the log. Also after instance + termination it could return a non-0 exit code even if there was no errors in + the script, and spam the error again (gh-5864). diff --git a/src/lua/init.c b/src/lua/init.c index 4729ad7a2..767abdfc5 100644 --- a/src/lua/init.c +++ b/src/lua/init.c @@ -718,8 +718,16 @@ tarantool_lua_run_script(char *path, bool interactive, if (script_fiber == NULL) panic("%s", diag_last_error(diag_get())->errmsg); script_fiber->storage.lua.stack = tarantool_L; + /* + * Create a new diag on the stack. Don't pass fiber's diag, because it + * might be overwritten by libev callbacks invoked in the scheduler + * fiber (which is this), and therefore can't be used as a sign of fail + * in the script itself. + */ + struct diag script_diag; + diag_create(&script_diag); fiber_start(script_fiber, tarantool_L, path, interactive, - optc, optv, argc, argv, diag_get()); + optc, optv, argc, argv, &script_diag); /* * Run an auxiliary event loop to re-schedule run_script fiber. @@ -729,6 +737,8 @@ tarantool_lua_run_script(char *path, bool interactive, ev_run(loop(), 0); /* The fiber running the startup script has ended. */ script_fiber = NULL; + diag_move(&script_diag, diag_get()); + diag_destroy(&script_diag); /* * Result can't be obtained via fiber_join - script fiber * never dies if os.exit() was called. This is why diag -- 2.24.3 (Apple Git-128) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Swim broadcast errors 2021-03-18 0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 13:17 ` Serge Petrenko via Tarantool-patches 2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches 3 siblings, 0 replies; 16+ messages in thread From: Serge Petrenko via Tarantool-patches @ 2021-03-19 13:17 UTC (permalink / raw) To: Vladislav Shpilevoy, tarantool-patches, gorcunov 18.03.2021 03:02, Vladislav Shpilevoy пишет: > The patchset fixes "Permission denied" error on broadcast in SWIM. > > Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5864-swim-broadcast > Issue: https://github.com/tarantool/tarantool/issues/5864 > > Vladislav Shpilevoy (2): > swim: add SO_BROADCAST option > lua: separate sched and script diag > > changelogs/unreleased/swim-broadcast-error.md | 6 +++++ > src/lib/swim/swim_io.c | 19 +++++++++++++-- > src/lib/swim/swim_transport_udp.c | 23 +++++++++++-------- > src/lua/init.c | 12 +++++++++- > 4 files changed, 48 insertions(+), 12 deletions(-) > create mode 100644 changelogs/unreleased/swim-broadcast-error.md > Hi! Thanks for the patchset! LGTM. -- Serge Petrenko ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/2] Swim broadcast errors 2021-03-18 0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches ` (2 preceding siblings ...) 2021-03-19 13:17 ` [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Serge Petrenko via Tarantool-patches @ 2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches 3 siblings, 0 replies; 16+ messages in thread From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 22:37 UTC (permalink / raw) To: tarantool-patches, gorcunov, sergepetrenko Pushed to master, 2.7, 2.6. ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-03-19 22:37 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-18 0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches 2021-03-18 0:49 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 19:37 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 20:47 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 21:27 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 22:34 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-18 23:18 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 6:53 ` Cyrill Gorcunov via Tarantool-patches 2021-03-19 20:18 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 20:29 ` Cyrill Gorcunov via Tarantool-patches 2021-03-19 21:46 ` Vladislav Shpilevoy via Tarantool-patches 2021-03-19 22:15 ` Cyrill Gorcunov via Tarantool-patches 2021-03-18 0:02 ` [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag Vladislav Shpilevoy via Tarantool-patches 2021-03-19 13:17 ` [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Serge Petrenko via Tarantool-patches 2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox