Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] Swim broadcast errors
@ 2021-03-18  0:02 Vladislav Shpilevoy via Tarantool-patches
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18  0:02 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

The patchset fixes "Permission denied" error on broadcast in SWIM.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5864-swim-broadcast
Issue: https://github.com/tarantool/tarantool/issues/5864

Vladislav Shpilevoy (2):
  swim: add SO_BROADCAST option
  lua: separate sched and script diag

 changelogs/unreleased/swim-broadcast-error.md |  6 +++++
 src/lib/swim/swim_io.c                        | 19 +++++++++++++--
 src/lib/swim/swim_transport_udp.c             | 23 +++++++++++--------
 src/lua/init.c                                | 12 +++++++++-
 4 files changed, 48 insertions(+), 12 deletions(-)
 create mode 100644 changelogs/unreleased/swim-broadcast-error.md

-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18  0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18  0:02 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18  0:49   ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 19:37   ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag Vladislav Shpilevoy via Tarantool-patches
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18  0:02 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

Swim node couldn't talk to broadcast network interfaces because
the option SO_BROADCAST wasn't set.

It worked fine for localhost broadcast, but failed for all the
other IPs. There is no a test, because the tests work for the
localhost only anyway.

It still fails on Mac though in case the swim node was bound to
127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt
to broadcast beyond the local machine. Not present on Linux, where
such an error simply can't be returned from sendto(). This error
is ignored on Mac, because it is not critical.

Part of #5906
---
 src/lib/swim/swim_io.c            | 19 +++++++++++++++++--
 src/lib/swim/swim_transport_udp.c | 23 ++++++++++++++---------
 2 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
index c8558c43e..45df36ba4 100644
--- a/src/lib/swim/swim_io.c
+++ b/src/lib/swim/swim_io.c
@@ -512,8 +512,23 @@ static inline void
 swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
 		   ssize_t size)
 {
-	if (size < 0)
-		diag_log();
+	if (size < 0) {
+		bool is_critical = false;
+#if TARGET_OS_DARWIN
+		/*
+		 * On Mac this error happens regularly if SWIM is bound to
+		 * the localhost and tries to broadcast out of the machine. This
+		 * is not critical, because will happen in the tests a lot, and
+		 * in prod it simply should not bind to localhost if there are
+		 * multiple machines in the cluster. Besides, Mac as a platform
+		 * is not supposed to be used in prod.
+		 */
+		struct error *last = diag_last_error(diag_get());
+		is_critical = (last->saved_errno == EADDRNOTAVAIL);
+#endif
+		if (is_critical)
+			diag_log();
+	}
 	if (task->complete != NULL)
 		task->complete(task, scheduler, size);
 }
diff --git a/src/lib/swim/swim_transport_udp.c b/src/lib/swim/swim_transport_udp.c
index c0317a20b..12626cc49 100644
--- a/src/lib/swim/swim_transport_udp.c
+++ b/src/lib/swim/swim_transport_udp.c
@@ -80,25 +80,27 @@ swim_transport_bind(struct swim_transport *transport,
 		return 0;
 	}
 
+	int is_on = 1;
+	int real_port = new_addr->sin_port;
 	int fd = sio_socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
 	if (fd < 0)
 		return -1;
-	if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0 ||
-	    evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) {
+	if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0) {
 		if (errno == EADDRINUSE)
 			diag_set(SocketError, sio_socketname(fd), "bind");
-		close(fd);
-		return -1;
+		goto end_error;
 	}
-	int real_port = new_addr->sin_port;
+	if (sio_setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &is_on,
+			   sizeof(is_on)) != 0)
+		goto end_error;
+	if (evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0)
+		goto end_error;
 	if (is_new_port_any) {
 		struct sockaddr_in real_addr;
 		addr_len = sizeof(real_addr);
 		if (sio_getsockname(fd, (struct sockaddr *) &real_addr,
-				    &addr_len) != 0) {
-			close(fd);
-			return -1;
-		}
+				    &addr_len) != 0)
+			goto end_error;
 		real_port = real_addr.sin_port;
 	}
 	if (transport->fd != -1)
@@ -107,6 +109,9 @@ swim_transport_bind(struct swim_transport *transport,
 	transport->addr = *new_addr;
 	transport->addr.sin_port = real_port;
 	return 0;
+end_error:
+	close(fd);
+	return -1;
 }
 
 void
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag
  2021-03-18  0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18  0:02 ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19 13:17 ` [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Serge Petrenko via Tarantool-patches
  2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches
  3 siblings, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18  0:02 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

When Lua main script was launched, the sched fiber passed its own
diag to the script's fiber. When the script was finished, it put
its error into the diag. The sched fiber then checked if the diag
is empty to detect an error.

But it wasn't really correct. The error could also happen right in
the scheduler fiber in a libev callback. For example, in one of
ev_io callbacks in SWIM. Then the process would end with an error
even if the script was finished successfully.

These errors were not related to the main fiber executing the
script.

The patch makes so the scheduler fiber's diag no longer is used as
an indication of an error in the script. Instead, a new diag is
created on the stack of the scheduler's fiber, where the Lua
script saves the error.

Closes #5906
---
 changelogs/unreleased/swim-broadcast-error.md |  6 ++++++
 src/lua/init.c                                | 12 +++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 changelogs/unreleased/swim-broadcast-error.md

diff --git a/changelogs/unreleased/swim-broadcast-error.md b/changelogs/unreleased/swim-broadcast-error.md
new file mode 100644
index 000000000..19ad44ca2
--- /dev/null
+++ b/changelogs/unreleased/swim-broadcast-error.md
@@ -0,0 +1,6 @@
+## bugfix/swim
+
+* Fix `<swim_instance>:broadcast()` which does not work on non-local addresses
+  and spams "Permission denied" errors to the log. Also after instance
+  termination it could return a non-0 exit code even if there was no errors in
+  the script, and spam the error again (gh-5864).
diff --git a/src/lua/init.c b/src/lua/init.c
index 4729ad7a2..767abdfc5 100644
--- a/src/lua/init.c
+++ b/src/lua/init.c
@@ -718,8 +718,16 @@ tarantool_lua_run_script(char *path, bool interactive,
 	if (script_fiber == NULL)
 		panic("%s", diag_last_error(diag_get())->errmsg);
 	script_fiber->storage.lua.stack = tarantool_L;
+	/*
+	 * Create a new diag on the stack. Don't pass fiber's diag, because it
+	 * might be overwritten by libev callbacks invoked in the scheduler
+	 * fiber (which is this), and therefore can't be used as a sign of fail
+	 * in the script itself.
+	 */
+	struct diag script_diag;
+	diag_create(&script_diag);
 	fiber_start(script_fiber, tarantool_L, path, interactive,
-		    optc, optv, argc, argv, diag_get());
+		    optc, optv, argc, argv, &script_diag);
 
 	/*
 	 * Run an auxiliary event loop to re-schedule run_script fiber.
@@ -729,6 +737,8 @@ tarantool_lua_run_script(char *path, bool interactive,
 		ev_run(loop(), 0);
 	/* The fiber running the startup script has ended. */
 	script_fiber = NULL;
+	diag_move(&script_diag, diag_get());
+	diag_destroy(&script_diag);
 	/*
 	 * Result can't be obtained via fiber_join - script fiber
 	 * never dies if os.exit() was called. This is why diag
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18  0:49   ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 19:37   ` Cyrill Gorcunov via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18  0:49 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

Sorry, wrong issue reference in this and the next commit.
I force pushed a fix on the branch.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches
  2021-03-18  0:49   ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18 19:37   ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 20:47     ` Vladislav Shpilevoy via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 19:37 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Thu, Mar 18, 2021 at 01:02:03AM +0100, Vladislav Shpilevoy wrote:
> Swim node couldn't talk to broadcast network interfaces because
> the option SO_BROADCAST wasn't set.
> 
> It worked fine for localhost broadcast, but failed for all the
> other IPs. There is no a test, because the tests work for the
> localhost only anyway.
> 
> It still fails on Mac though in case the swim node was bound to
> 127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt
> to broadcast beyond the local machine. Not present on Linux, where
> such an error simply can't be returned from sendto(). This error
> is ignored on Mac, because it is not critical.
> 
> Part of #5906
> ---
>  src/lib/swim/swim_io.c            | 19 +++++++++++++++++--
>  src/lib/swim/swim_transport_udp.c | 23 ++++++++++++++---------
>  2 files changed, 31 insertions(+), 11 deletions(-)
> 
> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
> index c8558c43e..45df36ba4 100644
> --- a/src/lib/swim/swim_io.c
> +++ b/src/lib/swim/swim_io.c
> @@ -512,8 +512,23 @@ static inline void
>  swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
>  		   ssize_t size)
>  {
> -	if (size < 0)
> -		diag_log();
> +	if (size < 0) {
> +		bool is_critical = false;
> +#if TARGET_OS_DARWIN
> +		/*
> +		 * On Mac this error happens regularly if SWIM is bound to
> +		 * the localhost and tries to broadcast out of the machine. This
> +		 * is not critical, because will happen in the tests a lot, and
> +		 * in prod it simply should not bind to localhost if there are
> +		 * multiple machines in the cluster. Besides, Mac as a platform
> +		 * is not supposed to be used in prod.
> +		 */
> +		struct error *last = diag_last_error(diag_get());
> +		is_critical = (last->saved_errno == EADDRNOTAVAIL);
> +#endif
> +		if (is_critical)
> +			diag_log();
> +	}
>  	if (task->complete != NULL)
>  		task->complete(task, scheduler, size);

Vlad, I don't understand. For non-mac users this @is_critical will
always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN
then? Ie

	if (size < 0) {
	#if TARGET_OS_DARWIN
		if (is_critical)
			diag_log();
	#endif
	}

or you made it this way to escape unused parameter compiler warning?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18 19:37   ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18 20:47     ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 21:27       ` Cyrill Gorcunov via Tarantool-patches
  0 siblings, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 20:47 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tarantool-patches

Hi! Thanks for the review!

>> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
>> index c8558c43e..45df36ba4 100644
>> --- a/src/lib/swim/swim_io.c
>> +++ b/src/lib/swim/swim_io.c
>> @@ -512,8 +512,23 @@ static inline void
>>  swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
>>  		   ssize_t size)
>>  {
>> -	if (size < 0)
>> -		diag_log();
>> +	if (size < 0) {
>> +		bool is_critical = false;
>> +#if TARGET_OS_DARWIN
>> +		/*
>> +		 * On Mac this error happens regularly if SWIM is bound to
>> +		 * the localhost and tries to broadcast out of the machine. This
>> +		 * is not critical, because will happen in the tests a lot, and
>> +		 * in prod it simply should not bind to localhost if there are
>> +		 * multiple machines in the cluster. Besides, Mac as a platform
>> +		 * is not supposed to be used in prod.
>> +		 */
>> +		struct error *last = diag_last_error(diag_get());
>> +		is_critical = (last->saved_errno == EADDRNOTAVAIL);
>> +#endif
>> +		if (is_critical)
>> +			diag_log();
>> +	}
>>  	if (task->complete != NULL)
>>  		task->complete(task, scheduler, size);
> 
> Vlad, I don't understand. For non-mac users this @is_critical will
> always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN
> then? Ie
> 
> 	if (size < 0) {
> 	#if TARGET_OS_DARWIN
> 		if (is_critical)
> 			diag_log();
> 	#endif
> 	}
> 
> or you made it this way to escape unused parameter compiler warning?

I added the variable on all platforms on purpose. Because if the code would
be done the way you propose above, non-Mac platforms won't log anything.
I, on the contrary, want to log on all platforms, but not certain errors
in certain platforms. Only EADDRNOTAVAIL and only on Mac. Everything else
must be logged. Because the user does not call 'send' directly, and has no
other way to know there was an error except via the logs.

Another option would be to have it like this:

	#if TARGET_OS_DARWIN
		bool is_critical = ...;
		if (is_critical)
			diag_log();
	#else
		diag_log();
	#endif

But I don't like having 2 places handling the critical error. I wanted to
have only one diag_log().

Talking of the warning - I am not sure I understand. It is used on both
platforms. On non-Mac it is the 'if' condition. Even if it is never changed
and inlined by the compiler, it is still used. There can't be a warning.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18 20:47     ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18 21:27       ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 22:34         ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 23:18         ` Vladislav Shpilevoy via Tarantool-patches
  0 siblings, 2 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-18 21:27 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Thu, Mar 18, 2021 at 09:47:36PM +0100, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
> 
> >> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
> >> index c8558c43e..45df36ba4 100644
> >> --- a/src/lib/swim/swim_io.c
> >> +++ b/src/lib/swim/swim_io.c
> >> @@ -512,8 +512,23 @@ static inline void
> >>  swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
> >>  		   ssize_t size)
> >>  {
> >> -	if (size < 0)
> >> -		diag_log();
> >> +	if (size < 0) {
> >> +		bool is_critical = false;
> >> +#if TARGET_OS_DARWIN
> >> +		/*
> >> +		 * On Mac this error happens regularly if SWIM is bound to
> >> +		 * the localhost and tries to broadcast out of the machine. This
> >> +		 * is not critical, because will happen in the tests a lot, and
> >> +		 * in prod it simply should not bind to localhost if there are
> >> +		 * multiple machines in the cluster. Besides, Mac as a platform
> >> +		 * is not supposed to be used in prod.
> >> +		 */
> >> +		struct error *last = diag_last_error(diag_get());
> >> +		is_critical = (last->saved_errno == EADDRNOTAVAIL);
> >> +#endif
> >> +		if (is_critical)
> >> +			diag_log();
> >> +	}
> >>  	if (task->complete != NULL)
> >>  		task->complete(task, scheduler, size);
> > 
> > Vlad, I don't understand. For non-mac users this @is_critical will
> > always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN
> > then? Ie
> > 
> > 	if (size < 0) {
> > 	#if TARGET_OS_DARWIN
> > 		if (is_critical)
> > 			diag_log();
> > 	#endif
> > 	}
> > 
> > or you made it this way to escape unused parameter compiler warning?
> 
> I added the variable on all platforms on purpose. Because if the code would
> be done the way you propose above, non-Mac platforms won't log anything.

The do not log anything with the patch. Lets drop TARGET_OS_DARWIN path
from your patch and we will have

swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
                  ssize_t size)
{
	if (size < 0) {
               bool is_critical = false;
               if (is_critical)
                       diag_log();
	}
}

The is_critical is always false, or there is a typo in the code above.


> I, on the contrary, want to log on all platforms, but not certain errors
> in certain platforms. Only EADDRNOTAVAIL and only on Mac. Everything else
> must be logged. Because the user does not call 'send' directly, and has no
> other way to know there was an error except via the logs.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18 21:27       ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-18 22:34         ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-18 23:18         ` Vladislav Shpilevoy via Tarantool-patches
  1 sibling, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 22:34 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tarantool-patches

>>>> diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
>>>> index c8558c43e..45df36ba4 100644
>>>> --- a/src/lib/swim/swim_io.c
>>>> +++ b/src/lib/swim/swim_io.c
>>>> @@ -512,8 +512,23 @@ static inline void
>>>>  swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
>>>>  		   ssize_t size)
>>>>  {
>>>> -	if (size < 0)
>>>> -		diag_log();
>>>> +	if (size < 0) {
>>>> +		bool is_critical = false;
>>>> +#if TARGET_OS_DARWIN
>>>> +		/*
>>>> +		 * On Mac this error happens regularly if SWIM is bound to
>>>> +		 * the localhost and tries to broadcast out of the machine. This
>>>> +		 * is not critical, because will happen in the tests a lot, and
>>>> +		 * in prod it simply should not bind to localhost if there are
>>>> +		 * multiple machines in the cluster. Besides, Mac as a platform
>>>> +		 * is not supposed to be used in prod.
>>>> +		 */
>>>> +		struct error *last = diag_last_error(diag_get());
>>>> +		is_critical = (last->saved_errno == EADDRNOTAVAIL);
>>>> +#endif
>>>> +		if (is_critical)
>>>> +			diag_log();
>>>> +	}
>>>>  	if (task->complete != NULL)
>>>>  		task->complete(task, scheduler, size);
>>>
>>> Vlad, I don't understand. For non-mac users this @is_critical will
>>> always be false, maybe we better move the whoe branch to TARGET_OS_DARWIN
>>> then? Ie
>>>
>>> 	if (size < 0) {
>>> 	#if TARGET_OS_DARWIN
>>> 		if (is_critical)
>>> 			diag_log();
>>> 	#endif
>>> 	}
>>>
>>> or you made it this way to escape unused parameter compiler warning?
>>
>> I added the variable on all platforms on purpose. Because if the code would
>> be done the way you propose above, non-Mac platforms won't log anything.
> 
> The do not log anything with the patch. Lets drop TARGET_OS_DARWIN path
> from your patch and we will have
> 
> swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
>                   ssize_t size)
> {
> 	if (size < 0) {
>                bool is_critical = false;
>                if (is_critical)
>                        diag_log();
> 	}
> }
> 
> The is_critical is always false, or there is a typo in the code above.

Oh shit, indeed :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18 21:27       ` Cyrill Gorcunov via Tarantool-patches
  2021-03-18 22:34         ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-18 23:18         ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19  6:53           ` Cyrill Gorcunov via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-18 23:18 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tarantool-patches

I force updated the patch on the branch.

Could you please check if it works on your machine? For
example, this test:

	uuid = require('uuid')
	swim = require('swim')

	s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'})

	s:broadcast(3313)

On Mac it used to print EADDRNOTAVAIL message. Probably
would fail with something on Linux too, but I don't know.
Might just work fine as well.

====================
diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
index 45df36ba4..c3fca3d7f 100644
--- a/src/lib/swim/swim_io.c
+++ b/src/lib/swim/swim_io.c
@@ -513,7 +513,7 @@ swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
 		   ssize_t size)
 {
 	if (size < 0) {
-		bool is_critical = false;
+		bool is_critical = true;
 #if TARGET_OS_DARWIN
 		/*
 		 * On Mac this error happens regularly if SWIM is bound to
@@ -524,7 +524,7 @@ swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
 		 * is not supposed to be used in prod.
 		 */
 		struct error *last = diag_last_error(diag_get());
-		is_critical = (last->saved_errno == EADDRNOTAVAIL);
+		is_critical = (last->saved_errno != EADDRNOTAVAIL);
 #endif
 		if (is_critical)
 			diag_log();

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-18 23:18         ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-19  6:53           ` Cyrill Gorcunov via Tarantool-patches
  2021-03-19 20:18             ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19 21:46             ` Vladislav Shpilevoy via Tarantool-patches
  0 siblings, 2 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19  6:53 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Fri, Mar 19, 2021 at 12:18:46AM +0100, Vladislav Shpilevoy wrote:
> I force updated the patch on the branch.
> 
> Could you please check if it works on your machine? For
> example, this test:
> 
> 	uuid = require('uuid')
> 	swim = require('swim')
> 
> 	s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'})
> 
> 	s:broadcast(3313)
> 
> On Mac it used to print EADDRNOTAVAIL message. Probably
> would fail with something on Linux too, but I don't know.
> Might just work fine as well.

Ack. Thanks! FWIW on linux this fails with
 | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Swim broadcast errors
  2021-03-18  0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches
  2021-03-18  0:02 ` [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-19 13:17 ` Serge Petrenko via Tarantool-patches
  2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches
  3 siblings, 0 replies; 16+ messages in thread
From: Serge Petrenko via Tarantool-patches @ 2021-03-19 13:17 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, gorcunov



18.03.2021 03:02, Vladislav Shpilevoy пишет:
> The patchset fixes "Permission denied" error on broadcast in SWIM.
>
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5864-swim-broadcast
> Issue: https://github.com/tarantool/tarantool/issues/5864
>
> Vladislav Shpilevoy (2):
>    swim: add SO_BROADCAST option
>    lua: separate sched and script diag
>
>   changelogs/unreleased/swim-broadcast-error.md |  6 +++++
>   src/lib/swim/swim_io.c                        | 19 +++++++++++++--
>   src/lib/swim/swim_transport_udp.c             | 23 +++++++++++--------
>   src/lua/init.c                                | 12 +++++++++-
>   4 files changed, 48 insertions(+), 12 deletions(-)
>   create mode 100644 changelogs/unreleased/swim-broadcast-error.md
>
Hi! Thanks for the patchset!
LGTM.

-- 
Serge Petrenko


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-19  6:53           ` Cyrill Gorcunov via Tarantool-patches
@ 2021-03-19 20:18             ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19 20:29               ` Cyrill Gorcunov via Tarantool-patches
  2021-03-19 21:46             ` Vladislav Shpilevoy via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 20:18 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tarantool-patches



On 19.03.2021 07:53, Cyrill Gorcunov wrote:
> On Fri, Mar 19, 2021 at 12:18:46AM +0100, Vladislav Shpilevoy wrote:
>> I force updated the patch on the branch.
>>
>> Could you please check if it works on your machine? For
>> example, this test:
>>
>> 	uuid = require('uuid')
>> 	swim = require('swim')
>>
>> 	s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'})
>>
>> 	s:broadcast(3313)
>>
>> On Mac it used to print EADDRNOTAVAIL message. Probably
>> would fail with something on Linux too, but I don't know.
>> Might just work fine as well.
> 
> Ack. Thanks! FWIW on linux this fails with
>  | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument

What if you use a non-local address? For example, I have an IP from my
router with 192.168... . Could you please try that one on your machine?
If it will work, I will silence 'Invalid argument' on Linux for the
same reason as EADDRNOTAVAIL on Mac.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-19 20:18             ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-19 20:29               ` Cyrill Gorcunov via Tarantool-patches
  0 siblings, 0 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19 20:29 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Fri, Mar 19, 2021 at 09:18:52PM +0100, Vladislav Shpilevoy wrote:
> 
> What if you use a non-local address? For example, I have an IP from my
> router with 192.168... . Could you please try that one on your machine?
> If it will work, I will silence 'Invalid argument' on Linux for the
> same reason as EADDRNOTAVAIL on Mac.

 tarantool> s,err = swim.new({uuid = uuid(), uri = '192.168.88.249:3313'})
 tarantool> s
 ---
  - uuid: 053f4d3f-3f59-429e-8677-f2f97883e439
    uri: 192.168.88.249:3313
  ...

 tarantool> s:broadcast(3313)
 ---
  - true

	Cyrill

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-19  6:53           ` Cyrill Gorcunov via Tarantool-patches
  2021-03-19 20:18             ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-19 21:46             ` Vladislav Shpilevoy via Tarantool-patches
  2021-03-19 22:15               ` Cyrill Gorcunov via Tarantool-patches
  1 sibling, 1 reply; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 21:46 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tarantool-patches

On 19.03.2021 07:53, Cyrill Gorcunov via Tarantool-patches wrote:
> On Fri, Mar 19, 2021 at 12:18:46AM +0100, Vladislav Shpilevoy wrote:
>> I force updated the patch on the branch.
>>
>> Could you please check if it works on your machine? For
>> example, this test:
>>
>> 	uuid = require('uuid')
>> 	swim = require('swim')
>>
>> 	s = swim.new({uuid = uuid(), uri = '127.0.0.1:3313'})
>>
>> 	s:broadcast(3313)
>>
>> On Mac it used to print EADDRNOTAVAIL message. Probably
>> would fail with something on Linux too, but I don't know.
>> Might just work fine as well.
> 
> Ack. Thanks! FWIW on linux this fails with
>  | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument

I made the error not critical on Linux so it is ignored now. The
new patch:

====================
    swim: add SO_BROADCAST option
    
    Swim node couldn't talk to broadcast network interfaces because
    the option SO_BROADCAST wasn't set.
    
    It worked fine for localhost broadcast, but failed for all the
    other IPs. There is no a test, because the tests work for the
    localhost only anyway.
    
    It still fails on Mac though in case the swim node was bound to
    127.0.0.1. Then somewhy sendto() raises EADDRNOTAVAIL on attempt
    to broadcast beyond the local machine. It happens on Linux too,
    but with EINVAL error. These errors are ignored because are not
    critical.
    
    Part of #5864

diff --git a/src/lib/swim/swim_io.c b/src/lib/swim/swim_io.c
index c8558c43e..0002472fa 100644
--- a/src/lib/swim/swim_io.c
+++ b/src/lib/swim/swim_io.c
@@ -512,8 +512,26 @@ static inline void
 swim_complete_send(struct swim_scheduler *scheduler, struct swim_task *task,
 		   ssize_t size)
 {
-	if (size < 0)
-		diag_log();
+	if (size < 0) {
+		bool is_critical = true;
+		int err = diag_last_error(diag_get())->saved_errno;
+#if TARGET_OS_DARWIN
+		/*
+		 * On Mac this error happens regularly if SWIM is bound to
+		 * the localhost and tries to broadcast out of the machine. This
+		 * is not critical, because will happen in the tests a lot, and
+		 * in prod it simply should not bind to localhost if there are
+		 * multiple machines in the cluster. Besides, Mac as a platform
+		 * is not supposed to be used in prod.
+		 */
+		is_critical = (err != EADDRNOTAVAIL);
+#else
+		/* The same as EADDRNOTAVAIL, but happens on Linux as EINVAL. */
+		is_critical = (err != EINVAL);
+#endif
+		if (is_critical)
+			diag_log();
+	}
 	if (task->complete != NULL)
 		task->complete(task, scheduler, size);
 }
diff --git a/src/lib/swim/swim_transport_udp.c b/src/lib/swim/swim_transport_udp.c
index c0317a20b..12626cc49 100644
--- a/src/lib/swim/swim_transport_udp.c
+++ b/src/lib/swim/swim_transport_udp.c
@@ -80,25 +80,27 @@ swim_transport_bind(struct swim_transport *transport,
 		return 0;
 	}
 
+	int is_on = 1;
+	int real_port = new_addr->sin_port;
 	int fd = sio_socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
 	if (fd < 0)
 		return -1;
-	if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0 ||
-	    evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) {
+	if (sio_bind(fd, (struct sockaddr *) addr, addr_len) != 0) {
 		if (errno == EADDRINUSE)
 			diag_set(SocketError, sio_socketname(fd), "bind");
-		close(fd);
-		return -1;
+		goto end_error;
 	}
-	int real_port = new_addr->sin_port;
+	if (sio_setsockopt(fd, SOL_SOCKET, SO_BROADCAST, &is_on,
+			   sizeof(is_on)) != 0)
+		goto end_error;
+	if (evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0)
+		goto end_error;
 	if (is_new_port_any) {
 		struct sockaddr_in real_addr;
 		addr_len = sizeof(real_addr);
 		if (sio_getsockname(fd, (struct sockaddr *) &real_addr,
-				    &addr_len) != 0) {
-			close(fd);
-			return -1;
-		}
+				    &addr_len) != 0)
+			goto end_error;
 		real_port = real_addr.sin_port;
 	}
 	if (transport->fd != -1)
@@ -107,6 +109,9 @@ swim_transport_bind(struct swim_transport *transport,
 	transport->addr = *new_addr;
 	transport->addr.sin_port = real_port;
 	return 0;
+end_error:
+	close(fd);
+	return -1;
 }
 
 void


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option
  2021-03-19 21:46             ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-03-19 22:15               ` Cyrill Gorcunov via Tarantool-patches
  0 siblings, 0 replies; 16+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-03-19 22:15 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Fri, Mar 19, 2021 at 10:46:02PM +0100, Vladislav Shpilevoy wrote:
> >>
> >> On Mac it used to print EADDRNOTAVAIL message. Probably
> >> would fail with something on Linux too, but I don't know.
> >> Might just work fine as well.
> > 
> > Ack. Thanks! FWIW on linux this fails with
> >  | tarantool> SystemError sendto(177), called on fd 10, aka 127.0.0.1:3313: Invalid argument
> 
> I made the error not critical on Linux so it is ignored now. The
> new patch:
Ack

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] Swim broadcast errors
  2021-03-18  0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-03-19 13:17 ` [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Serge Petrenko via Tarantool-patches
@ 2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches
  3 siblings, 0 replies; 16+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-19 22:37 UTC (permalink / raw)
  To: tarantool-patches, gorcunov, sergepetrenko

Pushed to master, 2.7, 2.6.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-03-19 22:37 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-18  0:02 [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Vladislav Shpilevoy via Tarantool-patches
2021-03-18  0:02 ` [Tarantool-patches] [PATCH 1/2] swim: add SO_BROADCAST option Vladislav Shpilevoy via Tarantool-patches
2021-03-18  0:49   ` Vladislav Shpilevoy via Tarantool-patches
2021-03-18 19:37   ` Cyrill Gorcunov via Tarantool-patches
2021-03-18 20:47     ` Vladislav Shpilevoy via Tarantool-patches
2021-03-18 21:27       ` Cyrill Gorcunov via Tarantool-patches
2021-03-18 22:34         ` Vladislav Shpilevoy via Tarantool-patches
2021-03-18 23:18         ` Vladislav Shpilevoy via Tarantool-patches
2021-03-19  6:53           ` Cyrill Gorcunov via Tarantool-patches
2021-03-19 20:18             ` Vladislav Shpilevoy via Tarantool-patches
2021-03-19 20:29               ` Cyrill Gorcunov via Tarantool-patches
2021-03-19 21:46             ` Vladislav Shpilevoy via Tarantool-patches
2021-03-19 22:15               ` Cyrill Gorcunov via Tarantool-patches
2021-03-18  0:02 ` [Tarantool-patches] [PATCH 2/2] lua: separate sched and script diag Vladislav Shpilevoy via Tarantool-patches
2021-03-19 13:17 ` [Tarantool-patches] [PATCH 0/2] Swim broadcast errors Serge Petrenko via Tarantool-patches
2021-03-19 22:37 ` Vladislav Shpilevoy via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox