Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous
@ 2020-04-13 10:34 Serge Petrenko
  2020-04-13 10:37 ` Serge Petrenko
  2020-04-15 10:16 ` Kirill Yukhin
  0 siblings, 2 replies; 4+ messages in thread
From: Serge Petrenko @ 2020-04-13 10:34 UTC (permalink / raw)
  To: gorcunov; +Cc: tarantool-patches

relay_subscribe_f sets a recovery trigger notifying tx when a full log
is read and gc consumer corresponding to the replica may be advanced.
Since anonymous replicas do not have gc consumers, the trigger isn't
added for them. However, on relay exit, the trigger deletion depends
on replica->anon flag. This is buggy in case relay stalls on exit due to
replica disconnect. Replica has time to reconnect and register as a
normal instance, hence its replica->anon flag will be false by the time
we check whether to clear triggers or not, effectively making us to
clear unset triggers and segfault.

Fix this by initializing the triggers with trigger_create(), which
allows a trigger_clear() call, even if the triggers are not set, and
omit the replica->anon check.

Closes #4731

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
https://github.com/tarantool/tarantool/tree/sp/gh-4731-anon-segfault-v2
https://github.com/tarantool/tarantool/issues/4731

 src/box/relay.cc | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/box/relay.cc b/src/box/relay.cc
index c634348a4..5f34753fc 100644
--- a/src/box/relay.cc
+++ b/src/box/relay.cc
@@ -580,9 +580,8 @@ relay_subscribe_f(va_list ap)
 	 * Not needed for anonymous replicas, since they
 	 * aren't registered with gc at all.
 	 */
-	struct trigger on_close_log = {
-		RLIST_LINK_INITIALIZER, relay_on_close_log_f, relay, NULL
-	};
+	struct trigger on_close_log;
+	trigger_create(&on_close_log, relay_on_close_log_f, relay, NULL);
 	if (!relay->replica->anon)
 		trigger_add(&r->on_close_log, &on_close_log);
 
@@ -662,9 +661,12 @@ relay_subscribe_f(va_list ap)
 	diag_log();
 	say_crit("exiting the relay loop");
 
-	/* Clear garbage collector trigger and WAL watcher. */
-	if (!relay->replica->anon)
-		trigger_clear(&on_close_log);
+	/*
+	 * Clear garbage collector trigger and WAL watcher.
+	 * trigger_clear() does nothing in case the triggers
+	 * aren't set (the replica is anonymous).
+	 */
+	trigger_clear(&on_close_log);
 	wal_clear_watcher(&relay->wal_watcher, cbus_process);
 
 	/* Join ack reader fiber. */
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous
  2020-04-13 10:34 [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous Serge Petrenko
@ 2020-04-13 10:37 ` Serge Petrenko
  2020-04-13 13:24   ` Sergey Bronnikov
  2020-04-15 10:16 ` Kirill Yukhin
  1 sibling, 1 reply; 4+ messages in thread
From: Serge Petrenko @ 2020-04-13 10:37 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tml

Sergey, would you kindly check this patch with your reproducer?
I failed to reproduce the issue on my side.

--
Serge Petrenko
sergepetrenko@tarantool.org




> 13 апр. 2020 г., в 13:34, Serge Petrenko <sergepetrenko@tarantool.org> написал(а):
> 
> relay_subscribe_f sets a recovery trigger notifying tx when a full log
> is read and gc consumer corresponding to the replica may be advanced.
> Since anonymous replicas do not have gc consumers, the trigger isn't
> added for them. However, on relay exit, the trigger deletion depends
> on replica->anon flag. This is buggy in case relay stalls on exit due to
> replica disconnect. Replica has time to reconnect and register as a
> normal instance, hence its replica->anon flag will be false by the time
> we check whether to clear triggers or not, effectively making us to
> clear unset triggers and segfault.
> 
> Fix this by initializing the triggers with trigger_create(), which
> allows a trigger_clear() call, even if the triggers are not set, and
> omit the replica->anon check.
> 
> Closes #4731
> 
> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> https://github.com/tarantool/tarantool/tree/sp/gh-4731-anon-segfault-v2
> https://github.com/tarantool/tarantool/issues/4731
> 
> src/box/relay.cc | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/src/box/relay.cc b/src/box/relay.cc
> index c634348a4..5f34753fc 100644
> --- a/src/box/relay.cc
> +++ b/src/box/relay.cc
> @@ -580,9 +580,8 @@ relay_subscribe_f(va_list ap)
> 	 * Not needed for anonymous replicas, since they
> 	 * aren't registered with gc at all.
> 	 */
> -	struct trigger on_close_log = {
> -		RLIST_LINK_INITIALIZER, relay_on_close_log_f, relay, NULL
> -	};
> +	struct trigger on_close_log;
> +	trigger_create(&on_close_log, relay_on_close_log_f, relay, NULL);
> 	if (!relay->replica->anon)
> 		trigger_add(&r->on_close_log, &on_close_log);
> 
> @@ -662,9 +661,12 @@ relay_subscribe_f(va_list ap)
> 	diag_log();
> 	say_crit("exiting the relay loop");
> 
> -	/* Clear garbage collector trigger and WAL watcher. */
> -	if (!relay->replica->anon)
> -		trigger_clear(&on_close_log);
> +	/*
> +	 * Clear garbage collector trigger and WAL watcher.
> +	 * trigger_clear() does nothing in case the triggers
> +	 * aren't set (the replica is anonymous).
> +	 */
> +	trigger_clear(&on_close_log);
> 	wal_clear_watcher(&relay->wal_watcher, cbus_process);
> 
> 	/* Join ack reader fiber. */
> -- 
> 2.21.1 (Apple Git-122.3)
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous
  2020-04-13 10:37 ` Serge Petrenko
@ 2020-04-13 13:24   ` Sergey Bronnikov
  0 siblings, 0 replies; 4+ messages in thread
From: Sergey Bronnikov @ 2020-04-13 13:24 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tml

Hi,

thanks for the fix!

On 13:37 Mon 13 Apr , Serge Petrenko wrote:
> Sergey, would you kindly check this patch with your reproducer?
> I failed to reproduce the issue on my side.

reproducer passed 200 times out of 200 runs.

> --
> Serge Petrenko
> sergepetrenko@tarantool.org
> 
> > 13 апр. 2020 г., в 13:34, Serge Petrenko <sergepetrenko@tarantool.org> написал(а):
> > 
> > relay_subscribe_f sets a recovery trigger notifying tx when a full log
> > is read and gc consumer corresponding to the replica may be advanced.
> > Since anonymous replicas do not have gc consumers, the trigger isn't
> > added for them. However, on relay exit, the trigger deletion depends
> > on replica->anon flag. This is buggy in case relay stalls on exit due to
> > replica disconnect. Replica has time to reconnect and register as a
> > normal instance, hence its replica->anon flag will be false by the time
> > we check whether to clear triggers or not, effectively making us to
> > clear unset triggers and segfault.
> > 
> > Fix this by initializing the triggers with trigger_create(), which
> > allows a trigger_clear() call, even if the triggers are not set, and
> > omit the replica->anon check.
> > 
> > Closes #4731
> > 
> > Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
> > ---
> > https://github.com/tarantool/tarantool/tree/sp/gh-4731-anon-segfault-v2
> > https://github.com/tarantool/tarantool/issues/4731
> > 
> > src/box/relay.cc | 14 ++++++++------
> > 1 file changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/box/relay.cc b/src/box/relay.cc
> > index c634348a4..5f34753fc 100644
> > --- a/src/box/relay.cc
> > +++ b/src/box/relay.cc
> > @@ -580,9 +580,8 @@ relay_subscribe_f(va_list ap)
> > 	 * Not needed for anonymous replicas, since they
> > 	 * aren't registered with gc at all.
> > 	 */
> > -	struct trigger on_close_log = {
> > -		RLIST_LINK_INITIALIZER, relay_on_close_log_f, relay, NULL
> > -	};
> > +	struct trigger on_close_log;
> > +	trigger_create(&on_close_log, relay_on_close_log_f, relay, NULL);
> > 	if (!relay->replica->anon)
> > 		trigger_add(&r->on_close_log, &on_close_log);
> > 
> > @@ -662,9 +661,12 @@ relay_subscribe_f(va_list ap)
> > 	diag_log();
> > 	say_crit("exiting the relay loop");
> > 
> > -	/* Clear garbage collector trigger and WAL watcher. */
> > -	if (!relay->replica->anon)
> > -		trigger_clear(&on_close_log);
> > +	/*
> > +	 * Clear garbage collector trigger and WAL watcher.
> > +	 * trigger_clear() does nothing in case the triggers
> > +	 * aren't set (the replica is anonymous).
> > +	 */
> > +	trigger_clear(&on_close_log);
> > 	wal_clear_watcher(&relay->wal_watcher, cbus_process);
> > 
> > 	/* Join ack reader fiber. */
> > -- 
> > 2.21.1 (Apple Git-122.3)
> > 
> 

-- 
sergeyb@

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous
  2020-04-13 10:34 [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous Serge Petrenko
  2020-04-13 10:37 ` Serge Petrenko
@ 2020-04-15 10:16 ` Kirill Yukhin
  1 sibling, 0 replies; 4+ messages in thread
From: Kirill Yukhin @ 2020-04-15 10:16 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches

Hello,

On 13 апр 13:34, Serge Petrenko wrote:
> relay_subscribe_f sets a recovery trigger notifying tx when a full log
> is read and gc consumer corresponding to the replica may be advanced.
> Since anonymous replicas do not have gc consumers, the trigger isn't
> added for them. However, on relay exit, the trigger deletion depends
> on replica->anon flag. This is buggy in case relay stalls on exit due to
> replica disconnect. Replica has time to reconnect and register as a
> normal instance, hence its replica->anon flag will be false by the time
> we check whether to clear triggers or not, effectively making us to
> clear unset triggers and segfault.
> 
> Fix this by initializing the triggers with trigger_create(), which
> allows a trigger_clear() call, even if the triggers are not set, and
> omit the replica->anon check.
> 
> Closes #4731
> 
> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
> https://github.com/tarantool/tarantool/tree/sp/gh-4731-anon-segfault-v2
> https://github.com/tarantool/tarantool/issues/4731

I've checked your patch into 2.3 and master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-04-15 10:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-13 10:34 [Tarantool-patches] [PATCH v2] relay: fix segfault on replica transition from anonymous Serge Petrenko
2020-04-13 10:37 ` Serge Petrenko
2020-04-13 13:24   ` Sergey Bronnikov
2020-04-15 10:16 ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox