Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber
@ 2020-11-06 23:43 Vladislav Shpilevoy
  2020-11-08 18:04 ` Vladislav Shpilevoy
  2020-11-10 22:05 ` Vladislav Shpilevoy
  0 siblings, 2 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-06 23:43 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Raft worker fiber does all the heavy and yielding jobs. These are
2 - disk write, and network broadcast. Disk write yields. Network
broadcast is slow, so it happens at most once per event loop
iteration.

The worker on each iteration check if any of these 2 jobs is
active, and if not, it goes to sleep until an explicit wakeup.

But there was a bug. Before going to sleep it did a yield + a
check that there is nothing to do. However during the yield new
tasks could appear, and the check failed, leading to a crash.

The patch reorganizes this part of the code so now the worker does
not yield between checking new tasks and going to sleep.

No test, because extremely hard to reproduce, and don't want to
clog this part of the code with error injections.
---
Branch: http://github.com/tarantool/tarantool/tree/gerold103/raft-crash

 src/box/raft.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/box/raft.c b/src/box/raft.c
index e1e60ce94..914b0d68f 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -682,11 +682,11 @@ raft_worker_f(va_list args)
 			raft_worker_handle_broadcast();
 			is_idle = false;
 		}
+		if (is_idle) {
+			assert(raft_is_fully_on_disk());
+			fiber_yield();
+		}
 		fiber_sleep(0);
-		if (!is_idle)
-			continue;
-		assert(raft_is_fully_on_disk());
-		fiber_yield();
 	}
 	return 0;
 }
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber
  2020-11-06 23:43 [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber Vladislav Shpilevoy
@ 2020-11-08 18:04 ` Vladislav Shpilevoy
  2020-11-09 11:10   ` Serge Petrenko
  2020-11-10 20:59   ` Alexander V. Tikhonov
  2020-11-10 22:05 ` Vladislav Shpilevoy
  1 sibling, 2 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-08 18:04 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

> diff --git a/src/box/raft.c b/src/box/raft.c
> index e1e60ce94..914b0d68f 100644
> --- a/src/box/raft.c
> +++ b/src/box/raft.c
> @@ -682,11 +682,11 @@ raft_worker_f(va_list args)
>  			raft_worker_handle_broadcast();
>  			is_idle = false;
>  		}
> +		if (is_idle) {
> +			assert(raft_is_fully_on_disk());
> +			fiber_yield();
> +		}
>  		fiber_sleep(0);
> -		if (!is_idle)
> -			continue;
> -		assert(raft_is_fully_on_disk());
> -		fiber_yield();
>  	}
>  	return 0;

I changed this part just a little. Because it makes no sense to
sleep(0) right after a yield. Force pushed to the branch.

====================
@@ -685,8 +685,9 @@ raft_worker_f(va_list args)
 		if (is_idle) {
 			assert(raft_is_fully_on_disk());
 			fiber_yield();
+		} else {
+			fiber_sleep(0);
 		}
-		fiber_sleep(0);
 	}
 	return 0;
 }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber
  2020-11-08 18:04 ` Vladislav Shpilevoy
@ 2020-11-09 11:10   ` Serge Petrenko
  2020-11-10 20:59   ` Alexander V. Tikhonov
  1 sibling, 0 replies; 5+ messages in thread
From: Serge Petrenko @ 2020-11-09 11:10 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches


08.11.2020 21:04, Vladislav Shpilevoy пишет:
>> diff --git a/src/box/raft.c b/src/box/raft.c
>> index e1e60ce94..914b0d68f 100644
>> --- a/src/box/raft.c
>> +++ b/src/box/raft.c
>> @@ -682,11 +682,11 @@ raft_worker_f(va_list args)
>>   			raft_worker_handle_broadcast();
>>   			is_idle = false;
>>   		}
>> +		if (is_idle) {
>> +			assert(raft_is_fully_on_disk());
>> +			fiber_yield();
>> +		}
>>   		fiber_sleep(0);
>> -		if (!is_idle)
>> -			continue;
>> -		assert(raft_is_fully_on_disk());
>> -		fiber_yield();
>>   	}
>>   	return 0;
> I changed this part just a little. Because it makes no sense to
> sleep(0) right after a yield. Force pushed to the branch.
>
> ====================
> @@ -685,8 +685,9 @@ raft_worker_f(va_list args)
>   		if (is_idle) {
>   			assert(raft_is_fully_on_disk());
>   			fiber_yield();
> +		} else {
> +			fiber_sleep(0);
>   		}
> -		fiber_sleep(0);
>   	}
>   	return 0;
>   }
Hi! Thanks for the patch!
LGTM.

-- 
Serge Petrenko

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber
  2020-11-08 18:04 ` Vladislav Shpilevoy
  2020-11-09 11:10   ` Serge Petrenko
@ 2020-11-10 20:59   ` Alexander V. Tikhonov
  1 sibling, 0 replies; 5+ messages in thread
From: Alexander V. Tikhonov @ 2020-11-10 20:59 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi Vlad, I've checked all results in gitlab-ci, and no new degradations
found [1], patch LGTM.

[1] - https://gitlab.com/tarantool/tarantool/-/pipelines/213343120

On Sun, Nov 08, 2020 at 07:04:09PM +0100, Vladislav Shpilevoy wrote:
> > diff --git a/src/box/raft.c b/src/box/raft.c
> > index e1e60ce94..914b0d68f 100644
> > --- a/src/box/raft.c
> > +++ b/src/box/raft.c
> > @@ -682,11 +682,11 @@ raft_worker_f(va_list args)
> >  			raft_worker_handle_broadcast();
> >  			is_idle = false;
> >  		}
> > +		if (is_idle) {
> > +			assert(raft_is_fully_on_disk());
> > +			fiber_yield();
> > +		}
> >  		fiber_sleep(0);
> > -		if (!is_idle)
> > -			continue;
> > -		assert(raft_is_fully_on_disk());
> > -		fiber_yield();
> >  	}
> >  	return 0;
> 
> I changed this part just a little. Because it makes no sense to
> sleep(0) right after a yield. Force pushed to the branch.
> 
> ====================
> @@ -685,8 +685,9 @@ raft_worker_f(va_list args)
>  		if (is_idle) {
>  			assert(raft_is_fully_on_disk());
>  			fiber_yield();
> +		} else {
> +			fiber_sleep(0);
>  		}
> -		fiber_sleep(0);
>  	}
>  	return 0;
>  }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber
  2020-11-06 23:43 [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber Vladislav Shpilevoy
  2020-11-08 18:04 ` Vladislav Shpilevoy
@ 2020-11-10 22:05 ` Vladislav Shpilevoy
  1 sibling, 0 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-11-10 22:05 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Pushed to master and 2.6.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-10 22:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-06 23:43 [Tarantool-patches] [PATCH 1/1] raft: fix crash in worker fiber Vladislav Shpilevoy
2020-11-08 18:04 ` Vladislav Shpilevoy
2020-11-09 11:10   ` Serge Petrenko
2020-11-10 20:59   ` Alexander V. Tikhonov
2020-11-10 22:05 ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox