From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Serge Petrenko Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_EA7E43CF-656F-4AE8-89EB-C8C305272DF7" Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: [tarantool-patches] Re: [PATCH] Fix fiber_join() hang in case fiber_cancel() was called Date: Tue, 5 Feb 2019 18:02:58 +0300 In-Reply-To: <20190205090042.vjtymjg7vp52aaki@esperanza> References: <20190204135138.51960-1-sergepetrenko@tarantool.org> <20190204174404.qyalziyrhz6use5g@esperanza> <08B70EC3-6EA3-405B-964F-7C10B5BA62F3@tarantool.org> <20190205090042.vjtymjg7vp52aaki@esperanza> To: Vladimir Davydov Cc: Georgy Kirichenko , tarantool-patches@freelists.org List-ID: --Apple-Mail=_EA7E43CF-656F-4AE8-89EB-C8C305272DF7 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > 5 =D1=84=D0=B5=D0=B2=D1=80. 2019 =D0=B3., =D0=B2 12:00, Vladimir = Davydov =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB= (=D0=B0): >=20 > On Tue, Feb 05, 2019 at 09:28:01AM +0300, Serge Petrenko wrote: >>=20 >>=20 >>> 4 =D1=84=D0=B5=D0=B2=D1=80. 2019 =D0=B3., =D0=B2 20:44, Vladimir = Davydov =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB= (=D0=B0): >>>=20 >>>>=20 >>>> diff --git a/src/fiber.c b/src/fiber.c >>>> index 6f3d0ab78..4dc6b3c5a 100644 >>>> --- a/src/fiber.c >>>> +++ b/src/fiber.c >>>> @@ -392,9 +392,17 @@ fiber_join(struct fiber *fiber) >>>> assert(fiber->flags & FIBER_IS_JOINABLE); >>>>=20 >>>> if (! fiber_is_dead(fiber)) { >>>> - rlist_add_tail_entry(&fiber->wake, fiber(), state); >>>>=20 >>>> do { >>>> + /* >>>> + * In case fiber is cancelled during yield >>>> + * it will be removed from wake queue by a >>>> + * wakeup following the cancel. >>>> + * Having multiple entries for the same fiber >>>> + * doesn't hurt, since wakeup is executed only >>>> + * once per fiber. >>>> + */ >>>> + rlist_add_tail_entry(&fiber->wake, fiber(), = state); >>>=20 >>> I don't quite like the idea that cancelling a fiber that is joining >>> another fiber will have no effect until the other fiber has exited. >>> Can't we break the loop if fiber_is_cancelled()? >>=20 >> We can do that. But then we have to set FIBER_IS_JOINABLE to false >> for the joined fiber so that it executes fiber_recycle(). >=20 > Why should we? IMO the user should be free to kill a fiber executing > fiber_join. If that happens, the joinable fiber shouldn't be collected > until another fiber joins it successfully. This would be consistent = with > pthread_join behavior. >=20 >> Otherwise it will leak. Is it ok? As discussed verbally, on cancellation lets make the fiber non-joinable = and exit. I addressed your other comments in v2. Please, take a look. --Apple-Mail=_EA7E43CF-656F-4AE8-89EB-C8C305272DF7 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

5 =D1=84=D0=B5=D0=B2=D1=80. 2019 =D0=B3., =D0=B2 12:00, = Vladimir Davydov <vdavydov.dev@gmail.com> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0= =B0=D0=BB(=D0=B0):

On Tue, Feb = 05, 2019 at 09:28:01AM +0300, Serge Petrenko wrote:


4 = =D1=84=D0=B5=D0=B2=D1=80. 2019 =D0=B3., =D0=B2 20:44, Vladimir Davydov = <vdavydov.dev@gmail.com> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0= =B0=D0=BB(=D0=B0):


diff --git a/src/fiber.c b/src/fiber.c
index 6f3d0ab78..4dc6b3c5a 100644
--- = a/src/fiber.c
+++ b/src/fiber.c
@@ -392,9 = +392,17 @@ fiber_join(struct fiber *fiber)
= assert(fiber->flags & FIBER_IS_JOINABLE);

= if (! fiber_is_dead(fiber)) {
- = rlist_add_tail_entry(&fiber->wake, fiber(), state);

do {
+ /*
+ = = =  * In case = fiber is cancelled during yield
+  * it will be removed from = wake queue by a
+  * wakeup following the = cancel.
+  * Having multiple entries = for the same fiber
+  * doesn't hurt, since = wakeup is executed only
+  * once per fiber.
+ = = =  */
+ = = = rlist_add_tail_entry(&fiber->wake, fiber(), state);

I don't quite like the idea that = cancelling a fiber that is joining
another fiber will have = no effect until the other fiber has exited.
Can't we break = the loop if fiber_is_cancelled()?

We can do that. But then we have to set FIBER_IS_JOINABLE to = false
for the joined fiber so that it executes = fiber_recycle().

Why should we? IMO the user should be free to kill a fiber = executing
fiber_join. = If that happens, the joinable fiber shouldn't be collected
until another fiber joins it = successfully. This would be consistent with
pthread_join behavior.

Otherwise it will leak.  Is it = ok?

As = discussed verbally, on cancellation lets make the fiber non-joinable and = exit.
I addressed your other comments in v2. Please, take a = look.

= --Apple-Mail=_EA7E43CF-656F-4AE8-89EB-C8C305272DF7--