From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 096BD469719 for ; Tue, 17 Nov 2020 01:12:44 +0300 (MSK) References: <20201031162911.61876-1-sergos@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Mon, 16 Nov 2020 23:12:42 +0100 MIME-Version: 1.0 In-Reply-To: <20201031162911.61876-1-sergos@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: sergos@tarantool.org, tarantool-patches@dev.tarantool.org Cc: alexander.turenko@tarantool.org Hi! Thanks for the patch! Please, change subsystem name to 'fiber:'. 'core:' is too general. We have tons of stuff in libcore in addition to fibers. > Before this patch fiber.cond():wait() just returns for cancelled > fiber. In contrast fiber.channel():get() threw "fiber is > canceled" error. > This patch unify behaviour of channels and condvars and also fixes > related net.box module problem - it was impossible to interrupt > net.box call with fiber.cancel because it used fiber.cond under > the hood. Test cases for both bugs are added. Netbox hangs not because of using fiber.cond. But because it never calls testcancel(). Normally all looped fibers should do that. Talking of compatibility - I think it always was supposed to throw. luaT_fiber_cond_wait() calls luaL_testcancel(), but only when fiber_cond_wait_timeout() returns not 0, which was never the case for cancellation. So it was supposed to throw, but nobody covered it with a test. See 2 comments below. > Closes #4834 > Closes #5013 > > Co-authored-by: Oleg Babin > diff --git a/src/box/box.cc b/src/box/box.cc > index 18568df3b..29f74e94b 100644 > --- a/src/box/box.cc > +++ b/src/box/box.cc > @@ -305,10 +305,9 @@ box_wait_ro(bool ro, double timeout) > { > double deadline = ev_monotonic_now(loop()) + timeout; > while (is_box_configured == false || box_is_ro() != ro) { > - if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0) > - return -1; > - if (fiber_is_cancelled()) { > - diag_set(FiberIsCancelled); > + if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0) { > + if (fiber_is_cancelled()) > + diag_set(FiberIsCancelled); > return -1; > } > } > diff --git a/src/lib/core/fiber_cond.c b/src/lib/core/fiber_cond.c > index 904a350d9..cc59eaafb 100644 > --- a/src/lib/core/fiber_cond.c > +++ b/src/lib/core/fiber_cond.c > @@ -108,6 +108,11 @@ fiber_cond_wait_timeout(struct fiber_cond *c, double timeout) > diag_set(TimedOut); > return -1; > } > + if (fiber_is_cancelled()) { > + if (diag_is_empty(diag_get())) > + diag_set(FiberIsCancelled); > + return -1; 1. Wtf? Why don't you set an error, when this is an error? And why do you do exactly the same in box_wait_ro() above? > + } > return 0; > }> diff --git a/test/box/net.box_fiber_cancel_gh-4834.result b/test/box/net.box_fiber_cancel_gh-4834.result > new file mode 100644 > index 000000000..4ed04bb61 > --- /dev/null > +++ b/test/box/net.box_fiber_cancel_gh-4834.result 2. This does not conform to our test file name pattern. Read this carefully: https://github.com/tarantool/tarantool/wiki/Code-review-procedure#testing