[Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Nov 17 01:12:42 MSK 2020


Hi! Thanks for the patch!

Please, change subsystem name to 'fiber:'. 'core:' is too general.
We have tons of stuff in libcore in addition to fibers.

> Before this patch fiber.cond():wait() just returns for cancelled
> fiber. In contrast fiber.channel():get() threw "fiber is
> canceled" error.
> This patch unify behaviour of channels and condvars and also fixes
> related net.box module problem - it was impossible to interrupt
> net.box call with fiber.cancel because it used fiber.cond under
> the hood. Test cases for both bugs are added.

Netbox hangs not because of using fiber.cond. But because it never
calls testcancel(). Normally all looped fibers should do that.

Talking of compatibility - I think it always was supposed to throw.
luaT_fiber_cond_wait() calls luaL_testcancel(), but only when
fiber_cond_wait_timeout() returns not 0, which was never the case for
cancellation. So it was supposed to throw, but nobody covered it with
a test.

See 2 comments below.

> Closes #4834
> Closes #5013
> 
> Co-authored-by: Oleg Babin <olegrok at tarantool.org>
> diff --git a/src/box/box.cc b/src/box/box.cc
> index 18568df3b..29f74e94b 100644
> --- a/src/box/box.cc
> +++ b/src/box/box.cc
> @@ -305,10 +305,9 @@ box_wait_ro(bool ro, double timeout)
>  {
>  	double deadline = ev_monotonic_now(loop()) + timeout;
>  	while (is_box_configured == false || box_is_ro() != ro) {
> -		if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0)
> -			return -1;
> -		if (fiber_is_cancelled()) {
> -			diag_set(FiberIsCancelled);
> +		if (fiber_cond_wait_deadline(&ro_cond, deadline) != 0) {
> +			if (fiber_is_cancelled())
> +				diag_set(FiberIsCancelled);
>  			return -1;
>  		}
>  	}
> diff --git a/src/lib/core/fiber_cond.c b/src/lib/core/fiber_cond.c
> index 904a350d9..cc59eaafb 100644
> --- a/src/lib/core/fiber_cond.c
> +++ b/src/lib/core/fiber_cond.c
> @@ -108,6 +108,11 @@ fiber_cond_wait_timeout(struct fiber_cond *c, double timeout)
>  		diag_set(TimedOut);
>  		return -1;
>  	}
> +	if (fiber_is_cancelled()) {
> +		if (diag_is_empty(diag_get()))
> +                        diag_set(FiberIsCancelled);
> +		return -1;

1. Wtf? Why don't you set an error, when this is an error? And why do
you do exactly the same in box_wait_ro() above?

> +	}
>  	return 0;
>  }> diff --git a/test/box/net.box_fiber_cancel_gh-4834.result b/test/box/net.box_fiber_cancel_gh-4834.result
> new file mode 100644
> index 000000000..4ed04bb61
> --- /dev/null
> +++ b/test/box/net.box_fiber_cancel_gh-4834.result

2. This does not conform to our test file name pattern. Read this
carefully: https://github.com/tarantool/tarantool/wiki/Code-review-procedure#testing


More information about the Tarantool-patches mailing list