[Tarantool-patches] [PATCH v2 2/2] fiber: fiber_join -- don't crash on misuse

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Apr 29 00:16:20 MSK 2021


> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
> index 02ec3d158..0c8238cab 100644
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -791,9 +793,11 @@ lbox_fiber_join(struct lua_State *L)
>  	int num_ret = 0;
>  	int coro_ref = 0;
>  
> -	if (!(fiber->flags & FIBER_IS_JOINABLE))
> -		luaL_error(L, "the fiber is not joinable");
> -	fiber_join(fiber);
> +	if (fiber_join(fiber) != 0) {
> +		e = diag_last_error(&fiber()->diag);
> +		if (e->type == &type_IllegalParams)
> +			luaL_error(L, e->errmsg);

After looking at this hunk I realized that it might be wrong to
allow to call join on a non-joinable fiber. Firstly, you have no
way to check why did join return -1: because it wasn't joinable
or because this is what the fiber's function has returned. It is
simply impossible in the public API (module.h).

Secondly, fiber_join() is documented to always return the
fiber's function result. I see it in module.h and on the site.
Here the behaviour has kind of changed - it might return something
even if the fiber didn't really end. This is especially bad if the
fiber was using some resources which are freed right after the join.
And doubly-bad if the user's function never fails, so fiber_join()
result wasn't even checked in his code.

Thirdly, this leads to inconsistent behaviour. In this example
fiber.join does not raise an error - it returns false + error:

	fiber = require('fiber')
	do
	    f = fiber.new(function() box.error('other error') end)
	    f:set_joinable(true)
	end

	tarantool> f:join()
	---
	- false
	- '[string "do..."]:2: box.error(): bad arguments'
	...

But when I change the error type, it raises the error:

	fiber = require('fiber')
	do
	    f = fiber.new(function() box.lib.load() end)
	    f:set_joinable(true)
	end

	tarantool> f:join()
	---
	- error: Expects box.lib.load('name') but no name passed
	...

It didn't happen before your patch. The same problem exists for
fiber_join_timeout(), but at least it is documented to be able to
return before the fiber has joined.

With that said, I think we must call panic() on an attempt to join
a non-joinable fiber.

For easier usage we might need to introduce fiber_join_ex(), which
wouldn't mix its own fail and the fiber's function fail. But maybe
not now since nobody really asked for that.


More information about the Tarantool-patches mailing list