[Tarantool-patches] [PATCH] fiber: abort trace recording on fiber yield

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Apr 1 02:57:17 MSK 2020


Hi! Thanks for the patch!

See 7 comments below.

> diff --git a/src/lua/fiber.c b/src/lua/fiber.c
> index 45bc03787..c79aa7c2b 100644
> --- a/src/lua/fiber.c
> +++ b/src/lua/fiber.c
> @@ -454,6 +455,25 @@ lua_fiber_run_f(MAYBE_UNUSED va_list ap)
>  	return result;
>  }
>  
> +static int
> +fiber_on_yield(struct trigger *trigger, void *event)
> +{
> +	(void) trigger;
> +	(void) event;
> +
> +	/*
> +	 * XXX: According to LuaJIT API reference luaJIT_setmode

1. What is 'XXX'?

> +	 * function returns 0 on failure and 1 on success. Since
> +	 * this call is aimed to abort the trace recording on
> +	 * fiber_yield call and prevent already compiled traces
> +	 * execution, the mode parameter is an invalid value (-1)
> +	 * so JIT mode change is expected to fail and 0 is
> +	 * returned. Otherwise non-zero return value signals
> +	 * trigger_run routine about this trigger failure.

2. So correct me if I am wrong - we deliberately fail this
function assuming that its call stops current trace collection/execution,
right? And we pass -1 because there is no other way to stop
trace collection/execution except this function.

Why can't we call lj_trace_abort() directly?

And most importantly, how does it affect perf? New trigger
is +1 virtual call on each yield of every Lua fiber and +1
execution of non-trival function luaJIT_setmode(). I think
it is better to write a micro bench checking how many yields
can we do per time unit before and after this patch. From Lua
fibers.

> +	 */
> +	return luaJIT_setmode(tarantool_L, 0, -1);

3. In case that function ever returns 1, that breaks our
error return convention, that error means -1 (it will work
in triggers, since they do '!= 0', but anyway). Also when
there is an error, diag is expected to be set. Better do
diag_set(LuajitError) here when this fails. Or even panic()
since it is not supposed to return 1 when the mode is bad.
Or assume it is always 0 (add assert()), and return const 0
from the trigger.

> +}
> +
>  /**
>   * Utility function for fiber.create and fiber.new
>   */
> @@ -467,10 +487,18 @@ fiber_create(struct lua_State *L)
>  
>  	struct fiber *f = fiber_new("lua", lua_fiber_run_f);
>  	if (f == NULL) {
> -		luaL_unref(L, LUA_REGISTRYINDEX, coro_ref);
> -		luaT_error(L);
> +		/* diagnostics is set in fiber_new. */
> +		goto error;
>  	}
>  
> +	struct trigger *t = malloc(sizeof(*t));
> +	if (t == NULL) {
> +		diag_set(OutOfMemory, sizeof(*t), "malloc", "t");
> +		goto error;
> +	}
> +	trigger_create(t, fiber_on_yield, NULL, (trigger_f0) free);
> +	trigger_add(&f->on_yield, t);

4. Yeah, now I understand why did you ask about when triggers are
deleted. In this particular case there is a better option. You can
allocate the trigger right on the stack, in lua_fiber_run_f(). This
is the trampoline for all Lua fibers, regardless how they are
created: via fiber.new(), or fiber.create(). You can add the trigger
before

    result = luaT_call(L, lua_gettop(L) - 1, LUA_MULTRET);

And remove afterwards. No need to use the heap then, nor move some
error handling code.

5. Also there is a question what will happen, if Lua is accessed from
a fiber, created not in Lua? That is basically all pooled fibers,
which serve IProto requests. See struct fiber_pool and its single
usage place. These fibers are started from C, and may end up executing
Lua code, if, for example, an IPROTO_CALL arrived.

The same question can be applied to fibers, created from C API:

    API_EXPORT struct fiber *
    fiber_new(const char *name, fiber_func f);

These fibers won't have the on_yield trigger, but can access Lua.
For example, they can run Lua on_replace triggers in a space by making
an insertion into it. Also they can access Lua API directly. By
calling things like luaT_call() (see module.h).

You may consider adding Jit collection/tracing termination to the
scheduler fiber. It is just one fiber, and from what I know, every
fiber after yield switches current thread to the scheduler fiber. This
also maybe less expensive. We probably won't even notice.

6. There is also a question about how does it affect collection of
valid traces? I don't know how Jit does it, so ask just in case.
Won't this make traces never collected? Because every fiber yields
soon or late.

7. Does this patch allow to yield in ffi?


More information about the Tarantool-patches mailing list