[Tarantool-patches] [PATCH v2] core: handle fiber cancellation for fiber.cond

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Tue Nov 17 01:12:46 MSK 2020


On 03.11.2020 11:20, Sergey Ostanevich wrote:
> Hi Oleg!
> 
> I believe the point about 'consistency' is not valid here. I put a
> simple check that if diag is already set, then print it out. For the
> fiber_cond_wait_timeout() it happened multiple times with various
> reports, inlcuding this one:
> 
> 2020-11-03 10:28:01.630 [72411] relay/unix/:(socket)/101/main C> Did not
> set the DIAG to FiberIsCancelled, original diag: Missing .xlog file
> between LSN 5 {1: 5} and 6 {1: 6}
> 
> that is used in the test system:
> 
> test_run:wait_upstream(1, {message_re = 'Missing %.xlog file', status =
> 'loading'})
> 
> So, my resolution will be: it is wrong to set a diag in an arbitrary
> place, without clear understanting of the reason. This is the case for
> the cond_wait machinery, since it doesn't know _why_ the fiber is
> cancelled.

It is a wrong resolution, IMO. You just hacked cond wait not to change the
other places. It is not about tests. Tests only show what is provided by the
internal subsystems. And if they depend on fiber cond not setting diag in
case of a fail, then it looks wrong.

I suggest you to fix the usage places, where the caller code thinks that
cond_wait never sets a diag on cancellation.

If a function fails, we set a diag. It is not a thing we do optionally.
Otherwise you make it a bit simpler in this patch, but make it harder to
work with the cond in future.

Talking of your statement:

	I believe the stack diag also is not supported there yet.

It is supported on the level of lib/core, i.e. everywhere. But is not
present on 1.10. However it is not the point. The point is that it is not
needed here.


More information about the Tarantool-patches mailing list