[Tarantool-patches] [PATCH] Add a cancellation guard to cpipe flush callback

Konstantin Osipov kostja.osipov at gmail.com
Thu Dec 5 10:27:06 MSK 2019


* Leonid Vasiliev <lvasiliev at tarantool.org> [19/12/05 10:24]:
> On 12/3/19 9:02 PM, Konstantin Osipov wrote:
> > * Leonid Vasiliev <lvasiliev at tarantool.org> [19/12/03 19:36]:
> > > https://github.com/tarantool/tarantool/issues/4127
> > > https://github.com/tarantool/tarantool/tree/lvasiliev/gh-4127-WAL-thread-stucks
> > 
> > Looks like a great catch.
> > 
> > > We need to set a thread cancellation guard, because
> > > another thread may cancel the current thread
> > > (write() is a cancellation point in ev_async_send)
> > > and the activation of the ev_async watcher
> > > through ev_async_send will fail.
> > 
> > I still don't get from the explanation why it is relevant that
> > ev_async_send mustn't fail?
> 
> The cause of why the ev_async_send mustn't fail is unwanted behavior of the
> tarantool instance. For example: first thread flush cpipe input to a
> endpoint output and go away while trying to call ev_async_send (write() -
> cancellation point). Now stailq_empty(&endpoint->output) is false. After
> that, another thread flush cpipe input to the same endpoint, but it didn't
> try to call ev_async_send, because output_was_empty is false. As result: a
> thread of endpoint->consumer didn't wake-up (blocked on epoll_wait). The
> same situation described in
> https://github.com/tarantool/tarantool/issues/4127:
> 

Looks like an explanation that deserves to be in a comment prior
to pthread_setcancelstate().

The issue is, however, if a thread is cancelled and disappears
before deregistering from cbus, a lot of other bad things will
happen - because all of its registration entries will sit there.
The memory, luckily, will not go away, but I am not sure anyone
but the creating thread can deregister the memory structures
safaly. Looks like this could be covered with a unit test, what do
you think?

> at main thread:
> 	from void wal_free(void):
> 		cbus_stop_loop(&writer->wal_pipe);
> 		if (cord_join(&writer->cord)) {...}  // wait the "wal" thread
> 
> at "wal" thread:
> 	don't try to call cbus_stop_loop_f (for the reasons described above)
> 	blocked at epoll_wait()
> 
> > 
> > 

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list