From: Vladimir Davydov <vdavydov.dev@gmail.com> To: tarantool-patches@freelists.org Subject: Re: [PATCH] memtx: cancel checkpoint thread at exit Date: Thu, 18 Apr 2019 20:15:51 +0300 [thread overview] Message-ID: <20190418171551.5zfqurxnyrdkwxjj@esperanza> (raw) In-Reply-To: <8278bbd0a73bbd0e18a4e8633eb0ad71f2ffa3f7.1555601315.git.vdavydov.dev@gmail.com> On Thu, Apr 18, 2019 at 06:46:54PM +0300, Vladimir Davydov wrote: > +static void > +checkpoint_cancel(struct checkpoint *ckpt) > +{ > + /* > + * Cancel the checkpoint thread if it's running and wait > + * for it to terminate so as to eliminate the possibility > + * of use-after-free. > + */ > + if (ckpt->cord.id != 0 && > + tt_pthread_cancel(ckpt->cord.id) != ESRCH) > + tt_pthread_join(ckpt->cord.id, NULL); > + checkpoint_delete(ckpt); > +} Oops, turned out this check isn't quite correct as we don't reset cord.id after joining the thread. As a result, we might call pthread_cancel on an already joined thread, which leads to a crash (caught by Travis CI): | #0 0x557f6332e9a5 in print_backtrace+1b | #1 0x557f6310614f in _ZL12sig_fatal_cbiP9siginfo_tPv+451 | #2 0x7f98504e10e0 in __restore_rt+0 | #3 0x7f98504de499 in pthread_cancel+9 | #4 0x557f6314aa48 in checkpoint_cancel+52 | #5 0x557f6314866c in memtx_engine_shutdown+55 | #6 0x557f63144fb1 in engine_shutdown+6f | #7 0x557f6326bb52 in box_free+98 | #8 0x557f631083cf in _Z14tarantool_freev+a9 | #9 0x557f63109ae2 in main+10cf | #10 0x7f984f24e2e1 in __libc_start_main+f1 | #11 0x557f63104f6a in _start+2a | #12 (nil) in +2a Actually, we don't need to check cord.id at all - we can use already existing waiting_for_snap_thread flag instead. Fixed this as follows (this is an incremental diff; for the full diff see below): diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c index 2b9bef24..4d99910c 100644 --- a/src/box/memtx_engine.c +++ b/src/box/memtx_engine.c @@ -595,11 +595,6 @@ checkpoint_new(const char *snap_dirname, uint64_t snap_io_rate_limit) xdir_create(&ckpt->dir, snap_dirname, SNAP, &INSTANCE_UUID, &opts); vclock_create(&ckpt->vclock); ckpt->touch = false; - /* - * Reset the checkpoint thread id so that checkpoint_cancel() - * doesn't attempt to cancel the thread if it isn't running. - */ - ckpt->cord.id = 0; return ckpt; } @@ -623,9 +618,10 @@ checkpoint_cancel(struct checkpoint *ckpt) * for it to terminate so as to eliminate the possibility * of use-after-free. */ - if (ckpt->cord.id != 0 && - tt_pthread_cancel(ckpt->cord.id) != ESRCH) + if (ckpt->waiting_for_snap_thread) { + tt_pthread_cancel(ckpt->cord.id); tt_pthread_join(ckpt->cord.id, NULL); + } checkpoint_delete(ckpt); } From 66ff448a89864c3de7d34563a515295d62f7f944 Mon Sep 17 00:00:00 2001 From: Vladimir Davydov <vdavydov.dev@gmail.com> Date: Thu, 18 Apr 2019 18:03:26 +0300 Subject: [PATCH] memtx: cancel checkpoint thread at exit If a tarantool instance exits while checkpointing is in progress, the memtx checkpoint thread, which writes the snap file, can access already freed data resulting in a crash. Let's fix this the same way we did for relay and vinyl threads - simply cancel the thread forcefully and wait for it to terminate. Closes #4170 diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c index 48f700a0..4d99910c 100644 --- a/src/box/memtx_engine.c +++ b/src/box/memtx_engine.c @@ -52,6 +52,9 @@ /* sync snapshot every 16MB */ #define SNAP_SYNC_INTERVAL (1 << 24) +static void +checkpoint_cancel(struct checkpoint *ckpt); + /* * Memtx yield-in-transaction trigger: roll back the effects * of the transaction and mark the transaction as aborted. @@ -179,6 +182,8 @@ static void memtx_engine_shutdown(struct engine *engine) { struct memtx_engine *memtx = (struct memtx_engine *)engine; + if (memtx->checkpoint != NULL) + checkpoint_cancel(memtx->checkpoint); mempool_destroy(&memtx->iterator_pool); if (mempool_is_initialized(&memtx->rtree_iterator_pool)) mempool_destroy(&memtx->rtree_iterator_pool); @@ -605,6 +610,20 @@ checkpoint_delete(struct checkpoint *ckpt) free(ckpt); } +static void +checkpoint_cancel(struct checkpoint *ckpt) +{ + /* + * Cancel the checkpoint thread if it's running and wait + * for it to terminate so as to eliminate the possibility + * of use-after-free. + */ + if (ckpt->waiting_for_snap_thread) { + tt_pthread_cancel(ckpt->cord.id); + tt_pthread_join(ckpt->cord.id, NULL); + } + checkpoint_delete(ckpt); +} static int checkpoint_add_space(struct space *sp, void *data)
next prev parent reply other threads:[~2019-04-18 17:15 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-04-18 15:46 Vladimir Davydov 2019-04-18 17:15 ` Vladimir Davydov [this message] 2019-04-18 17:24 ` [tarantool-patches] " Konstantin Osipov 2019-04-18 18:23 ` Vladimir Davydov 2019-04-18 17:24 ` Konstantin Osipov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190418171551.5zfqurxnyrdkwxjj@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=tarantool-patches@freelists.org \ --subject='Re: [PATCH] memtx: cancel checkpoint thread at exit' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox