From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: tarantool-patches@freelists.org
Subject: Re: [PATCH] memtx: cancel checkpoint thread at exit
Date: Thu, 18 Apr 2019 20:15:51 +0300 [thread overview]
Message-ID: <20190418171551.5zfqurxnyrdkwxjj@esperanza> (raw)
In-Reply-To: <8278bbd0a73bbd0e18a4e8633eb0ad71f2ffa3f7.1555601315.git.vdavydov.dev@gmail.com>
On Thu, Apr 18, 2019 at 06:46:54PM +0300, Vladimir Davydov wrote:
> +static void
> +checkpoint_cancel(struct checkpoint *ckpt)
> +{
> + /*
> + * Cancel the checkpoint thread if it's running and wait
> + * for it to terminate so as to eliminate the possibility
> + * of use-after-free.
> + */
> + if (ckpt->cord.id != 0 &&
> + tt_pthread_cancel(ckpt->cord.id) != ESRCH)
> + tt_pthread_join(ckpt->cord.id, NULL);
> + checkpoint_delete(ckpt);
> +}
Oops, turned out this check isn't quite correct as we don't reset
cord.id after joining the thread. As a result, we might call
pthread_cancel on an already joined thread, which leads to a crash
(caught by Travis CI):
| #0 0x557f6332e9a5 in print_backtrace+1b
| #1 0x557f6310614f in _ZL12sig_fatal_cbiP9siginfo_tPv+451
| #2 0x7f98504e10e0 in __restore_rt+0
| #3 0x7f98504de499 in pthread_cancel+9
| #4 0x557f6314aa48 in checkpoint_cancel+52
| #5 0x557f6314866c in memtx_engine_shutdown+55
| #6 0x557f63144fb1 in engine_shutdown+6f
| #7 0x557f6326bb52 in box_free+98
| #8 0x557f631083cf in _Z14tarantool_freev+a9
| #9 0x557f63109ae2 in main+10cf
| #10 0x7f984f24e2e1 in __libc_start_main+f1
| #11 0x557f63104f6a in _start+2a
| #12 (nil) in +2a
Actually, we don't need to check cord.id at all - we can use already
existing waiting_for_snap_thread flag instead. Fixed this as follows
(this is an incremental diff; for the full diff see below):
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 2b9bef24..4d99910c 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -595,11 +595,6 @@ checkpoint_new(const char *snap_dirname, uint64_t snap_io_rate_limit)
xdir_create(&ckpt->dir, snap_dirname, SNAP, &INSTANCE_UUID, &opts);
vclock_create(&ckpt->vclock);
ckpt->touch = false;
- /*
- * Reset the checkpoint thread id so that checkpoint_cancel()
- * doesn't attempt to cancel the thread if it isn't running.
- */
- ckpt->cord.id = 0;
return ckpt;
}
@@ -623,9 +618,10 @@ checkpoint_cancel(struct checkpoint *ckpt)
* for it to terminate so as to eliminate the possibility
* of use-after-free.
*/
- if (ckpt->cord.id != 0 &&
- tt_pthread_cancel(ckpt->cord.id) != ESRCH)
+ if (ckpt->waiting_for_snap_thread) {
+ tt_pthread_cancel(ckpt->cord.id);
tt_pthread_join(ckpt->cord.id, NULL);
+ }
checkpoint_delete(ckpt);
}
From 66ff448a89864c3de7d34563a515295d62f7f944 Mon Sep 17 00:00:00 2001
From: Vladimir Davydov <vdavydov.dev@gmail.com>
Date: Thu, 18 Apr 2019 18:03:26 +0300
Subject: [PATCH] memtx: cancel checkpoint thread at exit
If a tarantool instance exits while checkpointing is in progress, the
memtx checkpoint thread, which writes the snap file, can access already
freed data resulting in a crash. Let's fix this the same way we did for
relay and vinyl threads - simply cancel the thread forcefully and wait
for it to terminate.
Closes #4170
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 48f700a0..4d99910c 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -52,6 +52,9 @@
/* sync snapshot every 16MB */
#define SNAP_SYNC_INTERVAL (1 << 24)
+static void
+checkpoint_cancel(struct checkpoint *ckpt);
+
/*
* Memtx yield-in-transaction trigger: roll back the effects
* of the transaction and mark the transaction as aborted.
@@ -179,6 +182,8 @@ static void
memtx_engine_shutdown(struct engine *engine)
{
struct memtx_engine *memtx = (struct memtx_engine *)engine;
+ if (memtx->checkpoint != NULL)
+ checkpoint_cancel(memtx->checkpoint);
mempool_destroy(&memtx->iterator_pool);
if (mempool_is_initialized(&memtx->rtree_iterator_pool))
mempool_destroy(&memtx->rtree_iterator_pool);
@@ -605,6 +610,20 @@ checkpoint_delete(struct checkpoint *ckpt)
free(ckpt);
}
+static void
+checkpoint_cancel(struct checkpoint *ckpt)
+{
+ /*
+ * Cancel the checkpoint thread if it's running and wait
+ * for it to terminate so as to eliminate the possibility
+ * of use-after-free.
+ */
+ if (ckpt->waiting_for_snap_thread) {
+ tt_pthread_cancel(ckpt->cord.id);
+ tt_pthread_join(ckpt->cord.id, NULL);
+ }
+ checkpoint_delete(ckpt);
+}
static int
checkpoint_add_space(struct space *sp, void *data)
next prev parent reply other threads:[~2019-04-18 17:15 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-18 15:46 Vladimir Davydov
2019-04-18 17:15 ` Vladimir Davydov [this message]
2019-04-18 17:24 ` [tarantool-patches] " Konstantin Osipov
2019-04-18 18:23 ` Vladimir Davydov
2019-04-18 17:24 ` Konstantin Osipov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190418171551.5zfqurxnyrdkwxjj@esperanza \
--to=vdavydov.dev@gmail.com \
--cc=tarantool-patches@freelists.org \
--subject='Re: [PATCH] memtx: cancel checkpoint thread at exit' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox