* [PATCH 0/4] Fix DML vs DDL race @ 2019-03-23 21:07 Vladimir Davydov 2019-03-23 21:07 ` [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction Vladimir Davydov ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Vladimir Davydov @ 2019-03-23 21:07 UTC (permalink / raw) To: kostja; +Cc: tarantool-patches In Vinyl a DML request may yield on disk read while still having a pointer to the target space saved on stack. If the space is altered during this time window, the DML request will most certainly crash while trying to dereference the stale pointer when it resumes. The solution is simple: abort all affected transactions whenever we delete a space from the cache. https://github.com/tarantool/tarantool/issues/3420 https://github.com/tarantool/tarantool/commits/dv/gh-3420-vy-abort-tx-in-ddl NOTE the branch is based on top of https://github.com/tarantool/tarantool/commits/dv/gh-4070-vy-fail-aborted-transactions-early which was submitted for review earlier: https://www.freelists.org/post/tarantool-patches/PATCH-12-vinyl-drop-useless-vy-read-viewis-aborted-flag https://www.freelists.org/post/tarantool-patches/PATCH-22-vinyl-fail-aborted-transactions-early Vladimir Davydov (4): vinyl: introduce hash of spaces affected by transaction vinyl: don't abort transactions that modify only local spaces for ro vinyl: abort affected transactions when space is removed from cache Revert "test: skip ddl test for vinyl on travis" rpm/tarantool.spec | 2 - src/box/blackhole.c | 1 + src/box/memtx_space.c | 1 + src/box/schema.cc | 3 ++ src/box/space.c | 6 +++ src/box/space.h | 15 ++++++ src/box/sysview.c | 1 + src/box/vinyl.c | 33 ++++++++++--- src/box/vy_point_lookup.c | 14 ++++++ src/box/vy_read_iterator.c | 13 +++++ src/box/vy_tx.c | 45 ++++++++++++++--- src/box/vy_tx.h | 44 +++++++++++++++-- test/vinyl/ddl.skipcond | 6 --- test/vinyl/errinj_ddl.result | 110 +++++++++++++++++++++++++++++++++++++++++ test/vinyl/errinj_ddl.test.lua | 51 +++++++++++++++++++ test/vinyl/misc.result | 38 ++++++++++++++ test/vinyl/misc.test.lua | 16 ++++++ 17 files changed, 376 insertions(+), 23 deletions(-) delete mode 100644 test/vinyl/ddl.skipcond -- 2.11.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction 2019-03-23 21:07 [PATCH 0/4] Fix DML vs DDL race Vladimir Davydov @ 2019-03-23 21:07 ` Vladimir Davydov 2019-03-28 13:25 ` [tarantool-patches] " Konstantin Osipov 2019-03-23 21:07 ` [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro Vladimir Davydov ` (2 subsequent siblings) 3 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-23 21:07 UTC (permalink / raw) To: kostja; +Cc: tarantool-patches We need to abort all transactions writing to an altered space when a new index is build. Currently, we use the write set to look up such transactions, but it isn't quite correct, because a transaction could yield on disk read before inserting a statement into the write set. To address this problem, this patch introduces a hash of spaces affected by each transaction and makes tx_manager_abort_writers_for_ddl() use it instead of the write set to abort transactions for DDL. Needed for #3420 --- src/box/vinyl.c | 12 ++++++------ src/box/vy_tx.c | 26 ++++++++++++++++++++++---- src/box/vy_tx.h | 44 +++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 69 insertions(+), 13 deletions(-) diff --git a/src/box/vinyl.c b/src/box/vinyl.c index 3ef43e18..a6a5f187 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -1034,14 +1034,14 @@ vinyl_space_prepare_alter(struct space *old_space, struct space *new_space) /** * This function is called after installing on_replace trigger * used for propagating changes done during DDL. It aborts all - * rw transactions affecting the given LSM tree that began + * rw transactions affecting the given space that began * before the trigger was installed so that DDL doesn't miss * their working set. */ static void -vy_abort_writers_for_ddl(struct vy_env *env, struct vy_lsm *lsm) +vy_abort_writers_for_ddl(struct vy_env *env, struct space *space) { - tx_manager_abort_writers_for_ddl(env->xm, lsm); + tx_manager_abort_writers_for_ddl(env->xm, space); /* * Wait for prepared transactions to complete * (we can't abort them as they reached WAL). @@ -1115,7 +1115,7 @@ vinyl_space_check_format(struct space *space, struct tuple_format *format) trigger_create(&on_replace, vy_check_format_on_replace, &ctx, NULL); trigger_add(&space->on_replace, &on_replace); - vy_abort_writers_for_ddl(env, pk); + vy_abort_writers_for_ddl(env, space); struct vy_read_iterator itr; vy_read_iterator_open(&itr, pk, NULL, ITER_ALL, pk->env->empty_key, @@ -2434,7 +2434,7 @@ vinyl_engine_begin_statement(struct engine *engine, struct txn *txn) struct vy_tx *tx = txn->engine_tx; struct txn_stmt *stmt = txn_current_stmt(txn); assert(tx != NULL); - return vy_tx_begin_statement(tx, &stmt->engine_savepoint); + return vy_tx_begin_statement(tx, stmt->space, &stmt->engine_savepoint); } static void @@ -4229,7 +4229,7 @@ vinyl_space_build_index(struct space *src_space, struct index *new_index, trigger_create(&on_replace, vy_build_on_replace, &ctx, NULL); trigger_add(&src_space->on_replace, &on_replace); - vy_abort_writers_for_ddl(env, pk); + vy_abort_writers_for_ddl(env, src_space); struct vy_read_iterator itr; vy_read_iterator_open(&itr, pk, NULL, ITER_ALL, pk->env->empty_key, diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c index ae660dd8..56d594e5 100644 --- a/src/box/vy_tx.c +++ b/src/box/vy_tx.c @@ -311,6 +311,7 @@ vy_tx_create(struct tx_manager *xm, struct vy_tx *tx) tx->is_applier_session = false; tx->read_view = (struct vy_read_view *)xm->p_global_read_view; vy_tx_read_set_new(&tx->read_set); + vy_tx_space_hash_new(&tx->space_hash); tx->psn = 0; rlist_create(&tx->on_destroy); rlist_create(&tx->in_writers); @@ -849,13 +850,31 @@ vy_tx_rollback(struct vy_tx *tx) } int -vy_tx_begin_statement(struct vy_tx *tx, void **savepoint) +vy_tx_begin_statement(struct vy_tx *tx, struct space *space, void **savepoint) { if (tx->state == VINYL_TX_ABORT) { diag_set(ClientError, ER_TRANSACTION_CONFLICT); return -1; } assert(tx->state == VINYL_TX_READY); + + struct vy_tx_space_hash_entry *space_entry; + space_entry = vy_tx_space_hash_search(&tx->space_hash, space); + if (space_entry == NULL) { + /* + * Allocate a slot for the modified space in the hash. + * It will be freed automatically along with the region + * when the transaction is committed or rolled back. + */ + space_entry = region_alloc(&fiber()->gc, sizeof(*space_entry)); + if (space_entry == NULL) { + diag_set(OutOfMemory, sizeof(*space_entry), + "region", "space hash entry"); + return -1; + } + space_entry->space = space; + vy_tx_space_hash_insert(&tx->space_hash, space_entry); + } if (stailq_empty(&tx->log)) rlist_add_entry(&tx->xm->writers, tx, in_writers); *savepoint = stailq_last(&tx->log); @@ -1086,13 +1105,12 @@ vy_tx_set_with_colmask(struct vy_tx *tx, struct vy_lsm *lsm, } void -tx_manager_abort_writers_for_ddl(struct tx_manager *xm, struct vy_lsm *lsm) +tx_manager_abort_writers_for_ddl(struct tx_manager *xm, struct space *space) { struct vy_tx *tx; rlist_foreach_entry(tx, &xm->writers, in_writers) { if (tx->state == VINYL_TX_READY && - write_set_search_key(&tx->write_set, lsm, - lsm->env->empty_key) != NULL) + vy_tx_space_hash_search(&tx->space_hash, space) != NULL) vy_tx_abort(tx); } } diff --git a/src/box/vy_tx.h b/src/box/vy_tx.h index aaa31bee..fada41f1 100644 --- a/src/box/vy_tx.h +++ b/src/box/vy_tx.h @@ -51,6 +51,7 @@ extern "C" { #endif /* defined(__cplusplus) */ +struct space; struct tuple; struct tx_manager; struct vy_mem; @@ -133,6 +134,38 @@ write_set_search_key(write_set_t *tree, struct vy_lsm *lsm, return write_set_search(tree, &key); } +/** + * Represents a space affected by a transaction. + * See vy_tx::space_hash. + */ +struct vy_tx_space_hash_entry { + struct space *space; + rb_node(struct vy_tx_space_hash_entry) in_hash; +}; + +static inline int +vy_tx_space_hash_cmp(struct vy_tx_space_hash_entry *a, + struct vy_tx_space_hash_entry *b) +{ + if (a->space != b->space) + return a->space < b->space ? -1 : 1; + return 0; +} + +static inline int +vy_tx_space_hash_key_cmp(struct space *space, + struct vy_tx_space_hash_entry *entry) +{ + if (space != entry->space) + return space < entry->space ? -1 : 1; + return 0; +} + +typedef rb_tree(struct vy_tx_space_hash_entry) vy_tx_space_hash_t; +rb_gen_ext_key(MAYBE_UNUSED static inline, vy_tx_space_hash_, + vy_tx_space_hash_t, struct vy_tx_space_hash_entry, in_hash, + vy_tx_space_hash_cmp, struct space *, vy_tx_space_hash_key_cmp); + /** Transaction object. */ struct vy_tx { /** Link in tx_manager::writers. */ @@ -181,6 +214,11 @@ struct vy_tx { */ vy_tx_read_set_t read_set; /** + * Spaces affected by this transaction. + * Used for aborting transactions on DDL. + */ + vy_tx_space_hash_t space_hash; + /** * Prepare sequence number or -1 if the transaction * is not prepared. */ @@ -277,12 +315,12 @@ size_t tx_manager_mem_used(struct tx_manager *xm); /** - * Abort all rw transactions that affect the given LSM tree + * Abort all rw transactions that affect the given space * and haven't reached WAL yet. Called before executing a DDL * operation. */ void -tx_manager_abort_writers_for_ddl(struct tx_manager *xm, struct vy_lsm *lsm); +tx_manager_abort_writers_for_ddl(struct tx_manager *xm, struct space *space); /** * Abort all local rw transactions that haven't reached WAL yet. @@ -327,7 +365,7 @@ vy_tx_rollback(struct vy_tx *tx); * to a save point with vy_tx_rollback_statement(). */ int -vy_tx_begin_statement(struct vy_tx *tx, void **savepoint); +vy_tx_begin_statement(struct vy_tx *tx, struct space *space, void **savepoint); /** * Rollback a transaction statement. -- 2.11.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tarantool-patches] Re: [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction 2019-03-23 21:07 ` [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction Vladimir Davydov @ 2019-03-28 13:25 ` Konstantin Osipov 2019-03-28 13:58 ` Vladimir Davydov 0 siblings, 1 reply; 21+ messages in thread From: Konstantin Osipov @ 2019-03-28 13:25 UTC (permalink / raw) To: tarantool-patches * Vladimir Davydov <vdavydov.dev@gmail.com> [19/03/24 00:10]: > We need to abort all transactions writing to an altered space when > a new index is build. Currently, we use the write set to look up such > transactions, but it isn't quite correct, because a transaction could > yield on disk read before inserting a statement into the write set. > To address this problem, this patch introduces a hash of spaces affected > by each transaction and makes tx_manager_abort_writers_for_ddl() use it > instead of the write set to abort transactions for DDL. I think this is an overkill for the problem you're dealing with. The root of the problem is that a vinyl statement is only added to tx write set after a read/write has been performed, not before. The moment we started using tx write set for aborts, we should have fixed this, since one locks before one writes. The approach you chose also adds struct space to vy_tx. I know that there already is a dependency on struct space in vinyl, but why make it stronger? Now, to track before use, you would need to add a surrogate txv to tx write set at the beginning of vy_update/vy_replace/etc, not at the end of it. It doesn't seem difficult to do, but let's assume for a moment it is, and it is indeed perhaps more than we want to do in 1.10 at this point. Provided we can't track before use, do we really need a hash of all spaces just to cover the small gap between a read and a track? Why not have a single vy_tx member struct space *last_stmt_space, for the purposes of abort? It could be set at begin_statement() and cleared in rollback_statement() just the same. In future XM object will be global and we'll keep a list of struct txn in it, not struct vy_tx. Once we begin doing it, aborting writers for ddl and ro will also become straightforward. This is a yet another direction at which one could develop this patch - but I guess it's only acceptable for 2.x, not for 1.10. I know there is a subsequent patch that adds a nice feature - local transactions are not aborted when switching to RO. Well, it's nice, but not important for this problem, so I think this patch could be dropped. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] Re: [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction 2019-03-28 13:25 ` [tarantool-patches] " Konstantin Osipov @ 2019-03-28 13:58 ` Vladimir Davydov 0 siblings, 0 replies; 21+ messages in thread From: Vladimir Davydov @ 2019-03-28 13:58 UTC (permalink / raw) To: Konstantin Osipov; +Cc: tarantool-patches On Thu, Mar 28, 2019 at 04:25:52PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov <vdavydov.dev@gmail.com> [19/03/24 00:10]: > > We need to abort all transactions writing to an altered space when > > a new index is build. Currently, we use the write set to look up such > > transactions, but it isn't quite correct, because a transaction could > > yield on disk read before inserting a statement into the write set. > > To address this problem, this patch introduces a hash of spaces affected > > by each transaction and makes tx_manager_abort_writers_for_ddl() use it > > instead of the write set to abort transactions for DDL. > > I think this is an overkill for the problem you're dealing with. > > The root of the problem is that a vinyl statement is only added to tx > write set after a read/write has been performed, not before. The > moment we started using tx write set for aborts, we should have > fixed this, since one locks before one writes. > > The approach you chose also adds struct space to vy_tx. I know > that there already is a dependency on struct space in vinyl, but > why make it stronger? > > Now, to track before use, you would need to add a surrogate txv to > tx write set at the beginning of vy_update/vy_replace/etc, not at > the end of it. It doesn't seem difficult to do, but let's assume We'd have to handle this in txw iterator. I'd rather avoid it. At least, at this point. > for a moment it is, and it is indeed perhaps more than we want to > do in 1.10 at this point. > > > Provided we can't track before use, do we really need a hash of all > spaces just to cover the small gap between a read and a track? Why > not have a single vy_tx member struct space *last_stmt_space, > for the purposes of abort? It could be set at begin_statement() and > cleared in rollback_statement() just the same. This sounds perfect to me. I'll rework the patch accordingly. > > In future XM object will be global and we'll keep a list of > struct txn in it, not struct vy_tx. Once we begin doing it, > aborting writers for ddl and ro will also become straightforward. > This is a yet another direction at which one could develop this > patch - but I guess it's only acceptable for 2.x, not for 1.10. Yeah. I'd even refrain from doing this in 2.x until we have a roadmap for the common transaction manager. > > I know there is a subsequent patch that adds a nice feature - > local transactions are not aborted when switching to RO. Well, > it's nice, but not important for this problem, so I think this > patch could be dropped. Okay. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro 2019-03-23 21:07 [PATCH 0/4] Fix DML vs DDL race Vladimir Davydov 2019-03-23 21:07 ` [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction Vladimir Davydov @ 2019-03-23 21:07 ` Vladimir Davydov 2019-03-25 5:27 ` [tarantool-patches] " Георгий Кириченко 2019-03-23 21:07 ` [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache Vladimir Davydov 2019-03-23 21:07 ` [PATCH 4/4] Revert "test: skip ddl test for vinyl on travis" Vladimir Davydov 3 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-23 21:07 UTC (permalink / raw) To: kostja; +Cc: tarantool-patches Local spaces can be modified even in read-only mode hence we don't need to abort transactions that modify only local spaces when switching to read-only mode. Follow-up #4016 --- src/box/vy_tx.c | 19 +++++++++++++++++-- test/vinyl/misc.result | 38 ++++++++++++++++++++++++++++++++++++++ test/vinyl/misc.test.lua | 16 ++++++++++++++++ 3 files changed, 71 insertions(+), 2 deletions(-) diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c index 56d594e5..67a9b384 100644 --- a/src/box/vy_tx.c +++ b/src/box/vy_tx.c @@ -52,6 +52,7 @@ #include "trivia/util.h" #include "tuple.h" #include "column_mask.h" +#include "vclock.h" #include "vy_cache.h" #include "vy_lsm.h" #include "vy_mem.h" @@ -1121,8 +1122,22 @@ tx_manager_abort_writers_for_ro(struct tx_manager *xm) struct vy_tx *tx; rlist_foreach_entry(tx, &xm->writers, in_writers) { /* Applier ignores ro flag. */ - if (tx->state == VINYL_TX_READY && !tx->is_applier_session) - vy_tx_abort(tx); + if (tx->state == VINYL_TX_READY && !tx->is_applier_session) { + /* + * Don't abort transactions that modify only + * local spaces as they can be modified even + * in ro mode. + */ + struct vy_tx_space_hash_entry *entry; + struct vy_tx_space_hash_iterator it; + vy_tx_space_hash_ifirst(&tx->space_hash, &it); + while ((entry = vy_tx_space_hash_inext(&it)) != NULL) { + if (space_group_id(entry->space) != GROUP_LOCAL) { + vy_tx_abort(tx); + break; + } + } + } } } diff --git a/test/vinyl/misc.result b/test/vinyl/misc.result index 4f613cb0..751f8795 100644 --- a/test/vinyl/misc.result +++ b/test/vinyl/misc.result @@ -298,6 +298,16 @@ s:replace({1, 1}) --- - [1, 1] ... +s_local = box.schema.space.create('test_local', {engine = 'vinyl', is_local = true}) +--- +... +_ = s_local:create_index('pk') +--- +... +s_local:replace({1, 1}) +--- +- [1, 1] +... test_run:cmd("setopt delimiter ';'") --- - true @@ -334,6 +344,19 @@ _ = fiber.create(function() end); --- ... +-- Start rw transaction modifying only local spaces. +ch3 = fiber.channel(1); +--- +... +_ = fiber.create(function() + box.begin() + s_local:replace{1, 2} + ch3:get() + local status, err = pcall(box.commit) + ch3:put(status or err) +end); +--- +... test_run:cmd("setopt delimiter ''"); --- - true @@ -351,6 +374,10 @@ ch2:put(true) --- - true ... +ch3:put(true) +--- +- true +... ch1:get() --- - Can't modify data because this instance is in read-only mode. @@ -371,6 +398,10 @@ ch2:get() --- - true ... +ch3:get() +--- +- true +... -- Cleanup. box.cfg{read_only = false} --- @@ -382,3 +413,10 @@ s:select() s:drop() --- ... +s_local:select() +--- +- - [1, 2] +... +s_local:drop() +--- +... diff --git a/test/vinyl/misc.test.lua b/test/vinyl/misc.test.lua index bffaaf16..0616722f 100644 --- a/test/vinyl/misc.test.lua +++ b/test/vinyl/misc.test.lua @@ -124,6 +124,9 @@ test_run:cmd('cleanup server test') s = box.schema.space.create('test', {engine = 'vinyl'}) _ = s:create_index('pk') s:replace({1, 1}) +s_local = box.schema.space.create('test_local', {engine = 'vinyl', is_local = true}) +_ = s_local:create_index('pk') +s_local:replace({1, 1}) test_run:cmd("setopt delimiter ';'") -- Start rw transaction. ch1 = fiber.channel(1); @@ -149,18 +152,31 @@ _ = fiber.create(function() local status, err = pcall(box.commit) ch2:put(status or err) end); +-- Start rw transaction modifying only local spaces. +ch3 = fiber.channel(1); +_ = fiber.create(function() + box.begin() + s_local:replace{1, 2} + ch3:get() + local status, err = pcall(box.commit) + ch3:put(status or err) +end); test_run:cmd("setopt delimiter ''"); -- Switch to ro mode. box.cfg{read_only = true} -- Resume the transactions. ch1:put(true) ch2:put(true) +ch3:put(true) ch1:get() ch1:get() ch1:get() ch2:get() ch2:get() +ch3:get() -- Cleanup. box.cfg{read_only = false} s:select() s:drop() +s_local:select() +s_local:drop() -- 2.11.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro 2019-03-23 21:07 ` [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro Vladimir Davydov @ 2019-03-25 5:27 ` Георгий Кириченко 2019-03-25 8:13 ` Vladimir Davydov 0 siblings, 1 reply; 21+ messages in thread From: Георгий Кириченко @ 2019-03-25 5:27 UTC (permalink / raw) To: tarantool-patches; +Cc: Vladimir Davydov, kostja [-- Attachment #1: Type: text/plain, Size: 4403 bytes --] I think it is out of scope of the task. On Sunday, March 24, 2019 12:07:22 AM MSK Vladimir Davydov wrote: > Local spaces can be modified even in read-only mode hence we don't need > to abort transactions that modify only local spaces when switching to > read-only mode. > > Follow-up #4016 > --- > src/box/vy_tx.c | 19 +++++++++++++++++-- > test/vinyl/misc.result | 38 ++++++++++++++++++++++++++++++++++++++ > test/vinyl/misc.test.lua | 16 ++++++++++++++++ > 3 files changed, 71 insertions(+), 2 deletions(-) > > diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c > index 56d594e5..67a9b384 100644 > --- a/src/box/vy_tx.c > +++ b/src/box/vy_tx.c > @@ -52,6 +52,7 @@ > #include "trivia/util.h" > #include "tuple.h" > #include "column_mask.h" > +#include "vclock.h" > #include "vy_cache.h" > #include "vy_lsm.h" > #include "vy_mem.h" > @@ -1121,8 +1122,22 @@ tx_manager_abort_writers_for_ro(struct tx_manager > *xm) struct vy_tx *tx; > rlist_foreach_entry(tx, &xm->writers, in_writers) { > /* Applier ignores ro flag. */ > - if (tx->state == VINYL_TX_READY && !tx->is_applier_session) > - vy_tx_abort(tx); > + if (tx->state == VINYL_TX_READY && !tx->is_applier_session) { > + /* > + * Don't abort transactions that modify only > + * local spaces as they can be modified even > + * in ro mode. > + */ > + struct vy_tx_space_hash_entry *entry; > + struct vy_tx_space_hash_iterator it; > + vy_tx_space_hash_ifirst(&tx->space_hash, &it); > + while ((entry = vy_tx_space_hash_inext(&it)) != NULL) { > + if (space_group_id(entry->space) != GROUP_LOCAL) { > + vy_tx_abort(tx); > + break; > + } > + } > + } > } > } > > diff --git a/test/vinyl/misc.result b/test/vinyl/misc.result > index 4f613cb0..751f8795 100644 > --- a/test/vinyl/misc.result > +++ b/test/vinyl/misc.result > @@ -298,6 +298,16 @@ s:replace({1, 1}) > --- > - [1, 1] > ... > +s_local = box.schema.space.create('test_local', {engine = 'vinyl', is_local > = true}) +--- > +... > +_ = s_local:create_index('pk') > +--- > +... > +s_local:replace({1, 1}) > +--- > +- [1, 1] > +... > test_run:cmd("setopt delimiter ';'") > --- > - true > @@ -334,6 +344,19 @@ _ = fiber.create(function() > end); > --- > ... > +-- Start rw transaction modifying only local spaces. > +ch3 = fiber.channel(1); > +--- > +... > +_ = fiber.create(function() > + box.begin() > + s_local:replace{1, 2} > + ch3:get() > + local status, err = pcall(box.commit) > + ch3:put(status or err) > +end); > +--- > +... > test_run:cmd("setopt delimiter ''"); > --- > - true > @@ -351,6 +374,10 @@ ch2:put(true) > --- > - true > ... > +ch3:put(true) > +--- > +- true > +... > ch1:get() > --- > - Can't modify data because this instance is in read-only mode. > @@ -371,6 +398,10 @@ ch2:get() > --- > - true > ... > +ch3:get() > +--- > +- true > +... > -- Cleanup. > box.cfg{read_only = false} > --- > @@ -382,3 +413,10 @@ s:select() > s:drop() > --- > ... > +s_local:select() > +--- > +- - [1, 2] > +... > +s_local:drop() > +--- > +... > diff --git a/test/vinyl/misc.test.lua b/test/vinyl/misc.test.lua > index bffaaf16..0616722f 100644 > --- a/test/vinyl/misc.test.lua > +++ b/test/vinyl/misc.test.lua > @@ -124,6 +124,9 @@ test_run:cmd('cleanup server test') > s = box.schema.space.create('test', {engine = 'vinyl'}) > _ = s:create_index('pk') > s:replace({1, 1}) > +s_local = box.schema.space.create('test_local', {engine = 'vinyl', is_local > = true}) +_ = s_local:create_index('pk') > +s_local:replace({1, 1}) > test_run:cmd("setopt delimiter ';'") > -- Start rw transaction. > ch1 = fiber.channel(1); > @@ -149,18 +152,31 @@ _ = fiber.create(function() > local status, err = pcall(box.commit) > ch2:put(status or err) > end); > +-- Start rw transaction modifying only local spaces. > +ch3 = fiber.channel(1); > +_ = fiber.create(function() > + box.begin() > + s_local:replace{1, 2} > + ch3:get() > + local status, err = pcall(box.commit) > + ch3:put(status or err) > +end); > test_run:cmd("setopt delimiter ''"); > -- Switch to ro mode. > box.cfg{read_only = true} > -- Resume the transactions. > ch1:put(true) > ch2:put(true) > +ch3:put(true) > ch1:get() > ch1:get() > ch1:get() > ch2:get() > ch2:get() > +ch3:get() > -- Cleanup. > box.cfg{read_only = false} > s:select() > s:drop() > +s_local:select() > +s_local:drop() [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro 2019-03-25 5:27 ` [tarantool-patches] " Георгий Кириченко @ 2019-03-25 8:13 ` Vladimir Davydov 2019-03-25 8:58 ` Георгий Кириченко 0 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-25 8:13 UTC (permalink / raw) To: Георгий Кириченко Cc: tarantool-patches, kostja On Mon, Mar 25, 2019 at 08:27:28AM +0300, Георгий Кириченко wrote: > I think it is out of scope of the task. So what? This was a known issue when the switch_to_ro was introduced. Better fix it now than never, especially counting the fact that it's a low hanging fruit after patch 1. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro 2019-03-25 8:13 ` Vladimir Davydov @ 2019-03-25 8:58 ` Георгий Кириченко 2019-03-25 9:30 ` Vladimir Davydov 0 siblings, 1 reply; 21+ messages in thread From: Георгий Кириченко @ 2019-03-25 8:58 UTC (permalink / raw) To: Vladimir Davydov; +Cc: tarantool-patches, kostja [-- Attachment #1: Type: text/plain, Size: 441 bytes --] On Monday, March 25, 2019 11:13:51 AM MSK Vladimir Davydov wrote: > On Mon, Mar 25, 2019 at 08:27:28AM +0300, Георгий Кириченко wrote: > > I think it is out of scope of the task. > > So what? This was a known issue when the switch_to_ro was introduced. > Better fix it now than never, especially counting the fact that it's > a low hanging fruit after patch 1. So let it be in a separate patch with assigned ticket. [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro 2019-03-25 8:58 ` Георгий Кириченко @ 2019-03-25 9:30 ` Vladimir Davydov 0 siblings, 0 replies; 21+ messages in thread From: Vladimir Davydov @ 2019-03-25 9:30 UTC (permalink / raw) To: Георгий Кириченко Cc: tarantool-patches, kostja On Mon, Mar 25, 2019 at 11:58:18AM +0300, Георгий Кириченко wrote: > On Monday, March 25, 2019 11:13:51 AM MSK Vladimir Davydov wrote: > > On Mon, Mar 25, 2019 at 08:27:28AM +0300, Георгий Кириченко wrote: > > > I think it is out of scope of the task. > > > > So what? This was a known issue when the switch_to_ro was introduced. > > Better fix it now than never, especially counting the fact that it's > > a low hanging fruit after patch 1. > So let it be in a separate patch with assigned ticket. Why? It depends on the previous patch. I ask to move a patch out of a series only if there's a great chance to commit it before the rest of the series. Filing a ticket when the patch is ready looks like excessive bureaucracy to me. In fact, it's just a follow-up for #4016. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-23 21:07 [PATCH 0/4] Fix DML vs DDL race Vladimir Davydov 2019-03-23 21:07 ` [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction Vladimir Davydov 2019-03-23 21:07 ` [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro Vladimir Davydov @ 2019-03-23 21:07 ` Vladimir Davydov 2019-03-25 5:26 ` [tarantool-patches] " Георгий Кириченко ` (2 more replies) 2019-03-23 21:07 ` [PATCH 4/4] Revert "test: skip ddl test for vinyl on travis" Vladimir Davydov 3 siblings, 3 replies; 21+ messages in thread From: Vladimir Davydov @ 2019-03-23 21:07 UTC (permalink / raw) To: kostja; +Cc: tarantool-patches A DDL operation creates a new struct space container, moving unaffected indexes from the old container, then destroying it. The problem is there may be a DML request for this space, which was passed the old container in the argument list and then yielded on disk read. When it resumes, it may try to dereference the old space container, which may have already been destroyed. This will most certainly result in a crash. To address this problem, we introduce a new space callback, invalidate, which is called for the old space on space_cache_replace(). In case of vinyl, this callback aborts all transactions involving the space. To prevent a DML request from dereferencing a destroyed space, we also make the iterator code check the current transaction state after each yield and return an error if it was aborted. This should make any DML request bail out early without dereferencing the space anymore. Closes #3420 --- src/box/blackhole.c | 1 + src/box/memtx_space.c | 1 + src/box/schema.cc | 3 ++ src/box/space.c | 6 +++ src/box/space.h | 15 ++++++ src/box/sysview.c | 1 + src/box/vinyl.c | 21 ++++++++ src/box/vy_point_lookup.c | 14 ++++++ src/box/vy_read_iterator.c | 13 +++++ test/vinyl/errinj_ddl.result | 110 +++++++++++++++++++++++++++++++++++++++++ test/vinyl/errinj_ddl.test.lua | 51 +++++++++++++++++++ 11 files changed, 236 insertions(+) diff --git a/src/box/blackhole.c b/src/box/blackhole.c index 95064e1f..b69e543a 100644 --- a/src/box/blackhole.c +++ b/src/box/blackhole.c @@ -129,6 +129,7 @@ static const struct space_vtab blackhole_space_vtab = { /* .build_index = */ generic_space_build_index, /* .swap_index = */ generic_space_swap_index, /* .prepare_alter = */ generic_space_prepare_alter, + /* .invalidate = */ generic_space_invalidate, }; static void diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c index 92ec0b30..d8529fe0 100644 --- a/src/box/memtx_space.c +++ b/src/box/memtx_space.c @@ -961,6 +961,7 @@ static const struct space_vtab memtx_space_vtab = { /* .build_index = */ memtx_space_build_index, /* .swap_index = */ generic_space_swap_index, /* .prepare_alter = */ memtx_space_prepare_alter, + /* .invalidate = */ generic_space_invalidate, }; struct space * diff --git a/src/box/schema.cc b/src/box/schema.cc index 74d70d8d..4123dc9e 100644 --- a/src/box/schema.cc +++ b/src/box/schema.cc @@ -251,6 +251,9 @@ space_cache_replace(struct space *old_space, struct space *new_space) mh_strnptr_del(spaces_by_name, k, NULL); } space_cache_version++; + + if (old_space != NULL) + space_invalidate(old_space); } /** A wrapper around space_new() for data dictionary spaces. */ diff --git a/src/box/space.c b/src/box/space.c index e140f3d5..1379fa34 100644 --- a/src/box/space.c +++ b/src/box/space.c @@ -637,4 +637,10 @@ generic_space_prepare_alter(struct space *old_space, struct space *new_space) return 0; } +void +generic_space_invalidate(struct space *space) +{ + (void)space; +} + /* }}} */ diff --git a/src/box/space.h b/src/box/space.h index 211706d2..1bcb1284 100644 --- a/src/box/space.h +++ b/src/box/space.h @@ -133,6 +133,14 @@ struct space_vtab { */ int (*prepare_alter)(struct space *old_space, struct space *new_space); + /** + * Called right after removing the space from the cache. + * The engine should abort all transactions involving + * the space, because the space will be destroyed soon. + * + * This function isn't allowed to yield or fail. + */ + void (*invalidate)(struct space *space); }; struct space { @@ -407,6 +415,12 @@ space_prepare_alter(struct space *old_space, struct space *new_space) return new_space->vtab->prepare_alter(old_space, new_space); } +static inline void +space_invalidate(struct space *space) +{ + return space->vtab->invalidate(space); +} + static inline bool space_is_memtx(struct space *space) { return space->engine->id == 0; } @@ -469,6 +483,7 @@ int generic_space_check_format(struct space *, struct tuple_format *); int generic_space_build_index(struct space *, struct index *, struct tuple_format *); int generic_space_prepare_alter(struct space *, struct space *); +void generic_space_invalidate(struct space *); #if defined(__cplusplus) } /* extern "C" */ diff --git a/src/box/sysview.c b/src/box/sysview.c index 63b669d0..f9edd2d0 100644 --- a/src/box/sysview.c +++ b/src/box/sysview.c @@ -495,6 +495,7 @@ static const struct space_vtab sysview_space_vtab = { /* .build_index = */ generic_space_build_index, /* .swap_index = */ generic_space_swap_index, /* .prepare_alter = */ generic_space_prepare_alter, + /* .invalidate = */ generic_space_invalidate, }; static void diff --git a/src/box/vinyl.c b/src/box/vinyl.c index a6a5f187..da891025 100644 --- a/src/box/vinyl.c +++ b/src/box/vinyl.c @@ -1031,6 +1031,26 @@ vinyl_space_prepare_alter(struct space *old_space, struct space *new_space) return 0; } +static void +vinyl_space_invalidate(struct space *space) +{ + struct vy_env *env = vy_env(space->engine); + /* + * Abort all transactions involving the invalidated space. + * An aborted transaction doesn't allow any DML/DQL requests + * so the space won't be used anymore and can be safely + * destroyed. + * + * There's a subtle corner case though - a transaction can + * be reading disk from a DML request right now, with this + * space passed to it in the argument list. However, it's + * handled as well: the iterator will return an error as + * soon as it's done reading disk, which will make the DML + * request bail out early, without dereferencing the space. + */ + tx_manager_abort_writers_for_ddl(env->xm, space); +} + /** * This function is called after installing on_replace trigger * used for propagating changes done during DDL. It aborts all @@ -4543,6 +4563,7 @@ static const struct space_vtab vinyl_space_vtab = { /* .build_index = */ vinyl_space_build_index, /* .swap_index = */ vinyl_space_swap_index, /* .prepare_alter = */ vinyl_space_prepare_alter, + /* .invalidate = */ vinyl_space_invalidate, }; static const struct index_vtab vinyl_index_vtab = { diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c index 9f7796eb..9e1f3ca7 100644 --- a/src/box/vy_point_lookup.c +++ b/src/box/vy_point_lookup.c @@ -198,6 +198,7 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx, { /* All key parts must be set for a point lookup. */ assert(vy_stmt_is_full_key(key, lsm->cmp_def)); + assert(tx == NULL || tx->state == VINYL_TX_READY); *ret = NULL; double start_time = ev_monotonic_now(loop()); @@ -239,6 +240,19 @@ restart: errinj(ERRINJ_VY_POINT_ITER_WAIT, ERRINJ_BOOL)->bparam = false; }); + if (tx != NULL && tx->state == VINYL_TX_ABORT) { + /* + * The transaction was aborted while we were reading + * disk. We must stop now and return an error as this + * function could be called by a DML request aborted + * by a DDL operation: failing early will prevent it + * from dereferencing a destroyed space. + */ + diag_set(ClientError, ER_TRANSACTION_CONFLICT); + rc = -1; + goto done; + } + if (mem_list_version != lsm->mem_list_version) { /* * Mem list was changed during yield. This could be rotation diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c index 2864a280..60c43555 100644 --- a/src/box/vy_read_iterator.c +++ b/src/box/vy_read_iterator.c @@ -501,6 +501,17 @@ rescan_disk: } vy_read_iterator_unpin_slices(itr); /* + * The transaction could have been aborted while we were + * reading disk. We must stop now and return an error as + * this function could be called by a DML request that + * was aborted by a DDL operation: failing will prevent + * it from dereferencing a destroyed space. + */ + if (itr->tx != NULL && itr->tx->state == VINYL_TX_ABORT) { + diag_set(ClientError, ER_TRANSACTION_CONFLICT); + return -1; + } + /* * The list of in-memory indexes and/or the range tree could * have been modified by dump/compaction while we were fetching * data from disk. Restart the iterator if this is the case. @@ -842,6 +853,8 @@ vy_read_iterator_track_read(struct vy_read_iterator *itr, struct tuple *stmt) NODISCARD int vy_read_iterator_next(struct vy_read_iterator *itr, struct tuple **result) { + assert(itr->tx == NULL || itr->tx->state == VINYL_TX_READY); + ev_tstamp start_time = ev_monotonic_now(loop()); struct vy_lsm *lsm = itr->lsm; diff --git a/test/vinyl/errinj_ddl.result b/test/vinyl/errinj_ddl.result index 04cb581f..794125e8 100644 --- a/test/vinyl/errinj_ddl.result +++ b/test/vinyl/errinj_ddl.result @@ -682,3 +682,113 @@ box.commit() s:drop() --- ... +-- +-- gh-3420: crash if DML races with DDL. +-- +fiber = require('fiber') +--- +... +-- turn off cache for error injection to work +default_vinyl_cache = box.cfg.vinyl_cache +--- +... +box.cfg{vinyl_cache = 0} +--- +... +s = box.schema.space.create('test', {engine = 'vinyl'}) +--- +... +_ = s:create_index('i1') +--- +... +_ = s:create_index('i2', {parts = {2, 'unsigned'}}) +--- +... +_ = s:create_index('i3', {parts = {3, 'unsigned'}}) +--- +... +_ = s:create_index('i4', {parts = {4, 'unsigned'}}) +--- +... +for i = 1, 5 do s:replace({i, i, i, i}) end +--- +... +box.snapshot() +--- +- ok +... +test_run:cmd("setopt delimiter ';'") +--- +- true +... +function async(f) + local ch = fiber.channel(1) + fiber.create(function() + local _, res = pcall(f) + ch:put(res) + end) + return ch +end; +--- +... +test_run:cmd("setopt delimiter ''"); +--- +- true +... +-- Issue a few DML requests that require reading disk. +-- Stall disk reads. +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0.01) +--- +- ok +... +ch1 = async(function() s:insert({1, 1, 1, 1}) end) +--- +... +ch2 = async(function() s:replace({2, 3, 2, 3}) end) +--- +... +ch3 = async(function() s:update({3}, {{'+', 4, 1}}) end) +--- +... +ch4 = async(function() s:upsert({5, 5, 5, 5}, {{'+', 4, 1}}) end) +--- +... +-- Execute a DDL operation on the space. +s.index.i4:drop() +--- +... +-- Resume the DML requests. Check that they have been aborted. +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0) +--- +- ok +... +ch1:get() +--- +- Transaction has been aborted by conflict +... +ch2:get() +--- +- Transaction has been aborted by conflict +... +ch3:get() +--- +- Transaction has been aborted by conflict +... +ch4:get() +--- +- Transaction has been aborted by conflict +... +s:select() +--- +- - [1, 1, 1, 1] + - [2, 2, 2, 2] + - [3, 3, 3, 3] + - [4, 4, 4, 4] + - [5, 5, 5, 5] +... +s:drop() +--- +... +box.cfg{vinyl_cache = default_vinyl_cache} +--- +... diff --git a/test/vinyl/errinj_ddl.test.lua b/test/vinyl/errinj_ddl.test.lua index 50dc4aa1..84f33277 100644 --- a/test/vinyl/errinj_ddl.test.lua +++ b/test/vinyl/errinj_ddl.test.lua @@ -298,3 +298,54 @@ s:replace{1, 3} box.commit() s:drop() + +-- +-- gh-3420: crash if DML races with DDL. +-- +fiber = require('fiber') + +-- turn off cache for error injection to work +default_vinyl_cache = box.cfg.vinyl_cache +box.cfg{vinyl_cache = 0} + +s = box.schema.space.create('test', {engine = 'vinyl'}) +_ = s:create_index('i1') +_ = s:create_index('i2', {parts = {2, 'unsigned'}}) +_ = s:create_index('i3', {parts = {3, 'unsigned'}}) +_ = s:create_index('i4', {parts = {4, 'unsigned'}}) +for i = 1, 5 do s:replace({i, i, i, i}) end +box.snapshot() + +test_run:cmd("setopt delimiter ';'") +function async(f) + local ch = fiber.channel(1) + fiber.create(function() + local _, res = pcall(f) + ch:put(res) + end) + return ch +end; +test_run:cmd("setopt delimiter ''"); + +-- Issue a few DML requests that require reading disk. +-- Stall disk reads. +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0.01) + +ch1 = async(function() s:insert({1, 1, 1, 1}) end) +ch2 = async(function() s:replace({2, 3, 2, 3}) end) +ch3 = async(function() s:update({3}, {{'+', 4, 1}}) end) +ch4 = async(function() s:upsert({5, 5, 5, 5}, {{'+', 4, 1}}) end) + +-- Execute a DDL operation on the space. +s.index.i4:drop() + +-- Resume the DML requests. Check that they have been aborted. +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0) +ch1:get() +ch2:get() +ch3:get() +ch4:get() +s:select() +s:drop() + +box.cfg{vinyl_cache = default_vinyl_cache} -- 2.11.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-23 21:07 ` [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache Vladimir Davydov @ 2019-03-25 5:26 ` Георгий Кириченко 2019-03-25 8:17 ` Vladimir Davydov 2019-03-25 5:45 ` Георгий Кириченко 2019-03-28 13:45 ` [tarantool-patches] " Konstantin Osipov 2 siblings, 1 reply; 21+ messages in thread From: Георгий Кириченко @ 2019-03-25 5:26 UTC (permalink / raw) To: tarantool-patches; +Cc: Vladimir Davydov, kostja [-- Attachment #1: Type: text/plain, Size: 13470 bytes --] On Sunday, March 24, 2019 12:07:23 AM MSK Vladimir Davydov wrote: I definitely dislike introducing a new callback, there are already a lot of dependencies. Also I'm afraid that this could prevent us from the transactional ddl implementation. We have a space vtab and destroy, why shouldn't we use them? > A DDL operation creates a new struct space container, moving unaffected > indexes from the old container, then destroying it. The problem is there > may be a DML request for this space, which was passed the old container > in the argument list and then yielded on disk read. When it resumes, it > may try to dereference the old space container, which may have already > been destroyed. This will most certainly result in a crash. > > To address this problem, we introduce a new space callback, invalidate, > which is called for the old space on space_cache_replace(). In case of > vinyl, this callback aborts all transactions involving the space. To > prevent a DML request from dereferencing a destroyed space, we also make > the iterator code check the current transaction state after each yield > and return an error if it was aborted. This should make any DML request > bail out early without dereferencing the space anymore. > > Closes #3420 > --- > src/box/blackhole.c | 1 + > src/box/memtx_space.c | 1 + > src/box/schema.cc | 3 ++ > src/box/space.c | 6 +++ > src/box/space.h | 15 ++++++ > src/box/sysview.c | 1 + > src/box/vinyl.c | 21 ++++++++ > src/box/vy_point_lookup.c | 14 ++++++ > src/box/vy_read_iterator.c | 13 +++++ > test/vinyl/errinj_ddl.result | 110 > +++++++++++++++++++++++++++++++++++++++++ test/vinyl/errinj_ddl.test.lua | > 51 +++++++++++++++++++ > 11 files changed, 236 insertions(+) > > diff --git a/src/box/blackhole.c b/src/box/blackhole.c > index 95064e1f..b69e543a 100644 > --- a/src/box/blackhole.c > +++ b/src/box/blackhole.c > @@ -129,6 +129,7 @@ static const struct space_vtab blackhole_space_vtab = { > /* .build_index = */ generic_space_build_index, > /* .swap_index = */ generic_space_swap_index, > /* .prepare_alter = */ generic_space_prepare_alter, > + /* .invalidate = */ generic_space_invalidate, > }; > > static void > diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c > index 92ec0b30..d8529fe0 100644 > --- a/src/box/memtx_space.c > +++ b/src/box/memtx_space.c > @@ -961,6 +961,7 @@ static const struct space_vtab memtx_space_vtab = { > /* .build_index = */ memtx_space_build_index, > /* .swap_index = */ generic_space_swap_index, > /* .prepare_alter = */ memtx_space_prepare_alter, > + /* .invalidate = */ generic_space_invalidate, > }; > > struct space * > diff --git a/src/box/schema.cc b/src/box/schema.cc > index 74d70d8d..4123dc9e 100644 > --- a/src/box/schema.cc > +++ b/src/box/schema.cc > @@ -251,6 +251,9 @@ space_cache_replace(struct space *old_space, struct > space *new_space) mh_strnptr_del(spaces_by_name, k, NULL); > } > space_cache_version++; > + > + if (old_space != NULL) > + space_invalidate(old_space); > } > > /** A wrapper around space_new() for data dictionary spaces. */ > diff --git a/src/box/space.c b/src/box/space.c > index e140f3d5..1379fa34 100644 > --- a/src/box/space.c > +++ b/src/box/space.c > @@ -637,4 +637,10 @@ generic_space_prepare_alter(struct space *old_space, > struct space *new_space) return 0; > } > > +void > +generic_space_invalidate(struct space *space) > +{ > + (void)space; > +} > + > /* }}} */ > diff --git a/src/box/space.h b/src/box/space.h > index 211706d2..1bcb1284 100644 > --- a/src/box/space.h > +++ b/src/box/space.h > @@ -133,6 +133,14 @@ struct space_vtab { > */ > int (*prepare_alter)(struct space *old_space, > struct space *new_space); > + /** > + * Called right after removing the space from the cache. > + * The engine should abort all transactions involving > + * the space, because the space will be destroyed soon. > + * > + * This function isn't allowed to yield or fail. > + */ > + void (*invalidate)(struct space *space); > }; > > struct space { > @@ -407,6 +415,12 @@ space_prepare_alter(struct space *old_space, struct > space *new_space) return new_space->vtab->prepare_alter(old_space, > new_space); > } > > +static inline void > +space_invalidate(struct space *space) > +{ > + return space->vtab->invalidate(space); > +} > + > static inline bool > space_is_memtx(struct space *space) { return space->engine->id == 0; } > > @@ -469,6 +483,7 @@ int generic_space_check_format(struct space *, struct > tuple_format *); int generic_space_build_index(struct space *, struct index > *, > struct tuple_format *); > int generic_space_prepare_alter(struct space *, struct space *); > +void generic_space_invalidate(struct space *); > > #if defined(__cplusplus) > } /* extern "C" */ > diff --git a/src/box/sysview.c b/src/box/sysview.c > index 63b669d0..f9edd2d0 100644 > --- a/src/box/sysview.c > +++ b/src/box/sysview.c > @@ -495,6 +495,7 @@ static const struct space_vtab sysview_space_vtab = { > /* .build_index = */ generic_space_build_index, > /* .swap_index = */ generic_space_swap_index, > /* .prepare_alter = */ generic_space_prepare_alter, > + /* .invalidate = */ generic_space_invalidate, > }; > > static void > diff --git a/src/box/vinyl.c b/src/box/vinyl.c > index a6a5f187..da891025 100644 > --- a/src/box/vinyl.c > +++ b/src/box/vinyl.c > @@ -1031,6 +1031,26 @@ vinyl_space_prepare_alter(struct space *old_space, > struct space *new_space) return 0; > } > > +static void > +vinyl_space_invalidate(struct space *space) > +{ > + struct vy_env *env = vy_env(space->engine); > + /* > + * Abort all transactions involving the invalidated space. > + * An aborted transaction doesn't allow any DML/DQL requests > + * so the space won't be used anymore and can be safely > + * destroyed. > + * > + * There's a subtle corner case though - a transaction can > + * be reading disk from a DML request right now, with this > + * space passed to it in the argument list. However, it's > + * handled as well: the iterator will return an error as > + * soon as it's done reading disk, which will make the DML > + * request bail out early, without dereferencing the space. > + */ > + tx_manager_abort_writers_for_ddl(env->xm, space); > +} > + > /** > * This function is called after installing on_replace trigger > * used for propagating changes done during DDL. It aborts all > @@ -4543,6 +4563,7 @@ static const struct space_vtab vinyl_space_vtab = { > /* .build_index = */ vinyl_space_build_index, > /* .swap_index = */ vinyl_space_swap_index, > /* .prepare_alter = */ vinyl_space_prepare_alter, > + /* .invalidate = */ vinyl_space_invalidate, > }; > > static const struct index_vtab vinyl_index_vtab = { > diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c > index 9f7796eb..9e1f3ca7 100644 > --- a/src/box/vy_point_lookup.c > +++ b/src/box/vy_point_lookup.c > @@ -198,6 +198,7 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx, > { > /* All key parts must be set for a point lookup. */ > assert(vy_stmt_is_full_key(key, lsm->cmp_def)); > + assert(tx == NULL || tx->state == VINYL_TX_READY); > > *ret = NULL; > double start_time = ev_monotonic_now(loop()); > @@ -239,6 +240,19 @@ restart: > errinj(ERRINJ_VY_POINT_ITER_WAIT, ERRINJ_BOOL)->bparam = false; > }); > > + if (tx != NULL && tx->state == VINYL_TX_ABORT) { > + /* > + * The transaction was aborted while we were reading > + * disk. We must stop now and return an error as this > + * function could be called by a DML request aborted > + * by a DDL operation: failing early will prevent it > + * from dereferencing a destroyed space. > + */ > + diag_set(ClientError, ER_TRANSACTION_CONFLICT); > + rc = -1; > + goto done; > + } > + > if (mem_list_version != lsm->mem_list_version) { > /* > * Mem list was changed during yield. This could be rotation > diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c > index 2864a280..60c43555 100644 > --- a/src/box/vy_read_iterator.c > +++ b/src/box/vy_read_iterator.c > @@ -501,6 +501,17 @@ rescan_disk: > } > vy_read_iterator_unpin_slices(itr); > /* > + * The transaction could have been aborted while we were > + * reading disk. We must stop now and return an error as > + * this function could be called by a DML request that > + * was aborted by a DDL operation: failing will prevent > + * it from dereferencing a destroyed space. > + */ > + if (itr->tx != NULL && itr->tx->state == VINYL_TX_ABORT) { > + diag_set(ClientError, ER_TRANSACTION_CONFLICT); > + return -1; > + } > + /* > * The list of in-memory indexes and/or the range tree could > * have been modified by dump/compaction while we were fetching > * data from disk. Restart the iterator if this is the case. > @@ -842,6 +853,8 @@ vy_read_iterator_track_read(struct vy_read_iterator > *itr, struct tuple *stmt) NODISCARD int > vy_read_iterator_next(struct vy_read_iterator *itr, struct tuple **result) > { > + assert(itr->tx == NULL || itr->tx->state == VINYL_TX_READY); > + > ev_tstamp start_time = ev_monotonic_now(loop()); > > struct vy_lsm *lsm = itr->lsm; > diff --git a/test/vinyl/errinj_ddl.result b/test/vinyl/errinj_ddl.result > index 04cb581f..794125e8 100644 > --- a/test/vinyl/errinj_ddl.result > +++ b/test/vinyl/errinj_ddl.result > @@ -682,3 +682,113 @@ box.commit() > s:drop() > --- > ... > +-- > +-- gh-3420: crash if DML races with DDL. > +-- > +fiber = require('fiber') > +--- > +... > +-- turn off cache for error injection to work > +default_vinyl_cache = box.cfg.vinyl_cache > +--- > +... > +box.cfg{vinyl_cache = 0} > +--- > +... > +s = box.schema.space.create('test', {engine = 'vinyl'}) > +--- > +... > +_ = s:create_index('i1') > +--- > +... > +_ = s:create_index('i2', {parts = {2, 'unsigned'}}) > +--- > +... > +_ = s:create_index('i3', {parts = {3, 'unsigned'}}) > +--- > +... > +_ = s:create_index('i4', {parts = {4, 'unsigned'}}) > +--- > +... > +for i = 1, 5 do s:replace({i, i, i, i}) end > +--- > +... > +box.snapshot() > +--- > +- ok > +... > +test_run:cmd("setopt delimiter ';'") > +--- > +- true > +... > +function async(f) > + local ch = fiber.channel(1) > + fiber.create(function() > + local _, res = pcall(f) > + ch:put(res) > + end) > + return ch > +end; > +--- > +... > +test_run:cmd("setopt delimiter ''"); > +--- > +- true > +... > +-- Issue a few DML requests that require reading disk. > +-- Stall disk reads. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0.01) > +--- > +- ok > +... > +ch1 = async(function() s:insert({1, 1, 1, 1}) end) > +--- > +... > +ch2 = async(function() s:replace({2, 3, 2, 3}) end) > +--- > +... > +ch3 = async(function() s:update({3}, {{'+', 4, 1}}) end) > +--- > +... > +ch4 = async(function() s:upsert({5, 5, 5, 5}, {{'+', 4, 1}}) end) > +--- > +... > +-- Execute a DDL operation on the space. > +s.index.i4:drop() > +--- > +... > +-- Resume the DML requests. Check that they have been aborted. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0) > +--- > +- ok > +... > +ch1:get() > +--- > +- Transaction has been aborted by conflict > +... > +ch2:get() > +--- > +- Transaction has been aborted by conflict > +... > +ch3:get() > +--- > +- Transaction has been aborted by conflict > +... > +ch4:get() > +--- > +- Transaction has been aborted by conflict > +... > +s:select() > +--- > +- - [1, 1, 1, 1] > + - [2, 2, 2, 2] > + - [3, 3, 3, 3] > + - [4, 4, 4, 4] > + - [5, 5, 5, 5] > +... > +s:drop() > +--- > +... > +box.cfg{vinyl_cache = default_vinyl_cache} > +--- > +... > diff --git a/test/vinyl/errinj_ddl.test.lua b/test/vinyl/errinj_ddl.test.lua > index 50dc4aa1..84f33277 100644 > --- a/test/vinyl/errinj_ddl.test.lua > +++ b/test/vinyl/errinj_ddl.test.lua > @@ -298,3 +298,54 @@ s:replace{1, 3} > box.commit() > > s:drop() > + > +-- > +-- gh-3420: crash if DML races with DDL. > +-- > +fiber = require('fiber') > + > +-- turn off cache for error injection to work > +default_vinyl_cache = box.cfg.vinyl_cache > +box.cfg{vinyl_cache = 0} > + > +s = box.schema.space.create('test', {engine = 'vinyl'}) > +_ = s:create_index('i1') > +_ = s:create_index('i2', {parts = {2, 'unsigned'}}) > +_ = s:create_index('i3', {parts = {3, 'unsigned'}}) > +_ = s:create_index('i4', {parts = {4, 'unsigned'}}) > +for i = 1, 5 do s:replace({i, i, i, i}) end > +box.snapshot() > + > +test_run:cmd("setopt delimiter ';'") > +function async(f) > + local ch = fiber.channel(1) > + fiber.create(function() > + local _, res = pcall(f) > + ch:put(res) > + end) > + return ch > +end; > +test_run:cmd("setopt delimiter ''"); > + > +-- Issue a few DML requests that require reading disk. > +-- Stall disk reads. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0.01) > + > +ch1 = async(function() s:insert({1, 1, 1, 1}) end) > +ch2 = async(function() s:replace({2, 3, 2, 3}) end) > +ch3 = async(function() s:update({3}, {{'+', 4, 1}}) end) > +ch4 = async(function() s:upsert({5, 5, 5, 5}, {{'+', 4, 1}}) end) > + > +-- Execute a DDL operation on the space. > +s.index.i4:drop() > + > +-- Resume the DML requests. Check that they have been aborted. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0) > +ch1:get() > +ch2:get() > +ch3:get() > +ch4:get() > +s:select() > +s:drop() > + > +box.cfg{vinyl_cache = default_vinyl_cache} [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-25 5:26 ` [tarantool-patches] " Георгий Кириченко @ 2019-03-25 8:17 ` Vladimir Davydov [not found] ` <1564677.EMV258VVK2@home.lan> 0 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-25 8:17 UTC (permalink / raw) To: Георгий Кириченко Cc: tarantool-patches, kostja On Mon, Mar 25, 2019 at 08:26:25AM +0300, Георгий Кириченко wrote: > I definitely dislike introducing a new callback, there are already a lot of > dependencies. Also I'm afraid that this could prevent us from the > transactional ddl implementation. I don't see a way to fix this issue without introducing a new callback. And frankly I don't quite understand your concern. If the new callback proves to be useless / harmful for transaction DDL, we will remove it - after all engine API isn't carved in stone. > We have a space vtab and destroy, why shouldn't we use them? Because the space becomes invalid before it gets destroyed - ALTER moves indexes to the new space before writing to WAL. ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <1564677.EMV258VVK2@home.lan>]
* Re: [tarantool-patches] [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache [not found] ` <1564677.EMV258VVK2@home.lan> @ 2019-03-25 9:51 ` Vladimir Davydov 0 siblings, 0 replies; 21+ messages in thread From: Vladimir Davydov @ 2019-03-25 9:51 UTC (permalink / raw) To: Георгий Кириченко Cc: tarantool-patches, kostja [back to the list] On Mon, Mar 25, 2019 at 11:57:07AM +0300, Георгий Кириченко wrote: > On Monday, March 25, 2019 11:17:43 AM MSK you wrote: > > On Mon, Mar 25, 2019 at 08:26:25AM +0300, Георгий Кириченко wrote: > > > I definitely dislike introducing a new callback, there are already a lot > > > of > > > dependencies. Also I'm afraid that this could prevent us from the > > > transactional ddl implementation. > > > > I don't see a way to fix this issue without introducing a new callback. > > And frankly I don't quite understand your concern. If the new callback > > proves to be useless / harmful for transaction DDL, we will remove it - > > after all engine API isn't carved in stone. > But leads to increase in amount of work we should be done to implement in > future. And we already have a space destroy callback. > > > > > We have a space vtab and destroy, why shouldn't we use them? > > > > Because the space becomes invalid before it gets destroyed - ALTER moves > > indexes to the new space before writing to WAL. > So it might be a high time to change it. > It's feasible, I think. However, It'd blow the scope of this particular patch set as it would most likely require moving reference counting from vinyl to the upper level. Besides, I don't quite like mixing two separate functions (space deletion and transaction abortion) in one callback when we can easily add a new one with a clear semantic. I don't see anything wrong about removing this callback in future, when transactional DDL is finally ready. In fact, I've added and removed ALTER callbacks multiple times and it never posed a problem to me. Up to Kostja. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-23 21:07 ` [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache Vladimir Davydov 2019-03-25 5:26 ` [tarantool-patches] " Георгий Кириченко @ 2019-03-25 5:45 ` Георгий Кириченко 2019-03-25 8:21 ` Vladimir Davydov 2019-03-28 13:45 ` [tarantool-patches] " Konstantin Osipov 2 siblings, 1 reply; 21+ messages in thread From: Георгий Кириченко @ 2019-03-25 5:45 UTC (permalink / raw) To: tarantool-patches; +Cc: Vladimir Davydov, kostja [-- Attachment #1: Type: text/plain, Size: 13372 bytes --] On Sunday, March 24, 2019 12:07:23 AM MSK Vladimir Davydov wrote: > A DDL operation creates a new struct space container, moving unaffected > indexes from the old container, then destroying it. The problem is there > may be a DML request for this space, which was passed the old container > in the argument list and then yielded on disk read. When it resumes, it > may try to dereference the old space container, which may have already > been destroyed. This will most certainly result in a crash. > > To address this problem, we introduce a new space callback, invalidate, > which is called for the old space on space_cache_replace(). In case of > vinyl, this callback aborts all transactions involving the space. To > prevent a DML request from dereferencing a destroyed space, we also make > the iterator code check the current transaction state after each yield > and return an error if it was aborted. This should make any DML request > bail out early without dereferencing the space anymore. > > Closes #3420 > --- > src/box/blackhole.c | 1 + > src/box/memtx_space.c | 1 + > src/box/schema.cc | 3 ++ > src/box/space.c | 6 +++ > src/box/space.h | 15 ++++++ > src/box/sysview.c | 1 + > src/box/vinyl.c | 21 ++++++++ > src/box/vy_point_lookup.c | 14 ++++++ > src/box/vy_read_iterator.c | 13 +++++ > test/vinyl/errinj_ddl.result | 110 > +++++++++++++++++++++++++++++++++++++++++ test/vinyl/errinj_ddl.test.lua | > 51 +++++++++++++++++++ > 11 files changed, 236 insertions(+) > > diff --git a/src/box/blackhole.c b/src/box/blackhole.c > index 95064e1f..b69e543a 100644 > --- a/src/box/blackhole.c > +++ b/src/box/blackhole.c > @@ -129,6 +129,7 @@ static const struct space_vtab blackhole_space_vtab = { > /* .build_index = */ generic_space_build_index, > /* .swap_index = */ generic_space_swap_index, > /* .prepare_alter = */ generic_space_prepare_alter, > + /* .invalidate = */ generic_space_invalidate, > }; > > static void > diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c > index 92ec0b30..d8529fe0 100644 > --- a/src/box/memtx_space.c > +++ b/src/box/memtx_space.c > @@ -961,6 +961,7 @@ static const struct space_vtab memtx_space_vtab = { > /* .build_index = */ memtx_space_build_index, > /* .swap_index = */ generic_space_swap_index, > /* .prepare_alter = */ memtx_space_prepare_alter, > + /* .invalidate = */ generic_space_invalidate, > }; > > struct space * > diff --git a/src/box/schema.cc b/src/box/schema.cc > index 74d70d8d..4123dc9e 100644 > --- a/src/box/schema.cc > +++ b/src/box/schema.cc > @@ -251,6 +251,9 @@ space_cache_replace(struct space *old_space, struct > space *new_space) mh_strnptr_del(spaces_by_name, k, NULL); > } > space_cache_version++; > + > + if (old_space != NULL) > + space_invalidate(old_space); > } > > /** A wrapper around space_new() for data dictionary spaces. */ > diff --git a/src/box/space.c b/src/box/space.c > index e140f3d5..1379fa34 100644 > --- a/src/box/space.c > +++ b/src/box/space.c > @@ -637,4 +637,10 @@ generic_space_prepare_alter(struct space *old_space, > struct space *new_space) return 0; > } > > +void > +generic_space_invalidate(struct space *space) > +{ > + (void)space; > +} > + > /* }}} */ > diff --git a/src/box/space.h b/src/box/space.h > index 211706d2..1bcb1284 100644 > --- a/src/box/space.h > +++ b/src/box/space.h > @@ -133,6 +133,14 @@ struct space_vtab { > */ > int (*prepare_alter)(struct space *old_space, > struct space *new_space); > + /** > + * Called right after removing the space from the cache. > + * The engine should abort all transactions involving > + * the space, because the space will be destroyed soon. > + * > + * This function isn't allowed to yield or fail. > + */ > + void (*invalidate)(struct space *space); > }; > > struct space { > @@ -407,6 +415,12 @@ space_prepare_alter(struct space *old_space, struct > space *new_space) return new_space->vtab->prepare_alter(old_space, > new_space); > } > > +static inline void > +space_invalidate(struct space *space) > +{ > + return space->vtab->invalidate(space); > +} > + > static inline bool > space_is_memtx(struct space *space) { return space->engine->id == 0; } > > @@ -469,6 +483,7 @@ int generic_space_check_format(struct space *, struct > tuple_format *); int generic_space_build_index(struct space *, struct index > *, > struct tuple_format *); > int generic_space_prepare_alter(struct space *, struct space *); > +void generic_space_invalidate(struct space *); > > #if defined(__cplusplus) > } /* extern "C" */ > diff --git a/src/box/sysview.c b/src/box/sysview.c > index 63b669d0..f9edd2d0 100644 > --- a/src/box/sysview.c > +++ b/src/box/sysview.c > @@ -495,6 +495,7 @@ static const struct space_vtab sysview_space_vtab = { > /* .build_index = */ generic_space_build_index, > /* .swap_index = */ generic_space_swap_index, > /* .prepare_alter = */ generic_space_prepare_alter, > + /* .invalidate = */ generic_space_invalidate, > }; > > static void > diff --git a/src/box/vinyl.c b/src/box/vinyl.c > index a6a5f187..da891025 100644 > --- a/src/box/vinyl.c > +++ b/src/box/vinyl.c > @@ -1031,6 +1031,26 @@ vinyl_space_prepare_alter(struct space *old_space, > struct space *new_space) return 0; > } > > +static void > +vinyl_space_invalidate(struct space *space) > +{ > + struct vy_env *env = vy_env(space->engine); > + /* > + * Abort all transactions involving the invalidated space. > + * An aborted transaction doesn't allow any DML/DQL requests > + * so the space won't be used anymore and can be safely > + * destroyed. > + * > + * There's a subtle corner case though - a transaction can > + * be reading disk from a DML request right now, with this > + * space passed to it in the argument list. However, it's > + * handled as well: the iterator will return an error as > + * soon as it's done reading disk, which will make the DML > + * request bail out early, without dereferencing the space. > + */ > + tx_manager_abort_writers_for_ddl(env->xm, space); As tx_manager_abort_writers_for_ddl ignores applier sessions it wouldn't fix the issue. I think a separate function should be introduced. > +} > + > /** > * This function is called after installing on_replace trigger > * used for propagating changes done during DDL. It aborts all > @@ -4543,6 +4563,7 @@ static const struct space_vtab vinyl_space_vtab = { > /* .build_index = */ vinyl_space_build_index, > /* .swap_index = */ vinyl_space_swap_index, > /* .prepare_alter = */ vinyl_space_prepare_alter, > + /* .invalidate = */ vinyl_space_invalidate, > }; > > static const struct index_vtab vinyl_index_vtab = { > diff --git a/src/box/vy_point_lookup.c b/src/box/vy_point_lookup.c > index 9f7796eb..9e1f3ca7 100644 > --- a/src/box/vy_point_lookup.c > +++ b/src/box/vy_point_lookup.c > @@ -198,6 +198,7 @@ vy_point_lookup(struct vy_lsm *lsm, struct vy_tx *tx, > { > /* All key parts must be set for a point lookup. */ > assert(vy_stmt_is_full_key(key, lsm->cmp_def)); > + assert(tx == NULL || tx->state == VINYL_TX_READY); > > *ret = NULL; > double start_time = ev_monotonic_now(loop()); > @@ -239,6 +240,19 @@ restart: > errinj(ERRINJ_VY_POINT_ITER_WAIT, ERRINJ_BOOL)->bparam = false; > }); > > + if (tx != NULL && tx->state == VINYL_TX_ABORT) { > + /* > + * The transaction was aborted while we were reading > + * disk. We must stop now and return an error as this > + * function could be called by a DML request aborted > + * by a DDL operation: failing early will prevent it > + * from dereferencing a destroyed space. > + */ > + diag_set(ClientError, ER_TRANSACTION_CONFLICT); > + rc = -1; > + goto done; > + } > + > if (mem_list_version != lsm->mem_list_version) { > /* > * Mem list was changed during yield. This could be rotation > diff --git a/src/box/vy_read_iterator.c b/src/box/vy_read_iterator.c > index 2864a280..60c43555 100644 > --- a/src/box/vy_read_iterator.c > +++ b/src/box/vy_read_iterator.c > @@ -501,6 +501,17 @@ rescan_disk: > } > vy_read_iterator_unpin_slices(itr); > /* > + * The transaction could have been aborted while we were > + * reading disk. We must stop now and return an error as > + * this function could be called by a DML request that > + * was aborted by a DDL operation: failing will prevent > + * it from dereferencing a destroyed space. > + */ > + if (itr->tx != NULL && itr->tx->state == VINYL_TX_ABORT) { > + diag_set(ClientError, ER_TRANSACTION_CONFLICT); > + return -1; > + } > + /* > * The list of in-memory indexes and/or the range tree could > * have been modified by dump/compaction while we were fetching > * data from disk. Restart the iterator if this is the case. > @@ -842,6 +853,8 @@ vy_read_iterator_track_read(struct vy_read_iterator > *itr, struct tuple *stmt) NODISCARD int > vy_read_iterator_next(struct vy_read_iterator *itr, struct tuple **result) > { > + assert(itr->tx == NULL || itr->tx->state == VINYL_TX_READY); > + > ev_tstamp start_time = ev_monotonic_now(loop()); > > struct vy_lsm *lsm = itr->lsm; > diff --git a/test/vinyl/errinj_ddl.result b/test/vinyl/errinj_ddl.result > index 04cb581f..794125e8 100644 > --- a/test/vinyl/errinj_ddl.result > +++ b/test/vinyl/errinj_ddl.result > @@ -682,3 +682,113 @@ box.commit() > s:drop() > --- > ... > +-- > +-- gh-3420: crash if DML races with DDL. > +-- > +fiber = require('fiber') > +--- > +... > +-- turn off cache for error injection to work > +default_vinyl_cache = box.cfg.vinyl_cache > +--- > +... > +box.cfg{vinyl_cache = 0} > +--- > +... > +s = box.schema.space.create('test', {engine = 'vinyl'}) > +--- > +... > +_ = s:create_index('i1') > +--- > +... > +_ = s:create_index('i2', {parts = {2, 'unsigned'}}) > +--- > +... > +_ = s:create_index('i3', {parts = {3, 'unsigned'}}) > +--- > +... > +_ = s:create_index('i4', {parts = {4, 'unsigned'}}) > +--- > +... > +for i = 1, 5 do s:replace({i, i, i, i}) end > +--- > +... > +box.snapshot() > +--- > +- ok > +... > +test_run:cmd("setopt delimiter ';'") > +--- > +- true > +... > +function async(f) > + local ch = fiber.channel(1) > + fiber.create(function() > + local _, res = pcall(f) > + ch:put(res) > + end) > + return ch > +end; > +--- > +... > +test_run:cmd("setopt delimiter ''"); > +--- > +- true > +... > +-- Issue a few DML requests that require reading disk. > +-- Stall disk reads. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0.01) > +--- > +- ok > +... > +ch1 = async(function() s:insert({1, 1, 1, 1}) end) > +--- > +... > +ch2 = async(function() s:replace({2, 3, 2, 3}) end) > +--- > +... > +ch3 = async(function() s:update({3}, {{'+', 4, 1}}) end) > +--- > +... > +ch4 = async(function() s:upsert({5, 5, 5, 5}, {{'+', 4, 1}}) end) > +--- > +... > +-- Execute a DDL operation on the space. > +s.index.i4:drop() > +--- > +... > +-- Resume the DML requests. Check that they have been aborted. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0) > +--- > +- ok > +... > +ch1:get() > +--- > +- Transaction has been aborted by conflict > +... > +ch2:get() > +--- > +- Transaction has been aborted by conflict > +... > +ch3:get() > +--- > +- Transaction has been aborted by conflict > +... > +ch4:get() > +--- > +- Transaction has been aborted by conflict > +... > +s:select() > +--- > +- - [1, 1, 1, 1] > + - [2, 2, 2, 2] > + - [3, 3, 3, 3] > + - [4, 4, 4, 4] > + - [5, 5, 5, 5] > +... > +s:drop() > +--- > +... > +box.cfg{vinyl_cache = default_vinyl_cache} > +--- > +... > diff --git a/test/vinyl/errinj_ddl.test.lua b/test/vinyl/errinj_ddl.test.lua > index 50dc4aa1..84f33277 100644 > --- a/test/vinyl/errinj_ddl.test.lua > +++ b/test/vinyl/errinj_ddl.test.lua > @@ -298,3 +298,54 @@ s:replace{1, 3} > box.commit() > > s:drop() > + > +-- > +-- gh-3420: crash if DML races with DDL. > +-- > +fiber = require('fiber') > + > +-- turn off cache for error injection to work > +default_vinyl_cache = box.cfg.vinyl_cache > +box.cfg{vinyl_cache = 0} > + > +s = box.schema.space.create('test', {engine = 'vinyl'}) > +_ = s:create_index('i1') > +_ = s:create_index('i2', {parts = {2, 'unsigned'}}) > +_ = s:create_index('i3', {parts = {3, 'unsigned'}}) > +_ = s:create_index('i4', {parts = {4, 'unsigned'}}) > +for i = 1, 5 do s:replace({i, i, i, i}) end > +box.snapshot() > + > +test_run:cmd("setopt delimiter ';'") > +function async(f) > + local ch = fiber.channel(1) > + fiber.create(function() > + local _, res = pcall(f) > + ch:put(res) > + end) > + return ch > +end; > +test_run:cmd("setopt delimiter ''"); > + > +-- Issue a few DML requests that require reading disk. > +-- Stall disk reads. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0.01) > + > +ch1 = async(function() s:insert({1, 1, 1, 1}) end) > +ch2 = async(function() s:replace({2, 3, 2, 3}) end) > +ch3 = async(function() s:update({3}, {{'+', 4, 1}}) end) > +ch4 = async(function() s:upsert({5, 5, 5, 5}, {{'+', 4, 1}}) end) > + > +-- Execute a DDL operation on the space. > +s.index.i4:drop() > + > +-- Resume the DML requests. Check that they have been aborted. > +errinj.set('ERRINJ_VY_READ_PAGE_TIMEOUT', 0) > +ch1:get() > +ch2:get() > +ch3:get() > +ch4:get() > +s:select() > +s:drop() > + > +box.cfg{vinyl_cache = default_vinyl_cache} [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-25 5:45 ` Георгий Кириченко @ 2019-03-25 8:21 ` Vladimir Davydov 2019-03-25 9:03 ` Георгий Кириченко 0 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-25 8:21 UTC (permalink / raw) To: Георгий Кириченко Cc: tarantool-patches, kostja On Mon, Mar 25, 2019 at 08:45:41AM +0300, Георгий Кириченко wrote: > > +static void > > +vinyl_space_invalidate(struct space *space) > > +{ > > + struct vy_env *env = vy_env(space->engine); > > + /* > > + * Abort all transactions involving the invalidated space. > > + * An aborted transaction doesn't allow any DML/DQL requests > > + * so the space won't be used anymore and can be safely > > + * destroyed. > > + * > > + * There's a subtle corner case though - a transaction can > > + * be reading disk from a DML request right now, with this > > + * space passed to it in the argument list. However, it's > > + * handled as well: the iterator will return an error as > > + * soon as it's done reading disk, which will make the DML > > + * request bail out early, without dereferencing the space. > > + */ > > + tx_manager_abort_writers_for_ddl(env->xm, space); > > As tx_manager_abort_writers_for_ddl ignores applier sessions it wouldn't fix > the issue. I think a separate function should be introduced. Wrong. It's tx_manager_abort_writers_for_ro that treats appliers in a special way; tx_manager_abort_writers_for_ddl aborts all transactions involving the given space, irrespective of the session type. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-25 8:21 ` Vladimir Davydov @ 2019-03-25 9:03 ` Георгий Кириченко 0 siblings, 0 replies; 21+ messages in thread From: Георгий Кириченко @ 2019-03-25 9:03 UTC (permalink / raw) To: Vladimir Davydov; +Cc: tarantool-patches, kostja [-- Attachment #1: Type: text/plain, Size: 1410 bytes --] On Monday, March 25, 2019 11:21:07 AM MSK Vladimir Davydov wrote: > On Mon, Mar 25, 2019 at 08:45:41AM +0300, Георгий Кириченко wrote: > > > +static void > > > +vinyl_space_invalidate(struct space *space) > > > +{ > > > + struct vy_env *env = vy_env(space->engine); > > > + /* > > > + * Abort all transactions involving the invalidated space. > > > + * An aborted transaction doesn't allow any DML/DQL requests > > > + * so the space won't be used anymore and can be safely > > > + * destroyed. > > > + * > > > + * There's a subtle corner case though - a transaction can > > > + * be reading disk from a DML request right now, with this > > > + * space passed to it in the argument list. However, it's > > > + * handled as well: the iterator will return an error as > > > + * soon as it's done reading disk, which will make the DML > > > + * request bail out early, without dereferencing the space. > > > + */ > > > + tx_manager_abort_writers_for_ddl(env->xm, space); > > > > As tx_manager_abort_writers_for_ddl ignores applier sessions it wouldn't > > fix the issue. I think a separate function should be introduced. > > Wrong. It's tx_manager_abort_writers_for_ro that treats appliers in a > special way; tx_manager_abort_writers_for_ddl aborts all transactions > involving the given space, irrespective of the session type. Sorry, it was my mistake [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tarantool-patches] Re: [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-23 21:07 ` [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache Vladimir Davydov 2019-03-25 5:26 ` [tarantool-patches] " Георгий Кириченко 2019-03-25 5:45 ` Георгий Кириченко @ 2019-03-28 13:45 ` Konstantin Osipov 2019-03-28 14:02 ` Vladimir Davydov 2 siblings, 1 reply; 21+ messages in thread From: Konstantin Osipov @ 2019-03-28 13:45 UTC (permalink / raw) To: tarantool-patches * Vladimir Davydov <vdavydov.dev@gmail.com> [19/03/24 00:10]: > A DDL operation creates a new struct space container, moving unaffected > indexes from the old container, then destroying it. The problem is there > may be a DML request for this space, which was passed the old container > in the argument list and then yielded on disk read. When it resumes, it > may try to dereference the old space container, which may have already > been destroyed. This will most certainly result in a crash. > > To address this problem, we introduce a new space callback, invalidate, > which is called for the old space on space_cache_replace(). In case of > vinyl, this callback aborts all transactions involving the space. To > prevent a DML request from dereferencing a destroyed space, we also make > the iterator code check the current transaction state after each yield > and return an error if it was aborted. This should make any DML request > bail out early without dereferencing the space anymore. This patch looks good to me. I don't share Georgy's concerns. I would make the name of the callback more explicit, like abort_transactions_in_engine, so that it clear what it is doing now. In future, should we find more use for it, I would rename it. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] Re: [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-28 13:45 ` [tarantool-patches] " Konstantin Osipov @ 2019-03-28 14:02 ` Vladimir Davydov 2019-03-28 14:12 ` Konstantin Osipov 0 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-28 14:02 UTC (permalink / raw) To: Konstantin Osipov; +Cc: tarantool-patches On Thu, Mar 28, 2019 at 04:45:50PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov <vdavydov.dev@gmail.com> [19/03/24 00:10]: > > A DDL operation creates a new struct space container, moving unaffected > > indexes from the old container, then destroying it. The problem is there > > may be a DML request for this space, which was passed the old container > > in the argument list and then yielded on disk read. When it resumes, it > > may try to dereference the old space container, which may have already > > been destroyed. This will most certainly result in a crash. > > > > To address this problem, we introduce a new space callback, invalidate, > > which is called for the old space on space_cache_replace(). In case of > > vinyl, this callback aborts all transactions involving the space. To > > prevent a DML request from dereferencing a destroyed space, we also make > > the iterator code check the current transaction state after each yield > > and return an error if it was aborted. This should make any DML request > > bail out early without dereferencing the space anymore. > > This patch looks good to me. I don't share Georgy's concerns. I > would make the name of the callback more explicit, like > abort_transactions_in_engine, so that it clear what it is doing > now. In future, should we find more use for it, I would rename it. 'invalidate' sounds so much better IMO. Besides, the name suggests that the space becomes invalid when it's called and hence the function should take extra care. 'abort_transactions' is, in fact, a more generic name. Looking at it, I wouldn't say that it should take extra precautions to assure that currently running transactions won't dereference the old space by any chance. So I'm quite convinced we'd better call it 'invalidate'. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [tarantool-patches] Re: [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache 2019-03-28 14:02 ` Vladimir Davydov @ 2019-03-28 14:12 ` Konstantin Osipov 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Osipov @ 2019-03-28 14:12 UTC (permalink / raw) To: Vladimir Davydov; +Cc: tarantool-patches * Vladimir Davydov <vdavydov.dev@gmail.com> [19/03/28 17:05]: > > > A DDL operation creates a new struct space container, moving unaffected > > > indexes from the old container, then destroying it. The problem is there > > > may be a DML request for this space, which was passed the old container > > > in the argument list and then yielded on disk read. When it resumes, it > > > may try to dereference the old space container, which may have already > > > been destroyed. This will most certainly result in a crash. > > > > > > To address this problem, we introduce a new space callback, invalidate, > > > which is called for the old space on space_cache_replace(). In case of > > > vinyl, this callback aborts all transactions involving the space. To > > > prevent a DML request from dereferencing a destroyed space, we also make > > > the iterator code check the current transaction state after each yield > > > and return an error if it was aborted. This should make any DML request > > > bail out early without dereferencing the space anymore. > > > > This patch looks good to me. I don't share Georgy's concerns. I > > would make the name of the callback more explicit, like > > abort_transactions_in_engine, so that it clear what it is doing > > now. In future, should we find more use for it, I would rename it. > > 'invalidate' sounds so much better IMO. Besides, the name suggests that > the space becomes invalid when it's called and hence the function should > take extra care. > > 'abort_transactions' is, in fact, a more generic name. Looking at it, > I wouldn't say that it should take extra precautions to assure that > currently running transactions won't dereference the old space by any > chance. > > So I'm quite convinced we'd better call it 'invalidate'. :) -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 4/4] Revert "test: skip ddl test for vinyl on travis" 2019-03-23 21:07 [PATCH 0/4] Fix DML vs DDL race Vladimir Davydov ` (2 preceding siblings ...) 2019-03-23 21:07 ` [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache Vladimir Davydov @ 2019-03-23 21:07 ` Vladimir Davydov 2019-03-28 13:46 ` [tarantool-patches] " Konstantin Osipov 3 siblings, 1 reply; 21+ messages in thread From: Vladimir Davydov @ 2019-03-23 21:07 UTC (permalink / raw) To: kostja; +Cc: tarantool-patches This reverts commit c2de45c4686405d199eb22273706a3d52cadea3b. Not needed anymore as the DDL bug it was supposed to work around has been finally fixed. Follow-up #3420 --- rpm/tarantool.spec | 2 -- test/vinyl/ddl.skipcond | 6 ------ 2 files changed, 8 deletions(-) delete mode 100644 test/vinyl/ddl.skipcond diff --git a/rpm/tarantool.spec b/rpm/tarantool.spec index bd2469d9..c87b1667 100644 --- a/rpm/tarantool.spec +++ b/rpm/tarantool.spec @@ -149,8 +149,6 @@ rm -rf %{buildroot}%{_datarootdir}/doc/tarantool/ echo "self.skip = True" > ./test/app/socket.skipcond # https://github.com/tarantool/tarantool/issues/1322 echo "self.skip = True" > ./test/app/digest.skipcond -# https://github.com/tarantool/tarantool/issues/3420 -echo "self.skip = True" > ./test/vinyl/ddl.skipcond # run a safe subset of the test suite cd test && ./test-run.py -j 1 unit/ app/ app-tap/ box/ box-tap/ engine/ vinyl/ %endif diff --git a/test/vinyl/ddl.skipcond b/test/vinyl/ddl.skipcond deleted file mode 100644 index 3ef55854..00000000 --- a/test/vinyl/ddl.skipcond +++ /dev/null @@ -1,6 +0,0 @@ -# vim: set ft=python : -import os - -# Travis CI fails because of bug #3420 -if os.environ.get('TRAVIS_JOB_ID', False): - self.skip = 1 -- 2.11.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tarantool-patches] Re: [PATCH 4/4] Revert "test: skip ddl test for vinyl on travis" 2019-03-23 21:07 ` [PATCH 4/4] Revert "test: skip ddl test for vinyl on travis" Vladimir Davydov @ 2019-03-28 13:46 ` Konstantin Osipov 0 siblings, 0 replies; 21+ messages in thread From: Konstantin Osipov @ 2019-03-28 13:46 UTC (permalink / raw) To: tarantool-patches * Vladimir Davydov <vdavydov.dev@gmail.com> [19/03/24 00:10]: > This reverts commit c2de45c4686405d199eb22273706a3d52cadea3b. > > Not needed anymore as the DDL bug it was supposed to work around has > been finally fixed. > > Follow-up #3420 This is OK to push, obviously. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2019-03-28 14:12 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-23 21:07 [PATCH 0/4] Fix DML vs DDL race Vladimir Davydov 2019-03-23 21:07 ` [PATCH 1/4] vinyl: introduce hash of spaces affected by transaction Vladimir Davydov 2019-03-28 13:25 ` [tarantool-patches] " Konstantin Osipov 2019-03-28 13:58 ` Vladimir Davydov 2019-03-23 21:07 ` [PATCH 2/4] vinyl: don't abort transactions that modify only local spaces for ro Vladimir Davydov 2019-03-25 5:27 ` [tarantool-patches] " Георгий Кириченко 2019-03-25 8:13 ` Vladimir Davydov 2019-03-25 8:58 ` Георгий Кириченко 2019-03-25 9:30 ` Vladimir Davydov 2019-03-23 21:07 ` [PATCH 3/4] vinyl: abort affected transactions when space is removed from cache Vladimir Davydov 2019-03-25 5:26 ` [tarantool-patches] " Георгий Кириченко 2019-03-25 8:17 ` Vladimir Davydov [not found] ` <1564677.EMV258VVK2@home.lan> 2019-03-25 9:51 ` Vladimir Davydov 2019-03-25 5:45 ` Георгий Кириченко 2019-03-25 8:21 ` Vladimir Davydov 2019-03-25 9:03 ` Георгий Кириченко 2019-03-28 13:45 ` [tarantool-patches] " Konstantin Osipov 2019-03-28 14:02 ` Vladimir Davydov 2019-03-28 14:12 ` Konstantin Osipov 2019-03-23 21:07 ` [PATCH 4/4] Revert "test: skip ddl test for vinyl on travis" Vladimir Davydov 2019-03-28 13:46 ` [tarantool-patches] " Konstantin Osipov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox