From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Wed, 3 Jul 2019 22:35:41 +0300 From: Konstantin Osipov Subject: Re: [PATCH 6/6] Replace schema lock with fine-grained locking Message-ID: <20190703193541.GH17318@atlas> References: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: To: Vladimir Davydov Cc: tarantool-patches@freelists.org List-ID: * Vladimir Davydov [19/07/01 10:04]: > Now, as we don't need to take the schema lock for checkpointing, it is > only used to synchronize concurrent space modifications (drop, truncate, > alter). Actually, a global lock is a way too heavy means to achieve this > goal, because we only care about forbidding concurrent modifications of > the same space while concurrent modifications of different spaces should > work just fine. So this patch replaces schema lock with a per-space > flag. The flag is called is_in_alter and set by alter_space_new() and > cleared by alter_space_delete(). If the flag is already set when > alter_space_new() is called, an error is thrown. Uh-oh. Could you please do a bit more coding? There are inherent dangers in using a boolean flag rather than a normal lock: - lock life time is bound to object life time - there is no way to put itself into a wait queue - there is an implicit assumption that a fiber only takes one lock and no way to inspect/free all locks of a fiber. - deadlock detection is impossible. Let's add a normal name-based locking for this: struct lock { enum object_type object_type; char *object_name; enum { S, X } type; struct lock *pending; struct fiber *owner; // ideally it should be struct txn, or int txn_id, not struct fiber }; hash metadata_locks; This could be a separate module in box.cc. The api should take locks by name: struct lock *metadata_lock_get(enum object_type type, char *object_name, enum lock_type type, int txn_id); and unlock by value or txn_id: void metadata_lock_free(struct lock *lock); void metadata_lock_free_all(int txn_id); > Removal of the schema lock allows us to remove latch_steal() helper and > on_begin_stmt txn trigger altogether, as they were introduced solely to > support locking. > > This is a prerequisite for transactional DDL, because it's unclear how > to preserve the global schema lock while allowing to combine several DDL > statements in the same transaction. > --- > src/box/alter.cc | 68 +++++----------------------- > src/box/alter.h | 3 -- > src/box/schema.cc | 41 +++++++---------- > src/box/schema.h | 6 --- > src/box/space.c | 2 - > src/box/space.h | 7 ++- > src/box/txn.c | 3 -- > src/box/vy_lsm.c | 1 + > src/lib/core/latch.h | 10 ----- > test/box/on_replace.result | 8 ++-- > test/box/transaction.result | 2 +- > test/engine/errinj_ddl.result | 98 +++++++++++++++++++++++++++++++++++++++++ > test/engine/errinj_ddl.test.lua | 38 ++++++++++++++++ > 13 files changed, 176 insertions(+), 111 deletions(-) > > diff --git a/src/box/alter.cc b/src/box/alter.cc > index e76b9e68..f82993c3 100644 > --- a/src/box/alter.cc > +++ b/src/box/alter.cc > @@ -568,7 +568,6 @@ space_swap_triggers(struct space *new_space, struct space *old_space) > { > rlist_swap(&new_space->before_replace, &old_space->before_replace); > rlist_swap(&new_space->on_replace, &old_space->on_replace); > - rlist_swap(&new_space->on_stmt_begin, &old_space->on_stmt_begin); > /** Swap SQL Triggers pointer. */ > struct sql_trigger *new_value = new_space->sql_triggers; > new_space->sql_triggers = old_space->sql_triggers; > @@ -707,6 +706,10 @@ struct alter_space { > static struct alter_space * > alter_space_new(struct space *old_space) > { > + if (old_space->is_in_alter) { > + tnt_raise(ClientError, ER_ALTER_SPACE, space_name(old_space), > + "the space is already being modified"); > + } > struct alter_space *alter = > region_calloc_object_xc(&in_txn()->region, struct alter_space); > rlist_create(&alter->ops); > @@ -716,6 +719,8 @@ alter_space_new(struct space *old_space) > alter->new_min_field_count = old_space->format->min_field_count; > else > alter->new_min_field_count = 0; > + > + old_space->is_in_alter = true; > return alter; > } > > @@ -723,6 +728,11 @@ alter_space_new(struct space *old_space) > static void > alter_space_delete(struct alter_space *alter) > { > + if (alter->old_space != NULL) { > + assert(alter->old_space->is_in_alter); > + alter->old_space->is_in_alter = false; > + } > + > /* Destroy the ops. */ > while (! rlist_empty(&alter->ops)) { > AlterSpaceOp *op = rlist_shift_entry(&alter->ops, > @@ -769,6 +779,7 @@ alter_space_commit(struct trigger *trigger, void *event) > * going to use it. > */ > space_delete(alter->old_space); > + alter->old_space = NULL; > alter_space_delete(alter); > } > > @@ -3570,49 +3581,6 @@ on_replace_dd_space_sequence(struct trigger * /* trigger */, void *event) > > /* }}} sequence */ > > -static void > -unlock_after_dd(struct trigger *trigger, void *event) > -{ > - (void) trigger; > - (void) event; > - /* > - * In case of yielding journal this trigger will be processed > - * in a context of tx_prio endpoint instead of a context of > - * a fiber which has this latch locked. So steal the latch first. > - */ > - latch_steal(&schema_lock); > - latch_unlock(&schema_lock); > -} > - > -static void > -lock_before_dd(struct trigger *trigger, void *event) > -{ > - (void) trigger; > - if (fiber() == latch_owner(&schema_lock)) > - return; > - struct txn *txn = (struct txn *)event; > - /* > - * This trigger is executed before any check and may yield > - * on the latch lock. But a yield in a non-autocommit > - * memtx transaction will roll it back silently, rather > - * than produce an error, which is very confusing. > - * So don't try to lock a latch if there is > - * a multi-statement transaction. > - */ > - txn_check_singlestatement_xc(txn, "DDL"); > - struct trigger *on_commit = > - txn_alter_trigger_new(unlock_after_dd, NULL); > - struct trigger *on_rollback = > - txn_alter_trigger_new(unlock_after_dd, NULL); > - /* > - * Setting triggers doesn't fail. Lock the latch last > - * to avoid leaking the latch in case of exception. > - */ > - txn_on_commit(txn, on_commit); > - txn_on_rollback(txn, on_rollback); > - latch_lock(&schema_lock); > -} > - > /** > * Trigger invoked on rollback in the _trigger space. > * Since rollback trigger is invoked after insertion to hash and space, > @@ -4473,18 +4441,6 @@ struct trigger on_replace_space_sequence = { > RLIST_LINK_INITIALIZER, on_replace_dd_space_sequence, NULL, NULL > }; > > -struct trigger on_stmt_begin_space = { > - RLIST_LINK_INITIALIZER, lock_before_dd, NULL, NULL > -}; > - > -struct trigger on_stmt_begin_index = { > - RLIST_LINK_INITIALIZER, lock_before_dd, NULL, NULL > -}; > - > -struct trigger on_stmt_begin_truncate = { > - RLIST_LINK_INITIALIZER, lock_before_dd, NULL, NULL > -}; > - > struct trigger on_replace_trigger = { > RLIST_LINK_INITIALIZER, on_replace_dd_trigger, NULL, NULL > }; > diff --git a/src/box/alter.h b/src/box/alter.h > index b9ba7b84..c339ccea 100644 > --- a/src/box/alter.h > +++ b/src/box/alter.h > @@ -47,8 +47,5 @@ extern struct trigger on_replace_space_sequence; > extern struct trigger on_replace_trigger; > extern struct trigger on_replace_fk_constraint; > extern struct trigger on_replace_ck_constraint; > -extern struct trigger on_stmt_begin_space; > -extern struct trigger on_stmt_begin_index; > -extern struct trigger on_stmt_begin_truncate; > > #endif /* INCLUDES_TARANTOOL_BOX_ALTER_H */ > diff --git a/src/box/schema.cc b/src/box/schema.cc > index 64412fac..0ab256ed 100644 > --- a/src/box/schema.cc > +++ b/src/box/schema.cc > @@ -37,6 +37,7 @@ > #include "scoped_guard.h" > #include "user.h" > #include "vclock.h" > +#include "fiber.h" > > /** > * @module Data Dictionary > @@ -74,11 +75,6 @@ struct rlist on_alter_space = RLIST_HEAD_INITIALIZER(on_alter_space); > struct rlist on_alter_sequence = RLIST_HEAD_INITIALIZER(on_alter_sequence); > struct rlist on_alter_func = RLIST_HEAD_INITIALIZER(on_alter_func); > > -/** > - * Lock of scheme modification > - */ > -struct latch schema_lock = LATCH_INITIALIZER(schema_lock); > - > struct entity_access entity_access; > > bool > @@ -268,8 +264,7 @@ static void > sc_space_new(uint32_t id, const char *name, > struct key_part_def *key_parts, > uint32_t key_part_count, > - struct trigger *replace_trigger, > - struct trigger *stmt_begin_trigger) > + struct trigger *replace_trigger) > { > struct key_def *key_def = key_def_new(key_parts, key_part_count); > if (key_def == NULL) > @@ -298,8 +293,6 @@ sc_space_new(uint32_t id, const char *name, > space_cache_replace(NULL, space); > if (replace_trigger) > trigger_add(&space->on_replace, replace_trigger); > - if (stmt_begin_trigger) > - trigger_add(&space->on_stmt_begin, stmt_begin_trigger); > /* > * Data dictionary spaces are fully built since: > * - they contain data right from the start > @@ -394,56 +387,56 @@ schema_init() > key_parts[0].fieldno = 0; > key_parts[0].type = FIELD_TYPE_STRING; > sc_space_new(BOX_SCHEMA_ID, "_schema", key_parts, 1, > - &on_replace_schema, NULL); > + &on_replace_schema); > > /* _collation - collation description. */ > key_parts[0].fieldno = 0; > key_parts[0].type = FIELD_TYPE_UNSIGNED; > sc_space_new(BOX_COLLATION_ID, "_collation", key_parts, 1, > - &on_replace_collation, NULL); > + &on_replace_collation); > > /* _space - home for all spaces. */ > sc_space_new(BOX_SPACE_ID, "_space", key_parts, 1, > - &alter_space_on_replace_space, &on_stmt_begin_space); > + &alter_space_on_replace_space); > > /* _truncate - auxiliary space for triggering space truncation. */ > sc_space_new(BOX_TRUNCATE_ID, "_truncate", key_parts, 1, > - &on_replace_truncate, &on_stmt_begin_truncate); > + &on_replace_truncate); > > /* _sequence - definition of all sequence objects. */ > sc_space_new(BOX_SEQUENCE_ID, "_sequence", key_parts, 1, > - &on_replace_sequence, NULL); > + &on_replace_sequence); > > /* _sequence_data - current sequence value. */ > sc_space_new(BOX_SEQUENCE_DATA_ID, "_sequence_data", key_parts, 1, > - &on_replace_sequence_data, NULL); > + &on_replace_sequence_data); > > /* _space_seq - association space <-> sequence. */ > sc_space_new(BOX_SPACE_SEQUENCE_ID, "_space_sequence", key_parts, 1, > - &on_replace_space_sequence, NULL); > + &on_replace_space_sequence); > > /* _user - all existing users */ > - sc_space_new(BOX_USER_ID, "_user", key_parts, 1, &on_replace_user, NULL); > + sc_space_new(BOX_USER_ID, "_user", key_parts, 1, &on_replace_user); > > /* _func - all executable objects on which one can have grants */ > - sc_space_new(BOX_FUNC_ID, "_func", key_parts, 1, &on_replace_func, NULL); > + sc_space_new(BOX_FUNC_ID, "_func", key_parts, 1, &on_replace_func); > /* > * _priv - association user <-> object > * The real index is defined in the snapshot. > */ > - sc_space_new(BOX_PRIV_ID, "_priv", key_parts, 1, &on_replace_priv, NULL); > + sc_space_new(BOX_PRIV_ID, "_priv", key_parts, 1, &on_replace_priv); > /* > * _cluster - association instance uuid <-> instance id > * The real index is defined in the snapshot. > */ > sc_space_new(BOX_CLUSTER_ID, "_cluster", key_parts, 1, > - &on_replace_cluster, NULL); > + &on_replace_cluster); > > /* _trigger - all existing SQL triggers. */ > key_parts[0].fieldno = 0; > key_parts[0].type = FIELD_TYPE_STRING; > sc_space_new(BOX_TRIGGER_ID, "_trigger", key_parts, 1, > - &on_replace_trigger, NULL); > + &on_replace_trigger); > > /* _index - definition of all space indexes. */ > key_parts[0].fieldno = 0; /* space id */ > @@ -451,7 +444,7 @@ schema_init() > key_parts[1].fieldno = 1; /* index id */ > key_parts[1].type = FIELD_TYPE_UNSIGNED; > sc_space_new(BOX_INDEX_ID, "_index", key_parts, 2, > - &alter_space_on_replace_index, &on_stmt_begin_index); > + &alter_space_on_replace_index); > > /* _fk_сonstraint - foreign keys constraints. */ > key_parts[0].fieldno = 0; /* constraint name */ > @@ -459,7 +452,7 @@ schema_init() > key_parts[1].fieldno = 1; /* child space */ > key_parts[1].type = FIELD_TYPE_UNSIGNED; > sc_space_new(BOX_FK_CONSTRAINT_ID, "_fk_constraint", key_parts, 2, > - &on_replace_fk_constraint, NULL); > + &on_replace_fk_constraint); > > /* _ck_сonstraint - check constraints. */ > key_parts[0].fieldno = 0; /* space id */ > @@ -467,7 +460,7 @@ schema_init() > key_parts[1].fieldno = 1; /* constraint name */ > key_parts[1].type = FIELD_TYPE_STRING; > sc_space_new(BOX_CK_CONSTRAINT_ID, "_ck_constraint", key_parts, 2, > - &on_replace_ck_constraint, NULL); > + &on_replace_ck_constraint); > > /* > * _vinyl_deferred_delete - blackhole that is needed > diff --git a/src/box/schema.h b/src/box/schema.h > index f0039b29..9f8f6345 100644 > --- a/src/box/schema.h > +++ b/src/box/schema.h > @@ -35,7 +35,6 @@ > #include /* snprintf */ > #include "error.h" > #include "space.h" > -#include "latch.h" > > #if defined(__cplusplus) > extern "C" { > @@ -48,11 +47,6 @@ extern uint32_t space_cache_version; > extern struct rlist on_schema_init; > > /** > - * Lock of schema modification > - */ > -extern struct latch schema_lock; > - > -/** > * Try to look up a space by space number in the space cache. > * FFI-friendly no-exception-thrown space lookup function. > * > diff --git a/src/box/space.c b/src/box/space.c > index b6ad87bf..e7babb2a 100644 > --- a/src/box/space.c > +++ b/src/box/space.c > @@ -137,7 +137,6 @@ space_create(struct space *space, struct engine *engine, > space->index_id_max = index_id_max; > rlist_create(&space->before_replace); > rlist_create(&space->on_replace); > - rlist_create(&space->on_stmt_begin); > space->run_triggers = true; > > space->format = format; > @@ -219,7 +218,6 @@ space_delete(struct space *space) > tuple_format_unref(space->format); > trigger_destroy(&space->before_replace); > trigger_destroy(&space->on_replace); > - trigger_destroy(&space->on_stmt_begin); > space_def_delete(space->def); > /* > * SQL Triggers should be deleted with > diff --git a/src/box/space.h b/src/box/space.h > index 949f37d4..2cc0643d 100644 > --- a/src/box/space.h > +++ b/src/box/space.h > @@ -154,8 +154,6 @@ struct space { > struct rlist before_replace; > /** Triggers fired after space_replace() -- see txn_commit_stmt(). */ > struct rlist on_replace; > - /** Triggers fired before space statement */ > - struct rlist on_stmt_begin; > /** SQL Trigger list. */ > struct sql_trigger *sql_triggers; > /** > @@ -183,6 +181,11 @@ struct space { > /** Enable/disable triggers. */ > bool run_triggers; > /** > + * Set if the space is being altered and hence any other > + * DDL request must be rejected. > + */ > + bool is_in_alter; > + /** > * Space format or NULL if space does not have format > * (sysview engine, for example). > */ > diff --git a/src/box/txn.c b/src/box/txn.c > index 95076773..818f405b 100644 > --- a/src/box/txn.c > +++ b/src/box/txn.c > @@ -242,9 +242,6 @@ txn_begin_stmt(struct txn *txn, struct space *space) > if (space == NULL) > return 0; > > - if (trigger_run(&space->on_stmt_begin, txn) != 0) > - goto fail; > - > struct engine *engine = space->engine; > if (txn_begin_in_engine(engine, txn) != 0) > goto fail; > diff --git a/src/box/vy_lsm.c b/src/box/vy_lsm.c > index 3e134cc1..48590521 100644 > --- a/src/box/vy_lsm.c > +++ b/src/box/vy_lsm.c > @@ -38,6 +38,7 @@ > #include > > #include "diag.h" > +#include "fiber.h" > #include "errcode.h" > #include "histogram.h" > #include "index_def.h" > diff --git a/src/lib/core/latch.h b/src/lib/core/latch.h > index 58094256..49c59cf6 100644 > --- a/src/lib/core/latch.h > +++ b/src/lib/core/latch.h > @@ -156,16 +156,6 @@ latch_trylock(struct latch *l) > } > > /** > - * Take a latch ownership > - */ > -static inline void > -latch_steal(struct latch *l) > -{ > - assert(l->owner != NULL); > - l->owner = fiber(); > -} > - > -/** > * \copydoc box_latch_unlock > */ > static inline void > diff --git a/test/box/on_replace.result b/test/box/on_replace.result > index a2f51409..b71c9878 100644 > --- a/test/box/on_replace.result > +++ b/test/box/on_replace.result > @@ -477,7 +477,7 @@ t = s:on_replace(function () s:create_index('sec') end, t) > ... > s:replace({2, 3}) > --- > -- error: DDL does not support multi-statement transactions > +- error: Space _index does not support multi-statement transactions > ... > t = s:on_replace(function () box.schema.user.create('newu') end, t) > --- > @@ -512,7 +512,7 @@ t = s:on_replace(function () s:drop() end, t) > ... > s:replace({5, 6}) > --- > -- error: DDL does not support multi-statement transactions > +- error: Space _index does not support multi-statement transactions > ... > t = s:on_replace(function () box.schema.func.create('newf') end, t) > --- > @@ -533,14 +533,14 @@ t = s:on_replace(function () s:rename('newname') end, t) > ... > s:replace({8, 9}) > --- > -- error: DDL does not support multi-statement transactions > +- error: Space _space does not support multi-statement transactions > ... > t = s:on_replace(function () s.index.pk:rename('newname') end, t) > --- > ... > s:replace({9, 10}) > --- > -- error: DDL does not support multi-statement transactions > +- error: Space _index does not support multi-statement transactions > ... > s:select() > --- > diff --git a/test/box/transaction.result b/test/box/transaction.result > index bc50735b..9da53e5b 100644 > --- a/test/box/transaction.result > +++ b/test/box/transaction.result > @@ -87,7 +87,7 @@ while f:status() ~= 'dead' do fiber.sleep(0) end; > -- some operation involves more than one ddl spaces, so they should fail > box.begin() box.schema.space.create('test'); > --- > -- error: DDL does not support multi-statement transactions > +- error: Space _space does not support multi-statement transactions > ... > box.rollback(); > --- > diff --git a/test/engine/errinj_ddl.result b/test/engine/errinj_ddl.result > index a28223a6..60bf34d9 100644 > --- a/test/engine/errinj_ddl.result > +++ b/test/engine/errinj_ddl.result > @@ -404,3 +404,101 @@ s:select() > s:drop() > --- > ... > +-- > +-- Concurrent DDL. > +-- > +s1 = box.schema.space.create('test1', {engine = engine}) > +--- > +... > +_ = s1:create_index('pk') > +--- > +... > +for i = 1, 5 do s1:insert{i, i} end > +--- > +... > +s2 = box.schema.space.create('test2', {engine = engine}) > +--- > +... > +_ = s2:create_index('pk') > +--- > +... > +for i = 1, 5 do s2:insert{i, i} end > +--- > +... > +errinj.set("ERRINJ_BUILD_INDEX_DELAY", true) > +--- > +- ok > +... > +ch = fiber.channel(1) > +--- > +... > +_ = fiber.create(function() s1:create_index('sk', {parts = {2, 'unsigned'}}) ch:put(true) end) > +--- > +... > +-- Modification of the same space must fail while an index creation > +-- is in progress. > +s1:format{{'a', 'unsigned'}, {'b', 'unsigned'}} > +--- > +- error: 'Can''t modify space ''test1'': the space is already being modified' > +... > +s1:truncate() > +--- > +- error: 'Can''t modify space ''test1'': the space is already being modified' > +... > +_ = s1:create_index('tk', {parts = {3, 'unsigned'}}) > +--- > +- error: 'Can''t modify space ''test1'': the space is already being modified' > +... > +s1:drop() > +--- > +- error: 'Can''t modify space ''test1'': the space is already being modified' > +... > +-- Modification of another space must work though. > +s2:format{{'a', 'unsigned'}, {'b', 'unsigned'}} > +--- > +... > +s2:truncate() > +--- > +... > +_ = s2:create_index('sk', {parts = {2, 'unsigned'}}) > +--- > +... > +s2:drop() > +--- > +... > +s2 = box.schema.space.create('test2', {engine = engine}) > +--- > +... > +_ = s2:create_index('pk') > +--- > +... > +s2:drop() > +--- > +... > +errinj.set("ERRINJ_BUILD_INDEX_DELAY", false) > +--- > +- ok > +... > +ch:get() > +--- > +- true > +... > +s1.index.pk:select() > +--- > +- - [1, 1] > + - [2, 2] > + - [3, 3] > + - [4, 4] > + - [5, 5] > +... > +s1.index.sk:select() > +--- > +- - [1, 1] > + - [2, 2] > + - [3, 3] > + - [4, 4] > + - [5, 5] > +... > +s1:drop() > +--- > +... > diff --git a/test/engine/errinj_ddl.test.lua b/test/engine/errinj_ddl.test.lua > index c40eae93..207610c4 100644 > --- a/test/engine/errinj_ddl.test.lua > +++ b/test/engine/errinj_ddl.test.lua > @@ -188,3 +188,41 @@ ch:get() > > s:select() > s:drop() > + > +-- > +-- Concurrent DDL. > +-- > +s1 = box.schema.space.create('test1', {engine = engine}) > +_ = s1:create_index('pk') > +for i = 1, 5 do s1:insert{i, i} end > + > +s2 = box.schema.space.create('test2', {engine = engine}) > +_ = s2:create_index('pk') > +for i = 1, 5 do s2:insert{i, i} end > + > +errinj.set("ERRINJ_BUILD_INDEX_DELAY", true) > +ch = fiber.channel(1) > +_ = fiber.create(function() s1:create_index('sk', {parts = {2, 'unsigned'}}) ch:put(true) end) > + > +-- Modification of the same space must fail while an index creation > +-- is in progress. > +s1:format{{'a', 'unsigned'}, {'b', 'unsigned'}} > +s1:truncate() > +_ = s1:create_index('tk', {parts = {3, 'unsigned'}}) > +s1:drop() > + > +-- Modification of another space must work though. > +s2:format{{'a', 'unsigned'}, {'b', 'unsigned'}} > +s2:truncate() > +_ = s2:create_index('sk', {parts = {2, 'unsigned'}}) > +s2:drop() > +s2 = box.schema.space.create('test2', {engine = engine}) > +_ = s2:create_index('pk') > +s2:drop() > + > +errinj.set("ERRINJ_BUILD_INDEX_DELAY", false) > +ch:get() > + > +s1.index.pk:select() > +s1.index.sk:select() > +s1:drop() > -- > 2.11.0 > -- Konstantin Osipov, Moscow, Russia