[Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri May 21 22:26:17 MSK 2021


Hi! Thanks for the review!

On 13.05.2021 22:23, Oleg Babin wrote:
> Hi! Thanks for your patch! Three comments below.
> 
> On 13.05.2021 14:07, Vladislav Shpilevoy wrote:
>> +local function schema_install_triggers()
>> +    local _bucket = box.space._bucket
>> +    if M.bucket_on_replace then
>> +        local ok, err = pcall(_bucket.on_replace, _bucket, nil,
>> +                              M.bucket_on_replace)
>> +        if not ok then
>> +            log.warn('Could not drop old trigger from '..
>> +                     '_bucket: %s', err)
>> +        end
>> +    end
>> +    _bucket:on_replace(bucket_generation_increment)
>> +    M.bucket_on_replace = bucket_generation_increment
>> +end
>> +
>> +local function schema_install_on_replace(_, new)
>> +    -- Wait not just for _bucket to appear, but for the entire
>> +    -- schema. This might be important if the schema will ever
>> +    -- consist of more than just _bucket.
>> +    if new[1] ~= 'vshard_version' then
> 
> 
> Is it possible that someone drops something from _schema space? I mean that in
> 
> such case new will be nil.
> 
> I could provide one such case - when user needs to execute box.once once again, (s)he
> 
> drops corresponding record from _schema space.

Great note! Fixed:

====================
diff --git a/test/router/boot_replica_first.result b/test/router/boot_replica_first.result
index 3c5a08c..d6eaca5 100644
--- a/test/router/boot_replica_first.result
+++ b/test/router/boot_replica_first.result
@@ -42,6 +42,25 @@ util.check_error(vshard.storage.call, 1, 'read', 'echo', {100})
  | - attempt to index field '_bucket' (a nil value)
  | ...
 
+-- While waiting for the schema, gracefully handle deletions from _schema.
+ro = box.cfg.read_only
+ | ---
+ | ...
+box.cfg{read_only = false}
+ | ---
+ | ...
+box.space._schema:insert({'gh-276'})
+ | ---
+ | - ['gh-276']
+ | ...
+box.space._schema:delete({'gh-276'})
+ | ---
+ | - ['gh-276']
+ | ...
+box.cfg{read_only = ro}
+ | ---
+ | ...
+
 test_run:switch('default')
  | ---
  | - true
diff --git a/test/router/boot_replica_first.test.lua b/test/router/boot_replica_first.test.lua
index f973f2b..f1a3c19 100644
--- a/test/router/boot_replica_first.test.lua
+++ b/test/router/boot_replica_first.test.lua
@@ -19,6 +19,13 @@ vshard.storage.cfg(cfg, instance_uuid)
 -- _bucket is not created yet. Will fail.
 util.check_error(vshard.storage.call, 1, 'read', 'echo', {100})
 
+-- While waiting for the schema, gracefully handle deletions from _schema.
+ro = box.cfg.read_only
+box.cfg{read_only = false}
+box.space._schema:insert({'gh-276'})
+box.space._schema:delete({'gh-276'})
+box.cfg{read_only = ro}
+
 test_run:switch('default')
 util.map_evals(test_run, {REPLICASET_1}, 'bootstrap_storage(\'memtx\')')
 
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index b1a20d6..5285ee0 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -424,7 +424,7 @@ local function schema_install_on_replace(_, new)
     -- Wait not just for _bucket to appear, but for the entire
     -- schema. This might be important if the schema will ever
     -- consist of more than just _bucket.
-    if new[1] ~= 'vshard_version' then
+    if new == nil or new[1] ~= 'vshard_version' then
         return
     end
     schema_install_triggers()
====================

>> +        return
>> +    end
>> +    schema_install_triggers()
>> +
>> +    local _schema = box.space._schema
> 
> nit: you could get it from the third argument of the trigger. But up to you.

What do you mean? AFAIS, the third argument is space name, not the space
itself. And I need the space itself to clear the trigger.

>> +    local ok, err = pcall(_schema.on_replace, _schema, nil, M.schema_on_replace)
>> +    if not ok then
>> +        log.warn('Could not drop trigger from _schema inside of the '..
>> +                 'trigger: %s', err)
>> +    end
>> +    M.schema_on_replace = nil
>> +    -- Drop the caches which might have been created while the
>> +    -- schema was being replicated.
>> +    bucket_generation_increment()
>> +end
>> @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
>>       local uri = luri.parse(this_replica.uri)
>>       schema_upgrade(is_master, uri.login, uri.password)
>>   -    if M.bucket_on_replace then
>> -        box.space._bucket:on_replace(nil, M.bucket_on_replace)
>> -        M.bucket_on_replace = nil
>> -    end
>> -    lref.cfg()
>> -    if is_master then
>> -        box.space._bucket:on_replace(bucket_generation_increment)
>> -        M.bucket_on_replace = bucket_generation_increment
>> +    if is_master or box.space._bucket then
> 
> 
> Is it possible that "is_master" is true and _bucket is nil?
> 
> Why simple `if box.space._bucket` is not enough?

I want to emphasize that on the master the bootstrap is finished anyway,
because it performs the bootstrap. On the master no need to check for
_bucket nor wait for the schema anyhow.


More information about the Tarantool-patches mailing list