Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug
@ 2021-05-13 11:07 Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool Vladislav Shpilevoy via Tarantool-patches
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-13 11:07 UTC (permalink / raw)
  To: tarantool-patches, olegrok, yaroslav.dynnikov

The patchset fixes the issue with buckets_count() returning incorrect values on
replica nodes due to cache inconsistency.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-276-bucket-count-outdated
Issue: https://github.com/tarantool/tarantool/issues/276

Vladislav Shpilevoy (2):
  test: fix test output on latest Tarantool
  vshard: fix buckets_count() on replicas

 test/misc/reconfigure.result            | 21 ++++---
 test/misc/reconfigure.test.lua          | 13 +++--
 test/router/boot_replica_first.result   |  9 ++-
 test/router/boot_replica_first.test.lua |  7 ++-
 test/storage/storage.result             | 52 ++++++++++++++++-
 test/storage/storage.test.lua           | 22 ++++++-
 vshard/storage/init.lua                 | 78 ++++++++++++++++++++++---
 7 files changed, 171 insertions(+), 31 deletions(-)

-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool
  2021-05-13 11:07 [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-13 11:07 ` Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
  2021-05-19 21:50   ` Yaroslav Dynnikov via Tarantool-patches
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas Vladislav Shpilevoy via Tarantool-patches
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-13 11:07 UTC (permalink / raw)
  To: tarantool-patches, olegrok, yaroslav.dynnikov

Duplicate key error at insertion into a space on the latest
Tarantool changed its message and it broke of the tests. The patch
updates the test so it checks only the needed part of the message
and does not depend on Tarantool version anymore.
---
 test/storage/storage.result   | 8 +++++---
 test/storage/storage.test.lua | 3 ++-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/test/storage/storage.result b/test/storage/storage.result
index 2c9784a..d18b7f8 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -179,10 +179,12 @@ vshard.storage.buckets_info()
     status: active
     id: 1
 ...
-vshard.storage.bucket_force_create(1) -- error
+ok, err = vshard.storage.bucket_force_create(1)
 ---
-- null
-- Duplicate key exists in unique index 'pk' in space '_bucket'
+...
+assert(not ok and err.message:match("Duplicate key exists"))
+---
+- Duplicate key exists
 ...
 vshard.storage.bucket_force_drop(1)
 ---
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
index 33f0498..97558f6 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -56,7 +56,8 @@ vshard.storage.sync(100500)
 vshard.storage.buckets_info()
 vshard.storage.bucket_force_create(1)
 vshard.storage.buckets_info()
-vshard.storage.bucket_force_create(1) -- error
+ok, err = vshard.storage.bucket_force_create(1)
+assert(not ok and err.message:match("Duplicate key exists"))
 vshard.storage.bucket_force_drop(1)
 
 vshard.storage.buckets_info()
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas
  2021-05-13 11:07 [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-13 11:07 ` Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
  2021-05-19 21:51   ` Yaroslav Dynnikov via Tarantool-patches
  2021-05-13 20:23 ` [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Oleg Babin via Tarantool-patches
  2021-05-25 20:43 ` Vladislav Shpilevoy via Tarantool-patches
  3 siblings, 2 replies; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-13 11:07 UTC (permalink / raw)
  To: tarantool-patches, olegrok, yaroslav.dynnikov

vshard.storage.buckets_count() uses a cached count value when no
changes in _bucket space. The changes are detected using an
on_replace trigger on _bucket space. But it wasn't installed on
the replicas.

As a result, on replica vshard.storage.buckets_count() didn't
change at all after being called once.

The patch makes replicas install the trigger on _bucket and update
the bucket generation + drop the caches properly.

The tricky part is that _bucket might not exist on a replica if it
was vshard-configured earlier than the master. But for that
occasion the trigger installation on _bucket is delayed until
vshard schema is replicated from the master.

An alternative was to introduce a special version of
buckets_count() on replicas which does not use the cache at all.
But the trigger is going to be useful in scope of #173 where it
should drop bucket refs on the replica.

Closes #276
Needed for #173
---
 test/misc/reconfigure.result            | 21 ++++---
 test/misc/reconfigure.test.lua          | 13 +++--
 test/router/boot_replica_first.result   |  9 ++-
 test/router/boot_replica_first.test.lua |  7 ++-
 test/storage/storage.result             | 44 ++++++++++++++
 test/storage/storage.test.lua           | 19 ++++++
 vshard/storage/init.lua                 | 78 ++++++++++++++++++++++---
 7 files changed, 164 insertions(+), 27 deletions(-)

diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
index 3b34841..a03ab5a 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -271,9 +271,9 @@ box.cfg.read_only
 ---
 - false
 ...
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() == 1)
 ---
-- 1
+- true
 ...
 _ = test_run:switch('storage_2_a')
 ---
@@ -305,10 +305,13 @@ box.cfg.read_only
 ---
 - true
 ...
--- Should be zero on the slave node. Even though earlier the node was a master.
-#box.space._bucket:on_replace()
+--
+-- gh-276: replica should have triggers. This is important for proper update of
+-- caches and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() == 1)
 ---
-- 0
+- true
 ...
 _ = test_run:switch('storage_2_b')
 ---
@@ -340,9 +343,9 @@ box.cfg.read_only
 ---
 - false
 ...
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() == 1)
 ---
-- 1
+- true
 ...
 _ = test_run:switch('storage_3_a')
 ---
@@ -373,9 +376,9 @@ box.cfg.read_only
 ---
 - false
 ...
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() == 1)
 ---
-- 1
+- true
 ...
 _ = test_run:switch('router_1')
 ---
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
index 348628c..61ea3c0 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -109,7 +109,7 @@ table.sort(uris)
 uris
 box.cfg.replication
 box.cfg.read_only
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() == 1)
 
 _ = test_run:switch('storage_2_a')
 info = vshard.storage.info()
@@ -119,8 +119,11 @@ table.sort(uris)
 uris
 box.cfg.replication
 box.cfg.read_only
--- Should be zero on the slave node. Even though earlier the node was a master.
-#box.space._bucket:on_replace()
+--
+-- gh-276: replica should have triggers. This is important for proper update of
+-- caches and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() == 1)
 
 _ = test_run:switch('storage_2_b')
 info = vshard.storage.info()
@@ -130,7 +133,7 @@ table.sort(uris)
 uris
 box.cfg.replication
 box.cfg.read_only
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() == 1)
 
 _ = test_run:switch('storage_3_a')
 info = vshard.storage.info()
@@ -140,7 +143,7 @@ table.sort(uris)
 uris
 box.cfg.replication
 box.cfg.read_only
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() == 1)
 
 _ = test_run:switch('router_1')
 info = vshard.router.info()
diff --git a/test/router/boot_replica_first.result b/test/router/boot_replica_first.result
index 1705230..3c5a08c 100644
--- a/test/router/boot_replica_first.result
+++ b/test/router/boot_replica_first.result
@@ -76,10 +76,13 @@ vshard.storage.call(1, 'read', 'echo', {100})
  |   message: 'Cannot perform action with bucket 1, reason: Not found'
  |   name: WRONG_BUCKET
  | ...
--- Should not have triggers.
-#box.space._bucket:on_replace()
+--
+-- gh-276: should have triggers. This is important for proper update of caches
+-- and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() == 1)
  | ---
- | - 0
+ | - true
  | ...
 
 test_run:switch('router')
diff --git a/test/router/boot_replica_first.test.lua b/test/router/boot_replica_first.test.lua
index 7b1b3fd..f973f2b 100644
--- a/test/router/boot_replica_first.test.lua
+++ b/test/router/boot_replica_first.test.lua
@@ -29,8 +29,11 @@ test_run:switch('box_1_b')
 test_run:wait_lsn('box_1_b', 'box_1_a')
 -- Fails, but gracefully. Because the bucket is not found here.
 vshard.storage.call(1, 'read', 'echo', {100})
--- Should not have triggers.
-#box.space._bucket:on_replace()
+--
+-- gh-276: should have triggers. This is important for proper update of caches
+-- and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() == 1)
 
 test_run:switch('router')
 vshard.router.bootstrap()
diff --git a/test/storage/storage.result b/test/storage/storage.result
index d18b7f8..570d9c6 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -708,6 +708,24 @@ assert(vshard.storage.buckets_count() == 0)
 ---
 - true
 ...
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+---
+...
+_ = test_run:switch('storage_1_b')
+---
+...
+--
+-- gh-276: bucket count cache should be properly updated on the replica nodes.
+-- For that the replicas must also install on_replace trigger on _bucket space
+-- to watch for changes.
+--
+assert(vshard.storage.buckets_count() == 0)
+---
+- true
+...
+_ = test_run:switch('storage_1_a')
+---
+...
 vshard.storage.bucket_force_create(1, 5)
 ---
 - true
@@ -716,6 +734,19 @@ assert(vshard.storage.buckets_count() == 5)
 ---
 - true
 ...
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+---
+...
+_ = test_run:switch('storage_1_b')
+---
+...
+assert(vshard.storage.buckets_count() == 5)
+---
+- true
+...
+_ = test_run:switch('storage_1_a')
+---
+...
 vshard.storage.bucket_force_create(6, 5)
 ---
 - true
@@ -724,9 +755,22 @@ assert(vshard.storage.buckets_count() == 10)
 ---
 - true
 ...
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+---
+...
+_ = test_run:switch('storage_1_b')
+---
+...
+assert(vshard.storage.buckets_count() == 10)
+---
+- true
+...
 --
 -- Bucket_generation_wait() registry function.
 --
+_ = test_run:switch('storage_1_a')
+---
+...
 lstorage = require('vshard.registry').storage
 ---
 ...
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
index 97558f6..494e2e8 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -201,14 +201,33 @@ for bid, _ in pairs(buckets) do vshard.storage.bucket_force_drop(bid) end
 
 _ = test_run:switch('storage_1_a')
 assert(vshard.storage.buckets_count() == 0)
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+_ = test_run:switch('storage_1_b')
+--
+-- gh-276: bucket count cache should be properly updated on the replica nodes.
+-- For that the replicas must also install on_replace trigger on _bucket space
+-- to watch for changes.
+--
+assert(vshard.storage.buckets_count() == 0)
+
+_ = test_run:switch('storage_1_a')
 vshard.storage.bucket_force_create(1, 5)
 assert(vshard.storage.buckets_count() == 5)
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+_ = test_run:switch('storage_1_b')
+assert(vshard.storage.buckets_count() == 5)
+
+_ = test_run:switch('storage_1_a')
 vshard.storage.bucket_force_create(6, 5)
 assert(vshard.storage.buckets_count() == 10)
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+_ = test_run:switch('storage_1_b')
+assert(vshard.storage.buckets_count() == 10)
 
 --
 -- Bucket_generation_wait() registry function.
 --
+_ = test_run:switch('storage_1_a')
 lstorage = require('vshard.registry').storage
 ok, err = lstorage.bucket_generation_wait(-1)
 assert(not ok and err.message)
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index c4400f7..b1a20d6 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -115,6 +115,14 @@ if not M then
         -- replace the old function is to keep its reference.
         --
         bucket_on_replace = nil,
+        --
+        -- Reference to the function used as on_replace trigger on
+        -- _schema space. Saved explicitly by the same reason as
+        -- _bucket on_replace.
+        -- It is used by replicas to wait for schema bootstrap
+        -- because they might be configured earlier than the
+        -- master.
+        schema_on_replace = nil,
         -- Fast alternative to box.space._bucket:count(). But may be nil. Reset
         -- on each generation change.
         bucket_count_cache = nil,
@@ -398,6 +406,62 @@ local function schema_version_make(ver)
     return setmetatable(ver, schema_version_mt)
 end
 
+local function schema_install_triggers()
+    local _bucket = box.space._bucket
+    if M.bucket_on_replace then
+        local ok, err = pcall(_bucket.on_replace, _bucket, nil,
+                              M.bucket_on_replace)
+        if not ok then
+            log.warn('Could not drop old trigger from '..
+                     '_bucket: %s', err)
+        end
+    end
+    _bucket:on_replace(bucket_generation_increment)
+    M.bucket_on_replace = bucket_generation_increment
+end
+
+local function schema_install_on_replace(_, new)
+    -- Wait not just for _bucket to appear, but for the entire
+    -- schema. This might be important if the schema will ever
+    -- consist of more than just _bucket.
+    if new[1] ~= 'vshard_version' then
+        return
+    end
+    schema_install_triggers()
+
+    local _schema = box.space._schema
+    local ok, err = pcall(_schema.on_replace, _schema, nil, M.schema_on_replace)
+    if not ok then
+        log.warn('Could not drop trigger from _schema inside of the '..
+                 'trigger: %s', err)
+    end
+    M.schema_on_replace = nil
+    -- Drop the caches which might have been created while the
+    -- schema was being replicated.
+    bucket_generation_increment()
+end
+
+--
+-- Install the triggers later when there is an actual schema to install them on.
+-- On replicas it might happen that they are vshard-configured earlier than the
+-- master and therefore don't have the schema right away.
+--
+local function schema_install_triggers_delayed()
+    log.info('Could not find _bucket space to install triggers - delayed '..
+             'until the schema is replicated')
+    assert(not box.space._bucket)
+    local _schema = box.space._schema
+    if M.schema_on_replace then
+        local ok, err = pcall(_schema.on_replace, _schema, nil,
+                              M.schema_on_replace)
+        if not ok then
+            log.warn('Could not drop trigger from _schema: %s', err)
+        end
+    end
+    _schema:on_replace(schema_install_on_replace)
+    M.schema_on_replace = schema_install_on_replace
+end
+
 -- VShard versioning works in 4 numbers: major, minor, patch, and
 -- a last helper number incremented on every schema change, if
 -- first 3 numbers stay not changed. That happens when users take
@@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
     local uri = luri.parse(this_replica.uri)
     schema_upgrade(is_master, uri.login, uri.password)
 
-    if M.bucket_on_replace then
-        box.space._bucket:on_replace(nil, M.bucket_on_replace)
-        M.bucket_on_replace = nil
-    end
-    lref.cfg()
-    if is_master then
-        box.space._bucket:on_replace(bucket_generation_increment)
-        M.bucket_on_replace = bucket_generation_increment
+    if is_master or box.space._bucket then
+        schema_install_triggers()
+    else
+        schema_install_triggers_delayed()
     end
 
+    lref.cfg()
     lsched.cfg(vshard_cfg)
+
     lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
     lreplicaset.outdate_replicasets(M.replicasets)
     M.replicasets = new_replicasets
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug
  2021-05-13 11:07 [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-13 20:23 ` Oleg Babin via Tarantool-patches
  2021-05-25 20:43 ` Vladislav Shpilevoy via Tarantool-patches
  3 siblings, 0 replies; 14+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-05-13 20:23 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov

Hi! Thanks for patchset. I think links should refer to vshard not 
tarantool repo.

See my comments in following e-mails.

On 13.05.2021 14:07, Vladislav Shpilevoy wrote:
> The patchset fixes the issue with buckets_count() returning incorrect values on
> replica nodes due to cache inconsistency.
>
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-276-bucket-count-outdated
> Issue: https://github.com/tarantool/tarantool/issues/276
>
> Vladislav Shpilevoy (2):
>    test: fix test output on latest Tarantool
>    vshard: fix buckets_count() on replicas
>
>   test/misc/reconfigure.result            | 21 ++++---
>   test/misc/reconfigure.test.lua          | 13 +++--
>   test/router/boot_replica_first.result   |  9 ++-
>   test/router/boot_replica_first.test.lua |  7 ++-
>   test/storage/storage.result             | 52 ++++++++++++++++-
>   test/storage/storage.test.lua           | 22 ++++++-
>   vshard/storage/init.lua                 | 78 ++++++++++++++++++++++---
>   7 files changed, 171 insertions(+), 31 deletions(-)
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
  2021-05-19 21:50   ` Yaroslav Dynnikov via Tarantool-patches
  1 sibling, 0 replies; 14+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-05-13 20:23 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov

Hi! Thanks for your patch. LGTM.

On 13.05.2021 14:07, Vladislav Shpilevoy wrote:
> Duplicate key error at insertion into a space on the latest
> Tarantool changed its message and it broke of the tests. The patch
> updates the test so it checks only the needed part of the message
> and does not depend on Tarantool version anymore.
> ---
>   test/storage/storage.result   | 8 +++++---
>   test/storage/storage.test.lua | 3 ++-
>   2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index 2c9784a..d18b7f8 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -179,10 +179,12 @@ vshard.storage.buckets_info()
>       status: active
>       id: 1
>   ...
> -vshard.storage.bucket_force_create(1) -- error
> +ok, err = vshard.storage.bucket_force_create(1)
>   ---
> -- null
> -- Duplicate key exists in unique index 'pk' in space '_bucket'
> +...
> +assert(not ok and err.message:match("Duplicate key exists"))
> +---
> +- Duplicate key exists
>   ...
>   vshard.storage.bucket_force_drop(1)
>   ---
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index 33f0498..97558f6 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -56,7 +56,8 @@ vshard.storage.sync(100500)
>   vshard.storage.buckets_info()
>   vshard.storage.bucket_force_create(1)
>   vshard.storage.buckets_info()
> -vshard.storage.bucket_force_create(1) -- error
> +ok, err = vshard.storage.bucket_force_create(1)
> +assert(not ok and err.message:match("Duplicate key exists"))
>   vshard.storage.bucket_force_drop(1)
>   
>   vshard.storage.buckets_info()

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
  2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
  2021-05-19 21:51   ` Yaroslav Dynnikov via Tarantool-patches
  1 sibling, 1 reply; 14+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-05-13 20:23 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov

Hi! Thanks for your patch! Three comments below.

On 13.05.2021 14:07, Vladislav Shpilevoy wrote:
> +local function schema_install_triggers()
> +    local _bucket = box.space._bucket
> +    if M.bucket_on_replace then
> +        local ok, err = pcall(_bucket.on_replace, _bucket, nil,
> +                              M.bucket_on_replace)
> +        if not ok then
> +            log.warn('Could not drop old trigger from '..
> +                     '_bucket: %s', err)
> +        end
> +    end
> +    _bucket:on_replace(bucket_generation_increment)
> +    M.bucket_on_replace = bucket_generation_increment
> +end
> +
> +local function schema_install_on_replace(_, new)
> +    -- Wait not just for _bucket to appear, but for the entire
> +    -- schema. This might be important if the schema will ever
> +    -- consist of more than just _bucket.
> +    if new[1] ~= 'vshard_version' then


Is it possible that someone drops something from _schema space? I mean 
that in

such case new will be nil.

I could provide one such case - when user needs to execute box.once once 
again, (s)he

drops corresponding record from _schema space.

> +        return
> +    end
> +    schema_install_triggers()
> +
> +    local _schema = box.space._schema

nit: you could get it from the third argument of the trigger. But up to you.

> +    local ok, err = pcall(_schema.on_replace, _schema, nil, M.schema_on_replace)
> +    if not ok then
> +        log.warn('Could not drop trigger from _schema inside of the '..
> +                 'trigger: %s', err)
> +    end
> +    M.schema_on_replace = nil
> +    -- Drop the caches which might have been created while the
> +    -- schema was being replicated.
> +    bucket_generation_increment()
> +end
> +
> +--
> +-- Install the triggers later when there is an actual schema to install them on.
> +-- On replicas it might happen that they are vshard-configured earlier than the
> +-- master and therefore don't have the schema right away.
> +--
> +local function schema_install_triggers_delayed()
> +    log.info('Could not find _bucket space to install triggers - delayed '..
> +             'until the schema is replicated')
> +    assert(not box.space._bucket)
> +    local _schema = box.space._schema
> +    if M.schema_on_replace then
> +        local ok, err = pcall(_schema.on_replace, _schema, nil,
> +                              M.schema_on_replace)
> +        if not ok then
> +            log.warn('Could not drop trigger from _schema: %s', err)
> +        end
> +    end
> +    _schema:on_replace(schema_install_on_replace)
> +    M.schema_on_replace = schema_install_on_replace
> +end
> +
>   -- VShard versioning works in 4 numbers: major, minor, patch, and
>   -- a last helper number incremented on every schema change, if
>   -- first 3 numbers stay not changed. That happens when users take
> @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
>       local uri = luri.parse(this_replica.uri)
>       schema_upgrade(is_master, uri.login, uri.password)
>   
> -    if M.bucket_on_replace then
> -        box.space._bucket:on_replace(nil, M.bucket_on_replace)
> -        M.bucket_on_replace = nil
> -    end
> -    lref.cfg()
> -    if is_master then
> -        box.space._bucket:on_replace(bucket_generation_increment)
> -        M.bucket_on_replace = bucket_generation_increment
> +    if is_master or box.space._bucket then


Is it possible that "is_master" is true and _bucket is nil?

Why simple `if box.space._bucket` is not enough?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
@ 2021-05-19 21:50   ` Yaroslav Dynnikov via Tarantool-patches
  2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
  1 sibling, 1 reply; 14+ messages in thread
From: Yaroslav Dynnikov via Tarantool-patches @ 2021-05-19 21:50 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Yaroslav Dynnikov, tml

[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]

HI, Vlad

Thanks for the patch.
Find one comment below.

On Thu, 13 May 2021 at 14:07, Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
wrote:

> Duplicate key error at insertion into a space on the latest
> Tarantool changed its message and it broke of the tests. The patch
> updates the test so it checks only the needed part of the message
> and does not depend on Tarantool version anymore.
> ---
>  test/storage/storage.result   | 8 +++++---
>  test/storage/storage.test.lua | 3 ++-
>  2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index 2c9784a..d18b7f8 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -179,10 +179,12 @@ vshard.storage.buckets_info()
>      status: active
>      id: 1
>  ...
> -vshard.storage.bucket_force_create(1) -- error
> +ok, err = vshard.storage.bucket_force_create(1)
>  ---
> -- null
> -- Duplicate key exists in unique index 'pk' in space '_bucket'
> +...
> +assert(not ok and err.message:match("Duplicate key exists"))
> +---
> +- Duplicate key exists
>  ...
>  vshard.storage.bucket_force_drop(1)
>  ---
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index 33f0498..97558f6 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -56,7 +56,8 @@ vshard.storage.sync(100500)
>  vshard.storage.buckets_info()
>  vshard.storage.bucket_force_create(1)
>  vshard.storage.buckets_info()
> -vshard.storage.bucket_force_create(1) -- error
> +ok, err = vshard.storage.bucket_force_create(1)
> +assert(not ok and "err.message:match("Duplicate key exists))
>

I'd suggest splitting the check in two:

1. ok -- should be false
2. Then check the message matches.

Assertions usually don't provide useful errors.


>  vshard.storage.bucket_force_drop(1)
>
>  vshard.storage.buckets_info()
> --
> 2.24.3 (Apple Git-128)
>
>
Best regards
Yaroslav Dynnikov

[-- Attachment #2: Type: text/html, Size: 3099 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas
  2021-05-13 11:07 ` [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas Vladislav Shpilevoy via Tarantool-patches
  2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
@ 2021-05-19 21:51   ` Yaroslav Dynnikov via Tarantool-patches
  2021-05-21 19:30     ` Vladislav Shpilevoy via Tarantool-patches
  1 sibling, 1 reply; 14+ messages in thread
From: Yaroslav Dynnikov via Tarantool-patches @ 2021-05-19 21:51 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Yaroslav Dynnikov, tml

[-- Attachment #1: Type: text/plain, Size: 13769 bytes --]

Hi!

I've got several questions about this patch, find them below.

On Thu, 13 May 2021 at 14:07, Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
wrote:

> vshard.storage.buckets_count() uses a cached count value when no
> changes in _bucket space. The changes are detected using an
> on_replace trigger on _bucket space. But it wasn't installed on
> the replicas.
>
> As a result, on replica vshard.storage.buckets_count() didn't
> change at all after being called once.
>
> The patch makes replicas install the trigger on _bucket and update
> the bucket generation + drop the caches properly.
>
> The tricky part is that _bucket might not exist on a replica if it
> was vshard-configured earlier than the master. But for that
> occasion the trigger installation on _bucket is delayed until
> vshard schema is replicated from the master.
>
> An alternative was to introduce a special version of
> buckets_count() on replicas which does not use the cache at all.
> But the trigger is going to be useful in scope of #173 where it
> should drop bucket refs on the replica.
>
> Closes #276
> Needed for #173
> ---
>  test/misc/reconfigure.result            | 21 ++++---
>  test/misc/reconfigure.test.lua          | 13 +++--
>  test/router/boot_replica_first.result   |  9 ++-
>  test/router/boot_replica_first.test.lua |  7 ++-
>  test/storage/storage.result             | 44 ++++++++++++++
>  test/storage/storage.test.lua           | 19 ++++++
>  vshard/storage/init.lua                 | 78 ++++++++++++++++++++++---
>  7 files changed, 164 insertions(+), 27 deletions(-)
>
> diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
> index 3b34841..a03ab5a 100644
> --- a/test/misc/reconfigure.result
> +++ b/test/misc/reconfigure.result
> @@ -271,9 +271,9 @@ box.cfg.read_only
>  ---
>  - false
>  ...
> -#box.space._bucket:on_replace()
> +assert(#box.space._bucket:on_replace() == 1)
>  ---
> -- 1
> +- true
>  ...
>

Why is this change necessary? Seems now you miss an actual value in case
the test fails.
This question applies to many cases below.

No matter what testing framework is used, I think that

assert_equals(value, 1) -- assertion failed: expected 1, got 2

is better (more helpful) than eager

assert(value == 1) -- assertion failed!



>  _ = test_run:switch('storage_2_a')
>  ---
> @@ -305,10 +305,13 @@ box.cfg.read_only
>  ---
>  - true
>  ...
> --- Should be zero on the slave node. Even though earlier the node was a
> master.
> -#box.space._bucket:on_replace()
> +--
> +-- gh-276: replica should have triggers. This is important for proper
> update of
> +-- caches and in future for discarding refs in scope of gh-173.
> +--
> +assert(#box.space._bucket:on_replace() == 1)
>  ---
> -- 0
> +- true
>
 ...

 _ = test_run:switch('storage_2_b')
>  ---
> @@ -340,9 +343,9 @@ box.cfg.read_only
>  ---
>  - false
>  ...
> -#box.space._bucket:on_replace()
> +assert(#box.space._bucket:on_replace() == 1)
>  ---
> -- 1
> +- true
>  ...
>  _ = test_run:switch('storage_3_a')
>  ---
> @@ -373,9 +376,9 @@ box.cfg.read_only
>  ---
>  - false
>  ...
> -#box.space._bucket:on_replace()
> +assert(#box.space._bucket:on_replace() == 1)
>  ---
> -- 1
> +- true
>  ...
>  _ = test_run:switch('router_1')
>  ---
> diff --git a/test/misc/reconfigure.test.lua
> b/test/misc/reconfigure.test.lua
> index 348628c..61ea3c0 100644
> --- a/test/misc/reconfigure.test.lua
> +++ b/test/misc/reconfigure.test.lua
> @@ -109,7 +109,7 @@ table.sort(uris)
>  uris
>  box.cfg.replication
>  box.cfg.read_only
> -#box.space._bucket:on_replace()
> +assert(#box.space._bucket:on_replace() == 1)
>
>  _ = test_run:switch('storage_2_a')
>  info = vshard.storage.info()
> @@ -119,8 +119,11 @@ table.sort(uris)
>  uris
>  box.cfg.replication
>  box.cfg.read_only
> --- Should be zero on the slave node. Even though earlier the node was a
> master.
> -#box.space._bucket:on_replace()
> +--
> +-- gh-276: replica should have triggers. This is important for proper
> update of
> +-- caches and in future for discarding refs in scope of gh-173.
> +--
> +assert(#box.space._bucket:on_replace() == 1)
>
>  _ = test_run:switch('storage_2_b')
>  info = vshard.storage.info()
> @@ -130,7 +133,7 @@ table.sort(uris)
>  uris
>  box.cfg.replication
>  box.cfg.read_only
> -#box.space._bucket:on_replace()
> +assert(#box.space._bucket:on_replace() == 1)
>
>  _ = test_run:switch('storage_3_a')
>  info = vshard.storage.info()
> @@ -140,7 +143,7 @@ table.sort(uris)
>  uris
>  box.cfg.replication
>  box.cfg.read_only
> -#box.space._bucket:on_replace()
> +assert(#box.space._bucket:on_replace() == 1)
>
>  _ = test_run:switch('router_1')
>  info = vshard.router.info()
> diff --git a/test/router/boot_replica_first.result
> b/test/router/boot_replica_first.result
> index 1705230..3c5a08c 100644
> --- a/test/router/boot_replica_first.result
> +++ b/test/router/boot_replica_first.result
> @@ -76,10 +76,13 @@ vshard.storage.call(1, 'read', 'echo', {100})
>   |   message: 'Cannot perform action with bucket 1, reason: Not found'
>   |   name: WRONG_BUCKET
>   | ...
> --- Should not have triggers.
> -#box.space._bucket:on_replace()
> +--
> +-- gh-276: should have triggers. This is important for proper update of
> caches
> +-- and in future for discarding refs in scope of gh-173.
> +--
> +assert(#box.space._bucket:on_replace() == 1)
>   | ---
> - | - 0
> + | - true
>   | ...
>
>  test_run:switch('router')
> diff --git a/test/router/boot_replica_first.test.lua
> b/test/router/boot_replica_first.test.lua
> index 7b1b3fd..f973f2b 100644
> --- a/test/router/boot_replica_first.test.lua
> +++ b/test/router/boot_replica_first.test.lua
> @@ -29,8 +29,11 @@ test_run:switch('box_1_b')
>  test_run:wait_lsn('box_1_b', 'box_1_a')
>  -- Fails, but gracefully. Because the bucket is not found here.
>  vshard.storage.call(1, 'read', 'echo', {100})
> --- Should not have triggers.
> -#box.space._bucket:on_replace()
> +--
> +-- gh-276: should have triggers. This is important for proper update of
> caches
> +-- and in future for discarding refs in scope of gh-173.
> +--
> +assert(#box.space._bucket:on_replace() == 1)
>
>  test_run:switch('router')
>  vshard.router.bootstrap()
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index d18b7f8..570d9c6 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -708,6 +708,24 @@ assert(vshard.storage.buckets_count() == 0)
>  ---
>  - true
>  ...
> +test_run:wait_lsn('storage_1_b', 'storage_1_a')
> +---
> +...
> +_ = test_run:switch('storage_1_b')
> +---
> +...
> +--
> +-- gh-276: bucket count cache should be properly updated on the replica
> nodes.
> +-- For that the replicas must also install on_replace trigger on _bucket
> space
> +-- to watch for changes.
> +--
> +assert(vshard.storage.buckets_count() == 0)
> +---
> +- true
> +...
> +_ = test_run:switch('storage_1_a')
> +---
> +...
>  vshard.storage.bucket_force_create(1, 5)
>  ---
>  - true
> @@ -716,6 +734,19 @@ assert(vshard.storage.buckets_count() == 5)
>  ---
>  - true
>  ...
> +test_run:wait_lsn('storage_1_b', 'storage_1_a')
> +---
> +...
> +_ = test_run:switch('storage_1_b')
> +---
> +...
> +assert(vshard.storage.buckets_count() == 5)
> +---
> +- true
> +...
> +_ = test_run:switch('storage_1_a')
> +---
> +...
>  vshard.storage.bucket_force_create(6, 5)
>  ---
>  - true
> @@ -724,9 +755,22 @@ assert(vshard.storage.buckets_count() == 10)
>  ---
>  - true
>  ...
> +test_run:wait_lsn('storage_1_b', 'storage_1_a')
> +---
> +...
> +_ = test_run:switch('storage_1_b')
> +---
> +...
> +assert(vshard.storage.buckets_count() == 10)
> +---
> +- true
> +...
>  --
>  -- Bucket_generation_wait() registry function.
>  --
> +_ = test_run:switch('storage_1_a')
> +---
> +...
>  lstorage = require('vshard.registry').storage
>  ---
>  ...
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index 97558f6..494e2e8 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -201,14 +201,33 @@ for bid, _ in pairs(buckets) do
> vshard.storage.bucket_force_drop(bid) end
>
>  _ = test_run:switch('storage_1_a')
>  assert(vshard.storage.buckets_count() == 0)
> +test_run:wait_lsn('storage_1_b', 'storage_1_a')
> +_ = test_run:switch('storage_1_b')
> +--
> +-- gh-276: bucket count cache should be properly updated on the replica
> nodes.
> +-- For that the replicas must also install on_replace trigger on _bucket
> space
> +-- to watch for changes.
> +--
> +assert(vshard.storage.buckets_count() == 0)
> +
> +_ = test_run:switch('storage_1_a')
>  vshard.storage.bucket_force_create(1, 5)
>  assert(vshard.storage.buckets_count() == 5)
> +test_run:wait_lsn('storage_1_b', 'storage_1_a')
> +_ = test_run:switch('storage_1_b')
> +assert(vshard.storage.buckets_count() == 5)
> +
> +_ = test_run:switch('storage_1_a')
>  vshard.storage.bucket_force_create(6, 5)
>  assert(vshard.storage.buckets_count() == 10)
> +test_run:wait_lsn('storage_1_b', 'storage_1_a')
> +_ = test_run:switch('storage_1_b')
> +assert(vshard.storage.buckets_count() == 10)
>
>  --
>  -- Bucket_generation_wait() registry function.
>  --
> +_ = test_run:switch('storage_1_a')
>  lstorage = require('vshard.registry').storage
>  ok, err = lstorage.bucket_generation_wait(-1)
>  assert(not ok and err.message)
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index c4400f7..b1a20d6 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -115,6 +115,14 @@ if not M then
>          -- replace the old function is to keep its reference.
>          --
>          bucket_on_replace = nil,
> +        --
> +        -- Reference to the function used as on_replace trigger on
> +        -- _schema space. Saved explicitly by the same reason as
> +        -- _bucket on_replace.
> +        -- It is used by replicas to wait for schema bootstrap
> +        -- because they might be configured earlier than the
> +        -- master.
> +        schema_on_replace = nil,
>          -- Fast alternative to box.space._bucket:count(). But may be nil.
> Reset
>          -- on each generation change.
>          bucket_count_cache = nil,
> @@ -398,6 +406,62 @@ local function schema_version_make(ver)
>      return setmetatable(ver, schema_version_mt)
>  end
>
> +local function schema_install_triggers()
> +    local _bucket = box.space._bucket
> +    if M.bucket_on_replace then
> +        local ok, err = pcall(_bucket.on_replace, _bucket, nil,
> +                              M.bucket_on_replace)
> +        if not ok then
> +            log.warn('Could not drop old trigger from '..
> +                     '_bucket: %s', err)
> +        end
> +    end
> +    _bucket:on_replace(bucket_generation_increment)
> +    M.bucket_on_replace = bucket_generation_increment
> +end
> +
> +local function schema_install_on_replace(_, new)
> +    -- Wait not just for _bucket to appear, but for the entire
> +    -- schema. This might be important if the schema will ever
> +    -- consist of more than just _bucket.
> +    if new[1] ~= 'vshard_version' then
> +        return
> +    end
> +    schema_install_triggers()
> +
> +    local _schema = box.space._schema
> +    local ok, err = pcall(_schema.on_replace, _schema, nil,
> M.schema_on_replace)
> +    if not ok then
> +        log.warn('Could not drop trigger from _schema inside of the '..
> +                 'trigger: %s', err)
> +    end
> +    M.schema_on_replace = nil
> +    -- Drop the caches which might have been created while the
> +    -- schema was being replicated.
> +    bucket_generation_increment()
> +end
> +
> +--
> +-- Install the triggers later when there is an actual schema to install
> them on.
> +-- On replicas it might happen that they are vshard-configured earlier
> than the
> +-- master and therefore don't have the schema right away.
> +--
> +local function schema_install_triggers_delayed()
> +    log.info('Could not find _bucket space to install triggers - delayed
> '..
> +             'until the schema is replicated')
> +    assert(not box.space._bucket)
> +    local _schema = box.space._schema
> +    if M.schema_on_replace then
> +        local ok, err = pcall(_schema.on_replace, _schema, nil,
> +                              M.schema_on_replace)
> +        if not ok then
> +            log.warn('Could not drop trigger from _schema: %s', err)
> +        end
> +    end
> +    _schema:on_replace(schema_install_on_replace)
> +    M.schema_on_replace = schema_install_on_replace
> +end
> +
>  -- VShard versioning works in 4 numbers: major, minor, patch, and
>  -- a last helper number incremented on every schema change, if
>  -- first 3 numbers stay not changed. That happens when users take
> @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid,
> is_reload)
>      local uri = luri.parse(this_replica.uri)
>      schema_upgrade(is_master, uri.login, uri.password)
>
> -    if M.bucket_on_replace then
> -        box.space._bucket:on_replace(nil, M.bucket_on_replace)
> -        M.bucket_on_replace = nil
> -    end
> -    lref.cfg()
> -    if is_master then
> -        box.space._bucket:on_replace(bucket_generation_increment)
> -        M.bucket_on_replace = bucket_generation_increment
> +    if is_master or box.space._bucket then
> +        schema_install_triggers()
>

+1 to Oleg, isn't _bucket check enough?


> +    else
> +        schema_install_triggers_delayed()
>      end
>
> +    lref.cfg()
>      lsched.cfg(vshard_cfg)
> +
>      lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
>      lreplicaset.outdate_replicasets(M.replicasets)
>      M.replicasets = new_replicasets
> --
> 2.24.3 (Apple Git-128)
>
>
P.S. I've tested it on Cartridge. Basic tests (luatest + WebUI inspection)
seem to be OK.

Best regards
Yaroslav Dynnikov

[-- Attachment #2: Type: text/html, Size: 17330 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas
  2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
@ 2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
  2021-05-24  6:57       ` Oleg Babin via Tarantool-patches
  0 siblings, 1 reply; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-21 19:26 UTC (permalink / raw)
  To: Oleg Babin, tarantool-patches, yaroslav.dynnikov

Hi! Thanks for the review!

On 13.05.2021 22:23, Oleg Babin wrote:
> Hi! Thanks for your patch! Three comments below.
> 
> On 13.05.2021 14:07, Vladislav Shpilevoy wrote:
>> +local function schema_install_triggers()
>> +    local _bucket = box.space._bucket
>> +    if M.bucket_on_replace then
>> +        local ok, err = pcall(_bucket.on_replace, _bucket, nil,
>> +                              M.bucket_on_replace)
>> +        if not ok then
>> +            log.warn('Could not drop old trigger from '..
>> +                     '_bucket: %s', err)
>> +        end
>> +    end
>> +    _bucket:on_replace(bucket_generation_increment)
>> +    M.bucket_on_replace = bucket_generation_increment
>> +end
>> +
>> +local function schema_install_on_replace(_, new)
>> +    -- Wait not just for _bucket to appear, but for the entire
>> +    -- schema. This might be important if the schema will ever
>> +    -- consist of more than just _bucket.
>> +    if new[1] ~= 'vshard_version' then
> 
> 
> Is it possible that someone drops something from _schema space? I mean that in
> 
> such case new will be nil.
> 
> I could provide one such case - when user needs to execute box.once once again, (s)he
> 
> drops corresponding record from _schema space.

Great note! Fixed:

====================
diff --git a/test/router/boot_replica_first.result b/test/router/boot_replica_first.result
index 3c5a08c..d6eaca5 100644
--- a/test/router/boot_replica_first.result
+++ b/test/router/boot_replica_first.result
@@ -42,6 +42,25 @@ util.check_error(vshard.storage.call, 1, 'read', 'echo', {100})
  | - attempt to index field '_bucket' (a nil value)
  | ...
 
+-- While waiting for the schema, gracefully handle deletions from _schema.
+ro = box.cfg.read_only
+ | ---
+ | ...
+box.cfg{read_only = false}
+ | ---
+ | ...
+box.space._schema:insert({'gh-276'})
+ | ---
+ | - ['gh-276']
+ | ...
+box.space._schema:delete({'gh-276'})
+ | ---
+ | - ['gh-276']
+ | ...
+box.cfg{read_only = ro}
+ | ---
+ | ...
+
 test_run:switch('default')
  | ---
  | - true
diff --git a/test/router/boot_replica_first.test.lua b/test/router/boot_replica_first.test.lua
index f973f2b..f1a3c19 100644
--- a/test/router/boot_replica_first.test.lua
+++ b/test/router/boot_replica_first.test.lua
@@ -19,6 +19,13 @@ vshard.storage.cfg(cfg, instance_uuid)
 -- _bucket is not created yet. Will fail.
 util.check_error(vshard.storage.call, 1, 'read', 'echo', {100})
 
+-- While waiting for the schema, gracefully handle deletions from _schema.
+ro = box.cfg.read_only
+box.cfg{read_only = false}
+box.space._schema:insert({'gh-276'})
+box.space._schema:delete({'gh-276'})
+box.cfg{read_only = ro}
+
 test_run:switch('default')
 util.map_evals(test_run, {REPLICASET_1}, 'bootstrap_storage(\'memtx\')')
 
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index b1a20d6..5285ee0 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -424,7 +424,7 @@ local function schema_install_on_replace(_, new)
     -- Wait not just for _bucket to appear, but for the entire
     -- schema. This might be important if the schema will ever
     -- consist of more than just _bucket.
-    if new[1] ~= 'vshard_version' then
+    if new == nil or new[1] ~= 'vshard_version' then
         return
     end
     schema_install_triggers()
====================

>> +        return
>> +    end
>> +    schema_install_triggers()
>> +
>> +    local _schema = box.space._schema
> 
> nit: you could get it from the third argument of the trigger. But up to you.

What do you mean? AFAIS, the third argument is space name, not the space
itself. And I need the space itself to clear the trigger.

>> +    local ok, err = pcall(_schema.on_replace, _schema, nil, M.schema_on_replace)
>> +    if not ok then
>> +        log.warn('Could not drop trigger from _schema inside of the '..
>> +                 'trigger: %s', err)
>> +    end
>> +    M.schema_on_replace = nil
>> +    -- Drop the caches which might have been created while the
>> +    -- schema was being replicated.
>> +    bucket_generation_increment()
>> +end
>> @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
>>       local uri = luri.parse(this_replica.uri)
>>       schema_upgrade(is_master, uri.login, uri.password)
>>   -    if M.bucket_on_replace then
>> -        box.space._bucket:on_replace(nil, M.bucket_on_replace)
>> -        M.bucket_on_replace = nil
>> -    end
>> -    lref.cfg()
>> -    if is_master then
>> -        box.space._bucket:on_replace(bucket_generation_increment)
>> -        M.bucket_on_replace = bucket_generation_increment
>> +    if is_master or box.space._bucket then
> 
> 
> Is it possible that "is_master" is true and _bucket is nil?
> 
> Why simple `if box.space._bucket` is not enough?

I want to emphasize that on the master the bootstrap is finished anyway,
because it performs the bootstrap. On the master no need to check for
_bucket nor wait for the schema anyhow.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool
  2021-05-19 21:50   ` Yaroslav Dynnikov via Tarantool-patches
@ 2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
  2021-05-21 21:54       ` Yaroslav Dynnikov via Tarantool-patches
  0 siblings, 1 reply; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-21 19:26 UTC (permalink / raw)
  To: Yaroslav Dynnikov; +Cc: tml

Hi! Thanks for the review!

On 19.05.2021 23:50, Yaroslav Dynnikov wrote:
> HI, Vlad
> 
> Thanks for the patch.
> Find one comment below.
> 
> On Thu, 13 May 2021 at 14:07, Vladislav Shpilevoy <v.shpilevoy@tarantool.org <mailto:v.shpilevoy@tarantool.org>> wrote:
> 
>     Duplicate key error at insertion into a space on the latest
>     Tarantool changed its message and it broke of the tests. The patch
>     updates the test so it checks only the needed part of the message
>     and does not depend on Tarantool version anymore.
>     ---
>      test/storage/storage.result   | 8 +++++---
>      test/storage/storage.test.lua | 3 ++-
>      2 files changed, 7 insertions(+), 4 deletions(-)
> 
>     diff --git a/test/storage/storage.result b/test/storage/storage.result
>     index 2c9784a..d18b7f8 100644
>     --- a/test/storage/storage.result
>     +++ b/test/storage/storage.result
>     @@ -179,10 +179,12 @@ vshard.storage.buckets_info()
>          status: active
>          id: 1
>      ...
>     -vshard.storage.bucket_force_create(1) -- error
>     +ok, err = vshard.storage.bucket_force_create(1)
>      ---
>     -- null
>     -- Duplicate key exists in unique index 'pk' in space '_bucket'
>     +...
>     +assert(not ok and err.message:match("Duplicate key exists"))
>     +---
>     +- Duplicate key exists
>      ...
>      vshard.storage.bucket_force_drop(1)
>      ---
>     diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
>     index 33f0498..97558f6 100644
>     --- a/test/storage/storage.test.lua
>     +++ b/test/storage/storage.test.lua
>     @@ -56,7 +56,8 @@ vshard.storage.sync(100500)
>      vshard.storage.buckets_info()
>      vshard.storage.bucket_force_create(1)
>      vshard.storage.buckets_info()
>     -vshard.storage.bucket_force_create(1) -- error
>     +ok, err = vshard.storage.bucket_force_create(1)
>     +assert(not ok and "err.message:match("Duplicate key exists))
> 
> 
> I'd suggest splitting the check in two:
> 
> 1. ok -- should be false
> 2. Then check the message matches.
> 
> Assertions usually don't provide useful errors.

In the core we try to use assertions more, because this makes the
tests easier to read. You can see right away what are the target
points of the test. This is kind of a simulation of tap tests.

I consider vshard be "half-core", so I decided to use assertions
here as well.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas
  2021-05-19 21:51   ` Yaroslav Dynnikov via Tarantool-patches
@ 2021-05-21 19:30     ` Vladislav Shpilevoy via Tarantool-patches
  0 siblings, 0 replies; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-21 19:30 UTC (permalink / raw)
  To: Yaroslav Dynnikov; +Cc: tml

Thanks for the review!

>     diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
>     index 3b34841..a03ab5a 100644
>     --- a/test/misc/reconfigure.result
>     +++ b/test/misc/reconfigure.result
>     @@ -271,9 +271,9 @@ box.cfg.read_only
>      ---
>      - false
>      ...
>     -#box.space._bucket:on_replace()
>     +assert(#box.space._bucket:on_replace() == 1)
>      ---
>     -- 1
>     +- true
>      ...
> 
> 
> Why is this change necessary? Seems now you miss an actual value in case the test fails.
> This question applies to many cases below.

See me response in the first email about the assertions.

> No matter what testing framework is used, I think that
> 
> assert_equals(value, 1) -- assertion failed: expected 1, got 2
> 
> is better (more helpful) than eager
> 
> assert(value == 1) -- assertion failed
Yes, but what you are describing is present in the tap tests
framework. Which currently is hardly usable for complex multi-instance
tests. Hence I use what is available. Decided not to implement my
own testing suite for these checks as it is anyway trivial to
get more info if they start failing.

Until tap would be easier to use for multi-instance tests.

>     @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
>          local uri = luri.parse(this_replica.uri)
>          schema_upgrade(is_master, uri.login, uri.password)
> 
>     -    if M.bucket_on_replace then
>     -        box.space._bucket:on_replace(nil, M.bucket_on_replace)
>     -        M.bucket_on_replace = nil
>     -    end
>     -    lref.cfg()
>     -    if is_master then
>     -        box.space._bucket:on_replace(bucket_generation_increment)
>     -        M.bucket_on_replace = bucket_generation_increment
>     +    if is_master or box.space._bucket then
>     +        schema_install_triggers()
> 
> 
> +1 to Oleg, isn't _bucket check enough?

I want to emphasize that on the master the bootstrap is finished anyway,
because it performs the bootstrap. On the master no need to check for
_bucket nor wait for the schema anyhow.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool
  2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-21 21:54       ` Yaroslav Dynnikov via Tarantool-patches
  0 siblings, 0 replies; 14+ messages in thread
From: Yaroslav Dynnikov via Tarantool-patches @ 2021-05-21 21:54 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: Yaroslav Dynnikov, tml

[-- Attachment #1: Type: text/plain, Size: 2620 bytes --]

OK, no objections then.

On Fri, 21 May 2021, 22:26 Vladislav Shpilevoy, <v.shpilevoy@tarantool.org>
wrote:

> Hi! Thanks for the review!
>
> On 19.05.2021 23:50, Yaroslav Dynnikov wrote:
> > HI, Vlad
> >
> > Thanks for the patch.
> > Find one comment below.
> >
> > On Thu, 13 May 2021 at 14:07, Vladislav Shpilevoy <
> v.shpilevoy@tarantool.org <mailto:v.shpilevoy@tarantool.org>> wrote:
> >
> >     Duplicate key error at insertion into a space on the latest
> >     Tarantool changed its message and it broke of the tests. The patch
> >     updates the test so it checks only the needed part of the message
> >     and does not depend on Tarantool version anymore.
> >     ---
> >      test/storage/storage.result   | 8 +++++---
> >      test/storage/storage.test.lua | 3 ++-
> >      2 files changed, 7 insertions(+), 4 deletions(-)
> >
> >     diff --git a/test/storage/storage.result
> b/test/storage/storage.result
> >     index 2c9784a..d18b7f8 100644
> >     --- a/test/storage/storage.result
> >     +++ b/test/storage/storage.result
> >     @@ -179,10 +179,12 @@ vshard.storage.buckets_info()
> >          status: active
> >          id: 1
> >      ...
> >     -vshard.storage.bucket_force_create(1) -- error
> >     +ok, err = vshard.storage.bucket_force_create(1)
> >      ---
> >     -- null
> >     -- Duplicate key exists in unique index 'pk' in space '_bucket'
> >     +...
> >     +assert(not ok and err.message:match("Duplicate key exists"))
> >     +---
> >     +- Duplicate key exists
> >      ...
> >      vshard.storage.bucket_force_drop(1)
> >      ---
> >     diff --git a/test/storage/storage.test.lua
> b/test/storage/storage.test.lua
> >     index 33f0498..97558f6 100644
> >     --- a/test/storage/storage.test.lua
> >     +++ b/test/storage/storage.test.lua
> >     @@ -56,7 +56,8 @@ vshard.storage.sync(100500)
> >      vshard.storage.buckets_info()
> >      vshard.storage.bucket_force_create(1)
> >      vshard.storage.buckets_info()
> >     -vshard.storage.bucket_force_create(1) -- error
> >     +ok, err = vshard.storage.bucket_force_create(1)
> >     +assert(not ok and "err.message:match("Duplicate key exists))
> >
> >
> > I'd suggest splitting the check in two:
> >
> > 1. ok -- should be false
> > 2. Then check the message matches.
> >
> > Assertions usually don't provide useful errors.
>
> In the core we try to use assertions more, because this makes the
> tests easier to read. You can see right away what are the target
> points of the test. This is kind of a simulation of tap tests.
>
> I consider vshard be "half-core", so I decided to use assertions
> here as well.
>

[-- Attachment #2: Type: text/html, Size: 3626 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas
  2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-05-24  6:57       ` Oleg Babin via Tarantool-patches
  0 siblings, 0 replies; 14+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-05-24  6:57 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov

Hi! Thanks for your fixes. LGTM.

On 21.05.2021 22:26, Vladislav Shpilevoy wrote:
> Hi! Thanks for the review!
>
> On 13.05.2021 22:23, Oleg Babin wrote:
>> Hi! Thanks for your patch! Three comments below.
>>
>> On 13.05.2021 14:07, Vladislav Shpilevoy wrote:
>>> +local function schema_install_triggers()
>>> +    local _bucket = box.space._bucket
>>> +    if M.bucket_on_replace then
>>> +        local ok, err = pcall(_bucket.on_replace, _bucket, nil,
>>> +                              M.bucket_on_replace)
>>> +        if not ok then
>>> +            log.warn('Could not drop old trigger from '..
>>> +                     '_bucket: %s', err)
>>> +        end
>>> +    end
>>> +    _bucket:on_replace(bucket_generation_increment)
>>> +    M.bucket_on_replace = bucket_generation_increment
>>> +end
>>> +
>>> +local function schema_install_on_replace(_, new)
>>> +    -- Wait not just for _bucket to appear, but for the entire
>>> +    -- schema. This might be important if the schema will ever
>>> +    -- consist of more than just _bucket.
>>> +    if new[1] ~= 'vshard_version' then
>>
>> Is it possible that someone drops something from _schema space? I mean that in
>>
>> such case new will be nil.
>>
>> I could provide one such case - when user needs to execute box.once once again, (s)he
>>
>> drops corresponding record from _schema space.
> Great note! Fixed:
>
> ====================
> diff --git a/test/router/boot_replica_first.result b/test/router/boot_replica_first.result
> index 3c5a08c..d6eaca5 100644
> --- a/test/router/boot_replica_first.result
> +++ b/test/router/boot_replica_first.result
> @@ -42,6 +42,25 @@ util.check_error(vshard.storage.call, 1, 'read', 'echo', {100})
>    | - attempt to index field '_bucket' (a nil value)
>    | ...
>   
> +-- While waiting for the schema, gracefully handle deletions from _schema.
> +ro = box.cfg.read_only
> + | ---
> + | ...
> +box.cfg{read_only = false}
> + | ---
> + | ...
> +box.space._schema:insert({'gh-276'})
> + | ---
> + | - ['gh-276']
> + | ...
> +box.space._schema:delete({'gh-276'})
> + | ---
> + | - ['gh-276']
> + | ...
> +box.cfg{read_only = ro}
> + | ---
> + | ...
> +
>   test_run:switch('default')
>    | ---
>    | - true
> diff --git a/test/router/boot_replica_first.test.lua b/test/router/boot_replica_first.test.lua
> index f973f2b..f1a3c19 100644
> --- a/test/router/boot_replica_first.test.lua
> +++ b/test/router/boot_replica_first.test.lua
> @@ -19,6 +19,13 @@ vshard.storage.cfg(cfg, instance_uuid)
>   -- _bucket is not created yet. Will fail.
>   util.check_error(vshard.storage.call, 1, 'read', 'echo', {100})
>   
> +-- While waiting for the schema, gracefully handle deletions from _schema.
> +ro = box.cfg.read_only
> +box.cfg{read_only = false}
> +box.space._schema:insert({'gh-276'})
> +box.space._schema:delete({'gh-276'})
> +box.cfg{read_only = ro}
> +
>   test_run:switch('default')
>   util.map_evals(test_run, {REPLICASET_1}, 'bootstrap_storage(\'memtx\')')
>   
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index b1a20d6..5285ee0 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -424,7 +424,7 @@ local function schema_install_on_replace(_, new)
>       -- Wait not just for _bucket to appear, but for the entire
>       -- schema. This might be important if the schema will ever
>       -- consist of more than just _bucket.
> -    if new[1] ~= 'vshard_version' then
> +    if new == nil or new[1] ~= 'vshard_version' then
>           return
>       end
>       schema_install_triggers()
> ====================
>
>>> +        return
>>> +    end
>>> +    schema_install_triggers()
>>> +
>>> +    local _schema = box.space._schema
>> nit: you could get it from the third argument of the trigger. But up to you.
> What do you mean? AFAIS, the third argument is space name, not the space
> itself. And I need the space itself to clear the trigger.

My bad. Yes, it's a space name not space object.

>>> +    local ok, err = pcall(_schema.on_replace, _schema, nil, M.schema_on_replace)
>>> +    if not ok then
>>> +        log.warn('Could not drop trigger from _schema inside of the '..
>>> +                 'trigger: %s', err)
>>> +    end
>>> +    M.schema_on_replace = nil
>>> +    -- Drop the caches which might have been created while the
>>> +    -- schema was being replicated.
>>> +    bucket_generation_increment()
>>> +end
>>> @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
>>>        local uri = luri.parse(this_replica.uri)
>>>        schema_upgrade(is_master, uri.login, uri.password)
>>>    -    if M.bucket_on_replace then
>>> -        box.space._bucket:on_replace(nil, M.bucket_on_replace)
>>> -        M.bucket_on_replace = nil
>>> -    end
>>> -    lref.cfg()
>>> -    if is_master then
>>> -        box.space._bucket:on_replace(bucket_generation_increment)
>>> -        M.bucket_on_replace = bucket_generation_increment
>>> +    if is_master or box.space._bucket then
>>
>> Is it possible that "is_master" is true and _bucket is nil?
>>
>> Why simple `if box.space._bucket` is not enough?
> I want to emphasize that on the master the bootstrap is finished anyway,
> because it performs the bootstrap. On the master no need to check for
> _bucket nor wait for the schema anyhow.
Ok.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug
  2021-05-13 11:07 [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Vladislav Shpilevoy via Tarantool-patches
                   ` (2 preceding siblings ...)
  2021-05-13 20:23 ` [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Oleg Babin via Tarantool-patches
@ 2021-05-25 20:43 ` Vladislav Shpilevoy via Tarantool-patches
  3 siblings, 0 replies; 14+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-05-25 20:43 UTC (permalink / raw)
  To: tarantool-patches, olegrok, yaroslav.dynnikov

Pushed to master.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-05-25 20:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-13 11:07 [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Vladislav Shpilevoy via Tarantool-patches
2021-05-13 11:07 ` [Tarantool-patches] [PATCH 1/2] test: fix test output on latest Tarantool Vladislav Shpilevoy via Tarantool-patches
2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
2021-05-19 21:50   ` Yaroslav Dynnikov via Tarantool-patches
2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
2021-05-21 21:54       ` Yaroslav Dynnikov via Tarantool-patches
2021-05-13 11:07 ` [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas Vladislav Shpilevoy via Tarantool-patches
2021-05-13 20:23   ` Oleg Babin via Tarantool-patches
2021-05-21 19:26     ` Vladislav Shpilevoy via Tarantool-patches
2021-05-24  6:57       ` Oleg Babin via Tarantool-patches
2021-05-19 21:51   ` Yaroslav Dynnikov via Tarantool-patches
2021-05-21 19:30     ` Vladislav Shpilevoy via Tarantool-patches
2021-05-13 20:23 ` [Tarantool-patches] [PATCH 0/2] vshard.storage.buckets_count() bug Oleg Babin via Tarantool-patches
2021-05-25 20:43 ` Vladislav Shpilevoy via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox