From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 38DCB6C1AE; Thu, 20 May 2021 00:51:39 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 38DCB6C1AE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1621461099; bh=QAYbYPSd4m1FlD3lhQPqilYvVCWk/EtViDfgME4Lmvs=; h=References:In-Reply-To:Date:To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=GPVk7MGwrlzXt1tBWtBxFfdwg0o/WoDnnF4fu4WWW785tPqq6lVNZs1w0RsH40K1q lnPM8EZg4Mxi3SSR/uR6dYjeJhzyEhoapU/7UuAhN8AoCEGefUvml1dJsIK/3vAejt e2VRtvbqxy2PNmZMaYPb/CYmVO58ZdbnBcyD1KtE= Received: from smtp53.i.mail.ru (smtp53.i.mail.ru [94.100.177.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id BC99C6C1AE for ; Thu, 20 May 2021 00:51:35 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BC99C6C1AE Received: by smtp53.i.mail.ru with esmtpa (envelope-from ) id 1ljU6J-0001Da-5x for tarantool-patches@dev.tarantool.org; Thu, 20 May 2021 00:51:35 +0300 Received: by mail-lf1-f42.google.com with SMTP id r5so21315972lfr.5 for ; Wed, 19 May 2021 14:51:35 -0700 (PDT) X-Gm-Message-State: AOAM530iSCoxnNYHbpemCwI37e8NFMOI7EqVCvUac925bQ4ThuUv1DWe vryRrGFpBl14GLcuHWgaKkUf6iiHqPL5tJxdFg== X-Google-Smtp-Source: ABdhPJy4qO5wPw+bj5wCjkC/etpXpDQWP1/s0fDPUNuiuvpmBndJcsvoF1IwSZC9er89JMl1Zw9jVvlk+sXHqJTBQ+c= X-Received: by 2002:a05:6512:3233:: with SMTP id f19mr1177213lfe.350.1621461094720; Wed, 19 May 2021 14:51:34 -0700 (PDT) MIME-Version: 1.0 References: <04d7d05f09a5ee7ed52b27c480e81232c406e415.1620903962.git.v.shpilevoy@tarantool.org> In-Reply-To: <04d7d05f09a5ee7ed52b27c480e81232c406e415.1620903962.git.v.shpilevoy@tarantool.org> Date: Thu, 20 May 2021 00:51:23 +0300 X-Gmail-Original-Message-ID: Message-ID: To: Vladislav Shpilevoy Content-Type: multipart/alternative; boundary="000000000000916ceb05c2b5d4f9" X-4EC0790: 1 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD91B019B01C53E51AFCAACD197781D6F0CB42DE8FB7A42148700894C459B0CD1B9708F95A3625C178D8CDFAC23973BF92C79D594B285AE74F109DE87B2F61A9599 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE75210414551E8CD62EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006378D70459434292EC88638F802B75D45FF914D58D5BE9E6BC1A93B80C6DEB9DEE97C6FB206A91F05B2495B938286126929D42D428545215E409414A38D98B987DBD2E47CDBA5A96583C09775C1D3CA48CFCA5A41EBD8A3A0199FA2833FD35BB23D2EF20D2F80756B5F868A13BD56FB6657A471835C12D1D977725E5C173C3A84C3CA5A41EBD8A3A0199FA2833FD35BB23DF004C90652538430302FCEF25BFAB3454AD6D5ED66289B5278DA827A17800CE72AA49236079A88D2D32BA5DBAC0009BE395957E7521B51C20BC6067A898B09E4090A508E0FED6299176DF2183F8FC7C04DB3626BA78294CCB3661434B16C20AC78D18283394535A9E827F84554CEF5019E625A9149C048EE9ECD01F8117BC8BEE2021AF6380DFAD18AA50765F790063735872C767BF85DA227C277FBC8AE2E8B819F56F0C249972775ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A2368A440D3B0F6089093C9A16E5BC824A2A04A2ABAA09D25379311020FFC8D4ADFC896A72DD5706E6E69A0D13EF5F0560 X-C1DE0DAB: 0D63561A33F958A510232A9AC5BF57AC2EFDAA0BDC4B1F2AA4234160767390BDD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA752546FE575EB473F1410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34F05B761BB9C2AA44962FE3277DA79FA7D03384F0E85CFFB2DB9E99E638D054D9147BBA3C8ECB6E441D7E09C32AA3244CF683D3C5B77EBDE578323BF02B7C4DBDF26BFA4C8A6946B83EB3F6AD6EA9203E X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojF35yOkZDoIXcAG9b8jXBGw== X-Mailru-Sender: 4C235FE2E5D2D8903B37EB98D235FFC4893534B90741FD7F9661710FCE5B026ECBE6E8B0E6A0C7D453E6F1E4007818E061AB7FC983AAE23E992E2169F9161B8DDAB93E5CC8760AFE9437F6177E88F7363CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 2/2] vshard: fix buckets_count() on replicas X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Yaroslav Dynnikov via Tarantool-patches Reply-To: Yaroslav Dynnikov Cc: Yaroslav Dynnikov , tml Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" --000000000000916ceb05c2b5d4f9 Content-Type: text/plain; charset="UTF-8" Hi! I've got several questions about this patch, find them below. On Thu, 13 May 2021 at 14:07, Vladislav Shpilevoy wrote: > vshard.storage.buckets_count() uses a cached count value when no > changes in _bucket space. The changes are detected using an > on_replace trigger on _bucket space. But it wasn't installed on > the replicas. > > As a result, on replica vshard.storage.buckets_count() didn't > change at all after being called once. > > The patch makes replicas install the trigger on _bucket and update > the bucket generation + drop the caches properly. > > The tricky part is that _bucket might not exist on a replica if it > was vshard-configured earlier than the master. But for that > occasion the trigger installation on _bucket is delayed until > vshard schema is replicated from the master. > > An alternative was to introduce a special version of > buckets_count() on replicas which does not use the cache at all. > But the trigger is going to be useful in scope of #173 where it > should drop bucket refs on the replica. > > Closes #276 > Needed for #173 > --- > test/misc/reconfigure.result | 21 ++++--- > test/misc/reconfigure.test.lua | 13 +++-- > test/router/boot_replica_first.result | 9 ++- > test/router/boot_replica_first.test.lua | 7 ++- > test/storage/storage.result | 44 ++++++++++++++ > test/storage/storage.test.lua | 19 ++++++ > vshard/storage/init.lua | 78 ++++++++++++++++++++++--- > 7 files changed, 164 insertions(+), 27 deletions(-) > > diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result > index 3b34841..a03ab5a 100644 > --- a/test/misc/reconfigure.result > +++ b/test/misc/reconfigure.result > @@ -271,9 +271,9 @@ box.cfg.read_only > --- > - false > ... > -#box.space._bucket:on_replace() > +assert(#box.space._bucket:on_replace() == 1) > --- > -- 1 > +- true > ... > Why is this change necessary? Seems now you miss an actual value in case the test fails. This question applies to many cases below. No matter what testing framework is used, I think that assert_equals(value, 1) -- assertion failed: expected 1, got 2 is better (more helpful) than eager assert(value == 1) -- assertion failed! > _ = test_run:switch('storage_2_a') > --- > @@ -305,10 +305,13 @@ box.cfg.read_only > --- > - true > ... > --- Should be zero on the slave node. Even though earlier the node was a > master. > -#box.space._bucket:on_replace() > +-- > +-- gh-276: replica should have triggers. This is important for proper > update of > +-- caches and in future for discarding refs in scope of gh-173. > +-- > +assert(#box.space._bucket:on_replace() == 1) > --- > -- 0 > +- true > ... _ = test_run:switch('storage_2_b') > --- > @@ -340,9 +343,9 @@ box.cfg.read_only > --- > - false > ... > -#box.space._bucket:on_replace() > +assert(#box.space._bucket:on_replace() == 1) > --- > -- 1 > +- true > ... > _ = test_run:switch('storage_3_a') > --- > @@ -373,9 +376,9 @@ box.cfg.read_only > --- > - false > ... > -#box.space._bucket:on_replace() > +assert(#box.space._bucket:on_replace() == 1) > --- > -- 1 > +- true > ... > _ = test_run:switch('router_1') > --- > diff --git a/test/misc/reconfigure.test.lua > b/test/misc/reconfigure.test.lua > index 348628c..61ea3c0 100644 > --- a/test/misc/reconfigure.test.lua > +++ b/test/misc/reconfigure.test.lua > @@ -109,7 +109,7 @@ table.sort(uris) > uris > box.cfg.replication > box.cfg.read_only > -#box.space._bucket:on_replace() > +assert(#box.space._bucket:on_replace() == 1) > > _ = test_run:switch('storage_2_a') > info = vshard.storage.info() > @@ -119,8 +119,11 @@ table.sort(uris) > uris > box.cfg.replication > box.cfg.read_only > --- Should be zero on the slave node. Even though earlier the node was a > master. > -#box.space._bucket:on_replace() > +-- > +-- gh-276: replica should have triggers. This is important for proper > update of > +-- caches and in future for discarding refs in scope of gh-173. > +-- > +assert(#box.space._bucket:on_replace() == 1) > > _ = test_run:switch('storage_2_b') > info = vshard.storage.info() > @@ -130,7 +133,7 @@ table.sort(uris) > uris > box.cfg.replication > box.cfg.read_only > -#box.space._bucket:on_replace() > +assert(#box.space._bucket:on_replace() == 1) > > _ = test_run:switch('storage_3_a') > info = vshard.storage.info() > @@ -140,7 +143,7 @@ table.sort(uris) > uris > box.cfg.replication > box.cfg.read_only > -#box.space._bucket:on_replace() > +assert(#box.space._bucket:on_replace() == 1) > > _ = test_run:switch('router_1') > info = vshard.router.info() > diff --git a/test/router/boot_replica_first.result > b/test/router/boot_replica_first.result > index 1705230..3c5a08c 100644 > --- a/test/router/boot_replica_first.result > +++ b/test/router/boot_replica_first.result > @@ -76,10 +76,13 @@ vshard.storage.call(1, 'read', 'echo', {100}) > | message: 'Cannot perform action with bucket 1, reason: Not found' > | name: WRONG_BUCKET > | ... > --- Should not have triggers. > -#box.space._bucket:on_replace() > +-- > +-- gh-276: should have triggers. This is important for proper update of > caches > +-- and in future for discarding refs in scope of gh-173. > +-- > +assert(#box.space._bucket:on_replace() == 1) > | --- > - | - 0 > + | - true > | ... > > test_run:switch('router') > diff --git a/test/router/boot_replica_first.test.lua > b/test/router/boot_replica_first.test.lua > index 7b1b3fd..f973f2b 100644 > --- a/test/router/boot_replica_first.test.lua > +++ b/test/router/boot_replica_first.test.lua > @@ -29,8 +29,11 @@ test_run:switch('box_1_b') > test_run:wait_lsn('box_1_b', 'box_1_a') > -- Fails, but gracefully. Because the bucket is not found here. > vshard.storage.call(1, 'read', 'echo', {100}) > --- Should not have triggers. > -#box.space._bucket:on_replace() > +-- > +-- gh-276: should have triggers. This is important for proper update of > caches > +-- and in future for discarding refs in scope of gh-173. > +-- > +assert(#box.space._bucket:on_replace() == 1) > > test_run:switch('router') > vshard.router.bootstrap() > diff --git a/test/storage/storage.result b/test/storage/storage.result > index d18b7f8..570d9c6 100644 > --- a/test/storage/storage.result > +++ b/test/storage/storage.result > @@ -708,6 +708,24 @@ assert(vshard.storage.buckets_count() == 0) > --- > - true > ... > +test_run:wait_lsn('storage_1_b', 'storage_1_a') > +--- > +... > +_ = test_run:switch('storage_1_b') > +--- > +... > +-- > +-- gh-276: bucket count cache should be properly updated on the replica > nodes. > +-- For that the replicas must also install on_replace trigger on _bucket > space > +-- to watch for changes. > +-- > +assert(vshard.storage.buckets_count() == 0) > +--- > +- true > +... > +_ = test_run:switch('storage_1_a') > +--- > +... > vshard.storage.bucket_force_create(1, 5) > --- > - true > @@ -716,6 +734,19 @@ assert(vshard.storage.buckets_count() == 5) > --- > - true > ... > +test_run:wait_lsn('storage_1_b', 'storage_1_a') > +--- > +... > +_ = test_run:switch('storage_1_b') > +--- > +... > +assert(vshard.storage.buckets_count() == 5) > +--- > +- true > +... > +_ = test_run:switch('storage_1_a') > +--- > +... > vshard.storage.bucket_force_create(6, 5) > --- > - true > @@ -724,9 +755,22 @@ assert(vshard.storage.buckets_count() == 10) > --- > - true > ... > +test_run:wait_lsn('storage_1_b', 'storage_1_a') > +--- > +... > +_ = test_run:switch('storage_1_b') > +--- > +... > +assert(vshard.storage.buckets_count() == 10) > +--- > +- true > +... > -- > -- Bucket_generation_wait() registry function. > -- > +_ = test_run:switch('storage_1_a') > +--- > +... > lstorage = require('vshard.registry').storage > --- > ... > diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua > index 97558f6..494e2e8 100644 > --- a/test/storage/storage.test.lua > +++ b/test/storage/storage.test.lua > @@ -201,14 +201,33 @@ for bid, _ in pairs(buckets) do > vshard.storage.bucket_force_drop(bid) end > > _ = test_run:switch('storage_1_a') > assert(vshard.storage.buckets_count() == 0) > +test_run:wait_lsn('storage_1_b', 'storage_1_a') > +_ = test_run:switch('storage_1_b') > +-- > +-- gh-276: bucket count cache should be properly updated on the replica > nodes. > +-- For that the replicas must also install on_replace trigger on _bucket > space > +-- to watch for changes. > +-- > +assert(vshard.storage.buckets_count() == 0) > + > +_ = test_run:switch('storage_1_a') > vshard.storage.bucket_force_create(1, 5) > assert(vshard.storage.buckets_count() == 5) > +test_run:wait_lsn('storage_1_b', 'storage_1_a') > +_ = test_run:switch('storage_1_b') > +assert(vshard.storage.buckets_count() == 5) > + > +_ = test_run:switch('storage_1_a') > vshard.storage.bucket_force_create(6, 5) > assert(vshard.storage.buckets_count() == 10) > +test_run:wait_lsn('storage_1_b', 'storage_1_a') > +_ = test_run:switch('storage_1_b') > +assert(vshard.storage.buckets_count() == 10) > > -- > -- Bucket_generation_wait() registry function. > -- > +_ = test_run:switch('storage_1_a') > lstorage = require('vshard.registry').storage > ok, err = lstorage.bucket_generation_wait(-1) > assert(not ok and err.message) > diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua > index c4400f7..b1a20d6 100644 > --- a/vshard/storage/init.lua > +++ b/vshard/storage/init.lua > @@ -115,6 +115,14 @@ if not M then > -- replace the old function is to keep its reference. > -- > bucket_on_replace = nil, > + -- > + -- Reference to the function used as on_replace trigger on > + -- _schema space. Saved explicitly by the same reason as > + -- _bucket on_replace. > + -- It is used by replicas to wait for schema bootstrap > + -- because they might be configured earlier than the > + -- master. > + schema_on_replace = nil, > -- Fast alternative to box.space._bucket:count(). But may be nil. > Reset > -- on each generation change. > bucket_count_cache = nil, > @@ -398,6 +406,62 @@ local function schema_version_make(ver) > return setmetatable(ver, schema_version_mt) > end > > +local function schema_install_triggers() > + local _bucket = box.space._bucket > + if M.bucket_on_replace then > + local ok, err = pcall(_bucket.on_replace, _bucket, nil, > + M.bucket_on_replace) > + if not ok then > + log.warn('Could not drop old trigger from '.. > + '_bucket: %s', err) > + end > + end > + _bucket:on_replace(bucket_generation_increment) > + M.bucket_on_replace = bucket_generation_increment > +end > + > +local function schema_install_on_replace(_, new) > + -- Wait not just for _bucket to appear, but for the entire > + -- schema. This might be important if the schema will ever > + -- consist of more than just _bucket. > + if new[1] ~= 'vshard_version' then > + return > + end > + schema_install_triggers() > + > + local _schema = box.space._schema > + local ok, err = pcall(_schema.on_replace, _schema, nil, > M.schema_on_replace) > + if not ok then > + log.warn('Could not drop trigger from _schema inside of the '.. > + 'trigger: %s', err) > + end > + M.schema_on_replace = nil > + -- Drop the caches which might have been created while the > + -- schema was being replicated. > + bucket_generation_increment() > +end > + > +-- > +-- Install the triggers later when there is an actual schema to install > them on. > +-- On replicas it might happen that they are vshard-configured earlier > than the > +-- master and therefore don't have the schema right away. > +-- > +local function schema_install_triggers_delayed() > + log.info('Could not find _bucket space to install triggers - delayed > '.. > + 'until the schema is replicated') > + assert(not box.space._bucket) > + local _schema = box.space._schema > + if M.schema_on_replace then > + local ok, err = pcall(_schema.on_replace, _schema, nil, > + M.schema_on_replace) > + if not ok then > + log.warn('Could not drop trigger from _schema: %s', err) > + end > + end > + _schema:on_replace(schema_install_on_replace) > + M.schema_on_replace = schema_install_on_replace > +end > + > -- VShard versioning works in 4 numbers: major, minor, patch, and > -- a last helper number incremented on every schema change, if > -- first 3 numbers stay not changed. That happens when users take > @@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, > is_reload) > local uri = luri.parse(this_replica.uri) > schema_upgrade(is_master, uri.login, uri.password) > > - if M.bucket_on_replace then > - box.space._bucket:on_replace(nil, M.bucket_on_replace) > - M.bucket_on_replace = nil > - end > - lref.cfg() > - if is_master then > - box.space._bucket:on_replace(bucket_generation_increment) > - M.bucket_on_replace = bucket_generation_increment > + if is_master or box.space._bucket then > + schema_install_triggers() > +1 to Oleg, isn't _bucket check enough? > + else > + schema_install_triggers_delayed() > end > > + lref.cfg() > lsched.cfg(vshard_cfg) > + > lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) > lreplicaset.outdate_replicasets(M.replicasets) > M.replicasets = new_replicasets > -- > 2.24.3 (Apple Git-128) > > P.S. I've tested it on Cartridge. Basic tests (luatest + WebUI inspection) seem to be OK. Best regards Yaroslav Dynnikov --000000000000916ceb05c2b5d4f9 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi!

I'v= e got several questions about this patch, find them below.
<= br>
On Thu,= 13 May 2021 at 14:07, Vladislav Shpilevoy <v.shpilevoy@tarantool.org> wrote:=
vshard.storage.= buckets_count() uses a cached count value when no
changes in _bucket space. The changes are detected using an
on_replace trigger on _bucket space. But it wasn't installed on
the replicas.

As a result, on replica vshard.storage.buckets_count() didn't
change at all after being called once.

The patch makes replicas install the trigger on _bucket and update
the bucket generation + drop the caches properly.

The tricky part is that _bucket might not exist on a replica if it
was vshard-configured earlier than the master. But for that
occasion the trigger installation on _bucket is delayed until
vshard schema is replicated from the master.

An alternative was to introduce a special version of
buckets_count() on replicas which does not use the cache at all.
But the trigger is going to be useful in scope of #173 where it
should drop bucket refs on the replica.

Closes #276
Needed for #173
---
=C2=A0test/misc/reconfigure.result=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= | 21 ++++---
=C2=A0test/misc/reconfigure.test.lua=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 13= +++--
=C2=A0test/router/boot_replica_first.result=C2=A0 =C2=A0|=C2=A0 9 ++-
=C2=A0test/router/boot_replica_first.test.lua |=C2=A0 7 ++-
=C2=A0test/storage/storage.result=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0| 44 ++++++++++++++
=C2=A0test/storage/storage.test.lua=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0| 19 ++++++
=C2=A0vshard/storage/init.lua=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0| 78 ++++++++++++++++++++++---
=C2=A07 files changed, 164 insertions(+), 27 deletions(-)

diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index 3b34841..a03ab5a 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -271,9 +271,9 @@ box.cfg.read_only
=C2=A0---
=C2=A0- false
=C2=A0...
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() =3D=3D 1)
=C2=A0---
-- 1
+- true
=C2=A0...

Why is this change necessary?= Seems now you miss an actual value in case the test fails.
This = question applies to many cases below.

No matter wh= at testing framework is used, I think that

ass= ert_equals(value, 1) -- assertion failed: expected 1, got 2

<= /div>
is better (more helpful) than eager

= assert(value =3D=3D 1) -- assertion failed!

=C2=A0=
=C2=A0_ =3D test_run:switch('storage_2_a')
=C2=A0---
@@ -305,10 +305,13 @@ box.cfg.read_only
=C2=A0---
=C2=A0- true
=C2=A0...
--- Should be zero on the slave node. Even though earlier the node was a ma= ster.
-#box.space._bucket:on_replace()
+--
+-- gh-276: replica should have triggers. This is important for proper upda= te of
+-- caches and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() =3D=3D 1)
=C2=A0---
-- 0
+- true
=C2=A0...
=C2=A0_ =3D test_run:switch('storage_2_b')
=C2=A0---
@@ -340,9 +343,9 @@ box.cfg.read_only
=C2=A0---
=C2=A0- false
=C2=A0...
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() =3D=3D 1)
=C2=A0---
-- 1
+- true
=C2=A0...
=C2=A0_ =3D test_run:switch('storage_3_a')
=C2=A0---
@@ -373,9 +376,9 @@ box.cfg.read_only
=C2=A0---
=C2=A0- false
=C2=A0...
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() =3D=3D 1)
=C2=A0---
-- 1
+- true
=C2=A0...
=C2=A0_ =3D test_run:switch('router_1')
=C2=A0---
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lu= a
index 348628c..61ea3c0 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -109,7 +109,7 @@ table.sort(uris)
=C2=A0uris
=C2=A0box.cfg.replication
=C2=A0box.cfg.read_only
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() =3D=3D 1)

=C2=A0_ =3D test_run:switch('storage_2_a')
=C2=A0info =3D vshard.storage.info()
@@ -119,8 +119,11 @@ table.sort(uris)
=C2=A0uris
=C2=A0box.cfg.replication
=C2=A0box.cfg.read_only
--- Should be zero on the slave node. Even though earlier the node was a ma= ster.
-#box.space._bucket:on_replace()
+--
+-- gh-276: replica should have triggers. This is important for proper upda= te of
+-- caches and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() =3D=3D 1)

=C2=A0_ =3D test_run:switch('storage_2_b')
=C2=A0info =3D vshard.storage.info()
@@ -130,7 +133,7 @@ table.sort(uris)
=C2=A0uris
=C2=A0box.cfg.replication
=C2=A0box.cfg.read_only
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() =3D=3D 1)

=C2=A0_ =3D test_run:switch('storage_3_a')
=C2=A0info =3D vshard.storage.info()
@@ -140,7 +143,7 @@ table.sort(uris)
=C2=A0uris
=C2=A0box.cfg.replication
=C2=A0box.cfg.read_only
-#box.space._bucket:on_replace()
+assert(#box.space._bucket:on_replace() =3D=3D 1)

=C2=A0_ =3D test_run:switch('router_1')
=C2=A0info =3D vshard.router.info()
diff --git a/test/router/boot_replica_first.result b/test/router/boot_repli= ca_first.result
index 1705230..3c5a08c 100644
--- a/test/router/boot_replica_first.result
+++ b/test/router/boot_replica_first.result
@@ -76,10 +76,13 @@ vshard.storage.call(1, 'read', 'echo', = {100})
=C2=A0 |=C2=A0 =C2=A0message: 'Cannot perform action with bucket 1, rea= son: Not found'
=C2=A0 |=C2=A0 =C2=A0name: WRONG_BUCKET
=C2=A0 | ...
--- Should not have triggers.
-#box.space._bucket:on_replace()
+--
+-- gh-276: should have triggers. This is important for proper update of ca= ches
+-- and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() =3D=3D 1)
=C2=A0 | ---
- | - 0
+ | - true
=C2=A0 | ...

=C2=A0test_run:switch('router')
diff --git a/test/router/boot_replica_first.test.lua b/test/router/boot_rep= lica_first.test.lua
index 7b1b3fd..f973f2b 100644
--- a/test/router/boot_replica_first.test.lua
+++ b/test/router/boot_replica_first.test.lua
@@ -29,8 +29,11 @@ test_run:switch('box_1_b')
=C2=A0test_run:wait_lsn('box_1_b', 'box_1_a')
=C2=A0-- Fails, but gracefully. Because the bucket is not found here.
=C2=A0vshard.storage.call(1, 'read', 'echo', {100})
--- Should not have triggers.
-#box.space._bucket:on_replace()
+--
+-- gh-276: should have triggers. This is important for proper update of ca= ches
+-- and in future for discarding refs in scope of gh-173.
+--
+assert(#box.space._bucket:on_replace() =3D=3D 1)

=C2=A0test_run:switch('router')
=C2=A0vshard.router.bootstrap()
diff --git a/test/storage/storage.result b/test/storage/storage.result
index d18b7f8..570d9c6 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -708,6 +708,24 @@ assert(vshard.storage.buckets_count() =3D=3D 0)
=C2=A0---
=C2=A0- true
=C2=A0...
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+---
+...
+_ =3D test_run:switch('storage_1_b')
+---
+...
+--
+-- gh-276: bucket count cache should be properly updated on the replica no= des.
+-- For that the replicas must also install on_replace trigger on _bucket s= pace
+-- to watch for changes.
+--
+assert(vshard.storage.buckets_count() =3D=3D 0)
+---
+- true
+...
+_ =3D test_run:switch('storage_1_a')
+---
+...
=C2=A0vshard.storage.bucket_force_create(1, 5)
=C2=A0---
=C2=A0- true
@@ -716,6 +734,19 @@ assert(vshard.storage.buckets_count() =3D=3D 5)
=C2=A0---
=C2=A0- true
=C2=A0...
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+---
+...
+_ =3D test_run:switch('storage_1_b')
+---
+...
+assert(vshard.storage.buckets_count() =3D=3D 5)
+---
+- true
+...
+_ =3D test_run:switch('storage_1_a')
+---
+...
=C2=A0vshard.storage.bucket_force_create(6, 5)
=C2=A0---
=C2=A0- true
@@ -724,9 +755,22 @@ assert(vshard.storage.buckets_count() =3D=3D 10)
=C2=A0---
=C2=A0- true
=C2=A0...
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+---
+...
+_ =3D test_run:switch('storage_1_b')
+---
+...
+assert(vshard.storage.buckets_count() =3D=3D 10)
+---
+- true
+...
=C2=A0--
=C2=A0-- Bucket_generation_wait() registry function.
=C2=A0--
+_ =3D test_run:switch('storage_1_a')
+---
+...
=C2=A0lstorage =3D require('vshard.registry').storage
=C2=A0---
=C2=A0...
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua<= br> index 97558f6..494e2e8 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -201,14 +201,33 @@ for bid, _ in pairs(buckets) do vshard.storage.bucket= _force_drop(bid) end

=C2=A0_ =3D test_run:switch('storage_1_a')
=C2=A0assert(vshard.storage.buckets_count() =3D=3D 0)
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+_ =3D test_run:switch('storage_1_b')
+--
+-- gh-276: bucket count cache should be properly updated on the replica no= des.
+-- For that the replicas must also install on_replace trigger on _bucket s= pace
+-- to watch for changes.
+--
+assert(vshard.storage.buckets_count() =3D=3D 0)
+
+_ =3D test_run:switch('storage_1_a')
=C2=A0vshard.storage.bucket_force_create(1, 5)
=C2=A0assert(vshard.storage.buckets_count() =3D=3D 5)
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+_ =3D test_run:switch('storage_1_b')
+assert(vshard.storage.buckets_count() =3D=3D 5)
+
+_ =3D test_run:switch('storage_1_a')
=C2=A0vshard.storage.bucket_force_create(6, 5)
=C2=A0assert(vshard.storage.buckets_count() =3D=3D 10)
+test_run:wait_lsn('storage_1_b', 'storage_1_a')
+_ =3D test_run:switch('storage_1_b')
+assert(vshard.storage.buckets_count() =3D=3D 10)

=C2=A0--
=C2=A0-- Bucket_generation_wait() registry function.
=C2=A0--
+_ =3D test_run:switch('storage_1_a')
=C2=A0lstorage =3D require('vshard.registry').storage
=C2=A0ok, err =3D lstorage.bucket_generation_wait(-1)
=C2=A0assert(not ok and err.message)
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index c4400f7..b1a20d6 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -115,6 +115,14 @@ if not M then
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-- replace the old function is to keep it= s reference.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0--
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bucket_on_replace =3D nil,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 --
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- Reference to the function used as on_replac= e trigger on
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- _schema space. Saved explicitly by the same= reason as
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- _bucket on_replace.
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- It is used by replicas to wait for schema b= ootstrap
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- because they might be configured earlier th= an the
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 -- master.
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 schema_on_replace =3D nil,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-- Fast alternative to box.space._bucket:= count(). But may be nil. Reset
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-- on each generation change.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bucket_count_cache =3D nil,
@@ -398,6 +406,62 @@ local function schema_version_make(ver)
=C2=A0 =C2=A0 =C2=A0return setmetatable(ver, schema_version_mt)
=C2=A0end

+local function schema_install_triggers()
+=C2=A0 =C2=A0 local _bucket =3D box.space._bucket
+=C2=A0 =C2=A0 if M.bucket_on_replace then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 local ok, err =3D pcall(_bucket.on_replace, _b= ucket, nil,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 M.bucket_on_replace)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if not ok then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 log.warn('Could not drop old= trigger from '..
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0'_bucket: %s', err)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 end
+=C2=A0 =C2=A0 end
+=C2=A0 =C2=A0 _bucket:on_replace(bucket_generation_increment)
+=C2=A0 =C2=A0 M.bucket_on_replace =3D bucket_generation_increment
+end
+
+local function schema_install_on_replace(_, new)
+=C2=A0 =C2=A0 -- Wait not just for _bucket to appear, but for the entire +=C2=A0 =C2=A0 -- schema. This might be important if the schema will ever +=C2=A0 =C2=A0 -- consist of more than just _bucket.
+=C2=A0 =C2=A0 if new[1] ~=3D 'vshard_version' then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 return
+=C2=A0 =C2=A0 end
+=C2=A0 =C2=A0 schema_install_triggers()
+
+=C2=A0 =C2=A0 local _schema =3D box.space._schema
+=C2=A0 =C2=A0 local ok, err =3D pcall(_schema.on_replace, _schema, nil, M.= schema_on_replace)
+=C2=A0 =C2=A0 if not ok then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 log.warn('Could not drop trigger from _sch= ema inside of the '..
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0'trigger= : %s', err)
+=C2=A0 =C2=A0 end
+=C2=A0 =C2=A0 M.schema_on_replace =3D nil
+=C2=A0 =C2=A0 -- Drop the caches which might have been created while the +=C2=A0 =C2=A0 -- schema was being replicated.
+=C2=A0 =C2=A0 bucket_generation_increment()
+end
+
+--
+-- Install the triggers later when there is an actual schema to install th= em on.
+-- On replicas it might happen that they are vshard-configured earlier tha= n the
+-- master and therefore don't have the schema right away.
+--
+local function schema_install_triggers_delayed()
+=C2=A0 =C2=A0 log.info('Could not find _bucket space to install triggers - d= elayed '..
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0'until the schema is r= eplicated')
+=C2=A0 =C2=A0 assert(not box.space._bucket)
+=C2=A0 =C2=A0 local _schema =3D box.space._schema
+=C2=A0 =C2=A0 if M.schema_on_replace then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 local ok, err =3D pcall(_schema.on_replace, _s= chema, nil,
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 M.schema_on_replace)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 if not ok then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 log.warn('Could not drop tri= gger from _schema: %s', err)
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 end
+=C2=A0 =C2=A0 end
+=C2=A0 =C2=A0 _schema:on_replace(schema_install_on_replace)
+=C2=A0 =C2=A0 M.schema_on_replace =3D schema_install_on_replace
+end
+
=C2=A0-- VShard versioning works in 4 numbers: major, minor, patch, and
=C2=A0-- a last helper number incremented on every schema change, if
=C2=A0-- first 3 numbers stay not changed. That happens when users take
@@ -2612,17 +2676,15 @@ local function storage_cfg(cfg, this_replica_uuid, = is_reload)
=C2=A0 =C2=A0 =C2=A0local uri =3D luri.parse(this_replica.uri)
=C2=A0 =C2=A0 =C2=A0schema_upgrade(is_master, uri.login, uri.password)

-=C2=A0 =C2=A0 if M.bucket_on_replace then
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 box.space._bucket:on_replace(nil, M.bucket_on_= replace)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 M.bucket_on_replace =3D nil
-=C2=A0 =C2=A0 end
-=C2=A0 =C2=A0 lref.cfg()
-=C2=A0 =C2=A0 if is_master then
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 box.space._bucket:on_replace(bucket_generation= _increment)
-=C2=A0 =C2=A0 =C2=A0 =C2=A0 M.bucket_on_replace =3D bucket_generation_incr= ement
+=C2=A0 =C2=A0 if is_master or box.space._bucket then
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 schema_install_triggers()

+1 to Oleg, isn't _bucket check enough?
= =C2=A0
+=C2=A0 =C2=A0 else
+=C2=A0 =C2=A0 =C2=A0 =C2=A0 schema_install_triggers_delayed()
=C2=A0 =C2=A0 =C2=A0end

+=C2=A0 =C2=A0 lref.cfg()
=C2=A0 =C2=A0 =C2=A0lsched.cfg(vshard_cfg)
+
=C2=A0 =C2=A0 =C2=A0lreplicaset.rebind_replicasets(new_replicasets, M.repli= casets)
=C2=A0 =C2=A0 =C2=A0lreplicaset.outdate_replicasets(M.replicasets)
=C2=A0 =C2=A0 =C2=A0M.replicasets =3D new_replicasets
--
2.24.3 (Apple Git-128)


P.S. I've tested it on Cartridge. = Basic tests (luatest + WebUI inspection) seem to be OK.

<= /div>
Best regards
Yaroslav Dynnikov
= =C2=A0
--000000000000916ceb05c2b5d4f9--