[tarantool-patches] [PATCH V4] Prohibit router/storage configuration from different fibers

Evgeniy Zaikin evgen4rave at gmail.com
Mon Sep 24 18:22:04 MSK 2018


Subject: [PATCH V4] Prohibit router/storage configuration from different fibers
Fixes: #140
---
Ticket:https://github.com/tarantool/vshard/issues/140
Branch:https://github.com/tarantool/vshard/tree/rave4life/gh-140-prohibit-multiple-cfg

Hi! Thank you for review. This is my next patch version - hope it is
correct. You'll find patch below.

On Thu, 20 Sep 2018 at 22:59, Vladislav Shpilevoy
<v.shpilevoy at tarantool.org> wrote:
>
> Hi! Sorry for the last response - I was on vacation last
> 2 weeks. See 8 comments below.
>
> On 11/09/2018 12:32, Evgeniy Zaikin wrote:
> >  From 5ce83d5db595455a14a58aedf9e6411f5adf1d7c Mon Sep 17 00:00:00 2001
> > From: rave4life <evgen4rave at gmail.com>
> > Date: Fri, 31 Aug 2018 16:41:11 +0300
> > Subject: [PATCH V3] Prohibit router/storage configuration from different fibers
> >
> > Fixes: #140
> > ---
> > Ticket:https://github.com/tarantool/vshard/issues/140
> > Branch:https://github.com/tarantool/vshard/tree/rave4life/gh-140-prohibit-multiple-cfg
> >
>
> > diff --git a/test/router/router.result b/test/router/router.result
> > index 000307d..ada8309 100644
> > --- a/test/router/router.result
> > +++ b/test/router/router.result
> > @@ -1140,6 +1140,45 @@ test_run:drop_cluster(REPLICASET_2)
> >   while test_run:grep_log('router_1', 'disconnected from ') == nil do
> > fiber.sleep(0.1) end
> >   ---
> >   ...
> > +-- gh-140: prohibit vshard.router/storage.cfg from multiple fibers.
>

Fixed.

> 1. Sorry, still out of 80 symbols. It is a strict rule. Please,
> inspect your patch carefully for such code style things before
> sending to a review.
>
> > +cfg = {sharding = require('localcfg').sharding}
> > +---
> > +...
> > +vshard = require('vshard')
> > +---
> > +...
> > +fiber = require('fiber')
>

OK. Switched to 'storage_1_a' for storage and to 'router_1' for router.

> 2. Please, do not use 'default' server for testing
> storage or router. Router/router.test.lua starts multiple
> Tarantool instances: storage_1_a - storage_2_b and router_1
> (see the very beginning of the test file) and switches
> between them to test router or storage things. 'Default'
> server shall not be used for this. Please, use test_run:switch
> and router_1 server for a test. In such a case you will not
> have require('vshard') and require('fiber') since they are
> already loaded on router_1 instance.
>
> There is a reason why 'default' can not be used - once a
> router is started on an instance, it can not be stopped
> without restart (its internal fibers, I mean). For the storage
> things are worse - it creates new spaces, users, functions
> and we have no way to cleanup those things. So once 'default'
> is used for testing, it will affect next tests.
>
> For this problem we have an issue: https://github.com/tarantool/vshard/issues/121
> You can try it next after this patch is finished if you want.
>
> > +---
> > +...
> > +vshard.router.cfg(cfg)
> > +---
> > +...
> > +vshard.router.internal.errinj.ERRINJ_MULTIPLE_CFG = false
>

Set to 'nil' according to your suggestion (8-th comment)

> 3. It is false by default already (as I see in router/init.lua),
> so what is a point of this line? I removed it and the test
> passed ok. I guess it is a typo. Please, fix.
>
> > +---
> > +...
> > +c = fiber.channel(2)
> > +---
> > +...
> > diff --git a/test/storage/storage.result b/test/storage/storage.result
> > index e3e9e7f..0228445 100644
> > --- a/test/storage/storage.result
> > +++ b/test/storage/storage.result
> > @@ -688,6 +688,48 @@ rs:callro('echo', {'some_data'})
> >   - null
> >   - null
> >   ...
> > +-- gh-140: prohibit vshard.router/storage.cfg from multiple fibers.
> > +cfg = {sharding = {}}
> > +---
> > +...
> > +cfg.sharding[box.info.cluster.uuid] = {replicas = {}}
> > +---
> > +...
> > +cfg.sharding[box.info.cluster.uuid].replicas[box.info.uuid] = {uri =
> > 'storage:storage at 127.0.0.1:3309', name = 'test_storage', master =
> > true}
> > +---
> > +...
> > +vshard = require('vshard')
> > +---
> > +...
> > +fiber = require('fiber')
>

OK, removed.

> 4. For the same reason I explained above, you do not
> need 'require' here. This code is executed on 'storage_1_a' where
> both fiber and vshard are loaded already
> (see line storage.test.lua:137 - this is the latest switch before
> your test).
>
> > +---
> > +...
> > +vshard.storage.internal.errinj.ERRINJ_MULTIPLE_CFG = false
>
> 5. Same as 3.
>
> > +---
> > +...
> > +c = fiber.channel(2)
> > +---
> > +...
> > diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> > index 7573801..446b1a0 100644
> > --- a/vshard/router/init.lua
> > +++ b/vshard/router/init.lua
> > @@ -28,6 +28,7 @@ if not M then
> >           ---------------- Common module attributes ----------------
> >           errinj = {
> >               ERRINJ_CFG = false,
> > +            ERRINJ_MULTIPLE_CFG = false,
> >               ERRINJ_FAILOVER_CHANGE_CFG = false,
> >               ERRINJ_RELOAD = false,
> >               ERRINJ_LONG_DISCOVERY = false,
> > @@ -53,6 +54,10 @@ local ROUTER_TEMPLATE = {
> >           name = nil,
> >           -- The last passed configuration.
> >           current_cfg = nil,
> > +        -- Fiber channel as the locker to prevent multiple config.
> > +        cfg_lock = nil,
> > +        -- Configuration counter to control and verify locker.
> > +        cfg_counter = 0,
>

OK, 'cfg_counter' is also removed.

> 6. Please, remove cfg_counter. Its value > 0 is unreachable
> during normal work of router/storage. If you need a counter for
> testing lock, you can use erring.ERRINJ_MULTIPLE_CFG as a counter.
> It is not necessary to make each errinj as a strict boolean. See
> my proposal below in the comment 8.
>
> >           -- Time to outdate old objects on reload.
> >           connection_outdate_delay = nil,
> >           -- Bucket map cache.
> > @@ -594,6 +599,29 @@ local function router_cfg(router, cfg, is_reload)
> >       end
> >   end
> >
> > +--------------------------------------------------------------------------------
> > +-- Configuration wrapper
> > +--------------------------------------------------------------------------------
>
> 7. Function comment should look like this:
>
>      --
>      -- My function description wrapped by 66
>      -- symbols.
>      -- @param name1 Parameter 1 description.
>      -- @param name2 Parameter 2 description.
>      --
>      -- [if returns anything]
>      -- @retval Returned value description.
>      -- @retval Next possible return value description.
>      --
>

OK, comment was added.

> > +
> > +local function cfg_wrapper(router, ...)
> > +    local enable_lock = not M.errinj.ERRINJ_MULTIPLE_CFG
> > +    if enable_lock then
> > +        router.cfg_lock:put(true)
> > +    end
> > +    if router.cfg_counter > 0 then
> > +        error('Error: cfg_counter > 0')
> > +    end
> > +    router.cfg_counter = router.cfg_counter + 1
> > +    local status, err = pcall(router_cfg, router, ...)
> > +    router.cfg_counter = router.cfg_counter - 1
> > +    if enable_lock then
> > +        router.cfg_lock:get()
> > +    end
> > +    if not status then
> > +        error(err)
> > +    end
> > +end
>
> 8. Too complex. Lets use errinj.ERRINJ_MULTIPLE_CFG as
> an integer and set it to nil by default
>
> M = {
>   ...
>   errinj = {
>      ...
>      ERRINJ_MULTIPLE_CFG = nil
>   }
>   ...
> }
>
> ...
>
> function cfg_wrapper()
>      if ERRINJ_MULTIPLE_CFG then
>           ERRINJ_MULTIPLE_CFG = ERRINJ_MULTIPLE_CFG + 1
>           if ERRINJ_MULTIPLE_CFG > 1 then
>               error('Multiple cfgs at the same time')
>           end
>           status, err = pcall(router_cfg, router, ...)
>           ERRINJ_MULTIPLE_CFG = ERRINJ_MULTIPLE_CFG - 1
>      else
>          lock:put(true)
>          status, err = pcall(router_cfg, router, ...)
>          lock:get()
>      end
>      if not status then
>          error(err)
>      end
>      return
> end
>
>
> ERRINJ_MULTIPLE_CFG is nil by default and to active it
> you can do this in a test:
>
>      ERRINJ_MULTIPLE_CFG = 0
>
> This will trigger counter in my pseudo code above since '0'
> in Lua is true.
>

Yes, placed new code instead.

So, my fixed patch is below.

Changes in v4:
  - Checked code line length for 80 symbols
  - Switched to appropriate Tarantool instances in test
  - Error injection works now as a triggered counter
  - Added comment to wrapper function

---
 test/router/router.result     | 38 +++++++++++++++++++++++++++++++++++
 test/router/router.test.lua   | 16 +++++++++++++++
 test/storage/storage.result   | 35 ++++++++++++++++++++++++++++++++
 test/storage/storage.test.lua | 16 +++++++++++++++
 vshard/router/init.lua        | 38 ++++++++++++++++++++++++++++++++---
 vshard/storage/init.lua       | 34 +++++++++++++++++++++++++++++--
 6 files changed, 172 insertions(+), 5 deletions(-)

diff --git a/test/router/router.result b/test/router/router.result
index 000307d..5ce48b1 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -1130,6 +1130,44 @@ vshard.router.static:route(1):callro('echo',
{'some_data'})
 - null
 - null
 ...
+-- gh-140: prohibit vshard.router/storage.cfg from multiple fibers.
+cfg = {sharding = require('localcfg').sharding}
+---
+...
+vshard.router.cfg(cfg)
+---
+...
+c = fiber.channel(2)
+---
+...
+vshard.router.internal.errinj.ERRINJ_MULTIPLE_CFG = 0
+---
+...
+f = function()
+    fiber.create(function()
+        local status, err = pcall(vshard.router.cfg, cfg)
+        c:put(status and true or false)
+    end)
+end
+---
+...
+f()
+---
+...
+f()
+---
+...
+c:get()
+---
+- true
+...
+c:get()
+---
+- true
+...
+vshard.router.internal.errinj.ERRINJ_MULTIPLE_CFG = nil
+---
+...
 _ = test_run:switch("default")
 ---
 ...
diff --git a/test/router/router.test.lua b/test/router/router.test.lua
index 9f32d3e..5590f95 100644
--- a/test/router/router.test.lua
+++ b/test/router/router.test.lua
@@ -434,6 +434,22 @@ vshard.router.route(1):callro('echo', {'some_data'})
 -- object.
 vshard.router.static:route(1):callro('echo', {'some_data'})

+-- gh-140: prohibit vshard.router/storage.cfg from multiple fibers.
+cfg = {sharding = require('localcfg').sharding}
+vshard.router.cfg(cfg)
+vshard.router.internal.errinj.ERRINJ_MULTIPLE_CFG = 0
+c = fiber.channel(2)
+f = function()
+    fiber.create(function()
+        local status, err = pcall(vshard.router.cfg, cfg)
+        c:put(status and true or false)
+    end)
+end
+f()
+f()
+c:get()
+c:get()
+vshard.router.internal.errinj.ERRINJ_MULTIPLE_CFG = nil
 _ = test_run:switch("default")
 test_run:drop_cluster(REPLICASET_2)

diff --git a/test/storage/storage.result b/test/storage/storage.result
index e3e9e7f..b528b36 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -688,6 +688,41 @@ rs:callro('echo', {'some_data'})
 - null
 - null
 ...
+-- gh-140: prohibit storage.cfg from multiple fibers.
+_ = test_run:switch("storage_1_a")
+---
+...
+vshard.storage.internal.errinj.ERRINJ_MULTIPLE_CFG = 0
+---
+...
+c = fiber.channel(2)
+---
+...
+f = function()
+    fiber.create(function()
+        local status, err = pcall(vshard.storage.cfg, cfg, box.info.uuid)
+        c:put(status and true or false)
+    end)
+end
+---
+...
+f()
+---
+...
+f()
+---
+...
+c:get()
+---
+- true
+...
+c:get()
+---
+- true
+...
+vshard.storage.internal.errinj.ERRINJ_MULTIPLE_CFG = nil
+---
+...
 _ = test_run:switch("default")
 ---
 ...
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
index c07730b..12fff36 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -182,6 +182,22 @@ util.has_same_fields(old_internal, vshard.storage.internal)
 _, rs = next(vshard.storage.internal.replicasets)
 rs:callro('echo', {'some_data'})

+-- gh-140: prohibit vshard.router/storage.cfg from multiple fibers.
+_ = test_run:switch("storage_1_a")
+vshard.storage.internal.errinj.ERRINJ_MULTIPLE_CFG = 0
+c = fiber.channel(2)
+f = function()
+    fiber.create(function()
+        local status, err = pcall(vshard.storage.cfg, cfg, box.info.uuid)
+        c:put(status and true or false)
+    end)
+end
+f()
+f()
+c:get()
+c:get()
+vshard.storage.internal.errinj.ERRINJ_MULTIPLE_CFG = nil
+
 _ = test_run:switch("default")
 test_run:drop_cluster(REPLICASET_2)
 test_run:drop_cluster(REPLICASET_1)
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index 7573801..93bef77 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -28,6 +28,7 @@ if not M then
         ---------------- Common module attributes ----------------
         errinj = {
             ERRINJ_CFG = false,
+            ERRINJ_MULTIPLE_CFG = nil,
             ERRINJ_FAILOVER_CHANGE_CFG = false,
             ERRINJ_RELOAD = false,
             ERRINJ_LONG_DISCOVERY = false,
@@ -53,6 +54,8 @@ local ROUTER_TEMPLATE = {
         name = nil,
         -- The last passed configuration.
         current_cfg = nil,
+        -- Fiber channel as the locker to prevent multiple config.
+        cfg_lock = nil,
         -- Time to outdate old objects on reload.
         connection_outdate_delay = nil,
         -- Bucket map cache.
@@ -594,6 +597,34 @@ local function router_cfg(router, cfg, is_reload)
     end
 end

+--
+-- Wrapper around 'router_cfg' function to provide
+-- locking beafore / unlocking after pcall(router_cfg, router, ...),
+-- so it prevents 'router_cfg' been called from diffenrent fibers.
+-- @param router Router instance to be configured.
+-- @param ... Arguments which would be passed to the `router_cfg`.
+-- @retval true, ... on success
+--
+local function cfg_wrapper(router, ...)
+    local status, err
+    if M.errinj.ERRINJ_MULTIPLE_CFG then
+        M.errinj.ERRINJ_MULTIPLE_CFG = M.errinj.ERRINJ_MULTIPLE_CFG + 1
+        if M.errinj.ERRINJ_MULTIPLE_CFG > 1 then
+            error('Multiple cfgs at the same time')
+        end
+        status, err = pcall(router_cfg, router, ...)
+        M.errinj.ERRINJ_MULTIPLE_CFG = M.errinj.ERRINJ_MULTIPLE_CFG - 1
+    else
+        router.cfg_lock:put(true)
+        status, err = pcall(router_cfg, router, ...)
+        router.cfg_lock:get()
+    end
+    if not status then
+        error(err)
+    end
+    return true
+end
+
 --------------------------------------------------------------------------------
 -- Bootstrap
 --------------------------------------------------------------------------------
@@ -862,7 +893,7 @@ end

 local router_mt = {
     __index = {
-        cfg = function(router, cfg) return router_cfg(router, cfg, false) end,
+        cfg = function(router, cfg) return cfg_wrapper(router, cfg) end,
         info = router_info;
         buckets_info = router_buckets_info;
         call = router_call;
@@ -920,6 +951,7 @@ local function router_new(name, cfg)
     local router = table.deepcopy(ROUTER_TEMPLATE)
     setmetatable(router, router_mt)
     router.name = name
+    router.cfg_lock = lfiber.channel(1)
     M.routers[name] = router
     local ok, err = pcall(router_cfg, router, cfg)
     if not ok then
@@ -936,7 +968,7 @@ end
 local function legacy_cfg(cfg)
     if M.static_router then
         -- Reconfigure.
-        router_cfg(M.static_router, cfg, false)
+        cfg_wrapper(M.static_router, cfg, false)
     else
         -- Create new static instance.
         local router, err = router_new(STATIC_ROUTER_NAME, cfg)
@@ -961,7 +993,7 @@ if not rawget(_G, MODULE_INTERNALS) then
     rawset(_G, MODULE_INTERNALS, M)
 else
     for _, router in pairs(M.routers) do
-        router_cfg(router, router.current_cfg, true)
+        cfg_wrapper(router, router.current_cfg, true)
         setmetatable(router, router_mt)
     end
     if M.static_router then
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index dc059ba..3b1b525 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -48,6 +48,8 @@ if not M then
         ---------------- Common module attributes ----------------
         -- The last passed configuration.
         current_cfg = nil,
+        -- Fiber channel as the locker to prevent multiple config.
+        cfg_lock = lfiber.channel(1),
         --
         -- All known replicasets used for bucket re-balancing.
         -- See format in replicaset.lua.
@@ -66,6 +68,7 @@ if not M then
         total_bucket_count = 0,
         errinj = {
             ERRINJ_CFG = false,
+            ERRINJ_MULTIPLE_CFG = nil,
             ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
             ERRINJ_RELOAD = false,
             ERRINJ_CFG_DELAY = false,
@@ -1843,6 +1846,33 @@ local function storage_cfg(cfg,
this_replica_uuid, is_reload)
     collectgarbage()
 end

+--
+-- Wrapper around 'storage_cfg' function to provide
+-- locking beafore / unlocking after pcall(storage_cfg, ...),
+-- so it prevents 'storage_cfg' been called from diffenrent fibers.
+-- @param ... Arguments which would be passed to the `storage_cfg`.
+-- @retval true, ... on success
+--
+local function storage_cfg_wrapper(...)
+    local status, err
+    if M.errinj.ERRINJ_MULTIPLE_CFG then
+        M.errinj.ERRINJ_MULTIPLE_CFG = M.errinj.ERRINJ_MULTIPLE_CFG + 1
+        if M.errinj.ERRINJ_MULTIPLE_CFG > 1 then
+            error('Multiple cfgs at the same time')
+        end
+        status, err = pcall(storage_cfg, ...)
+        M.errinj.ERRINJ_MULTIPLE_CFG = M.errinj.ERRINJ_MULTIPLE_CFG - 1
+    else
+        M.cfg_lock:put(true)
+        status, err = pcall(storage_cfg, ...)
+        M.cfg_lock:get()
+    end
+    if not status then
+        error(err)
+    end
+    return true
+end
+
 --------------------------------------------------------------------------------
 -- Monitoring
 --------------------------------------------------------------------------------
@@ -2060,7 +2090,7 @@ if not rawget(_G, MODULE_INTERNALS) then
     rawset(_G, MODULE_INTERNALS, M)
 else
     reload_evolution.upgrade(M)
-    storage_cfg(M.current_cfg, M.this_replica.uuid, true)
+    storage_cfg_wrapper(M.current_cfg, M.this_replica.uuid, true)
     M.module_version = M.module_version + 1
 end

@@ -2103,7 +2133,7 @@ return {
     rebalancing_is_in_progress = rebalancing_is_in_progress,
     recovery_wakeup = recovery_wakeup,
     call = storage_call,
-    cfg = function(cfg, uuid) return storage_cfg(cfg, uuid, false) end,
+    cfg = function(cfg, uuid) return storage_cfg_wrapper(cfg, uuid, false) end,
     info = storage_info,
     buckets_info = storage_buckets_info,
     buckets_count = storage_buckets_count,
-- 
2.17.1




More information about the Tarantool-patches mailing list