[tarantool-patches] Re: [PATCH 1/4] Fix races related to object outdating
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Wed Aug 1 15:36:34 MSK 2018
Hi! Thanks for the patch! See 2 comments below.
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 142ddb6..1a0ed2f 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -88,15 +94,18 @@ local function bucket_discovery(bucket_id)
> log.verbose("Discovering bucket %d", bucket_id)
> local last_err = nil
> local unreachable_uuid = nil
> - for uuid, replicaset in pairs(M.replicasets) do
> - local _, err =
> - replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
> - if err == nil then
> - bucket_set(bucket_id, replicaset)
> - return replicaset
> - elseif err.code ~= lerror.code.WRONG_BUCKET then
> - last_err = err
> - unreachable_uuid = uuid
> + for uuid, _ in pairs(M.replicasets) do
> + -- Handle reload/reconfigure.
> + replicaset = M.replicasets[uuid]
1. Please, explain, how is it possible, that before this line
M.replicasets[uuid] can become nil. You iterate here over
M.replicasets and on the previous line in '_' you have
stored replicaset. How can here 'replicaset' differ from '_'?
It has nothing in common with 'reload/reconfigure' case since
you always iterate over M.replicasets - the most actual
list of replicasets. Maybe you thought that pairs() stores
its first argument into a temporary variable but looks like
it is not. I checked it with a simple test:
a = {}
a.objs = {1, 2, 3, 4, 5}
for k, v in pairs(a.objs) do
print(a.objs[k])
if k == 2 then
a.objs = {6, 7, 8, 9, 10}
end
end
Output:
1
2
8
9
10
---
...
> + if replicaset then
> + local _, err =
> + replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
> + if err == nil then
> + return bucket_set(bucket_id, replicaset.uuid)
> + elseif err.code ~= lerror.code.WRONG_BUCKET then
> + last_err = err
> + unreachable_uuid = uuid
> + end
> end
> end
> local err = nil
> @@ -513,27 +522,28 @@ local function router_cfg(cfg)
> end
> box.cfg(box_cfg)
> log.info("Box has been configured")
> - M.connection_outdate_delay = cfg.connection_outdate_delay
> - M.total_bucket_count = total_bucket_count
> - M.collect_lua_garbage = collect_lua_garbage
> - M.current_cfg = new_cfg
> -- Move connections from an old configuration to a new one.
> -- It must be done with no yields to prevent usage both of not
> -- fully moved old replicasets, and not fully built new ones.
> - lreplicaset.rebind_replicasets(new_replicasets, M.replicasets,
> - M.connection_outdate_delay)
> - M.replicasets = new_replicasets
> + lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
> -- Now the new replicasets are fully built. Can establish
> -- connections and yield.
> for _, replicaset in pairs(new_replicasets) do
> replicaset:connect_all()
> end
> + lreplicaset.wait_masters_connect(new_replicasets)
> + lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay)
> + M.connection_outdate_delay = cfg.connection_outdate_delay
> + M.total_bucket_count = total_bucket_count
> + M.collect_lua_garbage = collect_lua_garbage
> + M.current_cfg = cfg
> + M.replicasets = new_replicasets
> -- Update existing route map in-place.
> - for bucket, rs in pairs(M.route_map) do
> + local old_route_map = M.route_map
> + M.route_map = {}
> + for bucket, rs in pairs(old_route_map) do
> M.route_map[bucket] = M.replicasets[rs.uuid]
2. Why do you need to save old_route_map into a
separate variable? You can update M.route_map in place
like it was done before. It is not? You fill the new
route_map with exactly the same keys (maybe with different
values, but it does not affect 'for' iteration). Moreover,
when you create a new route_map instead of resetting the
old, you double the memory. For huge bucket count it can
be noticeable.
> end
> -
> - lreplicaset.wait_masters_connect(new_replicasets)
> if M.failover_fiber == nil then
> lfiber.create(util.reloadable_fiber_f, M, 'failover_f', 'Failover')
> end
More information about the Tarantool-patches
mailing list