[tarantool-patches] Re: [PATCH 1/4] Fix races related to object outdating

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Thu Aug 2 14:51:48 MSK 2018


Thanks for the patch! Pushed into the master.

On 01/08/2018 20:44, Alex Khatskevich wrote:
> 
> 
> On 01.08.2018 15:36, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the patch! See 2 comments below.
>>
>>> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
>>> index 142ddb6..1a0ed2f 100644
>>> --- a/vshard/router/init.lua
>>> +++ b/vshard/router/init.lua
>>> @@ -88,15 +94,18 @@ local function bucket_discovery(bucket_id)
>>>      log.verbose("Discovering bucket %d", bucket_id)
>>>      local last_err = nil
>>>      local unreachable_uuid = nil
>>> -    for uuid, replicaset in pairs(M.replicasets) do
>>> -        local _, err =
>>> -            replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
>>> -        if err == nil then
>>> -            bucket_set(bucket_id, replicaset)
>>> -            return replicaset
>>> -        elseif err.code ~= lerror.code.WRONG_BUCKET then
>>> -            last_err = err
>>> -            unreachable_uuid = uuid
>>> +    for uuid, _ in pairs(M.replicasets) do
>>> +        -- Handle reload/reconfigure.
>>> +        replicaset = M.replicasets[uuid]
>>
>> 1. Please, explain, how is it possible, that before this line
>> M.replicasets[uuid] can become nil. You iterate here over
>> M.replicasets and on the previous line in '_' you have
>> stored replicaset. How can here 'replicaset' differ from '_'?
>>
>> It has nothing in common with 'reload/reconfigure' case since
>> you always iterate over M.replicasets - the most actual
>> list of replicasets. Maybe you thought that pairs() stores
>> its first argument into a temporary variable but looks like
>> it is not. I checked it with a simple test:
>>
>>     a = {}
>>     a.objs = {1, 2, 3, 4, 5}
>>     for k, v in pairs(a.objs) do
>>         print(a.objs[k])
>>         if k == 2 then
>>             a.objs = {6, 7, 8, 9, 10}
>>         end
>>     end
>>
>>     Output:
>>     1
>>     2
>>     8
>>     9
>>     10
>>     ---
>>     ...
> Thanks. For some reason I was sure that pairs caches the table.
> Fixed.
>>> +        if replicaset then
>>> +            local _, err =
>>> +                replicaset:callrw('vshard.storage.bucket_stat', {bucket_id})
>>> +            if err == nil then
>>> +                return bucket_set(bucket_id, replicaset.uuid)
>>> +            elseif err.code ~= lerror.code.WRONG_BUCKET then
>>> +                last_err = err
>>> +                unreachable_uuid = uuid
>>> +            end
>>>          end
>>>      end
>>>      local err = nil
>>> @@ -513,27 +522,28 @@ local function router_cfg(cfg)
>>>      end
>>>      box.cfg(box_cfg)
>>>      log.info("Box has been configured")
>>> -    M.connection_outdate_delay = cfg.connection_outdate_delay
>>> -    M.total_bucket_count = total_bucket_count
>>> -    M.collect_lua_garbage = collect_lua_garbage
>>> -    M.current_cfg = new_cfg
>>>      -- Move connections from an old configuration to a new one.
>>>      -- It must be done with no yields to prevent usage both of not
>>>      -- fully moved old replicasets, and not fully built new ones.
>>> -    lreplicaset.rebind_replicasets(new_replicasets, M.replicasets,
>>> -                                   M.connection_outdate_delay)
>>> -    M.replicasets = new_replicasets
>>> +    lreplicaset.rebind_replicasets(new_replicasets, M.replicasets)
>>>      -- Now the new replicasets are fully built. Can establish
>>>      -- connections and yield.
>>>      for _, replicaset in pairs(new_replicasets) do
>>>          replicaset:connect_all()
>>>      end
>>> +    lreplicaset.wait_masters_connect(new_replicasets)
>>> +    lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay)
>>> +    M.connection_outdate_delay = cfg.connection_outdate_delay
>>> +    M.total_bucket_count = total_bucket_count
>>> +    M.collect_lua_garbage = collect_lua_garbage
>>> +    M.current_cfg = cfg
>>> +    M.replicasets = new_replicasets
>>>      -- Update existing route map in-place.
>>> -    for bucket, rs in pairs(M.route_map) do
>>> +    local old_route_map = M.route_map
>>> +    M.route_map = {}
>>> +    for bucket, rs in pairs(old_route_map) do
>>>          M.route_map[bucket] = M.replicasets[rs.uuid]
>>
>> 2. Why do you need to save old_route_map into a
>> separate variable? You can update M.route_map in place
>> like it was done before. It is not? You fill the new
>> route_map with exactly the same keys (maybe with different
>> values, but it does not affect 'for' iteration). Moreover,
>> when you create a new route_map instead of resetting the
>> old, you double the memory. For huge bucket count it can
>> be noticeable.
> Yes there is no point in that change.
> Fixed.
>>>      end
>>> -
>>> -    lreplicaset.wait_masters_connect(new_replicasets)
>>>      if M.failover_fiber == nil then
>>>          lfiber.create(util.reloadable_fiber_f, M, 'failover_f', 'Failover')
>>>      end
> 
> 




More information about the Tarantool-patches mailing list