From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 1B09128405 for ; Thu, 2 Aug 2018 07:51:51 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BjunynydBHz2 for ; Thu, 2 Aug 2018 07:51:50 -0400 (EDT) Received: from smtp46.i.mail.ru (smtp46.i.mail.ru [94.100.177.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id BB21F2833A for ; Thu, 2 Aug 2018 07:51:50 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 1/4] Fix races related to object outdating References: <18f2ede05fa4a77bf0bd2abb64c25df0e3c574d6.1532940401.git.avkhatskevich@tarantool.org> <4a8e9e20-561d-6896-ea8c-8517add2bc50@tarantool.org> <1eeb6917-0153-919d-de56-c21052ea42f9@tarantool.org> <39dccfbe-8d6a-ffe7-b841-efa0c5352697@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Thu, 2 Aug 2018 14:51:48 +0300 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Alex Khatskevich Thanks for the patch! Pushed into the master. On 01/08/2018 20:44, Alex Khatskevich wrote: > > > On 01.08.2018 15:36, Vladislav Shpilevoy wrote: >> Hi! Thanks for the patch! See 2 comments below. >> >>> diff --git a/vshard/router/init.lua b/vshard/router/init.lua >>> index 142ddb6..1a0ed2f 100644 >>> --- a/vshard/router/init.lua >>> +++ b/vshard/router/init.lua >>> @@ -88,15 +94,18 @@ local function bucket_discovery(bucket_id) >>>      log.verbose("Discovering bucket %d", bucket_id) >>>      local last_err = nil >>>      local unreachable_uuid = nil >>> -    for uuid, replicaset in pairs(M.replicasets) do >>> -        local _, err = >>> -            replicaset:callrw('vshard.storage.bucket_stat', {bucket_id}) >>> -        if err == nil then >>> -            bucket_set(bucket_id, replicaset) >>> -            return replicaset >>> -        elseif err.code ~= lerror.code.WRONG_BUCKET then >>> -            last_err = err >>> -            unreachable_uuid = uuid >>> +    for uuid, _ in pairs(M.replicasets) do >>> +        -- Handle reload/reconfigure. >>> +        replicaset = M.replicasets[uuid] >> >> 1. Please, explain, how is it possible, that before this line >> M.replicasets[uuid] can become nil. You iterate here over >> M.replicasets and on the previous line in '_' you have >> stored replicaset. How can here 'replicaset' differ from '_'? >> >> It has nothing in common with 'reload/reconfigure' case since >> you always iterate over M.replicasets - the most actual >> list of replicasets. Maybe you thought that pairs() stores >> its first argument into a temporary variable but looks like >> it is not. I checked it with a simple test: >> >>     a = {} >>     a.objs = {1, 2, 3, 4, 5} >>     for k, v in pairs(a.objs) do >>         print(a.objs[k]) >>         if k == 2 then >>             a.objs = {6, 7, 8, 9, 10} >>         end >>     end >> >>     Output: >>     1 >>     2 >>     8 >>     9 >>     10 >>     --- >>     ... > Thanks. For some reason I was sure that pairs caches the table. > Fixed. >>> +        if replicaset then >>> +            local _, err = >>> +                replicaset:callrw('vshard.storage.bucket_stat', {bucket_id}) >>> +            if err == nil then >>> +                return bucket_set(bucket_id, replicaset.uuid) >>> +            elseif err.code ~= lerror.code.WRONG_BUCKET then >>> +                last_err = err >>> +                unreachable_uuid = uuid >>> +            end >>>          end >>>      end >>>      local err = nil >>> @@ -513,27 +522,28 @@ local function router_cfg(cfg) >>>      end >>>      box.cfg(box_cfg) >>>      log.info("Box has been configured") >>> -    M.connection_outdate_delay = cfg.connection_outdate_delay >>> -    M.total_bucket_count = total_bucket_count >>> -    M.collect_lua_garbage = collect_lua_garbage >>> -    M.current_cfg = new_cfg >>>      -- Move connections from an old configuration to a new one. >>>      -- It must be done with no yields to prevent usage both of not >>>      -- fully moved old replicasets, and not fully built new ones. >>> -    lreplicaset.rebind_replicasets(new_replicasets, M.replicasets, >>> -                                   M.connection_outdate_delay) >>> -    M.replicasets = new_replicasets >>> +    lreplicaset.rebind_replicasets(new_replicasets, M.replicasets) >>>      -- Now the new replicasets are fully built. Can establish >>>      -- connections and yield. >>>      for _, replicaset in pairs(new_replicasets) do >>>          replicaset:connect_all() >>>      end >>> +    lreplicaset.wait_masters_connect(new_replicasets) >>> +    lreplicaset.outdate_replicasets(M.replicasets, cfg.connection_outdate_delay) >>> +    M.connection_outdate_delay = cfg.connection_outdate_delay >>> +    M.total_bucket_count = total_bucket_count >>> +    M.collect_lua_garbage = collect_lua_garbage >>> +    M.current_cfg = cfg >>> +    M.replicasets = new_replicasets >>>      -- Update existing route map in-place. >>> -    for bucket, rs in pairs(M.route_map) do >>> +    local old_route_map = M.route_map >>> +    M.route_map = {} >>> +    for bucket, rs in pairs(old_route_map) do >>>          M.route_map[bucket] = M.replicasets[rs.uuid] >> >> 2. Why do you need to save old_route_map into a >> separate variable? You can update M.route_map in place >> like it was done before. It is not? You fill the new >> route_map with exactly the same keys (maybe with different >> values, but it does not affect 'for' iteration). Moreover, >> when you create a new route_map instead of resetting the >> old, you double the memory. For huge bucket count it can >> be noticeable. > Yes there is no point in that change. > Fixed. >>>      end >>> - >>> -    lreplicaset.wait_masters_connect(new_replicasets) >>>      if M.failover_fiber == nil then >>>          lfiber.create(util.reloadable_fiber_f, M, 'failover_f', 'Failover') >>>      end > >