From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 62E1A6EC40; Mon, 5 Jul 2021 12:24:50 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 62E1A6EC40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1625477090; bh=txMmgnykxj/YM0fpZEbVacoqWO2+Mlc/iMxnX4VkQAo=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=GRtqXtcMTVNqBNZnkE7Chwr74sNhT+tpWet2tRvQlJsdq3YOCa4sCUSt/Du81M2j7 juNblFJ1m98RDaEYB6wMmHmXSVkIvKWIfRaa2O01PWYy5BouyYq9zAMzAVaJQm+5sm pLqviLsBvM4E7+895BjYr+NyQ+aBgdGcGHPVDLAc= Received: from smtp54.i.mail.ru (smtp54.i.mail.ru [217.69.128.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7BC8C6EC40 for ; Mon, 5 Jul 2021 12:24:08 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7BC8C6EC40 Received: by smtp54.i.mail.ru with esmtpa (envelope-from ) id 1m0Kpj-0001LM-Je; Mon, 05 Jul 2021 12:24:08 +0300 To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, yaroslav.dynnikov@tarantool.org References: <6d8c2a728366edf5b0d208aeed9e027f870aa699.1625177222.git.v.shpilevoy@tarantool.org> <900c73e7-9824-ad3c-88f8-aa07b0382986@tarantool.org> Message-ID: Date: Mon, 5 Jul 2021 12:24:07 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <900c73e7-9824-ad3c-88f8-aa07b0382986@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD954DFF1DC42D673FB2F1AA0EB8A504C8721532AB396CDCF09182A05F538085040498F9459629653B4ECB2AB569C995C9F8B26A86D915C37CFC2576E2883C1D16C X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE797F4D2EDC29AFAF7EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637E6006D770ADD73CF8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D86B711F32C2D202745A04E5C2361A6AC8117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAE9A1BBD95851C5BA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F44604297287769387670735201E561CDFBCA1751FBDFBBEFFF4125B51D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE4B6963042765DA4B1B780A39BCC1DD35D8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3724336BCC0EE1BA86136E347CC761E07C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637A7EFCB0EB5ACB161EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: 0D63561A33F958A52A33DD463EFFCC417F663304FE03053FE16474A598DF91AED59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7569E77FCA7B33833F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34D041FB2E16F174C42D28B35448B87E5334587BC3AE172B0142F1056A836B6A8082AA0E74409CF5E81D7E09C32AA3244C8DEE405485A276457ED590DCB42A7CDBA995755A1445935EFACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj5fH2RN9TpJmxVBfXsUYDPA== X-Mailru-Sender: 583F1D7ACE8F49BD07526C4546A62CBF9668AFE930FA8EF3A874B1F5B765A1F4D87712F1FCB747DE23E75C7104EB1B885DEE61814008E47C7013064206BFB89F93956FB04BA385BE9437F6177E88F7363CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH vshard 5/6] router: introduce automatic master discovery X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Oleg Babin via Tarantool-patches Reply-To: Oleg Babin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Thanks for your answers. See my answers below. On 03.07.2021 00:35, Vladislav Shpilevoy wrote: > Thanks for the review! > >>> This is getting especially tricky with built-in automatic master >>> elections which are not supported by vshard yet, but they will be, >>> they won't depend on any config. Master role won't be pinned to >>> one node then. >> Is it possible to implement some simple test that shows if RAFT changes >> >> a leader, vshard catches this changes? Of course, it will be test for new Tarantool versions. > Nope. Because the storages do not support it yet. 'Master' attribute > can only be set via the config now, unfortunately. This will change > in the future, but it is blocked by one of the bugs (about bucket refs > being useless on replicas). Thanks for answer. >>> diff --git a/vshard/error.lua b/vshard/error.lua >>> index bcbcd71..e2d1a31 100644 >>> --- a/vshard/error.lua >>> +++ b/vshard/error.lua >>> @@ -153,6 +153,11 @@ local error_message_template = { >>>           name = 'BUCKET_RECV_DATA_ERROR', >>>           msg = 'Can not receive the bucket %s data in space "%s" at tuple %s: %s', >>>           args = {'bucket_id', 'space', 'tuple', 'reason'}, >>> +    }, >>> +    [31] = { >>> +        name = 'MULTIPLE_MASTERS_FOUND', >>> +        msg = 'Found more than one master in replicaset %s on nodes %s and %s', >>> +        args = {'replicaset_uuid', 'master1', 'master2'}, >>>       } >>>   } >> Is it possible to test this error? > It is tested, yes. See master_discovery.test.lua. Section > "-- Multiple masters logging". Although I couldn't find an > easy way to expose it into vshard.router.info(). So it is > simply logged now. Yes, thanks. I've missed it. >>>   diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua >>> index fa048c9..7747258 100644 >>> --- a/vshard/replicaset.lua >>> +++ b/vshard/replicaset.lua >>> @@ -345,18 +377,30 @@ local function replicaset_master_call(replicaset, func, args, opts) >>>       assert(opts == nil or type(opts) == 'table') >>>       assert(type(func) == 'string', 'function name') >>>       assert(args == nil or type(args) == 'table', 'function arguments') >>> -    local conn, err = replicaset_connect_master(replicaset) >>> -    if not conn then >>> -        return nil, err >>> -    end >>> -    if not opts then >>> -        opts = {timeout = replicaset.master.net_timeout} >>> -    elseif not opts.timeout then >>> -        opts = table.copy(opts) >>> -        opts.timeout = replicaset.master.net_timeout >>> +    local master = replicaset.master >>> +    if not master then >>> +        opts = opts and table.copy(opts) or {} >>> +        if opts.is_async then >>> +            return nil, lerror.vshard(lerror.code.MISSING_MASTER, >>> +                                      replicaset.uuid) >>> +        end >> Could / should we here wakeup master discovery if "auto" is specified? > We can. But I am afraid that under considerable RW requests load we > might wakeup it too often. And if there are many routers, they might > create a storm of `vshard.storage._call('info')` requests putting a > notable load on the storages. So I decided it is not safe until the > subscriptions are implemented. But we can't simply fail async requests when master is unknown. And we even don't have exposed replicaset_wait_connected/replicaset_wait_master to call it before master will be known. As result several seconds we will get failed requests. Also I saw TODO in test that "RO requests should be able to go to replicas" and I expect questions about RTO 0 on read requests from our customers. I don't know will this feature be used by cartridge but it could create small inconvenience for our customers in test because they should inject timeouts until master will be discovered (it's too artifical example but there were some similar questions when discovery started to work by chunks). >>> @@ -425,11 +469,6 @@ local function replicaset_template_multicallro(prefer_replica, balance) >>>                   return r >>>               end >>>           end >>> -        local conn, err = replicaset_connect_master(replicaset) >>> -        if not conn then >>> -            return nil, err >>> -        end >>> -        return master >>>       end >> Why don't we need this part anymore? > Now the caller handles the case when couldn't find any node. See > the next diff hunk. I moved it outside, because I didn't want > pick_next_replica() to take a timeout argument. I wanted it to > be as simple as possible. Ok. Thanks for answer. >>>       return function(replicaset, func, args, opts) >>> @@ -444,9 +483,13 @@ local function replicaset_template_multicallro(prefer_replica, balance) >>>           end >>>           local end_time = fiber_clock() + timeout >>>           while not net_status and timeout > 0 do >>> -            replica, err = pick_next_replica(replicaset) >>> +            replica = pick_next_replica(replicaset) >>>               if not replica then >>> -                return nil, err >>> +                replica, timeout = replicaset_wait_master(replicaset, timeout) >>> +                if not replica then >>> +                    return nil, timeout >>> +                end >>> +                replicaset_connect_to_replica(replicaset, replica) >>>               end >>>               opts.timeout = timeout >>>               net_status, storage_status, retval, err = >>> @@ -508,7 +551,128 @@ local function rebind_replicasets(replicasets, old_replicasets) > <...> > >>> +    local master_uuid >>> +    for replica_uuid, f in pairs(futures) do >>> +        if timeout < 0 then >>> +            -- Netbox uses cond var inside, whose wait throws an error when gets >>> +            -- a negative timeout. Avoid that. >>> +            timeout = 0 >>> +        end >>> +        res, err = f:wait_result(timeout) >>> +        timeout = deadline - fiber_clock() >>> +        if not res then >>> +            f:discard() >>> +            last_err = err >>> +            goto next_wait >> If timeout will be negative we anyway go to next_wait and turn it to 0 at the next iteration. > Yup. This is intentional. With zero timeout the worst what can happen is a > yield. But on the other hand, while we were waiting for one request, the > others could complete and one of them could be the master. This is why I > continue the iteration - don't want to miss any arrived response. Ok. Thanks for answer. > <...> > >>> + >>> +local function locate_masters(replicasets) >>> +    local is_all_done = true >>> +    local is_all_nop = true >>> +    local last_err >>> +    for _, replicaset in pairs(replicasets) do >>> +        local is_done, is_nop, err = replicaset_locate_master(replicaset) >> I think we should log result of master discovery. Especially if error occurred. > It is logged by the caller. See master_search_f(), init.lua:1067. Hmm... Yes, but I think such approach with last_err could lose some errors. E.g. replicaset.lua:616 gets an error, saves it to `last_err` then if some error happened on furure `wait_result` we override `conn.call` error and externally we have a case when there was no any conn.call error. >>> diff --git a/vshard/router/init.lua b/vshard/router/init.lua >>> index 5e2a96b..9407ccd 100644 >>> --- a/vshard/router/init.lua >>> +++ b/vshard/router/init.lua >>> @@ -1030,6 +1032,93 @@ local function failover_f(router) > <...> > >>> +-- >>> +-- Master discovery background function. It is supposed to notice master changes >>> +-- and find new masters in the replicasets, which are configured for that. >>> +-- >>> +-- XXX: due to polling the search might notice master change not right when it >> Is this TODO related tohttps://github.com/tarantool/vshard/issues/284 ? > Yes, that is one of the "solutions", although I would rather call it a crutch. > I prefer to fix it by doing fair subscriptions. And for that I at least need a > way to get netbox call result asynchronously, without having to call wait_result() > anywhere. Does it request some changes in netbox/iproto? I mean https://github.com/tarantool/tarantool/issues/6107 > <...> > >>> +local function master_search_set(router) >>> +    local enable = false >>> +    for _, rs in pairs(router.replicasets) do >>> +        if rs.is_auto_master then >>> +            enable = true >>> +            break >>> +        end >>> +    end >>> +    local search_fiber = router.master_search_fiber >>> +    if enable and search_fiber == nil then >>> +        log.info('Master auto search is enabled') >>> +        router.master_search_fiber = util.reloadable_fiber_create( >>> +            'vshard.master_search.' .. router.name, M, 'master_search_f', >>> +            router) >> On 1.10 fiber name is limited with 32 symbols. >> >> tarantool> #'vshard.master_search.' >> --- >> - 21 >> ... >> >> That mean that router name is limited with 11 chars. >> >> It could be a problem. As I see you don't use {truncate = true} when assign a name >> >> https://github.com/tarantool/vshard/blob/a394e3f4008a7436f011c81028dcaadc270eebf6/vshard/util.lua#L93 > Holy shit, that is a good find. I fixed it in a new commit on top of the > branch. For all the background fibers at once. > >> Also it's possible to wakeup discovery and rebalanser fibers. Probably we should give a way to >> >> wakeup master_search_fiber. > I didn't want people to rely on that. Because when the subscriptions > are implemented, there won't be a fiber. But I could make it NOP later > probably. > > ==================== > diff --git a/test/router/master_discovery.result b/test/router/master_discovery.result > index 2fca1e4..6dcdf45 100644 > --- a/test/router/master_discovery.result > +++ b/test/router/master_discovery.result > @@ -87,7 +87,7 @@ end > function check_all_masters_found() \ > for _, rs in pairs(vshard.router.static.replicasets) do \ > if not rs.master then \ > - vshard.router.static.master_search_fiber:wakeup() \ > + vshard.router.master_search_wakeup() \ > return false \ > end \ > end \ > @@ -101,7 +101,7 @@ function check_master_for_replicaset(rs_id, master_name) > local master_uuid = util.name_to_uuid[master_name] \ > local master = vshard.router.static.replicasets[rs_uuid].master \ > if not master or master.uuid ~= master_uuid then \ > - vshard.router.static.master_search_fiber:wakeup() \ > + vshard.router.master_search_wakeup() \ > return false \ > end \ > return true \ > @@ -124,7 +124,7 @@ master_search_helper_f = nil > | ... > function aggressive_master_search_f() \ > while true do \ > - vshard.router.static.master_search_fiber:wakeup() \ > + vshard.router.master_search_wakeup() \ > fiber.sleep(0.001) \ > end \ > end > diff --git a/test/router/master_discovery.test.lua b/test/router/master_discovery.test.lua > index e087f58..653725a 100644 > --- a/test/router/master_discovery.test.lua > +++ b/test/router/master_discovery.test.lua > @@ -49,7 +49,7 @@ end > function check_all_masters_found() \ > for _, rs in pairs(vshard.router.static.replicasets) do \ > if not rs.master then \ > - vshard.router.static.master_search_fiber:wakeup() \ > + vshard.router.master_search_wakeup() \ > return false \ > end \ > end \ > @@ -61,7 +61,7 @@ function check_master_for_replicaset(rs_id, master_name) > local master_uuid = util.name_to_uuid[master_name] \ > local master = vshard.router.static.replicasets[rs_uuid].master \ > if not master or master.uuid ~= master_uuid then \ > - vshard.router.static.master_search_fiber:wakeup() \ > + vshard.router.master_search_wakeup() \ > return false \ > end \ > return true \ > @@ -78,7 +78,7 @@ end > master_search_helper_f = nil > function aggressive_master_search_f() \ > while true do \ > - vshard.router.static.master_search_fiber:wakeup() \ > + vshard.router.master_search_wakeup() \ > fiber.sleep(0.001) \ > end \ > end > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index 9407ccd..baaeee4 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -1119,6 +1119,13 @@ local function master_search_set(router) > end > end > > +local function master_search_wakeup(router) > + local f = router.master_search_fiber > + if f then > + f:wakeup() > + end > +end > + > -------------------------------------------------------------------------------- > -- Configuration > -------------------------------------------------------------------------------- > @@ -1545,6 +1552,7 @@ local router_mt = { > bootstrap = cluster_bootstrap; > bucket_discovery = bucket_discovery; > discovery_wakeup = discovery_wakeup; > + master_search_wakeup = master_search_wakeup, > discovery_set = discovery_set, > _route_map_clear = route_map_clear, > _bucket_reset = bucket_reset, > ==================== > > To the commit message I appended the following text: > > Master discovery works in its own fiber on the router, which is > activated only if at least one replicaset is configured to look > for the master. It wakes up with a certain period. But it is > possible to wake it up on demand using > `vshard.router.master_search_wakeup()` function. It does not do > anything if master search is not configured for any replicaset. Thanks for a fix.