From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 2E5A06C7D2; Thu, 11 Feb 2021 01:33:22 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 2E5A06C7D2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1612996402; bh=Ew7hUhZ2xaVhQSTeAodxNk23PaADu158l5hmOe/MAL4=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=JJVThoWLPG+K7RtX+WOXp3JPqvCTGwPFv37C+6pYhvoaBtkJXD7q+JOVkbXTYqKAx x76Ban6g9C1d9yJQYpvnMJVIuypB0a+LB8a4ewiCNRC1nbkR/xBFVZNR39OiD3QxS4 8pqYEOjRRqFgAoJjEkauk+1SjkQ/qz4oxaCqcdRw= Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 9A7D06C7D2 for ; Thu, 11 Feb 2021 01:33:20 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 9A7D06C7D2 Received: by smtpng3.m.smailru.net with esmtpa (envelope-from ) id 1l9y2t-0002jk-UW; Thu, 11 Feb 2021 01:33:16 +0300 To: Oleg Babin , tarantool-patches@dev.tarantool.org, yaroslav.dynnikov@tarantool.org References: <0898352fb536f41625736e7a1a60bc0e83cc3379.1612914070.git.v.shpilevoy@tarantool.org> <638701c4-6e5d-b5e7-5db8-9e7baf7ee7a8@tarantool.org> Message-ID: Date: Wed, 10 Feb 2021 23:33:14 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <638701c4-6e5d-b5e7-5db8-9e7baf7ee7a8@tarantool.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD953AC099BC0052A9CD238BCF93DF23716D1711D0DDC4F5AC2182A05F538085040010EBF2BFC3992DEF4DDB583F83B727F8E2FD76862354B616D1C86EEF5809A3B X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7C59BC9C499248573C2099A533E45F2D0395957E7521B51C2CFCAF695D4D8E9FCEA1F7E6F0F101C6778DA827A17800CE77F3E0A4828C8B68EEA1F7E6F0F101C674E70A05D1297E1BBC6CDE5D1141D2B1C7DD1F25B47C69A9B05D8378260D8F415C56784CA4A69A3419FA2833FD35BB23D9E625A9149C048EE33AC447995A7AD18C26CFBAC0749D213D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8B974A882099E279BDA471835C12D1D977C4224003CC836476EC64975D915A344093EC92FD9297F6718AA50765F7900637EC3198ECE464ADAEA7F4EDE966BC389F395957E7521B51C24C7702A67D5C33162DBA43225CD8A89F9FFED5BD9FB41755262FEC7FBD7D1F5BB5C8C57E37DE458B4C7702A67D5C3316FA3894348FB808DBAF038BB36E94EA6B574AF45C6390F7469DAA53EE0834AAEE X-C1DE0DAB: 0D63561A33F958A5E1DE0AD3CB63DBD1AF397825A19856BE3B7255268FEBE57DD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75F04B387B5D7535DE410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34B3611847B8BC2D0B07AAF110C15CEC9D171E8AB09DE70111DAF66D467F8B030292DBCCDDF77740C81D7E09C32AA3244CA828ED281C97EE3F57B9B6921851B8CEE646F07CC2D4F3D8FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojmMmQ+JvDeDHXVt5LiEd28Q== X-Mailru-Sender: 689FA8AB762F73936BC43F508A063822F1436071D1B24CEDC482DEC99EB4B8C23841015FED1DE5223CC9A89AB576DD93FB559BB5D741EB963CF37A108A312F5C27E8A8C3839CE0E267EA787935ED9F1B X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi! Thanks for the review! On 10.02.2021 09:57, Oleg Babin via Tarantool-patches wrote: > Thanks for your patch. LGTM except two nits: > > - Seems you need to put "Closes #246" Indeed. I had a feeling that I saw this clock task somewhere. > - Tarantool has "clock" module. I suggest to use "fiber_clock()" instead of simple "clock" to avoid possible confusing. Both comments fixed. The new patch below. No diff because it is big and obvious - a plain rename. ==================== Use fiber.clock() instead of .time() everywhere Fiber.time() returns real time. It is affected by time corrections in the system, and can be not monotonic. The patch makes everything in vshard use fiber.clock() instead of fiber.time(). Also fiber.clock function is saved as an upvalue for all functions in all modules using it. This makes the code a bit shorter and saves 1 indexing of 'fiber' table. The main reason - in the future map-reduce feature the current time will be used quite often. In some places it probably will be the slowest action (given how slow FFI can be when not compiled by JIT). Needed for #147 Closes #246 diff --git a/test/failover/failover.result b/test/failover/failover.result index 452694c..bae57fa 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d') --- - true ... -ts1 = fiber.time() +ts1 = fiber.clock() --- ... while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end --- ... -ts2 = fiber.time() +ts2 = fiber.clock() --- ... ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 13c517b..a969e0e 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -109,9 +109,9 @@ test_run:switch('router_1') -- Revive the best replica. A router must reconnect to it in -- FAILOVER_UP_TIMEOUT seconds. test_run:cmd('start server box_1_d') -ts1 = fiber.time() +ts1 = fiber.clock() while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end -ts2 = fiber.time() +ts2 = fiber.clock() ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT test_run:grep_log('router_1', 'New replica box_1_d%(storage%@') diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua index b13d05e..9c792b3 100644 --- a/vshard/replicaset.lua +++ b/vshard/replicaset.lua @@ -54,6 +54,7 @@ local luri = require('uri') local luuid = require('uuid') local ffi = require('ffi') local util = require('vshard.util') +local fiber_clock = fiber.clock local gsc = util.generate_self_checker -- @@ -88,7 +89,7 @@ local function netbox_on_connect(conn) -- biggest priority. Really, it is not neccessary to -- increase replica connection priority, if the current -- one already has the biggest priority. (See failover_f). - rs.replica_up_ts = fiber.time() + rs.replica_up_ts = fiber_clock() end end @@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn) assert(conn.replica) -- Replica is down - remember this time to decrease replica -- priority after FAILOVER_DOWN_TIMEOUT seconds. - conn.replica.down_ts = fiber.time() + conn.replica.down_ts = fiber_clock() end -- @@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset) local old_replica = replicaset.replica if old_replica == replicaset.priority_list[1] and old_replica:is_connected() then - replicaset.replica_up_ts = fiber.time() + replicaset.replica_up_ts = fiber_clock() return end for _, replica in pairs(replicaset.priority_list) do @@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance) net_status, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) end - local end_time = fiber.time() + timeout + local end_time = fiber_clock() + timeout while not net_status and timeout > 0 do replica, err = pick_next_replica(replicaset) if not replica then @@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance) opts.timeout = timeout net_status, storage_status, retval, err = replica_call(replica, func, args, opts) - timeout = end_time - fiber.time() + timeout = end_time - fiber_clock() if not net_status and not storage_status and not can_retry_after_error(retval) then -- There is no sense to retry LuaJit errors, such as @@ -680,7 +681,7 @@ local function buildall(sharding_cfg) else zone_weights = {} end - local curr_ts = fiber.time() + local curr_ts = fiber_clock() for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do local new_replicaset = setmetatable({ replicas = {}, diff --git a/vshard/router/init.lua b/vshard/router/init.lua index ba1f863..eeb7515 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -1,6 +1,7 @@ local log = require('log') local lfiber = require('fiber') local table_new = require('table.new') +local fiber_clock = lfiber.clock local MODULE_INTERNALS = '__module_vshard_router' -- Reload requirements, in case this module is reloaded manually. @@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, end local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN local replicaset, err - local tend = lfiber.time() + timeout + local tend = fiber_clock() + timeout if bucket_id > router.total_bucket_count or bucket_id <= 0 then error('Bucket is unreachable: bucket id is out of range') end @@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, replicaset, err = bucket_resolve(router, bucket_id) if replicaset then ::replicaset_is_found:: - opts.timeout = tend - lfiber.time() + opts.timeout = tend - fiber_clock() local storage_call_status, call_status, call_error = replicaset[call](replicaset, 'vshard.storage.call', {bucket_id, mode, func, args}, opts) @@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, -- if reconfiguration had been started, -- and while is not executed on router, -- but already is executed on storages. - while lfiber.time() <= tend do + while fiber_clock() <= tend do lfiber.sleep(0.05) replicaset = router.replicasets[err.destination] if replicaset then @@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, -- case of broken cluster, when a bucket -- is sent on two replicasets to each -- other. - if replicaset and lfiber.time() <= tend then + if replicaset and fiber_clock() <= tend then goto replicaset_is_found end end @@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, end end lfiber.yield() - until lfiber.time() > tend + until fiber_clock() > tend if err then return nil, err else @@ -749,7 +750,7 @@ end -- connections must be updated. -- local function failover_collect_to_update(router) - local ts = lfiber.time() + local ts = fiber_clock() local uuid_to_update = {} for uuid, rs in pairs(router.replicasets) do if failover_need_down_priority(rs, ts) or @@ -772,7 +773,7 @@ local function failover_step(router) if #uuid_to_update == 0 then return false end - local curr_ts = lfiber.time() + local curr_ts = fiber_clock() local replica_is_changed = false for _, uuid in pairs(uuid_to_update) do local rs = router.replicasets[uuid] @@ -1230,8 +1231,7 @@ local function router_sync(router, timeout) timeout = router.sync_timeout end local arg = {timeout} - local clock = lfiber.clock - local deadline = timeout and (clock() + timeout) + local deadline = timeout and (fiber_clock() + timeout) local opts = {timeout = timeout} for rs_uuid, replicaset in pairs(router.replicasets) do if timeout < 0 then @@ -1244,7 +1244,7 @@ local function router_sync(router, timeout) err.replicaset = rs_uuid return nil, err end - timeout = deadline - clock() + timeout = deadline - fiber_clock() arg[1] = timeout opts.timeout = timeout end diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 1b48bf1..38cdf19 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self() local trigger = require('internal.trigger') local ffi = require('ffi') local yaml_encode = require('yaml').encode +local fiber_clock = lfiber.clock local MODULE_INTERNALS = '__module_vshard_storage' -- Reload requirements, in case this module is reloaded manually. @@ -695,7 +696,7 @@ local function sync(timeout) log.debug("Synchronizing replicaset...") timeout = timeout or M.sync_timeout local vclock = box.info.vclock - local tstart = lfiber.time() + local tstart = fiber_clock() repeat local done = true for _, replica in ipairs(box.info.replication) do @@ -711,7 +712,7 @@ local function sync(timeout) return true end lfiber.sleep(0.001) - until not (lfiber.time() <= tstart + timeout) + until fiber_clock() > tstart + timeout log.warn("Timed out during synchronizing replicaset") local ok, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) @@ -1280,10 +1281,11 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard) ref.rw_lock = true exception_guard.ref = ref exception_guard.drop_rw_lock = true - local deadline = lfiber.clock() + (opts and opts.timeout or 10) + local timeout = opts and opts.timeout or 10 + local deadline = fiber_clock() + timeout while ref.rw ~= 0 do - if not M.bucket_rw_lock_is_ready_cond:wait(deadline - - lfiber.clock()) then + timeout = deadline - fiber_clock() + if not M.bucket_rw_lock_is_ready_cond:wait(timeout) then status, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) end @@ -1579,7 +1581,7 @@ function gc_bucket_f() -- specified time interval the buckets are deleted both from -- this array and from _bucket space. local buckets_for_redirect = {} - local buckets_for_redirect_ts = lfiber.time() + local buckets_for_redirect_ts = fiber_clock() -- Empty sent buckets, updated after each step, and when -- buckets_for_redirect is deleted, it gets empty_sent_buckets -- for next deletion. @@ -1614,7 +1616,7 @@ function gc_bucket_f() end end - if lfiber.time() - buckets_for_redirect_ts >= + if fiber_clock() - buckets_for_redirect_ts >= consts.BUCKET_SENT_GARBAGE_DELAY then status, err = gc_bucket_drop(buckets_for_redirect, consts.BUCKET.SENT) @@ -1629,7 +1631,7 @@ function gc_bucket_f() else buckets_for_redirect = empty_sent_buckets or {} empty_sent_buckets = nil - buckets_for_redirect_ts = lfiber.time() + buckets_for_redirect_ts = fiber_clock() end end ::continue::