The patchset makes some preparations for the incoming consistent map-reduce feature. Mostly it reworks bucket GC and recovery so as they wouldn't block map-reduce requests for too long time. The last commit is a first big part related directly to map-reduce. It introduces binary heap data structure implementation, which is going to be a core storage for map-reduce artifacts. Some commits were done alongside while working on the code. These are rlist extraction, and fix of a bug in bucket_recv. I did them in first version of the patchset as necessary, and then after some reworks they were not used anymore. But I still want to merge them. One fixes a potential bug, other improves code readability. Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-147-map-reduce-part1 Issue: https://github.com/tarantool/tarantool/issues/147 Vladislav Shpilevoy (9): rlist: move rlist to a new module Use fiber.clock() instead of .time() everywhere test: introduce a helper to wait for bucket GC storage: bucket_recv() should check rs lock util: introduce yielding table functions cfg: introduce 'deprecated option' feature gc: introduce reactive garbage collector recovery: introduce reactive recovery util: introduce binary heap data structure test/failover/failover.result | 4 +- test/failover/failover.test.lua | 4 +- test/lua_libs/storage_template.lua | 9 + test/misc/reconfigure.result | 10 - test/misc/reconfigure.test.lua | 3 - test/rebalancer/bucket_ref.result | 19 +- test/rebalancer/bucket_ref.test.lua | 8 +- test/rebalancer/errinj.result | 20 +- test/rebalancer/errinj.test.lua | 12 +- test/rebalancer/rebalancer.result | 5 +- test/rebalancer/rebalancer.test.lua | 3 +- .../rebalancer/rebalancer_lock_and_pin.result | 14 + .../rebalancer_lock_and_pin.test.lua | 4 + test/rebalancer/receiving_bucket.result | 10 +- test/rebalancer/receiving_bucket.test.lua | 3 +- test/reload_evolution/storage.result | 7 +- test/reload_evolution/storage.test.lua | 3 +- test/router/reroute_wrong_bucket.result | 8 +- test/router/reroute_wrong_bucket.test.lua | 4 +- test/router/router.result | 22 +- test/router/router.test.lua | 13 +- test/storage/recovery.result | 11 +- test/storage/recovery.test.lua | 5 + test/storage/recovery_errinj.result | 16 +- test/storage/recovery_errinj.test.lua | 9 +- test/storage/storage.result | 10 +- test/storage/storage.test.lua | 1 + test/unit-tap/heap.test.lua | 310 ++++++++++++++++ test/unit-tap/suite.ini | 4 + test/unit/config.result | 35 +- test/unit/config.test.lua | 16 +- test/unit/garbage.result | 106 +++--- test/unit/garbage.test.lua | 47 ++- test/unit/garbage_errinj.result | 223 ------------ test/unit/garbage_errinj.test.lua | 73 ---- test/unit/rebalancer.result | 99 ----- test/unit/rebalancer.test.lua | 27 -- test/unit/rlist.result | 114 ++++++ test/unit/rlist.test.lua | 33 ++ test/unit/util.result | 113 ++++++ test/unit/util.test.lua | 49 +++ vshard/cfg.lua | 10 +- vshard/consts.lua | 7 +- vshard/heap.lua | 226 ++++++++++++ vshard/replicaset.lua | 13 +- vshard/rlist.lua | 53 +++ vshard/router/init.lua | 16 +- vshard/storage/init.lua | 342 +++++++++--------- vshard/storage/reload_evolution.lua | 8 + vshard/util.lua | 40 ++ 50 files changed, 1368 insertions(+), 833 deletions(-) create mode 100755 test/unit-tap/heap.test.lua create mode 100644 test/unit-tap/suite.ini delete mode 100644 test/unit/garbage_errinj.result delete mode 100644 test/unit/garbage_errinj.test.lua create mode 100644 test/unit/rlist.result create mode 100644 test/unit/rlist.test.lua create mode 100644 vshard/heap.lua create mode 100644 vshard/rlist.lua -- 2.24.3 (Apple Git-128)
Rlist in storage/init.lua implemented a container similar to rlist in libsmall in Tarantool core. Doubly-linked list. It does not depend on anything in storage/init.lua, and should have been done in a separate module from the beginning. Now init.lua is going to grow even more in scope of map-reduce feature, beyond 3k lines if nothing would be moved out. It was decided (by me) that it crosses the border of when it is time to split init.lua into separate modules. The patch takes the low hanging fruit by moving rlist into its own module. --- test/unit/rebalancer.result | 99 ----------------------------- test/unit/rebalancer.test.lua | 27 -------- test/unit/rlist.result | 114 ++++++++++++++++++++++++++++++++++ test/unit/rlist.test.lua | 33 ++++++++++ vshard/rlist.lua | 53 ++++++++++++++++ vshard/storage/init.lua | 68 +++----------------- 6 files changed, 208 insertions(+), 186 deletions(-) create mode 100644 test/unit/rlist.result create mode 100644 test/unit/rlist.test.lua create mode 100644 vshard/rlist.lua diff --git a/test/unit/rebalancer.result b/test/unit/rebalancer.result index 2fb30e2..19aa480 100644 --- a/test/unit/rebalancer.result +++ b/test/unit/rebalancer.result @@ -1008,105 +1008,6 @@ build_routes(replicasets) -- the latter is a dispenser. It is a structure which hands out -- destination UUIDs in a round-robin manner to worker fibers. -- -list = rlist.new() ---- -... -list ---- -- count: 0 -... -obj1 = {i = 1} ---- -... -rlist.remove(list, obj1) ---- -... -list ---- -- count: 0 -... -rlist.add_tail(list, obj1) ---- -... -list ---- -- count: 1 - last: &0 - i: 1 - first: *0 -... -rlist.remove(list, obj1) ---- -... -list ---- -- count: 0 -... -obj1 ---- -- i: 1 -... -rlist.add_tail(list, obj1) ---- -... -obj2 = {i = 2} ---- -... -rlist.add_tail(list, obj2) ---- -... -list ---- -- count: 2 - last: &0 - i: 2 - prev: &1 - i: 1 - next: *0 - first: *1 -... -obj3 = {i = 3} ---- -... -rlist.add_tail(list, obj3) ---- -... -list ---- -- count: 3 - last: &0 - i: 3 - prev: &1 - i: 2 - next: *0 - prev: &2 - i: 1 - next: *1 - first: *2 -... -rlist.remove(list, obj2) ---- -... -list ---- -- count: 2 - last: &0 - i: 3 - prev: &1 - i: 1 - next: *0 - first: *1 -... -rlist.remove(list, obj1) ---- -... -list ---- -- count: 1 - last: &0 - i: 3 - first: *0 -... d = dispenser.create({uuid = 15}) --- ... diff --git a/test/unit/rebalancer.test.lua b/test/unit/rebalancer.test.lua index a4e18c1..8087d42 100644 --- a/test/unit/rebalancer.test.lua +++ b/test/unit/rebalancer.test.lua @@ -246,33 +246,6 @@ build_routes(replicasets) -- the latter is a dispenser. It is a structure which hands out -- destination UUIDs in a round-robin manner to worker fibers. -- -list = rlist.new() -list - -obj1 = {i = 1} -rlist.remove(list, obj1) -list - -rlist.add_tail(list, obj1) -list - -rlist.remove(list, obj1) -list -obj1 - -rlist.add_tail(list, obj1) -obj2 = {i = 2} -rlist.add_tail(list, obj2) -list -obj3 = {i = 3} -rlist.add_tail(list, obj3) -list - -rlist.remove(list, obj2) -list -rlist.remove(list, obj1) -list - d = dispenser.create({uuid = 15}) dispenser.pop(d) for i = 1, 14 do assert(dispenser.pop(d) == 'uuid', i) end diff --git a/test/unit/rlist.result b/test/unit/rlist.result new file mode 100644 index 0000000..c8aabc0 --- /dev/null +++ b/test/unit/rlist.result @@ -0,0 +1,114 @@ +-- test-run result file version 2 +-- +-- gh-161: parallel rebalancer. One of the most important part of the latter is +-- a dispenser. It is a structure which hands out destination UUIDs in a +-- round-robin manner to worker fibers. It uses rlist data structure. +-- +rlist = require('vshard.rlist') + | --- + | ... + +list = rlist.new() + | --- + | ... +list + | --- + | - count: 0 + | ... + +obj1 = {i = 1} + | --- + | ... +list:remove(obj1) + | --- + | ... +list + | --- + | - count: 0 + | ... + +list:add_tail(obj1) + | --- + | ... +list + | --- + | - count: 1 + | last: &0 + | i: 1 + | first: *0 + | ... + +list:remove(obj1) + | --- + | ... +list + | --- + | - count: 0 + | ... +obj1 + | --- + | - i: 1 + | ... + +list:add_tail(obj1) + | --- + | ... +obj2 = {i = 2} + | --- + | ... +list:add_tail(obj2) + | --- + | ... +list + | --- + | - count: 2 + | last: &0 + | i: 2 + | prev: &1 + | i: 1 + | next: *0 + | first: *1 + | ... +obj3 = {i = 3} + | --- + | ... +list:add_tail(obj3) + | --- + | ... +list + | --- + | - count: 3 + | last: &0 + | i: 3 + | prev: &1 + | i: 2 + | next: *0 + | prev: &2 + | i: 1 + | next: *1 + | first: *2 + | ... + +list:remove(obj2) + | --- + | ... +list + | --- + | - count: 2 + | last: &0 + | i: 3 + | prev: &1 + | i: 1 + | next: *0 + | first: *1 + | ... +list:remove(obj1) + | --- + | ... +list + | --- + | - count: 1 + | last: &0 + | i: 3 + | first: *0 + | ... diff --git a/test/unit/rlist.test.lua b/test/unit/rlist.test.lua new file mode 100644 index 0000000..db52955 --- /dev/null +++ b/test/unit/rlist.test.lua @@ -0,0 +1,33 @@ +-- +-- gh-161: parallel rebalancer. One of the most important part of the latter is +-- a dispenser. It is a structure which hands out destination UUIDs in a +-- round-robin manner to worker fibers. It uses rlist data structure. +-- +rlist = require('vshard.rlist') + +list = rlist.new() +list + +obj1 = {i = 1} +list:remove(obj1) +list + +list:add_tail(obj1) +list + +list:remove(obj1) +list +obj1 + +list:add_tail(obj1) +obj2 = {i = 2} +list:add_tail(obj2) +list +obj3 = {i = 3} +list:add_tail(obj3) +list + +list:remove(obj2) +list +list:remove(obj1) +list diff --git a/vshard/rlist.lua b/vshard/rlist.lua new file mode 100644 index 0000000..4be5382 --- /dev/null +++ b/vshard/rlist.lua @@ -0,0 +1,53 @@ +-- +-- A subset of rlist methods from the main repository. Rlist is a +-- doubly linked list, and is used here to implement a queue of +-- routes in the parallel rebalancer. +-- +local rlist_mt = {} + +function rlist_mt.add_tail(rlist, object) + local last = rlist.last + if last then + last.next = object + object.prev = last + else + rlist.first = object + end + rlist.last = object + rlist.count = rlist.count + 1 +end + +function rlist_mt.remove(rlist, object) + local prev = object.prev + local next = object.next + local belongs_to_list = false + if prev then + belongs_to_list = true + prev.next = next + end + if next then + belongs_to_list = true + next.prev = prev + end + object.prev = nil + object.next = nil + if rlist.last == object then + belongs_to_list = true + rlist.last = prev + end + if rlist.first == object then + belongs_to_list = true + rlist.first = next + end + if belongs_to_list then + rlist.count = rlist.count - 1 + end +end + +local function rlist_new() + return setmetatable({count = 0}, {__index = rlist_mt}) +end + +return { + new = rlist_new, +} diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 5464824..1b48bf1 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -13,12 +13,13 @@ if rawget(_G, MODULE_INTERNALS) then 'vshard.consts', 'vshard.error', 'vshard.cfg', 'vshard.replicaset', 'vshard.util', 'vshard.storage.reload_evolution', - 'vshard.lua_gc', + 'vshard.lua_gc', 'vshard.rlist' } for _, module in pairs(vshard_modules) do package.loaded[module] = nil end end +local rlist = require('vshard.rlist') local consts = require('vshard.consts') local lerror = require('vshard.error') local lcfg = require('vshard.cfg') @@ -1786,54 +1787,6 @@ local function rebalancer_build_routes(replicasets) return bucket_routes end --- --- A subset of rlist methods from the main repository. Rlist is a --- doubly linked list, and is used here to implement a queue of --- routes in the parallel rebalancer. --- -local function rlist_new() - return {count = 0} -end - -local function rlist_add_tail(rlist, object) - local last = rlist.last - if last then - last.next = object - object.prev = last - else - rlist.first = object - end - rlist.last = object - rlist.count = rlist.count + 1 -end - -local function rlist_remove(rlist, object) - local prev = object.prev - local next = object.next - local belongs_to_list = false - if prev then - belongs_to_list = true - prev.next = next - end - if next then - belongs_to_list = true - next.prev = prev - end - object.prev = nil - object.next = nil - if rlist.last == object then - belongs_to_list = true - rlist.last = prev - end - if rlist.first == object then - belongs_to_list = true - rlist.first = next - end - if belongs_to_list then - rlist.count = rlist.count - 1 - end -end - -- -- Dispenser is a container of routes received from the -- rebalancer. Its task is to hand out the routes to worker fibers @@ -1842,7 +1795,7 @@ end -- receiver nodes. -- local function route_dispenser_create(routes) - local rlist = rlist_new() + local rlist = rlist.new() local map = {} for uuid, bucket_count in pairs(routes) do local new = { @@ -1873,7 +1826,7 @@ local function route_dispenser_create(routes) -- the main applier fiber does some analysis on the -- destinations. map[uuid] = new - rlist_add_tail(rlist, new) + rlist:add_tail(new) end return { rlist = rlist, @@ -1892,7 +1845,7 @@ local function route_dispenser_put(dispenser, uuid) local bucket_count = dst.bucket_count + 1 dst.bucket_count = bucket_count if bucket_count == 1 then - rlist_add_tail(dispenser.rlist, dst) + dispenser.rlist:add_tail(dst) end end end @@ -1909,7 +1862,7 @@ local function route_dispenser_skip(dispenser, uuid) local dst = map[uuid] if dst then map[uuid] = nil - rlist_remove(dispenser.rlist, dst) + dispenser.rlist:remove(dst) end end @@ -1952,9 +1905,9 @@ local function route_dispenser_pop(dispenser) if dst then local bucket_count = dst.bucket_count - 1 dst.bucket_count = bucket_count - rlist_remove(rlist, dst) + rlist:remove(dst) if bucket_count > 0 then - rlist_add_tail(rlist, dst) + rlist:add_tail(dst) end return dst.uuid end @@ -2742,11 +2695,6 @@ M.route_dispenser = { pop = route_dispenser_pop, sent = route_dispenser_sent, } -M.rlist = { - new = rlist_new, - add_tail = rlist_add_tail, - remove = rlist_remove, -} M.schema_latest_version = schema_latest_version M.schema_current_version = schema_current_version M.schema_upgrade_master = schema_upgrade_master -- 2.24.3 (Apple Git-128)
Fiber.time() returns real time. It is affected by time corrections in the system, and can be not monotonic. The patch makes everything in vshard use fiber.clock() instead of fiber.time(). Also fiber.clock function is saved as an upvalue for all functions in all modules using it. This makes the code a bit shorter and saves 1 indexing of 'fiber' table. The main reason - in the future map-reduce feature the current time will be used quite often. In some places it probably will be the slowest action (given how slow FFI can be when not compiled by JIT). Needed for #147 --- test/failover/failover.result | 4 ++-- test/failover/failover.test.lua | 4 ++-- vshard/replicaset.lua | 13 +++++++------ vshard/router/init.lua | 16 ++++++++-------- vshard/storage/init.lua | 16 ++++++++-------- 5 files changed, 27 insertions(+), 26 deletions(-) diff --git a/test/failover/failover.result b/test/failover/failover.result index 452694c..bae57fa 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d') --- - true ... -ts1 = fiber.time() +ts1 = fiber.clock() --- ... while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end --- ... -ts2 = fiber.time() +ts2 = fiber.clock() --- ... ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 13c517b..a969e0e 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -109,9 +109,9 @@ test_run:switch('router_1') -- Revive the best replica. A router must reconnect to it in -- FAILOVER_UP_TIMEOUT seconds. test_run:cmd('start server box_1_d') -ts1 = fiber.time() +ts1 = fiber.clock() while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end -ts2 = fiber.time() +ts2 = fiber.clock() ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT test_run:grep_log('router_1', 'New replica box_1_d%(storage%@') diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua index b13d05e..a74c0f8 100644 --- a/vshard/replicaset.lua +++ b/vshard/replicaset.lua @@ -54,6 +54,7 @@ local luri = require('uri') local luuid = require('uuid') local ffi = require('ffi') local util = require('vshard.util') +local clock = fiber.clock local gsc = util.generate_self_checker -- @@ -88,7 +89,7 @@ local function netbox_on_connect(conn) -- biggest priority. Really, it is not neccessary to -- increase replica connection priority, if the current -- one already has the biggest priority. (See failover_f). - rs.replica_up_ts = fiber.time() + rs.replica_up_ts = clock() end end @@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn) assert(conn.replica) -- Replica is down - remember this time to decrease replica -- priority after FAILOVER_DOWN_TIMEOUT seconds. - conn.replica.down_ts = fiber.time() + conn.replica.down_ts = clock() end -- @@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset) local old_replica = replicaset.replica if old_replica == replicaset.priority_list[1] and old_replica:is_connected() then - replicaset.replica_up_ts = fiber.time() + replicaset.replica_up_ts = clock() return end for _, replica in pairs(replicaset.priority_list) do @@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance) net_status, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) end - local end_time = fiber.time() + timeout + local end_time = clock() + timeout while not net_status and timeout > 0 do replica, err = pick_next_replica(replicaset) if not replica then @@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance) opts.timeout = timeout net_status, storage_status, retval, err = replica_call(replica, func, args, opts) - timeout = end_time - fiber.time() + timeout = end_time - clock() if not net_status and not storage_status and not can_retry_after_error(retval) then -- There is no sense to retry LuaJit errors, such as @@ -680,7 +681,7 @@ local function buildall(sharding_cfg) else zone_weights = {} end - local curr_ts = fiber.time() + local curr_ts = clock() for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do local new_replicaset = setmetatable({ replicas = {}, diff --git a/vshard/router/init.lua b/vshard/router/init.lua index ba1f863..a530c29 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -1,6 +1,7 @@ local log = require('log') local lfiber = require('fiber') local table_new = require('table.new') +local clock = lfiber.clock local MODULE_INTERNALS = '__module_vshard_router' -- Reload requirements, in case this module is reloaded manually. @@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, end local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN local replicaset, err - local tend = lfiber.time() + timeout + local tend = clock() + timeout if bucket_id > router.total_bucket_count or bucket_id <= 0 then error('Bucket is unreachable: bucket id is out of range') end @@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, replicaset, err = bucket_resolve(router, bucket_id) if replicaset then ::replicaset_is_found:: - opts.timeout = tend - lfiber.time() + opts.timeout = tend - clock() local storage_call_status, call_status, call_error = replicaset[call](replicaset, 'vshard.storage.call', {bucket_id, mode, func, args}, opts) @@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, -- if reconfiguration had been started, -- and while is not executed on router, -- but already is executed on storages. - while lfiber.time() <= tend do + while clock() <= tend do lfiber.sleep(0.05) replicaset = router.replicasets[err.destination] if replicaset then @@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, -- case of broken cluster, when a bucket -- is sent on two replicasets to each -- other. - if replicaset and lfiber.time() <= tend then + if replicaset and clock() <= tend then goto replicaset_is_found end end @@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, end end lfiber.yield() - until lfiber.time() > tend + until clock() > tend if err then return nil, err else @@ -749,7 +750,7 @@ end -- connections must be updated. -- local function failover_collect_to_update(router) - local ts = lfiber.time() + local ts = clock() local uuid_to_update = {} for uuid, rs in pairs(router.replicasets) do if failover_need_down_priority(rs, ts) or @@ -772,7 +773,7 @@ local function failover_step(router) if #uuid_to_update == 0 then return false end - local curr_ts = lfiber.time() + local curr_ts = clock() local replica_is_changed = false for _, uuid in pairs(uuid_to_update) do local rs = router.replicasets[uuid] @@ -1230,7 +1231,6 @@ local function router_sync(router, timeout) timeout = router.sync_timeout end local arg = {timeout} - local clock = lfiber.clock local deadline = timeout and (clock() + timeout) local opts = {timeout = timeout} for rs_uuid, replicaset in pairs(router.replicasets) do diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 1b48bf1..c7335fc 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self() local trigger = require('internal.trigger') local ffi = require('ffi') local yaml_encode = require('yaml').encode +local clock = lfiber.clock local MODULE_INTERNALS = '__module_vshard_storage' -- Reload requirements, in case this module is reloaded manually. @@ -695,7 +696,7 @@ local function sync(timeout) log.debug("Synchronizing replicaset...") timeout = timeout or M.sync_timeout local vclock = box.info.vclock - local tstart = lfiber.time() + local tstart = clock() repeat local done = true for _, replica in ipairs(box.info.replication) do @@ -711,7 +712,7 @@ local function sync(timeout) return true end lfiber.sleep(0.001) - until not (lfiber.time() <= tstart + timeout) + until not (clock() <= tstart + timeout) log.warn("Timed out during synchronizing replicaset") local ok, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) @@ -1280,10 +1281,9 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard) ref.rw_lock = true exception_guard.ref = ref exception_guard.drop_rw_lock = true - local deadline = lfiber.clock() + (opts and opts.timeout or 10) + local deadline = clock() + (opts and opts.timeout or 10) while ref.rw ~= 0 do - if not M.bucket_rw_lock_is_ready_cond:wait(deadline - - lfiber.clock()) then + if not M.bucket_rw_lock_is_ready_cond:wait(deadline - clock()) then status, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) end @@ -1579,7 +1579,7 @@ function gc_bucket_f() -- specified time interval the buckets are deleted both from -- this array and from _bucket space. local buckets_for_redirect = {} - local buckets_for_redirect_ts = lfiber.time() + local buckets_for_redirect_ts = clock() -- Empty sent buckets, updated after each step, and when -- buckets_for_redirect is deleted, it gets empty_sent_buckets -- for next deletion. @@ -1614,7 +1614,7 @@ function gc_bucket_f() end end - if lfiber.time() - buckets_for_redirect_ts >= + if clock() - buckets_for_redirect_ts >= consts.BUCKET_SENT_GARBAGE_DELAY then status, err = gc_bucket_drop(buckets_for_redirect, consts.BUCKET.SENT) @@ -1629,7 +1629,7 @@ function gc_bucket_f() else buckets_for_redirect = empty_sent_buckets or {} empty_sent_buckets = nil - buckets_for_redirect_ts = lfiber.time() + buckets_for_redirect_ts = clock() end end ::continue:: -- 2.24.3 (Apple Git-128)
In the tests to wait for bucket deletion by GC it was necessary to have a long loop expression which checks _bucket space and wakes up GC fiber if the bucket is not deleted yet. Soon the GC wakeup won't be necessary as GC algorithm will become reactive instead of proactive. In order not to remove the wakeup from all places in the main patch, and to simplify the waiting the patch introduces a function wait_bucket_is_collected(). The reactive GC will delete GC wakeup from this function and all the tests still will pass in time. --- test/lua_libs/storage_template.lua | 10 ++++++++++ test/rebalancer/bucket_ref.result | 7 ++----- test/rebalancer/bucket_ref.test.lua | 5 ++--- test/rebalancer/errinj.result | 13 +++++-------- test/rebalancer/errinj.test.lua | 7 +++---- test/rebalancer/rebalancer.result | 5 +---- test/rebalancer/rebalancer.test.lua | 3 +-- test/rebalancer/receiving_bucket.result | 2 +- test/rebalancer/receiving_bucket.test.lua | 2 +- test/reload_evolution/storage.result | 5 +---- test/reload_evolution/storage.test.lua | 3 +-- 11 files changed, 28 insertions(+), 34 deletions(-) diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua index 84e4180..21409bd 100644 --- a/test/lua_libs/storage_template.lua +++ b/test/lua_libs/storage_template.lua @@ -165,3 +165,13 @@ function wait_rebalancer_state(state, test_run) vshard.storage.rebalancer_wakeup() end end + +function wait_bucket_is_collected(id) + test_run:wait_cond(function() + if not box.space._bucket:get{id} then + return true + end + vshard.storage.recovery_wakeup() + vshard.storage.garbage_collector_wakeup() + end) +end diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result index b66e449..b8fc7ff 100644 --- a/test/rebalancer/bucket_ref.result +++ b/test/rebalancer/bucket_ref.result @@ -243,7 +243,7 @@ vshard.storage.buckets_info(1) destination: <replicaset_2> id: 1 ... -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +wait_bucket_is_collected(1) --- ... _ = test_run:switch('box_2_a') @@ -292,10 +292,7 @@ vshard.storage.buckets_info(1) finish_refs = true --- ... -while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end ---- -... -while box.space._bucket:get{1} do fiber.sleep(0.01) end +wait_bucket_is_collected(1) --- ... _ = test_run:switch('box_1_a') diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua index 49ba583..213ced3 100644 --- a/test/rebalancer/bucket_ref.test.lua +++ b/test/rebalancer/bucket_ref.test.lua @@ -73,7 +73,7 @@ vshard.storage.bucket_refro(1) finish_refs = true while f1:status() ~= 'dead' do fiber.sleep(0.01) end vshard.storage.buckets_info(1) -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +wait_bucket_is_collected(1) _ = test_run:switch('box_2_a') vshard.storage.buckets_info(1) vshard.storage.internal.errinj.ERRINJ_LONG_RECEIVE = false @@ -89,8 +89,7 @@ while not vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end fiber.sleep(0.2) vshard.storage.buckets_info(1) finish_refs = true -while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end -while box.space._bucket:get{1} do fiber.sleep(0.01) end +wait_bucket_is_collected(1) _ = test_run:switch('box_1_a') vshard.storage.buckets_info(1) diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result index 214e7d8..e50eb72 100644 --- a/test/rebalancer/errinj.result +++ b/test/rebalancer/errinj.result @@ -237,7 +237,10 @@ _bucket:get{36} -- Buckets became 'active' on box_2_a, but still are sending on -- box_1_a. Wait until it is marked as garbage on box_1_a by the -- recovery fiber. -while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end +wait_bucket_is_collected(35) +--- +... +wait_bucket_is_collected(36) --- ... _ = test_run:switch('box_2_a') @@ -278,7 +281,7 @@ while not _bucket:get{36} do fiber.sleep(0.0001) end _ = test_run:switch('box_1_a') --- ... -while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +wait_bucket_is_collected(36) --- ... _bucket:get{36} @@ -295,12 +298,6 @@ box.error.injection.set('ERRINJ_WAL_DELAY', false) --- - ok ... -_ = test_run:switch('box_1_a') ---- -... -while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end ---- -... test_run:switch('default') --- - true diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua index 66fbe5e..2cc4a69 100644 --- a/test/rebalancer/errinj.test.lua +++ b/test/rebalancer/errinj.test.lua @@ -107,7 +107,8 @@ _bucket:get{36} -- Buckets became 'active' on box_2_a, but still are sending on -- box_1_a. Wait until it is marked as garbage on box_1_a by the -- recovery fiber. -while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end +wait_bucket_is_collected(35) +wait_bucket_is_collected(36) _ = test_run:switch('box_2_a') _bucket:get{35} _bucket:get{36} @@ -124,13 +125,11 @@ f1 = fiber.create(function() ret1, err1 = vshard.storage.bucket_send(36, util.re _ = test_run:switch('box_2_a') while not _bucket:get{36} do fiber.sleep(0.0001) end _ = test_run:switch('box_1_a') -while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +wait_bucket_is_collected(36) _bucket:get{36} _ = test_run:switch('box_2_a') _bucket:get{36} box.error.injection.set('ERRINJ_WAL_DELAY', false) -_ = test_run:switch('box_1_a') -while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end test_run:switch('default') test_run:drop_cluster(REPLICASET_2) diff --git a/test/rebalancer/rebalancer.result b/test/rebalancer/rebalancer.result index 3607e93..098b845 100644 --- a/test/rebalancer/rebalancer.result +++ b/test/rebalancer/rebalancer.result @@ -334,10 +334,7 @@ vshard.storage.rebalancer_wakeup() -- Now rebalancer makes a bucket SENT. After it the garbage -- collector cleans it and deletes after a timeout. -- -while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end ---- -... -while _bucket:get{91} ~= nil do fiber.sleep(0.1) end +wait_bucket_is_collected(91) --- ... wait_rebalancer_state("The cluster is balanced ok", test_run) diff --git a/test/rebalancer/rebalancer.test.lua b/test/rebalancer/rebalancer.test.lua index 63e690f..308e66d 100644 --- a/test/rebalancer/rebalancer.test.lua +++ b/test/rebalancer/rebalancer.test.lua @@ -162,8 +162,7 @@ vshard.storage.rebalancer_wakeup() -- Now rebalancer makes a bucket SENT. After it the garbage -- collector cleans it and deletes after a timeout. -- -while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end -while _bucket:get{91} ~= nil do fiber.sleep(0.1) end +wait_bucket_is_collected(91) wait_rebalancer_state("The cluster is balanced ok", test_run) _bucket.index.status:count({vshard.consts.BUCKET.ACTIVE}) _bucket.index.status:min({vshard.consts.BUCKET.ACTIVE}) diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result index db6a67f..7d3612b 100644 --- a/test/rebalancer/receiving_bucket.result +++ b/test/rebalancer/receiving_bucket.result @@ -374,7 +374,7 @@ vshard.storage.buckets_info(1) destination: <replicaset_1> id: 1 ... -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +wait_bucket_is_collected(1) --- ... vshard.storage.buckets_info(1) diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua index 1819cbb..24534b3 100644 --- a/test/rebalancer/receiving_bucket.test.lua +++ b/test/rebalancer/receiving_bucket.test.lua @@ -137,7 +137,7 @@ box.space.test3:select{100} _ = test_run:switch('box_2_a') vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3}) vshard.storage.buckets_info(1) -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end +wait_bucket_is_collected(1) vshard.storage.buckets_info(1) _ = test_run:switch('box_1_a') box.space._bucket:get{1} diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result index 4652c4f..753687f 100644 --- a/test/reload_evolution/storage.result +++ b/test/reload_evolution/storage.result @@ -129,10 +129,7 @@ vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1]) --- - true ... -vshard.storage.garbage_collector_wakeup() ---- -... -while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end +wait_bucket_is_collected(bucket_id_to_move) --- ... test_run:switch('storage_1_a') diff --git a/test/reload_evolution/storage.test.lua b/test/reload_evolution/storage.test.lua index 06f7117..639553e 100644 --- a/test/reload_evolution/storage.test.lua +++ b/test/reload_evolution/storage.test.lua @@ -51,8 +51,7 @@ vshard.storage.bucket_force_create(2000) vshard.storage.buckets_info()[2000] vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42}) vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1]) -vshard.storage.garbage_collector_wakeup() -while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end +wait_bucket_is_collected(bucket_id_to_move) test_run:switch('storage_1_a') while box.space._bucket:get{bucket_id_to_move}.status ~= vshard.consts.BUCKET.ACTIVE do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[2]) -- 2.24.3 (Apple Git-128)
Locked replicaset (via config) should not allow any bucket moves from or to the replicaset. But the lock check was only done by bucket_send(). Bucket_recv() allowed to receive a bucket even if the replicaset is locked. The patch fixes it. It didn't affect automatic bucket sends, because lock is accounted by the rebalancer from the config. Only manual bucket moves could have this bug. --- test/rebalancer/rebalancer_lock_and_pin.result | 14 ++++++++++++++ test/rebalancer/rebalancer_lock_and_pin.test.lua | 4 ++++ vshard/storage/init.lua | 3 +++ 3 files changed, 21 insertions(+) diff --git a/test/rebalancer/rebalancer_lock_and_pin.result b/test/rebalancer/rebalancer_lock_and_pin.result index 51dd36e..0bb4f45 100644 --- a/test/rebalancer/rebalancer_lock_and_pin.result +++ b/test/rebalancer/rebalancer_lock_and_pin.result @@ -156,6 +156,20 @@ vshard.storage.bucket_send(1, util.replicasets[2]) message: Replicaset is locked code: 19 ... +test_run:switch('box_2_a') +--- +- true +... +-- Does not allow to receive either. Send from a non-locked replicaset to a +-- locked one fails. +vshard.storage.bucket_send(101, util.replicasets[1]) +--- +- null +- type: ShardingError + code: 19 + name: REPLICASET_IS_LOCKED + message: Replicaset is locked +... -- -- Vshard ensures that if a replicaset is locked, then it will not -- allow to change its bucket set even if a rebalancer does not diff --git a/test/rebalancer/rebalancer_lock_and_pin.test.lua b/test/rebalancer/rebalancer_lock_and_pin.test.lua index c3412c1..7b87004 100644 --- a/test/rebalancer/rebalancer_lock_and_pin.test.lua +++ b/test/rebalancer/rebalancer_lock_and_pin.test.lua @@ -69,6 +69,10 @@ info.lock -- explicitly. -- vshard.storage.bucket_send(1, util.replicasets[2]) +test_run:switch('box_2_a') +-- Does not allow to receive either. Send from a non-locked replicaset to a +-- locked one fails. +vshard.storage.bucket_send(101, util.replicasets[1]) -- -- Vshard ensures that if a replicaset is locked, then it will not diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index c7335fc..298df71 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -995,6 +995,9 @@ local function bucket_recv_xc(bucket_id, from, data, opts) return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, msg, from) end + if is_this_replicaset_locked() then + return nil, lerror.vshard(lerror.code.REPLICASET_IS_LOCKED) + end if not bucket_receiving_quota_add(-1) then return nil, lerror.vshard(lerror.code.TOO_MANY_RECEIVING) end -- 2.24.3 (Apple Git-128)
The patch adds functions table_copy_yield and table_minus_yield. Yielding copy creates a duplicate of a table but yields every specified number of keys copied. Yielding minus removes matching key-value pairs specified in one table from another table. It yields every specified number of keys passed. The functions should help to process huge Lua tables (millions of elements and more). These are going to be used on the storage in the new GC algorithm. The algorithm will need to keep a route table on the storage, just like on the router, but with expiration time for the routes. Since bucket count can be millions, it means GC will potentially operate on a huge Lua table and could use some yields so as not to block TX thread for long. Needed for #147 --- test/unit/util.result | 113 ++++++++++++++++++++++++++++++++++++++++ test/unit/util.test.lua | 49 +++++++++++++++++ vshard/util.lua | 40 ++++++++++++++ 3 files changed, 202 insertions(+) diff --git a/test/unit/util.result b/test/unit/util.result index 096e36f..c4fd84d 100644 --- a/test/unit/util.result +++ b/test/unit/util.result @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000) fib:cancel() --- ... +-- Yielding table minus. +minus_yield = util.table_minus_yield +--- +... +minus_yield({}, {}, 1) +--- +- [] +... +minus_yield({}, {k = 1}, 1) +--- +- [] +... +minus_yield({}, {k = 1}, 0) +--- +- [] +... +minus_yield({k = 1}, {k = 1}, 0) +--- +- [] +... +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10) +--- +- k2: 2 +... +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10) +--- +- [] +... +-- Mismatching values are not deleted. +minus_yield({k1 = 1}, {k1 = 2}, 10) +--- +- k1: 1 +... +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) +--- +- k3: 3 + k2: 2 +... +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + f = fiber.create(function() \ + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ + end) \ + yield_count = 0 \ + while f:status() ~= 'dead' do \ + yield_count = yield_count + 1 \ + fiber.yield() \ + end \ +end +--- +... +yield_count +--- +- 2 +... +t +--- +- k4: 4 + k1: 1 +... +-- Yielding table copy. +copy_yield = util.table_copy_yield +--- +... +copy_yield({}, 1) +--- +- [] +... +copy_yield({k = 1}, 1) +--- +- k: 1 +... +copy_yield({k1 = 1, k2 = 2}, 1) +--- +- k1: 1 + k2: 2 +... +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + res = nil \ + f = fiber.create(function() \ + res = copy_yield(t, 2) \ + end) \ + yield_count = 0 \ + while f:status() ~= 'dead' do \ + yield_count = yield_count + 1 \ + fiber.yield() \ + end \ +end +--- +... +yield_count +--- +- 2 +... +t +--- +- k3: 3 + k4: 4 + k1: 1 + k2: 2 +... +res +--- +- k3: 3 + k4: 4 + k1: 1 + k2: 2 +... +t ~= res +--- +- true +... diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua index 5f39e06..4d6cbe9 100644 --- a/test/unit/util.test.lua +++ b/test/unit/util.test.lua @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function') while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end test_run:grep_log('default', 'reloadable_function has been started', 1000) fib:cancel() + +-- Yielding table minus. +minus_yield = util.table_minus_yield +minus_yield({}, {}, 1) +minus_yield({}, {k = 1}, 1) +minus_yield({}, {k = 1}, 0) +minus_yield({k = 1}, {k = 1}, 0) +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10) +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10) +-- Mismatching values are not deleted. +minus_yield({k1 = 1}, {k1 = 2}, 10) +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) + +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + f = fiber.create(function() \ + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ + end) \ + yield_count = 0 \ + while f:status() ~= 'dead' do \ + yield_count = yield_count + 1 \ + fiber.yield() \ + end \ +end +yield_count +t + +-- Yielding table copy. +copy_yield = util.table_copy_yield +copy_yield({}, 1) +copy_yield({k = 1}, 1) +copy_yield({k1 = 1, k2 = 2}, 1) + +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + res = nil \ + f = fiber.create(function() \ + res = copy_yield(t, 2) \ + end) \ + yield_count = 0 \ + while f:status() ~= 'dead' do \ + yield_count = yield_count + 1 \ + fiber.yield() \ + end \ +end +yield_count +t +res +t ~= res diff --git a/vshard/util.lua b/vshard/util.lua index d3b4e67..2362607 100644 --- a/vshard/util.lua +++ b/vshard/util.lua @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need) return minor >= minor_need end +-- +-- Copy @a src table. Fiber yields every @a interval keys copied. +-- +local function table_copy_yield(src, interval) + local res = {} + -- Time-To-Yield. + local tty = interval + for k, v in pairs(src) do + res[k] = v + tty = tty - 1 + if tty <= 0 then + fiber.yield() + tty = interval + end + end + return res +end + +-- +-- Remove @a src keys from @a dst if their values match. Fiber yields every +-- @a interval iterations. +-- +local function table_minus_yield(dst, src, interval) + -- Time-To-Yield. + local tty = interval + for k, srcv in pairs(src) do + if dst[k] == srcv then + dst[k] = nil + end + tty = tty - 1 + if tty <= 0 then + fiber.yield() + tty = interval + end + end + return dst +end + return { tuple_extract_key = tuple_extract_key, reloadable_fiber_create = reloadable_fiber_create, @@ -160,4 +198,6 @@ return { async_task = async_task, internal = M, version_is_at_least = version_is_at_least, + table_copy_yield = table_copy_yield, + table_minus_yield = table_minus_yield, } -- 2.24.3 (Apple Git-128)
Some options in vshard are going to be eventually deprecated. For instance, 'weigts' will be renamed, 'collect_lua_garbage' may be deleted since it appears not to be so useful, 'sync_timeout' is totally unnecessary since any 'sync' can take a timeout per-call. But the patch is motivated by 'collect_bucket_garbage_interval' which is going to become unused in the new GC algorithm. New GC will be reactive instead of proactive. Instead of periodic polling of _bucket space it will react on needed events immediately. This will make the 'collect interval' unused. The option will be deprecated and eventually in some far future release its usage will lead to an error. Needed for #147 --- vshard/cfg.lua | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/vshard/cfg.lua b/vshard/cfg.lua index 1ef1899..28c3400 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -59,7 +59,11 @@ local function validate_config(config, template, check_arg) local value = config[key] local name = template_value.name local expected_type = template_value.type - if value == nil then + if template_value.is_deprecated then + if value ~= nil then + log.warn('Option "%s" is deprecated', name) + end + elseif value == nil then if not template_value.is_optional then error(string.format('%s must be specified', name)) else -- 2.24.3 (Apple Git-128)
Garbage collector is a fiber on a master node which deletes GARBAGE and SENT buckets along with their data. It was proactive. It used to wakeup with a constant period to find and delete the needed buckets. But this won't work with the future feature called 'map-reduce'. Map-reduce as a preparation stage will need to ensure that all buckets on a storage are readable and writable. With the current GC algorithm if a bucket is sent, it won't be deleted for the next 5 seconds by default. During this time all new map-reduce requests can't execute. This is not acceptable. As well as too frequent wakeup of GC fiber because it would waste TX thread time. The patch makes GC fiber wakeup not by a timeout but by events happening with _bucket space. GC fiber sleeps on a condition variable which is signaled when _bucket is changed. Once GC sees work to do, it won't sleep until it is done. It will only yield. This makes GC delete SENT and GARBAGE buckets as soon as possible reducing the waiting time for the incoming map-reduce requests. Needed for #147 @TarantoolBot document Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval' It was used to specify the interval between bucket garbage collection steps. It was needed because garbage collection in vshard was proactive. It didn't react to newly appeared garbage buckets immediately. Since now (0.1.17) garbage collection became reactive. It starts working with garbage buckets immediately as they appear. And sleeps rest of the time. The option is not used now and does not affect behaviour of anything. I suppose it can be deleted from the documentation. Or left with a big label 'deprecated' + the explanation above. An attempt to use the option does not cause an error, but logs a warning. --- test/lua_libs/storage_template.lua | 1 - test/misc/reconfigure.result | 10 - test/misc/reconfigure.test.lua | 3 - test/rebalancer/bucket_ref.result | 12 -- test/rebalancer/bucket_ref.test.lua | 3 - test/rebalancer/errinj.result | 11 -- test/rebalancer/errinj.test.lua | 5 - test/rebalancer/receiving_bucket.result | 8 - test/rebalancer/receiving_bucket.test.lua | 1 - test/reload_evolution/storage.result | 2 +- test/router/reroute_wrong_bucket.result | 8 +- test/router/reroute_wrong_bucket.test.lua | 4 +- test/storage/recovery.result | 3 +- test/storage/storage.result | 10 +- test/storage/storage.test.lua | 1 + test/unit/config.result | 35 +--- test/unit/config.test.lua | 16 +- test/unit/garbage.result | 106 ++++++---- test/unit/garbage.test.lua | 47 +++-- test/unit/garbage_errinj.result | 223 ---------------------- test/unit/garbage_errinj.test.lua | 73 ------- vshard/cfg.lua | 4 +- vshard/consts.lua | 5 +- vshard/storage/init.lua | 207 ++++++++++---------- vshard/storage/reload_evolution.lua | 8 + 25 files changed, 233 insertions(+), 573 deletions(-) delete mode 100644 test/unit/garbage_errinj.result delete mode 100644 test/unit/garbage_errinj.test.lua diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua index 21409bd..8df89f6 100644 --- a/test/lua_libs/storage_template.lua +++ b/test/lua_libs/storage_template.lua @@ -172,6 +172,5 @@ function wait_bucket_is_collected(id) return true end vshard.storage.recovery_wakeup() - vshard.storage.garbage_collector_wakeup() end) end diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index 168be5d..3b34841 100644 --- a/test/misc/reconfigure.result +++ b/test/misc/reconfigure.result @@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true cfg.rebalancer_max_receiving = 1000 --- ... -cfg.collect_bucket_garbage_interval = 100 ---- -... cfg.invalid_option = 'kek' --- ... @@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000 --- - true ... -vshard.storage.internal.collect_bucket_garbage_interval ~= 100 ---- -- true -... cfg.sync_timeout = nil --- ... @@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil cfg.rebalancer_max_receiving = nil --- ... -cfg.collect_bucket_garbage_interval = nil ---- -... cfg.invalid_option = nil --- ... diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua index e891010..348628c 100644 --- a/test/misc/reconfigure.test.lua +++ b/test/misc/reconfigure.test.lua @@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout cfg.sync_timeout = 100 cfg.collect_lua_garbage = true cfg.rebalancer_max_receiving = 1000 -cfg.collect_bucket_garbage_interval = 100 cfg.invalid_option = 'kek' vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a) not vshard.storage.internal.collect_lua_garbage vshard.storage.internal.sync_timeout vshard.storage.internal.rebalancer_max_receiving ~= 1000 -vshard.storage.internal.collect_bucket_garbage_interval ~= 100 cfg.sync_timeout = nil cfg.collect_lua_garbage = nil cfg.rebalancer_max_receiving = nil -cfg.collect_bucket_garbage_interval = nil cfg.invalid_option = nil -- diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result index b8fc7ff..9df7480 100644 --- a/test/rebalancer/bucket_ref.result +++ b/test/rebalancer/bucket_ref.result @@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read') - true ... -- Force GC to take an RO lock on the bucket now. -vshard.storage.garbage_collector_wakeup() ---- -... vshard.storage.buckets_info(1) --- - 1: @@ -203,7 +200,6 @@ while true do if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then break end - vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end; --- @@ -235,14 +231,6 @@ finish_refs = true while f1:status() ~= 'dead' do fiber.sleep(0.01) end --- ... -vshard.storage.buckets_info(1) ---- -- 1: - status: garbage - ro_lock: true - destination: <replicaset_2> - id: 1 -... wait_bucket_is_collected(1) --- ... diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua index 213ced3..1b032ff 100644 --- a/test/rebalancer/bucket_ref.test.lua +++ b/test/rebalancer/bucket_ref.test.lua @@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs. vshard.storage.bucket_ref(1, 'read') vshard.storage.bucket_unref(1, 'read') -- Force GC to take an RO lock on the bucket now. -vshard.storage.garbage_collector_wakeup() vshard.storage.buckets_info(1) _ = test_run:cmd("setopt delimiter ';'") while true do @@ -64,7 +63,6 @@ while true do if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then break end - vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end; _ = test_run:cmd("setopt delimiter ''"); @@ -72,7 +70,6 @@ vshard.storage.buckets_info(1) vshard.storage.bucket_refro(1) finish_refs = true while f1:status() ~= 'dead' do fiber.sleep(0.01) end -vshard.storage.buckets_info(1) wait_bucket_is_collected(1) _ = test_run:switch('box_2_a') vshard.storage.buckets_info(1) diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result index e50eb72..0ddb1c9 100644 --- a/test/rebalancer/errinj.result +++ b/test/rebalancer/errinj.result @@ -226,17 +226,6 @@ ret2, err2 - true - null ... -_bucket:get{35} ---- -- [35, 'sent', '<replicaset_2>'] -... -_bucket:get{36} ---- -- [36, 'sent', '<replicaset_2>'] -... --- Buckets became 'active' on box_2_a, but still are sending on --- box_1_a. Wait until it is marked as garbage on box_1_a by the --- recovery fiber. wait_bucket_is_collected(35) --- ... diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua index 2cc4a69..a60f3d7 100644 --- a/test/rebalancer/errinj.test.lua +++ b/test/rebalancer/errinj.test.lua @@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a') while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end ret1, err1 ret2, err2 -_bucket:get{35} -_bucket:get{36} --- Buckets became 'active' on box_2_a, but still are sending on --- box_1_a. Wait until it is marked as garbage on box_1_a by the --- recovery fiber. wait_bucket_is_collected(35) wait_bucket_is_collected(36) _ = test_run:switch('box_2_a') diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result index 7d3612b..ad93445 100644 --- a/test/rebalancer/receiving_bucket.result +++ b/test/rebalancer/receiving_bucket.result @@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3}) --- - true ... -vshard.storage.buckets_info(1) ---- -- 1: - status: sent - ro_lock: true - destination: <replicaset_1> - id: 1 -... wait_bucket_is_collected(1) --- ... diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua index 24534b3..2cf6382 100644 --- a/test/rebalancer/receiving_bucket.test.lua +++ b/test/rebalancer/receiving_bucket.test.lua @@ -136,7 +136,6 @@ box.space.test3:select{100} -- Now the bucket is unreferenced and can be transferred. _ = test_run:switch('box_2_a') vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3}) -vshard.storage.buckets_info(1) wait_bucket_is_collected(1) vshard.storage.buckets_info(1) _ = test_run:switch('box_1_a') diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result index 753687f..9d30a04 100644 --- a/test/reload_evolution/storage.result +++ b/test/reload_evolution/storage.result @@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ... vshard.storage.internal.reload_version --- -- 2 +- 3 ... -- -- gh-237: should be only one trigger. During gh-237 the trigger installation diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result index 049bdef..ac340eb 100644 --- a/test/router/reroute_wrong_bucket.result +++ b/test/router/reroute_wrong_bucket.result @@ -37,7 +37,7 @@ test_run:switch('storage_1_a') --- - true ... -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 --- ... vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a) @@ -53,7 +53,7 @@ test_run:switch('storage_2_a') --- - true ... -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 --- ... vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a) @@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration') err --- - bucket_id: 100 - reason: write is prohibited + reason: Not found code: 1 destination: ac522f65-aa94-4134-9f64-51ee384f1a54 type: ShardingError name: WRONG_BUCKET - message: 'Cannot perform action with bucket 100, reason: write is prohibited' + message: 'Cannot perform action with bucket 100, reason: Not found' ... -- -- Now try again, but update configuration during call(). It must diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua index 9e6e804..207aac3 100644 --- a/test/router/reroute_wrong_bucket.test.lua +++ b/test/router/reroute_wrong_bucket.test.lua @@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt test_run:cmd('create server router_1 with script="router/router_1.lua"') test_run:cmd('start server router_1') test_run:switch('storage_1_a') -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a) vshard.storage.rebalancer_disable() for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end test_run:switch('storage_2_a') -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a) vshard.storage.rebalancer_disable() for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end diff --git a/test/storage/recovery.result b/test/storage/recovery.result index f833fe7..8ccb0b9 100644 --- a/test/storage/recovery.result +++ b/test/storage/recovery.result @@ -79,8 +79,7 @@ _bucket = box.space._bucket ... _bucket:select{} --- -- - [2, 'garbage', '<replicaset_2>'] - - [3, 'garbage', '<replicaset_2>'] +- [] ... _ = test_run:switch('storage_2_a') --- diff --git a/test/storage/storage.result b/test/storage/storage.result index 424bc4c..0550ad1 100644 --- a/test/storage/storage.result +++ b/test/storage/storage.result @@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2]) --- - true ... +wait_bucket_is_collected(1) +--- +... _ = test_run:switch("storage_2_a") --- ... @@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a") ... vshard.storage.buckets_info() --- -- 1: - status: sent - ro_lock: true - destination: <replicaset_2> - id: 1 - 2: +- 2: status: active id: 2 ... diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua index d631b51..d8fbd94 100644 --- a/test/storage/storage.test.lua +++ b/test/storage/storage.test.lua @@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1]) -- Successful transfer. vshard.storage.bucket_send(1, util.replicasets[2]) +wait_bucket_is_collected(1) _ = test_run:switch("storage_2_a") vshard.storage.buckets_info() _ = test_run:switch("storage_1_a") diff --git a/test/unit/config.result b/test/unit/config.result index dfd0219..e0b2482 100644 --- a/test/unit/config.result +++ b/test/unit/config.result @@ -428,33 +428,6 @@ _ = lcfg.check(cfg) -- -- gh-77: garbage collection options. -- -cfg.collect_bucket_garbage_interval = 'str' ---- -... -check(cfg) ---- -- Garbage bucket collect interval must be positive number -... -cfg.collect_bucket_garbage_interval = 0 ---- -... -check(cfg) ---- -- Garbage bucket collect interval must be positive number -... -cfg.collect_bucket_garbage_interval = -1 ---- -... -check(cfg) ---- -- Garbage bucket collect interval must be positive number -... -cfg.collect_bucket_garbage_interval = 100.5 ---- -... -_ = lcfg.check(cfg) ---- -... cfg.collect_lua_garbage = 100 --- ... @@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending cfg.rebalancer_max_sending = nil --- ... -cfg.sharding = nil +-- +-- Deprecated option does not break anything. +-- +cfg.collect_bucket_garbage_interval = 100 +--- +... +_ = lcfg.check(cfg) --- ... diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua index ada43db..a1c9f07 100644 --- a/test/unit/config.test.lua +++ b/test/unit/config.test.lua @@ -175,15 +175,6 @@ _ = lcfg.check(cfg) -- -- gh-77: garbage collection options. -- -cfg.collect_bucket_garbage_interval = 'str' -check(cfg) -cfg.collect_bucket_garbage_interval = 0 -check(cfg) -cfg.collect_bucket_garbage_interval = -1 -check(cfg) -cfg.collect_bucket_garbage_interval = 100.5 -_ = lcfg.check(cfg) - cfg.collect_lua_garbage = 100 check(cfg) cfg.collect_lua_garbage = true @@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg) cfg.rebalancer_max_sending = 15 lcfg.check(cfg).rebalancer_max_sending cfg.rebalancer_max_sending = nil -cfg.sharding = nil + +-- +-- Deprecated option does not break anything. +-- +cfg.collect_bucket_garbage_interval = 100 +_ = lcfg.check(cfg) diff --git a/test/unit/garbage.result b/test/unit/garbage.result index 74d9ccf..a530496 100644 --- a/test/unit/garbage.result +++ b/test/unit/garbage.result @@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''"); vshard.storage.internal.shard_index = 'bucket_id' --- ... -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL ---- -... -- -- Find nothing if no bucket_id anywhere, or there is no index -- by it, or bucket_id is not unsigned. @@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'} format[2] = {name = 'status', type = 'string'} --- ... +format[3] = {name = 'destination', type = 'string', is_nullable = true} +--- +... _bucket = box.schema.create_space('_bucket', {format = format}) --- ... @@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE} --- - [3, 'active'] ... -_bucket:replace{4, vshard.consts.BUCKET.SENT} ---- -- [4, 'sent'] -... -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} ---- -- [5, 'garbage'] -... -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE} ---- -- [6, 'garbage'] -... -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE} ---- -- [200, 'garbage'] -... s = box.schema.create_space('test', {engine = engine}) --- ... @@ -213,7 +197,7 @@ s:replace{4, 2} --- - [4, 2] ... -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop --- ... s2 = box.schema.create_space('test2', {engine = engine}) @@ -249,6 +233,10 @@ function fill_spaces_with_garbage() s2:replace{6, 4} s2:replace{7, 5} s2:replace{7, 6} + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'} + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE} + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'} + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE} end; --- ... @@ -267,12 +255,22 @@ fill_spaces_with_garbage() --- - 1107 ... -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) +route_map = {} +--- +... +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) --- -- - 5 - - 6 - - 200 - true +- null +... +route_map +--- +- - null + - null + - null + - null + - null + - destination2 ... #s2:select{} --- @@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) --- - 7 ... -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +route_map = {} +--- +... +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) --- -- - 4 - true +- null +... +route_map +--- +- - null + - null + - null + - destination1 ... s2:select{} --- @@ -303,17 +311,22 @@ s:select{} - [6, 100] ... -- Nothing deleted - update collected generation. -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) +route_map = {} +--- +... +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) --- -- - 5 - - 6 - - 200 - true +- null ... -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) --- -- - 4 - true +- null +... +route_map +--- +- [] ... #s2:select{} --- @@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) fill_spaces_with_garbage() --- ... -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) +_ = _bucket:on_replace(function() \ + local gen = vshard.storage.internal.bucket_generation \ + vshard.storage.internal.bucket_generation = gen + 1 \ + vshard.storage.internal.bucket_generation_cond:broadcast() \ +end) --- ... f = fiber.create(vshard.storage.internal.gc_bucket_f) --- ... -- Wait until garbage collection is finished. -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end) --- +- true ... s:select{} --- @@ -360,7 +378,6 @@ _bucket:select{} - - [1, 'active'] - [2, 'receiving'] - [3, 'active'] - - [4, 'sent'] ... -- -- Test deletion of 'sent' buckets after a specified timeout. @@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT} - [2, 'sent'] ... -- Wait deletion after a while. -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{2} end) --- +- true ... _bucket:select{} --- @@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT} --- - [4, 'sent'] ... -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) --- +- true ... -- -- Test WAL errors during deletion from _bucket. @@ -434,11 +453,14 @@ s:replace{6, 4} --- - [6, 4] ... -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_log('default', 'Error during garbage collection step', \ + 65536, 10) --- +- Error during garbage collection step ... -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return #sk:select{4} == 0 end) --- +- true ... s:select{} --- @@ -454,8 +476,9 @@ _bucket:select{} _ = _bucket:on_replace(nil, rollback_on_delete) --- ... -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) --- +- true ... f:cancel() --- @@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i, f = fiber.create(vshard.storage.internal.gc_bucket_f) --- ... -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return _bucket:count() == 0 end) --- +- true ... _bucket:select{} --- diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua index 30079fa..250afb0 100644 --- a/test/unit/garbage.test.lua +++ b/test/unit/garbage.test.lua @@ -15,7 +15,6 @@ end; test_run:cmd("setopt delimiter ''"); vshard.storage.internal.shard_index = 'bucket_id' -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL -- -- Find nothing if no bucket_id anywhere, or there is no index @@ -75,16 +74,13 @@ s:drop() format = {} format[1] = {name = 'id', type = 'unsigned'} format[2] = {name = 'status', type = 'string'} +format[3] = {name = 'destination', type = 'string', is_nullable = true} _bucket = box.schema.create_space('_bucket', {format = format}) _ = _bucket:create_index('pk') _ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false}) _bucket:replace{1, vshard.consts.BUCKET.ACTIVE} _bucket:replace{2, vshard.consts.BUCKET.RECEIVING} _bucket:replace{3, vshard.consts.BUCKET.ACTIVE} -_bucket:replace{4, vshard.consts.BUCKET.SENT} -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE} -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE} s = box.schema.create_space('test', {engine = engine}) pk = s:create_index('pk') @@ -94,7 +90,7 @@ s:replace{2, 1} s:replace{3, 2} s:replace{4, 2} -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop s2 = box.schema.create_space('test2', {engine = engine}) pk2 = s2:create_index('pk') sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) @@ -114,6 +110,10 @@ function fill_spaces_with_garbage() s2:replace{6, 4} s2:replace{7, 5} s2:replace{7, 6} + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'} + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE} + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'} + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE} end; test_run:cmd("setopt delimiter ''"); @@ -121,15 +121,21 @@ fill_spaces_with_garbage() #s2:select{} #s:select{} -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) +route_map = {} +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) +route_map #s2:select{} #s:select{} -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +route_map = {} +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) +route_map s2:select{} s:select{} -- Nothing deleted - update collected generation. -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +route_map = {} +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) +route_map #s2:select{} #s:select{} @@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) -- Test continuous garbage collection via background fiber. -- fill_spaces_with_garbage() -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) +_ = _bucket:on_replace(function() \ + local gen = vshard.storage.internal.bucket_generation \ + vshard.storage.internal.bucket_generation = gen + 1 \ + vshard.storage.internal.bucket_generation_cond:broadcast() \ +end) f = fiber.create(vshard.storage.internal.gc_bucket_f) -- Wait until garbage collection is finished. -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end) s:select{} s2:select{} -- Check garbage bucket is deleted by background fiber. @@ -150,7 +160,7 @@ _bucket:select{} -- _bucket:replace{2, vshard.consts.BUCKET.SENT} -- Wait deletion after a while. -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{2} end) _bucket:select{} s:select{} s2:select{} @@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE} s:replace{5, 4} s:replace{6, 4} _bucket:replace{4, vshard.consts.BUCKET.SENT} -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) -- -- Test WAL errors during deletion from _bucket. @@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete) _bucket:replace{4, vshard.consts.BUCKET.SENT} s:replace{5, 4} s:replace{6, 4} -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_log('default', 'Error during garbage collection step', \ + 65536, 10) +test_run:wait_cond(function() return #sk:select{4} == 0 end) s:select{} _bucket:select{} _ = _bucket:on_replace(nil, rollback_on_delete) -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) f:cancel() @@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i, #s:select{} #s2:select{} f = fiber.create(vshard.storage.internal.gc_bucket_f) -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return _bucket:count() == 0 end) _bucket:select{} s:select{} s2:select{} diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result deleted file mode 100644 index 92c8039..0000000 --- a/test/unit/garbage_errinj.result +++ /dev/null @@ -1,223 +0,0 @@ -test_run = require('test_run').new() ---- -... -vshard = require('vshard') ---- -... -fiber = require('fiber') ---- -... -engine = test_run:get_cfg('engine') ---- -... -vshard.storage.internal.shard_index = 'bucket_id' ---- -... -format = {} ---- -... -format[1] = {name = 'id', type = 'unsigned'} ---- -... -format[2] = {name = 'status', type = 'string', is_nullable = true} ---- -... -_bucket = box.schema.create_space('_bucket', {format = format}) ---- -... -_ = _bucket:create_index('pk') ---- -... -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false}) ---- -... -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE} ---- -- [1, 'active'] -... -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING} ---- -- [2, 'receiving'] -... -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE} ---- -- [3, 'active'] -... -_bucket:replace{4, vshard.consts.BUCKET.SENT} ---- -- [4, 'sent'] -... -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} ---- -- [5, 'garbage'] -... -s = box.schema.create_space('test', {engine = engine}) ---- -... -pk = s:create_index('pk') ---- -... -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) ---- -... -s:replace{1, 1} ---- -- [1, 1] -... -s:replace{2, 1} ---- -- [2, 1] -... -s:replace{3, 2} ---- -- [3, 2] -... -s:replace{4, 2} ---- -- [4, 2] -... -s:replace{5, 100} ---- -- [5, 100] -... -s:replace{6, 100} ---- -- [6, 100] -... -s:replace{7, 4} ---- -- [7, 4] -... -s:replace{8, 5} ---- -- [8, 5] -... -s2 = box.schema.create_space('test2', {engine = engine}) ---- -... -pk2 = s2:create_index('pk') ---- -... -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) ---- -... -s2:replace{1, 1} ---- -- [1, 1] -... -s2:replace{3, 3} ---- -- [3, 3] -... -for i = 7, 1107 do s:replace{i, 200} end ---- -... -s2:replace{4, 200} ---- -- [4, 200] -... -s2:replace{5, 100} ---- -- [5, 100] -... -s2:replace{5, 300} ---- -- [5, 300] -... -s2:replace{6, 4} ---- -- [6, 4] -... -s2:replace{7, 5} ---- -- [7, 5] -... -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type ---- -... -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) ---- -- - 4 -- true -... -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) ---- -- - 5 -- true -... --- --- Test _bucket generation change during garbage buckets search. --- -s:truncate() ---- -... -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) ---- -... -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true ---- -... -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end) ---- -... -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE} ---- -- [4, 'garbage'] -... -s:replace{5, 4} ---- -- [5, 4] -... -s:replace{6, 4} ---- -- [6, 4] -... -#s:select{} ---- -- 2 -... -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false ---- -... -while f:status() ~= 'dead' do fiber.sleep(0.1) end ---- -... --- Nothing is deleted - _bucket:replace() has changed _bucket --- generation during search of garbage buckets. -#s:select{} ---- -- 2 -... -_bucket:select{4} ---- -- - [4, 'garbage'] -... --- Next step deletes garbage ok. -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) ---- -- [] -- true -... -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) ---- -- - 4 - - 5 -- true -... -#s:select{} ---- -- 0 -... -_bucket:delete{4} ---- -- [4, 'garbage'] -... -s2:drop() ---- -... -s:drop() ---- -... -_bucket:drop() ---- -... diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua deleted file mode 100644 index 31184b9..0000000 --- a/test/unit/garbage_errinj.test.lua +++ /dev/null @@ -1,73 +0,0 @@ -test_run = require('test_run').new() -vshard = require('vshard') -fiber = require('fiber') - -engine = test_run:get_cfg('engine') -vshard.storage.internal.shard_index = 'bucket_id' - -format = {} -format[1] = {name = 'id', type = 'unsigned'} -format[2] = {name = 'status', type = 'string', is_nullable = true} -_bucket = box.schema.create_space('_bucket', {format = format}) -_ = _bucket:create_index('pk') -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false}) -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE} -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING} -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE} -_bucket:replace{4, vshard.consts.BUCKET.SENT} -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} - -s = box.schema.create_space('test', {engine = engine}) -pk = s:create_index('pk') -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) -s:replace{1, 1} -s:replace{2, 1} -s:replace{3, 2} -s:replace{4, 2} -s:replace{5, 100} -s:replace{6, 100} -s:replace{7, 4} -s:replace{8, 5} - -s2 = box.schema.create_space('test2', {engine = engine}) -pk2 = s2:create_index('pk') -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) -s2:replace{1, 1} -s2:replace{3, 3} -for i = 7, 1107 do s:replace{i, 200} end -s2:replace{4, 200} -s2:replace{5, 100} -s2:replace{5, 300} -s2:replace{6, 4} -s2:replace{7, 5} - -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) - --- --- Test _bucket generation change during garbage buckets search. --- -s:truncate() -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end) -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE} -s:replace{5, 4} -s:replace{6, 4} -#s:select{} -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false -while f:status() ~= 'dead' do fiber.sleep(0.1) end --- Nothing is deleted - _bucket:replace() has changed _bucket --- generation during search of garbage buckets. -#s:select{} -_bucket:select{4} --- Next step deletes garbage ok. -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) -#s:select{} -_bucket:delete{4} - -s2:drop() -s:drop() -_bucket:drop() diff --git a/vshard/cfg.lua b/vshard/cfg.lua index 28c3400..1345058 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -245,9 +245,7 @@ local cfg_template = { max = consts.REBALANCER_MAX_SENDING_MAX }, collect_bucket_garbage_interval = { - type = 'positive number', name = 'Garbage bucket collect interval', - is_optional = true, - default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL + name = 'Garbage bucket collect interval', is_deprecated = true, }, collect_lua_garbage = { type = 'boolean', name = 'Garbage Lua collect necessity', diff --git a/vshard/consts.lua b/vshard/consts.lua index 8c2a8b0..3f1585a 100644 --- a/vshard/consts.lua +++ b/vshard/consts.lua @@ -23,6 +23,7 @@ return { DEFAULT_BUCKET_COUNT = 3000; BUCKET_SENT_GARBAGE_DELAY = 0.5; BUCKET_CHUNK_SIZE = 1000; + LUA_CHUNK_SIZE = 100000, DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1; REBALANCER_IDLE_INTERVAL = 60 * 60; REBALANCER_WORK_INTERVAL = 10; @@ -37,7 +38,7 @@ return { DEFAULT_FAILOVER_PING_TIMEOUT = 5; DEFAULT_SYNC_TIMEOUT = 1; RECONNECT_TIMEOUT = 0.5; - DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5; + GC_BACKOFF_INTERVAL = 5, RECOVERY_INTERVAL = 5; COLLECT_LUA_GARBAGE_INTERVAL = 100; @@ -45,4 +46,6 @@ return { DISCOVERY_WORK_INTERVAL = 1, DISCOVERY_WORK_STEP = 0.01, DISCOVERY_TIMEOUT = 10, + + TIMEOUT_INFINITY = 500 * 365 * 86400, } diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 298df71..31a6fc7 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -69,7 +69,6 @@ if not M then total_bucket_count = 0, errinj = { ERRINJ_CFG = false, - ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false, ERRINJ_RELOAD = false, ERRINJ_CFG_DELAY = false, ERRINJ_LONG_RECEIVE = false, @@ -96,6 +95,8 @@ if not M then -- detect that _bucket was not changed between yields. -- bucket_generation = 0, + -- Condition variable fired on generation update. + bucket_generation_cond = lfiber.cond(), -- -- Reference to the function used as on_replace trigger on -- _bucket space. It is used to replace the trigger with @@ -107,12 +108,14 @@ if not M then -- replace the old function is to keep its reference. -- bucket_on_replace = nil, + -- Redirects for recently sent buckets. They are kept for a while to + -- help routers to find a new location for sent and deleted buckets + -- without whole cluster scan. + route_map = {}, ------------------- Garbage collection ------------------- -- Fiber to remove garbage buckets data. collect_bucket_garbage_fiber = nil, - -- Do buckets garbage collection once per this time. - collect_bucket_garbage_interval = nil, -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, @@ -173,6 +176,7 @@ end -- local function bucket_generation_increment() M.bucket_generation = M.bucket_generation + 1 + M.bucket_generation_cond:broadcast() end -- @@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode) else return bucket end + local dst = bucket and bucket.destination or M.route_map[bucket_id] return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason, - bucket and bucket.destination) + dst) end -- @@ -804,11 +809,23 @@ end -- local function bucket_unrefro(bucket_id) local ref = M.bucket_refs[bucket_id] - if not ref or ref.ro == 0 then + local count = ref and ref.ro or 0 + if count == 0 then return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, "no refs", nil) end - ref.ro = ref.ro - 1 + if count == 1 then + ref.ro = 0 + if ref.ro_lock then + -- Garbage collector is waiting for the bucket if RO + -- is locked. Let it know it has one more bucket to + -- collect. It relies on generation, so its increment + -- it enough. + bucket_generation_increment() + end + return true + end + ref.ro = count - 1 return true end @@ -1479,79 +1496,44 @@ local function gc_bucket_in_space(space, bucket_id, status) end -- --- Remove tuples from buckets of a specified type. --- @param type Type of buckets to gc. --- @retval List of ids of empty buckets of the type. +-- Drop buckets with the given status along with their data in all spaces. +-- @param status Status of target buckets. +-- @param route_map Destinations of deleted buckets are saved into this table. -- -local function gc_bucket_step_by_type(type) - local sharded_spaces = find_sharded_spaces() - local empty_buckets = {} +local function gc_bucket_drop_xc(status, route_map) local limit = consts.BUCKET_CHUNK_SIZE - local is_all_collected = true - for _, bucket in box.space._bucket.index.status:pairs(type) do - local bucket_id = bucket.id - local ref = M.bucket_refs[bucket_id] + local _bucket = box.space._bucket + local sharded_spaces = find_sharded_spaces() + for _, b in _bucket.index.status:pairs(status) do + local id = b.id + local ref = M.bucket_refs[id] if ref then assert(ref.rw == 0) if ref.ro ~= 0 then ref.ro_lock = true - is_all_collected = false goto continue end - M.bucket_refs[bucket_id] = nil + M.bucket_refs[id] = nil end for _, space in pairs(sharded_spaces) do - gc_bucket_in_space_xc(space, bucket_id, type) + gc_bucket_in_space_xc(space, id, status) limit = limit - 1 if limit == 0 then lfiber.sleep(0) limit = consts.BUCKET_CHUNK_SIZE end end - table.insert(empty_buckets, bucket.id) -::continue:: + route_map[id] = b.destination + _bucket:delete{id} + ::continue:: end - return empty_buckets, is_all_collected -end - --- --- Drop buckets with ids in the list. --- @param bucket_ids Bucket ids to drop. --- @param status Expected bucket status. --- -local function gc_bucket_drop_xc(bucket_ids, status) - if #bucket_ids == 0 then - return - end - local limit = consts.BUCKET_CHUNK_SIZE - box.begin() - local _bucket = box.space._bucket - for _, id in pairs(bucket_ids) do - local bucket_exists = _bucket:get{id} ~= nil - local b = _bucket:get{id} - if b then - if b.status ~= status then - return error(string.format('Bucket %d status is changed. Was '.. - '%s, became %s', id, status, - b.status)) - end - _bucket:delete{id} - end - limit = limit - 1 - if limit == 0 then - box.commit() - box.begin() - limit = consts.BUCKET_CHUNK_SIZE - end - end - box.commit() end -- -- Exception safe version of gc_bucket_drop_xc. -- -local function gc_bucket_drop(bucket_ids, status) - local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status) +local function gc_bucket_drop(status, route_map) + local status, err = pcall(gc_bucket_drop_xc, status, route_map) if not status then box.rollback() end @@ -1578,65 +1560,75 @@ function gc_bucket_f() -- generation == bucket generation. In such a case the fiber -- does nothing until next _bucket change. local bucket_generation_collected = -1 - -- Empty sent buckets are collected into an array. After a - -- specified time interval the buckets are deleted both from - -- this array and from _bucket space. - local buckets_for_redirect = {} - local buckets_for_redirect_ts = clock() - -- Empty sent buckets, updated after each step, and when - -- buckets_for_redirect is deleted, it gets empty_sent_buckets - -- for next deletion. - local empty_garbage_buckets, empty_sent_buckets, status, err + local bucket_generation_current = M.bucket_generation + -- Deleted buckets are saved into a route map to redirect routers if they + -- didn't discover new location of the buckets yet. However route map does + -- not grow infinitely. Otherwise it would end up storing redirects for all + -- buckets in the cluster. Which could also be outdated. + -- Garbage collector periodically drops old routes from the map. For that it + -- remembers state of route map in one moment, and after a while clears the + -- remembered routes from the global route map. + local route_map = M.route_map + local route_map_old = {} + local route_map_deadline = 0 + local status, err while M.module_version == module_version do - -- Check if no changes in buckets configuration. - if bucket_generation_collected ~= M.bucket_generation then - local bucket_generation = M.bucket_generation - local is_sent_collected, is_garbage_collected - status, empty_garbage_buckets, is_garbage_collected = - pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE) - if not status then - err = empty_garbage_buckets - goto check_error - end - status, empty_sent_buckets, is_sent_collected = - pcall(gc_bucket_step_by_type, consts.BUCKET.SENT) - if not status then - err = empty_sent_buckets - goto check_error + if bucket_generation_collected ~= bucket_generation_current then + status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map) + if status then + status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map) end - status, err = gc_bucket_drop(empty_garbage_buckets, - consts.BUCKET.GARBAGE) -::check_error:: if not status then box.rollback() log.error('Error during garbage collection step: %s', err) - goto continue + else + -- Don't use global generation. During the collection it could + -- already change. Instead, remember the generation known before + -- the collection has started. + -- Since the collection also changes the generation, it makes + -- the GC happen always at least twice. But typically on the + -- second iteration it should not find any buckets to collect, + -- and then the collected generation matches the global one. + bucket_generation_collected = bucket_generation_current end - if is_sent_collected and is_garbage_collected then - bucket_generation_collected = bucket_generation + else + status = true + end + + local sleep_time = route_map_deadline - clock() + if sleep_time <= 0 then + local chunk = consts.LUA_CHUNK_SIZE + util.table_minus_yield(route_map, route_map_old, chunk) + route_map_old = util.table_copy_yield(route_map, chunk) + if next(route_map_old) then + sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY + else + sleep_time = consts.TIMEOUT_INFINITY end + route_map_deadline = clock() + sleep_time end + bucket_generation_current = M.bucket_generation - if clock() - buckets_for_redirect_ts >= - consts.BUCKET_SENT_GARBAGE_DELAY then - status, err = gc_bucket_drop(buckets_for_redirect, - consts.BUCKET.SENT) - if not status then - buckets_for_redirect = {} - empty_sent_buckets = {} - bucket_generation_collected = -1 - log.error('Error during deletion of empty sent buckets: %s', - err) - elseif M.module_version ~= module_version then - return + if bucket_generation_current ~= bucket_generation_collected then + -- Generation was changed during collection. Or *by* collection. + if status then + -- Retry immediately. If the generation was changed by the + -- collection itself, it will notice it next iteration, and go + -- to proper sleep. + sleep_time = 0 else - buckets_for_redirect = empty_sent_buckets or {} - empty_sent_buckets = nil - buckets_for_redirect_ts = clock() + -- An error happened during the collection. Does not make sense + -- to retry on each iteration of the event loop. The most likely + -- errors are either a WAL error or a transaction abort - both + -- look like an issue in the user's code and can't be fixed + -- quickly anyway. Backoff. + sleep_time = consts.GC_BACKOFF_INTERVAL end end -::continue:: - lfiber.sleep(M.collect_bucket_garbage_interval) + + if M.module_version == module_version then + M.bucket_generation_cond:wait(sleep_time) + end end end @@ -2421,8 +2413,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) vshard_cfg.rebalancer_disbalance_threshold M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving M.shard_index = vshard_cfg.shard_index - M.collect_bucket_garbage_interval = - vshard_cfg.collect_bucket_garbage_interval M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending M.current_cfg = cfg @@ -2676,6 +2666,9 @@ else storage_cfg(M.current_cfg, M.this_replica.uuid, true) end M.module_version = M.module_version + 1 + -- Background fibers could sleep waiting for bucket changes. + -- Let them know it is time to reload. + bucket_generation_increment() end M.recovery_f = recovery_f @@ -2686,7 +2679,7 @@ M.gc_bucket_f = gc_bucket_f -- These functions are saved in M not for atomic reload, but for -- unit testing. -- -M.gc_bucket_step_by_type = gc_bucket_step_by_type +M.gc_bucket_drop = gc_bucket_drop M.rebalancer_build_routes = rebalancer_build_routes M.rebalancer_calculate_metrics = rebalancer_calculate_metrics M.cached_find_sharded_spaces = find_sharded_spaces diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua index f38af74..484f499 100644 --- a/vshard/storage/reload_evolution.lua +++ b/vshard/storage/reload_evolution.lua @@ -4,6 +4,7 @@ -- in a commit. -- local log = require('log') +local fiber = require('fiber') -- -- Array of upgrade functions. @@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M) end end +migrations[#migrations + 1] = function(M) + if not M.route_map then + M.bucket_generation_cond = fiber.cond() + M.route_map = {} + end +end + -- -- Perform an update based on a version stored in `M` (internals). -- @param M Old module internals which should be updated. -- 2.24.3 (Apple Git-128)
Recovery is a fiber on a master node which tries to resolve SENDING/RECEIVING buckets into GARBAGE or ACTIVE, in case they are stuck. Usually it happens due to a conflict on the receiving side, or if a restart happens during bucket send. Recovery was proactive. It used to wakeup with a constant period to find and resolve the needed buckets. But this won't work with the future feature called 'map-reduce'. Map-reduce as a preparation stage will need to ensure that all buckets on a storage are readable and writable. With the current recovery algorithm if a bucket is broken, it won't be recovered for the next 5 seconds by default. During this time all new map-reduce requests can't execute. This is not acceptable. As well as too frequent wakeup of recovery fiber because it would waste TX thread time. The patch makes recovery fiber wakeup not by a timeout but by events happening with _bucket space. Recovery fiber sleeps on a condition variable which is signaled when _bucket is changed. This is very similar to the reactive GC feature in a previous commit. It is worth mentioning that the backoff happens not only when a bucket couldn't be recovered (its transfer is still in progress, for example), but also when a network error happened and recovery couldn't check state of the bucket on the other storage. It would be a useless busy loop to retry network errors immediately after their appearance. Recovery uses a backoff interval for them as well. Needed for #147 --- test/router/router.result | 22 ++++++++--- test/router/router.test.lua | 13 ++++++- test/storage/recovery.result | 8 ++++ test/storage/recovery.test.lua | 5 +++ test/storage/recovery_errinj.result | 16 +++++++- test/storage/recovery_errinj.test.lua | 9 ++++- vshard/consts.lua | 2 +- vshard/storage/init.lua | 54 +++++++++++++++++++++++---- 8 files changed, 110 insertions(+), 19 deletions(-) diff --git a/test/router/router.result b/test/router/router.result index b2efd6d..3c1d073 100644 --- a/test/router/router.result +++ b/test/router/router.result @@ -312,6 +312,11 @@ replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err _ = test_run:switch('storage_2_a') --- ... +-- Pause recovery. It is too aggressive, and the test needs to see buckets in +-- their intermediate states. +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true +--- +... box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]}) --- - [1, 'sending', '<replicaset_1>'] @@ -319,6 +324,9 @@ box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]} _ = test_run:switch('storage_1_a') --- ... +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true +--- +... box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]}) --- - [1, 'receiving', '<replicaset_2>'] @@ -342,19 +350,21 @@ util.check_error(vshard.router.call, 1, 'write', 'echo', {123}) name: TRANSFER_IS_IN_PROGRESS message: Bucket 1 is transferring to replicaset <replicaset_1> ... -_ = test_run:switch('storage_2_a') +_ = test_run:switch('storage_1_a') +--- +... +box.space._bucket:delete({1}) --- +- [1, 'receiving', '<replicaset_2>'] ... -box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE}) +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false --- -- [1, 'active'] ... -_ = test_run:switch('storage_1_a') +_ = test_run:switch('storage_2_a') --- ... -box.space._bucket:delete({1}) +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false --- -- [1, 'receiving', '<replicaset_2>'] ... _ = test_run:switch('router_1') --- diff --git a/test/router/router.test.lua b/test/router/router.test.lua index 154310b..aa3eb3b 100644 --- a/test/router/router.test.lua +++ b/test/router/router.test.lua @@ -114,19 +114,28 @@ replicaset, err = vshard.router.bucket_discovery(1); return err == nil or err replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err _ = test_run:switch('storage_2_a') +-- Pause recovery. It is too aggressive, and the test needs to see buckets in +-- their intermediate states. +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]}) + _ = test_run:switch('storage_1_a') +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]}) + _ = test_run:switch('router_1') -- Ok to read sending bucket. vshard.router.call(1, 'read', 'echo', {123}) -- Not ok to write sending bucket. util.check_error(vshard.router.call, 1, 'write', 'echo', {123}) -_ = test_run:switch('storage_2_a') -box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE}) _ = test_run:switch('storage_1_a') box.space._bucket:delete({1}) +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false + +_ = test_run:switch('storage_2_a') +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false + _ = test_run:switch('router_1') -- Check unavailability of master of a replicaset. diff --git a/test/storage/recovery.result b/test/storage/recovery.result index 8ccb0b9..fa92bca 100644 --- a/test/storage/recovery.result +++ b/test/storage/recovery.result @@ -28,12 +28,20 @@ util.push_rs_filters(test_run) _ = test_run:switch("storage_2_a") --- ... +-- Pause until restart. Otherwise recovery does its job too fast and does not +-- allow to simulate the intermediate state. +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true +--- +... vshard.storage.rebalancer_disable() --- ... _ = test_run:switch("storage_1_a") --- ... +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true +--- +... -- Create buckets sending to rs2 and restart - recovery must -- garbage some of them and activate others. Receiving buckets -- must be garbaged on bootstrap. diff --git a/test/storage/recovery.test.lua b/test/storage/recovery.test.lua index a0651e8..93cec68 100644 --- a/test/storage/recovery.test.lua +++ b/test/storage/recovery.test.lua @@ -10,8 +10,13 @@ util.wait_master(test_run, REPLICASET_2, 'storage_2_a') util.push_rs_filters(test_run) _ = test_run:switch("storage_2_a") +-- Pause until restart. Otherwise recovery does its job too fast and does not +-- allow to simulate the intermediate state. +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true vshard.storage.rebalancer_disable() + _ = test_run:switch("storage_1_a") +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true -- Create buckets sending to rs2 and restart - recovery must -- garbage some of them and activate others. Receiving buckets diff --git a/test/storage/recovery_errinj.result b/test/storage/recovery_errinj.result index 3e9a9bf..8c178d5 100644 --- a/test/storage/recovery_errinj.result +++ b/test/storage/recovery_errinj.result @@ -35,9 +35,17 @@ _ = test_run:switch('storage_2_a') vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true --- ... +-- Pause recovery. Otherwise it does its job too fast and does not allow to +-- simulate the intermediate state. +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true +--- +... _ = test_run:switch('storage_1_a') --- ... +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true +--- +... _bucket = box.space._bucket --- ... @@ -76,10 +84,16 @@ _bucket:get{1} --- - [1, 'active'] ... +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false +--- +... _ = test_run:switch('storage_1_a') --- ... -while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false +--- +... +wait_bucket_is_collected(1) --- ... _ = test_run:switch("default") diff --git a/test/storage/recovery_errinj.test.lua b/test/storage/recovery_errinj.test.lua index 8c1a9d2..c730560 100644 --- a/test/storage/recovery_errinj.test.lua +++ b/test/storage/recovery_errinj.test.lua @@ -14,7 +14,12 @@ util.push_rs_filters(test_run) -- _ = test_run:switch('storage_2_a') vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true +-- Pause recovery. Otherwise it does its job too fast and does not allow to +-- simulate the intermediate state. +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true + _ = test_run:switch('storage_1_a') +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true _bucket = box.space._bucket _bucket:replace{1, vshard.consts.BUCKET.ACTIVE, util.replicasets[2]} ret, err = vshard.storage.bucket_send(1, util.replicasets[2], {timeout = 0.1}) @@ -27,9 +32,11 @@ vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = false _bucket = box.space._bucket while _bucket:get{1}.status ~= vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.01) end _bucket:get{1} +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false _ = test_run:switch('storage_1_a') -while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false +wait_bucket_is_collected(1) _ = test_run:switch("default") test_run:drop_cluster(REPLICASET_2) diff --git a/vshard/consts.lua b/vshard/consts.lua index 3f1585a..cf3f422 100644 --- a/vshard/consts.lua +++ b/vshard/consts.lua @@ -39,7 +39,7 @@ return { DEFAULT_SYNC_TIMEOUT = 1; RECONNECT_TIMEOUT = 0.5; GC_BACKOFF_INTERVAL = 5, - RECOVERY_INTERVAL = 5; + RECOVERY_BACKOFF_INTERVAL = 5, COLLECT_LUA_GARBAGE_INTERVAL = 100; DISCOVERY_IDLE_INTERVAL = 10, diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 31a6fc7..85f5024 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -634,13 +634,16 @@ end -- Infinite function to resolve status of buckets, whose 'sending' -- has failed due to tarantool or network problems. Restarts on -- reload. --- @param module_version Module version, on which the current --- function had been started. If the actual module version --- appears to be changed, then stop recovery. It is --- restarted in reloadable_fiber. -- local function recovery_f() local module_version = M.module_version + -- Changes of _bucket increments bucket generation. Recovery has its own + -- bucket generation which is <= actual. Recovery is finished, when its + -- generation == bucket generation. In such a case the fiber does nothing + -- until next _bucket change. + local bucket_generation_recovered = -1 + local bucket_generation_current = M.bucket_generation + local ok, sleep_time, is_all_recovered, total, recovered -- Interrupt recovery if a module has been reloaded. Perhaps, -- there was found a bug, and reload fixes it. while module_version == M.module_version do @@ -648,22 +651,57 @@ local function recovery_f() lfiber.yield() goto continue end - local ok, total, recovered = pcall(recovery_step_by_type, - consts.BUCKET.SENDING) + is_all_recovered = true + if bucket_generation_recovered == bucket_generation_current then + goto sleep + end + + ok, total, recovered = pcall(recovery_step_by_type, + consts.BUCKET.SENDING) if not ok then + is_all_recovered = false log.error('Error during sending buckets recovery: %s', total) + elseif total ~= recovered then + is_all_recovered = false end + ok, total, recovered = pcall(recovery_step_by_type, consts.BUCKET.RECEIVING) if not ok then + is_all_recovered = false log.error('Error during receiving buckets recovery: %s', total) elseif total == 0 then bucket_receiving_quota_reset() else bucket_receiving_quota_add(recovered) + if total ~= recovered then + is_all_recovered = false + end + end + + ::sleep:: + if not is_all_recovered then + bucket_generation_recovered = -1 + else + bucket_generation_recovered = bucket_generation_current + end + bucket_generation_current = M.bucket_generation + + if not is_all_recovered then + -- One option - some buckets are not broken. Their transmission is + -- still in progress. Don't need to retry immediately. Another + -- option - network errors when tried to repair the buckets. Also no + -- need to retry often. It won't help. + sleep_time = consts.RECOVERY_BACKOFF_INTERVAL + elseif bucket_generation_recovered ~= bucket_generation_current then + sleep_time = 0 + else + sleep_time = consts.TIMEOUT_INFINITY + end + if module_version == M.module_version then + M.bucket_generation_cond:wait(sleep_time) end - lfiber.sleep(consts.RECOVERY_INTERVAL) - ::continue:: + ::continue:: end end -- 2.24.3 (Apple Git-128)
Lua does not have a built-in standard library for binary heaps (also called priority queues). There is an implementation in Tarantool core in libsalad, but it is in C. Heap is a perfect storage for the soon coming feature map-reduce. In the map-reduce algorithm it will be necessary to be able to lock an entire storage against any bucket moves for time <= specified timeout. Number of map-reduce requests can be big, and they can have different timeouts. So there is a pile of timeouts from different requests. It is necessary to be able to quickly add new ones, be able to delete random ones, and remove expired ones. One way would be a sorted array of the deadlines. Unfortunately, it is super slow. O(N + log(N)) to add a new element (find place for log(N) and move all next elements for N), O(N) to delete a random one (move all next elements one cell left/right). Another way would be a sorted tree. But trees like RB or a dumb binary tree require extra steps to keep them balanced and to have access to the smallest element ASAP. The best way is the binary heap. It is perfectly balanced by design meaning that all operations there have complexity at most O(log(N)). It is possible to find the closest deadline for constant time as it is the heap's top. This patch implements it. The heap is intrusive. It means it stores index of each element right inside of the element as a field 'index'. Having an index along with each element allows to delete it from the heap for O(log(N)) without necessity to look its place up first. Part of #147 --- test/unit-tap/heap.test.lua | 310 ++++++++++++++++++++++++++++++++++++ test/unit-tap/suite.ini | 4 + vshard/heap.lua | 226 ++++++++++++++++++++++++++ 3 files changed, 540 insertions(+) create mode 100755 test/unit-tap/heap.test.lua create mode 100644 test/unit-tap/suite.ini create mode 100644 vshard/heap.lua diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua new file mode 100755 index 0000000..8c3819f --- /dev/null +++ b/test/unit-tap/heap.test.lua @@ -0,0 +1,310 @@ +#!/usr/bin/env tarantool + +local tap = require('tap') +local test = tap.test("cfg") +local heap = require('vshard.heap') + +-- +-- Max number of heap to test. Number of iterations in the test +-- grows as a factorial of this value. At 10 the test becomes +-- too long already. +-- +local heap_size = 8 + +-- +-- Type of the object stored in the intrusive heap. +-- +local function min_heap_cmp(l, r) + return l.value < r.value +end + +local function max_heap_cmp(l, r) + return l.value > r.value +end + +local function new_object(value) + return {value = value} +end + +local function heap_check_indexes(heap) + local count = heap:count() + local data = heap.data + for i = 1, count do + assert(data[i].index == i) + end +end + +local function reverse(values, i1, i2) + while i1 < i2 do + values[i1], values[i2] = values[i2], values[i1] + i1 = i1 + 1 + i2 = i2 - 1 + end +end + +-- +-- Implementation of std::next_permutation() from C++. +-- +local function next_permutation(values) + local count = #values + if count <= 1 then + return false + end + local i = count + while true do + local j = i + i = i - 1 + if values[i] < values[j] then + local k = count + while values[i] >= values[k] do + k = k - 1 + end + values[i], values[k] = values[k], values[i] + reverse(values, j, count) + return true + end + if i == 1 then + reverse(values, 1, count) + return false + end + end +end + +local function range(count) + local res = {} + for i = 1, count do + res[i] = i + end + return res +end + +-- +-- Min heap fill and empty. +-- +local function test_min_heap_basic(test) + test:plan(1) + + local h = heap.new(min_heap_cmp) + assert(not h:pop()) + assert(h:count() == 0) + local values = {} + for i = 1, heap_size do + values[i] = new_object(i) + end + for counti = 1, heap_size do + local indexes = range(counti) + repeat + for i = 1, counti do + h:push(values[indexes[i]]) + end + heap_check_indexes(h) + assert(h:count() == counti) + for i = 1, counti do + assert(h:top() == values[i]) + assert(h:pop() == values[i]) + heap_check_indexes(h) + end + assert(not h:pop()) + assert(h:count() == 0) + until not next_permutation(indexes) + end + + test:ok(true, "no asserts") +end + +-- +-- Max heap fill and empty. +-- +local function test_max_heap_basic(test) + test:plan(1) + + local h = heap.new(max_heap_cmp) + assert(not h:pop()) + assert(h:count() == 0) + local values = {} + for i = 1, heap_size do + values[i] = new_object(heap_size - i + 1) + end + for counti = 1, heap_size do + local indexes = range(counti) + repeat + for i = 1, counti do + h:push(values[indexes[i]]) + end + heap_check_indexes(h) + assert(h:count() == counti) + for i = 1, counti do + assert(h:top() == values[i]) + assert(h:pop() == values[i]) + heap_check_indexes(h) + end + assert(not h:pop()) + assert(h:count() == 0) + until not next_permutation(indexes) + end + + test:ok(true, "no asserts") +end + +-- +-- Min heap update top element. +-- +local function test_min_heap_update_top(test) + test:plan(1) + + local h = heap.new(min_heap_cmp) + for counti = 1, heap_size do + local indexes = range(counti) + repeat + local values = {} + for i = 1, counti do + values[i] = new_object(0) + h:push(values[i]) + end + heap_check_indexes(h) + for i = 1, counti do + h:top().value = indexes[i] + h:update_top() + end + heap_check_indexes(h) + assert(h:count() == counti) + for i = 1, counti do + assert(h:top().value == i) + assert(h:pop().value == i) + heap_check_indexes(h) + end + assert(not h:pop()) + assert(h:count() == 0) + until not next_permutation(indexes) + end + + test:ok(true, "no asserts") +end + +-- +-- Min heap update all elements in all possible positions. +-- +local function test_min_heap_update(test) + test:plan(1) + + local h = heap.new(min_heap_cmp) + for counti = 1, heap_size do + for srci = 1, counti do + local endv = srci * 10 + 5 + for newv = 5, endv, 5 do + local values = {} + for i = 1, counti do + values[i] = new_object(i * 10) + h:push(values[i]) + end + heap_check_indexes(h) + local obj = values[srci] + obj.value = newv + h:update(obj) + assert(obj.index >= 1) + assert(obj.index <= counti) + local prev = -1 + for i = 1, counti do + obj = h:pop() + assert(obj.index == -1) + assert(obj.value >= prev) + assert(obj.value >= 1) + prev = obj.value + obj.value = -1 + heap_check_indexes(h) + end + assert(not h:pop()) + assert(h:count() == 0) + end + end + end + + test:ok(true, "no asserts") +end + +-- +-- Max heap delete all elements from all possible positions. +-- +local function test_max_heap_delete(test) + test:plan(1) + + local h = heap.new(max_heap_cmp) + local inf = heap_size + 1 + for counti = 1, heap_size do + for srci = 1, counti do + local values = {} + for i = 1, counti do + values[i] = new_object(i) + h:push(values[i]) + end + heap_check_indexes(h) + local obj = values[srci] + obj.value = inf + h:remove(obj) + assert(obj.index == -1) + local prev = inf + for i = 2, counti do + obj = h:pop() + assert(obj.index == -1) + assert(obj.value < prev) + assert(obj.value >= 1) + prev = obj.value + obj.value = -1 + heap_check_indexes(h) + end + assert(not h:pop()) + assert(h:count() == 0) + end + end + + test:ok(true, "no asserts") +end + +local function test_min_heap_remove_top(test) + test:plan(1) + + local h = heap.new(min_heap_cmp) + for i = 1, heap_size do + h:push(new_object(i)) + end + for i = 1, heap_size do + assert(h:top().value == i) + h:remove_top() + end + assert(h:count() == 0) + + test:ok(true, "no asserts") +end + +local function test_max_heap_remove_try(test) + test:plan(1) + + local h = heap.new(max_heap_cmp) + local obj = new_object(1) + assert(obj.index == nil) + h:remove_try(obj) + assert(h:count() == 0) + + h:push(obj) + h:push(new_object(2)) + assert(obj.index == 2) + h:remove(obj) + assert(obj.index == -1) + h:remove_try(obj) + assert(obj.index == -1) + assert(h:count() == 1) + + test:ok(true, "no asserts") +end + +test:plan(7) + +test:test('min_heap_basic', test_min_heap_basic) +test:test('max_heap_basic', test_max_heap_basic) +test:test('min_heap_update_top', test_min_heap_update_top) +test:test('min heap update', test_min_heap_update) +test:test('max heap delete', test_max_heap_delete) +test:test('min heap remove top', test_min_heap_remove_top) +test:test('max heap remove try', test_max_heap_remove_try) + +os.exit(test:check() and 0 or 1) diff --git a/test/unit-tap/suite.ini b/test/unit-tap/suite.ini new file mode 100644 index 0000000..f365b69 --- /dev/null +++ b/test/unit-tap/suite.ini @@ -0,0 +1,4 @@ +[default] +core = app +description = Unit tests TAP +is_parallel = True diff --git a/vshard/heap.lua b/vshard/heap.lua new file mode 100644 index 0000000..78c600a --- /dev/null +++ b/vshard/heap.lua @@ -0,0 +1,226 @@ +local math_floor = math.floor + +-- +-- Implementation of a typical algorithm of the binary heap. +-- The heap is intrusive - it stores index of each element inside of it. It +-- allows to update and delete elements in any place in the heap, not only top +-- elements. +-- + +local function heap_parent_index(index) + return math_floor(index / 2) +end + +local function heap_left_child_index(index) + return index * 2 +end + +-- +-- Generate a new heap. +-- +-- The implementation is targeted on as few index accesses as possible. +-- Everything what could be is stored as upvalue variables instead of as indexes +-- in a table. What couldn't be an upvalue and is used in a function more than +-- once is saved on the stack. +-- +local function heap_new(is_left_above) + -- Having it as an upvalue allows not to do 'self.data' lookup in each + -- function. + local data = {} + -- Saves #data calculation. In Lua it is not just reading a number. + local count = 0 + + local function heap_update_index_up(idx) + if idx == 1 then + return false + end + + local orig_idx = idx + local value = data[idx] + local pidx = heap_parent_index(idx) + local parent = data[pidx] + while is_left_above(value, parent) do + data[idx] = parent + parent.index = idx + idx = pidx + if idx == 1 then + break + end + pidx = heap_parent_index(idx) + parent = data[pidx] + end + + if idx == orig_idx then + return false + end + data[idx] = value + value.index = idx + return true + end + + local function heap_update_index_down(idx) + local left_idx = heap_left_child_index(idx) + if left_idx > count then + return false + end + + local orig_idx = idx + local left + local right + local right_idx = left_idx + 1 + local top + local top_idx + local value = data[idx] + repeat + right_idx = left_idx + 1 + if right_idx > count then + top = data[left_idx] + if is_left_above(value, top) then + break + end + top_idx = left_idx + else + left = data[left_idx] + right = data[right_idx] + if is_left_above(left, right) then + if is_left_above(value, left) then + break + end + top_idx = left_idx + top = left + else + if is_left_above(value, right) then + break + end + top_idx = right_idx + top = right + end + end + + data[idx] = top + top.index = idx + idx = top_idx + left_idx = heap_left_child_index(idx) + until left_idx > count + + if idx == orig_idx then + return false + end + data[idx] = value + value.index = idx + return true + end + + local function heap_update_index(idx) + if not heap_update_index_up(idx) then + heap_update_index_down(idx) + end + end + + local function heap_push(self, value) + count = count + 1 + data[count] = value + value.index = count + heap_update_index_up(count) + end + + local function heap_update_top(self) + heap_update_index_down(1) + end + + local function heap_update(self, value) + heap_update_index(value.index) + end + + local function heap_remove_top(self) + if count == 0 then + return + end + data[1].index = -1 + if count == 1 then + data[1] = nil + count = 0 + return + end + local value = data[count] + data[count] = nil + data[1] = value + value.index = 1 + count = count - 1 + heap_update_index_down(1) + end + + local function heap_remove(self, value) + local idx = value.index + value.index = -1 + if idx == count then + data[count] = nil + count = count - 1 + return + end + value = data[count] + data[idx] = value + data[count] = nil + value.index = idx + count = count - 1 + heap_update_index(idx) + end + + local function heap_remove_try(self, value) + local idx = value.index + if idx and idx > 0 then + heap_remove(self, value) + end + end + + local function heap_pop(self) + if count == 0 then + return + end + -- Some duplication from remove_top, but allows to save a few + -- condition checks, index accesses, and a function call. + local res = data[1] + res.index = -1 + if count == 1 then + data[1] = nil + count = 0 + return res + end + local value = data[count] + data[count] = nil + data[1] = value + value.index = 1 + count = count - 1 + heap_update_index_down(1) + return res + end + + local function heap_top(self) + return data[1] + end + + local function heap_count(self) + return count + end + + return setmetatable({ + -- Expose the data. For testing. + data = data, + }, { + __index = { + push = heap_push, + update_top = heap_update_top, + remove_top = heap_remove_top, + pop = heap_pop, + update = heap_update, + remove = heap_remove, + remove_try = heap_remove_try, + top = heap_top, + count = heap_count, + } + }) +end + +return { + new = heap_new, +} -- 2.24.3 (Apple Git-128)
Bad links. Here are the correct ones: Branch: http://github.com/tarantool/vshard/tree/gerold103/gh-147-map-reduce-part1 Issue: https://github.com/tarantool/vshard/issues/147
Hi! Thanks for your patch. LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Rlist in storage/init.lua implemented a container similar to rlist
> in libsmall in Tarantool core. Doubly-linked list.
>
> It does not depend on anything in storage/init.lua, and should
> have been done in a separate module from the beginning.
>
> Now init.lua is going to grow even more in scope of map-reduce
> feature, beyond 3k lines if nothing would be moved out. It was
> decided (by me) that it crosses the border of when it is time to
> split init.lua into separate modules.
>
> The patch takes the low hanging fruit by moving rlist into its
> own module.
> ---
> test/unit/rebalancer.result | 99 -----------------------------
> test/unit/rebalancer.test.lua | 27 --------
> test/unit/rlist.result | 114 ++++++++++++++++++++++++++++++++++
> test/unit/rlist.test.lua | 33 ++++++++++
> vshard/rlist.lua | 53 ++++++++++++++++
> vshard/storage/init.lua | 68 +++-----------------
> 6 files changed, 208 insertions(+), 186 deletions(-)
> create mode 100644 test/unit/rlist.result
> create mode 100644 test/unit/rlist.test.lua
> create mode 100644 vshard/rlist.lua
>
> diff --git a/test/unit/rebalancer.result b/test/unit/rebalancer.result
> index 2fb30e2..19aa480 100644
> --- a/test/unit/rebalancer.result
> +++ b/test/unit/rebalancer.result
> @@ -1008,105 +1008,6 @@ build_routes(replicasets)
> -- the latter is a dispenser. It is a structure which hands out
> -- destination UUIDs in a round-robin manner to worker fibers.
> --
> -list = rlist.new()
> ----
> -...
> -list
> ----
> -- count: 0
> -...
> -obj1 = {i = 1}
> ----
> -...
> -rlist.remove(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 0
> -...
> -rlist.add_tail(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 1
> - last: &0
> - i: 1
> - first: *0
> -...
> -rlist.remove(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 0
> -...
> -obj1
> ----
> -- i: 1
> -...
> -rlist.add_tail(list, obj1)
> ----
> -...
> -obj2 = {i = 2}
> ----
> -...
> -rlist.add_tail(list, obj2)
> ----
> -...
> -list
> ----
> -- count: 2
> - last: &0
> - i: 2
> - prev: &1
> - i: 1
> - next: *0
> - first: *1
> -...
> -obj3 = {i = 3}
> ----
> -...
> -rlist.add_tail(list, obj3)
> ----
> -...
> -list
> ----
> -- count: 3
> - last: &0
> - i: 3
> - prev: &1
> - i: 2
> - next: *0
> - prev: &2
> - i: 1
> - next: *1
> - first: *2
> -...
> -rlist.remove(list, obj2)
> ----
> -...
> -list
> ----
> -- count: 2
> - last: &0
> - i: 3
> - prev: &1
> - i: 1
> - next: *0
> - first: *1
> -...
> -rlist.remove(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 1
> - last: &0
> - i: 3
> - first: *0
> -...
> d = dispenser.create({uuid = 15})
> ---
> ...
> diff --git a/test/unit/rebalancer.test.lua b/test/unit/rebalancer.test.lua
> index a4e18c1..8087d42 100644
> --- a/test/unit/rebalancer.test.lua
> +++ b/test/unit/rebalancer.test.lua
> @@ -246,33 +246,6 @@ build_routes(replicasets)
> -- the latter is a dispenser. It is a structure which hands out
> -- destination UUIDs in a round-robin manner to worker fibers.
> --
> -list = rlist.new()
> -list
> -
> -obj1 = {i = 1}
> -rlist.remove(list, obj1)
> -list
> -
> -rlist.add_tail(list, obj1)
> -list
> -
> -rlist.remove(list, obj1)
> -list
> -obj1
> -
> -rlist.add_tail(list, obj1)
> -obj2 = {i = 2}
> -rlist.add_tail(list, obj2)
> -list
> -obj3 = {i = 3}
> -rlist.add_tail(list, obj3)
> -list
> -
> -rlist.remove(list, obj2)
> -list
> -rlist.remove(list, obj1)
> -list
> -
> d = dispenser.create({uuid = 15})
> dispenser.pop(d)
> for i = 1, 14 do assert(dispenser.pop(d) == 'uuid', i) end
> diff --git a/test/unit/rlist.result b/test/unit/rlist.result
> new file mode 100644
> index 0000000..c8aabc0
> --- /dev/null
> +++ b/test/unit/rlist.result
> @@ -0,0 +1,114 @@
> +-- test-run result file version 2
> +--
> +-- gh-161: parallel rebalancer. One of the most important part of the latter is
> +-- a dispenser. It is a structure which hands out destination UUIDs in a
> +-- round-robin manner to worker fibers. It uses rlist data structure.
> +--
> +rlist = require('vshard.rlist')
> + | ---
> + | ...
> +
> +list = rlist.new()
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 0
> + | ...
> +
> +obj1 = {i = 1}
> + | ---
> + | ...
> +list:remove(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 0
> + | ...
> +
> +list:add_tail(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 1
> + | last: &0
> + | i: 1
> + | first: *0
> + | ...
> +
> +list:remove(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 0
> + | ...
> +obj1
> + | ---
> + | - i: 1
> + | ...
> +
> +list:add_tail(obj1)
> + | ---
> + | ...
> +obj2 = {i = 2}
> + | ---
> + | ...
> +list:add_tail(obj2)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 2
> + | last: &0
> + | i: 2
> + | prev: &1
> + | i: 1
> + | next: *0
> + | first: *1
> + | ...
> +obj3 = {i = 3}
> + | ---
> + | ...
> +list:add_tail(obj3)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 3
> + | last: &0
> + | i: 3
> + | prev: &1
> + | i: 2
> + | next: *0
> + | prev: &2
> + | i: 1
> + | next: *1
> + | first: *2
> + | ...
> +
> +list:remove(obj2)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 2
> + | last: &0
> + | i: 3
> + | prev: &1
> + | i: 1
> + | next: *0
> + | first: *1
> + | ...
> +list:remove(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 1
> + | last: &0
> + | i: 3
> + | first: *0
> + | ...
> diff --git a/test/unit/rlist.test.lua b/test/unit/rlist.test.lua
> new file mode 100644
> index 0000000..db52955
> --- /dev/null
> +++ b/test/unit/rlist.test.lua
> @@ -0,0 +1,33 @@
> +--
> +-- gh-161: parallel rebalancer. One of the most important part of the latter is
> +-- a dispenser. It is a structure which hands out destination UUIDs in a
> +-- round-robin manner to worker fibers. It uses rlist data structure.
> +--
> +rlist = require('vshard.rlist')
> +
> +list = rlist.new()
> +list
> +
> +obj1 = {i = 1}
> +list:remove(obj1)
> +list
> +
> +list:add_tail(obj1)
> +list
> +
> +list:remove(obj1)
> +list
> +obj1
> +
> +list:add_tail(obj1)
> +obj2 = {i = 2}
> +list:add_tail(obj2)
> +list
> +obj3 = {i = 3}
> +list:add_tail(obj3)
> +list
> +
> +list:remove(obj2)
> +list
> +list:remove(obj1)
> +list
> diff --git a/vshard/rlist.lua b/vshard/rlist.lua
> new file mode 100644
> index 0000000..4be5382
> --- /dev/null
> +++ b/vshard/rlist.lua
> @@ -0,0 +1,53 @@
> +--
> +-- A subset of rlist methods from the main repository. Rlist is a
> +-- doubly linked list, and is used here to implement a queue of
> +-- routes in the parallel rebalancer.
> +--
> +local rlist_mt = {}
> +
> +function rlist_mt.add_tail(rlist, object)
> + local last = rlist.last
> + if last then
> + last.next = object
> + object.prev = last
> + else
> + rlist.first = object
> + end
> + rlist.last = object
> + rlist.count = rlist.count + 1
> +end
> +
> +function rlist_mt.remove(rlist, object)
> + local prev = object.prev
> + local next = object.next
> + local belongs_to_list = false
> + if prev then
> + belongs_to_list = true
> + prev.next = next
> + end
> + if next then
> + belongs_to_list = true
> + next.prev = prev
> + end
> + object.prev = nil
> + object.next = nil
> + if rlist.last == object then
> + belongs_to_list = true
> + rlist.last = prev
> + end
> + if rlist.first == object then
> + belongs_to_list = true
> + rlist.first = next
> + end
> + if belongs_to_list then
> + rlist.count = rlist.count - 1
> + end
> +end
> +
> +local function rlist_new()
> + return setmetatable({count = 0}, {__index = rlist_mt})
> +end
> +
> +return {
> + new = rlist_new,
> +}
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 5464824..1b48bf1 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -13,12 +13,13 @@ if rawget(_G, MODULE_INTERNALS) then
> 'vshard.consts', 'vshard.error', 'vshard.cfg',
> 'vshard.replicaset', 'vshard.util',
> 'vshard.storage.reload_evolution',
> - 'vshard.lua_gc',
> + 'vshard.lua_gc', 'vshard.rlist'
> }
> for _, module in pairs(vshard_modules) do
> package.loaded[module] = nil
> end
> end
> +local rlist = require('vshard.rlist')
> local consts = require('vshard.consts')
> local lerror = require('vshard.error')
> local lcfg = require('vshard.cfg')
> @@ -1786,54 +1787,6 @@ local function rebalancer_build_routes(replicasets)
> return bucket_routes
> end
>
> ---
> --- A subset of rlist methods from the main repository. Rlist is a
> --- doubly linked list, and is used here to implement a queue of
> --- routes in the parallel rebalancer.
> ---
> -local function rlist_new()
> - return {count = 0}
> -end
> -
> -local function rlist_add_tail(rlist, object)
> - local last = rlist.last
> - if last then
> - last.next = object
> - object.prev = last
> - else
> - rlist.first = object
> - end
> - rlist.last = object
> - rlist.count = rlist.count + 1
> -end
> -
> -local function rlist_remove(rlist, object)
> - local prev = object.prev
> - local next = object.next
> - local belongs_to_list = false
> - if prev then
> - belongs_to_list = true
> - prev.next = next
> - end
> - if next then
> - belongs_to_list = true
> - next.prev = prev
> - end
> - object.prev = nil
> - object.next = nil
> - if rlist.last == object then
> - belongs_to_list = true
> - rlist.last = prev
> - end
> - if rlist.first == object then
> - belongs_to_list = true
> - rlist.first = next
> - end
> - if belongs_to_list then
> - rlist.count = rlist.count - 1
> - end
> -end
> -
> --
> -- Dispenser is a container of routes received from the
> -- rebalancer. Its task is to hand out the routes to worker fibers
> @@ -1842,7 +1795,7 @@ end
> -- receiver nodes.
> --
> local function route_dispenser_create(routes)
> - local rlist = rlist_new()
> + local rlist = rlist.new()
> local map = {}
> for uuid, bucket_count in pairs(routes) do
> local new = {
> @@ -1873,7 +1826,7 @@ local function route_dispenser_create(routes)
> -- the main applier fiber does some analysis on the
> -- destinations.
> map[uuid] = new
> - rlist_add_tail(rlist, new)
> + rlist:add_tail(new)
> end
> return {
> rlist = rlist,
> @@ -1892,7 +1845,7 @@ local function route_dispenser_put(dispenser, uuid)
> local bucket_count = dst.bucket_count + 1
> dst.bucket_count = bucket_count
> if bucket_count == 1 then
> - rlist_add_tail(dispenser.rlist, dst)
> + dispenser.rlist:add_tail(dst)
> end
> end
> end
> @@ -1909,7 +1862,7 @@ local function route_dispenser_skip(dispenser, uuid)
> local dst = map[uuid]
> if dst then
> map[uuid] = nil
> - rlist_remove(dispenser.rlist, dst)
> + dispenser.rlist:remove(dst)
> end
> end
>
> @@ -1952,9 +1905,9 @@ local function route_dispenser_pop(dispenser)
> if dst then
> local bucket_count = dst.bucket_count - 1
> dst.bucket_count = bucket_count
> - rlist_remove(rlist, dst)
> + rlist:remove(dst)
> if bucket_count > 0 then
> - rlist_add_tail(rlist, dst)
> + rlist:add_tail(dst)
> end
> return dst.uuid
> end
> @@ -2742,11 +2695,6 @@ M.route_dispenser = {
> pop = route_dispenser_pop,
> sent = route_dispenser_sent,
> }
> -M.rlist = {
> - new = rlist_new,
> - add_tail = rlist_add_tail,
> - remove = rlist_remove,
> -}
> M.schema_latest_version = schema_latest_version
> M.schema_current_version = schema_current_version
> M.schema_upgrade_master = schema_upgrade_master
Thanks for your patch. LGTM except two nits:
- Seems you need to put "Closes #246"
- Tarantool has "clock" module. I suggest to use "fiber_clock()" instead
of simple "clock" to avoid possible confusing.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Fiber.time() returns real time. It is affected by time corrections
> in the system, and can be not monotonic.
>
> The patch makes everything in vshard use fiber.clock() instead of
> fiber.time(). Also fiber.clock function is saved as an upvalue for
> all functions in all modules using it. This makes the code a bit
> shorter and saves 1 indexing of 'fiber' table.
>
> The main reason - in the future map-reduce feature the current
> time will be used quite often. In some places it probably will be
> the slowest action (given how slow FFI can be when not compiled by
> JIT).
>
> Needed for #147
> ---
> test/failover/failover.result | 4 ++--
> test/failover/failover.test.lua | 4 ++--
> vshard/replicaset.lua | 13 +++++++------
> vshard/router/init.lua | 16 ++++++++--------
> vshard/storage/init.lua | 16 ++++++++--------
> 5 files changed, 27 insertions(+), 26 deletions(-)
>
> diff --git a/test/failover/failover.result b/test/failover/failover.result
> index 452694c..bae57fa 100644
> --- a/test/failover/failover.result
> +++ b/test/failover/failover.result
> @@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d')
> ---
> - true
> ...
> -ts1 = fiber.time()
> +ts1 = fiber.clock()
> ---
> ...
> while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
> ---
> ...
> -ts2 = fiber.time()
> +ts2 = fiber.clock()
> ---
> ...
> ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
> diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua
> index 13c517b..a969e0e 100644
> --- a/test/failover/failover.test.lua
> +++ b/test/failover/failover.test.lua
> @@ -109,9 +109,9 @@ test_run:switch('router_1')
> -- Revive the best replica. A router must reconnect to it in
> -- FAILOVER_UP_TIMEOUT seconds.
> test_run:cmd('start server box_1_d')
> -ts1 = fiber.time()
> +ts1 = fiber.clock()
> while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
> -ts2 = fiber.time()
> +ts2 = fiber.clock()
> ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
> test_run:grep_log('router_1', 'New replica box_1_d%(storage%@')
>
> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index b13d05e..a74c0f8 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -54,6 +54,7 @@ local luri = require('uri')
> local luuid = require('uuid')
> local ffi = require('ffi')
> local util = require('vshard.util')
> +local clock = fiber.clock
> local gsc = util.generate_self_checker
>
> --
> @@ -88,7 +89,7 @@ local function netbox_on_connect(conn)
> -- biggest priority. Really, it is not neccessary to
> -- increase replica connection priority, if the current
> -- one already has the biggest priority. (See failover_f).
> - rs.replica_up_ts = fiber.time()
> + rs.replica_up_ts = clock()
> end
> end
>
> @@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn)
> assert(conn.replica)
> -- Replica is down - remember this time to decrease replica
> -- priority after FAILOVER_DOWN_TIMEOUT seconds.
> - conn.replica.down_ts = fiber.time()
> + conn.replica.down_ts = clock()
> end
>
> --
> @@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset)
> local old_replica = replicaset.replica
> if old_replica == replicaset.priority_list[1] and
> old_replica:is_connected() then
> - replicaset.replica_up_ts = fiber.time()
> + replicaset.replica_up_ts = clock()
> return
> end
> for _, replica in pairs(replicaset.priority_list) do
> @@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
> net_status, err = pcall(box.error, box.error.TIMEOUT)
> return nil, lerror.make(err)
> end
> - local end_time = fiber.time() + timeout
> + local end_time = clock() + timeout
> while not net_status and timeout > 0 do
> replica, err = pick_next_replica(replicaset)
> if not replica then
> @@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
> opts.timeout = timeout
> net_status, storage_status, retval, err =
> replica_call(replica, func, args, opts)
> - timeout = end_time - fiber.time()
> + timeout = end_time - clock()
> if not net_status and not storage_status and
> not can_retry_after_error(retval) then
> -- There is no sense to retry LuaJit errors, such as
> @@ -680,7 +681,7 @@ local function buildall(sharding_cfg)
> else
> zone_weights = {}
> end
> - local curr_ts = fiber.time()
> + local curr_ts = clock()
> for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do
> local new_replicaset = setmetatable({
> replicas = {},
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index ba1f863..a530c29 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -1,6 +1,7 @@
> local log = require('log')
> local lfiber = require('fiber')
> local table_new = require('table.new')
> +local clock = lfiber.clock
>
> local MODULE_INTERNALS = '__module_vshard_router'
> -- Reload requirements, in case this module is reloaded manually.
> @@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> end
> local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN
> local replicaset, err
> - local tend = lfiber.time() + timeout
> + local tend = clock() + timeout
> if bucket_id > router.total_bucket_count or bucket_id <= 0 then
> error('Bucket is unreachable: bucket id is out of range')
> end
> @@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> replicaset, err = bucket_resolve(router, bucket_id)
> if replicaset then
> ::replicaset_is_found::
> - opts.timeout = tend - lfiber.time()
> + opts.timeout = tend - clock()
> local storage_call_status, call_status, call_error =
> replicaset[call](replicaset, 'vshard.storage.call',
> {bucket_id, mode, func, args}, opts)
> @@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> -- if reconfiguration had been started,
> -- and while is not executed on router,
> -- but already is executed on storages.
> - while lfiber.time() <= tend do
> + while clock() <= tend do
> lfiber.sleep(0.05)
> replicaset = router.replicasets[err.destination]
> if replicaset then
> @@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> -- case of broken cluster, when a bucket
> -- is sent on two replicasets to each
> -- other.
> - if replicaset and lfiber.time() <= tend then
> + if replicaset and clock() <= tend then
> goto replicaset_is_found
> end
> end
> @@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> end
> end
> lfiber.yield()
> - until lfiber.time() > tend
> + until clock() > tend
> if err then
> return nil, err
> else
> @@ -749,7 +750,7 @@ end
> -- connections must be updated.
> --
> local function failover_collect_to_update(router)
> - local ts = lfiber.time()
> + local ts = clock()
> local uuid_to_update = {}
> for uuid, rs in pairs(router.replicasets) do
> if failover_need_down_priority(rs, ts) or
> @@ -772,7 +773,7 @@ local function failover_step(router)
> if #uuid_to_update == 0 then
> return false
> end
> - local curr_ts = lfiber.time()
> + local curr_ts = clock()
> local replica_is_changed = false
> for _, uuid in pairs(uuid_to_update) do
> local rs = router.replicasets[uuid]
> @@ -1230,7 +1231,6 @@ local function router_sync(router, timeout)
> timeout = router.sync_timeout
> end
> local arg = {timeout}
> - local clock = lfiber.clock
> local deadline = timeout and (clock() + timeout)
> local opts = {timeout = timeout}
> for rs_uuid, replicaset in pairs(router.replicasets) do
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 1b48bf1..c7335fc 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self()
> local trigger = require('internal.trigger')
> local ffi = require('ffi')
> local yaml_encode = require('yaml').encode
> +local clock = lfiber.clock
>
> local MODULE_INTERNALS = '__module_vshard_storage'
> -- Reload requirements, in case this module is reloaded manually.
> @@ -695,7 +696,7 @@ local function sync(timeout)
> log.debug("Synchronizing replicaset...")
> timeout = timeout or M.sync_timeout
> local vclock = box.info.vclock
> - local tstart = lfiber.time()
> + local tstart = clock()
> repeat
> local done = true
> for _, replica in ipairs(box.info.replication) do
> @@ -711,7 +712,7 @@ local function sync(timeout)
> return true
> end
> lfiber.sleep(0.001)
> - until not (lfiber.time() <= tstart + timeout)
> + until not (clock() <= tstart + timeout)
> log.warn("Timed out during synchronizing replicaset")
> local ok, err = pcall(box.error, box.error.TIMEOUT)
> return nil, lerror.make(err)
> @@ -1280,10 +1281,9 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard)
> ref.rw_lock = true
> exception_guard.ref = ref
> exception_guard.drop_rw_lock = true
> - local deadline = lfiber.clock() + (opts and opts.timeout or 10)
> + local deadline = clock() + (opts and opts.timeout or 10)
> while ref.rw ~= 0 do
> - if not M.bucket_rw_lock_is_ready_cond:wait(deadline -
> - lfiber.clock()) then
> + if not M.bucket_rw_lock_is_ready_cond:wait(deadline - clock()) then
> status, err = pcall(box.error, box.error.TIMEOUT)
> return nil, lerror.make(err)
> end
> @@ -1579,7 +1579,7 @@ function gc_bucket_f()
> -- specified time interval the buckets are deleted both from
> -- this array and from _bucket space.
> local buckets_for_redirect = {}
> - local buckets_for_redirect_ts = lfiber.time()
> + local buckets_for_redirect_ts = clock()
> -- Empty sent buckets, updated after each step, and when
> -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> -- for next deletion.
> @@ -1614,7 +1614,7 @@ function gc_bucket_f()
> end
> end
>
> - if lfiber.time() - buckets_for_redirect_ts >=
> + if clock() - buckets_for_redirect_ts >=
> consts.BUCKET_SENT_GARBAGE_DELAY then
> status, err = gc_bucket_drop(buckets_for_redirect,
> consts.BUCKET.SENT)
> @@ -1629,7 +1629,7 @@ function gc_bucket_f()
> else
> buckets_for_redirect = empty_sent_buckets or {}
> empty_sent_buckets = nil
> - buckets_for_redirect_ts = lfiber.time()
> + buckets_for_redirect_ts = clock()
> end
> end
> ::continue::
Hi! Thanks for your patch! LGTM but I have one question.
Maybe it's reasonable to add some timeout in this function?
AFAIK test-run terminates tests after 120 seconds of inactivity it seems
too long for such simple case.
But anyway it's up to you.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> In the tests to wait for bucket deletion by GC it was necessary
> to have a long loop expression which checks _bucket space and
> wakes up GC fiber if the bucket is not deleted yet.
>
> Soon the GC wakeup won't be necessary as GC algorithm will become
> reactive instead of proactive.
>
> In order not to remove the wakeup from all places in the main
> patch, and to simplify the waiting the patch introduces a function
> wait_bucket_is_collected().
>
> The reactive GC will delete GC wakeup from this function and all
> the tests still will pass in time.
> ---
> test/lua_libs/storage_template.lua | 10 ++++++++++
> test/rebalancer/bucket_ref.result | 7 ++-----
> test/rebalancer/bucket_ref.test.lua | 5 ++---
> test/rebalancer/errinj.result | 13 +++++--------
> test/rebalancer/errinj.test.lua | 7 +++----
> test/rebalancer/rebalancer.result | 5 +----
> test/rebalancer/rebalancer.test.lua | 3 +--
> test/rebalancer/receiving_bucket.result | 2 +-
> test/rebalancer/receiving_bucket.test.lua | 2 +-
> test/reload_evolution/storage.result | 5 +----
> test/reload_evolution/storage.test.lua | 3 +--
> 11 files changed, 28 insertions(+), 34 deletions(-)
>
> diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
> index 84e4180..21409bd 100644
> --- a/test/lua_libs/storage_template.lua
> +++ b/test/lua_libs/storage_template.lua
> @@ -165,3 +165,13 @@ function wait_rebalancer_state(state, test_run)
> vshard.storage.rebalancer_wakeup()
> end
> end
> +
> +function wait_bucket_is_collected(id)
> + test_run:wait_cond(function()
> + if not box.space._bucket:get{id} then
> + return true
> + end
> + vshard.storage.recovery_wakeup()
> + vshard.storage.garbage_collector_wakeup()
> + end)
> +end
> diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
> index b66e449..b8fc7ff 100644
> --- a/test/rebalancer/bucket_ref.result
> +++ b/test/rebalancer/bucket_ref.result
> @@ -243,7 +243,7 @@ vshard.storage.buckets_info(1)
> destination: <replicaset_2>
> id: 1
> ...
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> ---
> ...
> _ = test_run:switch('box_2_a')
> @@ -292,10 +292,7 @@ vshard.storage.buckets_info(1)
> finish_refs = true
> ---
> ...
> -while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
> ----
> -...
> -while box.space._bucket:get{1} do fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> ---
> ...
> _ = test_run:switch('box_1_a')
> diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
> index 49ba583..213ced3 100644
> --- a/test/rebalancer/bucket_ref.test.lua
> +++ b/test/rebalancer/bucket_ref.test.lua
> @@ -73,7 +73,7 @@ vshard.storage.bucket_refro(1)
> finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> vshard.storage.buckets_info(1)
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> _ = test_run:switch('box_2_a')
> vshard.storage.buckets_info(1)
> vshard.storage.internal.errinj.ERRINJ_LONG_RECEIVE = false
> @@ -89,8 +89,7 @@ while not vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
> fiber.sleep(0.2)
> vshard.storage.buckets_info(1)
> finish_refs = true
> -while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
> -while box.space._bucket:get{1} do fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> _ = test_run:switch('box_1_a')
> vshard.storage.buckets_info(1)
>
> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index 214e7d8..e50eb72 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -237,7 +237,10 @@ _bucket:get{36}
> -- Buckets became 'active' on box_2_a, but still are sending on
> -- box_1_a. Wait until it is marked as garbage on box_1_a by the
> -- recovery fiber.
> -while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(35)
> +---
> +...
> +wait_bucket_is_collected(36)
> ---
> ...
> _ = test_run:switch('box_2_a')
> @@ -278,7 +281,7 @@ while not _bucket:get{36} do fiber.sleep(0.0001) end
> _ = test_run:switch('box_1_a')
> ---
> ...
> -while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(36)
> ---
> ...
> _bucket:get{36}
> @@ -295,12 +298,6 @@ box.error.injection.set('ERRINJ_WAL_DELAY', false)
> ---
> - ok
> ...
> -_ = test_run:switch('box_1_a')
> ----
> -...
> -while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end
> ----
> -...
> test_run:switch('default')
> ---
> - true
> diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
> index 66fbe5e..2cc4a69 100644
> --- a/test/rebalancer/errinj.test.lua
> +++ b/test/rebalancer/errinj.test.lua
> @@ -107,7 +107,8 @@ _bucket:get{36}
> -- Buckets became 'active' on box_2_a, but still are sending on
> -- box_1_a. Wait until it is marked as garbage on box_1_a by the
> -- recovery fiber.
> -while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(35)
> +wait_bucket_is_collected(36)
> _ = test_run:switch('box_2_a')
> _bucket:get{35}
> _bucket:get{36}
> @@ -124,13 +125,11 @@ f1 = fiber.create(function() ret1, err1 = vshard.storage.bucket_send(36, util.re
> _ = test_run:switch('box_2_a')
> while not _bucket:get{36} do fiber.sleep(0.0001) end
> _ = test_run:switch('box_1_a')
> -while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(36)
> _bucket:get{36}
> _ = test_run:switch('box_2_a')
> _bucket:get{36}
> box.error.injection.set('ERRINJ_WAL_DELAY', false)
> -_ = test_run:switch('box_1_a')
> -while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end
>
> test_run:switch('default')
> test_run:drop_cluster(REPLICASET_2)
> diff --git a/test/rebalancer/rebalancer.result b/test/rebalancer/rebalancer.result
> index 3607e93..098b845 100644
> --- a/test/rebalancer/rebalancer.result
> +++ b/test/rebalancer/rebalancer.result
> @@ -334,10 +334,7 @@ vshard.storage.rebalancer_wakeup()
> -- Now rebalancer makes a bucket SENT. After it the garbage
> -- collector cleans it and deletes after a timeout.
> --
> -while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end
> ----
> -...
> -while _bucket:get{91} ~= nil do fiber.sleep(0.1) end
> +wait_bucket_is_collected(91)
> ---
> ...
> wait_rebalancer_state("The cluster is balanced ok", test_run)
> diff --git a/test/rebalancer/rebalancer.test.lua b/test/rebalancer/rebalancer.test.lua
> index 63e690f..308e66d 100644
> --- a/test/rebalancer/rebalancer.test.lua
> +++ b/test/rebalancer/rebalancer.test.lua
> @@ -162,8 +162,7 @@ vshard.storage.rebalancer_wakeup()
> -- Now rebalancer makes a bucket SENT. After it the garbage
> -- collector cleans it and deletes after a timeout.
> --
> -while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end
> -while _bucket:get{91} ~= nil do fiber.sleep(0.1) end
> +wait_bucket_is_collected(91)
> wait_rebalancer_state("The cluster is balanced ok", test_run)
> _bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
> _bucket.index.status:min({vshard.consts.BUCKET.ACTIVE})
> diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
> index db6a67f..7d3612b 100644
> --- a/test/rebalancer/receiving_bucket.result
> +++ b/test/rebalancer/receiving_bucket.result
> @@ -374,7 +374,7 @@ vshard.storage.buckets_info(1)
> destination: <replicaset_1>
> id: 1
> ...
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> ---
> ...
> vshard.storage.buckets_info(1)
> diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
> index 1819cbb..24534b3 100644
> --- a/test/rebalancer/receiving_bucket.test.lua
> +++ b/test/rebalancer/receiving_bucket.test.lua
> @@ -137,7 +137,7 @@ box.space.test3:select{100}
> _ = test_run:switch('box_2_a')
> vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> vshard.storage.buckets_info(1)
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> vshard.storage.buckets_info(1)
> _ = test_run:switch('box_1_a')
> box.space._bucket:get{1}
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> index 4652c4f..753687f 100644
> --- a/test/reload_evolution/storage.result
> +++ b/test/reload_evolution/storage.result
> @@ -129,10 +129,7 @@ vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1])
> ---
> - true
> ...
> -vshard.storage.garbage_collector_wakeup()
> ----
> -...
> -while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
> +wait_bucket_is_collected(bucket_id_to_move)
> ---
> ...
> test_run:switch('storage_1_a')
> diff --git a/test/reload_evolution/storage.test.lua b/test/reload_evolution/storage.test.lua
> index 06f7117..639553e 100644
> --- a/test/reload_evolution/storage.test.lua
> +++ b/test/reload_evolution/storage.test.lua
> @@ -51,8 +51,7 @@ vshard.storage.bucket_force_create(2000)
> vshard.storage.buckets_info()[2000]
> vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
> vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1])
> -vshard.storage.garbage_collector_wakeup()
> -while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
> +wait_bucket_is_collected(bucket_id_to_move)
> test_run:switch('storage_1_a')
> while box.space._bucket:get{bucket_id_to_move}.status ~= vshard.consts.BUCKET.ACTIVE do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end
> vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[2])
Thanks for your patch. LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Locked replicaset (via config) should not allow any bucket moves
> from or to the replicaset.
>
> But the lock check was only done by bucket_send(). Bucket_recv()
> allowed to receive a bucket even if the replicaset is locked. The
> patch fixes it.
>
> It didn't affect automatic bucket sends, because lock is
> accounted by the rebalancer from the config. Only manual bucket
> moves could have this bug.
> ---
> test/rebalancer/rebalancer_lock_and_pin.result | 14 ++++++++++++++
> test/rebalancer/rebalancer_lock_and_pin.test.lua | 4 ++++
> vshard/storage/init.lua | 3 +++
> 3 files changed, 21 insertions(+)
>
> diff --git a/test/rebalancer/rebalancer_lock_and_pin.result b/test/rebalancer/rebalancer_lock_and_pin.result
> index 51dd36e..0bb4f45 100644
> --- a/test/rebalancer/rebalancer_lock_and_pin.result
> +++ b/test/rebalancer/rebalancer_lock_and_pin.result
> @@ -156,6 +156,20 @@ vshard.storage.bucket_send(1, util.replicasets[2])
> message: Replicaset is locked
> code: 19
> ...
> +test_run:switch('box_2_a')
> +---
> +- true
> +...
> +-- Does not allow to receive either. Send from a non-locked replicaset to a
> +-- locked one fails.
> +vshard.storage.bucket_send(101, util.replicasets[1])
> +---
> +- null
> +- type: ShardingError
> + code: 19
> + name: REPLICASET_IS_LOCKED
> + message: Replicaset is locked
> +...
> --
> -- Vshard ensures that if a replicaset is locked, then it will not
> -- allow to change its bucket set even if a rebalancer does not
> diff --git a/test/rebalancer/rebalancer_lock_and_pin.test.lua b/test/rebalancer/rebalancer_lock_and_pin.test.lua
> index c3412c1..7b87004 100644
> --- a/test/rebalancer/rebalancer_lock_and_pin.test.lua
> +++ b/test/rebalancer/rebalancer_lock_and_pin.test.lua
> @@ -69,6 +69,10 @@ info.lock
> -- explicitly.
> --
> vshard.storage.bucket_send(1, util.replicasets[2])
> +test_run:switch('box_2_a')
> +-- Does not allow to receive either. Send from a non-locked replicaset to a
> +-- locked one fails.
> +vshard.storage.bucket_send(101, util.replicasets[1])
>
> --
> -- Vshard ensures that if a replicaset is locked, then it will not
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index c7335fc..298df71 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -995,6 +995,9 @@ local function bucket_recv_xc(bucket_id, from, data, opts)
> return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, msg,
> from)
> end
> + if is_this_replicaset_locked() then
> + return nil, lerror.vshard(lerror.code.REPLICASET_IS_LOCKED)
> + end
> if not bucket_receiving_quota_add(-1) then
> return nil, lerror.vshard(lerror.code.TOO_MANY_RECEIVING)
> end
Thanks for your patch 1 comment below. On 10/02/2021 02:46, Vladislav Shpilevoy wrote: > The patch adds functions table_copy_yield and table_minus_yield. > > Yielding copy creates a duplicate of a table but yields every > specified number of keys copied. > > Yielding minus removes matching key-value pairs specified in one > table from another table. It yields every specified number of keys > passed. > > The functions should help to process huge Lua tables (millions of > elements and more). These are going to be used on the storage in > the new GC algorithm. > > The algorithm will need to keep a route table on the storage, just > like on the router, but with expiration time for the routes. Since > bucket count can be millions, it means GC will potentially operate > on a huge Lua table and could use some yields so as not to block > TX thread for long. > > Needed for #147 > --- > test/unit/util.result | 113 ++++++++++++++++++++++++++++++++++++++++ > test/unit/util.test.lua | 49 +++++++++++++++++ > vshard/util.lua | 40 ++++++++++++++ > 3 files changed, 202 insertions(+) > > diff --git a/test/unit/util.result b/test/unit/util.result > index 096e36f..c4fd84d 100644 > --- a/test/unit/util.result > +++ b/test/unit/util.result > @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000) > fib:cancel() > --- > ... > +-- Yielding table minus. > +minus_yield = util.table_minus_yield > +--- > +... > +minus_yield({}, {}, 1) > +--- > +- [] > +... > +minus_yield({}, {k = 1}, 1) > +--- > +- [] > +... > +minus_yield({}, {k = 1}, 0) > +--- > +- [] > +... > +minus_yield({k = 1}, {k = 1}, 0) > +--- > +- [] > +... > +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10) > +--- > +- k2: 2 > +... > +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10) > +--- > +- [] > +... > +-- Mismatching values are not deleted. > +minus_yield({k1 = 1}, {k1 = 2}, 10) > +--- > +- k1: 1 > +... > +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) > +--- > +- k3: 3 > + k2: 2 > +... > +do \ > + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ > + f = fiber.create(function() \ > + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ > + end) \ > + yield_count = 0 \ > + while f:status() ~= 'dead' do \ > + yield_count = yield_count + 1 \ > + fiber.yield() \ > + end \ > +end > +--- Why can't you use "csw" of fiber.self() instead? Also it's it reliable enough to simply count yields? Could scheduler skip this fiber at some loop iteration? In other words, won't this test be flaky? > +... > +yield_count > +--- > +- 2 > +... > +t > +--- > +- k4: 4 > + k1: 1 > +... > +-- Yielding table copy. > +copy_yield = util.table_copy_yield > +--- > +... > +copy_yield({}, 1) > +--- > +- [] > +... > +copy_yield({k = 1}, 1) > +--- > +- k: 1 > +... > +copy_yield({k1 = 1, k2 = 2}, 1) > +--- > +- k1: 1 > + k2: 2 > +... > +do \ > + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ > + res = nil \ > + f = fiber.create(function() \ > + res = copy_yield(t, 2) \ > + end) \ > + yield_count = 0 \ > + while f:status() ~= 'dead' do \ > + yield_count = yield_count + 1 \ > + fiber.yield() \ > + end \ > +end > +--- > +... > +yield_count > +--- > +- 2 > +... > +t > +--- > +- k3: 3 > + k4: 4 > + k1: 1 > + k2: 2 > +... > +res > +--- > +- k3: 3 > + k4: 4 > + k1: 1 > + k2: 2 > +... > +t ~= res > +--- > +- true > +... > diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua > index 5f39e06..4d6cbe9 100644 > --- a/test/unit/util.test.lua > +++ b/test/unit/util.test.lua > @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function') > while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end > test_run:grep_log('default', 'reloadable_function has been started', 1000) > fib:cancel() > + > +-- Yielding table minus. > +minus_yield = util.table_minus_yield > +minus_yield({}, {}, 1) > +minus_yield({}, {k = 1}, 1) > +minus_yield({}, {k = 1}, 0) > +minus_yield({k = 1}, {k = 1}, 0) > +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10) > +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10) > +-- Mismatching values are not deleted. > +minus_yield({k1 = 1}, {k1 = 2}, 10) > +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) > + > +do \ > + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ > + f = fiber.create(function() \ > + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ > + end) \ > + yield_count = 0 \ > + while f:status() ~= 'dead' do \ > + yield_count = yield_count + 1 \ > + fiber.yield() \ > + end \ > +end > +yield_count > +t > + > +-- Yielding table copy. > +copy_yield = util.table_copy_yield > +copy_yield({}, 1) > +copy_yield({k = 1}, 1) > +copy_yield({k1 = 1, k2 = 2}, 1) > + > +do \ > + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ > + res = nil \ > + f = fiber.create(function() \ > + res = copy_yield(t, 2) \ > + end) \ > + yield_count = 0 \ > + while f:status() ~= 'dead' do \ > + yield_count = yield_count + 1 \ > + fiber.yield() \ > + end \ > +end > +yield_count > +t > +res > +t ~= res > diff --git a/vshard/util.lua b/vshard/util.lua > index d3b4e67..2362607 100644 > --- a/vshard/util.lua > +++ b/vshard/util.lua > @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need) > return minor >= minor_need > end > > +-- > +-- Copy @a src table. Fiber yields every @a interval keys copied. > +-- > +local function table_copy_yield(src, interval) > + local res = {} > + -- Time-To-Yield. > + local tty = interval > + for k, v in pairs(src) do > + res[k] = v > + tty = tty - 1 > + if tty <= 0 then > + fiber.yield() > + tty = interval > + end > + end > + return res > +end > + > +-- > +-- Remove @a src keys from @a dst if their values match. Fiber yields every > +-- @a interval iterations. > +-- > +local function table_minus_yield(dst, src, interval) > + -- Time-To-Yield. > + local tty = interval > + for k, srcv in pairs(src) do > + if dst[k] == srcv then > + dst[k] = nil > + end > + tty = tty - 1 > + if tty <= 0 then > + fiber.yield() > + tty = interval > + end > + end > + return dst > +end > + > return { > tuple_extract_key = tuple_extract_key, > reloadable_fiber_create = reloadable_fiber_create, > @@ -160,4 +198,6 @@ return { > async_task = async_task, > internal = M, > version_is_at_least = version_is_at_least, > + table_copy_yield = table_copy_yield, > + table_minus_yield = table_minus_yield, > }
Thanks for your patch! Is it possible to extend log message to "Option is deprecated and has no effect anymore"? Also for some options could be useful: "Option is deprecated, use ... instead" (e.g. for "weights"). Seems it should be more configurable and gives some hint for user to do. On 10/02/2021 02:46, Vladislav Shpilevoy wrote: > Some options in vshard are going to be eventually deprecated. For > instance, 'weigts' will be renamed, 'collect_lua_garbage' may be typo: weigts -> weights > deleted since it appears not to be so useful, 'sync_timeout' is > totally unnecessary since any 'sync' can take a timeout per-call. > > But the patch is motivated by 'collect_bucket_garbage_interval' > which is going to become unused in the new GC algorithm. > > New GC will be reactive instead of proactive. Instead of periodic > polling of _bucket space it will react on needed events > immediately. This will make the 'collect interval' unused. > > The option will be deprecated and eventually in some far future > release its usage will lead to an error. > > Needed for #147 > --- > vshard/cfg.lua | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/vshard/cfg.lua b/vshard/cfg.lua > index 1ef1899..28c3400 100644 > --- a/vshard/cfg.lua > +++ b/vshard/cfg.lua > @@ -59,7 +59,11 @@ local function validate_config(config, template, check_arg) > local value = config[key] > local name = template_value.name > local expected_type = template_value.type > - if value == nil then > + if template_value.is_deprecated then > + if value ~= nil then > + log.warn('Option "%s" is deprecated', name) > + end > + elseif value == nil then > if not template_value.is_optional then > error(string.format('%s must be specified', name)) > else
Thanks for your patch.
As I see you've introduced some new parameters: "LUA_CHUNK_SIZE" and
"GC_BACKOFF_INTERVAL".
I think it's better to describe them in commit message to understand
more clear how new algorithm.
I see that you didn't update comment above "gc_bucket_f" function. Is it
still relevant?
In general patch LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Garbage collector is a fiber on a master node which deletes
> GARBAGE and SENT buckets along with their data.
>
> It was proactive. It used to wakeup with a constant period to
> find and delete the needed buckets.
>
> But this won't work with the future feature called 'map-reduce'.
> Map-reduce as a preparation stage will need to ensure that all
> buckets on a storage are readable and writable. With the current
> GC algorithm if a bucket is sent, it won't be deleted for the next
> 5 seconds by default. During this time all new map-reduce requests
> can't execute.
>
> This is not acceptable. As well as too frequent wakeup of GC fiber
> because it would waste TX thread time.
>
> The patch makes GC fiber wakeup not by a timeout but by events
> happening with _bucket space. GC fiber sleeps on a condition
> variable which is signaled when _bucket is changed.
>
> Once GC sees work to do, it won't sleep until it is done. It will
> only yield.
>
> This makes GC delete SENT and GARBAGE buckets as soon as possible
> reducing the waiting time for the incoming map-reduce requests.
>
> Needed for #147
>
> @TarantoolBot document
> Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval'
> It was used to specify the interval between bucket garbage
> collection steps. It was needed because garbage collection in
> vshard was proactive. It didn't react to newly appeared garbage
> buckets immediately.
>
> Since now (0.1.17) garbage collection became reactive. It starts
> working with garbage buckets immediately as they appear. And
> sleeps rest of the time. The option is not used now and does not
> affect behaviour of anything.
>
> I suppose it can be deleted from the documentation. Or left with
> a big label 'deprecated' + the explanation above.
>
> An attempt to use the option does not cause an error, but logs a
> warning.
> ---
> test/lua_libs/storage_template.lua | 1 -
> test/misc/reconfigure.result | 10 -
> test/misc/reconfigure.test.lua | 3 -
> test/rebalancer/bucket_ref.result | 12 --
> test/rebalancer/bucket_ref.test.lua | 3 -
> test/rebalancer/errinj.result | 11 --
> test/rebalancer/errinj.test.lua | 5 -
> test/rebalancer/receiving_bucket.result | 8 -
> test/rebalancer/receiving_bucket.test.lua | 1 -
> test/reload_evolution/storage.result | 2 +-
> test/router/reroute_wrong_bucket.result | 8 +-
> test/router/reroute_wrong_bucket.test.lua | 4 +-
> test/storage/recovery.result | 3 +-
> test/storage/storage.result | 10 +-
> test/storage/storage.test.lua | 1 +
> test/unit/config.result | 35 +---
> test/unit/config.test.lua | 16 +-
> test/unit/garbage.result | 106 ++++++----
> test/unit/garbage.test.lua | 47 +++--
> test/unit/garbage_errinj.result | 223 ----------------------
> test/unit/garbage_errinj.test.lua | 73 -------
> vshard/cfg.lua | 4 +-
> vshard/consts.lua | 5 +-
> vshard/storage/init.lua | 207 ++++++++++----------
> vshard/storage/reload_evolution.lua | 8 +
> 25 files changed, 233 insertions(+), 573 deletions(-)
> delete mode 100644 test/unit/garbage_errinj.result
> delete mode 100644 test/unit/garbage_errinj.test.lua
>
> diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
> index 21409bd..8df89f6 100644
> --- a/test/lua_libs/storage_template.lua
> +++ b/test/lua_libs/storage_template.lua
> @@ -172,6 +172,5 @@ function wait_bucket_is_collected(id)
> return true
> end
> vshard.storage.recovery_wakeup()
> - vshard.storage.garbage_collector_wakeup()
> end)
> end
> diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
> index 168be5d..3b34841 100644
> --- a/test/misc/reconfigure.result
> +++ b/test/misc/reconfigure.result
> @@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> ---
> ...
> -cfg.collect_bucket_garbage_interval = 100
> ----
> -...
> cfg.invalid_option = 'kek'
> ---
> ...
> @@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000
> ---
> - true
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> ----
> -- true
> -...
> cfg.sync_timeout = nil
> ---
> ...
> @@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> ---
> ...
> -cfg.collect_bucket_garbage_interval = nil
> ----
> -...
> cfg.invalid_option = nil
> ---
> ...
> diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
> index e891010..348628c 100644
> --- a/test/misc/reconfigure.test.lua
> +++ b/test/misc/reconfigure.test.lua
> @@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout
> cfg.sync_timeout = 100
> cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> -cfg.collect_bucket_garbage_interval = 100
> cfg.invalid_option = 'kek'
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> not vshard.storage.internal.collect_lua_garbage
> vshard.storage.internal.sync_timeout
> vshard.storage.internal.rebalancer_max_receiving ~= 1000
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> cfg.sync_timeout = nil
> cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> -cfg.collect_bucket_garbage_interval = nil
> cfg.invalid_option = nil
>
> --
> diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
> index b8fc7ff..9df7480 100644
> --- a/test/rebalancer/bucket_ref.result
> +++ b/test/rebalancer/bucket_ref.result
> @@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read')
> - true
> ...
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> ----
> -...
> vshard.storage.buckets_info(1)
> ---
> - 1:
> @@ -203,7 +200,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> ---
> @@ -235,14 +231,6 @@ finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> ---
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: garbage
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
> index 213ced3..1b032ff 100644
> --- a/test/rebalancer/bucket_ref.test.lua
> +++ b/test/rebalancer/bucket_ref.test.lua
> @@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs.
> vshard.storage.bucket_ref(1, 'read')
> vshard.storage.bucket_unref(1, 'read')
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> vshard.storage.buckets_info(1)
> _ = test_run:cmd("setopt delimiter ';'")
> while true do
> @@ -64,7 +63,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> _ = test_run:cmd("setopt delimiter ''");
> @@ -72,7 +70,6 @@ vshard.storage.buckets_info(1)
> vshard.storage.bucket_refro(1)
> finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> _ = test_run:switch('box_2_a')
> vshard.storage.buckets_info(1)
> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index e50eb72..0ddb1c9 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -226,17 +226,6 @@ ret2, err2
> - true
> - null
> ...
> -_bucket:get{35}
> ----
> -- [35, 'sent', '<replicaset_2>']
> -...
> -_bucket:get{36}
> ----
> -- [36, 'sent', '<replicaset_2>']
> -...
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> ---
> ...
> diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
> index 2cc4a69..a60f3d7 100644
> --- a/test/rebalancer/errinj.test.lua
> +++ b/test/rebalancer/errinj.test.lua
> @@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a')
> while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end
> ret1, err1
> ret2, err2
> -_bucket:get{35}
> -_bucket:get{36}
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> wait_bucket_is_collected(36)
> _ = test_run:switch('box_2_a')
> diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
> index 7d3612b..ad93445 100644
> --- a/test/rebalancer/receiving_bucket.result
> +++ b/test/rebalancer/receiving_bucket.result
> @@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> ---
> - true
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_1>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
> index 24534b3..2cf6382 100644
> --- a/test/rebalancer/receiving_bucket.test.lua
> +++ b/test/rebalancer/receiving_bucket.test.lua
> @@ -136,7 +136,6 @@ box.space.test3:select{100}
> -- Now the bucket is unreferenced and can be transferred.
> _ = test_run:switch('box_2_a')
> vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> vshard.storage.buckets_info(1)
> _ = test_run:switch('box_1_a')
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> index 753687f..9d30a04 100644
> --- a/test/reload_evolution/storage.result
> +++ b/test/reload_evolution/storage.result
> @@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to')
> ...
> vshard.storage.internal.reload_version
> ---
> -- 2
> +- 3
> ...
> --
> -- gh-237: should be only one trigger. During gh-237 the trigger installation
> diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result
> index 049bdef..ac340eb 100644
> --- a/test/router/reroute_wrong_bucket.result
> +++ b/test/router/reroute_wrong_bucket.result
> @@ -37,7 +37,7 @@ test_run:switch('storage_1_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> @@ -53,7 +53,7 @@ test_run:switch('storage_2_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> @@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration')
> err
> ---
> - bucket_id: 100
> - reason: write is prohibited
> + reason: Not found
> code: 1
> destination: ac522f65-aa94-4134-9f64-51ee384f1a54
> type: ShardingError
> name: WRONG_BUCKET
> - message: 'Cannot perform action with bucket 100, reason: write is prohibited'
> + message: 'Cannot perform action with bucket 100, reason: Not found'
> ...
> --
> -- Now try again, but update configuration during call(). It must
> diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua
> index 9e6e804..207aac3 100644
> --- a/test/router/reroute_wrong_bucket.test.lua
> +++ b/test/router/reroute_wrong_bucket.test.lua
> @@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt
> test_run:cmd('create server router_1 with script="router/router_1.lua"')
> test_run:cmd('start server router_1')
> test_run:switch('storage_1_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> vshard.storage.rebalancer_disable()
> for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
>
> test_run:switch('storage_2_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> vshard.storage.rebalancer_disable()
> for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
> diff --git a/test/storage/recovery.result b/test/storage/recovery.result
> index f833fe7..8ccb0b9 100644
> --- a/test/storage/recovery.result
> +++ b/test/storage/recovery.result
> @@ -79,8 +79,7 @@ _bucket = box.space._bucket
> ...
> _bucket:select{}
> ---
> -- - [2, 'garbage', '<replicaset_2>']
> - - [3, 'garbage', '<replicaset_2>']
> +- []
> ...
> _ = test_run:switch('storage_2_a')
> ---
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index 424bc4c..0550ad1 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2])
> ---
> - true
> ...
> +wait_bucket_is_collected(1)
> +---
> +...
> _ = test_run:switch("storage_2_a")
> ---
> ...
> @@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a")
> ...
> vshard.storage.buckets_info()
> ---
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> - 2:
> +- 2:
> status: active
> id: 2
> ...
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index d631b51..d8fbd94 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1])
>
> -- Successful transfer.
> vshard.storage.bucket_send(1, util.replicasets[2])
> +wait_bucket_is_collected(1)
> _ = test_run:switch("storage_2_a")
> vshard.storage.buckets_info()
> _ = test_run:switch("storage_1_a")
> diff --git a/test/unit/config.result b/test/unit/config.result
> index dfd0219..e0b2482 100644
> --- a/test/unit/config.result
> +++ b/test/unit/config.result
> @@ -428,33 +428,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 0
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = -1
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 100.5
> ----
> -...
> -_ = lcfg.check(cfg)
> ----
> -...
> cfg.collect_lua_garbage = 100
> ---
> ...
> @@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> ---
> ...
> -cfg.sharding = nil
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +---
> +...
> +_ = lcfg.check(cfg)
> ---
> ...
> diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua
> index ada43db..a1c9f07 100644
> --- a/test/unit/config.test.lua
> +++ b/test/unit/config.test.lua
> @@ -175,15 +175,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 0
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = -1
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 100.5
> -_ = lcfg.check(cfg)
> -
> cfg.collect_lua_garbage = 100
> check(cfg)
> cfg.collect_lua_garbage = true
> @@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg)
> cfg.rebalancer_max_sending = 15
> lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> -cfg.sharding = nil
> +
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +_ = lcfg.check(cfg)
> diff --git a/test/unit/garbage.result b/test/unit/garbage.result
> index 74d9ccf..a530496 100644
> --- a/test/unit/garbage.result
> +++ b/test/unit/garbage.result
> @@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''");
> vshard.storage.internal.shard_index = 'bucket_id'
> ---
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> ----
> -...
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> -- by it, or bucket_id is not unsigned.
> @@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> ---
> ...
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> +---
> +...
> _bucket = box.schema.create_space('_bucket', {format = format})
> ---
> ...
> @@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ---
> - [3, 'active']
> ...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [6, 'garbage']
> -...
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [200, 'garbage']
> -...
> s = box.schema.create_space('test', {engine = engine})
> ---
> ...
> @@ -213,7 +197,7 @@ s:replace{4, 2}
> ---
> - [4, 2]
> ...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> ---
> ...
> s2 = box.schema.create_space('test2', {engine = engine})
> @@ -249,6 +233,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> ---
> ...
> @@ -267,12 +255,22 @@ fill_spaces_with_garbage()
> ---
> - 1107
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - null
> + - null
> + - destination2
> ...
> #s2:select{}
> ---
> @@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ---
> - 7
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - destination1
> ...
> s2:select{}
> ---
> @@ -303,17 +311,22 @@ s:select{}
> - [6, 100]
> ...
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- []
> ...
> #s2:select{}
> ---
> @@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> fill_spaces_with_garbage()
> ---
> ...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> ---
> ...
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -360,7 +378,6 @@ _bucket:select{}
> - - [1, 'active']
> - [2, 'receiving']
> - [3, 'active']
> - - [4, 'sent']
> ...
> --
> -- Test deletion of 'sent' buckets after a specified timeout.
> @@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT}
> - [2, 'sent']
> ...
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> @@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT}
> ---
> - [4, 'sent']
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -434,11 +453,14 @@ s:replace{6, 4}
> ---
> - [6, 4]
> ...
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> ---
> +- Error during garbage collection step
> ...
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -454,8 +476,9 @@ _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> ---
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> f:cancel()
> ---
> @@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua
> index 30079fa..250afb0 100644
> --- a/test/unit/garbage.test.lua
> +++ b/test/unit/garbage.test.lua
> @@ -15,7 +15,6 @@ end;
> test_run:cmd("setopt delimiter ''");
>
> vshard.storage.internal.shard_index = 'bucket_id'
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
>
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> @@ -75,16 +74,13 @@ s:drop()
> format = {}
> format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> _bucket = box.schema.create_space('_bucket', {format = format})
> _ = _bucket:create_index('pk')
> _ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> _bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> _bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
>
> s = box.schema.create_space('test', {engine = engine})
> pk = s:create_index('pk')
> @@ -94,7 +90,7 @@ s:replace{2, 1}
> s:replace{3, 2}
> s:replace{4, 2}
>
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> s2 = box.schema.create_space('test2', {engine = engine})
> pk2 = s2:create_index('pk')
> sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> @@ -114,6 +110,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> test_run:cmd("setopt delimiter ''");
>
> @@ -121,15 +121,21 @@ fill_spaces_with_garbage()
>
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +route_map
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> s2:select{}
> s:select{}
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> #s2:select{}
> #s:select{}
>
> @@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -- Test continuous garbage collection via background fiber.
> --
> fill_spaces_with_garbage()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> s:select{}
> s2:select{}
> -- Check garbage bucket is deleted by background fiber.
> @@ -150,7 +160,7 @@ _bucket:select{}
> --
> _bucket:replace{2, vshard.consts.BUCKET.SENT}
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> _bucket:select{}
> s:select{}
> s2:select{}
> @@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE}
> s:replace{5, 4}
> s:replace{6, 4}
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete)
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> s:replace{5, 4}
> s:replace{6, 4}
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> s:select{}
> _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> f:cancel()
>
> @@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> #s:select{}
> #s2:select{}
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> _bucket:select{}
> s:select{}
> s2:select{}
> diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result
> deleted file mode 100644
> index 92c8039..0000000
> --- a/test/unit/garbage_errinj.result
> +++ /dev/null
> @@ -1,223 +0,0 @@
> -test_run = require('test_run').new()
> ----
> -...
> -vshard = require('vshard')
> ----
> -...
> -fiber = require('fiber')
> ----
> -...
> -engine = test_run:get_cfg('engine')
> ----
> -...
> -vshard.storage.internal.shard_index = 'bucket_id'
> ----
> -...
> -format = {}
> ----
> -...
> -format[1] = {name = 'id', type = 'unsigned'}
> ----
> -...
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> ----
> -...
> -_bucket = box.schema.create_space('_bucket', {format = format})
> ----
> -...
> -_ = _bucket:create_index('pk')
> ----
> -...
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> ----
> -...
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [1, 'active']
> -...
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> ----
> -- [2, 'receiving']
> -...
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [3, 'active']
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -s = box.schema.create_space('test', {engine = engine})
> ----
> -...
> -pk = s:create_index('pk')
> ----
> -...
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s:replace{2, 1}
> ----
> -- [2, 1]
> -...
> -s:replace{3, 2}
> ----
> -- [3, 2]
> -...
> -s:replace{4, 2}
> ----
> -- [4, 2]
> -...
> -s:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s:replace{6, 100}
> ----
> -- [6, 100]
> -...
> -s:replace{7, 4}
> ----
> -- [7, 4]
> -...
> -s:replace{8, 5}
> ----
> -- [8, 5]
> -...
> -s2 = box.schema.create_space('test2', {engine = engine})
> ----
> -...
> -pk2 = s2:create_index('pk')
> ----
> -...
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s2:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s2:replace{3, 3}
> ----
> -- [3, 3]
> -...
> -for i = 7, 1107 do s:replace{i, 200} end
> ----
> -...
> -s2:replace{4, 200}
> ----
> -- [4, 200]
> -...
> -s2:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s2:replace{5, 300}
> ----
> -- [5, 300]
> -...
> -s2:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -s2:replace{7, 5}
> ----
> -- [7, 5]
> -...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> ----
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- - 4
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 5
> -- true
> -...
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> ----
> -...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> ----
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> ----
> -...
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> ----
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [4, 'garbage']
> -...
> -s:replace{5, 4}
> ----
> -- [5, 4]
> -...
> -s:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -#s:select{}
> ----
> -- 2
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> ----
> -...
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> ----
> -...
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> ----
> -- 2
> -...
> -_bucket:select{4}
> ----
> -- - [4, 'garbage']
> -...
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- []
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 4
> - - 5
> -- true
> -...
> -#s:select{}
> ----
> -- 0
> -...
> -_bucket:delete{4}
> ----
> -- [4, 'garbage']
> -...
> -s2:drop()
> ----
> -...
> -s:drop()
> ----
> -...
> -_bucket:drop()
> ----
> -...
> diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua
> deleted file mode 100644
> index 31184b9..0000000
> --- a/test/unit/garbage_errinj.test.lua
> +++ /dev/null
> @@ -1,73 +0,0 @@
> -test_run = require('test_run').new()
> -vshard = require('vshard')
> -fiber = require('fiber')
> -
> -engine = test_run:get_cfg('engine')
> -vshard.storage.internal.shard_index = 'bucket_id'
> -
> -format = {}
> -format[1] = {name = 'id', type = 'unsigned'}
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> -_bucket = box.schema.create_space('_bucket', {format = format})
> -_ = _bucket:create_index('pk')
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -
> -s = box.schema.create_space('test', {engine = engine})
> -pk = s:create_index('pk')
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s:replace{1, 1}
> -s:replace{2, 1}
> -s:replace{3, 2}
> -s:replace{4, 2}
> -s:replace{5, 100}
> -s:replace{6, 100}
> -s:replace{7, 4}
> -s:replace{8, 5}
> -
> -s2 = box.schema.create_space('test2', {engine = engine})
> -pk2 = s2:create_index('pk')
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s2:replace{1, 1}
> -s2:replace{3, 3}
> -for i = 7, 1107 do s:replace{i, 200} end
> -s2:replace{4, 200}
> -s2:replace{5, 100}
> -s2:replace{5, 300}
> -s2:replace{6, 4}
> -s2:replace{7, 5}
> -
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> -s:replace{5, 4}
> -s:replace{6, 4}
> -#s:select{}
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> -_bucket:select{4}
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -#s:select{}
> -_bucket:delete{4}
> -
> -s2:drop()
> -s:drop()
> -_bucket:drop()
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 28c3400..1345058 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -245,9 +245,7 @@ local cfg_template = {
> max = consts.REBALANCER_MAX_SENDING_MAX
> },
> collect_bucket_garbage_interval = {
> - type = 'positive number', name = 'Garbage bucket collect interval',
> - is_optional = true,
> - default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> + name = 'Garbage bucket collect interval', is_deprecated = true,
> },
> collect_lua_garbage = {
> type = 'boolean', name = 'Garbage Lua collect necessity',
> diff --git a/vshard/consts.lua b/vshard/consts.lua
> index 8c2a8b0..3f1585a 100644
> --- a/vshard/consts.lua
> +++ b/vshard/consts.lua
> @@ -23,6 +23,7 @@ return {
> DEFAULT_BUCKET_COUNT = 3000;
> BUCKET_SENT_GARBAGE_DELAY = 0.5;
> BUCKET_CHUNK_SIZE = 1000;
> + LUA_CHUNK_SIZE = 100000,
> DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1;
> REBALANCER_IDLE_INTERVAL = 60 * 60;
> REBALANCER_WORK_INTERVAL = 10;
> @@ -37,7 +38,7 @@ return {
> DEFAULT_FAILOVER_PING_TIMEOUT = 5;
> DEFAULT_SYNC_TIMEOUT = 1;
> RECONNECT_TIMEOUT = 0.5;
> - DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5;
> + GC_BACKOFF_INTERVAL = 5,
> RECOVERY_INTERVAL = 5;
> COLLECT_LUA_GARBAGE_INTERVAL = 100;
>
> @@ -45,4 +46,6 @@ return {
> DISCOVERY_WORK_INTERVAL = 1,
> DISCOVERY_WORK_STEP = 0.01,
> DISCOVERY_TIMEOUT = 10,
> +
> + TIMEOUT_INFINITY = 500 * 365 * 86400,
> }
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 298df71..31a6fc7 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -69,7 +69,6 @@ if not M then
> total_bucket_count = 0,
> errinj = {
> ERRINJ_CFG = false,
> - ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
> ERRINJ_RELOAD = false,
> ERRINJ_CFG_DELAY = false,
> ERRINJ_LONG_RECEIVE = false,
> @@ -96,6 +95,8 @@ if not M then
> -- detect that _bucket was not changed between yields.
> --
> bucket_generation = 0,
> + -- Condition variable fired on generation update.
> + bucket_generation_cond = lfiber.cond(),
> --
> -- Reference to the function used as on_replace trigger on
> -- _bucket space. It is used to replace the trigger with
> @@ -107,12 +108,14 @@ if not M then
> -- replace the old function is to keep its reference.
> --
> bucket_on_replace = nil,
> + -- Redirects for recently sent buckets. They are kept for a while to
> + -- help routers to find a new location for sent and deleted buckets
> + -- without whole cluster scan.
> + route_map = {},
>
> ------------------- Garbage collection -------------------
> -- Fiber to remove garbage buckets data.
> collect_bucket_garbage_fiber = nil,
> - -- Do buckets garbage collection once per this time.
> - collect_bucket_garbage_interval = nil,
> -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
>
> @@ -173,6 +176,7 @@ end
> --
> local function bucket_generation_increment()
> M.bucket_generation = M.bucket_generation + 1
> + M.bucket_generation_cond:broadcast()
> end
>
> --
> @@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode)
> else
> return bucket
> end
> + local dst = bucket and bucket.destination or M.route_map[bucket_id]
> return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason,
> - bucket and bucket.destination)
> + dst)
> end
>
> --
> @@ -804,11 +809,23 @@ end
> --
> local function bucket_unrefro(bucket_id)
> local ref = M.bucket_refs[bucket_id]
> - if not ref or ref.ro == 0 then
> + local count = ref and ref.ro or 0
> + if count == 0 then
> return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id,
> "no refs", nil)
> end
> - ref.ro = ref.ro - 1
> + if count == 1 then
> + ref.ro = 0
> + if ref.ro_lock then
> + -- Garbage collector is waiting for the bucket if RO
> + -- is locked. Let it know it has one more bucket to
> + -- collect. It relies on generation, so its increment
> + -- it enough.
> + bucket_generation_increment()
> + end
> + return true
> + end
> + ref.ro = count - 1
> return true
> end
>
> @@ -1479,79 +1496,44 @@ local function gc_bucket_in_space(space, bucket_id, status)
> end
>
> --
> --- Remove tuples from buckets of a specified type.
> --- @param type Type of buckets to gc.
> --- @retval List of ids of empty buckets of the type.
> +-- Drop buckets with the given status along with their data in all spaces.
> +-- @param status Status of target buckets.
> +-- @param route_map Destinations of deleted buckets are saved into this table.
> --
> -local function gc_bucket_step_by_type(type)
> - local sharded_spaces = find_sharded_spaces()
> - local empty_buckets = {}
> +local function gc_bucket_drop_xc(status, route_map)
> local limit = consts.BUCKET_CHUNK_SIZE
> - local is_all_collected = true
> - for _, bucket in box.space._bucket.index.status:pairs(type) do
> - local bucket_id = bucket.id
> - local ref = M.bucket_refs[bucket_id]
> + local _bucket = box.space._bucket
> + local sharded_spaces = find_sharded_spaces()
> + for _, b in _bucket.index.status:pairs(status) do
> + local id = b.id
> + local ref = M.bucket_refs[id]
> if ref then
> assert(ref.rw == 0)
> if ref.ro ~= 0 then
> ref.ro_lock = true
> - is_all_collected = false
> goto continue
> end
> - M.bucket_refs[bucket_id] = nil
> + M.bucket_refs[id] = nil
> end
> for _, space in pairs(sharded_spaces) do
> - gc_bucket_in_space_xc(space, bucket_id, type)
> + gc_bucket_in_space_xc(space, id, status)
> limit = limit - 1
> if limit == 0 then
> lfiber.sleep(0)
> limit = consts.BUCKET_CHUNK_SIZE
> end
> end
> - table.insert(empty_buckets, bucket.id)
> -::continue::
> + route_map[id] = b.destination
> + _bucket:delete{id}
> + ::continue::
> end
> - return empty_buckets, is_all_collected
> -end
> -
> ---
> --- Drop buckets with ids in the list.
> --- @param bucket_ids Bucket ids to drop.
> --- @param status Expected bucket status.
> ---
> -local function gc_bucket_drop_xc(bucket_ids, status)
> - if #bucket_ids == 0 then
> - return
> - end
> - local limit = consts.BUCKET_CHUNK_SIZE
> - box.begin()
> - local _bucket = box.space._bucket
> - for _, id in pairs(bucket_ids) do
> - local bucket_exists = _bucket:get{id} ~= nil
> - local b = _bucket:get{id}
> - if b then
> - if b.status ~= status then
> - return error(string.format('Bucket %d status is changed. Was '..
> - '%s, became %s', id, status,
> - b.status))
> - end
> - _bucket:delete{id}
> - end
> - limit = limit - 1
> - if limit == 0 then
> - box.commit()
> - box.begin()
> - limit = consts.BUCKET_CHUNK_SIZE
> - end
> - end
> - box.commit()
> end
>
> --
> -- Exception safe version of gc_bucket_drop_xc.
> --
> -local function gc_bucket_drop(bucket_ids, status)
> - local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status)
> +local function gc_bucket_drop(status, route_map)
> + local status, err = pcall(gc_bucket_drop_xc, status, route_map)
> if not status then
> box.rollback()
> end
> @@ -1578,65 +1560,75 @@ function gc_bucket_f()
> -- generation == bucket generation. In such a case the fiber
> -- does nothing until next _bucket change.
> local bucket_generation_collected = -1
> - -- Empty sent buckets are collected into an array. After a
> - -- specified time interval the buckets are deleted both from
> - -- this array and from _bucket space.
> - local buckets_for_redirect = {}
> - local buckets_for_redirect_ts = clock()
> - -- Empty sent buckets, updated after each step, and when
> - -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> - -- for next deletion.
> - local empty_garbage_buckets, empty_sent_buckets, status, err
> + local bucket_generation_current = M.bucket_generation
> + -- Deleted buckets are saved into a route map to redirect routers if they
> + -- didn't discover new location of the buckets yet. However route map does
> + -- not grow infinitely. Otherwise it would end up storing redirects for all
> + -- buckets in the cluster. Which could also be outdated.
> + -- Garbage collector periodically drops old routes from the map. For that it
> + -- remembers state of route map in one moment, and after a while clears the
> + -- remembered routes from the global route map.
> + local route_map = M.route_map
> + local route_map_old = {}
> + local route_map_deadline = 0
> + local status, err
> while M.module_version == module_version do
> - -- Check if no changes in buckets configuration.
> - if bucket_generation_collected ~= M.bucket_generation then
> - local bucket_generation = M.bucket_generation
> - local is_sent_collected, is_garbage_collected
> - status, empty_garbage_buckets, is_garbage_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE)
> - if not status then
> - err = empty_garbage_buckets
> - goto check_error
> - end
> - status, empty_sent_buckets, is_sent_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.SENT)
> - if not status then
> - err = empty_sent_buckets
> - goto check_error
> + if bucket_generation_collected ~= bucket_generation_current then
> + status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map)
> + if status then
> + status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map)
> end
> - status, err = gc_bucket_drop(empty_garbage_buckets,
> - consts.BUCKET.GARBAGE)
> -::check_error::
> if not status then
> box.rollback()
> log.error('Error during garbage collection step: %s', err)
> - goto continue
> + else
> + -- Don't use global generation. During the collection it could
> + -- already change. Instead, remember the generation known before
> + -- the collection has started.
> + -- Since the collection also changes the generation, it makes
> + -- the GC happen always at least twice. But typically on the
> + -- second iteration it should not find any buckets to collect,
> + -- and then the collected generation matches the global one.
> + bucket_generation_collected = bucket_generation_current
> end
> - if is_sent_collected and is_garbage_collected then
> - bucket_generation_collected = bucket_generation
> + else
> + status = true
> + end
> +
> + local sleep_time = route_map_deadline - clock()
> + if sleep_time <= 0 then
> + local chunk = consts.LUA_CHUNK_SIZE
> + util.table_minus_yield(route_map, route_map_old, chunk)
> + route_map_old = util.table_copy_yield(route_map, chunk)
> + if next(route_map_old) then
> + sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY
> + else
> + sleep_time = consts.TIMEOUT_INFINITY
> end
> + route_map_deadline = clock() + sleep_time
> end
> + bucket_generation_current = M.bucket_generation
>
> - if clock() - buckets_for_redirect_ts >=
> - consts.BUCKET_SENT_GARBAGE_DELAY then
> - status, err = gc_bucket_drop(buckets_for_redirect,
> - consts.BUCKET.SENT)
> - if not status then
> - buckets_for_redirect = {}
> - empty_sent_buckets = {}
> - bucket_generation_collected = -1
> - log.error('Error during deletion of empty sent buckets: %s',
> - err)
> - elseif M.module_version ~= module_version then
> - return
> + if bucket_generation_current ~= bucket_generation_collected then
> + -- Generation was changed during collection. Or *by* collection.
> + if status then
> + -- Retry immediately. If the generation was changed by the
> + -- collection itself, it will notice it next iteration, and go
> + -- to proper sleep.
> + sleep_time = 0
> else
> - buckets_for_redirect = empty_sent_buckets or {}
> - empty_sent_buckets = nil
> - buckets_for_redirect_ts = clock()
> + -- An error happened during the collection. Does not make sense
> + -- to retry on each iteration of the event loop. The most likely
> + -- errors are either a WAL error or a transaction abort - both
> + -- look like an issue in the user's code and can't be fixed
> + -- quickly anyway. Backoff.
> + sleep_time = consts.GC_BACKOFF_INTERVAL
> end
> end
> -::continue::
> - lfiber.sleep(M.collect_bucket_garbage_interval)
> +
> + if M.module_version == module_version then
> + M.bucket_generation_cond:wait(sleep_time)
> + end
> end
> end
>
> @@ -2421,8 +2413,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> vshard_cfg.rebalancer_disbalance_threshold
> M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving
> M.shard_index = vshard_cfg.shard_index
> - M.collect_bucket_garbage_interval =
> - vshard_cfg.collect_bucket_garbage_interval
> M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending
> M.current_cfg = cfg
> @@ -2676,6 +2666,9 @@ else
> storage_cfg(M.current_cfg, M.this_replica.uuid, true)
> end
> M.module_version = M.module_version + 1
> + -- Background fibers could sleep waiting for bucket changes.
> + -- Let them know it is time to reload.
> + bucket_generation_increment()
> end
>
> M.recovery_f = recovery_f
> @@ -2686,7 +2679,7 @@ M.gc_bucket_f = gc_bucket_f
> -- These functions are saved in M not for atomic reload, but for
> -- unit testing.
> --
> -M.gc_bucket_step_by_type = gc_bucket_step_by_type
> +M.gc_bucket_drop = gc_bucket_drop
> M.rebalancer_build_routes = rebalancer_build_routes
> M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
> M.cached_find_sharded_spaces = find_sharded_spaces
> diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
> index f38af74..484f499 100644
> --- a/vshard/storage/reload_evolution.lua
> +++ b/vshard/storage/reload_evolution.lua
> @@ -4,6 +4,7 @@
> -- in a commit.
> --
> local log = require('log')
> +local fiber = require('fiber')
>
> --
> -- Array of upgrade functions.
> @@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M)
> end
> end
>
> +migrations[#migrations + 1] = function(M)
> + if not M.route_map then
> + M.bucket_generation_cond = fiber.cond()
> + M.route_map = {}
> + end
> +end
> +
> --
> -- Perform an update based on a version stored in `M` (internals).
> -- @param M Old module internals which should be updated.
Thanks for your patch. LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Recovery is a fiber on a master node which tries to resolve
> SENDING/RECEIVING buckets into GARBAGE or ACTIVE, in case they are
> stuck. Usually it happens due to a conflict on the receiving side,
> or if a restart happens during bucket send.
>
> Recovery was proactive. It used to wakeup with a constant period
> to find and resolve the needed buckets.
>
> But this won't work with the future feature called 'map-reduce'.
> Map-reduce as a preparation stage will need to ensure that all
> buckets on a storage are readable and writable. With the current
> recovery algorithm if a bucket is broken, it won't be recovered
> for the next 5 seconds by default. During this time all new
> map-reduce requests can't execute.
>
> This is not acceptable. As well as too frequent wakeup of recovery
> fiber because it would waste TX thread time.
>
> The patch makes recovery fiber wakeup not by a timeout but by
> events happening with _bucket space. Recovery fiber sleeps on a
> condition variable which is signaled when _bucket is changed.
>
> This is very similar to the reactive GC feature in a previous
> commit.
>
> It is worth mentioning that the backoff happens not only when a
> bucket couldn't be recovered (its transfer is still in progress,
> for example), but also when a network error happened and recovery
> couldn't check state of the bucket on the other storage.
>
> It would be a useless busy loop to retry network errors
> immediately after their appearance. Recovery uses a backoff
> interval for them as well.
>
> Needed for #147
> ---
> test/router/router.result | 22 ++++++++---
> test/router/router.test.lua | 13 ++++++-
> test/storage/recovery.result | 8 ++++
> test/storage/recovery.test.lua | 5 +++
> test/storage/recovery_errinj.result | 16 +++++++-
> test/storage/recovery_errinj.test.lua | 9 ++++-
> vshard/consts.lua | 2 +-
> vshard/storage/init.lua | 54 +++++++++++++++++++++++----
> 8 files changed, 110 insertions(+), 19 deletions(-)
>
> diff --git a/test/router/router.result b/test/router/router.result
> index b2efd6d..3c1d073 100644
> --- a/test/router/router.result
> +++ b/test/router/router.result
> @@ -312,6 +312,11 @@ replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err
> _ = test_run:switch('storage_2_a')
> ---
> ...
> +-- Pause recovery. It is too aggressive, and the test needs to see buckets in
> +-- their intermediate states.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]})
> ---
> - [1, 'sending', '<replicaset_1>']
> @@ -319,6 +324,9 @@ box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]}
> _ = test_run:switch('storage_1_a')
> ---
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]})
> ---
> - [1, 'receiving', '<replicaset_2>']
> @@ -342,19 +350,21 @@ util.check_error(vshard.router.call, 1, 'write', 'echo', {123})
> name: TRANSFER_IS_IN_PROGRESS
> message: Bucket 1 is transferring to replicaset <replicaset_1>
> ...
> -_ = test_run:switch('storage_2_a')
> +_ = test_run:switch('storage_1_a')
> +---
> +...
> +box.space._bucket:delete({1})
> ---
> +- [1, 'receiving', '<replicaset_2>']
> ...
> -box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE})
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> ---
> -- [1, 'active']
> ...
> -_ = test_run:switch('storage_1_a')
> +_ = test_run:switch('storage_2_a')
> ---
> ...
> -box.space._bucket:delete({1})
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> ---
> -- [1, 'receiving', '<replicaset_2>']
> ...
> _ = test_run:switch('router_1')
> ---
> diff --git a/test/router/router.test.lua b/test/router/router.test.lua
> index 154310b..aa3eb3b 100644
> --- a/test/router/router.test.lua
> +++ b/test/router/router.test.lua
> @@ -114,19 +114,28 @@ replicaset, err = vshard.router.bucket_discovery(1); return err == nil or err
> replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err
>
> _ = test_run:switch('storage_2_a')
> +-- Pause recovery. It is too aggressive, and the test needs to see buckets in
> +-- their intermediate states.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]})
> +
> _ = test_run:switch('storage_1_a')
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]})
> +
> _ = test_run:switch('router_1')
> -- Ok to read sending bucket.
> vshard.router.call(1, 'read', 'echo', {123})
> -- Not ok to write sending bucket.
> util.check_error(vshard.router.call, 1, 'write', 'echo', {123})
>
> -_ = test_run:switch('storage_2_a')
> -box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE})
> _ = test_run:switch('storage_1_a')
> box.space._bucket:delete({1})
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +
> +_ = test_run:switch('storage_2_a')
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +
> _ = test_run:switch('router_1')
>
> -- Check unavailability of master of a replicaset.
> diff --git a/test/storage/recovery.result b/test/storage/recovery.result
> index 8ccb0b9..fa92bca 100644
> --- a/test/storage/recovery.result
> +++ b/test/storage/recovery.result
> @@ -28,12 +28,20 @@ util.push_rs_filters(test_run)
> _ = test_run:switch("storage_2_a")
> ---
> ...
> +-- Pause until restart. Otherwise recovery does its job too fast and does not
> +-- allow to simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> vshard.storage.rebalancer_disable()
> ---
> ...
> _ = test_run:switch("storage_1_a")
> ---
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> -- Create buckets sending to rs2 and restart - recovery must
> -- garbage some of them and activate others. Receiving buckets
> -- must be garbaged on bootstrap.
> diff --git a/test/storage/recovery.test.lua b/test/storage/recovery.test.lua
> index a0651e8..93cec68 100644
> --- a/test/storage/recovery.test.lua
> +++ b/test/storage/recovery.test.lua
> @@ -10,8 +10,13 @@ util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
> util.push_rs_filters(test_run)
>
> _ = test_run:switch("storage_2_a")
> +-- Pause until restart. Otherwise recovery does its job too fast and does not
> +-- allow to simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> vshard.storage.rebalancer_disable()
> +
> _ = test_run:switch("storage_1_a")
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
>
> -- Create buckets sending to rs2 and restart - recovery must
> -- garbage some of them and activate others. Receiving buckets
> diff --git a/test/storage/recovery_errinj.result b/test/storage/recovery_errinj.result
> index 3e9a9bf..8c178d5 100644
> --- a/test/storage/recovery_errinj.result
> +++ b/test/storage/recovery_errinj.result
> @@ -35,9 +35,17 @@ _ = test_run:switch('storage_2_a')
> vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true
> ---
> ...
> +-- Pause recovery. Otherwise it does its job too fast and does not allow to
> +-- simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> _ = test_run:switch('storage_1_a')
> ---
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> _bucket = box.space._bucket
> ---
> ...
> @@ -76,10 +84,16 @@ _bucket:get{1}
> ---
> - [1, 'active']
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +---
> +...
> _ = test_run:switch('storage_1_a')
> ---
> ...
> -while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +---
> +...
> +wait_bucket_is_collected(1)
> ---
> ...
> _ = test_run:switch("default")
> diff --git a/test/storage/recovery_errinj.test.lua b/test/storage/recovery_errinj.test.lua
> index 8c1a9d2..c730560 100644
> --- a/test/storage/recovery_errinj.test.lua
> +++ b/test/storage/recovery_errinj.test.lua
> @@ -14,7 +14,12 @@ util.push_rs_filters(test_run)
> --
> _ = test_run:switch('storage_2_a')
> vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true
> +-- Pause recovery. Otherwise it does its job too fast and does not allow to
> +-- simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +
> _ = test_run:switch('storage_1_a')
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> _bucket = box.space._bucket
> _bucket:replace{1, vshard.consts.BUCKET.ACTIVE, util.replicasets[2]}
> ret, err = vshard.storage.bucket_send(1, util.replicasets[2], {timeout = 0.1})
> @@ -27,9 +32,11 @@ vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = false
> _bucket = box.space._bucket
> while _bucket:get{1}.status ~= vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.01) end
> _bucket:get{1}
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
>
> _ = test_run:switch('storage_1_a')
> -while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +wait_bucket_is_collected(1)
>
> _ = test_run:switch("default")
> test_run:drop_cluster(REPLICASET_2)
> diff --git a/vshard/consts.lua b/vshard/consts.lua
> index 3f1585a..cf3f422 100644
> --- a/vshard/consts.lua
> +++ b/vshard/consts.lua
> @@ -39,7 +39,7 @@ return {
> DEFAULT_SYNC_TIMEOUT = 1;
> RECONNECT_TIMEOUT = 0.5;
> GC_BACKOFF_INTERVAL = 5,
> - RECOVERY_INTERVAL = 5;
> + RECOVERY_BACKOFF_INTERVAL = 5,
> COLLECT_LUA_GARBAGE_INTERVAL = 100;
>
> DISCOVERY_IDLE_INTERVAL = 10,
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 31a6fc7..85f5024 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -634,13 +634,16 @@ end
> -- Infinite function to resolve status of buckets, whose 'sending'
> -- has failed due to tarantool or network problems. Restarts on
> -- reload.
> --- @param module_version Module version, on which the current
> --- function had been started. If the actual module version
> --- appears to be changed, then stop recovery. It is
> --- restarted in reloadable_fiber.
> --
> local function recovery_f()
> local module_version = M.module_version
> + -- Changes of _bucket increments bucket generation. Recovery has its own
> + -- bucket generation which is <= actual. Recovery is finished, when its
> + -- generation == bucket generation. In such a case the fiber does nothing
> + -- until next _bucket change.
> + local bucket_generation_recovered = -1
> + local bucket_generation_current = M.bucket_generation
> + local ok, sleep_time, is_all_recovered, total, recovered
> -- Interrupt recovery if a module has been reloaded. Perhaps,
> -- there was found a bug, and reload fixes it.
> while module_version == M.module_version do
> @@ -648,22 +651,57 @@ local function recovery_f()
> lfiber.yield()
> goto continue
> end
> - local ok, total, recovered = pcall(recovery_step_by_type,
> - consts.BUCKET.SENDING)
> + is_all_recovered = true
> + if bucket_generation_recovered == bucket_generation_current then
> + goto sleep
> + end
> +
> + ok, total, recovered = pcall(recovery_step_by_type,
> + consts.BUCKET.SENDING)
> if not ok then
> + is_all_recovered = false
> log.error('Error during sending buckets recovery: %s', total)
> + elseif total ~= recovered then
> + is_all_recovered = false
> end
> +
> ok, total, recovered = pcall(recovery_step_by_type,
> consts.BUCKET.RECEIVING)
> if not ok then
> + is_all_recovered = false
> log.error('Error during receiving buckets recovery: %s', total)
> elseif total == 0 then
> bucket_receiving_quota_reset()
> else
> bucket_receiving_quota_add(recovered)
> + if total ~= recovered then
> + is_all_recovered = false
> + end
> + end
> +
> + ::sleep::
> + if not is_all_recovered then
> + bucket_generation_recovered = -1
> + else
> + bucket_generation_recovered = bucket_generation_current
> + end
> + bucket_generation_current = M.bucket_generation
> +
> + if not is_all_recovered then
> + -- One option - some buckets are not broken. Their transmission is
> + -- still in progress. Don't need to retry immediately. Another
> + -- option - network errors when tried to repair the buckets. Also no
> + -- need to retry often. It won't help.
> + sleep_time = consts.RECOVERY_BACKOFF_INTERVAL
> + elseif bucket_generation_recovered ~= bucket_generation_current then
> + sleep_time = 0
> + else
> + sleep_time = consts.TIMEOUT_INFINITY
> + end
> + if module_version == M.module_version then
> + M.bucket_generation_cond:wait(sleep_time)
> end
> - lfiber.sleep(consts.RECOVERY_INTERVAL)
> - ::continue::
> + ::continue::
> end
> end
>
Thanks for your patch. Shouldn't it be added to storage "MODULE_INTERNALS" ? LGTM. One comment below. On 10/02/2021 02:46, Vladislav Shpilevoy wrote: > Lua does not have a built-in standard library for binary heaps > (also called priority queues). There is an implementation in > Tarantool core in libsalad, but it is in C. > > Heap is a perfect storage for the soon coming feature map-reduce. > In the map-reduce algorithm it will be necessary to be able to > lock an entire storage against any bucket moves for time <= > specified timeout. Number of map-reduce requests can be big, and > they can have different timeouts. > > So there is a pile of timeouts from different requests. It is > necessary to be able to quickly add new ones, be able to delete > random ones, and remove expired ones. > > One way would be a sorted array of the deadlines. Unfortunately, > it is super slow. O(N + log(N)) to add a new element (find place > for log(N) and move all next elements for N), O(N) to delete a > random one (move all next elements one cell left/right). > > Another way would be a sorted tree. But trees like RB or a dumb > binary tree require extra steps to keep them balanced and to have > access to the smallest element ASAP. > > The best way is the binary heap. It is perfectly balanced by > design meaning that all operations there have complexity at most > O(log(N)). It is possible to find the closest deadline for > constant time as it is the heap's top. > > This patch implements it. The heap is intrusive. It means it > stores index of each element right inside of the element as a > field 'index'. Having an index along with each element allows to > delete it from the heap for O(log(N)) without necessity to look > its place up first. > > Part of #147 > --- > test/unit-tap/heap.test.lua | 310 ++++++++++++++++++++++++++++++++++++ > test/unit-tap/suite.ini | 4 + > vshard/heap.lua | 226 ++++++++++++++++++++++++++ > 3 files changed, 540 insertions(+) > create mode 100755 test/unit-tap/heap.test.lua > create mode 100644 test/unit-tap/suite.ini > create mode 100644 vshard/heap.lua > > diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua > new file mode 100755 > index 0000000..8c3819f > --- /dev/null > +++ b/test/unit-tap/heap.test.lua > @@ -0,0 +1,310 @@ > +#!/usr/bin/env tarantool > + > +local tap = require('tap') > +local test = tap.test("cfg") > +local heap = require('vshard.heap') > + Maybe it's better to use single brackets everywhere: test("cfg") -> test('cfg'). Or does such difference have some sense? > +-- > +-- Max number of heap to test. Number of iterations in the test > +-- grows as a factorial of this value. At 10 the test becomes > +-- too long already. > +-- > +local heap_size = 8 > + > +-- > +-- Type of the object stored in the intrusive heap. > +-- > +local function min_heap_cmp(l, r) > + return l.value < r.value > +end > + > +local function max_heap_cmp(l, r) > + return l.value > r.value > +end > + > +local function new_object(value) > + return {value = value} > +end > + > +local function heap_check_indexes(heap) > + local count = heap:count() > + local data = heap.data > + for i = 1, count do > + assert(data[i].index == i) > + end > +end > + > +local function reverse(values, i1, i2) > + while i1 < i2 do > + values[i1], values[i2] = values[i2], values[i1] > + i1 = i1 + 1 > + i2 = i2 - 1 > + end > +end > + > +-- > +-- Implementation of std::next_permutation() from C++. > +-- > +local function next_permutation(values) > + local count = #values > + if count <= 1 then > + return false > + end > + local i = count > + while true do > + local j = i > + i = i - 1 > + if values[i] < values[j] then > + local k = count > + while values[i] >= values[k] do > + k = k - 1 > + end > + values[i], values[k] = values[k], values[i] > + reverse(values, j, count) > + return true > + end > + if i == 1 then > + reverse(values, 1, count) > + return false > + end > + end > +end > + > +local function range(count) > + local res = {} > + for i = 1, count do > + res[i] = i > + end > + return res > +end > + > +-- > +-- Min heap fill and empty. > +-- > +local function test_min_heap_basic(test) > + test:plan(1) > + > + local h = heap.new(min_heap_cmp) > + assert(not h:pop()) > + assert(h:count() == 0) > + local values = {} > + for i = 1, heap_size do > + values[i] = new_object(i) > + end > + for counti = 1, heap_size do > + local indexes = range(counti) > + repeat > + for i = 1, counti do > + h:push(values[indexes[i]]) > + end > + heap_check_indexes(h) > + assert(h:count() == counti) > + for i = 1, counti do > + assert(h:top() == values[i]) > + assert(h:pop() == values[i]) > + heap_check_indexes(h) > + end > + assert(not h:pop()) > + assert(h:count() == 0) > + until not next_permutation(indexes) > + end > + > + test:ok(true, "no asserts") > +end > + > +-- > +-- Max heap fill and empty. > +-- > +local function test_max_heap_basic(test) > + test:plan(1) > + > + local h = heap.new(max_heap_cmp) > + assert(not h:pop()) > + assert(h:count() == 0) > + local values = {} > + for i = 1, heap_size do > + values[i] = new_object(heap_size - i + 1) > + end > + for counti = 1, heap_size do > + local indexes = range(counti) > + repeat > + for i = 1, counti do > + h:push(values[indexes[i]]) > + end > + heap_check_indexes(h) > + assert(h:count() == counti) > + for i = 1, counti do > + assert(h:top() == values[i]) > + assert(h:pop() == values[i]) > + heap_check_indexes(h) > + end > + assert(not h:pop()) > + assert(h:count() == 0) > + until not next_permutation(indexes) > + end > + > + test:ok(true, "no asserts") > +end > + > +-- > +-- Min heap update top element. > +-- > +local function test_min_heap_update_top(test) > + test:plan(1) > + > + local h = heap.new(min_heap_cmp) > + for counti = 1, heap_size do > + local indexes = range(counti) > + repeat > + local values = {} > + for i = 1, counti do > + values[i] = new_object(0) > + h:push(values[i]) > + end > + heap_check_indexes(h) > + for i = 1, counti do > + h:top().value = indexes[i] > + h:update_top() > + end > + heap_check_indexes(h) > + assert(h:count() == counti) > + for i = 1, counti do > + assert(h:top().value == i) > + assert(h:pop().value == i) > + heap_check_indexes(h) > + end > + assert(not h:pop()) > + assert(h:count() == 0) > + until not next_permutation(indexes) > + end > + > + test:ok(true, "no asserts") > +end > + > +-- > +-- Min heap update all elements in all possible positions. > +-- > +local function test_min_heap_update(test) > + test:plan(1) > + > + local h = heap.new(min_heap_cmp) > + for counti = 1, heap_size do > + for srci = 1, counti do > + local endv = srci * 10 + 5 > + for newv = 5, endv, 5 do > + local values = {} > + for i = 1, counti do > + values[i] = new_object(i * 10) > + h:push(values[i]) > + end > + heap_check_indexes(h) > + local obj = values[srci] > + obj.value = newv > + h:update(obj) > + assert(obj.index >= 1) > + assert(obj.index <= counti) > + local prev = -1 > + for i = 1, counti do > + obj = h:pop() > + assert(obj.index == -1) > + assert(obj.value >= prev) > + assert(obj.value >= 1) > + prev = obj.value > + obj.value = -1 > + heap_check_indexes(h) > + end > + assert(not h:pop()) > + assert(h:count() == 0) > + end > + end > + end > + > + test:ok(true, "no asserts") > +end > + > +-- > +-- Max heap delete all elements from all possible positions. > +-- > +local function test_max_heap_delete(test) > + test:plan(1) > + > + local h = heap.new(max_heap_cmp) > + local inf = heap_size + 1 > + for counti = 1, heap_size do > + for srci = 1, counti do > + local values = {} > + for i = 1, counti do > + values[i] = new_object(i) > + h:push(values[i]) > + end > + heap_check_indexes(h) > + local obj = values[srci] > + obj.value = inf > + h:remove(obj) > + assert(obj.index == -1) > + local prev = inf > + for i = 2, counti do > + obj = h:pop() > + assert(obj.index == -1) > + assert(obj.value < prev) > + assert(obj.value >= 1) > + prev = obj.value > + obj.value = -1 > + heap_check_indexes(h) > + end > + assert(not h:pop()) > + assert(h:count() == 0) > + end > + end > + > + test:ok(true, "no asserts") > +end > + > +local function test_min_heap_remove_top(test) > + test:plan(1) > + > + local h = heap.new(min_heap_cmp) > + for i = 1, heap_size do > + h:push(new_object(i)) > + end > + for i = 1, heap_size do > + assert(h:top().value == i) > + h:remove_top() > + end > + assert(h:count() == 0) > + > + test:ok(true, "no asserts") > +end > + > +local function test_max_heap_remove_try(test) > + test:plan(1) > + > + local h = heap.new(max_heap_cmp) > + local obj = new_object(1) > + assert(obj.index == nil) > + h:remove_try(obj) > + assert(h:count() == 0) > + > + h:push(obj) > + h:push(new_object(2)) > + assert(obj.index == 2) > + h:remove(obj) > + assert(obj.index == -1) > + h:remove_try(obj) > + assert(obj.index == -1) > + assert(h:count() == 1) > + > + test:ok(true, "no asserts") > +end > + > +test:plan(7) > + > +test:test('min_heap_basic', test_min_heap_basic) > +test:test('max_heap_basic', test_max_heap_basic) > +test:test('min_heap_update_top', test_min_heap_update_top) > +test:test('min heap update', test_min_heap_update) > +test:test('max heap delete', test_max_heap_delete) > +test:test('min heap remove top', test_min_heap_remove_top) > +test:test('max heap remove try', test_max_heap_remove_try) > + > +os.exit(test:check() and 0 or 1) > diff --git a/test/unit-tap/suite.ini b/test/unit-tap/suite.ini > new file mode 100644 > index 0000000..f365b69 > --- /dev/null > +++ b/test/unit-tap/suite.ini > @@ -0,0 +1,4 @@ > +[default] > +core = app > +description = Unit tests TAP > +is_parallel = True > diff --git a/vshard/heap.lua b/vshard/heap.lua > new file mode 100644 > index 0000000..78c600a > --- /dev/null > +++ b/vshard/heap.lua > @@ -0,0 +1,226 @@ > +local math_floor = math.floor > + > +-- > +-- Implementation of a typical algorithm of the binary heap. > +-- The heap is intrusive - it stores index of each element inside of it. It > +-- allows to update and delete elements in any place in the heap, not only top > +-- elements. > +-- > + > +local function heap_parent_index(index) > + return math_floor(index / 2) > +end > + > +local function heap_left_child_index(index) > + return index * 2 > +end > + > +-- > +-- Generate a new heap. > +-- > +-- The implementation is targeted on as few index accesses as possible. > +-- Everything what could be is stored as upvalue variables instead of as indexes > +-- in a table. What couldn't be an upvalue and is used in a function more than > +-- once is saved on the stack. > +-- > +local function heap_new(is_left_above) > + -- Having it as an upvalue allows not to do 'self.data' lookup in each > + -- function. > + local data = {} > + -- Saves #data calculation. In Lua it is not just reading a number. > + local count = 0 > + > + local function heap_update_index_up(idx) > + if idx == 1 then > + return false > + end > + > + local orig_idx = idx > + local value = data[idx] > + local pidx = heap_parent_index(idx) > + local parent = data[pidx] > + while is_left_above(value, parent) do > + data[idx] = parent > + parent.index = idx > + idx = pidx > + if idx == 1 then > + break > + end > + pidx = heap_parent_index(idx) > + parent = data[pidx] > + end > + > + if idx == orig_idx then > + return false > + end > + data[idx] = value > + value.index = idx > + return true > + end > + > + local function heap_update_index_down(idx) > + local left_idx = heap_left_child_index(idx) > + if left_idx > count then > + return false > + end > + > + local orig_idx = idx > + local left > + local right > + local right_idx = left_idx + 1 > + local top > + local top_idx > + local value = data[idx] > + repeat > + right_idx = left_idx + 1 > + if right_idx > count then > + top = data[left_idx] > + if is_left_above(value, top) then > + break > + end > + top_idx = left_idx > + else > + left = data[left_idx] > + right = data[right_idx] > + if is_left_above(left, right) then > + if is_left_above(value, left) then > + break > + end > + top_idx = left_idx > + top = left > + else > + if is_left_above(value, right) then > + break > + end > + top_idx = right_idx > + top = right > + end > + end > + > + data[idx] = top > + top.index = idx > + idx = top_idx > + left_idx = heap_left_child_index(idx) > + until left_idx > count > + > + if idx == orig_idx then > + return false > + end > + data[idx] = value > + value.index = idx > + return true > + end > + > + local function heap_update_index(idx) > + if not heap_update_index_up(idx) then > + heap_update_index_down(idx) > + end > + end > + > + local function heap_push(self, value) > + count = count + 1 > + data[count] = value > + value.index = count > + heap_update_index_up(count) > + end > + > + local function heap_update_top(self) > + heap_update_index_down(1) > + end > + > + local function heap_update(self, value) > + heap_update_index(value.index) > + end > + > + local function heap_remove_top(self) > + if count == 0 then > + return > + end > + data[1].index = -1 > + if count == 1 then > + data[1] = nil > + count = 0 > + return > + end > + local value = data[count] > + data[count] = nil > + data[1] = value > + value.index = 1 > + count = count - 1 > + heap_update_index_down(1) > + end > + > + local function heap_remove(self, value) > + local idx = value.index > + value.index = -1 > + if idx == count then > + data[count] = nil > + count = count - 1 > + return > + end > + value = data[count] > + data[idx] = value > + data[count] = nil > + value.index = idx > + count = count - 1 > + heap_update_index(idx) > + end > + > + local function heap_remove_try(self, value) > + local idx = value.index > + if idx and idx > 0 then > + heap_remove(self, value) > + end > + end > + > + local function heap_pop(self) > + if count == 0 then > + return > + end > + -- Some duplication from remove_top, but allows to save a few > + -- condition checks, index accesses, and a function call. > + local res = data[1] > + res.index = -1 > + if count == 1 then > + data[1] = nil > + count = 0 > + return res > + end > + local value = data[count] > + data[count] = nil > + data[1] = value > + value.index = 1 > + count = count - 1 > + heap_update_index_down(1) > + return res > + end > + > + local function heap_top(self) > + return data[1] > + end > + > + local function heap_count(self) > + return count > + end > + > + return setmetatable({ > + -- Expose the data. For testing. > + data = data, > + }, { > + __index = { > + push = heap_push, > + update_top = heap_update_top, > + remove_top = heap_remove_top, > + pop = heap_pop, > + update = heap_update, > + remove = heap_remove, > + remove_try = heap_remove_try, > + top = heap_top, > + count = heap_count, > + } > + }) > +end > + > +return { > + new = heap_new, > +}
Hi! Thanks for the review! On 10.02.2021 09:57, Oleg Babin via Tarantool-patches wrote: > Thanks for your patch. LGTM except two nits: > > - Seems you need to put "Closes #246" Indeed. I had a feeling that I saw this clock task somewhere. > - Tarantool has "clock" module. I suggest to use "fiber_clock()" instead of simple "clock" to avoid possible confusing. Both comments fixed. The new patch below. No diff because it is big and obvious - a plain rename. ==================== Use fiber.clock() instead of .time() everywhere Fiber.time() returns real time. It is affected by time corrections in the system, and can be not monotonic. The patch makes everything in vshard use fiber.clock() instead of fiber.time(). Also fiber.clock function is saved as an upvalue for all functions in all modules using it. This makes the code a bit shorter and saves 1 indexing of 'fiber' table. The main reason - in the future map-reduce feature the current time will be used quite often. In some places it probably will be the slowest action (given how slow FFI can be when not compiled by JIT). Needed for #147 Closes #246 diff --git a/test/failover/failover.result b/test/failover/failover.result index 452694c..bae57fa 100644 --- a/test/failover/failover.result +++ b/test/failover/failover.result @@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d') --- - true ... -ts1 = fiber.time() +ts1 = fiber.clock() --- ... while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end --- ... -ts2 = fiber.time() +ts2 = fiber.clock() --- ... ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua index 13c517b..a969e0e 100644 --- a/test/failover/failover.test.lua +++ b/test/failover/failover.test.lua @@ -109,9 +109,9 @@ test_run:switch('router_1') -- Revive the best replica. A router must reconnect to it in -- FAILOVER_UP_TIMEOUT seconds. test_run:cmd('start server box_1_d') -ts1 = fiber.time() +ts1 = fiber.clock() while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end -ts2 = fiber.time() +ts2 = fiber.clock() ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT test_run:grep_log('router_1', 'New replica box_1_d%(storage%@') diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua index b13d05e..9c792b3 100644 --- a/vshard/replicaset.lua +++ b/vshard/replicaset.lua @@ -54,6 +54,7 @@ local luri = require('uri') local luuid = require('uuid') local ffi = require('ffi') local util = require('vshard.util') +local fiber_clock = fiber.clock local gsc = util.generate_self_checker -- @@ -88,7 +89,7 @@ local function netbox_on_connect(conn) -- biggest priority. Really, it is not neccessary to -- increase replica connection priority, if the current -- one already has the biggest priority. (See failover_f). - rs.replica_up_ts = fiber.time() + rs.replica_up_ts = fiber_clock() end end @@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn) assert(conn.replica) -- Replica is down - remember this time to decrease replica -- priority after FAILOVER_DOWN_TIMEOUT seconds. - conn.replica.down_ts = fiber.time() + conn.replica.down_ts = fiber_clock() end -- @@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset) local old_replica = replicaset.replica if old_replica == replicaset.priority_list[1] and old_replica:is_connected() then - replicaset.replica_up_ts = fiber.time() + replicaset.replica_up_ts = fiber_clock() return end for _, replica in pairs(replicaset.priority_list) do @@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance) net_status, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) end - local end_time = fiber.time() + timeout + local end_time = fiber_clock() + timeout while not net_status and timeout > 0 do replica, err = pick_next_replica(replicaset) if not replica then @@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance) opts.timeout = timeout net_status, storage_status, retval, err = replica_call(replica, func, args, opts) - timeout = end_time - fiber.time() + timeout = end_time - fiber_clock() if not net_status and not storage_status and not can_retry_after_error(retval) then -- There is no sense to retry LuaJit errors, such as @@ -680,7 +681,7 @@ local function buildall(sharding_cfg) else zone_weights = {} end - local curr_ts = fiber.time() + local curr_ts = fiber_clock() for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do local new_replicaset = setmetatable({ replicas = {}, diff --git a/vshard/router/init.lua b/vshard/router/init.lua index ba1f863..eeb7515 100644 --- a/vshard/router/init.lua +++ b/vshard/router/init.lua @@ -1,6 +1,7 @@ local log = require('log') local lfiber = require('fiber') local table_new = require('table.new') +local fiber_clock = lfiber.clock local MODULE_INTERNALS = '__module_vshard_router' -- Reload requirements, in case this module is reloaded manually. @@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, end local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN local replicaset, err - local tend = lfiber.time() + timeout + local tend = fiber_clock() + timeout if bucket_id > router.total_bucket_count or bucket_id <= 0 then error('Bucket is unreachable: bucket id is out of range') end @@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, replicaset, err = bucket_resolve(router, bucket_id) if replicaset then ::replicaset_is_found:: - opts.timeout = tend - lfiber.time() + opts.timeout = tend - fiber_clock() local storage_call_status, call_status, call_error = replicaset[call](replicaset, 'vshard.storage.call', {bucket_id, mode, func, args}, opts) @@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, -- if reconfiguration had been started, -- and while is not executed on router, -- but already is executed on storages. - while lfiber.time() <= tend do + while fiber_clock() <= tend do lfiber.sleep(0.05) replicaset = router.replicasets[err.destination] if replicaset then @@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, -- case of broken cluster, when a bucket -- is sent on two replicasets to each -- other. - if replicaset and lfiber.time() <= tend then + if replicaset and fiber_clock() <= tend then goto replicaset_is_found end end @@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica, end end lfiber.yield() - until lfiber.time() > tend + until fiber_clock() > tend if err then return nil, err else @@ -749,7 +750,7 @@ end -- connections must be updated. -- local function failover_collect_to_update(router) - local ts = lfiber.time() + local ts = fiber_clock() local uuid_to_update = {} for uuid, rs in pairs(router.replicasets) do if failover_need_down_priority(rs, ts) or @@ -772,7 +773,7 @@ local function failover_step(router) if #uuid_to_update == 0 then return false end - local curr_ts = lfiber.time() + local curr_ts = fiber_clock() local replica_is_changed = false for _, uuid in pairs(uuid_to_update) do local rs = router.replicasets[uuid] @@ -1230,8 +1231,7 @@ local function router_sync(router, timeout) timeout = router.sync_timeout end local arg = {timeout} - local clock = lfiber.clock - local deadline = timeout and (clock() + timeout) + local deadline = timeout and (fiber_clock() + timeout) local opts = {timeout = timeout} for rs_uuid, replicaset in pairs(router.replicasets) do if timeout < 0 then @@ -1244,7 +1244,7 @@ local function router_sync(router, timeout) err.replicaset = rs_uuid return nil, err end - timeout = deadline - clock() + timeout = deadline - fiber_clock() arg[1] = timeout opts.timeout = timeout end diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 1b48bf1..38cdf19 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self() local trigger = require('internal.trigger') local ffi = require('ffi') local yaml_encode = require('yaml').encode +local fiber_clock = lfiber.clock local MODULE_INTERNALS = '__module_vshard_storage' -- Reload requirements, in case this module is reloaded manually. @@ -695,7 +696,7 @@ local function sync(timeout) log.debug("Synchronizing replicaset...") timeout = timeout or M.sync_timeout local vclock = box.info.vclock - local tstart = lfiber.time() + local tstart = fiber_clock() repeat local done = true for _, replica in ipairs(box.info.replication) do @@ -711,7 +712,7 @@ local function sync(timeout) return true end lfiber.sleep(0.001) - until not (lfiber.time() <= tstart + timeout) + until fiber_clock() > tstart + timeout log.warn("Timed out during synchronizing replicaset") local ok, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) @@ -1280,10 +1281,11 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard) ref.rw_lock = true exception_guard.ref = ref exception_guard.drop_rw_lock = true - local deadline = lfiber.clock() + (opts and opts.timeout or 10) + local timeout = opts and opts.timeout or 10 + local deadline = fiber_clock() + timeout while ref.rw ~= 0 do - if not M.bucket_rw_lock_is_ready_cond:wait(deadline - - lfiber.clock()) then + timeout = deadline - fiber_clock() + if not M.bucket_rw_lock_is_ready_cond:wait(timeout) then status, err = pcall(box.error, box.error.TIMEOUT) return nil, lerror.make(err) end @@ -1579,7 +1581,7 @@ function gc_bucket_f() -- specified time interval the buckets are deleted both from -- this array and from _bucket space. local buckets_for_redirect = {} - local buckets_for_redirect_ts = lfiber.time() + local buckets_for_redirect_ts = fiber_clock() -- Empty sent buckets, updated after each step, and when -- buckets_for_redirect is deleted, it gets empty_sent_buckets -- for next deletion. @@ -1614,7 +1616,7 @@ function gc_bucket_f() end end - if lfiber.time() - buckets_for_redirect_ts >= + if fiber_clock() - buckets_for_redirect_ts >= consts.BUCKET_SENT_GARBAGE_DELAY then status, err = gc_bucket_drop(buckets_for_redirect, consts.BUCKET.SENT) @@ -1629,7 +1631,7 @@ function gc_bucket_f() else buckets_for_redirect = empty_sent_buckets or {} empty_sent_buckets = nil - buckets_for_redirect_ts = lfiber.time() + buckets_for_redirect_ts = fiber_clock() end end ::continue::
Thanks for the review!
On 10.02.2021 09:57, Oleg Babin wrote:
> Hi! Thanks for your patch! LGTM but I have one question.
>
> Maybe it's reasonable to add some timeout in this function?
>
> AFAIK test-run terminates tests after 120 seconds of inactivity it seems too long for such simple case.
>
> But anyway it's up to you.
test_run:wait_cond() has default timeout 1 minute. I decided it
is fine.
Thanks for the review! >> diff --git a/test/unit/util.result b/test/unit/util.result >> index 096e36f..c4fd84d 100644 >> --- a/test/unit/util.result >> +++ b/test/unit/util.result >> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000) >> +do \ >> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ >> + f = fiber.create(function() \ >> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ >> + end) \ >> + yield_count = 0 \ >> + while f:status() ~= 'dead' do \ >> + yield_count = yield_count + 1 \ >> + fiber.yield() \ >> + end \ >> +end >> +--- > > > Why can't you use "csw" of fiber.self() instead? Also it's it reliable enough to simply count yields? Yup, will work too. See the diff below. ==================== diff --git a/test/unit/util.result b/test/unit/util.result index c4fd84d..42a361a 100644 --- a/test/unit/util.result +++ b/test/unit/util.result @@ -111,14 +111,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) ... do \ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + yield_count = 0 \ f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ end) \ - yield_count = 0 \ - while f:status() ~= 'dead' do \ - yield_count = yield_count + 1 \ - fiber.yield() \ - end \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ end --- ... @@ -151,14 +151,14 @@ copy_yield({k1 = 1, k2 = 2}, 1) do \ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ res = nil \ + yield_count = 0 \ f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ res = copy_yield(t, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ end) \ - yield_count = 0 \ - while f:status() ~= 'dead' do \ - yield_count = yield_count + 1 \ - fiber.yield() \ - end \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ end --- ... diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua index 4d6cbe9..9550a95 100644 --- a/test/unit/util.test.lua +++ b/test/unit/util.test.lua @@ -42,14 +42,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) do \ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + yield_count = 0 \ f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ end) \ - yield_count = 0 \ - while f:status() ~= 'dead' do \ - yield_count = yield_count + 1 \ - fiber.yield() \ - end \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ end yield_count t @@ -63,14 +63,14 @@ copy_yield({k1 = 1, k2 = 2}, 1) do \ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ res = nil \ + yield_count = 0 \ f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ res = copy_yield(t, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ end) \ - yield_count = 0 \ - while f:status() ~= 'dead' do \ - yield_count = yield_count + 1 \ - fiber.yield() \ - end \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ end yield_count t ==================== > Could scheduler skip this fiber at some loop iteration? In other words, won't this test be flaky? Nope. Unless the fiber is sleeping on some condition or for a timeout, a plain sleep(0) also known as fiber.yield() won't skip this fiber on the next iteration of the loop. But does not matter if csw is used to count the yields. Full new patch below. ==================== util: introduce yielding table functions The patch adds functions table_copy_yield and table_minus_yield. Yielding copy creates a duplicate of a table but yields every specified number of keys copied. Yielding minus removes matching key-value pairs specified in one table from another table. It yields every specified number of keys passed. The functions should help to process huge Lua tables (millions of elements and more). These are going to be used on the storage in the new GC algorithm. The algorithm will need to keep a route table on the storage, just like on the router, but with expiration time for the routes. Since bucket count can be millions, it means GC will potentially operate on a huge Lua table and could use some yields so as not to block TX thread for long. Needed for #147 diff --git a/test/unit/util.result b/test/unit/util.result index 096e36f..42a361a 100644 --- a/test/unit/util.result +++ b/test/unit/util.result @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000) fib:cancel() --- ... +-- Yielding table minus. +minus_yield = util.table_minus_yield +--- +... +minus_yield({}, {}, 1) +--- +- [] +... +minus_yield({}, {k = 1}, 1) +--- +- [] +... +minus_yield({}, {k = 1}, 0) +--- +- [] +... +minus_yield({k = 1}, {k = 1}, 0) +--- +- [] +... +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10) +--- +- k2: 2 +... +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10) +--- +- [] +... +-- Mismatching values are not deleted. +minus_yield({k1 = 1}, {k1 = 2}, 10) +--- +- k1: 1 +... +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) +--- +- k3: 3 + k2: 2 +... +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + yield_count = 0 \ + f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ + end) \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ +end +--- +... +yield_count +--- +- 2 +... +t +--- +- k4: 4 + k1: 1 +... +-- Yielding table copy. +copy_yield = util.table_copy_yield +--- +... +copy_yield({}, 1) +--- +- [] +... +copy_yield({k = 1}, 1) +--- +- k: 1 +... +copy_yield({k1 = 1, k2 = 2}, 1) +--- +- k1: 1 + k2: 2 +... +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + res = nil \ + yield_count = 0 \ + f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ + res = copy_yield(t, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ + end) \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ +end +--- +... +yield_count +--- +- 2 +... +t +--- +- k3: 3 + k4: 4 + k1: 1 + k2: 2 +... +res +--- +- k3: 3 + k4: 4 + k1: 1 + k2: 2 +... +t ~= res +--- +- true +... diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua index 5f39e06..9550a95 100644 --- a/test/unit/util.test.lua +++ b/test/unit/util.test.lua @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function') while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end test_run:grep_log('default', 'reloadable_function has been started', 1000) fib:cancel() + +-- Yielding table minus. +minus_yield = util.table_minus_yield +minus_yield({}, {}, 1) +minus_yield({}, {k = 1}, 1) +minus_yield({}, {k = 1}, 0) +minus_yield({k = 1}, {k = 1}, 0) +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10) +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10) +-- Mismatching values are not deleted. +minus_yield({k1 = 1}, {k1 = 2}, 10) +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10) + +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + yield_count = 0 \ + f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ + end) \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ +end +yield_count +t + +-- Yielding table copy. +copy_yield = util.table_copy_yield +copy_yield({}, 1) +copy_yield({k = 1}, 1) +copy_yield({k1 = 1, k2 = 2}, 1) + +do \ + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \ + res = nil \ + yield_count = 0 \ + f = fiber.create(function() \ + local csw1 = fiber.info()[fiber.id()].csw \ + res = copy_yield(t, 2) \ + local csw2 = fiber.info()[fiber.id()].csw \ + yield_count = csw2 - csw1 \ + end) \ + test_run:wait_cond(function() return f:status() == 'dead' end) \ +end +yield_count +t +res +t ~= res diff --git a/vshard/util.lua b/vshard/util.lua index d3b4e67..2362607 100644 --- a/vshard/util.lua +++ b/vshard/util.lua @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need) return minor >= minor_need end +-- +-- Copy @a src table. Fiber yields every @a interval keys copied. +-- +local function table_copy_yield(src, interval) + local res = {} + -- Time-To-Yield. + local tty = interval + for k, v in pairs(src) do + res[k] = v + tty = tty - 1 + if tty <= 0 then + fiber.yield() + tty = interval + end + end + return res +end + +-- +-- Remove @a src keys from @a dst if their values match. Fiber yields every +-- @a interval iterations. +-- +local function table_minus_yield(dst, src, interval) + -- Time-To-Yield. + local tty = interval + for k, srcv in pairs(src) do + if dst[k] == srcv then + dst[k] = nil + end + tty = tty - 1 + if tty <= 0 then + fiber.yield() + tty = interval + end + end + return dst +end + return { tuple_extract_key = tuple_extract_key, reloadable_fiber_create = reloadable_fiber_create, @@ -160,4 +198,6 @@ return { async_task = async_task, internal = M, version_is_at_least = version_is_at_least, + table_copy_yield = table_copy_yield, + table_minus_yield = table_minus_yield, }
Thanks for the review! On 10.02.2021 09:59, Oleg Babin wrote: > Thanks for your patch! > > Is it possible to extend log message to "Option is deprecated and has no effect anymore"? Good idea. See the diff in this commit. ==================== diff --git a/vshard/cfg.lua b/vshard/cfg.lua index 28c3400..f7d5dbc 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -61,7 +61,13 @@ local function validate_config(config, template, check_arg) local expected_type = template_value.type if template_value.is_deprecated then if value ~= nil then - log.warn('Option "%s" is deprecated', name) + local reason = template_value.reason + if reason then + reason = '. '..reason + else + reason = '' + end + log.warn('Option "%s" is deprecated'..reason, name) end elseif value == nil then if not template_value.is_optional then ==================== And in the next commit: ==================== diff --git a/vshard/cfg.lua b/vshard/cfg.lua index f7dd4c1..63d5414 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -252,6 +252,7 @@ local cfg_template = { }, collect_bucket_garbage_interval = { name = 'Garbage bucket collect interval', is_deprecated = true, + reason = 'Has no effect anymore' }, collect_lua_garbage = { type = 'boolean', name = 'Garbage Lua collect necessity', ==================== > Also for some options could be useful: "Option is deprecated, use ... instead" (e.g. for "weights"). With the updated version I can specify any 'reason'. Such as 'has no affect', 'use ... instead', etc. > Seems it should be more configurable and gives some hint for user to do. > > > On 10/02/2021 02:46, Vladislav Shpilevoy wrote: >> Some options in vshard are going to be eventually deprecated. For >> instance, 'weigts' will be renamed, 'collect_lua_garbage' may be > > > typo: weigts -> weights Fixed. See the full new patch below. ==================== cfg: introduce 'deprecated option' feature Some options in vshard are going to be eventually deprecated. For instance, 'weights' will be renamed, 'collect_lua_garbage' may be deleted since it appears not to be so useful, 'sync_timeout' is totally unnecessary since any 'sync' can take a timeout per-call. But the patch is motivated by 'collect_bucket_garbage_interval' which is going to become unused in the new GC algorithm. New GC will be reactive instead of proactive. Instead of periodic polling of _bucket space it will react on needed events immediately. This will make the 'collect interval' unused. The option will be deprecated and eventually in some far future release its usage will lead to an error. Needed for #147 diff --git a/vshard/cfg.lua b/vshard/cfg.lua index 1ef1899..f7d5dbc 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -59,7 +59,17 @@ local function validate_config(config, template, check_arg) local value = config[key] local name = template_value.name local expected_type = template_value.type - if value == nil then + if template_value.is_deprecated then + if value ~= nil then + local reason = template_value.reason + if reason then + reason = '. '..reason + else + reason = '' + end + log.warn('Option "%s" is deprecated'..reason, name) + end + elseif value == nil then if not template_value.is_optional then error(string.format('%s must be specified', name)) else ====================
Thanks for the review! On 10.02.2021 10:00, Oleg Babin wrote: > Thanks for your patch. > > As I see you've introduced some new parameters: "LUA_CHUNK_SIZE" and "GC_BACKOFF_INTERVAL". I decided not to go into too deep details and not describe private constants in the commit message. GC_BACKOFF_INTERVAL is explained in the place where it is used. LUA_CHUNK_SIZE is quite obvious if you look at its usage. > I think it's better to describe them in commit message to understand more clear how new algorithm. These constants are not super relevant to the algorithm's core idea. It does not matter much for the reactive GC concept if I yield in table utility functions, or if I have a backoff timeout. These could be considered 'optimizations', 'amendments'. I would consider them small details not worth mentioning in the commit message. > I see that you didn't update comment above "gc_bucket_f" function. Is it still relevant? No, irrelevant, thanks for noticing. Here is the diff: ==================== diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index 99f92a0..1ea8069 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -1543,14 +1543,16 @@ local function gc_bucket_drop(status, route_map) end -- --- Garbage collector. Works on masters. The garbage collector --- wakes up once per specified time. +-- Garbage collector. Works on masters. The garbage collector wakes up when +-- state of any bucket changes. -- After wakeup it follows the plan: --- 1) Check if _bucket has changed. If not, then sleep again; --- 2) Scan user spaces for sent and garbage buckets, delete --- garbage data in batches of limited size; --- 3) Delete GARBAGE buckets from _bucket immediately, and --- schedule SENT buckets for deletion after a timeout; +-- 1) Check if state of any bucket has really changed. If not, then sleep again; +-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of +-- limited size. +-- 3) Bucket destinations are saved into a global route_map to reroute incoming +-- requests from routers in case they didn't notice the buckets being moved. +-- The saved routes are scheduled for deletion after a timeout, which is +-- checked on each iteration of this loop. -- 4) Sleep, go to (1). -- For each step details see comments in the code. -- ==================== The full new patch below. ==================== gc: introduce reactive garbage collector Garbage collector is a fiber on a master node which deletes GARBAGE and SENT buckets along with their data. It was proactive. It used to wakeup with a constant period to find and delete the needed buckets. But this won't work with the future feature called 'map-reduce'. Map-reduce as a preparation stage will need to ensure that all buckets on a storage are readable and writable. With the current GC algorithm if a bucket is sent, it won't be deleted for the next 5 seconds by default. During this time all new map-reduce requests can't execute. This is not acceptable. As well as too frequent wakeup of GC fiber because it would waste TX thread time. The patch makes GC fiber wakeup not by a timeout but by events happening with _bucket space. GC fiber sleeps on a condition variable which is signaled when _bucket is changed. Once GC sees work to do, it won't sleep until it is done. It will only yield. This makes GC delete SENT and GARBAGE buckets as soon as possible reducing the waiting time for the incoming map-reduce requests. Needed for #147 @TarantoolBot document Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval' It was used to specify the interval between bucket garbage collection steps. It was needed because garbage collection in vshard was proactive. It didn't react to newly appeared garbage buckets immediately. Since now (0.1.17) garbage collection became reactive. It starts working with garbage buckets immediately as they appear. And sleeps rest of the time. The option is not used now and does not affect behaviour of anything. I suppose it can be deleted from the documentation. Or left with a big label 'deprecated' + the explanation above. An attempt to use the option does not cause an error, but logs a warning. diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua index 21409bd..8df89f6 100644 --- a/test/lua_libs/storage_template.lua +++ b/test/lua_libs/storage_template.lua @@ -172,6 +172,5 @@ function wait_bucket_is_collected(id) return true end vshard.storage.recovery_wakeup() - vshard.storage.garbage_collector_wakeup() end) end diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result index 168be5d..3b34841 100644 --- a/test/misc/reconfigure.result +++ b/test/misc/reconfigure.result @@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true cfg.rebalancer_max_receiving = 1000 --- ... -cfg.collect_bucket_garbage_interval = 100 ---- -... cfg.invalid_option = 'kek' --- ... @@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000 --- - true ... -vshard.storage.internal.collect_bucket_garbage_interval ~= 100 ---- -- true -... cfg.sync_timeout = nil --- ... @@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil cfg.rebalancer_max_receiving = nil --- ... -cfg.collect_bucket_garbage_interval = nil ---- -... cfg.invalid_option = nil --- ... diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua index e891010..348628c 100644 --- a/test/misc/reconfigure.test.lua +++ b/test/misc/reconfigure.test.lua @@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout cfg.sync_timeout = 100 cfg.collect_lua_garbage = true cfg.rebalancer_max_receiving = 1000 -cfg.collect_bucket_garbage_interval = 100 cfg.invalid_option = 'kek' vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a) not vshard.storage.internal.collect_lua_garbage vshard.storage.internal.sync_timeout vshard.storage.internal.rebalancer_max_receiving ~= 1000 -vshard.storage.internal.collect_bucket_garbage_interval ~= 100 cfg.sync_timeout = nil cfg.collect_lua_garbage = nil cfg.rebalancer_max_receiving = nil -cfg.collect_bucket_garbage_interval = nil cfg.invalid_option = nil -- diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result index b8fc7ff..9df7480 100644 --- a/test/rebalancer/bucket_ref.result +++ b/test/rebalancer/bucket_ref.result @@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read') - true ... -- Force GC to take an RO lock on the bucket now. -vshard.storage.garbage_collector_wakeup() ---- -... vshard.storage.buckets_info(1) --- - 1: @@ -203,7 +200,6 @@ while true do if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then break end - vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end; --- @@ -235,14 +231,6 @@ finish_refs = true while f1:status() ~= 'dead' do fiber.sleep(0.01) end --- ... -vshard.storage.buckets_info(1) ---- -- 1: - status: garbage - ro_lock: true - destination: <replicaset_2> - id: 1 -... wait_bucket_is_collected(1) --- ... diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua index 213ced3..1b032ff 100644 --- a/test/rebalancer/bucket_ref.test.lua +++ b/test/rebalancer/bucket_ref.test.lua @@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs. vshard.storage.bucket_ref(1, 'read') vshard.storage.bucket_unref(1, 'read') -- Force GC to take an RO lock on the bucket now. -vshard.storage.garbage_collector_wakeup() vshard.storage.buckets_info(1) _ = test_run:cmd("setopt delimiter ';'") while true do @@ -64,7 +63,6 @@ while true do if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then break end - vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end; _ = test_run:cmd("setopt delimiter ''"); @@ -72,7 +70,6 @@ vshard.storage.buckets_info(1) vshard.storage.bucket_refro(1) finish_refs = true while f1:status() ~= 'dead' do fiber.sleep(0.01) end -vshard.storage.buckets_info(1) wait_bucket_is_collected(1) _ = test_run:switch('box_2_a') vshard.storage.buckets_info(1) diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result index e50eb72..0ddb1c9 100644 --- a/test/rebalancer/errinj.result +++ b/test/rebalancer/errinj.result @@ -226,17 +226,6 @@ ret2, err2 - true - null ... -_bucket:get{35} ---- -- [35, 'sent', '<replicaset_2>'] -... -_bucket:get{36} ---- -- [36, 'sent', '<replicaset_2>'] -... --- Buckets became 'active' on box_2_a, but still are sending on --- box_1_a. Wait until it is marked as garbage on box_1_a by the --- recovery fiber. wait_bucket_is_collected(35) --- ... diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua index 2cc4a69..a60f3d7 100644 --- a/test/rebalancer/errinj.test.lua +++ b/test/rebalancer/errinj.test.lua @@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a') while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end ret1, err1 ret2, err2 -_bucket:get{35} -_bucket:get{36} --- Buckets became 'active' on box_2_a, but still are sending on --- box_1_a. Wait until it is marked as garbage on box_1_a by the --- recovery fiber. wait_bucket_is_collected(35) wait_bucket_is_collected(36) _ = test_run:switch('box_2_a') diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result index 7d3612b..ad93445 100644 --- a/test/rebalancer/receiving_bucket.result +++ b/test/rebalancer/receiving_bucket.result @@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3}) --- - true ... -vshard.storage.buckets_info(1) ---- -- 1: - status: sent - ro_lock: true - destination: <replicaset_1> - id: 1 -... wait_bucket_is_collected(1) --- ... diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua index 24534b3..2cf6382 100644 --- a/test/rebalancer/receiving_bucket.test.lua +++ b/test/rebalancer/receiving_bucket.test.lua @@ -136,7 +136,6 @@ box.space.test3:select{100} -- Now the bucket is unreferenced and can be transferred. _ = test_run:switch('box_2_a') vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3}) -vshard.storage.buckets_info(1) wait_bucket_is_collected(1) vshard.storage.buckets_info(1) _ = test_run:switch('box_1_a') diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result index 753687f..9d30a04 100644 --- a/test/reload_evolution/storage.result +++ b/test/reload_evolution/storage.result @@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to') ... vshard.storage.internal.reload_version --- -- 2 +- 3 ... -- -- gh-237: should be only one trigger. During gh-237 the trigger installation diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result index 049bdef..ac340eb 100644 --- a/test/router/reroute_wrong_bucket.result +++ b/test/router/reroute_wrong_bucket.result @@ -37,7 +37,7 @@ test_run:switch('storage_1_a') --- - true ... -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 --- ... vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a) @@ -53,7 +53,7 @@ test_run:switch('storage_2_a') --- - true ... -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 --- ... vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a) @@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration') err --- - bucket_id: 100 - reason: write is prohibited + reason: Not found code: 1 destination: ac522f65-aa94-4134-9f64-51ee384f1a54 type: ShardingError name: WRONG_BUCKET - message: 'Cannot perform action with bucket 100, reason: write is prohibited' + message: 'Cannot perform action with bucket 100, reason: Not found' ... -- -- Now try again, but update configuration during call(). It must diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua index 9e6e804..207aac3 100644 --- a/test/router/reroute_wrong_bucket.test.lua +++ b/test/router/reroute_wrong_bucket.test.lua @@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt test_run:cmd('create server router_1 with script="router/router_1.lua"') test_run:cmd('start server router_1') test_run:switch('storage_1_a') -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a) vshard.storage.rebalancer_disable() for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end test_run:switch('storage_2_a') -cfg.collect_bucket_garbage_interval = 100 +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100 vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a) vshard.storage.rebalancer_disable() for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end diff --git a/test/storage/recovery.result b/test/storage/recovery.result index f833fe7..8ccb0b9 100644 --- a/test/storage/recovery.result +++ b/test/storage/recovery.result @@ -79,8 +79,7 @@ _bucket = box.space._bucket ... _bucket:select{} --- -- - [2, 'garbage', '<replicaset_2>'] - - [3, 'garbage', '<replicaset_2>'] +- [] ... _ = test_run:switch('storage_2_a') --- diff --git a/test/storage/storage.result b/test/storage/storage.result index 424bc4c..0550ad1 100644 --- a/test/storage/storage.result +++ b/test/storage/storage.result @@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2]) --- - true ... +wait_bucket_is_collected(1) +--- +... _ = test_run:switch("storage_2_a") --- ... @@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a") ... vshard.storage.buckets_info() --- -- 1: - status: sent - ro_lock: true - destination: <replicaset_2> - id: 1 - 2: +- 2: status: active id: 2 ... diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua index d631b51..d8fbd94 100644 --- a/test/storage/storage.test.lua +++ b/test/storage/storage.test.lua @@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1]) -- Successful transfer. vshard.storage.bucket_send(1, util.replicasets[2]) +wait_bucket_is_collected(1) _ = test_run:switch("storage_2_a") vshard.storage.buckets_info() _ = test_run:switch("storage_1_a") diff --git a/test/unit/config.result b/test/unit/config.result index dfd0219..e0b2482 100644 --- a/test/unit/config.result +++ b/test/unit/config.result @@ -428,33 +428,6 @@ _ = lcfg.check(cfg) -- -- gh-77: garbage collection options. -- -cfg.collect_bucket_garbage_interval = 'str' ---- -... -check(cfg) ---- -- Garbage bucket collect interval must be positive number -... -cfg.collect_bucket_garbage_interval = 0 ---- -... -check(cfg) ---- -- Garbage bucket collect interval must be positive number -... -cfg.collect_bucket_garbage_interval = -1 ---- -... -check(cfg) ---- -- Garbage bucket collect interval must be positive number -... -cfg.collect_bucket_garbage_interval = 100.5 ---- -... -_ = lcfg.check(cfg) ---- -... cfg.collect_lua_garbage = 100 --- ... @@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending cfg.rebalancer_max_sending = nil --- ... -cfg.sharding = nil +-- +-- Deprecated option does not break anything. +-- +cfg.collect_bucket_garbage_interval = 100 +--- +... +_ = lcfg.check(cfg) --- ... diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua index ada43db..a1c9f07 100644 --- a/test/unit/config.test.lua +++ b/test/unit/config.test.lua @@ -175,15 +175,6 @@ _ = lcfg.check(cfg) -- -- gh-77: garbage collection options. -- -cfg.collect_bucket_garbage_interval = 'str' -check(cfg) -cfg.collect_bucket_garbage_interval = 0 -check(cfg) -cfg.collect_bucket_garbage_interval = -1 -check(cfg) -cfg.collect_bucket_garbage_interval = 100.5 -_ = lcfg.check(cfg) - cfg.collect_lua_garbage = 100 check(cfg) cfg.collect_lua_garbage = true @@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg) cfg.rebalancer_max_sending = 15 lcfg.check(cfg).rebalancer_max_sending cfg.rebalancer_max_sending = nil -cfg.sharding = nil + +-- +-- Deprecated option does not break anything. +-- +cfg.collect_bucket_garbage_interval = 100 +_ = lcfg.check(cfg) diff --git a/test/unit/garbage.result b/test/unit/garbage.result index 74d9ccf..a530496 100644 --- a/test/unit/garbage.result +++ b/test/unit/garbage.result @@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''"); vshard.storage.internal.shard_index = 'bucket_id' --- ... -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL ---- -... -- -- Find nothing if no bucket_id anywhere, or there is no index -- by it, or bucket_id is not unsigned. @@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'} format[2] = {name = 'status', type = 'string'} --- ... +format[3] = {name = 'destination', type = 'string', is_nullable = true} +--- +... _bucket = box.schema.create_space('_bucket', {format = format}) --- ... @@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE} --- - [3, 'active'] ... -_bucket:replace{4, vshard.consts.BUCKET.SENT} ---- -- [4, 'sent'] -... -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} ---- -- [5, 'garbage'] -... -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE} ---- -- [6, 'garbage'] -... -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE} ---- -- [200, 'garbage'] -... s = box.schema.create_space('test', {engine = engine}) --- ... @@ -213,7 +197,7 @@ s:replace{4, 2} --- - [4, 2] ... -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop --- ... s2 = box.schema.create_space('test2', {engine = engine}) @@ -249,6 +233,10 @@ function fill_spaces_with_garbage() s2:replace{6, 4} s2:replace{7, 5} s2:replace{7, 6} + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'} + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE} + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'} + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE} end; --- ... @@ -267,12 +255,22 @@ fill_spaces_with_garbage() --- - 1107 ... -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) +route_map = {} +--- +... +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) --- -- - 5 - - 6 - - 200 - true +- null +... +route_map +--- +- - null + - null + - null + - null + - null + - destination2 ... #s2:select{} --- @@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) --- - 7 ... -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +route_map = {} +--- +... +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) --- -- - 4 - true +- null +... +route_map +--- +- - null + - null + - null + - destination1 ... s2:select{} --- @@ -303,17 +311,22 @@ s:select{} - [6, 100] ... -- Nothing deleted - update collected generation. -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) +route_map = {} +--- +... +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) --- -- - 5 - - 6 - - 200 - true +- null ... -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) --- -- - 4 - true +- null +... +route_map +--- +- [] ... #s2:select{} --- @@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) fill_spaces_with_garbage() --- ... -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) +_ = _bucket:on_replace(function() \ + local gen = vshard.storage.internal.bucket_generation \ + vshard.storage.internal.bucket_generation = gen + 1 \ + vshard.storage.internal.bucket_generation_cond:broadcast() \ +end) --- ... f = fiber.create(vshard.storage.internal.gc_bucket_f) --- ... -- Wait until garbage collection is finished. -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end) --- +- true ... s:select{} --- @@ -360,7 +378,6 @@ _bucket:select{} - - [1, 'active'] - [2, 'receiving'] - [3, 'active'] - - [4, 'sent'] ... -- -- Test deletion of 'sent' buckets after a specified timeout. @@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT} - [2, 'sent'] ... -- Wait deletion after a while. -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{2} end) --- +- true ... _bucket:select{} --- @@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT} --- - [4, 'sent'] ... -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) --- +- true ... -- -- Test WAL errors during deletion from _bucket. @@ -434,11 +453,14 @@ s:replace{6, 4} --- - [6, 4] ... -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_log('default', 'Error during garbage collection step', \ + 65536, 10) --- +- Error during garbage collection step ... -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return #sk:select{4} == 0 end) --- +- true ... s:select{} --- @@ -454,8 +476,9 @@ _bucket:select{} _ = _bucket:on_replace(nil, rollback_on_delete) --- ... -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) --- +- true ... f:cancel() --- @@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i, f = fiber.create(vshard.storage.internal.gc_bucket_f) --- ... -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return _bucket:count() == 0 end) --- +- true ... _bucket:select{} --- diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua index 30079fa..250afb0 100644 --- a/test/unit/garbage.test.lua +++ b/test/unit/garbage.test.lua @@ -15,7 +15,6 @@ end; test_run:cmd("setopt delimiter ''"); vshard.storage.internal.shard_index = 'bucket_id' -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL -- -- Find nothing if no bucket_id anywhere, or there is no index @@ -75,16 +74,13 @@ s:drop() format = {} format[1] = {name = 'id', type = 'unsigned'} format[2] = {name = 'status', type = 'string'} +format[3] = {name = 'destination', type = 'string', is_nullable = true} _bucket = box.schema.create_space('_bucket', {format = format}) _ = _bucket:create_index('pk') _ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false}) _bucket:replace{1, vshard.consts.BUCKET.ACTIVE} _bucket:replace{2, vshard.consts.BUCKET.RECEIVING} _bucket:replace{3, vshard.consts.BUCKET.ACTIVE} -_bucket:replace{4, vshard.consts.BUCKET.SENT} -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE} -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE} s = box.schema.create_space('test', {engine = engine}) pk = s:create_index('pk') @@ -94,7 +90,7 @@ s:replace{2, 1} s:replace{3, 2} s:replace{4, 2} -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop s2 = box.schema.create_space('test2', {engine = engine}) pk2 = s2:create_index('pk') sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) @@ -114,6 +110,10 @@ function fill_spaces_with_garbage() s2:replace{6, 4} s2:replace{7, 5} s2:replace{7, 6} + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'} + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE} + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'} + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE} end; test_run:cmd("setopt delimiter ''"); @@ -121,15 +121,21 @@ fill_spaces_with_garbage() #s2:select{} #s:select{} -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) +route_map = {} +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) +route_map #s2:select{} #s:select{} -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +route_map = {} +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) +route_map s2:select{} s:select{} -- Nothing deleted - update collected generation. -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) +route_map = {} +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map) +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map) +route_map #s2:select{} #s:select{} @@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) -- Test continuous garbage collection via background fiber. -- fill_spaces_with_garbage() -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) +_ = _bucket:on_replace(function() \ + local gen = vshard.storage.internal.bucket_generation \ + vshard.storage.internal.bucket_generation = gen + 1 \ + vshard.storage.internal.bucket_generation_cond:broadcast() \ +end) f = fiber.create(vshard.storage.internal.gc_bucket_f) -- Wait until garbage collection is finished. -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end) s:select{} s2:select{} -- Check garbage bucket is deleted by background fiber. @@ -150,7 +160,7 @@ _bucket:select{} -- _bucket:replace{2, vshard.consts.BUCKET.SENT} -- Wait deletion after a while. -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{2} end) _bucket:select{} s:select{} s2:select{} @@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE} s:replace{5, 4} s:replace{6, 4} _bucket:replace{4, vshard.consts.BUCKET.SENT} -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) -- -- Test WAL errors during deletion from _bucket. @@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete) _bucket:replace{4, vshard.consts.BUCKET.SENT} s:replace{5, 4} s:replace{6, 4} -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_log('default', 'Error during garbage collection step', \ + 65536, 10) +test_run:wait_cond(function() return #sk:select{4} == 0 end) s:select{} _bucket:select{} _ = _bucket:on_replace(nil, rollback_on_delete) -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return not _bucket:get{4} end) f:cancel() @@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i, #s:select{} #s2:select{} f = fiber.create(vshard.storage.internal.gc_bucket_f) -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end +test_run:wait_cond(function() return _bucket:count() == 0 end) _bucket:select{} s:select{} s2:select{} diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result deleted file mode 100644 index 92c8039..0000000 --- a/test/unit/garbage_errinj.result +++ /dev/null @@ -1,223 +0,0 @@ -test_run = require('test_run').new() ---- -... -vshard = require('vshard') ---- -... -fiber = require('fiber') ---- -... -engine = test_run:get_cfg('engine') ---- -... -vshard.storage.internal.shard_index = 'bucket_id' ---- -... -format = {} ---- -... -format[1] = {name = 'id', type = 'unsigned'} ---- -... -format[2] = {name = 'status', type = 'string', is_nullable = true} ---- -... -_bucket = box.schema.create_space('_bucket', {format = format}) ---- -... -_ = _bucket:create_index('pk') ---- -... -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false}) ---- -... -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE} ---- -- [1, 'active'] -... -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING} ---- -- [2, 'receiving'] -... -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE} ---- -- [3, 'active'] -... -_bucket:replace{4, vshard.consts.BUCKET.SENT} ---- -- [4, 'sent'] -... -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} ---- -- [5, 'garbage'] -... -s = box.schema.create_space('test', {engine = engine}) ---- -... -pk = s:create_index('pk') ---- -... -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) ---- -... -s:replace{1, 1} ---- -- [1, 1] -... -s:replace{2, 1} ---- -- [2, 1] -... -s:replace{3, 2} ---- -- [3, 2] -... -s:replace{4, 2} ---- -- [4, 2] -... -s:replace{5, 100} ---- -- [5, 100] -... -s:replace{6, 100} ---- -- [6, 100] -... -s:replace{7, 4} ---- -- [7, 4] -... -s:replace{8, 5} ---- -- [8, 5] -... -s2 = box.schema.create_space('test2', {engine = engine}) ---- -... -pk2 = s2:create_index('pk') ---- -... -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) ---- -... -s2:replace{1, 1} ---- -- [1, 1] -... -s2:replace{3, 3} ---- -- [3, 3] -... -for i = 7, 1107 do s:replace{i, 200} end ---- -... -s2:replace{4, 200} ---- -- [4, 200] -... -s2:replace{5, 100} ---- -- [5, 100] -... -s2:replace{5, 300} ---- -- [5, 300] -... -s2:replace{6, 4} ---- -- [6, 4] -... -s2:replace{7, 5} ---- -- [7, 5] -... -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type ---- -... -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) ---- -- - 4 -- true -... -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) ---- -- - 5 -- true -... --- --- Test _bucket generation change during garbage buckets search. --- -s:truncate() ---- -... -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) ---- -... -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true ---- -... -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end) ---- -... -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE} ---- -- [4, 'garbage'] -... -s:replace{5, 4} ---- -- [5, 4] -... -s:replace{6, 4} ---- -- [6, 4] -... -#s:select{} ---- -- 2 -... -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false ---- -... -while f:status() ~= 'dead' do fiber.sleep(0.1) end ---- -... --- Nothing is deleted - _bucket:replace() has changed _bucket --- generation during search of garbage buckets. -#s:select{} ---- -- 2 -... -_bucket:select{4} ---- -- - [4, 'garbage'] -... --- Next step deletes garbage ok. -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) ---- -- [] -- true -... -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) ---- -- - 4 - - 5 -- true -... -#s:select{} ---- -- 0 -... -_bucket:delete{4} ---- -- [4, 'garbage'] -... -s2:drop() ---- -... -s:drop() ---- -... -_bucket:drop() ---- -... diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua deleted file mode 100644 index 31184b9..0000000 --- a/test/unit/garbage_errinj.test.lua +++ /dev/null @@ -1,73 +0,0 @@ -test_run = require('test_run').new() -vshard = require('vshard') -fiber = require('fiber') - -engine = test_run:get_cfg('engine') -vshard.storage.internal.shard_index = 'bucket_id' - -format = {} -format[1] = {name = 'id', type = 'unsigned'} -format[2] = {name = 'status', type = 'string', is_nullable = true} -_bucket = box.schema.create_space('_bucket', {format = format}) -_ = _bucket:create_index('pk') -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false}) -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE} -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING} -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE} -_bucket:replace{4, vshard.consts.BUCKET.SENT} -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE} - -s = box.schema.create_space('test', {engine = engine}) -pk = s:create_index('pk') -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) -s:replace{1, 1} -s:replace{2, 1} -s:replace{3, 2} -s:replace{4, 2} -s:replace{5, 100} -s:replace{6, 100} -s:replace{7, 4} -s:replace{8, 5} - -s2 = box.schema.create_space('test2', {engine = engine}) -pk2 = s2:create_index('pk') -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false}) -s2:replace{1, 1} -s2:replace{3, 3} -for i = 7, 1107 do s:replace{i, 200} end -s2:replace{4, 200} -s2:replace{5, 100} -s2:replace{5, 300} -s2:replace{6, 4} -s2:replace{7, 5} - -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) - --- --- Test _bucket generation change during garbage buckets search. --- -s:truncate() -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end) -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end) -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE} -s:replace{5, 4} -s:replace{6, 4} -#s:select{} -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false -while f:status() ~= 'dead' do fiber.sleep(0.1) end --- Nothing is deleted - _bucket:replace() has changed _bucket --- generation during search of garbage buckets. -#s:select{} -_bucket:select{4} --- Next step deletes garbage ok. -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) -#s:select{} -_bucket:delete{4} - -s2:drop() -s:drop() -_bucket:drop() diff --git a/vshard/cfg.lua b/vshard/cfg.lua index f7d5dbc..63d5414 100644 --- a/vshard/cfg.lua +++ b/vshard/cfg.lua @@ -251,9 +251,8 @@ local cfg_template = { max = consts.REBALANCER_MAX_SENDING_MAX }, collect_bucket_garbage_interval = { - type = 'positive number', name = 'Garbage bucket collect interval', - is_optional = true, - default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL + name = 'Garbage bucket collect interval', is_deprecated = true, + reason = 'Has no effect anymore' }, collect_lua_garbage = { type = 'boolean', name = 'Garbage Lua collect necessity', diff --git a/vshard/consts.lua b/vshard/consts.lua index 8c2a8b0..3f1585a 100644 --- a/vshard/consts.lua +++ b/vshard/consts.lua @@ -23,6 +23,7 @@ return { DEFAULT_BUCKET_COUNT = 3000; BUCKET_SENT_GARBAGE_DELAY = 0.5; BUCKET_CHUNK_SIZE = 1000; + LUA_CHUNK_SIZE = 100000, DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1; REBALANCER_IDLE_INTERVAL = 60 * 60; REBALANCER_WORK_INTERVAL = 10; @@ -37,7 +38,7 @@ return { DEFAULT_FAILOVER_PING_TIMEOUT = 5; DEFAULT_SYNC_TIMEOUT = 1; RECONNECT_TIMEOUT = 0.5; - DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5; + GC_BACKOFF_INTERVAL = 5, RECOVERY_INTERVAL = 5; COLLECT_LUA_GARBAGE_INTERVAL = 100; @@ -45,4 +46,6 @@ return { DISCOVERY_WORK_INTERVAL = 1, DISCOVERY_WORK_STEP = 0.01, DISCOVERY_TIMEOUT = 10, + + TIMEOUT_INFINITY = 500 * 365 * 86400, } diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua index adf1c20..1ea8069 100644 --- a/vshard/storage/init.lua +++ b/vshard/storage/init.lua @@ -69,7 +69,6 @@ if not M then total_bucket_count = 0, errinj = { ERRINJ_CFG = false, - ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false, ERRINJ_RELOAD = false, ERRINJ_CFG_DELAY = false, ERRINJ_LONG_RECEIVE = false, @@ -96,6 +95,8 @@ if not M then -- detect that _bucket was not changed between yields. -- bucket_generation = 0, + -- Condition variable fired on generation update. + bucket_generation_cond = lfiber.cond(), -- -- Reference to the function used as on_replace trigger on -- _bucket space. It is used to replace the trigger with @@ -107,12 +108,14 @@ if not M then -- replace the old function is to keep its reference. -- bucket_on_replace = nil, + -- Redirects for recently sent buckets. They are kept for a while to + -- help routers to find a new location for sent and deleted buckets + -- without whole cluster scan. + route_map = {}, ------------------- Garbage collection ------------------- -- Fiber to remove garbage buckets data. collect_bucket_garbage_fiber = nil, - -- Do buckets garbage collection once per this time. - collect_bucket_garbage_interval = nil, -- Boolean lua_gc state (create periodic gc task). collect_lua_garbage = nil, @@ -173,6 +176,7 @@ end -- local function bucket_generation_increment() M.bucket_generation = M.bucket_generation + 1 + M.bucket_generation_cond:broadcast() end -- @@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode) else return bucket end + local dst = bucket and bucket.destination or M.route_map[bucket_id] return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason, - bucket and bucket.destination) + dst) end -- @@ -804,11 +809,23 @@ end -- local function bucket_unrefro(bucket_id) local ref = M.bucket_refs[bucket_id] - if not ref or ref.ro == 0 then + local count = ref and ref.ro or 0 + if count == 0 then return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, "no refs", nil) end - ref.ro = ref.ro - 1 + if count == 1 then + ref.ro = 0 + if ref.ro_lock then + -- Garbage collector is waiting for the bucket if RO + -- is locked. Let it know it has one more bucket to + -- collect. It relies on generation, so its increment + -- it enough. + bucket_generation_increment() + end + return true + end + ref.ro = count - 1 return true end @@ -1481,79 +1498,44 @@ local function gc_bucket_in_space(space, bucket_id, status) end -- --- Remove tuples from buckets of a specified type. --- @param type Type of buckets to gc. --- @retval List of ids of empty buckets of the type. +-- Drop buckets with the given status along with their data in all spaces. +-- @param status Status of target buckets. +-- @param route_map Destinations of deleted buckets are saved into this table. -- -local function gc_bucket_step_by_type(type) - local sharded_spaces = find_sharded_spaces() - local empty_buckets = {} +local function gc_bucket_drop_xc(status, route_map) local limit = consts.BUCKET_CHUNK_SIZE - local is_all_collected = true - for _, bucket in box.space._bucket.index.status:pairs(type) do - local bucket_id = bucket.id - local ref = M.bucket_refs[bucket_id] + local _bucket = box.space._bucket + local sharded_spaces = find_sharded_spaces() + for _, b in _bucket.index.status:pairs(status) do + local id = b.id + local ref = M.bucket_refs[id] if ref then assert(ref.rw == 0) if ref.ro ~= 0 then ref.ro_lock = true - is_all_collected = false goto continue end - M.bucket_refs[bucket_id] = nil + M.bucket_refs[id] = nil end for _, space in pairs(sharded_spaces) do - gc_bucket_in_space_xc(space, bucket_id, type) + gc_bucket_in_space_xc(space, id, status) limit = limit - 1 if limit == 0 then lfiber.sleep(0) limit = consts.BUCKET_CHUNK_SIZE end end - table.insert(empty_buckets, bucket.id) -::continue:: + route_map[id] = b.destination + _bucket:delete{id} + ::continue:: end - return empty_buckets, is_all_collected -end - --- --- Drop buckets with ids in the list. --- @param bucket_ids Bucket ids to drop. --- @param status Expected bucket status. --- -local function gc_bucket_drop_xc(bucket_ids, status) - if #bucket_ids == 0 then - return - end - local limit = consts.BUCKET_CHUNK_SIZE - box.begin() - local _bucket = box.space._bucket - for _, id in pairs(bucket_ids) do - local bucket_exists = _bucket:get{id} ~= nil - local b = _bucket:get{id} - if b then - if b.status ~= status then - return error(string.format('Bucket %d status is changed. Was '.. - '%s, became %s', id, status, - b.status)) - end - _bucket:delete{id} - end - limit = limit - 1 - if limit == 0 then - box.commit() - box.begin() - limit = consts.BUCKET_CHUNK_SIZE - end - end - box.commit() end -- -- Exception safe version of gc_bucket_drop_xc. -- -local function gc_bucket_drop(bucket_ids, status) - local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status) +local function gc_bucket_drop(status, route_map) + local status, err = pcall(gc_bucket_drop_xc, status, route_map) if not status then box.rollback() end @@ -1561,14 +1543,16 @@ local function gc_bucket_drop(bucket_ids, status) end -- --- Garbage collector. Works on masters. The garbage collector --- wakes up once per specified time. +-- Garbage collector. Works on masters. The garbage collector wakes up when +-- state of any bucket changes. -- After wakeup it follows the plan: --- 1) Check if _bucket has changed. If not, then sleep again; --- 2) Scan user spaces for sent and garbage buckets, delete --- garbage data in batches of limited size; --- 3) Delete GARBAGE buckets from _bucket immediately, and --- schedule SENT buckets for deletion after a timeout; +-- 1) Check if state of any bucket has really changed. If not, then sleep again; +-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of +-- limited size. +-- 3) Bucket destinations are saved into a global route_map to reroute incoming +-- requests from routers in case they didn't notice the buckets being moved. +-- The saved routes are scheduled for deletion after a timeout, which is +-- checked on each iteration of this loop. -- 4) Sleep, go to (1). -- For each step details see comments in the code. -- @@ -1580,65 +1564,75 @@ function gc_bucket_f() -- generation == bucket generation. In such a case the fiber -- does nothing until next _bucket change. local bucket_generation_collected = -1 - -- Empty sent buckets are collected into an array. After a - -- specified time interval the buckets are deleted both from - -- this array and from _bucket space. - local buckets_for_redirect = {} - local buckets_for_redirect_ts = fiber_clock() - -- Empty sent buckets, updated after each step, and when - -- buckets_for_redirect is deleted, it gets empty_sent_buckets - -- for next deletion. - local empty_garbage_buckets, empty_sent_buckets, status, err + local bucket_generation_current = M.bucket_generation + -- Deleted buckets are saved into a route map to redirect routers if they + -- didn't discover new location of the buckets yet. However route map does + -- not grow infinitely. Otherwise it would end up storing redirects for all + -- buckets in the cluster. Which could also be outdated. + -- Garbage collector periodically drops old routes from the map. For that it + -- remembers state of route map in one moment, and after a while clears the + -- remembered routes from the global route map. + local route_map = M.route_map + local route_map_old = {} + local route_map_deadline = 0 + local status, err while M.module_version == module_version do - -- Check if no changes in buckets configuration. - if bucket_generation_collected ~= M.bucket_generation then - local bucket_generation = M.bucket_generation - local is_sent_collected, is_garbage_collected - status, empty_garbage_buckets, is_garbage_collected = - pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE) - if not status then - err = empty_garbage_buckets - goto check_error - end - status, empty_sent_buckets, is_sent_collected = - pcall(gc_bucket_step_by_type, consts.BUCKET.SENT) - if not status then - err = empty_sent_buckets - goto check_error + if bucket_generation_collected ~= bucket_generation_current then + status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map) + if status then + status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map) end - status, err = gc_bucket_drop(empty_garbage_buckets, - consts.BUCKET.GARBAGE) -::check_error:: if not status then box.rollback() log.error('Error during garbage collection step: %s', err) - goto continue + else + -- Don't use global generation. During the collection it could + -- already change. Instead, remember the generation known before + -- the collection has started. + -- Since the collection also changes the generation, it makes + -- the GC happen always at least twice. But typically on the + -- second iteration it should not find any buckets to collect, + -- and then the collected generation matches the global one. + bucket_generation_collected = bucket_generation_current end - if is_sent_collected and is_garbage_collected then - bucket_generation_collected = bucket_generation + else + status = true + end + + local sleep_time = route_map_deadline - fiber_clock() + if sleep_time <= 0 then + local chunk = consts.LUA_CHUNK_SIZE + util.table_minus_yield(route_map, route_map_old, chunk) + route_map_old = util.table_copy_yield(route_map, chunk) + if next(route_map_old) then + sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY + else + sleep_time = consts.TIMEOUT_INFINITY end + route_map_deadline = fiber_clock() + sleep_time end + bucket_generation_current = M.bucket_generation - if fiber_clock() - buckets_for_redirect_ts >= - consts.BUCKET_SENT_GARBAGE_DELAY then - status, err = gc_bucket_drop(buckets_for_redirect, - consts.BUCKET.SENT) - if not status then - buckets_for_redirect = {} - empty_sent_buckets = {} - bucket_generation_collected = -1 - log.error('Error during deletion of empty sent buckets: %s', - err) - elseif M.module_version ~= module_version then - return + if bucket_generation_current ~= bucket_generation_collected then + -- Generation was changed during collection. Or *by* collection. + if status then + -- Retry immediately. If the generation was changed by the + -- collection itself, it will notice it next iteration, and go + -- to proper sleep. + sleep_time = 0 else - buckets_for_redirect = empty_sent_buckets or {} - empty_sent_buckets = nil - buckets_for_redirect_ts = fiber_clock() + -- An error happened during the collection. Does not make sense + -- to retry on each iteration of the event loop. The most likely + -- errors are either a WAL error or a transaction abort - both + -- look like an issue in the user's code and can't be fixed + -- quickly anyway. Backoff. + sleep_time = consts.GC_BACKOFF_INTERVAL end end -::continue:: - lfiber.sleep(M.collect_bucket_garbage_interval) + + if M.module_version == module_version then + M.bucket_generation_cond:wait(sleep_time) + end end end @@ -2423,8 +2417,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload) vshard_cfg.rebalancer_disbalance_threshold M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving M.shard_index = vshard_cfg.shard_index - M.collect_bucket_garbage_interval = - vshard_cfg.collect_bucket_garbage_interval M.collect_lua_garbage = vshard_cfg.collect_lua_garbage M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending M.current_cfg = cfg @@ -2678,6 +2670,9 @@ else storage_cfg(M.current_cfg, M.this_replica.uuid, true) end M.module_version = M.module_version + 1 + -- Background fibers could sleep waiting for bucket changes. + -- Let them know it is time to reload. + bucket_generation_increment() end M.recovery_f = recovery_f @@ -2688,7 +2683,7 @@ M.gc_bucket_f = gc_bucket_f -- These functions are saved in M not for atomic reload, but for -- unit testing. -- -M.gc_bucket_step_by_type = gc_bucket_step_by_type +M.gc_bucket_drop = gc_bucket_drop M.rebalancer_build_routes = rebalancer_build_routes M.rebalancer_calculate_metrics = rebalancer_calculate_metrics M.cached_find_sharded_spaces = find_sharded_spaces diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua index f38af74..484f499 100644 --- a/vshard/storage/reload_evolution.lua +++ b/vshard/storage/reload_evolution.lua @@ -4,6 +4,7 @@ -- in a commit. -- local log = require('log') +local fiber = require('fiber') -- -- Array of upgrade functions. @@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M) end end +migrations[#migrations + 1] = function(M) + if not M.route_map then + M.bucket_generation_cond = fiber.cond() + M.route_map = {} + end +end + -- -- Perform an update based on a version stored in `M` (internals). -- @param M Old module internals which should be updated.
Thanks for the review! On 10.02.2021 10:01, Oleg Babin wrote: > Thanks for your patch. > > Shouldn't it be added to storage "MODULE_INTERNALS" ? Hm. Not sure I understand. Did you mean 'vshard_modules' variable in storage/init.lua? Why? The heap is not used in storage/init.lua and won't be used there directly in future patches. The next patches will introduce new modules for storage/, which will use the heap, and will reload it. Also it does not have any global objects. So it does not need its own global M, if this is what you meant. >> diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua >> new file mode 100755 >> index 0000000..8c3819f >> --- /dev/null >> +++ b/test/unit-tap/heap.test.lua >> @@ -0,0 +1,310 @@ >> +#!/usr/bin/env tarantool >> + >> +local tap = require('tap') >> +local test = tap.test("cfg") >> +local heap = require('vshard.heap') >> + > > > Maybe it's better to use single brackets everywhere: test("cfg") -> test('cfg'). Or does such difference have some sense? Yeah, didn't notice it. Here is the diff: ==================== diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua index 8c3819f..9202f62 100755 --- a/test/unit-tap/heap.test.lua +++ b/test/unit-tap/heap.test.lua @@ -1,7 +1,7 @@ #!/usr/bin/env tarantool local tap = require('tap') -local test = tap.test("cfg") +local test = tap.test('cfg') local heap = require('vshard.heap') -- @@ -109,7 +109,7 @@ local function test_min_heap_basic(test) until not next_permutation(indexes) end - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end -- @@ -143,7 +143,7 @@ local function test_max_heap_basic(test) until not next_permutation(indexes) end - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end -- @@ -178,7 +178,7 @@ local function test_min_heap_update_top(test) until not next_permutation(indexes) end - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end -- @@ -219,7 +219,7 @@ local function test_min_heap_update(test) end end - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end -- @@ -257,7 +257,7 @@ local function test_max_heap_delete(test) end end - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end local function test_min_heap_remove_top(test) @@ -273,7 +273,7 @@ local function test_min_heap_remove_top(test) end assert(h:count() == 0) - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end local function test_max_heap_remove_try(test) @@ -294,7 +294,7 @@ local function test_max_heap_remove_try(test) assert(obj.index == -1) assert(h:count() == 1) - test:ok(true, "no asserts") + test:ok(true, 'no asserts') end test:plan(7)
I've noticed that you've missed to add new file to vshard/CMakeList.txt [1] It will break the build. [1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9 On 10/02/2021 11:57, Oleg Babin via Tarantool-patches wrote: > Hi! Thanks for your patch. LGTM. > > On 10/02/2021 02:46, Vladislav Shpilevoy wrote: >> Rlist in storage/init.lua implemented a container similar to rlist >> in libsmall in Tarantool core. Doubly-linked list. >> >> It does not depend on anything in storage/init.lua, and should >> have been done in a separate module from the beginning. >> >> Now init.lua is going to grow even more in scope of map-reduce >> feature, beyond 3k lines if nothing would be moved out. It was >> decided (by me) that it crosses the border of when it is time to >> split init.lua into separate modules. >> >> The patch takes the low hanging fruit by moving rlist into its >> own module. >> --- >> test/unit/rebalancer.result | 99 ----------------------------- >> test/unit/rebalancer.test.lua | 27 -------- >> test/unit/rlist.result | 114 ++++++++++++++++++++++++++++++++++ >> test/unit/rlist.test.lua | 33 ++++++++++ >> vshard/rlist.lua | 53 ++++++++++++++++ >> vshard/storage/init.lua | 68 +++----------------- >> 6 files changed, 208 insertions(+), 186 deletions(-) >> create mode 100644 test/unit/rlist.result >> create mode 100644 test/unit/rlist.test.lua >> create mode 100644 vshard/rlist.lua >> >> diff --git a/test/unit/rebalancer.result b/test/unit/rebalancer.result >> index 2fb30e2..19aa480 100644 >> --- a/test/unit/rebalancer.result >> +++ b/test/unit/rebalancer.result >> @@ -1008,105 +1008,6 @@ build_routes(replicasets) >> -- the latter is a dispenser. It is a structure which hands out >> -- destination UUIDs in a round-robin manner to worker fibers. >> -- >> -list = rlist.new() >> ---- >> -... >> -list >> ---- >> -- count: 0 >> -... >> -obj1 = {i = 1} >> ---- >> -... >> -rlist.remove(list, obj1) >> ---- >> -... >> -list >> ---- >> -- count: 0 >> -... >> -rlist.add_tail(list, obj1) >> ---- >> -... >> -list >> ---- >> -- count: 1 >> - last: &0 >> - i: 1 >> - first: *0 >> -... >> -rlist.remove(list, obj1) >> ---- >> -... >> -list >> ---- >> -- count: 0 >> -... >> -obj1 >> ---- >> -- i: 1 >> -... >> -rlist.add_tail(list, obj1) >> ---- >> -... >> -obj2 = {i = 2} >> ---- >> -... >> -rlist.add_tail(list, obj2) >> ---- >> -... >> -list >> ---- >> -- count: 2 >> - last: &0 >> - i: 2 >> - prev: &1 >> - i: 1 >> - next: *0 >> - first: *1 >> -... >> -obj3 = {i = 3} >> ---- >> -... >> -rlist.add_tail(list, obj3) >> ---- >> -... >> -list >> ---- >> -- count: 3 >> - last: &0 >> - i: 3 >> - prev: &1 >> - i: 2 >> - next: *0 >> - prev: &2 >> - i: 1 >> - next: *1 >> - first: *2 >> -... >> -rlist.remove(list, obj2) >> ---- >> -... >> -list >> ---- >> -- count: 2 >> - last: &0 >> - i: 3 >> - prev: &1 >> - i: 1 >> - next: *0 >> - first: *1 >> -... >> -rlist.remove(list, obj1) >> ---- >> -... >> -list >> ---- >> -- count: 1 >> - last: &0 >> - i: 3 >> - first: *0 >> -... >> d = dispenser.create({uuid = 15}) >> --- >> ... >> diff --git a/test/unit/rebalancer.test.lua >> b/test/unit/rebalancer.test.lua >> index a4e18c1..8087d42 100644 >> --- a/test/unit/rebalancer.test.lua >> +++ b/test/unit/rebalancer.test.lua >> @@ -246,33 +246,6 @@ build_routes(replicasets) >> -- the latter is a dispenser. It is a structure which hands out >> -- destination UUIDs in a round-robin manner to worker fibers. >> -- >> -list = rlist.new() >> -list >> - >> -obj1 = {i = 1} >> -rlist.remove(list, obj1) >> -list >> - >> -rlist.add_tail(list, obj1) >> -list >> - >> -rlist.remove(list, obj1) >> -list >> -obj1 >> - >> -rlist.add_tail(list, obj1) >> -obj2 = {i = 2} >> -rlist.add_tail(list, obj2) >> -list >> -obj3 = {i = 3} >> -rlist.add_tail(list, obj3) >> -list >> - >> -rlist.remove(list, obj2) >> -list >> -rlist.remove(list, obj1) >> -list >> - >> d = dispenser.create({uuid = 15}) >> dispenser.pop(d) >> for i = 1, 14 do assert(dispenser.pop(d) == 'uuid', i) end >> diff --git a/test/unit/rlist.result b/test/unit/rlist.result >> new file mode 100644 >> index 0000000..c8aabc0 >> --- /dev/null >> +++ b/test/unit/rlist.result >> @@ -0,0 +1,114 @@ >> +-- test-run result file version 2 >> +-- >> +-- gh-161: parallel rebalancer. One of the most important part of >> the latter is >> +-- a dispenser. It is a structure which hands out destination UUIDs >> in a >> +-- round-robin manner to worker fibers. It uses rlist data structure. >> +-- >> +rlist = require('vshard.rlist') >> + | --- >> + | ... >> + >> +list = rlist.new() >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 0 >> + | ... >> + >> +obj1 = {i = 1} >> + | --- >> + | ... >> +list:remove(obj1) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 0 >> + | ... >> + >> +list:add_tail(obj1) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 1 >> + | last: &0 >> + | i: 1 >> + | first: *0 >> + | ... >> + >> +list:remove(obj1) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 0 >> + | ... >> +obj1 >> + | --- >> + | - i: 1 >> + | ... >> + >> +list:add_tail(obj1) >> + | --- >> + | ... >> +obj2 = {i = 2} >> + | --- >> + | ... >> +list:add_tail(obj2) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 2 >> + | last: &0 >> + | i: 2 >> + | prev: &1 >> + | i: 1 >> + | next: *0 >> + | first: *1 >> + | ... >> +obj3 = {i = 3} >> + | --- >> + | ... >> +list:add_tail(obj3) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 3 >> + | last: &0 >> + | i: 3 >> + | prev: &1 >> + | i: 2 >> + | next: *0 >> + | prev: &2 >> + | i: 1 >> + | next: *1 >> + | first: *2 >> + | ... >> + >> +list:remove(obj2) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 2 >> + | last: &0 >> + | i: 3 >> + | prev: &1 >> + | i: 1 >> + | next: *0 >> + | first: *1 >> + | ... >> +list:remove(obj1) >> + | --- >> + | ... >> +list >> + | --- >> + | - count: 1 >> + | last: &0 >> + | i: 3 >> + | first: *0 >> + | ... >> diff --git a/test/unit/rlist.test.lua b/test/unit/rlist.test.lua >> new file mode 100644 >> index 0000000..db52955 >> --- /dev/null >> +++ b/test/unit/rlist.test.lua >> @@ -0,0 +1,33 @@ >> +-- >> +-- gh-161: parallel rebalancer. One of the most important part of >> the latter is >> +-- a dispenser. It is a structure which hands out destination UUIDs >> in a >> +-- round-robin manner to worker fibers. It uses rlist data structure. >> +-- >> +rlist = require('vshard.rlist') >> + >> +list = rlist.new() >> +list >> + >> +obj1 = {i = 1} >> +list:remove(obj1) >> +list >> + >> +list:add_tail(obj1) >> +list >> + >> +list:remove(obj1) >> +list >> +obj1 >> + >> +list:add_tail(obj1) >> +obj2 = {i = 2} >> +list:add_tail(obj2) >> +list >> +obj3 = {i = 3} >> +list:add_tail(obj3) >> +list >> + >> +list:remove(obj2) >> +list >> +list:remove(obj1) >> +list >> diff --git a/vshard/rlist.lua b/vshard/rlist.lua >> new file mode 100644 >> index 0000000..4be5382 >> --- /dev/null >> +++ b/vshard/rlist.lua >> @@ -0,0 +1,53 @@ >> +-- >> +-- A subset of rlist methods from the main repository. Rlist is a >> +-- doubly linked list, and is used here to implement a queue of >> +-- routes in the parallel rebalancer. >> +-- >> +local rlist_mt = {} >> + >> +function rlist_mt.add_tail(rlist, object) >> + local last = rlist.last >> + if last then >> + last.next = object >> + object.prev = last >> + else >> + rlist.first = object >> + end >> + rlist.last = object >> + rlist.count = rlist.count + 1 >> +end >> + >> +function rlist_mt.remove(rlist, object) >> + local prev = object.prev >> + local next = object.next >> + local belongs_to_list = false >> + if prev then >> + belongs_to_list = true >> + prev.next = next >> + end >> + if next then >> + belongs_to_list = true >> + next.prev = prev >> + end >> + object.prev = nil >> + object.next = nil >> + if rlist.last == object then >> + belongs_to_list = true >> + rlist.last = prev >> + end >> + if rlist.first == object then >> + belongs_to_list = true >> + rlist.first = next >> + end >> + if belongs_to_list then >> + rlist.count = rlist.count - 1 >> + end >> +end >> + >> +local function rlist_new() >> + return setmetatable({count = 0}, {__index = rlist_mt}) >> +end >> + >> +return { >> + new = rlist_new, >> +} >> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua >> index 5464824..1b48bf1 100644 >> --- a/vshard/storage/init.lua >> +++ b/vshard/storage/init.lua >> @@ -13,12 +13,13 @@ if rawget(_G, MODULE_INTERNALS) then >> 'vshard.consts', 'vshard.error', 'vshard.cfg', >> 'vshard.replicaset', 'vshard.util', >> 'vshard.storage.reload_evolution', >> - 'vshard.lua_gc', >> + 'vshard.lua_gc', 'vshard.rlist' >> } >> for _, module in pairs(vshard_modules) do >> package.loaded[module] = nil >> end >> end >> +local rlist = require('vshard.rlist') >> local consts = require('vshard.consts') >> local lerror = require('vshard.error') >> local lcfg = require('vshard.cfg') >> @@ -1786,54 +1787,6 @@ local function >> rebalancer_build_routes(replicasets) >> return bucket_routes >> end >> --- >> --- A subset of rlist methods from the main repository. Rlist is a >> --- doubly linked list, and is used here to implement a queue of >> --- routes in the parallel rebalancer. >> --- >> -local function rlist_new() >> - return {count = 0} >> -end >> - >> -local function rlist_add_tail(rlist, object) >> - local last = rlist.last >> - if last then >> - last.next = object >> - object.prev = last >> - else >> - rlist.first = object >> - end >> - rlist.last = object >> - rlist.count = rlist.count + 1 >> -end >> - >> -local function rlist_remove(rlist, object) >> - local prev = object.prev >> - local next = object.next >> - local belongs_to_list = false >> - if prev then >> - belongs_to_list = true >> - prev.next = next >> - end >> - if next then >> - belongs_to_list = true >> - next.prev = prev >> - end >> - object.prev = nil >> - object.next = nil >> - if rlist.last == object then >> - belongs_to_list = true >> - rlist.last = prev >> - end >> - if rlist.first == object then >> - belongs_to_list = true >> - rlist.first = next >> - end >> - if belongs_to_list then >> - rlist.count = rlist.count - 1 >> - end >> -end >> - >> -- >> -- Dispenser is a container of routes received from the >> -- rebalancer. Its task is to hand out the routes to worker fibers >> @@ -1842,7 +1795,7 @@ end >> -- receiver nodes. >> -- >> local function route_dispenser_create(routes) >> - local rlist = rlist_new() >> + local rlist = rlist.new() >> local map = {} >> for uuid, bucket_count in pairs(routes) do >> local new = { >> @@ -1873,7 +1826,7 @@ local function route_dispenser_create(routes) >> -- the main applier fiber does some analysis on the >> -- destinations. >> map[uuid] = new >> - rlist_add_tail(rlist, new) >> + rlist:add_tail(new) >> end >> return { >> rlist = rlist, >> @@ -1892,7 +1845,7 @@ local function route_dispenser_put(dispenser, >> uuid) >> local bucket_count = dst.bucket_count + 1 >> dst.bucket_count = bucket_count >> if bucket_count == 1 then >> - rlist_add_tail(dispenser.rlist, dst) >> + dispenser.rlist:add_tail(dst) >> end >> end >> end >> @@ -1909,7 +1862,7 @@ local function route_dispenser_skip(dispenser, >> uuid) >> local dst = map[uuid] >> if dst then >> map[uuid] = nil >> - rlist_remove(dispenser.rlist, dst) >> + dispenser.rlist:remove(dst) >> end >> end >> @@ -1952,9 +1905,9 @@ local function route_dispenser_pop(dispenser) >> if dst then >> local bucket_count = dst.bucket_count - 1 >> dst.bucket_count = bucket_count >> - rlist_remove(rlist, dst) >> + rlist:remove(dst) >> if bucket_count > 0 then >> - rlist_add_tail(rlist, dst) >> + rlist:add_tail(dst) >> end >> return dst.uuid >> end >> @@ -2742,11 +2695,6 @@ M.route_dispenser = { >> pop = route_dispenser_pop, >> sent = route_dispenser_sent, >> } >> -M.rlist = { >> - new = rlist_new, >> - add_tail = rlist_add_tail, >> - remove = rlist_remove, >> -} >> M.schema_latest_version = schema_latest_version >> M.schema_current_version = schema_current_version >> M.schema_upgrade_master = schema_upgrade_master
Thanks for your answer. Yes, it's fine. LGTM.
On 11/02/2021 01:33, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 09:57, Oleg Babin wrote:
>> Hi! Thanks for your patch! LGTM but I have one question.
>>
>> Maybe it's reasonable to add some timeout in this function?
>>
>> AFAIK test-run terminates tests after 120 seconds of inactivity it seems too long for such simple case.
>>
>> But anyway it's up to you.
> test_run:wait_cond() has default timeout 1 minute. I decided it
> is fine.
Thanks for your fixes! LGTM.
On 11/02/2021 01:34, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
>>> diff --git a/test/unit/util.result b/test/unit/util.result
>>> index 096e36f..c4fd84d 100644
>>> --- a/test/unit/util.result
>>> +++ b/test/unit/util.result
>>> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
>>> +do \
>>> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
>>> + f = fiber.create(function() \
>>> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
>>> + end) \
>>> + yield_count = 0 \
>>> + while f:status() ~= 'dead' do \
>>> + yield_count = yield_count + 1 \
>>> + fiber.yield() \
>>> + end \
>>> +end
>>> +---
>> Why can't you use "csw" of fiber.self() instead? Also it's it reliable enough to simply count yields?
> Yup, will work too. See the diff below.
>
> ====================
> diff --git a/test/unit/util.result b/test/unit/util.result
> index c4fd84d..42a361a 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -111,14 +111,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> ...
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> ---
> ...
> @@ -151,14 +151,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> res = nil \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> ---
> ...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 4d6cbe9..9550a95 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -42,14 +42,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
>
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> yield_count
> t
> @@ -63,14 +63,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> res = nil \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> yield_count
> t
> ====================
>
>> Could scheduler skip this fiber at some loop iteration? In other words, won't this test be flaky?
> Nope. Unless the fiber is sleeping on some condition or for a timeout, a plain
> sleep(0) also known as fiber.yield() won't skip this fiber on the next
> iteration of the loop. But does not matter if csw is used to count the yields.
>
> Full new patch below.
>
> ====================
> util: introduce yielding table functions
>
> The patch adds functions table_copy_yield and table_minus_yield.
>
> Yielding copy creates a duplicate of a table but yields every
> specified number of keys copied.
>
> Yielding minus removes matching key-value pairs specified in one
> table from another table. It yields every specified number of keys
> passed.
>
> The functions should help to process huge Lua tables (millions of
> elements and more). These are going to be used on the storage in
> the new GC algorithm.
>
> The algorithm will need to keep a route table on the storage, just
> like on the router, but with expiration time for the routes. Since
> bucket count can be millions, it means GC will potentially operate
> on a huge Lua table and could use some yields so as not to block
> TX thread for long.
>
> Needed for #147
>
> diff --git a/test/unit/util.result b/test/unit/util.result
> index 096e36f..42a361a 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> ---
> ...
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +---
> +...
> +minus_yield({}, {}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k = 1}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +---
> +- k2: 2
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +---
> +- []
> +...
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +---
> +- k1: 1
> +...
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +---
> +- k3: 3
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k4: 4
> + k1: 1
> +...
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +---
> +...
> +copy_yield({}, 1)
> +---
> +- []
> +...
> +copy_yield({k = 1}, 1)
> +---
> +- k: 1
> +...
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +---
> +- k1: 1
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +res
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +t ~= res
> +---
> +- true
> +...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 5f39e06..9550a95 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function')
> while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end
> test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> +
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +minus_yield({}, {}, 1)
> +minus_yield({}, {k = 1}, 1)
> +minus_yield({}, {k = 1}, 0)
> +minus_yield({k = 1}, {k = 1}, 0)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +yield_count
> +t
> +
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +copy_yield({}, 1)
> +copy_yield({k = 1}, 1)
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +yield_count
> +t
> +res
> +t ~= res
> diff --git a/vshard/util.lua b/vshard/util.lua
> index d3b4e67..2362607 100644
> --- a/vshard/util.lua
> +++ b/vshard/util.lua
> @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need)
> return minor >= minor_need
> end
>
> +--
> +-- Copy @a src table. Fiber yields every @a interval keys copied.
> +--
> +local function table_copy_yield(src, interval)
> + local res = {}
> + -- Time-To-Yield.
> + local tty = interval
> + for k, v in pairs(src) do
> + res[k] = v
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return res
> +end
> +
> +--
> +-- Remove @a src keys from @a dst if their values match. Fiber yields every
> +-- @a interval iterations.
> +--
> +local function table_minus_yield(dst, src, interval)
> + -- Time-To-Yield.
> + local tty = interval
> + for k, srcv in pairs(src) do
> + if dst[k] == srcv then
> + dst[k] = nil
> + end
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return dst
> +end
> +
> return {
> tuple_extract_key = tuple_extract_key,
> reloadable_fiber_create = reloadable_fiber_create,
> @@ -160,4 +198,6 @@ return {
> async_task = async_task,
> internal = M,
> version_is_at_least = version_is_at_least,
> + table_copy_yield = table_copy_yield,
> + table_minus_yield = table_minus_yield,
> }
>
Thanks for your fixes. LGTM!
On 11/02/2021 01:34, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 09:59, Oleg Babin wrote:
>> Thanks for your patch!
>>
>> Is it possible to extend log message to "Option is deprecated and has no effect anymore"?
> Good idea. See the diff in this commit.
>
> ====================
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 28c3400..f7d5dbc 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -61,7 +61,13 @@ local function validate_config(config, template, check_arg)
> local expected_type = template_value.type
> if template_value.is_deprecated then
> if value ~= nil then
> - log.warn('Option "%s" is deprecated', name)
> + local reason = template_value.reason
> + if reason then
> + reason = '. '..reason
> + else
> + reason = ''
> + end
> + log.warn('Option "%s" is deprecated'..reason, name)
> end
> elseif value == nil then
> if not template_value.is_optional then
> ====================
>
> And in the next commit:
>
> ====================
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index f7dd4c1..63d5414 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -252,6 +252,7 @@ local cfg_template = {
> },
> collect_bucket_garbage_interval = {
> name = 'Garbage bucket collect interval', is_deprecated = true,
> + reason = 'Has no effect anymore'
> },
> collect_lua_garbage = {
> type = 'boolean', name = 'Garbage Lua collect necessity',
>
> ====================
>
>> Also for some options could be useful: "Option is deprecated, use ... instead" (e.g. for "weights").
> With the updated version I can specify any 'reason'. Such as
> 'has no affect', 'use ... instead', etc.
>
>> Seems it should be more configurable and gives some hint for user to do.
>>
>>
>> On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
>>> Some options in vshard are going to be eventually deprecated. For
>>> instance, 'weigts' will be renamed, 'collect_lua_garbage' may be
>> typo: weigts -> weights
> Fixed. See the full new patch below.
>
> ====================
> cfg: introduce 'deprecated option' feature
>
> Some options in vshard are going to be eventually deprecated. For
> instance, 'weights' will be renamed, 'collect_lua_garbage' may be
> deleted since it appears not to be so useful, 'sync_timeout' is
> totally unnecessary since any 'sync' can take a timeout per-call.
>
> But the patch is motivated by 'collect_bucket_garbage_interval'
> which is going to become unused in the new GC algorithm.
>
> New GC will be reactive instead of proactive. Instead of periodic
> polling of _bucket space it will react on needed events
> immediately. This will make the 'collect interval' unused.
>
> The option will be deprecated and eventually in some far future
> release its usage will lead to an error.
>
> Needed for #147
>
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 1ef1899..f7d5dbc 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -59,7 +59,17 @@ local function validate_config(config, template, check_arg)
> local value = config[key]
> local name = template_value.name
> local expected_type = template_value.type
> - if value == nil then
> + if template_value.is_deprecated then
> + if value ~= nil then
> + local reason = template_value.reason
> + if reason then
> + reason = '. '..reason
> + else
> + reason = ''
> + end
> + log.warn('Option "%s" is deprecated'..reason, name)
> + end
> + elseif value == nil then
> if not template_value.is_optional then
> error(string.format('%s must be specified', name))
> else
>
> ====================
Thanks for your fixes! LGTM.
On 11/02/2021 01:35, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 10:00, Oleg Babin wrote:
>> Thanks for your patch.
>>
>> As I see you've introduced some new parameters: "LUA_CHUNK_SIZE" and "GC_BACKOFF_INTERVAL".
> I decided not to go into too deep details and not describe private
> constants in the commit message. GC_BACKOFF_INTERVAL is explained
> in the place where it is used. LUA_CHUNK_SIZE is quite obvious if
> you look at its usage.
>
>> I think it's better to describe them in commit message to understand more clear how new algorithm.
> These constants are not super relevant to the algorithm's core
> idea. It does not matter much for the reactive GC concept if I
> yield in table utility functions, or if I have a backoff timeout.
> These could be considered 'optimizations', 'amendments'. I would
> consider them small details not worth mentioning in the commit
> message.
>
>> I see that you didn't update comment above "gc_bucket_f" function. Is it still relevant?
> No, irrelevant, thanks for noticing. Here is the diff:
>
> ====================
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 99f92a0..1ea8069 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -1543,14 +1543,16 @@ local function gc_bucket_drop(status, route_map)
> end
>
> --
> --- Garbage collector. Works on masters. The garbage collector
> --- wakes up once per specified time.
> +-- Garbage collector. Works on masters. The garbage collector wakes up when
> +-- state of any bucket changes.
> -- After wakeup it follows the plan:
> --- 1) Check if _bucket has changed. If not, then sleep again;
> --- 2) Scan user spaces for sent and garbage buckets, delete
> --- garbage data in batches of limited size;
> --- 3) Delete GARBAGE buckets from _bucket immediately, and
> --- schedule SENT buckets for deletion after a timeout;
> +-- 1) Check if state of any bucket has really changed. If not, then sleep again;
> +-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of
> +-- limited size.
> +-- 3) Bucket destinations are saved into a global route_map to reroute incoming
> +-- requests from routers in case they didn't notice the buckets being moved.
> +-- The saved routes are scheduled for deletion after a timeout, which is
> +-- checked on each iteration of this loop.
> -- 4) Sleep, go to (1).
> -- For each step details see comments in the code.
> --
> ====================
>
> The full new patch below.
>
> ====================
> gc: introduce reactive garbage collector
>
> Garbage collector is a fiber on a master node which deletes
> GARBAGE and SENT buckets along with their data.
>
> It was proactive. It used to wakeup with a constant period to
> find and delete the needed buckets.
>
> But this won't work with the future feature called 'map-reduce'.
> Map-reduce as a preparation stage will need to ensure that all
> buckets on a storage are readable and writable. With the current
> GC algorithm if a bucket is sent, it won't be deleted for the next
> 5 seconds by default. During this time all new map-reduce requests
> can't execute.
>
> This is not acceptable. As well as too frequent wakeup of GC fiber
> because it would waste TX thread time.
>
> The patch makes GC fiber wakeup not by a timeout but by events
> happening with _bucket space. GC fiber sleeps on a condition
> variable which is signaled when _bucket is changed.
>
> Once GC sees work to do, it won't sleep until it is done. It will
> only yield.
>
> This makes GC delete SENT and GARBAGE buckets as soon as possible
> reducing the waiting time for the incoming map-reduce requests.
>
> Needed for #147
>
> @TarantoolBot document
> Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval'
> It was used to specify the interval between bucket garbage
> collection steps. It was needed because garbage collection in
> vshard was proactive. It didn't react to newly appeared garbage
> buckets immediately.
>
> Since now (0.1.17) garbage collection became reactive. It starts
> working with garbage buckets immediately as they appear. And
> sleeps rest of the time. The option is not used now and does not
> affect behaviour of anything.
>
> I suppose it can be deleted from the documentation. Or left with
> a big label 'deprecated' + the explanation above.
>
> An attempt to use the option does not cause an error, but logs a
> warning.
>
> diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
> index 21409bd..8df89f6 100644
> --- a/test/lua_libs/storage_template.lua
> +++ b/test/lua_libs/storage_template.lua
> @@ -172,6 +172,5 @@ function wait_bucket_is_collected(id)
> return true
> end
> vshard.storage.recovery_wakeup()
> - vshard.storage.garbage_collector_wakeup()
> end)
> end
> diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
> index 168be5d..3b34841 100644
> --- a/test/misc/reconfigure.result
> +++ b/test/misc/reconfigure.result
> @@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> ---
> ...
> -cfg.collect_bucket_garbage_interval = 100
> ----
> -...
> cfg.invalid_option = 'kek'
> ---
> ...
> @@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000
> ---
> - true
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> ----
> -- true
> -...
> cfg.sync_timeout = nil
> ---
> ...
> @@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> ---
> ...
> -cfg.collect_bucket_garbage_interval = nil
> ----
> -...
> cfg.invalid_option = nil
> ---
> ...
> diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
> index e891010..348628c 100644
> --- a/test/misc/reconfigure.test.lua
> +++ b/test/misc/reconfigure.test.lua
> @@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout
> cfg.sync_timeout = 100
> cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> -cfg.collect_bucket_garbage_interval = 100
> cfg.invalid_option = 'kek'
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> not vshard.storage.internal.collect_lua_garbage
> vshard.storage.internal.sync_timeout
> vshard.storage.internal.rebalancer_max_receiving ~= 1000
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> cfg.sync_timeout = nil
> cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> -cfg.collect_bucket_garbage_interval = nil
> cfg.invalid_option = nil
>
> --
> diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
> index b8fc7ff..9df7480 100644
> --- a/test/rebalancer/bucket_ref.result
> +++ b/test/rebalancer/bucket_ref.result
> @@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read')
> - true
> ...
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> ----
> -...
> vshard.storage.buckets_info(1)
> ---
> - 1:
> @@ -203,7 +200,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> ---
> @@ -235,14 +231,6 @@ finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> ---
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: garbage
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
> index 213ced3..1b032ff 100644
> --- a/test/rebalancer/bucket_ref.test.lua
> +++ b/test/rebalancer/bucket_ref.test.lua
> @@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs.
> vshard.storage.bucket_ref(1, 'read')
> vshard.storage.bucket_unref(1, 'read')
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> vshard.storage.buckets_info(1)
> _ = test_run:cmd("setopt delimiter ';'")
> while true do
> @@ -64,7 +63,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> _ = test_run:cmd("setopt delimiter ''");
> @@ -72,7 +70,6 @@ vshard.storage.buckets_info(1)
> vshard.storage.bucket_refro(1)
> finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> _ = test_run:switch('box_2_a')
> vshard.storage.buckets_info(1)
> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index e50eb72..0ddb1c9 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -226,17 +226,6 @@ ret2, err2
> - true
> - null
> ...
> -_bucket:get{35}
> ----
> -- [35, 'sent', '<replicaset_2>']
> -...
> -_bucket:get{36}
> ----
> -- [36, 'sent', '<replicaset_2>']
> -...
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> ---
> ...
> diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
> index 2cc4a69..a60f3d7 100644
> --- a/test/rebalancer/errinj.test.lua
> +++ b/test/rebalancer/errinj.test.lua
> @@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a')
> while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end
> ret1, err1
> ret2, err2
> -_bucket:get{35}
> -_bucket:get{36}
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> wait_bucket_is_collected(36)
> _ = test_run:switch('box_2_a')
> diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
> index 7d3612b..ad93445 100644
> --- a/test/rebalancer/receiving_bucket.result
> +++ b/test/rebalancer/receiving_bucket.result
> @@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> ---
> - true
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_1>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
> index 24534b3..2cf6382 100644
> --- a/test/rebalancer/receiving_bucket.test.lua
> +++ b/test/rebalancer/receiving_bucket.test.lua
> @@ -136,7 +136,6 @@ box.space.test3:select{100}
> -- Now the bucket is unreferenced and can be transferred.
> _ = test_run:switch('box_2_a')
> vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> vshard.storage.buckets_info(1)
> _ = test_run:switch('box_1_a')
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> index 753687f..9d30a04 100644
> --- a/test/reload_evolution/storage.result
> +++ b/test/reload_evolution/storage.result
> @@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to')
> ...
> vshard.storage.internal.reload_version
> ---
> -- 2
> +- 3
> ...
> --
> -- gh-237: should be only one trigger. During gh-237 the trigger installation
> diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result
> index 049bdef..ac340eb 100644
> --- a/test/router/reroute_wrong_bucket.result
> +++ b/test/router/reroute_wrong_bucket.result
> @@ -37,7 +37,7 @@ test_run:switch('storage_1_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> @@ -53,7 +53,7 @@ test_run:switch('storage_2_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> @@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration')
> err
> ---
> - bucket_id: 100
> - reason: write is prohibited
> + reason: Not found
> code: 1
> destination: ac522f65-aa94-4134-9f64-51ee384f1a54
> type: ShardingError
> name: WRONG_BUCKET
> - message: 'Cannot perform action with bucket 100, reason: write is prohibited'
> + message: 'Cannot perform action with bucket 100, reason: Not found'
> ...
> --
> -- Now try again, but update configuration during call(). It must
> diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua
> index 9e6e804..207aac3 100644
> --- a/test/router/reroute_wrong_bucket.test.lua
> +++ b/test/router/reroute_wrong_bucket.test.lua
> @@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt
> test_run:cmd('create server router_1 with script="router/router_1.lua"')
> test_run:cmd('start server router_1')
> test_run:switch('storage_1_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> vshard.storage.rebalancer_disable()
> for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
>
> test_run:switch('storage_2_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> vshard.storage.rebalancer_disable()
> for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
> diff --git a/test/storage/recovery.result b/test/storage/recovery.result
> index f833fe7..8ccb0b9 100644
> --- a/test/storage/recovery.result
> +++ b/test/storage/recovery.result
> @@ -79,8 +79,7 @@ _bucket = box.space._bucket
> ...
> _bucket:select{}
> ---
> -- - [2, 'garbage', '<replicaset_2>']
> - - [3, 'garbage', '<replicaset_2>']
> +- []
> ...
> _ = test_run:switch('storage_2_a')
> ---
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index 424bc4c..0550ad1 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2])
> ---
> - true
> ...
> +wait_bucket_is_collected(1)
> +---
> +...
> _ = test_run:switch("storage_2_a")
> ---
> ...
> @@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a")
> ...
> vshard.storage.buckets_info()
> ---
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> - 2:
> +- 2:
> status: active
> id: 2
> ...
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index d631b51..d8fbd94 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1])
>
> -- Successful transfer.
> vshard.storage.bucket_send(1, util.replicasets[2])
> +wait_bucket_is_collected(1)
> _ = test_run:switch("storage_2_a")
> vshard.storage.buckets_info()
> _ = test_run:switch("storage_1_a")
> diff --git a/test/unit/config.result b/test/unit/config.result
> index dfd0219..e0b2482 100644
> --- a/test/unit/config.result
> +++ b/test/unit/config.result
> @@ -428,33 +428,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 0
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = -1
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 100.5
> ----
> -...
> -_ = lcfg.check(cfg)
> ----
> -...
> cfg.collect_lua_garbage = 100
> ---
> ...
> @@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> ---
> ...
> -cfg.sharding = nil
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +---
> +...
> +_ = lcfg.check(cfg)
> ---
> ...
> diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua
> index ada43db..a1c9f07 100644
> --- a/test/unit/config.test.lua
> +++ b/test/unit/config.test.lua
> @@ -175,15 +175,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 0
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = -1
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 100.5
> -_ = lcfg.check(cfg)
> -
> cfg.collect_lua_garbage = 100
> check(cfg)
> cfg.collect_lua_garbage = true
> @@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg)
> cfg.rebalancer_max_sending = 15
> lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> -cfg.sharding = nil
> +
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +_ = lcfg.check(cfg)
> diff --git a/test/unit/garbage.result b/test/unit/garbage.result
> index 74d9ccf..a530496 100644
> --- a/test/unit/garbage.result
> +++ b/test/unit/garbage.result
> @@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''");
> vshard.storage.internal.shard_index = 'bucket_id'
> ---
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> ----
> -...
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> -- by it, or bucket_id is not unsigned.
> @@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> ---
> ...
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> +---
> +...
> _bucket = box.schema.create_space('_bucket', {format = format})
> ---
> ...
> @@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ---
> - [3, 'active']
> ...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [6, 'garbage']
> -...
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [200, 'garbage']
> -...
> s = box.schema.create_space('test', {engine = engine})
> ---
> ...
> @@ -213,7 +197,7 @@ s:replace{4, 2}
> ---
> - [4, 2]
> ...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> ---
> ...
> s2 = box.schema.create_space('test2', {engine = engine})
> @@ -249,6 +233,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> ---
> ...
> @@ -267,12 +255,22 @@ fill_spaces_with_garbage()
> ---
> - 1107
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - null
> + - null
> + - destination2
> ...
> #s2:select{}
> ---
> @@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ---
> - 7
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - destination1
> ...
> s2:select{}
> ---
> @@ -303,17 +311,22 @@ s:select{}
> - [6, 100]
> ...
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- []
> ...
> #s2:select{}
> ---
> @@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> fill_spaces_with_garbage()
> ---
> ...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> ---
> ...
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -360,7 +378,6 @@ _bucket:select{}
> - - [1, 'active']
> - [2, 'receiving']
> - [3, 'active']
> - - [4, 'sent']
> ...
> --
> -- Test deletion of 'sent' buckets after a specified timeout.
> @@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT}
> - [2, 'sent']
> ...
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> @@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT}
> ---
> - [4, 'sent']
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -434,11 +453,14 @@ s:replace{6, 4}
> ---
> - [6, 4]
> ...
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> ---
> +- Error during garbage collection step
> ...
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -454,8 +476,9 @@ _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> ---
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> f:cancel()
> ---
> @@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua
> index 30079fa..250afb0 100644
> --- a/test/unit/garbage.test.lua
> +++ b/test/unit/garbage.test.lua
> @@ -15,7 +15,6 @@ end;
> test_run:cmd("setopt delimiter ''");
>
> vshard.storage.internal.shard_index = 'bucket_id'
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
>
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> @@ -75,16 +74,13 @@ s:drop()
> format = {}
> format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> _bucket = box.schema.create_space('_bucket', {format = format})
> _ = _bucket:create_index('pk')
> _ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> _bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> _bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
>
> s = box.schema.create_space('test', {engine = engine})
> pk = s:create_index('pk')
> @@ -94,7 +90,7 @@ s:replace{2, 1}
> s:replace{3, 2}
> s:replace{4, 2}
>
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> s2 = box.schema.create_space('test2', {engine = engine})
> pk2 = s2:create_index('pk')
> sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> @@ -114,6 +110,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> test_run:cmd("setopt delimiter ''");
>
> @@ -121,15 +121,21 @@ fill_spaces_with_garbage()
>
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +route_map
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> s2:select{}
> s:select{}
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> #s2:select{}
> #s:select{}
>
> @@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -- Test continuous garbage collection via background fiber.
> --
> fill_spaces_with_garbage()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> s:select{}
> s2:select{}
> -- Check garbage bucket is deleted by background fiber.
> @@ -150,7 +160,7 @@ _bucket:select{}
> --
> _bucket:replace{2, vshard.consts.BUCKET.SENT}
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> _bucket:select{}
> s:select{}
> s2:select{}
> @@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE}
> s:replace{5, 4}
> s:replace{6, 4}
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete)
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> s:replace{5, 4}
> s:replace{6, 4}
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> s:select{}
> _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> f:cancel()
>
> @@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> #s:select{}
> #s2:select{}
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> _bucket:select{}
> s:select{}
> s2:select{}
> diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result
> deleted file mode 100644
> index 92c8039..0000000
> --- a/test/unit/garbage_errinj.result
> +++ /dev/null
> @@ -1,223 +0,0 @@
> -test_run = require('test_run').new()
> ----
> -...
> -vshard = require('vshard')
> ----
> -...
> -fiber = require('fiber')
> ----
> -...
> -engine = test_run:get_cfg('engine')
> ----
> -...
> -vshard.storage.internal.shard_index = 'bucket_id'
> ----
> -...
> -format = {}
> ----
> -...
> -format[1] = {name = 'id', type = 'unsigned'}
> ----
> -...
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> ----
> -...
> -_bucket = box.schema.create_space('_bucket', {format = format})
> ----
> -...
> -_ = _bucket:create_index('pk')
> ----
> -...
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> ----
> -...
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [1, 'active']
> -...
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> ----
> -- [2, 'receiving']
> -...
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [3, 'active']
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -s = box.schema.create_space('test', {engine = engine})
> ----
> -...
> -pk = s:create_index('pk')
> ----
> -...
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s:replace{2, 1}
> ----
> -- [2, 1]
> -...
> -s:replace{3, 2}
> ----
> -- [3, 2]
> -...
> -s:replace{4, 2}
> ----
> -- [4, 2]
> -...
> -s:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s:replace{6, 100}
> ----
> -- [6, 100]
> -...
> -s:replace{7, 4}
> ----
> -- [7, 4]
> -...
> -s:replace{8, 5}
> ----
> -- [8, 5]
> -...
> -s2 = box.schema.create_space('test2', {engine = engine})
> ----
> -...
> -pk2 = s2:create_index('pk')
> ----
> -...
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s2:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s2:replace{3, 3}
> ----
> -- [3, 3]
> -...
> -for i = 7, 1107 do s:replace{i, 200} end
> ----
> -...
> -s2:replace{4, 200}
> ----
> -- [4, 200]
> -...
> -s2:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s2:replace{5, 300}
> ----
> -- [5, 300]
> -...
> -s2:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -s2:replace{7, 5}
> ----
> -- [7, 5]
> -...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> ----
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- - 4
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 5
> -- true
> -...
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> ----
> -...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> ----
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> ----
> -...
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> ----
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [4, 'garbage']
> -...
> -s:replace{5, 4}
> ----
> -- [5, 4]
> -...
> -s:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -#s:select{}
> ----
> -- 2
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> ----
> -...
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> ----
> -...
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> ----
> -- 2
> -...
> -_bucket:select{4}
> ----
> -- - [4, 'garbage']
> -...
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- []
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 4
> - - 5
> -- true
> -...
> -#s:select{}
> ----
> -- 0
> -...
> -_bucket:delete{4}
> ----
> -- [4, 'garbage']
> -...
> -s2:drop()
> ----
> -...
> -s:drop()
> ----
> -...
> -_bucket:drop()
> ----
> -...
> diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua
> deleted file mode 100644
> index 31184b9..0000000
> --- a/test/unit/garbage_errinj.test.lua
> +++ /dev/null
> @@ -1,73 +0,0 @@
> -test_run = require('test_run').new()
> -vshard = require('vshard')
> -fiber = require('fiber')
> -
> -engine = test_run:get_cfg('engine')
> -vshard.storage.internal.shard_index = 'bucket_id'
> -
> -format = {}
> -format[1] = {name = 'id', type = 'unsigned'}
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> -_bucket = box.schema.create_space('_bucket', {format = format})
> -_ = _bucket:create_index('pk')
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -
> -s = box.schema.create_space('test', {engine = engine})
> -pk = s:create_index('pk')
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s:replace{1, 1}
> -s:replace{2, 1}
> -s:replace{3, 2}
> -s:replace{4, 2}
> -s:replace{5, 100}
> -s:replace{6, 100}
> -s:replace{7, 4}
> -s:replace{8, 5}
> -
> -s2 = box.schema.create_space('test2', {engine = engine})
> -pk2 = s2:create_index('pk')
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s2:replace{1, 1}
> -s2:replace{3, 3}
> -for i = 7, 1107 do s:replace{i, 200} end
> -s2:replace{4, 200}
> -s2:replace{5, 100}
> -s2:replace{5, 300}
> -s2:replace{6, 4}
> -s2:replace{7, 5}
> -
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> -s:replace{5, 4}
> -s:replace{6, 4}
> -#s:select{}
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> -_bucket:select{4}
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -#s:select{}
> -_bucket:delete{4}
> -
> -s2:drop()
> -s:drop()
> -_bucket:drop()
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index f7d5dbc..63d5414 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -251,9 +251,8 @@ local cfg_template = {
> max = consts.REBALANCER_MAX_SENDING_MAX
> },
> collect_bucket_garbage_interval = {
> - type = 'positive number', name = 'Garbage bucket collect interval',
> - is_optional = true,
> - default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> + name = 'Garbage bucket collect interval', is_deprecated = true,
> + reason = 'Has no effect anymore'
> },
> collect_lua_garbage = {
> type = 'boolean', name = 'Garbage Lua collect necessity',
> diff --git a/vshard/consts.lua b/vshard/consts.lua
> index 8c2a8b0..3f1585a 100644
> --- a/vshard/consts.lua
> +++ b/vshard/consts.lua
> @@ -23,6 +23,7 @@ return {
> DEFAULT_BUCKET_COUNT = 3000;
> BUCKET_SENT_GARBAGE_DELAY = 0.5;
> BUCKET_CHUNK_SIZE = 1000;
> + LUA_CHUNK_SIZE = 100000,
> DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1;
> REBALANCER_IDLE_INTERVAL = 60 * 60;
> REBALANCER_WORK_INTERVAL = 10;
> @@ -37,7 +38,7 @@ return {
> DEFAULT_FAILOVER_PING_TIMEOUT = 5;
> DEFAULT_SYNC_TIMEOUT = 1;
> RECONNECT_TIMEOUT = 0.5;
> - DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5;
> + GC_BACKOFF_INTERVAL = 5,
> RECOVERY_INTERVAL = 5;
> COLLECT_LUA_GARBAGE_INTERVAL = 100;
>
> @@ -45,4 +46,6 @@ return {
> DISCOVERY_WORK_INTERVAL = 1,
> DISCOVERY_WORK_STEP = 0.01,
> DISCOVERY_TIMEOUT = 10,
> +
> + TIMEOUT_INFINITY = 500 * 365 * 86400,
> }
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index adf1c20..1ea8069 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -69,7 +69,6 @@ if not M then
> total_bucket_count = 0,
> errinj = {
> ERRINJ_CFG = false,
> - ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
> ERRINJ_RELOAD = false,
> ERRINJ_CFG_DELAY = false,
> ERRINJ_LONG_RECEIVE = false,
> @@ -96,6 +95,8 @@ if not M then
> -- detect that _bucket was not changed between yields.
> --
> bucket_generation = 0,
> + -- Condition variable fired on generation update.
> + bucket_generation_cond = lfiber.cond(),
> --
> -- Reference to the function used as on_replace trigger on
> -- _bucket space. It is used to replace the trigger with
> @@ -107,12 +108,14 @@ if not M then
> -- replace the old function is to keep its reference.
> --
> bucket_on_replace = nil,
> + -- Redirects for recently sent buckets. They are kept for a while to
> + -- help routers to find a new location for sent and deleted buckets
> + -- without whole cluster scan.
> + route_map = {},
>
> ------------------- Garbage collection -------------------
> -- Fiber to remove garbage buckets data.
> collect_bucket_garbage_fiber = nil,
> - -- Do buckets garbage collection once per this time.
> - collect_bucket_garbage_interval = nil,
> -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
>
> @@ -173,6 +176,7 @@ end
> --
> local function bucket_generation_increment()
> M.bucket_generation = M.bucket_generation + 1
> + M.bucket_generation_cond:broadcast()
> end
>
> --
> @@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode)
> else
> return bucket
> end
> + local dst = bucket and bucket.destination or M.route_map[bucket_id]
> return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason,
> - bucket and bucket.destination)
> + dst)
> end
>
> --
> @@ -804,11 +809,23 @@ end
> --
> local function bucket_unrefro(bucket_id)
> local ref = M.bucket_refs[bucket_id]
> - if not ref or ref.ro == 0 then
> + local count = ref and ref.ro or 0
> + if count == 0 then
> return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id,
> "no refs", nil)
> end
> - ref.ro = ref.ro - 1
> + if count == 1 then
> + ref.ro = 0
> + if ref.ro_lock then
> + -- Garbage collector is waiting for the bucket if RO
> + -- is locked. Let it know it has one more bucket to
> + -- collect. It relies on generation, so its increment
> + -- it enough.
> + bucket_generation_increment()
> + end
> + return true
> + end
> + ref.ro = count - 1
> return true
> end
>
> @@ -1481,79 +1498,44 @@ local function gc_bucket_in_space(space, bucket_id, status)
> end
>
> --
> --- Remove tuples from buckets of a specified type.
> --- @param type Type of buckets to gc.
> --- @retval List of ids of empty buckets of the type.
> +-- Drop buckets with the given status along with their data in all spaces.
> +-- @param status Status of target buckets.
> +-- @param route_map Destinations of deleted buckets are saved into this table.
> --
> -local function gc_bucket_step_by_type(type)
> - local sharded_spaces = find_sharded_spaces()
> - local empty_buckets = {}
> +local function gc_bucket_drop_xc(status, route_map)
> local limit = consts.BUCKET_CHUNK_SIZE
> - local is_all_collected = true
> - for _, bucket in box.space._bucket.index.status:pairs(type) do
> - local bucket_id = bucket.id
> - local ref = M.bucket_refs[bucket_id]
> + local _bucket = box.space._bucket
> + local sharded_spaces = find_sharded_spaces()
> + for _, b in _bucket.index.status:pairs(status) do
> + local id = b.id
> + local ref = M.bucket_refs[id]
> if ref then
> assert(ref.rw == 0)
> if ref.ro ~= 0 then
> ref.ro_lock = true
> - is_all_collected = false
> goto continue
> end
> - M.bucket_refs[bucket_id] = nil
> + M.bucket_refs[id] = nil
> end
> for _, space in pairs(sharded_spaces) do
> - gc_bucket_in_space_xc(space, bucket_id, type)
> + gc_bucket_in_space_xc(space, id, status)
> limit = limit - 1
> if limit == 0 then
> lfiber.sleep(0)
> limit = consts.BUCKET_CHUNK_SIZE
> end
> end
> - table.insert(empty_buckets, bucket.id)
> -::continue::
> + route_map[id] = b.destination
> + _bucket:delete{id}
> + ::continue::
> end
> - return empty_buckets, is_all_collected
> -end
> -
> ---
> --- Drop buckets with ids in the list.
> --- @param bucket_ids Bucket ids to drop.
> --- @param status Expected bucket status.
> ---
> -local function gc_bucket_drop_xc(bucket_ids, status)
> - if #bucket_ids == 0 then
> - return
> - end
> - local limit = consts.BUCKET_CHUNK_SIZE
> - box.begin()
> - local _bucket = box.space._bucket
> - for _, id in pairs(bucket_ids) do
> - local bucket_exists = _bucket:get{id} ~= nil
> - local b = _bucket:get{id}
> - if b then
> - if b.status ~= status then
> - return error(string.format('Bucket %d status is changed. Was '..
> - '%s, became %s', id, status,
> - b.status))
> - end
> - _bucket:delete{id}
> - end
> - limit = limit - 1
> - if limit == 0 then
> - box.commit()
> - box.begin()
> - limit = consts.BUCKET_CHUNK_SIZE
> - end
> - end
> - box.commit()
> end
>
> --
> -- Exception safe version of gc_bucket_drop_xc.
> --
> -local function gc_bucket_drop(bucket_ids, status)
> - local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status)
> +local function gc_bucket_drop(status, route_map)
> + local status, err = pcall(gc_bucket_drop_xc, status, route_map)
> if not status then
> box.rollback()
> end
> @@ -1561,14 +1543,16 @@ local function gc_bucket_drop(bucket_ids, status)
> end
>
> --
> --- Garbage collector. Works on masters. The garbage collector
> --- wakes up once per specified time.
> +-- Garbage collector. Works on masters. The garbage collector wakes up when
> +-- state of any bucket changes.
> -- After wakeup it follows the plan:
> --- 1) Check if _bucket has changed. If not, then sleep again;
> --- 2) Scan user spaces for sent and garbage buckets, delete
> --- garbage data in batches of limited size;
> --- 3) Delete GARBAGE buckets from _bucket immediately, and
> --- schedule SENT buckets for deletion after a timeout;
> +-- 1) Check if state of any bucket has really changed. If not, then sleep again;
> +-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of
> +-- limited size.
> +-- 3) Bucket destinations are saved into a global route_map to reroute incoming
> +-- requests from routers in case they didn't notice the buckets being moved.
> +-- The saved routes are scheduled for deletion after a timeout, which is
> +-- checked on each iteration of this loop.
> -- 4) Sleep, go to (1).
> -- For each step details see comments in the code.
> --
> @@ -1580,65 +1564,75 @@ function gc_bucket_f()
> -- generation == bucket generation. In such a case the fiber
> -- does nothing until next _bucket change.
> local bucket_generation_collected = -1
> - -- Empty sent buckets are collected into an array. After a
> - -- specified time interval the buckets are deleted both from
> - -- this array and from _bucket space.
> - local buckets_for_redirect = {}
> - local buckets_for_redirect_ts = fiber_clock()
> - -- Empty sent buckets, updated after each step, and when
> - -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> - -- for next deletion.
> - local empty_garbage_buckets, empty_sent_buckets, status, err
> + local bucket_generation_current = M.bucket_generation
> + -- Deleted buckets are saved into a route map to redirect routers if they
> + -- didn't discover new location of the buckets yet. However route map does
> + -- not grow infinitely. Otherwise it would end up storing redirects for all
> + -- buckets in the cluster. Which could also be outdated.
> + -- Garbage collector periodically drops old routes from the map. For that it
> + -- remembers state of route map in one moment, and after a while clears the
> + -- remembered routes from the global route map.
> + local route_map = M.route_map
> + local route_map_old = {}
> + local route_map_deadline = 0
> + local status, err
> while M.module_version == module_version do
> - -- Check if no changes in buckets configuration.
> - if bucket_generation_collected ~= M.bucket_generation then
> - local bucket_generation = M.bucket_generation
> - local is_sent_collected, is_garbage_collected
> - status, empty_garbage_buckets, is_garbage_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE)
> - if not status then
> - err = empty_garbage_buckets
> - goto check_error
> - end
> - status, empty_sent_buckets, is_sent_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.SENT)
> - if not status then
> - err = empty_sent_buckets
> - goto check_error
> + if bucket_generation_collected ~= bucket_generation_current then
> + status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map)
> + if status then
> + status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map)
> end
> - status, err = gc_bucket_drop(empty_garbage_buckets,
> - consts.BUCKET.GARBAGE)
> -::check_error::
> if not status then
> box.rollback()
> log.error('Error during garbage collection step: %s', err)
> - goto continue
> + else
> + -- Don't use global generation. During the collection it could
> + -- already change. Instead, remember the generation known before
> + -- the collection has started.
> + -- Since the collection also changes the generation, it makes
> + -- the GC happen always at least twice. But typically on the
> + -- second iteration it should not find any buckets to collect,
> + -- and then the collected generation matches the global one.
> + bucket_generation_collected = bucket_generation_current
> end
> - if is_sent_collected and is_garbage_collected then
> - bucket_generation_collected = bucket_generation
> + else
> + status = true
> + end
> +
> + local sleep_time = route_map_deadline - fiber_clock()
> + if sleep_time <= 0 then
> + local chunk = consts.LUA_CHUNK_SIZE
> + util.table_minus_yield(route_map, route_map_old, chunk)
> + route_map_old = util.table_copy_yield(route_map, chunk)
> + if next(route_map_old) then
> + sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY
> + else
> + sleep_time = consts.TIMEOUT_INFINITY
> end
> + route_map_deadline = fiber_clock() + sleep_time
> end
> + bucket_generation_current = M.bucket_generation
>
> - if fiber_clock() - buckets_for_redirect_ts >=
> - consts.BUCKET_SENT_GARBAGE_DELAY then
> - status, err = gc_bucket_drop(buckets_for_redirect,
> - consts.BUCKET.SENT)
> - if not status then
> - buckets_for_redirect = {}
> - empty_sent_buckets = {}
> - bucket_generation_collected = -1
> - log.error('Error during deletion of empty sent buckets: %s',
> - err)
> - elseif M.module_version ~= module_version then
> - return
> + if bucket_generation_current ~= bucket_generation_collected then
> + -- Generation was changed during collection. Or *by* collection.
> + if status then
> + -- Retry immediately. If the generation was changed by the
> + -- collection itself, it will notice it next iteration, and go
> + -- to proper sleep.
> + sleep_time = 0
> else
> - buckets_for_redirect = empty_sent_buckets or {}
> - empty_sent_buckets = nil
> - buckets_for_redirect_ts = fiber_clock()
> + -- An error happened during the collection. Does not make sense
> + -- to retry on each iteration of the event loop. The most likely
> + -- errors are either a WAL error or a transaction abort - both
> + -- look like an issue in the user's code and can't be fixed
> + -- quickly anyway. Backoff.
> + sleep_time = consts.GC_BACKOFF_INTERVAL
> end
> end
> -::continue::
> - lfiber.sleep(M.collect_bucket_garbage_interval)
> +
> + if M.module_version == module_version then
> + M.bucket_generation_cond:wait(sleep_time)
> + end
> end
> end
>
> @@ -2423,8 +2417,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> vshard_cfg.rebalancer_disbalance_threshold
> M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving
> M.shard_index = vshard_cfg.shard_index
> - M.collect_bucket_garbage_interval =
> - vshard_cfg.collect_bucket_garbage_interval
> M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending
> M.current_cfg = cfg
> @@ -2678,6 +2670,9 @@ else
> storage_cfg(M.current_cfg, M.this_replica.uuid, true)
> end
> M.module_version = M.module_version + 1
> + -- Background fibers could sleep waiting for bucket changes.
> + -- Let them know it is time to reload.
> + bucket_generation_increment()
> end
>
> M.recovery_f = recovery_f
> @@ -2688,7 +2683,7 @@ M.gc_bucket_f = gc_bucket_f
> -- These functions are saved in M not for atomic reload, but for
> -- unit testing.
> --
> -M.gc_bucket_step_by_type = gc_bucket_step_by_type
> +M.gc_bucket_drop = gc_bucket_drop
> M.rebalancer_build_routes = rebalancer_build_routes
> M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
> M.cached_find_sharded_spaces = find_sharded_spaces
> diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
> index f38af74..484f499 100644
> --- a/vshard/storage/reload_evolution.lua
> +++ b/vshard/storage/reload_evolution.lua
> @@ -4,6 +4,7 @@
> -- in a commit.
> --
> local log = require('log')
> +local fiber = require('fiber')
>
> --
> -- Array of upgrade functions.
> @@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M)
> end
> end
>
> +migrations[#migrations + 1] = function(M)
> + if not M.route_map then
> + M.bucket_generation_cond = fiber.cond()
> + M.route_map = {}
> + end
> +end
> +
> --
> -- Perform an update based on a version stored in `M` (internals).
> -- @param M Old module internals which should be updated.
Thanks for your fixes! I found you've missed to add new file to "vshard/CMakeLists.txt" [1] [1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9 On 11/02/2021 01:36, Vladislav Shpilevoy wrote: > Thanks for the review! > > On 10.02.2021 10:01, Oleg Babin wrote: >> Thanks for your patch. >> >> Shouldn't it be added to storage "MODULE_INTERNALS" ? > Hm. Not sure I understand. Did you mean 'vshard_modules' variable in > storage/init.lua? Why? The heap is not used in storage/init.lua and > won't be used there directly in future patches. The next patches > will introduce new modules for storage/, which will use the heap, > and will reload it. > > Also it does not have any global objects. So it does not > need its own global M, if this is what you meant. Yes, thanks for your answer. Got it. >>> diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua >>> new file mode 100755 >>> index 0000000..8c3819f >>> --- /dev/null >>> +++ b/test/unit-tap/heap.test.lua >>> @@ -0,0 +1,310 @@ >>> +#!/usr/bin/env tarantool >>> + >>> +local tap = require('tap') >>> +local test = tap.test("cfg") >>> +local heap = require('vshard.heap') >>> + >> Maybe it's better to use single brackets everywhere: test("cfg") -> test('cfg'). Or does such difference have some sense? > Yeah, didn't notice it. Here is the diff: > > ==================== > diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua > index 8c3819f..9202f62 100755 > --- a/test/unit-tap/heap.test.lua > +++ b/test/unit-tap/heap.test.lua > @@ -1,7 +1,7 @@ > #!/usr/bin/env tarantool > > local tap = require('tap') > -local test = tap.test("cfg") > +local test = tap.test('cfg') > local heap = require('vshard.heap') > > -- > @@ -109,7 +109,7 @@ local function test_min_heap_basic(test) > until not next_permutation(indexes) > end > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > -- > @@ -143,7 +143,7 @@ local function test_max_heap_basic(test) > until not next_permutation(indexes) > end > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > -- > @@ -178,7 +178,7 @@ local function test_min_heap_update_top(test) > until not next_permutation(indexes) > end > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > -- > @@ -219,7 +219,7 @@ local function test_min_heap_update(test) > end > end > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > -- > @@ -257,7 +257,7 @@ local function test_max_heap_delete(test) > end > end > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > local function test_min_heap_remove_top(test) > @@ -273,7 +273,7 @@ local function test_min_heap_remove_top(test) > end > assert(h:count() == 0) > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > local function test_max_heap_remove_try(test) > @@ -294,7 +294,7 @@ local function test_max_heap_remove_try(test) > assert(obj.index == -1) > assert(h:count() == 1) > > - test:ok(true, "no asserts") > + test:ok(true, 'no asserts') > end > > test:plan(7)
On 11.02.2021 07:51, Oleg Babin wrote: > Thanks for your fixes! > > I found you've missed to add new file to "vshard/CMakeLists.txt" [1] > > > [1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9 Thanks for noticing! Fixed: ==================== diff --git a/vshard/CMakeLists.txt b/vshard/CMakeLists.txt index 1063da8..78a3f07 100644 --- a/vshard/CMakeLists.txt +++ b/vshard/CMakeLists.txt @@ -7,4 +7,4 @@ add_subdirectory(router) # Install module install(FILES cfg.lua error.lua consts.lua hash.lua init.lua replicaset.lua - util.lua lua_gc.lua rlist.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard) + util.lua lua_gc.lua rlist.lua heap.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard) ====================
On 11.02.2021 07:50, Oleg Babin wrote: > I've noticed that you've missed to add new file to vshard/CMakeList.txt [1] > > It will break the build. > > > [1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9 Thanks for noticing! Fixed: ==================== diff --git a/vshard/CMakeLists.txt b/vshard/CMakeLists.txt index 607be54..1063da8 100644 --- a/vshard/CMakeLists.txt +++ b/vshard/CMakeLists.txt @@ -7,4 +7,4 @@ add_subdirectory(router) # Install module install(FILES cfg.lua error.lua consts.lua hash.lua init.lua replicaset.lua - util.lua lua_gc.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard) + util.lua lua_gc.lua rlist.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard) ====================
Hi, Vlad! Thanks for your patchset LGTM.
On 10/02/2021 02:51, Vladislav Shpilevoy wrote:
> Bad links. Here are the correct ones:
>
> Branch: http://github.com/tarantool/vshard/tree/gerold103/gh-147-map-reduce-part1
> Issue: https://github.com/tarantool/vshard/issues/147
Applied this diff and force-pushed, in order to eliminate the metatable and __index access. Besides, each heap's metatable is different from the others because the methods are closures, so there wouldn't be any memory saving from using a metatable. It couldn't have been shared between the heaps anyway. ==================== diff --git a/vshard/heap.lua b/vshard/heap.lua index 78c600a..b125921 100644 --- a/vshard/heap.lua +++ b/vshard/heap.lua @@ -203,22 +203,21 @@ local function heap_new(is_left_above) return count end - return setmetatable({ + return { -- Expose the data. For testing. data = data, - }, { - __index = { - push = heap_push, - update_top = heap_update_top, - remove_top = heap_remove_top, - pop = heap_pop, - update = heap_update, - remove = heap_remove, - remove_try = heap_remove_try, - top = heap_top, - count = heap_count, - } - }) + -- Methods are exported as members instead of __index so as to save on + -- not taking a metatable and going through __index on each method call. + push = heap_push, + update_top = heap_update_top, + remove_top = heap_remove_top, + pop = heap_pop, + update = heap_update, + remove = heap_remove, + remove_try = heap_remove_try, + top = heap_top, + count = heap_count, + } end return {