* [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere Vladislav Shpilevoy via Tarantool-patches
` (8 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Rlist in storage/init.lua implemented a container similar to rlist
in libsmall in Tarantool core. Doubly-linked list.
It does not depend on anything in storage/init.lua, and should
have been done in a separate module from the beginning.
Now init.lua is going to grow even more in scope of map-reduce
feature, beyond 3k lines if nothing would be moved out. It was
decided (by me) that it crosses the border of when it is time to
split init.lua into separate modules.
The patch takes the low hanging fruit by moving rlist into its
own module.
---
test/unit/rebalancer.result | 99 -----------------------------
test/unit/rebalancer.test.lua | 27 --------
test/unit/rlist.result | 114 ++++++++++++++++++++++++++++++++++
test/unit/rlist.test.lua | 33 ++++++++++
vshard/rlist.lua | 53 ++++++++++++++++
vshard/storage/init.lua | 68 +++-----------------
6 files changed, 208 insertions(+), 186 deletions(-)
create mode 100644 test/unit/rlist.result
create mode 100644 test/unit/rlist.test.lua
create mode 100644 vshard/rlist.lua
diff --git a/test/unit/rebalancer.result b/test/unit/rebalancer.result
index 2fb30e2..19aa480 100644
--- a/test/unit/rebalancer.result
+++ b/test/unit/rebalancer.result
@@ -1008,105 +1008,6 @@ build_routes(replicasets)
-- the latter is a dispenser. It is a structure which hands out
-- destination UUIDs in a round-robin manner to worker fibers.
--
-list = rlist.new()
----
-...
-list
----
-- count: 0
-...
-obj1 = {i = 1}
----
-...
-rlist.remove(list, obj1)
----
-...
-list
----
-- count: 0
-...
-rlist.add_tail(list, obj1)
----
-...
-list
----
-- count: 1
- last: &0
- i: 1
- first: *0
-...
-rlist.remove(list, obj1)
----
-...
-list
----
-- count: 0
-...
-obj1
----
-- i: 1
-...
-rlist.add_tail(list, obj1)
----
-...
-obj2 = {i = 2}
----
-...
-rlist.add_tail(list, obj2)
----
-...
-list
----
-- count: 2
- last: &0
- i: 2
- prev: &1
- i: 1
- next: *0
- first: *1
-...
-obj3 = {i = 3}
----
-...
-rlist.add_tail(list, obj3)
----
-...
-list
----
-- count: 3
- last: &0
- i: 3
- prev: &1
- i: 2
- next: *0
- prev: &2
- i: 1
- next: *1
- first: *2
-...
-rlist.remove(list, obj2)
----
-...
-list
----
-- count: 2
- last: &0
- i: 3
- prev: &1
- i: 1
- next: *0
- first: *1
-...
-rlist.remove(list, obj1)
----
-...
-list
----
-- count: 1
- last: &0
- i: 3
- first: *0
-...
d = dispenser.create({uuid = 15})
---
...
diff --git a/test/unit/rebalancer.test.lua b/test/unit/rebalancer.test.lua
index a4e18c1..8087d42 100644
--- a/test/unit/rebalancer.test.lua
+++ b/test/unit/rebalancer.test.lua
@@ -246,33 +246,6 @@ build_routes(replicasets)
-- the latter is a dispenser. It is a structure which hands out
-- destination UUIDs in a round-robin manner to worker fibers.
--
-list = rlist.new()
-list
-
-obj1 = {i = 1}
-rlist.remove(list, obj1)
-list
-
-rlist.add_tail(list, obj1)
-list
-
-rlist.remove(list, obj1)
-list
-obj1
-
-rlist.add_tail(list, obj1)
-obj2 = {i = 2}
-rlist.add_tail(list, obj2)
-list
-obj3 = {i = 3}
-rlist.add_tail(list, obj3)
-list
-
-rlist.remove(list, obj2)
-list
-rlist.remove(list, obj1)
-list
-
d = dispenser.create({uuid = 15})
dispenser.pop(d)
for i = 1, 14 do assert(dispenser.pop(d) == 'uuid', i) end
diff --git a/test/unit/rlist.result b/test/unit/rlist.result
new file mode 100644
index 0000000..c8aabc0
--- /dev/null
+++ b/test/unit/rlist.result
@@ -0,0 +1,114 @@
+-- test-run result file version 2
+--
+-- gh-161: parallel rebalancer. One of the most important part of the latter is
+-- a dispenser. It is a structure which hands out destination UUIDs in a
+-- round-robin manner to worker fibers. It uses rlist data structure.
+--
+rlist = require('vshard.rlist')
+ | ---
+ | ...
+
+list = rlist.new()
+ | ---
+ | ...
+list
+ | ---
+ | - count: 0
+ | ...
+
+obj1 = {i = 1}
+ | ---
+ | ...
+list:remove(obj1)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 0
+ | ...
+
+list:add_tail(obj1)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 1
+ | last: &0
+ | i: 1
+ | first: *0
+ | ...
+
+list:remove(obj1)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 0
+ | ...
+obj1
+ | ---
+ | - i: 1
+ | ...
+
+list:add_tail(obj1)
+ | ---
+ | ...
+obj2 = {i = 2}
+ | ---
+ | ...
+list:add_tail(obj2)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 2
+ | last: &0
+ | i: 2
+ | prev: &1
+ | i: 1
+ | next: *0
+ | first: *1
+ | ...
+obj3 = {i = 3}
+ | ---
+ | ...
+list:add_tail(obj3)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 3
+ | last: &0
+ | i: 3
+ | prev: &1
+ | i: 2
+ | next: *0
+ | prev: &2
+ | i: 1
+ | next: *1
+ | first: *2
+ | ...
+
+list:remove(obj2)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 2
+ | last: &0
+ | i: 3
+ | prev: &1
+ | i: 1
+ | next: *0
+ | first: *1
+ | ...
+list:remove(obj1)
+ | ---
+ | ...
+list
+ | ---
+ | - count: 1
+ | last: &0
+ | i: 3
+ | first: *0
+ | ...
diff --git a/test/unit/rlist.test.lua b/test/unit/rlist.test.lua
new file mode 100644
index 0000000..db52955
--- /dev/null
+++ b/test/unit/rlist.test.lua
@@ -0,0 +1,33 @@
+--
+-- gh-161: parallel rebalancer. One of the most important part of the latter is
+-- a dispenser. It is a structure which hands out destination UUIDs in a
+-- round-robin manner to worker fibers. It uses rlist data structure.
+--
+rlist = require('vshard.rlist')
+
+list = rlist.new()
+list
+
+obj1 = {i = 1}
+list:remove(obj1)
+list
+
+list:add_tail(obj1)
+list
+
+list:remove(obj1)
+list
+obj1
+
+list:add_tail(obj1)
+obj2 = {i = 2}
+list:add_tail(obj2)
+list
+obj3 = {i = 3}
+list:add_tail(obj3)
+list
+
+list:remove(obj2)
+list
+list:remove(obj1)
+list
diff --git a/vshard/rlist.lua b/vshard/rlist.lua
new file mode 100644
index 0000000..4be5382
--- /dev/null
+++ b/vshard/rlist.lua
@@ -0,0 +1,53 @@
+--
+-- A subset of rlist methods from the main repository. Rlist is a
+-- doubly linked list, and is used here to implement a queue of
+-- routes in the parallel rebalancer.
+--
+local rlist_mt = {}
+
+function rlist_mt.add_tail(rlist, object)
+ local last = rlist.last
+ if last then
+ last.next = object
+ object.prev = last
+ else
+ rlist.first = object
+ end
+ rlist.last = object
+ rlist.count = rlist.count + 1
+end
+
+function rlist_mt.remove(rlist, object)
+ local prev = object.prev
+ local next = object.next
+ local belongs_to_list = false
+ if prev then
+ belongs_to_list = true
+ prev.next = next
+ end
+ if next then
+ belongs_to_list = true
+ next.prev = prev
+ end
+ object.prev = nil
+ object.next = nil
+ if rlist.last == object then
+ belongs_to_list = true
+ rlist.last = prev
+ end
+ if rlist.first == object then
+ belongs_to_list = true
+ rlist.first = next
+ end
+ if belongs_to_list then
+ rlist.count = rlist.count - 1
+ end
+end
+
+local function rlist_new()
+ return setmetatable({count = 0}, {__index = rlist_mt})
+end
+
+return {
+ new = rlist_new,
+}
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 5464824..1b48bf1 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -13,12 +13,13 @@ if rawget(_G, MODULE_INTERNALS) then
'vshard.consts', 'vshard.error', 'vshard.cfg',
'vshard.replicaset', 'vshard.util',
'vshard.storage.reload_evolution',
- 'vshard.lua_gc',
+ 'vshard.lua_gc', 'vshard.rlist'
}
for _, module in pairs(vshard_modules) do
package.loaded[module] = nil
end
end
+local rlist = require('vshard.rlist')
local consts = require('vshard.consts')
local lerror = require('vshard.error')
local lcfg = require('vshard.cfg')
@@ -1786,54 +1787,6 @@ local function rebalancer_build_routes(replicasets)
return bucket_routes
end
---
--- A subset of rlist methods from the main repository. Rlist is a
--- doubly linked list, and is used here to implement a queue of
--- routes in the parallel rebalancer.
---
-local function rlist_new()
- return {count = 0}
-end
-
-local function rlist_add_tail(rlist, object)
- local last = rlist.last
- if last then
- last.next = object
- object.prev = last
- else
- rlist.first = object
- end
- rlist.last = object
- rlist.count = rlist.count + 1
-end
-
-local function rlist_remove(rlist, object)
- local prev = object.prev
- local next = object.next
- local belongs_to_list = false
- if prev then
- belongs_to_list = true
- prev.next = next
- end
- if next then
- belongs_to_list = true
- next.prev = prev
- end
- object.prev = nil
- object.next = nil
- if rlist.last == object then
- belongs_to_list = true
- rlist.last = prev
- end
- if rlist.first == object then
- belongs_to_list = true
- rlist.first = next
- end
- if belongs_to_list then
- rlist.count = rlist.count - 1
- end
-end
-
--
-- Dispenser is a container of routes received from the
-- rebalancer. Its task is to hand out the routes to worker fibers
@@ -1842,7 +1795,7 @@ end
-- receiver nodes.
--
local function route_dispenser_create(routes)
- local rlist = rlist_new()
+ local rlist = rlist.new()
local map = {}
for uuid, bucket_count in pairs(routes) do
local new = {
@@ -1873,7 +1826,7 @@ local function route_dispenser_create(routes)
-- the main applier fiber does some analysis on the
-- destinations.
map[uuid] = new
- rlist_add_tail(rlist, new)
+ rlist:add_tail(new)
end
return {
rlist = rlist,
@@ -1892,7 +1845,7 @@ local function route_dispenser_put(dispenser, uuid)
local bucket_count = dst.bucket_count + 1
dst.bucket_count = bucket_count
if bucket_count == 1 then
- rlist_add_tail(dispenser.rlist, dst)
+ dispenser.rlist:add_tail(dst)
end
end
end
@@ -1909,7 +1862,7 @@ local function route_dispenser_skip(dispenser, uuid)
local dst = map[uuid]
if dst then
map[uuid] = nil
- rlist_remove(dispenser.rlist, dst)
+ dispenser.rlist:remove(dst)
end
end
@@ -1952,9 +1905,9 @@ local function route_dispenser_pop(dispenser)
if dst then
local bucket_count = dst.bucket_count - 1
dst.bucket_count = bucket_count
- rlist_remove(rlist, dst)
+ rlist:remove(dst)
if bucket_count > 0 then
- rlist_add_tail(rlist, dst)
+ rlist:add_tail(dst)
end
return dst.uuid
end
@@ -2742,11 +2695,6 @@ M.route_dispenser = {
pop = route_dispenser_pop,
sent = route_dispenser_sent,
}
-M.rlist = {
- new = rlist_new,
- add_tail = rlist_add_tail,
- remove = rlist_remove,
-}
M.schema_latest_version = schema_latest_version
M.schema_current_version = schema_current_version
M.schema_upgrade_master = schema_upgrade_master
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 8:57 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Hi! Thanks for your patch. LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Rlist in storage/init.lua implemented a container similar to rlist
> in libsmall in Tarantool core. Doubly-linked list.
>
> It does not depend on anything in storage/init.lua, and should
> have been done in a separate module from the beginning.
>
> Now init.lua is going to grow even more in scope of map-reduce
> feature, beyond 3k lines if nothing would be moved out. It was
> decided (by me) that it crosses the border of when it is time to
> split init.lua into separate modules.
>
> The patch takes the low hanging fruit by moving rlist into its
> own module.
> ---
> test/unit/rebalancer.result | 99 -----------------------------
> test/unit/rebalancer.test.lua | 27 --------
> test/unit/rlist.result | 114 ++++++++++++++++++++++++++++++++++
> test/unit/rlist.test.lua | 33 ++++++++++
> vshard/rlist.lua | 53 ++++++++++++++++
> vshard/storage/init.lua | 68 +++-----------------
> 6 files changed, 208 insertions(+), 186 deletions(-)
> create mode 100644 test/unit/rlist.result
> create mode 100644 test/unit/rlist.test.lua
> create mode 100644 vshard/rlist.lua
>
> diff --git a/test/unit/rebalancer.result b/test/unit/rebalancer.result
> index 2fb30e2..19aa480 100644
> --- a/test/unit/rebalancer.result
> +++ b/test/unit/rebalancer.result
> @@ -1008,105 +1008,6 @@ build_routes(replicasets)
> -- the latter is a dispenser. It is a structure which hands out
> -- destination UUIDs in a round-robin manner to worker fibers.
> --
> -list = rlist.new()
> ----
> -...
> -list
> ----
> -- count: 0
> -...
> -obj1 = {i = 1}
> ----
> -...
> -rlist.remove(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 0
> -...
> -rlist.add_tail(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 1
> - last: &0
> - i: 1
> - first: *0
> -...
> -rlist.remove(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 0
> -...
> -obj1
> ----
> -- i: 1
> -...
> -rlist.add_tail(list, obj1)
> ----
> -...
> -obj2 = {i = 2}
> ----
> -...
> -rlist.add_tail(list, obj2)
> ----
> -...
> -list
> ----
> -- count: 2
> - last: &0
> - i: 2
> - prev: &1
> - i: 1
> - next: *0
> - first: *1
> -...
> -obj3 = {i = 3}
> ----
> -...
> -rlist.add_tail(list, obj3)
> ----
> -...
> -list
> ----
> -- count: 3
> - last: &0
> - i: 3
> - prev: &1
> - i: 2
> - next: *0
> - prev: &2
> - i: 1
> - next: *1
> - first: *2
> -...
> -rlist.remove(list, obj2)
> ----
> -...
> -list
> ----
> -- count: 2
> - last: &0
> - i: 3
> - prev: &1
> - i: 1
> - next: *0
> - first: *1
> -...
> -rlist.remove(list, obj1)
> ----
> -...
> -list
> ----
> -- count: 1
> - last: &0
> - i: 3
> - first: *0
> -...
> d = dispenser.create({uuid = 15})
> ---
> ...
> diff --git a/test/unit/rebalancer.test.lua b/test/unit/rebalancer.test.lua
> index a4e18c1..8087d42 100644
> --- a/test/unit/rebalancer.test.lua
> +++ b/test/unit/rebalancer.test.lua
> @@ -246,33 +246,6 @@ build_routes(replicasets)
> -- the latter is a dispenser. It is a structure which hands out
> -- destination UUIDs in a round-robin manner to worker fibers.
> --
> -list = rlist.new()
> -list
> -
> -obj1 = {i = 1}
> -rlist.remove(list, obj1)
> -list
> -
> -rlist.add_tail(list, obj1)
> -list
> -
> -rlist.remove(list, obj1)
> -list
> -obj1
> -
> -rlist.add_tail(list, obj1)
> -obj2 = {i = 2}
> -rlist.add_tail(list, obj2)
> -list
> -obj3 = {i = 3}
> -rlist.add_tail(list, obj3)
> -list
> -
> -rlist.remove(list, obj2)
> -list
> -rlist.remove(list, obj1)
> -list
> -
> d = dispenser.create({uuid = 15})
> dispenser.pop(d)
> for i = 1, 14 do assert(dispenser.pop(d) == 'uuid', i) end
> diff --git a/test/unit/rlist.result b/test/unit/rlist.result
> new file mode 100644
> index 0000000..c8aabc0
> --- /dev/null
> +++ b/test/unit/rlist.result
> @@ -0,0 +1,114 @@
> +-- test-run result file version 2
> +--
> +-- gh-161: parallel rebalancer. One of the most important part of the latter is
> +-- a dispenser. It is a structure which hands out destination UUIDs in a
> +-- round-robin manner to worker fibers. It uses rlist data structure.
> +--
> +rlist = require('vshard.rlist')
> + | ---
> + | ...
> +
> +list = rlist.new()
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 0
> + | ...
> +
> +obj1 = {i = 1}
> + | ---
> + | ...
> +list:remove(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 0
> + | ...
> +
> +list:add_tail(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 1
> + | last: &0
> + | i: 1
> + | first: *0
> + | ...
> +
> +list:remove(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 0
> + | ...
> +obj1
> + | ---
> + | - i: 1
> + | ...
> +
> +list:add_tail(obj1)
> + | ---
> + | ...
> +obj2 = {i = 2}
> + | ---
> + | ...
> +list:add_tail(obj2)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 2
> + | last: &0
> + | i: 2
> + | prev: &1
> + | i: 1
> + | next: *0
> + | first: *1
> + | ...
> +obj3 = {i = 3}
> + | ---
> + | ...
> +list:add_tail(obj3)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 3
> + | last: &0
> + | i: 3
> + | prev: &1
> + | i: 2
> + | next: *0
> + | prev: &2
> + | i: 1
> + | next: *1
> + | first: *2
> + | ...
> +
> +list:remove(obj2)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 2
> + | last: &0
> + | i: 3
> + | prev: &1
> + | i: 1
> + | next: *0
> + | first: *1
> + | ...
> +list:remove(obj1)
> + | ---
> + | ...
> +list
> + | ---
> + | - count: 1
> + | last: &0
> + | i: 3
> + | first: *0
> + | ...
> diff --git a/test/unit/rlist.test.lua b/test/unit/rlist.test.lua
> new file mode 100644
> index 0000000..db52955
> --- /dev/null
> +++ b/test/unit/rlist.test.lua
> @@ -0,0 +1,33 @@
> +--
> +-- gh-161: parallel rebalancer. One of the most important part of the latter is
> +-- a dispenser. It is a structure which hands out destination UUIDs in a
> +-- round-robin manner to worker fibers. It uses rlist data structure.
> +--
> +rlist = require('vshard.rlist')
> +
> +list = rlist.new()
> +list
> +
> +obj1 = {i = 1}
> +list:remove(obj1)
> +list
> +
> +list:add_tail(obj1)
> +list
> +
> +list:remove(obj1)
> +list
> +obj1
> +
> +list:add_tail(obj1)
> +obj2 = {i = 2}
> +list:add_tail(obj2)
> +list
> +obj3 = {i = 3}
> +list:add_tail(obj3)
> +list
> +
> +list:remove(obj2)
> +list
> +list:remove(obj1)
> +list
> diff --git a/vshard/rlist.lua b/vshard/rlist.lua
> new file mode 100644
> index 0000000..4be5382
> --- /dev/null
> +++ b/vshard/rlist.lua
> @@ -0,0 +1,53 @@
> +--
> +-- A subset of rlist methods from the main repository. Rlist is a
> +-- doubly linked list, and is used here to implement a queue of
> +-- routes in the parallel rebalancer.
> +--
> +local rlist_mt = {}
> +
> +function rlist_mt.add_tail(rlist, object)
> + local last = rlist.last
> + if last then
> + last.next = object
> + object.prev = last
> + else
> + rlist.first = object
> + end
> + rlist.last = object
> + rlist.count = rlist.count + 1
> +end
> +
> +function rlist_mt.remove(rlist, object)
> + local prev = object.prev
> + local next = object.next
> + local belongs_to_list = false
> + if prev then
> + belongs_to_list = true
> + prev.next = next
> + end
> + if next then
> + belongs_to_list = true
> + next.prev = prev
> + end
> + object.prev = nil
> + object.next = nil
> + if rlist.last == object then
> + belongs_to_list = true
> + rlist.last = prev
> + end
> + if rlist.first == object then
> + belongs_to_list = true
> + rlist.first = next
> + end
> + if belongs_to_list then
> + rlist.count = rlist.count - 1
> + end
> +end
> +
> +local function rlist_new()
> + return setmetatable({count = 0}, {__index = rlist_mt})
> +end
> +
> +return {
> + new = rlist_new,
> +}
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 5464824..1b48bf1 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -13,12 +13,13 @@ if rawget(_G, MODULE_INTERNALS) then
> 'vshard.consts', 'vshard.error', 'vshard.cfg',
> 'vshard.replicaset', 'vshard.util',
> 'vshard.storage.reload_evolution',
> - 'vshard.lua_gc',
> + 'vshard.lua_gc', 'vshard.rlist'
> }
> for _, module in pairs(vshard_modules) do
> package.loaded[module] = nil
> end
> end
> +local rlist = require('vshard.rlist')
> local consts = require('vshard.consts')
> local lerror = require('vshard.error')
> local lcfg = require('vshard.cfg')
> @@ -1786,54 +1787,6 @@ local function rebalancer_build_routes(replicasets)
> return bucket_routes
> end
>
> ---
> --- A subset of rlist methods from the main repository. Rlist is a
> --- doubly linked list, and is used here to implement a queue of
> --- routes in the parallel rebalancer.
> ---
> -local function rlist_new()
> - return {count = 0}
> -end
> -
> -local function rlist_add_tail(rlist, object)
> - local last = rlist.last
> - if last then
> - last.next = object
> - object.prev = last
> - else
> - rlist.first = object
> - end
> - rlist.last = object
> - rlist.count = rlist.count + 1
> -end
> -
> -local function rlist_remove(rlist, object)
> - local prev = object.prev
> - local next = object.next
> - local belongs_to_list = false
> - if prev then
> - belongs_to_list = true
> - prev.next = next
> - end
> - if next then
> - belongs_to_list = true
> - next.prev = prev
> - end
> - object.prev = nil
> - object.next = nil
> - if rlist.last == object then
> - belongs_to_list = true
> - rlist.last = prev
> - end
> - if rlist.first == object then
> - belongs_to_list = true
> - rlist.first = next
> - end
> - if belongs_to_list then
> - rlist.count = rlist.count - 1
> - end
> -end
> -
> --
> -- Dispenser is a container of routes received from the
> -- rebalancer. Its task is to hand out the routes to worker fibers
> @@ -1842,7 +1795,7 @@ end
> -- receiver nodes.
> --
> local function route_dispenser_create(routes)
> - local rlist = rlist_new()
> + local rlist = rlist.new()
> local map = {}
> for uuid, bucket_count in pairs(routes) do
> local new = {
> @@ -1873,7 +1826,7 @@ local function route_dispenser_create(routes)
> -- the main applier fiber does some analysis on the
> -- destinations.
> map[uuid] = new
> - rlist_add_tail(rlist, new)
> + rlist:add_tail(new)
> end
> return {
> rlist = rlist,
> @@ -1892,7 +1845,7 @@ local function route_dispenser_put(dispenser, uuid)
> local bucket_count = dst.bucket_count + 1
> dst.bucket_count = bucket_count
> if bucket_count == 1 then
> - rlist_add_tail(dispenser.rlist, dst)
> + dispenser.rlist:add_tail(dst)
> end
> end
> end
> @@ -1909,7 +1862,7 @@ local function route_dispenser_skip(dispenser, uuid)
> local dst = map[uuid]
> if dst then
> map[uuid] = nil
> - rlist_remove(dispenser.rlist, dst)
> + dispenser.rlist:remove(dst)
> end
> end
>
> @@ -1952,9 +1905,9 @@ local function route_dispenser_pop(dispenser)
> if dst then
> local bucket_count = dst.bucket_count - 1
> dst.bucket_count = bucket_count
> - rlist_remove(rlist, dst)
> + rlist:remove(dst)
> if bucket_count > 0 then
> - rlist_add_tail(rlist, dst)
> + rlist:add_tail(dst)
> end
> return dst.uuid
> end
> @@ -2742,11 +2695,6 @@ M.route_dispenser = {
> pop = route_dispenser_pop,
> sent = route_dispenser_sent,
> }
> -M.rlist = {
> - new = rlist_new,
> - add_tail = rlist_add_tail,
> - remove = rlist_remove,
> -}
> M.schema_latest_version = schema_latest_version
> M.schema_current_version = schema_current_version
> M.schema_upgrade_master = schema_upgrade_master
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
@ 2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
2021-02-12 0:09 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-11 6:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
I've noticed that you've missed to add new file to vshard/CMakeList.txt [1]
It will break the build.
[1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9
On 10/02/2021 11:57, Oleg Babin via Tarantool-patches wrote:
> Hi! Thanks for your patch. LGTM.
>
> On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
>> Rlist in storage/init.lua implemented a container similar to rlist
>> in libsmall in Tarantool core. Doubly-linked list.
>>
>> It does not depend on anything in storage/init.lua, and should
>> have been done in a separate module from the beginning.
>>
>> Now init.lua is going to grow even more in scope of map-reduce
>> feature, beyond 3k lines if nothing would be moved out. It was
>> decided (by me) that it crosses the border of when it is time to
>> split init.lua into separate modules.
>>
>> The patch takes the low hanging fruit by moving rlist into its
>> own module.
>> ---
>> test/unit/rebalancer.result | 99 -----------------------------
>> test/unit/rebalancer.test.lua | 27 --------
>> test/unit/rlist.result | 114 ++++++++++++++++++++++++++++++++++
>> test/unit/rlist.test.lua | 33 ++++++++++
>> vshard/rlist.lua | 53 ++++++++++++++++
>> vshard/storage/init.lua | 68 +++-----------------
>> 6 files changed, 208 insertions(+), 186 deletions(-)
>> create mode 100644 test/unit/rlist.result
>> create mode 100644 test/unit/rlist.test.lua
>> create mode 100644 vshard/rlist.lua
>>
>> diff --git a/test/unit/rebalancer.result b/test/unit/rebalancer.result
>> index 2fb30e2..19aa480 100644
>> --- a/test/unit/rebalancer.result
>> +++ b/test/unit/rebalancer.result
>> @@ -1008,105 +1008,6 @@ build_routes(replicasets)
>> -- the latter is a dispenser. It is a structure which hands out
>> -- destination UUIDs in a round-robin manner to worker fibers.
>> --
>> -list = rlist.new()
>> ----
>> -...
>> -list
>> ----
>> -- count: 0
>> -...
>> -obj1 = {i = 1}
>> ----
>> -...
>> -rlist.remove(list, obj1)
>> ----
>> -...
>> -list
>> ----
>> -- count: 0
>> -...
>> -rlist.add_tail(list, obj1)
>> ----
>> -...
>> -list
>> ----
>> -- count: 1
>> - last: &0
>> - i: 1
>> - first: *0
>> -...
>> -rlist.remove(list, obj1)
>> ----
>> -...
>> -list
>> ----
>> -- count: 0
>> -...
>> -obj1
>> ----
>> -- i: 1
>> -...
>> -rlist.add_tail(list, obj1)
>> ----
>> -...
>> -obj2 = {i = 2}
>> ----
>> -...
>> -rlist.add_tail(list, obj2)
>> ----
>> -...
>> -list
>> ----
>> -- count: 2
>> - last: &0
>> - i: 2
>> - prev: &1
>> - i: 1
>> - next: *0
>> - first: *1
>> -...
>> -obj3 = {i = 3}
>> ----
>> -...
>> -rlist.add_tail(list, obj3)
>> ----
>> -...
>> -list
>> ----
>> -- count: 3
>> - last: &0
>> - i: 3
>> - prev: &1
>> - i: 2
>> - next: *0
>> - prev: &2
>> - i: 1
>> - next: *1
>> - first: *2
>> -...
>> -rlist.remove(list, obj2)
>> ----
>> -...
>> -list
>> ----
>> -- count: 2
>> - last: &0
>> - i: 3
>> - prev: &1
>> - i: 1
>> - next: *0
>> - first: *1
>> -...
>> -rlist.remove(list, obj1)
>> ----
>> -...
>> -list
>> ----
>> -- count: 1
>> - last: &0
>> - i: 3
>> - first: *0
>> -...
>> d = dispenser.create({uuid = 15})
>> ---
>> ...
>> diff --git a/test/unit/rebalancer.test.lua
>> b/test/unit/rebalancer.test.lua
>> index a4e18c1..8087d42 100644
>> --- a/test/unit/rebalancer.test.lua
>> +++ b/test/unit/rebalancer.test.lua
>> @@ -246,33 +246,6 @@ build_routes(replicasets)
>> -- the latter is a dispenser. It is a structure which hands out
>> -- destination UUIDs in a round-robin manner to worker fibers.
>> --
>> -list = rlist.new()
>> -list
>> -
>> -obj1 = {i = 1}
>> -rlist.remove(list, obj1)
>> -list
>> -
>> -rlist.add_tail(list, obj1)
>> -list
>> -
>> -rlist.remove(list, obj1)
>> -list
>> -obj1
>> -
>> -rlist.add_tail(list, obj1)
>> -obj2 = {i = 2}
>> -rlist.add_tail(list, obj2)
>> -list
>> -obj3 = {i = 3}
>> -rlist.add_tail(list, obj3)
>> -list
>> -
>> -rlist.remove(list, obj2)
>> -list
>> -rlist.remove(list, obj1)
>> -list
>> -
>> d = dispenser.create({uuid = 15})
>> dispenser.pop(d)
>> for i = 1, 14 do assert(dispenser.pop(d) == 'uuid', i) end
>> diff --git a/test/unit/rlist.result b/test/unit/rlist.result
>> new file mode 100644
>> index 0000000..c8aabc0
>> --- /dev/null
>> +++ b/test/unit/rlist.result
>> @@ -0,0 +1,114 @@
>> +-- test-run result file version 2
>> +--
>> +-- gh-161: parallel rebalancer. One of the most important part of
>> the latter is
>> +-- a dispenser. It is a structure which hands out destination UUIDs
>> in a
>> +-- round-robin manner to worker fibers. It uses rlist data structure.
>> +--
>> +rlist = require('vshard.rlist')
>> + | ---
>> + | ...
>> +
>> +list = rlist.new()
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 0
>> + | ...
>> +
>> +obj1 = {i = 1}
>> + | ---
>> + | ...
>> +list:remove(obj1)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 0
>> + | ...
>> +
>> +list:add_tail(obj1)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 1
>> + | last: &0
>> + | i: 1
>> + | first: *0
>> + | ...
>> +
>> +list:remove(obj1)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 0
>> + | ...
>> +obj1
>> + | ---
>> + | - i: 1
>> + | ...
>> +
>> +list:add_tail(obj1)
>> + | ---
>> + | ...
>> +obj2 = {i = 2}
>> + | ---
>> + | ...
>> +list:add_tail(obj2)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 2
>> + | last: &0
>> + | i: 2
>> + | prev: &1
>> + | i: 1
>> + | next: *0
>> + | first: *1
>> + | ...
>> +obj3 = {i = 3}
>> + | ---
>> + | ...
>> +list:add_tail(obj3)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 3
>> + | last: &0
>> + | i: 3
>> + | prev: &1
>> + | i: 2
>> + | next: *0
>> + | prev: &2
>> + | i: 1
>> + | next: *1
>> + | first: *2
>> + | ...
>> +
>> +list:remove(obj2)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 2
>> + | last: &0
>> + | i: 3
>> + | prev: &1
>> + | i: 1
>> + | next: *0
>> + | first: *1
>> + | ...
>> +list:remove(obj1)
>> + | ---
>> + | ...
>> +list
>> + | ---
>> + | - count: 1
>> + | last: &0
>> + | i: 3
>> + | first: *0
>> + | ...
>> diff --git a/test/unit/rlist.test.lua b/test/unit/rlist.test.lua
>> new file mode 100644
>> index 0000000..db52955
>> --- /dev/null
>> +++ b/test/unit/rlist.test.lua
>> @@ -0,0 +1,33 @@
>> +--
>> +-- gh-161: parallel rebalancer. One of the most important part of
>> the latter is
>> +-- a dispenser. It is a structure which hands out destination UUIDs
>> in a
>> +-- round-robin manner to worker fibers. It uses rlist data structure.
>> +--
>> +rlist = require('vshard.rlist')
>> +
>> +list = rlist.new()
>> +list
>> +
>> +obj1 = {i = 1}
>> +list:remove(obj1)
>> +list
>> +
>> +list:add_tail(obj1)
>> +list
>> +
>> +list:remove(obj1)
>> +list
>> +obj1
>> +
>> +list:add_tail(obj1)
>> +obj2 = {i = 2}
>> +list:add_tail(obj2)
>> +list
>> +obj3 = {i = 3}
>> +list:add_tail(obj3)
>> +list
>> +
>> +list:remove(obj2)
>> +list
>> +list:remove(obj1)
>> +list
>> diff --git a/vshard/rlist.lua b/vshard/rlist.lua
>> new file mode 100644
>> index 0000000..4be5382
>> --- /dev/null
>> +++ b/vshard/rlist.lua
>> @@ -0,0 +1,53 @@
>> +--
>> +-- A subset of rlist methods from the main repository. Rlist is a
>> +-- doubly linked list, and is used here to implement a queue of
>> +-- routes in the parallel rebalancer.
>> +--
>> +local rlist_mt = {}
>> +
>> +function rlist_mt.add_tail(rlist, object)
>> + local last = rlist.last
>> + if last then
>> + last.next = object
>> + object.prev = last
>> + else
>> + rlist.first = object
>> + end
>> + rlist.last = object
>> + rlist.count = rlist.count + 1
>> +end
>> +
>> +function rlist_mt.remove(rlist, object)
>> + local prev = object.prev
>> + local next = object.next
>> + local belongs_to_list = false
>> + if prev then
>> + belongs_to_list = true
>> + prev.next = next
>> + end
>> + if next then
>> + belongs_to_list = true
>> + next.prev = prev
>> + end
>> + object.prev = nil
>> + object.next = nil
>> + if rlist.last == object then
>> + belongs_to_list = true
>> + rlist.last = prev
>> + end
>> + if rlist.first == object then
>> + belongs_to_list = true
>> + rlist.first = next
>> + end
>> + if belongs_to_list then
>> + rlist.count = rlist.count - 1
>> + end
>> +end
>> +
>> +local function rlist_new()
>> + return setmetatable({count = 0}, {__index = rlist_mt})
>> +end
>> +
>> +return {
>> + new = rlist_new,
>> +}
>> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
>> index 5464824..1b48bf1 100644
>> --- a/vshard/storage/init.lua
>> +++ b/vshard/storage/init.lua
>> @@ -13,12 +13,13 @@ if rawget(_G, MODULE_INTERNALS) then
>> 'vshard.consts', 'vshard.error', 'vshard.cfg',
>> 'vshard.replicaset', 'vshard.util',
>> 'vshard.storage.reload_evolution',
>> - 'vshard.lua_gc',
>> + 'vshard.lua_gc', 'vshard.rlist'
>> }
>> for _, module in pairs(vshard_modules) do
>> package.loaded[module] = nil
>> end
>> end
>> +local rlist = require('vshard.rlist')
>> local consts = require('vshard.consts')
>> local lerror = require('vshard.error')
>> local lcfg = require('vshard.cfg')
>> @@ -1786,54 +1787,6 @@ local function
>> rebalancer_build_routes(replicasets)
>> return bucket_routes
>> end
>> ---
>> --- A subset of rlist methods from the main repository. Rlist is a
>> --- doubly linked list, and is used here to implement a queue of
>> --- routes in the parallel rebalancer.
>> ---
>> -local function rlist_new()
>> - return {count = 0}
>> -end
>> -
>> -local function rlist_add_tail(rlist, object)
>> - local last = rlist.last
>> - if last then
>> - last.next = object
>> - object.prev = last
>> - else
>> - rlist.first = object
>> - end
>> - rlist.last = object
>> - rlist.count = rlist.count + 1
>> -end
>> -
>> -local function rlist_remove(rlist, object)
>> - local prev = object.prev
>> - local next = object.next
>> - local belongs_to_list = false
>> - if prev then
>> - belongs_to_list = true
>> - prev.next = next
>> - end
>> - if next then
>> - belongs_to_list = true
>> - next.prev = prev
>> - end
>> - object.prev = nil
>> - object.next = nil
>> - if rlist.last == object then
>> - belongs_to_list = true
>> - rlist.last = prev
>> - end
>> - if rlist.first == object then
>> - belongs_to_list = true
>> - rlist.first = next
>> - end
>> - if belongs_to_list then
>> - rlist.count = rlist.count - 1
>> - end
>> -end
>> -
>> --
>> -- Dispenser is a container of routes received from the
>> -- rebalancer. Its task is to hand out the routes to worker fibers
>> @@ -1842,7 +1795,7 @@ end
>> -- receiver nodes.
>> --
>> local function route_dispenser_create(routes)
>> - local rlist = rlist_new()
>> + local rlist = rlist.new()
>> local map = {}
>> for uuid, bucket_count in pairs(routes) do
>> local new = {
>> @@ -1873,7 +1826,7 @@ local function route_dispenser_create(routes)
>> -- the main applier fiber does some analysis on the
>> -- destinations.
>> map[uuid] = new
>> - rlist_add_tail(rlist, new)
>> + rlist:add_tail(new)
>> end
>> return {
>> rlist = rlist,
>> @@ -1892,7 +1845,7 @@ local function route_dispenser_put(dispenser,
>> uuid)
>> local bucket_count = dst.bucket_count + 1
>> dst.bucket_count = bucket_count
>> if bucket_count == 1 then
>> - rlist_add_tail(dispenser.rlist, dst)
>> + dispenser.rlist:add_tail(dst)
>> end
>> end
>> end
>> @@ -1909,7 +1862,7 @@ local function route_dispenser_skip(dispenser,
>> uuid)
>> local dst = map[uuid]
>> if dst then
>> map[uuid] = nil
>> - rlist_remove(dispenser.rlist, dst)
>> + dispenser.rlist:remove(dst)
>> end
>> end
>> @@ -1952,9 +1905,9 @@ local function route_dispenser_pop(dispenser)
>> if dst then
>> local bucket_count = dst.bucket_count - 1
>> dst.bucket_count = bucket_count
>> - rlist_remove(rlist, dst)
>> + rlist:remove(dst)
>> if bucket_count > 0 then
>> - rlist_add_tail(rlist, dst)
>> + rlist:add_tail(dst)
>> end
>> return dst.uuid
>> end
>> @@ -2742,11 +2695,6 @@ M.route_dispenser = {
>> pop = route_dispenser_pop,
>> sent = route_dispenser_sent,
>> }
>> -M.rlist = {
>> - new = rlist_new,
>> - add_tail = rlist_add_tail,
>> - remove = rlist_remove,
>> -}
>> M.schema_latest_version = schema_latest_version
>> M.schema_current_version = schema_current_version
>> M.schema_upgrade_master = schema_upgrade_master
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
@ 2021-02-12 0:09 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-12 0:09 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
On 11.02.2021 07:50, Oleg Babin wrote:
> I've noticed that you've missed to add new file to vshard/CMakeList.txt [1]
>
> It will break the build.
>
>
> [1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9
Thanks for noticing! Fixed:
====================
diff --git a/vshard/CMakeLists.txt b/vshard/CMakeLists.txt
index 607be54..1063da8 100644
--- a/vshard/CMakeLists.txt
+++ b/vshard/CMakeLists.txt
@@ -7,4 +7,4 @@ add_subdirectory(router)
# Install module
install(FILES cfg.lua error.lua consts.lua hash.lua init.lua replicaset.lua
- util.lua lua_gc.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard)
+ util.lua lua_gc.lua rlist.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard)
====================
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC Vladislav Shpilevoy via Tarantool-patches
` (7 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Fiber.time() returns real time. It is affected by time corrections
in the system, and can be not monotonic.
The patch makes everything in vshard use fiber.clock() instead of
fiber.time(). Also fiber.clock function is saved as an upvalue for
all functions in all modules using it. This makes the code a bit
shorter and saves 1 indexing of 'fiber' table.
The main reason - in the future map-reduce feature the current
time will be used quite often. In some places it probably will be
the slowest action (given how slow FFI can be when not compiled by
JIT).
Needed for #147
---
test/failover/failover.result | 4 ++--
test/failover/failover.test.lua | 4 ++--
vshard/replicaset.lua | 13 +++++++------
vshard/router/init.lua | 16 ++++++++--------
vshard/storage/init.lua | 16 ++++++++--------
5 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/test/failover/failover.result b/test/failover/failover.result
index 452694c..bae57fa 100644
--- a/test/failover/failover.result
+++ b/test/failover/failover.result
@@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d')
---
- true
...
-ts1 = fiber.time()
+ts1 = fiber.clock()
---
...
while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
---
...
-ts2 = fiber.time()
+ts2 = fiber.clock()
---
...
ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua
index 13c517b..a969e0e 100644
--- a/test/failover/failover.test.lua
+++ b/test/failover/failover.test.lua
@@ -109,9 +109,9 @@ test_run:switch('router_1')
-- Revive the best replica. A router must reconnect to it in
-- FAILOVER_UP_TIMEOUT seconds.
test_run:cmd('start server box_1_d')
-ts1 = fiber.time()
+ts1 = fiber.clock()
while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
-ts2 = fiber.time()
+ts2 = fiber.clock()
ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
test_run:grep_log('router_1', 'New replica box_1_d%(storage%@')
diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
index b13d05e..a74c0f8 100644
--- a/vshard/replicaset.lua
+++ b/vshard/replicaset.lua
@@ -54,6 +54,7 @@ local luri = require('uri')
local luuid = require('uuid')
local ffi = require('ffi')
local util = require('vshard.util')
+local clock = fiber.clock
local gsc = util.generate_self_checker
--
@@ -88,7 +89,7 @@ local function netbox_on_connect(conn)
-- biggest priority. Really, it is not neccessary to
-- increase replica connection priority, if the current
-- one already has the biggest priority. (See failover_f).
- rs.replica_up_ts = fiber.time()
+ rs.replica_up_ts = clock()
end
end
@@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn)
assert(conn.replica)
-- Replica is down - remember this time to decrease replica
-- priority after FAILOVER_DOWN_TIMEOUT seconds.
- conn.replica.down_ts = fiber.time()
+ conn.replica.down_ts = clock()
end
--
@@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset)
local old_replica = replicaset.replica
if old_replica == replicaset.priority_list[1] and
old_replica:is_connected() then
- replicaset.replica_up_ts = fiber.time()
+ replicaset.replica_up_ts = clock()
return
end
for _, replica in pairs(replicaset.priority_list) do
@@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
net_status, err = pcall(box.error, box.error.TIMEOUT)
return nil, lerror.make(err)
end
- local end_time = fiber.time() + timeout
+ local end_time = clock() + timeout
while not net_status and timeout > 0 do
replica, err = pick_next_replica(replicaset)
if not replica then
@@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
opts.timeout = timeout
net_status, storage_status, retval, err =
replica_call(replica, func, args, opts)
- timeout = end_time - fiber.time()
+ timeout = end_time - clock()
if not net_status and not storage_status and
not can_retry_after_error(retval) then
-- There is no sense to retry LuaJit errors, such as
@@ -680,7 +681,7 @@ local function buildall(sharding_cfg)
else
zone_weights = {}
end
- local curr_ts = fiber.time()
+ local curr_ts = clock()
for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do
local new_replicaset = setmetatable({
replicas = {},
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index ba1f863..a530c29 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -1,6 +1,7 @@
local log = require('log')
local lfiber = require('fiber')
local table_new = require('table.new')
+local clock = lfiber.clock
local MODULE_INTERNALS = '__module_vshard_router'
-- Reload requirements, in case this module is reloaded manually.
@@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
end
local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN
local replicaset, err
- local tend = lfiber.time() + timeout
+ local tend = clock() + timeout
if bucket_id > router.total_bucket_count or bucket_id <= 0 then
error('Bucket is unreachable: bucket id is out of range')
end
@@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
replicaset, err = bucket_resolve(router, bucket_id)
if replicaset then
::replicaset_is_found::
- opts.timeout = tend - lfiber.time()
+ opts.timeout = tend - clock()
local storage_call_status, call_status, call_error =
replicaset[call](replicaset, 'vshard.storage.call',
{bucket_id, mode, func, args}, opts)
@@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
-- if reconfiguration had been started,
-- and while is not executed on router,
-- but already is executed on storages.
- while lfiber.time() <= tend do
+ while clock() <= tend do
lfiber.sleep(0.05)
replicaset = router.replicasets[err.destination]
if replicaset then
@@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
-- case of broken cluster, when a bucket
-- is sent on two replicasets to each
-- other.
- if replicaset and lfiber.time() <= tend then
+ if replicaset and clock() <= tend then
goto replicaset_is_found
end
end
@@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
end
end
lfiber.yield()
- until lfiber.time() > tend
+ until clock() > tend
if err then
return nil, err
else
@@ -749,7 +750,7 @@ end
-- connections must be updated.
--
local function failover_collect_to_update(router)
- local ts = lfiber.time()
+ local ts = clock()
local uuid_to_update = {}
for uuid, rs in pairs(router.replicasets) do
if failover_need_down_priority(rs, ts) or
@@ -772,7 +773,7 @@ local function failover_step(router)
if #uuid_to_update == 0 then
return false
end
- local curr_ts = lfiber.time()
+ local curr_ts = clock()
local replica_is_changed = false
for _, uuid in pairs(uuid_to_update) do
local rs = router.replicasets[uuid]
@@ -1230,7 +1231,6 @@ local function router_sync(router, timeout)
timeout = router.sync_timeout
end
local arg = {timeout}
- local clock = lfiber.clock
local deadline = timeout and (clock() + timeout)
local opts = {timeout = timeout}
for rs_uuid, replicaset in pairs(router.replicasets) do
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 1b48bf1..c7335fc 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self()
local trigger = require('internal.trigger')
local ffi = require('ffi')
local yaml_encode = require('yaml').encode
+local clock = lfiber.clock
local MODULE_INTERNALS = '__module_vshard_storage'
-- Reload requirements, in case this module is reloaded manually.
@@ -695,7 +696,7 @@ local function sync(timeout)
log.debug("Synchronizing replicaset...")
timeout = timeout or M.sync_timeout
local vclock = box.info.vclock
- local tstart = lfiber.time()
+ local tstart = clock()
repeat
local done = true
for _, replica in ipairs(box.info.replication) do
@@ -711,7 +712,7 @@ local function sync(timeout)
return true
end
lfiber.sleep(0.001)
- until not (lfiber.time() <= tstart + timeout)
+ until not (clock() <= tstart + timeout)
log.warn("Timed out during synchronizing replicaset")
local ok, err = pcall(box.error, box.error.TIMEOUT)
return nil, lerror.make(err)
@@ -1280,10 +1281,9 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard)
ref.rw_lock = true
exception_guard.ref = ref
exception_guard.drop_rw_lock = true
- local deadline = lfiber.clock() + (opts and opts.timeout or 10)
+ local deadline = clock() + (opts and opts.timeout or 10)
while ref.rw ~= 0 do
- if not M.bucket_rw_lock_is_ready_cond:wait(deadline -
- lfiber.clock()) then
+ if not M.bucket_rw_lock_is_ready_cond:wait(deadline - clock()) then
status, err = pcall(box.error, box.error.TIMEOUT)
return nil, lerror.make(err)
end
@@ -1579,7 +1579,7 @@ function gc_bucket_f()
-- specified time interval the buckets are deleted both from
-- this array and from _bucket space.
local buckets_for_redirect = {}
- local buckets_for_redirect_ts = lfiber.time()
+ local buckets_for_redirect_ts = clock()
-- Empty sent buckets, updated after each step, and when
-- buckets_for_redirect is deleted, it gets empty_sent_buckets
-- for next deletion.
@@ -1614,7 +1614,7 @@ function gc_bucket_f()
end
end
- if lfiber.time() - buckets_for_redirect_ts >=
+ if clock() - buckets_for_redirect_ts >=
consts.BUCKET_SENT_GARBAGE_DELAY then
status, err = gc_bucket_drop(buckets_for_redirect,
consts.BUCKET.SENT)
@@ -1629,7 +1629,7 @@ function gc_bucket_f()
else
buckets_for_redirect = empty_sent_buckets or {}
empty_sent_buckets = nil
- buckets_for_redirect_ts = lfiber.time()
+ buckets_for_redirect_ts = clock()
end
end
::continue::
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 8:57 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch. LGTM except two nits:
- Seems you need to put "Closes #246"
- Tarantool has "clock" module. I suggest to use "fiber_clock()" instead
of simple "clock" to avoid possible confusing.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Fiber.time() returns real time. It is affected by time corrections
> in the system, and can be not monotonic.
>
> The patch makes everything in vshard use fiber.clock() instead of
> fiber.time(). Also fiber.clock function is saved as an upvalue for
> all functions in all modules using it. This makes the code a bit
> shorter and saves 1 indexing of 'fiber' table.
>
> The main reason - in the future map-reduce feature the current
> time will be used quite often. In some places it probably will be
> the slowest action (given how slow FFI can be when not compiled by
> JIT).
>
> Needed for #147
> ---
> test/failover/failover.result | 4 ++--
> test/failover/failover.test.lua | 4 ++--
> vshard/replicaset.lua | 13 +++++++------
> vshard/router/init.lua | 16 ++++++++--------
> vshard/storage/init.lua | 16 ++++++++--------
> 5 files changed, 27 insertions(+), 26 deletions(-)
>
> diff --git a/test/failover/failover.result b/test/failover/failover.result
> index 452694c..bae57fa 100644
> --- a/test/failover/failover.result
> +++ b/test/failover/failover.result
> @@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d')
> ---
> - true
> ...
> -ts1 = fiber.time()
> +ts1 = fiber.clock()
> ---
> ...
> while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
> ---
> ...
> -ts2 = fiber.time()
> +ts2 = fiber.clock()
> ---
> ...
> ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
> diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua
> index 13c517b..a969e0e 100644
> --- a/test/failover/failover.test.lua
> +++ b/test/failover/failover.test.lua
> @@ -109,9 +109,9 @@ test_run:switch('router_1')
> -- Revive the best replica. A router must reconnect to it in
> -- FAILOVER_UP_TIMEOUT seconds.
> test_run:cmd('start server box_1_d')
> -ts1 = fiber.time()
> +ts1 = fiber.clock()
> while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
> -ts2 = fiber.time()
> +ts2 = fiber.clock()
> ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
> test_run:grep_log('router_1', 'New replica box_1_d%(storage%@')
>
> diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
> index b13d05e..a74c0f8 100644
> --- a/vshard/replicaset.lua
> +++ b/vshard/replicaset.lua
> @@ -54,6 +54,7 @@ local luri = require('uri')
> local luuid = require('uuid')
> local ffi = require('ffi')
> local util = require('vshard.util')
> +local clock = fiber.clock
> local gsc = util.generate_self_checker
>
> --
> @@ -88,7 +89,7 @@ local function netbox_on_connect(conn)
> -- biggest priority. Really, it is not neccessary to
> -- increase replica connection priority, if the current
> -- one already has the biggest priority. (See failover_f).
> - rs.replica_up_ts = fiber.time()
> + rs.replica_up_ts = clock()
> end
> end
>
> @@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn)
> assert(conn.replica)
> -- Replica is down - remember this time to decrease replica
> -- priority after FAILOVER_DOWN_TIMEOUT seconds.
> - conn.replica.down_ts = fiber.time()
> + conn.replica.down_ts = clock()
> end
>
> --
> @@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset)
> local old_replica = replicaset.replica
> if old_replica == replicaset.priority_list[1] and
> old_replica:is_connected() then
> - replicaset.replica_up_ts = fiber.time()
> + replicaset.replica_up_ts = clock()
> return
> end
> for _, replica in pairs(replicaset.priority_list) do
> @@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
> net_status, err = pcall(box.error, box.error.TIMEOUT)
> return nil, lerror.make(err)
> end
> - local end_time = fiber.time() + timeout
> + local end_time = clock() + timeout
> while not net_status and timeout > 0 do
> replica, err = pick_next_replica(replicaset)
> if not replica then
> @@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
> opts.timeout = timeout
> net_status, storage_status, retval, err =
> replica_call(replica, func, args, opts)
> - timeout = end_time - fiber.time()
> + timeout = end_time - clock()
> if not net_status and not storage_status and
> not can_retry_after_error(retval) then
> -- There is no sense to retry LuaJit errors, such as
> @@ -680,7 +681,7 @@ local function buildall(sharding_cfg)
> else
> zone_weights = {}
> end
> - local curr_ts = fiber.time()
> + local curr_ts = clock()
> for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do
> local new_replicaset = setmetatable({
> replicas = {},
> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index ba1f863..a530c29 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -1,6 +1,7 @@
> local log = require('log')
> local lfiber = require('fiber')
> local table_new = require('table.new')
> +local clock = lfiber.clock
>
> local MODULE_INTERNALS = '__module_vshard_router'
> -- Reload requirements, in case this module is reloaded manually.
> @@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> end
> local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN
> local replicaset, err
> - local tend = lfiber.time() + timeout
> + local tend = clock() + timeout
> if bucket_id > router.total_bucket_count or bucket_id <= 0 then
> error('Bucket is unreachable: bucket id is out of range')
> end
> @@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> replicaset, err = bucket_resolve(router, bucket_id)
> if replicaset then
> ::replicaset_is_found::
> - opts.timeout = tend - lfiber.time()
> + opts.timeout = tend - clock()
> local storage_call_status, call_status, call_error =
> replicaset[call](replicaset, 'vshard.storage.call',
> {bucket_id, mode, func, args}, opts)
> @@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> -- if reconfiguration had been started,
> -- and while is not executed on router,
> -- but already is executed on storages.
> - while lfiber.time() <= tend do
> + while clock() <= tend do
> lfiber.sleep(0.05)
> replicaset = router.replicasets[err.destination]
> if replicaset then
> @@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> -- case of broken cluster, when a bucket
> -- is sent on two replicasets to each
> -- other.
> - if replicaset and lfiber.time() <= tend then
> + if replicaset and clock() <= tend then
> goto replicaset_is_found
> end
> end
> @@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
> end
> end
> lfiber.yield()
> - until lfiber.time() > tend
> + until clock() > tend
> if err then
> return nil, err
> else
> @@ -749,7 +750,7 @@ end
> -- connections must be updated.
> --
> local function failover_collect_to_update(router)
> - local ts = lfiber.time()
> + local ts = clock()
> local uuid_to_update = {}
> for uuid, rs in pairs(router.replicasets) do
> if failover_need_down_priority(rs, ts) or
> @@ -772,7 +773,7 @@ local function failover_step(router)
> if #uuid_to_update == 0 then
> return false
> end
> - local curr_ts = lfiber.time()
> + local curr_ts = clock()
> local replica_is_changed = false
> for _, uuid in pairs(uuid_to_update) do
> local rs = router.replicasets[uuid]
> @@ -1230,7 +1231,6 @@ local function router_sync(router, timeout)
> timeout = router.sync_timeout
> end
> local arg = {timeout}
> - local clock = lfiber.clock
> local deadline = timeout and (clock() + timeout)
> local opts = {timeout = timeout}
> for rs_uuid, replicaset in pairs(router.replicasets) do
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 1b48bf1..c7335fc 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self()
> local trigger = require('internal.trigger')
> local ffi = require('ffi')
> local yaml_encode = require('yaml').encode
> +local clock = lfiber.clock
>
> local MODULE_INTERNALS = '__module_vshard_storage'
> -- Reload requirements, in case this module is reloaded manually.
> @@ -695,7 +696,7 @@ local function sync(timeout)
> log.debug("Synchronizing replicaset...")
> timeout = timeout or M.sync_timeout
> local vclock = box.info.vclock
> - local tstart = lfiber.time()
> + local tstart = clock()
> repeat
> local done = true
> for _, replica in ipairs(box.info.replication) do
> @@ -711,7 +712,7 @@ local function sync(timeout)
> return true
> end
> lfiber.sleep(0.001)
> - until not (lfiber.time() <= tstart + timeout)
> + until not (clock() <= tstart + timeout)
> log.warn("Timed out during synchronizing replicaset")
> local ok, err = pcall(box.error, box.error.TIMEOUT)
> return nil, lerror.make(err)
> @@ -1280,10 +1281,9 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard)
> ref.rw_lock = true
> exception_guard.ref = ref
> exception_guard.drop_rw_lock = true
> - local deadline = lfiber.clock() + (opts and opts.timeout or 10)
> + local deadline = clock() + (opts and opts.timeout or 10)
> while ref.rw ~= 0 do
> - if not M.bucket_rw_lock_is_ready_cond:wait(deadline -
> - lfiber.clock()) then
> + if not M.bucket_rw_lock_is_ready_cond:wait(deadline - clock()) then
> status, err = pcall(box.error, box.error.TIMEOUT)
> return nil, lerror.make(err)
> end
> @@ -1579,7 +1579,7 @@ function gc_bucket_f()
> -- specified time interval the buckets are deleted both from
> -- this array and from _bucket space.
> local buckets_for_redirect = {}
> - local buckets_for_redirect_ts = lfiber.time()
> + local buckets_for_redirect_ts = clock()
> -- Empty sent buckets, updated after each step, and when
> -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> -- for next deletion.
> @@ -1614,7 +1614,7 @@ function gc_bucket_f()
> end
> end
>
> - if lfiber.time() - buckets_for_redirect_ts >=
> + if clock() - buckets_for_redirect_ts >=
> consts.BUCKET_SENT_GARBAGE_DELAY then
> status, err = gc_bucket_drop(buckets_for_redirect,
> consts.BUCKET.SENT)
> @@ -1629,7 +1629,7 @@ function gc_bucket_f()
> else
> buckets_for_redirect = empty_sent_buckets or {}
> empty_sent_buckets = nil
> - buckets_for_redirect_ts = lfiber.time()
> + buckets_for_redirect_ts = clock()
> end
> end
> ::continue::
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
@ 2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-10 22:33 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
Hi! Thanks for the review!
On 10.02.2021 09:57, Oleg Babin via Tarantool-patches wrote:
> Thanks for your patch. LGTM except two nits:
>
> - Seems you need to put "Closes #246"
Indeed. I had a feeling that I saw this clock task somewhere.
> - Tarantool has "clock" module. I suggest to use "fiber_clock()" instead of simple "clock" to avoid possible confusing.
Both comments fixed. The new patch below. No diff because it
is big and obvious - a plain rename.
====================
Use fiber.clock() instead of .time() everywhere
Fiber.time() returns real time. It is affected by time corrections
in the system, and can be not monotonic.
The patch makes everything in vshard use fiber.clock() instead of
fiber.time(). Also fiber.clock function is saved as an upvalue for
all functions in all modules using it. This makes the code a bit
shorter and saves 1 indexing of 'fiber' table.
The main reason - in the future map-reduce feature the current
time will be used quite often. In some places it probably will be
the slowest action (given how slow FFI can be when not compiled by
JIT).
Needed for #147
Closes #246
diff --git a/test/failover/failover.result b/test/failover/failover.result
index 452694c..bae57fa 100644
--- a/test/failover/failover.result
+++ b/test/failover/failover.result
@@ -261,13 +261,13 @@ test_run:cmd('start server box_1_d')
---
- true
...
-ts1 = fiber.time()
+ts1 = fiber.clock()
---
...
while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
---
...
-ts2 = fiber.time()
+ts2 = fiber.clock()
---
...
ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
diff --git a/test/failover/failover.test.lua b/test/failover/failover.test.lua
index 13c517b..a969e0e 100644
--- a/test/failover/failover.test.lua
+++ b/test/failover/failover.test.lua
@@ -109,9 +109,9 @@ test_run:switch('router_1')
-- Revive the best replica. A router must reconnect to it in
-- FAILOVER_UP_TIMEOUT seconds.
test_run:cmd('start server box_1_d')
-ts1 = fiber.time()
+ts1 = fiber.clock()
while rs1.replica.name ~= 'box_1_d' do fiber.sleep(0.1) end
-ts2 = fiber.time()
+ts2 = fiber.clock()
ts2 - ts1 < vshard.consts.FAILOVER_UP_TIMEOUT
test_run:grep_log('router_1', 'New replica box_1_d%(storage%@')
diff --git a/vshard/replicaset.lua b/vshard/replicaset.lua
index b13d05e..9c792b3 100644
--- a/vshard/replicaset.lua
+++ b/vshard/replicaset.lua
@@ -54,6 +54,7 @@ local luri = require('uri')
local luuid = require('uuid')
local ffi = require('ffi')
local util = require('vshard.util')
+local fiber_clock = fiber.clock
local gsc = util.generate_self_checker
--
@@ -88,7 +89,7 @@ local function netbox_on_connect(conn)
-- biggest priority. Really, it is not neccessary to
-- increase replica connection priority, if the current
-- one already has the biggest priority. (See failover_f).
- rs.replica_up_ts = fiber.time()
+ rs.replica_up_ts = fiber_clock()
end
end
@@ -100,7 +101,7 @@ local function netbox_on_disconnect(conn)
assert(conn.replica)
-- Replica is down - remember this time to decrease replica
-- priority after FAILOVER_DOWN_TIMEOUT seconds.
- conn.replica.down_ts = fiber.time()
+ conn.replica.down_ts = fiber_clock()
end
--
@@ -174,7 +175,7 @@ local function replicaset_up_replica_priority(replicaset)
local old_replica = replicaset.replica
if old_replica == replicaset.priority_list[1] and
old_replica:is_connected() then
- replicaset.replica_up_ts = fiber.time()
+ replicaset.replica_up_ts = fiber_clock()
return
end
for _, replica in pairs(replicaset.priority_list) do
@@ -403,7 +404,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
net_status, err = pcall(box.error, box.error.TIMEOUT)
return nil, lerror.make(err)
end
- local end_time = fiber.time() + timeout
+ local end_time = fiber_clock() + timeout
while not net_status and timeout > 0 do
replica, err = pick_next_replica(replicaset)
if not replica then
@@ -412,7 +413,7 @@ local function replicaset_template_multicallro(prefer_replica, balance)
opts.timeout = timeout
net_status, storage_status, retval, err =
replica_call(replica, func, args, opts)
- timeout = end_time - fiber.time()
+ timeout = end_time - fiber_clock()
if not net_status and not storage_status and
not can_retry_after_error(retval) then
-- There is no sense to retry LuaJit errors, such as
@@ -680,7 +681,7 @@ local function buildall(sharding_cfg)
else
zone_weights = {}
end
- local curr_ts = fiber.time()
+ local curr_ts = fiber_clock()
for replicaset_uuid, replicaset in pairs(sharding_cfg.sharding) do
local new_replicaset = setmetatable({
replicas = {},
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index ba1f863..eeb7515 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -1,6 +1,7 @@
local log = require('log')
local lfiber = require('fiber')
local table_new = require('table.new')
+local fiber_clock = lfiber.clock
local MODULE_INTERNALS = '__module_vshard_router'
-- Reload requirements, in case this module is reloaded manually.
@@ -527,7 +528,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
end
local timeout = opts.timeout or consts.CALL_TIMEOUT_MIN
local replicaset, err
- local tend = lfiber.time() + timeout
+ local tend = fiber_clock() + timeout
if bucket_id > router.total_bucket_count or bucket_id <= 0 then
error('Bucket is unreachable: bucket id is out of range')
end
@@ -551,7 +552,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
replicaset, err = bucket_resolve(router, bucket_id)
if replicaset then
::replicaset_is_found::
- opts.timeout = tend - lfiber.time()
+ opts.timeout = tend - fiber_clock()
local storage_call_status, call_status, call_error =
replicaset[call](replicaset, 'vshard.storage.call',
{bucket_id, mode, func, args}, opts)
@@ -583,7 +584,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
-- if reconfiguration had been started,
-- and while is not executed on router,
-- but already is executed on storages.
- while lfiber.time() <= tend do
+ while fiber_clock() <= tend do
lfiber.sleep(0.05)
replicaset = router.replicasets[err.destination]
if replicaset then
@@ -598,7 +599,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
-- case of broken cluster, when a bucket
-- is sent on two replicasets to each
-- other.
- if replicaset and lfiber.time() <= tend then
+ if replicaset and fiber_clock() <= tend then
goto replicaset_is_found
end
end
@@ -623,7 +624,7 @@ local function router_call_impl(router, bucket_id, mode, prefer_replica,
end
end
lfiber.yield()
- until lfiber.time() > tend
+ until fiber_clock() > tend
if err then
return nil, err
else
@@ -749,7 +750,7 @@ end
-- connections must be updated.
--
local function failover_collect_to_update(router)
- local ts = lfiber.time()
+ local ts = fiber_clock()
local uuid_to_update = {}
for uuid, rs in pairs(router.replicasets) do
if failover_need_down_priority(rs, ts) or
@@ -772,7 +773,7 @@ local function failover_step(router)
if #uuid_to_update == 0 then
return false
end
- local curr_ts = lfiber.time()
+ local curr_ts = fiber_clock()
local replica_is_changed = false
for _, uuid in pairs(uuid_to_update) do
local rs = router.replicasets[uuid]
@@ -1230,8 +1231,7 @@ local function router_sync(router, timeout)
timeout = router.sync_timeout
end
local arg = {timeout}
- local clock = lfiber.clock
- local deadline = timeout and (clock() + timeout)
+ local deadline = timeout and (fiber_clock() + timeout)
local opts = {timeout = timeout}
for rs_uuid, replicaset in pairs(router.replicasets) do
if timeout < 0 then
@@ -1244,7 +1244,7 @@ local function router_sync(router, timeout)
err.replicaset = rs_uuid
return nil, err
end
- timeout = deadline - clock()
+ timeout = deadline - fiber_clock()
arg[1] = timeout
opts.timeout = timeout
end
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 1b48bf1..38cdf19 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -5,6 +5,7 @@ local netbox = require('net.box') -- for net.box:self()
local trigger = require('internal.trigger')
local ffi = require('ffi')
local yaml_encode = require('yaml').encode
+local fiber_clock = lfiber.clock
local MODULE_INTERNALS = '__module_vshard_storage'
-- Reload requirements, in case this module is reloaded manually.
@@ -695,7 +696,7 @@ local function sync(timeout)
log.debug("Synchronizing replicaset...")
timeout = timeout or M.sync_timeout
local vclock = box.info.vclock
- local tstart = lfiber.time()
+ local tstart = fiber_clock()
repeat
local done = true
for _, replica in ipairs(box.info.replication) do
@@ -711,7 +712,7 @@ local function sync(timeout)
return true
end
lfiber.sleep(0.001)
- until not (lfiber.time() <= tstart + timeout)
+ until fiber_clock() > tstart + timeout
log.warn("Timed out during synchronizing replicaset")
local ok, err = pcall(box.error, box.error.TIMEOUT)
return nil, lerror.make(err)
@@ -1280,10 +1281,11 @@ local function bucket_send_xc(bucket_id, destination, opts, exception_guard)
ref.rw_lock = true
exception_guard.ref = ref
exception_guard.drop_rw_lock = true
- local deadline = lfiber.clock() + (opts and opts.timeout or 10)
+ local timeout = opts and opts.timeout or 10
+ local deadline = fiber_clock() + timeout
while ref.rw ~= 0 do
- if not M.bucket_rw_lock_is_ready_cond:wait(deadline -
- lfiber.clock()) then
+ timeout = deadline - fiber_clock()
+ if not M.bucket_rw_lock_is_ready_cond:wait(timeout) then
status, err = pcall(box.error, box.error.TIMEOUT)
return nil, lerror.make(err)
end
@@ -1579,7 +1581,7 @@ function gc_bucket_f()
-- specified time interval the buckets are deleted both from
-- this array and from _bucket space.
local buckets_for_redirect = {}
- local buckets_for_redirect_ts = lfiber.time()
+ local buckets_for_redirect_ts = fiber_clock()
-- Empty sent buckets, updated after each step, and when
-- buckets_for_redirect is deleted, it gets empty_sent_buckets
-- for next deletion.
@@ -1614,7 +1616,7 @@ function gc_bucket_f()
end
end
- if lfiber.time() - buckets_for_redirect_ts >=
+ if fiber_clock() - buckets_for_redirect_ts >=
consts.BUCKET_SENT_GARBAGE_DELAY then
status, err = gc_bucket_drop(buckets_for_redirect,
consts.BUCKET.SENT)
@@ -1629,7 +1631,7 @@ function gc_bucket_f()
else
buckets_for_redirect = empty_sent_buckets or {}
empty_sent_buckets = nil
- buckets_for_redirect_ts = lfiber.time()
+ buckets_for_redirect_ts = fiber_clock()
end
end
::continue::
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 4/9] storage: bucket_recv() should check rs lock Vladislav Shpilevoy via Tarantool-patches
` (6 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
In the tests to wait for bucket deletion by GC it was necessary
to have a long loop expression which checks _bucket space and
wakes up GC fiber if the bucket is not deleted yet.
Soon the GC wakeup won't be necessary as GC algorithm will become
reactive instead of proactive.
In order not to remove the wakeup from all places in the main
patch, and to simplify the waiting the patch introduces a function
wait_bucket_is_collected().
The reactive GC will delete GC wakeup from this function and all
the tests still will pass in time.
---
test/lua_libs/storage_template.lua | 10 ++++++++++
test/rebalancer/bucket_ref.result | 7 ++-----
test/rebalancer/bucket_ref.test.lua | 5 ++---
test/rebalancer/errinj.result | 13 +++++--------
test/rebalancer/errinj.test.lua | 7 +++----
test/rebalancer/rebalancer.result | 5 +----
test/rebalancer/rebalancer.test.lua | 3 +--
test/rebalancer/receiving_bucket.result | 2 +-
test/rebalancer/receiving_bucket.test.lua | 2 +-
test/reload_evolution/storage.result | 5 +----
test/reload_evolution/storage.test.lua | 3 +--
11 files changed, 28 insertions(+), 34 deletions(-)
diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
index 84e4180..21409bd 100644
--- a/test/lua_libs/storage_template.lua
+++ b/test/lua_libs/storage_template.lua
@@ -165,3 +165,13 @@ function wait_rebalancer_state(state, test_run)
vshard.storage.rebalancer_wakeup()
end
end
+
+function wait_bucket_is_collected(id)
+ test_run:wait_cond(function()
+ if not box.space._bucket:get{id} then
+ return true
+ end
+ vshard.storage.recovery_wakeup()
+ vshard.storage.garbage_collector_wakeup()
+ end)
+end
diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
index b66e449..b8fc7ff 100644
--- a/test/rebalancer/bucket_ref.result
+++ b/test/rebalancer/bucket_ref.result
@@ -243,7 +243,7 @@ vshard.storage.buckets_info(1)
destination: <replicaset_2>
id: 1
...
-while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
+wait_bucket_is_collected(1)
---
...
_ = test_run:switch('box_2_a')
@@ -292,10 +292,7 @@ vshard.storage.buckets_info(1)
finish_refs = true
---
...
-while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
----
-...
-while box.space._bucket:get{1} do fiber.sleep(0.01) end
+wait_bucket_is_collected(1)
---
...
_ = test_run:switch('box_1_a')
diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
index 49ba583..213ced3 100644
--- a/test/rebalancer/bucket_ref.test.lua
+++ b/test/rebalancer/bucket_ref.test.lua
@@ -73,7 +73,7 @@ vshard.storage.bucket_refro(1)
finish_refs = true
while f1:status() ~= 'dead' do fiber.sleep(0.01) end
vshard.storage.buckets_info(1)
-while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
+wait_bucket_is_collected(1)
_ = test_run:switch('box_2_a')
vshard.storage.buckets_info(1)
vshard.storage.internal.errinj.ERRINJ_LONG_RECEIVE = false
@@ -89,8 +89,7 @@ while not vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
fiber.sleep(0.2)
vshard.storage.buckets_info(1)
finish_refs = true
-while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
-while box.space._bucket:get{1} do fiber.sleep(0.01) end
+wait_bucket_is_collected(1)
_ = test_run:switch('box_1_a')
vshard.storage.buckets_info(1)
diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
index 214e7d8..e50eb72 100644
--- a/test/rebalancer/errinj.result
+++ b/test/rebalancer/errinj.result
@@ -237,7 +237,10 @@ _bucket:get{36}
-- Buckets became 'active' on box_2_a, but still are sending on
-- box_1_a. Wait until it is marked as garbage on box_1_a by the
-- recovery fiber.
-while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end
+wait_bucket_is_collected(35)
+---
+...
+wait_bucket_is_collected(36)
---
...
_ = test_run:switch('box_2_a')
@@ -278,7 +281,7 @@ while not _bucket:get{36} do fiber.sleep(0.0001) end
_ = test_run:switch('box_1_a')
---
...
-while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+wait_bucket_is_collected(36)
---
...
_bucket:get{36}
@@ -295,12 +298,6 @@ box.error.injection.set('ERRINJ_WAL_DELAY', false)
---
- ok
...
-_ = test_run:switch('box_1_a')
----
-...
-while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end
----
-...
test_run:switch('default')
---
- true
diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
index 66fbe5e..2cc4a69 100644
--- a/test/rebalancer/errinj.test.lua
+++ b/test/rebalancer/errinj.test.lua
@@ -107,7 +107,8 @@ _bucket:get{36}
-- Buckets became 'active' on box_2_a, but still are sending on
-- box_1_a. Wait until it is marked as garbage on box_1_a by the
-- recovery fiber.
-while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end
+wait_bucket_is_collected(35)
+wait_bucket_is_collected(36)
_ = test_run:switch('box_2_a')
_bucket:get{35}
_bucket:get{36}
@@ -124,13 +125,11 @@ f1 = fiber.create(function() ret1, err1 = vshard.storage.bucket_send(36, util.re
_ = test_run:switch('box_2_a')
while not _bucket:get{36} do fiber.sleep(0.0001) end
_ = test_run:switch('box_1_a')
-while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+wait_bucket_is_collected(36)
_bucket:get{36}
_ = test_run:switch('box_2_a')
_bucket:get{36}
box.error.injection.set('ERRINJ_WAL_DELAY', false)
-_ = test_run:switch('box_1_a')
-while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end
test_run:switch('default')
test_run:drop_cluster(REPLICASET_2)
diff --git a/test/rebalancer/rebalancer.result b/test/rebalancer/rebalancer.result
index 3607e93..098b845 100644
--- a/test/rebalancer/rebalancer.result
+++ b/test/rebalancer/rebalancer.result
@@ -334,10 +334,7 @@ vshard.storage.rebalancer_wakeup()
-- Now rebalancer makes a bucket SENT. After it the garbage
-- collector cleans it and deletes after a timeout.
--
-while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end
----
-...
-while _bucket:get{91} ~= nil do fiber.sleep(0.1) end
+wait_bucket_is_collected(91)
---
...
wait_rebalancer_state("The cluster is balanced ok", test_run)
diff --git a/test/rebalancer/rebalancer.test.lua b/test/rebalancer/rebalancer.test.lua
index 63e690f..308e66d 100644
--- a/test/rebalancer/rebalancer.test.lua
+++ b/test/rebalancer/rebalancer.test.lua
@@ -162,8 +162,7 @@ vshard.storage.rebalancer_wakeup()
-- Now rebalancer makes a bucket SENT. After it the garbage
-- collector cleans it and deletes after a timeout.
--
-while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end
-while _bucket:get{91} ~= nil do fiber.sleep(0.1) end
+wait_bucket_is_collected(91)
wait_rebalancer_state("The cluster is balanced ok", test_run)
_bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
_bucket.index.status:min({vshard.consts.BUCKET.ACTIVE})
diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
index db6a67f..7d3612b 100644
--- a/test/rebalancer/receiving_bucket.result
+++ b/test/rebalancer/receiving_bucket.result
@@ -374,7 +374,7 @@ vshard.storage.buckets_info(1)
destination: <replicaset_1>
id: 1
...
-while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
+wait_bucket_is_collected(1)
---
...
vshard.storage.buckets_info(1)
diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
index 1819cbb..24534b3 100644
--- a/test/rebalancer/receiving_bucket.test.lua
+++ b/test/rebalancer/receiving_bucket.test.lua
@@ -137,7 +137,7 @@ box.space.test3:select{100}
_ = test_run:switch('box_2_a')
vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
vshard.storage.buckets_info(1)
-while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
+wait_bucket_is_collected(1)
vshard.storage.buckets_info(1)
_ = test_run:switch('box_1_a')
box.space._bucket:get{1}
diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
index 4652c4f..753687f 100644
--- a/test/reload_evolution/storage.result
+++ b/test/reload_evolution/storage.result
@@ -129,10 +129,7 @@ vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1])
---
- true
...
-vshard.storage.garbage_collector_wakeup()
----
-...
-while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+wait_bucket_is_collected(bucket_id_to_move)
---
...
test_run:switch('storage_1_a')
diff --git a/test/reload_evolution/storage.test.lua b/test/reload_evolution/storage.test.lua
index 06f7117..639553e 100644
--- a/test/reload_evolution/storage.test.lua
+++ b/test/reload_evolution/storage.test.lua
@@ -51,8 +51,7 @@ vshard.storage.bucket_force_create(2000)
vshard.storage.buckets_info()[2000]
vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1])
-vshard.storage.garbage_collector_wakeup()
-while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
+wait_bucket_is_collected(bucket_id_to_move)
test_run:switch('storage_1_a')
while box.space._bucket:get{bucket_id_to_move}.status ~= vshard.consts.BUCKET.ACTIVE do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end
vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[2])
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 8:57 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Hi! Thanks for your patch! LGTM but I have one question.
Maybe it's reasonable to add some timeout in this function?
AFAIK test-run terminates tests after 120 seconds of inactivity it seems
too long for such simple case.
But anyway it's up to you.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> In the tests to wait for bucket deletion by GC it was necessary
> to have a long loop expression which checks _bucket space and
> wakes up GC fiber if the bucket is not deleted yet.
>
> Soon the GC wakeup won't be necessary as GC algorithm will become
> reactive instead of proactive.
>
> In order not to remove the wakeup from all places in the main
> patch, and to simplify the waiting the patch introduces a function
> wait_bucket_is_collected().
>
> The reactive GC will delete GC wakeup from this function and all
> the tests still will pass in time.
> ---
> test/lua_libs/storage_template.lua | 10 ++++++++++
> test/rebalancer/bucket_ref.result | 7 ++-----
> test/rebalancer/bucket_ref.test.lua | 5 ++---
> test/rebalancer/errinj.result | 13 +++++--------
> test/rebalancer/errinj.test.lua | 7 +++----
> test/rebalancer/rebalancer.result | 5 +----
> test/rebalancer/rebalancer.test.lua | 3 +--
> test/rebalancer/receiving_bucket.result | 2 +-
> test/rebalancer/receiving_bucket.test.lua | 2 +-
> test/reload_evolution/storage.result | 5 +----
> test/reload_evolution/storage.test.lua | 3 +--
> 11 files changed, 28 insertions(+), 34 deletions(-)
>
> diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
> index 84e4180..21409bd 100644
> --- a/test/lua_libs/storage_template.lua
> +++ b/test/lua_libs/storage_template.lua
> @@ -165,3 +165,13 @@ function wait_rebalancer_state(state, test_run)
> vshard.storage.rebalancer_wakeup()
> end
> end
> +
> +function wait_bucket_is_collected(id)
> + test_run:wait_cond(function()
> + if not box.space._bucket:get{id} then
> + return true
> + end
> + vshard.storage.recovery_wakeup()
> + vshard.storage.garbage_collector_wakeup()
> + end)
> +end
> diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
> index b66e449..b8fc7ff 100644
> --- a/test/rebalancer/bucket_ref.result
> +++ b/test/rebalancer/bucket_ref.result
> @@ -243,7 +243,7 @@ vshard.storage.buckets_info(1)
> destination: <replicaset_2>
> id: 1
> ...
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> ---
> ...
> _ = test_run:switch('box_2_a')
> @@ -292,10 +292,7 @@ vshard.storage.buckets_info(1)
> finish_refs = true
> ---
> ...
> -while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
> ----
> -...
> -while box.space._bucket:get{1} do fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> ---
> ...
> _ = test_run:switch('box_1_a')
> diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
> index 49ba583..213ced3 100644
> --- a/test/rebalancer/bucket_ref.test.lua
> +++ b/test/rebalancer/bucket_ref.test.lua
> @@ -73,7 +73,7 @@ vshard.storage.bucket_refro(1)
> finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> vshard.storage.buckets_info(1)
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> _ = test_run:switch('box_2_a')
> vshard.storage.buckets_info(1)
> vshard.storage.internal.errinj.ERRINJ_LONG_RECEIVE = false
> @@ -89,8 +89,7 @@ while not vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
> fiber.sleep(0.2)
> vshard.storage.buckets_info(1)
> finish_refs = true
> -while vshard.storage.buckets_info(1)[1].rw_lock do fiber.sleep(0.01) end
> -while box.space._bucket:get{1} do fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> _ = test_run:switch('box_1_a')
> vshard.storage.buckets_info(1)
>
> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index 214e7d8..e50eb72 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -237,7 +237,10 @@ _bucket:get{36}
> -- Buckets became 'active' on box_2_a, but still are sending on
> -- box_1_a. Wait until it is marked as garbage on box_1_a by the
> -- recovery fiber.
> -while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(35)
> +---
> +...
> +wait_bucket_is_collected(36)
> ---
> ...
> _ = test_run:switch('box_2_a')
> @@ -278,7 +281,7 @@ while not _bucket:get{36} do fiber.sleep(0.0001) end
> _ = test_run:switch('box_1_a')
> ---
> ...
> -while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(36)
> ---
> ...
> _bucket:get{36}
> @@ -295,12 +298,6 @@ box.error.injection.set('ERRINJ_WAL_DELAY', false)
> ---
> - ok
> ...
> -_ = test_run:switch('box_1_a')
> ----
> -...
> -while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end
> ----
> -...
> test_run:switch('default')
> ---
> - true
> diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
> index 66fbe5e..2cc4a69 100644
> --- a/test/rebalancer/errinj.test.lua
> +++ b/test/rebalancer/errinj.test.lua
> @@ -107,7 +107,8 @@ _bucket:get{36}
> -- Buckets became 'active' on box_2_a, but still are sending on
> -- box_1_a. Wait until it is marked as garbage on box_1_a by the
> -- recovery fiber.
> -while _bucket:get{35} ~= nil or _bucket:get{36} ~= nil do vshard.storage.recovery_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(35)
> +wait_bucket_is_collected(36)
> _ = test_run:switch('box_2_a')
> _bucket:get{35}
> _bucket:get{36}
> @@ -124,13 +125,11 @@ f1 = fiber.create(function() ret1, err1 = vshard.storage.bucket_send(36, util.re
> _ = test_run:switch('box_2_a')
> while not _bucket:get{36} do fiber.sleep(0.0001) end
> _ = test_run:switch('box_1_a')
> -while _bucket:get{36} do vshard.storage.recovery_wakeup() vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +wait_bucket_is_collected(36)
> _bucket:get{36}
> _ = test_run:switch('box_2_a')
> _bucket:get{36}
> box.error.injection.set('ERRINJ_WAL_DELAY', false)
> -_ = test_run:switch('box_1_a')
> -while _bucket:get{36} and _bucket:get{36}.status == vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.001) end
>
> test_run:switch('default')
> test_run:drop_cluster(REPLICASET_2)
> diff --git a/test/rebalancer/rebalancer.result b/test/rebalancer/rebalancer.result
> index 3607e93..098b845 100644
> --- a/test/rebalancer/rebalancer.result
> +++ b/test/rebalancer/rebalancer.result
> @@ -334,10 +334,7 @@ vshard.storage.rebalancer_wakeup()
> -- Now rebalancer makes a bucket SENT. After it the garbage
> -- collector cleans it and deletes after a timeout.
> --
> -while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end
> ----
> -...
> -while _bucket:get{91} ~= nil do fiber.sleep(0.1) end
> +wait_bucket_is_collected(91)
> ---
> ...
> wait_rebalancer_state("The cluster is balanced ok", test_run)
> diff --git a/test/rebalancer/rebalancer.test.lua b/test/rebalancer/rebalancer.test.lua
> index 63e690f..308e66d 100644
> --- a/test/rebalancer/rebalancer.test.lua
> +++ b/test/rebalancer/rebalancer.test.lua
> @@ -162,8 +162,7 @@ vshard.storage.rebalancer_wakeup()
> -- Now rebalancer makes a bucket SENT. After it the garbage
> -- collector cleans it and deletes after a timeout.
> --
> -while _bucket:get{91}.status ~= vshard.consts.BUCKET.SENT do fiber.sleep(0.01) end
> -while _bucket:get{91} ~= nil do fiber.sleep(0.1) end
> +wait_bucket_is_collected(91)
> wait_rebalancer_state("The cluster is balanced ok", test_run)
> _bucket.index.status:count({vshard.consts.BUCKET.ACTIVE})
> _bucket.index.status:min({vshard.consts.BUCKET.ACTIVE})
> diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
> index db6a67f..7d3612b 100644
> --- a/test/rebalancer/receiving_bucket.result
> +++ b/test/rebalancer/receiving_bucket.result
> @@ -374,7 +374,7 @@ vshard.storage.buckets_info(1)
> destination: <replicaset_1>
> id: 1
> ...
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> ---
> ...
> vshard.storage.buckets_info(1)
> diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
> index 1819cbb..24534b3 100644
> --- a/test/rebalancer/receiving_bucket.test.lua
> +++ b/test/rebalancer/receiving_bucket.test.lua
> @@ -137,7 +137,7 @@ box.space.test3:select{100}
> _ = test_run:switch('box_2_a')
> vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> vshard.storage.buckets_info(1)
> -while box.space._bucket:get{1} do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.01) end
> +wait_bucket_is_collected(1)
> vshard.storage.buckets_info(1)
> _ = test_run:switch('box_1_a')
> box.space._bucket:get{1}
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> index 4652c4f..753687f 100644
> --- a/test/reload_evolution/storage.result
> +++ b/test/reload_evolution/storage.result
> @@ -129,10 +129,7 @@ vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1])
> ---
> - true
> ...
> -vshard.storage.garbage_collector_wakeup()
> ----
> -...
> -while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
> +wait_bucket_is_collected(bucket_id_to_move)
> ---
> ...
> test_run:switch('storage_1_a')
> diff --git a/test/reload_evolution/storage.test.lua b/test/reload_evolution/storage.test.lua
> index 06f7117..639553e 100644
> --- a/test/reload_evolution/storage.test.lua
> +++ b/test/reload_evolution/storage.test.lua
> @@ -51,8 +51,7 @@ vshard.storage.bucket_force_create(2000)
> vshard.storage.buckets_info()[2000]
> vshard.storage.call(bucket_id_to_move, 'read', 'do_select', {42})
> vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[1])
> -vshard.storage.garbage_collector_wakeup()
> -while box.space._bucket:get({bucket_id_to_move}) do fiber.sleep(0.01) end
> +wait_bucket_is_collected(bucket_id_to_move)
> test_run:switch('storage_1_a')
> while box.space._bucket:get{bucket_id_to_move}.status ~= vshard.consts.BUCKET.ACTIVE do vshard.storage.recovery_wakeup() fiber.sleep(0.01) end
> vshard.storage.bucket_send(bucket_id_to_move, util.replicasets[2])
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
@ 2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-10 22:33 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
Thanks for the review!
On 10.02.2021 09:57, Oleg Babin wrote:
> Hi! Thanks for your patch! LGTM but I have one question.
>
> Maybe it's reasonable to add some timeout in this function?
>
> AFAIK test-run terminates tests after 120 seconds of inactivity it seems too long for such simple case.
>
> But anyway it's up to you.
test_run:wait_cond() has default timeout 1 minute. I decided it
is fine.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC
2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-11 6:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your answer. Yes, it's fine. LGTM.
On 11/02/2021 01:33, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 09:57, Oleg Babin wrote:
>> Hi! Thanks for your patch! LGTM but I have one question.
>>
>> Maybe it's reasonable to add some timeout in this function?
>>
>> AFAIK test-run terminates tests after 120 seconds of inactivity it seems too long for such simple case.
>>
>> But anyway it's up to you.
> test_run:wait_cond() has default timeout 1 minute. I decided it
> is fine.
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 4/9] storage: bucket_recv() should check rs lock
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (2 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions Vladislav Shpilevoy via Tarantool-patches
` (5 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Locked replicaset (via config) should not allow any bucket moves
from or to the replicaset.
But the lock check was only done by bucket_send(). Bucket_recv()
allowed to receive a bucket even if the replicaset is locked. The
patch fixes it.
It didn't affect automatic bucket sends, because lock is
accounted by the rebalancer from the config. Only manual bucket
moves could have this bug.
---
test/rebalancer/rebalancer_lock_and_pin.result | 14 ++++++++++++++
test/rebalancer/rebalancer_lock_and_pin.test.lua | 4 ++++
vshard/storage/init.lua | 3 +++
3 files changed, 21 insertions(+)
diff --git a/test/rebalancer/rebalancer_lock_and_pin.result b/test/rebalancer/rebalancer_lock_and_pin.result
index 51dd36e..0bb4f45 100644
--- a/test/rebalancer/rebalancer_lock_and_pin.result
+++ b/test/rebalancer/rebalancer_lock_and_pin.result
@@ -156,6 +156,20 @@ vshard.storage.bucket_send(1, util.replicasets[2])
message: Replicaset is locked
code: 19
...
+test_run:switch('box_2_a')
+---
+- true
+...
+-- Does not allow to receive either. Send from a non-locked replicaset to a
+-- locked one fails.
+vshard.storage.bucket_send(101, util.replicasets[1])
+---
+- null
+- type: ShardingError
+ code: 19
+ name: REPLICASET_IS_LOCKED
+ message: Replicaset is locked
+...
--
-- Vshard ensures that if a replicaset is locked, then it will not
-- allow to change its bucket set even if a rebalancer does not
diff --git a/test/rebalancer/rebalancer_lock_and_pin.test.lua b/test/rebalancer/rebalancer_lock_and_pin.test.lua
index c3412c1..7b87004 100644
--- a/test/rebalancer/rebalancer_lock_and_pin.test.lua
+++ b/test/rebalancer/rebalancer_lock_and_pin.test.lua
@@ -69,6 +69,10 @@ info.lock
-- explicitly.
--
vshard.storage.bucket_send(1, util.replicasets[2])
+test_run:switch('box_2_a')
+-- Does not allow to receive either. Send from a non-locked replicaset to a
+-- locked one fails.
+vshard.storage.bucket_send(101, util.replicasets[1])
--
-- Vshard ensures that if a replicaset is locked, then it will not
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index c7335fc..298df71 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -995,6 +995,9 @@ local function bucket_recv_xc(bucket_id, from, data, opts)
return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, msg,
from)
end
+ if is_this_replicaset_locked() then
+ return nil, lerror.vshard(lerror.code.REPLICASET_IS_LOCKED)
+ end
if not bucket_receiving_quota_add(-1) then
return nil, lerror.vshard(lerror.code.TOO_MANY_RECEIVING)
end
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 4/9] storage: bucket_recv() should check rs lock
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 4/9] storage: bucket_recv() should check rs lock Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 8:59 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch. LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Locked replicaset (via config) should not allow any bucket moves
> from or to the replicaset.
>
> But the lock check was only done by bucket_send(). Bucket_recv()
> allowed to receive a bucket even if the replicaset is locked. The
> patch fixes it.
>
> It didn't affect automatic bucket sends, because lock is
> accounted by the rebalancer from the config. Only manual bucket
> moves could have this bug.
> ---
> test/rebalancer/rebalancer_lock_and_pin.result | 14 ++++++++++++++
> test/rebalancer/rebalancer_lock_and_pin.test.lua | 4 ++++
> vshard/storage/init.lua | 3 +++
> 3 files changed, 21 insertions(+)
>
> diff --git a/test/rebalancer/rebalancer_lock_and_pin.result b/test/rebalancer/rebalancer_lock_and_pin.result
> index 51dd36e..0bb4f45 100644
> --- a/test/rebalancer/rebalancer_lock_and_pin.result
> +++ b/test/rebalancer/rebalancer_lock_and_pin.result
> @@ -156,6 +156,20 @@ vshard.storage.bucket_send(1, util.replicasets[2])
> message: Replicaset is locked
> code: 19
> ...
> +test_run:switch('box_2_a')
> +---
> +- true
> +...
> +-- Does not allow to receive either. Send from a non-locked replicaset to a
> +-- locked one fails.
> +vshard.storage.bucket_send(101, util.replicasets[1])
> +---
> +- null
> +- type: ShardingError
> + code: 19
> + name: REPLICASET_IS_LOCKED
> + message: Replicaset is locked
> +...
> --
> -- Vshard ensures that if a replicaset is locked, then it will not
> -- allow to change its bucket set even if a rebalancer does not
> diff --git a/test/rebalancer/rebalancer_lock_and_pin.test.lua b/test/rebalancer/rebalancer_lock_and_pin.test.lua
> index c3412c1..7b87004 100644
> --- a/test/rebalancer/rebalancer_lock_and_pin.test.lua
> +++ b/test/rebalancer/rebalancer_lock_and_pin.test.lua
> @@ -69,6 +69,10 @@ info.lock
> -- explicitly.
> --
> vshard.storage.bucket_send(1, util.replicasets[2])
> +test_run:switch('box_2_a')
> +-- Does not allow to receive either. Send from a non-locked replicaset to a
> +-- locked one fails.
> +vshard.storage.bucket_send(101, util.replicasets[1])
>
> --
> -- Vshard ensures that if a replicaset is locked, then it will not
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index c7335fc..298df71 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -995,6 +995,9 @@ local function bucket_recv_xc(bucket_id, from, data, opts)
> return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, msg,
> from)
> end
> + if is_this_replicaset_locked() then
> + return nil, lerror.vshard(lerror.code.REPLICASET_IS_LOCKED)
> + end
> if not bucket_receiving_quota_add(-1) then
> return nil, lerror.vshard(lerror.code.TOO_MANY_RECEIVING)
> end
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (3 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 4/9] storage: bucket_recv() should check rs lock Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature Vladislav Shpilevoy via Tarantool-patches
` (4 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
The patch adds functions table_copy_yield and table_minus_yield.
Yielding copy creates a duplicate of a table but yields every
specified number of keys copied.
Yielding minus removes matching key-value pairs specified in one
table from another table. It yields every specified number of keys
passed.
The functions should help to process huge Lua tables (millions of
elements and more). These are going to be used on the storage in
the new GC algorithm.
The algorithm will need to keep a route table on the storage, just
like on the router, but with expiration time for the routes. Since
bucket count can be millions, it means GC will potentially operate
on a huge Lua table and could use some yields so as not to block
TX thread for long.
Needed for #147
---
test/unit/util.result | 113 ++++++++++++++++++++++++++++++++++++++++
test/unit/util.test.lua | 49 +++++++++++++++++
vshard/util.lua | 40 ++++++++++++++
3 files changed, 202 insertions(+)
diff --git a/test/unit/util.result b/test/unit/util.result
index 096e36f..c4fd84d 100644
--- a/test/unit/util.result
+++ b/test/unit/util.result
@@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
fib:cancel()
---
...
+-- Yielding table minus.
+minus_yield = util.table_minus_yield
+---
+...
+minus_yield({}, {}, 1)
+---
+- []
+...
+minus_yield({}, {k = 1}, 1)
+---
+- []
+...
+minus_yield({}, {k = 1}, 0)
+---
+- []
+...
+minus_yield({k = 1}, {k = 1}, 0)
+---
+- []
+...
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
+---
+- k2: 2
+...
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
+---
+- []
+...
+-- Mismatching values are not deleted.
+minus_yield({k1 = 1}, {k1 = 2}, 10)
+---
+- k1: 1
+...
+minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
+---
+- k3: 3
+ k2: 2
+...
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ f = fiber.create(function() \
+ minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
+ end) \
+ yield_count = 0 \
+ while f:status() ~= 'dead' do \
+ yield_count = yield_count + 1 \
+ fiber.yield() \
+ end \
+end
+---
+...
+yield_count
+---
+- 2
+...
+t
+---
+- k4: 4
+ k1: 1
+...
+-- Yielding table copy.
+copy_yield = util.table_copy_yield
+---
+...
+copy_yield({}, 1)
+---
+- []
+...
+copy_yield({k = 1}, 1)
+---
+- k: 1
+...
+copy_yield({k1 = 1, k2 = 2}, 1)
+---
+- k1: 1
+ k2: 2
+...
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ res = nil \
+ f = fiber.create(function() \
+ res = copy_yield(t, 2) \
+ end) \
+ yield_count = 0 \
+ while f:status() ~= 'dead' do \
+ yield_count = yield_count + 1 \
+ fiber.yield() \
+ end \
+end
+---
+...
+yield_count
+---
+- 2
+...
+t
+---
+- k3: 3
+ k4: 4
+ k1: 1
+ k2: 2
+...
+res
+---
+- k3: 3
+ k4: 4
+ k1: 1
+ k2: 2
+...
+t ~= res
+---
+- true
+...
diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
index 5f39e06..4d6cbe9 100644
--- a/test/unit/util.test.lua
+++ b/test/unit/util.test.lua
@@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function')
while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end
test_run:grep_log('default', 'reloadable_function has been started', 1000)
fib:cancel()
+
+-- Yielding table minus.
+minus_yield = util.table_minus_yield
+minus_yield({}, {}, 1)
+minus_yield({}, {k = 1}, 1)
+minus_yield({}, {k = 1}, 0)
+minus_yield({k = 1}, {k = 1}, 0)
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
+-- Mismatching values are not deleted.
+minus_yield({k1 = 1}, {k1 = 2}, 10)
+minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
+
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ f = fiber.create(function() \
+ minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
+ end) \
+ yield_count = 0 \
+ while f:status() ~= 'dead' do \
+ yield_count = yield_count + 1 \
+ fiber.yield() \
+ end \
+end
+yield_count
+t
+
+-- Yielding table copy.
+copy_yield = util.table_copy_yield
+copy_yield({}, 1)
+copy_yield({k = 1}, 1)
+copy_yield({k1 = 1, k2 = 2}, 1)
+
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ res = nil \
+ f = fiber.create(function() \
+ res = copy_yield(t, 2) \
+ end) \
+ yield_count = 0 \
+ while f:status() ~= 'dead' do \
+ yield_count = yield_count + 1 \
+ fiber.yield() \
+ end \
+end
+yield_count
+t
+res
+t ~= res
diff --git a/vshard/util.lua b/vshard/util.lua
index d3b4e67..2362607 100644
--- a/vshard/util.lua
+++ b/vshard/util.lua
@@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need)
return minor >= minor_need
end
+--
+-- Copy @a src table. Fiber yields every @a interval keys copied.
+--
+local function table_copy_yield(src, interval)
+ local res = {}
+ -- Time-To-Yield.
+ local tty = interval
+ for k, v in pairs(src) do
+ res[k] = v
+ tty = tty - 1
+ if tty <= 0 then
+ fiber.yield()
+ tty = interval
+ end
+ end
+ return res
+end
+
+--
+-- Remove @a src keys from @a dst if their values match. Fiber yields every
+-- @a interval iterations.
+--
+local function table_minus_yield(dst, src, interval)
+ -- Time-To-Yield.
+ local tty = interval
+ for k, srcv in pairs(src) do
+ if dst[k] == srcv then
+ dst[k] = nil
+ end
+ tty = tty - 1
+ if tty <= 0 then
+ fiber.yield()
+ tty = interval
+ end
+ end
+ return dst
+end
+
return {
tuple_extract_key = tuple_extract_key,
reloadable_fiber_create = reloadable_fiber_create,
@@ -160,4 +198,6 @@ return {
async_task = async_task,
internal = M,
version_is_at_least = version_is_at_least,
+ table_copy_yield = table_copy_yield,
+ table_minus_yield = table_minus_yield,
}
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 8:59 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch 1 comment below.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> The patch adds functions table_copy_yield and table_minus_yield.
>
> Yielding copy creates a duplicate of a table but yields every
> specified number of keys copied.
>
> Yielding minus removes matching key-value pairs specified in one
> table from another table. It yields every specified number of keys
> passed.
>
> The functions should help to process huge Lua tables (millions of
> elements and more). These are going to be used on the storage in
> the new GC algorithm.
>
> The algorithm will need to keep a route table on the storage, just
> like on the router, but with expiration time for the routes. Since
> bucket count can be millions, it means GC will potentially operate
> on a huge Lua table and could use some yields so as not to block
> TX thread for long.
>
> Needed for #147
> ---
> test/unit/util.result | 113 ++++++++++++++++++++++++++++++++++++++++
> test/unit/util.test.lua | 49 +++++++++++++++++
> vshard/util.lua | 40 ++++++++++++++
> 3 files changed, 202 insertions(+)
>
> diff --git a/test/unit/util.result b/test/unit/util.result
> index 096e36f..c4fd84d 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> ---
> ...
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +---
> +...
> +minus_yield({}, {}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k = 1}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +---
> +- k2: 2
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +---
> +- []
> +...
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +---
> +- k1: 1
> +...
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +---
> +- k3: 3
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + f = fiber.create(function() \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + end) \
> + yield_count = 0 \
> + while f:status() ~= 'dead' do \
> + yield_count = yield_count + 1 \
> + fiber.yield() \
> + end \
> +end
> +---
Why can't you use "csw" of fiber.self() instead? Also it's it reliable
enough to simply count yields?
Could scheduler skip this fiber at some loop iteration? In other words,
won't this test be flaky?
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k4: 4
> + k1: 1
> +...
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +---
> +...
> +copy_yield({}, 1)
> +---
> +- []
> +...
> +copy_yield({k = 1}, 1)
> +---
> +- k: 1
> +...
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +---
> +- k1: 1
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + f = fiber.create(function() \
> + res = copy_yield(t, 2) \
> + end) \
> + yield_count = 0 \
> + while f:status() ~= 'dead' do \
> + yield_count = yield_count + 1 \
> + fiber.yield() \
> + end \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +res
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +t ~= res
> +---
> +- true
> +...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 5f39e06..4d6cbe9 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function')
> while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end
> test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> +
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +minus_yield({}, {}, 1)
> +minus_yield({}, {k = 1}, 1)
> +minus_yield({}, {k = 1}, 0)
> +minus_yield({k = 1}, {k = 1}, 0)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + f = fiber.create(function() \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + end) \
> + yield_count = 0 \
> + while f:status() ~= 'dead' do \
> + yield_count = yield_count + 1 \
> + fiber.yield() \
> + end \
> +end
> +yield_count
> +t
> +
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +copy_yield({}, 1)
> +copy_yield({k = 1}, 1)
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + f = fiber.create(function() \
> + res = copy_yield(t, 2) \
> + end) \
> + yield_count = 0 \
> + while f:status() ~= 'dead' do \
> + yield_count = yield_count + 1 \
> + fiber.yield() \
> + end \
> +end
> +yield_count
> +t
> +res
> +t ~= res
> diff --git a/vshard/util.lua b/vshard/util.lua
> index d3b4e67..2362607 100644
> --- a/vshard/util.lua
> +++ b/vshard/util.lua
> @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need)
> return minor >= minor_need
> end
>
> +--
> +-- Copy @a src table. Fiber yields every @a interval keys copied.
> +--
> +local function table_copy_yield(src, interval)
> + local res = {}
> + -- Time-To-Yield.
> + local tty = interval
> + for k, v in pairs(src) do
> + res[k] = v
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return res
> +end
> +
> +--
> +-- Remove @a src keys from @a dst if their values match. Fiber yields every
> +-- @a interval iterations.
> +--
> +local function table_minus_yield(dst, src, interval)
> + -- Time-To-Yield.
> + local tty = interval
> + for k, srcv in pairs(src) do
> + if dst[k] == srcv then
> + dst[k] = nil
> + end
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return dst
> +end
> +
> return {
> tuple_extract_key = tuple_extract_key,
> reloadable_fiber_create = reloadable_fiber_create,
> @@ -160,4 +198,6 @@ return {
> async_task = async_task,
> internal = M,
> version_is_at_least = version_is_at_least,
> + table_copy_yield = table_copy_yield,
> + table_minus_yield = table_minus_yield,
> }
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
@ 2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-10 22:34 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
Thanks for the review!
>> diff --git a/test/unit/util.result b/test/unit/util.result
>> index 096e36f..c4fd84d 100644
>> --- a/test/unit/util.result
>> +++ b/test/unit/util.result
>> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
>> +do \
>> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
>> + f = fiber.create(function() \
>> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
>> + end) \
>> + yield_count = 0 \
>> + while f:status() ~= 'dead' do \
>> + yield_count = yield_count + 1 \
>> + fiber.yield() \
>> + end \
>> +end
>> +---
>
>
> Why can't you use "csw" of fiber.self() instead? Also it's it reliable enough to simply count yields?
Yup, will work too. See the diff below.
====================
diff --git a/test/unit/util.result b/test/unit/util.result
index c4fd84d..42a361a 100644
--- a/test/unit/util.result
+++ b/test/unit/util.result
@@ -111,14 +111,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
...
do \
t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ yield_count = 0 \
f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
end) \
- yield_count = 0 \
- while f:status() ~= 'dead' do \
- yield_count = yield_count + 1 \
- fiber.yield() \
- end \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
end
---
...
@@ -151,14 +151,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
do \
t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
res = nil \
+ yield_count = 0 \
f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
res = copy_yield(t, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
end) \
- yield_count = 0 \
- while f:status() ~= 'dead' do \
- yield_count = yield_count + 1 \
- fiber.yield() \
- end \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
end
---
...
diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
index 4d6cbe9..9550a95 100644
--- a/test/unit/util.test.lua
+++ b/test/unit/util.test.lua
@@ -42,14 +42,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
do \
t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ yield_count = 0 \
f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
end) \
- yield_count = 0 \
- while f:status() ~= 'dead' do \
- yield_count = yield_count + 1 \
- fiber.yield() \
- end \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
end
yield_count
t
@@ -63,14 +63,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
do \
t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
res = nil \
+ yield_count = 0 \
f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
res = copy_yield(t, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
end) \
- yield_count = 0 \
- while f:status() ~= 'dead' do \
- yield_count = yield_count + 1 \
- fiber.yield() \
- end \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
end
yield_count
t
====================
> Could scheduler skip this fiber at some loop iteration? In other words, won't this test be flaky?
Nope. Unless the fiber is sleeping on some condition or for a timeout, a plain
sleep(0) also known as fiber.yield() won't skip this fiber on the next
iteration of the loop. But does not matter if csw is used to count the yields.
Full new patch below.
====================
util: introduce yielding table functions
The patch adds functions table_copy_yield and table_minus_yield.
Yielding copy creates a duplicate of a table but yields every
specified number of keys copied.
Yielding minus removes matching key-value pairs specified in one
table from another table. It yields every specified number of keys
passed.
The functions should help to process huge Lua tables (millions of
elements and more). These are going to be used on the storage in
the new GC algorithm.
The algorithm will need to keep a route table on the storage, just
like on the router, but with expiration time for the routes. Since
bucket count can be millions, it means GC will potentially operate
on a huge Lua table and could use some yields so as not to block
TX thread for long.
Needed for #147
diff --git a/test/unit/util.result b/test/unit/util.result
index 096e36f..42a361a 100644
--- a/test/unit/util.result
+++ b/test/unit/util.result
@@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
fib:cancel()
---
...
+-- Yielding table minus.
+minus_yield = util.table_minus_yield
+---
+...
+minus_yield({}, {}, 1)
+---
+- []
+...
+minus_yield({}, {k = 1}, 1)
+---
+- []
+...
+minus_yield({}, {k = 1}, 0)
+---
+- []
+...
+minus_yield({k = 1}, {k = 1}, 0)
+---
+- []
+...
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
+---
+- k2: 2
+...
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
+---
+- []
+...
+-- Mismatching values are not deleted.
+minus_yield({k1 = 1}, {k1 = 2}, 10)
+---
+- k1: 1
+...
+minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
+---
+- k3: 3
+ k2: 2
+...
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ yield_count = 0 \
+ f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
+ minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
+ end) \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
+end
+---
+...
+yield_count
+---
+- 2
+...
+t
+---
+- k4: 4
+ k1: 1
+...
+-- Yielding table copy.
+copy_yield = util.table_copy_yield
+---
+...
+copy_yield({}, 1)
+---
+- []
+...
+copy_yield({k = 1}, 1)
+---
+- k: 1
+...
+copy_yield({k1 = 1, k2 = 2}, 1)
+---
+- k1: 1
+ k2: 2
+...
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ res = nil \
+ yield_count = 0 \
+ f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
+ res = copy_yield(t, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
+ end) \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
+end
+---
+...
+yield_count
+---
+- 2
+...
+t
+---
+- k3: 3
+ k4: 4
+ k1: 1
+ k2: 2
+...
+res
+---
+- k3: 3
+ k4: 4
+ k1: 1
+ k2: 2
+...
+t ~= res
+---
+- true
+...
diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
index 5f39e06..9550a95 100644
--- a/test/unit/util.test.lua
+++ b/test/unit/util.test.lua
@@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function')
while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end
test_run:grep_log('default', 'reloadable_function has been started', 1000)
fib:cancel()
+
+-- Yielding table minus.
+minus_yield = util.table_minus_yield
+minus_yield({}, {}, 1)
+minus_yield({}, {k = 1}, 1)
+minus_yield({}, {k = 1}, 0)
+minus_yield({k = 1}, {k = 1}, 0)
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
+minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
+-- Mismatching values are not deleted.
+minus_yield({k1 = 1}, {k1 = 2}, 10)
+minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
+
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ yield_count = 0 \
+ f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
+ minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
+ end) \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
+end
+yield_count
+t
+
+-- Yielding table copy.
+copy_yield = util.table_copy_yield
+copy_yield({}, 1)
+copy_yield({k = 1}, 1)
+copy_yield({k1 = 1, k2 = 2}, 1)
+
+do \
+ t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
+ res = nil \
+ yield_count = 0 \
+ f = fiber.create(function() \
+ local csw1 = fiber.info()[fiber.id()].csw \
+ res = copy_yield(t, 2) \
+ local csw2 = fiber.info()[fiber.id()].csw \
+ yield_count = csw2 - csw1 \
+ end) \
+ test_run:wait_cond(function() return f:status() == 'dead' end) \
+end
+yield_count
+t
+res
+t ~= res
diff --git a/vshard/util.lua b/vshard/util.lua
index d3b4e67..2362607 100644
--- a/vshard/util.lua
+++ b/vshard/util.lua
@@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need)
return minor >= minor_need
end
+--
+-- Copy @a src table. Fiber yields every @a interval keys copied.
+--
+local function table_copy_yield(src, interval)
+ local res = {}
+ -- Time-To-Yield.
+ local tty = interval
+ for k, v in pairs(src) do
+ res[k] = v
+ tty = tty - 1
+ if tty <= 0 then
+ fiber.yield()
+ tty = interval
+ end
+ end
+ return res
+end
+
+--
+-- Remove @a src keys from @a dst if their values match. Fiber yields every
+-- @a interval iterations.
+--
+local function table_minus_yield(dst, src, interval)
+ -- Time-To-Yield.
+ local tty = interval
+ for k, srcv in pairs(src) do
+ if dst[k] == srcv then
+ dst[k] = nil
+ end
+ tty = tty - 1
+ if tty <= 0 then
+ fiber.yield()
+ tty = interval
+ end
+ end
+ return dst
+end
+
return {
tuple_extract_key = tuple_extract_key,
reloadable_fiber_create = reloadable_fiber_create,
@@ -160,4 +198,6 @@ return {
async_task = async_task,
internal = M,
version_is_at_least = version_is_at_least,
+ table_copy_yield = table_copy_yield,
+ table_minus_yield = table_minus_yield,
}
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions
2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-11 6:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your fixes! LGTM.
On 11/02/2021 01:34, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
>>> diff --git a/test/unit/util.result b/test/unit/util.result
>>> index 096e36f..c4fd84d 100644
>>> --- a/test/unit/util.result
>>> +++ b/test/unit/util.result
>>> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
>>> +do \
>>> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
>>> + f = fiber.create(function() \
>>> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
>>> + end) \
>>> + yield_count = 0 \
>>> + while f:status() ~= 'dead' do \
>>> + yield_count = yield_count + 1 \
>>> + fiber.yield() \
>>> + end \
>>> +end
>>> +---
>> Why can't you use "csw" of fiber.self() instead? Also it's it reliable enough to simply count yields?
> Yup, will work too. See the diff below.
>
> ====================
> diff --git a/test/unit/util.result b/test/unit/util.result
> index c4fd84d..42a361a 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -111,14 +111,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> ...
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> ---
> ...
> @@ -151,14 +151,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> res = nil \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> ---
> ...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 4d6cbe9..9550a95 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -42,14 +42,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
>
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> yield_count
> t
> @@ -63,14 +63,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> res = nil \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> yield_count
> t
> ====================
>
>> Could scheduler skip this fiber at some loop iteration? In other words, won't this test be flaky?
> Nope. Unless the fiber is sleeping on some condition or for a timeout, a plain
> sleep(0) also known as fiber.yield() won't skip this fiber on the next
> iteration of the loop. But does not matter if csw is used to count the yields.
>
> Full new patch below.
>
> ====================
> util: introduce yielding table functions
>
> The patch adds functions table_copy_yield and table_minus_yield.
>
> Yielding copy creates a duplicate of a table but yields every
> specified number of keys copied.
>
> Yielding minus removes matching key-value pairs specified in one
> table from another table. It yields every specified number of keys
> passed.
>
> The functions should help to process huge Lua tables (millions of
> elements and more). These are going to be used on the storage in
> the new GC algorithm.
>
> The algorithm will need to keep a route table on the storage, just
> like on the router, but with expiration time for the routes. Since
> bucket count can be millions, it means GC will potentially operate
> on a huge Lua table and could use some yields so as not to block
> TX thread for long.
>
> Needed for #147
>
> diff --git a/test/unit/util.result b/test/unit/util.result
> index 096e36f..42a361a 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> ---
> ...
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +---
> +...
> +minus_yield({}, {}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k = 1}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +---
> +- k2: 2
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +---
> +- []
> +...
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +---
> +- k1: 1
> +...
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +---
> +- k3: 3
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k4: 4
> + k1: 1
> +...
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +---
> +...
> +copy_yield({}, 1)
> +---
> +- []
> +...
> +copy_yield({k = 1}, 1)
> +---
> +- k: 1
> +...
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +---
> +- k1: 1
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +res
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +t ~= res
> +---
> +- true
> +...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 5f39e06..9550a95 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function')
> while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end
> test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> +
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +minus_yield({}, {}, 1)
> +minus_yield({}, {k = 1}, 1)
> +minus_yield({}, {k = 1}, 0)
> +minus_yield({k = 1}, {k = 1}, 0)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +yield_count
> +t
> +
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +copy_yield({}, 1)
> +copy_yield({k = 1}, 1)
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +yield_count
> +t
> +res
> +t ~= res
> diff --git a/vshard/util.lua b/vshard/util.lua
> index d3b4e67..2362607 100644
> --- a/vshard/util.lua
> +++ b/vshard/util.lua
> @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need)
> return minor >= minor_need
> end
>
> +--
> +-- Copy @a src table. Fiber yields every @a interval keys copied.
> +--
> +local function table_copy_yield(src, interval)
> + local res = {}
> + -- Time-To-Yield.
> + local tty = interval
> + for k, v in pairs(src) do
> + res[k] = v
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return res
> +end
> +
> +--
> +-- Remove @a src keys from @a dst if their values match. Fiber yields every
> +-- @a interval iterations.
> +--
> +local function table_minus_yield(dst, src, interval)
> + -- Time-To-Yield.
> + local tty = interval
> + for k, srcv in pairs(src) do
> + if dst[k] == srcv then
> + dst[k] = nil
> + end
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return dst
> +end
> +
> return {
> tuple_extract_key = tuple_extract_key,
> reloadable_fiber_create = reloadable_fiber_create,
> @@ -160,4 +198,6 @@ return {
> async_task = async_task,
> internal = M,
> version_is_at_least = version_is_at_least,
> + table_copy_yield = table_copy_yield,
> + table_minus_yield = table_minus_yield,
> }
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (4 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector Vladislav Shpilevoy via Tarantool-patches
` (3 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Some options in vshard are going to be eventually deprecated. For
instance, 'weigts' will be renamed, 'collect_lua_garbage' may be
deleted since it appears not to be so useful, 'sync_timeout' is
totally unnecessary since any 'sync' can take a timeout per-call.
But the patch is motivated by 'collect_bucket_garbage_interval'
which is going to become unused in the new GC algorithm.
New GC will be reactive instead of proactive. Instead of periodic
polling of _bucket space it will react on needed events
immediately. This will make the 'collect interval' unused.
The option will be deprecated and eventually in some far future
release its usage will lead to an error.
Needed for #147
---
vshard/cfg.lua | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index 1ef1899..28c3400 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -59,7 +59,11 @@ local function validate_config(config, template, check_arg)
local value = config[key]
local name = template_value.name
local expected_type = template_value.type
- if value == nil then
+ if template_value.is_deprecated then
+ if value ~= nil then
+ log.warn('Option "%s" is deprecated', name)
+ end
+ elseif value == nil then
if not template_value.is_optional then
error(string.format('%s must be specified', name))
else
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 8:59 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch!
Is it possible to extend log message to "Option is deprecated and has no
effect anymore"?
Also for some options could be useful: "Option is deprecated, use ...
instead" (e.g. for "weights").
Seems it should be more configurable and gives some hint for user to do.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Some options in vshard are going to be eventually deprecated. For
> instance, 'weigts' will be renamed, 'collect_lua_garbage' may be
typo: weigts -> weights
> deleted since it appears not to be so useful, 'sync_timeout' is
> totally unnecessary since any 'sync' can take a timeout per-call.
>
> But the patch is motivated by 'collect_bucket_garbage_interval'
> which is going to become unused in the new GC algorithm.
>
> New GC will be reactive instead of proactive. Instead of periodic
> polling of _bucket space it will react on needed events
> immediately. This will make the 'collect interval' unused.
>
> The option will be deprecated and eventually in some far future
> release its usage will lead to an error.
>
> Needed for #147
> ---
> vshard/cfg.lua | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 1ef1899..28c3400 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -59,7 +59,11 @@ local function validate_config(config, template, check_arg)
> local value = config[key]
> local name = template_value.name
> local expected_type = template_value.type
> - if value == nil then
> + if template_value.is_deprecated then
> + if value ~= nil then
> + log.warn('Option "%s" is deprecated', name)
> + end
> + elseif value == nil then
> if not template_value.is_optional then
> error(string.format('%s must be specified', name))
> else
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
@ 2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-10 22:34 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
Thanks for the review!
On 10.02.2021 09:59, Oleg Babin wrote:
> Thanks for your patch!
>
> Is it possible to extend log message to "Option is deprecated and has no effect anymore"?
Good idea. See the diff in this commit.
====================
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index 28c3400..f7d5dbc 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -61,7 +61,13 @@ local function validate_config(config, template, check_arg)
local expected_type = template_value.type
if template_value.is_deprecated then
if value ~= nil then
- log.warn('Option "%s" is deprecated', name)
+ local reason = template_value.reason
+ if reason then
+ reason = '. '..reason
+ else
+ reason = ''
+ end
+ log.warn('Option "%s" is deprecated'..reason, name)
end
elseif value == nil then
if not template_value.is_optional then
====================
And in the next commit:
====================
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index f7dd4c1..63d5414 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -252,6 +252,7 @@ local cfg_template = {
},
collect_bucket_garbage_interval = {
name = 'Garbage bucket collect interval', is_deprecated = true,
+ reason = 'Has no effect anymore'
},
collect_lua_garbage = {
type = 'boolean', name = 'Garbage Lua collect necessity',
====================
> Also for some options could be useful: "Option is deprecated, use ... instead" (e.g. for "weights").
With the updated version I can specify any 'reason'. Such as
'has no affect', 'use ... instead', etc.
> Seems it should be more configurable and gives some hint for user to do.
>
>
> On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
>> Some options in vshard are going to be eventually deprecated. For
>> instance, 'weigts' will be renamed, 'collect_lua_garbage' may be
>
>
> typo: weigts -> weights
Fixed. See the full new patch below.
====================
cfg: introduce 'deprecated option' feature
Some options in vshard are going to be eventually deprecated. For
instance, 'weights' will be renamed, 'collect_lua_garbage' may be
deleted since it appears not to be so useful, 'sync_timeout' is
totally unnecessary since any 'sync' can take a timeout per-call.
But the patch is motivated by 'collect_bucket_garbage_interval'
which is going to become unused in the new GC algorithm.
New GC will be reactive instead of proactive. Instead of periodic
polling of _bucket space it will react on needed events
immediately. This will make the 'collect interval' unused.
The option will be deprecated and eventually in some far future
release its usage will lead to an error.
Needed for #147
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index 1ef1899..f7d5dbc 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -59,7 +59,17 @@ local function validate_config(config, template, check_arg)
local value = config[key]
local name = template_value.name
local expected_type = template_value.type
- if value == nil then
+ if template_value.is_deprecated then
+ if value ~= nil then
+ local reason = template_value.reason
+ if reason then
+ reason = '. '..reason
+ else
+ reason = ''
+ end
+ log.warn('Option "%s" is deprecated'..reason, name)
+ end
+ elseif value == nil then
if not template_value.is_optional then
error(string.format('%s must be specified', name))
else
====================
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature
2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-11 6:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your fixes. LGTM!
On 11/02/2021 01:34, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 09:59, Oleg Babin wrote:
>> Thanks for your patch!
>>
>> Is it possible to extend log message to "Option is deprecated and has no effect anymore"?
> Good idea. See the diff in this commit.
>
> ====================
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 28c3400..f7d5dbc 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -61,7 +61,13 @@ local function validate_config(config, template, check_arg)
> local expected_type = template_value.type
> if template_value.is_deprecated then
> if value ~= nil then
> - log.warn('Option "%s" is deprecated', name)
> + local reason = template_value.reason
> + if reason then
> + reason = '. '..reason
> + else
> + reason = ''
> + end
> + log.warn('Option "%s" is deprecated'..reason, name)
> end
> elseif value == nil then
> if not template_value.is_optional then
> ====================
>
> And in the next commit:
>
> ====================
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index f7dd4c1..63d5414 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -252,6 +252,7 @@ local cfg_template = {
> },
> collect_bucket_garbage_interval = {
> name = 'Garbage bucket collect interval', is_deprecated = true,
> + reason = 'Has no effect anymore'
> },
> collect_lua_garbage = {
> type = 'boolean', name = 'Garbage Lua collect necessity',
>
> ====================
>
>> Also for some options could be useful: "Option is deprecated, use ... instead" (e.g. for "weights").
> With the updated version I can specify any 'reason'. Such as
> 'has no affect', 'use ... instead', etc.
>
>> Seems it should be more configurable and gives some hint for user to do.
>>
>>
>> On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
>>> Some options in vshard are going to be eventually deprecated. For
>>> instance, 'weigts' will be renamed, 'collect_lua_garbage' may be
>> typo: weigts -> weights
> Fixed. See the full new patch below.
>
> ====================
> cfg: introduce 'deprecated option' feature
>
> Some options in vshard are going to be eventually deprecated. For
> instance, 'weights' will be renamed, 'collect_lua_garbage' may be
> deleted since it appears not to be so useful, 'sync_timeout' is
> totally unnecessary since any 'sync' can take a timeout per-call.
>
> But the patch is motivated by 'collect_bucket_garbage_interval'
> which is going to become unused in the new GC algorithm.
>
> New GC will be reactive instead of proactive. Instead of periodic
> polling of _bucket space it will react on needed events
> immediately. This will make the 'collect interval' unused.
>
> The option will be deprecated and eventually in some far future
> release its usage will lead to an error.
>
> Needed for #147
>
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 1ef1899..f7d5dbc 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -59,7 +59,17 @@ local function validate_config(config, template, check_arg)
> local value = config[key]
> local name = template_value.name
> local expected_type = template_value.type
> - if value == nil then
> + if template_value.is_deprecated then
> + if value ~= nil then
> + local reason = template_value.reason
> + if reason then
> + reason = '. '..reason
> + else
> + reason = ''
> + end
> + log.warn('Option "%s" is deprecated'..reason, name)
> + end
> + elseif value == nil then
> if not template_value.is_optional then
> error(string.format('%s must be specified', name))
> else
>
> ====================
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (5 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 8/9] recovery: introduce reactive recovery Vladislav Shpilevoy via Tarantool-patches
` (2 subsequent siblings)
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Garbage collector is a fiber on a master node which deletes
GARBAGE and SENT buckets along with their data.
It was proactive. It used to wakeup with a constant period to
find and delete the needed buckets.
But this won't work with the future feature called 'map-reduce'.
Map-reduce as a preparation stage will need to ensure that all
buckets on a storage are readable and writable. With the current
GC algorithm if a bucket is sent, it won't be deleted for the next
5 seconds by default. During this time all new map-reduce requests
can't execute.
This is not acceptable. As well as too frequent wakeup of GC fiber
because it would waste TX thread time.
The patch makes GC fiber wakeup not by a timeout but by events
happening with _bucket space. GC fiber sleeps on a condition
variable which is signaled when _bucket is changed.
Once GC sees work to do, it won't sleep until it is done. It will
only yield.
This makes GC delete SENT and GARBAGE buckets as soon as possible
reducing the waiting time for the incoming map-reduce requests.
Needed for #147
@TarantoolBot document
Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval'
It was used to specify the interval between bucket garbage
collection steps. It was needed because garbage collection in
vshard was proactive. It didn't react to newly appeared garbage
buckets immediately.
Since now (0.1.17) garbage collection became reactive. It starts
working with garbage buckets immediately as they appear. And
sleeps rest of the time. The option is not used now and does not
affect behaviour of anything.
I suppose it can be deleted from the documentation. Or left with
a big label 'deprecated' + the explanation above.
An attempt to use the option does not cause an error, but logs a
warning.
---
test/lua_libs/storage_template.lua | 1 -
test/misc/reconfigure.result | 10 -
test/misc/reconfigure.test.lua | 3 -
test/rebalancer/bucket_ref.result | 12 --
test/rebalancer/bucket_ref.test.lua | 3 -
test/rebalancer/errinj.result | 11 --
test/rebalancer/errinj.test.lua | 5 -
test/rebalancer/receiving_bucket.result | 8 -
test/rebalancer/receiving_bucket.test.lua | 1 -
test/reload_evolution/storage.result | 2 +-
test/router/reroute_wrong_bucket.result | 8 +-
test/router/reroute_wrong_bucket.test.lua | 4 +-
test/storage/recovery.result | 3 +-
test/storage/storage.result | 10 +-
test/storage/storage.test.lua | 1 +
test/unit/config.result | 35 +---
test/unit/config.test.lua | 16 +-
test/unit/garbage.result | 106 ++++++----
test/unit/garbage.test.lua | 47 +++--
test/unit/garbage_errinj.result | 223 ----------------------
test/unit/garbage_errinj.test.lua | 73 -------
vshard/cfg.lua | 4 +-
vshard/consts.lua | 5 +-
vshard/storage/init.lua | 207 ++++++++++----------
vshard/storage/reload_evolution.lua | 8 +
25 files changed, 233 insertions(+), 573 deletions(-)
delete mode 100644 test/unit/garbage_errinj.result
delete mode 100644 test/unit/garbage_errinj.test.lua
diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
index 21409bd..8df89f6 100644
--- a/test/lua_libs/storage_template.lua
+++ b/test/lua_libs/storage_template.lua
@@ -172,6 +172,5 @@ function wait_bucket_is_collected(id)
return true
end
vshard.storage.recovery_wakeup()
- vshard.storage.garbage_collector_wakeup()
end)
end
diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
index 168be5d..3b34841 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true
cfg.rebalancer_max_receiving = 1000
---
...
-cfg.collect_bucket_garbage_interval = 100
----
-...
cfg.invalid_option = 'kek'
---
...
@@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000
---
- true
...
-vshard.storage.internal.collect_bucket_garbage_interval ~= 100
----
-- true
-...
cfg.sync_timeout = nil
---
...
@@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil
cfg.rebalancer_max_receiving = nil
---
...
-cfg.collect_bucket_garbage_interval = nil
----
-...
cfg.invalid_option = nil
---
...
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
index e891010..348628c 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout
cfg.sync_timeout = 100
cfg.collect_lua_garbage = true
cfg.rebalancer_max_receiving = 1000
-cfg.collect_bucket_garbage_interval = 100
cfg.invalid_option = 'kek'
vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
not vshard.storage.internal.collect_lua_garbage
vshard.storage.internal.sync_timeout
vshard.storage.internal.rebalancer_max_receiving ~= 1000
-vshard.storage.internal.collect_bucket_garbage_interval ~= 100
cfg.sync_timeout = nil
cfg.collect_lua_garbage = nil
cfg.rebalancer_max_receiving = nil
-cfg.collect_bucket_garbage_interval = nil
cfg.invalid_option = nil
--
diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
index b8fc7ff..9df7480 100644
--- a/test/rebalancer/bucket_ref.result
+++ b/test/rebalancer/bucket_ref.result
@@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read')
- true
...
-- Force GC to take an RO lock on the bucket now.
-vshard.storage.garbage_collector_wakeup()
----
-...
vshard.storage.buckets_info(1)
---
- 1:
@@ -203,7 +200,6 @@ while true do
if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
break
end
- vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01)
end;
---
@@ -235,14 +231,6 @@ finish_refs = true
while f1:status() ~= 'dead' do fiber.sleep(0.01) end
---
...
-vshard.storage.buckets_info(1)
----
-- 1:
- status: garbage
- ro_lock: true
- destination: <replicaset_2>
- id: 1
-...
wait_bucket_is_collected(1)
---
...
diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
index 213ced3..1b032ff 100644
--- a/test/rebalancer/bucket_ref.test.lua
+++ b/test/rebalancer/bucket_ref.test.lua
@@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs.
vshard.storage.bucket_ref(1, 'read')
vshard.storage.bucket_unref(1, 'read')
-- Force GC to take an RO lock on the bucket now.
-vshard.storage.garbage_collector_wakeup()
vshard.storage.buckets_info(1)
_ = test_run:cmd("setopt delimiter ';'")
while true do
@@ -64,7 +63,6 @@ while true do
if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
break
end
- vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01)
end;
_ = test_run:cmd("setopt delimiter ''");
@@ -72,7 +70,6 @@ vshard.storage.buckets_info(1)
vshard.storage.bucket_refro(1)
finish_refs = true
while f1:status() ~= 'dead' do fiber.sleep(0.01) end
-vshard.storage.buckets_info(1)
wait_bucket_is_collected(1)
_ = test_run:switch('box_2_a')
vshard.storage.buckets_info(1)
diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
index e50eb72..0ddb1c9 100644
--- a/test/rebalancer/errinj.result
+++ b/test/rebalancer/errinj.result
@@ -226,17 +226,6 @@ ret2, err2
- true
- null
...
-_bucket:get{35}
----
-- [35, 'sent', '<replicaset_2>']
-...
-_bucket:get{36}
----
-- [36, 'sent', '<replicaset_2>']
-...
--- Buckets became 'active' on box_2_a, but still are sending on
--- box_1_a. Wait until it is marked as garbage on box_1_a by the
--- recovery fiber.
wait_bucket_is_collected(35)
---
...
diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
index 2cc4a69..a60f3d7 100644
--- a/test/rebalancer/errinj.test.lua
+++ b/test/rebalancer/errinj.test.lua
@@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a')
while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end
ret1, err1
ret2, err2
-_bucket:get{35}
-_bucket:get{36}
--- Buckets became 'active' on box_2_a, but still are sending on
--- box_1_a. Wait until it is marked as garbage on box_1_a by the
--- recovery fiber.
wait_bucket_is_collected(35)
wait_bucket_is_collected(36)
_ = test_run:switch('box_2_a')
diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
index 7d3612b..ad93445 100644
--- a/test/rebalancer/receiving_bucket.result
+++ b/test/rebalancer/receiving_bucket.result
@@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
---
- true
...
-vshard.storage.buckets_info(1)
----
-- 1:
- status: sent
- ro_lock: true
- destination: <replicaset_1>
- id: 1
-...
wait_bucket_is_collected(1)
---
...
diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
index 24534b3..2cf6382 100644
--- a/test/rebalancer/receiving_bucket.test.lua
+++ b/test/rebalancer/receiving_bucket.test.lua
@@ -136,7 +136,6 @@ box.space.test3:select{100}
-- Now the bucket is unreferenced and can be transferred.
_ = test_run:switch('box_2_a')
vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
-vshard.storage.buckets_info(1)
wait_bucket_is_collected(1)
vshard.storage.buckets_info(1)
_ = test_run:switch('box_1_a')
diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
index 753687f..9d30a04 100644
--- a/test/reload_evolution/storage.result
+++ b/test/reload_evolution/storage.result
@@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to')
...
vshard.storage.internal.reload_version
---
-- 2
+- 3
...
--
-- gh-237: should be only one trigger. During gh-237 the trigger installation
diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result
index 049bdef..ac340eb 100644
--- a/test/router/reroute_wrong_bucket.result
+++ b/test/router/reroute_wrong_bucket.result
@@ -37,7 +37,7 @@ test_run:switch('storage_1_a')
---
- true
...
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
---
...
vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
@@ -53,7 +53,7 @@ test_run:switch('storage_2_a')
---
- true
...
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
---
...
vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
@@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration')
err
---
- bucket_id: 100
- reason: write is prohibited
+ reason: Not found
code: 1
destination: ac522f65-aa94-4134-9f64-51ee384f1a54
type: ShardingError
name: WRONG_BUCKET
- message: 'Cannot perform action with bucket 100, reason: write is prohibited'
+ message: 'Cannot perform action with bucket 100, reason: Not found'
...
--
-- Now try again, but update configuration during call(). It must
diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua
index 9e6e804..207aac3 100644
--- a/test/router/reroute_wrong_bucket.test.lua
+++ b/test/router/reroute_wrong_bucket.test.lua
@@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt
test_run:cmd('create server router_1 with script="router/router_1.lua"')
test_run:cmd('start server router_1')
test_run:switch('storage_1_a')
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
vshard.storage.rebalancer_disable()
for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
test_run:switch('storage_2_a')
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
vshard.storage.rebalancer_disable()
for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
diff --git a/test/storage/recovery.result b/test/storage/recovery.result
index f833fe7..8ccb0b9 100644
--- a/test/storage/recovery.result
+++ b/test/storage/recovery.result
@@ -79,8 +79,7 @@ _bucket = box.space._bucket
...
_bucket:select{}
---
-- - [2, 'garbage', '<replicaset_2>']
- - [3, 'garbage', '<replicaset_2>']
+- []
...
_ = test_run:switch('storage_2_a')
---
diff --git a/test/storage/storage.result b/test/storage/storage.result
index 424bc4c..0550ad1 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2])
---
- true
...
+wait_bucket_is_collected(1)
+---
+...
_ = test_run:switch("storage_2_a")
---
...
@@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a")
...
vshard.storage.buckets_info()
---
-- 1:
- status: sent
- ro_lock: true
- destination: <replicaset_2>
- id: 1
- 2:
+- 2:
status: active
id: 2
...
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
index d631b51..d8fbd94 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1])
-- Successful transfer.
vshard.storage.bucket_send(1, util.replicasets[2])
+wait_bucket_is_collected(1)
_ = test_run:switch("storage_2_a")
vshard.storage.buckets_info()
_ = test_run:switch("storage_1_a")
diff --git a/test/unit/config.result b/test/unit/config.result
index dfd0219..e0b2482 100644
--- a/test/unit/config.result
+++ b/test/unit/config.result
@@ -428,33 +428,6 @@ _ = lcfg.check(cfg)
--
-- gh-77: garbage collection options.
--
-cfg.collect_bucket_garbage_interval = 'str'
----
-...
-check(cfg)
----
-- Garbage bucket collect interval must be positive number
-...
-cfg.collect_bucket_garbage_interval = 0
----
-...
-check(cfg)
----
-- Garbage bucket collect interval must be positive number
-...
-cfg.collect_bucket_garbage_interval = -1
----
-...
-check(cfg)
----
-- Garbage bucket collect interval must be positive number
-...
-cfg.collect_bucket_garbage_interval = 100.5
----
-...
-_ = lcfg.check(cfg)
----
-...
cfg.collect_lua_garbage = 100
---
...
@@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending
cfg.rebalancer_max_sending = nil
---
...
-cfg.sharding = nil
+--
+-- Deprecated option does not break anything.
+--
+cfg.collect_bucket_garbage_interval = 100
+---
+...
+_ = lcfg.check(cfg)
---
...
diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua
index ada43db..a1c9f07 100644
--- a/test/unit/config.test.lua
+++ b/test/unit/config.test.lua
@@ -175,15 +175,6 @@ _ = lcfg.check(cfg)
--
-- gh-77: garbage collection options.
--
-cfg.collect_bucket_garbage_interval = 'str'
-check(cfg)
-cfg.collect_bucket_garbage_interval = 0
-check(cfg)
-cfg.collect_bucket_garbage_interval = -1
-check(cfg)
-cfg.collect_bucket_garbage_interval = 100.5
-_ = lcfg.check(cfg)
-
cfg.collect_lua_garbage = 100
check(cfg)
cfg.collect_lua_garbage = true
@@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg)
cfg.rebalancer_max_sending = 15
lcfg.check(cfg).rebalancer_max_sending
cfg.rebalancer_max_sending = nil
-cfg.sharding = nil
+
+--
+-- Deprecated option does not break anything.
+--
+cfg.collect_bucket_garbage_interval = 100
+_ = lcfg.check(cfg)
diff --git a/test/unit/garbage.result b/test/unit/garbage.result
index 74d9ccf..a530496 100644
--- a/test/unit/garbage.result
+++ b/test/unit/garbage.result
@@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''");
vshard.storage.internal.shard_index = 'bucket_id'
---
...
-vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
----
-...
--
-- Find nothing if no bucket_id anywhere, or there is no index
-- by it, or bucket_id is not unsigned.
@@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'}
format[2] = {name = 'status', type = 'string'}
---
...
+format[3] = {name = 'destination', type = 'string', is_nullable = true}
+---
+...
_bucket = box.schema.create_space('_bucket', {format = format})
---
...
@@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
---
- [3, 'active']
...
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
----
-- [4, 'sent']
-...
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
----
-- [5, 'garbage']
-...
-_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
----
-- [6, 'garbage']
-...
-_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
----
-- [200, 'garbage']
-...
s = box.schema.create_space('test', {engine = engine})
---
...
@@ -213,7 +197,7 @@ s:replace{4, 2}
---
- [4, 2]
...
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
+gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
---
...
s2 = box.schema.create_space('test2', {engine = engine})
@@ -249,6 +233,10 @@ function fill_spaces_with_garbage()
s2:replace{6, 4}
s2:replace{7, 5}
s2:replace{7, 6}
+ _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
+ _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
+ _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
+ _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
end;
---
...
@@ -267,12 +255,22 @@ fill_spaces_with_garbage()
---
- 1107
...
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
+route_map = {}
+---
+...
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
---
-- - 5
- - 6
- - 200
- true
+- null
+...
+route_map
+---
+- - null
+ - null
+ - null
+ - null
+ - null
+ - destination2
...
#s2:select{}
---
@@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
---
- 7
...
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+route_map = {}
+---
+...
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
---
-- - 4
- true
+- null
+...
+route_map
+---
+- - null
+ - null
+ - null
+ - destination1
...
s2:select{}
---
@@ -303,17 +311,22 @@ s:select{}
- [6, 100]
...
-- Nothing deleted - update collected generation.
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
+route_map = {}
+---
+...
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
---
-- - 5
- - 6
- - 200
- true
+- null
...
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
---
-- - 4
- true
+- null
+...
+route_map
+---
+- []
...
#s2:select{}
---
@@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
fill_spaces_with_garbage()
---
...
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
+_ = _bucket:on_replace(function() \
+ local gen = vshard.storage.internal.bucket_generation \
+ vshard.storage.internal.bucket_generation = gen + 1 \
+ vshard.storage.internal.bucket_generation_cond:broadcast() \
+end)
---
...
f = fiber.create(vshard.storage.internal.gc_bucket_f)
---
...
-- Wait until garbage collection is finished.
-while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
---
+- true
...
s:select{}
---
@@ -360,7 +378,6 @@ _bucket:select{}
- - [1, 'active']
- [2, 'receiving']
- [3, 'active']
- - [4, 'sent']
...
--
-- Test deletion of 'sent' buckets after a specified timeout.
@@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT}
- [2, 'sent']
...
-- Wait deletion after a while.
-while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{2} end)
---
+- true
...
_bucket:select{}
---
@@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT}
---
- [4, 'sent']
...
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
---
+- true
...
--
-- Test WAL errors during deletion from _bucket.
@@ -434,11 +453,14 @@ s:replace{6, 4}
---
- [6, 4]
...
-while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_log('default', 'Error during garbage collection step', \
+ 65536, 10)
---
+- Error during garbage collection step
...
-while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return #sk:select{4} == 0 end)
---
+- true
...
s:select{}
---
@@ -454,8 +476,9 @@ _bucket:select{}
_ = _bucket:on_replace(nil, rollback_on_delete)
---
...
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
---
+- true
...
f:cancel()
---
@@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
f = fiber.create(vshard.storage.internal.gc_bucket_f)
---
...
-while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return _bucket:count() == 0 end)
---
+- true
...
_bucket:select{}
---
diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua
index 30079fa..250afb0 100644
--- a/test/unit/garbage.test.lua
+++ b/test/unit/garbage.test.lua
@@ -15,7 +15,6 @@ end;
test_run:cmd("setopt delimiter ''");
vshard.storage.internal.shard_index = 'bucket_id'
-vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
--
-- Find nothing if no bucket_id anywhere, or there is no index
@@ -75,16 +74,13 @@ s:drop()
format = {}
format[1] = {name = 'id', type = 'unsigned'}
format[2] = {name = 'status', type = 'string'}
+format[3] = {name = 'destination', type = 'string', is_nullable = true}
_bucket = box.schema.create_space('_bucket', {format = format})
_ = _bucket:create_index('pk')
_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
-_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
-_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
s = box.schema.create_space('test', {engine = engine})
pk = s:create_index('pk')
@@ -94,7 +90,7 @@ s:replace{2, 1}
s:replace{3, 2}
s:replace{4, 2}
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
+gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
s2 = box.schema.create_space('test2', {engine = engine})
pk2 = s2:create_index('pk')
sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
@@ -114,6 +110,10 @@ function fill_spaces_with_garbage()
s2:replace{6, 4}
s2:replace{7, 5}
s2:replace{7, 6}
+ _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
+ _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
+ _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
+ _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
end;
test_run:cmd("setopt delimiter ''");
@@ -121,15 +121,21 @@ fill_spaces_with_garbage()
#s2:select{}
#s:select{}
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
+route_map = {}
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
+route_map
#s2:select{}
#s:select{}
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+route_map = {}
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
+route_map
s2:select{}
s:select{}
-- Nothing deleted - update collected generation.
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+route_map = {}
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
+route_map
#s2:select{}
#s:select{}
@@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
-- Test continuous garbage collection via background fiber.
--
fill_spaces_with_garbage()
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
+_ = _bucket:on_replace(function() \
+ local gen = vshard.storage.internal.bucket_generation \
+ vshard.storage.internal.bucket_generation = gen + 1 \
+ vshard.storage.internal.bucket_generation_cond:broadcast() \
+end)
f = fiber.create(vshard.storage.internal.gc_bucket_f)
-- Wait until garbage collection is finished.
-while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
s:select{}
s2:select{}
-- Check garbage bucket is deleted by background fiber.
@@ -150,7 +160,7 @@ _bucket:select{}
--
_bucket:replace{2, vshard.consts.BUCKET.SENT}
-- Wait deletion after a while.
-while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{2} end)
_bucket:select{}
s:select{}
s2:select{}
@@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE}
s:replace{5, 4}
s:replace{6, 4}
_bucket:replace{4, vshard.consts.BUCKET.SENT}
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
--
-- Test WAL errors during deletion from _bucket.
@@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete)
_bucket:replace{4, vshard.consts.BUCKET.SENT}
s:replace{5, 4}
s:replace{6, 4}
-while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
-while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_log('default', 'Error during garbage collection step', \
+ 65536, 10)
+test_run:wait_cond(function() return #sk:select{4} == 0 end)
s:select{}
_bucket:select{}
_ = _bucket:on_replace(nil, rollback_on_delete)
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
f:cancel()
@@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
#s:select{}
#s2:select{}
f = fiber.create(vshard.storage.internal.gc_bucket_f)
-while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return _bucket:count() == 0 end)
_bucket:select{}
s:select{}
s2:select{}
diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result
deleted file mode 100644
index 92c8039..0000000
--- a/test/unit/garbage_errinj.result
+++ /dev/null
@@ -1,223 +0,0 @@
-test_run = require('test_run').new()
----
-...
-vshard = require('vshard')
----
-...
-fiber = require('fiber')
----
-...
-engine = test_run:get_cfg('engine')
----
-...
-vshard.storage.internal.shard_index = 'bucket_id'
----
-...
-format = {}
----
-...
-format[1] = {name = 'id', type = 'unsigned'}
----
-...
-format[2] = {name = 'status', type = 'string', is_nullable = true}
----
-...
-_bucket = box.schema.create_space('_bucket', {format = format})
----
-...
-_ = _bucket:create_index('pk')
----
-...
-_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
----
-...
-_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
----
-- [1, 'active']
-...
-_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
----
-- [2, 'receiving']
-...
-_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
----
-- [3, 'active']
-...
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
----
-- [4, 'sent']
-...
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
----
-- [5, 'garbage']
-...
-s = box.schema.create_space('test', {engine = engine})
----
-...
-pk = s:create_index('pk')
----
-...
-sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
----
-...
-s:replace{1, 1}
----
-- [1, 1]
-...
-s:replace{2, 1}
----
-- [2, 1]
-...
-s:replace{3, 2}
----
-- [3, 2]
-...
-s:replace{4, 2}
----
-- [4, 2]
-...
-s:replace{5, 100}
----
-- [5, 100]
-...
-s:replace{6, 100}
----
-- [6, 100]
-...
-s:replace{7, 4}
----
-- [7, 4]
-...
-s:replace{8, 5}
----
-- [8, 5]
-...
-s2 = box.schema.create_space('test2', {engine = engine})
----
-...
-pk2 = s2:create_index('pk')
----
-...
-sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
----
-...
-s2:replace{1, 1}
----
-- [1, 1]
-...
-s2:replace{3, 3}
----
-- [3, 3]
-...
-for i = 7, 1107 do s:replace{i, 200} end
----
-...
-s2:replace{4, 200}
----
-- [4, 200]
-...
-s2:replace{5, 100}
----
-- [5, 100]
-...
-s2:replace{5, 300}
----
-- [5, 300]
-...
-s2:replace{6, 4}
----
-- [6, 4]
-...
-s2:replace{7, 5}
----
-- [7, 5]
-...
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
----
-...
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
----
-- - 4
-- true
-...
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
----
-- - 5
-- true
-...
---
--- Test _bucket generation change during garbage buckets search.
---
-s:truncate()
----
-...
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
----
-...
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
----
-...
-f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
----
-...
-_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
----
-- [4, 'garbage']
-...
-s:replace{5, 4}
----
-- [5, 4]
-...
-s:replace{6, 4}
----
-- [6, 4]
-...
-#s:select{}
----
-- 2
-...
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
----
-...
-while f:status() ~= 'dead' do fiber.sleep(0.1) end
----
-...
--- Nothing is deleted - _bucket:replace() has changed _bucket
--- generation during search of garbage buckets.
-#s:select{}
----
-- 2
-...
-_bucket:select{4}
----
-- - [4, 'garbage']
-...
--- Next step deletes garbage ok.
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
----
-- []
-- true
-...
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
----
-- - 4
- - 5
-- true
-...
-#s:select{}
----
-- 0
-...
-_bucket:delete{4}
----
-- [4, 'garbage']
-...
-s2:drop()
----
-...
-s:drop()
----
-...
-_bucket:drop()
----
-...
diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua
deleted file mode 100644
index 31184b9..0000000
--- a/test/unit/garbage_errinj.test.lua
+++ /dev/null
@@ -1,73 +0,0 @@
-test_run = require('test_run').new()
-vshard = require('vshard')
-fiber = require('fiber')
-
-engine = test_run:get_cfg('engine')
-vshard.storage.internal.shard_index = 'bucket_id'
-
-format = {}
-format[1] = {name = 'id', type = 'unsigned'}
-format[2] = {name = 'status', type = 'string', is_nullable = true}
-_bucket = box.schema.create_space('_bucket', {format = format})
-_ = _bucket:create_index('pk')
-_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
-_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
-_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
-_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
-
-s = box.schema.create_space('test', {engine = engine})
-pk = s:create_index('pk')
-sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
-s:replace{1, 1}
-s:replace{2, 1}
-s:replace{3, 2}
-s:replace{4, 2}
-s:replace{5, 100}
-s:replace{6, 100}
-s:replace{7, 4}
-s:replace{8, 5}
-
-s2 = box.schema.create_space('test2', {engine = engine})
-pk2 = s2:create_index('pk')
-sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
-s2:replace{1, 1}
-s2:replace{3, 3}
-for i = 7, 1107 do s:replace{i, 200} end
-s2:replace{4, 200}
-s2:replace{5, 100}
-s2:replace{5, 300}
-s2:replace{6, 4}
-s2:replace{7, 5}
-
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
-
---
--- Test _bucket generation change during garbage buckets search.
---
-s:truncate()
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
-f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
-_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
-s:replace{5, 4}
-s:replace{6, 4}
-#s:select{}
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
-while f:status() ~= 'dead' do fiber.sleep(0.1) end
--- Nothing is deleted - _bucket:replace() has changed _bucket
--- generation during search of garbage buckets.
-#s:select{}
-_bucket:select{4}
--- Next step deletes garbage ok.
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
-#s:select{}
-_bucket:delete{4}
-
-s2:drop()
-s:drop()
-_bucket:drop()
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index 28c3400..1345058 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -245,9 +245,7 @@ local cfg_template = {
max = consts.REBALANCER_MAX_SENDING_MAX
},
collect_bucket_garbage_interval = {
- type = 'positive number', name = 'Garbage bucket collect interval',
- is_optional = true,
- default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
+ name = 'Garbage bucket collect interval', is_deprecated = true,
},
collect_lua_garbage = {
type = 'boolean', name = 'Garbage Lua collect necessity',
diff --git a/vshard/consts.lua b/vshard/consts.lua
index 8c2a8b0..3f1585a 100644
--- a/vshard/consts.lua
+++ b/vshard/consts.lua
@@ -23,6 +23,7 @@ return {
DEFAULT_BUCKET_COUNT = 3000;
BUCKET_SENT_GARBAGE_DELAY = 0.5;
BUCKET_CHUNK_SIZE = 1000;
+ LUA_CHUNK_SIZE = 100000,
DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1;
REBALANCER_IDLE_INTERVAL = 60 * 60;
REBALANCER_WORK_INTERVAL = 10;
@@ -37,7 +38,7 @@ return {
DEFAULT_FAILOVER_PING_TIMEOUT = 5;
DEFAULT_SYNC_TIMEOUT = 1;
RECONNECT_TIMEOUT = 0.5;
- DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5;
+ GC_BACKOFF_INTERVAL = 5,
RECOVERY_INTERVAL = 5;
COLLECT_LUA_GARBAGE_INTERVAL = 100;
@@ -45,4 +46,6 @@ return {
DISCOVERY_WORK_INTERVAL = 1,
DISCOVERY_WORK_STEP = 0.01,
DISCOVERY_TIMEOUT = 10,
+
+ TIMEOUT_INFINITY = 500 * 365 * 86400,
}
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 298df71..31a6fc7 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -69,7 +69,6 @@ if not M then
total_bucket_count = 0,
errinj = {
ERRINJ_CFG = false,
- ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
ERRINJ_RELOAD = false,
ERRINJ_CFG_DELAY = false,
ERRINJ_LONG_RECEIVE = false,
@@ -96,6 +95,8 @@ if not M then
-- detect that _bucket was not changed between yields.
--
bucket_generation = 0,
+ -- Condition variable fired on generation update.
+ bucket_generation_cond = lfiber.cond(),
--
-- Reference to the function used as on_replace trigger on
-- _bucket space. It is used to replace the trigger with
@@ -107,12 +108,14 @@ if not M then
-- replace the old function is to keep its reference.
--
bucket_on_replace = nil,
+ -- Redirects for recently sent buckets. They are kept for a while to
+ -- help routers to find a new location for sent and deleted buckets
+ -- without whole cluster scan.
+ route_map = {},
------------------- Garbage collection -------------------
-- Fiber to remove garbage buckets data.
collect_bucket_garbage_fiber = nil,
- -- Do buckets garbage collection once per this time.
- collect_bucket_garbage_interval = nil,
-- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
@@ -173,6 +176,7 @@ end
--
local function bucket_generation_increment()
M.bucket_generation = M.bucket_generation + 1
+ M.bucket_generation_cond:broadcast()
end
--
@@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode)
else
return bucket
end
+ local dst = bucket and bucket.destination or M.route_map[bucket_id]
return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason,
- bucket and bucket.destination)
+ dst)
end
--
@@ -804,11 +809,23 @@ end
--
local function bucket_unrefro(bucket_id)
local ref = M.bucket_refs[bucket_id]
- if not ref or ref.ro == 0 then
+ local count = ref and ref.ro or 0
+ if count == 0 then
return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id,
"no refs", nil)
end
- ref.ro = ref.ro - 1
+ if count == 1 then
+ ref.ro = 0
+ if ref.ro_lock then
+ -- Garbage collector is waiting for the bucket if RO
+ -- is locked. Let it know it has one more bucket to
+ -- collect. It relies on generation, so its increment
+ -- it enough.
+ bucket_generation_increment()
+ end
+ return true
+ end
+ ref.ro = count - 1
return true
end
@@ -1479,79 +1496,44 @@ local function gc_bucket_in_space(space, bucket_id, status)
end
--
--- Remove tuples from buckets of a specified type.
--- @param type Type of buckets to gc.
--- @retval List of ids of empty buckets of the type.
+-- Drop buckets with the given status along with their data in all spaces.
+-- @param status Status of target buckets.
+-- @param route_map Destinations of deleted buckets are saved into this table.
--
-local function gc_bucket_step_by_type(type)
- local sharded_spaces = find_sharded_spaces()
- local empty_buckets = {}
+local function gc_bucket_drop_xc(status, route_map)
local limit = consts.BUCKET_CHUNK_SIZE
- local is_all_collected = true
- for _, bucket in box.space._bucket.index.status:pairs(type) do
- local bucket_id = bucket.id
- local ref = M.bucket_refs[bucket_id]
+ local _bucket = box.space._bucket
+ local sharded_spaces = find_sharded_spaces()
+ for _, b in _bucket.index.status:pairs(status) do
+ local id = b.id
+ local ref = M.bucket_refs[id]
if ref then
assert(ref.rw == 0)
if ref.ro ~= 0 then
ref.ro_lock = true
- is_all_collected = false
goto continue
end
- M.bucket_refs[bucket_id] = nil
+ M.bucket_refs[id] = nil
end
for _, space in pairs(sharded_spaces) do
- gc_bucket_in_space_xc(space, bucket_id, type)
+ gc_bucket_in_space_xc(space, id, status)
limit = limit - 1
if limit == 0 then
lfiber.sleep(0)
limit = consts.BUCKET_CHUNK_SIZE
end
end
- table.insert(empty_buckets, bucket.id)
-::continue::
+ route_map[id] = b.destination
+ _bucket:delete{id}
+ ::continue::
end
- return empty_buckets, is_all_collected
-end
-
---
--- Drop buckets with ids in the list.
--- @param bucket_ids Bucket ids to drop.
--- @param status Expected bucket status.
---
-local function gc_bucket_drop_xc(bucket_ids, status)
- if #bucket_ids == 0 then
- return
- end
- local limit = consts.BUCKET_CHUNK_SIZE
- box.begin()
- local _bucket = box.space._bucket
- for _, id in pairs(bucket_ids) do
- local bucket_exists = _bucket:get{id} ~= nil
- local b = _bucket:get{id}
- if b then
- if b.status ~= status then
- return error(string.format('Bucket %d status is changed. Was '..
- '%s, became %s', id, status,
- b.status))
- end
- _bucket:delete{id}
- end
- limit = limit - 1
- if limit == 0 then
- box.commit()
- box.begin()
- limit = consts.BUCKET_CHUNK_SIZE
- end
- end
- box.commit()
end
--
-- Exception safe version of gc_bucket_drop_xc.
--
-local function gc_bucket_drop(bucket_ids, status)
- local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status)
+local function gc_bucket_drop(status, route_map)
+ local status, err = pcall(gc_bucket_drop_xc, status, route_map)
if not status then
box.rollback()
end
@@ -1578,65 +1560,75 @@ function gc_bucket_f()
-- generation == bucket generation. In such a case the fiber
-- does nothing until next _bucket change.
local bucket_generation_collected = -1
- -- Empty sent buckets are collected into an array. After a
- -- specified time interval the buckets are deleted both from
- -- this array and from _bucket space.
- local buckets_for_redirect = {}
- local buckets_for_redirect_ts = clock()
- -- Empty sent buckets, updated after each step, and when
- -- buckets_for_redirect is deleted, it gets empty_sent_buckets
- -- for next deletion.
- local empty_garbage_buckets, empty_sent_buckets, status, err
+ local bucket_generation_current = M.bucket_generation
+ -- Deleted buckets are saved into a route map to redirect routers if they
+ -- didn't discover new location of the buckets yet. However route map does
+ -- not grow infinitely. Otherwise it would end up storing redirects for all
+ -- buckets in the cluster. Which could also be outdated.
+ -- Garbage collector periodically drops old routes from the map. For that it
+ -- remembers state of route map in one moment, and after a while clears the
+ -- remembered routes from the global route map.
+ local route_map = M.route_map
+ local route_map_old = {}
+ local route_map_deadline = 0
+ local status, err
while M.module_version == module_version do
- -- Check if no changes in buckets configuration.
- if bucket_generation_collected ~= M.bucket_generation then
- local bucket_generation = M.bucket_generation
- local is_sent_collected, is_garbage_collected
- status, empty_garbage_buckets, is_garbage_collected =
- pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE)
- if not status then
- err = empty_garbage_buckets
- goto check_error
- end
- status, empty_sent_buckets, is_sent_collected =
- pcall(gc_bucket_step_by_type, consts.BUCKET.SENT)
- if not status then
- err = empty_sent_buckets
- goto check_error
+ if bucket_generation_collected ~= bucket_generation_current then
+ status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map)
+ if status then
+ status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map)
end
- status, err = gc_bucket_drop(empty_garbage_buckets,
- consts.BUCKET.GARBAGE)
-::check_error::
if not status then
box.rollback()
log.error('Error during garbage collection step: %s', err)
- goto continue
+ else
+ -- Don't use global generation. During the collection it could
+ -- already change. Instead, remember the generation known before
+ -- the collection has started.
+ -- Since the collection also changes the generation, it makes
+ -- the GC happen always at least twice. But typically on the
+ -- second iteration it should not find any buckets to collect,
+ -- and then the collected generation matches the global one.
+ bucket_generation_collected = bucket_generation_current
end
- if is_sent_collected and is_garbage_collected then
- bucket_generation_collected = bucket_generation
+ else
+ status = true
+ end
+
+ local sleep_time = route_map_deadline - clock()
+ if sleep_time <= 0 then
+ local chunk = consts.LUA_CHUNK_SIZE
+ util.table_minus_yield(route_map, route_map_old, chunk)
+ route_map_old = util.table_copy_yield(route_map, chunk)
+ if next(route_map_old) then
+ sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY
+ else
+ sleep_time = consts.TIMEOUT_INFINITY
end
+ route_map_deadline = clock() + sleep_time
end
+ bucket_generation_current = M.bucket_generation
- if clock() - buckets_for_redirect_ts >=
- consts.BUCKET_SENT_GARBAGE_DELAY then
- status, err = gc_bucket_drop(buckets_for_redirect,
- consts.BUCKET.SENT)
- if not status then
- buckets_for_redirect = {}
- empty_sent_buckets = {}
- bucket_generation_collected = -1
- log.error('Error during deletion of empty sent buckets: %s',
- err)
- elseif M.module_version ~= module_version then
- return
+ if bucket_generation_current ~= bucket_generation_collected then
+ -- Generation was changed during collection. Or *by* collection.
+ if status then
+ -- Retry immediately. If the generation was changed by the
+ -- collection itself, it will notice it next iteration, and go
+ -- to proper sleep.
+ sleep_time = 0
else
- buckets_for_redirect = empty_sent_buckets or {}
- empty_sent_buckets = nil
- buckets_for_redirect_ts = clock()
+ -- An error happened during the collection. Does not make sense
+ -- to retry on each iteration of the event loop. The most likely
+ -- errors are either a WAL error or a transaction abort - both
+ -- look like an issue in the user's code and can't be fixed
+ -- quickly anyway. Backoff.
+ sleep_time = consts.GC_BACKOFF_INTERVAL
end
end
-::continue::
- lfiber.sleep(M.collect_bucket_garbage_interval)
+
+ if M.module_version == module_version then
+ M.bucket_generation_cond:wait(sleep_time)
+ end
end
end
@@ -2421,8 +2413,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
vshard_cfg.rebalancer_disbalance_threshold
M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving
M.shard_index = vshard_cfg.shard_index
- M.collect_bucket_garbage_interval =
- vshard_cfg.collect_bucket_garbage_interval
M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending
M.current_cfg = cfg
@@ -2676,6 +2666,9 @@ else
storage_cfg(M.current_cfg, M.this_replica.uuid, true)
end
M.module_version = M.module_version + 1
+ -- Background fibers could sleep waiting for bucket changes.
+ -- Let them know it is time to reload.
+ bucket_generation_increment()
end
M.recovery_f = recovery_f
@@ -2686,7 +2679,7 @@ M.gc_bucket_f = gc_bucket_f
-- These functions are saved in M not for atomic reload, but for
-- unit testing.
--
-M.gc_bucket_step_by_type = gc_bucket_step_by_type
+M.gc_bucket_drop = gc_bucket_drop
M.rebalancer_build_routes = rebalancer_build_routes
M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
M.cached_find_sharded_spaces = find_sharded_spaces
diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
index f38af74..484f499 100644
--- a/vshard/storage/reload_evolution.lua
+++ b/vshard/storage/reload_evolution.lua
@@ -4,6 +4,7 @@
-- in a commit.
--
local log = require('log')
+local fiber = require('fiber')
--
-- Array of upgrade functions.
@@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M)
end
end
+migrations[#migrations + 1] = function(M)
+ if not M.route_map then
+ M.bucket_generation_cond = fiber.cond()
+ M.route_map = {}
+ end
+end
+
--
-- Perform an update based on a version stored in `M` (internals).
-- @param M Old module internals which should be updated.
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
2021-02-10 22:35 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 9:00 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch.
As I see you've introduced some new parameters: "LUA_CHUNK_SIZE" and
"GC_BACKOFF_INTERVAL".
I think it's better to describe them in commit message to understand
more clear how new algorithm.
I see that you didn't update comment above "gc_bucket_f" function. Is it
still relevant?
In general patch LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Garbage collector is a fiber on a master node which deletes
> GARBAGE and SENT buckets along with their data.
>
> It was proactive. It used to wakeup with a constant period to
> find and delete the needed buckets.
>
> But this won't work with the future feature called 'map-reduce'.
> Map-reduce as a preparation stage will need to ensure that all
> buckets on a storage are readable and writable. With the current
> GC algorithm if a bucket is sent, it won't be deleted for the next
> 5 seconds by default. During this time all new map-reduce requests
> can't execute.
>
> This is not acceptable. As well as too frequent wakeup of GC fiber
> because it would waste TX thread time.
>
> The patch makes GC fiber wakeup not by a timeout but by events
> happening with _bucket space. GC fiber sleeps on a condition
> variable which is signaled when _bucket is changed.
>
> Once GC sees work to do, it won't sleep until it is done. It will
> only yield.
>
> This makes GC delete SENT and GARBAGE buckets as soon as possible
> reducing the waiting time for the incoming map-reduce requests.
>
> Needed for #147
>
> @TarantoolBot document
> Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval'
> It was used to specify the interval between bucket garbage
> collection steps. It was needed because garbage collection in
> vshard was proactive. It didn't react to newly appeared garbage
> buckets immediately.
>
> Since now (0.1.17) garbage collection became reactive. It starts
> working with garbage buckets immediately as they appear. And
> sleeps rest of the time. The option is not used now and does not
> affect behaviour of anything.
>
> I suppose it can be deleted from the documentation. Or left with
> a big label 'deprecated' + the explanation above.
>
> An attempt to use the option does not cause an error, but logs a
> warning.
> ---
> test/lua_libs/storage_template.lua | 1 -
> test/misc/reconfigure.result | 10 -
> test/misc/reconfigure.test.lua | 3 -
> test/rebalancer/bucket_ref.result | 12 --
> test/rebalancer/bucket_ref.test.lua | 3 -
> test/rebalancer/errinj.result | 11 --
> test/rebalancer/errinj.test.lua | 5 -
> test/rebalancer/receiving_bucket.result | 8 -
> test/rebalancer/receiving_bucket.test.lua | 1 -
> test/reload_evolution/storage.result | 2 +-
> test/router/reroute_wrong_bucket.result | 8 +-
> test/router/reroute_wrong_bucket.test.lua | 4 +-
> test/storage/recovery.result | 3 +-
> test/storage/storage.result | 10 +-
> test/storage/storage.test.lua | 1 +
> test/unit/config.result | 35 +---
> test/unit/config.test.lua | 16 +-
> test/unit/garbage.result | 106 ++++++----
> test/unit/garbage.test.lua | 47 +++--
> test/unit/garbage_errinj.result | 223 ----------------------
> test/unit/garbage_errinj.test.lua | 73 -------
> vshard/cfg.lua | 4 +-
> vshard/consts.lua | 5 +-
> vshard/storage/init.lua | 207 ++++++++++----------
> vshard/storage/reload_evolution.lua | 8 +
> 25 files changed, 233 insertions(+), 573 deletions(-)
> delete mode 100644 test/unit/garbage_errinj.result
> delete mode 100644 test/unit/garbage_errinj.test.lua
>
> diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
> index 21409bd..8df89f6 100644
> --- a/test/lua_libs/storage_template.lua
> +++ b/test/lua_libs/storage_template.lua
> @@ -172,6 +172,5 @@ function wait_bucket_is_collected(id)
> return true
> end
> vshard.storage.recovery_wakeup()
> - vshard.storage.garbage_collector_wakeup()
> end)
> end
> diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
> index 168be5d..3b34841 100644
> --- a/test/misc/reconfigure.result
> +++ b/test/misc/reconfigure.result
> @@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> ---
> ...
> -cfg.collect_bucket_garbage_interval = 100
> ----
> -...
> cfg.invalid_option = 'kek'
> ---
> ...
> @@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000
> ---
> - true
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> ----
> -- true
> -...
> cfg.sync_timeout = nil
> ---
> ...
> @@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> ---
> ...
> -cfg.collect_bucket_garbage_interval = nil
> ----
> -...
> cfg.invalid_option = nil
> ---
> ...
> diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
> index e891010..348628c 100644
> --- a/test/misc/reconfigure.test.lua
> +++ b/test/misc/reconfigure.test.lua
> @@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout
> cfg.sync_timeout = 100
> cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> -cfg.collect_bucket_garbage_interval = 100
> cfg.invalid_option = 'kek'
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> not vshard.storage.internal.collect_lua_garbage
> vshard.storage.internal.sync_timeout
> vshard.storage.internal.rebalancer_max_receiving ~= 1000
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> cfg.sync_timeout = nil
> cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> -cfg.collect_bucket_garbage_interval = nil
> cfg.invalid_option = nil
>
> --
> diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
> index b8fc7ff..9df7480 100644
> --- a/test/rebalancer/bucket_ref.result
> +++ b/test/rebalancer/bucket_ref.result
> @@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read')
> - true
> ...
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> ----
> -...
> vshard.storage.buckets_info(1)
> ---
> - 1:
> @@ -203,7 +200,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> ---
> @@ -235,14 +231,6 @@ finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> ---
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: garbage
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
> index 213ced3..1b032ff 100644
> --- a/test/rebalancer/bucket_ref.test.lua
> +++ b/test/rebalancer/bucket_ref.test.lua
> @@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs.
> vshard.storage.bucket_ref(1, 'read')
> vshard.storage.bucket_unref(1, 'read')
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> vshard.storage.buckets_info(1)
> _ = test_run:cmd("setopt delimiter ';'")
> while true do
> @@ -64,7 +63,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> _ = test_run:cmd("setopt delimiter ''");
> @@ -72,7 +70,6 @@ vshard.storage.buckets_info(1)
> vshard.storage.bucket_refro(1)
> finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> _ = test_run:switch('box_2_a')
> vshard.storage.buckets_info(1)
> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index e50eb72..0ddb1c9 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -226,17 +226,6 @@ ret2, err2
> - true
> - null
> ...
> -_bucket:get{35}
> ----
> -- [35, 'sent', '<replicaset_2>']
> -...
> -_bucket:get{36}
> ----
> -- [36, 'sent', '<replicaset_2>']
> -...
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> ---
> ...
> diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
> index 2cc4a69..a60f3d7 100644
> --- a/test/rebalancer/errinj.test.lua
> +++ b/test/rebalancer/errinj.test.lua
> @@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a')
> while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end
> ret1, err1
> ret2, err2
> -_bucket:get{35}
> -_bucket:get{36}
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> wait_bucket_is_collected(36)
> _ = test_run:switch('box_2_a')
> diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
> index 7d3612b..ad93445 100644
> --- a/test/rebalancer/receiving_bucket.result
> +++ b/test/rebalancer/receiving_bucket.result
> @@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> ---
> - true
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_1>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
> index 24534b3..2cf6382 100644
> --- a/test/rebalancer/receiving_bucket.test.lua
> +++ b/test/rebalancer/receiving_bucket.test.lua
> @@ -136,7 +136,6 @@ box.space.test3:select{100}
> -- Now the bucket is unreferenced and can be transferred.
> _ = test_run:switch('box_2_a')
> vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> vshard.storage.buckets_info(1)
> _ = test_run:switch('box_1_a')
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> index 753687f..9d30a04 100644
> --- a/test/reload_evolution/storage.result
> +++ b/test/reload_evolution/storage.result
> @@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to')
> ...
> vshard.storage.internal.reload_version
> ---
> -- 2
> +- 3
> ...
> --
> -- gh-237: should be only one trigger. During gh-237 the trigger installation
> diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result
> index 049bdef..ac340eb 100644
> --- a/test/router/reroute_wrong_bucket.result
> +++ b/test/router/reroute_wrong_bucket.result
> @@ -37,7 +37,7 @@ test_run:switch('storage_1_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> @@ -53,7 +53,7 @@ test_run:switch('storage_2_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> @@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration')
> err
> ---
> - bucket_id: 100
> - reason: write is prohibited
> + reason: Not found
> code: 1
> destination: ac522f65-aa94-4134-9f64-51ee384f1a54
> type: ShardingError
> name: WRONG_BUCKET
> - message: 'Cannot perform action with bucket 100, reason: write is prohibited'
> + message: 'Cannot perform action with bucket 100, reason: Not found'
> ...
> --
> -- Now try again, but update configuration during call(). It must
> diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua
> index 9e6e804..207aac3 100644
> --- a/test/router/reroute_wrong_bucket.test.lua
> +++ b/test/router/reroute_wrong_bucket.test.lua
> @@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt
> test_run:cmd('create server router_1 with script="router/router_1.lua"')
> test_run:cmd('start server router_1')
> test_run:switch('storage_1_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> vshard.storage.rebalancer_disable()
> for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
>
> test_run:switch('storage_2_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> vshard.storage.rebalancer_disable()
> for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
> diff --git a/test/storage/recovery.result b/test/storage/recovery.result
> index f833fe7..8ccb0b9 100644
> --- a/test/storage/recovery.result
> +++ b/test/storage/recovery.result
> @@ -79,8 +79,7 @@ _bucket = box.space._bucket
> ...
> _bucket:select{}
> ---
> -- - [2, 'garbage', '<replicaset_2>']
> - - [3, 'garbage', '<replicaset_2>']
> +- []
> ...
> _ = test_run:switch('storage_2_a')
> ---
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index 424bc4c..0550ad1 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2])
> ---
> - true
> ...
> +wait_bucket_is_collected(1)
> +---
> +...
> _ = test_run:switch("storage_2_a")
> ---
> ...
> @@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a")
> ...
> vshard.storage.buckets_info()
> ---
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> - 2:
> +- 2:
> status: active
> id: 2
> ...
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index d631b51..d8fbd94 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1])
>
> -- Successful transfer.
> vshard.storage.bucket_send(1, util.replicasets[2])
> +wait_bucket_is_collected(1)
> _ = test_run:switch("storage_2_a")
> vshard.storage.buckets_info()
> _ = test_run:switch("storage_1_a")
> diff --git a/test/unit/config.result b/test/unit/config.result
> index dfd0219..e0b2482 100644
> --- a/test/unit/config.result
> +++ b/test/unit/config.result
> @@ -428,33 +428,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 0
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = -1
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 100.5
> ----
> -...
> -_ = lcfg.check(cfg)
> ----
> -...
> cfg.collect_lua_garbage = 100
> ---
> ...
> @@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> ---
> ...
> -cfg.sharding = nil
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +---
> +...
> +_ = lcfg.check(cfg)
> ---
> ...
> diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua
> index ada43db..a1c9f07 100644
> --- a/test/unit/config.test.lua
> +++ b/test/unit/config.test.lua
> @@ -175,15 +175,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 0
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = -1
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 100.5
> -_ = lcfg.check(cfg)
> -
> cfg.collect_lua_garbage = 100
> check(cfg)
> cfg.collect_lua_garbage = true
> @@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg)
> cfg.rebalancer_max_sending = 15
> lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> -cfg.sharding = nil
> +
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +_ = lcfg.check(cfg)
> diff --git a/test/unit/garbage.result b/test/unit/garbage.result
> index 74d9ccf..a530496 100644
> --- a/test/unit/garbage.result
> +++ b/test/unit/garbage.result
> @@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''");
> vshard.storage.internal.shard_index = 'bucket_id'
> ---
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> ----
> -...
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> -- by it, or bucket_id is not unsigned.
> @@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> ---
> ...
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> +---
> +...
> _bucket = box.schema.create_space('_bucket', {format = format})
> ---
> ...
> @@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ---
> - [3, 'active']
> ...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [6, 'garbage']
> -...
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [200, 'garbage']
> -...
> s = box.schema.create_space('test', {engine = engine})
> ---
> ...
> @@ -213,7 +197,7 @@ s:replace{4, 2}
> ---
> - [4, 2]
> ...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> ---
> ...
> s2 = box.schema.create_space('test2', {engine = engine})
> @@ -249,6 +233,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> ---
> ...
> @@ -267,12 +255,22 @@ fill_spaces_with_garbage()
> ---
> - 1107
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - null
> + - null
> + - destination2
> ...
> #s2:select{}
> ---
> @@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ---
> - 7
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - destination1
> ...
> s2:select{}
> ---
> @@ -303,17 +311,22 @@ s:select{}
> - [6, 100]
> ...
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- []
> ...
> #s2:select{}
> ---
> @@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> fill_spaces_with_garbage()
> ---
> ...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> ---
> ...
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -360,7 +378,6 @@ _bucket:select{}
> - - [1, 'active']
> - [2, 'receiving']
> - [3, 'active']
> - - [4, 'sent']
> ...
> --
> -- Test deletion of 'sent' buckets after a specified timeout.
> @@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT}
> - [2, 'sent']
> ...
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> @@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT}
> ---
> - [4, 'sent']
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -434,11 +453,14 @@ s:replace{6, 4}
> ---
> - [6, 4]
> ...
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> ---
> +- Error during garbage collection step
> ...
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -454,8 +476,9 @@ _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> ---
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> f:cancel()
> ---
> @@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua
> index 30079fa..250afb0 100644
> --- a/test/unit/garbage.test.lua
> +++ b/test/unit/garbage.test.lua
> @@ -15,7 +15,6 @@ end;
> test_run:cmd("setopt delimiter ''");
>
> vshard.storage.internal.shard_index = 'bucket_id'
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
>
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> @@ -75,16 +74,13 @@ s:drop()
> format = {}
> format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> _bucket = box.schema.create_space('_bucket', {format = format})
> _ = _bucket:create_index('pk')
> _ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> _bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> _bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
>
> s = box.schema.create_space('test', {engine = engine})
> pk = s:create_index('pk')
> @@ -94,7 +90,7 @@ s:replace{2, 1}
> s:replace{3, 2}
> s:replace{4, 2}
>
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> s2 = box.schema.create_space('test2', {engine = engine})
> pk2 = s2:create_index('pk')
> sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> @@ -114,6 +110,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> test_run:cmd("setopt delimiter ''");
>
> @@ -121,15 +121,21 @@ fill_spaces_with_garbage()
>
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +route_map
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> s2:select{}
> s:select{}
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> #s2:select{}
> #s:select{}
>
> @@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -- Test continuous garbage collection via background fiber.
> --
> fill_spaces_with_garbage()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> s:select{}
> s2:select{}
> -- Check garbage bucket is deleted by background fiber.
> @@ -150,7 +160,7 @@ _bucket:select{}
> --
> _bucket:replace{2, vshard.consts.BUCKET.SENT}
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> _bucket:select{}
> s:select{}
> s2:select{}
> @@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE}
> s:replace{5, 4}
> s:replace{6, 4}
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete)
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> s:replace{5, 4}
> s:replace{6, 4}
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> s:select{}
> _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> f:cancel()
>
> @@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> #s:select{}
> #s2:select{}
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> _bucket:select{}
> s:select{}
> s2:select{}
> diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result
> deleted file mode 100644
> index 92c8039..0000000
> --- a/test/unit/garbage_errinj.result
> +++ /dev/null
> @@ -1,223 +0,0 @@
> -test_run = require('test_run').new()
> ----
> -...
> -vshard = require('vshard')
> ----
> -...
> -fiber = require('fiber')
> ----
> -...
> -engine = test_run:get_cfg('engine')
> ----
> -...
> -vshard.storage.internal.shard_index = 'bucket_id'
> ----
> -...
> -format = {}
> ----
> -...
> -format[1] = {name = 'id', type = 'unsigned'}
> ----
> -...
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> ----
> -...
> -_bucket = box.schema.create_space('_bucket', {format = format})
> ----
> -...
> -_ = _bucket:create_index('pk')
> ----
> -...
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> ----
> -...
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [1, 'active']
> -...
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> ----
> -- [2, 'receiving']
> -...
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [3, 'active']
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -s = box.schema.create_space('test', {engine = engine})
> ----
> -...
> -pk = s:create_index('pk')
> ----
> -...
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s:replace{2, 1}
> ----
> -- [2, 1]
> -...
> -s:replace{3, 2}
> ----
> -- [3, 2]
> -...
> -s:replace{4, 2}
> ----
> -- [4, 2]
> -...
> -s:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s:replace{6, 100}
> ----
> -- [6, 100]
> -...
> -s:replace{7, 4}
> ----
> -- [7, 4]
> -...
> -s:replace{8, 5}
> ----
> -- [8, 5]
> -...
> -s2 = box.schema.create_space('test2', {engine = engine})
> ----
> -...
> -pk2 = s2:create_index('pk')
> ----
> -...
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s2:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s2:replace{3, 3}
> ----
> -- [3, 3]
> -...
> -for i = 7, 1107 do s:replace{i, 200} end
> ----
> -...
> -s2:replace{4, 200}
> ----
> -- [4, 200]
> -...
> -s2:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s2:replace{5, 300}
> ----
> -- [5, 300]
> -...
> -s2:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -s2:replace{7, 5}
> ----
> -- [7, 5]
> -...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> ----
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- - 4
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 5
> -- true
> -...
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> ----
> -...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> ----
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> ----
> -...
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> ----
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [4, 'garbage']
> -...
> -s:replace{5, 4}
> ----
> -- [5, 4]
> -...
> -s:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -#s:select{}
> ----
> -- 2
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> ----
> -...
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> ----
> -...
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> ----
> -- 2
> -...
> -_bucket:select{4}
> ----
> -- - [4, 'garbage']
> -...
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- []
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 4
> - - 5
> -- true
> -...
> -#s:select{}
> ----
> -- 0
> -...
> -_bucket:delete{4}
> ----
> -- [4, 'garbage']
> -...
> -s2:drop()
> ----
> -...
> -s:drop()
> ----
> -...
> -_bucket:drop()
> ----
> -...
> diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua
> deleted file mode 100644
> index 31184b9..0000000
> --- a/test/unit/garbage_errinj.test.lua
> +++ /dev/null
> @@ -1,73 +0,0 @@
> -test_run = require('test_run').new()
> -vshard = require('vshard')
> -fiber = require('fiber')
> -
> -engine = test_run:get_cfg('engine')
> -vshard.storage.internal.shard_index = 'bucket_id'
> -
> -format = {}
> -format[1] = {name = 'id', type = 'unsigned'}
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> -_bucket = box.schema.create_space('_bucket', {format = format})
> -_ = _bucket:create_index('pk')
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -
> -s = box.schema.create_space('test', {engine = engine})
> -pk = s:create_index('pk')
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s:replace{1, 1}
> -s:replace{2, 1}
> -s:replace{3, 2}
> -s:replace{4, 2}
> -s:replace{5, 100}
> -s:replace{6, 100}
> -s:replace{7, 4}
> -s:replace{8, 5}
> -
> -s2 = box.schema.create_space('test2', {engine = engine})
> -pk2 = s2:create_index('pk')
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s2:replace{1, 1}
> -s2:replace{3, 3}
> -for i = 7, 1107 do s:replace{i, 200} end
> -s2:replace{4, 200}
> -s2:replace{5, 100}
> -s2:replace{5, 300}
> -s2:replace{6, 4}
> -s2:replace{7, 5}
> -
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> -s:replace{5, 4}
> -s:replace{6, 4}
> -#s:select{}
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> -_bucket:select{4}
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -#s:select{}
> -_bucket:delete{4}
> -
> -s2:drop()
> -s:drop()
> -_bucket:drop()
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index 28c3400..1345058 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -245,9 +245,7 @@ local cfg_template = {
> max = consts.REBALANCER_MAX_SENDING_MAX
> },
> collect_bucket_garbage_interval = {
> - type = 'positive number', name = 'Garbage bucket collect interval',
> - is_optional = true,
> - default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> + name = 'Garbage bucket collect interval', is_deprecated = true,
> },
> collect_lua_garbage = {
> type = 'boolean', name = 'Garbage Lua collect necessity',
> diff --git a/vshard/consts.lua b/vshard/consts.lua
> index 8c2a8b0..3f1585a 100644
> --- a/vshard/consts.lua
> +++ b/vshard/consts.lua
> @@ -23,6 +23,7 @@ return {
> DEFAULT_BUCKET_COUNT = 3000;
> BUCKET_SENT_GARBAGE_DELAY = 0.5;
> BUCKET_CHUNK_SIZE = 1000;
> + LUA_CHUNK_SIZE = 100000,
> DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1;
> REBALANCER_IDLE_INTERVAL = 60 * 60;
> REBALANCER_WORK_INTERVAL = 10;
> @@ -37,7 +38,7 @@ return {
> DEFAULT_FAILOVER_PING_TIMEOUT = 5;
> DEFAULT_SYNC_TIMEOUT = 1;
> RECONNECT_TIMEOUT = 0.5;
> - DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5;
> + GC_BACKOFF_INTERVAL = 5,
> RECOVERY_INTERVAL = 5;
> COLLECT_LUA_GARBAGE_INTERVAL = 100;
>
> @@ -45,4 +46,6 @@ return {
> DISCOVERY_WORK_INTERVAL = 1,
> DISCOVERY_WORK_STEP = 0.01,
> DISCOVERY_TIMEOUT = 10,
> +
> + TIMEOUT_INFINITY = 500 * 365 * 86400,
> }
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 298df71..31a6fc7 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -69,7 +69,6 @@ if not M then
> total_bucket_count = 0,
> errinj = {
> ERRINJ_CFG = false,
> - ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
> ERRINJ_RELOAD = false,
> ERRINJ_CFG_DELAY = false,
> ERRINJ_LONG_RECEIVE = false,
> @@ -96,6 +95,8 @@ if not M then
> -- detect that _bucket was not changed between yields.
> --
> bucket_generation = 0,
> + -- Condition variable fired on generation update.
> + bucket_generation_cond = lfiber.cond(),
> --
> -- Reference to the function used as on_replace trigger on
> -- _bucket space. It is used to replace the trigger with
> @@ -107,12 +108,14 @@ if not M then
> -- replace the old function is to keep its reference.
> --
> bucket_on_replace = nil,
> + -- Redirects for recently sent buckets. They are kept for a while to
> + -- help routers to find a new location for sent and deleted buckets
> + -- without whole cluster scan.
> + route_map = {},
>
> ------------------- Garbage collection -------------------
> -- Fiber to remove garbage buckets data.
> collect_bucket_garbage_fiber = nil,
> - -- Do buckets garbage collection once per this time.
> - collect_bucket_garbage_interval = nil,
> -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
>
> @@ -173,6 +176,7 @@ end
> --
> local function bucket_generation_increment()
> M.bucket_generation = M.bucket_generation + 1
> + M.bucket_generation_cond:broadcast()
> end
>
> --
> @@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode)
> else
> return bucket
> end
> + local dst = bucket and bucket.destination or M.route_map[bucket_id]
> return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason,
> - bucket and bucket.destination)
> + dst)
> end
>
> --
> @@ -804,11 +809,23 @@ end
> --
> local function bucket_unrefro(bucket_id)
> local ref = M.bucket_refs[bucket_id]
> - if not ref or ref.ro == 0 then
> + local count = ref and ref.ro or 0
> + if count == 0 then
> return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id,
> "no refs", nil)
> end
> - ref.ro = ref.ro - 1
> + if count == 1 then
> + ref.ro = 0
> + if ref.ro_lock then
> + -- Garbage collector is waiting for the bucket if RO
> + -- is locked. Let it know it has one more bucket to
> + -- collect. It relies on generation, so its increment
> + -- it enough.
> + bucket_generation_increment()
> + end
> + return true
> + end
> + ref.ro = count - 1
> return true
> end
>
> @@ -1479,79 +1496,44 @@ local function gc_bucket_in_space(space, bucket_id, status)
> end
>
> --
> --- Remove tuples from buckets of a specified type.
> --- @param type Type of buckets to gc.
> --- @retval List of ids of empty buckets of the type.
> +-- Drop buckets with the given status along with their data in all spaces.
> +-- @param status Status of target buckets.
> +-- @param route_map Destinations of deleted buckets are saved into this table.
> --
> -local function gc_bucket_step_by_type(type)
> - local sharded_spaces = find_sharded_spaces()
> - local empty_buckets = {}
> +local function gc_bucket_drop_xc(status, route_map)
> local limit = consts.BUCKET_CHUNK_SIZE
> - local is_all_collected = true
> - for _, bucket in box.space._bucket.index.status:pairs(type) do
> - local bucket_id = bucket.id
> - local ref = M.bucket_refs[bucket_id]
> + local _bucket = box.space._bucket
> + local sharded_spaces = find_sharded_spaces()
> + for _, b in _bucket.index.status:pairs(status) do
> + local id = b.id
> + local ref = M.bucket_refs[id]
> if ref then
> assert(ref.rw == 0)
> if ref.ro ~= 0 then
> ref.ro_lock = true
> - is_all_collected = false
> goto continue
> end
> - M.bucket_refs[bucket_id] = nil
> + M.bucket_refs[id] = nil
> end
> for _, space in pairs(sharded_spaces) do
> - gc_bucket_in_space_xc(space, bucket_id, type)
> + gc_bucket_in_space_xc(space, id, status)
> limit = limit - 1
> if limit == 0 then
> lfiber.sleep(0)
> limit = consts.BUCKET_CHUNK_SIZE
> end
> end
> - table.insert(empty_buckets, bucket.id)
> -::continue::
> + route_map[id] = b.destination
> + _bucket:delete{id}
> + ::continue::
> end
> - return empty_buckets, is_all_collected
> -end
> -
> ---
> --- Drop buckets with ids in the list.
> --- @param bucket_ids Bucket ids to drop.
> --- @param status Expected bucket status.
> ---
> -local function gc_bucket_drop_xc(bucket_ids, status)
> - if #bucket_ids == 0 then
> - return
> - end
> - local limit = consts.BUCKET_CHUNK_SIZE
> - box.begin()
> - local _bucket = box.space._bucket
> - for _, id in pairs(bucket_ids) do
> - local bucket_exists = _bucket:get{id} ~= nil
> - local b = _bucket:get{id}
> - if b then
> - if b.status ~= status then
> - return error(string.format('Bucket %d status is changed. Was '..
> - '%s, became %s', id, status,
> - b.status))
> - end
> - _bucket:delete{id}
> - end
> - limit = limit - 1
> - if limit == 0 then
> - box.commit()
> - box.begin()
> - limit = consts.BUCKET_CHUNK_SIZE
> - end
> - end
> - box.commit()
> end
>
> --
> -- Exception safe version of gc_bucket_drop_xc.
> --
> -local function gc_bucket_drop(bucket_ids, status)
> - local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status)
> +local function gc_bucket_drop(status, route_map)
> + local status, err = pcall(gc_bucket_drop_xc, status, route_map)
> if not status then
> box.rollback()
> end
> @@ -1578,65 +1560,75 @@ function gc_bucket_f()
> -- generation == bucket generation. In such a case the fiber
> -- does nothing until next _bucket change.
> local bucket_generation_collected = -1
> - -- Empty sent buckets are collected into an array. After a
> - -- specified time interval the buckets are deleted both from
> - -- this array and from _bucket space.
> - local buckets_for_redirect = {}
> - local buckets_for_redirect_ts = clock()
> - -- Empty sent buckets, updated after each step, and when
> - -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> - -- for next deletion.
> - local empty_garbage_buckets, empty_sent_buckets, status, err
> + local bucket_generation_current = M.bucket_generation
> + -- Deleted buckets are saved into a route map to redirect routers if they
> + -- didn't discover new location of the buckets yet. However route map does
> + -- not grow infinitely. Otherwise it would end up storing redirects for all
> + -- buckets in the cluster. Which could also be outdated.
> + -- Garbage collector periodically drops old routes from the map. For that it
> + -- remembers state of route map in one moment, and after a while clears the
> + -- remembered routes from the global route map.
> + local route_map = M.route_map
> + local route_map_old = {}
> + local route_map_deadline = 0
> + local status, err
> while M.module_version == module_version do
> - -- Check if no changes in buckets configuration.
> - if bucket_generation_collected ~= M.bucket_generation then
> - local bucket_generation = M.bucket_generation
> - local is_sent_collected, is_garbage_collected
> - status, empty_garbage_buckets, is_garbage_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE)
> - if not status then
> - err = empty_garbage_buckets
> - goto check_error
> - end
> - status, empty_sent_buckets, is_sent_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.SENT)
> - if not status then
> - err = empty_sent_buckets
> - goto check_error
> + if bucket_generation_collected ~= bucket_generation_current then
> + status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map)
> + if status then
> + status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map)
> end
> - status, err = gc_bucket_drop(empty_garbage_buckets,
> - consts.BUCKET.GARBAGE)
> -::check_error::
> if not status then
> box.rollback()
> log.error('Error during garbage collection step: %s', err)
> - goto continue
> + else
> + -- Don't use global generation. During the collection it could
> + -- already change. Instead, remember the generation known before
> + -- the collection has started.
> + -- Since the collection also changes the generation, it makes
> + -- the GC happen always at least twice. But typically on the
> + -- second iteration it should not find any buckets to collect,
> + -- and then the collected generation matches the global one.
> + bucket_generation_collected = bucket_generation_current
> end
> - if is_sent_collected and is_garbage_collected then
> - bucket_generation_collected = bucket_generation
> + else
> + status = true
> + end
> +
> + local sleep_time = route_map_deadline - clock()
> + if sleep_time <= 0 then
> + local chunk = consts.LUA_CHUNK_SIZE
> + util.table_minus_yield(route_map, route_map_old, chunk)
> + route_map_old = util.table_copy_yield(route_map, chunk)
> + if next(route_map_old) then
> + sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY
> + else
> + sleep_time = consts.TIMEOUT_INFINITY
> end
> + route_map_deadline = clock() + sleep_time
> end
> + bucket_generation_current = M.bucket_generation
>
> - if clock() - buckets_for_redirect_ts >=
> - consts.BUCKET_SENT_GARBAGE_DELAY then
> - status, err = gc_bucket_drop(buckets_for_redirect,
> - consts.BUCKET.SENT)
> - if not status then
> - buckets_for_redirect = {}
> - empty_sent_buckets = {}
> - bucket_generation_collected = -1
> - log.error('Error during deletion of empty sent buckets: %s',
> - err)
> - elseif M.module_version ~= module_version then
> - return
> + if bucket_generation_current ~= bucket_generation_collected then
> + -- Generation was changed during collection. Or *by* collection.
> + if status then
> + -- Retry immediately. If the generation was changed by the
> + -- collection itself, it will notice it next iteration, and go
> + -- to proper sleep.
> + sleep_time = 0
> else
> - buckets_for_redirect = empty_sent_buckets or {}
> - empty_sent_buckets = nil
> - buckets_for_redirect_ts = clock()
> + -- An error happened during the collection. Does not make sense
> + -- to retry on each iteration of the event loop. The most likely
> + -- errors are either a WAL error or a transaction abort - both
> + -- look like an issue in the user's code and can't be fixed
> + -- quickly anyway. Backoff.
> + sleep_time = consts.GC_BACKOFF_INTERVAL
> end
> end
> -::continue::
> - lfiber.sleep(M.collect_bucket_garbage_interval)
> +
> + if M.module_version == module_version then
> + M.bucket_generation_cond:wait(sleep_time)
> + end
> end
> end
>
> @@ -2421,8 +2413,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> vshard_cfg.rebalancer_disbalance_threshold
> M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving
> M.shard_index = vshard_cfg.shard_index
> - M.collect_bucket_garbage_interval =
> - vshard_cfg.collect_bucket_garbage_interval
> M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending
> M.current_cfg = cfg
> @@ -2676,6 +2666,9 @@ else
> storage_cfg(M.current_cfg, M.this_replica.uuid, true)
> end
> M.module_version = M.module_version + 1
> + -- Background fibers could sleep waiting for bucket changes.
> + -- Let them know it is time to reload.
> + bucket_generation_increment()
> end
>
> M.recovery_f = recovery_f
> @@ -2686,7 +2679,7 @@ M.gc_bucket_f = gc_bucket_f
> -- These functions are saved in M not for atomic reload, but for
> -- unit testing.
> --
> -M.gc_bucket_step_by_type = gc_bucket_step_by_type
> +M.gc_bucket_drop = gc_bucket_drop
> M.rebalancer_build_routes = rebalancer_build_routes
> M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
> M.cached_find_sharded_spaces = find_sharded_spaces
> diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
> index f38af74..484f499 100644
> --- a/vshard/storage/reload_evolution.lua
> +++ b/vshard/storage/reload_evolution.lua
> @@ -4,6 +4,7 @@
> -- in a commit.
> --
> local log = require('log')
> +local fiber = require('fiber')
>
> --
> -- Array of upgrade functions.
> @@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M)
> end
> end
>
> +migrations[#migrations + 1] = function(M)
> + if not M.route_map then
> + M.bucket_generation_cond = fiber.cond()
> + M.route_map = {}
> + end
> +end
> +
> --
> -- Perform an update based on a version stored in `M` (internals).
> -- @param M Old module internals which should be updated.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector
2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
@ 2021-02-10 22:35 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-10 22:35 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
Thanks for the review!
On 10.02.2021 10:00, Oleg Babin wrote:
> Thanks for your patch.
>
> As I see you've introduced some new parameters: "LUA_CHUNK_SIZE" and "GC_BACKOFF_INTERVAL".
I decided not to go into too deep details and not describe private
constants in the commit message. GC_BACKOFF_INTERVAL is explained
in the place where it is used. LUA_CHUNK_SIZE is quite obvious if
you look at its usage.
> I think it's better to describe them in commit message to understand more clear how new algorithm.
These constants are not super relevant to the algorithm's core
idea. It does not matter much for the reactive GC concept if I
yield in table utility functions, or if I have a backoff timeout.
These could be considered 'optimizations', 'amendments'. I would
consider them small details not worth mentioning in the commit
message.
> I see that you didn't update comment above "gc_bucket_f" function. Is it still relevant?
No, irrelevant, thanks for noticing. Here is the diff:
====================
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 99f92a0..1ea8069 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -1543,14 +1543,16 @@ local function gc_bucket_drop(status, route_map)
end
--
--- Garbage collector. Works on masters. The garbage collector
--- wakes up once per specified time.
+-- Garbage collector. Works on masters. The garbage collector wakes up when
+-- state of any bucket changes.
-- After wakeup it follows the plan:
--- 1) Check if _bucket has changed. If not, then sleep again;
--- 2) Scan user spaces for sent and garbage buckets, delete
--- garbage data in batches of limited size;
--- 3) Delete GARBAGE buckets from _bucket immediately, and
--- schedule SENT buckets for deletion after a timeout;
+-- 1) Check if state of any bucket has really changed. If not, then sleep again;
+-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of
+-- limited size.
+-- 3) Bucket destinations are saved into a global route_map to reroute incoming
+-- requests from routers in case they didn't notice the buckets being moved.
+-- The saved routes are scheduled for deletion after a timeout, which is
+-- checked on each iteration of this loop.
-- 4) Sleep, go to (1).
-- For each step details see comments in the code.
--
====================
The full new patch below.
====================
gc: introduce reactive garbage collector
Garbage collector is a fiber on a master node which deletes
GARBAGE and SENT buckets along with their data.
It was proactive. It used to wakeup with a constant period to
find and delete the needed buckets.
But this won't work with the future feature called 'map-reduce'.
Map-reduce as a preparation stage will need to ensure that all
buckets on a storage are readable and writable. With the current
GC algorithm if a bucket is sent, it won't be deleted for the next
5 seconds by default. During this time all new map-reduce requests
can't execute.
This is not acceptable. As well as too frequent wakeup of GC fiber
because it would waste TX thread time.
The patch makes GC fiber wakeup not by a timeout but by events
happening with _bucket space. GC fiber sleeps on a condition
variable which is signaled when _bucket is changed.
Once GC sees work to do, it won't sleep until it is done. It will
only yield.
This makes GC delete SENT and GARBAGE buckets as soon as possible
reducing the waiting time for the incoming map-reduce requests.
Needed for #147
@TarantoolBot document
Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval'
It was used to specify the interval between bucket garbage
collection steps. It was needed because garbage collection in
vshard was proactive. It didn't react to newly appeared garbage
buckets immediately.
Since now (0.1.17) garbage collection became reactive. It starts
working with garbage buckets immediately as they appear. And
sleeps rest of the time. The option is not used now and does not
affect behaviour of anything.
I suppose it can be deleted from the documentation. Or left with
a big label 'deprecated' + the explanation above.
An attempt to use the option does not cause an error, but logs a
warning.
diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
index 21409bd..8df89f6 100644
--- a/test/lua_libs/storage_template.lua
+++ b/test/lua_libs/storage_template.lua
@@ -172,6 +172,5 @@ function wait_bucket_is_collected(id)
return true
end
vshard.storage.recovery_wakeup()
- vshard.storage.garbage_collector_wakeup()
end)
end
diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
index 168be5d..3b34841 100644
--- a/test/misc/reconfigure.result
+++ b/test/misc/reconfigure.result
@@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true
cfg.rebalancer_max_receiving = 1000
---
...
-cfg.collect_bucket_garbage_interval = 100
----
-...
cfg.invalid_option = 'kek'
---
...
@@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000
---
- true
...
-vshard.storage.internal.collect_bucket_garbage_interval ~= 100
----
-- true
-...
cfg.sync_timeout = nil
---
...
@@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil
cfg.rebalancer_max_receiving = nil
---
...
-cfg.collect_bucket_garbage_interval = nil
----
-...
cfg.invalid_option = nil
---
...
diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
index e891010..348628c 100644
--- a/test/misc/reconfigure.test.lua
+++ b/test/misc/reconfigure.test.lua
@@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout
cfg.sync_timeout = 100
cfg.collect_lua_garbage = true
cfg.rebalancer_max_receiving = 1000
-cfg.collect_bucket_garbage_interval = 100
cfg.invalid_option = 'kek'
vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
not vshard.storage.internal.collect_lua_garbage
vshard.storage.internal.sync_timeout
vshard.storage.internal.rebalancer_max_receiving ~= 1000
-vshard.storage.internal.collect_bucket_garbage_interval ~= 100
cfg.sync_timeout = nil
cfg.collect_lua_garbage = nil
cfg.rebalancer_max_receiving = nil
-cfg.collect_bucket_garbage_interval = nil
cfg.invalid_option = nil
--
diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
index b8fc7ff..9df7480 100644
--- a/test/rebalancer/bucket_ref.result
+++ b/test/rebalancer/bucket_ref.result
@@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read')
- true
...
-- Force GC to take an RO lock on the bucket now.
-vshard.storage.garbage_collector_wakeup()
----
-...
vshard.storage.buckets_info(1)
---
- 1:
@@ -203,7 +200,6 @@ while true do
if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
break
end
- vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01)
end;
---
@@ -235,14 +231,6 @@ finish_refs = true
while f1:status() ~= 'dead' do fiber.sleep(0.01) end
---
...
-vshard.storage.buckets_info(1)
----
-- 1:
- status: garbage
- ro_lock: true
- destination: <replicaset_2>
- id: 1
-...
wait_bucket_is_collected(1)
---
...
diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
index 213ced3..1b032ff 100644
--- a/test/rebalancer/bucket_ref.test.lua
+++ b/test/rebalancer/bucket_ref.test.lua
@@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs.
vshard.storage.bucket_ref(1, 'read')
vshard.storage.bucket_unref(1, 'read')
-- Force GC to take an RO lock on the bucket now.
-vshard.storage.garbage_collector_wakeup()
vshard.storage.buckets_info(1)
_ = test_run:cmd("setopt delimiter ';'")
while true do
@@ -64,7 +63,6 @@ while true do
if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
break
end
- vshard.storage.garbage_collector_wakeup()
fiber.sleep(0.01)
end;
_ = test_run:cmd("setopt delimiter ''");
@@ -72,7 +70,6 @@ vshard.storage.buckets_info(1)
vshard.storage.bucket_refro(1)
finish_refs = true
while f1:status() ~= 'dead' do fiber.sleep(0.01) end
-vshard.storage.buckets_info(1)
wait_bucket_is_collected(1)
_ = test_run:switch('box_2_a')
vshard.storage.buckets_info(1)
diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
index e50eb72..0ddb1c9 100644
--- a/test/rebalancer/errinj.result
+++ b/test/rebalancer/errinj.result
@@ -226,17 +226,6 @@ ret2, err2
- true
- null
...
-_bucket:get{35}
----
-- [35, 'sent', '<replicaset_2>']
-...
-_bucket:get{36}
----
-- [36, 'sent', '<replicaset_2>']
-...
--- Buckets became 'active' on box_2_a, but still are sending on
--- box_1_a. Wait until it is marked as garbage on box_1_a by the
--- recovery fiber.
wait_bucket_is_collected(35)
---
...
diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
index 2cc4a69..a60f3d7 100644
--- a/test/rebalancer/errinj.test.lua
+++ b/test/rebalancer/errinj.test.lua
@@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a')
while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end
ret1, err1
ret2, err2
-_bucket:get{35}
-_bucket:get{36}
--- Buckets became 'active' on box_2_a, but still are sending on
--- box_1_a. Wait until it is marked as garbage on box_1_a by the
--- recovery fiber.
wait_bucket_is_collected(35)
wait_bucket_is_collected(36)
_ = test_run:switch('box_2_a')
diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
index 7d3612b..ad93445 100644
--- a/test/rebalancer/receiving_bucket.result
+++ b/test/rebalancer/receiving_bucket.result
@@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
---
- true
...
-vshard.storage.buckets_info(1)
----
-- 1:
- status: sent
- ro_lock: true
- destination: <replicaset_1>
- id: 1
-...
wait_bucket_is_collected(1)
---
...
diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
index 24534b3..2cf6382 100644
--- a/test/rebalancer/receiving_bucket.test.lua
+++ b/test/rebalancer/receiving_bucket.test.lua
@@ -136,7 +136,6 @@ box.space.test3:select{100}
-- Now the bucket is unreferenced and can be transferred.
_ = test_run:switch('box_2_a')
vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
-vshard.storage.buckets_info(1)
wait_bucket_is_collected(1)
vshard.storage.buckets_info(1)
_ = test_run:switch('box_1_a')
diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
index 753687f..9d30a04 100644
--- a/test/reload_evolution/storage.result
+++ b/test/reload_evolution/storage.result
@@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to')
...
vshard.storage.internal.reload_version
---
-- 2
+- 3
...
--
-- gh-237: should be only one trigger. During gh-237 the trigger installation
diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result
index 049bdef..ac340eb 100644
--- a/test/router/reroute_wrong_bucket.result
+++ b/test/router/reroute_wrong_bucket.result
@@ -37,7 +37,7 @@ test_run:switch('storage_1_a')
---
- true
...
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
---
...
vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
@@ -53,7 +53,7 @@ test_run:switch('storage_2_a')
---
- true
...
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
---
...
vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
@@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration')
err
---
- bucket_id: 100
- reason: write is prohibited
+ reason: Not found
code: 1
destination: ac522f65-aa94-4134-9f64-51ee384f1a54
type: ShardingError
name: WRONG_BUCKET
- message: 'Cannot perform action with bucket 100, reason: write is prohibited'
+ message: 'Cannot perform action with bucket 100, reason: Not found'
...
--
-- Now try again, but update configuration during call(). It must
diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua
index 9e6e804..207aac3 100644
--- a/test/router/reroute_wrong_bucket.test.lua
+++ b/test/router/reroute_wrong_bucket.test.lua
@@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt
test_run:cmd('create server router_1 with script="router/router_1.lua"')
test_run:cmd('start server router_1')
test_run:switch('storage_1_a')
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
vshard.storage.rebalancer_disable()
for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
test_run:switch('storage_2_a')
-cfg.collect_bucket_garbage_interval = 100
+vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
vshard.storage.rebalancer_disable()
for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
diff --git a/test/storage/recovery.result b/test/storage/recovery.result
index f833fe7..8ccb0b9 100644
--- a/test/storage/recovery.result
+++ b/test/storage/recovery.result
@@ -79,8 +79,7 @@ _bucket = box.space._bucket
...
_bucket:select{}
---
-- - [2, 'garbage', '<replicaset_2>']
- - [3, 'garbage', '<replicaset_2>']
+- []
...
_ = test_run:switch('storage_2_a')
---
diff --git a/test/storage/storage.result b/test/storage/storage.result
index 424bc4c..0550ad1 100644
--- a/test/storage/storage.result
+++ b/test/storage/storage.result
@@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2])
---
- true
...
+wait_bucket_is_collected(1)
+---
+...
_ = test_run:switch("storage_2_a")
---
...
@@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a")
...
vshard.storage.buckets_info()
---
-- 1:
- status: sent
- ro_lock: true
- destination: <replicaset_2>
- id: 1
- 2:
+- 2:
status: active
id: 2
...
diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
index d631b51..d8fbd94 100644
--- a/test/storage/storage.test.lua
+++ b/test/storage/storage.test.lua
@@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1])
-- Successful transfer.
vshard.storage.bucket_send(1, util.replicasets[2])
+wait_bucket_is_collected(1)
_ = test_run:switch("storage_2_a")
vshard.storage.buckets_info()
_ = test_run:switch("storage_1_a")
diff --git a/test/unit/config.result b/test/unit/config.result
index dfd0219..e0b2482 100644
--- a/test/unit/config.result
+++ b/test/unit/config.result
@@ -428,33 +428,6 @@ _ = lcfg.check(cfg)
--
-- gh-77: garbage collection options.
--
-cfg.collect_bucket_garbage_interval = 'str'
----
-...
-check(cfg)
----
-- Garbage bucket collect interval must be positive number
-...
-cfg.collect_bucket_garbage_interval = 0
----
-...
-check(cfg)
----
-- Garbage bucket collect interval must be positive number
-...
-cfg.collect_bucket_garbage_interval = -1
----
-...
-check(cfg)
----
-- Garbage bucket collect interval must be positive number
-...
-cfg.collect_bucket_garbage_interval = 100.5
----
-...
-_ = lcfg.check(cfg)
----
-...
cfg.collect_lua_garbage = 100
---
...
@@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending
cfg.rebalancer_max_sending = nil
---
...
-cfg.sharding = nil
+--
+-- Deprecated option does not break anything.
+--
+cfg.collect_bucket_garbage_interval = 100
+---
+...
+_ = lcfg.check(cfg)
---
...
diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua
index ada43db..a1c9f07 100644
--- a/test/unit/config.test.lua
+++ b/test/unit/config.test.lua
@@ -175,15 +175,6 @@ _ = lcfg.check(cfg)
--
-- gh-77: garbage collection options.
--
-cfg.collect_bucket_garbage_interval = 'str'
-check(cfg)
-cfg.collect_bucket_garbage_interval = 0
-check(cfg)
-cfg.collect_bucket_garbage_interval = -1
-check(cfg)
-cfg.collect_bucket_garbage_interval = 100.5
-_ = lcfg.check(cfg)
-
cfg.collect_lua_garbage = 100
check(cfg)
cfg.collect_lua_garbage = true
@@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg)
cfg.rebalancer_max_sending = 15
lcfg.check(cfg).rebalancer_max_sending
cfg.rebalancer_max_sending = nil
-cfg.sharding = nil
+
+--
+-- Deprecated option does not break anything.
+--
+cfg.collect_bucket_garbage_interval = 100
+_ = lcfg.check(cfg)
diff --git a/test/unit/garbage.result b/test/unit/garbage.result
index 74d9ccf..a530496 100644
--- a/test/unit/garbage.result
+++ b/test/unit/garbage.result
@@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''");
vshard.storage.internal.shard_index = 'bucket_id'
---
...
-vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
----
-...
--
-- Find nothing if no bucket_id anywhere, or there is no index
-- by it, or bucket_id is not unsigned.
@@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'}
format[2] = {name = 'status', type = 'string'}
---
...
+format[3] = {name = 'destination', type = 'string', is_nullable = true}
+---
+...
_bucket = box.schema.create_space('_bucket', {format = format})
---
...
@@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
---
- [3, 'active']
...
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
----
-- [4, 'sent']
-...
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
----
-- [5, 'garbage']
-...
-_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
----
-- [6, 'garbage']
-...
-_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
----
-- [200, 'garbage']
-...
s = box.schema.create_space('test', {engine = engine})
---
...
@@ -213,7 +197,7 @@ s:replace{4, 2}
---
- [4, 2]
...
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
+gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
---
...
s2 = box.schema.create_space('test2', {engine = engine})
@@ -249,6 +233,10 @@ function fill_spaces_with_garbage()
s2:replace{6, 4}
s2:replace{7, 5}
s2:replace{7, 6}
+ _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
+ _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
+ _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
+ _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
end;
---
...
@@ -267,12 +255,22 @@ fill_spaces_with_garbage()
---
- 1107
...
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
+route_map = {}
+---
+...
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
---
-- - 5
- - 6
- - 200
- true
+- null
+...
+route_map
+---
+- - null
+ - null
+ - null
+ - null
+ - null
+ - destination2
...
#s2:select{}
---
@@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
---
- 7
...
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+route_map = {}
+---
+...
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
---
-- - 4
- true
+- null
+...
+route_map
+---
+- - null
+ - null
+ - null
+ - destination1
...
s2:select{}
---
@@ -303,17 +311,22 @@ s:select{}
- [6, 100]
...
-- Nothing deleted - update collected generation.
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
+route_map = {}
+---
+...
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
---
-- - 5
- - 6
- - 200
- true
+- null
...
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
---
-- - 4
- true
+- null
+...
+route_map
+---
+- []
...
#s2:select{}
---
@@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
fill_spaces_with_garbage()
---
...
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
+_ = _bucket:on_replace(function() \
+ local gen = vshard.storage.internal.bucket_generation \
+ vshard.storage.internal.bucket_generation = gen + 1 \
+ vshard.storage.internal.bucket_generation_cond:broadcast() \
+end)
---
...
f = fiber.create(vshard.storage.internal.gc_bucket_f)
---
...
-- Wait until garbage collection is finished.
-while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
---
+- true
...
s:select{}
---
@@ -360,7 +378,6 @@ _bucket:select{}
- - [1, 'active']
- [2, 'receiving']
- [3, 'active']
- - [4, 'sent']
...
--
-- Test deletion of 'sent' buckets after a specified timeout.
@@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT}
- [2, 'sent']
...
-- Wait deletion after a while.
-while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{2} end)
---
+- true
...
_bucket:select{}
---
@@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT}
---
- [4, 'sent']
...
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
---
+- true
...
--
-- Test WAL errors during deletion from _bucket.
@@ -434,11 +453,14 @@ s:replace{6, 4}
---
- [6, 4]
...
-while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_log('default', 'Error during garbage collection step', \
+ 65536, 10)
---
+- Error during garbage collection step
...
-while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return #sk:select{4} == 0 end)
---
+- true
...
s:select{}
---
@@ -454,8 +476,9 @@ _bucket:select{}
_ = _bucket:on_replace(nil, rollback_on_delete)
---
...
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
---
+- true
...
f:cancel()
---
@@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
f = fiber.create(vshard.storage.internal.gc_bucket_f)
---
...
-while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return _bucket:count() == 0 end)
---
+- true
...
_bucket:select{}
---
diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua
index 30079fa..250afb0 100644
--- a/test/unit/garbage.test.lua
+++ b/test/unit/garbage.test.lua
@@ -15,7 +15,6 @@ end;
test_run:cmd("setopt delimiter ''");
vshard.storage.internal.shard_index = 'bucket_id'
-vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
--
-- Find nothing if no bucket_id anywhere, or there is no index
@@ -75,16 +74,13 @@ s:drop()
format = {}
format[1] = {name = 'id', type = 'unsigned'}
format[2] = {name = 'status', type = 'string'}
+format[3] = {name = 'destination', type = 'string', is_nullable = true}
_bucket = box.schema.create_space('_bucket', {format = format})
_ = _bucket:create_index('pk')
_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
-_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
-_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
s = box.schema.create_space('test', {engine = engine})
pk = s:create_index('pk')
@@ -94,7 +90,7 @@ s:replace{2, 1}
s:replace{3, 2}
s:replace{4, 2}
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
+gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
s2 = box.schema.create_space('test2', {engine = engine})
pk2 = s2:create_index('pk')
sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
@@ -114,6 +110,10 @@ function fill_spaces_with_garbage()
s2:replace{6, 4}
s2:replace{7, 5}
s2:replace{7, 6}
+ _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
+ _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
+ _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
+ _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
end;
test_run:cmd("setopt delimiter ''");
@@ -121,15 +121,21 @@ fill_spaces_with_garbage()
#s2:select{}
#s:select{}
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
+route_map = {}
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
+route_map
#s2:select{}
#s:select{}
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+route_map = {}
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
+route_map
s2:select{}
s:select{}
-- Nothing deleted - update collected generation.
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
+route_map = {}
+gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
+gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
+route_map
#s2:select{}
#s:select{}
@@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
-- Test continuous garbage collection via background fiber.
--
fill_spaces_with_garbage()
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
+_ = _bucket:on_replace(function() \
+ local gen = vshard.storage.internal.bucket_generation \
+ vshard.storage.internal.bucket_generation = gen + 1 \
+ vshard.storage.internal.bucket_generation_cond:broadcast() \
+end)
f = fiber.create(vshard.storage.internal.gc_bucket_f)
-- Wait until garbage collection is finished.
-while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
s:select{}
s2:select{}
-- Check garbage bucket is deleted by background fiber.
@@ -150,7 +160,7 @@ _bucket:select{}
--
_bucket:replace{2, vshard.consts.BUCKET.SENT}
-- Wait deletion after a while.
-while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{2} end)
_bucket:select{}
s:select{}
s2:select{}
@@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE}
s:replace{5, 4}
s:replace{6, 4}
_bucket:replace{4, vshard.consts.BUCKET.SENT}
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
--
-- Test WAL errors during deletion from _bucket.
@@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete)
_bucket:replace{4, vshard.consts.BUCKET.SENT}
s:replace{5, 4}
s:replace{6, 4}
-while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
-while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_log('default', 'Error during garbage collection step', \
+ 65536, 10)
+test_run:wait_cond(function() return #sk:select{4} == 0 end)
s:select{}
_bucket:select{}
_ = _bucket:on_replace(nil, rollback_on_delete)
-while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return not _bucket:get{4} end)
f:cancel()
@@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
#s:select{}
#s2:select{}
f = fiber.create(vshard.storage.internal.gc_bucket_f)
-while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
+test_run:wait_cond(function() return _bucket:count() == 0 end)
_bucket:select{}
s:select{}
s2:select{}
diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result
deleted file mode 100644
index 92c8039..0000000
--- a/test/unit/garbage_errinj.result
+++ /dev/null
@@ -1,223 +0,0 @@
-test_run = require('test_run').new()
----
-...
-vshard = require('vshard')
----
-...
-fiber = require('fiber')
----
-...
-engine = test_run:get_cfg('engine')
----
-...
-vshard.storage.internal.shard_index = 'bucket_id'
----
-...
-format = {}
----
-...
-format[1] = {name = 'id', type = 'unsigned'}
----
-...
-format[2] = {name = 'status', type = 'string', is_nullable = true}
----
-...
-_bucket = box.schema.create_space('_bucket', {format = format})
----
-...
-_ = _bucket:create_index('pk')
----
-...
-_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
----
-...
-_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
----
-- [1, 'active']
-...
-_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
----
-- [2, 'receiving']
-...
-_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
----
-- [3, 'active']
-...
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
----
-- [4, 'sent']
-...
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
----
-- [5, 'garbage']
-...
-s = box.schema.create_space('test', {engine = engine})
----
-...
-pk = s:create_index('pk')
----
-...
-sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
----
-...
-s:replace{1, 1}
----
-- [1, 1]
-...
-s:replace{2, 1}
----
-- [2, 1]
-...
-s:replace{3, 2}
----
-- [3, 2]
-...
-s:replace{4, 2}
----
-- [4, 2]
-...
-s:replace{5, 100}
----
-- [5, 100]
-...
-s:replace{6, 100}
----
-- [6, 100]
-...
-s:replace{7, 4}
----
-- [7, 4]
-...
-s:replace{8, 5}
----
-- [8, 5]
-...
-s2 = box.schema.create_space('test2', {engine = engine})
----
-...
-pk2 = s2:create_index('pk')
----
-...
-sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
----
-...
-s2:replace{1, 1}
----
-- [1, 1]
-...
-s2:replace{3, 3}
----
-- [3, 3]
-...
-for i = 7, 1107 do s:replace{i, 200} end
----
-...
-s2:replace{4, 200}
----
-- [4, 200]
-...
-s2:replace{5, 100}
----
-- [5, 100]
-...
-s2:replace{5, 300}
----
-- [5, 300]
-...
-s2:replace{6, 4}
----
-- [6, 4]
-...
-s2:replace{7, 5}
----
-- [7, 5]
-...
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
----
-...
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
----
-- - 4
-- true
-...
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
----
-- - 5
-- true
-...
---
--- Test _bucket generation change during garbage buckets search.
---
-s:truncate()
----
-...
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
----
-...
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
----
-...
-f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
----
-...
-_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
----
-- [4, 'garbage']
-...
-s:replace{5, 4}
----
-- [5, 4]
-...
-s:replace{6, 4}
----
-- [6, 4]
-...
-#s:select{}
----
-- 2
-...
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
----
-...
-while f:status() ~= 'dead' do fiber.sleep(0.1) end
----
-...
--- Nothing is deleted - _bucket:replace() has changed _bucket
--- generation during search of garbage buckets.
-#s:select{}
----
-- 2
-...
-_bucket:select{4}
----
-- - [4, 'garbage']
-...
--- Next step deletes garbage ok.
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
----
-- []
-- true
-...
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
----
-- - 4
- - 5
-- true
-...
-#s:select{}
----
-- 0
-...
-_bucket:delete{4}
----
-- [4, 'garbage']
-...
-s2:drop()
----
-...
-s:drop()
----
-...
-_bucket:drop()
----
-...
diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua
deleted file mode 100644
index 31184b9..0000000
--- a/test/unit/garbage_errinj.test.lua
+++ /dev/null
@@ -1,73 +0,0 @@
-test_run = require('test_run').new()
-vshard = require('vshard')
-fiber = require('fiber')
-
-engine = test_run:get_cfg('engine')
-vshard.storage.internal.shard_index = 'bucket_id'
-
-format = {}
-format[1] = {name = 'id', type = 'unsigned'}
-format[2] = {name = 'status', type = 'string', is_nullable = true}
-_bucket = box.schema.create_space('_bucket', {format = format})
-_ = _bucket:create_index('pk')
-_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
-_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
-_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
-_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
-_bucket:replace{4, vshard.consts.BUCKET.SENT}
-_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
-
-s = box.schema.create_space('test', {engine = engine})
-pk = s:create_index('pk')
-sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
-s:replace{1, 1}
-s:replace{2, 1}
-s:replace{3, 2}
-s:replace{4, 2}
-s:replace{5, 100}
-s:replace{6, 100}
-s:replace{7, 4}
-s:replace{8, 5}
-
-s2 = box.schema.create_space('test2', {engine = engine})
-pk2 = s2:create_index('pk')
-sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
-s2:replace{1, 1}
-s2:replace{3, 3}
-for i = 7, 1107 do s:replace{i, 200} end
-s2:replace{4, 200}
-s2:replace{5, 100}
-s2:replace{5, 300}
-s2:replace{6, 4}
-s2:replace{7, 5}
-
-gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
-
---
--- Test _bucket generation change during garbage buckets search.
---
-s:truncate()
-_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
-f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
-_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
-s:replace{5, 4}
-s:replace{6, 4}
-#s:select{}
-vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
-while f:status() ~= 'dead' do fiber.sleep(0.1) end
--- Nothing is deleted - _bucket:replace() has changed _bucket
--- generation during search of garbage buckets.
-#s:select{}
-_bucket:select{4}
--- Next step deletes garbage ok.
-gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
-gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
-#s:select{}
-_bucket:delete{4}
-
-s2:drop()
-s:drop()
-_bucket:drop()
diff --git a/vshard/cfg.lua b/vshard/cfg.lua
index f7d5dbc..63d5414 100644
--- a/vshard/cfg.lua
+++ b/vshard/cfg.lua
@@ -251,9 +251,8 @@ local cfg_template = {
max = consts.REBALANCER_MAX_SENDING_MAX
},
collect_bucket_garbage_interval = {
- type = 'positive number', name = 'Garbage bucket collect interval',
- is_optional = true,
- default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
+ name = 'Garbage bucket collect interval', is_deprecated = true,
+ reason = 'Has no effect anymore'
},
collect_lua_garbage = {
type = 'boolean', name = 'Garbage Lua collect necessity',
diff --git a/vshard/consts.lua b/vshard/consts.lua
index 8c2a8b0..3f1585a 100644
--- a/vshard/consts.lua
+++ b/vshard/consts.lua
@@ -23,6 +23,7 @@ return {
DEFAULT_BUCKET_COUNT = 3000;
BUCKET_SENT_GARBAGE_DELAY = 0.5;
BUCKET_CHUNK_SIZE = 1000;
+ LUA_CHUNK_SIZE = 100000,
DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1;
REBALANCER_IDLE_INTERVAL = 60 * 60;
REBALANCER_WORK_INTERVAL = 10;
@@ -37,7 +38,7 @@ return {
DEFAULT_FAILOVER_PING_TIMEOUT = 5;
DEFAULT_SYNC_TIMEOUT = 1;
RECONNECT_TIMEOUT = 0.5;
- DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5;
+ GC_BACKOFF_INTERVAL = 5,
RECOVERY_INTERVAL = 5;
COLLECT_LUA_GARBAGE_INTERVAL = 100;
@@ -45,4 +46,6 @@ return {
DISCOVERY_WORK_INTERVAL = 1,
DISCOVERY_WORK_STEP = 0.01,
DISCOVERY_TIMEOUT = 10,
+
+ TIMEOUT_INFINITY = 500 * 365 * 86400,
}
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index adf1c20..1ea8069 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -69,7 +69,6 @@ if not M then
total_bucket_count = 0,
errinj = {
ERRINJ_CFG = false,
- ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
ERRINJ_RELOAD = false,
ERRINJ_CFG_DELAY = false,
ERRINJ_LONG_RECEIVE = false,
@@ -96,6 +95,8 @@ if not M then
-- detect that _bucket was not changed between yields.
--
bucket_generation = 0,
+ -- Condition variable fired on generation update.
+ bucket_generation_cond = lfiber.cond(),
--
-- Reference to the function used as on_replace trigger on
-- _bucket space. It is used to replace the trigger with
@@ -107,12 +108,14 @@ if not M then
-- replace the old function is to keep its reference.
--
bucket_on_replace = nil,
+ -- Redirects for recently sent buckets. They are kept for a while to
+ -- help routers to find a new location for sent and deleted buckets
+ -- without whole cluster scan.
+ route_map = {},
------------------- Garbage collection -------------------
-- Fiber to remove garbage buckets data.
collect_bucket_garbage_fiber = nil,
- -- Do buckets garbage collection once per this time.
- collect_bucket_garbage_interval = nil,
-- Boolean lua_gc state (create periodic gc task).
collect_lua_garbage = nil,
@@ -173,6 +176,7 @@ end
--
local function bucket_generation_increment()
M.bucket_generation = M.bucket_generation + 1
+ M.bucket_generation_cond:broadcast()
end
--
@@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode)
else
return bucket
end
+ local dst = bucket and bucket.destination or M.route_map[bucket_id]
return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason,
- bucket and bucket.destination)
+ dst)
end
--
@@ -804,11 +809,23 @@ end
--
local function bucket_unrefro(bucket_id)
local ref = M.bucket_refs[bucket_id]
- if not ref or ref.ro == 0 then
+ local count = ref and ref.ro or 0
+ if count == 0 then
return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id,
"no refs", nil)
end
- ref.ro = ref.ro - 1
+ if count == 1 then
+ ref.ro = 0
+ if ref.ro_lock then
+ -- Garbage collector is waiting for the bucket if RO
+ -- is locked. Let it know it has one more bucket to
+ -- collect. It relies on generation, so its increment
+ -- it enough.
+ bucket_generation_increment()
+ end
+ return true
+ end
+ ref.ro = count - 1
return true
end
@@ -1481,79 +1498,44 @@ local function gc_bucket_in_space(space, bucket_id, status)
end
--
--- Remove tuples from buckets of a specified type.
--- @param type Type of buckets to gc.
--- @retval List of ids of empty buckets of the type.
+-- Drop buckets with the given status along with their data in all spaces.
+-- @param status Status of target buckets.
+-- @param route_map Destinations of deleted buckets are saved into this table.
--
-local function gc_bucket_step_by_type(type)
- local sharded_spaces = find_sharded_spaces()
- local empty_buckets = {}
+local function gc_bucket_drop_xc(status, route_map)
local limit = consts.BUCKET_CHUNK_SIZE
- local is_all_collected = true
- for _, bucket in box.space._bucket.index.status:pairs(type) do
- local bucket_id = bucket.id
- local ref = M.bucket_refs[bucket_id]
+ local _bucket = box.space._bucket
+ local sharded_spaces = find_sharded_spaces()
+ for _, b in _bucket.index.status:pairs(status) do
+ local id = b.id
+ local ref = M.bucket_refs[id]
if ref then
assert(ref.rw == 0)
if ref.ro ~= 0 then
ref.ro_lock = true
- is_all_collected = false
goto continue
end
- M.bucket_refs[bucket_id] = nil
+ M.bucket_refs[id] = nil
end
for _, space in pairs(sharded_spaces) do
- gc_bucket_in_space_xc(space, bucket_id, type)
+ gc_bucket_in_space_xc(space, id, status)
limit = limit - 1
if limit == 0 then
lfiber.sleep(0)
limit = consts.BUCKET_CHUNK_SIZE
end
end
- table.insert(empty_buckets, bucket.id)
-::continue::
+ route_map[id] = b.destination
+ _bucket:delete{id}
+ ::continue::
end
- return empty_buckets, is_all_collected
-end
-
---
--- Drop buckets with ids in the list.
--- @param bucket_ids Bucket ids to drop.
--- @param status Expected bucket status.
---
-local function gc_bucket_drop_xc(bucket_ids, status)
- if #bucket_ids == 0 then
- return
- end
- local limit = consts.BUCKET_CHUNK_SIZE
- box.begin()
- local _bucket = box.space._bucket
- for _, id in pairs(bucket_ids) do
- local bucket_exists = _bucket:get{id} ~= nil
- local b = _bucket:get{id}
- if b then
- if b.status ~= status then
- return error(string.format('Bucket %d status is changed. Was '..
- '%s, became %s', id, status,
- b.status))
- end
- _bucket:delete{id}
- end
- limit = limit - 1
- if limit == 0 then
- box.commit()
- box.begin()
- limit = consts.BUCKET_CHUNK_SIZE
- end
- end
- box.commit()
end
--
-- Exception safe version of gc_bucket_drop_xc.
--
-local function gc_bucket_drop(bucket_ids, status)
- local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status)
+local function gc_bucket_drop(status, route_map)
+ local status, err = pcall(gc_bucket_drop_xc, status, route_map)
if not status then
box.rollback()
end
@@ -1561,14 +1543,16 @@ local function gc_bucket_drop(bucket_ids, status)
end
--
--- Garbage collector. Works on masters. The garbage collector
--- wakes up once per specified time.
+-- Garbage collector. Works on masters. The garbage collector wakes up when
+-- state of any bucket changes.
-- After wakeup it follows the plan:
--- 1) Check if _bucket has changed. If not, then sleep again;
--- 2) Scan user spaces for sent and garbage buckets, delete
--- garbage data in batches of limited size;
--- 3) Delete GARBAGE buckets from _bucket immediately, and
--- schedule SENT buckets for deletion after a timeout;
+-- 1) Check if state of any bucket has really changed. If not, then sleep again;
+-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of
+-- limited size.
+-- 3) Bucket destinations are saved into a global route_map to reroute incoming
+-- requests from routers in case they didn't notice the buckets being moved.
+-- The saved routes are scheduled for deletion after a timeout, which is
+-- checked on each iteration of this loop.
-- 4) Sleep, go to (1).
-- For each step details see comments in the code.
--
@@ -1580,65 +1564,75 @@ function gc_bucket_f()
-- generation == bucket generation. In such a case the fiber
-- does nothing until next _bucket change.
local bucket_generation_collected = -1
- -- Empty sent buckets are collected into an array. After a
- -- specified time interval the buckets are deleted both from
- -- this array and from _bucket space.
- local buckets_for_redirect = {}
- local buckets_for_redirect_ts = fiber_clock()
- -- Empty sent buckets, updated after each step, and when
- -- buckets_for_redirect is deleted, it gets empty_sent_buckets
- -- for next deletion.
- local empty_garbage_buckets, empty_sent_buckets, status, err
+ local bucket_generation_current = M.bucket_generation
+ -- Deleted buckets are saved into a route map to redirect routers if they
+ -- didn't discover new location of the buckets yet. However route map does
+ -- not grow infinitely. Otherwise it would end up storing redirects for all
+ -- buckets in the cluster. Which could also be outdated.
+ -- Garbage collector periodically drops old routes from the map. For that it
+ -- remembers state of route map in one moment, and after a while clears the
+ -- remembered routes from the global route map.
+ local route_map = M.route_map
+ local route_map_old = {}
+ local route_map_deadline = 0
+ local status, err
while M.module_version == module_version do
- -- Check if no changes in buckets configuration.
- if bucket_generation_collected ~= M.bucket_generation then
- local bucket_generation = M.bucket_generation
- local is_sent_collected, is_garbage_collected
- status, empty_garbage_buckets, is_garbage_collected =
- pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE)
- if not status then
- err = empty_garbage_buckets
- goto check_error
- end
- status, empty_sent_buckets, is_sent_collected =
- pcall(gc_bucket_step_by_type, consts.BUCKET.SENT)
- if not status then
- err = empty_sent_buckets
- goto check_error
+ if bucket_generation_collected ~= bucket_generation_current then
+ status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map)
+ if status then
+ status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map)
end
- status, err = gc_bucket_drop(empty_garbage_buckets,
- consts.BUCKET.GARBAGE)
-::check_error::
if not status then
box.rollback()
log.error('Error during garbage collection step: %s', err)
- goto continue
+ else
+ -- Don't use global generation. During the collection it could
+ -- already change. Instead, remember the generation known before
+ -- the collection has started.
+ -- Since the collection also changes the generation, it makes
+ -- the GC happen always at least twice. But typically on the
+ -- second iteration it should not find any buckets to collect,
+ -- and then the collected generation matches the global one.
+ bucket_generation_collected = bucket_generation_current
end
- if is_sent_collected and is_garbage_collected then
- bucket_generation_collected = bucket_generation
+ else
+ status = true
+ end
+
+ local sleep_time = route_map_deadline - fiber_clock()
+ if sleep_time <= 0 then
+ local chunk = consts.LUA_CHUNK_SIZE
+ util.table_minus_yield(route_map, route_map_old, chunk)
+ route_map_old = util.table_copy_yield(route_map, chunk)
+ if next(route_map_old) then
+ sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY
+ else
+ sleep_time = consts.TIMEOUT_INFINITY
end
+ route_map_deadline = fiber_clock() + sleep_time
end
+ bucket_generation_current = M.bucket_generation
- if fiber_clock() - buckets_for_redirect_ts >=
- consts.BUCKET_SENT_GARBAGE_DELAY then
- status, err = gc_bucket_drop(buckets_for_redirect,
- consts.BUCKET.SENT)
- if not status then
- buckets_for_redirect = {}
- empty_sent_buckets = {}
- bucket_generation_collected = -1
- log.error('Error during deletion of empty sent buckets: %s',
- err)
- elseif M.module_version ~= module_version then
- return
+ if bucket_generation_current ~= bucket_generation_collected then
+ -- Generation was changed during collection. Or *by* collection.
+ if status then
+ -- Retry immediately. If the generation was changed by the
+ -- collection itself, it will notice it next iteration, and go
+ -- to proper sleep.
+ sleep_time = 0
else
- buckets_for_redirect = empty_sent_buckets or {}
- empty_sent_buckets = nil
- buckets_for_redirect_ts = fiber_clock()
+ -- An error happened during the collection. Does not make sense
+ -- to retry on each iteration of the event loop. The most likely
+ -- errors are either a WAL error or a transaction abort - both
+ -- look like an issue in the user's code and can't be fixed
+ -- quickly anyway. Backoff.
+ sleep_time = consts.GC_BACKOFF_INTERVAL
end
end
-::continue::
- lfiber.sleep(M.collect_bucket_garbage_interval)
+
+ if M.module_version == module_version then
+ M.bucket_generation_cond:wait(sleep_time)
+ end
end
end
@@ -2423,8 +2417,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
vshard_cfg.rebalancer_disbalance_threshold
M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving
M.shard_index = vshard_cfg.shard_index
- M.collect_bucket_garbage_interval =
- vshard_cfg.collect_bucket_garbage_interval
M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending
M.current_cfg = cfg
@@ -2678,6 +2670,9 @@ else
storage_cfg(M.current_cfg, M.this_replica.uuid, true)
end
M.module_version = M.module_version + 1
+ -- Background fibers could sleep waiting for bucket changes.
+ -- Let them know it is time to reload.
+ bucket_generation_increment()
end
M.recovery_f = recovery_f
@@ -2688,7 +2683,7 @@ M.gc_bucket_f = gc_bucket_f
-- These functions are saved in M not for atomic reload, but for
-- unit testing.
--
-M.gc_bucket_step_by_type = gc_bucket_step_by_type
+M.gc_bucket_drop = gc_bucket_drop
M.rebalancer_build_routes = rebalancer_build_routes
M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
M.cached_find_sharded_spaces = find_sharded_spaces
diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
index f38af74..484f499 100644
--- a/vshard/storage/reload_evolution.lua
+++ b/vshard/storage/reload_evolution.lua
@@ -4,6 +4,7 @@
-- in a commit.
--
local log = require('log')
+local fiber = require('fiber')
--
-- Array of upgrade functions.
@@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M)
end
end
+migrations[#migrations + 1] = function(M)
+ if not M.route_map then
+ M.bucket_generation_cond = fiber.cond()
+ M.route_map = {}
+ end
+end
+
--
-- Perform an update based on a version stored in `M` (internals).
-- @param M Old module internals which should be updated.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector
2021-02-10 22:35 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-11 6:50 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your fixes! LGTM.
On 11/02/2021 01:35, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 10:00, Oleg Babin wrote:
>> Thanks for your patch.
>>
>> As I see you've introduced some new parameters: "LUA_CHUNK_SIZE" and "GC_BACKOFF_INTERVAL".
> I decided not to go into too deep details and not describe private
> constants in the commit message. GC_BACKOFF_INTERVAL is explained
> in the place where it is used. LUA_CHUNK_SIZE is quite obvious if
> you look at its usage.
>
>> I think it's better to describe them in commit message to understand more clear how new algorithm.
> These constants are not super relevant to the algorithm's core
> idea. It does not matter much for the reactive GC concept if I
> yield in table utility functions, or if I have a backoff timeout.
> These could be considered 'optimizations', 'amendments'. I would
> consider them small details not worth mentioning in the commit
> message.
>
>> I see that you didn't update comment above "gc_bucket_f" function. Is it still relevant?
> No, irrelevant, thanks for noticing. Here is the diff:
>
> ====================
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 99f92a0..1ea8069 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -1543,14 +1543,16 @@ local function gc_bucket_drop(status, route_map)
> end
>
> --
> --- Garbage collector. Works on masters. The garbage collector
> --- wakes up once per specified time.
> +-- Garbage collector. Works on masters. The garbage collector wakes up when
> +-- state of any bucket changes.
> -- After wakeup it follows the plan:
> --- 1) Check if _bucket has changed. If not, then sleep again;
> --- 2) Scan user spaces for sent and garbage buckets, delete
> --- garbage data in batches of limited size;
> --- 3) Delete GARBAGE buckets from _bucket immediately, and
> --- schedule SENT buckets for deletion after a timeout;
> +-- 1) Check if state of any bucket has really changed. If not, then sleep again;
> +-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of
> +-- limited size.
> +-- 3) Bucket destinations are saved into a global route_map to reroute incoming
> +-- requests from routers in case they didn't notice the buckets being moved.
> +-- The saved routes are scheduled for deletion after a timeout, which is
> +-- checked on each iteration of this loop.
> -- 4) Sleep, go to (1).
> -- For each step details see comments in the code.
> --
> ====================
>
> The full new patch below.
>
> ====================
> gc: introduce reactive garbage collector
>
> Garbage collector is a fiber on a master node which deletes
> GARBAGE and SENT buckets along with their data.
>
> It was proactive. It used to wakeup with a constant period to
> find and delete the needed buckets.
>
> But this won't work with the future feature called 'map-reduce'.
> Map-reduce as a preparation stage will need to ensure that all
> buckets on a storage are readable and writable. With the current
> GC algorithm if a bucket is sent, it won't be deleted for the next
> 5 seconds by default. During this time all new map-reduce requests
> can't execute.
>
> This is not acceptable. As well as too frequent wakeup of GC fiber
> because it would waste TX thread time.
>
> The patch makes GC fiber wakeup not by a timeout but by events
> happening with _bucket space. GC fiber sleeps on a condition
> variable which is signaled when _bucket is changed.
>
> Once GC sees work to do, it won't sleep until it is done. It will
> only yield.
>
> This makes GC delete SENT and GARBAGE buckets as soon as possible
> reducing the waiting time for the incoming map-reduce requests.
>
> Needed for #147
>
> @TarantoolBot document
> Title: VShard: deprecate cfg option 'collect_bucket_garbage_interval'
> It was used to specify the interval between bucket garbage
> collection steps. It was needed because garbage collection in
> vshard was proactive. It didn't react to newly appeared garbage
> buckets immediately.
>
> Since now (0.1.17) garbage collection became reactive. It starts
> working with garbage buckets immediately as they appear. And
> sleeps rest of the time. The option is not used now and does not
> affect behaviour of anything.
>
> I suppose it can be deleted from the documentation. Or left with
> a big label 'deprecated' + the explanation above.
>
> An attempt to use the option does not cause an error, but logs a
> warning.
>
> diff --git a/test/lua_libs/storage_template.lua b/test/lua_libs/storage_template.lua
> index 21409bd..8df89f6 100644
> --- a/test/lua_libs/storage_template.lua
> +++ b/test/lua_libs/storage_template.lua
> @@ -172,6 +172,5 @@ function wait_bucket_is_collected(id)
> return true
> end
> vshard.storage.recovery_wakeup()
> - vshard.storage.garbage_collector_wakeup()
> end)
> end
> diff --git a/test/misc/reconfigure.result b/test/misc/reconfigure.result
> index 168be5d..3b34841 100644
> --- a/test/misc/reconfigure.result
> +++ b/test/misc/reconfigure.result
> @@ -83,9 +83,6 @@ cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> ---
> ...
> -cfg.collect_bucket_garbage_interval = 100
> ----
> -...
> cfg.invalid_option = 'kek'
> ---
> ...
> @@ -105,10 +102,6 @@ vshard.storage.internal.rebalancer_max_receiving ~= 1000
> ---
> - true
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> ----
> -- true
> -...
> cfg.sync_timeout = nil
> ---
> ...
> @@ -118,9 +111,6 @@ cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> ---
> ...
> -cfg.collect_bucket_garbage_interval = nil
> ----
> -...
> cfg.invalid_option = nil
> ---
> ...
> diff --git a/test/misc/reconfigure.test.lua b/test/misc/reconfigure.test.lua
> index e891010..348628c 100644
> --- a/test/misc/reconfigure.test.lua
> +++ b/test/misc/reconfigure.test.lua
> @@ -33,17 +33,14 @@ vshard.storage.internal.sync_timeout
> cfg.sync_timeout = 100
> cfg.collect_lua_garbage = true
> cfg.rebalancer_max_receiving = 1000
> -cfg.collect_bucket_garbage_interval = 100
> cfg.invalid_option = 'kek'
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> not vshard.storage.internal.collect_lua_garbage
> vshard.storage.internal.sync_timeout
> vshard.storage.internal.rebalancer_max_receiving ~= 1000
> -vshard.storage.internal.collect_bucket_garbage_interval ~= 100
> cfg.sync_timeout = nil
> cfg.collect_lua_garbage = nil
> cfg.rebalancer_max_receiving = nil
> -cfg.collect_bucket_garbage_interval = nil
> cfg.invalid_option = nil
>
> --
> diff --git a/test/rebalancer/bucket_ref.result b/test/rebalancer/bucket_ref.result
> index b8fc7ff..9df7480 100644
> --- a/test/rebalancer/bucket_ref.result
> +++ b/test/rebalancer/bucket_ref.result
> @@ -184,9 +184,6 @@ vshard.storage.bucket_unref(1, 'read')
> - true
> ...
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> ----
> -...
> vshard.storage.buckets_info(1)
> ---
> - 1:
> @@ -203,7 +200,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> ---
> @@ -235,14 +231,6 @@ finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> ---
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: garbage
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/bucket_ref.test.lua b/test/rebalancer/bucket_ref.test.lua
> index 213ced3..1b032ff 100644
> --- a/test/rebalancer/bucket_ref.test.lua
> +++ b/test/rebalancer/bucket_ref.test.lua
> @@ -56,7 +56,6 @@ vshard.storage.bucket_unref(1, 'write') -- Error, no refs.
> vshard.storage.bucket_ref(1, 'read')
> vshard.storage.bucket_unref(1, 'read')
> -- Force GC to take an RO lock on the bucket now.
> -vshard.storage.garbage_collector_wakeup()
> vshard.storage.buckets_info(1)
> _ = test_run:cmd("setopt delimiter ';'")
> while true do
> @@ -64,7 +63,6 @@ while true do
> if i.status == vshard.consts.BUCKET.GARBAGE and i.ro_lock then
> break
> end
> - vshard.storage.garbage_collector_wakeup()
> fiber.sleep(0.01)
> end;
> _ = test_run:cmd("setopt delimiter ''");
> @@ -72,7 +70,6 @@ vshard.storage.buckets_info(1)
> vshard.storage.bucket_refro(1)
> finish_refs = true
> while f1:status() ~= 'dead' do fiber.sleep(0.01) end
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> _ = test_run:switch('box_2_a')
> vshard.storage.buckets_info(1)
> diff --git a/test/rebalancer/errinj.result b/test/rebalancer/errinj.result
> index e50eb72..0ddb1c9 100644
> --- a/test/rebalancer/errinj.result
> +++ b/test/rebalancer/errinj.result
> @@ -226,17 +226,6 @@ ret2, err2
> - true
> - null
> ...
> -_bucket:get{35}
> ----
> -- [35, 'sent', '<replicaset_2>']
> -...
> -_bucket:get{36}
> ----
> -- [36, 'sent', '<replicaset_2>']
> -...
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> ---
> ...
> diff --git a/test/rebalancer/errinj.test.lua b/test/rebalancer/errinj.test.lua
> index 2cc4a69..a60f3d7 100644
> --- a/test/rebalancer/errinj.test.lua
> +++ b/test/rebalancer/errinj.test.lua
> @@ -102,11 +102,6 @@ _ = test_run:switch('box_1_a')
> while f1:status() ~= 'dead' or f2:status() ~= 'dead' do fiber.sleep(0.001) end
> ret1, err1
> ret2, err2
> -_bucket:get{35}
> -_bucket:get{36}
> --- Buckets became 'active' on box_2_a, but still are sending on
> --- box_1_a. Wait until it is marked as garbage on box_1_a by the
> --- recovery fiber.
> wait_bucket_is_collected(35)
> wait_bucket_is_collected(36)
> _ = test_run:switch('box_2_a')
> diff --git a/test/rebalancer/receiving_bucket.result b/test/rebalancer/receiving_bucket.result
> index 7d3612b..ad93445 100644
> --- a/test/rebalancer/receiving_bucket.result
> +++ b/test/rebalancer/receiving_bucket.result
> @@ -366,14 +366,6 @@ vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> ---
> - true
> ...
> -vshard.storage.buckets_info(1)
> ----
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_1>
> - id: 1
> -...
> wait_bucket_is_collected(1)
> ---
> ...
> diff --git a/test/rebalancer/receiving_bucket.test.lua b/test/rebalancer/receiving_bucket.test.lua
> index 24534b3..2cf6382 100644
> --- a/test/rebalancer/receiving_bucket.test.lua
> +++ b/test/rebalancer/receiving_bucket.test.lua
> @@ -136,7 +136,6 @@ box.space.test3:select{100}
> -- Now the bucket is unreferenced and can be transferred.
> _ = test_run:switch('box_2_a')
> vshard.storage.bucket_send(1, util.replicasets[1], {timeout = 0.3})
> -vshard.storage.buckets_info(1)
> wait_bucket_is_collected(1)
> vshard.storage.buckets_info(1)
> _ = test_run:switch('box_1_a')
> diff --git a/test/reload_evolution/storage.result b/test/reload_evolution/storage.result
> index 753687f..9d30a04 100644
> --- a/test/reload_evolution/storage.result
> +++ b/test/reload_evolution/storage.result
> @@ -92,7 +92,7 @@ test_run:grep_log('storage_2_a', 'vshard.storage.reload_evolution: upgraded to')
> ...
> vshard.storage.internal.reload_version
> ---
> -- 2
> +- 3
> ...
> --
> -- gh-237: should be only one trigger. During gh-237 the trigger installation
> diff --git a/test/router/reroute_wrong_bucket.result b/test/router/reroute_wrong_bucket.result
> index 049bdef..ac340eb 100644
> --- a/test/router/reroute_wrong_bucket.result
> +++ b/test/router/reroute_wrong_bucket.result
> @@ -37,7 +37,7 @@ test_run:switch('storage_1_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> @@ -53,7 +53,7 @@ test_run:switch('storage_2_a')
> ---
> - true
> ...
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> ---
> ...
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> @@ -202,12 +202,12 @@ test_run:grep_log('router_1', 'please update configuration')
> err
> ---
> - bucket_id: 100
> - reason: write is prohibited
> + reason: Not found
> code: 1
> destination: ac522f65-aa94-4134-9f64-51ee384f1a54
> type: ShardingError
> name: WRONG_BUCKET
> - message: 'Cannot perform action with bucket 100, reason: write is prohibited'
> + message: 'Cannot perform action with bucket 100, reason: Not found'
> ...
> --
> -- Now try again, but update configuration during call(). It must
> diff --git a/test/router/reroute_wrong_bucket.test.lua b/test/router/reroute_wrong_bucket.test.lua
> index 9e6e804..207aac3 100644
> --- a/test/router/reroute_wrong_bucket.test.lua
> +++ b/test/router/reroute_wrong_bucket.test.lua
> @@ -11,13 +11,13 @@ util.map_evals(test_run, {REPLICASET_1, REPLICASET_2}, 'bootstrap_storage(\'memt
> test_run:cmd('create server router_1 with script="router/router_1.lua"')
> test_run:cmd('start server router_1')
> test_run:switch('storage_1_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_1_a)
> vshard.storage.rebalancer_disable()
> for i = 1, 100 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
>
> test_run:switch('storage_2_a')
> -cfg.collect_bucket_garbage_interval = 100
> +vshard.consts.BUCKET_SENT_GARBAGE_DELAY = 100
> vshard.storage.cfg(cfg, util.name_to_uuid.storage_2_a)
> vshard.storage.rebalancer_disable()
> for i = 101, 200 do box.space._bucket:replace{i, vshard.consts.BUCKET.ACTIVE} end
> diff --git a/test/storage/recovery.result b/test/storage/recovery.result
> index f833fe7..8ccb0b9 100644
> --- a/test/storage/recovery.result
> +++ b/test/storage/recovery.result
> @@ -79,8 +79,7 @@ _bucket = box.space._bucket
> ...
> _bucket:select{}
> ---
> -- - [2, 'garbage', '<replicaset_2>']
> - - [3, 'garbage', '<replicaset_2>']
> +- []
> ...
> _ = test_run:switch('storage_2_a')
> ---
> diff --git a/test/storage/storage.result b/test/storage/storage.result
> index 424bc4c..0550ad1 100644
> --- a/test/storage/storage.result
> +++ b/test/storage/storage.result
> @@ -547,6 +547,9 @@ vshard.storage.bucket_send(1, util.replicasets[2])
> ---
> - true
> ...
> +wait_bucket_is_collected(1)
> +---
> +...
> _ = test_run:switch("storage_2_a")
> ---
> ...
> @@ -567,12 +570,7 @@ _ = test_run:switch("storage_1_a")
> ...
> vshard.storage.buckets_info()
> ---
> -- 1:
> - status: sent
> - ro_lock: true
> - destination: <replicaset_2>
> - id: 1
> - 2:
> +- 2:
> status: active
> id: 2
> ...
> diff --git a/test/storage/storage.test.lua b/test/storage/storage.test.lua
> index d631b51..d8fbd94 100644
> --- a/test/storage/storage.test.lua
> +++ b/test/storage/storage.test.lua
> @@ -136,6 +136,7 @@ vshard.storage.bucket_send(1, util.replicasets[1])
>
> -- Successful transfer.
> vshard.storage.bucket_send(1, util.replicasets[2])
> +wait_bucket_is_collected(1)
> _ = test_run:switch("storage_2_a")
> vshard.storage.buckets_info()
> _ = test_run:switch("storage_1_a")
> diff --git a/test/unit/config.result b/test/unit/config.result
> index dfd0219..e0b2482 100644
> --- a/test/unit/config.result
> +++ b/test/unit/config.result
> @@ -428,33 +428,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 0
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = -1
> ----
> -...
> -check(cfg)
> ----
> -- Garbage bucket collect interval must be positive number
> -...
> -cfg.collect_bucket_garbage_interval = 100.5
> ----
> -...
> -_ = lcfg.check(cfg)
> ----
> -...
> cfg.collect_lua_garbage = 100
> ---
> ...
> @@ -615,6 +588,12 @@ lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> ---
> ...
> -cfg.sharding = nil
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +---
> +...
> +_ = lcfg.check(cfg)
> ---
> ...
> diff --git a/test/unit/config.test.lua b/test/unit/config.test.lua
> index ada43db..a1c9f07 100644
> --- a/test/unit/config.test.lua
> +++ b/test/unit/config.test.lua
> @@ -175,15 +175,6 @@ _ = lcfg.check(cfg)
> --
> -- gh-77: garbage collection options.
> --
> -cfg.collect_bucket_garbage_interval = 'str'
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 0
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = -1
> -check(cfg)
> -cfg.collect_bucket_garbage_interval = 100.5
> -_ = lcfg.check(cfg)
> -
> cfg.collect_lua_garbage = 100
> check(cfg)
> cfg.collect_lua_garbage = true
> @@ -244,4 +235,9 @@ util.check_error(lcfg.check, cfg)
> cfg.rebalancer_max_sending = 15
> lcfg.check(cfg).rebalancer_max_sending
> cfg.rebalancer_max_sending = nil
> -cfg.sharding = nil
> +
> +--
> +-- Deprecated option does not break anything.
> +--
> +cfg.collect_bucket_garbage_interval = 100
> +_ = lcfg.check(cfg)
> diff --git a/test/unit/garbage.result b/test/unit/garbage.result
> index 74d9ccf..a530496 100644
> --- a/test/unit/garbage.result
> +++ b/test/unit/garbage.result
> @@ -31,9 +31,6 @@ test_run:cmd("setopt delimiter ''");
> vshard.storage.internal.shard_index = 'bucket_id'
> ---
> ...
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> ----
> -...
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> -- by it, or bucket_id is not unsigned.
> @@ -151,6 +148,9 @@ format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> ---
> ...
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> +---
> +...
> _bucket = box.schema.create_space('_bucket', {format = format})
> ---
> ...
> @@ -172,22 +172,6 @@ _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ---
> - [3, 'active']
> ...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [6, 'garbage']
> -...
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [200, 'garbage']
> -...
> s = box.schema.create_space('test', {engine = engine})
> ---
> ...
> @@ -213,7 +197,7 @@ s:replace{4, 2}
> ---
> - [4, 2]
> ...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> ---
> ...
> s2 = box.schema.create_space('test2', {engine = engine})
> @@ -249,6 +233,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> ---
> ...
> @@ -267,12 +255,22 @@ fill_spaces_with_garbage()
> ---
> - 1107
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - null
> + - null
> + - destination2
> ...
> #s2:select{}
> ---
> @@ -282,10 +280,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ---
> - 7
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- - null
> + - null
> + - null
> + - destination1
> ...
> s2:select{}
> ---
> @@ -303,17 +311,22 @@ s:select{}
> - [6, 100]
> ...
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +---
> +...
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> ---
> -- - 5
> - - 6
> - - 200
> - true
> +- null
> ...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> ---
> -- - 4
> - true
> +- null
> +...
> +route_map
> +---
> +- []
> ...
> #s2:select{}
> ---
> @@ -329,15 +342,20 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> fill_spaces_with_garbage()
> ---
> ...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> ---
> ...
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -360,7 +378,6 @@ _bucket:select{}
> - - [1, 'active']
> - [2, 'receiving']
> - [3, 'active']
> - - [4, 'sent']
> ...
> --
> -- Test deletion of 'sent' buckets after a specified timeout.
> @@ -370,8 +387,9 @@ _bucket:replace{2, vshard.consts.BUCKET.SENT}
> - [2, 'sent']
> ...
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> @@ -410,8 +428,9 @@ _bucket:replace{4, vshard.consts.BUCKET.SENT}
> ---
> - [4, 'sent']
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -434,11 +453,14 @@ s:replace{6, 4}
> ---
> - [6, 4]
> ...
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> ---
> +- Error during garbage collection step
> ...
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> ---
> +- true
> ...
> s:select{}
> ---
> @@ -454,8 +476,9 @@ _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> ---
> ...
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
> ---
> +- true
> ...
> f:cancel()
> ---
> @@ -562,8 +585,9 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> ---
> ...
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> ---
> +- true
> ...
> _bucket:select{}
> ---
> diff --git a/test/unit/garbage.test.lua b/test/unit/garbage.test.lua
> index 30079fa..250afb0 100644
> --- a/test/unit/garbage.test.lua
> +++ b/test/unit/garbage.test.lua
> @@ -15,7 +15,6 @@ end;
> test_run:cmd("setopt delimiter ''");
>
> vshard.storage.internal.shard_index = 'bucket_id'
> -vshard.storage.internal.collect_bucket_garbage_interval = vshard.consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
>
> --
> -- Find nothing if no bucket_id anywhere, or there is no index
> @@ -75,16 +74,13 @@ s:drop()
> format = {}
> format[1] = {name = 'id', type = 'unsigned'}
> format[2] = {name = 'status', type = 'string'}
> +format[3] = {name = 'destination', type = 'string', is_nullable = true}
> _bucket = box.schema.create_space('_bucket', {format = format})
> _ = _bucket:create_index('pk')
> _ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> _bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> _bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> _bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{6, vshard.consts.BUCKET.GARBAGE}
> -_bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
>
> s = box.schema.create_space('test', {engine = engine})
> pk = s:create_index('pk')
> @@ -94,7 +90,7 @@ s:replace{2, 1}
> s:replace{3, 2}
> s:replace{4, 2}
>
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> +gc_bucket_drop = vshard.storage.internal.gc_bucket_drop
> s2 = box.schema.create_space('test2', {engine = engine})
> pk2 = s2:create_index('pk')
> sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> @@ -114,6 +110,10 @@ function fill_spaces_with_garbage()
> s2:replace{6, 4}
> s2:replace{7, 5}
> s2:replace{7, 6}
> + _bucket:replace{4, vshard.consts.BUCKET.SENT, 'destination1'}
> + _bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> + _bucket:replace{6, vshard.consts.BUCKET.GARBAGE, 'destination2'}
> + _bucket:replace{200, vshard.consts.BUCKET.GARBAGE}
> end;
> test_run:cmd("setopt delimiter ''");
>
> @@ -121,15 +121,21 @@ fill_spaces_with_garbage()
>
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +route_map
> #s2:select{}
> #s:select{}
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> s2:select{}
> s:select{}
> -- Nothing deleted - update collected generation.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> +route_map = {}
> +gc_bucket_drop(vshard.consts.BUCKET.GARBAGE, route_map)
> +gc_bucket_drop(vshard.consts.BUCKET.SENT, route_map)
> +route_map
> #s2:select{}
> #s:select{}
>
> @@ -137,10 +143,14 @@ gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -- Test continuous garbage collection via background fiber.
> --
> fill_spaces_with_garbage()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> +_ = _bucket:on_replace(function() \
> + local gen = vshard.storage.internal.bucket_generation \
> + vshard.storage.internal.bucket_generation = gen + 1 \
> + vshard.storage.internal.bucket_generation_cond:broadcast() \
> +end)
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -- Wait until garbage collection is finished.
> -while s2:count() ~= 3 or s:count() ~= 6 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return s2:count() == 3 and s:count() == 6 end)
> s:select{}
> s2:select{}
> -- Check garbage bucket is deleted by background fiber.
> @@ -150,7 +160,7 @@ _bucket:select{}
> --
> _bucket:replace{2, vshard.consts.BUCKET.SENT}
> -- Wait deletion after a while.
> -while _bucket:get{2} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{2} end)
> _bucket:select{}
> s:select{}
> s2:select{}
> @@ -162,7 +172,7 @@ _bucket:replace{4, vshard.consts.BUCKET.ACTIVE}
> s:replace{5, 4}
> s:replace{6, 4}
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> --
> -- Test WAL errors during deletion from _bucket.
> @@ -172,12 +182,13 @@ _ = _bucket:on_replace(rollback_on_delete)
> _bucket:replace{4, vshard.consts.BUCKET.SENT}
> s:replace{5, 4}
> s:replace{6, 4}
> -while not test_run:grep_log("default", "Error during deletion of empty sent buckets") do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> -while #sk:select{4} ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_log('default', 'Error during garbage collection step', \
> + 65536, 10)
> +test_run:wait_cond(function() return #sk:select{4} == 0 end)
> s:select{}
> _bucket:select{}
> _ = _bucket:on_replace(nil, rollback_on_delete)
> -while _bucket:get{4} ~= nil do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return not _bucket:get{4} end)
>
> f:cancel()
>
> @@ -220,7 +231,7 @@ for i = 1, 2000 do _bucket:replace{i, vshard.consts.BUCKET.GARBAGE} s:replace{i,
> #s:select{}
> #s2:select{}
> f = fiber.create(vshard.storage.internal.gc_bucket_f)
> -while _bucket:count() ~= 0 do vshard.storage.garbage_collector_wakeup() fiber.sleep(0.001) end
> +test_run:wait_cond(function() return _bucket:count() == 0 end)
> _bucket:select{}
> s:select{}
> s2:select{}
> diff --git a/test/unit/garbage_errinj.result b/test/unit/garbage_errinj.result
> deleted file mode 100644
> index 92c8039..0000000
> --- a/test/unit/garbage_errinj.result
> +++ /dev/null
> @@ -1,223 +0,0 @@
> -test_run = require('test_run').new()
> ----
> -...
> -vshard = require('vshard')
> ----
> -...
> -fiber = require('fiber')
> ----
> -...
> -engine = test_run:get_cfg('engine')
> ----
> -...
> -vshard.storage.internal.shard_index = 'bucket_id'
> ----
> -...
> -format = {}
> ----
> -...
> -format[1] = {name = 'id', type = 'unsigned'}
> ----
> -...
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> ----
> -...
> -_bucket = box.schema.create_space('_bucket', {format = format})
> ----
> -...
> -_ = _bucket:create_index('pk')
> ----
> -...
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> ----
> -...
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [1, 'active']
> -...
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> ----
> -- [2, 'receiving']
> -...
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> ----
> -- [3, 'active']
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> ----
> -- [4, 'sent']
> -...
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [5, 'garbage']
> -...
> -s = box.schema.create_space('test', {engine = engine})
> ----
> -...
> -pk = s:create_index('pk')
> ----
> -...
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s:replace{2, 1}
> ----
> -- [2, 1]
> -...
> -s:replace{3, 2}
> ----
> -- [3, 2]
> -...
> -s:replace{4, 2}
> ----
> -- [4, 2]
> -...
> -s:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s:replace{6, 100}
> ----
> -- [6, 100]
> -...
> -s:replace{7, 4}
> ----
> -- [7, 4]
> -...
> -s:replace{8, 5}
> ----
> -- [8, 5]
> -...
> -s2 = box.schema.create_space('test2', {engine = engine})
> ----
> -...
> -pk2 = s2:create_index('pk')
> ----
> -...
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> ----
> -...
> -s2:replace{1, 1}
> ----
> -- [1, 1]
> -...
> -s2:replace{3, 3}
> ----
> -- [3, 3]
> -...
> -for i = 7, 1107 do s:replace{i, 200} end
> ----
> -...
> -s2:replace{4, 200}
> ----
> -- [4, 200]
> -...
> -s2:replace{5, 100}
> ----
> -- [5, 100]
> -...
> -s2:replace{5, 300}
> ----
> -- [5, 300]
> -...
> -s2:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -s2:replace{7, 5}
> ----
> -- [7, 5]
> -...
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> ----
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- - 4
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 5
> -- true
> -...
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> ----
> -...
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> ----
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> ----
> -...
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> ----
> -...
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> ----
> -- [4, 'garbage']
> -...
> -s:replace{5, 4}
> ----
> -- [5, 4]
> -...
> -s:replace{6, 4}
> ----
> -- [6, 4]
> -...
> -#s:select{}
> ----
> -- 2
> -...
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> ----
> -...
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> ----
> -...
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> ----
> -- 2
> -...
> -_bucket:select{4}
> ----
> -- - [4, 'garbage']
> -...
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> ----
> -- []
> -- true
> -...
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> ----
> -- - 4
> - - 5
> -- true
> -...
> -#s:select{}
> ----
> -- 0
> -...
> -_bucket:delete{4}
> ----
> -- [4, 'garbage']
> -...
> -s2:drop()
> ----
> -...
> -s:drop()
> ----
> -...
> -_bucket:drop()
> ----
> -...
> diff --git a/test/unit/garbage_errinj.test.lua b/test/unit/garbage_errinj.test.lua
> deleted file mode 100644
> index 31184b9..0000000
> --- a/test/unit/garbage_errinj.test.lua
> +++ /dev/null
> @@ -1,73 +0,0 @@
> -test_run = require('test_run').new()
> -vshard = require('vshard')
> -fiber = require('fiber')
> -
> -engine = test_run:get_cfg('engine')
> -vshard.storage.internal.shard_index = 'bucket_id'
> -
> -format = {}
> -format[1] = {name = 'id', type = 'unsigned'}
> -format[2] = {name = 'status', type = 'string', is_nullable = true}
> -_bucket = box.schema.create_space('_bucket', {format = format})
> -_ = _bucket:create_index('pk')
> -_ = _bucket:create_index('status', {parts = {{2, 'string'}}, unique = false})
> -_bucket:replace{1, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{2, vshard.consts.BUCKET.RECEIVING}
> -_bucket:replace{3, vshard.consts.BUCKET.ACTIVE}
> -_bucket:replace{4, vshard.consts.BUCKET.SENT}
> -_bucket:replace{5, vshard.consts.BUCKET.GARBAGE}
> -
> -s = box.schema.create_space('test', {engine = engine})
> -pk = s:create_index('pk')
> -sk = s:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s:replace{1, 1}
> -s:replace{2, 1}
> -s:replace{3, 2}
> -s:replace{4, 2}
> -s:replace{5, 100}
> -s:replace{6, 100}
> -s:replace{7, 4}
> -s:replace{8, 5}
> -
> -s2 = box.schema.create_space('test2', {engine = engine})
> -pk2 = s2:create_index('pk')
> -sk2 = s2:create_index('bucket_id', {parts = {{2, 'unsigned'}}, unique = false})
> -s2:replace{1, 1}
> -s2:replace{3, 3}
> -for i = 7, 1107 do s:replace{i, 200} end
> -s2:replace{4, 200}
> -s2:replace{5, 100}
> -s2:replace{5, 300}
> -s2:replace{6, 4}
> -s2:replace{7, 5}
> -
> -gc_bucket_step_by_type = vshard.storage.internal.gc_bucket_step_by_type
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -
> ---
> --- Test _bucket generation change during garbage buckets search.
> ---
> -s:truncate()
> -_ = _bucket:on_replace(function() vshard.storage.internal.bucket_generation = vshard.storage.internal.bucket_generation + 1 end)
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = true
> -f = fiber.create(function() gc_bucket_step_by_type(vshard.consts.BUCKET.SENT) gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE) end)
> -_bucket:replace{4, vshard.consts.BUCKET.GARBAGE}
> -s:replace{5, 4}
> -s:replace{6, 4}
> -#s:select{}
> -vshard.storage.internal.errinj.ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false
> -while f:status() ~= 'dead' do fiber.sleep(0.1) end
> --- Nothing is deleted - _bucket:replace() has changed _bucket
> --- generation during search of garbage buckets.
> -#s:select{}
> -_bucket:select{4}
> --- Next step deletes garbage ok.
> -gc_bucket_step_by_type(vshard.consts.BUCKET.SENT)
> -gc_bucket_step_by_type(vshard.consts.BUCKET.GARBAGE)
> -#s:select{}
> -_bucket:delete{4}
> -
> -s2:drop()
> -s:drop()
> -_bucket:drop()
> diff --git a/vshard/cfg.lua b/vshard/cfg.lua
> index f7d5dbc..63d5414 100644
> --- a/vshard/cfg.lua
> +++ b/vshard/cfg.lua
> @@ -251,9 +251,8 @@ local cfg_template = {
> max = consts.REBALANCER_MAX_SENDING_MAX
> },
> collect_bucket_garbage_interval = {
> - type = 'positive number', name = 'Garbage bucket collect interval',
> - is_optional = true,
> - default = consts.DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL
> + name = 'Garbage bucket collect interval', is_deprecated = true,
> + reason = 'Has no effect anymore'
> },
> collect_lua_garbage = {
> type = 'boolean', name = 'Garbage Lua collect necessity',
> diff --git a/vshard/consts.lua b/vshard/consts.lua
> index 8c2a8b0..3f1585a 100644
> --- a/vshard/consts.lua
> +++ b/vshard/consts.lua
> @@ -23,6 +23,7 @@ return {
> DEFAULT_BUCKET_COUNT = 3000;
> BUCKET_SENT_GARBAGE_DELAY = 0.5;
> BUCKET_CHUNK_SIZE = 1000;
> + LUA_CHUNK_SIZE = 100000,
> DEFAULT_REBALANCER_DISBALANCE_THRESHOLD = 1;
> REBALANCER_IDLE_INTERVAL = 60 * 60;
> REBALANCER_WORK_INTERVAL = 10;
> @@ -37,7 +38,7 @@ return {
> DEFAULT_FAILOVER_PING_TIMEOUT = 5;
> DEFAULT_SYNC_TIMEOUT = 1;
> RECONNECT_TIMEOUT = 0.5;
> - DEFAULT_COLLECT_BUCKET_GARBAGE_INTERVAL = 0.5;
> + GC_BACKOFF_INTERVAL = 5,
> RECOVERY_INTERVAL = 5;
> COLLECT_LUA_GARBAGE_INTERVAL = 100;
>
> @@ -45,4 +46,6 @@ return {
> DISCOVERY_WORK_INTERVAL = 1,
> DISCOVERY_WORK_STEP = 0.01,
> DISCOVERY_TIMEOUT = 10,
> +
> + TIMEOUT_INFINITY = 500 * 365 * 86400,
> }
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index adf1c20..1ea8069 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -69,7 +69,6 @@ if not M then
> total_bucket_count = 0,
> errinj = {
> ERRINJ_CFG = false,
> - ERRINJ_BUCKET_FIND_GARBAGE_DELAY = false,
> ERRINJ_RELOAD = false,
> ERRINJ_CFG_DELAY = false,
> ERRINJ_LONG_RECEIVE = false,
> @@ -96,6 +95,8 @@ if not M then
> -- detect that _bucket was not changed between yields.
> --
> bucket_generation = 0,
> + -- Condition variable fired on generation update.
> + bucket_generation_cond = lfiber.cond(),
> --
> -- Reference to the function used as on_replace trigger on
> -- _bucket space. It is used to replace the trigger with
> @@ -107,12 +108,14 @@ if not M then
> -- replace the old function is to keep its reference.
> --
> bucket_on_replace = nil,
> + -- Redirects for recently sent buckets. They are kept for a while to
> + -- help routers to find a new location for sent and deleted buckets
> + -- without whole cluster scan.
> + route_map = {},
>
> ------------------- Garbage collection -------------------
> -- Fiber to remove garbage buckets data.
> collect_bucket_garbage_fiber = nil,
> - -- Do buckets garbage collection once per this time.
> - collect_bucket_garbage_interval = nil,
> -- Boolean lua_gc state (create periodic gc task).
> collect_lua_garbage = nil,
>
> @@ -173,6 +176,7 @@ end
> --
> local function bucket_generation_increment()
> M.bucket_generation = M.bucket_generation + 1
> + M.bucket_generation_cond:broadcast()
> end
>
> --
> @@ -758,8 +762,9 @@ local function bucket_check_state(bucket_id, mode)
> else
> return bucket
> end
> + local dst = bucket and bucket.destination or M.route_map[bucket_id]
> return bucket, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id, reason,
> - bucket and bucket.destination)
> + dst)
> end
>
> --
> @@ -804,11 +809,23 @@ end
> --
> local function bucket_unrefro(bucket_id)
> local ref = M.bucket_refs[bucket_id]
> - if not ref or ref.ro == 0 then
> + local count = ref and ref.ro or 0
> + if count == 0 then
> return nil, lerror.vshard(lerror.code.WRONG_BUCKET, bucket_id,
> "no refs", nil)
> end
> - ref.ro = ref.ro - 1
> + if count == 1 then
> + ref.ro = 0
> + if ref.ro_lock then
> + -- Garbage collector is waiting for the bucket if RO
> + -- is locked. Let it know it has one more bucket to
> + -- collect. It relies on generation, so its increment
> + -- it enough.
> + bucket_generation_increment()
> + end
> + return true
> + end
> + ref.ro = count - 1
> return true
> end
>
> @@ -1481,79 +1498,44 @@ local function gc_bucket_in_space(space, bucket_id, status)
> end
>
> --
> --- Remove tuples from buckets of a specified type.
> --- @param type Type of buckets to gc.
> --- @retval List of ids of empty buckets of the type.
> +-- Drop buckets with the given status along with their data in all spaces.
> +-- @param status Status of target buckets.
> +-- @param route_map Destinations of deleted buckets are saved into this table.
> --
> -local function gc_bucket_step_by_type(type)
> - local sharded_spaces = find_sharded_spaces()
> - local empty_buckets = {}
> +local function gc_bucket_drop_xc(status, route_map)
> local limit = consts.BUCKET_CHUNK_SIZE
> - local is_all_collected = true
> - for _, bucket in box.space._bucket.index.status:pairs(type) do
> - local bucket_id = bucket.id
> - local ref = M.bucket_refs[bucket_id]
> + local _bucket = box.space._bucket
> + local sharded_spaces = find_sharded_spaces()
> + for _, b in _bucket.index.status:pairs(status) do
> + local id = b.id
> + local ref = M.bucket_refs[id]
> if ref then
> assert(ref.rw == 0)
> if ref.ro ~= 0 then
> ref.ro_lock = true
> - is_all_collected = false
> goto continue
> end
> - M.bucket_refs[bucket_id] = nil
> + M.bucket_refs[id] = nil
> end
> for _, space in pairs(sharded_spaces) do
> - gc_bucket_in_space_xc(space, bucket_id, type)
> + gc_bucket_in_space_xc(space, id, status)
> limit = limit - 1
> if limit == 0 then
> lfiber.sleep(0)
> limit = consts.BUCKET_CHUNK_SIZE
> end
> end
> - table.insert(empty_buckets, bucket.id)
> -::continue::
> + route_map[id] = b.destination
> + _bucket:delete{id}
> + ::continue::
> end
> - return empty_buckets, is_all_collected
> -end
> -
> ---
> --- Drop buckets with ids in the list.
> --- @param bucket_ids Bucket ids to drop.
> --- @param status Expected bucket status.
> ---
> -local function gc_bucket_drop_xc(bucket_ids, status)
> - if #bucket_ids == 0 then
> - return
> - end
> - local limit = consts.BUCKET_CHUNK_SIZE
> - box.begin()
> - local _bucket = box.space._bucket
> - for _, id in pairs(bucket_ids) do
> - local bucket_exists = _bucket:get{id} ~= nil
> - local b = _bucket:get{id}
> - if b then
> - if b.status ~= status then
> - return error(string.format('Bucket %d status is changed. Was '..
> - '%s, became %s', id, status,
> - b.status))
> - end
> - _bucket:delete{id}
> - end
> - limit = limit - 1
> - if limit == 0 then
> - box.commit()
> - box.begin()
> - limit = consts.BUCKET_CHUNK_SIZE
> - end
> - end
> - box.commit()
> end
>
> --
> -- Exception safe version of gc_bucket_drop_xc.
> --
> -local function gc_bucket_drop(bucket_ids, status)
> - local status, err = pcall(gc_bucket_drop_xc, bucket_ids, status)
> +local function gc_bucket_drop(status, route_map)
> + local status, err = pcall(gc_bucket_drop_xc, status, route_map)
> if not status then
> box.rollback()
> end
> @@ -1561,14 +1543,16 @@ local function gc_bucket_drop(bucket_ids, status)
> end
>
> --
> --- Garbage collector. Works on masters. The garbage collector
> --- wakes up once per specified time.
> +-- Garbage collector. Works on masters. The garbage collector wakes up when
> +-- state of any bucket changes.
> -- After wakeup it follows the plan:
> --- 1) Check if _bucket has changed. If not, then sleep again;
> --- 2) Scan user spaces for sent and garbage buckets, delete
> --- garbage data in batches of limited size;
> --- 3) Delete GARBAGE buckets from _bucket immediately, and
> --- schedule SENT buckets for deletion after a timeout;
> +-- 1) Check if state of any bucket has really changed. If not, then sleep again;
> +-- 2) Delete all GARBAGE and SENT buckets along with their data in chunks of
> +-- limited size.
> +-- 3) Bucket destinations are saved into a global route_map to reroute incoming
> +-- requests from routers in case they didn't notice the buckets being moved.
> +-- The saved routes are scheduled for deletion after a timeout, which is
> +-- checked on each iteration of this loop.
> -- 4) Sleep, go to (1).
> -- For each step details see comments in the code.
> --
> @@ -1580,65 +1564,75 @@ function gc_bucket_f()
> -- generation == bucket generation. In such a case the fiber
> -- does nothing until next _bucket change.
> local bucket_generation_collected = -1
> - -- Empty sent buckets are collected into an array. After a
> - -- specified time interval the buckets are deleted both from
> - -- this array and from _bucket space.
> - local buckets_for_redirect = {}
> - local buckets_for_redirect_ts = fiber_clock()
> - -- Empty sent buckets, updated after each step, and when
> - -- buckets_for_redirect is deleted, it gets empty_sent_buckets
> - -- for next deletion.
> - local empty_garbage_buckets, empty_sent_buckets, status, err
> + local bucket_generation_current = M.bucket_generation
> + -- Deleted buckets are saved into a route map to redirect routers if they
> + -- didn't discover new location of the buckets yet. However route map does
> + -- not grow infinitely. Otherwise it would end up storing redirects for all
> + -- buckets in the cluster. Which could also be outdated.
> + -- Garbage collector periodically drops old routes from the map. For that it
> + -- remembers state of route map in one moment, and after a while clears the
> + -- remembered routes from the global route map.
> + local route_map = M.route_map
> + local route_map_old = {}
> + local route_map_deadline = 0
> + local status, err
> while M.module_version == module_version do
> - -- Check if no changes in buckets configuration.
> - if bucket_generation_collected ~= M.bucket_generation then
> - local bucket_generation = M.bucket_generation
> - local is_sent_collected, is_garbage_collected
> - status, empty_garbage_buckets, is_garbage_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.GARBAGE)
> - if not status then
> - err = empty_garbage_buckets
> - goto check_error
> - end
> - status, empty_sent_buckets, is_sent_collected =
> - pcall(gc_bucket_step_by_type, consts.BUCKET.SENT)
> - if not status then
> - err = empty_sent_buckets
> - goto check_error
> + if bucket_generation_collected ~= bucket_generation_current then
> + status, err = gc_bucket_drop(consts.BUCKET.GARBAGE, route_map)
> + if status then
> + status, err = gc_bucket_drop(consts.BUCKET.SENT, route_map)
> end
> - status, err = gc_bucket_drop(empty_garbage_buckets,
> - consts.BUCKET.GARBAGE)
> -::check_error::
> if not status then
> box.rollback()
> log.error('Error during garbage collection step: %s', err)
> - goto continue
> + else
> + -- Don't use global generation. During the collection it could
> + -- already change. Instead, remember the generation known before
> + -- the collection has started.
> + -- Since the collection also changes the generation, it makes
> + -- the GC happen always at least twice. But typically on the
> + -- second iteration it should not find any buckets to collect,
> + -- and then the collected generation matches the global one.
> + bucket_generation_collected = bucket_generation_current
> end
> - if is_sent_collected and is_garbage_collected then
> - bucket_generation_collected = bucket_generation
> + else
> + status = true
> + end
> +
> + local sleep_time = route_map_deadline - fiber_clock()
> + if sleep_time <= 0 then
> + local chunk = consts.LUA_CHUNK_SIZE
> + util.table_minus_yield(route_map, route_map_old, chunk)
> + route_map_old = util.table_copy_yield(route_map, chunk)
> + if next(route_map_old) then
> + sleep_time = consts.BUCKET_SENT_GARBAGE_DELAY
> + else
> + sleep_time = consts.TIMEOUT_INFINITY
> end
> + route_map_deadline = fiber_clock() + sleep_time
> end
> + bucket_generation_current = M.bucket_generation
>
> - if fiber_clock() - buckets_for_redirect_ts >=
> - consts.BUCKET_SENT_GARBAGE_DELAY then
> - status, err = gc_bucket_drop(buckets_for_redirect,
> - consts.BUCKET.SENT)
> - if not status then
> - buckets_for_redirect = {}
> - empty_sent_buckets = {}
> - bucket_generation_collected = -1
> - log.error('Error during deletion of empty sent buckets: %s',
> - err)
> - elseif M.module_version ~= module_version then
> - return
> + if bucket_generation_current ~= bucket_generation_collected then
> + -- Generation was changed during collection. Or *by* collection.
> + if status then
> + -- Retry immediately. If the generation was changed by the
> + -- collection itself, it will notice it next iteration, and go
> + -- to proper sleep.
> + sleep_time = 0
> else
> - buckets_for_redirect = empty_sent_buckets or {}
> - empty_sent_buckets = nil
> - buckets_for_redirect_ts = fiber_clock()
> + -- An error happened during the collection. Does not make sense
> + -- to retry on each iteration of the event loop. The most likely
> + -- errors are either a WAL error or a transaction abort - both
> + -- look like an issue in the user's code and can't be fixed
> + -- quickly anyway. Backoff.
> + sleep_time = consts.GC_BACKOFF_INTERVAL
> end
> end
> -::continue::
> - lfiber.sleep(M.collect_bucket_garbage_interval)
> +
> + if M.module_version == module_version then
> + M.bucket_generation_cond:wait(sleep_time)
> + end
> end
> end
>
> @@ -2423,8 +2417,6 @@ local function storage_cfg(cfg, this_replica_uuid, is_reload)
> vshard_cfg.rebalancer_disbalance_threshold
> M.rebalancer_receiving_quota = vshard_cfg.rebalancer_max_receiving
> M.shard_index = vshard_cfg.shard_index
> - M.collect_bucket_garbage_interval =
> - vshard_cfg.collect_bucket_garbage_interval
> M.collect_lua_garbage = vshard_cfg.collect_lua_garbage
> M.rebalancer_worker_count = vshard_cfg.rebalancer_max_sending
> M.current_cfg = cfg
> @@ -2678,6 +2670,9 @@ else
> storage_cfg(M.current_cfg, M.this_replica.uuid, true)
> end
> M.module_version = M.module_version + 1
> + -- Background fibers could sleep waiting for bucket changes.
> + -- Let them know it is time to reload.
> + bucket_generation_increment()
> end
>
> M.recovery_f = recovery_f
> @@ -2688,7 +2683,7 @@ M.gc_bucket_f = gc_bucket_f
> -- These functions are saved in M not for atomic reload, but for
> -- unit testing.
> --
> -M.gc_bucket_step_by_type = gc_bucket_step_by_type
> +M.gc_bucket_drop = gc_bucket_drop
> M.rebalancer_build_routes = rebalancer_build_routes
> M.rebalancer_calculate_metrics = rebalancer_calculate_metrics
> M.cached_find_sharded_spaces = find_sharded_spaces
> diff --git a/vshard/storage/reload_evolution.lua b/vshard/storage/reload_evolution.lua
> index f38af74..484f499 100644
> --- a/vshard/storage/reload_evolution.lua
> +++ b/vshard/storage/reload_evolution.lua
> @@ -4,6 +4,7 @@
> -- in a commit.
> --
> local log = require('log')
> +local fiber = require('fiber')
>
> --
> -- Array of upgrade functions.
> @@ -25,6 +26,13 @@ migrations[#migrations + 1] = function(M)
> end
> end
>
> +migrations[#migrations + 1] = function(M)
> + if not M.route_map then
> + M.bucket_generation_cond = fiber.cond()
> + M.route_map = {}
> + end
> +end
> +
> --
> -- Perform an update based on a version stored in `M` (internals).
> -- @param M Old module internals which should be updated.
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 8/9] recovery: introduce reactive recovery
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (6 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:51 ` [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Recovery is a fiber on a master node which tries to resolve
SENDING/RECEIVING buckets into GARBAGE or ACTIVE, in case they are
stuck. Usually it happens due to a conflict on the receiving side,
or if a restart happens during bucket send.
Recovery was proactive. It used to wakeup with a constant period
to find and resolve the needed buckets.
But this won't work with the future feature called 'map-reduce'.
Map-reduce as a preparation stage will need to ensure that all
buckets on a storage are readable and writable. With the current
recovery algorithm if a bucket is broken, it won't be recovered
for the next 5 seconds by default. During this time all new
map-reduce requests can't execute.
This is not acceptable. As well as too frequent wakeup of recovery
fiber because it would waste TX thread time.
The patch makes recovery fiber wakeup not by a timeout but by
events happening with _bucket space. Recovery fiber sleeps on a
condition variable which is signaled when _bucket is changed.
This is very similar to the reactive GC feature in a previous
commit.
It is worth mentioning that the backoff happens not only when a
bucket couldn't be recovered (its transfer is still in progress,
for example), but also when a network error happened and recovery
couldn't check state of the bucket on the other storage.
It would be a useless busy loop to retry network errors
immediately after their appearance. Recovery uses a backoff
interval for them as well.
Needed for #147
---
test/router/router.result | 22 ++++++++---
test/router/router.test.lua | 13 ++++++-
test/storage/recovery.result | 8 ++++
test/storage/recovery.test.lua | 5 +++
test/storage/recovery_errinj.result | 16 +++++++-
test/storage/recovery_errinj.test.lua | 9 ++++-
vshard/consts.lua | 2 +-
vshard/storage/init.lua | 54 +++++++++++++++++++++++----
8 files changed, 110 insertions(+), 19 deletions(-)
diff --git a/test/router/router.result b/test/router/router.result
index b2efd6d..3c1d073 100644
--- a/test/router/router.result
+++ b/test/router/router.result
@@ -312,6 +312,11 @@ replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err
_ = test_run:switch('storage_2_a')
---
...
+-- Pause recovery. It is too aggressive, and the test needs to see buckets in
+-- their intermediate states.
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+---
+...
box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]})
---
- [1, 'sending', '<replicaset_1>']
@@ -319,6 +324,9 @@ box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]}
_ = test_run:switch('storage_1_a')
---
...
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+---
+...
box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]})
---
- [1, 'receiving', '<replicaset_2>']
@@ -342,19 +350,21 @@ util.check_error(vshard.router.call, 1, 'write', 'echo', {123})
name: TRANSFER_IS_IN_PROGRESS
message: Bucket 1 is transferring to replicaset <replicaset_1>
...
-_ = test_run:switch('storage_2_a')
+_ = test_run:switch('storage_1_a')
+---
+...
+box.space._bucket:delete({1})
---
+- [1, 'receiving', '<replicaset_2>']
...
-box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE})
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
---
-- [1, 'active']
...
-_ = test_run:switch('storage_1_a')
+_ = test_run:switch('storage_2_a')
---
...
-box.space._bucket:delete({1})
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
---
-- [1, 'receiving', '<replicaset_2>']
...
_ = test_run:switch('router_1')
---
diff --git a/test/router/router.test.lua b/test/router/router.test.lua
index 154310b..aa3eb3b 100644
--- a/test/router/router.test.lua
+++ b/test/router/router.test.lua
@@ -114,19 +114,28 @@ replicaset, err = vshard.router.bucket_discovery(1); return err == nil or err
replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err
_ = test_run:switch('storage_2_a')
+-- Pause recovery. It is too aggressive, and the test needs to see buckets in
+-- their intermediate states.
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]})
+
_ = test_run:switch('storage_1_a')
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]})
+
_ = test_run:switch('router_1')
-- Ok to read sending bucket.
vshard.router.call(1, 'read', 'echo', {123})
-- Not ok to write sending bucket.
util.check_error(vshard.router.call, 1, 'write', 'echo', {123})
-_ = test_run:switch('storage_2_a')
-box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE})
_ = test_run:switch('storage_1_a')
box.space._bucket:delete({1})
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
+
+_ = test_run:switch('storage_2_a')
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
+
_ = test_run:switch('router_1')
-- Check unavailability of master of a replicaset.
diff --git a/test/storage/recovery.result b/test/storage/recovery.result
index 8ccb0b9..fa92bca 100644
--- a/test/storage/recovery.result
+++ b/test/storage/recovery.result
@@ -28,12 +28,20 @@ util.push_rs_filters(test_run)
_ = test_run:switch("storage_2_a")
---
...
+-- Pause until restart. Otherwise recovery does its job too fast and does not
+-- allow to simulate the intermediate state.
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+---
+...
vshard.storage.rebalancer_disable()
---
...
_ = test_run:switch("storage_1_a")
---
...
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+---
+...
-- Create buckets sending to rs2 and restart - recovery must
-- garbage some of them and activate others. Receiving buckets
-- must be garbaged on bootstrap.
diff --git a/test/storage/recovery.test.lua b/test/storage/recovery.test.lua
index a0651e8..93cec68 100644
--- a/test/storage/recovery.test.lua
+++ b/test/storage/recovery.test.lua
@@ -10,8 +10,13 @@ util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
util.push_rs_filters(test_run)
_ = test_run:switch("storage_2_a")
+-- Pause until restart. Otherwise recovery does its job too fast and does not
+-- allow to simulate the intermediate state.
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
vshard.storage.rebalancer_disable()
+
_ = test_run:switch("storage_1_a")
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
-- Create buckets sending to rs2 and restart - recovery must
-- garbage some of them and activate others. Receiving buckets
diff --git a/test/storage/recovery_errinj.result b/test/storage/recovery_errinj.result
index 3e9a9bf..8c178d5 100644
--- a/test/storage/recovery_errinj.result
+++ b/test/storage/recovery_errinj.result
@@ -35,9 +35,17 @@ _ = test_run:switch('storage_2_a')
vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true
---
...
+-- Pause recovery. Otherwise it does its job too fast and does not allow to
+-- simulate the intermediate state.
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+---
+...
_ = test_run:switch('storage_1_a')
---
...
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+---
+...
_bucket = box.space._bucket
---
...
@@ -76,10 +84,16 @@ _bucket:get{1}
---
- [1, 'active']
...
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
+---
+...
_ = test_run:switch('storage_1_a')
---
...
-while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
+---
+...
+wait_bucket_is_collected(1)
---
...
_ = test_run:switch("default")
diff --git a/test/storage/recovery_errinj.test.lua b/test/storage/recovery_errinj.test.lua
index 8c1a9d2..c730560 100644
--- a/test/storage/recovery_errinj.test.lua
+++ b/test/storage/recovery_errinj.test.lua
@@ -14,7 +14,12 @@ util.push_rs_filters(test_run)
--
_ = test_run:switch('storage_2_a')
vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true
+-- Pause recovery. Otherwise it does its job too fast and does not allow to
+-- simulate the intermediate state.
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
+
_ = test_run:switch('storage_1_a')
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
_bucket = box.space._bucket
_bucket:replace{1, vshard.consts.BUCKET.ACTIVE, util.replicasets[2]}
ret, err = vshard.storage.bucket_send(1, util.replicasets[2], {timeout = 0.1})
@@ -27,9 +32,11 @@ vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = false
_bucket = box.space._bucket
while _bucket:get{1}.status ~= vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.01) end
_bucket:get{1}
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
_ = test_run:switch('storage_1_a')
-while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end
+vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
+wait_bucket_is_collected(1)
_ = test_run:switch("default")
test_run:drop_cluster(REPLICASET_2)
diff --git a/vshard/consts.lua b/vshard/consts.lua
index 3f1585a..cf3f422 100644
--- a/vshard/consts.lua
+++ b/vshard/consts.lua
@@ -39,7 +39,7 @@ return {
DEFAULT_SYNC_TIMEOUT = 1;
RECONNECT_TIMEOUT = 0.5;
GC_BACKOFF_INTERVAL = 5,
- RECOVERY_INTERVAL = 5;
+ RECOVERY_BACKOFF_INTERVAL = 5,
COLLECT_LUA_GARBAGE_INTERVAL = 100;
DISCOVERY_IDLE_INTERVAL = 10,
diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
index 31a6fc7..85f5024 100644
--- a/vshard/storage/init.lua
+++ b/vshard/storage/init.lua
@@ -634,13 +634,16 @@ end
-- Infinite function to resolve status of buckets, whose 'sending'
-- has failed due to tarantool or network problems. Restarts on
-- reload.
--- @param module_version Module version, on which the current
--- function had been started. If the actual module version
--- appears to be changed, then stop recovery. It is
--- restarted in reloadable_fiber.
--
local function recovery_f()
local module_version = M.module_version
+ -- Changes of _bucket increments bucket generation. Recovery has its own
+ -- bucket generation which is <= actual. Recovery is finished, when its
+ -- generation == bucket generation. In such a case the fiber does nothing
+ -- until next _bucket change.
+ local bucket_generation_recovered = -1
+ local bucket_generation_current = M.bucket_generation
+ local ok, sleep_time, is_all_recovered, total, recovered
-- Interrupt recovery if a module has been reloaded. Perhaps,
-- there was found a bug, and reload fixes it.
while module_version == M.module_version do
@@ -648,22 +651,57 @@ local function recovery_f()
lfiber.yield()
goto continue
end
- local ok, total, recovered = pcall(recovery_step_by_type,
- consts.BUCKET.SENDING)
+ is_all_recovered = true
+ if bucket_generation_recovered == bucket_generation_current then
+ goto sleep
+ end
+
+ ok, total, recovered = pcall(recovery_step_by_type,
+ consts.BUCKET.SENDING)
if not ok then
+ is_all_recovered = false
log.error('Error during sending buckets recovery: %s', total)
+ elseif total ~= recovered then
+ is_all_recovered = false
end
+
ok, total, recovered = pcall(recovery_step_by_type,
consts.BUCKET.RECEIVING)
if not ok then
+ is_all_recovered = false
log.error('Error during receiving buckets recovery: %s', total)
elseif total == 0 then
bucket_receiving_quota_reset()
else
bucket_receiving_quota_add(recovered)
+ if total ~= recovered then
+ is_all_recovered = false
+ end
+ end
+
+ ::sleep::
+ if not is_all_recovered then
+ bucket_generation_recovered = -1
+ else
+ bucket_generation_recovered = bucket_generation_current
+ end
+ bucket_generation_current = M.bucket_generation
+
+ if not is_all_recovered then
+ -- One option - some buckets are not broken. Their transmission is
+ -- still in progress. Don't need to retry immediately. Another
+ -- option - network errors when tried to repair the buckets. Also no
+ -- need to retry often. It won't help.
+ sleep_time = consts.RECOVERY_BACKOFF_INTERVAL
+ elseif bucket_generation_recovered ~= bucket_generation_current then
+ sleep_time = 0
+ else
+ sleep_time = consts.TIMEOUT_INFINITY
+ end
+ if module_version == M.module_version then
+ M.bucket_generation_cond:wait(sleep_time)
end
- lfiber.sleep(consts.RECOVERY_INTERVAL)
- ::continue::
+ ::continue::
end
end
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 8/9] recovery: introduce reactive recovery
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 8/9] recovery: introduce reactive recovery Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 9:00 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch. LGTM.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Recovery is a fiber on a master node which tries to resolve
> SENDING/RECEIVING buckets into GARBAGE or ACTIVE, in case they are
> stuck. Usually it happens due to a conflict on the receiving side,
> or if a restart happens during bucket send.
>
> Recovery was proactive. It used to wakeup with a constant period
> to find and resolve the needed buckets.
>
> But this won't work with the future feature called 'map-reduce'.
> Map-reduce as a preparation stage will need to ensure that all
> buckets on a storage are readable and writable. With the current
> recovery algorithm if a bucket is broken, it won't be recovered
> for the next 5 seconds by default. During this time all new
> map-reduce requests can't execute.
>
> This is not acceptable. As well as too frequent wakeup of recovery
> fiber because it would waste TX thread time.
>
> The patch makes recovery fiber wakeup not by a timeout but by
> events happening with _bucket space. Recovery fiber sleeps on a
> condition variable which is signaled when _bucket is changed.
>
> This is very similar to the reactive GC feature in a previous
> commit.
>
> It is worth mentioning that the backoff happens not only when a
> bucket couldn't be recovered (its transfer is still in progress,
> for example), but also when a network error happened and recovery
> couldn't check state of the bucket on the other storage.
>
> It would be a useless busy loop to retry network errors
> immediately after their appearance. Recovery uses a backoff
> interval for them as well.
>
> Needed for #147
> ---
> test/router/router.result | 22 ++++++++---
> test/router/router.test.lua | 13 ++++++-
> test/storage/recovery.result | 8 ++++
> test/storage/recovery.test.lua | 5 +++
> test/storage/recovery_errinj.result | 16 +++++++-
> test/storage/recovery_errinj.test.lua | 9 ++++-
> vshard/consts.lua | 2 +-
> vshard/storage/init.lua | 54 +++++++++++++++++++++++----
> 8 files changed, 110 insertions(+), 19 deletions(-)
>
> diff --git a/test/router/router.result b/test/router/router.result
> index b2efd6d..3c1d073 100644
> --- a/test/router/router.result
> +++ b/test/router/router.result
> @@ -312,6 +312,11 @@ replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err
> _ = test_run:switch('storage_2_a')
> ---
> ...
> +-- Pause recovery. It is too aggressive, and the test needs to see buckets in
> +-- their intermediate states.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]})
> ---
> - [1, 'sending', '<replicaset_1>']
> @@ -319,6 +324,9 @@ box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]}
> _ = test_run:switch('storage_1_a')
> ---
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]})
> ---
> - [1, 'receiving', '<replicaset_2>']
> @@ -342,19 +350,21 @@ util.check_error(vshard.router.call, 1, 'write', 'echo', {123})
> name: TRANSFER_IS_IN_PROGRESS
> message: Bucket 1 is transferring to replicaset <replicaset_1>
> ...
> -_ = test_run:switch('storage_2_a')
> +_ = test_run:switch('storage_1_a')
> +---
> +...
> +box.space._bucket:delete({1})
> ---
> +- [1, 'receiving', '<replicaset_2>']
> ...
> -box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE})
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> ---
> -- [1, 'active']
> ...
> -_ = test_run:switch('storage_1_a')
> +_ = test_run:switch('storage_2_a')
> ---
> ...
> -box.space._bucket:delete({1})
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> ---
> -- [1, 'receiving', '<replicaset_2>']
> ...
> _ = test_run:switch('router_1')
> ---
> diff --git a/test/router/router.test.lua b/test/router/router.test.lua
> index 154310b..aa3eb3b 100644
> --- a/test/router/router.test.lua
> +++ b/test/router/router.test.lua
> @@ -114,19 +114,28 @@ replicaset, err = vshard.router.bucket_discovery(1); return err == nil or err
> replicaset, err = vshard.router.bucket_discovery(2); return err == nil or err
>
> _ = test_run:switch('storage_2_a')
> +-- Pause recovery. It is too aggressive, and the test needs to see buckets in
> +-- their intermediate states.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> box.space._bucket:replace({1, vshard.consts.BUCKET.SENDING, util.replicasets[1]})
> +
> _ = test_run:switch('storage_1_a')
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> box.space._bucket:replace({1, vshard.consts.BUCKET.RECEIVING, util.replicasets[2]})
> +
> _ = test_run:switch('router_1')
> -- Ok to read sending bucket.
> vshard.router.call(1, 'read', 'echo', {123})
> -- Not ok to write sending bucket.
> util.check_error(vshard.router.call, 1, 'write', 'echo', {123})
>
> -_ = test_run:switch('storage_2_a')
> -box.space._bucket:replace({1, vshard.consts.BUCKET.ACTIVE})
> _ = test_run:switch('storage_1_a')
> box.space._bucket:delete({1})
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +
> +_ = test_run:switch('storage_2_a')
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +
> _ = test_run:switch('router_1')
>
> -- Check unavailability of master of a replicaset.
> diff --git a/test/storage/recovery.result b/test/storage/recovery.result
> index 8ccb0b9..fa92bca 100644
> --- a/test/storage/recovery.result
> +++ b/test/storage/recovery.result
> @@ -28,12 +28,20 @@ util.push_rs_filters(test_run)
> _ = test_run:switch("storage_2_a")
> ---
> ...
> +-- Pause until restart. Otherwise recovery does its job too fast and does not
> +-- allow to simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> vshard.storage.rebalancer_disable()
> ---
> ...
> _ = test_run:switch("storage_1_a")
> ---
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> -- Create buckets sending to rs2 and restart - recovery must
> -- garbage some of them and activate others. Receiving buckets
> -- must be garbaged on bootstrap.
> diff --git a/test/storage/recovery.test.lua b/test/storage/recovery.test.lua
> index a0651e8..93cec68 100644
> --- a/test/storage/recovery.test.lua
> +++ b/test/storage/recovery.test.lua
> @@ -10,8 +10,13 @@ util.wait_master(test_run, REPLICASET_2, 'storage_2_a')
> util.push_rs_filters(test_run)
>
> _ = test_run:switch("storage_2_a")
> +-- Pause until restart. Otherwise recovery does its job too fast and does not
> +-- allow to simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> vshard.storage.rebalancer_disable()
> +
> _ = test_run:switch("storage_1_a")
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
>
> -- Create buckets sending to rs2 and restart - recovery must
> -- garbage some of them and activate others. Receiving buckets
> diff --git a/test/storage/recovery_errinj.result b/test/storage/recovery_errinj.result
> index 3e9a9bf..8c178d5 100644
> --- a/test/storage/recovery_errinj.result
> +++ b/test/storage/recovery_errinj.result
> @@ -35,9 +35,17 @@ _ = test_run:switch('storage_2_a')
> vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true
> ---
> ...
> +-- Pause recovery. Otherwise it does its job too fast and does not allow to
> +-- simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> _ = test_run:switch('storage_1_a')
> ---
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +---
> +...
> _bucket = box.space._bucket
> ---
> ...
> @@ -76,10 +84,16 @@ _bucket:get{1}
> ---
> - [1, 'active']
> ...
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +---
> +...
> _ = test_run:switch('storage_1_a')
> ---
> ...
> -while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +---
> +...
> +wait_bucket_is_collected(1)
> ---
> ...
> _ = test_run:switch("default")
> diff --git a/test/storage/recovery_errinj.test.lua b/test/storage/recovery_errinj.test.lua
> index 8c1a9d2..c730560 100644
> --- a/test/storage/recovery_errinj.test.lua
> +++ b/test/storage/recovery_errinj.test.lua
> @@ -14,7 +14,12 @@ util.push_rs_filters(test_run)
> --
> _ = test_run:switch('storage_2_a')
> vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = true
> +-- Pause recovery. Otherwise it does its job too fast and does not allow to
> +-- simulate the intermediate state.
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> +
> _ = test_run:switch('storage_1_a')
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = true
> _bucket = box.space._bucket
> _bucket:replace{1, vshard.consts.BUCKET.ACTIVE, util.replicasets[2]}
> ret, err = vshard.storage.bucket_send(1, util.replicasets[2], {timeout = 0.1})
> @@ -27,9 +32,11 @@ vshard.storage.internal.errinj.ERRINJ_LAST_RECEIVE_DELAY = false
> _bucket = box.space._bucket
> while _bucket:get{1}.status ~= vshard.consts.BUCKET.ACTIVE do fiber.sleep(0.01) end
> _bucket:get{1}
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
>
> _ = test_run:switch('storage_1_a')
> -while _bucket:count() ~= 0 do vshard.storage.recovery_wakeup() fiber.sleep(0.1) end
> +vshard.storage.internal.errinj.ERRINJ_NO_RECOVERY = false
> +wait_bucket_is_collected(1)
>
> _ = test_run:switch("default")
> test_run:drop_cluster(REPLICASET_2)
> diff --git a/vshard/consts.lua b/vshard/consts.lua
> index 3f1585a..cf3f422 100644
> --- a/vshard/consts.lua
> +++ b/vshard/consts.lua
> @@ -39,7 +39,7 @@ return {
> DEFAULT_SYNC_TIMEOUT = 1;
> RECONNECT_TIMEOUT = 0.5;
> GC_BACKOFF_INTERVAL = 5,
> - RECOVERY_INTERVAL = 5;
> + RECOVERY_BACKOFF_INTERVAL = 5,
> COLLECT_LUA_GARBAGE_INTERVAL = 100;
>
> DISCOVERY_IDLE_INTERVAL = 10,
> diff --git a/vshard/storage/init.lua b/vshard/storage/init.lua
> index 31a6fc7..85f5024 100644
> --- a/vshard/storage/init.lua
> +++ b/vshard/storage/init.lua
> @@ -634,13 +634,16 @@ end
> -- Infinite function to resolve status of buckets, whose 'sending'
> -- has failed due to tarantool or network problems. Restarts on
> -- reload.
> --- @param module_version Module version, on which the current
> --- function had been started. If the actual module version
> --- appears to be changed, then stop recovery. It is
> --- restarted in reloadable_fiber.
> --
> local function recovery_f()
> local module_version = M.module_version
> + -- Changes of _bucket increments bucket generation. Recovery has its own
> + -- bucket generation which is <= actual. Recovery is finished, when its
> + -- generation == bucket generation. In such a case the fiber does nothing
> + -- until next _bucket change.
> + local bucket_generation_recovered = -1
> + local bucket_generation_current = M.bucket_generation
> + local ok, sleep_time, is_all_recovered, total, recovered
> -- Interrupt recovery if a module has been reloaded. Perhaps,
> -- there was found a bug, and reload fixes it.
> while module_version == M.module_version do
> @@ -648,22 +651,57 @@ local function recovery_f()
> lfiber.yield()
> goto continue
> end
> - local ok, total, recovered = pcall(recovery_step_by_type,
> - consts.BUCKET.SENDING)
> + is_all_recovered = true
> + if bucket_generation_recovered == bucket_generation_current then
> + goto sleep
> + end
> +
> + ok, total, recovered = pcall(recovery_step_by_type,
> + consts.BUCKET.SENDING)
> if not ok then
> + is_all_recovered = false
> log.error('Error during sending buckets recovery: %s', total)
> + elseif total ~= recovered then
> + is_all_recovered = false
> end
> +
> ok, total, recovered = pcall(recovery_step_by_type,
> consts.BUCKET.RECEIVING)
> if not ok then
> + is_all_recovered = false
> log.error('Error during receiving buckets recovery: %s', total)
> elseif total == 0 then
> bucket_receiving_quota_reset()
> else
> bucket_receiving_quota_add(recovered)
> + if total ~= recovered then
> + is_all_recovered = false
> + end
> + end
> +
> + ::sleep::
> + if not is_all_recovered then
> + bucket_generation_recovered = -1
> + else
> + bucket_generation_recovered = bucket_generation_current
> + end
> + bucket_generation_current = M.bucket_generation
> +
> + if not is_all_recovered then
> + -- One option - some buckets are not broken. Their transmission is
> + -- still in progress. Don't need to retry immediately. Another
> + -- option - network errors when tried to repair the buckets. Also no
> + -- need to retry often. It won't help.
> + sleep_time = consts.RECOVERY_BACKOFF_INTERVAL
> + elseif bucket_generation_recovered ~= bucket_generation_current then
> + sleep_time = 0
> + else
> + sleep_time = consts.TIMEOUT_INFINITY
> + end
> + if module_version == M.module_version then
> + M.bucket_generation_cond:wait(sleep_time)
> end
> - lfiber.sleep(consts.RECOVERY_INTERVAL)
> - ::continue::
> + ::continue::
> end
> end
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (7 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 8/9] recovery: introduce reactive recovery Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:46 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:01 ` Oleg Babin via Tarantool-patches
2021-03-05 22:03 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:51 ` [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
9 siblings, 2 replies; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:46 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Lua does not have a built-in standard library for binary heaps
(also called priority queues). There is an implementation in
Tarantool core in libsalad, but it is in C.
Heap is a perfect storage for the soon coming feature map-reduce.
In the map-reduce algorithm it will be necessary to be able to
lock an entire storage against any bucket moves for time <=
specified timeout. Number of map-reduce requests can be big, and
they can have different timeouts.
So there is a pile of timeouts from different requests. It is
necessary to be able to quickly add new ones, be able to delete
random ones, and remove expired ones.
One way would be a sorted array of the deadlines. Unfortunately,
it is super slow. O(N + log(N)) to add a new element (find place
for log(N) and move all next elements for N), O(N) to delete a
random one (move all next elements one cell left/right).
Another way would be a sorted tree. But trees like RB or a dumb
binary tree require extra steps to keep them balanced and to have
access to the smallest element ASAP.
The best way is the binary heap. It is perfectly balanced by
design meaning that all operations there have complexity at most
O(log(N)). It is possible to find the closest deadline for
constant time as it is the heap's top.
This patch implements it. The heap is intrusive. It means it
stores index of each element right inside of the element as a
field 'index'. Having an index along with each element allows to
delete it from the heap for O(log(N)) without necessity to look
its place up first.
Part of #147
---
test/unit-tap/heap.test.lua | 310 ++++++++++++++++++++++++++++++++++++
test/unit-tap/suite.ini | 4 +
vshard/heap.lua | 226 ++++++++++++++++++++++++++
3 files changed, 540 insertions(+)
create mode 100755 test/unit-tap/heap.test.lua
create mode 100644 test/unit-tap/suite.ini
create mode 100644 vshard/heap.lua
diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua
new file mode 100755
index 0000000..8c3819f
--- /dev/null
+++ b/test/unit-tap/heap.test.lua
@@ -0,0 +1,310 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+local test = tap.test("cfg")
+local heap = require('vshard.heap')
+
+--
+-- Max number of heap to test. Number of iterations in the test
+-- grows as a factorial of this value. At 10 the test becomes
+-- too long already.
+--
+local heap_size = 8
+
+--
+-- Type of the object stored in the intrusive heap.
+--
+local function min_heap_cmp(l, r)
+ return l.value < r.value
+end
+
+local function max_heap_cmp(l, r)
+ return l.value > r.value
+end
+
+local function new_object(value)
+ return {value = value}
+end
+
+local function heap_check_indexes(heap)
+ local count = heap:count()
+ local data = heap.data
+ for i = 1, count do
+ assert(data[i].index == i)
+ end
+end
+
+local function reverse(values, i1, i2)
+ while i1 < i2 do
+ values[i1], values[i2] = values[i2], values[i1]
+ i1 = i1 + 1
+ i2 = i2 - 1
+ end
+end
+
+--
+-- Implementation of std::next_permutation() from C++.
+--
+local function next_permutation(values)
+ local count = #values
+ if count <= 1 then
+ return false
+ end
+ local i = count
+ while true do
+ local j = i
+ i = i - 1
+ if values[i] < values[j] then
+ local k = count
+ while values[i] >= values[k] do
+ k = k - 1
+ end
+ values[i], values[k] = values[k], values[i]
+ reverse(values, j, count)
+ return true
+ end
+ if i == 1 then
+ reverse(values, 1, count)
+ return false
+ end
+ end
+end
+
+local function range(count)
+ local res = {}
+ for i = 1, count do
+ res[i] = i
+ end
+ return res
+end
+
+--
+-- Min heap fill and empty.
+--
+local function test_min_heap_basic(test)
+ test:plan(1)
+
+ local h = heap.new(min_heap_cmp)
+ assert(not h:pop())
+ assert(h:count() == 0)
+ local values = {}
+ for i = 1, heap_size do
+ values[i] = new_object(i)
+ end
+ for counti = 1, heap_size do
+ local indexes = range(counti)
+ repeat
+ for i = 1, counti do
+ h:push(values[indexes[i]])
+ end
+ heap_check_indexes(h)
+ assert(h:count() == counti)
+ for i = 1, counti do
+ assert(h:top() == values[i])
+ assert(h:pop() == values[i])
+ heap_check_indexes(h)
+ end
+ assert(not h:pop())
+ assert(h:count() == 0)
+ until not next_permutation(indexes)
+ end
+
+ test:ok(true, "no asserts")
+end
+
+--
+-- Max heap fill and empty.
+--
+local function test_max_heap_basic(test)
+ test:plan(1)
+
+ local h = heap.new(max_heap_cmp)
+ assert(not h:pop())
+ assert(h:count() == 0)
+ local values = {}
+ for i = 1, heap_size do
+ values[i] = new_object(heap_size - i + 1)
+ end
+ for counti = 1, heap_size do
+ local indexes = range(counti)
+ repeat
+ for i = 1, counti do
+ h:push(values[indexes[i]])
+ end
+ heap_check_indexes(h)
+ assert(h:count() == counti)
+ for i = 1, counti do
+ assert(h:top() == values[i])
+ assert(h:pop() == values[i])
+ heap_check_indexes(h)
+ end
+ assert(not h:pop())
+ assert(h:count() == 0)
+ until not next_permutation(indexes)
+ end
+
+ test:ok(true, "no asserts")
+end
+
+--
+-- Min heap update top element.
+--
+local function test_min_heap_update_top(test)
+ test:plan(1)
+
+ local h = heap.new(min_heap_cmp)
+ for counti = 1, heap_size do
+ local indexes = range(counti)
+ repeat
+ local values = {}
+ for i = 1, counti do
+ values[i] = new_object(0)
+ h:push(values[i])
+ end
+ heap_check_indexes(h)
+ for i = 1, counti do
+ h:top().value = indexes[i]
+ h:update_top()
+ end
+ heap_check_indexes(h)
+ assert(h:count() == counti)
+ for i = 1, counti do
+ assert(h:top().value == i)
+ assert(h:pop().value == i)
+ heap_check_indexes(h)
+ end
+ assert(not h:pop())
+ assert(h:count() == 0)
+ until not next_permutation(indexes)
+ end
+
+ test:ok(true, "no asserts")
+end
+
+--
+-- Min heap update all elements in all possible positions.
+--
+local function test_min_heap_update(test)
+ test:plan(1)
+
+ local h = heap.new(min_heap_cmp)
+ for counti = 1, heap_size do
+ for srci = 1, counti do
+ local endv = srci * 10 + 5
+ for newv = 5, endv, 5 do
+ local values = {}
+ for i = 1, counti do
+ values[i] = new_object(i * 10)
+ h:push(values[i])
+ end
+ heap_check_indexes(h)
+ local obj = values[srci]
+ obj.value = newv
+ h:update(obj)
+ assert(obj.index >= 1)
+ assert(obj.index <= counti)
+ local prev = -1
+ for i = 1, counti do
+ obj = h:pop()
+ assert(obj.index == -1)
+ assert(obj.value >= prev)
+ assert(obj.value >= 1)
+ prev = obj.value
+ obj.value = -1
+ heap_check_indexes(h)
+ end
+ assert(not h:pop())
+ assert(h:count() == 0)
+ end
+ end
+ end
+
+ test:ok(true, "no asserts")
+end
+
+--
+-- Max heap delete all elements from all possible positions.
+--
+local function test_max_heap_delete(test)
+ test:plan(1)
+
+ local h = heap.new(max_heap_cmp)
+ local inf = heap_size + 1
+ for counti = 1, heap_size do
+ for srci = 1, counti do
+ local values = {}
+ for i = 1, counti do
+ values[i] = new_object(i)
+ h:push(values[i])
+ end
+ heap_check_indexes(h)
+ local obj = values[srci]
+ obj.value = inf
+ h:remove(obj)
+ assert(obj.index == -1)
+ local prev = inf
+ for i = 2, counti do
+ obj = h:pop()
+ assert(obj.index == -1)
+ assert(obj.value < prev)
+ assert(obj.value >= 1)
+ prev = obj.value
+ obj.value = -1
+ heap_check_indexes(h)
+ end
+ assert(not h:pop())
+ assert(h:count() == 0)
+ end
+ end
+
+ test:ok(true, "no asserts")
+end
+
+local function test_min_heap_remove_top(test)
+ test:plan(1)
+
+ local h = heap.new(min_heap_cmp)
+ for i = 1, heap_size do
+ h:push(new_object(i))
+ end
+ for i = 1, heap_size do
+ assert(h:top().value == i)
+ h:remove_top()
+ end
+ assert(h:count() == 0)
+
+ test:ok(true, "no asserts")
+end
+
+local function test_max_heap_remove_try(test)
+ test:plan(1)
+
+ local h = heap.new(max_heap_cmp)
+ local obj = new_object(1)
+ assert(obj.index == nil)
+ h:remove_try(obj)
+ assert(h:count() == 0)
+
+ h:push(obj)
+ h:push(new_object(2))
+ assert(obj.index == 2)
+ h:remove(obj)
+ assert(obj.index == -1)
+ h:remove_try(obj)
+ assert(obj.index == -1)
+ assert(h:count() == 1)
+
+ test:ok(true, "no asserts")
+end
+
+test:plan(7)
+
+test:test('min_heap_basic', test_min_heap_basic)
+test:test('max_heap_basic', test_max_heap_basic)
+test:test('min_heap_update_top', test_min_heap_update_top)
+test:test('min heap update', test_min_heap_update)
+test:test('max heap delete', test_max_heap_delete)
+test:test('min heap remove top', test_min_heap_remove_top)
+test:test('max heap remove try', test_max_heap_remove_try)
+
+os.exit(test:check() and 0 or 1)
diff --git a/test/unit-tap/suite.ini b/test/unit-tap/suite.ini
new file mode 100644
index 0000000..f365b69
--- /dev/null
+++ b/test/unit-tap/suite.ini
@@ -0,0 +1,4 @@
+[default]
+core = app
+description = Unit tests TAP
+is_parallel = True
diff --git a/vshard/heap.lua b/vshard/heap.lua
new file mode 100644
index 0000000..78c600a
--- /dev/null
+++ b/vshard/heap.lua
@@ -0,0 +1,226 @@
+local math_floor = math.floor
+
+--
+-- Implementation of a typical algorithm of the binary heap.
+-- The heap is intrusive - it stores index of each element inside of it. It
+-- allows to update and delete elements in any place in the heap, not only top
+-- elements.
+--
+
+local function heap_parent_index(index)
+ return math_floor(index / 2)
+end
+
+local function heap_left_child_index(index)
+ return index * 2
+end
+
+--
+-- Generate a new heap.
+--
+-- The implementation is targeted on as few index accesses as possible.
+-- Everything what could be is stored as upvalue variables instead of as indexes
+-- in a table. What couldn't be an upvalue and is used in a function more than
+-- once is saved on the stack.
+--
+local function heap_new(is_left_above)
+ -- Having it as an upvalue allows not to do 'self.data' lookup in each
+ -- function.
+ local data = {}
+ -- Saves #data calculation. In Lua it is not just reading a number.
+ local count = 0
+
+ local function heap_update_index_up(idx)
+ if idx == 1 then
+ return false
+ end
+
+ local orig_idx = idx
+ local value = data[idx]
+ local pidx = heap_parent_index(idx)
+ local parent = data[pidx]
+ while is_left_above(value, parent) do
+ data[idx] = parent
+ parent.index = idx
+ idx = pidx
+ if idx == 1 then
+ break
+ end
+ pidx = heap_parent_index(idx)
+ parent = data[pidx]
+ end
+
+ if idx == orig_idx then
+ return false
+ end
+ data[idx] = value
+ value.index = idx
+ return true
+ end
+
+ local function heap_update_index_down(idx)
+ local left_idx = heap_left_child_index(idx)
+ if left_idx > count then
+ return false
+ end
+
+ local orig_idx = idx
+ local left
+ local right
+ local right_idx = left_idx + 1
+ local top
+ local top_idx
+ local value = data[idx]
+ repeat
+ right_idx = left_idx + 1
+ if right_idx > count then
+ top = data[left_idx]
+ if is_left_above(value, top) then
+ break
+ end
+ top_idx = left_idx
+ else
+ left = data[left_idx]
+ right = data[right_idx]
+ if is_left_above(left, right) then
+ if is_left_above(value, left) then
+ break
+ end
+ top_idx = left_idx
+ top = left
+ else
+ if is_left_above(value, right) then
+ break
+ end
+ top_idx = right_idx
+ top = right
+ end
+ end
+
+ data[idx] = top
+ top.index = idx
+ idx = top_idx
+ left_idx = heap_left_child_index(idx)
+ until left_idx > count
+
+ if idx == orig_idx then
+ return false
+ end
+ data[idx] = value
+ value.index = idx
+ return true
+ end
+
+ local function heap_update_index(idx)
+ if not heap_update_index_up(idx) then
+ heap_update_index_down(idx)
+ end
+ end
+
+ local function heap_push(self, value)
+ count = count + 1
+ data[count] = value
+ value.index = count
+ heap_update_index_up(count)
+ end
+
+ local function heap_update_top(self)
+ heap_update_index_down(1)
+ end
+
+ local function heap_update(self, value)
+ heap_update_index(value.index)
+ end
+
+ local function heap_remove_top(self)
+ if count == 0 then
+ return
+ end
+ data[1].index = -1
+ if count == 1 then
+ data[1] = nil
+ count = 0
+ return
+ end
+ local value = data[count]
+ data[count] = nil
+ data[1] = value
+ value.index = 1
+ count = count - 1
+ heap_update_index_down(1)
+ end
+
+ local function heap_remove(self, value)
+ local idx = value.index
+ value.index = -1
+ if idx == count then
+ data[count] = nil
+ count = count - 1
+ return
+ end
+ value = data[count]
+ data[idx] = value
+ data[count] = nil
+ value.index = idx
+ count = count - 1
+ heap_update_index(idx)
+ end
+
+ local function heap_remove_try(self, value)
+ local idx = value.index
+ if idx and idx > 0 then
+ heap_remove(self, value)
+ end
+ end
+
+ local function heap_pop(self)
+ if count == 0 then
+ return
+ end
+ -- Some duplication from remove_top, but allows to save a few
+ -- condition checks, index accesses, and a function call.
+ local res = data[1]
+ res.index = -1
+ if count == 1 then
+ data[1] = nil
+ count = 0
+ return res
+ end
+ local value = data[count]
+ data[count] = nil
+ data[1] = value
+ value.index = 1
+ count = count - 1
+ heap_update_index_down(1)
+ return res
+ end
+
+ local function heap_top(self)
+ return data[1]
+ end
+
+ local function heap_count(self)
+ return count
+ end
+
+ return setmetatable({
+ -- Expose the data. For testing.
+ data = data,
+ }, {
+ __index = {
+ push = heap_push,
+ update_top = heap_update_top,
+ remove_top = heap_remove_top,
+ pop = heap_pop,
+ update = heap_update,
+ remove = heap_remove,
+ remove_try = heap_remove_try,
+ top = heap_top,
+ count = heap_count,
+ }
+ })
+end
+
+return {
+ new = heap_new,
+}
--
2.24.3 (Apple Git-128)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-10 9:01 ` Oleg Babin via Tarantool-patches
2021-02-10 22:36 ` Vladislav Shpilevoy via Tarantool-patches
2021-03-05 22:03 ` Vladislav Shpilevoy via Tarantool-patches
1 sibling, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-10 9:01 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your patch.
Shouldn't it be added to storage "MODULE_INTERNALS" ?
LGTM. One comment below.
On 10/02/2021 02:46, Vladislav Shpilevoy wrote:
> Lua does not have a built-in standard library for binary heaps
> (also called priority queues). There is an implementation in
> Tarantool core in libsalad, but it is in C.
>
> Heap is a perfect storage for the soon coming feature map-reduce.
> In the map-reduce algorithm it will be necessary to be able to
> lock an entire storage against any bucket moves for time <=
> specified timeout. Number of map-reduce requests can be big, and
> they can have different timeouts.
>
> So there is a pile of timeouts from different requests. It is
> necessary to be able to quickly add new ones, be able to delete
> random ones, and remove expired ones.
>
> One way would be a sorted array of the deadlines. Unfortunately,
> it is super slow. O(N + log(N)) to add a new element (find place
> for log(N) and move all next elements for N), O(N) to delete a
> random one (move all next elements one cell left/right).
>
> Another way would be a sorted tree. But trees like RB or a dumb
> binary tree require extra steps to keep them balanced and to have
> access to the smallest element ASAP.
>
> The best way is the binary heap. It is perfectly balanced by
> design meaning that all operations there have complexity at most
> O(log(N)). It is possible to find the closest deadline for
> constant time as it is the heap's top.
>
> This patch implements it. The heap is intrusive. It means it
> stores index of each element right inside of the element as a
> field 'index'. Having an index along with each element allows to
> delete it from the heap for O(log(N)) without necessity to look
> its place up first.
>
> Part of #147
> ---
> test/unit-tap/heap.test.lua | 310 ++++++++++++++++++++++++++++++++++++
> test/unit-tap/suite.ini | 4 +
> vshard/heap.lua | 226 ++++++++++++++++++++++++++
> 3 files changed, 540 insertions(+)
> create mode 100755 test/unit-tap/heap.test.lua
> create mode 100644 test/unit-tap/suite.ini
> create mode 100644 vshard/heap.lua
>
> diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua
> new file mode 100755
> index 0000000..8c3819f
> --- /dev/null
> +++ b/test/unit-tap/heap.test.lua
> @@ -0,0 +1,310 @@
> +#!/usr/bin/env tarantool
> +
> +local tap = require('tap')
> +local test = tap.test("cfg")
> +local heap = require('vshard.heap')
> +
Maybe it's better to use single brackets everywhere: test("cfg") ->
test('cfg'). Or does such difference have some sense?
> +--
> +-- Max number of heap to test. Number of iterations in the test
> +-- grows as a factorial of this value. At 10 the test becomes
> +-- too long already.
> +--
> +local heap_size = 8
> +
> +--
> +-- Type of the object stored in the intrusive heap.
> +--
> +local function min_heap_cmp(l, r)
> + return l.value < r.value
> +end
> +
> +local function max_heap_cmp(l, r)
> + return l.value > r.value
> +end
> +
> +local function new_object(value)
> + return {value = value}
> +end
> +
> +local function heap_check_indexes(heap)
> + local count = heap:count()
> + local data = heap.data
> + for i = 1, count do
> + assert(data[i].index == i)
> + end
> +end
> +
> +local function reverse(values, i1, i2)
> + while i1 < i2 do
> + values[i1], values[i2] = values[i2], values[i1]
> + i1 = i1 + 1
> + i2 = i2 - 1
> + end
> +end
> +
> +--
> +-- Implementation of std::next_permutation() from C++.
> +--
> +local function next_permutation(values)
> + local count = #values
> + if count <= 1 then
> + return false
> + end
> + local i = count
> + while true do
> + local j = i
> + i = i - 1
> + if values[i] < values[j] then
> + local k = count
> + while values[i] >= values[k] do
> + k = k - 1
> + end
> + values[i], values[k] = values[k], values[i]
> + reverse(values, j, count)
> + return true
> + end
> + if i == 1 then
> + reverse(values, 1, count)
> + return false
> + end
> + end
> +end
> +
> +local function range(count)
> + local res = {}
> + for i = 1, count do
> + res[i] = i
> + end
> + return res
> +end
> +
> +--
> +-- Min heap fill and empty.
> +--
> +local function test_min_heap_basic(test)
> + test:plan(1)
> +
> + local h = heap.new(min_heap_cmp)
> + assert(not h:pop())
> + assert(h:count() == 0)
> + local values = {}
> + for i = 1, heap_size do
> + values[i] = new_object(i)
> + end
> + for counti = 1, heap_size do
> + local indexes = range(counti)
> + repeat
> + for i = 1, counti do
> + h:push(values[indexes[i]])
> + end
> + heap_check_indexes(h)
> + assert(h:count() == counti)
> + for i = 1, counti do
> + assert(h:top() == values[i])
> + assert(h:pop() == values[i])
> + heap_check_indexes(h)
> + end
> + assert(not h:pop())
> + assert(h:count() == 0)
> + until not next_permutation(indexes)
> + end
> +
> + test:ok(true, "no asserts")
> +end
> +
> +--
> +-- Max heap fill and empty.
> +--
> +local function test_max_heap_basic(test)
> + test:plan(1)
> +
> + local h = heap.new(max_heap_cmp)
> + assert(not h:pop())
> + assert(h:count() == 0)
> + local values = {}
> + for i = 1, heap_size do
> + values[i] = new_object(heap_size - i + 1)
> + end
> + for counti = 1, heap_size do
> + local indexes = range(counti)
> + repeat
> + for i = 1, counti do
> + h:push(values[indexes[i]])
> + end
> + heap_check_indexes(h)
> + assert(h:count() == counti)
> + for i = 1, counti do
> + assert(h:top() == values[i])
> + assert(h:pop() == values[i])
> + heap_check_indexes(h)
> + end
> + assert(not h:pop())
> + assert(h:count() == 0)
> + until not next_permutation(indexes)
> + end
> +
> + test:ok(true, "no asserts")
> +end
> +
> +--
> +-- Min heap update top element.
> +--
> +local function test_min_heap_update_top(test)
> + test:plan(1)
> +
> + local h = heap.new(min_heap_cmp)
> + for counti = 1, heap_size do
> + local indexes = range(counti)
> + repeat
> + local values = {}
> + for i = 1, counti do
> + values[i] = new_object(0)
> + h:push(values[i])
> + end
> + heap_check_indexes(h)
> + for i = 1, counti do
> + h:top().value = indexes[i]
> + h:update_top()
> + end
> + heap_check_indexes(h)
> + assert(h:count() == counti)
> + for i = 1, counti do
> + assert(h:top().value == i)
> + assert(h:pop().value == i)
> + heap_check_indexes(h)
> + end
> + assert(not h:pop())
> + assert(h:count() == 0)
> + until not next_permutation(indexes)
> + end
> +
> + test:ok(true, "no asserts")
> +end
> +
> +--
> +-- Min heap update all elements in all possible positions.
> +--
> +local function test_min_heap_update(test)
> + test:plan(1)
> +
> + local h = heap.new(min_heap_cmp)
> + for counti = 1, heap_size do
> + for srci = 1, counti do
> + local endv = srci * 10 + 5
> + for newv = 5, endv, 5 do
> + local values = {}
> + for i = 1, counti do
> + values[i] = new_object(i * 10)
> + h:push(values[i])
> + end
> + heap_check_indexes(h)
> + local obj = values[srci]
> + obj.value = newv
> + h:update(obj)
> + assert(obj.index >= 1)
> + assert(obj.index <= counti)
> + local prev = -1
> + for i = 1, counti do
> + obj = h:pop()
> + assert(obj.index == -1)
> + assert(obj.value >= prev)
> + assert(obj.value >= 1)
> + prev = obj.value
> + obj.value = -1
> + heap_check_indexes(h)
> + end
> + assert(not h:pop())
> + assert(h:count() == 0)
> + end
> + end
> + end
> +
> + test:ok(true, "no asserts")
> +end
> +
> +--
> +-- Max heap delete all elements from all possible positions.
> +--
> +local function test_max_heap_delete(test)
> + test:plan(1)
> +
> + local h = heap.new(max_heap_cmp)
> + local inf = heap_size + 1
> + for counti = 1, heap_size do
> + for srci = 1, counti do
> + local values = {}
> + for i = 1, counti do
> + values[i] = new_object(i)
> + h:push(values[i])
> + end
> + heap_check_indexes(h)
> + local obj = values[srci]
> + obj.value = inf
> + h:remove(obj)
> + assert(obj.index == -1)
> + local prev = inf
> + for i = 2, counti do
> + obj = h:pop()
> + assert(obj.index == -1)
> + assert(obj.value < prev)
> + assert(obj.value >= 1)
> + prev = obj.value
> + obj.value = -1
> + heap_check_indexes(h)
> + end
> + assert(not h:pop())
> + assert(h:count() == 0)
> + end
> + end
> +
> + test:ok(true, "no asserts")
> +end
> +
> +local function test_min_heap_remove_top(test)
> + test:plan(1)
> +
> + local h = heap.new(min_heap_cmp)
> + for i = 1, heap_size do
> + h:push(new_object(i))
> + end
> + for i = 1, heap_size do
> + assert(h:top().value == i)
> + h:remove_top()
> + end
> + assert(h:count() == 0)
> +
> + test:ok(true, "no asserts")
> +end
> +
> +local function test_max_heap_remove_try(test)
> + test:plan(1)
> +
> + local h = heap.new(max_heap_cmp)
> + local obj = new_object(1)
> + assert(obj.index == nil)
> + h:remove_try(obj)
> + assert(h:count() == 0)
> +
> + h:push(obj)
> + h:push(new_object(2))
> + assert(obj.index == 2)
> + h:remove(obj)
> + assert(obj.index == -1)
> + h:remove_try(obj)
> + assert(obj.index == -1)
> + assert(h:count() == 1)
> +
> + test:ok(true, "no asserts")
> +end
> +
> +test:plan(7)
> +
> +test:test('min_heap_basic', test_min_heap_basic)
> +test:test('max_heap_basic', test_max_heap_basic)
> +test:test('min_heap_update_top', test_min_heap_update_top)
> +test:test('min heap update', test_min_heap_update)
> +test:test('max heap delete', test_max_heap_delete)
> +test:test('min heap remove top', test_min_heap_remove_top)
> +test:test('max heap remove try', test_max_heap_remove_try)
> +
> +os.exit(test:check() and 0 or 1)
> diff --git a/test/unit-tap/suite.ini b/test/unit-tap/suite.ini
> new file mode 100644
> index 0000000..f365b69
> --- /dev/null
> +++ b/test/unit-tap/suite.ini
> @@ -0,0 +1,4 @@
> +[default]
> +core = app
> +description = Unit tests TAP
> +is_parallel = True
> diff --git a/vshard/heap.lua b/vshard/heap.lua
> new file mode 100644
> index 0000000..78c600a
> --- /dev/null
> +++ b/vshard/heap.lua
> @@ -0,0 +1,226 @@
> +local math_floor = math.floor
> +
> +--
> +-- Implementation of a typical algorithm of the binary heap.
> +-- The heap is intrusive - it stores index of each element inside of it. It
> +-- allows to update and delete elements in any place in the heap, not only top
> +-- elements.
> +--
> +
> +local function heap_parent_index(index)
> + return math_floor(index / 2)
> +end
> +
> +local function heap_left_child_index(index)
> + return index * 2
> +end
> +
> +--
> +-- Generate a new heap.
> +--
> +-- The implementation is targeted on as few index accesses as possible.
> +-- Everything what could be is stored as upvalue variables instead of as indexes
> +-- in a table. What couldn't be an upvalue and is used in a function more than
> +-- once is saved on the stack.
> +--
> +local function heap_new(is_left_above)
> + -- Having it as an upvalue allows not to do 'self.data' lookup in each
> + -- function.
> + local data = {}
> + -- Saves #data calculation. In Lua it is not just reading a number.
> + local count = 0
> +
> + local function heap_update_index_up(idx)
> + if idx == 1 then
> + return false
> + end
> +
> + local orig_idx = idx
> + local value = data[idx]
> + local pidx = heap_parent_index(idx)
> + local parent = data[pidx]
> + while is_left_above(value, parent) do
> + data[idx] = parent
> + parent.index = idx
> + idx = pidx
> + if idx == 1 then
> + break
> + end
> + pidx = heap_parent_index(idx)
> + parent = data[pidx]
> + end
> +
> + if idx == orig_idx then
> + return false
> + end
> + data[idx] = value
> + value.index = idx
> + return true
> + end
> +
> + local function heap_update_index_down(idx)
> + local left_idx = heap_left_child_index(idx)
> + if left_idx > count then
> + return false
> + end
> +
> + local orig_idx = idx
> + local left
> + local right
> + local right_idx = left_idx + 1
> + local top
> + local top_idx
> + local value = data[idx]
> + repeat
> + right_idx = left_idx + 1
> + if right_idx > count then
> + top = data[left_idx]
> + if is_left_above(value, top) then
> + break
> + end
> + top_idx = left_idx
> + else
> + left = data[left_idx]
> + right = data[right_idx]
> + if is_left_above(left, right) then
> + if is_left_above(value, left) then
> + break
> + end
> + top_idx = left_idx
> + top = left
> + else
> + if is_left_above(value, right) then
> + break
> + end
> + top_idx = right_idx
> + top = right
> + end
> + end
> +
> + data[idx] = top
> + top.index = idx
> + idx = top_idx
> + left_idx = heap_left_child_index(idx)
> + until left_idx > count
> +
> + if idx == orig_idx then
> + return false
> + end
> + data[idx] = value
> + value.index = idx
> + return true
> + end
> +
> + local function heap_update_index(idx)
> + if not heap_update_index_up(idx) then
> + heap_update_index_down(idx)
> + end
> + end
> +
> + local function heap_push(self, value)
> + count = count + 1
> + data[count] = value
> + value.index = count
> + heap_update_index_up(count)
> + end
> +
> + local function heap_update_top(self)
> + heap_update_index_down(1)
> + end
> +
> + local function heap_update(self, value)
> + heap_update_index(value.index)
> + end
> +
> + local function heap_remove_top(self)
> + if count == 0 then
> + return
> + end
> + data[1].index = -1
> + if count == 1 then
> + data[1] = nil
> + count = 0
> + return
> + end
> + local value = data[count]
> + data[count] = nil
> + data[1] = value
> + value.index = 1
> + count = count - 1
> + heap_update_index_down(1)
> + end
> +
> + local function heap_remove(self, value)
> + local idx = value.index
> + value.index = -1
> + if idx == count then
> + data[count] = nil
> + count = count - 1
> + return
> + end
> + value = data[count]
> + data[idx] = value
> + data[count] = nil
> + value.index = idx
> + count = count - 1
> + heap_update_index(idx)
> + end
> +
> + local function heap_remove_try(self, value)
> + local idx = value.index
> + if idx and idx > 0 then
> + heap_remove(self, value)
> + end
> + end
> +
> + local function heap_pop(self)
> + if count == 0 then
> + return
> + end
> + -- Some duplication from remove_top, but allows to save a few
> + -- condition checks, index accesses, and a function call.
> + local res = data[1]
> + res.index = -1
> + if count == 1 then
> + data[1] = nil
> + count = 0
> + return res
> + end
> + local value = data[count]
> + data[count] = nil
> + data[1] = value
> + value.index = 1
> + count = count - 1
> + heap_update_index_down(1)
> + return res
> + end
> +
> + local function heap_top(self)
> + return data[1]
> + end
> +
> + local function heap_count(self)
> + return count
> + end
> +
> + return setmetatable({
> + -- Expose the data. For testing.
> + data = data,
> + }, {
> + __index = {
> + push = heap_push,
> + update_top = heap_update_top,
> + remove_top = heap_remove_top,
> + pop = heap_pop,
> + update = heap_update,
> + remove = heap_remove,
> + remove_try = heap_remove_try,
> + top = heap_top,
> + count = heap_count,
> + }
> + })
> +end
> +
> +return {
> + new = heap_new,
> +}
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure
2021-02-10 9:01 ` Oleg Babin via Tarantool-patches
@ 2021-02-10 22:36 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:51 ` Oleg Babin via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-10 22:36 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
Thanks for the review!
On 10.02.2021 10:01, Oleg Babin wrote:
> Thanks for your patch.
>
> Shouldn't it be added to storage "MODULE_INTERNALS" ?
Hm. Not sure I understand. Did you mean 'vshard_modules' variable in
storage/init.lua? Why? The heap is not used in storage/init.lua and
won't be used there directly in future patches. The next patches
will introduce new modules for storage/, which will use the heap,
and will reload it.
Also it does not have any global objects. So it does not
need its own global M, if this is what you meant.
>> diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua
>> new file mode 100755
>> index 0000000..8c3819f
>> --- /dev/null
>> +++ b/test/unit-tap/heap.test.lua
>> @@ -0,0 +1,310 @@
>> +#!/usr/bin/env tarantool
>> +
>> +local tap = require('tap')
>> +local test = tap.test("cfg")
>> +local heap = require('vshard.heap')
>> +
>
>
> Maybe it's better to use single brackets everywhere: test("cfg") -> test('cfg'). Or does such difference have some sense?
Yeah, didn't notice it. Here is the diff:
====================
diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua
index 8c3819f..9202f62 100755
--- a/test/unit-tap/heap.test.lua
+++ b/test/unit-tap/heap.test.lua
@@ -1,7 +1,7 @@
#!/usr/bin/env tarantool
local tap = require('tap')
-local test = tap.test("cfg")
+local test = tap.test('cfg')
local heap = require('vshard.heap')
--
@@ -109,7 +109,7 @@ local function test_min_heap_basic(test)
until not next_permutation(indexes)
end
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
--
@@ -143,7 +143,7 @@ local function test_max_heap_basic(test)
until not next_permutation(indexes)
end
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
--
@@ -178,7 +178,7 @@ local function test_min_heap_update_top(test)
until not next_permutation(indexes)
end
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
--
@@ -219,7 +219,7 @@ local function test_min_heap_update(test)
end
end
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
--
@@ -257,7 +257,7 @@ local function test_max_heap_delete(test)
end
end
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
local function test_min_heap_remove_top(test)
@@ -273,7 +273,7 @@ local function test_min_heap_remove_top(test)
end
assert(h:count() == 0)
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
local function test_max_heap_remove_try(test)
@@ -294,7 +294,7 @@ local function test_max_heap_remove_try(test)
assert(obj.index == -1)
assert(h:count() == 1)
- test:ok(true, "no asserts")
+ test:ok(true, 'no asserts')
end
test:plan(7)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure
2021-02-10 22:36 ` Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-11 6:51 ` Oleg Babin via Tarantool-patches
2021-02-12 0:09 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 1 reply; 36+ messages in thread
From: Oleg Babin via Tarantool-patches @ 2021-02-11 6:51 UTC (permalink / raw)
To: Vladislav Shpilevoy, tarantool-patches, yaroslav.dynnikov
Thanks for your fixes!
I found you've missed to add new file to "vshard/CMakeLists.txt" [1]
[1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9
On 11/02/2021 01:36, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
> On 10.02.2021 10:01, Oleg Babin wrote:
>> Thanks for your patch.
>>
>> Shouldn't it be added to storage "MODULE_INTERNALS" ?
> Hm. Not sure I understand. Did you mean 'vshard_modules' variable in
> storage/init.lua? Why? The heap is not used in storage/init.lua and
> won't be used there directly in future patches. The next patches
> will introduce new modules for storage/, which will use the heap,
> and will reload it.
>
> Also it does not have any global objects. So it does not
> need its own global M, if this is what you meant.
Yes, thanks for your answer. Got it.
>>> diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua
>>> new file mode 100755
>>> index 0000000..8c3819f
>>> --- /dev/null
>>> +++ b/test/unit-tap/heap.test.lua
>>> @@ -0,0 +1,310 @@
>>> +#!/usr/bin/env tarantool
>>> +
>>> +local tap = require('tap')
>>> +local test = tap.test("cfg")
>>> +local heap = require('vshard.heap')
>>> +
>> Maybe it's better to use single brackets everywhere: test("cfg") -> test('cfg'). Or does such difference have some sense?
> Yeah, didn't notice it. Here is the diff:
>
> ====================
> diff --git a/test/unit-tap/heap.test.lua b/test/unit-tap/heap.test.lua
> index 8c3819f..9202f62 100755
> --- a/test/unit-tap/heap.test.lua
> +++ b/test/unit-tap/heap.test.lua
> @@ -1,7 +1,7 @@
> #!/usr/bin/env tarantool
>
> local tap = require('tap')
> -local test = tap.test("cfg")
> +local test = tap.test('cfg')
> local heap = require('vshard.heap')
>
> --
> @@ -109,7 +109,7 @@ local function test_min_heap_basic(test)
> until not next_permutation(indexes)
> end
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> --
> @@ -143,7 +143,7 @@ local function test_max_heap_basic(test)
> until not next_permutation(indexes)
> end
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> --
> @@ -178,7 +178,7 @@ local function test_min_heap_update_top(test)
> until not next_permutation(indexes)
> end
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> --
> @@ -219,7 +219,7 @@ local function test_min_heap_update(test)
> end
> end
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> --
> @@ -257,7 +257,7 @@ local function test_max_heap_delete(test)
> end
> end
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> local function test_min_heap_remove_top(test)
> @@ -273,7 +273,7 @@ local function test_min_heap_remove_top(test)
> end
> assert(h:count() == 0)
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> local function test_max_heap_remove_try(test)
> @@ -294,7 +294,7 @@ local function test_max_heap_remove_try(test)
> assert(obj.index == -1)
> assert(h:count() == 1)
>
> - test:ok(true, "no asserts")
> + test:ok(true, 'no asserts')
> end
>
> test:plan(7)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure
2021-02-11 6:51 ` Oleg Babin via Tarantool-patches
@ 2021-02-12 0:09 ` Vladislav Shpilevoy via Tarantool-patches
0 siblings, 0 replies; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-12 0:09 UTC (permalink / raw)
To: Oleg Babin, tarantool-patches, yaroslav.dynnikov
On 11.02.2021 07:51, Oleg Babin wrote:
> Thanks for your fixes!
>
> I found you've missed to add new file to "vshard/CMakeLists.txt" [1]
>
>
> [1] https://github.com/tarantool/vshard/blob/master/vshard/CMakeLists.txt#L9
Thanks for noticing! Fixed:
====================
diff --git a/vshard/CMakeLists.txt b/vshard/CMakeLists.txt
index 1063da8..78a3f07 100644
--- a/vshard/CMakeLists.txt
+++ b/vshard/CMakeLists.txt
@@ -7,4 +7,4 @@ add_subdirectory(router)
# Install module
install(FILES cfg.lua error.lua consts.lua hash.lua init.lua replicaset.lua
- util.lua lua_gc.lua rlist.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard)
+ util.lua lua_gc.lua rlist.lua heap.lua DESTINATION ${TARANTOOL_INSTALL_LUADIR}/vshard)
====================
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:01 ` Oleg Babin via Tarantool-patches
@ 2021-03-05 22:03 ` Vladislav Shpilevoy via Tarantool-patches
1 sibling, 0 replies; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-03-05 22:03 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Applied this diff and force-pushed, in order to eliminate the
metatable and __index access.
Besides, each heap's metatable is different from the others
because the methods are closures, so there wouldn't be any
memory saving from using a metatable. It couldn't have been
shared between the heaps anyway.
====================
diff --git a/vshard/heap.lua b/vshard/heap.lua
index 78c600a..b125921 100644
--- a/vshard/heap.lua
+++ b/vshard/heap.lua
@@ -203,22 +203,21 @@ local function heap_new(is_left_above)
return count
end
- return setmetatable({
+ return {
-- Expose the data. For testing.
data = data,
- }, {
- __index = {
- push = heap_push,
- update_top = heap_update_top,
- remove_top = heap_remove_top,
- pop = heap_pop,
- update = heap_update,
- remove = heap_remove,
- remove_try = heap_remove_try,
- top = heap_top,
- count = heap_count,
- }
- })
+ -- Methods are exported as members instead of __index so as to save on
+ -- not taking a metatable and going through __index on each method call.
+ push = heap_push,
+ update_top = heap_update_top,
+ remove_top = heap_remove_top,
+ pop = heap_pop,
+ update = heap_update,
+ remove = heap_remove,
+ remove_try = heap_remove_try,
+ top = heap_top,
+ count = heap_count,
+ }
end
return {
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
` (8 preceding siblings ...)
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure Vladislav Shpilevoy via Tarantool-patches
@ 2021-02-09 23:51 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-12 11:02 ` Oleg Babin via Tarantool-patches
9 siblings, 1 reply; 36+ messages in thread
From: Vladislav Shpilevoy via Tarantool-patches @ 2021-02-09 23:51 UTC (permalink / raw)
To: tarantool-patches, olegrok, yaroslav.dynnikov
Bad links. Here are the correct ones:
Branch: http://github.com/tarantool/vshard/tree/gerold103/gh-147-map-reduce-part1
Issue: https://github.com/tarantool/vshard/issues/147
^ permalink raw reply [flat|nested] 36+ messages in thread