From: Oleg Babin via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>,
tarantool-patches@dev.tarantool.org,
yaroslav.dynnikov@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions
Date: Thu, 11 Feb 2021 09:50:33 +0300 [thread overview]
Message-ID: <22c4aff1-1305-bf7f-e43b-1fe4d3fa5c51@tarantool.org> (raw)
In-Reply-To: <481db25d-5324-0494-e8b9-da9ff1ec39ac@tarantool.org>
Thanks for your fixes! LGTM.
On 11/02/2021 01:34, Vladislav Shpilevoy wrote:
> Thanks for the review!
>
>>> diff --git a/test/unit/util.result b/test/unit/util.result
>>> index 096e36f..c4fd84d 100644
>>> --- a/test/unit/util.result
>>> +++ b/test/unit/util.result
>>> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
>>> +do \
>>> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
>>> + f = fiber.create(function() \
>>> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
>>> + end) \
>>> + yield_count = 0 \
>>> + while f:status() ~= 'dead' do \
>>> + yield_count = yield_count + 1 \
>>> + fiber.yield() \
>>> + end \
>>> +end
>>> +---
>> Why can't you use "csw" of fiber.self() instead? Also it's it reliable enough to simply count yields?
> Yup, will work too. See the diff below.
>
> ====================
> diff --git a/test/unit/util.result b/test/unit/util.result
> index c4fd84d..42a361a 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -111,14 +111,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> ...
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> ---
> ...
> @@ -151,14 +151,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> res = nil \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> ---
> ...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 4d6cbe9..9550a95 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -42,14 +42,14 @@ minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
>
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> yield_count
> t
> @@ -63,14 +63,14 @@ copy_yield({k1 = 1, k2 = 2}, 1)
> do \
> t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> res = nil \
> + yield_count = 0 \
> f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> end) \
> - yield_count = 0 \
> - while f:status() ~= 'dead' do \
> - yield_count = yield_count + 1 \
> - fiber.yield() \
> - end \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> end
> yield_count
> t
> ====================
>
>> Could scheduler skip this fiber at some loop iteration? In other words, won't this test be flaky?
> Nope. Unless the fiber is sleeping on some condition or for a timeout, a plain
> sleep(0) also known as fiber.yield() won't skip this fiber on the next
> iteration of the loop. But does not matter if csw is used to count the yields.
>
> Full new patch below.
>
> ====================
> util: introduce yielding table functions
>
> The patch adds functions table_copy_yield and table_minus_yield.
>
> Yielding copy creates a duplicate of a table but yields every
> specified number of keys copied.
>
> Yielding minus removes matching key-value pairs specified in one
> table from another table. It yields every specified number of keys
> passed.
>
> The functions should help to process huge Lua tables (millions of
> elements and more). These are going to be used on the storage in
> the new GC algorithm.
>
> The algorithm will need to keep a route table on the storage, just
> like on the router, but with expiration time for the routes. Since
> bucket count can be millions, it means GC will potentially operate
> on a huge Lua table and could use some yields so as not to block
> TX thread for long.
>
> Needed for #147
>
> diff --git a/test/unit/util.result b/test/unit/util.result
> index 096e36f..42a361a 100644
> --- a/test/unit/util.result
> +++ b/test/unit/util.result
> @@ -71,3 +71,116 @@ test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> ---
> ...
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +---
> +...
> +minus_yield({}, {}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 1)
> +---
> +- []
> +...
> +minus_yield({}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k = 1}, {k = 1}, 0)
> +---
> +- []
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +---
> +- k2: 2
> +...
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +---
> +- []
> +...
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +---
> +- k1: 1
> +...
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +---
> +- k3: 3
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k4: 4
> + k1: 1
> +...
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +---
> +...
> +copy_yield({}, 1)
> +---
> +- []
> +...
> +copy_yield({k = 1}, 1)
> +---
> +- k: 1
> +...
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +---
> +- k1: 1
> + k2: 2
> +...
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +---
> +...
> +yield_count
> +---
> +- 2
> +...
> +t
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +res
> +---
> +- k3: 3
> + k4: 4
> + k1: 1
> + k2: 2
> +...
> +t ~= res
> +---
> +- true
> +...
> diff --git a/test/unit/util.test.lua b/test/unit/util.test.lua
> index 5f39e06..9550a95 100644
> --- a/test/unit/util.test.lua
> +++ b/test/unit/util.test.lua
> @@ -27,3 +27,52 @@ fib = util.reloadable_fiber_create('Worker_name', fake_M, 'reloadable_function')
> while not test_run:grep_log('default', 'module is reloaded, restarting') do fiber.sleep(0.01) end
> test_run:grep_log('default', 'reloadable_function has been started', 1000)
> fib:cancel()
> +
> +-- Yielding table minus.
> +minus_yield = util.table_minus_yield
> +minus_yield({}, {}, 1)
> +minus_yield({}, {k = 1}, 1)
> +minus_yield({}, {k = 1}, 0)
> +minus_yield({k = 1}, {k = 1}, 0)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k3 = 3}, 10)
> +minus_yield({k1 = 1, k2 = 2}, {k1 = 1, k2 = 2}, 10)
> +-- Mismatching values are not deleted.
> +minus_yield({k1 = 1}, {k1 = 2}, 10)
> +minus_yield({k1 = 1, k2 = 2, k3 = 3}, {k1 = 1, k2 = 222}, 10)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + minus_yield(t, {k2 = 2, k3 = 3, k5 = 5, k4 = 444}, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +yield_count
> +t
> +
> +-- Yielding table copy.
> +copy_yield = util.table_copy_yield
> +copy_yield({}, 1)
> +copy_yield({k = 1}, 1)
> +copy_yield({k1 = 1, k2 = 2}, 1)
> +
> +do \
> + t = {k1 = 1, k2 = 2, k3 = 3, k4 = 4} \
> + res = nil \
> + yield_count = 0 \
> + f = fiber.create(function() \
> + local csw1 = fiber.info()[fiber.id()].csw \
> + res = copy_yield(t, 2) \
> + local csw2 = fiber.info()[fiber.id()].csw \
> + yield_count = csw2 - csw1 \
> + end) \
> + test_run:wait_cond(function() return f:status() == 'dead' end) \
> +end
> +yield_count
> +t
> +res
> +t ~= res
> diff --git a/vshard/util.lua b/vshard/util.lua
> index d3b4e67..2362607 100644
> --- a/vshard/util.lua
> +++ b/vshard/util.lua
> @@ -153,6 +153,44 @@ local function version_is_at_least(major_need, middle_need, minor_need)
> return minor >= minor_need
> end
>
> +--
> +-- Copy @a src table. Fiber yields every @a interval keys copied.
> +--
> +local function table_copy_yield(src, interval)
> + local res = {}
> + -- Time-To-Yield.
> + local tty = interval
> + for k, v in pairs(src) do
> + res[k] = v
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return res
> +end
> +
> +--
> +-- Remove @a src keys from @a dst if their values match. Fiber yields every
> +-- @a interval iterations.
> +--
> +local function table_minus_yield(dst, src, interval)
> + -- Time-To-Yield.
> + local tty = interval
> + for k, srcv in pairs(src) do
> + if dst[k] == srcv then
> + dst[k] = nil
> + end
> + tty = tty - 1
> + if tty <= 0 then
> + fiber.yield()
> + tty = interval
> + end
> + end
> + return dst
> +end
> +
> return {
> tuple_extract_key = tuple_extract_key,
> reloadable_fiber_create = reloadable_fiber_create,
> @@ -160,4 +198,6 @@ return {
> async_task = async_task,
> internal = M,
> version_is_at_least = version_is_at_least,
> + table_copy_yield = table_copy_yield,
> + table_minus_yield = table_minus_yield,
> }
>
next prev parent reply other threads:[~2021-02-11 6:51 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-09 23:46 [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 1/9] rlist: move rlist to a new module Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
2021-02-12 0:09 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 2/9] Use fiber.clock() instead of .time() everywhere Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 3/9] test: introduce a helper to wait for bucket GC Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:57 ` Oleg Babin via Tarantool-patches
2021-02-10 22:33 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 4/9] storage: bucket_recv() should check rs lock Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches [this message]
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 6/9] cfg: introduce 'deprecated option' feature Vladislav Shpilevoy via Tarantool-patches
2021-02-10 8:59 ` Oleg Babin via Tarantool-patches
2021-02-10 22:34 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 7/9] gc: introduce reactive garbage collector Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
2021-02-10 22:35 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:50 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 8/9] recovery: introduce reactive recovery Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:00 ` Oleg Babin via Tarantool-patches
2021-02-09 23:46 ` [Tarantool-patches] [PATCH 9/9] util: introduce binary heap data structure Vladislav Shpilevoy via Tarantool-patches
2021-02-10 9:01 ` Oleg Babin via Tarantool-patches
2021-02-10 22:36 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-11 6:51 ` Oleg Babin via Tarantool-patches
2021-02-12 0:09 ` Vladislav Shpilevoy via Tarantool-patches
2021-03-05 22:03 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-09 23:51 ` [Tarantool-patches] [PATCH 0/9] VShard Map-Reduce, part 1, preparations Vladislav Shpilevoy via Tarantool-patches
2021-02-12 11:02 ` Oleg Babin via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22c4aff1-1305-bf7f-e43b-1fe4d3fa5c51@tarantool.org \
--to=tarantool-patches@dev.tarantool.org \
--cc=olegrok@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--cc=yaroslav.dynnikov@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH 5/9] util: introduce yielding table functions' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox