[tarantool-patches] Re: [PATCH] vshard: added prohibition of multiple router/storage configuration from different fibers

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Aug 15 00:22:30 MSK 2018


Hi! My congrats on your first patch! Let me acquaint you with
the order of our review process. For each your patch I will
send you my enumerated comments that you are needed to fix.
This process is continuing until the patch looks perfect,
passes all the tests and the utmost case: a test is not
passing without the patch (or at least without its core part).

For each my comment in your response email in the same thread
you should provide diff that shows you have fixed it. An example:
https://www.freelists.org/post/tarantool-patches/PATCH-33-sql-implement-point-where-for-DELETE-stmts,2
In the same email at the end you put a new version of the
patch.

If the patch is changed significantly, then you send the new
patch in a next mailing thread and change the header prefix
from [PATCH] to [PATCH v2/v3/v4...].

See below my 11 comments.

On 13/08/2018 12:41, Евгений Заикин (Redacted sender evgeniy19872004 for DMARC) wrote:
> Adds prohibition of multiple router/storage configuration from different fibers
> Fixes: #140
> ---
>   Ticket:https://github.com/tarantool/vshard/issues/140
>   Branch:rave4life/gh-140-prohibit-multiple-cfg

1. Please, make each link be hyper, clickable. Including https://,
github.com etc.

>   
>   test/misc/vshard_multi_fiber_cfg.result   | 53 +++++++++++++++++++++++
>   test/misc/vshard_multi_fiber_cfg.test.lua | 21 +++++++++
>   vshard/router/init.lua                    | 20 +++++++--
>   vshard/storage/init.lua                   | 18 +++++++-
>   4 files changed, 107 insertions(+), 5 deletions(-)
>   create mode 100644 test/misc/vshard_multi_fiber_cfg.result
>   create mode 100644 test/misc/vshard_multi_fiber_cfg.test.lua

2. Look at your commit on github:

https://github.com/tarantool/vshard/commit/427e1fbd027d924b31648fe2af1c889bf95b52b6

2.1. Your login listed right below the commit message should be clickable as well.
See example of my commit:
https://github.com/tarantool/vshard/commit/530a9e17ac289aa2b6cc55f3082fde6c78371043
Here you see my github avatar, can click on it. Please, fix some settings in
your local git. Most likely, you use an email different from currently active on
github.

2.2. As you can see, the commit title is wrapped because it is too long. Please,
avoid it. A commit title should not exceed 50 symbols. If you want to explain
something, please, feel free to do it in the commit message body (whose width
is limited with 72 symbols).

> 
> diff --git a/test/misc/vshard_multi_fiber_cfg.result b/test/misc/vshard_multi_fiber_cfg.result
> new file mode 100644
> index 0000000..21121eb
> --- /dev/null
> +++ b/test/misc/vshard_multi_fiber_cfg.result
> @@ -0,0 +1,53 @@
> +test_run = require('test_run').new()

3. You have tested only storage concurrent configuration, but
the router should be tested as well.

Also, I think that so basic test should be moved to storage/storage.test.lua
and router/router.test.lua, somewhere at the beginning of the files. In misc/
suite you have no an alive cluster so can not test storage/router.cfg properly.
Here you are testing them under pcall and not existing UUID, but I think, we
should test good and the most anticipated case, when cfg()'s, called from
multiple fibers, are serialized and work ok.

> +---
> +...
> +cfg.weights = nil

4. Why do you need weights nullifying?

5. Where did you create cfg? I do not see. And looks like because
'cfg' is not declared here, I got the errors on my laptop:

2018-08-14 23:34:29.955 [34528] main C> entering the event loop
2018-08-14 23:34:30.012 [34528] main/109/lua utils.c:923 E> LuajitError: [string "sf = function() fiber.create(function() pcall..."]:1: variable 'cfg' is not declared
2018-08-14 23:34:30.012 [34528] main/110/lua utils.c:923 E> LuajitError: [string "sf = function() fiber.create(function() pcall..."]:1: variable 'cfg' is not declared
2018-08-14 23:34:56.991 [34528] main C> got signal 2 - Interrupt: 2

The test just hangs. I wonder how it passes the tests on the travis ...

> +---
> +...
> +sample_replica_id = '9519e546-e39b-4db8-a3bb-66a59fdea9fc'
> +---
> +...
> +vshard = require('vshard')
> +---
> +...
> +-- gh-140 - Prohibit vshard.router/storage.cfg from multiple fibers

6. Please, finish the sentence with the dot.

> +fiber = require('fiber')
> +---
> +...
> +c = fiber.channel(2)
> +---
> +...
> +sf = function() fiber.create(function() pcall(vshard.storage.cfg, cfg, sample_replica_id, false) c:put(true) end) 

7. Why do you pass 'false' as a third parameter to vshard.storage.cfg?
It takes only two: cfg and uuid.

8. Even when the comments above are fixed, the test will pass
without your patch. We should think out how to test this bugfix
more univocal.

For example, use a new error injection (see M.errinj in
storage/init.lua) as a counter of concurrently running storage_cfg.
Its should be asserted to be < 2. This test should fail without
other part of your patch.

> diff --git a/vshard/router/init.lua b/vshard/router/init.lua
> index 971d830..de0a143 100644
> --- a/vshard/router/init.lua
> +++ b/vshard/router/init.lua
> @@ -2,6 +2,20 @@ local log = require('log')
>   local lfiber = require('fiber')
>   local table_new = require('table.new')
>   
> +-- Function decorator that is used to prevent router.cfg() from
> +-- being called concurrently by different fibers.
> +local lock = lfiber.channel(1)
> +local function locked(f)
> +    return function(...)
> +        lock:put(true)
> +        local status, err = pcall(f, ...)
> +        lock:get()
> +        if not status then
> +            error(err)
> +        end
> +    end
> +end

9. Please, make the lock be part of M. You should never create any
objects (except results of 'require') in the global scope. M is
used both in the router and storage to do module reload safely.

Module reload is a module code reloading without restarting of
the instance. On reload all the code in the .lua file is parsed
and executed again. So when you created an object right in the
global scope, on reload the same second object will be created.

This lock should be single per module even on reload, so move it
to M. Look at other code creating singleton objects, like
M.buckets_to_recovery, M.collect_bucket_garbage_fiber etc.

Also please consult the end of storage/init.lua file for detailed
explanation about what is reload and how to work with it.



10. Besides, I think, that a special generic wrapper like
locked(f) is overkill for such a simple problem. What is more,
it is not efficient - each time I call locked() a new function
is parsed and created to wrap the argument because you call
'return function(...) ... '. Please, avoid such constructions
on hot execution paths. It is ok only when a function is generated
once, or at least very rare. Here it is not worth the cost.



11. Also, a comment on future. When you want a commonplace function
to be used both in storage and router, please, use special
utilities file: vshard/util.lua. It serves exactly to this goal -
store some functions not depending on vshard things.

Thank you for working on the patch!




More information about the Tarantool-patches mailing list