From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 3A79C20E43 for ; Sun, 17 Jun 2018 15:46:50 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SgntATQzz29Y for ; Sun, 17 Jun 2018 15:46:50 -0400 (EDT) Received: from smtp55.i.mail.ru (smtp55.i.mail.ru [217.69.128.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id DF8F020CEE for ; Sun, 17 Jun 2018 15:46:49 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 1/2] Add test on error during reconfigure References: <489ade011c878e28236afe2792e0eddb1ded75b9.1528566184.git.avkhatskevich@tarantool.org> From: Vladislav Shpilevoy Message-ID: <1fc9755e-0adc-8a44-4016-dedb7ba557b7@tarantool.org> Date: Sun, 17 Jun 2018 22:46:46 +0300 MIME-Version: 1.0 In-Reply-To: <489ade011c878e28236afe2792e0eddb1ded75b9.1528566184.git.avkhatskevich@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, AKhatskevich Thanks for the patch! See 4 comments below. On 09/06/2018 20:47, AKhatskevich wrote: > In case reconfigure process fails, the node should continue > work properly. > --- > test/lua_libs/util.lua | 16 ++++++++++++++++ > test/router/router.result | 27 +++++++++++++++++++++++++++ > test/router/router.test.lua | 10 ++++++++++ > test/storage/storage.result | 33 +++++++++++++++++++++++++++++++++ > test/storage/storage.test.lua | 12 ++++++++++++ > vshard/router/init.lua | 7 +++++++ > vshard/storage/init.lua | 9 +++++++++ > 7 files changed, 114 insertions(+) > > diff --git a/test/router/router.result b/test/router/router.result > index 2ee1bff..3ebab5d 100644 > --- a/test/router/router.result > +++ b/test/router/router.result > @@ -1057,6 +1057,33 @@ error_messages > - - Use replica:is_connected(...) instead of replica.is_connected(...) > - Use replica:safe_uri(...) instead of replica.safe_uri(...) > ... > +-- Error during reconfigure process. > +_ = vshard.router.route(1):callro('echo', {'some_data'}) 1. Why do you need this call here? It does nor outputs nothing. > +--- > +... > +vshard.router.internal.errinj.ERRINJ_CFG = true > +--- > +... > +old_internal = table.copy(vshard.router.internal) > +--- > +... > +_, err = pcall(vshard.router.cfg, cfg) > +--- > +... > +err:match('Error injection:.*') > +--- > +- 'Error injection: cfg' > +... > +vshard.router.internal.errinj.ERRINJ_CFG = false > +--- > +... > +util.has_same_fields(old_internal, vshard.router.internal) > +--- > +- true > +... > +_ = vshard.router.route(1):callro('echo', {'some_data'}) 2. Same. Maybe this call should output something? This test would not fail if callro failed. Call/ro/rw do not throw exceptions but return a pair: result and error if occurred. > +--- > +... > _ = test_run:cmd("switch default") > --- > ... > diff --git a/test/storage/storage.result b/test/storage/storage.result > index d0bf792..8d88bf4 100644 > --- a/test/storage/storage.result > +++ b/test/storage/storage.result > @@ -720,6 +720,39 @@ test_run:cmd("setopt delimiter ''"); > --- > - true > ... > +-- Error during reconfigure process. > +_, rs = next(vshard.storage.internal.replicasets) > +--- > +... > +_ = rs:callro('echo', {'some_data'}) 3. Same. Here and at the end of the test. > +--- > +... > +vshard.storage.internal.errinj.ERRINJ_CFG = true > +--- > +... > +old_internal = table.copy(vshard.storage.internal) > +--- > +... > +_, err = pcall(vshard.storage.cfg, cfg, names.storage_1_a) > +--- > +... > +err:match('Error injection:.*') > +--- > +- 'Error injection: cfg' > +... > +vshard.storage.internal.errinj.ERRINJ_CFG = false > +--- > +... > +util.has_same_fields(old_internal, vshard.storage.internal) > +--- > +- true > +... > +_, rs = next(vshard.storage.internal.replicasets) > +--- > +... > +_ = rs:callro('echo', {'some_data'}) > +--- > +... > _ = test_run:cmd("switch default") > --- > ... > diff --git a/vshard/router/init.lua b/vshard/router/init.lua > index 21093e5..1dee80c 100644 > --- a/vshard/router/init.lua > +++ b/vshard/router/init.lua > @@ -473,6 +474,12 @@ local function router_cfg(cfg) > end > box.cfg(cfg) > log.info("Box has been configured") > + -- It is considered that all possible errors during cfg > + -- process occur only before this place. > + -- This check should be placed as late as possible. > + if M.errinj.ERRINJ_CFG then > + error('Error injection: cfg') > + end 4. I think, you should place this injection before box.cfg. All the cfg() code relies on the fact, that box.cfg is atomic. And if box.cfg is ok, then storage/router.cfg finish with no errors. An error is possible only before box.cfg and in box.cfg. You may think box.cfg like commit, and code before box.cfg like prepare. > M.total_bucket_count = total_bucket_count > M.collect_lua_garbage = collect_lua_garbage > -- TODO: update existing route map in-place