From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B960420D5B for ; Wed, 27 Jun 2018 07:45:29 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KAYd6SGjgJBm for ; Wed, 27 Jun 2018 07:45:29 -0400 (EDT) Received: from smtp20.mail.ru (smtp20.mail.ru [94.100.179.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 7800A20CC3 for ; Wed, 27 Jun 2018 07:45:29 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 2/2] Fix discovery/reconfigure race References: <7790cb4a-ff5e-99bf-03bc-996fea379185@tarantool.org> From: Vladislav Shpilevoy Message-ID: Date: Wed, 27 Jun 2018 14:45:27 +0300 MIME-Version: 1.0 In-Reply-To: <7790cb4a-ff5e-99bf-03bc-996fea379185@tarantool.org> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: Alex Khatskevich , tarantool-patches@freelists.org Thanks for the fixes! See 2 comments below. > commit bc90c3db5f5b1b663c747b8c2829fc8528af70cf > Author: AKhatskevich > Date:   Thu Jun 14 16:03:09 2018 +0300 > >     Fix discovery/reconfigure race > >     This commit prevents discovery fiber from discovering old replicasets >     and spoiling `route_map`. > > diff --git a/test/router/router.result b/test/router/router.result > index 5643f3e..36d54bf 100644 > --- a/test/router/router.result > +++ b/test/router/router.result > @@ -1095,6 +1095,55 @@ for bucket, old_rs in pairs(bucket_to_old_rs) do >  end; >  --- >  ... > +-- > +-- Check route_map is not filled with old replica objects after > +-- reconfigure. > +-- > +-- Simulate long `callro`. > +vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY = true; > +--- > +... > +while not vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY == 'waiting' do 1. Why 'not value == needed_value'? Why not 'value ~= needed_value'? And how does it work? I checked this thing in Lua and got these results: -- Before cycle tarantool> not true == 'waiting' --- - false ... -- After 'waiting' is set tarantool> not 'waiting' == 'waiting' --- - false ... So this cycle is never run. And I proved it with putting assert(false) into the first line - the test passed as well: [001] +++ router/router.reject Wed Jun 27 14:39:30 2018 [001] @@ -1104,6 +1104,7 @@ [001] --- [001] ... [001] while not vshard.router.internal.errinj.ERRINJ_LONG_DISCOVERY == 'waiting' do [001] + assert(false) [001] vshard.router.discovery_wakeup() [001] fiber.sleep(0.02) [001] end; [001] Please, investigate why your test passes even without this cycle again. > +    vshard.router.discovery_wakeup() > +    fiber.sleep(0.02) > +end; > +--- > +... > +vshard.router.cfg(cfg); > +--- > +... > +route_map = vshard.router.internal.route_map > +for bucket_id, _ in pairs(route_map) do > +    route_map[bucket_id] = nil > +end; 2. Why not just 'vshard.router.internal.route_map = {}' ?