From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 5151171233; Fri, 29 Oct 2021 11:06:49 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 5151171233 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1635494809; bh=p+IcvK2NtJWVnenRE20qPtoq7UHS/squlo8I6egbvnE=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=aY7c+ThqDYpyi29w6yVrP+jAfpOBgOJpzMRXMN2KqT5br/GS9d4gotVTKq0xuXoj9 2QUQx0uGUQ/GgOVJ2ZNTXq6W3+nOKxI7wtvcpD9D69DLfSwO0Em2kg5RS9NeRpN1KH h21EvqQ8NaA5h3hC2R0Ky9v0j//+DJWIrxsRaPn4= Received: from smtp47.i.mail.ru (smtp47.i.mail.ru [94.100.177.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 8A36471230 for ; Fri, 29 Oct 2021 11:06:46 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 8A36471230 Received: by smtp47.i.mail.ru with esmtpa (envelope-from ) id 1mgMuT-0004af-Rv; Fri, 29 Oct 2021 11:06:46 +0300 Message-ID: Date: Fri, 29 Oct 2021 11:06:45 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Content-Language: ru To: =?UTF-8?B?0K/QvSDQqNGC0YPQvdC00LXRgA==?= , tml References: <20211025095223.22521-1-ya.shtunder@gmail.com> <665ef444-80de-4848-f9f0-a3ccc6e7c059@tarantool.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-4EC0790: 10 X-7564579A: EEAE043A70213CC8 X-77F55803: 4F1203BC0FB41BD9E6B4260954843F6FA34CE80FAB969EE3F9371B587AC9353E00894C459B0CD1B96A337E0A482D381AF06A81DF52AD79D86CA721AFF76799DAF77FB8B1A9589EDD X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE711269A7C2F827F16EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AB1265E79AFCDEF58638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D898B4AEB3D5439B8840274305D586A7CB117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCF1175FABE1C0F9B6A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18C26CFBAC0749D213D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE7B96B19DC409332149AF716F719AB83ED8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3A6C7FFFE744CA7FB6E0066C2D8992A16C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637149D0840703ADBE5EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: 0D63561A33F958A5919AA8E8E190B0C034F5B664BAD97BEDAA39DD4E0B16F1D2D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75C69C5C0DDE134364410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D3473457D764E1CDE770D30B06896D950F4A4C323B68C779F9CCB16027D972C6D2F318DF8E3A01415391D7E09C32AA3244CB3F44B047B56F6E882E1EE5F5E1B3D5A60759606DA2E136A729B2BEF169E0186 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojdMRfVmNkPDh7pWvJY9s1Sg== X-Mailru-Sender: 583F1D7ACE8F49BD8518EAAA0E4F94F183C0ADD486F68BCFE6B610019BC9D944ADA57E69902965B2424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v3] replication: removing anonymous replicas from synchro quorum X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 28.10.2021 18:56, Ян Штундер пишет: > Hi! Thank you for the review! > I have fixed the errors > > Nit: better say "Transactions should be committed". > reaches -> reach. > > > Transactions should be committed after they reach quorum of "real" > cluster members. > > Please, find a more informative test name. > For example, "gh_5418_qsync_with_anon_test.lua* > > > gh_5418_test.lua -> gh_5418_qsync_with_anon_test.lua > > Please, use `t.helpers.retrying()` here. > > >  I used the wait_vclock function from the luatest_helpers.lua file > > -- > Yan Shtunder Good job on the fixes! LGTM. > > пн, 25 окт. 2021 г. в 16:32, Serge Petrenko : > > > > 25.10.2021 12:52, Yan Shtunder via Tarantool-patches пишет: > > Hi! Good job on porting the test to the current luatest version! > Please, find a couple of comments below. > > > Transactions have to committed after they reaches quorum of "real" > > Nit: better say "Transactions should be committed". > reaches -> reach. > > > cluster members. Therefore, anonymous replicas don't have to > > participate in the quorum. > > > > Closes #5418 > > --- > > Issue: https://github.com/tarantool/tarantool/issues/5418 > > Patch: > https://github.com/tarantool/tarantool/tree/yshtunder/gh-5418-qsync-with-anon-replicas > > > >   src/box/relay.cc                          |  3 +- > >   test/replication-luatest/gh_5418_test.lua | 82 > +++++++++++++++++++++++ > >   2 files changed, 84 insertions(+), 1 deletion(-) > >   create mode 100644 test/replication-luatest/gh_5418_test.lua > > > > diff --git a/src/box/relay.cc b/src/box/relay.cc > > index f5852df7b..cf569e8e2 100644 > > --- a/src/box/relay.cc > > +++ b/src/box/relay.cc > > @@ -543,6 +543,7 @@ tx_status_update(struct cmsg *msg) > >       struct replication_ack ack; > >       ack.source = status->relay->replica->id; > >       ack.vclock = &status->vclock; > > +     bool anon = status->relay->replica->anon; > >       /* > >        * Let pending synchronous transactions know, which of > >        * them were successfully sent to the replica. Acks are > > @@ -550,7 +551,7 @@ tx_status_update(struct cmsg *msg) > >        * the single master in 100% so far). Other instances wait > >        * for master's CONFIRM message instead. > >        */ > > -     if (txn_limbo.owner_id == instance_id) { > > +     if (txn_limbo.owner_id == instance_id && !anon) { > >               txn_limbo_ack(&txn_limbo, ack.source, > >                             vclock_get(ack.vclock, instance_id)); > >       } > > I can't build your patch to test it manually, compilation fails with > some ERRINJ-related errors. > > Seems like the commit "replication: fill replicaset.applier.vclock > after > local recovery" > you have on the branch is extraneous. And it causes the error. > > Please remove it. > > > diff --git a/test/replication-luatest/gh_5418_test.lua > b/test/replication-luatest/gh_5418_test.lua > > new file mode 100644 > > index 000000000..265d28ccb > > --- /dev/null > > +++ b/test/replication-luatest/gh_5418_test.lua > > Please, find a more informative test name. > For example, "gh_5418_qsync_with_anon_test.lua* > > > @@ -0,0 +1,82 @@ > > +local fio = require('fio') > > +local log = require('log') > > +local fiber = require('fiber') > > +local t = require('luatest') > > +local cluster = require('test.luatest_helpers.cluster') > > +local helpers = require('test.luatest_helpers.helpers') > > + > > +local g = t.group('gh-5418') > > + > > +g.before_test('test_qsync_with_anon', function() > > +    g.cluster = cluster:new({}) > > + > > +    local box_cfg = { > > +        replication         = {helpers.instance_uri('master')}, > > +        replication_synchro_quorum = 2, > > +        replication_timeout = 0.1 > > +    } > > + > > +    g.master = g.cluster:build_server({alias = 'master'}, > engine, box_cfg) > > + > > +    local box_cfg = { > > +        replication         = { > > +            helpers.instance_uri('master'), > > +            helpers.instance_uri('replica') > > +        }, > > +        replication_timeout = 0.1, > > +        replication_connect_timeout = 0.5, > > +        read_only           = true, > > +        replication_anon    = true > > +    } > > + > > +    g.replica = g.cluster:build_server({alias = 'replica'}, > engine, box_cfg) > > + > > +    g.cluster:join_server(g.master) > > +    g.cluster:join_server(g.replica) > > +    g.cluster:start() > > + log.info ('Everything is started') > > +end) > > + > > +g.after_test('test_qsync_with_anon', function() > > +    g.cluster:stop() > > +    fio.rmtree(g.master.workdir) > > +    fio.rmtree(g.replica.workdir) > > +end) > > + > > +local function wait_vclock(timeout) > > +    local started_at = fiber.clock() > > +    local lsn = g.master:eval("return box.info.vclock[1]") > > + > > +    local _, tbl = g.master:eval("return > next(box.info.replication_anon())") > > +    local to_lsn = tbl.downstream.vclock[1] > > + > > +    while to_lsn == nil or to_lsn < lsn do > > +        fiber.sleep(0.001) > > + > > +        if (fiber.clock() - started_at) > timeout then > > +            return false > > +        end > > + > > +        _, tbl = g.master:eval("return > next(box.info.replication_anon())") > > +        to_lsn = tbl.downstream.vclock[1] > > + > > + log.info (string.format("master lsn: %d; > replica_anon lsn: %d", > > +            lsn, to_lsn)) > > +    end > > + > > +    return true > > +end > > + > > +g.test_qsync_with_anon = function() > > +    g.master:eval("box.schema.space.create('sync', {is_sync = > true})") > > +    g.master:eval("box.space.sync:create_index('pk')") > > + > > +    t.assert_error_msg_content_equals("Quorum collection for a > synchronous transaction is timed out", > > +        function() g.master:eval("return > box.space.sync:insert{1}") end) > > + > > +    -- Wait until everything is replicated from the master to > the replica > > +    t.assert(wait_vclock(1)) > > Please, use `t.helpers.retrying()` here. > It receives a timeout and a function to call. > Like `t.helpter.retrying({timeout=5}, wait_vclock)` > And wait_vclock should simply return true or false based on > whether the replica has reached master's vclock. > > Also, please choose a bigger timeout. Like 5 or 10 seconds. > Otherwise the test will be flaky on slow testing machines in our CI. > > > + > > +    t.assert_equals(g.master:eval("return > box.space.sync:select()"), {}) > > +    t.assert_equals(g.replica:eval("return > box.space.sync:select()"), {}) > > +end > > -- > > 2.25.1 > > > > -- > Serge Petrenko > -- Serge Petrenko