From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 0FF636EC5D; Fri, 9 Apr 2021 11:26:26 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 0FF636EC5D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1617956786; bh=T/A2IrTbPmDtlXL2SRdeZ7ZZ7fD0z0OSyll3KGP3bfI=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=zQBh2qF+SSV+p/FUQ/fU3mF2X1nxbYBoI7hPhfy46PnRj9DNhtxMNJdmvXjaZzPVw /VGl4XeMy86RKGBioXTHPKauPkdHkUG7k21UAvy+XqNRMkS2rz1Ex9GYfwCLqnUyzN +csvhH1rwVuR6BLK5nfQCm3tuOfttApfIFw6CGM0= Received: from smtp48.i.mail.ru (smtp48.i.mail.ru [94.100.177.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 935426EC5D for ; Fri, 9 Apr 2021 11:26:24 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 935426EC5D Received: by smtp48.i.mail.ru with esmtpa (envelope-from ) id 1lUmT9-0004mS-Ub; Fri, 09 Apr 2021 11:26:24 +0300 To: Vladislav Shpilevoy , tarantool-patches@dev.tarantool.org, gorcunov@gmail.com References: <00c39bc5e3090383a5f0e732b5a1f32130191cf1.1617835503.git.v.shpilevoy@tarantool.org> <71ee3907-a8fd-d916-fc6a-3205e66f2d29@tarantool.org> Message-ID: <1460ae71-9e27-f844-1a97-acf3942700fc@tarantool.org> Date: Fri, 9 Apr 2021 11:26:23 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD92FFCB8E6708E7480B1C8842CE613979772AC4B34408C6525182A05F538085040F848E3BCCE0EFC4194F123E6E59CFFFC68091938BF733D74131FD54ED34F4F48 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE79E25BE5045FD62C0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006371D26D2A8652661258638F802B75D45FF914D58D5BE9E6BC1A93B80C6DEB9DEE97C6FB206A91F05B247C32227A175F29683D3260CC42A15DBA2BB4B7391E66DBBD2E47CDBA5A96583C09775C1D3CA48CF17B107DEF921CE79117882F4460429724CE54428C33FAD30A8DF7F3B2552694AC26CFBAC0749D213D2E47CDBA5A9658378DA827A17800CE70F3DDF2BBF19B93A9FA2833FD35BB23DF004C906525384302BEBFE083D3B9BA71A620F70A64A45A98AA50765F79006372E808ACE2090B5E1725E5C173C3A84C3C5EA940A35A165FF2DBA43225CD8A89FB26E97DCB74E6252C6EABA9B74D0DA47B5C8C57E37DE458BEDA766A37F9254B7 X-C1DE0DAB: 0D63561A33F958A56A5D3899642DE81603E0B1CDA89F5B925415A5AE6253DE87D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7502E6951B79FF9A3F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34FA5A4D623E7343BB525F5D628810FD18239B1A04454F26D0E1D133E720827F0C73E559CB34D75F951D7E09C32AA3244CE721B8E5285B91D367805DA611D0E9245A1673A01BA68E40FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojyO2lHpuZu4RpbHl5xIl/rA== X-Mailru-Sender: 3B9A0136629DC9125D61937A2360A4463FCC4C3DB9D44B2C659AB8FD8796F8C92BBA6B2BB3E76234424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 1/1] applier: process synchro rows after WAL write X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" 09.04.2021 01:56, Vladislav Shpilevoy пишет: > Hi! Thanks for the review! > >> Thanks for the patch! >> >> I'm a bit worried about two different synchro rows coming from two >> appliers. Is everything ok in this case? >> Or even normal rows coming from other appliers. Say some other replica >> has already applied this synchro row and even has written some rows on >> top of it. Its applier won't block on replica_id latch, and may fail to apply >> some txs following this synchro row, because it's not yet written to WAL >> and thus not applied (limbo is still not empty or belongs to other instance). > Should not happen, if I understood you correctly. See my response in the > other email + a test for test-run + a console test. Yes, it's fine, thanks for the explanation & the test! > >>> diff --git a/src/box/applier.cc b/src/box/applier.cc >>> index 971b2e64c..f7c82548c 100644 >>> --- a/src/box/applier.cc >>> +++ b/src/box/applier.cc >>> @@ -762,26 +762,21 @@ applier_txn_wal_write_cb(struct trigger *trigger, void *event) >>>   } >>>     struct synchro_entry { >>> -    /** Encoded form of a synchro record. */ >>> -    struct synchro_body_bin    body_bin; >>> - >>> -    /** xrow to write, used by the journal engine. */ >>> -    struct xrow_header row; >>> - >>> +    /** Request to process when WAL write is done. */ >>> +    struct synchro_request *req; >>> +    /** Fiber created the entry. To wakeup when WAL write is done. */ >>> +    struct fiber *owner; >>>       /** >>> -     * The journal entry itself. Note since >>> -     * it has unsized array it must be the >>> -     * last entry in the structure. >>> +     * The base journal entry. It has unsized array and then must be the >>> +     * last entry in the structure. But can workaround it via a union >>> +     * adding the needed tail as char[]. >>>        */ >>> -    struct journal_entry journal_entry; >>> +    union { >>> +        struct journal_entry base; >>> +        char base_buf[sizeof(base) + sizeof(base.rows[0])]; >>> +    }; >>>   }; >> I don't understand this union stuff. >> The journal_entry is the last entry in synchro_entry anyway. >> Is this a hack for allowing to allocate synchro_entry on the stack? > Yes. Otherwise I would need a more ugly hack: > > struct synchro_entry *e; > alignas(alignof(*e)) char e_buf[sizeof(*e) + sizeof(e->base.rows[0])]; > e = (struct synchro_entry *)e_buf; > > Then I managed to create the union hack, which I find > less ugly, and it seems to be allow to declare it on the stack. > The interesting thing though is that I still can't make that > union not a last member. > > I also see now there is a problem in one of the CI jobs: > https://github.com/tarantool/tarantool/runs/2291961909 > > It complains about out of bound array access. Which has nothing to > do with the union, just a C++ issue I suppose. I fixed it in the new > version, see the other email. Ok, I see now. Thanks! > >>> diff --git a/test/replication/gh-5213-qsync-applier-order.test.lua b/test/replication/gh-5213-qsync-applier-order.test.lua >>> new file mode 100644 >>> index 000000000..3a32626f2 >>> --- /dev/null >>> +++ b/test/replication/gh-5213-qsync-applier-order.test.lua >>> @@ -0,0 +1,129 @@ >>> +-- >>> +-- gh-5213: applier used to process CONFIRM/ROLLBACK before writing them to WAL. >>> +-- As a result it could happen that the transactions became visible on CONFIRM, >>> +-- then somehow weren't written to WAL, and on restart the data might not be >>> +-- visible again. Which means rollback of confirmed data and is not acceptable >>> +-- (on the contrary with commit after rollback). >>> +-- >>> +test_run = require('test_run').new() >>> +fiber = require('fiber') >>> +old_synchro_quorum = box.cfg.replication_synchro_quorum >>> +old_synchro_timeout = box.cfg.replication_synchro_timeout >>> + >>> +box.schema.user.grant('guest', 'super') >>> + >>> +s = box.schema.space.create('test', {is_sync = true}) >>> +_ = s:create_index('pk') >>> + >>> +test_run:cmd('create server replica with rpl_master=default,\ >>> +              script="replication/gh-5213-replica.lua"') >>> +test_run:cmd('start server replica') >>> + >>> +test_run:switch('replica') >>> +assert(box.info.id == 2) >>> +lsn = box.info.vclock[1] >>> + >>> +test_run:switch('default') >>> +fiber = require('fiber') >>> +box.cfg{                                                                        \ >>> +    replication_synchro_quorum = 3,                                             \ >>> +    replication_synchro_timeout = 1000,                                         \ >>> +} >>> +f = fiber.new(function() s:replace{1} end) >>> + >>> +test_run:switch('replica') >>> +-- Wait when the transaction is written to WAL. >>> +test_run:wait_cond(function() return box.info.vclock[1] == lsn + 1 end) >> This shouldn't go wrong, but I find >> test_run:wait_lsn('replica, 'default') more durable. >> And you wouldn't need to save lsn above then. >> >> Same for other usages of lsn = ... and wait_cond(box.info.vclock[1] == lsn + ...) >> >> Up to you though. > I totally forgot about wait_lsn(), thanks for noticing. > I applied it. See the full diff in the second email. Thanks for the fixes, LGTM. -- Serge Petrenko