From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 6806B6EC55; Tue, 15 Jun 2021 23:55:26 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 6806B6EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1623790526; bh=Z0+4SA30hEOWYEW8U89316waZcgz5F3R+xuEPDj9xkU=; h=To:Cc:References:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=JNydAezwl4TTt8mzazHw9749JzaSKP557kW/btZPTr6fKCMYKYDXWTS1MENEIv8Bz V6Ba1m6S06jK/JplI4SfTQFEFdH64uxwwGuyMQt922MjlftOAH923yJSqio/dcbk4E D/vi3C1Z9dT+axxPjnWIqC2dFftfVoZ+H2iCyJVc= Received: from smtp41.i.mail.ru (smtp41.i.mail.ru [94.100.177.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 90DCB6EC55 for ; Tue, 15 Jun 2021 23:55:23 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 90DCB6EC55 Received: by smtp41.i.mail.ru with esmtpa (envelope-from ) id 1ltG5i-0002Za-Nb; Tue, 15 Jun 2021 23:55:23 +0300 To: Serge Petrenko , gorcunov@gmail.com Cc: tarantool-patches@dev.tarantool.org References: Message-ID: <797fdc67-944d-67f2-cd44-ff8ce79ce342@tarantool.org> Date: Tue, 15 Jun 2021 22:55:21 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9D5B0DA836B685C54F4BC37E91F2690B85F43D7652182C513182A05F5380850407BC3CB4A4D3A8B7BAB98D7EE9884A1693438AA6EFB707415723AA996CA907829 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7C8140302C704C25FEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637CD8995A08F41566CEA1F7E6F0F101C6723150C8DA25C47586E58E00D9D99D84E1BDDB23E98D2D38BD6CF32B5F8F9D4041D97EB9CA927546383AF19C24DEE4BCFCC7F00164DA146DAFE8445B8C89999728AA50765F7900637F924B32C592EA89F389733CBF5DBD5E9C8A9BA7A39EFB766F5D81C698A659EA7CC7F00164DA146DA9985D098DBDEAEC8062BEEFFB5F8EA3EF6B57BC7E6449061A352F6E88A58FB86F5D81C698A659EA73AA81AA40904B5D9A18204E546F3947C82B967D547A19D2F6136E347CC761E074AD6D5ED66289B52698AB9A7B718F8C46E0066C2D8992A16725E5C173C3A84C3CE501FE4DEA2667BBA3038C0950A5D36B5C8C57E37DE458B0BC6067A898B09E46D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE7E1BCFB2C0BE3F189731C566533BA786AA5CC5B56E945C8DA X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A24209795067102C07E8F7B195E1C97831AEB5B58371B97DB7B24383D33A8A987E X-C1DE0DAB: 0D63561A33F958A5197B660D11DF77A4D7ABA836291E466D25533368AC0EBDDAD59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA75448CF9D3A7B2C848410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D348CDF4129B2EA8560A601E961E09195FB2E6A89C6875C1D2033104BC8740DC15D10C36D4A0411B1CA1D7E09C32AA3244CEA86829508DDAAD3CD1EE0ECCE06B6DE5A1673A01BA68E40FACE5A9C96DEB163 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj6OL1iHTyIM3UNytoG4NDww== X-Mailru-Sender: 504CC1E875BF3E7D9BC0E5172ADA31106551F4458FBF9F290A5917D1B5CFA946FE82372B44F232A107784C02288277CA03E0582D3806FB6A5317862B1921BA260ED6CFD6382C13A6112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH 2/7] replication: forbid implicit limbo owner transition X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Vladislav Shpilevoy via Tarantool-patches Reply-To: Vladislav Shpilevoy Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Good job on the patch! See 4 comments below. On 10.06.2021 15:32, Serge Petrenko via Tarantool-patches wrote: > Forbid limbo ownership transition without an explicit promote. > Make it so that synchronous transactions may be committed only after it > is claimed by some instance via a PROMOTE request. > > Make everyone but the limbo owner read-only even when the limbo is > empty. > > Part-of #6034 > > @TarantoolBot document > Title: synchronous replication changes > > `box.info.synchro.queue` receives a new field: `owner`. It's a replica > id of the instance owning the synchronous transaction limbo. > > Once some instance owns the limbo, every other instance becomes > read-only. When the limbo is unclaimed, e.g. > `box.info.synchro.queue.owner` is `0`, everyone may be writeable, but > cannot create synchronous transactions. > > In order to claim or re-claim the limbo, you have to issue > `box.ctl.promote()` on the instance you wish to promote. > > When elections are enabled, the instance issues `box.ctl.promote()` > automatically once it wins the elections. 1. It might be not a good idea to mention the limbo in the public documentation. It is not described there anyhow, and raises the question "what is limbo?". Maybe better replace "limbo" with "queue" everywhere in the public texts such as doc requests, changelogs, and error messages. > diff --git a/src/box/errcode.h b/src/box/errcode.h > index d93820e96..e75f54a01 100644 > --- a/src/box/errcode.h > +++ b/src/box/errcode.h > @@ -277,6 +277,7 @@ struct errcode_record { > /*222 */_(ER_QUORUM_WAIT, "Couldn't wait for quorum %d: %s") \ > /*223 */_(ER_INTERFERING_PROMOTE, "Instance with replica id %u was promoted first") \ > /*224 */_(ER_RAFT_DISABLED, "Elections were turned off while running box.ctl.promote()")\ > + /*225 */_(ER_LIMBO_UNCLAIMED, "Synchronous transaction limbo doesn't belong to any instance")\ 2. The same as above - lets not mention limbo in the public space. Might be ER_SYNC_UNCLAIMED, or ER_SYNC_QUEUE_UNCLAIMED, or a similar option. And in the error message: transaction limbo -> transaction queue > diff --git a/test/replication/qsync_basic.test.lua b/test/replication/qsync_basic.test.lua > index 75c9b222b..6a49e2b01 100644 > --- a/test/replication/qsync_basic.test.lua > +++ b/test/replication/qsync_basic.test.lua > @@ -248,29 +249,6 @@ for i = 1, 100 do box.space.sync:delete{i} end > test_run:cmd('switch replica') > box.space.sync:count() > > --- > --- gh-5445: NOPs bypass the limbo for the sake of vclock bumps from foreign > --- instances, but also works for local rows. 3. Please, try to keep this test somehow. You can try to perform these nops locally on the default node. Otherwise is_fully_nop in txn_journal_entry_new() remains untested. 4. When I tried to run the tests, I got a crash: [010] replication/gh-5195-qsync-replica-write.test.l> memtx [010] [010] [Instance "replica" killed by signal: 6 (SIGABRT)] [010] [010] Last 15 lines of Tarantool Log file [Instance "replica"][/Users/gerold/Work/Repositories/tarantool/test/var/010_replication/replica.log]: [010] 2021-06-15 22:13:04.821 [40345] main/112/applier/unix/:/Users/gerold/Work/Repositories/tarantool/test/var/010_replication/master.socket-iproto I> RAFT: message {term: 1, state: follower} from 1 [010] 2021-06-15 22:13:04.824 [40345] main/113/applierw/unix/:/Users/gerold/Work/Repositories/tarantool/test/var/010_replication/master.socket-iproto C> leaving orphan mode [010] 2021-06-15 22:13:04.824 [40345] main/103/replica I> replica set sync complete [010] 2021-06-15 22:13:04.824 [40345] main/103/replica C> leaving orphan mode [010] 2021-06-15 22:13:04.825 [40345] main/103/replica I> set 'log_level' configuration option to 5 [010] 2021-06-15 22:13:04.825 [40345] main/107/checkpoint_daemon I> scheduled next checkpoint for Tue Jun 15 23:20:29 2021 [010] 2021-06-15 22:13:04.825 [40345] main/103/replica I> set 'replication_timeout' configuration option to 0.1 [010] 2021-06-15 22:13:04.825 [40345] main/103/replica I> set 'memtx_memory' configuration option to 107374182 [010] 2021-06-15 22:13:04.826 [40345] main/103/replica I> set 'replication_sync_timeout' configuration option to 100 [010] 2021-06-15 22:13:04.826 [40345] main/103/replica I> set 'listen' configuration option to "\/Users\/gerold\/Work\/Repositories\/tarantool\/test\/var\/010_replication\/replica.socket-iproto" [010] 2021-06-15 22:13:04.826 [40345] main/103/replica I> set 'replication' configuration option to ["\/Users\/gerold\/Work\/Repositories\/tarantool\/test\/var\/010_replication\/master.socket-iproto"] [010] 2021-06-15 22:13:04.826 [40345] main/103/replica I> set 'log_format' configuration option to "plain" [010] 2021-06-15 22:13:04.827 [40345] main C> entering the event loop [010] 2021-06-15 22:13:04.903 [40345] main/128/console/unix/: I> set 'replication_synchro_timeout' configuration option to 0.001 [010] Assertion failed: (txn_limbo_is_empty(&txn_limbo)), function box_promote, file /Users/gerold/Work/Repositories/tarantool/src/box/box.cc, line 1678. And a wrong result: [005] replication/gh-5298-qsync-recovery-snap.test.l> vinyl [ fail ] [005] [005] Test failed! Result content mismatch: [005] --- replication/gh-5298-qsync-recovery-snap.result Tue Jun 15 21:48:57 2021 [005] +++ var/rejects/replication/gh-5298-qsync-recovery-snap.reject Tue Jun 15 22:13:44 2021 [005] @@ -46,7 +46,7 @@ [005] -- Could hang if the limbo would incorrectly handle the snapshot end. [005] box.space.sync:replace{11} [005] | --- [005] - | - [11] [005] + | - error: Synchronous transaction limbo doesn't belong to any instance [005] | ... [005] [005] old_synchro_quorum = box.cfg.replication_synchro_quorum [005] @@ -64,7 +64,7 @@ [005] | ... [005] box.space.sync:replace{12} [005] | --- [005] - | - error: Quorum collection for a synchronous transaction is timed out [005] + | - error: Synchronous transaction limbo doesn't belong to any instance [005] | ... [005] [005] box.cfg{ \ [005] @@ -75,18 +75,16 @@ [005] | ... [005] box.space.sync:replace{13} [005] | --- [005] - | - [13] [005] + | - error: Synchronous transaction limbo doesn't belong to any instance [005] | ... [005] box.space.sync:get({11}) [005] | --- [005] - | - [11] [005] | ... [005] box.space.sync:get({12}) [005] | --- [005] | ... [005] box.space.sync:get({13}) [005] | --- [005] - | - [13] [005] | ... [005] [005] box.cfg{ \ [005] Then I stopped.