Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 1/1] [tosquash] test: add a test for sync heartbeats
@ 2020-06-23 22:39 Vladislav Shpilevoy
  2020-06-25  7:59 ` Serge Petrenko
  0 siblings, 1 reply; 4+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-23 22:39 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

Should be squashed into the commit closing 5100.
---
Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication
Issue: https://github.com/tarantool/tarantool/issues/4842

 .../sync_replication_sanity.result            | 50 +++++++++++++++++++
 .../sync_replication_sanity.test.lua          | 22 ++++++++
 2 files changed, 72 insertions(+)

diff --git a/test/replication/sync_replication_sanity.result b/test/replication/sync_replication_sanity.result
index 4b9823d77..a0591dcf3 100644
--- a/test/replication/sync_replication_sanity.result
+++ b/test/replication/sync_replication_sanity.result
@@ -178,6 +178,53 @@ box.space.sync:select{}
  |   - [3]
  | ...
 
+--
+-- gh-5100: replica should send ACKs for sync transactions after
+-- WAL write immediately, not waiting for replication timeout or
+-- a CONFIRM.
+--
+box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
+ | ---
+ | ...
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
+ | ---
+ | ...
+test_run:switch('default')
+ | ---
+ | - true
+ | ...
+-- Commit something non-sync. So as applier writer fiber would
+-- flush the pending heartbeat and go to sleep with the new huge
+-- replication timeout.
+s = box.schema.create_space('test')
+ | ---
+ | ...
+pk = s:create_index('pk')
+ | ---
+ | ...
+s:replace{1}
+ | ---
+ | - [1]
+ | ...
+-- Now commit something sync. It should return immediately even
+-- though the replication timeout is huge.
+box.space.sync:replace{4}
+ | ---
+ | - [4]
+ | ...
+test_run:switch('replica')
+ | ---
+ | - true
+ | ...
+box.space.sync:select{4}
+ | ---
+ | - - [4]
+ | ...
+
 -- Cleanup.
 test_run:cmd('switch default')
  | ---
@@ -195,6 +242,9 @@ test_run:cmd('delete server replica')
  | ---
  | - true
  | ...
+box.space.test:drop()
+ | ---
+ | ...
 box.space.sync:drop()
  | ---
  | ...
diff --git a/test/replication/sync_replication_sanity.test.lua b/test/replication/sync_replication_sanity.test.lua
index 8715a4600..f769804ca 100644
--- a/test/replication/sync_replication_sanity.test.lua
+++ b/test/replication/sync_replication_sanity.test.lua
@@ -71,11 +71,33 @@ box.space.sync:select{}
 test_run:cmd('restart server replica')
 box.space.sync:select{}
 
+--
+-- gh-5100: replica should send ACKs for sync transactions after
+-- WAL write immediately, not waiting for replication timeout or
+-- a CONFIRM.
+--
+box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
+test_run:switch('replica')
+box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
+test_run:switch('default')
+-- Commit something non-sync. So as applier writer fiber would
+-- flush the pending heartbeat and go to sleep with the new huge
+-- replication timeout.
+s = box.schema.create_space('test')
+pk = s:create_index('pk')
+s:replace{1}
+-- Now commit something sync. It should return immediately even
+-- though the replication timeout is huge.
+box.space.sync:replace{4}
+test_run:switch('replica')
+box.space.sync:select{4}
+
 -- Cleanup.
 test_run:cmd('switch default')
 
 box.cfg{replication_synchro_quorum=quorum, replication_synchro_timeout=timeout}
 test_run:cmd('stop server replica')
 test_run:cmd('delete server replica')
+box.space.test:drop()
 box.space.sync:drop()
 box.schema.user.revoke('guest', 'replication')
-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] [tosquash] test: add a test for sync heartbeats
  2020-06-23 22:39 [Tarantool-patches] [PATCH 1/1] [tosquash] test: add a test for sync heartbeats Vladislav Shpilevoy
@ 2020-06-25  7:59 ` Serge Petrenko
  2020-06-25 20:58   ` Vladislav Shpilevoy
  0 siblings, 1 reply; 4+ messages in thread
From: Serge Petrenko @ 2020-06-25  7:59 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches

Hi! Thanks for the patch!

Please see 1 comment below.

24.06.2020 01:39, Vladislav Shpilevoy пишет:
> Should be squashed into the commit closing 5100.
> ---
> Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication
> Issue: https://github.com/tarantool/tarantool/issues/4842
>
>   .../sync_replication_sanity.result            | 50 +++++++++++++++++++
>   .../sync_replication_sanity.test.lua          | 22 ++++++++
>   2 files changed, 72 insertions(+)
>
> diff --git a/test/replication/sync_replication_sanity.result b/test/replication/sync_replication_sanity.result
> index 4b9823d77..a0591dcf3 100644
> --- a/test/replication/sync_replication_sanity.result
> +++ b/test/replication/sync_replication_sanity.result
> @@ -178,6 +178,53 @@ box.space.sync:select{}
>    |   - [3]
>    | ...
>   
> +--
> +-- gh-5100: replica should send ACKs for sync transactions after
> +-- WAL write immediately, not waiting for replication timeout or
> +-- a CONFIRM.
> +--
> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
> + | ---
> + | ...

You should remember previous replication_timeout here and set it back 
during cleanup.

Other than that, LGTM.

> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
> + | ---
> + | ...
> +test_run:switch('default')
> + | ---
> + | - true
> + | ...
> +-- Commit something non-sync. So as applier writer fiber would
> +-- flush the pending heartbeat and go to sleep with the new huge
> +-- replication timeout.
> +s = box.schema.create_space('test')
> + | ---
> + | ...
> +pk = s:create_index('pk')
> + | ---
> + | ...
> +s:replace{1}
> + | ---
> + | - [1]
> + | ...
> +-- Now commit something sync. It should return immediately even
> +-- though the replication timeout is huge.
> +box.space.sync:replace{4}
> + | ---
> + | - [4]
> + | ...
> +test_run:switch('replica')
> + | ---
> + | - true
> + | ...
> +box.space.sync:select{4}
> + | ---
> + | - - [4]
> + | ...
> +
>   -- Cleanup.
>   test_run:cmd('switch default')
>    | ---
> @@ -195,6 +242,9 @@ test_run:cmd('delete server replica')
>    | ---
>    | - true
>    | ...
> +box.space.test:drop()
> + | ---
> + | ...
>   box.space.sync:drop()
>    | ---
>    | ...
> diff --git a/test/replication/sync_replication_sanity.test.lua b/test/replication/sync_replication_sanity.test.lua
> index 8715a4600..f769804ca 100644
> --- a/test/replication/sync_replication_sanity.test.lua
> +++ b/test/replication/sync_replication_sanity.test.lua
> @@ -71,11 +71,33 @@ box.space.sync:select{}
>   test_run:cmd('restart server replica')
>   box.space.sync:select{}
>   
> +--
> +-- gh-5100: replica should send ACKs for sync transactions after
> +-- WAL write immediately, not waiting for replication timeout or
> +-- a CONFIRM.
> +--
> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
> +test_run:switch('replica')
> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
> +test_run:switch('default')
> +-- Commit something non-sync. So as applier writer fiber would
> +-- flush the pending heartbeat and go to sleep with the new huge
> +-- replication timeout.
> +s = box.schema.create_space('test')
> +pk = s:create_index('pk')
> +s:replace{1}
> +-- Now commit something sync. It should return immediately even
> +-- though the replication timeout is huge.
> +box.space.sync:replace{4}
> +test_run:switch('replica')
> +box.space.sync:select{4}
> +
>   -- Cleanup.
>   test_run:cmd('switch default')
>   
>   box.cfg{replication_synchro_quorum=quorum, replication_synchro_timeout=timeout}
>   test_run:cmd('stop server replica')
>   test_run:cmd('delete server replica')
> +box.space.test:drop()
>   box.space.sync:drop()
>   box.schema.user.revoke('guest', 'replication')

-- 
Serge Petrenko

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] [tosquash] test: add a test for sync heartbeats
  2020-06-25  7:59 ` Serge Petrenko
@ 2020-06-25 20:58   ` Vladislav Shpilevoy
  2020-06-26 10:45     ` Serge Petrenko
  0 siblings, 1 reply; 4+ messages in thread
From: Vladislav Shpilevoy @ 2020-06-25 20:58 UTC (permalink / raw)
  To: Serge Petrenko, tarantool-patches

Hi! Thanks for the review!

> Please see 1 comment below.
> 
> 24.06.2020 01:39, Vladislav Shpilevoy пишет:
>> Should be squashed into the commit closing 5100.
>> ---
>> Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication
>> Issue: https://github.com/tarantool/tarantool/issues/4842
>>
>>   .../sync_replication_sanity.result            | 50 +++++++++++++++++++
>>   .../sync_replication_sanity.test.lua          | 22 ++++++++
>>   2 files changed, 72 insertions(+)
>>
>> diff --git a/test/replication/sync_replication_sanity.result b/test/replication/sync_replication_sanity.result
>> index 4b9823d77..a0591dcf3 100644
>> --- a/test/replication/sync_replication_sanity.result
>> +++ b/test/replication/sync_replication_sanity.result
>> @@ -178,6 +178,53 @@ box.space.sync:select{}
>>    |   - [3]
>>    | ...
>>   +--
>> +-- gh-5100: replica should send ACKs for sync transactions after
>> +-- WAL write immediately, not waiting for replication timeout or
>> +-- a CONFIRM.
>> +--
>> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
>> + | ---
>> + | ...
> 
> You should remember previous replication_timeout here and set it back during cleanup.

Oh shit, you are right. I looked at how you restore replication_synchro_timeout
on line 98 and somewhy decided that the normal timeout is also restored.

I also found that instead of configuring master and replica I configured the
replica 2 times. Because I thought that the previous tests ends in 'default'
instance. Fixed this too.

Force pushed to this commit.

====================
diff --git a/test/replication/sync_replication_sanity.result b/test/replication/sync_replication_sanity.result
index a0591dcf3..8b37ba6f5 100644
--- a/test/replication/sync_replication_sanity.result
+++ b/test/replication/sync_replication_sanity.result
@@ -90,10 +90,10 @@ box.schema.user.grant('guest', 'replication')
  | ---
  | ...
 -- Set up synchronous replication options.
-quorum = box.cfg.replication_synchro_quorum
+old_synchro_quorum = box.cfg.replication_synchro_quorum
  | ---
  | ...
-timeout = box.cfg.replication_synchro_timeout
+old_synchro_timeout = box.cfg.replication_synchro_timeout
  | ---
  | ...
 box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.1}
@@ -186,16 +186,15 @@ box.space.sync:select{}
 box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
  | ---
  | ...
-test_run:switch('replica')
+test_run:switch('default')
  | ---
  | - true
  | ...
-box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
+old_timeout = box.cfg.replication_timeout
  | ---
  | ...
-test_run:switch('default')
+box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
  | ---
- | - true
  | ...
 -- Commit something non-sync. So as applier writer fiber would
 -- flush the pending heartbeat and go to sleep with the new huge
@@ -231,7 +230,11 @@ test_run:cmd('switch default')
  | - true
  | ...
 
-box.cfg{replication_synchro_quorum=quorum, replication_synchro_timeout=timeout}
+box.cfg{                                                                        \
+    replication_synchro_quorum = old_synchro_quorum,                            \
+    replication_synchro_timeout = old_synchro_timeout,                          \
+    replication_timeout = old_timeout,                                          \
+}
  | ---
  | ...
 test_run:cmd('stop server replica')
diff --git a/test/replication/sync_replication_sanity.test.lua b/test/replication/sync_replication_sanity.test.lua
index f769804ca..b0326fd4b 100644
--- a/test/replication/sync_replication_sanity.test.lua
+++ b/test/replication/sync_replication_sanity.test.lua
@@ -38,8 +38,8 @@ engine = test_run:get_cfg('engine')
 
 box.schema.user.grant('guest', 'replication')
 -- Set up synchronous replication options.
-quorum = box.cfg.replication_synchro_quorum
-timeout = box.cfg.replication_synchro_timeout
+old_synchro_quorum = box.cfg.replication_synchro_quorum
+old_synchro_timeout = box.cfg.replication_synchro_timeout
 box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.1}
 
 test_run:cmd('create server replica with rpl_master=default,\
@@ -77,9 +77,9 @@ box.space.sync:select{}
 -- a CONFIRM.
 --
 box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
-test_run:switch('replica')
-box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
 test_run:switch('default')
+old_timeout = box.cfg.replication_timeout
+box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
 -- Commit something non-sync. So as applier writer fiber would
 -- flush the pending heartbeat and go to sleep with the new huge
 -- replication timeout.
@@ -95,7 +95,11 @@ box.space.sync:select{4}
 -- Cleanup.
 test_run:cmd('switch default')
 
-box.cfg{replication_synchro_quorum=quorum, replication_synchro_timeout=timeout}
+box.cfg{                                                                        \
+    replication_synchro_quorum = old_synchro_quorum,                            \
+    replication_synchro_timeout = old_synchro_timeout,                          \
+    replication_timeout = old_timeout,                                          \
+}
 test_run:cmd('stop server replica')
 test_run:cmd('delete server replica')
 box.space.test:drop()

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH 1/1] [tosquash] test: add a test for sync heartbeats
  2020-06-25 20:58   ` Vladislav Shpilevoy
@ 2020-06-26 10:45     ` Serge Petrenko
  0 siblings, 0 replies; 4+ messages in thread
From: Serge Petrenko @ 2020-06-26 10:45 UTC (permalink / raw)
  To: Vladislav Shpilevoy, tarantool-patches


25.06.2020 23:58, Vladislav Shpilevoy пишет:
> Hi! Thanks for the review!
>
>> Please see 1 comment below.
>>
>> 24.06.2020 01:39, Vladislav Shpilevoy пишет:
>>> Should be squashed into the commit closing 5100.
>>> ---
>>> Branch: http://github.com/tarantool/tarantool/tree/gh-4842-sync-replication
>>> Issue: https://github.com/tarantool/tarantool/issues/4842
>>>
>>>    .../sync_replication_sanity.result            | 50 +++++++++++++++++++
>>>    .../sync_replication_sanity.test.lua          | 22 ++++++++
>>>    2 files changed, 72 insertions(+)
>>>
>>> diff --git a/test/replication/sync_replication_sanity.result b/test/replication/sync_replication_sanity.result
>>> index 4b9823d77..a0591dcf3 100644
>>> --- a/test/replication/sync_replication_sanity.result
>>> +++ b/test/replication/sync_replication_sanity.result
>>> @@ -178,6 +178,53 @@ box.space.sync:select{}
>>>     |   - [3]
>>>     | ...
>>>    +--
>>> +-- gh-5100: replica should send ACKs for sync transactions after
>>> +-- WAL write immediately, not waiting for replication timeout or
>>> +-- a CONFIRM.
>>> +--
>>> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
>>> + | ---
>>> + | ...
>> You should remember previous replication_timeout here and set it back during cleanup.
> Oh shit, you are right. I looked at how you restore replication_synchro_timeout
> on line 98 and somewhy decided that the normal timeout is also restored.
>
> I also found that instead of configuring master and replica I configured the
> replica 2 times. Because I thought that the previous tests ends in 'default'
> instance. Fixed this too.

LGTM

>
> Force pushed to this commit.
>
> ====================
> diff --git a/test/replication/sync_replication_sanity.result b/test/replication/sync_replication_sanity.result
> index a0591dcf3..8b37ba6f5 100644
> --- a/test/replication/sync_replication_sanity.result
> +++ b/test/replication/sync_replication_sanity.result
> @@ -90,10 +90,10 @@ box.schema.user.grant('guest', 'replication')
>    | ---
>    | ...
>   -- Set up synchronous replication options.
> -quorum = box.cfg.replication_synchro_quorum
> +old_synchro_quorum = box.cfg.replication_synchro_quorum
>    | ---
>    | ...
> -timeout = box.cfg.replication_synchro_timeout
> +old_synchro_timeout = box.cfg.replication_synchro_timeout
>    | ---
>    | ...
>   box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.1}
> @@ -186,16 +186,15 @@ box.space.sync:select{}
>   box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
>    | ---
>    | ...
> -test_run:switch('replica')
> +test_run:switch('default')
>    | ---
>    | - true
>    | ...
> -box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
> +old_timeout = box.cfg.replication_timeout
>    | ---
>    | ...
> -test_run:switch('default')
> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
>    | ---
> - | - true
>    | ...
>   -- Commit something non-sync. So as applier writer fiber would
>   -- flush the pending heartbeat and go to sleep with the new huge
> @@ -231,7 +230,11 @@ test_run:cmd('switch default')
>    | - true
>    | ...
>   
> -box.cfg{replication_synchro_quorum=quorum, replication_synchro_timeout=timeout}
> +box.cfg{                                                                        \
> +    replication_synchro_quorum = old_synchro_quorum,                            \
> +    replication_synchro_timeout = old_synchro_timeout,                          \
> +    replication_timeout = old_timeout,                                          \
> +}
>    | ---
>    | ...
>   test_run:cmd('stop server replica')
> diff --git a/test/replication/sync_replication_sanity.test.lua b/test/replication/sync_replication_sanity.test.lua
> index f769804ca..b0326fd4b 100644
> --- a/test/replication/sync_replication_sanity.test.lua
> +++ b/test/replication/sync_replication_sanity.test.lua
> @@ -38,8 +38,8 @@ engine = test_run:get_cfg('engine')
>   
>   box.schema.user.grant('guest', 'replication')
>   -- Set up synchronous replication options.
> -quorum = box.cfg.replication_synchro_quorum
> -timeout = box.cfg.replication_synchro_timeout
> +old_synchro_quorum = box.cfg.replication_synchro_quorum
> +old_synchro_timeout = box.cfg.replication_synchro_timeout
>   box.cfg{replication_synchro_quorum=2, replication_synchro_timeout=0.1}
>   
>   test_run:cmd('create server replica with rpl_master=default,\
> @@ -77,9 +77,9 @@ box.space.sync:select{}
>   -- a CONFIRM.
>   --
>   box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
> -test_run:switch('replica')
> -box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
>   test_run:switch('default')
> +old_timeout = box.cfg.replication_timeout
> +box.cfg{replication_timeout = 1000, replication_synchro_timeout = 1000}
>   -- Commit something non-sync. So as applier writer fiber would
>   -- flush the pending heartbeat and go to sleep with the new huge
>   -- replication timeout.
> @@ -95,7 +95,11 @@ box.space.sync:select{4}
>   -- Cleanup.
>   test_run:cmd('switch default')
>   
> -box.cfg{replication_synchro_quorum=quorum, replication_synchro_timeout=timeout}
> +box.cfg{                                                                        \
> +    replication_synchro_quorum = old_synchro_quorum,                            \
> +    replication_synchro_timeout = old_synchro_timeout,                          \
> +    replication_timeout = old_timeout,                                          \
> +}
>   test_run:cmd('stop server replica')
>   test_run:cmd('delete server replica')
>   box.space.test:drop()

-- 
Serge Petrenko

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-06-26 10:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 22:39 [Tarantool-patches] [PATCH 1/1] [tosquash] test: add a test for sync heartbeats Vladislav Shpilevoy
2020-06-25  7:59 ` Serge Petrenko
2020-06-25 20:58   ` Vladislav Shpilevoy
2020-06-26 10:45     ` Serge Petrenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox