Tarantool development patches archive
 help / color / mirror / Atom feed
From: "Alexander V. Tikhonov" <avtikhon@tarantool.org>
To: Kirill Yukhin <kyukhin@tarantool.org>,
	Alexander Turenko <alexander.turenko@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: [Tarantool-patches] [PATCH v1] test: fix issue on first replica in drop_cluster()
Date: Sun, 16 Aug 2020 23:01:34 +0300	[thread overview]
Message-ID: <27c2f93a6602f14b882484b4f0c7ec4b8748c371.1597607968.git.avtikhon@tarantool.org> (raw)

Found flaky failed test replication/box_set_replication_stress.test.lua
on drop_cluster() routine, like:

  --- replication/box_set_replication_stress.result	Fri Aug 14 18:28:41 2020
  +++ var/004_replication/box_set_replication_stress.result	Sat Aug 15 15:19:44 2020
  @@ -34,5 +34,3 @@

   -- Cleanup.
   test_run:drop_cluster(SERVERS)
  - | ---
  - | ...

Found that drop_cluster() routine from test-run repository failed in
stop() routine from lib/tarantool_server.py:TarantoolServer class.
It failed to stop 1st replica which used in test to switch on/off the
replication 1000 times. It happend because stop() routine used SIGTERM
by default which couldn't kill the first replica in some situations.
It happend when both replca processes were alive and tried to read and
write data into their sockets, but sockets of the first replica were
already unreachable while second replica were alive. In this situation
SIGTERM signal was not enough to stop the first replica and test-run
hanged in wait_stop() in lib/tarantool_server.py:TarantoolServer class
till test-run stopped the test by its general timeout of 2 minutes.

To fix the issue the only possible way was to use SIGKILL instead of
SIGTERM to be sure that the process will not wait for sockets closing
and would be killed w/o waiting of it. SIGKILL could be used by default
in drop_cluster() routine, but seems that this change was not good for
detecting the other issues of the other tests. So it was decided to use
SIGKILL just in this test as the additional option for "stop server"
test-run call.

Closes #5244
---

Github: https://github.com/tarantool/tarantool/tree/avtikhon/gh-5244-replication-box-stress-drop-replica
Issue: https://github.com/tarantool/tarantool/issues/5244

 .../replication/box_set_replication_stress.result | 15 ++++++++++++++-
 .../box_set_replication_stress.test.lua           |  5 ++++-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/test/replication/box_set_replication_stress.result b/test/replication/box_set_replication_stress.result
index e683c0643..225f33ecb 100644
--- a/test/replication/box_set_replication_stress.result
+++ b/test/replication/box_set_replication_stress.result
@@ -33,6 +33,19 @@ test_run:cmd("switch default")
  | ...
 
 -- Cleanup.
-test_run:drop_cluster(SERVERS)
+test_run:cmd('stop server master_quorum1 with signal=SIGKILL')
  | ---
+ | - true
+ | ...
+test_run:cmd('delete server master_quorum1')
+ | ---
+ | - true
+ | ...
+test_run:cmd('stop server master_quorum2 with signal=SIGKILL')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server master_quorum2')
+ | ---
+ | - true
  | ...
diff --git a/test/replication/box_set_replication_stress.test.lua b/test/replication/box_set_replication_stress.test.lua
index 407e91e0f..88652b0b4 100644
--- a/test/replication/box_set_replication_stress.test.lua
+++ b/test/replication/box_set_replication_stress.test.lua
@@ -14,4 +14,7 @@ end
 test_run:cmd("switch default")
 
 -- Cleanup.
-test_run:drop_cluster(SERVERS)
+test_run:cmd('stop server master_quorum1 with signal=SIGKILL')
+test_run:cmd('delete server master_quorum1')
+test_run:cmd('stop server master_quorum2 with signal=SIGKILL')
+test_run:cmd('delete server master_quorum2')
-- 
2.17.1

                 reply	other threads:[~2020-08-16 20:01 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27c2f93a6602f14b882484b4f0c7ec4b8748c371.1597607968.git.avtikhon@tarantool.org \
    --to=avtikhon@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=kyukhin@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v1] test: fix issue on first replica in drop_cluster()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox