[Tarantool-patches] [PATCH v1] tarantoolctl: fix pid file removement

Alexander V. Tikhonov avtikhon at tarantool.org
Thu Nov 5 17:20:10 MSK 2020


Found that stop() routine didn't remove the pid file on instance
successfull kill. It produced the issue with restarting the instance
in the same temporary workdir of the test. I happened because the
instance had the same naming and start_check() routine finding the
old pid file broke instance start. To fix it the pid file set to be
removed in all sitations just before signal to instance process sent.

Resolved the issue [1]:

  [096] replication/ddl.test.lua                        memtx
  [096]
  [096] [Instance "ddl2" returns with non-zero exit code: 1]
  [096]
  [096] Last 15 lines of Tarantool Log file [Instance "ddl2"][/tmp/tnt/096_replication/ddl2.log]:
  ...
  [096] 2020-11-05 13:56:59.838 [10538] main/103/ddl2 I> bootstrapping replica from f4f59bcd-54bb-4308-a43c-c8ede1c84701 at unix/:/private/tmp/tnt/096_replication/autobootstrap4.sock
  [096] 2020-11-05 13:56:59.838 [10538] main/115/applier/cluster at unix/:/private/tmp/tnt/096_replication/autobootstrap4.sock I> can't read row
  [096] 2020-11-05 13:56:59.838 [10538] main/115/applier/cluster at unix/:/private/tmp/tnt/096_replication/autobootstrap4.sock box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [096] 2020-11-05 13:56:59.838 [10538] main/103/ddl2 box.cc:183 E> ER_READONLY: Can't modify data because this instance is in read-only mode.
  [096] 2020-11-05 13:56:59.838 [10538] main/103/ddl2 F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [096] 2020-11-05 13:56:59.838 [10538] main/103/ddl2 F> can't initialize storage: Can't modify data because this instance is in read-only mode.
  [096] [ fail ]
  [096] Test "replication/ddl.test.lua", conf: "memtx"
  [096] 	from "fragile" list failed with results file checksum: "a006d40205b9a67ddbbb8206b4e1764c", rerunning with server restart ...
  [096] replication/ddl.test.lua                        memtx           [ fail ]
  [096] Test "replication/ddl.test.lua", conf: "memtx"
  [096] 	from "fragile" list failed with results file checksum: "a3962e843889def7f61d6f1f71461bf1", rerunning with server restart ...
  [096] replication/ddl.test.lua                        memtx           [ fail ]
  ...
  [096] Worker "096_replication" got failed test; restarted the server
  [096] replication/ddl.test.lua                        vinyl
  [096]
  [096] [Instance "ddl1" returns with non-zero exit code: 1]
  [096]
  [096] Last 15 lines of Tarantool Log file [Instance "ddl1"][/tmp/tnt/096_replication/ddl1.log]:
  [096] Stopping instance ddl1...
  [096] Starting instance ddl1...
  [096] The daemon is already running: PID 10536
  [096] Stopping instance ddl1...
  [096] Starting instance ddl1...
  [096] The daemon is already running: PID 10536
  ...

  [1] - https://gitlab.com/tarantool/tarantool/-/jobs/831873727#L4683
---

Github: https://github.com/tarantool/tarantool/tree/avtikhon/tarantoolctl-pid-file

 extra/dist/tarantoolctl.in | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/extra/dist/tarantoolctl.in b/extra/dist/tarantoolctl.in
index 0726e7f46..acdb613fa 100755
--- a/extra/dist/tarantoolctl.in
+++ b/extra/dist/tarantoolctl.in
@@ -595,12 +595,11 @@ local function stop()
         return 1
     end
 
+    fio.unlink(pid_file)
     if ffi.C.kill(pid, 15) < 0 then
         log.error("Can't kill process %d: %s", pid, errno.strerror())
-        fio.unlink(pid_file)
         return 1
     end
-
     return 0
 end
 
-- 
2.25.1



More information about the Tarantool-patches mailing list