Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart
@ 2020-10-17 17:17 Vladislav Shpilevoy
  2020-10-17 17:17 ` [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked Vladislav Shpilevoy
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Vladislav Shpilevoy @ 2020-10-17 17:17 UTC (permalink / raw)
  To: tarantool-patches, sergepetrenko

There were 2 issues with the relay restarting recovery cursor when the node is
elected as a leader. Fixed in the last 2 commits. First was about local LSN not
being set, second about GC not being propagated.

The first patch is not related to the bugs above directly. Just was found while
working on this. In theory without the first patch we can get flakiness into
the testes changed in this commit, but only if a replication connection will
break without a reason.

Additionally, the new test - gh-5433-election-restart-recovery - hangs on my
machine when I start tens of it. All workers, after executing it several times,
hang. But!!! not in something related to the raft - they hang in the first
box.snapshot(), where the election is not even enabled yet. From some debug
prints I see it hangs somewhere in engine_being_checkpoint(), and consumes
~80% of the CPU. But it may be just a consequence of the corrupted memory on
Mac, due to libeio being broken. Don't know what to do with that now.

Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5433-raft-leader-recovery-restart
Issue: https://github.com/tarantool/tarantool/issues/5433

Vladislav Shpilevoy (3):
  raft: send state to new subscribers if Raft worked
  raft: use local LSN in relay recovery restart
  raft: don't drop GC when restart relay recovery

 src/box/box.cc                                |  14 +-
 src/box/raft.h                                |  10 +
 src/box/relay.cc                              |  22 ++-
 .../gh-5426-election-on-off.result            |  59 ++++--
 .../gh-5426-election-on-off.test.lua          |  26 ++-
 .../gh-5433-election-restart-recovery.result  | 174 ++++++++++++++++++
 ...gh-5433-election-restart-recovery.test.lua |  87 +++++++++
 test/replication/suite.cfg                    |   1 +
 8 files changed, 367 insertions(+), 26 deletions(-)
 create mode 100644 test/replication/gh-5433-election-restart-recovery.result
 create mode 100644 test/replication/gh-5433-election-restart-recovery.test.lua

-- 
2.21.1 (Apple Git-122.3)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-22  8:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-17 17:17 [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart Vladislav Shpilevoy
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 1/3] raft: send state to new subscribers if Raft worked Vladislav Shpilevoy
2020-10-20 20:43   ` Vladislav Shpilevoy
2020-10-21 11:41     ` Serge Petrenko
2020-10-21 21:41       ` Vladislav Shpilevoy
2020-10-22  8:53         ` Alexander V. Tikhonov
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 2/3] raft: use local LSN in relay recovery restart Vladislav Shpilevoy
2020-10-17 17:17 ` [Tarantool-patches] [PATCH 3/3] raft: don't drop GC when restart relay recovery Vladislav Shpilevoy
2020-10-19  9:36 ` [Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart Serge Petrenko
2020-10-19 20:26   ` Vladislav Shpilevoy
2020-10-20  8:18     ` Serge Petrenko
2020-10-22  8:55 ` Alexander V. Tikhonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox