[Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart
Alexander V. Tikhonov
avtikhon at tarantool.org
Thu Oct 22 11:55:29 MSK 2020
Hi Vlad, thanks for the patchset. After your removed one of the patch
from it test began to pass [1]. No new degradations found. Patchset
LGTM.
[1] - https://gitlab.com/tarantool/tarantool/-/pipelines/206144990
On Sat, Oct 17, 2020 at 07:17:54PM +0200, Vladislav Shpilevoy wrote:
> There were 2 issues with the relay restarting recovery cursor when the node is
> elected as a leader. Fixed in the last 2 commits. First was about local LSN not
> being set, second about GC not being propagated.
>
> The first patch is not related to the bugs above directly. Just was found while
> working on this. In theory without the first patch we can get flakiness into
> the testes changed in this commit, but only if a replication connection will
> break without a reason.
>
> Additionally, the new test - gh-5433-election-restart-recovery - hangs on my
> machine when I start tens of it. All workers, after executing it several times,
> hang. But!!! not in something related to the raft - they hang in the first
> box.snapshot(), where the election is not even enabled yet. From some debug
> prints I see it hangs somewhere in engine_being_checkpoint(), and consumes
> ~80% of the CPU. But it may be just a consequence of the corrupted memory on
> Mac, due to libeio being broken. Don't know what to do with that now.
>
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5433-raft-leader-recovery-restart
> Issue: https://github.com/tarantool/tarantool/issues/5433
>
> Vladislav Shpilevoy (3):
> raft: send state to new subscribers if Raft worked
> raft: use local LSN in relay recovery restart
> raft: don't drop GC when restart relay recovery
>
> src/box/box.cc | 14 +-
> src/box/raft.h | 10 +
> src/box/relay.cc | 22 ++-
> .../gh-5426-election-on-off.result | 59 ++++--
> .../gh-5426-election-on-off.test.lua | 26 ++-
> .../gh-5433-election-restart-recovery.result | 174 ++++++++++++++++++
> ...gh-5433-election-restart-recovery.test.lua | 87 +++++++++
> test/replication/suite.cfg | 1 +
> 8 files changed, 367 insertions(+), 26 deletions(-)
> create mode 100644 test/replication/gh-5433-election-restart-recovery.result
> create mode 100644 test/replication/gh-5433-election-restart-recovery.test.lua
>
> --
> 2.21.1 (Apple Git-122.3)
>
More information about the Tarantool-patches
mailing list