[Tarantool-patches] [PATCH 0/3] Raft on leader election recovery restart

Alexander V. Tikhonov avtikhon at tarantool.org
Thu Oct 22 11:55:29 MSK 2020


Hi Vlad, thanks for the patchset. After your removed one of the patch
from it test began to pass [1]. No new degradations found. Patchset
LGTM.

[1] - https://gitlab.com/tarantool/tarantool/-/pipelines/206144990

On Sat, Oct 17, 2020 at 07:17:54PM +0200, Vladislav Shpilevoy wrote:
> There were 2 issues with the relay restarting recovery cursor when the node is
> elected as a leader. Fixed in the last 2 commits. First was about local LSN not
> being set, second about GC not being propagated.
> 
> The first patch is not related to the bugs above directly. Just was found while
> working on this. In theory without the first patch we can get flakiness into
> the testes changed in this commit, but only if a replication connection will
> break without a reason.
> 
> Additionally, the new test - gh-5433-election-restart-recovery - hangs on my
> machine when I start tens of it. All workers, after executing it several times,
> hang. But!!! not in something related to the raft - they hang in the first
> box.snapshot(), where the election is not even enabled yet. From some debug
> prints I see it hangs somewhere in engine_being_checkpoint(), and consumes
> ~80% of the CPU. But it may be just a consequence of the corrupted memory on
> Mac, due to libeio being broken. Don't know what to do with that now.
> 
> Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5433-raft-leader-recovery-restart
> Issue: https://github.com/tarantool/tarantool/issues/5433
> 
> Vladislav Shpilevoy (3):
>   raft: send state to new subscribers if Raft worked
>   raft: use local LSN in relay recovery restart
>   raft: don't drop GC when restart relay recovery
> 
>  src/box/box.cc                                |  14 +-
>  src/box/raft.h                                |  10 +
>  src/box/relay.cc                              |  22 ++-
>  .../gh-5426-election-on-off.result            |  59 ++++--
>  .../gh-5426-election-on-off.test.lua          |  26 ++-
>  .../gh-5433-election-restart-recovery.result  | 174 ++++++++++++++++++
>  ...gh-5433-election-restart-recovery.test.lua |  87 +++++++++
>  test/replication/suite.cfg                    |   1 +
>  8 files changed, 367 insertions(+), 26 deletions(-)
>  create mode 100644 test/replication/gh-5433-election-restart-recovery.result
>  create mode 100644 test/replication/gh-5433-election-restart-recovery.test.lua
> 
> -- 
> 2.21.1 (Apple Git-122.3)
> 


More information about the Tarantool-patches mailing list