[Tarantool-patches] [PATCH 10/12] raft: move box_update_ro_summary to update trigger

Serge Petrenko sergepetrenko at tarantool.org
Tue Nov 17 15:42:27 MSK 2020


17.11.2020 03:02, Vladislav Shpilevoy пишет:
> box_update_ro_summary() was called from the basic Raft library,
> making it depend on box. But it was called every time when Raft
> state was changed and broadcasted. It means the same effect can be
> achieved by updating RO summary from Raft state update trigger.
>
> The patch does it, and now Raft code does not depend on box.h.
>
> Part of #5303
> ---
>   src/box/raft.c    | 2 ++
>   src/box/raftlib.c | 8 --------
>   2 files changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/src/box/raft.c b/src/box/raft.c
> index f3652bbcb..db1a3f423 100644
> --- a/src/box/raft.c
> +++ b/src/box/raft.c
> @@ -77,6 +77,8 @@ box_raft_on_update_f(struct trigger *trigger, void *event)
>   	(void)trigger;
>   	struct raft *raft = (struct raft *)event;
>   	assert(raft == box_raft());
> +	/* State or enablence could be changed, affecting read-only state. */
> +	box_update_ro_summary();
>   	if (raft->state != RAFT_STATE_LEADER)
>   		return 0;
>   	/*


Raft uses synchronous WAL write, corect?

So there's a yield in raft_worker_handle_io(). Now there's a period of 
time when
an instance is a follower, but it isn't read-only.

When you reconfigure a leader to become voter, everything's fine, since no
writing to disk is involved.

However, if an existing leader receives a message with term greater, 
than its own,
it'll first persist this term, and thus yield, and proceed to broadcast 
and switching
to ro later.

So now it's possible that a follower is writeable for some period of time.

Maybe run on_update triggers before the wal write? Even better, run the 
triggers
on the actual state transition. After each  `raft->state = ...`.

On the bright side, your patch seems to fix
https://github.com/tarantool/tarantool/issues/5440

> diff --git a/src/box/raftlib.c b/src/box/raftlib.c
> index 512dbd51f..2e09d5405 100644
> --- a/src/box/raftlib.c
> +++ b/src/box/raftlib.c
> @@ -33,7 +33,6 @@
>   #include "error.h"
>   #include "fiber.h"
>   #include "small/region.h"
> -#include "box.h"
>   #include "tt_static.h"
>   
>   /**
> @@ -603,8 +602,6 @@ raft_sm_become_leader(struct raft *raft)
>   	raft->state = RAFT_STATE_LEADER;
>   	raft->leader = raft->self;
>   	ev_timer_stop(loop(), &raft->timer);
> -	/* Make read-write (if other subsystems allow that. */
> -	box_update_ro_summary();
>   	/* State is visible and it is changed - broadcast. */
>   	raft_schedule_broadcast(raft);
>   }
> @@ -655,7 +652,6 @@ raft_sm_schedule_new_term(struct raft *raft, uint64_t new_term)
>   	raft->volatile_vote = 0;
>   	raft->leader = 0;
>   	raft->state = RAFT_STATE_FOLLOWER;
> -	box_update_ro_summary();
>   	raft_sm_pause_and_dump(raft);
>   	/*
>   	 * State is visible and it is changed - broadcast. Term is also visible,
> @@ -686,7 +682,6 @@ raft_sm_schedule_new_election(struct raft *raft)
>   	/* Everyone is a follower until its vote for self is persisted. */
>   	raft_sm_schedule_new_term(raft, raft->term + 1);
>   	raft_sm_schedule_new_vote(raft, raft->self);
> -	box_update_ro_summary();
>   }
>   
>   static void
> @@ -771,7 +766,6 @@ raft_sm_start(struct raft *raft)
>   		 */
>   		raft_sm_wait_leader_found(raft);
>   	}
> -	box_update_ro_summary();
>   	/*
>   	 * Nothing changed. But when raft was stopped, its state wasn't sent to
>   	 * replicas. At least this was happening at the moment of this being
> @@ -793,7 +787,6 @@ raft_sm_stop(struct raft *raft)
>   		raft->leader = 0;
>   	raft->state = RAFT_STATE_FOLLOWER;
>   	ev_timer_stop(loop(), &raft->timer);
> -	box_update_ro_summary();
>   	/* State is visible and changed - broadcast. */
>   	raft_schedule_broadcast(raft);
>   }
> @@ -879,7 +872,6 @@ raft_cfg_is_candidate(struct raft *raft, bool is_candidate)
>   			raft_schedule_broadcast(raft);
>   		}
>   	}
> -	box_update_ro_summary();
>   }
>   
>   void

-- 
Serge Petrenko



More information about the Tarantool-patches mailing list