[Tarantool-patches] [PATCH 10/12] raft: move box_update_ro_summary to update trigger

Serge Petrenko sergepetrenko at tarantool.org
Tue Nov 17 18:17:58 MSK 2020


17.11.2020 15:42, Serge Petrenko пишет:
>
> 17.11.2020 03:02, Vladislav Shpilevoy пишет:
>> box_update_ro_summary() was called from the basic Raft library,
>> making it depend on box. But it was called every time when Raft
>> state was changed and broadcasted. It means the same effect can be
>> achieved by updating RO summary from Raft state update trigger.
>>
>> The patch does it, and now Raft code does not depend on box.h.
>>
>> Part of #5303
>> ---
>>   src/box/raft.c    | 2 ++
>>   src/box/raftlib.c | 8 --------
>>   2 files changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/box/raft.c b/src/box/raft.c
>> index f3652bbcb..db1a3f423 100644
>> --- a/src/box/raft.c
>> +++ b/src/box/raft.c
>> @@ -77,6 +77,8 @@ box_raft_on_update_f(struct trigger *trigger, void 
>> *event)
>>       (void)trigger;
>>       struct raft *raft = (struct raft *)event;
>>       assert(raft == box_raft());
>> +    /* State or enablence could be changed, affecting read-only 
>> state. */
>> +    box_update_ro_summary();
>>       if (raft->state != RAFT_STATE_LEADER)
>>           return 0;
>>       /*
>
>
> Raft uses synchronous WAL write, corect?
>
> So there's a yield in raft_worker_handle_io(). Now there's a period of 
> time when
> an instance is a follower, but it isn't read-only.
>
> When you reconfigure a leader to become voter, everything's fine, 
> since no
> writing to disk is involved.
>
> However, if an existing leader receives a message with term greater, 
> than its own,
> it'll first persist this term, and thus yield, and proceed to 
> broadcast and switching
> to ro later.
>
> So now it's possible that a follower is writeable for some period of 
> time.
>
> Maybe run on_update triggers before the wal write? Even better, run 
> the triggers
> on the actual state transition. After each  `raft->state = ...`.
>
> On the bright side, your patch seems to fix
> https://github.com/tarantool/tarantool/issues/5440


I've addressed the issue in a patch based on your branch,
"[PATCH] raft: execute triggers exactly on state change"

Take a look, please.

>
>> diff --git a/src/box/raftlib.c b/src/box/raftlib.c
>> index 512dbd51f..2e09d5405 100644
>> --- a/src/box/raftlib.c
>> +++ b/src/box/raftlib.c
>> @@ -33,7 +33,6 @@
>>   #include "error.h"
>>   #include "fiber.h"
>>   #include "small/region.h"
>> -#include "box.h"
>>   #include "tt_static.h"
>>     /**
>> @@ -603,8 +602,6 @@ raft_sm_become_leader(struct raft *raft)
>>       raft->state = RAFT_STATE_LEADER;
>>       raft->leader = raft->self;
>>       ev_timer_stop(loop(), &raft->timer);
>> -    /* Make read-write (if other subsystems allow that. */
>> -    box_update_ro_summary();
>>       /* State is visible and it is changed - broadcast. */
>>       raft_schedule_broadcast(raft);
>>   }
>> @@ -655,7 +652,6 @@ raft_sm_schedule_new_term(struct raft *raft, 
>> uint64_t new_term)
>>       raft->volatile_vote = 0;
>>       raft->leader = 0;
>>       raft->state = RAFT_STATE_FOLLOWER;
>> -    box_update_ro_summary();
>>       raft_sm_pause_and_dump(raft);
>>       /*
>>        * State is visible and it is changed - broadcast. Term is also 
>> visible,
>> @@ -686,7 +682,6 @@ raft_sm_schedule_new_election(struct raft *raft)
>>       /* Everyone is a follower until its vote for self is persisted. */
>>       raft_sm_schedule_new_term(raft, raft->term + 1);
>>       raft_sm_schedule_new_vote(raft, raft->self);
>> -    box_update_ro_summary();
>>   }
>>     static void
>> @@ -771,7 +766,6 @@ raft_sm_start(struct raft *raft)
>>            */
>>           raft_sm_wait_leader_found(raft);
>>       }
>> -    box_update_ro_summary();
>>       /*
>>        * Nothing changed. But when raft was stopped, its state wasn't 
>> sent to
>>        * replicas. At least this was happening at the moment of this 
>> being
>> @@ -793,7 +787,6 @@ raft_sm_stop(struct raft *raft)
>>           raft->leader = 0;
>>       raft->state = RAFT_STATE_FOLLOWER;
>>       ev_timer_stop(loop(), &raft->timer);
>> -    box_update_ro_summary();
>>       /* State is visible and changed - broadcast. */
>>       raft_schedule_broadcast(raft);
>>   }
>> @@ -879,7 +872,6 @@ raft_cfg_is_candidate(struct raft *raft, bool 
>> is_candidate)
>>               raft_schedule_broadcast(raft);
>>           }
>>       }
>> -    box_update_ro_summary();
>>   }
>>     void
>
-- 
Serge Petrenko



More information about the Tarantool-patches mailing list