[Tarantool-patches] [PATCH v2 12/16] raft: move synchro queue clear to the worker fiber

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Nov 20 02:46:00 MSK 2020


The synchro queue was cleared from the Raft on_update trigger
installed by box. It was fine as long as the trigger is called
from the worker fiber, because it won't block the state machine,
while the synchro queue clearance yields.

But the trigger is going to be called from the Raft state machine
directly soon. Because it will need to call
box_update_ro_summary() right after Raft state is updated, without
a yield to switch to the worker fiber. This will be done in scope
of getting rid of box in the Raft library.

It means, the trigger can't call box_clear_synchro_queue(). But it
can schedule its execution for later, since the worker fiber now
belongs to box. The patch does it.

Part of #5303
---
 src/box/raft.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/box/raft.c b/src/box/raft.c
index 0027230da..5ccfb3449 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -80,6 +80,19 @@ box_raft_request_to_msg(const struct raft_request *req, struct raft_msg *msg)
 	};
 }
 
+static void
+box_raft_update_synchro_queue(struct raft *raft)
+{
+	assert(raft == box_raft());
+	/*
+	 * When the node became a leader, it means it will ignore all records
+	 * from all the other nodes, and won't get late CONFIRM messages anyway.
+	 * Can clear the queue without waiting for confirmations.
+	 */
+	if (raft->state == RAFT_STATE_LEADER)
+		box_clear_synchro_queue(false);
+}
+
 static int
 box_raft_worker_f(va_list args)
 {
@@ -90,6 +103,7 @@ box_raft_worker_f(va_list args)
 		box_raft_has_work = false;
 
 		raft_process_async(raft);
+		box_raft_update_synchro_queue(raft);
 
 		if (!box_raft_has_work)
 			fiber_yield();
@@ -142,11 +156,11 @@ box_raft_on_update_f(struct trigger *trigger, void *event)
 	if (raft->state != RAFT_STATE_LEADER)
 		return 0;
 	/*
-	 * When the node became a leader, it means it will ignore all records
-	 * from all the other nodes, and won't get late CONFIRM messages anyway.
-	 * Can clear the queue without waiting for confirmations.
+	 * If the node became a leader, time to clear the synchro queue. But it
+	 * must be done in the worker fiber so as not to block the state
+	 * machine, which called this trigger.
 	 */
-	box_clear_synchro_queue(false);
+	box_raft_schedule_async(raft);
 	return 0;
 }
 
-- 
2.24.3 (Apple Git-128)



More information about the Tarantool-patches mailing list