From: Serge Petrenko <sergepetrenko@tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>, tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH 4/8] test: introduce raft unit tests Date: Fri, 18 Dec 2020 11:17:31 +0300 [thread overview] Message-ID: <51c59e82-8351-8383-552b-d0c6056a324c@tarantool.org> (raw) In-Reply-To: <f48a3aa0-55e2-f4d4-02de-db5c60fe3c96@tarantool.org> 18.12.2020 01:44, Vladislav Shpilevoy пишет: > Hi! Thanks for the review! Thanks for the fixes! LGTM. Consider one comment below. >>> diff --git a/test/unit/raft.c b/test/unit/raft.c >>> new file mode 100644 >>> index 000000000..dfb5f8e43 >>> --- /dev/null >>> +++ b/test/unit/raft.c >>> + >>> +static void >>> +raft_test_vote_skip(void) >>> +{ >>> + raft_start_test(33); >>> + struct raft_node node; >>> + raft_node_create(&node); >>> + >>> + /* Everything is skipped if the term is outdated. */ >> >> 1. Let's also test a case when vote response has greater term than the candidate itself. > Good idea. Below is the diff (without .result file diff, to be > short). > > ==================== > @@ -346,7 +346,7 @@ raft_test_vote(void) > static void > raft_test_vote_skip(void) > { > - raft_start_test(33); > + raft_start_test(37); > struct raft_node node; > raft_node_create(&node); > > @@ -507,6 +507,35 @@ raft_test_vote_skip(void) > > raft_node_cfg_is_candidate(&node, true); > > + /* > + * Vote response with a bigger term must be skipped, but it will bump > + * the term. > + */ > + > + /* Re-create the node so as not to write the vclock each time. */ > + raft_node_destroy(&node); > + raft_node_create(&node); > + > + raft_run_next_event(); > + is(node.raft.state, RAFT_STATE_CANDIDATE, "became candidate"); > + is(node.raft.term, 2, "term is bumped"); > + > + is(raft_node_send_vote_response(&node, > + 3 /* Term. */, > + 1 /* Vote. */, > + 2 /* Source. */ > + ), 0, "message is accepted"); > + > + ok(raft_node_check_full_state(&node, > + RAFT_STATE_CANDIDATE /* State. */, > + 0 /* Leader. */, > + 3 /* Term. */, > + 1 /* Vote. */, > + 3 /* Volatile term. */, > + 1 /* Volatile vote. */, > + "{0: 3}" /* Vclock. */ > + ), "term is bumped and became candidate"); > + > raft_node_destroy(&node); > raft_finish_test(); > } > ==================== I'd also set the quorum to 2 to make sure the node doesn't become leader in the old term and that it doesn't count votes from a bigger term, but maybe that's an overkill for this test. So, up to you. >>> + >>> + raft_run_next_event(); >>> + is(node.raft.state, RAFT_STATE_CANDIDATE, "became candidate"); >>> + is(node.raft.term, 2, "term is bumped"); >>> + >>> + is(raft_node_send_vote_response(&node, >>> + 1 /* Term. */, >>> + 1 /* Vote. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote_count, 1, "but ignored - too old term"); >>> + >>> + /* Competing vote requests are skipped. */ >>> + >>> + is(raft_node_send_vote_response(&node, >>> + 2 /* Term. */, >>> + 3 /* Vote. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote_count, 1, "but ignored - vote not for this node"); >>> + is(node.raft.state, RAFT_STATE_CANDIDATE, "this node does not give up"); >>> + >>> + /* Vote requests are ignored when node is disabled. */ >>> + >>> + raft_node_cfg_is_enabled(&node, false); >>> + >>> + is(raft_node_send_follower(&node, >>> + 3 /* Term. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + ok(raft_node_check_full_state(&node, >>> + RAFT_STATE_FOLLOWER /* State. */, >>> + 0 /* Leader. */, >>> + 3 /* Term. */, >>> + 0 /* Vote. */, >>> + 3 /* Volatile term. */, >>> + 0 /* Volatile vote. */, >>> + "{0: 2}" /* Vclock. */ >>> + ), "term bump to be able to vote again"); >>> + is(raft_node_send_vote_request(&node, >>> + 3 /* Term. */, >>> + "{}" /* Vclock. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 0, "but ignored - node is disabled"); >>> + >>> + /* Disabled node still takes term from the vote request. */ >>> + >>> + is(raft_node_send_vote_request(&node, >>> + 4 /* Term. */, >>> + "{}" /* Vclock. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + ok(raft_node_check_full_state(&node, >>> + RAFT_STATE_FOLLOWER /* State. */, >>> + 0 /* Leader. */, >>> + 4 /* Term. */, >>> + 0 /* Vote. */, >>> + 4 /* Volatile term. */, >>> + 0 /* Volatile vote. */, >>> + "{0: 3}" /* Vclock. */ >>> + ), "term is bumped, but vote request is ignored"); >>> + >>> + raft_node_cfg_is_enabled(&node, true); >>> + >>> + /* Not a candidate won't accept vote request for self. */ >>> + >>> + is(raft_node_send_vote_response(&node, >>> + 4 /* Term. */, >>> + 1 /* Vote. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 0, "but ignored - vote works only on a candidate"); >>> + >>> + /* Ignore vote response for some third node. */ >>> + >>> + is(raft_node_send_vote_response(&node, >>> + 4 /* Term. */, >>> + 3 /* Vote. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 0, "but ignored - sender != vote, so it is not a " >>> + "request"); >>> + >>> + /* Ignore if leader is already known. */ >>> + >>> + is(raft_node_send_leader(&node, >>> + 4 /* Term. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.leader, 2, "leader is accepted"); >>> + >>> + is(raft_node_send_vote_request(&node, >>> + 4 /* Term. */, >>> + "{}" /* Vclock. */, >>> + 3 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 0, "but ignored - leader is already known"); >>> + is(node.raft.leader, 2, "leader is not changed"); >>> + >>> + /* Ignore too small vclock. */ >>> + >>> + /* >>> + * Need to turn off the candidate role to bump the term and not become >>> + * a candidate. >>> + */ >>> + raft_node_cfg_is_candidate(&node, false); >>> + >>> + raft_node_journal_follow(&node, 1, 5); >>> + raft_node_journal_follow(&node, 2, 5); >>> + ok(raft_node_check_full_state(&node, >>> + RAFT_STATE_FOLLOWER /* State. */, >>> + 2 /* Leader. */, >>> + 4 /* Term. */, >>> + 0 /* Vote. */, >>> + 4 /* Volatile term. */, >>> + 0 /* Volatile vote. */, >>> + "{0: 3, 1: 5, 2: 5}" /* Vclock. */ >>> + ), "vclock is bumped"); >>> + >>> + is(raft_node_send_vote_request(&node, >>> + 5 /* Term. */, >>> + "{1: 4}" /* Vclock. */, >>> + 3 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 0, "but ignored - vclock is too small"); >>> + is(node.raft.term, 5, "new term"); >>> + is(node.raft.leader, 0, "leader is dropped in the new term"); >>> + >>> + /* Ignore incomparable vclock. */ >>> + >>> + is(raft_node_send_vote_request(&node, >>> + 5 /* Term. */, >>> + "{1: 4, 2: 6}" /* Vclock. */, >>> + 3 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 0, "but ignored - vclock is incomparable"); >>> + >>> + /* Ignore if voted in the current term. */ >>> + >>> + is(raft_node_send_vote_request(&node, >>> + 6 /* Term. */, >>> + "{1: 5, 2: 5}" /* Vclock. */, >>> + 2 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 2, "voted"); >>> + >>> + is(raft_node_send_vote_request(&node, >>> + 6 /* Term. */, >>> + "{1: 5, 2: 5}" /* Vclock. */, >>> + 3 /* Source. */ >>> + ), 0, "message is accepted"); >>> + is(node.raft.vote, 2, "but ignored - already voted in the term"); >> >> 2. I'd also check the case that the node ignores vote request after restart. > Hm. What do you mean? Why should it? Do you mean if it voted > in the current term and then restarted? Yes. > > I added this: > > ==================== > @@ -346,7 +346,7 @@ raft_test_vote(void) > static void > raft_test_vote_skip(void) > { > - raft_start_test(37); > + raft_start_test(39); > struct raft_node node; > raft_node_create(&node); > > @@ -505,6 +505,16 @@ raft_test_vote_skip(void) > ), 0, "message is accepted"); > is(node.raft.vote, 2, "but ignored - already voted in the term"); > > + /* After restart it still will ignore requests in the current term. */ > + > + raft_node_restart(&node); > + is(raft_node_send_vote_request(&node, > + 6 /* Term. */, > + "{1: 5, 2: 5}" /* Vclock. */, > + 3 /* Source. */ > + ), 0, "message is accepted"); > + is(node.raft.vote, 2, "but ignored - already voted in the term"); > + > raft_node_cfg_is_candidate(&node, true); > > /* > ==================== That's exactly what I meant. >>> + >>> +static void >>> +raft_test_heartbeat(void) >>> +{ >>> + raft_start_test(12); >>> + struct raft_node node; >>> + raft_node_create(&node); >>> + >>> + /* Let the node know there is a leader somewhere. */ >>> + >>> + is(raft_node_send_leader(&node, >>> + 2 /* Term. */, >>> + 2 /* Source. */ >>> + ), 0, "leader notification"); >>> + ok(raft_node_check_full_state(&node, >>> + RAFT_STATE_FOLLOWER /* State. */, >>> + 2 /* Leader. */, >>> + 2 /* Term. */, >>> + 0 /* Vote. */, >>> + 2 /* Volatile term. */, >>> + 0 /* Volatile vote. */, >>> + "{0: 1}" /* Vclock. */ >>> + ), "follow the leader after notification"); >>> + >>> + /* Leader can send the same message many times. */ >>> + >>> + is(raft_node_send_leader(&node, >>> + 2 /* Term. */, >>> + 2 /* Source. */ >>> + ), 0, "leader notification"); >>> + >>> + /* The node won't do anything if it is not a candidate. */ >>> + >>> + raft_node_cfg_is_candidate(&node, false); >>> + raft_run_for(node.cfg_death_timeout * 2); >>> + ok(raft_node_check_full_state(&node, >>> + RAFT_STATE_FOLLOWER /* State. */, >>> + 2 /* Leader. */, >>> + 2 /* Term. */, >>> + 0 /* Vote. */, >>> + 2 /* Volatile term. */, >>> + 0 /* Volatile vote. */, >>> + "{0: 1}" /* Vclock. */ >>> + ), "follow the leader because no candidate"); >> >> 3. Shouldn't the node reset the leader to 0 even if if it isn't a candidate itself? > I am not sure I parsed the sentence correctly. Did you said that > the node should reset the leader if raft_node_cfg_is_candidate(false) > is called? > > But why? If it is not a leader, it is a voter. It still must know > who is the leader to decide whether to accept rows from its peers. > > Or do you mean we should still reset the leader after the timeout, > but don't start new election? Yes, that's what I was trying to say. But we don't do that in raft code now, so no point in testing it here. > >>> + raft_node_cfg_is_candidate(&node, true); >>> + >>> + /* Heartbeats from the leader are accepted. */ >>> + >>> + for (int i = 0; i < 5; ++i) { >>> + raft_run_for(node.cfg_death_timeout / 2); >>> + raft_node_send_heartbeat(&node, 2); >>> + } >>> + ok(raft_node_check_full_state(&node, >>> + RAFT_STATE_FOLLOWER /* State. */, >>> + 2 /* Leader. */, >>> + 2 /* Term. */, >>> + 0 /* Vote. */, >>> + 2 /* Volatile term. */, >>> + 0 /* Volatile vote. */, >>> + "{0: 1}" /* Vclock. */ >>> + ), "follow the leader because had heartbeats"); >>> + >>> + /* Heartbeats not from the leader won't do anything. */ >>> + >>> + double start = raft_time(); >>> + raft_run_for(node.cfg_death_timeout / 3); >>> + raft_node_send_heartbeat(&node, 3); >>> + raft_run_for(node.cfg_death_timeout / 3); >>> + raft_node_send_heartbeat(&node, 0); >>> + raft_run_next_event(); >>> + ok(raft_time() >= start + node.cfg_death_timeout, "death timeout " >>> + "passed"); >> >> 4. Looks like this test would succeed even if heartbeats from other >> instances were counted as leader heartbeats. You simply say >> `raft_run_next_event()` which'll make the death timeout fire sooner >> or later anyway. I think it's better to add another >> `raft_run_for(node.cfg_death_timeout / 2)` to make sure the node >> times out exactly when it should. > I am again not 100% sure I understood correctly, but hopefully > you meant this, and here I agree: > > ==================== > @@ -745,8 +755,13 @@ raft_test_heartbeat(void) > raft_run_for(node.cfg_death_timeout / 3); > raft_node_send_heartbeat(&node, 0); > raft_run_next_event(); > - ok(raft_time() >= start + node.cfg_death_timeout, "death timeout " > - "passed"); > + double deadline = start + node.cfg_death_timeout; > + /* > + * Compare == with 0.1 precision. Because '/ 3' operations above will > + * make the doubles contain some small garbage. > + */ > + ok(raft_time() + 0.1 >= deadline && raft_time() - 0.1 <= deadline, > + "death timeout passed"); > ok(raft_node_check_full_state(&node, > RAFT_STATE_CANDIDATE /* State. */, > 0 /* Leader. */, > ==================== Yes, that's what I wanted. > >>> diff --git a/test/unit/raft_test_utils.c b/test/unit/raft_test_utils.c >>> new file mode 100644 >>> index 000000000..4acd74e8f >>> --- /dev/null >>> +++ b/test/unit/raft_test_utils.c >>> + >>> +static void >>> +raft_method_node_broadcast(struct raft *raft, const struct raft_msg *msg); >>> + >>> +static void >>> +raft_method_node_write(struct raft *raft, const struct raft_msg *msg); >>> + >>> +static void >>> +raft_method_node_schedule_async(struct raft *raft); >>> + >> >> 5. Why such a long prefix? Why not simply raft_node_broadcast? > I wanted to emphasize that these are 'methods' of 'raft'. Not of > raft_node. > >> Ok, raft_node_broadcast may sound too general and referring more >> to a raft test node rather than the raft core itself. >> What about raft_node_broadcast_f then? > I don't mind to change it if my version is confusing. Thanks! > > ==================== > @@ -60,18 +60,18 @@ raft_loop(void) > } > > static void > -raft_method_node_broadcast(struct raft *raft, const struct raft_msg *msg); > +raft_node_broadcast_f(struct raft *raft, const struct raft_msg *msg); > > static void > -raft_method_node_write(struct raft *raft, const struct raft_msg *msg); > +raft_node_write_f(struct raft *raft, const struct raft_msg *msg); > > static void > -raft_method_node_schedule_async(struct raft *raft); > +raft_node_schedule_async_f(struct raft *raft); > > static struct raft_vtab raft_vtab = { > - .broadcast = raft_method_node_broadcast, > - .write = raft_method_node_write, > - .schedule_async = raft_method_node_schedule_async, > + .broadcast = raft_node_broadcast_f, > + .write = raft_node_write_f, > + .schedule_async = raft_node_schedule_async_f, > }; > > static int > @@ -485,21 +485,21 @@ raft_node_destroy(struct raft_node *node) > } > > static void > -raft_method_node_broadcast(struct raft *raft, const struct raft_msg *msg) > +raft_node_broadcast_f(struct raft *raft, const struct raft_msg *msg) > { > struct raft_node *node = container_of(raft, struct raft_node, raft); > raft_net_send(&node->net, msg); > } > > static void > -raft_method_node_write(struct raft *raft, const struct raft_msg *msg) > +raft_node_write_f(struct raft *raft, const struct raft_msg *msg) > { > struct raft_node *node = container_of(raft, struct raft_node, raft); > raft_journal_write(&node->journal, msg); > } > > static void > -raft_method_node_schedule_async(struct raft *raft) > +raft_node_schedule_async_f(struct raft *raft) > { > struct raft_node *node = container_of(raft, struct raft_node, raft); > node->has_work = true; > ==================== > >>> +static struct raft_vtab raft_vtab = { >>> + .broadcast = raft_method_node_broadcast, >>> + .write = raft_method_node_write, >>> + .schedule_async = raft_method_node_schedule_async, >>> +}; >>> + >>> +static int >>> +raft_node_on_update(struct trigger *t, void *event) >>> +{ >>> + struct raft_node *n = t->data; >>> + assert(&n->on_update == t); >>> + assert(&n->raft == event); >> >> 6. `(void) event` for building in release? > Yes, need to add. > > ==================== > @@ -80,6 +80,7 @@ raft_node_on_update(struct trigger *t, void *event) > struct raft_node *n = t->data; > assert(&n->on_update == t); > assert(&n->raft == event); > + (void)event; > ++n->update_count; > return 0; > } > ==================== -- Serge Petrenko
next prev parent reply other threads:[~2020-12-18 8:17 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-12-13 17:15 [Tarantool-patches] [PATCH 0/8] Raft module, part 4 - " Vladislav Shpilevoy 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 1/8] fakesys: fix ev_is_active not working on fake timers Vladislav Shpilevoy 2020-12-15 9:42 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 2/8] fakesys: introduce fakeev_timer_remaining() Vladislav Shpilevoy 2020-12-15 9:43 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 3/8] raft: introduce raft_ev Vladislav Shpilevoy 2020-12-15 10:02 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 4/8] test: introduce raft unit tests Vladislav Shpilevoy 2020-12-13 18:10 ` Vladislav Shpilevoy 2020-12-16 13:03 ` Serge Petrenko 2020-12-17 22:44 ` Vladislav Shpilevoy 2020-12-18 8:17 ` Serge Petrenko [this message] 2020-12-20 17:28 ` Vladislav Shpilevoy 2020-12-21 7:36 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 5/8] raft: fix crash when received 0 term message Vladislav Shpilevoy 2020-12-16 13:05 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 6/8] raft: fix ignorance of bad state receipt Vladislav Shpilevoy 2020-12-16 13:06 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 7/8] raft: fix crash on election timeout decrease Vladislav Shpilevoy 2020-12-16 13:08 ` Serge Petrenko 2020-12-13 17:15 ` [Tarantool-patches] [PATCH 8/8] raft: fix crash on death " Vladislav Shpilevoy 2020-12-16 13:10 ` Serge Petrenko 2020-12-21 16:50 ` [Tarantool-patches] [PATCH 0/8] Raft module, part 4 - unit tests Vladislav Shpilevoy 2020-12-21 17:29 ` Vladislav Shpilevoy
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=51c59e82-8351-8383-552b-d0c6056a324c@tarantool.org \ --to=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH 4/8] test: introduce raft unit tests' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox