[tarantool-patches] Re: [PATCH 2/6] test: introduce breakpoints for swim's event loop

Tue Apr 2 15:25:51 MSK 2019

Hi!

On 29/03/2019 21:20, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/03/20 14:11]:
>> Breakpoint as API gives a test writer more control over timing of
>> condition checks. Breakpoint stops the swim's event loop in a
>> certain moment of virtual time.
>>
>> Without breakpoints it is possible, that a condition has failed
>> its deadline, but it can not be checked properly. For example,
>> assume that there is a cluster of two members, and after 1 second
>> they should become fullmesh. It means, that any checks in [0, 1)
>> time have to fail. But without breakpoints it is not so:
> 
> From the description of the event, I would actually expect it to 
> stop the event loop, while you're stopping the event loop from the
> helper macro.

It does stop the event loop. But many other events do the same.
Purpose of that new brk event is to stop the loop in a concrete
point in time, probably between some other, natural, events.

> Why is it a macro btw? Why not make it a function?

There was no any inevitable obstacle on the way of implementing
that as a function. It just looks shorter, especially taking
into account that the next patches introduce new swim_wait_timeout()
usage cases with more complex conditions. I tried to make it a
function firstly, but failed to do it without a pile of wrappers
for each case.

If you want it be a function - ok. I did it.

==============================================================================

diff --git a/test/unit/swim_test_utils.c b/test/unit/swim_test_utils.c
index 0b301333b..896b9dcae 100644
--- a/test/unit/swim_test_utils.c
+++ b/test/unit/swim_test_utils.c
@@ -129,28 +129,49 @@ swim_cluster_is_fullmesh(struct swim_cluster *cluster)
 	return true;
 }
 
+typedef bool (*swim_loop_check_f)(struct swim_cluster *cluster, void *data);
+
 /**
- * A common wrapper for some conditions checking after each event
- * loop step.
+ * Run the event loop until timeout happens or a custom
+ * test-defined condition is met.
+ * @param timeout Maximal number of bogus seconds to run the loop
+ *        for.
+ * @param cluster Cluster to test for a condition.
+ * @param check Function condition-checker. It should return true,
+ *        when the condition is met.
+ * @param data Arbitrary test data passed to @a check without
+ *        changes.
+ *
+ * @retval -1 Timeout, condition is not satisfied.
+ * @retval 0 Success, condition is met before timeout.
  */
-#define swim_wait_timeout(timeout, target_cond) ({			\
-	swim_ev_set_brk(timeout);					\
-	double deadline = swim_time() + timeout;			\
-	int rc = 0;							\
-	while (! (target_cond)) {					\
-		if (swim_time() >= deadline) {				\
-			rc = -1;					\
-			break;						\
-		}							\
-		swim_do_loop_step(loop());				\
-	}								\
-	rc;								\
-})
+static int
+swim_wait_timeout(double timeout, struct swim_cluster *cluster,
+		  swim_loop_check_f check, void *data)
+{
+	swim_ev_set_brk(timeout);
+	double deadline = swim_time() + timeout;
+	while (! check(cluster, data)) {
+		if (swim_time() >= deadline)
+			return -1;
+		swim_do_loop_step(loop());
+	}
+	return 0;
+}
+
+/** Wrapper to check a cluster for fullmesh for timeout. */
+static bool
+swim_loop_check_fullmesh(struct swim_cluster *cluster, void *data)
+{
+	(void) data;
+	return swim_cluster_is_fullmesh(cluster);
+}
 
 int
 swim_cluster_wait_fullmesh(struct swim_cluster *cluster, double timeout)
 {
-	return swim_wait_timeout(timeout, swim_cluster_is_fullmesh(cluster));
+	return swim_wait_timeout(timeout, cluster, swim_loop_check_fullmesh,
+				 NULL);
 }
==============================================================================

Note, that I can not pass just 'cluster' without 'data' - next commits
need more arguments than just a cluster object.

> 
> Anyway, from this description I would expect that
> swim_brk_event_process does something like ev_loop_break(), and
> then the loop has to be resumed.

On the whole, each event here does something like 'ev_loop_break()'.
After each loop step the execution is stopped. This is because I do
not have anything like ev_run() - I just don't need it. The only API
exposed by test event loop is swim_do_loop_step(). It rolls one
loop iteration and stops. Above that method I have wrappers, doing
more loop steps and checking conditions. And *these wrappers* can be
stopped or broken in concrete time points. Watch the hands: the loop
is incremental and it can not be stopped because it can not be run.
But wrappers can be stopped since they roll many steps in search
of a condition satisfaction, or a timeout.

This new event is needed not because I can't stop event loop execution,
but because I can't add to it arbitrary time points. For example,
*between* some events. Also, without that ability I could miss some
errors. For an example you can look at the commit message.

> It's fully up to you to do it
> differently, but please fix the docs so that there is no
> ambiguity.

I added a big descriptive comment to the swim_test_ev.h, if you
did not catch the mechanism from my commit messages:

==============================================================================
diff --git a/test/unit/swim_test_ev.h b/test/unit/swim_test_ev.h
index 01a1b8868..355f83f1b 100644
--- a/test/unit/swim_test_ev.h
+++ b/test/unit/swim_test_ev.h
@@ -37,6 +37,47 @@ struct ev_loop;
  * speed up events processing while keeping SWIM unaware that it
  * works in a simulation. Libev is used a little, just to store
  * some IO events.
+ *
+ * The test event loop works as follows. It has a global watch and
+ * a heap of events sorted by deadlines. An event is either a
+ * libev event like EV_TIMER, or an internal test event.
+ *
+ * On each iteration it takes all the next events with the nearest
+ * and equal deadline, and sets the global watch with the deadline
+ * value. It simulates time flow. All the events with that
+ * deadline are processed. An event processing usually means
+ * calling a libev callback set by a SWIM instance beforehand.
+ *
+ * For example, if event deadlines and the watch are:
+ *
+ *     watch = 0
+ *     queue = [1, 1, 1, 5, 5, 6, 7, 7, 7]
+ *
+ * Then the queue is dispatched as follows:
+ *
+ *     1) watch = 1
+ *        process first 3 events
+ *        queue = [5, 5, 6, 7, 7, 7]
+ *
+ *     2) watch = 5
+ *        process next 2 events
+ *        queue = [6, 7, 7, 7]
+ *
+ *     3) watch = 6
+ *        process a next event
+ *        queue = [7, 7, 7]
+ *
+ *     4) watch = 7
+ *        process next 3 events
+ *        queue = []
+ *
+ * The loop provides an API to make one iteration, do one loop
+ * step. For example, the sequence above is played in 4 loop
+ * steps. The unit tests can either do explicitly step by step,
+ * calling that API method. Or use wrappers with 'timeouts', which
+ * in fact do the same, but until the global watch equals a
+ * certain value. Usually after each loop step a test checks some
+ * conditions.
  */
==============================================================================

> And, once again, does swim_wait_timeout have to be a
> macro?
> 
> As a final nit, I would prefix all testing APIs with a name that 
> would make sure they are never confused with the actual SWIM API.
> 
> E.g. swim_wait_timeout() could be easily confused with some public
> method by a newbie.
Doubtful. It is located in test/swim_test_utils.c file. If someone
thinks, that methods in these file and directory are public, then
not sure if different naming would cure them.

> Perhaps choose an entirely different prefix
> for the testing harness methods like swim_unit_* swim_ut_ or swum?

I thought about it and tried as well, but then names looks either
crazy ('swum', seriously? are you mocking?), or too long and the
code starts looking like Java. I strongly disagree that we need
these prefixes just for the sake of themselves here, because it
aggravates readability. If in future something conflicts, we will
just fix the concrete conflict without padding out all the other
code.

At this moment I use suffixes '_test_' and '_cluster_' only in
some really dubious places. For example, for the methods which
are supposed to be mixed with normal SWIM methods along the tests.


Of course, for each of my objectives above and further you can
insist with your right of veto, and I will do it heavy-heartedly
through the nose and memes.

> 
> -- 
> Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
> http://tarantool.io - www.twitter.com/kostja_osipov
>