[Tarantool-patches] [PATCH 3/4] box: yield after initial box_cfg() is finished

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Apr 12 23:13:29 MSK 2020


box.cfg() works in two stages, when called first time - boot the
instance using box_cfg() C++ function, and then configure it.
During booting all non-dynamic parameters are read. Dynamic are
configured mostly afterwards.

Normally there should be a yield between box_cfg() C++ call and
dynamic parameters configuration. It is used by box.ctl.wait_ro()
and box.ctl.wait_rw() Lua calls to catch the instance in read-only
state always before read-write state.

In theory a user should be able to call box.ctl.wait_ro() and
box.ctl.wait_rw() in one fiber, box.cfg() in another, and these
waits would be unblocked one after another.

It works fine now, but only because of, surprisingly, the feedback
daemon. The daemon creates a yield after C++ box_cfg() is
finished, but dynamic parameters are still being applied in
load_cfg.lua. That gives time to catch box.ctl.wait_ro() event.

The thing is that dynamic parameters configuration includes the
daemon's options too. When 'feedback_enable' option is installed
to true, the daemon is started using fiber.create(). That creates
a yield, and gives time to box.ctl.wait_ro() fibers to handle the
event.

When the daemon is disabled or removed, like it is going to happen
in #3308, this trick does not work, and box.ctl.wait_ro() started
before box.cfg() is never triggered.

It could be tested on app-tap/cfg.test.lua with

    box.cfg{}

changed to

    box.cfg{feedback_enabled = false}

Then the test would hang. A test is not patched here, because the
feedback is going to be optionally removed in a next commit, and
the test would become flaky depending on build options.

Needed for #3308
---
 src/box/box.cc | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 0c15ba5e9..891289bf6 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -2450,8 +2450,21 @@ box_cfg_xc(void)
 	if (!is_bootstrap_leader)
 		replicaset_sync();
 
-	/* If anyone is waiting for ro mode to go away */
+	/* box.cfg.read_only is not read yet. */
+	assert(box_is_ro());
+	/* If anyone is waiting for ro mode. */
 	fiber_cond_broadcast(&ro_cond);
+	/*
+	 * Yield to let ro condition waiters to handle the event.
+	 * Without yield it may happen there won't be a context
+	 * switch until the ro state is changed again, and as a
+	 * result, some ro waiters may sleep forever. For example,
+	 * when Tarantool is just started, it is expected it will
+	 * enter ro=true state, and then ro=false. Without the
+	 * yield the ro=true event may be lost. This affects
+	 * box.ctl.wait_ro() call.
+	 */
+	fiber_sleep(0);
 }
 
 void
-- 
2.21.1 (Apple Git-122.3)



More information about the Tarantool-patches mailing list