[PATCH] vinyl: be pessimistic about write rate when setting dump watermark

Vladimir Davydov vdavydov.dev at gmail.com
Fri Apr 19 17:22:49 MSK 2019


We set the dump watermark using the following formula

    limit - watermark     watermark
    ---------------- = --------------
       write_rate      dump_bandwidth

This ensures that by the time we run out of memory quota, memory
dump will have been completed and we'll be able to proceed. Here
the write_rate is the expected rate at which the workload will
write to the database while the dump is in progress. Once the dump
is started, we throttle the workload in case it exceeds this rate.

Currently, we estimate the write rate as a moving average observed
for the last 5 seconds. This performs poorly unless the workload
write rate is perfectly stable: if the 5 second average turns out to
be even slightly less than the max rate, the workload may experience
long stalls during memory dump.

To avoid that let's use the max write rate multiplied by 1.5 instead
of the average when setting the watermark. This means that we will
start dump earlier than we probably could, but at the same time this
will tolerate write rate fluctuations thus minimizing the probability
of stalls.

Closes #4166
---
https://github.com/tarantool/tarantool/issues/4166
https://github.com/tarantool/tarantool/commits/dv/gh-4166-vy-throttling-tweak

 src/box/vy_regulator.c | 25 ++++++++++++++++++++++---
 src/box/vy_regulator.h |  5 +++++
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/src/box/vy_regulator.c b/src/box/vy_regulator.c
index 589d3fd9..a6604c9f 100644
--- a/src/box/vy_regulator.c
+++ b/src/box/vy_regulator.c
@@ -115,10 +115,13 @@ vy_regulator_trigger_dump(struct vy_regulator *regulator)
 				max_write_rate);
 
 	say_info("dumping %zu bytes, expected rate %.1f MB/s, "
-		 "ETA %.1f s, recent write rate %.1f MB/s",
+		 "ETA %.1f s, write rate (avg/max) %.1f/%.1f MB/s",
 		 quota->used, (double)regulator->dump_bandwidth / 1024 / 1024,
 		 (double)quota->used / (regulator->dump_bandwidth + 1),
-		 (double)regulator->write_rate / 1024 / 1024);
+		 (double)regulator->write_rate / 1024 / 1024,
+		 (double)regulator->write_rate_max / 1024 / 1024);
+
+	regulator->write_rate_max = regulator->write_rate;
 }
 
 static void
@@ -146,6 +149,8 @@ vy_regulator_update_write_rate(struct vy_regulator *regulator)
 	rate_avg = (1 - weight) * rate_avg + weight * rate_curr;
 
 	regulator->write_rate = rate_avg;
+	if (regulator->write_rate_max < rate_curr)
+		regulator->write_rate_max = rate_curr;
 	regulator->quota_used_last = used_curr;
 }
 
@@ -164,10 +169,24 @@ vy_regulator_update_dump_watermark(struct vy_regulator *regulator)
 	 *   limit - watermark      watermark
 	 *   ----------------- = --------------
 	 *       write_rate      dump_bandwidth
+	 *
+	 * Be pessimistic when predicting the write rate - use the
+	 * max observed write rate multiplied by 1.5 - because it's
+	 * better to start memory dump early than delay it as long
+	 * as possible at the risk of experiencing unpredictably
+	 * long stalls.
 	 */
+	size_t write_rate = regulator->write_rate_max * 3 / 2;
 	regulator->dump_watermark =
 			(double)quota->limit * regulator->dump_bandwidth /
-			(regulator->dump_bandwidth + regulator->write_rate + 1);
+			(regulator->dump_bandwidth + write_rate + 1);
+	/*
+	 * It doesn't make sense to set the watermark below 50%
+	 * of the memory limit because the write rate can exceed
+	 * the dump bandwidth under no circumstances.
+	 */
+	regulator->dump_watermark = MAX(regulator->dump_watermark,
+					quota->limit / 2);
 }
 
 static void
diff --git a/src/box/vy_regulator.h b/src/box/vy_regulator.h
index 65c1672d..5131ac58 100644
--- a/src/box/vy_regulator.h
+++ b/src/box/vy_regulator.h
@@ -76,6 +76,11 @@ struct vy_regulator {
 	 */
 	size_t write_rate;
 	/**
+	 * Max write rate observed since the last time when
+	 * memory dump was triggered, in bytes per second.
+	 */
+	size_t write_rate_max;
+	/**
 	 * Amount of memory that was used when the timer was
 	 * executed last time. Needed to update @write_rate.
 	 */
-- 
2.11.0




More information about the Tarantool-patches mailing list