[Tarantool-patches] [PATCH v3] wal: introduce limits on simultaneous writes
Serge Petrenko
sergepetrenko at tarantool.org
Wed Mar 17 15:14:11 MSK 2021
16.03.2021 23:48, Vladislav Shpilevoy пишет:
> Hi! Thanks for the fixes!
Thanks for the review!
> See 3 comments below.
>
>> Here's when the option comes in handy:
>> Imagine such a situation: there are 2 servers, a master and a replica,
>> and the replica is down for some period of time. While the replica is
>> down, the master serves requests at a reasonable pace, possibly close to
>> its WAL throughput limit. Once the replica reconnects, it has to receive
>> all the data master has piled up. Now there's no limit in speed at which
>> master sends the data to replica, and there's no limit at which
>> replica's applier submits corresponding write requests to WAL. This
>> leads to a situation when replica's WAL is never in time to serve the
>> requests and the amount of pending requests is constantly growing.
>> There's no limit for memory WAL write requests take, and this clogging
>> of WAL write queue may even lead to replica using up all the available
>> memory.
>>
>> Now, when `wal_queue_max_size` is set, appliers will stop reading new
>> transactions once the limit is reached. This will let WAL process all the
>> requests that have piled up and free all the excess memory.
>>
>> [tosquash] remove wal_queue_max_len
> 1. You forgot something, the last line. Also, while we are here, it probably
> would be easier for the doc team if the old behaviour was described using a
> past tense, while the new one using the present tense. Currently you use
> 'now' word both for the old and for the new behaviour. For instance, you say
>
> Now there's no limit in speed at which master sends the data to replica,
> and there's no limit at which replica's applier submits corresponding
> write requests to WAL
>
> But >now< there is a limit. Even if 'wal_queue_max_size' is not set, it works
> with the default value.
Thanks! Take a look at the revised paragraph:
=====================================================
Here's when the option comes in handy:
Before this option was introduced such a situation could be possible:
there are 2 servers, a master and a replica, and the replica is down for
some period of time. While the replica is down, master serves requests
at a reasonable pace, possibly close to its WAL throughput limit. Once the
replica reconnects, it has to receive all the data master has piled up and
there's no limit in speed at which master sends the data to replica, and,
without the option, there was no limit in speed at which replica submitted
corresponding write requests to WAL.
This lead to a situation when replica's WAL was never in time to serve the
requests and the amount of pending requests was constantly growing.
There was no limit for memory WAL write requests take, and this clogging
of WAL write queue could even lead to replica using up all the available
memory.
Now, when `wal_queue_max_size` is set, appliers will stop reading new
transactions once the limit is reached. This will let WAL process all the
requests that have piled up and free all the excess memory.
=====================================================
>> diff --git a/changelogs/unreleased/wal-queue-limit.md b/changelogs/unreleased/wal-queue-limit.md
>> new file mode 100644
>> index 000000000..393932456
>> --- /dev/null
>> +++ b/changelogs/unreleased/wal-queue-limit.md
>> @@ -0,0 +1,9 @@
>> +## feature/core
>> +
>> +* Introduce the concept of WAL queue and 2 new configuration options:
>> + `wal_queue_max_len`, measured in transactions, with 100k default and
>> + `wal_queue_max_size`, measured in bytes, with 100 Mb default.
> 2. There is 1 option now, not 2.
Sorry for the inattention, fixed.
>
>> + The options help limit the pace at which replica submits new transactions
>> + to WAL: the limits are checked every time a transaction from master is
>> + submitted to replica's WAL, and the space taken by a transaction is
>> + considered empty once it's successfully written (gh-5536).> diff --git a/src/box/journal.h b/src/box/journal.h
>> index 5d8d5a726..437257728 100644
>> --- a/src/box/journal.h
>> +++ b/src/box/journal.h
>> @@ -124,6 +142,62 @@ struct journal {
>> +
>> +/** Set maximal journal queue size in bytes. */
>> +static inline void
>> +journal_queue_set_max_size(int64_t size)
>> +{
>> + journal_queue.max_size = size;
>> + journal_queue_wakeup();
>> +}
>> +
>> +/** Increase queue size on a new write request. */
>> +static inline void
>> +journal_queue_on_append(struct journal_entry *entry)
> 3. Since you will amend the patch anyway, you could also
> make the entry 'const', the same in journal_queue_on_complete().
Sure. The diff's below.
>
>> +{
>> + journal_queue.size += entry->approx_len;
>> +}
>> +
>> +/** Decrease queue size once write request is complete. */
>> +static inline void
>> +journal_queue_on_complete(struct journal_entry *entry)
>> +{
>> + journal_queue.size -= entry->approx_len;
>> + assert(journal_queue.size >= 0);
>> +}
=====================================================diff --git
a/changelogs/unreleased/wal-queue-limit.md
b/changelogs/unreleased/wal-queue-limit.md
index 393932456..1708e46e6 100644
--- a/changelogs/unreleased/wal-queue-limit.md
+++ b/changelogs/unreleased/wal-queue-limit.md
@@ -1,9 +1,8 @@
## feature/core
-* Introduce the concept of WAL queue and 2 new configuration options:
- `wal_queue_max_len`, measured in transactions, with 100k default and
+* Introduce the concept of WAL queue and a new configuration option:
`wal_queue_max_size`, measured in bytes, with 100 Mb default.
- The options help limit the pace at which replica submits new transactions
- to WAL: the limits are checked every time a transaction from master is
+ The option helps limit the pace at which replica submits new transactions
+ to WAL: the limit is checked every time a transaction from master is
submitted to replica's WAL, and the space taken by a transaction is
considered empty once it's successfully written (gh-5536).
diff --git a/src/box/journal.h b/src/box/journal.h
index 437257728..76c70c19f 100644
--- a/src/box/journal.h
+++ b/src/box/journal.h
@@ -185,14 +185,14 @@ journal_queue_set_max_size(int64_t size)
/** Increase queue size on a new write request. */
static inline void
-journal_queue_on_append(struct journal_entry *entry)
+journal_queue_on_append(const struct journal_entry *entry)
{
journal_queue.size += entry->approx_len;
}
/** Decrease queue size once write request is complete. */
static inline void
-journal_queue_on_complete(struct journal_entry *entry)
+journal_queue_on_complete(const struct journal_entry *entry)
{
journal_queue.size -= entry->approx_len;
assert(journal_queue.size >= 0);
diff --git a/test/box-tap/cfg.test.lua b/test/box-tap/cfg.test.lua
index 3276ddf64..8f21c5628 100755
--- a/test/box-tap/cfg.test.lua
+++ b/test/box-tap/cfg.test.lua
@@ -6,7 +6,7 @@ local socket = require('socket')
local fio = require('fio')
local uuid = require('uuid')
local msgpack = require('msgpack')
-test:plan(110)
+test:plan(109)
--------------------------------------------------------------------------------
-- Invalid values
@@ -50,7 +50,6 @@ invalid('vinyl_run_size_ratio', 1)
invalid('vinyl_bloom_fpr', 0)
invalid('vinyl_bloom_fpr', 1.1)
invalid('wal_queue_max_size', -1)
-invalid('wal_queue_max_len', -1)
local function invalid_combinations(name, val)
local status, result = pcall(box.cfg, val)
--
Serge Petrenko
More information about the Tarantool-patches
mailing list