[Tarantool-patches] [PATCH v9 0/2] relay: provide downstream lag information
Cyrill Gorcunov
gorcunov at gmail.com
Thu Jun 17 18:48:33 MSK 2021
Guys, take a look once time permit, hopefully manage to address
all comments. Previous series at
https://lists.tarantool.org/tarantool-patches/20210607155519.109626-1-gorcunov@gmail.com/
v4 (by Vlad):
- add a test case
- add docbot request
- dropped off xrow_encode_vclock_timed, we use opencoded assignment
for tm value when send ack
- struct awstat renamed to applier_wal_stat. Vlad I think this is
better name than "applier_lag" because this is statistics on WAL,
we simply track remote WAL propagation here, so more general name
is better for grep sake and for future extensions
- instead of passing applier structure we pass replica_id
- the real keeper of this statistics comes into "replica" structure
thus unbound of applier itself
- for synchro entries we pass a pointer to the applier_wal_stat instead
of using replica_id = 0 as a sign that we don't need to update statistics
for initial and final join cases
- to write and read statistics we provide wal_stat_update and wal_stat_ack
helpers to cover the case where single ACK spans several transactions
v8:
- make one branch less in apply_synchro_row()
- keep applier_txn_start_tm inside replica stucture
- rename wal_stat to replica_cb_data since this is more
logical for case where we have no general stat engine
- make applier to send timestamp so that relay will compute
delta upon the read, the lag is kept permanently until new
write happens
- extend doc and changelog a bit
- keep reading of relay's lag from TX thread without any modifications
because relay get deleted from TX thread and set to non-RELAY_FOLLOW
state, thus any attempt to read it won't success. To be honest there
is a small race window present: write doubles are not atomic operation
thus we might read partially updated timestamp similarly as we have with
@idle field already. I think this should be addressed separately and better
without heavy cmsg engine involved but with rw lock instead or plain atomics.
v9 (Vlad and Serge):
- update of transaction lag for reading by TX thread done via cbus message
- use last timestamp from transaction to account
- verify that we really need to test for replica being non-nil in
applier reader
- update docs
- update a testcase
branch gorcunov/gh-5447-relay-lag-9
issue https://github.com/tarantool/tarantool/issues/5447
Cyrill Gorcunov (2):
applier: send transaction's first row WAL time in the applier_writer_f
relay: provide information about downstream lag
.../unreleased/gh-5447-downstream-lag.md | 6 +
src/box/applier.cc | 97 +++++++++++--
src/box/lua/info.c | 3 +
src/box/relay.cc | 94 ++++++++++++-
src/box/relay.h | 6 +
src/box/replication.cc | 1 +
src/box/replication.h | 5 +
.../replication/gh-5447-downstream-lag.result | 128 ++++++++++++++++++
.../gh-5447-downstream-lag.test.lua | 57 ++++++++
9 files changed, 378 insertions(+), 19 deletions(-)
create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
create mode 100644 test/replication/gh-5447-downstream-lag.result
create mode 100644 test/replication/gh-5447-downstream-lag.test.lua
base-commit: b5f0dc4db9aef9618f56b0bcb4a7b82a59591784
--
2.31.1
More information about the Tarantool-patches
mailing list