[Tarantool-patches] [PATCH v9 0/2] relay: provide downstream lag information

Cyrill Gorcunov gorcunov at gmail.com
Thu Jun 17 18:48:33 MSK 2021


Guys, take a look once time permit, hopefully manage to address
all comments. Previous series at

https://lists.tarantool.org/tarantool-patches/20210607155519.109626-1-gorcunov@gmail.com/

v4 (by Vlad):
 - add a test case
 - add docbot request
 - dropped off xrow_encode_vclock_timed, we use opencoded assignment
   for tm value when send ack
 - struct awstat renamed to applier_wal_stat. Vlad I think this is
   better name than "applier_lag" because this is statistics on WAL,
   we simply track remote WAL propagation here, so more general name
   is better for grep sake and for future extensions
 - instead of passing applier structure we pass replica_id
 - the real keeper of this statistics comes into "replica" structure
   thus unbound of applier itself
 - for synchro entries we pass a pointer to the applier_wal_stat instead
   of using replica_id = 0 as a sign that we don't need to update statistics
   for initial and final join cases
 - to write and read statistics we provide wal_stat_update and wal_stat_ack
   helpers to cover the case where single ACK spans several transactions

v8:
 - make one branch less in apply_synchro_row()
 - keep applier_txn_start_tm inside replica stucture
 - rename wal_stat to replica_cb_data since this is more
   logical for case where we have no general stat engine
 - make applier to send timestamp so that relay will compute
   delta upon the read, the lag is kept permanently until new
   write happens
 - extend doc and changelog a bit
 - keep reading of relay's lag from TX thread without any modifications
   because relay get deleted from TX thread and set to non-RELAY_FOLLOW
   state, thus any attempt to read it won't success. To be honest there
   is a small race window present: write doubles are not atomic operation
   thus we might read partially updated timestamp similarly as we have with
   @idle field already. I think this should be addressed separately and better
   without heavy cmsg engine involved but with rw lock instead or plain atomics.

v9 (Vlad and Serge):
 - update of transaction lag for reading by TX thread done via cbus message
 - use last timestamp from transaction to account
 - verify that we really need to test for replica being non-nil in
   applier reader
 - update docs
 - update a testcase

branch gorcunov/gh-5447-relay-lag-9
issue https://github.com/tarantool/tarantool/issues/5447

Cyrill Gorcunov (2):
  applier: send transaction's first row WAL time in the applier_writer_f
  relay: provide information about downstream lag

 .../unreleased/gh-5447-downstream-lag.md      |   6 +
 src/box/applier.cc                            |  97 +++++++++++--
 src/box/lua/info.c                            |   3 +
 src/box/relay.cc                              |  94 ++++++++++++-
 src/box/relay.h                               |   6 +
 src/box/replication.cc                        |   1 +
 src/box/replication.h                         |   5 +
 .../replication/gh-5447-downstream-lag.result | 128 ++++++++++++++++++
 .../gh-5447-downstream-lag.test.lua           |  57 ++++++++
 9 files changed, 378 insertions(+), 19 deletions(-)
 create mode 100644 changelogs/unreleased/gh-5447-downstream-lag.md
 create mode 100644 test/replication/gh-5447-downstream-lag.result
 create mode 100644 test/replication/gh-5447-downstream-lag.test.lua


base-commit: b5f0dc4db9aef9618f56b0bcb4a7b82a59591784
-- 
2.31.1



More information about the Tarantool-patches mailing list