[Tarantool-patches] [RFC v3 0/3] relay: provide downstream lag information
Cyrill Gorcunov
gorcunov at gmail.com
Fri Apr 30 18:39:37 MSK 2021
Guys, this is *NOT* for merging but rather to gather comments on the
code structure and overall idea.
Here is a code flow for memory refresh
MASTER NODE
===========
TX
==
main.sched
|
`- box_process_rw
^ `- txn_commit
| `- alloc xrow
| `- journal_write
| `- wal_assign_lsn
| `- write to disk
| `- wal_notify_watchers
| |
+---------------+ wakeup relay thread
|
v
RELAY THREAD
============
relay_subscribe_f
`- relay_reader_f
| `- coio_read_xrow_timeout_xc <------------------+
| |
`- relay_process_wal_event |
`- recover_remaining_wals |
`- relay_send |
| |
| read xrows from disk |
| and send them to replica's |
| applier |
| |
| |
REPLICA NODE | |
============ | ^
TX | |
== | |
main.sched | |
`- applier_apply_tx <---+ |
| `- apply_synchro_row (if CONFIRM | ROLLBACK) |
| | `- journal_write |
| | `- applier->first_row_wal_time from xrow::tm |
| `- apply_plain_tx |
| `- txn_commit_try_async |
| `- applier_txn_wal_write_cb |
| `- applier->first_row_wal_time from xrow::tm |
| |
`- applier_writer_f |
`- xrow_encode_vclock_timed(applier->first_row_wal_time) |
`- coio_write_xrow -----------------------------------------+
Typical output is something like
(freshly started)
|tarantool> box.info.replication
|---
|- 1:
| id: 1
| uuid: f94edca8-71d4-46c9-b9d2-620a6a2bd977
| lsn: 121
| 2:
| id: 2
| uuid: f6ac84e1-a040-48d9-a9c7-f8147b8e2c9e
| lsn: 0
| upstream:
| status: follow
| idle: 0.56554910800332
| peer: replicator at 127.0.0.1:3302
| lag: 0.00021719932556152
| downstream:
| status: follow
| idle: 0.52823433600133
| vclock: {1: 121}
| lag: 0
|...
The new data sent
|tarantool> box.space.sync:insert{55}
|---
|- [55]
|...
| tarantool> box.info.replication
| ---
| - 1:
| id: 1
| uuid: f94edca8-71d4-46c9-b9d2-620a6a2bd977
| lsn: 123
| 2:
| id: 2
| uuid: f6ac84e1-a040-48d9-a9c7-f8147b8e2c9e
| lsn: 0
| upstream:
| status: follow
| idle: 0.96756215799542
| peer: replicator at 127.0.0.1:3302
| lag: 0.0002143383026123
| downstream:
| status: follow
| idle: 0.31903971399879
| vclock: {1: 123}
| lag: 0.0010807514190674
| ...
Please take a look on applier notifications structure and naming. Actually
I don't really like `downstream.lag` name either because this is not a counterpart
for `upstream.lag` as far as I understand but rather packet traverse so maybe
`dowstream.wal-lag` would be more suitable? Also in idle cycles downstream.lag
is not changed which might confuse the readers because `upstream.lag` does.
Anyway any kind of comments and code structure would be highly appreciated.
Again, this series is not for merging because there is no docs, no tests yet,
I did manual testing only.
Previous version https://lists.tarantool.org/tarantool-patches/20210201100037.212301-1-gorcunov@gmail.com/
branch: gorcunov/gh-5447-relay-lag-3
issue: https://github.com/tarantool/tarantool/issues/5447
Cyrill Gorcunov (3):
xrow: allow to pass timestamp via xrow_encode_vclock_timed helper
applier: send first row's WAL time in the applier_writer_f
relay: provide information about downstream lag
src/box/applier.cc | 84 ++++++++++++++++++++++++++++++++++++++--------
src/box/applier.h | 5 +++
src/box/lua/info.c | 3 ++
src/box/relay.cc | 46 ++++++++++++++++++++++---
src/box/relay.h | 3 ++
src/box/xrow.c | 5 ++-
src/box/xrow.h | 21 ++++++++++--
7 files changed, 146 insertions(+), 21 deletions(-)
base-commit: 7fd53b4c5264bdbc8f01858409abe52bc38764c8
--
2.30.2
More information about the Tarantool-patches
mailing list