[tarantool-patches] [PATCH v2 0/2] detect and throw away dead replicas

Olga Arkhangelskaia arkholga at tarantool.org
Fri Oct 12 22:45:55 MSK 2018


According to previous discussions the way of replicas bad state detection
is changed completely. Now we maintain two time differences between now and
last activity of applier and relay.
THis values can be found in box.info.replication.lar/law: 
We use hours, but i still have some doubts may be we should display days,
hours and minutes.
Lar/law are compared with replication_dead/rw_gap, that should be previously
configured via box.cfg. The question here - now I am not sure in replication_rw_gap.
The reason I added tis parameter is the idea that in master case the difference
between applier and relay activity is too be - there is big chance that something
is wrong with replica.

The last problem I want to discuss - is test cases, test takes too much time, and
there is no separate case for applier. I mean that relay and rw_gap can be tested
separetly by turning off replication and tuning gap parameters, however i do not
see case when only lar is lagging seriously.

If you have ideas how to make this functionality better - please, share. Will be
glad to see other opinions.
---
Branch:
https://github.com/tarantool/tarantool/tree/OKriw/gh-3110-prune-dead-replica-from-replicaset-1.10
Issue:
https://github.com/tarantool/tarantool/issues/3110

v1:
https://www.freelists.org/post/tarantool-patches/PATCH-rfc-schema-add-possibility-to-find-and-throw-away-dead-replicas

Changes v2:
- changed the way of replicas death detection
- added special box options
- changed test
- now only dead replicas are shown
- added function to throw away any replica

Olga Arkhangelskaia (2):
  box: added replication_dead/rw_gap options
  ctl: added functionality to detect and prune dead replicas

 src/box/CMakeLists.txt         |   1 +
 src/box/box.cc                 |  34 ++++++
 src/box/box.h                  |   2 +
 src/box/lua/cfg.cc             |  24 +++++
 src/box/lua/ctl.lua            |  58 ++++++++++
 src/box/lua/info.c             |  10 ++
 src/box/lua/init.c             |   2 +
 src/box/lua/load_cfg.lua       |   8 ++
 src/box/relay.cc               |   6 ++
 src/box/relay.h                |   4 +
 src/box/replication.cc         |   3 +-
 src/box/replication.h          |  12 +++
 test/box/admin.result          |   4 +
 test/box/cfg.result            |   8 ++
 test/replication/trim.lua      |  66 ++++++++++++
 test/replication/trim.result   | 237 +++++++++++++++++++++++++++++++++++++++++
 test/replication/trim.test.lua |  93 ++++++++++++++++
 test/replication/trim1.lua     |   1 +
 test/replication/trim2.lua     |   1 +
 test/replication/trim3.lua     |   1 +
 test/replication/trim4.lua     |   1 +
 21 files changed, 575 insertions(+), 1 deletion(-)
 create mode 100644 src/box/lua/ctl.lua
 create mode 100644 test/replication/trim.lua
 create mode 100644 test/replication/trim.result
 create mode 100644 test/replication/trim.test.lua
 create mode 120000 test/replication/trim1.lua
 create mode 120000 test/replication/trim2.lua
 create mode 120000 test/replication/trim3.lua
 create mode 120000 test/replication/trim4.lua

-- 
2.14.3 (Apple Git-98)






More information about the Tarantool-patches mailing list