[tarantool-patches] Re: [tarantool-patches] Re: [PATCH v2 1/2] box: added replication_dead/rw_gap options
arkholga at tarantool.org
Tue Oct 23 21:32:25 MSK 2018
23/10/2018 10:10, Konstantin Osipov пишет:
> * Olga Arkhangelskaia < arkholga at tarantool.org > [18/10/13 08:20]:
>> In scope of gh-3110 we need options that store periods of time,
>> to be compared with time of last activity of relay and applier.
>> This patch introduces replication_dead_gap and replication_rw_gap options.
>> replication_dead_gap is configured in box.cfg, with default 0 value.
>> If time that passed from now till last reader/writer activity of given replica
>> exceeds replication_dead_gap value, replica is suspected to be dead.
>> replication_dead_gap is measured in hours.
>> replication_rw_gap is configured in box.cfg, with default 0 value.
>> If time difference between last reader activity and last writer activity of
>> given replica exceeds replication_rw_gap value, replica is suspected to be dead.
>> replication_rw_gap is measured in hours.
> Why do we need this if we have heartbeats?
I used to think that we need some parameters, that can be set by user,
to check that replica is not active.
For example, if replica is not active for XXXX seconds - it is dead.
However, I did not think about the idea of passing this parameter as a
function argument: list_dead_replicas(XXXX). So I will throw it away.
Another question that is worth to discuss - is kind of statistics to use
for accusing replica to be dead.
The is two ways - save time of last write/read by applier and relay. I
implemented it, but as Vova pointed out, may be we need to save period
of time that replica spends in stopped status. So we decided to do
statistics in separate patch set, and implement both way. And than
decide. However, may be you have better ideas, etc.
> And with swim on board we will have gossip information about entire replica set?
I have read about swim, and as I understand it :
if we have replica set with some topology except full-mesh, we can save
dead replicas mask, numbers, etc, (that we obtained using
list_dead_replicas on some of replicas), and in the end, after some
questioning, we will definitely have information about every replica
in the set.
If that what you mean.
If not, can you be more specific.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Tarantool-patches