Ok, so we can keep relay state and get last error from it.
For example in case of disconnect (broken pipe) we got SocketError
and in case of LSN gap, XlogGapError.
So for diagnostic if state is STOPPED just print last error from relay.

So changes to do:
1. Align relay state to match applier state (OFF, FOLLOW and STOPPED).
2. Do not destroy relay on error.
3. To simplify diagnostic, box.info.replication print last error if STOPPED.

Kostya, Georgy are you OK?


Суббота, 28 апреля 2018, 23:30 +03:00 от Konstantin Osipov <kostja@tarantool.org>:

* Konstantin Belyavskiy <k.belyavskiy@tarantool.org> [18/04/27 13:43]:
> This fix improves 'box.info.replication' output.
> If downstream fails and thus disconnects from upstream, improve
> logging by printing 'status: disconnected'.
> Add relay_state { NONE, CONNECTED, DISCONNECTED } to track replica
> presence, once connected it either CONNECTED or DISCONNECTED until
> master is reset.
>
> Closes #3365

Hi,

the patch is very good, almost ready to push, but I would like to
request more work.

First, a very minor reason why I'm not pushing it right away is
that I believe relay states should be matched with applier states.

The matching applier states are OFF, FOLLOW and STOPPED.
I think it would be easier for users if we don't invent new states
on relay side.

Second, we allocate relay object on stack; this seems to be a historical
artifact, we have had struct relay before we got struct replica.
Relay has a diagnostics area, so by keeping the relay around we
will be able to display the last error in a way similar to
applier.

I'm not talking about pushing the message back from the applier to
relay - this seems to be an unnecessary hassle and will complicate
things quite a bit.


--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov



Best regards,
Konstantin Belyavskiy
k.belyavskiy@tarantool.org