From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 0FD1122027 for ; Sat, 28 Apr 2018 16:46:01 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U3AZPZMxzhnB for ; Sat, 28 Apr 2018 16:46:00 -0400 (EDT) Received: from smtp46.i.mail.ru (smtp46.i.mail.ru [94.100.177.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id C242722013 for ; Sat, 28 Apr 2018 16:46:00 -0400 (EDT) Date: Sat, 28 Apr 2018 23:45:57 +0300 From: Konstantin Osipov Subject: [tarantool-patches] Re: [PATCH] replication: display correct status at upstream Message-ID: <20180428204557.GB4857@atlas> References: <20180427103938.96974-1-k.belyavskiy@tarantool.org> <20180428203036.GA4857@atlas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180428203036.GA4857@atlas> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org Cc: georgy@tarantool.org * Konstantin Osipov [18/04/28 23:30]: > * Konstantin Belyavskiy [18/04/27 13:43]: > > This fix improves 'box.info.replication' output. > > If downstream fails and thus disconnects from upstream, improve > > logging by printing 'status: disconnected'. > > Add relay_state { NONE, CONNECTED, DISCONNECTED } to track replica > > presence, once connected it either CONNECTED or DISCONNECTED until > > master is reset. > > > > Closes #3365 > > Hi, > > the patch is very good, almost ready to push, but I would like to > request more work. > > First, a very minor reason why I'm not pushing it right away is > that I believe relay states should be matched with applier states. > > The matching applier states are OFF, FOLLOW and STOPPED. > I think it would be easier for users if we don't invent new states > on relay side. > > Second, we allocate relay object on stack; this seems to be a historical > artifact, we have had struct relay before we got struct replica. > Relay has a diagnostics area, so by keeping the relay around we > will be able to display the last error in a way similar to > applier. > > I'm not talking about pushing the message back from the applier to > relay - this seems to be an unnecessary hassle and will complicate > things quite a bit. One more thing. As I wrote earlier, it is difficult or impossible to reliable deliver the error message from applier to the relay. But what we could do at the relay side, if we see that state is "stopped" and there is no message in relay->diag, we could set a helper error ER_RELAY_STOPPED "Please check peer upstream status fin box.info.replication for reason". This would direct users towards finding the cause of the problem. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov