From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@dev.tarantool.org, gorcunov@gmail.com,
	sergepetrenko@tarantool.org
Subject: [Tarantool-patches] [PATCH v2 0/4] Boot with anon
Date: Tue, 15 Sep 2020 01:11:26 +0200	[thread overview]
Message-ID: <cover.1600124767.git.v.shpilevoy@tarantool.org> (raw)
The patch attempts to address with problem of anonymous replicas being
registered in _cluster, if they are present during bootstrap.
The bug was found during working on another issue related to Raft. The problem
is that Raft won't work properly during bootstrap if non-joined replicas are
registered in _cluster.
When their auto-registration by applier was removed, the anon bug was found.
The auto-registration removal is trivial, but it breaks the cluster bootstrap in
another way creating false-positive XlogGap errors. See the second commit with
an explanation. To solve the issue quite a radical solution is applied - gap
errors are not considered critical anymore, and can be retried. I am not sure
that is the best option, but couldn't come up with anything better after a long
struggle with that.
This is a bug, so whatever we will come up with after all, it should be pushed
to the older versions too.
Branch: http://github.com/tarantool/tarantool/tree/gerold103/gh-5287-anon-false-register
Issue: https://github.com/tarantool/tarantool/issues/5287
Changes in v2:
- Anon status is stored as a flag again. In v1 it was stored as enum, but an
  alternative solution was proposed, where the enum is not needed.
- Ballot now has a new field is_anon. It helps to avoid the enum, and set
  replica->anon flag to a correct value right when it becomes connected. Through
  relay or applier, either.
@ChangeLog
* Anonymous replica could be registered and could prevent WAL files removal (gh-5287).
* XlogGapError is not a critical error anymore. It means, box.info.replication will show upstream status as 'loading' if the error was found. The upstream will be restarted until the error is resolved automatically with a help of another instance, or until the replica is removed from box.cfg.replication (gh-5287).
Vladislav Shpilevoy (4):
  xlog: introduce an error code for XlogGapError
  replication: retry in case of XlogGapError
  replication: add is_anon flag to ballot
  replication: do not register outgoing connections
 src/box/applier.cc                            |  40 ++++
 src/box/box.cc                                |  30 +--
 src/box/errcode.h                             |   2 +
 src/box/error.cc                              |   2 +
 src/box/error.h                               |   1 +
 src/box/iproto_constants.h                    |   1 +
 src/box/recovery.h                            |   2 -
 src/box/replication.cc                        |  14 +-
 src/box/xrow.c                                |  14 +-
 src/box/xrow.h                                |   5 +
 test/box/error.result                         |   2 +
 test/replication/autobootstrap_anon.lua       |  25 +++
 test/replication/autobootstrap_anon1.lua      |   1 +
 test/replication/autobootstrap_anon2.lua      |   1 +
 test/replication/force_recovery.result        | 110 -----------
 test/replication/force_recovery.test.lua      |  43 -----
 test/replication/gh-5287-boot-anon.result     |  81 +++++++++
 test/replication/gh-5287-boot-anon.test.lua   |  30 +++
 test/replication/prune.result                 |  18 +-
 test/replication/prune.test.lua               |   7 +-
 test/replication/replica.lua                  |   2 +
 test/replication/replica_rejoin.result        |   6 +-
 test/replication/replica_rejoin.test.lua      |   4 +-
 .../show_error_on_disconnect.result           |   2 +-
 .../show_error_on_disconnect.test.lua         |   2 +-
 test/xlog/panic_on_wal_error.result           | 171 ------------------
 test/xlog/panic_on_wal_error.test.lua         |  75 --------
 27 files changed, 262 insertions(+), 429 deletions(-)
 create mode 100644 test/replication/autobootstrap_anon.lua
 create mode 120000 test/replication/autobootstrap_anon1.lua
 create mode 120000 test/replication/autobootstrap_anon2.lua
 delete mode 100644 test/replication/force_recovery.result
 delete mode 100644 test/replication/force_recovery.test.lua
 create mode 100644 test/replication/gh-5287-boot-anon.result
 create mode 100644 test/replication/gh-5287-boot-anon.test.lua
 delete mode 100644 test/xlog/panic_on_wal_error.result
 delete mode 100644 test/xlog/panic_on_wal_error.test.lua
-- 
2.21.1 (Apple Git-122.3)
next             reply	other threads:[~2020-09-14 23:11 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14 23:11 Vladislav Shpilevoy [this message]
2020-09-14 23:11 ` [Tarantool-patches] [PATCH v2 1/4] xlog: introduce an error code for XlogGapError Vladislav Shpilevoy
2020-09-15  7:53   ` Serge Petrenko
2020-09-14 23:11 ` [Tarantool-patches] [PATCH v2 2/4] replication: retry in case of XlogGapError Vladislav Shpilevoy
2020-09-15  7:35   ` Serge Petrenko
2020-09-15 21:23     ` Vladislav Shpilevoy
2020-09-16 10:59       ` Serge Petrenko
2020-09-14 23:11 ` [Tarantool-patches] [PATCH v2 3/4] replication: add is_anon flag to ballot Vladislav Shpilevoy
2020-09-15  7:46   ` Serge Petrenko
2020-09-15 21:22     ` Vladislav Shpilevoy
2020-09-16 10:59       ` Serge Petrenko
2020-09-14 23:11 ` [Tarantool-patches] [PATCH v2 4/4] replication: do not register outgoing connections Vladislav Shpilevoy
2020-09-15  7:50   ` Serge Petrenko
2020-09-17 12:08 ` [Tarantool-patches] [PATCH v2 0/4] Boot with anon Kirill Yukhin
2020-09-17 13:00   ` Vladislav Shpilevoy
2020-09-17 15:04     ` Kirill Yukhin
2020-09-17 16:42       ` Vladislav Shpilevoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=cover.1600124767.git.v.shpilevoy@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2 0/4] Boot with anon' \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox