Tarantool development patches archive
 help / color / mirror / Atom feed
From: Konstantin Osipov <kostja.osipov@gmail.com>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v5 3/5] box/applier: fix nil dereference in applier rollback
Date: Wed, 5 Feb 2020 12:55:24 +0300	[thread overview]
Message-ID: <20200205095524.GD4624@atlas> (raw)
In-Reply-To: <20200205082721.GJ12445@uranus>

* Cyrill Gorcunov <gorcunov@gmail.com> [20/02/05 11:31]:
> > I don't understand this comment. How can it be lost exactly?
> 
> Hmmm, I think you're right. Actually unweaving the all possible
> call traces by hands (which I had to do) is quite exhausting task
> so I might be wrong here.

It was added by parallel applier patch, so the most likely cause
of the bug is that this way of cancelling parallel appliers on
conflict was not tested well enough. Previously we only had a
single applier per peer so did not need to coordinate.

> > Let's begin by explaining why we need to cancel the reader fiber here.
> 
> This fiber_cancel has been already here, I only added diag_set(FiberIsCancelled)
> to throw an exception thus the caller would zap this applier fiber.
> Actually I think we could retry instead of reaping off the fiber
> completely but it requies more deep understanding of how applier
> works. So I left it in comment.

No, we can't and shouldn't retry. Retry handling is done elsewhere already - see replication_skip_conflict.

If replication is stopped by an apply error, it's most likely a
transaction conflict, indicating that active-active setup is
broken, so it has to be resolved by a DBA (which can set
replication_skip_conflict). This is why it's critical to preserve
the original error.
> 
> Not exactly, if I understand the initial logic of this applier
> try/cath branch  we need to setup replicaset.applier.diag and
> then on FiberIsCancelled we should move it from replicaset.applier.diag
> back to current fiber->diag.

Please dig into what is "current" here. Which fiber is current if
there are many fibers handling a single peer?

-- 
Konstantin Osipov, Moscow, Russia
https://scylladb.com

  reply	other threads:[~2020-02-05  9:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-27 21:53 [Tarantool-patches] [PATCH v5 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov
2020-01-27 21:53 ` [Tarantool-patches] [PATCH v5 1/5] box/request: add missing OutOfMemory diag_set Cyrill Gorcunov
2020-01-27 21:53 ` [Tarantool-patches] [PATCH v5 2/5] box/applier: add missing diag_set on region_alloc failure Cyrill Gorcunov
2020-01-27 21:53 ` [Tarantool-patches] [PATCH v5 3/5] box/applier: fix nil dereference in applier rollback Cyrill Gorcunov
2020-02-04 22:11   ` Konstantin Osipov
2020-02-05  8:27     ` Cyrill Gorcunov
2020-02-05  9:55       ` Konstantin Osipov [this message]
2020-02-05 10:48         ` Cyrill Gorcunov
2020-01-27 21:53 ` [Tarantool-patches] [PATCH v5 4/5] errinj: add ERRINJ_REPLICA_TXN_WRITE Cyrill Gorcunov
2020-02-04 22:11   ` Konstantin Osipov
2020-01-27 21:53 ` [Tarantool-patches] [PATCH v5 5/5] test: add replication/applier-rollback Cyrill Gorcunov
2020-01-28  8:26   ` [Tarantool-patches] [PATCH v6 " Cyrill Gorcunov
2020-01-28 14:23 ` [Tarantool-patches] [PATCH v5 0/5] box/replication: add missing diag set and fix sigsegv Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200205095524.GD4624@atlas \
    --to=kostja.osipov@gmail.com \
    --cc=gorcunov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v5 3/5] box/applier: fix nil dereference in applier rollback' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox