Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Serge Petrenko <sergepetrenko@tarantool.org>, gorcunov@gmail.com
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH] recovery: make it yield when positioning in a WAL
Date: Wed, 28 Apr 2021 22:50:43 +0200
Message-ID: <738d3030-80cf-8079-1b03-55a7d665dbbf@tarantool.org> (raw)
In-Reply-To: <eff3dfea-b394-606d-e342-eb5aa46e5efd@tarantool.org>

>>> We had various places in box.cc and relay.cc which counted processed
>>> rows and yielded every now and then. These yields didn't cover cases,
>>> when recovery has to position inside a long WAL file:
>>>
>>> For example, when tarantool exits without leaving an empty WAL file
>>> which'll be used to recover instance vclock on restart. In this case
>>> the instance freezes while processing the last availabe WAL in order
>>> to recover the vclock.
>>>
>>> Another issue is with replication. If a replica connects and needs data
>>> from the end of a really long WAL, recovery will read up to the needed
>>> position without yields, making relay disconnect by timeout.
>>>
>>> In order to fix the issue, make recovery decide when a yield should
>>> happen. Introduce a new callback: schedule_yield, which is called by
>>> recovery once it processes (no matter how, either simply skips or calls
>>> xstream_write) enough rows (WAL_ROWS_PER_YIELD).
>>>
>>> schedule_yield either yields right away, in case of relay, or saves the
>>> yield for later, in case of local recovery, because it might be in the
>>> middle of a transaction.
>> 1. Did you consider an option to yield explicitly in recovery code when
>> it skips rows? If they are being skipped, it does not matter what are
>> their transaction borders.
> 
> I did consider that. It is possible to do so, but then we'll have yet another
> place (in addition to relay and wal_stream) which counts rows and yields
> every now and then.
> 
> I thought it would be better to unify all these places. Actually, this could be
> done this way from the very beginning.
> I think it's not recovery's business whether to yield or not once
> some rows are processed.
> 
> Anyway, I can make it this way, if you insist.

The current solution is also fine.

>>> +
>>> +/**
>>> + * Plan a yield in recovery stream. Wal stream will execute it as soon as it's
>>> + * ready.
>>> + */
>>> +static void
>>> +wal_stream_schedule_yield(void)
>>> +{
>>> +    wal_stream.has_yield = true;
>>> +    wal_stream_try_yield(&wal_stream);
>>> +}
>>> diff --git a/src/box/recovery.cc b/src/box/recovery.cc
>>> index cd33e7635..5351d8524 100644
>>> --- a/src/box/recovery.cc
>>> +++ b/src/box/recovery.cc
>>> @@ -241,10 +248,16 @@ static void
>>>   recover_xlog(struct recovery *r, struct xstream *stream,
>>>            const struct vclock *stop_vclock)
>>>   {
>>> +    /* Imitate old behaviour. Rows are counted separately for each xlog. */
>>> +    r->row_count = 0;
>> 3. But why do you need to imitate it? Does it mean if the files are
>> too small to yield even once in each, but in total their number is
>> huge, there won't be yields?
> 
> Yes, that's true.

Does not this look wrong to you? The xlog files might not contain enough
rows if wal_max_size is small enough, and then the same issue still
exists - no yields.

>> Also does it mean "1M rows processed" was not ever printed in that
>> case?
> 
> Yes, when WALs are not big enough.
> Recovery starts over with '0.1M rows processed' on every new WAL file.

Does not this look wrong to you too? That at least the number of
rows should not drop to 0 on each next xlog file.

  reply	other threads:[~2021-04-28 20:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26 16:59 Serge Petrenko via Tarantool-patches
2021-04-26 21:20 ` Vladislav Shpilevoy via Tarantool-patches
2021-04-28 15:34   ` Serge Petrenko via Tarantool-patches
2021-04-28 20:50     ` Vladislav Shpilevoy via Tarantool-patches [this message]
2021-04-29  8:55       ` Serge Petrenko via Tarantool-patches
2021-04-29 20:03         ` Vladislav Shpilevoy via Tarantool-patches
2021-05-12 11:30           ` Serge Petrenko via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=738d3030-80cf-8079-1b03-55a7d665dbbf@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=sergepetrenko@tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Tarantool development patches archive

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://lists.tarantool.org/tarantool-patches/0 tarantool-patches/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 tarantool-patches tarantool-patches/ https://lists.tarantool.org/tarantool-patches \
		tarantool-patches@dev.tarantool.org.
	public-inbox-index tarantool-patches

Example config snippet for mirrors.


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git