Re: [Tarantool-patches] [RFC v3 2/3] applier: send first row's WAL time in the applier_writer_f - Vladislav Shpilevoy via Tarantool-patches

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Vladislav Shpilevoy via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Mons Anderson <v.perepelitsa@corp.mail.ru>,
	tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [RFC v3 2/3] applier: send first row's WAL time in the applier_writer_f
Date: Wed, 5 May 2021 22:47:23 +0200	[thread overview]
Message-ID: <5c0161b3-0f00-901d-1c94-f40915fe6f6b@tarantool.org> (raw)
In-Reply-To: <YJKYQD76i27TXodg@grain>

Hi! Thanks for the response!

>>> +	uint32_t instance_id;
>>> +	double first_row_tm;
>>> +};
>>> +
>>> +static void
>>> +awstat_update(struct awstat *awstat)
>>> +{
>>> +	/* Ignore if not needed */
>>> +	if (awstat->instance_id == 0)
>>> +		return;
>>
>> 2. Why did you even allocate this stat if it is not needed?
>> Maybe it would be better to have it NULL then and check
>> for NULL? AFAIU these are the initial and final join cases.
> 
> They are allocated on the stack together with synchro_entry,
> so there is no penalty.

They are allocated on the region for plain transactions, which
are way more important to optimize.

>> Did you try the way I proposed about waiting for all applier's
>> WAL writes to end in applier_stop()? Does it look worse? After
>> the fiber stop and wal_sync() it would be safe to assume there
>> are no WAL writes in fly from this applier. But I don't know if
>> it would look better.
> 
> I thought about it alot. And you know, I don't really like what we
> are to implement
> 
>  - currently applier_stop() doesn't wait the journal to finish its
>    write. The main applier reader is spinning in !fiber_is_cancelled()
>    cycle in a polling way while applier tries to read new data from the
>    remote relay peer. If peer doesn't reply for some reason then we throw
>    an exception which is catched by a caller code, and the caller tries
>    to iterate new cycle testing if fiber is cancelled.

I do not follow here. You needed to wait only for the pending journal
writes. It has nothing to do with reading new data from anywhere, does
it?

>  - in applier_stop() we will have to implement some kind of reference
>    counting, which would be modified on journal completion and i think
>    this makes code even more complex, since we have to add some additional
>    logic when applier is allowed to cancel.

We don't really. You only need to call wal_sync() after which you can be
sure all the WAL writes started before this call now are finished. At
least AFAR it works. But these refs might be needed for assertions, so
yes, maybe not a good idea.

>>> +		return;
>>> +
>>> +	r->applier->first_row_wal_time = awstat->first_row_tm;
>>
>> 4. In case there was a batch of transactions written to WAL,
>> the latest one will override the timestamp of the previous ones and
>> this would make the lag incorrect, because you missed the older
>> transactions. Exactly like when you tried to take a timestamp of
>> the last row instead of the first row, but in a bigger scope.
>> Unless I missed something.
> 
> I'm not sure I follow you here. Say we have a batch of transactions.
> The update happens on every journal_entry completion, if several
> entries are flushed then completion is called in ordered way (one
> followed by another).

Exactly. The latest update will override the previous timestamps.

> The update happens in same tx thread where
> appliers are running which means acks sending procedure is ordered
> relatively to updates call. Thus we may have situation where we
> complete first entry then we either send it in ack message either
> update to a new value and only then send ack.

In one event loop iteration many transactions might be submitted to
WAL by the applier. And they will end also in one event loop iteration
later. There won't even be any yields between their completions.
Yes, first of them will wakeup the ack-sender fiber, but the next ones
will override the first timestamp. After a yield the ACK will contain
the newest timestamp ignoring the timestamps of the older transactions.

> As far as I understand
> there might be a gap in fibers scheduling before several journal entries
> get completed. Thus the lag calculated on relay side will be bigger for
> some moment but on next ack will shrink to latest entry for writtent
> journal entry.

On the contrary, it will be smaller on the other side than it is.
Because you send the ACK with the latest timestamp instead of the
oldest one. Smaller would be misleading about the real latency.

Unless I don't understand something.

> Because relay uses current time minus time value
> obtained from applier's ack. Hopefully I didn't miss something either.
> 
>> Probably you need to assign a new timestamp only when the old
>> one is not 0, and reset it to 0 on each sent ACK. Don't know.
> 
> Gimme some time to investigate this vague moment.

next prev parent reply	other threads:[~2021-05-05 20:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30 15:39 [Tarantool-patches] [RFC v3 0/3] relay: provide downstream lag information Cyrill Gorcunov via Tarantool-patches
2021-04-30 15:39 ` [Tarantool-patches] [RFC v3 1/3] xrow: allow to pass timestamp via xrow_encode_vclock_timed helper Cyrill Gorcunov via Tarantool-patches
2021-04-30 20:45   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-30 20:50     ` Cyrill Gorcunov via Tarantool-patches
2021-05-03 20:21   ` Konstantin Osipov via Tarantool-patches
2021-05-03 20:33     ` Cyrill Gorcunov via Tarantool-patches
2021-05-03 20:37       ` Konstantin Osipov via Tarantool-patches
2021-05-03 20:42         ` Cyrill Gorcunov via Tarantool-patches
2021-04-30 15:39 ` [Tarantool-patches] [RFC v3 2/3] applier: send first row's WAL time in the applier_writer_f Cyrill Gorcunov via Tarantool-patches
2021-04-30 20:49   ` Vladislav Shpilevoy via Tarantool-patches
2021-05-05 13:06     ` Cyrill Gorcunov via Tarantool-patches
2021-05-05 20:47       ` Vladislav Shpilevoy via Tarantool-patches [this message]
2021-05-05 22:19         ` Cyrill Gorcunov via Tarantool-patches
2021-04-30 15:39 ` [Tarantool-patches] [RFC v3 3/3] relay: provide information about downstream lag Cyrill Gorcunov via Tarantool-patches
2021-04-30 20:50   ` Vladislav Shpilevoy via Tarantool-patches
2021-04-30 20:45 ` [Tarantool-patches] [RFC v3 0/3] relay: provide downstream lag information Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c0161b3-0f00-901d-1c94-f40915fe6f6b@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=v.perepelitsa@corp.mail.ru \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [RFC v3 2/3] applier: send first row'\''s WAL time in the applier_writer_f' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox