* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 0/4] In-memory wal replication
[not found] ` <5842734.CUO0naIOha@home.lan>
@ 2019-10-21 23:03 ` Vladislav Shpilevoy
0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:03 UTC (permalink / raw)
To: tarantool-patches, Georgy Kirichenko; +Cc: tarantool-patches
On 20/09/2019 10:28, Georgy Kirichenko wrote:
> On Friday, September 20, 2019 10:52:49 AM MSK Konstantin Osipov wrote:
>> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/09/19 00:44]:
>>>> The last patch makes a relay able to read from wal memory using
>>>> a xrow_buffer cursor. When relay went out of wal memory window it turns
>>>> back to file mode. After last known xlog file is relayed the a relay
>>>> returns to wal memory.
>>>
>>> Please, ask Alexander to bench your branch to check if there are no
>>> performance problems.
>>
>> the whole series should boost the performance quite a bit, after
>> all we are adding an in-memory cache of wal rows and remove file
>> I/O in the most common case.
>>
>> But tx overhead is increased by more eager gc_advance, so it has
>> to be benched independently.
> It is not the latest branch, just after this I plan to move relay into wal,
> which allows us to process the whole xlog collection logic in the wal thread.
>
It will take unknown amount of time, what means, that this
not the latest branch might be on top of the master for quite
a long time, and may become a part of a release. We need it
benched.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/4] wal: wal memory buffer
[not found] ` <68c06ac1-3a3f-bae2-ebf4-79c1ae63cb64@tarantool.org>
@ 2019-10-21 23:04 ` Vladislav Shpilevoy
0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:04 UTC (permalink / raw)
To: Georgy Kirichenko, tarantool-patches, Konstantin Osipov,
tarantool-patches
>>>>>> + /* Decoded xrow header. */
>>>>>> + struct xrow_header xrow;
>>>>>> + /* Pointer to the xrow encoded raw data. */
>>>>>> + void *data;
>>>>>> + /* xrow raw data size. */
>>>>>> + size_t size;
>>>>>> +};
>>>>>> +
>>>>>> +/*
>>>>>> + * Xrow memory data chunk info contains
>>>>>> + * a vclock just before the first stored row,
>>>>>> + * an ibuf with row descriptors
>>>>>> + * an obuf with encoded data
>>>>>
>>>>> 29. I don't think it really makes sense to just dupicate here
>>>>> the comments of each attribute of the structure. Please,
>>>>> better use that space to explain, why you need vclock, why
>>>>> 'before first row' instead of just 'first row', or 'last row'.
>>>>>
>>>>> What is a 'row descriptor'?
>>>>>
>>>>> Why do you need both initial xrows and encoded? I thought, that
>>>>> xrow_buf stores only encoded rows.
>>>>
>>>> Relay will need initial xrows to perform filtering (replication group, lsn
>>>> and e.t.c.) without xrow encoded body to decode.
>>>
>>> Hm, it looks quite expensive just for filtering. But assume it is ok.
>>> 1) Why can't you store only a header instead of the whole xrow with a
>>> body? As I understand, for any filtering the header is enough. Or not?
>> I should also store xrow bodies and then encode the xrow as many times as
>> count of replicas I have.
>
> Why do you need to encode it multiple times? Why not to send the same
> encoded record?
> Also, I don't see where do you encode more than once.
>
Please, say anything. I still don't understand it.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/4] wal: wal memory buffer
[not found] ` <772fcc2b-bf03-19d1-6431-1b7d2b058865@tarantool.org>
@ 2019-10-21 23:04 ` Vladislav Shpilevoy
0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:04 UTC (permalink / raw)
To: tarantool-patches, Georgy Kirichenko, tarantool-patches
>> +enum {
>> + /*
>> + * Xrow memory object contains some count of rotating data chunks.
>> + * Every rotation estimated decrease in amount of stored rows is
>> + * about 1/(COUNT OF CHUNKS). However the bigger value makes rotation
>> + * more frequent, the decrease would be smoother and size of
>> + * a wal memory more stable.
>
> 27. Please, limit max length of a comment string to 66 symbols. It
> helps reading, no jokes. In the C file too.
Sorry, still out of 66. In many other places too.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 3/4] wal: xrow buffer cursor
[not found] ` <569ff681-a670-f4ca-3c6f-75c159e21429@tarantool.org>
@ 2019-10-21 23:04 ` Vladislav Shpilevoy
0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:04 UTC (permalink / raw)
To: Georgy Kirichenko; +Cc: tarantool-patches, tarantool-patches
>>>> + break;
>>>> + /* Buffer was discarded. */
>>>
>>> 4. Ok, perhaps now I understand the idea of why do you discard
>>> chunks without any checks if they are in use. Looks cool.
>>>
>>> But it means, that if a relay sends data too slow, it will fall
>>> from the in-memory replication down to disk. Being fallen to a disk,
>>> how can it return back? Isn't it a problem, that the chunks will
>>> be being written to wal so many and fast, that the relay will never
>>> return back to in-memory replication?
>> If it is a temporary slowdown and a relay will process data faster than a
>> master writes then the relay will return to in-memory replication. In the
>> opposite case, if the relay is lower that the master, then the relay will be
>> discarded because of collected wall file - in this case there is nothing we can
>> do.
>
> Have you asked Alexander Tikh. to bench you branch? I think
> we need to test how realistic is the case, when WAL writes much faster
> than replicates, and the replicas just never can keep master's pace.
> Is it correct, that relay doesn't send a next transaction until a
> previous is ACKed?
Any news here?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 4/4] replication: use wal memory buffer to fetch rows
[not found] ` <0db07768-c555-a3f6-12d0-a33a267deaf7@tarantool.org>
@ 2019-10-21 23:05 ` Vladislav Shpilevoy
[not found] ` <2613444.GBVTE10AvH@home.lan>
1 sibling, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:05 UTC (permalink / raw)
To: tarantool-patches, Georgy Kirichenko, tarantool-patches
See 5 comments below.
>> diff --git a/src/box/relay.cc b/src/box/relay.cc
>> index 21674119d..1e65d6d56 100644
>> --- a/src/box/relay.cc
>> +++ b/src/box/relay.cc
>> @@ -161,9 +163,9 @@ relay_new(struct replica *replica)
>> }
>> relay->replica = replica;
>> relay->last_row_time = ev_monotonic_now(loop());
>> - fiber_cond_create(&relay->reader_cond);
>> diag_create(&relay->diag);
>> relay->state = RELAY_OFF;
>> + relay->r = NULL;
>
> 1. Why? 'relay' is allocated using 'calloc' a few lines
> above, 'r' is NULL already.
>
1. Why did you ignore that comment?
>> + relay->wal_dir = strdup(cfg_gets("wal_dir"));
>> + if (relay->wal_dir == NULL) {
>> + diag_set(OutOfMemory, strlen(cfg_gets("wal_dir")),
>
> 10. I know, I am a fucking bore, but 'strlen() + 1', for terminating
> zero :)
>
2. Why did you ignore that comment?
>> + "runtime", "wal_dir");
>
> 11. There is a strict rule what arguments we pass to diag_raise -
> it is size, allocator function, variable name. Allocator function
> here is 'strdup', not 'runtime'.
>
3. Why did you ignore that comment?
>> + struct fiber_cond xrow_buf_cond;
>> };
>>
>> struct wal_msg {
>> @@ -1131,6 +1134,7 @@ wal_writer_f(va_list ap)
>> (void) ap;
>> struct wal_writer *writer = &wal_writer_singleton;
>> xrow_buf_create(&writer->xrow_buf);
>> + fiber_cond_create(&writer->xrow_buf_cond);
>
> 18. Where do you call fiber_cond_destroy()?
>
4. Why did you ignore that comment?
>> +
>> #if defined(__cplusplus)
>> } /* extern "C" */
>> #endif /* defined(__cplusplus) */
>> diff --git a/src/lib/core/cbus.c b/src/lib/core/cbus.c
>> index b3b1280e7..b7e6d769b 100644
>> --- a/src/lib/core/cbus.c
>> +++ b/src/lib/core/cbus.c
>> @@ -284,6 +284,9 @@ cpipe_flush_cb(ev_loop *loop, struct ev_async *watcher, int events)
>> /* Trigger task processing when the queue becomes non-empty. */
>> bool output_was_empty;
>>
>> + int old_cancel_state;
>> + pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &old_cancel_state);
>
> 28. Why?
>
5. Why did you ignore that comment?
>> +
>> tt_pthread_mutex_lock(&endpoint->mutex);
>> output_was_empty = stailq_empty(&endpoint->output);
>> /** Flush input */
>> @@ -297,6 +300,7 @@ cpipe_flush_cb(ev_loop *loop, struct ev_async *watcher, int events)
>>
>> ev_async_send(endpoint->consumer, &endpoint->async);
>> }
>> + pthread_setcancelstate(old_cancel_state, NULL);
>> }
>
> 29. I will review the tests later, when I will fully understand the
> code. But so few tests for such a serious feature looks not enough.
> Could you test it more rigorous?
>
6. So have you succeed in that hacking? I am mostly
interested in tests about slow replication with fast
WAL writing.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 4/4] replication: use wal memory buffer to fetch rows
[not found] ` <84edf0e4-a878-64be-d2f3-694cc672ec9a@tarantool.org>
@ 2019-10-21 23:05 ` Vladislav Shpilevoy
0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:05 UTC (permalink / raw)
To: Georgy Kirichenko; +Cc: tarantool-patches, tarantool-patches
On 23/09/2019 23:51, Vladislav Shpilevoy wrote:
> Thanks for the answers.
>
>>>> + struct vclock *send_vclock;
>>>> + if (relay->version_id < version_id(1, 7, 4))
>>>> + send_vclock = &relay->r->vclock;
>>>
>>> 3. I know, this is old code existed before your patch, and it will
>>> be dropped, but why do you send relay->r->vclock and don't decode it
>>> anywhere? Above you decoded a received vclock into relay->recv_vclock.
>>> relay->r->vclock is not updated, and nonetheless you send it.
>> Before 1.7.4 we used recovery vclock as replica vclock. So recovery does the
>> work.
>
> It *did* the work, but now it doesn't. Recovery vclock was being
> updated in relay_process_wal_event(). Now you relay directly
> from memory, and recovery vclock is not changed until you fall
> to disk. Moreover, looks like recovery object does not exist until
> you fall to disk. It means, that to 1.7.4 you would either crash or
> send the same vclock until a next fall to disk, wouldn't you?
>
Any news here?
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-10-21 23:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <cover.1568798455.git.georgy@tarantool.org>
[not found] ` <21c3aa57-72c7-a8b1-e9e8-5fbb92f22ed9@tarantool.org>
[not found] ` <20190920075249.GI8859@atlas>
[not found] ` <5842734.CUO0naIOha@home.lan>
2019-10-21 23:03 ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 0/4] In-memory wal replication Vladislav Shpilevoy
[not found] ` <4885469.V32OVRv9ck@home.lan>
[not found] ` <c21b86af-ef18-7040-4380-78b01c48203f@tarantool.org>
[not found] ` <2210179.PDX6z0RLMb@home.lan>
[not found] ` <68c06ac1-3a3f-bae2-ebf4-79c1ae63cb64@tarantool.org>
2019-10-21 23:04 ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/4] wal: wal memory buffer Vladislav Shpilevoy
[not found] ` <68fa00546c99b7b53c1c3024881a19024513104d.1568798455.git.georgy@tarantool.org>
[not found] ` <772fcc2b-bf03-19d1-6431-1b7d2b058865@tarantool.org>
2019-10-21 23:04 ` Vladislav Shpilevoy
[not found] ` <5dc83b86bbeada22342c355648de2daa7766bb73.1568798455.git.georgy@tarantool.org>
[not found] ` <a6cec753-ce56-f91f-1305-383380a2d630@tarantool.org>
[not found] ` <1816701.j1k7ruT3f3@home.lan>
[not found] ` <569ff681-a670-f4ca-3c6f-75c159e21429@tarantool.org>
2019-10-21 23:04 ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 3/4] wal: xrow buffer cursor Vladislav Shpilevoy
[not found] ` <0d9ae1e9b32cb8334bf6007de4cb62ce62f87fcb.1568798455.git.georgy@tarantool.org>
[not found] ` <0db07768-c555-a3f6-12d0-a33a267deaf7@tarantool.org>
2019-10-21 23:05 ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 4/4] replication: use wal memory buffer to fetch rows Vladislav Shpilevoy
[not found] ` <2613444.GBVTE10AvH@home.lan>
[not found] ` <84edf0e4-a878-64be-d2f3-694cc672ec9a@tarantool.org>
2019-10-21 23:05 ` Vladislav Shpilevoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox