Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 0/4] In-memory wal replication

Tarantool development patches archive
 help / color / mirror / Atom feed

* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 0/4] In-memory wal replication
       [not found]     ` <5842734.CUO0naIOha@home.lan>
@ 2019-10-21 23:03       ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:03 UTC (permalink / raw)
  To: tarantool-patches, Georgy Kirichenko; +Cc: tarantool-patches



On 20/09/2019 10:28, Georgy Kirichenko wrote:
> On Friday, September 20, 2019 10:52:49 AM MSK Konstantin Osipov wrote:
>> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/09/19 00:44]:
>>>> The last patch makes a relay able to read from wal memory using
>>>> a xrow_buffer cursor. When relay went out of wal memory window it turns
>>>> back to file mode. After last known xlog file is relayed the a relay
>>>> returns to wal memory.
>>>
>>> Please, ask Alexander to bench your branch to check if there are no
>>> performance problems.
>>
>> the whole series should boost the performance quite a bit, after
>> all we are adding an in-memory cache of wal rows and remove file
>> I/O in the most common case.
>>
>> But tx overhead is increased by more eager gc_advance, so it has
>> to be benched independently.
> It is not the latest branch, just after this I plan to move relay into wal, 
> which allows us to process the whole xlog collection logic in the wal thread. 
> 

It will take unknown amount of time, what means, that this
not the latest branch might be on top of the master for quite
a long time, and may become a part of a release. We need it
benched.

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <4885469.V32OVRv9ck@home.lan>]

[parent not found: <c21b86af-ef18-7040-4380-78b01c48203f@tarantool.org>]

[parent not found: <2210179.PDX6z0RLMb@home.lan>]

[parent not found: <68c06ac1-3a3f-bae2-ebf4-79c1ae63cb64@tarantool.org>]

* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/4] wal: wal memory buffer
       [not found]       ` <68c06ac1-3a3f-bae2-ebf4-79c1ae63cb64@tarantool.org>
@ 2019-10-21 23:04         ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:04 UTC (permalink / raw)
  To: Georgy Kirichenko, tarantool-patches, Konstantin Osipov,
	tarantool-patches

>>>>>> +	/* Decoded xrow header. */
>>>>>> +	struct xrow_header xrow;
>>>>>> +	/* Pointer to the xrow encoded raw data. */
>>>>>> +	void *data;
>>>>>> +	/* xrow raw data size. */
>>>>>> +	size_t size;
>>>>>> +};
>>>>>> +
>>>>>> +/*
>>>>>> + * Xrow memory data chunk info contains
>>>>>> + *  a vclock just before the first stored row,
>>>>>> + *  an ibuf with row descriptors
>>>>>> + *  an obuf with encoded data
>>>>>
>>>>> 29. I don't think it really makes sense to just dupicate here
>>>>> the comments of each attribute of the structure. Please,
>>>>> better use that space to explain, why you need vclock, why
>>>>> 'before first row' instead of just 'first row', or 'last row'.
>>>>>
>>>>> What is a 'row descriptor'?
>>>>>
>>>>> Why do you need both initial xrows and encoded? I thought, that
>>>>> xrow_buf stores only encoded rows.
>>>>
>>>> Relay will need initial xrows to perform filtering (replication group, lsn
>>>> and e.t.c.) without xrow encoded body to decode.
>>>
>>> Hm, it looks quite expensive just for filtering. But assume it is ok.
>>> 1) Why can't you store only a header instead of the whole xrow with a
>>> body? As I understand, for any filtering the header is enough. Or not?
>> I should also store xrow bodies and then encode the xrow as many times as 
>> count of replicas I have.
> 
> Why do you need to encode it multiple times? Why not to send the same
> encoded record?
> Also, I don't see where do you encode more than once.
> 

Please, say anything. I still don't understand it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <68fa00546c99b7b53c1c3024881a19024513104d.1568798455.git.georgy@tarantool.org>]

[parent not found: <772fcc2b-bf03-19d1-6431-1b7d2b058865@tarantool.org>]

* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/4] wal: wal memory buffer
       [not found]   ` <772fcc2b-bf03-19d1-6431-1b7d2b058865@tarantool.org>
@ 2019-10-21 23:04     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:04 UTC (permalink / raw)
  To: tarantool-patches, Georgy Kirichenko, tarantool-patches

>> +enum {
>> +	/*
>> +	 * Xrow memory object contains some count of rotating data chunks.
>> +	 * Every rotation estimated decrease in amount of stored rows is
>> +	 * about 1/(COUNT OF CHUNKS). However the bigger value makes rotation
>> +	 * more frequent, the decrease would be smoother and size of
>> +	 * a wal memory more stable.
> 
> 27. Please, limit max length of a comment string to 66 symbols. It
> helps reading, no jokes. In the C file too.

Sorry, still out of 66. In many other places too.

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <5dc83b86bbeada22342c355648de2daa7766bb73.1568798455.git.georgy@tarantool.org>]

[parent not found: <a6cec753-ce56-f91f-1305-383380a2d630@tarantool.org>]

[parent not found: <1816701.j1k7ruT3f3@home.lan>]

[parent not found: <569ff681-a670-f4ca-3c6f-75c159e21429@tarantool.org>]

* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 3/4] wal: xrow buffer cursor
       [not found]       ` <569ff681-a670-f4ca-3c6f-75c159e21429@tarantool.org>
@ 2019-10-21 23:04         ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:04 UTC (permalink / raw)
  To: Georgy Kirichenko; +Cc: tarantool-patches, tarantool-patches

>>>> +			break;
>>>> +		/* Buffer was discarded. */
>>>
>>> 4. Ok, perhaps now I understand the idea of why do you discard
>>> chunks without any checks if they are in use. Looks cool.
>>>
>>> But it means, that if a relay sends data too slow, it will fall
>>> from the in-memory replication down to disk. Being fallen to a disk,
>>> how can it return back? Isn't it a problem, that the chunks will
>>> be being written to wal so many and fast, that the relay will never
>>> return back to in-memory replication?
>> If it is a temporary slowdown and a relay will process data faster than a 
>> master writes then the relay will return to in-memory replication. In the 
>> opposite case, if the relay is lower that the master, then the relay will be 
>> discarded because of collected wall file - in this case there is nothing we can 
>> do.
> 
> Have you asked Alexander Tikh. to bench you branch? I think
> we need to test how realistic is the case, when WAL writes much faster
> than replicates, and the replicas just never can keep master's pace.
> Is it correct, that relay doesn't send a next transaction until a
> previous is ACKed?

Any news here?

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <0d9ae1e9b32cb8334bf6007de4cb62ce62f87fcb.1568798455.git.georgy@tarantool.org>]

[parent not found: <0db07768-c555-a3f6-12d0-a33a267deaf7@tarantool.org>]

* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 4/4] replication: use wal memory buffer to fetch rows
       [not found]   ` <0db07768-c555-a3f6-12d0-a33a267deaf7@tarantool.org>
@ 2019-10-21 23:05     ` Vladislav Shpilevoy
       [not found]     ` <2613444.GBVTE10AvH@home.lan>
  1 sibling, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:05 UTC (permalink / raw)
  To: tarantool-patches, Georgy Kirichenko, tarantool-patches

See 5 comments below.

>> diff --git a/src/box/relay.cc b/src/box/relay.cc
>> index 21674119d..1e65d6d56 100644
>> --- a/src/box/relay.cc
>> +++ b/src/box/relay.cc
>> @@ -161,9 +163,9 @@ relay_new(struct replica *replica)
>>  	}
>>  	relay->replica = replica;
>>  	relay->last_row_time = ev_monotonic_now(loop());
>> -	fiber_cond_create(&relay->reader_cond);
>>  	diag_create(&relay->diag);
>>  	relay->state = RELAY_OFF;
>> +	relay->r = NULL;
> 
> 1. Why? 'relay' is allocated using 'calloc' a few lines
> above, 'r' is NULL already.
> 

1. Why did you ignore that comment?

>> +	relay->wal_dir = strdup(cfg_gets("wal_dir"));
>> +	if (relay->wal_dir == NULL) {
>> +		diag_set(OutOfMemory, strlen(cfg_gets("wal_dir")),
> 
> 10. I know, I am a fucking bore, but 'strlen() + 1', for terminating
> zero :)
> 

2. Why did you ignore that comment?

>> +			 "runtime", "wal_dir");
> 
> 11. There is a strict rule what arguments we pass to diag_raise -
> it is size, allocator function, variable name. Allocator function
> here is 'strdup', not 'runtime'.
> 

3. Why did you ignore that comment?

>> +	struct fiber_cond xrow_buf_cond;
>>  };
>>  
>>  struct wal_msg {
>> @@ -1131,6 +1134,7 @@ wal_writer_f(va_list ap)
>>  	(void) ap;
>>  	struct wal_writer *writer = &wal_writer_singleton;
>>  	xrow_buf_create(&writer->xrow_buf);
>> +	fiber_cond_create(&writer->xrow_buf_cond);
> 
> 18. Where do you call fiber_cond_destroy()?
> 

4. Why did you ignore that comment?

>> +
>>  #if defined(__cplusplus)
>>  } /* extern "C" */
>>  #endif /* defined(__cplusplus) */
>> diff --git a/src/lib/core/cbus.c b/src/lib/core/cbus.c
>> index b3b1280e7..b7e6d769b 100644
>> --- a/src/lib/core/cbus.c
>> +++ b/src/lib/core/cbus.c
>> @@ -284,6 +284,9 @@ cpipe_flush_cb(ev_loop *loop, struct ev_async *watcher, int events)
>>  	/* Trigger task processing when the queue becomes non-empty. */
>>  	bool output_was_empty;
>>  
>> +	int old_cancel_state;
>> +	pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &old_cancel_state);
> 
> 28. Why?
> 

5. Why did you ignore that comment?

>> +
>>  	tt_pthread_mutex_lock(&endpoint->mutex);
>>  	output_was_empty = stailq_empty(&endpoint->output);
>>  	/** Flush input */
>> @@ -297,6 +300,7 @@ cpipe_flush_cb(ev_loop *loop, struct ev_async *watcher, int events)
>>  
>>  		ev_async_send(endpoint->consumer, &endpoint->async);
>>  	}
>> +	pthread_setcancelstate(old_cancel_state, NULL);
>>  }
> 
> 29. I will review the tests later, when I will fully understand the
> code. But so few tests for such a serious feature looks not enough.
> Could you test it more rigorous?
> 

6. So have you succeed in that hacking? I am mostly
interested in tests about slow replication with fast
WAL writing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <2613444.GBVTE10AvH@home.lan>]

[parent not found: <84edf0e4-a878-64be-d2f3-694cc672ec9a@tarantool.org>]

* Re: [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 4/4] replication: use wal memory buffer to fetch rows
       [not found]       ` <84edf0e4-a878-64be-d2f3-694cc672ec9a@tarantool.org>
@ 2019-10-21 23:05         ` Vladislav Shpilevoy
  0 siblings, 0 replies; 6+ messages in thread
From: Vladislav Shpilevoy @ 2019-10-21 23:05 UTC (permalink / raw)
  To: Georgy Kirichenko; +Cc: tarantool-patches, tarantool-patches



On 23/09/2019 23:51, Vladislav Shpilevoy wrote:
> Thanks for the answers.
> 
>>>> +			struct vclock *send_vclock;
>>>> +			if (relay->version_id < version_id(1, 7, 4))
>>>> +				send_vclock = &relay->r->vclock;
>>>
>>> 3. I know, this is old code existed before your patch, and it will
>>> be dropped, but why do you send relay->r->vclock and don't decode it
>>> anywhere? Above you decoded a received vclock into relay->recv_vclock.
>>> relay->r->vclock is not updated, and nonetheless you send it.
>> Before 1.7.4 we used recovery vclock as replica vclock. So recovery does the 
>> work.
> 
> It *did* the work, but now it doesn't. Recovery vclock was being
> updated in relay_process_wal_event(). Now you relay directly
> from memory, and recovery vclock is not changed until you fall
> to disk. Moreover, looks like recovery object does not exist until
> you fall to disk. It means, that to 1.7.4 you would either crash or
> send the same vclock until a next fall to disk, wouldn't you?
> 

Any news here?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-21 23:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1568798455.git.georgy@tarantool.org>
     [not found] ` <21c3aa57-72c7-a8b1-e9e8-5fbb92f22ed9@tarantool.org>
     [not found]   ` <20190920075249.GI8859@atlas>
     [not found]     ` <5842734.CUO0naIOha@home.lan>
2019-10-21 23:03       ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 0/4] In-memory wal replication Vladislav Shpilevoy
     [not found] ` <4885469.V32OVRv9ck@home.lan>
     [not found]   ` <c21b86af-ef18-7040-4380-78b01c48203f@tarantool.org>
     [not found]     ` <2210179.PDX6z0RLMb@home.lan>
     [not found]       ` <68c06ac1-3a3f-bae2-ebf4-79c1ae63cb64@tarantool.org>
2019-10-21 23:04         ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 2/4] wal: wal memory buffer Vladislav Shpilevoy
     [not found] ` <68fa00546c99b7b53c1c3024881a19024513104d.1568798455.git.georgy@tarantool.org>
     [not found]   ` <772fcc2b-bf03-19d1-6431-1b7d2b058865@tarantool.org>
2019-10-21 23:04     ` Vladislav Shpilevoy
     [not found] ` <5dc83b86bbeada22342c355648de2daa7766bb73.1568798455.git.georgy@tarantool.org>
     [not found]   ` <a6cec753-ce56-f91f-1305-383380a2d630@tarantool.org>
     [not found]     ` <1816701.j1k7ruT3f3@home.lan>
     [not found]       ` <569ff681-a670-f4ca-3c6f-75c159e21429@tarantool.org>
2019-10-21 23:04         ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 3/4] wal: xrow buffer cursor Vladislav Shpilevoy
     [not found] ` <0d9ae1e9b32cb8334bf6007de4cb62ce62f87fcb.1568798455.git.georgy@tarantool.org>
     [not found]   ` <0db07768-c555-a3f6-12d0-a33a267deaf7@tarantool.org>
2019-10-21 23:05     ` [Tarantool-patches] [tarantool-patches] Re: [PATCH v2 4/4] replication: use wal memory buffer to fetch rows Vladislav Shpilevoy
     [not found]     ` <2613444.GBVTE10AvH@home.lan>
     [not found]       ` <84edf0e4-a878-64be-d2f3-694cc672ec9a@tarantool.org>
2019-10-21 23:05         ` Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox