Re: [patches] Re: [PATCH] replication: fix disconnect due to race condition

Konstantin Belyavskiy k.belyavskiy at tarantool.org
Fri Feb 16 15:45:57 MSK 2018


branch: gh-3160-disconnect-race-condition


>Пятница, 16 февраля 2018, 14:44 +03:00 от Vladimir Davydov <vdavydov.dev at gmail.com>:
>
>On Fri, Feb 16, 2018 at 02:39:11PM +0300, Konstantin Belyavskiy wrote:
>> Incomming ACK lead to race condition and prevent heartbeat
>> messages. It ends up with disconnect on timeout.
>> This fix based on @locker proposal to send vclock only to
>> reply master (since it itself send heartbeat messages).
>> 
>> Fix #3160
>> ---
>> branch: gh-3160-disconnect-race-condition
>>  src/box/applier.cc | 8 ++++++--
>
>Please add a test case. 
Done
>
>
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>> 
>> diff --git a/src/box/applier.cc b/src/box/applier.cc
>> index 91769ae00..d9656c870 100644
>> --- a/src/box/applier.cc
>> +++ b/src/box/applier.cc
>> @@ -98,6 +98,11 @@ applier_log_error(struct applier *applier, struct error *e)
>> 
>>  /*
>>   * Fiber function to write vclock to replication master.
>> + * To track conncection status, replica answers to master
>> + * with encoded vclock. In addition to update requests,
>> + * master also sends heartbeat messages every
>> + * replication_timeout (introduced in 1.7.7).
>
>What happens if the master is older than 1.7.7 and so doesn't
>send heartbeats? 
Thank you, fixed
>
>
>> + * On such requests replica also responds with vlock.
>>   */
>>  static int
>>  applier_writer_f(va_list ap)
>> @@ -106,10 +111,9 @@ applier_writer_f(va_list ap)
>>  	struct ev_io io;
>>  	coio_create(&io, applier->io.fd);
>> 
>> -	/* Re-connect loop */
>>  	while (!fiber_is_cancelled()) {
>>  		fiber_cond_wait_timeout(&applier->writer_cond,
>> -					replication_timeout);
>> +					TIMEOUT_INFINITY);
>>  		/* Send ACKs only when in FOLLOW mode ,*/
>>  		if (applier->state != APPLIER_SYNC &&
>>  		    applier->state != APPLIER_FOLLOW)


Best regards,
Konstantin Belyavskiy
k.belyavskiy at tarantool.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180216/48a39d56/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-replication-fix-disconnect-due-to-race-condition.patch
Type: application/x-patch
Size: 4522 bytes
Desc: not available
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20180216/48a39d56/attachment.bin>


More information about the Tarantool-patches mailing list