[tarantool-patches] Re: [PATCH 2/2] swim: disseminate event for log(cluster_size) steps

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun Jun 30 19:24:13 MSK 2019



On 30/06/2019 08:55, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/06/30 09:04]:
> 
>  I don'
>> Before the patch there was a problem of events and anti-entropy
>> starvation, when a cluster generates so many events, that they
>> consume the whole UDP packet. If during the event storm something
>> important happens, that event is likely to be lost, and not
>> disseminated until the storm is over.
>>
>> Sadly, there is no way to prevent a storm, but it can be made
>> much shorter. For that the patch makes TTD of events logarithmic
>> instead of linear of cluster size.
>>
>> According to the SWIM paper and to the experiments the logarithm
>> is really enough. Linear TTD was a redundant overkill.
>>
>> When events live shorter, it does not solve a problem of the
>> events starvation - still some of them can be lost in case of a
>> storm. But it frees some space for anti-entropy, which can finish
>> dissemination of lost events.
>>
>> Experiments in a simulation of a cluster with 100 nodes showed,
>> that a failure dissemination happened in ~110 steps if there is
>> a storm. Linear dissemination is the worst problem.
>> After the patch it is ~20 steps. So it is logarithmic as it
>> should be, although with a bigger constant than without a storm.
> 
> You say nothing in this commit about limbo queue. I have serious
> doubts about your manipulation with it. The patch needs to be
> split into pieces, each addressing its own problem and having a
> test. Now I only see 1 test for so many changes.
> 

Limbo queue makes no sense before logarithmic TTD. I can add it in
a separate commit, but then the first commit with log TTD will fail
some tests. And they will be fixed in the second one adding the limbo
queue.

If you don't like the limbo queue idea, then answer on my cover
letter in this thread, where I explain what alternatives exist for
the limbo queue.




More information about the Tarantool-patches mailing list