[Tarantool-patches] [PATCH 1/1] replication: prevent boot when rs uuid mismatches

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sat Jun 5 02:48:59 MSK 2021


Hi! Thanks for the review!

> First of all, 5613 is about 3rd replica bootstrapping a separate cluster,
> even when it sees that the 2 other nodes have already bootstrapped.
> 
> This patch doesn't actually fix 5613. The 3rd node shows a different error now,
> but it still bootstraps its own cluster with a separate uuid.

I see now. I thought that in the issue description the first 2 nodes were
bootstrapped separately.

> I propose to change replicaset_round() somehow, so that it never chooses
> non-bootstrapped instances over bootstrapped ones. Even when bootstrapped
> instances are read-only.

I did it in the new version, see another email thread.

> Looks like you don't even have to change ballot for this purpose. There's
> already the 'is_loading' field. We just have to assign higher priority to
> `is_loading = false` rather than `read_only = false`.
> 
> P.S. I've checked, and looks like is_loading is not that useful now.
> It's equal to instance's is_ro flag (not the one passed in ballot, but actual is_ro).
> Still, it's easier to change is_loading encoding than introduce a whole new field.

Yes, indeed, is_loading has little to do with actual loading.
It is more like "box.cfg() is finished and box.cfg{read_only=true} was set".

I did several changes to the ballot to make it work. Renamed
field is_ro, renamed + slightly changed behaviour of is_loading,
and added a new field.

Only is_loading is not enough, because I still need to know who is really
read-only. Not just by read_only=false, but who is actually writable. There
can be orphans who has finished bootstrap/recovery, but are not writable yet.

Some replication tests starts failing if we only look at read_only=false
and finished bootstrap/recovery. For instance, assume 1 node is started
and booted fine, it is writable. Then 2 other nodes are started: node2
and node3. They connect to node1 first, get its ballot, vclock. Then
node2 registers on node1. Node3 connects to node2 now and gets its ballot.
It sees node3 has higher vclock than node1 (because node2 connected to
node3 later). If it does not look at it being read-only (because it is
an orphan), it tries to boot from node3 (because its vclock looks like > node1)
and fails because node3 can't write to _cluster yet.

That error I got on replication/bootstrap_leader.test.lua until I decided
to keep both properties of being booted and of being read-only.


More information about the Tarantool-patches mailing list