[tarantool-patches] Re: [PATCH v3 1/6] [RAW] swim: introduce SWIM's anti-entropy component
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Tue Jan 15 17:42:13 MSK 2019
On 09/01/2019 14:45, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [18/12/29 15:07]:
>> + swim_member_bin_create(&member_bin);
>
>> + for (; i < (int) mh_size(swim->members); ++i) {
>> + char *pos = swim_packet_alloc(packet, sizeof(member_bin));
>> + if (pos == NULL)
>> + break;
>> + struct swim_member *member = swim->shuffled_members[i];
>> + swim_member_bin_reset(&member_bin, member);
>
> Why do you need to create() the member if you then reset it?
> Perhaps encode() or fill() is a more suitable verb than reset?
Create initializes constant fields. Reset fills others. But as you
wish. Done here and with similar functions in the next commits.
@@ -269,8 +269,7 @@ struct PACKED swim_member_bin {
};
static inline void
-swim_member_bin_reset(struct swim_member_bin *header,
- struct swim_member *member)
+swim_member_bin_fill(struct swim_member_bin *header, struct swim_member *member)
{
header->v_status = member->status;
header->v_addr = mp_bswap_u32(member->addr.sin_addr.s_addr);
@@ -439,7 +438,7 @@ swim_encode_anti_entropy(struct swim *swim, struct swim_msg *msg)
if (pos == NULL)
break;
struct swim_member *member = swim->shuffled_members[i];
- swim_member_bin_reset(&member_bin, member);
+ swim_member_bin_fill(&member_bin, member);
memcpy(pos, &member_bin, sizeof(member_bin));
}
if (i == 0)
>
>> + memcpy(pos, &member_bin, sizeof(member_bin));
>> + swim_anti_entropy_header_bin_create(&ae_header_bin, i);
>> + memcpy(header, &ae_header_bin, sizeof(ae_header_bin));
>> + swim_packet_flush(packet);
>
> Why flush() and not simply send()?
Because it has nothing to do with network. swim_packet is an
allocator, where backend buffer is a char[UDP_PACKET_SIZE].
Flush() moves current position in that buffer. Please, read
carefully swim_packet and swim_msg API.
As a leading light I've used mpstream API:
mpstream_flush(), reserve(), advance().
>
>> +swim_encode_round_msg(struct swim *swim, struct swim_msg *msg)
>
> Why not simply swim_encode_round()?
Because it takes a message as an argument and encodes it.
>
>> +/** Once per specified timeout trigger a next broadcast step. */
>> +static void
>> +swim_round_step_begin(struct ev_loop *loop, struct ev_periodic *p, int events)
>
> Once again I have a difficulty understanding the name. Is it swim
> step begin or swim round begin? What is swim round step? Sounds
> like each round has many steps and each step has a beginning and an end?
It is begin of a step. What is round step is described at the beginning
of the file. Cite:
"
* Tarantool splits protocol operation into rounds. At the
* beginning of a round all members are randomly reordered and
* linked into a list. At each round step a member is popped from
* the list head, a message is sent to him, and he waits for the
* next round.
"
>
> Then I'm missing swim_round_step_end(), swim_round_step_first(),
> or something like that.
There are no first nor second nor end. swim_round_step_begin schedules
sending of a round message to the next member. It does not send
immediately, so technically it is not round_step(), but round_step_begin().
Instead of round_step_end() I have round_step_complete(), just like
for vy_task.
>
> Looking at the code, swim_round_step_begin() is simply
> swim_round().
It is not round. What is round is described at the beginning of
the file, see cite above.
>
>
>> +static void
>> +swim_process_member_update(struct swim *swim, struct swim_member_def *def)
>> +{
>> + struct swim_member *member = swim_find_member(swim, &def->addr);
>> + /*
>> + * Trivial processing of a new member - just add it to the
>> + * members table.
>> + */
>> + if (member == NULL) {
>> + member = swim_member_new(swim, &def->addr, def->status);
>> + if (member == NULL)
>> + diag_log();
>> + }
>> +}
>
> Why nothing is done for an existing member? This needs a comment, no?
In this commit a member can only be added. Not updated nor deleted. These
actions are introduced in the next commits.
>
>> +
>> +struct swim_transport swim_udp_transport = {
>> + /* .send_round_msg = */ swim_udp_send_msg,
>> + /* .recv_msg = */ swim_udp_recv_msg,
>> +};
>
> Initializing/destroying an endpoint (like calling bind()) should also be
> part of transport api.>
>> +int
>> +swim_scheduler_bind(struct swim_scheduler *scheduler, struct sockaddr_in *addr)
>
> And not part of the scheduler api.
As you wish. Done on the branch. I do not pase the diff here
because it is too big.
>
>> + evio_setsockopt_server(fd, AF_INET, SOCK_DGRAM) != 0) {
>
> The file descriptor itself should also be part of the transport.
In such a case the transport will be not a vtab, but an object, with
its own attributes. As I remember, you asked to make the transport just
a simple vtab. But as you wish. Done on the branch.
>
>> +static void
>> +swim_scheduler_on_input(struct ev_loop *loop, struct ev_io *io, int events)
>> +{
>> + assert((events & EV_READ) != 0);
>> + (void) events;
>> + (void) loop;
>> + struct swim_scheduler *scheduler = (struct swim_scheduler *) io->data;
>> + struct sockaddr_in addr;
>> + socklen_t len = sizeof(addr);
>> + struct swim_packet packet;
>> + struct swim_msg msg;
>> + swim_msg_create(&msg);
>> + swim_packet_create(&packet, &msg);
>> + swim_transport_recv_f recv = scheduler->transport->recv_msg;
>> + ssize_t size = recv(io->fd, packet.body, packet.end - packet.body,
>> + (struct sockaddr *) &addr, &len);
>
> I don't understand why you do it here, if it's part of the
> transport api.
It is. recv here is a variable taken from scheduler->transport->recv_msg.
But never mind, this code is reworked already due to other comments.
More information about the Tarantool-patches
mailing list