Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH v3 0/6] SWIM draft
@ 2018-12-29 10:14 Vladislav Shpilevoy
  2018-12-29 10:14 ` [PATCH v3 1/6] [RAW] swim: introduce SWIM's anti-entropy component Vladislav Shpilevoy
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Vladislav Shpilevoy @ 2018-12-29 10:14 UTC (permalink / raw)
  To: tarantool-patches; +Cc: vdavydov.dev, kostja

First commit message is a comprehensive information about SWIM which I will not
duplicate here. This is only a description of the patchset.

SWIM consists of two main components - dissemination and failure detection, and
one additional component - anti-entropy. The patchset introduces them one by one
in the first three commits.

Next two commits are technical improvements.

The last commit allows SWIM user to carry its own data with dissemination and
anti-entropy component messages.

Note, these commits contain bugs, typos, and have no tests. The goal of this
review is a highlevel approval of API so as to start writing tests.

Nonetheless here I describe some known bugs and opened questions:

1. I tried to do not allocate most used swim_tasks without necessity and saved
   ping_task and ack_task in struct swim_member as attributes to reuse them. But
   now I do not think it is worth 'perf win', but complicates task destruction.
   I am planning to always allocate/delete swim_task. Also it is required for
   indirect ping/acks where I can not allocate tasks in advance.

2. ACKs now can be lost. I start waiting for an ACK once a PING task is
   scheduled, but it is not correct. Techincally ping still is not sent and
   swim_check_acks can mistakenly treat it as a lost ping. I am planning to fix
   it with saving last received ACK timestamp in struct member and start waiting
   for an ACK *after* PING task is finished, *but only if* member did not
   receive ACK already.
   This problem originally arises from the fact that PING can be not the only
   packet in a message, so I can not always safely start waiting for an ACK in
   swim_task.complete callback. ACK can arrive after PING packet is sent, but
   before the whole message is sent. Gitlab Lua version of SWIM has no this
   problem since it has no multi-packet support.

3. In swim_round_step_complete() it is unsafe to assume that member in queue
   round head is still the same as it was during scheduling task. I am planning
   to just do shift in swim_round_step_begin.

4. There is a problem with 'immortal' members. When a member is declaed as a
   dead, its state is disseminated to other members, and it is deleted from the
   table. But other members via anti-entropy component can return it back. The
   member still will have 'dead' status, but never deleted. I am planning to fix
   it via do not adding dead members from anti-entropy component to the local
   table.

Opened questions:

1. Should timestamp be added to each PING/ACK in addition to incarnation?
   It protects from the case when ACK is duplicated accidentally, or arrived
   with the same incarnation, but too late. Gitlab Lua version does it, but
   protocol, as I remember, does not specify it.

Also the code is very obfuscated in some places and still needs renaming,
refactoring in most places, some movements of diffs between commits, sorry.

http://github.com/tarantool/tarantool/tree/gerold103/gh-3234-swim
https://github.com/tarantool/tarantool/issues/3234

Changes in v3:
- packets can carry arbitrary payload;
- socket reading/writing related routines and structures are moved to
  swim_scheduler.

Changes in v2:
- new API with explicit members addition, removal;
- ability to create multiple SWIM instances per one Tarantool process;
- multi-packet sending of one SWIM message.

V1: https://www.freelists.org/post/tarantool-patches/PATCH-05-SWIM
V2: https://www.freelists.org/post/tarantool-patches/PATCH-v2-06-SWIM

Vladislav Shpilevoy (6):
  [RAW] swim: introduce SWIM's anti-entropy component
  [RAW] swim: introduce failure detection component
  [RAW] swim: introduce a dissemination component
  [RAW] swim: keep encoded round message cached
  [RAW] swim: send one UDP packet per EV_WRITE event
  [RAW] swim: introduce payload

 src/CMakeLists.txt            |    3 +-
 src/evio.c                    |    3 +-
 src/evio.h                    |    4 +
 src/lib/CMakeLists.txt        |    1 +
 src/lib/swim/CMakeLists.txt   |    7 +
 src/lib/swim/swim.c           | 1653 +++++++++++++++++++++++++++++++++
 src/lib/swim/swim.h           |   99 ++
 src/lib/swim/swim_io.c        |  180 ++++
 src/lib/swim/swim_io.h        |  316 +++++++
 src/lib/swim/swim_transport.h |   66 ++
 src/lua/init.c                |    2 +
 src/lua/swim.c                |  244 +++++
 src/lua/swim.h                |   47 +
 13 files changed, 2622 insertions(+), 3 deletions(-)
 create mode 100644 src/lib/swim/CMakeLists.txt
 create mode 100644 src/lib/swim/swim.c
 create mode 100644 src/lib/swim/swim.h
 create mode 100644 src/lib/swim/swim_io.c
 create mode 100644 src/lib/swim/swim_io.h
 create mode 100644 src/lib/swim/swim_transport.h
 create mode 100644 src/lua/swim.c
 create mode 100644 src/lua/swim.h

-- 
2.17.2 (Apple Git-113)

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-01-15 14:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-29 10:14 [PATCH v3 0/6] SWIM draft Vladislav Shpilevoy
2018-12-29 10:14 ` [PATCH v3 1/6] [RAW] swim: introduce SWIM's anti-entropy component Vladislav Shpilevoy
2019-01-09  9:12   ` [tarantool-patches] " Konstantin Osipov
2019-01-15 14:42     ` [tarantool-patches] " Vladislav Shpilevoy
2019-01-09 11:45   ` [tarantool-patches] " Konstantin Osipov
2019-01-15 14:42     ` [tarantool-patches] " Vladislav Shpilevoy
2018-12-29 10:14 ` [PATCH v3 2/6] [RAW] swim: introduce failure detection component Vladislav Shpilevoy
2019-01-09 13:48   ` [tarantool-patches] " Konstantin Osipov
2019-01-15 14:42     ` [tarantool-patches] " Vladislav Shpilevoy
2018-12-29 10:14 ` [PATCH v3 3/6] [RAW] swim: introduce a dissemination component Vladislav Shpilevoy
2018-12-29 10:14 ` [PATCH v3 4/6] [RAW] swim: keep encoded round message cached Vladislav Shpilevoy
2018-12-29 10:14 ` [PATCH v3 5/6] [RAW] swim: send one UDP packet per EV_WRITE event Vladislav Shpilevoy
2019-01-09 13:53   ` [tarantool-patches] " Konstantin Osipov
2019-01-15 14:42     ` [tarantool-patches] " Vladislav Shpilevoy
2018-12-29 10:14 ` [PATCH v3 6/6] [RAW] swim: introduce payload Vladislav Shpilevoy
2019-01-09 13:58   ` [tarantool-patches] " Konstantin Osipov
2019-01-15 14:42     ` [tarantool-patches] " Vladislav Shpilevoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox