[Tarantool-patches] [PATCH v4 00/11] Replication from memory
Georgy Kirichenko
georgy at tarantool.org
Wed Feb 12 12:39:09 MSK 2020
This is a complete redesign of the previous version of the
feature. First five patches are refactoring done to make
corresponding facilities, recovery, coio and xstream,
C-compliant. Two minor changes are picked out in order to
facilitate review.
Sixth patch extracts xlog batch writing into a separate
routine what helps with further reviews too.
Matrix clock is a structure maintaining a set of vclocks and
used to build a n-majority vclock (a vclock each component of
them is greather or equal than n corresponding components of
all containing vclocks). This feature is used in order to
determine a vclock read by all replicas (0-majority) or
a vclock which is applied by n-replicas in case of synchronous
replication.
Matrix clock allows wal to track relay vclocks and collect
garbage without tx thread what is implemented in the next patch.
Xrow buffer objects is a in-mempory data structure placing
encoded xrows data as well as corresponding xrow headers into
rotating memory buffers. The main purpose is to let atransaction
to live in memory for some time even after the transaction
finalization. Xrow encoded data is stored into obufs whereas
headers are stored in arrays. Such approach allows to analyze
xrow header (replica id, lsn and group) without decoding blob
data as recovery does now. Additionally it is possible to scan
xrow headers and build a big data range containing already encoded
data in order to send the data with one call (not implemented yet).
Tenth patch does a wal refactoring consisiting in xrow buffer
usage before any actual write.
The last patch implements in memory replication. From now a relay
lives in a wal thread (what is inevitably in case of synchronous
replication) as a pair of fibers - writer and reader. The reader
has the same mission as before - to read and process replica status
vclock. The writer fetcher rows from wal xrow buffer and then
transmits them to a replica. If wal memory does not contain
required rows then writing fiber spawns a cord which reads logs
from files. Also relay provides a special filter function which
is used by the writer in order to implement previous relaying
logic (skip rows, nops).
Branch:
https://github.com/tarantool/tarantool/tree/g.kirichenko/gh-3794-memory-replication
Issue: https://github.com/tarantool/tarantool/issues/3794
Georgy Kirichenko (11):
recovery: do not call recovery_stop_local inside recovery_delete
recovery: do not throw an error
coio: do not allow parallel usage of coio
coio: do not throw an error, minor refactoring
xstream: get rid of an exception
wal: extract log write batch into a separate routine
wal: matrix clock structure
wal: track relay vclock and collect logs in wal thread
wal: xrow memory buffer and cursor
wal: use a xrow buffer object for entry encoding
replication: use wal memory buffer to fetch rows
src/box/CMakeLists.txt | 6 +-
src/box/applier.cc | 49 +-
src/box/box.cc | 81 +-
src/box/gc.c | 216 ++---
src/box/gc.h | 95 +-
src/box/lua/info.c | 33 +-
src/box/mclock.c | 374 ++++++++
src/box/mclock.h | 125 +++
src/box/recovery.cc | 100 ++-
src/box/recovery.h | 14 +-
src/box/relay.cc | 649 ++++----------
src/box/relay.h | 6 +-
src/box/replication.cc | 37 +-
src/box/wal.c | 829 ++++++++++++++++--
src/box/wal.h | 97 +-
src/box/xlog.c | 57 +-
src/box/xlog.h | 14 +
src/box/xrow_buf.c | 374 ++++++++
src/box/xrow_buf.h | 197 +++++
src/box/xrow_io.cc | 59 +-
src/box/xrow_io.h | 11 +-
src/box/xstream.cc | 44 -
src/box/xstream.h | 9 +-
src/lib/core/coio.cc | 534 ++++++-----
src/lib/core/coio.h | 19 +-
src/lib/core/coio_buf.h | 8 +
src/lib/core/errinj.h | 1 +
test/box-py/iproto.test.py | 9 +-
test/box/errinj.result | 134 +--
test/replication/force_recovery.result | 8 +
test/replication/force_recovery.test.lua | 2 +
test/replication/gc_no_space.result | 30 +-
test/replication/gc_no_space.test.lua | 12 +-
test/replication/replica_rejoin.result | 8 +
test/replication/replica_rejoin.test.lua | 2 +
.../show_error_on_disconnect.result | 8 +
.../show_error_on_disconnect.test.lua | 2 +
test/replication/suite.ini | 2 +-
test/unit/CMakeLists.txt | 2 +
test/unit/mclock.result | 18 +
test/unit/mclock.test.c | 160 ++++
test/xlog/panic_on_wal_error.result | 12 +
test/xlog/panic_on_wal_error.test.lua | 3 +
test/xlog/suite.ini | 2 +-
44 files changed, 3063 insertions(+), 1389 deletions(-)
create mode 100644 src/box/mclock.c
create mode 100644 src/box/mclock.h
create mode 100644 src/box/xrow_buf.c
create mode 100644 src/box/xrow_buf.h
delete mode 100644 src/box/xstream.cc
create mode 100644 test/unit/mclock.result
create mode 100644 test/unit/mclock.test.c
--
2.25.0
More information about the Tarantool-patches
mailing list