From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Vladimir Davydov Subject: [PATCH 00/13] Join replicas off the current read view Date: Sat, 10 Aug 2019 13:03:27 +0300 Message-Id: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit To: tarantool-patches@freelists.org List-ID: Currently, we join replicas off the last checkpoint. As a result, we must keep all files corresponding to the last checkpoint. This means that we must always create a memtx snapshot file on initial call to box.cfg() even though it is virtually the same for all instances. Besides, we must rotate the vylog file synchronously with snapshot creation, otherwise we wouldn't be able to pull all vinyl files corresponding to the last checkpoint. This interconnection between vylog and xlog makes the code difficult to maintain. Actually, nothing prevents us from relaying the current read view instead of the last checkpoint on initial join, as both memtx and vinyl support a consistent read view. This patch does the trick. This is a step towards making vylog independent of checkpointing and WAL. https://github.com/tarantool/tarantool/issues/1271 https://github.com/tarantool/tarantool/commits/dv/gh-1271-rework-replica-join Vladimir Davydov (13): vinyl: embed engine in vy_env vinyl: embed index in vy_lsm vinyl: move reference counting from vy_lsm to index vinyl: don't pin index for iterator lifetime vinyl: don't exempt dropped indexes from dump and compaction memtx: don't store pointers to index internals in iterator memtx: use ref counting to pin indexes for snapshot memtx: allow snapshot iterator to fail memtx: enter small delayed free mode from snapshot iterator wal: make wal_sync fail on write error xrow: factor out helper for setting REPLACE request body test: disable replication/on_schema_init relay: join new replicas off read view src/box/blackhole.c | 3 +- src/box/box.cc | 53 +- src/box/engine.c | 21 - src/box/engine.h | 27 +- src/box/index.cc | 2 + src/box/index.h | 29 +- src/box/iproto_constants.h | 11 + src/box/lua/info.c | 3 +- src/box/lua/stat.c | 3 +- src/box/memtx_bitset.c | 10 +- src/box/memtx_engine.c | 152 +---- src/box/memtx_engine.h | 28 +- src/box/memtx_hash.c | 52 +- src/box/memtx_tree.c | 100 ++-- src/box/relay.cc | 170 +++++- src/box/relay.h | 2 +- src/box/sequence.c | 24 +- src/box/space.c | 4 +- src/box/sysview.c | 3 +- src/box/vinyl.c | 596 ++++++-------------- src/box/vinyl.h | 26 +- src/box/vy_lsm.c | 6 +- src/box/vy_lsm.h | 22 +- src/box/vy_run.c | 6 - src/box/vy_scheduler.c | 95 ++-- src/box/vy_scheduler.h | 10 +- src/box/vy_tx.c | 10 +- src/box/vy_tx.h | 8 + src/box/wal.c | 29 +- src/box/wal.h | 5 +- src/lib/core/errinj.h | 2 +- test/box/errinj.result | 38 +- test/replication-py/cluster.result | 13 - test/replication-py/cluster.test.py | 25 - test/replication/join_without_snap.result | 88 +++ test/replication/join_without_snap.test.lua | 32 ++ test/replication/suite.cfg | 1 + test/replication/suite.ini | 2 +- test/unit/vy_point_lookup.c | 3 +- test/vinyl/errinj.result | 4 +- test/vinyl/errinj.test.lua | 4 +- test/xlog/panic_on_broken_lsn.result | 31 +- test/xlog/panic_on_broken_lsn.test.lua | 20 +- 43 files changed, 826 insertions(+), 947 deletions(-) create mode 100644 test/replication/join_without_snap.result create mode 100644 test/replication/join_without_snap.test.lua -- 2.20.1