From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> To: tarantool-patches@dev.tarantool.org, sergepetrenko@tarantool.org Subject: [Tarantool-patches] [PATCH v2 05/19] xrow: introduce CONFIRM and ROLLBACK entries Date: Tue, 30 Jun 2020 01:15:24 +0200 [thread overview] Message-ID: <3153acf5a8f5e34411a44069e8681357abc4b3a7.1593472477.git.v.shpilevoy@tarantool.org> (raw) In-Reply-To: <cover.1593472477.git.v.shpilevoy@tarantool.org> From: Serge Petrenko <sergepetrenko@tarantool.org> Add methods to encode/decode CONFIRM entry. A CONFIRM entry will be written to WAL by synchronous replication master as soon as it finds that the transaction was applied on a quorum of replicas. CONFIRM rows share the same header with other rows in WAL, but their body differs: it's just a map containing replica_id and lsn of the last confirmed transaction. ROLLBACK request contains the same data as CONFIRM request. The only difference is the request semantics. While a CONFIRM request releases all the limbo entries up to the given lsn, the ROLLBACK request rolls back all the entries with lsn greater than given one. Part-of #4847 Part-of #4848 @TarantoolBot document Title: document synchronous replication auxiliary requests Two new iproto request codes are added: * IPROTO_CONFIRM = 0x28 (decimal 40) * IPROTO_ROLLBACK = 0x29 (decimal 41) Both entries share the same request body (it's a map of 2 items): IPROTO_REPLICA_ID : leader_id - id of the synchronous replication leader, IPROTO_LSN : leader_lsn - lsn of the last confirmed transaction. The CONFIRM and ROLLBACK ops are written to WAL, so their header also has IPROTO_REPLICA_ID and IPROTO_LSN fields, which are replica_id : lsn of the instance that wrote these records. leader_id may be different from replica_id, and leader_lsn refers to some past moment in time. When an instance either reads from WAL or receives a CONFIRM entry via replication, it knows that all the leader's synchronous transactions up to the given leader_lsn may be safely committed. When an instance receives or reads a ROLLBACK entry, it knows that all the leader's transactions received up to the given point in time must be rolled back, starting with a transaction, which begins with leader_lsn + 1. --- src/box/iproto_constants.h | 12 +++++ src/box/xrow.c | 106 +++++++++++++++++++++++++++++++++++++ src/box/xrow.h | 46 ++++++++++++++++ 3 files changed, 164 insertions(+) diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h index e38ee4529..6b850f101 100644 --- a/src/box/iproto_constants.h +++ b/src/box/iproto_constants.h @@ -219,6 +219,11 @@ enum iproto_type { /** The maximum typecode used for box.stat() */ IPROTO_TYPE_STAT_MAX, + /** A confirmation message for synchronous transactions. */ + IPROTO_CONFIRM = 40, + /** A rollback message for synchronous transactions. */ + IPROTO_ROLLBACK = 41, + /** PING request */ IPROTO_PING = 64, /** Replication JOIN command */ @@ -316,6 +321,13 @@ dml_request_key_map(uint32_t type) return iproto_body_key_map[type]; } +/** CONFIRM/ROLLBACK entries for synchronous replication. */ +static inline bool +iproto_type_is_synchro_request(uint32_t type) +{ + return type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK; +} + /** This is an error. */ static inline bool iproto_type_is_error(uint32_t type) diff --git a/src/box/xrow.c b/src/box/xrow.c index bb64864b2..39d1814c4 100644 --- a/src/box/xrow.c +++ b/src/box/xrow.c @@ -878,6 +878,112 @@ xrow_encode_dml(const struct request *request, struct region *region, return iovcnt; } +static int +xrow_encode_confirm_rollback(struct xrow_header *row, uint32_t replica_id, + int64_t lsn, int type) +{ + size_t len = mp_sizeof_map(2) + mp_sizeof_uint(IPROTO_REPLICA_ID) + + mp_sizeof_uint(replica_id) + mp_sizeof_uint(IPROTO_LSN) + + mp_sizeof_uint(lsn); + char *buf = (char *)region_alloc(&fiber()->gc, len); + if (buf == NULL) { + diag_set(OutOfMemory, len, "region_alloc", "buf"); + return -1; + } + char *pos = buf; + + pos = mp_encode_map(pos, 2); + pos = mp_encode_uint(pos, IPROTO_REPLICA_ID); + pos = mp_encode_uint(pos, replica_id); + pos = mp_encode_uint(pos, IPROTO_LSN); + pos = mp_encode_uint(pos, lsn); + + memset(row, 0, sizeof(*row)); + + row->body[0].iov_base = buf; + row->body[0].iov_len = len; + row->bodycnt = 1; + + row->type = type; + + return 0; +} + +int +xrow_encode_confirm(struct xrow_header *row, uint32_t replica_id, int64_t lsn) +{ + return xrow_encode_confirm_rollback(row, replica_id, lsn, + IPROTO_CONFIRM); +} + +int +xrow_encode_rollback(struct xrow_header *row, uint32_t replica_id, int64_t lsn) +{ + return xrow_encode_confirm_rollback(row, replica_id, lsn, + IPROTO_ROLLBACK); +} + +static int +xrow_decode_confirm_rollback(struct xrow_header *row, uint32_t *replica_id, + int64_t *lsn) +{ + if (row->bodycnt == 0) { + diag_set(ClientError, ER_INVALID_MSGPACK, "request body"); + return -1; + } + + assert(row->bodycnt == 1); + + const char * const data = (const char *)row->body[0].iov_base; + const char * const end = data + row->body[0].iov_len; + const char *d = data; + if (mp_check(&d, end) != 0 || mp_typeof(*data) != MP_MAP) { + xrow_on_decode_err(data, end, ER_INVALID_MSGPACK, + "request body"); + return -1; + } + + d = data; + uint32_t map_size = mp_decode_map(&d); + for (uint32_t i = 0; i < map_size; i++) { + enum mp_type type = mp_typeof(*d); + if (type != MP_UINT) { + mp_next(&d); + mp_next(&d); + continue; + } + uint8_t key = mp_decode_uint(&d); + if (key >= IPROTO_KEY_MAX || iproto_key_type[key] != type) { + xrow_on_decode_err(data, end, ER_INVALID_MSGPACK, + "request body"); + return -1; + } + switch (key) { + case IPROTO_REPLICA_ID: + *replica_id = mp_decode_uint(&d); + break; + case IPROTO_LSN: + *lsn = mp_decode_uint(&d); + break; + default: + mp_next(&d); + } + } + return 0; +} + +int +xrow_decode_confirm(struct xrow_header *row, uint32_t *replica_id, int64_t *lsn) +{ + return xrow_decode_confirm_rollback(row, replica_id, lsn); +} + +int +xrow_decode_rollback(struct xrow_header *row, uint32_t *replica_id, int64_t *lsn) +{ + return xrow_decode_confirm_rollback(row, replica_id, lsn); +} + int xrow_to_iovec(const struct xrow_header *row, struct iovec *out) { diff --git a/src/box/xrow.h b/src/box/xrow.h index 2a0a9c852..1def394e7 100644 --- a/src/box/xrow.h +++ b/src/box/xrow.h @@ -207,6 +207,52 @@ int xrow_encode_dml(const struct request *request, struct region *region, struct iovec *iov); +/** + * Encode the CONFIRM to row body and set row type to + * IPROTO_CONFIRM. + * @param row xrow header. + * @param replica_id master's instance id. + * @param lsn last confirmed lsn. + * @retval -1 on error. + * @retval 0 success. + */ +int +xrow_encode_confirm(struct xrow_header *row, uint32_t replica_id, int64_t lsn); + +/** + * Decode the CONFIRM request body. + * @param row xrow header. + * @param[out] replica_id master's instance id. + * @param[out] lsn last confirmed lsn. + * @retval -1 on error. + * @retval 0 success. + */ +int +xrow_decode_confirm(struct xrow_header *row, uint32_t *replica_id, int64_t *lsn); + +/** + * Encode the ROLLBACK row body and set row type to + * IPROTO_ROLLBACK. + * @param row xrow header. + * @param replica_id master's instance id. + * @param lsn lsn to rollback to. + * @retval -1 on error. + * @retval 0 success. + */ +int +xrow_encode_rollback(struct xrow_header *row, uint32_t replica_id, int64_t lsn); + +/** + * Decode the ROLLBACK row body. + * @param row xrow header. + * @param[out] replica_id master's instance id. + * @param[out] lsn lsn to rollback to. + * @retval -1 on error. + * @retval 0 success. + */ +int +xrow_decode_rollback(struct xrow_header *row, uint32_t *replica_id, int64_t *lsn); + /** * CALL/EVAL request. */ -- 2.21.1 (Apple Git-122.3)
next prev parent reply other threads:[~2020-06-29 23:15 UTC|newest] Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <cover.1593723973.git.sergeyb@tarantool.org> 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 00/19] Sync replication Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 01/19] replication: introduce space.is_sync option Vladislav Shpilevoy 2020-06-30 23:00 ` Vladislav Shpilevoy 2020-07-01 15:55 ` Sergey Ostanevich 2020-07-01 23:46 ` Vladislav Shpilevoy 2020-07-02 8:25 ` Serge Petrenko 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 10/19] txn_limbo: add ROLLBACK processing Vladislav Shpilevoy 2020-07-05 15:29 ` Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 11/19] box: rework local_recovery to use async txn_commit Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 12/19] replication: support ROLLBACK and CONFIRM during recovery Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 13/19] replication: add test for synchro CONFIRM/ROLLBACK Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 14/19] applier: remove writer_cond Vladislav Shpilevoy 2020-07-02 9:13 ` Serge Petrenko 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 15/19] applier: send heartbeat not only on commit, but on any write Vladislav Shpilevoy 2020-07-01 23:55 ` Vladislav Shpilevoy 2020-07-03 12:23 ` Serge Petrenko 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 16/19] txn_limbo: add diag_set in txn_limbo_wait_confirm Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 17/19] replication: delay initial join until confirmation Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 18/19] replication: only send confirmed data during final join Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 19/19] replication: block async transactions when not empty limbo Vladislav Shpilevoy 2020-07-01 17:12 ` Sergey Ostanevich 2020-07-01 23:47 ` Vladislav Shpilevoy 2020-07-03 12:28 ` Serge Petrenko 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 02/19] replication: introduce replication_synchro_* cfg options Vladislav Shpilevoy 2020-07-01 16:05 ` Sergey Ostanevich 2020-07-01 23:46 ` Vladislav Shpilevoy 2020-07-02 8:29 ` Serge Petrenko 2020-07-02 23:36 ` Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 03/19] txn: add TXN_WAIT_ACK flag Vladislav Shpilevoy 2020-07-01 17:14 ` Sergey Ostanevich 2020-07-01 23:46 ` Vladislav Shpilevoy 2020-07-02 8:30 ` Serge Petrenko 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 04/19] replication: make sync transactions wait quorum Vladislav Shpilevoy 2020-06-30 23:00 ` Vladislav Shpilevoy 2020-07-02 8:48 ` Serge Petrenko 2020-07-03 21:16 ` Vladislav Shpilevoy 2020-07-05 16:05 ` Vladislav Shpilevoy 2020-06-29 23:15 ` Vladislav Shpilevoy [this message] 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 06/19] txn: introduce various reasons for txn rollback Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 07/19] replication: write and read CONFIRM entries Vladislav Shpilevoy 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 08/19] replication: add support of qsync to the snapshot machinery Vladislav Shpilevoy 2020-07-02 8:52 ` Serge Petrenko 2020-07-08 11:43 ` Leonid Vasiliev 2020-06-29 23:15 ` [Tarantool-patches] [PATCH v2 09/19] txn_limbo: add timeout when waiting for acks Vladislav Shpilevoy 2020-06-29 23:22 ` [Tarantool-patches] [PATCH v2 00/19] Sync replication Vladislav Shpilevoy 2020-06-30 23:00 ` [Tarantool-patches] [PATCH v2 20/19] replication: add test for quorum 1 Vladislav Shpilevoy 2020-07-03 12:32 ` Serge Petrenko 2020-07-02 21:13 ` [Tarantool-patches] [PATCH 1/4] replication: regression test on gh-5119 [not fixed] sergeyb 2020-07-02 21:13 ` [Tarantool-patches] [PATCH 2/4] replication: add advanced tests for sync replication sergeyb 2020-07-02 22:46 ` Sergey Bronnikov 2020-07-02 23:20 ` Vladislav Shpilevoy 2020-07-06 12:30 ` Sergey Bronnikov 2020-07-06 23:31 ` Vladislav Shpilevoy 2020-07-07 12:12 ` Sergey Bronnikov 2020-07-07 20:57 ` Vladislav Shpilevoy 2020-07-08 12:07 ` Sergey Bronnikov 2020-07-08 22:13 ` Vladislav Shpilevoy 2020-07-09 9:39 ` Sergey Bronnikov 2020-07-02 21:13 ` [Tarantool-patches] [PATCH 3/4] replication: add tests for sync replication with anon replica sergeyb 2020-07-06 23:31 ` Vladislav Shpilevoy 2020-07-02 21:13 ` [Tarantool-patches] [PATCH 4/4] replication: add tests for sync replication with snapshots sergeyb 2020-07-02 22:46 ` Sergey Bronnikov 2020-07-02 23:20 ` Vladislav Shpilevoy 2020-07-06 23:31 ` Vladislav Shpilevoy 2020-07-07 16:00 ` Sergey Bronnikov 2020-07-06 23:31 ` [Tarantool-patches] [PATCH] Add new error injection constant ERRINJ_SYNC_TIMEOUT Vladislav Shpilevoy 2020-07-10 0:50 ` [Tarantool-patches] [PATCH v2 00/19] Sync replication Vladislav Shpilevoy 2020-07-10 7:40 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=3153acf5a8f5e34411a44069e8681357abc4b3a7.1593472477.git.v.shpilevoy@tarantool.org \ --to=v.shpilevoy@tarantool.org \ --cc=sergepetrenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v2 05/19] xrow: introduce CONFIRM and ROLLBACK entries' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox