From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id C14276FC8F; Fri, 16 Apr 2021 19:27:17 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org C14276FC8F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1618590437; bh=Dsa5fFm1A0sLnxgLZER9DiEAazzmEqK65EVtmFH9ErY=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=FTGDRuQtMopNTQ+2Kv4jvxx+IN4XhlTaH0bjzrK2MQpreWA3FM8GSKboe8z1uEJ2p WaHBm5zf0/9fEQdQWkKi8eEju8af7qWSKvvaukkpjkNXA3M3ILUvMassBiuYb41wZo g2nS2DKjcAqzAg5X1pu8Bb0OpCVclvzR7MXhJgEU= Received: from smtp17.mail.ru (smtp17.mail.ru [94.100.176.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 205996BD0C for ; Fri, 16 Apr 2021 19:25:50 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 205996BD0C Received: by smtp17.mail.ru with esmtpa (envelope-from ) id 1lXRHx-00008e-BH; Fri, 16 Apr 2021 19:25:49 +0300 To: v.shpilevoy@tarantool.org, gorcunov@gmail.com Date: Fri, 16 Apr 2021 19:25:34 +0300 Message-Id: <7f16f58b3274d2f4e07332b59e4243240a455d2b.1618590211.git.sergepetrenko@tarantool.org> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD92FFCB8E6708E74807BAE725B9AE625DE765B0E193B5B7687182A05F538085040FC3CCF50A0521E462C430CBFB1CBB570CAC8C929FB398B1C4F3F5B8F0ED0FC24 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE70B8ADF238913687CB287FD4696A6DC2FA8DF7F3B2552694A4E2F5AFA99E116B42401471946AA11AF7680F9384605B903CC517EB18D671D938F08D7030A58E5ADC58D69EE07B14084F39EFFDF887939037866D6147AF826D85BF893881540130E86CAEF09EB43DC2E117882F4460429724CE54428C33FAD305F5C1EE8F4F765FC974A882099E279BDA471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18E5D25F19253116ADD2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EEFAD5A440E159F97D4782AAF36435267CD8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3643FE6A0CAC512C7BA3038C0950A5D36B5C8C57E37DE458B0BC6067A898B09E46D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637427B078F297B269AEFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-C1DE0DAB: C20DE7B7AB408E4181F030C43753B8183A4AFAF3EA6BDC44E1F4276B80994196BF1196BB3248DD4B3DEFAB32DEFE63545D967FC0DF4824D09C2B6934AE262D3EE7EAB7254005DCED7532B743992DF240BDC6A1CF3F042BAD6DF99611D93F60EF0417BEADF48D1460699F904B3F4130E343918A1A30D5E7FCCB5012B2E24CD356 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34431D0341F6B74DD3A66B27D4611F00E066D1366035DA499BF59B9D04CB01D9963E377EFA069B408B1D7E09C32AA3244C3C98E55B0D6CAB1A90D819821D231E2A3A92A9747B6CC886927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2bioj3S6P1v0GIqRA62FuCBIKSg== X-Mailru-Sender: 583F1D7ACE8F49BDD2846D59FC20E9F88BD3238A7C907672D28B8071A16AF3704C6B55A011D48DCF424AE0EB1F3D1D21E2978F233C3FAE6EE63DB1732555E4A8EE80603BA4A5B0BC112434F685709FCF0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH v4 03/12] xrow: introduce a PROMOTE entry X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Serge Petrenko via Tarantool-patches Reply-To: Serge Petrenko Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" A PROMOTE entry combines effect of CONFIRM, ROLLBACK and RAFT_TERM entries with some additional semantics on top. PROMOTE carries the following arguments: 1) former_leader_id - the id of previous limbo owner whose entries we want to confirm. 2) confirm_lsn - the lsn of the last former leader's transaction to be confirmed. In this sense PROMOTE(confirm_lsn) replaces CONFIRM(confirm_lsn) + ROLLBACK(confirm_lsn + 1). 3) replica_id - id of the instance issuing `box.ctl.clear_synchro_queue()` 4) term - the new term the instance issuing `box.ctl.clear_synchro_queue()` has just entered. This entry will be written to WAL instead of the usual CONFIRM + ROLLBACK pair on a successful `box.ctl.clear_synchro_queue()` call. Note, the ususal CONFIRM and ROLLBACK occurrences (after a confirmed or rolled back synchronous transaction) are here to stay. Part of #5445 --- src/box/iproto_constants.h | 26 ++++++++++++++++++++-- src/box/txn_limbo.c | 4 ++-- src/box/xrow.c | 45 ++++++++++++++++++++++++-------------- src/box/xrow.h | 31 ++++++++++++++------------ 4 files changed, 71 insertions(+), 35 deletions(-) diff --git a/src/box/iproto_constants.h b/src/box/iproto_constants.h index e9d1ef5d6..99c8ca184 100644 --- a/src/box/iproto_constants.h +++ b/src/box/iproto_constants.h @@ -132,6 +132,18 @@ enum iproto_key { IPROTO_REPLICA_ANON = 0x50, IPROTO_ID_FILTER = 0x51, IPROTO_ERROR = 0x52, + /** + * Term. Has the same meaning as IPROTO_RAFT_TERM, but is an iproto + * key, rather than a raft key. Used for PROMOTE request, which needs + * both iproto (e.g. REPLICA_ID) and raft (RAFT_TERM) keys. + */ + IPROTO_TERM = 0x53, + /* + * Be careful to not extend iproto_key values over 0x7f. + * iproto_keys are encoded in msgpack as positive fixnum, which ends at + * 0x7f, and we rely on this in some places by allocating a uint8_t to + * hold a msgpack-encoded key value. + */ IPROTO_KEY_MAX }; @@ -226,6 +238,8 @@ enum iproto_type { IPROTO_TYPE_STAT_MAX, IPROTO_RAFT = 30, + /** PROMOTE request. */ + IPROTO_PROMOTE = 31, /** A confirmation message for synchronous transactions. */ IPROTO_CONFIRM = 40, @@ -340,11 +354,19 @@ dml_request_key_map(uint16_t type) return iproto_body_key_map[type]; } -/** CONFIRM/ROLLBACK entries for synchronous replication. */ +/** Synchronous replication entries: CONFIRM/ROLLBACK/PROMOTE. */ static inline bool iproto_type_is_synchro_request(uint16_t type) { - return type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK; + return type == IPROTO_CONFIRM || type == IPROTO_ROLLBACK || + type == IPROTO_PROMOTE; +} + +/** PROMOTE entry (synchronous replication and leader elections). */ +static inline bool +iproto_type_is_promote_request(uint32_t type) +{ + return type == IPROTO_PROMOTE; } static inline bool diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c index addcb0f97..c96e497c6 100644 --- a/src/box/txn_limbo.c +++ b/src/box/txn_limbo.c @@ -331,7 +331,7 @@ txn_limbo_write_synchro(struct txn_limbo *limbo, uint16_t type, int64_t lsn) * This is a synchronous commit so we can * allocate everything on a stack. */ - struct synchro_body_bin body; + char body[XROW_SYNCHRO_BODY_LEN_MAX]; struct xrow_header row; char buf[sizeof(struct journal_entry) + sizeof(struct xrow_header *)]; @@ -339,7 +339,7 @@ txn_limbo_write_synchro(struct txn_limbo *limbo, uint16_t type, int64_t lsn) struct journal_entry *entry = (struct journal_entry *)buf; entry->rows[0] = &row; - xrow_encode_synchro(&row, &body, &req); + xrow_encode_synchro(&row, body, &req); journal_entry_create(entry, 1, xrow_approx_len(&row), txn_limbo_write_cb, fiber()); diff --git a/src/box/xrow.c b/src/box/xrow.c index 35e1d1c20..2e364cea5 100644 --- a/src/box/xrow.c +++ b/src/box/xrow.c @@ -885,28 +885,33 @@ xrow_encode_dml(const struct request *request, struct region *region, } void -xrow_encode_synchro(struct xrow_header *row, - struct synchro_body_bin *body, +xrow_encode_synchro(struct xrow_header *row, char *body, const struct synchro_request *req) { - /* - * A map with two elements. We don't compress - * numbers to have this structure constant in size, - * which allows us to preallocate it on stack. - */ - body->m_body = 0x80 | 2; - body->k_replica_id = IPROTO_REPLICA_ID; - body->m_replica_id = 0xce; - body->v_replica_id = mp_bswap_u32(req->replica_id); - body->k_lsn = IPROTO_LSN; - body->m_lsn = 0xcf; - body->v_lsn = mp_bswap_u64(req->lsn); + assert(iproto_type_is_synchro_request(req->type)); - memset(row, 0, sizeof(*row)); + char *pos = body; + + pos = mp_encode_map(pos, + iproto_type_is_promote_request(req->type) ? 3 : 2); + pos = mp_encode_uint(pos, IPROTO_REPLICA_ID); + pos = mp_encode_uint(pos, req->replica_id); + + pos = mp_encode_uint(pos, IPROTO_LSN); + pos = mp_encode_uint(pos, req->lsn); + + if (iproto_type_is_promote_request(req->type)) { + pos = mp_encode_uint(pos, IPROTO_TERM); + pos = mp_encode_uint(pos, req->term); + } + + assert(pos - body < XROW_SYNCHRO_BODY_LEN_MAX); + + memset(row, 0, sizeof(*row)); row->type = req->type; - row->body[0].iov_base = (void *)body; - row->body[0].iov_len = sizeof(*body); + row->body[0].iov_base = body; + row->body[0].iov_len = pos - body; row->bodycnt = 1; } @@ -952,11 +957,17 @@ xrow_decode_synchro(const struct xrow_header *row, struct synchro_request *req) case IPROTO_LSN: req->lsn = mp_decode_uint(&d); break; + case IPROTO_TERM: + req->term = mp_decode_uint(&d); + break; default: mp_next(&d); } } + req->type = row->type; + req->origin_id = row->replica_id; + return 0; } diff --git a/src/box/xrow.h b/src/box/xrow.h index 5ea99e792..b3c664be2 100644 --- a/src/box/xrow.h +++ b/src/box/xrow.h @@ -49,6 +49,7 @@ enum { XROW_IOVMAX = XROW_HEADER_IOVMAX + XROW_BODY_IOVMAX, XROW_HEADER_LEN_MAX = 52, XROW_BODY_LEN_MAX = 256, + XROW_SYNCHRO_BODY_LEN_MAX = 32, IPROTO_HEADER_LEN = 28, /** 7 = sizeof(iproto_body_bin). */ IPROTO_SELECT_HEADER_LEN = IPROTO_HEADER_LEN + 7, @@ -226,7 +227,10 @@ xrow_encode_dml(const struct request *request, struct region *region, * pending synchronous transactions. */ struct synchro_request { - /** Operation type - IPROTO_ROLLBACK or IPROTO_CONFIRM. */ + /** + * Operation type - either IPROTO_ROLLBACK or IPROTO_CONFIRM or + * IPROTO_PROMOTE + */ uint16_t type; /** * ID of the instance owning the pending transactions. @@ -236,25 +240,25 @@ struct synchro_request { * finish transactions of an old master. */ uint32_t replica_id; + /** + * Id of the instance which has issued this request. Only filled on + * decoding, and left blank when encoding a request. + */ + uint32_t origin_id; /** * Operation LSN. * In case of CONFIRM it means 'confirm all * transactions with lsn <= this value'. * In case of ROLLBACK it means 'rollback all transactions * with lsn >= this value'. + * In case of PROMOTE it means CONFIRM(lsn) + ROLLBACK(lsn+1) */ int64_t lsn; -}; - -/** Synchro request xrow's body in MsgPack format. */ -struct PACKED synchro_body_bin { - uint8_t m_body; - uint8_t k_replica_id; - uint8_t m_replica_id; - uint32_t v_replica_id; - uint8_t k_lsn; - uint8_t m_lsn; - uint64_t v_lsn; + /** + * The new term the instance issuing this request is in. Only used for + * PROMOTE request. + */ + uint64_t term; }; /** @@ -264,8 +268,7 @@ struct PACKED synchro_body_bin { * @param req Request parameters. */ void -xrow_encode_synchro(struct xrow_header *row, - struct synchro_body_bin *body, +xrow_encode_synchro(struct xrow_header *row, char *body, const struct synchro_request *req); /** -- 2.24.3 (Apple Git-128)