From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp50.i.mail.ru (smtp50.i.mail.ru [94.100.177.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 67B42469711 for ; Mon, 25 May 2020 13:59:13 +0300 (MSK) From: Serge Petrenko Date: Mon, 25 May 2020 13:58:55 +0300 Message-Id: <89a689f86a2d81e9fcd424375ef6d71432b2d720.1590403792.git.sergepetrenko@tarantool.org> In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Tarantool-patches] [PATCH v2 1/2] wal: fix tx boundaries List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: v.shpilevoy@tarantool.org, gorcunov@gmail.com, kostja.osipov@gmail.com Cc: tarantool-patches@dev.tarantool.org In order to preserve transaction boundaries in replication protocol, wal assigns each tx row a transaction sequence number (tsn). Tsn is equal to the lsn of the first transaction row. Starting with commit 7eb4650eecf1ac382119d0038076c19b2708f4a1, local space requests are assigned a special replica id, 0, and have their own lsns. These operations are not replicated. If a transaction starting with a local space operation ends up in the WAL, it gets a tsn equal to the lsn of the local space request. Then, during replication, when such a transaction is replicated, the local space request is omitted, and replica receives a global part of the transaction with a seemingly random tsn, yielding an ER_PROTOCOL error: "Transaction id must be equal to LSN of the first row in the transaction". Assign tsn as equal to the lsn of the first global row in the transaction to fix the problem, and assign tsn as before for fully local transactions. Follow-up #4114 Part-of #4928 Reviewed-by: Cyrill Gorcunov --- src/box/wal.c | 30 +++++++++++++++++++++++++----- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/src/box/wal.c b/src/box/wal.c index b979244e3..ef4d84920 100644 --- a/src/box/wal.c +++ b/src/box/wal.c @@ -956,25 +956,37 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, struct xrow_header **end) { int64_t tsn = 0; + struct xrow_header **start = row; + struct xrow_header **first_glob_row = row; /** Assign LSN to all local rows. */ for ( ; row < end; row++) { if ((*row)->replica_id == 0) { /* * All rows representing local space data - * manipulations are signed wth a zero + * manipulations are signed with a zero * instance id. This is also true for * anonymous replicas, since they are * only capable of writing to local and * temporary spaces. */ - if ((*row)->group_id != GROUP_LOCAL) + if ((*row)->group_id != GROUP_LOCAL) { (*row)->replica_id = instance_id; + } (*row)->lsn = vclock_inc(vclock_diff, (*row)->replica_id) + vclock_get(base, (*row)->replica_id); - /* Use lsn of the first local row as transaction id. */ - tsn = tsn == 0 ? (*row)->lsn : tsn; - (*row)->tsn = tsn; + /* + * Use lsn of the first global row as + * transaction id. + */ + if ((*row)->group_id != GROUP_LOCAL && tsn == 0) { + tsn = (*row)->lsn; + /* + * Remember the tail being processed. + */ + first_glob_row = row; + } + (*row)->tsn = tsn == 0 ? (*start)->lsn : tsn; (*row)->is_commit = row == end - 1; } else { int64_t diff = (*row)->lsn - vclock_get(base, (*row)->replica_id); @@ -993,6 +1005,14 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base, } } } + + /* + * Fill transaction id for all the local rows preceding + * the first global row. tsn was yet unknown when those + * rows were processed. + */ + for (row = start; row < first_glob_row; row++) + (*row)->tsn = tsn; } static void -- 2.24.2 (Apple Git-127)