Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework
@ 2020-07-02 13:39 Serge Petrenko
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 1/2] wal: fix tx boundaries Serge Petrenko
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Serge Petrenko @ 2020-07-02 13:39 UTC (permalink / raw)
  To: gorcunov, v.shpilevoy; +Cc: tarantool-patches

https://github.com/tarantool/tarantool/issues/4928
https://github.com/tarantool/tarantool/tree/sp/gh-4928-tx-boundary-fix-nop

Changes in v3:
  - Instead of reordering rows in WAL, append
    a dummy NOP row to the tx end when needed.

Serge Petrenko (2):
  wal: fix tx boundaries
  replication: append NOP as the last tx row

 src/box/txn.c                                 |  23 ++-
 src/box/wal.c                                 |  27 +++-
 test/replication/gh-4928-tx-boundaries.result | 138 ++++++++++++++++++
 .../gh-4928-tx-boundaries.test.lua            |  61 ++++++++
 test/replication/suite.cfg                    |   1 +
 5 files changed, 245 insertions(+), 5 deletions(-)
 create mode 100644 test/replication/gh-4928-tx-boundaries.result
 create mode 100644 test/replication/gh-4928-tx-boundaries.test.lua

-- 
2.24.3 (Apple Git-128)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Tarantool-patches] [PATCH v3 1/2] wal: fix tx boundaries
  2020-07-02 13:39 [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Serge Petrenko
@ 2020-07-02 13:39 ` Serge Petrenko
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row Serge Petrenko
  2020-07-06  7:47 ` [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Kirill Yukhin
  2 siblings, 0 replies; 12+ messages in thread
From: Serge Petrenko @ 2020-07-02 13:39 UTC (permalink / raw)
  To: gorcunov, v.shpilevoy; +Cc: tarantool-patches

In order to preserve transaction boundaries in replication protocol, wal
assigns each tx row a transaction sequence number (tsn). Tsn is equal to
the lsn of the first transaction row.

Starting with commit 7eb4650eecf1ac382119d0038076c19b2708f4a1, local
space requests are assigned a special replica id, 0, and have their own
lsns. These operations are not replicated.

If a transaction starting with a local space operation ends up in the
WAL, it gets a tsn equal to the lsn of the local space request. Then,
during replication, when such a transaction is replicated, the local
space request is omitted, and replica receives a global part of the
transaction with a seemingly random tsn, yielding an ER_PROTOCOL error:
"Transaction id must be equal to LSN of the first row in the transaction".

Assign tsn as equal to the lsn of the first global row in the
transaction to fix the problem, and assign tsn as before for fully local
transactions.

Follow-up #4114
Part-of #4928

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/box/wal.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/box/wal.c b/src/box/wal.c
index 74cc74684..ef89733ed 100644
--- a/src/box/wal.c
+++ b/src/box/wal.c
@@ -955,12 +955,14 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
 	       struct xrow_header **end)
 {
 	int64_t tsn = 0;
+	struct xrow_header **start = row;
+	struct xrow_header **first_glob_row = row;
 	/** Assign LSN to all local rows. */
 	for ( ; row < end; row++) {
 		if ((*row)->replica_id == 0) {
 			/*
 			 * All rows representing local space data
-			 * manipulations are signed wth a zero
+			 * manipulations are signed with a zero
 			 * instance id. This is also true for
 			 * anonymous replicas, since they are
 			 * only capable of writing to local and
@@ -971,9 +973,18 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
 
 			(*row)->lsn = vclock_inc(vclock_diff, (*row)->replica_id) +
 				      vclock_get(base, (*row)->replica_id);
-			/* Use lsn of the first local row as transaction id. */
-			tsn = tsn == 0 ? (*row)->lsn : tsn;
-			(*row)->tsn = tsn;
+			/*
+			 * Use lsn of the first global row as
+			 * transaction id.
+			 */
+			if ((*row)->group_id != GROUP_LOCAL && tsn == 0) {
+				tsn = (*row)->lsn;
+				/*
+				 * Remember the tail being processed.
+				 */
+				first_glob_row = row;
+			}
+			(*row)->tsn = tsn == 0 ? (*start)->lsn : tsn;
 			(*row)->is_commit = row == end - 1;
 		} else {
 			int64_t diff = (*row)->lsn - vclock_get(base, (*row)->replica_id);
@@ -992,6 +1003,14 @@ wal_assign_lsn(struct vclock *vclock_diff, struct vclock *base,
 			}
 		}
 	}
+
+	/*
+	 * Fill transaction id for all the local rows preceding
+	 * the first global row. tsn was yet unknown when those
+	 * rows were processed.
+	 */
+	for (row = start; row < first_glob_row; row++)
+		(*row)->tsn = tsn;
 }
 
 static void
-- 
2.24.3 (Apple Git-128)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-02 13:39 [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Serge Petrenko
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 1/2] wal: fix tx boundaries Serge Petrenko
@ 2020-07-02 13:39 ` Serge Petrenko
  2020-07-02 14:35   ` Cyrill Gorcunov
                     ` (2 more replies)
  2020-07-06  7:47 ` [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Kirill Yukhin
  2 siblings, 3 replies; 12+ messages in thread
From: Serge Petrenko @ 2020-07-02 13:39 UTC (permalink / raw)
  To: gorcunov, v.shpilevoy; +Cc: tarantool-patches

Since we stopped sending local space operations in replication, the last
tx row has to be global in order to preserve tx boundaries on replica.
If the last row happens to be a local one, replica will never receive
the tx end marker, yielding the following errors:
`ER_UNSUPPORTED: replication does not support interleaving
transactions`.

In order to fix the problem append a global NOP row at the tx end if
it happens to end on a local row.

Follow-up #4114
Closes #4928
---
 src/box/txn.c                                 |  23 ++-
 test/replication/gh-4928-tx-boundaries.result | 138 ++++++++++++++++++
 .../gh-4928-tx-boundaries.test.lua            |  61 ++++++++
 test/replication/suite.cfg                    |   1 +
 4 files changed, 222 insertions(+), 1 deletion(-)
 create mode 100644 test/replication/gh-4928-tx-boundaries.result
 create mode 100644 test/replication/gh-4928-tx-boundaries.test.lua

diff --git a/src/box/txn.c b/src/box/txn.c
index 123520166..9793109f3 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -35,6 +35,7 @@
 #include <fiber.h>
 #include "xrow.h"
 #include "errinj.h"
+#include "iproto_constants.h"
 
 double too_long_threshold;
 
@@ -488,7 +489,8 @@ txn_journal_entry_new(struct txn *txn)
 
 	assert(txn->n_new_rows + txn->n_applier_rows > 0);
 
-	req = journal_entry_new(txn->n_new_rows + txn->n_applier_rows,
+	/* Save space for an additional NOP row just in case. */
+	req = journal_entry_new(txn->n_new_rows + txn->n_applier_rows + 1,
 				&txn->region, txn);
 	if (req == NULL)
 		return NULL;
@@ -517,6 +519,25 @@ txn_journal_entry_new(struct txn *txn)
 	assert(remote_row == req->rows + txn->n_applier_rows);
 	assert(local_row == remote_row + txn->n_new_rows);
 
+	/*
+	 * Append a dummy NOP statement to preserve replication tx
+	 * boundaries when the last tx row is a local one.
+	 */
+	if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&
+	    (*(local_row - 1))->group_id == GROUP_LOCAL) {
+		size_t size;
+		*local_row = region_alloc_object(&txn->region,
+						 typeof(**local_row), &size);
+		if (*local_row == NULL) {
+			diag_set(OutOfMemory, size, "region_alloc_object", "row");
+			return NULL;
+		}
+		memset(*local_row, 0, sizeof(**local_row));
+		(*local_row)->type = IPROTO_NOP;
+		(*local_row)->group_id = GROUP_DEFAULT;
+	} else
+		--req->n_rows;
+
 	return req;
 }
 
diff --git a/test/replication/gh-4928-tx-boundaries.result b/test/replication/gh-4928-tx-boundaries.result
new file mode 100644
index 000000000..969bd8438
--- /dev/null
+++ b/test/replication/gh-4928-tx-boundaries.result
@@ -0,0 +1,138 @@
+-- test-run result file version 2
+-- gh-4928. Test that transactions mixing local and global
+-- space operations are replicated correctly.
+env = require('test_run')
+ | ---
+ | ...
+test_run = env.new()
+ | ---
+ | ...
+bit = require('bit')
+ | ---
+ | ...
+
+-- Init.
+box.schema.user.grant('guest', 'replication')
+ | ---
+ | ...
+_ = box.schema.space.create('glob')
+ | ---
+ | ...
+_ = box.schema.space.create('loc', {is_local=true})
+ | ---
+ | ...
+_ = box.space.glob:create_index('pk')
+ | ---
+ | ...
+_ = box.space.loc:create_index('pk')
+ | ---
+ | ...
+
+function gen_mixed_tx(i)\
+    box.begin()\
+    if bit.band(i, 1) ~= 0 then\
+        box.space.glob:insert{10 * i + 1}\
+    else\
+        box.space.loc:insert{10 * i + 1}\
+    end\
+    if bit.band(i, 2) ~= 0 then\
+        box.space.glob:insert{10 * i + 2}\
+    else\
+        box.space.loc:insert{10 * i + 2}\
+    end\
+    if bit.band(i, 4) ~= 0 then\
+        box.space.glob:insert{10 * i + 3}\
+    else\
+        box.space.loc:insert{10 * i + 3}\
+    end\
+    box.commit()\
+end
+ | ---
+ | ...
+
+test_run:cmd("create server replica with rpl_master=default,\
+             script='replication/replica.lua'")
+ | ---
+ | - true
+ | ...
+test_run:cmd('start server replica')
+ | ---
+ | - true
+ | ...
+test_run:wait_downstream(2, {status='follow'})
+ | ---
+ | - true
+ | ...
+
+for i = 0, 7 do gen_mixed_tx(i) end
+ | ---
+ | ...
+
+box.info.replication[2].status
+ | ---
+ | - null
+ | ...
+
+vclock = box.info.vclock
+ | ---
+ | ...
+vclock[0] = nil
+ | ---
+ | ...
+test_run:wait_vclock("replica", vclock)
+ | ---
+ | ...
+
+test_run:cmd('switch replica')
+ | ---
+ | - true
+ | ...
+
+box.info.status
+ | ---
+ | - running
+ | ...
+box.info.replication[1].upstream.status
+ | ---
+ | - follow
+ | ...
+
+box.space.glob:select{}
+ | ---
+ | - - [11]
+ |   - [22]
+ |   - [31]
+ |   - [32]
+ |   - [43]
+ |   - [51]
+ |   - [53]
+ |   - [62]
+ |   - [63]
+ |   - [71]
+ |   - [72]
+ |   - [73]
+ | ...
+
+test_run:cmd('switch default')
+ | ---
+ | - true
+ | ...
+
+-- Cleanup.
+test_run:cmd('stop server replica')
+ | ---
+ | - true
+ | ...
+test_run:cmd('delete server replica')
+ | ---
+ | - true
+ | ...
+box.schema.user.revoke('guest', 'replication')
+ | ---
+ | ...
+box.space.loc:drop()
+ | ---
+ | ...
+box.space.glob:drop()
+ | ---
+ | ...
diff --git a/test/replication/gh-4928-tx-boundaries.test.lua b/test/replication/gh-4928-tx-boundaries.test.lua
new file mode 100644
index 000000000..92526fc51
--- /dev/null
+++ b/test/replication/gh-4928-tx-boundaries.test.lua
@@ -0,0 +1,61 @@
+-- gh-4928. Test that transactions mixing local and global
+-- space operations are replicated correctly.
+env = require('test_run')
+test_run = env.new()
+bit = require('bit')
+
+-- Init.
+box.schema.user.grant('guest', 'replication')
+_ = box.schema.space.create('glob')
+_ = box.schema.space.create('loc', {is_local=true})
+_ = box.space.glob:create_index('pk')
+_ = box.space.loc:create_index('pk')
+
+function gen_mixed_tx(i)\
+    box.begin()\
+    if bit.band(i, 1) ~= 0 then\
+        box.space.glob:insert{10 * i + 1}\
+    else\
+        box.space.loc:insert{10 * i + 1}\
+    end\
+    if bit.band(i, 2) ~= 0 then\
+        box.space.glob:insert{10 * i + 2}\
+    else\
+        box.space.loc:insert{10 * i + 2}\
+    end\
+    if bit.band(i, 4) ~= 0 then\
+        box.space.glob:insert{10 * i + 3}\
+    else\
+        box.space.loc:insert{10 * i + 3}\
+    end\
+    box.commit()\
+end
+
+test_run:cmd("create server replica with rpl_master=default,\
+             script='replication/replica.lua'")
+test_run:cmd('start server replica')
+test_run:wait_downstream(2, {status='follow'})
+
+for i = 0, 7 do gen_mixed_tx(i) end
+
+box.info.replication[2].status
+
+vclock = box.info.vclock
+vclock[0] = nil
+test_run:wait_vclock("replica", vclock)
+
+test_run:cmd('switch replica')
+
+box.info.status
+box.info.replication[1].upstream.status
+
+box.space.glob:select{}
+
+test_run:cmd('switch default')
+
+-- Cleanup.
+test_run:cmd('stop server replica')
+test_run:cmd('delete server replica')
+box.schema.user.revoke('guest', 'replication')
+box.space.loc:drop()
+box.space.glob:drop()
diff --git a/test/replication/suite.cfg b/test/replication/suite.cfg
index d2743b5ed..f357b07da 100644
--- a/test/replication/suite.cfg
+++ b/test/replication/suite.cfg
@@ -18,6 +18,7 @@
     "gh-4606-admin-creds.test.lua": {},
     "gh-4739-vclock-assert.test.lua": {},
     "gh-4730-applier-rollback.test.lua": {},
+    "gh-4928-tx-boundaries.test.lua": {},
     "*": {
         "memtx": {"engine": "memtx"},
         "vinyl": {"engine": "vinyl"}
-- 
2.24.3 (Apple Git-128)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row Serge Petrenko
@ 2020-07-02 14:35   ` Cyrill Gorcunov
  2020-07-02 14:49     ` Cyrill Gorcunov
  2020-07-02 15:37   ` Cyrill Gorcunov
  2020-07-02 20:30   ` Vladislav Shpilevoy
  2 siblings, 1 reply; 12+ messages in thread
From: Cyrill Gorcunov @ 2020-07-02 14:35 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy

On Thu, Jul 02, 2020 at 04:39:51PM +0300, Serge Petrenko wrote:
...
> +	/*
> +	 * Append a dummy NOP statement to preserve replication tx
> +	 * boundaries when the last tx row is a local one.
> +	 */
> +	if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&
> +	    (*(local_row - 1))->group_id == GROUP_LOCAL) {
> +		size_t size;
> +		*local_row = region_alloc_object(&txn->region,
> +						 typeof(**local_row), &size);
> +		if (*local_row == NULL) {
> +			diag_set(OutOfMemory, size, "region_alloc_object", "row");
> +			return NULL;
> +		}
> +		memset(*local_row, 0, sizeof(**local_row));
> +		(*local_row)->type = IPROTO_NOP;
> +		(*local_row)->group_id = GROUP_DEFAULT;
> +	} else
> +		--req->n_rows;

Serge, do we really need to allocate _new_ xheader here? Can't we simply
create static one and reuse it all the time?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-02 14:35   ` Cyrill Gorcunov
@ 2020-07-02 14:49     ` Cyrill Gorcunov
  0 siblings, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2020-07-02 14:49 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy

On Thu, Jul 02, 2020 at 05:35:21PM +0300, Cyrill Gorcunov wrote:
> On Thu, Jul 02, 2020 at 04:39:51PM +0300, Serge Petrenko wrote:
> ...
> Serge, do we really need to allocate _new_ xheader here? Can't we simply
> create static one and reuse it all the time?

Won't work. Drop this mail.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row Serge Petrenko
  2020-07-02 14:35   ` Cyrill Gorcunov
@ 2020-07-02 15:37   ` Cyrill Gorcunov
  2020-07-02 20:30   ` Vladislav Shpilevoy
  2 siblings, 0 replies; 12+ messages in thread
From: Cyrill Gorcunov @ 2020-07-02 15:37 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy

On Thu, Jul 02, 2020 at 04:39:51PM +0300, Serge Petrenko wrote:
> Since we stopped sending local space operations in replication, the last
> tx row has to be global in order to preserve tx boundaries on replica.
> If the last row happens to be a local one, replica will never receive
> the tx end marker, yielding the following errors:
> `ER_UNSUPPORTED: replication does not support interleaving
> transactions`.
> 
> In order to fix the problem append a global NOP row at the tx end if
> it happens to end on a local row.
> 
> Follow-up #4114
> Closes #4928

I've been trying to find out better place for this allocation (or even
use preallocated static one) but since we need to assign lsn later in
wal code none of ideas work. Thus lets keep it this way.

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row Serge Petrenko
  2020-07-02 14:35   ` Cyrill Gorcunov
  2020-07-02 15:37   ` Cyrill Gorcunov
@ 2020-07-02 20:30   ` Vladislav Shpilevoy
  2020-07-03 11:24     ` Serge Petrenko
  2 siblings, 1 reply; 12+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-02 20:30 UTC (permalink / raw)
  To: Serge Petrenko, gorcunov; +Cc: tarantool-patches

Hi! Thanks for the patch!

See 3 comments below.

> diff --git a/src/box/txn.c b/src/box/txn.c
> index 123520166..9793109f3 100644
> --- a/src/box/txn.c
> +++ b/src/box/txn.c
> @@ -517,6 +519,25 @@ txn_journal_entry_new(struct txn *txn)
>  	assert(remote_row == req->rows + txn->n_applier_rows);
>  	assert(local_row == remote_row + txn->n_new_rows);
>  
> +	/*
> +	 * Append a dummy NOP statement to preserve replication tx
> +	 * boundaries when the last tx row is a local one.
> +	 */
> +	if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&

1. Why do you need 'txn->n_local_rows != txn->n_new_rows' check?
Looks like local rows are always in the end. So if their count > 0,
you can look at *(local_row - 1) without any other checks, no?

Btw, I just noticed, that we already reorder rows - regardless of
what was the order of their appliance, txn_journal_entry_new()
anyway moves all remote rows first, and local rows second. This looks
like the same problem as I was talking about, when we discussed the
local reorder, and gave a FOREIGN KEY example.

It means it is already broken, in theory. And perhaps that should be
changed.

If we consider it not broken, then why can't we put local rows first,
and remote rows second? Then we don't need NOP here.

> +	    (*(local_row - 1))->group_id == GROUP_LOCAL) {
> +		size_t size;
> +		*local_row = region_alloc_object(&txn->region,
> +						 typeof(**local_row), &size);
> +		if (*local_row == NULL) {
> +			diag_set(OutOfMemory, size, "region_alloc_object", "row");

2. Out of 80.

> +			return NULL;
> +		}
> +		memset(*local_row, 0, sizeof(**local_row));
> +		(*local_row)->type = IPROTO_NOP;
> +		(*local_row)->group_id = GROUP_DEFAULT;
> +	} else
> +		--req->n_rows;

3. Lets add {} to the 'else' branch. To be consistent with the 'if'
branch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-02 20:30   ` Vladislav Shpilevoy
@ 2020-07-03 11:24     ` Serge Petrenko
  2020-07-03 21:29       ` Vladislav Shpilevoy
  0 siblings, 1 reply; 12+ messages in thread
From: Serge Petrenko @ 2020-07-03 11:24 UTC (permalink / raw)
  To: Vladislav Shpilevoy, gorcunov; +Cc: tarantool-patches


02.07.2020 23:30, Vladislav Shpilevoy пишет:
> Hi! Thanks for the patch!
>
> See 3 comments below.

Thanks  for  the review!

Please find my answers and diff below.

>
>> diff --git a/src/box/txn.c b/src/box/txn.c
>> index 123520166..9793109f3 100644
>> --- a/src/box/txn.c
>> +++ b/src/box/txn.c
>> @@ -517,6 +519,25 @@ txn_journal_entry_new(struct txn *txn)
>>   	assert(remote_row == req->rows + txn->n_applier_rows);
>>   	assert(local_row == remote_row + txn->n_new_rows);
>>   
>> +	/*
>> +	 * Append a dummy NOP statement to preserve replication tx
>> +	 * boundaries when the last tx row is a local one.
>> +	 */
>> +	if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&
> 1. Why do you need 'txn->n_local_rows != txn->n_new_rows' check?
> Looks like local rows are always in the end. So if their count > 0,
> you can look at *(local_row - 1) without any other checks, no?

There are n_applier_rows, n_new_rows, and n_local_rows.
A txn has a total of n_applier_rows + n_new_rows rows.
First come the n_applier_rows of remote rows, if any, then
n_new_rows "local" rows. Here local means that they're not
from a remote instance.
So, there are two things we need to check: that the last row
is related to a local space, and that the transaction isn't fully local.

I've just found out I made a mistake.

Looks like that's a correct check:


if (txn->n_local_rows > 0 &&
(txn->n_local_rows != txn->n_new_rows || txn->n_applier_rows > 0) &&
(*(local_row - 1))->group_id == GROUP_LOCAL) {

>
> Btw, I just noticed, that we already reorder rows - regardless of
> what was the order of their appliance, txn_journal_entry_new()
> anyway moves all remote rows first, and local rows second. This looks
> like the same problem as I was talking about, when we discussed the
> local reorder, and gave a FOREIGN KEY example.
>
> It means it is already broken, in theory. And perhaps that should be
> changed.
>
> If we consider it not broken, then why can't we put local rows first,
> and remote rows second? Then we don't need NOP here.

We need a NOP when there are no remote rows anyway, since we decided
to not reorder local and global rows.

>
>> +	    (*(local_row - 1))->group_id == GROUP_LOCAL) {
>> +		size_t size;
>> +		*local_row = region_alloc_object(&txn->region,
>> +						 typeof(**local_row), &size);
>> +		if (*local_row == NULL) {
>> +			diag_set(OutOfMemory, size, "region_alloc_object", "row");
> 2. Out of 80.
Fixed.
>
>> +			return NULL;
>> +		}
>> +		memset(*local_row, 0, sizeof(**local_row));
>> +		(*local_row)->type = IPROTO_NOP;
>> +		(*local_row)->group_id = GROUP_DEFAULT;
>> +	} else
>> +		--req->n_rows;
> 3. Lets add {} to the 'else' branch. To be consistent with the 'if'
> branch.
Ok.
diff --git a/src/box/txn.c b/src/box/txn.c
index 9793109f3..765dbd2ce 100644
--- a/src/box/txn.c
+++ b/src/box/txn.c
@@ -521,22 +521,26 @@ txn_journal_entry_new(struct txn *txn)

      /*
       * Append a dummy NOP statement to preserve replication tx
-     * boundaries when the last tx row is a local one.
+     * boundaries when the last tx row is a local one, and the
+     * transaction has at least one global row.
       */
-    if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&
+    if (txn->n_local_rows > 0 &&
+        (txn->n_local_rows != txn->n_new_rows || txn->n_applier_rows > 
0) &&
          (*(local_row - 1))->group_id == GROUP_LOCAL) {
          size_t size;
          *local_row = region_alloc_object(&txn->region,
                           typeof(**local_row), &size);
          if (*local_row == NULL) {
-            diag_set(OutOfMemory, size, "region_alloc_object", "row");
+            diag_set(OutOfMemory, size, "region_alloc_object",
+                 "row");
              return NULL;
          }
          memset(*local_row, 0, sizeof(**local_row));
          (*local_row)->type = IPROTO_NOP;
          (*local_row)->group_id = GROUP_DEFAULT;
-    } else
+    } else {
          --req->n_rows;
+    }

      return req;
  }


-- Serge Petrenko

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-03 11:24     ` Serge Petrenko
@ 2020-07-03 21:29       ` Vladislav Shpilevoy
  2020-07-04 17:25         ` Serge Petrenko
  0 siblings, 1 reply; 12+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-03 21:29 UTC (permalink / raw)
  To: Serge Petrenko, gorcunov; +Cc: tarantool-patches

Привет.

>>> diff --git a/src/box/txn.c b/src/box/txn.c
>>> index 123520166..9793109f3 100644
>>> --- a/src/box/txn.c
>>> +++ b/src/box/txn.c
>>> @@ -517,6 +519,25 @@ txn_journal_entry_new(struct txn *txn)
>>>       assert(remote_row == req->rows + txn->n_applier_rows);
>>>       assert(local_row == remote_row + txn->n_new_rows);
>>>   +    /*
>>> +     * Append a dummy NOP statement to preserve replication tx
>>> +     * boundaries when the last tx row is a local one.
>>> +     */
>>> +    if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&
>> 1. Why do you need 'txn->n_local_rows != txn->n_new_rows' check?
>> Looks like local rows are always in the end. So if their count > 0,
>> you can look at *(local_row - 1) without any other checks, no?
> 
> There are n_applier_rows, n_new_rows, and n_local_rows.
> A txn has a total of n_applier_rows + n_new_rows rows.
> First come the n_applier_rows of remote rows, if any, then
> n_new_rows "local" rows. Here local means that they're not
> from a remote instance.
> So, there are two things we need to check: that the last row
> is related to a local space, and that the transaction isn't fully local.
> 
> I've just found out I made a mistake.
> 
> Looks like that's a correct check:
> 
> 
> if (txn->n_local_rows > 0 &&
> (txn->n_local_rows != txn->n_new_rows || txn->n_applier_rows > 0) &&
> (*(local_row - 1))->group_id == GROUP_LOCAL) {

Я все равно не понимаю. Почему недостаточно тупо проверить последний local_row?
txn->n_local_rows != txn->n_new_rows - это нужно только за тем, что вся
транзакция может быть GROUP_LOCAL и тогда ноп не нужен?

>> Btw, I just noticed, that we already reorder rows - regardless of
>> what was the order of their appliance, txn_journal_entry_new()
>> anyway moves all remote rows first, and local rows second. This looks
>> like the same problem as I was talking about, when we discussed the
>> local reorder, and gave a FOREIGN KEY example.
>>
>> It means it is already broken, in theory. And perhaps that should be
>> changed.
>>
>> If we consider it not broken, then why can't we put local rows first,
>> and remote rows second? Then we don't need NOP here.
> 
> We need a NOP when there are no remote rows anyway, since we decided
> to not reorder local and global rows.

Что если прилетела транзакция, и аплаер ее целиком переписал через
before_replace триггеры + вставил что-то в локальные спейсы? Тогда получится,
что мастер не получит никакого подтверждения от реплики, что она транзакцию в
целом получила? Потому как лсн она от мастера не сможет ниоткуда взять в
txn_commit_async() - все стейтменты будут локальные.

По поводу since we decided to not reorder local and global rows - все так, да.
Но мой поинт в том, что у нас оказывается реордеринг **уже** есть. У транзакции
remote и local rows переупорядочиваются. Из этого происходит, что переупорядочивание
все равно есть. Независимо от того, внесем ли мы еще одно сейчас. А значит надо
либо это тоже чинить, чтоб вообще ничего не переупорядочивалось, либо тогда не
обязательно сдерживаться, и в этом патче тоже можно переупорядочить

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-03 21:29       ` Vladislav Shpilevoy
@ 2020-07-04 17:25         ` Serge Petrenko
  2020-07-06  6:48           ` Vladislav Shpilevoy
  0 siblings, 1 reply; 12+ messages in thread
From: Serge Petrenko @ 2020-07-04 17:25 UTC (permalink / raw)
  To: Vladislav Shpilevoy, gorcunov; +Cc: tarantool-patches


04.07.2020 00:29, Vladislav Shpilevoy пишет:
> Привет.
Привет.  Обсудили уже в личке, продублирую здесь.
>
>>>> diff --git a/src/box/txn.c b/src/box/txn.c
>>>> index 123520166..9793109f3 100644
>>>> --- a/src/box/txn.c
>>>> +++ b/src/box/txn.c
>>>> @@ -517,6 +519,25 @@ txn_journal_entry_new(struct txn *txn)
>>>>        assert(remote_row == req->rows + txn->n_applier_rows);
>>>>        assert(local_row == remote_row + txn->n_new_rows);
>>>>    +    /*
>>>> +     * Append a dummy NOP statement to preserve replication tx
>>>> +     * boundaries when the last tx row is a local one.
>>>> +     */
>>>> +    if (txn->n_local_rows > 0 && txn->n_local_rows != txn->n_new_rows &&
>>> 1. Why do you need 'txn->n_local_rows != txn->n_new_rows' check?
>>> Looks like local rows are always in the end. So if their count > 0,
>>> you can look at *(local_row - 1) without any other checks, no?
>> There are n_applier_rows, n_new_rows, and n_local_rows.
>> A txn has a total of n_applier_rows + n_new_rows rows.
>> First come the n_applier_rows of remote rows, if any, then
>> n_new_rows "local" rows. Here local means that they're not
>> from a remote instance.
>> So, there are two things we need to check: that the last row
>> is related to a local space, and that the transaction isn't fully local.
>>
>> I've just found out I made a mistake.
>>
>> Looks like that's a correct check:
>>
>>
>> if (txn->n_local_rows > 0 &&
>> (txn->n_local_rows != txn->n_new_rows || txn->n_applier_rows > 0) &&
>> (*(local_row - 1))->group_id == GROUP_LOCAL) {
> Я все равно не понимаю. Почему недостаточно тупо проверить последний local_row?
> txn->n_local_rows != txn->n_new_rows - это нужно только за тем, что вся
> транзакция может быть GROUP_LOCAL и тогда ноп не нужен?

Да,  проверка  на то что последний row - GROUP_LOCAL,  и на то что 
транзакция
не вся целиком  GROUP_LOCAL

>
>>> Btw, I just noticed, that we already reorder rows - regardless of
>>> what was the order of their appliance, txn_journal_entry_new()
>>> anyway moves all remote rows first, and local rows second. This looks
>>> like the same problem as I was talking about, when we discussed the
>>> local reorder, and gave a FOREIGN KEY example.
>>>
>>> It means it is already broken, in theory. And perhaps that should be
>>> changed.
>>>
>>> If we consider it not broken, then why can't we put local rows first,
>>> and remote rows second? Then we don't need NOP here.
>> We need a NOP when there are no remote rows anyway, since we decided
>> to not reorder local and global rows.
> Что если прилетела транзакция, и аплаер ее целиком переписал через
> before_replace триггеры + вставил что-то в локальные спейсы? Тогда получится,
> что мастер не получит никакого подтверждения от реплики, что она транзакцию в
> целом получила? Потому как лсн она от мастера не сможет ниоткуда взять в
> txn_commit_async() - все стейтменты будут локальные.
Выяснилось, что мастер получит ак, потому что на каждый стейтмент, который
в апплаере заменили на "NOP", т е вернули new_tuple = old_tuple, 
действительно
в вал запишется NOP.
>
> По поводу since we decided to not reorder local and global rows - все так, да.
> Но мой поинт в том, что у нас оказывается реордеринг **уже** есть. У транзакции
> remote и local rows переупорядочиваются. Из этого происходит, что переупорядочивание
> все равно есть. Независимо от того, внесем ли мы еще одно сейчас. А значит надо
> либо это тоже чинить, чтоб вообще ничего не переупорядочивалось, либо тогда не
> обязательно сдерживаться, и в этом патче тоже можно переупорядочить
Если мы сейчас сделаем НОП вместо переупорядочивания, то мы как минимум не
сломаем это всё ещё сильнее.
Не знаю, зачем переупорядочивать remote и local записи.  Это наверное 
появилось
вместе с транзакционным апплаером, но зачем -  непонятно.

-- 
Serge Petrenko

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row
  2020-07-04 17:25         ` Serge Petrenko
@ 2020-07-06  6:48           ` Vladislav Shpilevoy
  0 siblings, 0 replies; 12+ messages in thread
From: Vladislav Shpilevoy @ 2020-07-06  6:48 UTC (permalink / raw)
  To: Serge Petrenko, gorcunov; +Cc: tarantool-patches

LGTM.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework
  2020-07-02 13:39 [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Serge Petrenko
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 1/2] wal: fix tx boundaries Serge Petrenko
  2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row Serge Petrenko
@ 2020-07-06  7:47 ` Kirill Yukhin
  2 siblings, 0 replies; 12+ messages in thread
From: Kirill Yukhin @ 2020-07-06  7:47 UTC (permalink / raw)
  To: Serge Petrenko; +Cc: tarantool-patches, v.shpilevoy

Hello,

On 02 июл 16:39, Serge Petrenko wrote:
> https://github.com/tarantool/tarantool/issues/4928
> https://github.com/tarantool/tarantool/tree/sp/gh-4928-tx-boundary-fix-nop
> 
> Changes in v3:
>   - Instead of reordering rows in WAL, append
>     a dummy NOP row to the tx end when needed.
> 
> Serge Petrenko (2):
>   wal: fix tx boundaries
>   replication: append NOP as the last tx row

I've checked your patchset into 2.3, 2.4 and master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-06  7:47 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 13:39 [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Serge Petrenko
2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 1/2] wal: fix tx boundaries Serge Petrenko
2020-07-02 13:39 ` [Tarantool-patches] [PATCH v3 2/2] replication: append NOP as the last tx row Serge Petrenko
2020-07-02 14:35   ` Cyrill Gorcunov
2020-07-02 14:49     ` Cyrill Gorcunov
2020-07-02 15:37   ` Cyrill Gorcunov
2020-07-02 20:30   ` Vladislav Shpilevoy
2020-07-03 11:24     ` Serge Petrenko
2020-07-03 21:29       ` Vladislav Shpilevoy
2020-07-04 17:25         ` Serge Petrenko
2020-07-06  6:48           ` Vladislav Shpilevoy
2020-07-06  7:47 ` [Tarantool-patches] [PATCH v3 0/2] fix replication tx boundaries after local space rework Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox