Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v22 0/3] qsync: implement packet filtering (part 1)
@ 2021-10-11 19:16 Cyrill Gorcunov via Tarantool-patches
  2021-10-11 19:16 ` [Tarantool-patches] [PATCH v22 1/3] latch: add latch_is_locked helper Cyrill Gorcunov via Tarantool-patches
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Cyrill Gorcunov via Tarantool-patches @ 2021-10-11 19:16 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Guys, please take a look once time permit, any comments are highly appreciated!

v19 (by Vlad):
 - do not modify box_issue_promote and demote (while they are
   still simply utter code duplication but whatever)
 - make txn_limbo_process being void
 - make txn_limbo_process_begin/commit/rollback being void
 - the real processing of request under the lock named as
   txn_limbo_process_core
 - testcase completely reworked (kudos to SergeP)
 - note that if we import test to the master branch without
   ordering pass it will fire assertion
 - dropped off debug info from box.info interface

v20 (by SergeP):
 - use guard for ACK processing and parameters change
 - rework test

v21 (by SergeP):
 - drop warning from txn_limbo_ack
 - rework test to use cluster helpers and ERRINJ_WAL_WRITE_COUNT
   error injection, same time drop modification of election_replica
   script
v22 (by SergeP):
 - use limbo emptiness test _after_ owner_id test
 - drop redundant assert in limbo commit/rollback
   since we're unlocking a latch anyway where own
   assertion present
 - in test: drop excessive wait_cond and setup wal
   delay earlier

branch gorcunov/gh-6036-rollback-confirm-22
issue https://github.com/tarantool/tarantool/issues/6036
previous series https://lists.tarantool.org/tarantool-patches/20211008175809.349501-1-gorcunov@gmail.com/

Cyrill Gorcunov (3):
  latch: add latch_is_locked helper
  qsync: order access to the limbo terms
  test: add gh-6036-qsync-order test

 src/box/applier.cc                            |  12 +-
 src/box/box.cc                                |  15 +-
 src/box/relay.cc                              |  11 +-
 src/box/txn.c                                 |   2 +-
 src/box/txn_limbo.c                           |  49 ++++-
 src/box/txn_limbo.h                           |  78 ++++++-
 src/lib/core/latch.h                          |  11 +
 test/replication/gh-6036-qsync-order.result   | 200 ++++++++++++++++++
 test/replication/gh-6036-qsync-order.test.lua |  96 +++++++++
 test/replication/suite.cfg                    |   1 +
 test/replication/suite.ini                    |   2 +-
 test/unit/snap_quorum_delay.cc                |   5 +-
 12 files changed, 451 insertions(+), 31 deletions(-)
 create mode 100644 test/replication/gh-6036-qsync-order.result
 create mode 100644 test/replication/gh-6036-qsync-order.test.lua


base-commit: ce5752ce235324fcefd5a3d0503fd3f8a1800d38
-- 
2.31.1

-- Summary patch against v21 --
diff --git a/src/box/txn_limbo.c b/src/box/txn_limbo.c
index 176a83cb5..8f9bc11c7 100644
--- a/src/box/txn_limbo.c
+++ b/src/box/txn_limbo.c
@@ -546,8 +546,6 @@ void
 txn_limbo_ack(struct txn_limbo *limbo, uint32_t owner_id,
 	      uint32_t replica_id, int64_t lsn)
 {
-	if (rlist_empty(&limbo->queue))
-		return;
 	/*
 	 * ACKs are collected only by the transactions originator
 	 * (which is the single master in 100% so far). Other instances
@@ -560,6 +558,15 @@ txn_limbo_ack(struct txn_limbo *limbo, uint32_t owner_id,
 	 */
 	if (!txn_limbo_is_owner(limbo, owner_id))
 		return;
+
+	/*
+	 * Test for empty queue is done _after_ txn_limbo_is_owner
+	 * call because we need to be sure that limbo is not been
+	 * changed under our feets while we're reading it.
+	 */
+	if (rlist_empty(&limbo->queue))
+		return;
+
 	/*
 	 * If limbo is currently writing a rollback, it means that the whole
 	 * queue will be rolled back. Because rollback is written only for
@@ -815,8 +822,6 @@ txn_limbo_process(struct txn_limbo *limbo,
 void
 txn_limbo_on_parameters_change(struct txn_limbo *limbo)
 {
-	if (rlist_empty(&limbo->queue))
-		return;
 	/*
 	 * In case if we're not current leader (ie not owning the
 	 * limbo) then we should not confirm anything, otherwise
@@ -826,6 +831,9 @@ txn_limbo_on_parameters_change(struct txn_limbo *limbo)
 	if (!txn_limbo_is_owner(limbo, instance_id))
 		return;
 
+	if (rlist_empty(&limbo->queue))
+		return;
+
 	struct txn_limbo_entry *e;
 	int64_t confirm_lsn = -1;
 	rlist_foreach_entry(e, &limbo->queue, in_queue) {
diff --git a/src/box/txn_limbo.h b/src/box/txn_limbo.h
index aaff444e4..33cacef8f 100644
--- a/src/box/txn_limbo.h
+++ b/src/box/txn_limbo.h
@@ -349,7 +349,6 @@ txn_limbo_process_begin(struct txn_limbo *limbo)
 static inline void
 txn_limbo_process_commit(struct txn_limbo *limbo)
 {
-	assert(latch_is_locked(&limbo->promote_latch));
 	latch_unlock(&limbo->promote_latch);
 }
 
@@ -357,7 +356,6 @@ txn_limbo_process_commit(struct txn_limbo *limbo)
 static inline void
 txn_limbo_process_rollback(struct txn_limbo *limbo)
 {
-	assert(latch_is_locked(&limbo->promote_latch));
 	latch_unlock(&limbo->promote_latch);
 }
 
diff --git a/test/replication/gh-6036-qsync-order.result b/test/replication/gh-6036-qsync-order.result
index eb3e808cb..464a131a4 100644
--- a/test/replication/gh-6036-qsync-order.result
+++ b/test/replication/gh-6036-qsync-order.result
@@ -76,10 +76,6 @@ test_run:switch("election_replica2")
  | ---
  | - true
  | ...
-test_run:wait_cond(function() return box.space.test:get{1} ~= nil end)
- | ---
- | - true
- | ...
 box.cfg({                                   \
     replication = {                         \
         "unix/:./election_replica2.sock",   \
@@ -106,6 +102,10 @@ test_run:switch("election_replica3")
 write_cnt = box.error.injection.get("ERRINJ_WAL_WRITE_COUNT")
  | ---
  | ...
+box.error.injection.set("ERRINJ_WAL_DELAY", true)
+ | ---
+ | - ok
+ | ...
 --
 -- Make election_replica2 been a leader and start writting data,
 -- the PROMOTE request get queued on election_replica3 and not
@@ -128,10 +128,6 @@ test_run:wait_cond(function() return box.error.injection.get("ERRINJ_WAL_WRITE_C
  | ---
  | - true
  | ...
-box.error.injection.set("ERRINJ_WAL_DELAY", true)
- | ---
- | - ok
- | ...
 test_run:switch("election_replica2")
  | ---
  | - true
diff --git a/test/replication/gh-6036-qsync-order.test.lua b/test/replication/gh-6036-qsync-order.test.lua
index b8df170b8..6350e9303 100644
--- a/test/replication/gh-6036-qsync-order.test.lua
+++ b/test/replication/gh-6036-qsync-order.test.lua
@@ -37,7 +37,6 @@ box.cfg({                                   \
 --
 -- Drop connection between election_replica2 and election_replica1.
 test_run:switch("election_replica2")
-test_run:wait_cond(function() return box.space.test:get{1} ~= nil end)
 box.cfg({                                   \
     replication = {                         \
         "unix/:./election_replica2.sock",   \
@@ -57,6 +56,7 @@ box.cfg({                                   \
 -- fall into forever sleep.
 test_run:switch("election_replica3")
 write_cnt = box.error.injection.get("ERRINJ_WAL_WRITE_COUNT")
+box.error.injection.set("ERRINJ_WAL_DELAY", true)
 --
 -- Make election_replica2 been a leader and start writting data,
 -- the PROMOTE request get queued on election_replica3 and not
@@ -68,7 +68,6 @@ test_run:switch("election_replica2")
 box.ctl.promote()
 test_run:switch("election_replica3")
 test_run:wait_cond(function() return box.error.injection.get("ERRINJ_WAL_WRITE_COUNT") > write_cnt end)
-box.error.injection.set("ERRINJ_WAL_DELAY", true)
 test_run:switch("election_replica2")
 _ = require('fiber').create(function() box.space.test:insert{2} end)
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-13  8:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-11 19:16 [Tarantool-patches] [PATCH v22 0/3] qsync: implement packet filtering (part 1) Cyrill Gorcunov via Tarantool-patches
2021-10-11 19:16 ` [Tarantool-patches] [PATCH v22 1/3] latch: add latch_is_locked helper Cyrill Gorcunov via Tarantool-patches
2021-10-11 19:16 ` [Tarantool-patches] [PATCH v22 2/3] qsync: order access to the limbo terms Cyrill Gorcunov via Tarantool-patches
2021-10-12  9:40   ` Serge Petrenko via Tarantool-patches
2021-10-12 20:26     ` Cyrill Gorcunov via Tarantool-patches
2021-10-13  7:56       ` Serge Petrenko via Tarantool-patches
2021-10-11 19:16 ` [Tarantool-patches] [PATCH v22 3/3] test: add gh-6036-qsync-order test Cyrill Gorcunov via Tarantool-patches
2021-10-12  9:47   ` Serge Petrenko via Tarantool-patches
2021-10-12 20:28     ` Cyrill Gorcunov via Tarantool-patches
2021-10-13  8:20       ` Serge Petrenko via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox