[Tarantool-patches] [PATCH v12 7/8] applier: prevent nil dereference on applier rollback

Cyrill Gorcunov gorcunov at gmail.com
Tue Apr 7 18:15:00 MSK 2020


Currently when transaction rollback happens we just drop an existing
error setting ClientError to the replicaset.applier.diag. This action
leaves current fiber with diag=nil, which in turn leads to sigsegv once
diag_raise() called right after applier_apply_tx():

 | applier_f
 |   try {
 |   applier_subscribe
 |     applier_apply_tx
 |       // error happens
 |       txn_rollback
 |         diag_set(ClientError, ER_WAL_IO)
 |         diag_move(&fiber()->diag, &replicaset.applier.diag)
 |         // fiber->diag = nil
 |       applier_on_rollback
 |         diag_add_error(&applier->diag, diag_last_error(&replicaset.applier.diag)
 |         fiber_cancel(applier->reader);
 |     diag_raise() -> NULL dereference
 |   } catch { ... }

Thus:
 - use diag_set_error() instead of diag_move() to not drop error
   from a current fiber() preventing a nil dereference;
 - put fixme mark into the code: we need to rework it in a
   more sense way.

Fixes #4730

Acked-by: Serge Petrenko <sergepetrenko at tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov at gmail.com>
---
 src/box/applier.cc | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/box/applier.cc b/src/box/applier.cc
index 2f9c9c797..68de3c08c 100644
--- a/src/box/applier.cc
+++ b/src/box/applier.cc
@@ -692,9 +692,22 @@ static int
 applier_txn_rollback_cb(struct trigger *trigger, void *event)
 {
 	(void) trigger;
-	/* Setup shared applier diagnostic area. */
+
+	/*
+	 * Setup shared applier diagnostic area.
+	 *
+	 * FIXME: We should consider redesign this
+	 * moment and instead of carrying one shared
+	 * diag use per-applier diag instead all the time
+	 * (which actually already present in the structure).
+	 *
+	 * But remember that transactions are asynchronous
+	 * and rollback may happen a way latter after it
+	 * passed to the journal engine.
+	 */
 	diag_set(ClientError, ER_WAL_IO);
-	diag_move(&fiber()->diag, &replicaset.applier.diag);
+	diag_set_error(&replicaset.applier.diag,
+		       diag_last_error(diag_get()));
 
 	/* Broadcast the rollback event across all appliers. */
 	trigger_run(&replicaset.applier.on_rollback, event);
-- 
2.20.1



More information about the Tarantool-patches mailing list