Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect
@ 2019-11-18 21:31 Vladislav Shpilevoy
  2019-11-19  7:27 ` Konstantin Osipov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Vladislav Shpilevoy @ 2019-11-18 21:31 UTC (permalink / raw)
  To: tarantool-patches, kostja.osipov

Binary session disconnect trigger yield could lead to use after
free of the session object. That happened because iproto thread
sent two requests to TX thread at disconnect:

    - Close the session and run its on disconnect triggers;

    - If all requests are handled, destroy the session.

When a connection is idle, all requests are handled, so both these
requests are sent. If the first one yielded in TX thread, the
second one arrived and destroyed the session right under the feet
of the first one.

This can be solved in two ways - in TX thread, and in iproto
thread.

Iproto thread solution (which is chosen in the patch): just don't
send destroy request until disconnect returns back to iproto
thread.

TX thread solution (alternative): add a flag which says whether
disconnect is processed by TX. When destroy request arrives, it
checks the flag. If disconnect is not done, the destroy request
waits on a condition variable until it is.

The iproto is a bit tricker to implement, but it looks more
correct.

Closes #4627
---

Changes in v3:
- Flags are replaced with a state enum.

Branch: https://github.com/tarantool/tarantool/tree/gerold103/gh-4627-session-use-after-free
Issue: https://github.com/tarantool/tarantool/issues/4627

 src/box/iproto.cc | 104 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 83 insertions(+), 21 deletions(-)

diff --git a/src/box/iproto.cc b/src/box/iproto.cc
index 34c8f469a..c39b8e7bf 100644
--- a/src/box/iproto.cc
+++ b/src/box/iproto.cc
@@ -296,8 +296,13 @@ static const struct cmsg_hop destroy_route[] = {
 static void
 tx_process_disconnect(struct cmsg *m);
 
+/** Send destroy message to tx thread. */
+static void
+net_finish_disconnect(struct cmsg *m);
+
 static const struct cmsg_hop disconnect_route[] = {
-	{ tx_process_disconnect, NULL }
+	{ tx_process_disconnect, &net_pipe },
+	{ net_finish_disconnect, NULL }
 };
 
 /**
@@ -327,6 +332,32 @@ static const struct cmsg_hop push_route[] = {
 
 /* {{{ iproto_connection - declaration and definition */
 
+/** Connection life cycle stages. */
+enum iproto_connection_state {
+	/**
+	 * A connection is always alive in the beginning because
+	 * takes an already active socket in a constructor.
+	 */
+	IPROTO_CONNECTION_ALIVE,
+	/**
+	 * Socket was closed, a notification is sent to the TX
+	 * thread to close the session.
+	 */
+	IPROTO_CONNECTION_CLOSED,
+	/**
+	 * TX thread was notified about close, but some requests
+	 * are still not finished. That state may be skipped in
+	 * case the connection was already idle (not having
+	 * unfinished requests) at the moment of closing.
+	 */
+	IPROTO_CONNECTION_PENDING_DESTROY,
+	/**
+	 * All requests are finished, a destroy request is sent to
+	 * the TX thread.
+	 */
+	IPROTO_CONNECTION_DESTROYED,
+};
+
 /**
  * Context of a single client connection.
  * Interaction scheme:
@@ -430,8 +461,12 @@ struct iproto_connection
 	 * connection.
 	 */
 	struct cmsg destroy_msg;
-	/** True if destroy message is sent. Debug-only. */
-	bool is_destroy_sent;
+	/**
+	 * Connection state. Mainly it is used to determine when
+	 * the connection can be destroyed, and for debug purposes
+	 * to assert on a double destroy, for example.
+	 */
+	enum iproto_connection_state state;
 	struct rlist in_stop_list;
 	/**
 	 * Kharon is used to implement box.session.push().
@@ -572,6 +607,35 @@ iproto_connection_stop_msg_max_limit(struct iproto_connection *con)
 	rlist_add_tail(&stopped_connections, &con->in_stop_list);
 }
 
+/**
+ * Send a destroy message to TX thread in case all requests are
+ * finished.
+ */
+static inline void
+iproto_connection_try_to_start_destroy(struct iproto_connection *con)
+{
+	assert(con->state == IPROTO_CONNECTION_CLOSED ||
+	       con->state == IPROTO_CONNECTION_PENDING_DESTROY);
+	if (!iproto_connection_is_idle(con)) {
+		/*
+		 * Not all requests are finished. Let the last
+		 * finished request destroy the connection.
+		 */
+		con->state = IPROTO_CONNECTION_PENDING_DESTROY;
+		return;
+	}
+	/*
+	 * If the connection has no outstanding requests in the
+	 * input buffer, then no one (e.g. tx thread) is referring
+	 * to it, so it must be destroyed. Firstly queue a msg to
+	 * destroy the session and other resources owned by TX
+	 * thread. When it is done, iproto thread will destroy
+	 * other parts of the connection.
+	 */
+	con->state = IPROTO_CONNECTION_DESTROYED;
+	cpipe_push(&tx_pipe, &con->destroy_msg);
+}
+
 /**
  * Initiate a connection shutdown. This method may
  * be invoked many times, and does the internal
@@ -597,23 +661,12 @@ iproto_connection_close(struct iproto_connection *con)
 		 */
 		con->p_ibuf->wpos -= con->parse_size;
 		cpipe_push(&tx_pipe, &con->disconnect_msg);
-	}
-	/*
-	 * If the connection has no outstanding requests in the
-	 * input buffer, then no one (e.g. tx thread) is referring
-	 * to it, so it must be destroyed at once. Queue a msg to
-	 * run on_disconnect() trigger and destroy the connection.
-	 *
-	 * Otherwise, it will be destroyed by the last request on
-	 * this connection that has finished processing.
-	 *
-	 * The check is mandatory to not destroy a connection
-	 * twice.
-	 */
-	if (iproto_connection_is_idle(con)) {
-		assert(! con->is_destroy_sent);
-		con->is_destroy_sent = true;
-		cpipe_push(&tx_pipe, &con->destroy_msg);
+		assert(con->state == IPROTO_CONNECTION_ALIVE);
+		con->state = IPROTO_CONNECTION_CLOSED;
+	} else if (con->state == IPROTO_CONNECTION_PENDING_DESTROY) {
+		iproto_connection_try_to_start_destroy(con);
+	} else {
+		assert(con->state == IPROTO_CONNECTION_CLOSED);
 	}
 	rlist_del(&con->in_stop_list);
 }
@@ -1048,7 +1101,7 @@ iproto_connection_new(int fd)
 	/* It may be very awkward to allocate at close. */
 	cmsg_init(&con->destroy_msg, destroy_route);
 	cmsg_init(&con->disconnect_msg, disconnect_route);
-	con->is_destroy_sent = false;
+	con->state = IPROTO_CONNECTION_ALIVE;
 	con->tx.is_push_pending = false;
 	con->tx.is_push_sent = false;
 	rmean_collect(rmean_net, IPROTO_CONNECTIONS, 1);
@@ -1063,6 +1116,7 @@ iproto_connection_delete(struct iproto_connection *con)
 	assert(!evio_has_fd(&con->output));
 	assert(!evio_has_fd(&con->input));
 	assert(con->session == NULL);
+	assert(con->state == IPROTO_CONNECTION_DESTROYED);
 	/*
 	 * The output buffers must have been deleted
 	 * in tx thread.
@@ -1281,6 +1335,14 @@ tx_process_disconnect(struct cmsg *m)
 	}
 }
 
+static void
+net_finish_disconnect(struct cmsg *m)
+{
+	struct iproto_connection *con =
+		container_of(m, struct iproto_connection, disconnect_msg);
+	iproto_connection_try_to_start_destroy(con);
+}
+
 /**
  * Destroy the session object, as well as output buffers of the
  * connection.
-- 
2.21.0 (Apple Git-122.2)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect
  2019-11-18 21:31 [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect Vladislav Shpilevoy
@ 2019-11-19  7:27 ` Konstantin Osipov
  2019-11-21 23:04 ` Vladislav Shpilevoy
  2019-11-26  7:51 ` Kirill Yukhin
  2 siblings, 0 replies; 5+ messages in thread
From: Konstantin Osipov @ 2019-11-19  7:27 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/11/19 00:26]:

lgtm

> Binary session disconnect trigger yield could lead to use after
> free of the session object. That happened because iproto thread
> sent two requests to TX thread at disconnect:
> 
>     - Close the session and run its on disconnect triggers;
> 
>     - If all requests are handled, destroy the session.
> 
> When a connection is idle, all requests are handled, so both these
> requests are sent. If the first one yielded in TX thread, the
> second one arrived and destroyed the session right under the feet
> of the first one.
> 
> This can be solved in two ways - in TX thread, and in iproto
> thread.
> 
> Iproto thread solution (which is chosen in the patch): just don't
> send destroy request until disconnect returns back to iproto
> thread.
> 
> TX thread solution (alternative): add a flag which says whether
> disconnect is processed by TX. When destroy request arrives, it
> checks the flag. If disconnect is not done, the destroy request
> waits on a condition variable until it is.
> 
> The iproto is a bit tricker to implement, but it looks more
> correct.
> 
> Closes #4627
> ---
> 
> Changes in v3:
> - Flags are replaced with a state enum.
> 
> Branch: https://github.com/tarantool/tarantool/tree/gerold103/gh-4627-session-use-after-free
> Issue: https://github.com/tarantool/tarantool/issues/4627
> 
>  src/box/iproto.cc | 104 ++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 83 insertions(+), 21 deletions(-)
> 
> diff --git a/src/box/iproto.cc b/src/box/iproto.cc
> index 34c8f469a..c39b8e7bf 100644
> --- a/src/box/iproto.cc
> +++ b/src/box/iproto.cc
> @@ -296,8 +296,13 @@ static const struct cmsg_hop destroy_route[] = {
>  static void
>  tx_process_disconnect(struct cmsg *m);
>  
> +/** Send destroy message to tx thread. */
> +static void
> +net_finish_disconnect(struct cmsg *m);
> +
>  static const struct cmsg_hop disconnect_route[] = {
> -	{ tx_process_disconnect, NULL }
> +	{ tx_process_disconnect, &net_pipe },
> +	{ net_finish_disconnect, NULL }
>  };
>  
>  /**
> @@ -327,6 +332,32 @@ static const struct cmsg_hop push_route[] = {
>  
>  /* {{{ iproto_connection - declaration and definition */
>  
> +/** Connection life cycle stages. */
> +enum iproto_connection_state {
> +	/**
> +	 * A connection is always alive in the beginning because
> +	 * takes an already active socket in a constructor.
> +	 */
> +	IPROTO_CONNECTION_ALIVE,
> +	/**
> +	 * Socket was closed, a notification is sent to the TX
> +	 * thread to close the session.
> +	 */
> +	IPROTO_CONNECTION_CLOSED,
> +	/**
> +	 * TX thread was notified about close, but some requests
> +	 * are still not finished. That state may be skipped in
> +	 * case the connection was already idle (not having
> +	 * unfinished requests) at the moment of closing.
> +	 */
> +	IPROTO_CONNECTION_PENDING_DESTROY,
> +	/**
> +	 * All requests are finished, a destroy request is sent to
> +	 * the TX thread.
> +	 */
> +	IPROTO_CONNECTION_DESTROYED,
> +};
> +
>  /**
>   * Context of a single client connection.
>   * Interaction scheme:
> @@ -430,8 +461,12 @@ struct iproto_connection
>  	 * connection.
>  	 */
>  	struct cmsg destroy_msg;
> -	/** True if destroy message is sent. Debug-only. */
> -	bool is_destroy_sent;
> +	/**
> +	 * Connection state. Mainly it is used to determine when
> +	 * the connection can be destroyed, and for debug purposes
> +	 * to assert on a double destroy, for example.
> +	 */
> +	enum iproto_connection_state state;
>  	struct rlist in_stop_list;
>  	/**
>  	 * Kharon is used to implement box.session.push().
> @@ -572,6 +607,35 @@ iproto_connection_stop_msg_max_limit(struct iproto_connection *con)
>  	rlist_add_tail(&stopped_connections, &con->in_stop_list);
>  }
>  
> +/**
> + * Send a destroy message to TX thread in case all requests are
> + * finished.
> + */
> +static inline void
> +iproto_connection_try_to_start_destroy(struct iproto_connection *con)
> +{
> +	assert(con->state == IPROTO_CONNECTION_CLOSED ||
> +	       con->state == IPROTO_CONNECTION_PENDING_DESTROY);
> +	if (!iproto_connection_is_idle(con)) {
> +		/*
> +		 * Not all requests are finished. Let the last
> +		 * finished request destroy the connection.
> +		 */
> +		con->state = IPROTO_CONNECTION_PENDING_DESTROY;
> +		return;
> +	}
> +	/*
> +	 * If the connection has no outstanding requests in the
> +	 * input buffer, then no one (e.g. tx thread) is referring
> +	 * to it, so it must be destroyed. Firstly queue a msg to
> +	 * destroy the session and other resources owned by TX
> +	 * thread. When it is done, iproto thread will destroy
> +	 * other parts of the connection.
> +	 */
> +	con->state = IPROTO_CONNECTION_DESTROYED;
> +	cpipe_push(&tx_pipe, &con->destroy_msg);
> +}
> +
>  /**
>   * Initiate a connection shutdown. This method may
>   * be invoked many times, and does the internal
> @@ -597,23 +661,12 @@ iproto_connection_close(struct iproto_connection *con)
>  		 */
>  		con->p_ibuf->wpos -= con->parse_size;
>  		cpipe_push(&tx_pipe, &con->disconnect_msg);
> -	}
> -	/*
> -	 * If the connection has no outstanding requests in the
> -	 * input buffer, then no one (e.g. tx thread) is referring
> -	 * to it, so it must be destroyed at once. Queue a msg to
> -	 * run on_disconnect() trigger and destroy the connection.
> -	 *
> -	 * Otherwise, it will be destroyed by the last request on
> -	 * this connection that has finished processing.
> -	 *
> -	 * The check is mandatory to not destroy a connection
> -	 * twice.
> -	 */
> -	if (iproto_connection_is_idle(con)) {
> -		assert(! con->is_destroy_sent);
> -		con->is_destroy_sent = true;
> -		cpipe_push(&tx_pipe, &con->destroy_msg);
> +		assert(con->state == IPROTO_CONNECTION_ALIVE);
> +		con->state = IPROTO_CONNECTION_CLOSED;
> +	} else if (con->state == IPROTO_CONNECTION_PENDING_DESTROY) {
> +		iproto_connection_try_to_start_destroy(con);
> +	} else {
> +		assert(con->state == IPROTO_CONNECTION_CLOSED);
>  	}
>  	rlist_del(&con->in_stop_list);
>  }
> @@ -1048,7 +1101,7 @@ iproto_connection_new(int fd)
>  	/* It may be very awkward to allocate at close. */
>  	cmsg_init(&con->destroy_msg, destroy_route);
>  	cmsg_init(&con->disconnect_msg, disconnect_route);
> -	con->is_destroy_sent = false;
> +	con->state = IPROTO_CONNECTION_ALIVE;
>  	con->tx.is_push_pending = false;
>  	con->tx.is_push_sent = false;
>  	rmean_collect(rmean_net, IPROTO_CONNECTIONS, 1);
> @@ -1063,6 +1116,7 @@ iproto_connection_delete(struct iproto_connection *con)
>  	assert(!evio_has_fd(&con->output));
>  	assert(!evio_has_fd(&con->input));
>  	assert(con->session == NULL);
> +	assert(con->state == IPROTO_CONNECTION_DESTROYED);
>  	/*
>  	 * The output buffers must have been deleted
>  	 * in tx thread.
> @@ -1281,6 +1335,14 @@ tx_process_disconnect(struct cmsg *m)
>  	}
>  }
>  
> +static void
> +net_finish_disconnect(struct cmsg *m)
> +{
> +	struct iproto_connection *con =
> +		container_of(m, struct iproto_connection, disconnect_msg);
> +	iproto_connection_try_to_start_destroy(con);
> +}
> +
>  /**
>   * Destroy the session object, as well as output buffers of the
>   * connection.
> -- 
> 2.21.0 (Apple Git-122.2)
> 

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect
  2019-11-18 21:31 [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect Vladislav Shpilevoy
  2019-11-19  7:27 ` Konstantin Osipov
@ 2019-11-21 23:04 ` Vladislav Shpilevoy
  2019-11-22  7:55   ` Konstantin Osipov
  2019-11-26  7:51 ` Kirill Yukhin
  2 siblings, 1 reply; 5+ messages in thread
From: Vladislav Shpilevoy @ 2019-11-21 23:04 UTC (permalink / raw)
  To: tarantool-patches, kostja.osipov

I forgot to push a test file. Here it is.

================================================================================

diff --git a/test/box/gh-4627-session-use-after-free.result b/test/box/gh-4627-session-use-after-free.result
new file mode 100644
index 000000000..5e5c154b9
--- /dev/null
+++ b/test/box/gh-4627-session-use-after-free.result
@@ -0,0 +1,60 @@
+-- test-run result file version 2
+--
+-- gh-4627: binary session disconnect trigger yield could lead to
+-- use after free of the session object. That happened because
+-- iproto thread sent two requests to TX thread at disconnect:
+--
+--     - Close the session and run its on disconnect triggers;
+--
+--     - If all requests are handled, destroy the session.
+--
+-- When a connection is idle, all requests are handled, so both
+-- these requests are sent. If the first one yielded in TX thread,
+-- the second one arrived and destroyed the session right under
+-- the feet of the first one.
+--
+net_box = require('net.box')
+ | ---
+ | ...
+fiber = require('fiber')
+ | ---
+ | ...
+
+sid_before_yield = nil
+ | ---
+ | ...
+sid_after_yield = nil
+ | ---
+ | ...
+func = box.session.on_disconnect(function()     \
+    sid_before_yield = box.session.id()         \
+    fiber.yield()                               \
+    sid_after_yield = box.session.id()          \
+end)
+ | ---
+ | ...
+
+connection = net_box.connect(box.cfg.listen)
+ | ---
+ | ...
+connection:ping()
+ | ---
+ | - true
+ | ...
+connection:close()
+ | ---
+ | ...
+
+while not sid_after_yield do fiber.yield() end
+ | ---
+ | ...
+
+sid_after_yield == sid_before_yield and sid_after_yield ~= 0 or \
+    {sid_after_yield, sid_before_yield}
+ | ---
+ | - true
+ | ...
+
+box.session.on_disconnect(nil, func)
+ | ---
+ | ...
diff --git a/test/box/gh-4627-session-use-after-free.test.lua b/test/box/gh-4627-session-use-after-free.test.lua
new file mode 100644
index 000000000..70624a96a
--- /dev/null
+++ b/test/box/gh-4627-session-use-after-free.test.lua
@@ -0,0 +1,35 @@
+--
+-- gh-4627: binary session disconnect trigger yield could lead to
+-- use after free of the session object. That happened because
+-- iproto thread sent two requests to TX thread at disconnect:
+--
+--     - Close the session and run its on disconnect triggers;
+--
+--     - If all requests are handled, destroy the session.
+--
+-- When a connection is idle, all requests are handled, so both
+-- these requests are sent. If the first one yielded in TX thread,
+-- the second one arrived and destroyed the session right under
+-- the feet of the first one.
+--
+net_box = require('net.box')
+fiber = require('fiber')
+
+sid_before_yield = nil
+sid_after_yield = nil
+func = box.session.on_disconnect(function()     \
+    sid_before_yield = box.session.id()         \
+    fiber.yield()                               \
+    sid_after_yield = box.session.id()          \
+end)
+
+connection = net_box.connect(box.cfg.listen)
+connection:ping()
+connection:close()
+
+while not sid_after_yield do fiber.yield() end
+
+sid_after_yield == sid_before_yield and sid_after_yield ~= 0 or \
+    {sid_after_yield, sid_before_yield}
+
+box.session.on_disconnect(nil, func)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect
  2019-11-21 23:04 ` Vladislav Shpilevoy
@ 2019-11-22  7:55   ` Konstantin Osipov
  0 siblings, 0 replies; 5+ messages in thread
From: Konstantin Osipov @ 2019-11-22  7:55 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [19/11/22 09:49]:
> I forgot to push a test file. Here it is.

I assumed it's the same as in the previous patch.


-- 
Konstantin Osipov, Moscow, Russia
https://scylladb.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect
  2019-11-18 21:31 [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect Vladislav Shpilevoy
  2019-11-19  7:27 ` Konstantin Osipov
  2019-11-21 23:04 ` Vladislav Shpilevoy
@ 2019-11-26  7:51 ` Kirill Yukhin
  2 siblings, 0 replies; 5+ messages in thread
From: Kirill Yukhin @ 2019-11-26  7:51 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hello,

On 18 ноя 22:31, Vladislav Shpilevoy wrote:
> Binary session disconnect trigger yield could lead to use after
> free of the session object. That happened because iproto thread
> sent two requests to TX thread at disconnect:
> 
>     - Close the session and run its on disconnect triggers;
> 
>     - If all requests are handled, destroy the session.
> 
> When a connection is idle, all requests are handled, so both these
> requests are sent. If the first one yielded in TX thread, the
> second one arrived and destroyed the session right under the feet
> of the first one.
> 
> This can be solved in two ways - in TX thread, and in iproto
> thread.
> 
> Iproto thread solution (which is chosen in the patch): just don't
> send destroy request until disconnect returns back to iproto
> thread.
> 
> TX thread solution (alternative): add a flag which says whether
> disconnect is processed by TX. When destroy request arrives, it
> checks the flag. If disconnect is not done, the destroy request
> waits on a condition variable until it is.
> 
> The iproto is a bit tricker to implement, but it looks more
> correct.
> 
> Closes #4627

I've checked your patch into 1.10, 2.2 and master.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-26  7:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-18 21:31 [Tarantool-patches] [PATCH v3 1/1] iproto: don't destroy a session during disconnect Vladislav Shpilevoy
2019-11-19  7:27 ` Konstantin Osipov
2019-11-21 23:04 ` Vladislav Shpilevoy
2019-11-22  7:55   ` Konstantin Osipov
2019-11-26  7:51 ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox