Tarantool development patches archive
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: tml <tarantool-patches@freelists.org>
Cc: Alexander Turenko <alexander.turenko@tarantool.org>
Subject: [tarantool-patches] [PATCH luajit v2 3/4] bugfix: LuaJIT tables' hash chains might get corrupted leading to infinite loops while fetching, missing keys, and etc.
Date: Wed, 22 May 2019 22:05:10 +0300	[thread overview]
Message-ID: <20190522190510.17201-4-gorcunov@gmail.com> (raw)
In-Reply-To: <20190522190510.17201-1-gorcunov@gmail.com>

From: "Yichun Zhang (agentzh)" <yichun@openresty.com>

Thanks Julien Desgats for the report and Peter Cawley for the patch.

See LuaJIT/LuaJIT#494 for the full discussion on the issues and fixes.

The test covering this bug was submitted to the openresty/luajit2-test-suite
repo as commit ce2c916d55.

Backport of @openrusty/luajit2
commit 9e338fe1cd854065cc9b29b2a863cc3c742a3404

Part-of #4171
---
 src/lj_tab.c | 81 ++++++++++++++++++++++++++++++++++------------------
 1 file changed, 53 insertions(+), 28 deletions(-)

diff --git a/src/lj_tab.c b/src/lj_tab.c
index c51666d..ff216f3 100644
--- a/src/lj_tab.c
+++ b/src/lj_tab.c
@@ -474,6 +474,7 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
     lua_assert(freenode != &G(L)->nilnode);
     collide = hashkey(t, &n->key);
     if (collide != n) {  /* Colliding node not the main node? */
+      Node *nn;
       while (noderef(collide->next) != n)  /* Find predecessor. */
 	collide = nextnode(collide);
       setmref(collide->next, freenode);  /* Relink chain. */
@@ -483,39 +484,63 @@ TValue *lj_tab_newkey(lua_State *L, GCtab *t, cTValue *key)
       freenode->next = n->next;
       setmref(n->next, NULL);
       setnilV(&n->val);
-      /* Rechain pseudo-resurrected string keys with colliding hashes. */
-      while (nextnode(freenode)) {
-	Node *nn = nextnode(freenode);
-	if (tvisstr(&nn->key) && !tvisnil(&nn->val) &&
-	    hashstr(t, strV(&nn->key)) == n) {
-	  freenode->next = nn->next;
-	  nn->next = n->next;
-	  setmref(n->next, nn);
-	  /*
-	  ** Rechaining a resurrected string key creates a new dilemma:
-	  ** Another string key may have originally been resurrected via
-	  ** _any_ of the previous nodes as a chain anchor. Including
-	  ** a node that had to be moved, which makes them unreachable.
-	  ** It's not feasible to check for all previous nodes, so rechain
-	  ** any string key that's currently in a non-main positions.
-	  */
-	  while ((nn = nextnode(freenode))) {
-	    if (tvisstr(&nn->key) && !tvisnil(&nn->val)) {
-	      Node *mn = hashstr(t, strV(&nn->key));
-	      if (mn != freenode) {
-		freenode->next = nn->next;
-		nn->next = mn->next;
-		setmref(mn->next, nn);
+      /*
+      ** Nodes after n might have n as their main node, and need rechaining
+      ** back onto n. We make use of the following property of tables: for all
+      ** nodes m, at least one of the following four statements is true:
+      **  1. tvisnil(&m->key)  NB: tvisnil(&m->val) is a stronger statement
+      **  2. tvisstr(&m->key)
+      **  3. tvisstr(&main(m)->key)
+      **  4. main(m) == main(main(m))
+      ** Initially, we need to rechain any nn which has main(nn) == n. As
+      ** main(n) != n (because collide != n earlier), main(nn) == n requires
+      ** either statement 2 or statement 3 to be true about nn.
+      */
+      if (!tvisstr(&n->key)) {
+	/* Statement 3 is not true, so only need to consider string keys. */
+	while ((nn = nextnode(freenode))) {
+	  if (tvisstr(&nn->key) && !tvisnil(&nn->val) &&
+	      hashstr(t, strV(&nn->key)) == n) {
+	    goto rechain;
+	  }
+	  freenode = nn;
+	}
+      } else {
+	/* Statement 3 is true, so need to consider all types of key. */
+	while ((nn = nextnode(freenode))) {
+	  if (!tvisnil(&nn->val) && hashkey(t, &nn->key) == n) {
+	  rechain:
+	    freenode->next = nn->next;
+	    nn->next = n->next;
+	    setmref(n->next, nn);
+	    /*
+	    ** Rechaining one node onto n creates a new dilemma: we now need
+	    ** to rechain any nn which has main(nn) == n OR has main(nn) equal
+	    ** to any node which has already been rechained. Furthermore, at
+	    ** least one of n and n->next will have a string key, so all types
+	    ** of nn key need to be considered. Rather than testing whether
+	    ** main(nn) definitely _is_ in the new chain, we test whether it
+	    ** might _not_ be in the old chain, and if so re-link it into
+	    ** the correct chain.
+	    */
+	    while ((nn = nextnode(freenode))) {
+	      if (!tvisnil(&nn->val)) {
+		Node *mn = hashkey(t, &nn->key);
+		if (mn != freenode && mn != nn) {
+		  freenode->next = nn->next;
+		  nn->next = mn->next;
+		  setmref(mn->next, nn);
+		} else {
+		  freenode = nn;
+		}
 	      } else {
 		freenode = nn;
 	      }
-	    } else {
-	      freenode = nn;
 	    }
+	    break;
+	  } else {
+	    freenode = nn;
 	  }
-	  break;
-	} else {
-	  freenode = nn;
 	}
       }
     } else {  /* Otherwise use free node. */
-- 
2.20.1

  parent reply	other threads:[~2019-05-22 19:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-22 19:05 [tarantool-patches] [PATCH v2 0/4] luajit: Bacport patches from openrusty Cyrill Gorcunov
2019-05-22 19:05 ` [tarantool-patches] [PATCH luajit v2 1/4] Fix overflow of snapshot map offset Cyrill Gorcunov
2019-05-22 19:05 ` [tarantool-patches] [PATCH luajit v2 2/4] Fix rechaining of pseudo-resurrected string keys Cyrill Gorcunov
2019-05-22 19:05 ` Cyrill Gorcunov [this message]
2019-05-22 19:05 ` [tarantool-patches] [PATCH tarantool v2 4/4] test/luajit-tap: Add table_chain_bug_LuaJIT_494.test.lua Cyrill Gorcunov
2019-05-23 10:32 ` [tarantool-patches] Re: [PATCH v2 0/4] luajit: Bacport patches from openrusty Kirill Yukhin
2019-05-23 10:37   ` Cyrill Gorcunov
2019-05-23 10:57     ` Kirill Yukhin
2019-05-23 11:53       ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190522190510.17201-4-gorcunov@gmail.com \
    --to=gorcunov@gmail.com \
    --cc=alexander.turenko@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH luajit v2 3/4] bugfix: LuaJIT tables'\'' hash chains might get corrupted leading to infinite loops while fetching, missing keys, and etc.' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox