Tarantool development patches archive
 help / color / mirror / Atom feed
From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Sergey Ostanevich <sergos@tarantool.org>,
	Igor Munkin <imun@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: [Tarantool-patches] [PATCH luajit] Give expected results for negative non-base-10 numbers in tonumber().
Date: Mon, 27 Dec 2021 16:42:37 +0300	[thread overview]
Message-ID: <20211227134237.2942-1-skaplun@tarantool.org> (raw)

From: Mike Pall <mike>

This was undefined in Lua 5.1, but it's defined in 5.2.

(cherry picked from f3cf0d6e15240098147437fed7bd436ff55fdf8c)

`strtoul()` considers negative values as a valid input and silently
converts them to the equivalent unsigned long value. As a result yielded
value is unexpected to the user.

This patch adds reading of a sign (if exists) from argument and provide
the remaining part of the string as is if it starts with a digit or
alphabetical symbol to be consistent with Lua 5.2.

Sergey Kaplun:
* added the description and the test for the problem

Part of tarantool/tarantool#6548
---

Issue: https://github.com/tarantool/tarantool/issues/6548
Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-tonumber-expected-results-full-ci
Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-tonumber-expected-results-full-ci

CI is red due to integration tests failes (same as on master) or due to
connection errors.

Side note: I suppose that undefinence Mike talking about is the
following lines in the Lua 5.1 Reference manual [1]:

| In base 10 (the default), the number can have a decimal part, as well
| as an optional exponent part (see paragraph 2.1). In other bases, only
| unsigned integers are accepted.

In the Lua 5.2 Reference manual [2] they are deleted.

 src/lib_base.c                                | 27 +++++++++++------
 ...onumber-negative-non-decimal-base.test.lua | 29 +++++++++++++++++++
 2 files changed, 47 insertions(+), 9 deletions(-)
 create mode 100644 test/tarantool-tests/tonumber-negative-non-decimal-base.test.lua

diff --git a/src/lib_base.c b/src/lib_base.c
index 3a757870..d61e8762 100644
--- a/src/lib_base.c
+++ b/src/lib_base.c
@@ -287,18 +287,27 @@ LJLIB_ASM(tonumber)		LJLIB_REC(.)
   } else {
     const char *p = strdata(lj_lib_checkstr(L, 1));
     char *ep;
+    unsigned int neg = 0;
     unsigned long ul;
     if (base < 2 || base > 36)
       lj_err_arg(L, 2, LJ_ERR_BASERNG);
-    ul = strtoul(p, &ep, base);
-    if (p != ep) {
-      while (lj_char_isspace((unsigned char)(*ep))) ep++;
-      if (*ep == '\0') {
-	if (LJ_DUALNUM && LJ_LIKELY(ul < 0x80000000u))
-	  setintV(L->base-1-LJ_FR2, (int32_t)ul);
-	else
-	  setnumV(L->base-1-LJ_FR2, (lua_Number)ul);
-	return FFH_RES(1);
+    while (lj_char_isspace((unsigned char)(*p))) p++;
+    if (*p == '-') { p++; neg = 1; } else if (*p == '+') { p++; }
+    if (lj_char_isalnum((unsigned char)(*p))) {
+      ul = strtoul(p, &ep, base);
+      if (p != ep) {
+	while (lj_char_isspace((unsigned char)(*ep))) ep++;
+	if (*ep == '\0') {
+	  if (LJ_DUALNUM && LJ_LIKELY(ul < 0x80000000u+neg)) {
+	    if (neg) ul = -ul;
+	    setintV(L->base-1-LJ_FR2, (int32_t)ul);
+	  } else {
+	    lua_Number n = (lua_Number)ul;
+	    if (neg) n = -n;
+	    setnumV(L->base-1-LJ_FR2, n);
+	  }
+	  return FFH_RES(1);
+	}
       }
     }
   }
diff --git a/test/tarantool-tests/tonumber-negative-non-decimal-base.test.lua b/test/tarantool-tests/tonumber-negative-non-decimal-base.test.lua
new file mode 100644
index 00000000..94df3b1f
--- /dev/null
+++ b/test/tarantool-tests/tonumber-negative-non-decimal-base.test.lua
@@ -0,0 +1,29 @@
+local tap = require('tap')
+
+local test = tap.test('tonumber-negative-non-decimal-base')
+test:plan(18)
+
+-- Test valid tonumber() with +- signs and non-10 base.
+test:ok(tonumber('-010', 2) == -2, 'negative base 2')
+test:ok(tonumber('-10', 8) == -8, 'negative base 8')
+test:ok(tonumber('-0x10', 16) == -16, 'negative base 16')
+test:ok(tonumber('  -1010  ', 2) == -10, 'negative base 2 with spaces')
+test:ok(tonumber('  +1010  ', 2) == 10, 'positive base 2 with spaces')
+test:ok(tonumber('  -012  ', 8) == -10, 'negative base 8 with spaces')
+test:ok(tonumber('  +012  ', 8) == 10, 'positive base 8 with spaces')
+test:ok(tonumber('  -10  ', 16) == -16, 'negative base 16 with spaces')
+test:ok(tonumber('  +10  ', 16) == 16, 'positive base 16 with spaces')
+test:ok(tonumber('  -1Z  ', 36) == -36 - 35, 'negative base 36 with spaces')
+test:ok(tonumber('  +1z  ', 36) == 36 + 35, 'positive base 36 with spaces')
+test:ok(tonumber('-fF', 16) == -(15 + (16 * 15)), 'negative base 16 mixed case')
+test:ok(tonumber('-0ffffffFFFF', 16) - 1 == -2 ^ 40, 'negative base 16 long')
+
+-- Test invalid tonumber() for non-10 base.
+test:ok(tonumber('-z1010  ', 2) == nil, 'incorrect notation in base 2')
+test:ok(tonumber('--1010  ', 2) == nil, 'double minus sign')
+test:ok(tonumber('-+1010  ', 2) == nil, 'minus plus sign')
+test:ok(tonumber('- 1010  ', 2) == nil, 'space between sign and value')
+test:ok(tonumber('-_1010  ', 2) == nil,
+	'invalid character between sign and value')
+
+os.exit(test:check() and 0 or 1)
-- 
2.34.1

[1]: http://www.lua.org/manual/5.1/manual.html#pdf-tonumber
[2]: http://www.lua.org/manual/5.2/manual.html#pdf-tonumber

             reply	other threads:[~2021-12-27 13:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-27 13:42 Sergey Kaplun via Tarantool-patches [this message]
2022-06-21 11:25 ` sergos via Tarantool-patches
2022-06-27 20:58 ` Igor Munkin via Tarantool-patches
2022-06-30 12:09 ` Igor Munkin via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211227134237.2942-1-skaplun@tarantool.org \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=imun@tarantool.org \
    --cc=sergos@tarantool.org \
    --cc=skaplun@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH luajit] Give expected results for negative non-base-10 numbers in tonumber().' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox