From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id C4B36469719 for ; Fri, 14 Feb 2020 03:12:23 +0300 (MSK) Date: Fri, 14 Feb 2020 03:07:04 +0300 From: Igor Munkin Message-ID: <20200214000703.GA6018@tarantool.org> References: <62003dc4b5a3672d02c3ec599b5ecb65a557d6b5.1581635592.git.imun@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <62003dc4b5a3672d02c3ec599b5ecb65a557d6b5.1581635592.git.imun@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH luajit] Make string to number conversions fail on NUL char List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sergey Ostanevich , Alexander Turenko Cc: tarantool-patches@dev.tarantool.org Mistyped the issue link. Fixed, squashed, force-pushed to the branch. On 14.02.20, Igor Munkin wrote: > The routine used for conversion a string representation to number > (lj_strscan_scan) doesn't respect the size of the given string/buffer. > Such behaviour leads to the following results: > > | local a = tonumber("inf\x00imun") -- the result is 'inf' > | local b = tonumber("\x36\x00\x80") -- the result is 6 > > The behaviour described above is similar to the one vanila Lua 5.1 has: > > | $ ./lua -e 'print(_VERSION, tonumber("inf"..string.char(0).."imun"))' > | Lua 5.1 inf > > However, the issue is fixed in Lua 5.2 and the results are the following: > | $ ./lua -e 'print(_VERSION, tonumber("inf"..string.char(0).."imun"))' > | Lua 5.2 nil > > The patch introduces additional parameter to lj_strscan_scan routine to > detect whether there is nothing left after the null character. > > Relates to tarantool#4773 Relates to tarantool/tarantool#4773 > > Reported-by: Alexander Turenko > Signed-off-by: Igor Munkin > --- > > This is a backport of the origin commit from v2.1 branch in LuaJIT > repo extended with tests and commit message. Commit subject is left > unchanged with the respect to the origin commit. > > Upstream commit: https://github.com/LuaJIT/LuaJIT/commit/0ad60ccbc3768fa8e3e726858adf261950edbc22 > Issue: https://github.com/tarantool/tarantool/issues/4773 > Branch: https://github.com/tarantool/luajit/tree/imun/tonumber-fail-on-NUL-char > > src/lj_cparse.c | 3 ++- > src/lj_lex.c | 2 +- > src/lj_strscan.c | 11 +++++---- > src/lj_strscan.h | 3 ++- > ...gh-4773-tonumber-fail-on-NUL-char.test.lua | 24 +++++++++++++++++++ > 5 files changed, 36 insertions(+), 7 deletions(-) > create mode 100755 test/gh-4773-tonumber-fail-on-NUL-char.test.lua > > diff --git a/src/lj_cparse.c b/src/lj_cparse.c > index 24ef034..fb44056 100644 > --- a/src/lj_cparse.c > +++ b/src/lj_cparse.c > @@ -151,7 +151,8 @@ static CPToken cp_number(CPState *cp) > TValue o; > do { cp_save(cp, cp->c); } while (lj_char_isident(cp_get(cp))); > cp_save(cp, '\0'); > - fmt = lj_strscan_scan((const uint8_t *)sbufB(&cp->sb), &o, STRSCAN_OPT_C); > + fmt = lj_strscan_scan((const uint8_t *)sbufB(&cp->sb), sbuflen(&cp->sb)-1, > + &o, STRSCAN_OPT_C); > if (fmt == STRSCAN_INT) cp->val.id = CTID_INT32; > else if (fmt == STRSCAN_U32) cp->val.id = CTID_UINT32; > else if (!(cp->mode & CPARSE_MODE_SKIP)) > diff --git a/src/lj_lex.c b/src/lj_lex.c > index 2d2f819..5285691 100644 > --- a/src/lj_lex.c > +++ b/src/lj_lex.c > @@ -99,7 +99,7 @@ static void lex_number(LexState *ls, TValue *tv) > lex_savenext(ls); > } > lex_save(ls, '\0'); > - fmt = lj_strscan_scan((const uint8_t *)sbufB(&ls->sb), tv, > + fmt = lj_strscan_scan((const uint8_t *)sbufB(&ls->sb), sbuflen(&ls->sb)-1, tv, > (LJ_DUALNUM ? STRSCAN_OPT_TOINT : STRSCAN_OPT_TONUM) | > (LJ_HASFFI ? (STRSCAN_OPT_LL|STRSCAN_OPT_IMAG) : 0)); > if (LJ_DUALNUM && fmt == STRSCAN_INT) { > diff --git a/src/lj_strscan.c b/src/lj_strscan.c > index f5f35c9..08f41f1 100644 > --- a/src/lj_strscan.c > +++ b/src/lj_strscan.c > @@ -370,9 +370,11 @@ static StrScanFmt strscan_bin(const uint8_t *p, TValue *o, > } > > /* Scan string containing a number. Returns format. Returns value in o. */ > -StrScanFmt lj_strscan_scan(const uint8_t *p, TValue *o, uint32_t opt) > +StrScanFmt lj_strscan_scan(const uint8_t *p, MSize len, TValue *o, > + uint32_t opt) > { > int32_t neg = 0; > + const uint8_t *pe = p + len; > > /* Remove leading space, parse sign and non-numbers. */ > if (LJ_UNLIKELY(!lj_char_isdigit(*p))) { > @@ -390,7 +392,7 @@ StrScanFmt lj_strscan_scan(const uint8_t *p, TValue *o, uint32_t opt) > p += 3; > } > while (lj_char_isspace(*p)) p++; > - if (*p) return STRSCAN_ERROR; > + if (*p || p < pe) return STRSCAN_ERROR; > o->u64 = tmp.u64; > return STRSCAN_NUM; > } > @@ -488,6 +490,7 @@ StrScanFmt lj_strscan_scan(const uint8_t *p, TValue *o, uint32_t opt) > while (lj_char_isspace(*p)) p++; > if (*p) return STRSCAN_ERROR; > } > + if (p < pe) return STRSCAN_ERROR; > > /* Fast path for decimal 32 bit integers. */ > if (fmt == STRSCAN_INT && base == 10 && > @@ -524,7 +527,7 @@ StrScanFmt lj_strscan_scan(const uint8_t *p, TValue *o, uint32_t opt) > > int LJ_FASTCALL lj_strscan_num(GCstr *str, TValue *o) > { > - StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), o, > + StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o, > STRSCAN_OPT_TONUM); > lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM); > return (fmt != STRSCAN_ERROR); > @@ -533,7 +536,7 @@ int LJ_FASTCALL lj_strscan_num(GCstr *str, TValue *o) > #if LJ_DUALNUM > int LJ_FASTCALL lj_strscan_number(GCstr *str, TValue *o) > { > - StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), o, > + StrScanFmt fmt = lj_strscan_scan((const uint8_t *)strdata(str), str->len, o, > STRSCAN_OPT_TOINT); > lua_assert(fmt == STRSCAN_ERROR || fmt == STRSCAN_NUM || fmt == STRSCAN_INT); > if (fmt == STRSCAN_INT) setitype(o, LJ_TISNUM); > diff --git a/src/lj_strscan.h b/src/lj_strscan.h > index 6fb0dda..fdfe894 100644 > --- a/src/lj_strscan.h > +++ b/src/lj_strscan.h > @@ -22,7 +22,8 @@ typedef enum { > STRSCAN_INT, STRSCAN_U32, STRSCAN_I64, STRSCAN_U64, > } StrScanFmt; > > -LJ_FUNC StrScanFmt lj_strscan_scan(const uint8_t *p, TValue *o, uint32_t opt); > +LJ_FUNC StrScanFmt lj_strscan_scan(const uint8_t *p, MSize len, TValue *o, > + uint32_t opt); > LJ_FUNC int LJ_FASTCALL lj_strscan_num(GCstr *str, TValue *o); > #if LJ_DUALNUM > LJ_FUNC int LJ_FASTCALL lj_strscan_number(GCstr *str, TValue *o); > diff --git a/test/gh-4773-tonumber-fail-on-NUL-char.test.lua b/test/gh-4773-tonumber-fail-on-NUL-char.test.lua > new file mode 100755 > index 0000000..a660979 > --- /dev/null > +++ b/test/gh-4773-tonumber-fail-on-NUL-char.test.lua > @@ -0,0 +1,24 @@ > +#!/usr/bin/env tarantool > + > +local tap = require('tap') > + > +local test = tap.test("Tarantool 4773") > +test:plan(3) > + > +-- Test file to demonstrate LuaJIT tonumber routine fails on NUL char, > +-- details: > +-- https://github.com/tarantool/tarantool/issues/4773 > + > +local t = { > + zero = '0', > + null = '\x00', > + tail = 'imun', > +} > + > +-- Since VM, Lua/C API and compiler use a single routine for conversion numeric > +-- string to a number, test cases are reduced to the following: > +test:is(tonumber(t.zero), 0) > +test:is(tonumber(t.zero .. t.tail), nil) > +test:is(tonumber(t.zero .. t.null .. t.tail), nil) > + > +os.exit(test:check() and 0 or 1) > -- > 2.25.0 > -- Best regards, IM