* [Tarantool-patches] [PATCH luajit 0/2] Fixes for recording of string built-ins
@ 2026-03-06 13:42 Sergey Kaplun via Tarantool-patches
2026-03-06 13:42 ` [Tarantool-patches] [PATCH luajit 1/2] Fix edge cases when generating IR for string.byte/sub/find Sergey Kaplun via Tarantool-patches
2026-03-06 13:42 ` [Tarantool-patches] [PATCH luajit 2/2] Fix edge cases when recording string.byte/sub Sergey Kaplun via Tarantool-patches
0 siblings, 2 replies; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-03-06 13:42 UTC (permalink / raw)
To: Sergey Bronnikov; +Cc: tarantool-patches
Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1407-ir-string-builtin
Related issues:
* https://github.com/LuaJIT/LuaJIT/issues/1407
* https://github.com/LuaJIT/LuaJIT/issues/1443
* https://github.com/tarantool/tarantool/issues/12134
Mike Pall (2):
Fix edge cases when generating IR for string.byte/sub/find.
Fix edge cases when recording string.byte/sub.
src/lj_ffrecord.c | 14 ++--
.../lj-1407-ir-string-builtin.test.lua | 70 +++++++++++++++++++
.../lj-1443-stirng-byte-underflow.test.lua | 25 +++++++
3 files changed, 102 insertions(+), 7 deletions(-)
create mode 100644 test/tarantool-tests/lj-1407-ir-string-builtin.test.lua
create mode 100644 test/tarantool-tests/lj-1443-stirng-byte-underflow.test.lua
--
2.53.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Tarantool-patches] [PATCH luajit 1/2] Fix edge cases when generating IR for string.byte/sub/find.
2026-03-06 13:42 [Tarantool-patches] [PATCH luajit 0/2] Fixes for recording of string built-ins Sergey Kaplun via Tarantool-patches
@ 2026-03-06 13:42 ` Sergey Kaplun via Tarantool-patches
2026-03-06 13:42 ` [Tarantool-patches] [PATCH luajit 2/2] Fix edge cases when recording string.byte/sub Sergey Kaplun via Tarantool-patches
1 sibling, 0 replies; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-03-06 13:42 UTC (permalink / raw)
To: Sergey Bronnikov; +Cc: tarantool-patches
From: Mike Pall <mike>
Contributed by XmiliaH.
(cherry picked from commit af9763a50da87ff8ba16e828cbd5664135e05a88)
The generated ADD/SUB IRs for the calculation of indexes in the string
for the aforementioned build-ins don't check the overflow. This may lead
to the incorrect results, incorrect trace semantics, or invalid memory
access. Also, the negative values may pass the UGT guard check emitted
for the positive `end` position and lead to the incorrect results on the
trace.
This patch fixes this by using guarded ADDOV/SUBOV instead. The UGT IR
is replaced with GT.
Sergey Kaplun:
* added the description and the test for the problem
Part of tarantool/tarantool#12134
---
src/lj_ffrecord.c | 8 +--
.../lj-1407-ir-string-builtin.test.lua | 70 +++++++++++++++++++
2 files changed, 74 insertions(+), 4 deletions(-)
create mode 100644 test/tarantool-tests/lj-1407-ir-string-builtin.test.lua
diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
index 3b82d044..d888e83e 100644
--- a/src/lj_ffrecord.c
+++ b/src/lj_ffrecord.c
@@ -752,7 +752,7 @@ static TRef recff_string_start(jit_State *J, GCstr *s, int32_t *st, TRef tr,
emitir(IRTGI(IR_EQ), tr, tr0);
tr = tr0;
} else {
- tr = emitir(IRTI(IR_ADD), tr, lj_ir_kint(J, -1));
+ tr = emitir(IRTGI(IR_ADDOV), tr, lj_ir_kint(J, -1));
emitir(IRTGI(IR_GE), tr, tr0);
start--;
}
@@ -804,7 +804,7 @@ static void LJ_FASTCALL recff_string_range(jit_State *J, RecordFFData *rd)
} else if ((MSize)end <= str->len) {
emitir(IRTGI(IR_ULE), trend, trlen);
} else {
- emitir(IRTGI(IR_UGT), trend, trlen);
+ emitir(IRTGI(IR_GT), trend, trlen);
end = (int32_t)str->len;
trend = trlen;
}
@@ -812,7 +812,7 @@ static void LJ_FASTCALL recff_string_range(jit_State *J, RecordFFData *rd)
if (rd->data) { /* Return string.sub result. */
if (end - start >= 0) {
/* Also handle empty range here, to avoid extra traces. */
- TRef trptr, trslen = emitir(IRTI(IR_SUB), trend, trstart);
+ TRef trptr, trslen = emitir(IRTGI(IR_SUBOV), trend, trstart);
emitir(IRTGI(IR_GE), trslen, tr0);
trptr = emitir(IRT(IR_STRREF, IRT_PGC), trstr, trstart);
J->base[0] = emitir(IRT(IR_SNEW, IRT_STR), trptr, trslen);
@@ -823,7 +823,7 @@ static void LJ_FASTCALL recff_string_range(jit_State *J, RecordFFData *rd)
} else { /* Return string.byte result(s). */
ptrdiff_t i, len = end - start;
if (len > 0) {
- TRef trslen = emitir(IRTI(IR_SUB), trend, trstart);
+ TRef trslen = emitir(IRTGI(IR_SUBOV), trend, trstart);
emitir(IRTGI(IR_EQ), trslen, lj_ir_kint(J, (int32_t)len));
if (J->baseslot + len > LJ_MAX_JSLOTS)
lj_trace_err_info(J, LJ_TRERR_STACKOV);
diff --git a/test/tarantool-tests/lj-1407-ir-string-builtin.test.lua b/test/tarantool-tests/lj-1407-ir-string-builtin.test.lua
new file mode 100644
index 00000000..f168469a
--- /dev/null
+++ b/test/tarantool-tests/lj-1407-ir-string-builtin.test.lua
@@ -0,0 +1,70 @@
+local tap = require('tap')
+
+-- Test file to demonstrate incorrect LuaJIT recording for the
+-- corner cases of the `string` built-ins. All cases below don't
+-- check the integer overflow/underflow correctly.
+-- See also https://github.com/LuaJIT/LuaJIT/issues/1407.
+
+local test = tap.test('lj-1407-ir-string-builtin'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(4)
+
+local function trace_sub(s, i, e)
+ local r = s:sub(i, e)
+ return r
+end
+
+local function trace_sub_neg(s, i)
+ local r = s:sub(1, i)
+ return r
+end
+
+local function trace_byte(s, i, e)
+ local r = s:byte(i, e)
+ return r
+end
+
+local function trace_find(s, i)
+ local r = s:find('2', i)
+ return r
+end
+
+jit.opt.start('hotloop=1')
+
+-- Compile the trace.
+trace_sub('123', 1, -2)
+trace_sub('123', 1, -2)
+-- Execute the trace with the invalid memory access.
+test:is(trace_sub('123', 0x7FFFFFFF, -0x7FFFFFFF), '',
+ 'string.sub is correct at the trace')
+
+-- The arithmetic for the number of results on the trace is the
+-- following for the negative last argument (`end`):
+-- | str->len + 1 + end - (start - 1)
+-- Trace has the guard to the number of results. We should record
+-- an original trace with the guard passed for the underflowed
+-- case as well:
+-- 0 + 1 + 0x80000001 - 0x7ffffffe = 4.
+-- Compile the trace that fits the needed properties:
+trace_byte('1234', 1, -1)
+trace_byte('1234', 1, -1)
+-- Execute the trace with the invalid memory access.
+test:is(trace_byte('', 0x7FFFFFFF, -0x7FFFFFFF), nil,
+ 'string.byte is correct at the trace')
+
+-- Compile the trace.
+trace_sub_neg('123', 5)
+trace_sub_neg('123', 5)
+-- Execute the trace with negative value.
+test:is(trace_sub_neg('123', -2), '12',
+ 'string.sub negative end is correct at the trace')
+
+-- Compile the trace.
+trace_find('123', 5)
+trace_find('123', 5)
+-- Execute the trace with value to overflow.
+test:is(trace_find('123', -0x80000000), 2, 'string.find with overflow')
+
+test:done(true)
--
2.53.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Tarantool-patches] [PATCH luajit 2/2] Fix edge cases when recording string.byte/sub.
2026-03-06 13:42 [Tarantool-patches] [PATCH luajit 0/2] Fixes for recording of string built-ins Sergey Kaplun via Tarantool-patches
2026-03-06 13:42 ` [Tarantool-patches] [PATCH luajit 1/2] Fix edge cases when generating IR for string.byte/sub/find Sergey Kaplun via Tarantool-patches
@ 2026-03-06 13:42 ` Sergey Kaplun via Tarantool-patches
1 sibling, 0 replies; 3+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2026-03-06 13:42 UTC (permalink / raw)
To: Sergey Bronnikov; +Cc: tarantool-patches
From: Mike Pall <mike>
Thanks to Sergey Kaplun.
(cherry picked from commit 89f268b3f745dba80da6350d3cbbb0964f3fdbee)
It is possible that the `len` (`end - start`) will underflow and become
positive in the `recff_string_range()` when the `end` is negative. For
`string.sub()` this is not crucial, since the trace will be valid
anyway. But for `string.byte()` it may lead to the assertion failure in
the `rec_check_slots()`.
This patch fixes those underflows by the correct comparison.
Sergey Kaplun:
* added the description and the test for the problem
Part of tarantool/tarantool#12134
---
src/lj_ffrecord.c | 6 ++---
.../lj-1443-stirng-byte-underflow.test.lua | 25 +++++++++++++++++++
2 files changed, 28 insertions(+), 3 deletions(-)
create mode 100644 test/tarantool-tests/lj-1443-stirng-byte-underflow.test.lua
diff --git a/src/lj_ffrecord.c b/src/lj_ffrecord.c
index d888e83e..aad1bd87 100644
--- a/src/lj_ffrecord.c
+++ b/src/lj_ffrecord.c
@@ -810,7 +810,7 @@ static void LJ_FASTCALL recff_string_range(jit_State *J, RecordFFData *rd)
}
trstart = recff_string_start(J, str, &start, trstart, trlen, tr0);
if (rd->data) { /* Return string.sub result. */
- if (end - start >= 0) {
+ if (start <= end) {
/* Also handle empty range here, to avoid extra traces. */
TRef trptr, trslen = emitir(IRTGI(IR_SUBOV), trend, trstart);
emitir(IRTGI(IR_GE), trslen, tr0);
@@ -821,8 +821,8 @@ static void LJ_FASTCALL recff_string_range(jit_State *J, RecordFFData *rd)
J->base[0] = lj_ir_kstr(J, &J2G(J)->strempty);
}
} else { /* Return string.byte result(s). */
- ptrdiff_t i, len = end - start;
- if (len > 0) {
+ if (start < end) {
+ ptrdiff_t i, len = end - start;
TRef trslen = emitir(IRTGI(IR_SUBOV), trend, trstart);
emitir(IRTGI(IR_EQ), trslen, lj_ir_kint(J, (int32_t)len));
if (J->baseslot + len > LJ_MAX_JSLOTS)
diff --git a/test/tarantool-tests/lj-1443-stirng-byte-underflow.test.lua b/test/tarantool-tests/lj-1443-stirng-byte-underflow.test.lua
new file mode 100644
index 00000000..9f91718c
--- /dev/null
+++ b/test/tarantool-tests/lj-1443-stirng-byte-underflow.test.lua
@@ -0,0 +1,25 @@
+local tap = require('tap')
+
+-- The test file to demonstrate integer underflow during recording
+-- for the `string.byte()` built-in.
+-- See also https://github.com/LuaJIT/LuaJIT/issues/1443.
+
+local test = tap.test('lj-1443-stirng-byte-underflow'):skipcond({
+ ['Test requires JIT enabled'] = not jit.status(),
+})
+
+test:plan(1)
+
+jit.opt.start('hotloop=1')
+
+local result
+local str = 'xxx'
+for _ = 1, 4 do
+ -- Failed assertion in `rec_check_slots()` due to incorrect
+ -- number of results after underflow.
+ result = (str):byte(0X7FFFFFFF, -0X7FFFFFFF)
+end
+
+test:is(result, nil, 'correct result on trace')
+
+test:done(true)
--
2.53.0
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-03-06 13:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-06 13:42 [Tarantool-patches] [PATCH luajit 0/2] Fixes for recording of string built-ins Sergey Kaplun via Tarantool-patches
2026-03-06 13:42 ` [Tarantool-patches] [PATCH luajit 1/2] Fix edge cases when generating IR for string.byte/sub/find Sergey Kaplun via Tarantool-patches
2026-03-06 13:42 ` [Tarantool-patches] [PATCH luajit 2/2] Fix edge cases when recording string.byte/sub Sergey Kaplun via Tarantool-patches
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox