Hi! The description looks good! Although, I can’t get the point how this > (negative offset from the stack > pointer may appear positive and result is undefined memory access). can’t be applied to a constant? Hence, we’d need a check? Although, I’m trying to get into Mike’s business again. So it LGTM as a backport. Regards, Sergos > On 28 Oct 2021, at 14:36, Sergey Kaplun wrote: > > Hi, Sergos! > > Thanks for the review! > > On 27.10.21, sergos wrote: >> Hi! >> >> Thanks for the patch, see my comments below. >> >> Regards, >> Sergos >> >>> On 18 Oct 2021, at 21:53, Sergey Kaplun wrote: >>> >>> From: Mike Pall >>> >>> Reported by Matthew Burk. >>> >>> (cherry picked from commit 9f0caad0e43f97a4613850b3874b851cb1bc301d) >>> >>> The simplify_conv_sext optimization is used for reduction of widening. >> >> Whether it is part of narrowing itself? > > Not exactly, it is fold optimization that can be used during narrowing. > > lj_opt_narrow_cindex -> lj_opt_fold (in emitir) -> fold_narrow_convert -> > lj_opt_narrow_convert -> narrow_conv_emit -> lj_opt_fold -> > fold_simplify_conv_sext > >> >>> cdata indexing narrow optimization uses it for narrowing of a C array >> >> The sentence is started with lowercase, making one to track back the start >> of the sentence. “A narrow optimization for cdata…” in conjunction of >> first sentence above should help. >> >> >>> index. The optimization eliminates sign extension for corresponding >> the >> >>> integer value. However, this conversion cannot be omitted for non >>> constant values (for example loading stack slots) as far as their sign >>> extension may change. The emitted machine code may be incorrect without >> >> I believe you meant ‘with’ the conversion. >> >>> aforementioned conversion (for example mov instruction instead movsxd is >>> used on x86 architecture). As a result the value in a destination >> >> The example is too much. Just “negative offset from the stack pointer >> may appear positive and result in undefined memory access” > > Fixed all your comments, see the updated commit message below. > Branch is force pushed. > =================================================================== > Fix FOLD rule for strength reduction of widening. > > Reported by Matthew Burk. > > (cherry picked from commit 9f0caad0e43f97a4613850b3874b851cb1bc301d) > > The simplify_conv_sext optimization is used for reduction of widening. > An indexing narrow optimization for cdata uses it for narrowing of a C > array index. The optimization eliminates sign extension for the > corresponding integer value. However, this conversion cannot be omitted > for non constant values (for example loading stack slots) as far as > their sign extension may change. The emitted machine code may be > incorrect with aforementioned conversion (negative offset from the stack > pointer may appear positive and result is undefined memory access). As a > result the value in a destination register during trace execution is > invalid. > > This patch allows this optimization only for constant integer values. > > Sergey Kaplun: > * added the description and the test for the problem > > =================================================================== >> >>> register during trace execution is invalid. >>> >>> This patch allows this optimization only for constant integer values. >> >> Should it check if integer - even a constant one - is positive? > > No, why? In fold rule are declared both types: INT64 and UINT64. > >> >>> >>> Sergey Kaplun: >>> * added the description and the test for the problem >>> --- >>> >>> Tarantool branch: https://github.com/tarantool/tarantool/tree/skaplun/gh-noticket-fix-fold-simplify-conv-sext >>> Branch: https://github.com/tarantool/luajit/tree/skaplun/gh-noticket-fix-fold-simplify-conv-sext >>> >>> src/lj_opt_fold.c | 2 +- >>> .../lj-fix-fold-simplify-conv-sext.test.lua | 35 +++++++++++++++++++ >>> 2 files changed, 36 insertions(+), 1 deletion(-) >>> create mode 100644 test/tarantool-tests/lj-fix-fold-simplify-conv-sext.test.lua >>> >>> diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c >>> index 3c508062..276dc040 100644 >>> --- a/src/lj_opt_fold.c >>> +++ b/src/lj_opt_fold.c >>> @@ -1227,7 +1227,7 @@ LJFOLDF(simplify_conv_sext) >>> if (ref == J->scev.idx) { >>> IRRef lo = J->scev.dir ? J->scev.start : J->scev.stop; >>> lua_assert(irt_isint(J->scev.t)); >>> - if (lo && IR(lo)->i + ofs >= 0) { >>> + if (lo && IR(lo)->o == IR_KINT && IR(lo)->i + ofs >= 0) { >>> ok_reduce: >>> #if LJ_TARGET_X64 >>> /* Eliminate widening. All 32 bit ops do an implicit zero-extension. */ >>> diff --git a/test/tarantool-tests/lj-fix-fold-simplify-conv-sext.test.lua b/test/tarantool-tests/lj-fix-fold-simplify-conv-sext.test.lua >>> new file mode 100644 >>> index 00000000..bd3738c5 >>> --- /dev/null >>> +++ b/test/tarantool-tests/lj-fix-fold-simplify-conv-sext.test.lua >>> @@ -0,0 +1,35 @@ >>> +local tap = require('tap') >>> +local ffi = require('ffi') >>> + >>> +local test = tap.test('lj-fix-fold-simplify-conv-sext') >>> + >>> +local NSAMPLES = 4 >>> +local NTEST = NSAMPLES * 2 + 1 > > Also fixed typo here: s/+/-/ > >>> +test:plan(NTEST) >>> + >>> +local samples = ffi.new('int [?]', NSAMPLES) >>> + >>> +-- Prepare data. >>> +for i = 0, NSAMPLES - 1 do samples[i] = i end >>> + >>> +local expected = {3, 2, 1, 0, 3, 2, 1} >>> + >>> +local START = 3 >>> +local STOP = -START >>> + >>> +local results = {} >>> +jit.opt.start('hotloop=1') >>> +for i = START, STOP, -1 do >>> + -- While recording cdata indexing the fold CONV SEXT >>> + -- optimization eliminate sign extension for the corresponding >>> + -- non constant value (i.e. stack slot). As a result the read >>> + -- out of bounds was occurring. >>> + results[#results + 1] = samples[i % NSAMPLES] >>> +end >>> + >>> +for i = 1, NTEST do >>> + test:ok(results[i] == expected[i], 'correct cdata indexing') >>> +end >>> + >>> +os.exit(test:check() and 0 or 1) >>> + >>> -- >>> 2.31.0 >>> >> > > -- > Best regards, > Sergey Kaplun