From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 1588DCC16C4; Wed, 25 Sep 2024 13:41:23 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 1588DCC16C4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1727260883; bh=eyXcVpivPXDAOYkUxCDyQgL4oMzxYxqamq6VS8AqvIU=; h=To:Date:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=sk/J05/WdlhbUKtjOVKsIRkA76pbqXZUO4aHaOMEpsUme5MsTVyp3s232gxL4DokO bWQVsd9IaUzVYJPcWbkKlKIr1BA22bOoGZatNSUVd1oRNSui/uVhM6AjyQa4KULuEB jd8KCONjm8jye2TIRtgdR+uKC7zosp9RwnbxsZZs= Received: from smtp61.i.mail.ru (smtp61.i.mail.ru [95.163.41.99]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 7CD74CC16C0 for ; Wed, 25 Sep 2024 13:41:21 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 7CD74CC16C0 Received: by exim-smtp-5fb647bcdc-nvn82 with esmtpa (envelope-from ) id 1stPSS-00000000363-2HSw; Wed, 25 Sep 2024 13:41:20 +0300 To: Maxim Kokryashkin , Sergey Bronnikov Date: Wed, 25 Sep 2024 13:36:56 +0300 Message-ID: <20240925103656.14771-1-skaplun@tarantool.org> X-Mailer: git-send-email 2.46.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Mailru-Src: smtpeAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojbL9S8ysBdXhJdG2XwjWnCFMz47jZbx9t X-Mailru-Sender: 520A125C2F17F0B1A9638AD358559B59EDFF455C650CC4EB3DE06ABAFEAF67053EC08F6B75DF7174B7CBEF92542CD7C88B0A2698F12F5C9EC77752E0C033A69E86920BD37369036789A8C6A0E60D2BB63A5DB60FBEB33A8A0DA7A0AF5A3A8387 X-Mras: Ok Subject: [Tarantool-patches] [PATCH luajit] Limit CSE for IR_CARG to fix loop optimizations. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Kaplun via Tarantool-patches Reply-To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" From: Mike Pall Thanks to Peter Cawley. (cherry picked from commit 3bdc6498c4c012a8fbf9cfa2756a5b07f56f1540) `IR_CALLXS` for the vararg function contains `IR_CARG(fptr, ctid)` as the second operand. The `loop_emit_phi()` scans only the first operand of the IR, so the second is not marked as PHI. In this case, when the IR appears in both the invariant and variant parts of the loop, CSE may remove it and thus lead to incorrect emitting results. This patch tweaks the CSE rules to avoid CSE across the `IR_LOOP`. Sergey Kaplun: * added the description and the test for the problem Part of tarantool/tarantool#10199 --- Branch: https://github.com/tarantool/luajit/tree/skaplun/lj-1244-missing-phi-carg Related issues: * https://github.com/tarantool/tarantool/issues/10199 * https://github.com/LuaJIT/LuaJIT/issues/1244 src/lj_opt_fold.c | 11 ++++ .../lj-1244-missing-phi-carg.test.lua | 53 +++++++++++++++++++ 2 files changed, 64 insertions(+) create mode 100644 test/tarantool-tests/lj-1244-missing-phi-carg.test.lua diff --git a/src/lj_opt_fold.c b/src/lj_opt_fold.c index e2171e1b..33e5f9dd 100644 --- a/src/lj_opt_fold.c +++ b/src/lj_opt_fold.c @@ -2406,6 +2406,17 @@ LJFOLD(XSNEW any any) LJFOLD(BUFHDR any any) LJFOLDX(lj_ir_emit) +/* -- Miscellaneous ------------------------------------------------------- */ + +LJFOLD(CARG any any) +LJFOLDF(cse_carg) +{ + TRef tr = lj_opt_cse(J); + if (tref_ref(tr) < J->chain[IR_LOOP]) /* CSE across loop? */ + return EMITFOLD; /* Raw emit. Assumes fins is left intact by CSE. */ + return tr; +} + /* ------------------------------------------------------------------------ */ /* Every entry in the generated hash table is a 32 bit pattern: diff --git a/test/tarantool-tests/lj-1244-missing-phi-carg.test.lua b/test/tarantool-tests/lj-1244-missing-phi-carg.test.lua new file mode 100644 index 00000000..865cdd26 --- /dev/null +++ b/test/tarantool-tests/lj-1244-missing-phi-carg.test.lua @@ -0,0 +1,53 @@ +local ffi = require('ffi') +local table_new = require('table.new') + +-- Test file to demonstrate LuaJIT incorrect behaviour for +-- recording the FFI call to the vararg function. See also: +-- https://github.com/LuaJIT/LuaJIT/issues/1244. +local tap = require('tap') +local test = tap.test('lj-1244-missing-phi-carg'):skipcond({ + ['Test requires JIT enabled'] = not jit.status(), +}) + +-- Loop unrolls into 2 iterations. Thus means that the loop is +-- executed on trace on the 5th iteration (instead of the usual +-- 4th). Run it even number of iterations to test both, so last is +-- 6th. +local NTESTS = 6 + +test:plan(NTESTS) + +ffi.cdef[[ + double sin(double, ...); + double cos(double, ...); +]] + +local EXPECTED = {[0] = ffi.C.sin(0), ffi.C.cos(0)} + +-- Array of 2 functions. +local fns = ffi.new('double (*[2])(double, ...)') +fns[0] = ffi.C.cos +fns[1] = ffi.C.sin + +-- Avoid reallocating the table on the trace. +local result = table_new(8, 0) + +jit.opt.start('hotloop=1') + +local fn = fns[0] +-- The first result is `cos()`. +for i = 1, NTESTS do + result[i] = fn(0) + fn = fns[i % 2] + -- The call persists in the invariant part of the loop as well. + -- Hence, XLOAD (part of the IR_CARG -- function to be called) + -- should be marked as PHI, but it isn't due to CSE. + fn(0) +end + +for i = 1, NTESTS do + test:is(result[i], EXPECTED[i % 2], + ('correct result on iteration %d'):format(i)) +end + +test:done(true) -- 2.46.0