From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 30B289D5739; Fri, 2 Feb 2024 15:21:42 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 30B289D5739 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1706876502; bh=DAXlB8JYPNfrMWQsdDbUXXWZEQifF3SsiX70RXXeQvI=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=LhIOql+qCUjZAlf5a35ki2XcrPe0wQSN39X/iBtZt5AzYWFlWzz9FeHti//lUGleO TufBswJm3vZUZ5QZsQ2117ldCmMzyp4yd/k9G/SBcSLQcmtgQ5FK8P+vcDIv/+TNKh Juck7j2vcGwMrGnWj/ehxQypusVABeGXsaGhXMFs= Received: from smtp59.i.mail.ru (smtp59.i.mail.ru [95.163.41.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id BBC689D5700 for ; Fri, 2 Feb 2024 15:21:40 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org BBC689D5700 Received: by smtp59.i.mail.ru with esmtpa (envelope-from ) id 1rVsY7-00000003itt-3C4H; Fri, 02 Feb 2024 15:21:40 +0300 Date: Fri, 2 Feb 2024 15:21:39 +0300 To: Sergey Kaplun Cc: Maxim Kokryashkin , tarantool-patches@dev.tarantool.org Message-ID: References: <20240112132643.106145-1-m.kokryashkin@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Mailru-Src: smtp X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD9D209DFAF3802320A60E85FA8DC5BB29C269302F705E5CD19182A05F538085040BB939E6A910CC9E1A6D5EE0DB6E1EC8D3A59495B746018A75ADF187202214113234C7E1550D41240 X-C1DE0DAB: 0D63561A33F958A5005BD216715416E05002B1117B3ED6969C33898DD67451A6C638DF663A625AFA823CB91A9FED034534781492E4B8EEADFB12F4B11BB5604FBDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0ADBF74143AD284FC7177DD89D51EBB7742DC8270968E61249B1004E42C50DC4CA955A7F0CF078B5EC49A30900B95165D3435BBF0AC4E3A921C268F4711AD55DC370E87AD27B92609F20E04B68F63A7C1D28E781B60709BE7721D7E09C32AA3244C00DAEE509F8EB6A0921CC9A31F9C4781ECE8CDF779F2FD76EA455F16B58544A2557BDE0DD54B3590965026E5D17F6739C77C69D99B9914278E50E1F0597A6FD5CD72808BE417F3B9E0E7457915DAA85F X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojKdLLj0Q7SP+RmIUxenIcMQ== X-Mailru-Sender: 11C2EC085EDE56FA38FD4C59F7EFE40722C0E83C99C4C203B951B70A5BD4BD8E74D6CDB06B754591228AB1844EA588C704C9FB44FCBCE9EE92D99EB8CC7091A7ECEABDC5717908DEF544888E8238EB4872D6B4FCE48DF648AE208404248635DF X-Mras: OK Subject: Re: [Tarantool-patches] [PATCH luajit] LJ_GC64: Fix HREFK optimization. X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Maxim Kokryashkin via Tarantool-patches Reply-To: Maxim Kokryashkin Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi, Sergey! Thanks for the review! Fixed your comments, branch is force-pushed. Here is the diff with changes: === diff --git a/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua b/test/tarantool-tests/lj-840-gc64-fix-hrefk-optimization.test.lua similarity index 67% rename from test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua rename to test/tarantool-tests/lj-840-gc64-fix-hrefk-optimization.test.lua index a11b91e3..49168bc4 100644 --- a/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua +++ b/test/tarantool-tests/lj-840-gc64-fix-hrefk-optimization.test.lua @@ -4,7 +4,7 @@ local tap = require('tap') -- in LuaJIT. local ffi = require('ffi') -local test = tap.test('lj-840-fix-hrefk-optimization'):skipcond({ +local test = tap.test('lj-840-gc64-fix-hrefk-optimization'):skipcond({ ['Test requires GC64 mode enabled'] = not ffi.abi('gc64'), ['Test requires JIT enabled'] = not jit.status(), }) @@ -12,10 +12,14 @@ test:plan(1) local table_new = require('table.new') +local SIZEOF_GCTAB = 64 +-- See `chunk2mem` in lj_alloc.c for details. +local ALLOC_CHUNK_HDR = 16 +local GCTAB_FOOTPRINT = SIZEOF_GCTAB + ALLOC_CHUNK_HDR -- Size of single hash node in bytes. local NODE_SIZE = 24 --- Number of hash nodes to allocate on each iteration --- based on the condition from `rec_idx_key` +-- The maximum value that can be stored in a 16-bit `op2` +-- field in HREFK IR. local HASH_NODES = 65535 -- The vector of hash nodes should have a raw size of -- `HASH_NODES * NODE_SIZE`, which is allocated in @@ -23,16 +27,17 @@ local HASH_NODES = 65535 -- the LuaJIT allocator adds a bunch of small paddings -- and aligns the required size to LJ_PAGESIZE, which is -- 4096, so the actual allocated size includes alignment. -local ALIGNMENT = 4096 +local LJ_PAGESIZE = 4096 -- The vector for hash nodes in the table is allocated based on -- `hbits`, so it's actually got a size of 65536 nodes. -local SINGLE_ITERATION_ALLOC = (HASH_NODES + 1) * NODE_SIZE + ALIGNMENT + 72 +local NODE_VECTOR_SIZE = (HASH_NODES + 1) * NODE_SIZE + LJ_PAGESIZE +local SINGLE_ITERATION_ALLOC = NODE_VECTOR_SIZE + GCTAB_FOOTPRINT -- We need to overflow the 32-bit distance to the global nilnode, --- so we divide 2^32 by the SINGLE_ITERATION_ALLOC. There are a --- bunch of non-table.new allocations already performed, so one --- iteration is subtracted to account for them. -local N_ITERATIONS = 0x100000000 / SINGLE_ITERATION_ALLOC - 1 --- Prevent anchor table from interfering with target table allocations. +-- so we divide 2^32 by the SINGLE_ITERATION_ALLOC and ceil +-- the result. +local N_ITERATIONS = math.ceil((2 ^ 32) / SINGLE_ITERATION_ALLOC) +-- Prevent anchor table from interfering with target +-- table allocations. local anchor = table.new(N_ITERATIONS, 0) -- Construct table. === New commit message: === LJ_GC64: Fix HREFK optimization. Contributed by XmiliaH. (cherry-picked from commit 91bc6b8ad1f373c1ce9003dc024b2e21fad0e444) In `lj_record_idx` when `ix->oldv` is the global nilnode and the required key is not present in the table, it is possible to pass the constant key lookup optimization condition because of the `uint32_t` (`MSize`) overflow. Because of that, further recording incorrectly removes the check for the nilnode, which produces wrong results when trace is called for a different table. The issue is solved by using `GCSize`, which has a size of 64 bits, instead of `MSize`. Maxim Kokryashkin: * added the description and the test for the problem Part of tarantool/tarantool#9145 === On Mon, Jan 15, 2024 at 06:22:29PM +0300, Sergey Kaplun via Tarantool-patches wrote: > Hi, Maxim! > Thanks for the patch! > LGTM with a few comments below. > > On 12.01.24, Maxim Kokryashkin wrote: > > From: Mike Pall > > > > Contributed by XmiliaH. > > > > (cherry-picked from commit 91bc6b8ad1f373c1ce9003dc024b2e21fad0e444) > > > > In `lj_record_idx` when `ix->oldv` is the global nilnode and the > > required key is not present in the table, it is possible to pass > > the constant key lookup optimization condition because of the > > `uint32_t` overflow. Because of that, further recording > > I suggest clarifying like the following: > | `uint32_t` (`MSize`) > Feel free to ignore. Fixed. > > > incorrectly removes the check for the nilnode, which produces > > wrong results when trace is called for a different table. > > Nit: Please mention also how the problem is fixed. Fixed. > > > > > Maxim Kokryashkin: > > * added the description and the test for the problem > > > > Part of tarantool/tarantool#9145 > > --- > > Branch: https://github.com/tarantool/luajit/tree/fckxorg/lj-840-fix-hrefk-optimization > > PR: https://github.com/tarantool/tarantool/pull/9591 > > Issues: https://github.com/LuaJIT/LuaJIT/issues/840 > > https://github.com/tarantool/tarantool/issues/9145 > > > > src/lj_record.c | 8 +-- > > .../lj-840-fix-hrefk-optimization.test.lua | 58 +++++++++++++++++++ > > 2 files changed, 62 insertions(+), 4 deletions(-) > > create mode 100644 test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua > > > > diff --git a/src/lj_record.c b/src/lj_record.c > > index a929b8aa..919e7169 100644 > > --- a/src/lj_record.c > > +++ b/src/lj_record.c > > > > > diff --git a/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua b/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua > > Nit: We can also add gc64 prefix for this test like: > > Feel free to ignore. Fixed. > > > new file mode 100644 > > index 00000000..a11b91e3 > > --- /dev/null > > +++ b/test/tarantool-tests/lj-840-fix-hrefk-optimization.test.lua > > @@ -0,0 +1,58 @@ > > +local tap = require('tap') > > + > > +-- Test file to demonstrate incorrect HREFK optimization > > +-- in LuaJIT. > > + > > +local ffi = require('ffi') > > +local test = tap.test('lj-840-fix-hrefk-optimization'):skipcond({ > > + ['Test requires GC64 mode enabled'] = not ffi.abi('gc64'), > > + ['Test requires JIT enabled'] = not jit.status(), > > +}) > > +test:plan(1) > > + > > +local table_new = require('table.new') > > + > > +-- Size of single hash node in bytes. > > +local NODE_SIZE = 24 > > +-- Number of hash nodes to allocate on each iteration > > +-- based on the condition from `rec_idx_key` > > Nit: Missed dot at the end of the sentence. > > It is more correct to say that we use this is restricted by the IR > format: > `op2` field in the HREFK IR is a slot number and it is 16-bit wide. > 65535 == 2^16 - 1; i.e., it is the maximum value that can be stored in a > 16-bit field. Fixed. > > > +local HASH_NODES = 65535 > > +-- The vector of hash nodes should have a raw size of > > +-- `HASH_NODES * NODE_SIZE`, which is allocated in > > +-- `lj_alloc_malloc` directly with `mmap`. However, > > +-- the LuaJIT allocator adds a bunch of small paddings > > +-- and aligns the required size to LJ_PAGESIZE, which is > > +-- 4096, so the actual allocated size includes alignment. > > +local ALIGNMENT = 4096 > > Minor: So, maybe name it `LJ_PAGESIZE`? > Feel free to ignore. > > > +-- The vector for hash nodes in the table is allocated based on > > +-- `hbits`, so it's actually got a size of 65536 nodes. > > +local SINGLE_ITERATION_ALLOC = (HASH_NODES + 1) * NODE_SIZE + ALIGNMENT + 72 > > What is the magic number 72? It's GCtab memory footprint, but I got it wrong and it's actually 80. Fixed. > > > +-- We need to overflow the 32-bit distance to the global nilnode, > > +-- so we divide 2^32 by the SINGLE_ITERATION_ALLOC. There are a > > +-- bunch of non-table.new allocations already performed, so one > > +-- iteration is subtracted to account for them. > > Why is it crucial to subtract it? What happens without it? > I suppose that the new table will still be huge enough, won't it? > > > +local N_ITERATIONS = 0x100000000 / SINGLE_ITERATION_ALLOC - 1 I've got a mistake here too. There must be no subtraction, moreover, it must be a ceil of the result. Fixed. > > Minor: We can use `(2 ^ 32)` instead of 0x100000000 (it is easier to > read). > Feel free to ignore. Fixed. > > > +-- Prevent anchor table from interfering with target table allocations. > > Nit: Comment length is more than 66 symbols. Fixed. > > > +local anchor = table.new(N_ITERATIONS, 0) > > + > > +-- Construct table. > > +for _ = 1, N_ITERATIONS do > > + table.insert(anchor, table_new(0, HASH_NODES)) > > +end > > + > > +jit.opt.start('hotloop=1') > > +local function get_n(tab) > > + local x > > + for _ = 1, 4 do > > + x = tab.n > > + end > > + return x > > +end > > + > > +-- Record the trace for the constructed table. > > +get_n(anchor[#anchor]) > > + > > +-- Check the result for the table that has the required key. > > +local result = get_n({n=1}) > > +test:is(result, 1, 'correct value retrieved') > > +test:done(true) > > -- > > 2.43.0 > > > > -- > Best regards, > Sergey Kaplun