From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 933B216D2255; Fri, 2 Jan 2026 13:03:37 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 933B216D2255 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1767348217; bh=i9YJyf43qNhXYEGKmciOoYdrRFy8I9WrjtxbH9HGS38=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=k9Stdtv/XDs8bAp38nE+VDMc7RlOhT/BE5w31wTXRp2gAl75XFO649v0/3hWZe3xf urBK9E8Kefy5sG6a+ZZSSJIB66ARd0bjtaACX5QPh6Dmb8sOU65I+19VRdbUEUshSE sR0DEUlJP1/+b8etp0/Ndqnkkul5CJo7Nn5nqCM4= Received: from send265.i.mail.ru (send265.i.mail.ru [95.163.59.104]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B83E716D2256 for ; Fri, 2 Jan 2026 13:03:35 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org B83E716D2256 Received: by exim-smtp-7b4fb89df9-4rqsk with esmtpa (envelope-from ) id 1vbc0M-00000000PtS-39bb; Fri, 02 Jan 2026 13:03:35 +0300 Content-Type: multipart/alternative; boundary="------------rPyFIDR9MCan9sPcuCvgHlIm" Message-ID: <67655dd2-51ef-4d5c-ad10-31b090c1dd46@tarantool.org> Date: Fri, 2 Jan 2026 13:03:34 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <27c8ab14bd988b3a60f9204502b7e0edb14c7480.1766738771.git.skaplun@tarantool.org> In-Reply-To: <27c8ab14bd988b3a60f9204502b7e0edb14c7480.1766738771.git.skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD979975AF0D777FEBD5FB5867E87496E3ED186ABB0F1A32B63182A05F538085040DE166B656E0FE5783DE06ABAFEAF6705979473CB0A47A9227C6D7116AD809442A035D568507C15B2 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE70043D879A87EF1BCEA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB55337566D1B0EEF93386A5FDCF516ED76E795D7480E5D32C577C3CDAA3F433DF466A9DF1389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C0D9442B0B5983000E8941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B64854413538E1713FCC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB8D32BA5DBAC0009BE9E8FC8737B5C22496040CC107394EA4476E601842F6C81A12EF20D2F80756B5FB606B96278B59C4276E601842F6C81A127C277FBC8AE2E8B485CA73C03513DAC3AA81AA40904B5D99C9F4D5AE37F343AD1F44FA8B9022EA23BBE47FD9DD3FB595F5C1EE8F4F765FCF1175FABE1C0F9B6E2021AF6380DFAD18AA50765F790063735872C767BF85DA227C277FBC8AE2E8BB835E6E385EA5AF075ECD9A6C639B01B4E70A05D1297E1BBCB5012B2E24CD356 X-C1DE0DAB: 0D63561A33F958A5FAC19CA9BD07E6265002B1117B3ED696AA8EBC20115309B8361FAC1196A180DE823CB91A9FED034534781492E4B8EEADA91A6E18C88C5E2F X-C8649E89: 1C3962B70DF3F0AD73CAD6646DEDE191716CD42B3DD1D34CAB70F9BE574AE9C625B6776AC983F447FC0B9F89525902EE6F57B2FD27647F25E66C117BDB76D6597277FF67BD3E9D24FDE9DC004C08F073ADCA1DCDB14B701214F78E6CA4106A91FC5A6464724F42E0B8341EE9D5BE9A0A81128B9A9CD521BFB6475D915CA1FEDAD2E7286FF9890C038CD93680B12512CF4C41F94D744909CE2512F26BEC029E55448553D2254B8D95CD72808BE417F3B9E0E7457915DAA85F X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVTZJppT4ZVHRSXFIYi5QIMQ= X-Mailru-Sender: C4F68CFF4024C8867DFDF7C7F2588458ABAEB5FD85A6477108AF9BFB128101EEADC2AEC54E32129E21A8B7AA7B732B08645D15D82EE4B272BD6E4642A116CA93524AA66B5ACBE6721EF430B9A63E2A504198E0F3ECE9B5443453F38A29522196 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v2 luajit 11/41] perf: adjust k-nucleotide in LuaJIT-benches X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------rPyFIDR9MCan9sPcuCvgHlIm Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hello, Sergey, please see my comments. Sergey On 12/26/25 12:17, Sergey Kaplun wrote: > This patch adjusts the aforementioned test to use the benchmark > framework introduced before. The default arguments are adjusted > according to the file. The arguments to the script still > can be provided in the command line run. > > The benchmark input is given by redirecting the corresponding > file generated by the `libs/fasta.lua 5e6`. The output > from the benchmark is redirected to /dev/null. All checks are done by > the comparison with the precomputed values for the aforementioned file. > --- > perf/LuaJIT-benches/k-nucleotide.lua | 96 ++++++++++++++++++++++++---- > 1 file changed, 85 insertions(+), 11 deletions(-) > > diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua > index 0bfb41be..e92429e8 100644 > --- a/perf/LuaJIT-benches/k-nucleotide.lua > +++ b/perf/LuaJIT-benches/k-nucleotide.lua > @@ -1,3 +1,10 @@ > +-- The benchmark that checks the performance of hash tables. > +-- The program reads the redirected FASTA format file from stdin, > +-- extracts DNA sequence THREE, and counts the specific sequences. > +-- For the details see: > +--https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html > + > +local bench = require('bench').new(arg) > > local function kfrequency(seq, freq, k, frame) > local sub = string.sub > @@ -12,7 +19,7 @@ local function count(seq, frag) > local k = #frag > local freq = {} > for frame=1,k do kfrequency(seq, freq, k, frame) end > - io.write(freq[frag] or 0, "\t", frag, "\n") > + return freq[frag] > end > > local function frequency(seq, k) > @@ -24,10 +31,11 @@ local function frequency(seq, k) > local fa, fb = freq[a], freq[b] > return fa == fb and a > b or fa > fb > end) > + local res = {} > for _,c in ipairs(sfreq) do > - io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum)) > + res[c] = freq[c]*100/sum add more whitespaces please > end > - io.write("\n") > + return res > end > > local function readseq() > @@ -48,11 +56,77 @@ local function readseq() > return string.upper(table.concat(lines, "", 1, ln)) > end > > -local seq = readseq() > -frequency(seq, 1) > -frequency(seq, 2) > -count(seq, "GGT") > -count(seq, "GGTA") > -count(seq, "GGTATT") > -count(seq, "GGTATTTTAATT") > -count(seq, "GGTATTTTAATTTATAGT") > +local function check_freq(res, expected) > + for k,v in pairs(expected) do > + assert(string.format("%0.3f", res[k]) == v, > + "Incorrect frequency for fragment " .. k) > + end > +end > + > +-- The input is generated by `fasta.lua 5e6'. The check function > +-- is corresponding. > +local N = 5e6 > +-- See for the details. > +local items = N * 5 > +bench:add({ > + name = "k_nucleotide", > + payload = function() > + local seq = readseq() > + local sfreq1 = frequency(seq, 1) > + local sfreq2 = frequency(seq, 2) > + local GGT = count(seq, "GGT") > + local GGTA = count(seq, "GGTA") > + local GGTATT = count(seq, "GGTATT") > + local GGTATTTTAATT = count(seq, "GGTATTTTAATT") > + local GGTATTTTAATTTATAGT = count(seq, "GGTATTTTAATTTATAGT") > + > + local res = { > + sfreq1 = sfreq1, > + sfreq2 = sfreq2, > + GGT = GGT, > + GGTA = GGTA, > + GGTATT = GGTATT, > + GGTATTTTAATT = GGTATTTTAATT, > + GGTATTTTAATTTATAGT = GGTATTTTAATTTATAGT, > + } > + -- XXX: Reset input for the non-check iteration. > +io.stdin:seek("set", 0) > + return res > + end, > + checker = function(res) > + check_freq(res.sfreq1, { > + A = "30.296", > + T = "30.149", > + C = "19.800", > + G = "19.754", > + }) > + check_freq(res.sfreq2, { > + AA = "9.177", > + TA = "9.132", > + AT = "9.130", > + TT = "9.091", > + CA = "6.002", > + AC = "6.001", > + AG = "5.987", > + GA = "5.984", > + CT = "5.971", > + TC = "5.971", > + GT = "5.957", > + TG = "5.956", > + CC = "3.917", > + GC = "3.911", > + CG = "3.909", > + GG = "3.902", > + }) > + > + assert(res.GGT == 294331) > + assert(res.GGTA == 89290) > + assert(res.GGTATT == 9462) > + assert(res.GGTATTTTAATT == 178) > + assert(res.GGTATTTTAATTTATAGT == 178) > + return true > + end, > + items = items, > +}) > + > +bench:run_and_report() I don't know why, but microbench cannot finish on my machine. And according to usage and implementation it is not expected any input or option. --------------rPyFIDR9MCan9sPcuCvgHlIm Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

Hello, Sergey,

please see my comments.

Sergey

On 12/26/25 12:17, Sergey Kaplun wrote:
This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The benchmark input is given by redirecting the corresponding
<FASTA_5000000> file generated by the `libs/fasta.lua 5e6`. The output
from the benchmark is redirected to /dev/null. All checks are done by
the comparison with the precomputed values for the aforementioned file.
---
 perf/LuaJIT-benches/k-nucleotide.lua | 96 ++++++++++++++++++++++++----
 1 file changed, 85 insertions(+), 11 deletions(-)

diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
index 0bfb41be..e92429e8 100644
--- a/perf/LuaJIT-benches/k-nucleotide.lua
+++ b/perf/LuaJIT-benches/k-nucleotide.lua
@@ -1,3 +1,10 @@
+-- The benchmark that checks the performance of hash tables.
+-- The program reads the redirected FASTA format file from stdin,
+-- extracts DNA sequence THREE, and counts the specific sequences.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html
+
+local bench = require('bench').new(arg)
 
 local function kfrequency(seq, freq, k, frame)
   local sub = string.sub
@@ -12,7 +19,7 @@ local function count(seq, frag)
   local k = #frag
   local freq = {}
   for frame=1,k do kfrequency(seq, freq, k, frame) end
-  io.write(freq[frag] or 0, "\t", frag, "\n")
+  return freq[frag]
 end
 
 local function frequency(seq, k)
@@ -24,10 +31,11 @@ local function frequency(seq, k)
     local fa, fb = freq[a], freq[b]
     return fa == fb and a > b or fa > fb
   end)
+  local res = {}
   for _,c in ipairs(sfreq) do
-    io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
+    res[c] = freq[c]*100/sum
add more whitespaces please
   end
-  io.write("\n")
+  return res
 end
 
 local function readseq()
@@ -48,11 +56,77 @@ local function readseq()
   return string.upper(table.concat(lines, "", 1, ln))
 end
 
-local seq = readseq()
-frequency(seq, 1)
-frequency(seq, 2)
-count(seq, "GGT")
-count(seq, "GGTA")
-count(seq, "GGTATT")
-count(seq, "GGTATTTTAATT")
-count(seq, "GGTATTTTAATTTATAGT")
+local function check_freq(res, expected)
+  for k,v in pairs(expected) do
+    assert(string.format("%0.3f", res[k]) == v,
+           "Incorrect frequency for fragment " .. k)
+  end
+end
+
+-- The input is generated by `fasta.lua 5e6'. The check function
+-- is corresponding.
+local N = 5e6
+-- See <libs/fasta.lua> for the details.
+local items = N * 5
+bench:add({
+  name = "k_nucleotide",
+  payload = function()
+    local seq = readseq()
+    local sfreq1 = frequency(seq, 1)
+    local sfreq2 = frequency(seq, 2)
+    local GGT  = count(seq, "GGT")
+    local GGTA = count(seq, "GGTA")
+    local GGTATT = count(seq, "GGTATT")
+    local GGTATTTTAATT = count(seq, "GGTATTTTAATT")
+    local GGTATTTTAATTTATAGT = count(seq, "GGTATTTTAATTTATAGT")
+
+    local res = {
+      sfreq1 = sfreq1,
+      sfreq2 = sfreq2,
+      GGT  = GGT,
+      GGTA = GGTA,
+      GGTATT = GGTATT,
+      GGTATTTTAATT = GGTATTTTAATT,
+      GGTATTTTAATTTATAGT = GGTATTTTAATTTATAGT,
+    }
+    -- XXX: Reset input for the non-check iteration.
+    io.stdin:seek("set", 0)
+    return res
+  end,
+  checker = function(res)
+    check_freq(res.sfreq1, {
+      A = "30.296",
+      T = "30.149",
+      C = "19.800",
+      G = "19.754",
+    })
+    check_freq(res.sfreq2, {
+      AA = "9.177",
+      TA = "9.132",
+      AT = "9.130",
+      TT = "9.091",
+      CA = "6.002",
+      AC = "6.001",
+      AG = "5.987",
+      GA = "5.984",
+      CT = "5.971",
+      TC = "5.971",
+      GT = "5.957",
+      TG = "5.956",
+      CC = "3.917",
+      GC = "3.911",
+      CG = "3.909",
+      GG = "3.902",
+    })
+
+    assert(res.GGT == 294331)
+    assert(res.GGTA == 89290)
+    assert(res.GGTATT == 9462)
+    assert(res.GGTATTTTAATT == 178)
+    assert(res.GGTATTTTAATTTATAGT == 178)
+    return true
+  end,
+  items = items,
+})
+
+bench:run_and_report()

I don't know why, but microbench cannot finish on my machine. And according to usageĀ 

and implementation it is not expected any input or option.


    
--------------rPyFIDR9MCan9sPcuCvgHlIm--