From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 198FA1623DBA; Thu, 15 Jan 2026 11:07:40 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 198FA1623DBA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1768464460; bh=DvJ8KT05mzr8qp3GDWbjQtSoA6CvInQYMvYtQU007dg=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=xbPPtgJPr0om6/bYgVCnTm9LBOXwe0mtvvOOb7hyh3fTXIQ6TaCtZ5OEotR9hqok5 b5xOZSSsWPelDnyJoKgX0kFqQs5BIt1KaZWQv8STJwF6ctEaaZRwJ5EUe56faLG052 sRu/AgXKKMRnsjgdxOCe6VPHClBkAD/ON2121Fuw= Received: from send217.i.mail.ru (send217.i.mail.ru [95.163.59.56]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 3209F1623DB8 for ; Thu, 15 Jan 2026 11:07:38 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 3209F1623DB8 Received: by exim-smtp-55f9c6db88-vxdrq with esmtpa (envelope-from ) id 1vgIOH-000000008Fn-0A7d; Thu, 15 Jan 2026 11:07:37 +0300 Content-Type: multipart/alternative; boundary="------------fVd0ZizX3lVzwpCEJicvxpo3" Message-ID: <83bfe7da-e645-4abe-aa28-d3634f722b9b@tarantool.org> Date: Thu, 15 Jan 2026 11:07:35 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Sergey Kaplun Cc: tarantool-patches@dev.tarantool.org References: <9e98c600fdd28246ef80da1988866e5948a6ba62.1766738771.git.skaplun@tarantool.org> In-Reply-To: <9e98c600fdd28246ef80da1988866e5948a6ba62.1766738771.git.skaplun@tarantool.org> X-Mailru-Src: smtp X-4EC0790: 10 X-7564579A: 646B95376F6C166E X-77F55803: 4F1203BC0FB41BD99B884C1B59F9AF80EBA57780F58D630065AED29003F8BD11182A05F538085040FBCEC0344EFCB7073DE06ABAFEAF670520C206F69CDE67D619EC99DA2772570221F44083463E5A52 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7D77100FFB2844417EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637AC83A81C8FD4AD23D82A6BABE6F325AC2E85FA5F3EDFCBAA7353EFBB5533756680939BF45BBEA6960F205C7B704358BE59CB407A91F59BDA739682138CDE4B62389733CBF5DBD5E913377AFFFEAFD269176DF2183F8FC7C07E7E81EEA8A9722B8941B15DA834481FCF19DD082D7633A0EF3E4896CB9E6436389733CBF5DBD5E9D5E8D9A59859A8B64854413538E1713FCC7F00164DA146DA6F5DAA56C3B73B237318B6A418E8EAB86D1867E19FE14079C09775C1D3CA48CF3D321E7403792E342EB15956EA79C166A417C69337E82CC275ECD9A6C639B01B78DA827A17800CE76515C59FC18CEA6D731C566533BA786AA5CC5B56E945C8DA X-C1DE0DAB: 0D63561A33F958A5006673AC92E62ACE5002B1117B3ED6966A02526114F125870E58516B1639A14B823CB91A9FED034534781492E4B8EEAD81B3E0F64AD3EF57BDAD6C7F3747799A X-C8649E89: 1C3962B70DF3F0AD73CAD6646DEDE191716CD42B3DD1D34CAB70F9BE574AE9C625B6776AC983F447FC0B9F89525902EE6F57B2FD27647F25E66C117BDB76D65931D7B662819950078E207D1D7C03E399AE41C0D4D362DB3813E9E66BA3DB4B79F1D5132FE5C52529B8341EE9D5BE9A0AEC8DB7ADA2D2F5DC8D320B29B888025BE85E02B994D555628CD93680B12512CF4C41F94D744909CE2512F26BEC029E55448553D2254B8D95CD72808BE417F3B9E0E7457915DAA85F X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu53w8ahmwBjZKM/YPHZyZHvz5uv+WouB9+ObcCpyrx6l7KImUglyhkEat/+ysWwi0gdhEs0JGjl6ggRWTy1haxBpVdbIX1nthFXMZebaIdHP2ghjoIc/363UZI6Kf1ptIMVVt0N+pdRHOZNtW/DFYtWPA= X-Mailru-Sender: 689FA8AB762F7393DDD5FD59B456EAD2A8B7115685359739C95079C85553573CC388775711392396EF86D5F70DA33880E41E8EF7A07863ECB274557F927329BE2DDF8182D28ACDB545BD1C3CC395C826B4A721A3011E896F X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v2 luajit 10/41] perf: adjust fasta in LuaJIT-benches X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Sergey Bronnikov via Tarantool-patches Reply-To: Sergey Bronnikov Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" This is a multi-part message in MIME format. --------------fVd0ZizX3lVzwpCEJicvxpo3 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi, Sergey, Thanks for the patch! LGTM Sergey On 12/26/25 12:17, Sergey Kaplun wrote: > This patch adjusts the aforementioned test to use the benchmark > framework introduced before. The default arguments are adjusted > according to the file. The arguments to the script still > can be provided in the command line run. > > Since the result output (with the different input parameter value) > produced by this benchmark is used in other benchmarks > ( and ), the original script is used as a > library (inside the subdirectory) with the updated default input > value and returns the number of items processed. The output for the > benchmark itself is suppressed and not checked since it is irrational to > store in the repository such huge files for testing. > --- > perf/LuaJIT-benches/fasta.lua | 126 ++++++++--------------------- > perf/LuaJIT-benches/libs/fasta.lua | 105 ++++++++++++++++++++++++ > 2 files changed, 138 insertions(+), 93 deletions(-) > create mode 100644 perf/LuaJIT-benches/libs/fasta.lua > > diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua > index 7ce60804..457623b2 100644 > --- a/perf/LuaJIT-benches/fasta.lua > +++ b/perf/LuaJIT-benches/fasta.lua > @@ -1,95 +1,35 @@ > - > -local Last = 42 > -local function random(max) > - local y = (Last * 3877 + 29573) % 139968 > - Last = y > - return (max * y) / 139968 > -end > - > -local function make_repeat_fasta(id, desc, s, n) > - local write, sub = io.write, string.sub > - write(">", id, " ", desc, "\n") > - local p, sn, s2 = 1, #s, s..s > - for i=60,n,60 do > - write(sub(s2, p, p + 59), "\n") > - p = p + 60; if p > sn then p = p - sn end > - end > - local tail = n % 60 > - if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end > -end > - > -local function make_random_fasta(id, desc, bs, n) > - io.write(">", id, " ", desc, "\n") > - loadstring([=[ > - local write, char, unpack, n, random = io.write, string.char, unpack, ... > - local buf, p = {}, 1 > - for i=60,n,60 do > - for j=p,p+59 do ]=]..bs..[=[ end > - buf[p+60] = 10; p = p + 61 > - if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end > - end > - local tail = n % 60 > - if tail > 0 then > - for j=p,p+tail-1 do ]=]..bs..[=[ end > - p = p + tail; buf[p] = 10; p = p + 1 > - end > - write(char(unpack(buf, 1, p-1))) > - ]=], desc)(n, random) > -end > - > -local function bisect(c, p, lo, hi) > - local n = hi - lo > - if n == 0 then return "buf[j] = "..c[hi].."\n" end > - local mid = math.floor(n / 2) > - return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid).. > - "else\n"..bisect(c, p, lo+mid+1, hi).."end\n" > -end > - > -local function make_bisect(tab) > - local c, p, sum = {}, {}, 0 > - for i,row in ipairs(tab) do > - c[i] = string.byte(row[1]) > - sum = sum + row[2] > - p[i] = sum > - end > - return "local r = random(1)\n"..bisect(c, p, 1, #tab) > -end > - > -local alu = > - "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG".. > - "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA".. > - "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT".. > - "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA".. > - "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG".. > - "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC".. > - "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA" > - > -local iub = make_bisect{ > - { "a", 0.27 }, > - { "c", 0.12 }, > - { "g", 0.12 }, > - { "t", 0.27 }, > - { "B", 0.02 }, > - { "D", 0.02 }, > - { "H", 0.02 }, > - { "K", 0.02 }, > - { "M", 0.02 }, > - { "N", 0.02 }, > - { "R", 0.02 }, > - { "S", 0.02 }, > - { "V", 0.02 }, > - { "W", 0.02 }, > - { "Y", 0.02 }, > -} > - > -local homosapiens = make_bisect{ > - { "a", 0.3029549426680 }, > - { "c", 0.1979883004921 }, > - { "g", 0.1975473066391 }, > - { "t", 0.3015094502008 }, > +-- Benchmark to check the performance of working with strings and > +-- output to the file. It generates DNA sequences by copying or > +-- weighted random selection. > +-- For details see: > +--https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html > + > +local bench = require("bench").new(arg) > + > +local stdout = io.output() > + > +local benchmark > +benchmark = { > + name = "fasta", > + -- XXX: The result file may take up to 278 Mb for the default > + -- settings. To check the correctness of the script, run it as > + -- is from the console. > + skip_check = true, > + setup = function() > + io.output("/dev/null") > + end, > + payload = function() > + -- Run the benchmark as is from the file. > + local items = require("fasta") > + -- Remove it from the cache to be sure the benchmark will run > + -- at the next iteration. > + package.loaded["fasta"] = nil > + benchmark.items = items > + end, > + teardown = function() > + io.output(stdout) > + end, > } > > -local N = tonumber(arg and arg[1]) or 1000 > -make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2) > -make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3) > -make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5) > +bench:add(benchmark) > +bench:run_and_report() > diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua > new file mode 100644 > index 00000000..58f59dd5 > --- /dev/null > +++ b/perf/LuaJIT-benches/libs/fasta.lua > @@ -0,0 +1,105 @@ > +-- Benchmark to check the performance of working with strings and > +-- output to the file. It generates DNA sequences by copying or > +-- weighted random selection. > +-- For details see: > +--https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html > +-- Also, this file is used as a script to generate inputs for > +-- other benchmarks like and . > + > +local Last = 42 > +local function random(max) > + local y = (Last * 3877 + 29573) % 139968 > + Last = y > + return (max * y) / 139968 > +end > + > +local function make_repeat_fasta(id, desc, s, n) > + local write, sub = io.write, string.sub > + write(">", id, " ", desc, "\n") > + local p, sn, s2 = 1, #s, s..s > + for i = 60, n, 60 do > + write(sub(s2, p, p + 59), "\n") > + p = p + 60; if p > sn then p = p - sn end > + end > + local tail = n % 60 > + if tail > 0 then write(sub(s2, p, p + tail - 1), "\n") end > +end > + > +local function make_random_fasta(id, desc, bs, n) > + io.write(">", id, " ", desc, "\n") > + loadstring([=[ > + local write, char, unpack, n, random = io.write, string.char, unpack, ... > + local buf, p = {}, 1 > + for i = 60, n, 60 do > + for j = p, p + 59 do ]=]..bs..[=[ end > + buf[p + 60] = 10; p = p + 61 > + if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end > + end > + local tail = n % 60 > + if tail > 0 then > + for j = p, p + tail - 1 do ]=]..bs..[=[ end > + p = p + tail; buf[p] = 10; p = p + 1 > + end > + write(char(unpack(buf, 1, p - 1))) > + ]=], desc)(n, random) > +end > + > +local function bisect(c, p, lo, hi) > + local n = hi - lo > + if n == 0 then return "buf[j] = "..c[hi].."\n" end > + local mid = math.floor(n / 2) > + return "if r < "..p[lo + mid].." then\n"..bisect(c, p, lo, lo + mid).. > + "else\n"..bisect(c, p, lo + mid + 1, hi).."end\n" > +end > + > +local function make_bisect(tab) > + local c, p, sum = {}, {}, 0 > + for i, row in ipairs(tab) do > + c[i] = string.byte(row[1]) > + sum = sum + row[2] > + p[i] = sum > + end > + return "local r = random(1)\n"..bisect(c, p, 1, #tab) > +end > + > +local alu = > + "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG".. > + "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA".. > + "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT".. > + "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA".. > + "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG".. > + "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC".. > + "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA" > + > +local iub = make_bisect{ > + { "a", 0.27 }, > + { "c", 0.12 }, > + { "g", 0.12 }, > + { "t", 0.27 }, > + { "B", 0.02 }, > + { "D", 0.02 }, > + { "H", 0.02 }, > + { "K", 0.02 }, > + { "M", 0.02 }, > + { "N", 0.02 }, > + { "R", 0.02 }, > + { "S", 0.02 }, > + { "V", 0.02 }, > + { "W", 0.02 }, > + { "Y", 0.02 }, > +} > + > +local homosapiens = make_bisect{ > + { "a", 0.3029549426680 }, > + { "c", 0.1979883004921 }, > + { "g", 0.1975473066391 }, > + { "t", 0.3015094502008 }, > +} > + > +local N = tonumber(arg and arg[1]) or 25e6 > + > +make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2) > +make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3) > +make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5) > + > +return N*2 + N*3 + N*5 --------------fVd0ZizX3lVzwpCEJicvxpo3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

Hi, Sergey,

Thanks for the patch! LGTM

Sergey

On 12/26/25 12:17, Sergey Kaplun wrote:
This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Since the result output (with the different input parameter value)
produced by this benchmark is used in other benchmarks
(<k-nucleotide.lua> and <revcomp.lua>), the original script is used as a
library (inside the <libs/> subdirectory) with the updated default input
value and returns the number of items processed. The output for the
benchmark itself is suppressed and not checked since it is irrational to
store in the repository such huge files for testing.
---
 perf/LuaJIT-benches/fasta.lua      | 126 ++++++++---------------------
 perf/LuaJIT-benches/libs/fasta.lua | 105 ++++++++++++++++++++++++
 2 files changed, 138 insertions(+), 93 deletions(-)
 create mode 100644 perf/LuaJIT-benches/libs/fasta.lua

diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua
index 7ce60804..457623b2 100644
--- a/perf/LuaJIT-benches/fasta.lua
+++ b/perf/LuaJIT-benches/fasta.lua
@@ -1,95 +1,35 @@
-
-local Last = 42
-local function random(max)
-  local y = (Last * 3877 + 29573) % 139968
-  Last = y
-  return (max * y) / 139968
-end
-
-local function make_repeat_fasta(id, desc, s, n)
-  local write, sub = io.write, string.sub
-  write(">", id, " ", desc, "\n")
-  local p, sn, s2 = 1, #s, s..s
-  for i=60,n,60 do
-    write(sub(s2, p, p + 59), "\n")
-    p = p + 60; if p > sn then p = p - sn end
-  end
-  local tail = n % 60
-  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
-end
-
-local function make_random_fasta(id, desc, bs, n)
-  io.write(">", id, " ", desc, "\n")
-  loadstring([=[
-    local write, char, unpack, n, random = io.write, string.char, unpack, ...
-    local buf, p = {}, 1
-    for i=60,n,60 do
-      for j=p,p+59 do ]=]..bs..[=[ end
-      buf[p+60] = 10; p = p + 61
-      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
-    end
-    local tail = n % 60
-    if tail > 0 then
-      for j=p,p+tail-1 do ]=]..bs..[=[ end
-      p = p + tail; buf[p] = 10; p = p + 1
-    end
-    write(char(unpack(buf, 1, p-1)))
-  ]=], desc)(n, random)
-end
-
-local function bisect(c, p, lo, hi)
-  local n = hi - lo
-  if n == 0 then return "buf[j] = "..c[hi].."\n" end
-  local mid = math.floor(n / 2)
-  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
-         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
-end
-
-local function make_bisect(tab)
-  local c, p, sum = {}, {}, 0
-  for i,row in ipairs(tab) do
-    c[i] = string.byte(row[1])
-    sum = sum + row[2]
-    p[i] = sum
-  end
-  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
-end
-
-local alu =
-  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
-  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
-  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
-  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
-  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
-  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
-  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
-
-local iub = make_bisect{
-  { "a", 0.27 },
-  { "c", 0.12 },
-  { "g", 0.12 },
-  { "t", 0.27 },
-  { "B", 0.02 },
-  { "D", 0.02 },
-  { "H", 0.02 },
-  { "K", 0.02 },
-  { "M", 0.02 },
-  { "N", 0.02 },
-  { "R", 0.02 },
-  { "S", 0.02 },
-  { "V", 0.02 },
-  { "W", 0.02 },
-  { "Y", 0.02 },
-}
-
-local homosapiens = make_bisect{
-  { "a", 0.3029549426680 },
-  { "c", 0.1979883004921 },
-  { "g", 0.1975473066391 },
-  { "t", 0.3015094502008 },
+-- Benchmark to check the performance of working with strings and
+-- output to the file. It generates DNA sequences by copying or
+-- weighted random selection.
+-- For details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html
+
+local bench = require("bench").new(arg)
+
+local stdout = io.output()
+
+local benchmark
+benchmark = {
+  name = "fasta",
+  -- XXX: The result file may take up to 278 Mb for the default
+  -- settings. To check the correctness of the script, run it as
+  -- is from the console.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  payload = function()
+    -- Run the benchmark as is from the file.
+    local items = require("fasta")
+    -- Remove it from the cache to be sure the benchmark will run
+    -- at the next iteration.
+    package.loaded["fasta"] = nil
+    benchmark.items = items
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
 }
 
-local N = tonumber(arg and arg[1]) or 1000
-make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
-make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
-make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
+bench:add(benchmark)
+bench:run_and_report()
diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua
new file mode 100644
index 00000000..58f59dd5
--- /dev/null
+++ b/perf/LuaJIT-benches/libs/fasta.lua
@@ -0,0 +1,105 @@
+-- Benchmark to check the performance of working with strings and
+-- output to the file. It generates DNA sequences by copying or
+-- weighted random selection.
+-- For details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html
+-- Also, this file is used as a script to generate inputs for
+-- other benchmarks like <k-nucleotide.lua> and <revcomp.lua>.
+
+local Last = 42
+local function random(max)
+  local y = (Last * 3877 + 29573) % 139968
+  Last = y
+  return (max * y) / 139968
+end
+
+local function make_repeat_fasta(id, desc, s, n)
+  local write, sub = io.write, string.sub
+  write(">", id, " ", desc, "\n")
+  local p, sn, s2 = 1, #s, s..s
+  for i = 60, n, 60 do
+    write(sub(s2, p, p + 59), "\n")
+    p = p + 60; if p > sn then p = p - sn end
+  end
+  local tail = n % 60
+  if tail > 0 then write(sub(s2, p, p + tail - 1), "\n") end
+end
+
+local function make_random_fasta(id, desc, bs, n)
+  io.write(">", id, " ", desc, "\n")
+  loadstring([=[
+    local write, char, unpack, n, random = io.write, string.char, unpack, ...
+    local buf, p = {}, 1
+    for i = 60, n, 60 do
+      for j = p, p + 59 do ]=]..bs..[=[ end
+      buf[p + 60] = 10; p = p + 61
+      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
+    end
+    local tail = n % 60
+    if tail > 0 then
+      for j = p, p + tail - 1 do ]=]..bs..[=[ end
+      p = p + tail; buf[p] = 10; p = p + 1
+    end
+    write(char(unpack(buf, 1, p - 1)))
+  ]=], desc)(n, random)
+end
+
+local function bisect(c, p, lo, hi)
+  local n = hi - lo
+  if n == 0 then return "buf[j] = "..c[hi].."\n" end
+  local mid = math.floor(n / 2)
+  return "if r < "..p[lo + mid].." then\n"..bisect(c, p, lo, lo + mid)..
+         "else\n"..bisect(c, p, lo + mid + 1, hi).."end\n"
+end
+
+local function make_bisect(tab)
+  local c, p, sum = {}, {}, 0
+  for i, row in ipairs(tab) do
+    c[i] = string.byte(row[1])
+    sum = sum + row[2]
+    p[i] = sum
+  end
+  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
+end
+
+local alu =
+  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
+  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
+  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
+  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
+  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
+  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
+  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
+
+local iub = make_bisect{
+  { "a", 0.27 },
+  { "c", 0.12 },
+  { "g", 0.12 },
+  { "t", 0.27 },
+  { "B", 0.02 },
+  { "D", 0.02 },
+  { "H", 0.02 },
+  { "K", 0.02 },
+  { "M", 0.02 },
+  { "N", 0.02 },
+  { "R", 0.02 },
+  { "S", 0.02 },
+  { "V", 0.02 },
+  { "W", 0.02 },
+  { "Y", 0.02 },
+}
+
+local homosapiens = make_bisect{
+  { "a", 0.3029549426680 },
+  { "c", 0.1979883004921 },
+  { "g", 0.1975473066391 },
+  { "t", 0.3015094502008 },
+}
+
+local N = tonumber(arg and arg[1]) or 25e6
+
+make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
+make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
+make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
+
+return N*2 + N*3 + N*5
--------------fVd0ZizX3lVzwpCEJicvxpo3--