[Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing

Tarantool development patches archive
 help / color / mirror / Atom feed

* [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing
@ 2025-10-24 10:50 Sergey Kaplun via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite Sergey Kaplun via Tarantool-patches
                   ` (40 more replies)
  0 siblings, 41 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patchset introduces the performance testing suite for LuaJIT
benchmarks. It takes the LuaJIT test cleanup benches [1] and adapts them
to use a custom benchmark module with the GoogleBenchamark-similar
format. All results are collected and reported to InfluxDB, like it is
done for the Tarantool's tests.

The results for the following benchmarks are not very stable. It should
be investigated later (I appreciate any help with this):

* array3d
* binary-trees
* euler14-bit
* k-nukleotide
* nsieve (most unstable)
* nsieve-bit
* spectral-norm

Also, I've added notes to some commits where I'm not sure that the
implementation/solution is very good. Any insights are welcome :).

[1]: https://github.com/LuaJIT/LuaJIT-test-cleanup/tree/014708b/bench

Sergey Kaplun (41):
  perf: add LuaJIT-test-cleanup perf suite
  perf: introduce clock module
  perf: introduce bench module
  perf: adjust array3d in LuaJIT-benches
  perf: adjust binary-trees in LuaJIT-benches
  perf: adjust chameneos in LuaJIT-benches
  perf: adjust coroutine-ring in LuaJIT-benches
  perf: adjust euler14-bit in LuaJIT-benches
  perf: adjust fannkuch in LuaJIT-benches
  perf: adjust fasta in LuaJIT-benches
  perf: adjust k-nucleotide in LuaJIT-benches
  perf: adjust life in LuaJIT-benches
  perf: adjust mandelbrot-bit in LuaJIT-benches
  perf: adjust mandelbrot in LuaJIT-benches
  perf: adjust md5 in LuaJIT-benches
  perf: adjust meteor in LuaJIT-benches
  perf: adjust nbody in LuaJIT-benches
  perf: adjust nsieve-bit-fp in LuaJIT-benches
  perf: adjust nsieve-bit in LuaJIT-benches
  perf: adjust nsieve in LuaJIT-benches
  perf: adjust partialsums in LuaJIT-benches
  perf: adjust pidigits-nogmp in LuaJIT-benches
  perf: adjust ray in LuaJIT-benches
  perf: adjust recursive-ack in LuaJIT-benches
  perf: adjust recursive-fib in LuaJIT-benches
  perf: adjust revcomp in LuaJIT-benches
  perf: adjust scimark-2010-12-20 in LuaJIT-benches
  perf: move <scimark_lib.lua> to <libs/> directory
  perf: adjust scimark-fft in LuaJIT-benches
  perf: adjust scimark-lu in LuaJIT-benches
  perf: add scimark-mc in LuaJIT-benches
  perf: adjust scimark-sor in LuaJIT-benches
  perf: adjust scimark-sparse in LuaJIT-benches
  perf: adjust series in LuaJIT-benches
  perf: adjust spectral-norm in LuaJIT-benches
  perf: adjust sum-file in LuaJIT-benches
  perf: add CMake infrastructure
  perf: add aggregator helper for bench statistics
  perf: add a script for the environment setup
  perf: provide CMake option to setup the benchmark
  ci: introduce the performance workflow

 .github/actions/setup-performance/README.md  |   10 +
 .github/actions/setup-performance/action.yml |   18 +
 .github/workflows/performance.yml            |  110 ++
 .gitignore                                   |    5 +
 .luacheckrc                                  |    1 +
 CMakeLists.txt                               |   11 +
 perf/CMakeLists.txt                          |  119 +++
 perf/LuaJIT-benches/CMakeLists.txt           |   52 +
 perf/LuaJIT-benches/PARAM_arm.txt            |   29 +
 perf/LuaJIT-benches/PARAM_mips.txt           |   29 +
 perf/LuaJIT-benches/PARAM_ppc.txt            |   29 +
 perf/LuaJIT-benches/PARAM_x86.txt            |   29 +
 perf/LuaJIT-benches/SUMCOL_1.txt             | 1000 ++++++++++++++++++
 perf/LuaJIT-benches/TEST_md5sum.txt          |   20 +
 perf/LuaJIT-benches/array3d.lua              |   74 ++
 perf/LuaJIT-benches/binary-trees.lua         |  105 ++
 perf/LuaJIT-benches/chameneos.lua            |   82 ++
 perf/LuaJIT-benches/coroutine-ring.lua       |   53 +
 perf/LuaJIT-benches/euler14-bit.lua          |   42 +
 perf/LuaJIT-benches/fannkuch.lua             |   81 ++
 perf/LuaJIT-benches/fasta.lua                |   29 +
 perf/LuaJIT-benches/k-nucleotide.lua         |  129 +++
 perf/LuaJIT-benches/libs/fasta.lua           |   98 ++
 perf/LuaJIT-benches/libs/scimark_lib.lua     |  297 ++++++
 perf/LuaJIT-benches/life.lua                 |  188 ++++
 perf/LuaJIT-benches/mandelbrot-bit.lua       |   61 ++
 perf/LuaJIT-benches/mandelbrot.lua           |   49 +
 perf/LuaJIT-benches/md5.lua                  |  196 ++++
 perf/LuaJIT-benches/meteor.lua               |  246 +++++
 perf/LuaJIT-benches/nbody.lua                |  140 +++
 perf/LuaJIT-benches/nsieve-bit-fp.lua        |   62 ++
 perf/LuaJIT-benches/nsieve-bit.lua           |   52 +
 perf/LuaJIT-benches/nsieve.lua               |   46 +
 perf/LuaJIT-benches/partialsums.lua          |   44 +
 perf/LuaJIT-benches/pidigits-nogmp.lua       |  121 +++
 perf/LuaJIT-benches/ray.lua                  |  159 +++
 perf/LuaJIT-benches/recursive-ack.lua        |   23 +
 perf/LuaJIT-benches/recursive-fib.lua        |   31 +
 perf/LuaJIT-benches/revcomp.lua              |   59 ++
 perf/LuaJIT-benches/scimark-2010-12-20.lua   |  415 ++++++++
 perf/LuaJIT-benches/scimark-fft.lua          |   18 +
 perf/LuaJIT-benches/scimark-lu.lua           |   19 +
 perf/LuaJIT-benches/scimark-mc.lua           |   19 +
 perf/LuaJIT-benches/scimark-sor.lua          |   19 +
 perf/LuaJIT-benches/scimark-sparse.lua       |   19 +
 perf/LuaJIT-benches/series.lua               |   42 +
 perf/LuaJIT-benches/spectral-norm.lua        |   58 +
 perf/LuaJIT-benches/sum-file.lua             |   25 +
 perf/helpers/aggregate.lua                   |  124 +++
 perf/helpers/setup_env.sh                    |  135 +++
 perf/utils/bench.lua                         |  509 +++++++++
 perf/utils/clock.lua                         |   35 +
 52 files changed, 5366 insertions(+)
 create mode 100644 .github/actions/setup-performance/README.md
 create mode 100644 .github/actions/setup-performance/action.yml
 create mode 100644 .github/workflows/performance.yml
 create mode 100644 perf/CMakeLists.txt
 create mode 100644 perf/LuaJIT-benches/CMakeLists.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_arm.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_mips.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_ppc.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_x86.txt
 create mode 100644 perf/LuaJIT-benches/SUMCOL_1.txt
 create mode 100644 perf/LuaJIT-benches/TEST_md5sum.txt
 create mode 100644 perf/LuaJIT-benches/array3d.lua
 create mode 100644 perf/LuaJIT-benches/binary-trees.lua
 create mode 100644 perf/LuaJIT-benches/chameneos.lua
 create mode 100644 perf/LuaJIT-benches/coroutine-ring.lua
 create mode 100644 perf/LuaJIT-benches/euler14-bit.lua
 create mode 100644 perf/LuaJIT-benches/fannkuch.lua
 create mode 100644 perf/LuaJIT-benches/fasta.lua
 create mode 100644 perf/LuaJIT-benches/k-nucleotide.lua
 create mode 100644 perf/LuaJIT-benches/libs/fasta.lua
 create mode 100644 perf/LuaJIT-benches/libs/scimark_lib.lua
 create mode 100644 perf/LuaJIT-benches/life.lua
 create mode 100644 perf/LuaJIT-benches/mandelbrot-bit.lua
 create mode 100644 perf/LuaJIT-benches/mandelbrot.lua
 create mode 100644 perf/LuaJIT-benches/md5.lua
 create mode 100644 perf/LuaJIT-benches/meteor.lua
 create mode 100644 perf/LuaJIT-benches/nbody.lua
 create mode 100644 perf/LuaJIT-benches/nsieve-bit-fp.lua
 create mode 100644 perf/LuaJIT-benches/nsieve-bit.lua
 create mode 100644 perf/LuaJIT-benches/nsieve.lua
 create mode 100644 perf/LuaJIT-benches/partialsums.lua
 create mode 100644 perf/LuaJIT-benches/pidigits-nogmp.lua
 create mode 100644 perf/LuaJIT-benches/ray.lua
 create mode 100644 perf/LuaJIT-benches/recursive-ack.lua
 create mode 100644 perf/LuaJIT-benches/recursive-fib.lua
 create mode 100644 perf/LuaJIT-benches/revcomp.lua
 create mode 100644 perf/LuaJIT-benches/scimark-2010-12-20.lua
 create mode 100644 perf/LuaJIT-benches/scimark-fft.lua
 create mode 100644 perf/LuaJIT-benches/scimark-lu.lua
 create mode 100644 perf/LuaJIT-benches/scimark-mc.lua
 create mode 100644 perf/LuaJIT-benches/scimark-sor.lua
 create mode 100644 perf/LuaJIT-benches/scimark-sparse.lua
 create mode 100644 perf/LuaJIT-benches/series.lua
 create mode 100644 perf/LuaJIT-benches/spectral-norm.lua
 create mode 100644 perf/LuaJIT-benches/sum-file.lua
 create mode 100644 perf/helpers/aggregate.lua
 create mode 100755 perf/helpers/setup_env.sh
 create mode 100644 perf/utils/bench.lua
 create mode 100644 perf/utils/clock.lua

-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module Sergey Kaplun via Tarantool-patches
                   ` (39 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 88295 bytes --]

This patch introduces the LuaJIT-test-cleanup bench suite [1] into our
LuaJIT fork source tree. To provide relatable reprodusible results
several benchmarks need to be adjusted. However, to be sure we initially use
the valid suite, everything in the <perf/LuaJIT-benches> directory is
moved intact.

[1]: https://github.com/LuaJIT/LuaJIT-test-cleanup/tree/014708b/bench
---
 .luacheckrc                                |    1 +
 perf/LuaJIT-benches/PARAM_arm.txt          |   29 +
 perf/LuaJIT-benches/PARAM_mips.txt         |   29 +
 perf/LuaJIT-benches/PARAM_ppc.txt          |   29 +
 perf/LuaJIT-benches/PARAM_x86.txt          |   29 +
 perf/LuaJIT-benches/SUMCOL_1.txt           | 1000 ++++++++++++++++++++
 perf/LuaJIT-benches/TEST_md5sum.txt        |   20 +
 perf/LuaJIT-benches/array3d.lua            |   59 ++
 perf/LuaJIT-benches/binary-trees.lua       |   47 +
 perf/LuaJIT-benches/chameneos.lua          |   68 ++
 perf/LuaJIT-benches/coroutine-ring.lua     |   42 +
 perf/LuaJIT-benches/euler14-bit.lua        |   22 +
 perf/LuaJIT-benches/fannkuch.lua           |   50 +
 perf/LuaJIT-benches/fasta.lua              |   95 ++
 perf/LuaJIT-benches/k-nucleotide.lua       |   58 ++
 perf/LuaJIT-benches/life.lua               |  111 +++
 perf/LuaJIT-benches/mandelbrot-bit.lua     |   33 +
 perf/LuaJIT-benches/mandelbrot.lua         |   23 +
 perf/LuaJIT-benches/md5.lua                |  183 ++++
 perf/LuaJIT-benches/meteor.lua             |  220 +++++
 perf/LuaJIT-benches/nbody.lua              |  119 +++
 perf/LuaJIT-benches/nsieve-bit-fp.lua      |   37 +
 perf/LuaJIT-benches/nsieve-bit.lua         |   27 +
 perf/LuaJIT-benches/nsieve.lua             |   21 +
 perf/LuaJIT-benches/partialsums.lua        |   29 +
 perf/LuaJIT-benches/pidigits-nogmp.lua     |  100 ++
 perf/LuaJIT-benches/ray.lua                |  135 +++
 perf/LuaJIT-benches/recursive-ack.lua      |    8 +
 perf/LuaJIT-benches/recursive-fib.lua      |    7 +
 perf/LuaJIT-benches/revcomp.lua            |   37 +
 perf/LuaJIT-benches/scimark-2010-12-20.lua |  400 ++++++++
 perf/LuaJIT-benches/scimark-fft.lua        |    1 +
 perf/LuaJIT-benches/scimark-lu.lua         |    1 +
 perf/LuaJIT-benches/scimark-sor.lua        |    1 +
 perf/LuaJIT-benches/scimark-sparse.lua     |    1 +
 perf/LuaJIT-benches/scimark_lib.lua        |  297 ++++++
 perf/LuaJIT-benches/series.lua             |   34 +
 perf/LuaJIT-benches/spectral-norm.lua      |   40 +
 perf/LuaJIT-benches/sum-file.lua           |    6 +
 39 files changed, 3449 insertions(+)
 create mode 100644 perf/LuaJIT-benches/PARAM_arm.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_mips.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_ppc.txt
 create mode 100644 perf/LuaJIT-benches/PARAM_x86.txt
 create mode 100644 perf/LuaJIT-benches/SUMCOL_1.txt
 create mode 100644 perf/LuaJIT-benches/TEST_md5sum.txt
 create mode 100644 perf/LuaJIT-benches/array3d.lua
 create mode 100644 perf/LuaJIT-benches/binary-trees.lua
 create mode 100644 perf/LuaJIT-benches/chameneos.lua
 create mode 100644 perf/LuaJIT-benches/coroutine-ring.lua
 create mode 100644 perf/LuaJIT-benches/euler14-bit.lua
 create mode 100644 perf/LuaJIT-benches/fannkuch.lua
 create mode 100644 perf/LuaJIT-benches/fasta.lua
 create mode 100644 perf/LuaJIT-benches/k-nucleotide.lua
 create mode 100644 perf/LuaJIT-benches/life.lua
 create mode 100644 perf/LuaJIT-benches/mandelbrot-bit.lua
 create mode 100644 perf/LuaJIT-benches/mandelbrot.lua
 create mode 100644 perf/LuaJIT-benches/md5.lua
 create mode 100644 perf/LuaJIT-benches/meteor.lua
 create mode 100644 perf/LuaJIT-benches/nbody.lua
 create mode 100644 perf/LuaJIT-benches/nsieve-bit-fp.lua
 create mode 100644 perf/LuaJIT-benches/nsieve-bit.lua
 create mode 100644 perf/LuaJIT-benches/nsieve.lua
 create mode 100644 perf/LuaJIT-benches/partialsums.lua
 create mode 100644 perf/LuaJIT-benches/pidigits-nogmp.lua
 create mode 100644 perf/LuaJIT-benches/ray.lua
 create mode 100644 perf/LuaJIT-benches/recursive-ack.lua
 create mode 100644 perf/LuaJIT-benches/recursive-fib.lua
 create mode 100644 perf/LuaJIT-benches/revcomp.lua
 create mode 100644 perf/LuaJIT-benches/scimark-2010-12-20.lua
 create mode 100644 perf/LuaJIT-benches/scimark-fft.lua
 create mode 100644 perf/LuaJIT-benches/scimark-lu.lua
 create mode 100644 perf/LuaJIT-benches/scimark-sor.lua
 create mode 100644 perf/LuaJIT-benches/scimark-sparse.lua
 create mode 100644 perf/LuaJIT-benches/scimark_lib.lua
 create mode 100644 perf/LuaJIT-benches/series.lua
 create mode 100644 perf/LuaJIT-benches/spectral-norm.lua
 create mode 100644 perf/LuaJIT-benches/sum-file.lua

diff --git a/.luacheckrc b/.luacheckrc
index 19098dd9..35824875 100644
--- a/.luacheckrc
+++ b/.luacheckrc
@@ -16,6 +16,7 @@ files['test/tarantool-tests/'] = {
 -- test suites and need to be coherent with the upstream.
 exclude_files = {
   'dynasm/',
+  'perf/LuaJIT-benches/',
   'src/',
   'test/LuaJIT-tests/',
   'test/PUC-Rio-Lua-5.1-tests/',
diff --git a/perf/LuaJIT-benches/PARAM_arm.txt b/perf/LuaJIT-benches/PARAM_arm.txt
new file mode 100644
index 00000000..a07fd010
--- /dev/null
+++ b/perf/LuaJIT-benches/PARAM_arm.txt
@@ -0,0 +1,29 @@
+array3d 200
+binary-trees 13
+chameneos 1e6
+coroutine-ring 3e6
+euler14-bit 5e6
+fannkuch 10
+fasta 2e6
+k-nucleotide 5e5 FASTA_500000
+life
+mandelbrot 2000
+mandelbrot-bit 2000
+md5 5000
+nbody 1e6
+nsieve 9
+nsieve-bit 9
+nsieve-bit-fp 9
+partialsums 2e6
+pidigits-nogmp 2000
+ray 4
+recursive-ack 9
+recursive-fib 37
+revcomp 1e6 FASTA_1000000
+scimark-fft 2000
+scimark-lu 300
+scimark-sor 5000
+scimark-sparse 5e3
+series 1500
+spectral-norm 1000
+sum-file 1000 SUMCOL_1000
diff --git a/perf/LuaJIT-benches/PARAM_mips.txt b/perf/LuaJIT-benches/PARAM_mips.txt
new file mode 100644
index 00000000..e6bcadba
--- /dev/null
+++ b/perf/LuaJIT-benches/PARAM_mips.txt
@@ -0,0 +1,29 @@
+array3d 50
+binary-trees 10
+chameneos 5e4
+coroutine-ring 2e5
+euler14-bit 2e4
+fannkuch 8
+fasta 2e4
+k-nucleotide 1e4 FASTA_10000
+life
+mandelbrot 150
+mandelbrot-bit 150
+md5 10
+nbody 1e4
+nsieve 4
+nsieve-bit 4
+nsieve-bit-fp 2
+partialsums 5e4
+pidigits-nogmp 150
+ray 2
+recursive-ack 7
+recursive-fib 29
+revcomp 5e4 FASTA_50000
+scimark-fft 20
+scimark-lu 3
+scimark-sor 40
+scimark-sparse 100
+series 50
+spectral-norm 100
+sum-file 100 SUMCOL_100
diff --git a/perf/LuaJIT-benches/PARAM_ppc.txt b/perf/LuaJIT-benches/PARAM_ppc.txt
new file mode 100644
index 00000000..c8319a15
--- /dev/null
+++ b/perf/LuaJIT-benches/PARAM_ppc.txt
@@ -0,0 +1,29 @@
+array3d 200
+binary-trees 13
+chameneos 1e6
+coroutine-ring 4e6
+euler14-bit 1e6
+fannkuch 9
+fasta 5e5
+k-nucleotide 1e5 FASTA_100000
+life
+mandelbrot 800
+mandelbrot-bit 800
+md5 500
+nbody 1e5
+nsieve 8
+nsieve-bit 8
+nsieve-bit-fp 8
+partialsums 5e5
+pidigits-nogmp 800
+ray 5
+recursive-ack 9
+recursive-fib 34
+revcomp 1e6 FASTA_1000000
+scimark-fft 500
+scimark-lu 100
+scimark-sor 1000
+scimark-sparse 3000
+series 1000
+spectral-norm 200
+sum-file 1000 SUMCOL_1000
diff --git a/perf/LuaJIT-benches/PARAM_x86.txt b/perf/LuaJIT-benches/PARAM_x86.txt
new file mode 100644
index 00000000..87088d7b
--- /dev/null
+++ b/perf/LuaJIT-benches/PARAM_x86.txt
@@ -0,0 +1,29 @@
+array3d 300
+binary-trees 16
+chameneos 1e7
+coroutine-ring 2e7
+euler14-bit 2e7
+fannkuch 11
+fasta 25e6
+k-nucleotide 5e6 FASTA_5000000
+life
+mandelbrot 5000
+mandelbrot-bit 5000
+md5 20000
+nbody 5e6
+nsieve 12
+nsieve-bit 12
+nsieve-bit-fp 12
+partialsums 1e7
+pidigits-nogmp 5000
+ray 9
+recursive-ack 10
+recursive-fib 40
+revcomp 5e6 FASTA_5000000
+scimark-fft 50000
+scimark-lu 5000
+scimark-sor 50000
+scimark-sparse 15e4
+series 10000
+spectral-norm 3000
+sum-file 5000 SUMCOL_5000
diff --git a/perf/LuaJIT-benches/SUMCOL_1.txt b/perf/LuaJIT-benches/SUMCOL_1.txt
new file mode 100644
index 00000000..956aba14
--- /dev/null
+++ b/perf/LuaJIT-benches/SUMCOL_1.txt
@@ -0,0 +1,1000 @@
+276
+498
+-981
+770
+-401
+702
+966
+950
+-853
+-53
+-293
+604
+288
+892
+-697
+204
+96
+408
+880
+-7
+-817
+422
+-261
+-485
+-77
+826
+184
+864
+-751
+626
+812
+-369
+-353
+-371
+488
+-83
+-659
+24
+524
+-21
+840
+-757
+-17
+-973
+-843
+260
+858
+-389
+-521
+-99
+482
+-561
+-213
+630
+766
+932
+112
+-419
+-877
+762
+266
+-837
+170
+834
+746
+764
+922
+-89
+576
+-63
+90
+684
+316
+506
+-959
+708
+70
+252
+-747
+342
+-593
+-895
+-937
+-707
+350
+588
+-201
+-683
+-113
+-511
+-867
+322
+202
+472
+150
+-9
+-643
+28
+336
+86
+-925
+836
+-473
+-451
+-971
+-805
+-619
+84
+-67
+806
+270
+366
+334
+-555
+-557
+-331
+-409
+-553
+-145
+-71
+528
+490
+492
+828
+628
+-961
+536
+-859
+-271
+974
+-671
+-749
+414
+-257
+778
+56
+598
+-437
+-899
+-785
+-987
+32
+-999
+132
+-821
+-209
+402
+-543
+194
+-967
+294
+-943
+-285
+-483
+-97
+660
+-481
+-829
+-309
+-597
+-855
+80
+-355
+192
+-823
+436
+916
+282
+-629
+612
+-329
+-535
+780
+-47
+706
+110
+756
+-857
+-933
+-345
+-523
+718
+-31
+902
+678
+540
+698
+456
+-399
+126
+412
+-563
+-321
+-487
+-641
+-195
+-199
+-955
+772
+570
+18
+-217
+886
+984
+-721
+-995
+46
+-989
+946
+64
+716
+-719
+-869
+-579
+776
+450
+936
+980
+-439
+-977
+-455
+-997
+6
+268
+-269
+-421
+328
+352
+578
+-575
+476
+976
+-57
+-469
+544
+582
+-43
+510
+-939
+-581
+-337
+-203
+-737
+-827
+852
+-279
+-803
+-911
+-865
+548
+48
+-75
+416
+-275
+688
+-255
+-687
+-461
+-233
+420
+912
+-901
+-299
+12
+568
+694
+-411
+-883
+-327
+-361
+-339
+646
+-137
+-905
+670
+686
+-131
+-849
+-825
+256
+228
+-841
+68
+368
+-909
+242
+298
+118
+10
+222
+954
+-493
+-459
+-445
+608
+-765
+34
+468
+-715
+690
+-185
+-551
+-571
+-241
+292
+92
+768
+-923
+956
+614
+8
+730
+208
+-417
+300
+136
+-59
+-251
+-539
+166
+798
+866
+454
+-391
+-317
+668
+502
+-15
+994
+854
+-189
+666
+446
+-565
+-5
+42
+-227
+-87
+-779
+26
+312
+354
+754
+396
+-515
+220
+872
+654
+88
+-667
+250
+572
+952
+72
+982
+972
+-529
+-471
+-533
+-427
+538
+154
+-457
+-819
+750
+152
+452
+-41
+838
+-489
+418
+-649
+-637
+-197
+74
+394
+-653
+-727
+-435
+-23
+348
+638
+-611
+914
+-357
+-743
+-685
+580
+-247
+-577
+54
+-931
+-3
+558
+-793
+-443
+-759
+162
+-811
+384
+720
+-117
+900
+-519
+-39
+744
+432
+286
+-873
+380
+-167
+-283
+430
+-155
+-755
+206
+100
+364
+-677
+332
+-567
+382
+-605
+-181
+676
+-475
+-845
+910
+546
+14
+398
+616
+-769
+424
+992
+-235
+-239
+774
+478
+-919
+168
+-771
+-773
+-69
+-509
+930
+550
+-463
+178
+-861
+-761
+-795
+234
+-831
+-61
+-979
+-851
+-665
+-709
+896
+742
+-123
+590
+-693
+-887
+-379
+144
+-717
+20
+174
+82
+464
+30
+-969
+-349
+-531
+-799
+-661
+-647
+-623
+878
+148
+-545
+238
+-259
+554
+726
+-37
+-797
+98
+78
+-591
+-975
+962
+120
+906
+-207
+656
+-171
+652
+188
+672
+-133
+-91
+224
+818
+-333
+-839
+-499
+22
+-739
+142
+378
+-403
+-315
+370
+284
+122
+230
+-527
+-127
+442
+534
+160
+722
+262
+-657
+304
+258
+-103
+960
+-495
+-265
+634
+-101
+480
+-363
+308
+76
+-949
+-585
+904
+146
+-703
+164
+850
+246
+732
+-725
+566
+274
+-163
+-935
+-681
+-229
+254
+-733
+-547
+-273
+-903
+736
+-711
+794
+392
+-655
+-549
+808
+-429
+484
+-701
+-617
+804
+36
+-775
+-335
+-927
+714
+-177
+-325
+-413
+-963
+114
+-253
+-789
+-645
+40
+434
+898
+924
+-19
+738
+788
+280
+-121
+594
+-913
+426
+816
+-373
+-45
+340
+-109
+-323
+58
+-249
+940
+-297
+988
+998
+-607
+-745
+-633
+-115
+996
+-893
+696
+400
+848
+500
+-263
+562
+-807
+-105
+-603
+658
+-73
+-863
+448
+680
+-157
+-161
+728
+814
+-477
+-375
+1000
+-631
+-991
+362
+156
+-187
+-705
+-917
+-449
+-741
+556
+440
+-589
+-11
+-359
+-891
+-801
+-153
+-381
+938
+-173
+-243
+618
+-599
+-497
+486
+128
+790
+460
+-27
+-305
+-205
+-215
+324
+-341
+50
+458
+52
+-621
+874
+386
+560
+-569
+-51
+802
+786
+920
+-425
+466
+444
+-507
+-915
+346
+622
+-679
+784
+-689
+388
+508
+-613
+-313
+-447
+564
+-897
+-211
+-225
+-615
+-367
+186
+894
+-65
+-453
+-245
+602
+496
+-651
+-601
+820
+226
+-695
+-119
+372
+180
+94
+214
+542
+648
+-871
+592
+584
+824
+796
+374
+-945
+-311
+516
+942
+-221
+-433
+200
+-465
+-953
+870
+868
+-879
+518
+356
+-223
+682
+990
+-191
+-541
+-951
+-921
+-319
+-169
+-291
+-289
+792
+876
+306
+-491
+326
+-885
+62
+514
+-929
+318
+-231
+632
+44
+-107
+644
+-267
+-343
+-847
+934
+734
+-505
+-351
+574
+-627
+636
+-93
+-431
+-835
+428
+-183
+-151
+2
+-813
+-595
+958
+-141
+692
+-385
+610
+-179
+376
+948
+198
+-675
+964
+-907
+918
+-165
+-1
+406
+748
+-111
+532
+-55
+-281
+740
+504
+236
+-29
+662
+-713
+-537
+196
+-587
+822
+-135
+700
+-35
+674
+-407
+240
+-673
+-669
+-393
+470
+-525
+-875
+-383
+-625
+296
+-85
+-147
+-277
+800
+-691
+-143
+16
+-983
+-303
+290
+-139
+172
+320
+512
+596
+640
+664
+-791
+-783
+-387
+-735
+-467
+-301
+810
+134
+216
+278
+176
+606
+140
+-787
+978
+586
+890
+882
+-753
+-13
+970
+-941
+-175
+-777
+-809
+-441
+-347
+-377
+390
+-423
+842
+642
+190
+302
+438
+704
+310
+-49
+124
+-781
+-287
+724
+-767
+830
+620
+-295
+244
+-159
+-307
+-397
+66
+-237
+314
+-79
+624
+710
+272
+-365
+928
+856
+138
+-479
+520
+832
+862
+760
+846
+-81
+106
+-513
+-193
+650
+782
+-517
+944
+218
+712
+-663
+-559
+462
+-635
+-25
+182
+530
+844
+330
+-833
+102
+-881
+108
+-947
+-763
+-405
+232
+410
+104
+-729
+-149
+-889
+888
+360
+968
+908
+116
+-815
+-129
+522
+-723
+-993
+860
+-503
+926
+-219
+-415
+60
+158
+-609
+-501
+986
+-699
+-583
+884
+212
+210
+-957
+526
+-985
+552
+344
+-395
+-95
+338
+248
+494
+130
+404
+358
+600
+-639
+-125
+-33
+-965
+752
+474
+-731
+758
+-573
+4
+38
+264
diff --git a/perf/LuaJIT-benches/TEST_md5sum.txt b/perf/LuaJIT-benches/TEST_md5sum.txt
new file mode 100644
index 00000000..15aa8a1c
--- /dev/null
+++ b/perf/LuaJIT-benches/TEST_md5sum.txt
@@ -0,0 +1,20 @@
+binarytrees	10	7202f4e13df7abc5ad8c07f05fe9d644
+chameneos	1e5	a629ce12f63050c6656bce175258cf8f
+cheapconcr	1000	d29799d1e263810a4db7bbf43ca66499
+cheapconcw	1000	d29799d1e263810a4db7bbf43ca66499
+fannkuch	8	51e5e372cbc5471ea8940b20ad782319
+fasta	1e5	78cd327de6f0a5667da0aa9349888279
+knucleotide	x	88efb24c1fed533959ed84bb32c88142 <FASTA_10000
+mandelbrot	200	cc65e64bd553ed18896de1dfe7fae3e5
+meteor	3000	9a65bb4b0a735ace1eaa4f2628f01026
+nbody	1e4	e0361c898ba747117ec177f7b3b3359c
+nsieve	4	767e02c93624995732e151932fa5f304
+nsievebits	4	767e02c93624995732e151932fa5f304
+partialsums	1e5	33efb41c72f8ecfb5b36c99e32189a3f
+pidigits	200	173a11a77bb1e72dd31254a760317428
+recursive	4	07a47c2d2cf50503b16efda789f84916
+regexdna	x	fdf3e6e9c599754e1eec3e524ea13fed <FASTA_10000
+revcomp	x	47de276e2f72519b57b82da39f4c7592 <FASTA_10000
+spectralnorm 200	25f44bd552ccd9faa0ee2ae5617947e2
+sumfile	x	2ebd3caa45b31a2e74e436b645eab4b0 <SUMCOL_100
+
diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
new file mode 100644
index 00000000..c10b09b1
--- /dev/null
+++ b/perf/LuaJIT-benches/array3d.lua
@@ -0,0 +1,59 @@
+
+local function array_set(self, x, y, z, p)
+  assert(x >= 0 and x < self.nx, "x outside PA")
+  assert(y >= 0 and y < self.ny, "y outside PA")
+  assert(z >= 0 and z < self.nz, "z outside PA")
+  local pos = (z*self.ny + y)*self.nx + x
+  local image = self.image
+  if self.packed then
+    local maxv = self.max_voltage
+    if p > maxv then self.max_voltage = p*2.0 end
+    local oldp = image[pos] or 0.0 -- Works with uninitialized table, too
+    if oldp > maxv then p = p + maxv*2.0 end
+    image[pos] = p
+  else
+    image[pos] = p
+  end
+  self.changed = true
+  self.changed_recently = true
+end
+
+local function array_points(self)
+  local y, z = 0, 0
+  return function(self, x)
+    x = x + 1
+    if x >= self.nx then
+      x = 0
+      y = y + 1
+      if y >= self.ny then
+	y = 0
+	z = z + 1
+	if z >= self.nz then
+	  return nil, nil, nil
+	end
+      end
+    end
+    return x, y, z
+  end, self, 0
+end
+
+local function array_new(nx, ny, nz, packed)
+  return {
+    nx = nx, ny = ny, nz = nz,
+    packed = packed, max_voltage = 0.0,
+    changed = false, changed_recently = false,
+    image = {}, -- Preferably use a fixed-type, pre-sized array here.
+    set = array_set,
+    points = array_points,
+  }
+end
+
+local dim = tonumber(arg and arg[1]) or 300 -- Array dimension dim^3
+local packed = arg and arg[2] == "packed"   -- Packed image or flat
+local arr = array_new(dim, dim, dim, packed)
+
+for x,y,z in arr:points() do
+  arr:set(x, y, z, x*x)
+end
+assert(arr.image[dim^3-1] == (dim-1)^2)
+
diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
new file mode 100644
index 00000000..bf040466
--- /dev/null
+++ b/perf/LuaJIT-benches/binary-trees.lua
@@ -0,0 +1,47 @@
+
+local function BottomUpTree(item, depth)
+  if depth > 0 then
+    local i = item + item
+    depth = depth - 1
+    local left, right = BottomUpTree(i-1, depth), BottomUpTree(i, depth)
+    return { item, left, right }
+  else
+    return { item }
+  end
+end
+
+local function ItemCheck(tree)
+  if tree[2] then
+    return tree[1] + ItemCheck(tree[2]) - ItemCheck(tree[3])
+  else
+    return tree[1]
+  end
+end
+
+local N = tonumber(arg and arg[1]) or 0
+local mindepth = 4
+local maxdepth = mindepth + 2
+if maxdepth < N then maxdepth = N end
+
+do
+  local stretchdepth = maxdepth + 1
+  local stretchtree = BottomUpTree(0, stretchdepth)
+  io.write(string.format("stretch tree of depth %d\t check: %d\n",
+    stretchdepth, ItemCheck(stretchtree)))
+end
+
+local longlivedtree = BottomUpTree(0, maxdepth)
+
+for depth=mindepth,maxdepth,2 do
+  local iterations = 2 ^ (maxdepth - depth + mindepth)
+  local check = 0
+  for i=1,iterations do
+    check = check + ItemCheck(BottomUpTree(1, depth)) +
+            ItemCheck(BottomUpTree(-1, depth))
+  end
+  io.write(string.format("%d\t trees of depth %d\t check: %d\n",
+    iterations*2, depth, check))
+end
+
+io.write(string.format("long lived tree of depth %d\t check: %d\n",
+  maxdepth, ItemCheck(longlivedtree)))
diff --git a/perf/LuaJIT-benches/chameneos.lua b/perf/LuaJIT-benches/chameneos.lua
new file mode 100644
index 00000000..78b64c3f
--- /dev/null
+++ b/perf/LuaJIT-benches/chameneos.lua
@@ -0,0 +1,68 @@
+
+local co = coroutine
+local create, resume, yield = co.create, co.resume, co.yield
+
+local N = tonumber(arg and arg[1]) or 10
+local first, second
+
+-- Meet another creature.
+local function meet(me)
+  while second do yield() end -- Wait until meeting place clears.
+  local other = first
+  if other then -- Hey, I found a new friend!
+    first = nil
+    second = me
+  else -- Sniff, nobody here (yet).
+    local n = N - 1
+    if n < 0 then return end -- Uh oh, the mall is closed.
+    N = n
+    first = me
+    repeat yield(); other = second until other -- Wait for another creature.
+    second = nil
+    yield() -- Be nice and let others meet up.
+  end
+  return other
+end
+
+-- Create a very social creature.
+local function creature(color)
+  return create(function()
+    local me = color
+    for met=0,1000000000 do
+      local other = meet(me)
+      if not other then return met end
+      if me ~= other then
+        if me == "blue" then me = other == "red" and "yellow" or "red"
+        elseif me == "red" then me = other == "blue" and "yellow" or "blue"
+        else me = other == "blue" and "red" or "blue" end
+      end
+    end
+  end)
+end
+
+-- Trivial round-robin scheduler.
+local function schedule(threads)
+  local resume = resume
+  local nthreads, meetings = #threads, 0
+  repeat
+    for i=1,nthreads do
+      local thr = threads[i]
+      if not thr then return meetings end
+      local ok, met = resume(thr)
+      if met then
+        meetings = meetings + met
+        threads[i] = nil
+      end
+    end
+  until false
+end
+
+-- A bunch of colorful creatures.
+local threads = {
+  creature("blue"),
+  creature("red"),
+  creature("yellow"),
+  creature("blue"),
+}
+
+io.write(schedule(threads), "\n")
diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
new file mode 100644
index 00000000..1e8c5ef6
--- /dev/null
+++ b/perf/LuaJIT-benches/coroutine-ring.lua
@@ -0,0 +1,42 @@
+-- The Computer Language Benchmarks Game
+-- http://shootout.alioth.debian.org/
+-- contributed by Sam Roberts
+-- reviewed by Bruno Massa
+
+local n         = tonumber(arg and arg[1]) or 2e7
+
+-- fixed size pool
+local poolsize  = 503
+local threads   = {}
+
+-- cache these to avoid global environment lookups
+local create    = coroutine.create
+local resume    = coroutine.resume
+local yield     = coroutine.yield
+
+local id        = 1
+local token     = 0
+local ok
+
+local body = function(token)
+  while true do
+    token = yield(token + 1)
+  end
+end
+
+-- create all threads
+for id = 1, poolsize do
+  threads[id] = create(body)
+end
+
+-- send the token
+repeat
+  if id == poolsize then
+    id = 1
+  else
+    id = id + 1
+  end
+  ok, token = resume(threads[id], token)
+until token == n
+
+io.write(id, "\n")
diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
new file mode 100644
index 00000000..537f2bf3
--- /dev/null
+++ b/perf/LuaJIT-benches/euler14-bit.lua
@@ -0,0 +1,22 @@
+
+local bit = require("bit")
+local bnot, bor, band = bit.bnot, bit.bor, bit.band
+local shl, shr = bit.lshift, bit.rshift
+
+local N = tonumber(arg and arg[1]) or 10000000
+local cache, m, n = { 1 }, 1, 1
+if arg and arg[2] then cache = nil end
+for i=2,N do
+  local j = i
+  for len=1,1000000000 do
+    j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
+    if cache then
+      local x = cache[j]; if x then j = x+len; break end
+    elseif j == 1 then
+      j = len+1; break
+    end
+  end
+  if cache then cache[i] = j end
+  if j > m then m, n = j, i end
+end
+io.write("Found ", n, " (chain length: ", m, ")\n")
diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
new file mode 100644
index 00000000..2a4cd426
--- /dev/null
+++ b/perf/LuaJIT-benches/fannkuch.lua
@@ -0,0 +1,50 @@
+
+local function fannkuch(n)
+  local p, q, s, odd, check, maxflips = {}, {}, {}, true, 0, 0
+  for i=1,n do p[i] = i; q[i] = i; s[i] = i end
+  repeat
+    -- Print max. 30 permutations.
+    if check < 30 then
+      if not p[n] then return maxflips end	-- Catch n = 0, 1, 2.
+      io.write(unpack(p)); io.write("\n")
+      check = check + 1
+    end
+    -- Copy and flip.
+    local q1 = p[1]				-- Cache 1st element.
+    if p[n] ~= n and q1 ~= 1 then		-- Avoid useless work.
+      for i=2,n do q[i] = p[i] end		-- Work on a copy.
+      local flips = 1			-- Flip ...
+      while true do
+	local qq = q[q1]
+	if qq == 1 then				-- ... until 1st element is 1.
+	  if flips > maxflips then maxflips = flips end -- New maximum?
+	  break
+	end
+	q[q1] = q1
+	if q1 >= 4 then
+	  local i, j = 2, q1 - 1
+	  repeat q[i], q[j] = q[j], q[i]; i = i + 1; j = j - 1; until i >= j
+	end
+	q1 = qq
+	flips=flips+1
+      end
+    end
+    -- Permute.
+    if odd then
+      p[2], p[1] = p[1], p[2]; odd = false	-- Rotate 1<-2.
+    else
+      p[2], p[3] = p[3], p[2]; odd = true	-- Rotate 1<-2 and 1<-2<-3.
+      for i=3,n do
+	local sx = s[i]
+	if sx ~= 1 then s[i] = sx-1; break end
+	if i == n then return maxflips end	-- Out of permutations.
+	s[i] = i
+	-- Rotate 1<-...<-i+1.
+	local t=p[1]; for j=i+1,1,-1 do p[j],t=t,p[j] end
+      end
+    end
+  until false
+end
+
+local n = tonumber(arg and arg[1]) or 1
+io.write("Pfannkuchen(", n, ") = ", fannkuch(n), "\n")
diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua
new file mode 100644
index 00000000..7ce60804
--- /dev/null
+++ b/perf/LuaJIT-benches/fasta.lua
@@ -0,0 +1,95 @@
+
+local Last = 42
+local function random(max)
+  local y = (Last * 3877 + 29573) % 139968
+  Last = y
+  return (max * y) / 139968
+end
+
+local function make_repeat_fasta(id, desc, s, n)
+  local write, sub = io.write, string.sub
+  write(">", id, " ", desc, "\n")
+  local p, sn, s2 = 1, #s, s..s
+  for i=60,n,60 do
+    write(sub(s2, p, p + 59), "\n")
+    p = p + 60; if p > sn then p = p - sn end
+  end
+  local tail = n % 60
+  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
+end
+
+local function make_random_fasta(id, desc, bs, n)
+  io.write(">", id, " ", desc, "\n")
+  loadstring([=[
+    local write, char, unpack, n, random = io.write, string.char, unpack, ...
+    local buf, p = {}, 1
+    for i=60,n,60 do
+      for j=p,p+59 do ]=]..bs..[=[ end
+      buf[p+60] = 10; p = p + 61
+      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
+    end
+    local tail = n % 60
+    if tail > 0 then
+      for j=p,p+tail-1 do ]=]..bs..[=[ end
+      p = p + tail; buf[p] = 10; p = p + 1
+    end
+    write(char(unpack(buf, 1, p-1)))
+  ]=], desc)(n, random)
+end
+
+local function bisect(c, p, lo, hi)
+  local n = hi - lo
+  if n == 0 then return "buf[j] = "..c[hi].."\n" end
+  local mid = math.floor(n / 2)
+  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
+         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
+end
+
+local function make_bisect(tab)
+  local c, p, sum = {}, {}, 0
+  for i,row in ipairs(tab) do
+    c[i] = string.byte(row[1])
+    sum = sum + row[2]
+    p[i] = sum
+  end
+  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
+end
+
+local alu =
+  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
+  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
+  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
+  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
+  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
+  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
+  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
+
+local iub = make_bisect{
+  { "a", 0.27 },
+  { "c", 0.12 },
+  { "g", 0.12 },
+  { "t", 0.27 },
+  { "B", 0.02 },
+  { "D", 0.02 },
+  { "H", 0.02 },
+  { "K", 0.02 },
+  { "M", 0.02 },
+  { "N", 0.02 },
+  { "R", 0.02 },
+  { "S", 0.02 },
+  { "V", 0.02 },
+  { "W", 0.02 },
+  { "Y", 0.02 },
+}
+
+local homosapiens = make_bisect{
+  { "a", 0.3029549426680 },
+  { "c", 0.1979883004921 },
+  { "g", 0.1975473066391 },
+  { "t", 0.3015094502008 },
+}
+
+local N = tonumber(arg and arg[1]) or 1000
+make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
+make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
+make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
new file mode 100644
index 00000000..0bfb41be
--- /dev/null
+++ b/perf/LuaJIT-benches/k-nucleotide.lua
@@ -0,0 +1,58 @@
+
+local function kfrequency(seq, freq, k, frame)
+  local sub = string.sub
+  local k1 = k - 1
+  for i=frame,#seq-k1,k do
+    local c = sub(seq, i, i+k1)
+    freq[c] = (freq[c] or 0) + 1
+  end
+end
+
+local function count(seq, frag)
+  local k = #frag
+  local freq = {}
+  for frame=1,k do kfrequency(seq, freq, k, frame) end
+  io.write(freq[frag] or 0, "\t", frag, "\n")
+end
+
+local function frequency(seq, k)
+  local freq = {}
+  for frame=1,k do kfrequency(seq, freq, k, frame) end
+  local sfreq, sn, sum = {}, 1, 0
+  for c,v in pairs(freq) do sfreq[sn] = c; sn = sn + 1; sum = sum + v end
+  table.sort(sfreq, function(a, b)
+    local fa, fb = freq[a], freq[b]
+    return fa == fb and a > b or fa > fb
+  end)
+  for _,c in ipairs(sfreq) do
+    io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
+  end
+  io.write("\n")
+end
+
+local function readseq()
+  local sub = string.sub
+  for line in io.lines() do
+    if sub(line, 1, 1) == ">" and sub(line, 2, 6) == "THREE" then break end
+  end
+  local lines, ln = {}, 0
+  for line in io.lines() do
+    local c = sub(line, 1, 1)
+    if c == ">" then
+      break
+    elseif c ~= ";" then
+      ln = ln + 1
+      lines[ln] = line
+    end
+  end
+  return string.upper(table.concat(lines, "", 1, ln))
+end
+
+local seq = readseq()
+frequency(seq, 1)
+frequency(seq, 2)
+count(seq, "GGT")
+count(seq, "GGTA")
+count(seq, "GGTATT")
+count(seq, "GGTATTTTAATT")
+count(seq, "GGTATTTTAATTTATAGT")
diff --git a/perf/LuaJIT-benches/life.lua b/perf/LuaJIT-benches/life.lua
new file mode 100644
index 00000000..911d9fe1
--- /dev/null
+++ b/perf/LuaJIT-benches/life.lua
@@ -0,0 +1,111 @@
+-- life.lua
+-- original by Dave Bollinger <DBollinger@compuserve.com> posted to lua-l
+-- modified to use ANSI terminal escape sequences
+-- modified to use for instead of while
+
+local write=io.write
+
+ALIVE="¥"	DEAD="þ"
+ALIVE="O"	DEAD="-"
+
+function delay() -- NOTE: SYSTEM-DEPENDENT, adjust as necessary
+  for i=1,10000 do end
+  -- local i=os.clock()+1 while(os.clock()<i) do end
+end
+
+function ARRAY2D(w,h)
+  local t = {w=w,h=h}
+  for y=1,h do
+    t[y] = {}
+    for x=1,w do
+      t[y][x]=0
+    end
+  end
+  return t
+end
+
+_CELLS = {}
+
+-- give birth to a "shape" within the cell array
+function _CELLS:spawn(shape,left,top)
+  for y=0,shape.h-1 do
+    for x=0,shape.w-1 do
+      self[top+y][left+x] = shape[y*shape.w+x+1]
+    end
+  end
+end
+
+-- run the CA and produce the next generation
+function _CELLS:evolve(next)
+  local ym1,y,yp1,yi=self.h-1,self.h,1,self.h
+  while yi > 0 do
+    local xm1,x,xp1,xi=self.w-1,self.w,1,self.w
+    while xi > 0 do
+      local sum = self[ym1][xm1] + self[ym1][x] + self[ym1][xp1] +
+                  self[y][xm1] + self[y][xp1] +
+                  self[yp1][xm1] + self[yp1][x] + self[yp1][xp1]
+      next[y][x] = ((sum==2) and self[y][x]) or ((sum==3) and 1) or 0
+      xm1,x,xp1,xi = x,xp1,xp1+1,xi-1
+    end
+    ym1,y,yp1,yi = y,yp1,yp1+1,yi-1
+  end
+end
+
+-- output the array to screen
+function _CELLS:draw()
+  local out="" -- accumulate to reduce flicker
+  for y=1,self.h do
+   for x=1,self.w do
+      out=out..(((self[y][x]>0) and ALIVE) or DEAD)
+    end
+    out=out.."\n"
+  end
+  write(out)
+end
+
+-- constructor
+function CELLS(w,h)
+  local c = ARRAY2D(w,h)
+  c.spawn = _CELLS.spawn
+  c.evolve = _CELLS.evolve
+  c.draw = _CELLS.draw
+  return c
+end
+
+--
+-- shapes suitable for use with spawn() above
+--
+HEART = { 1,0,1,1,0,1,1,1,1; w=3,h=3 }
+GLIDER = { 0,0,1,1,0,1,0,1,1; w=3,h=3 }
+EXPLODE = { 0,1,0,1,1,1,1,0,1,0,1,0; w=3,h=4 }
+FISH = { 0,1,1,1,1,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0; w=5,h=4 }
+BUTTERFLY = { 1,0,0,0,1,0,1,1,1,0,1,0,0,0,1,1,0,1,0,1,1,0,0,0,1; w=5,h=5 }
+
+-- the main routine
+function LIFE(w,h)
+  -- create two arrays
+  local thisgen = CELLS(w,h)
+  local nextgen = CELLS(w,h)
+
+  -- create some life
+  -- about 1000 generations of fun, then a glider steady-state
+  thisgen:spawn(GLIDER,5,4)
+  thisgen:spawn(EXPLODE,25,10)
+  thisgen:spawn(FISH,4,12)
+
+  -- run until break
+  local gen=1
+  write("\027[2J")	-- ANSI clear screen
+  while 1 do
+    thisgen:evolve(nextgen)
+    thisgen,nextgen = nextgen,thisgen
+    write("\027[H")	-- ANSI home cursor
+    thisgen:draw()
+    write("Life - generation ",gen,"\n")
+    gen=gen+1
+    if gen>2000 then break end
+    --delay()		-- no delay
+  end
+end
+
+LIFE(40,20)
diff --git a/perf/LuaJIT-benches/mandelbrot-bit.lua b/perf/LuaJIT-benches/mandelbrot-bit.lua
new file mode 100644
index 00000000..91d96975
--- /dev/null
+++ b/perf/LuaJIT-benches/mandelbrot-bit.lua
@@ -0,0 +1,33 @@
+
+local bit = require("bit")
+local bor, band = bit.bor, bit.band
+local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
+local write, char, unpack = io.write, string.char, unpack
+local N = tonumber(arg and arg[1]) or 100
+local M, buf = 2/N, {}
+write("P4\n", N, " ", N, "\n")
+for y=0,N-1 do
+  local Ci, b, p = y*M-1, -16777216, 0
+  local Ciq = Ci*Ci
+  for x=0,N-1,2 do
+    local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
+    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
+    local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
+    b = rol(b, 2)
+    for i=1,49 do
+      Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
+      Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
+      Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
+      Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
+      if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
+      if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
+      if band(b, 3) == 0 then break end
+    end
+    if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
+  end
+  if b ~= -16777216 then
+    if band(N, 1) ~= 0 then b = shr(b, 1) end
+    p = p + 1; buf[p] = shl(b, 8-band(N, 7))
+  end
+  write(char(unpack(buf, 1, p)))
+end
diff --git a/perf/LuaJIT-benches/mandelbrot.lua b/perf/LuaJIT-benches/mandelbrot.lua
new file mode 100644
index 00000000..0ef595a2
--- /dev/null
+++ b/perf/LuaJIT-benches/mandelbrot.lua
@@ -0,0 +1,23 @@
+
+local write, char, unpack = io.write, string.char, unpack
+local N = tonumber(arg and arg[1]) or 100
+local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
+write("P4\n", N, " ", N, "\n")
+for y=0,N-1 do
+  local Ci, b, p = y*M-1, 1, 0
+  for x=0,N-1 do
+    local Cr = x*M-1.5
+    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
+    b = b + b
+    for i=1,49 do
+      Zi = Zr*Zi*2 + Ci
+      Zr = Zrq-Ziq + Cr
+      Ziq = Zi*Zi
+      Zrq = Zr*Zr
+      if Zrq+Ziq > 4.0 then b = b + 1; break; end
+    end
+    if b >= 256 then p = p + 1; buf[p] = 511 - b; b = 1; end
+  end
+  if b ~= 1 then p = p + 1; buf[p] = (ba-b)*bb; end
+  write(char(unpack(buf, 1, p)))
+end
diff --git a/perf/LuaJIT-benches/md5.lua b/perf/LuaJIT-benches/md5.lua
new file mode 100644
index 00000000..fdf6b4a7
--- /dev/null
+++ b/perf/LuaJIT-benches/md5.lua
@@ -0,0 +1,183 @@
+
+local bit = require("bit")
+local tobit, tohex, bnot = bit.tobit or bit.cast, bit.tohex, bit.bnot
+local bor, band, bxor = bit.bor, bit.band, bit.bxor
+local lshift, rshift, rol, bswap = bit.lshift, bit.rshift, bit.rol, bit.bswap
+local byte, char, sub, rep = string.byte, string.char, string.sub, string.rep
+
+if not rol then -- Replacement function if rotates are missing.
+  local bor, shl, shr = bit.bor, bit.lshift, bit.rshift
+  function rol(a, b) return bor(shl(a, b), shr(a, 32-b)) end
+end
+
+if not bswap then -- Replacement function if bswap is missing.
+  local bor, band, shl, shr = bit.bor, bit.band, bit.lshift, bit.rshift
+  function bswap(a)
+    return bor(shr(a, 24), band(shr(a, 8), 0xff00),
+	       shl(band(a, 0xff00), 8), shl(a, 24));
+  end
+end
+
+if not tohex then -- (Unreliable) replacement function if tohex is missing.
+  function tohex(a)
+    return string.sub(string.format("%08x", a), -8)
+  end
+end
+
+local function tr_f(a, b, c, d, x, s)
+  return rol(bxor(d, band(b, bxor(c, d))) + a + x, s) + b
+end
+
+local function tr_g(a, b, c, d, x, s)
+  return rol(bxor(c, band(d, bxor(b, c))) + a + x, s) + b
+end
+
+local function tr_h(a, b, c, d, x, s)
+  return rol(bxor(b, c, d) + a + x, s) + b
+end
+
+local function tr_i(a, b, c, d, x, s)
+  return rol(bxor(c, bor(b, bnot(d))) + a + x, s) + b
+end
+
+local function transform(x, a1, b1, c1, d1)
+  local a, b, c, d = a1, b1, c1, d1
+
+  a = tr_f(a, b, c, d, x[ 1] + 0xd76aa478,  7)
+  d = tr_f(d, a, b, c, x[ 2] + 0xe8c7b756, 12)
+  c = tr_f(c, d, a, b, x[ 3] + 0x242070db, 17)
+  b = tr_f(b, c, d, a, x[ 4] + 0xc1bdceee, 22)
+  a = tr_f(a, b, c, d, x[ 5] + 0xf57c0faf,  7)
+  d = tr_f(d, a, b, c, x[ 6] + 0x4787c62a, 12)
+  c = tr_f(c, d, a, b, x[ 7] + 0xa8304613, 17)
+  b = tr_f(b, c, d, a, x[ 8] + 0xfd469501, 22)
+  a = tr_f(a, b, c, d, x[ 9] + 0x698098d8,  7)
+  d = tr_f(d, a, b, c, x[10] + 0x8b44f7af, 12)
+  c = tr_f(c, d, a, b, x[11] + 0xffff5bb1, 17)
+  b = tr_f(b, c, d, a, x[12] + 0x895cd7be, 22)
+  a = tr_f(a, b, c, d, x[13] + 0x6b901122,  7)
+  d = tr_f(d, a, b, c, x[14] + 0xfd987193, 12)
+  c = tr_f(c, d, a, b, x[15] + 0xa679438e, 17)
+  b = tr_f(b, c, d, a, x[16] + 0x49b40821, 22)
+
+  a = tr_g(a, b, c, d, x[ 2] + 0xf61e2562,  5)
+  d = tr_g(d, a, b, c, x[ 7] + 0xc040b340,  9)
+  c = tr_g(c, d, a, b, x[12] + 0x265e5a51, 14)
+  b = tr_g(b, c, d, a, x[ 1] + 0xe9b6c7aa, 20)
+  a = tr_g(a, b, c, d, x[ 6] + 0xd62f105d,  5)
+  d = tr_g(d, a, b, c, x[11] + 0x02441453,  9)
+  c = tr_g(c, d, a, b, x[16] + 0xd8a1e681, 14)
+  b = tr_g(b, c, d, a, x[ 5] + 0xe7d3fbc8, 20)
+  a = tr_g(a, b, c, d, x[10] + 0x21e1cde6,  5)
+  d = tr_g(d, a, b, c, x[15] + 0xc33707d6,  9)
+  c = tr_g(c, d, a, b, x[ 4] + 0xf4d50d87, 14)
+  b = tr_g(b, c, d, a, x[ 9] + 0x455a14ed, 20)
+  a = tr_g(a, b, c, d, x[14] + 0xa9e3e905,  5)
+  d = tr_g(d, a, b, c, x[ 3] + 0xfcefa3f8,  9)
+  c = tr_g(c, d, a, b, x[ 8] + 0x676f02d9, 14)
+  b = tr_g(b, c, d, a, x[13] + 0x8d2a4c8a, 20)
+
+  a = tr_h(a, b, c, d, x[ 6] + 0xfffa3942,  4)
+  d = tr_h(d, a, b, c, x[ 9] + 0x8771f681, 11)
+  c = tr_h(c, d, a, b, x[12] + 0x6d9d6122, 16)
+  b = tr_h(b, c, d, a, x[15] + 0xfde5380c, 23)
+  a = tr_h(a, b, c, d, x[ 2] + 0xa4beea44,  4)
+  d = tr_h(d, a, b, c, x[ 5] + 0x4bdecfa9, 11)
+  c = tr_h(c, d, a, b, x[ 8] + 0xf6bb4b60, 16)
+  b = tr_h(b, c, d, a, x[11] + 0xbebfbc70, 23)
+  a = tr_h(a, b, c, d, x[14] + 0x289b7ec6,  4)
+  d = tr_h(d, a, b, c, x[ 1] + 0xeaa127fa, 11)
+  c = tr_h(c, d, a, b, x[ 4] + 0xd4ef3085, 16)
+  b = tr_h(b, c, d, a, x[ 7] + 0x04881d05, 23)
+  a = tr_h(a, b, c, d, x[10] + 0xd9d4d039,  4)
+  d = tr_h(d, a, b, c, x[13] + 0xe6db99e5, 11)
+  c = tr_h(c, d, a, b, x[16] + 0x1fa27cf8, 16)
+  b = tr_h(b, c, d, a, x[ 3] + 0xc4ac5665, 23)
+
+  a = tr_i(a, b, c, d, x[ 1] + 0xf4292244,  6)
+  d = tr_i(d, a, b, c, x[ 8] + 0x432aff97, 10)
+  c = tr_i(c, d, a, b, x[15] + 0xab9423a7, 15)
+  b = tr_i(b, c, d, a, x[ 6] + 0xfc93a039, 21)
+  a = tr_i(a, b, c, d, x[13] + 0x655b59c3,  6)
+  d = tr_i(d, a, b, c, x[ 4] + 0x8f0ccc92, 10)
+  c = tr_i(c, d, a, b, x[11] + 0xffeff47d, 15)
+  b = tr_i(b, c, d, a, x[ 2] + 0x85845dd1, 21)
+  a = tr_i(a, b, c, d, x[ 9] + 0x6fa87e4f,  6)
+  d = tr_i(d, a, b, c, x[16] + 0xfe2ce6e0, 10)
+  c = tr_i(c, d, a, b, x[ 7] + 0xa3014314, 15)
+  b = tr_i(b, c, d, a, x[14] + 0x4e0811a1, 21)
+  a = tr_i(a, b, c, d, x[ 5] + 0xf7537e82,  6)
+  d = tr_i(d, a, b, c, x[12] + 0xbd3af235, 10)
+  c = tr_i(c, d, a, b, x[ 3] + 0x2ad7d2bb, 15)
+  b = tr_i(b, c, d, a, x[10] + 0xeb86d391, 21)
+
+  return tobit(a+a1), tobit(b+b1), tobit(c+c1), tobit(d+d1)
+end
+
+-- Note: this is copying the original string and NOT particularly fast.
+-- A library for struct unpacking would make this task much easier.
+local function md5(msg)
+  local len = #msg
+  msg = msg.."\128"..rep("\0", 63 - band(len + 8, 63))
+	   ..char(band(lshift(len, 3), 255), band(rshift(len, 5), 255),
+		  band(rshift(len, 13), 255), band(rshift(len, 21), 255))
+	   .."\0\0\0\0"
+  local a, b, c, d = 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476
+  local x, k = {}, 1
+  for i=1,#msg,4 do
+    local m0, m1, m2, m3 = byte(msg, i, i+3)
+    x[k] = bor(m0, lshift(m1, 8), lshift(m2, 16), lshift(m3, 24))
+    if k == 16 then
+      a, b, c, d = transform(x, a, b, c, d)
+      k = 1
+    else
+      k = k + 1
+    end
+  end
+  return tohex(bswap(a))..tohex(bswap(b))..tohex(bswap(c))..tohex(bswap(d))
+end
+
+assert(md5('') == 'd41d8cd98f00b204e9800998ecf8427e')
+assert(md5('a') == '0cc175b9c0f1b6a831c399e269772661')
+assert(md5('abc') == '900150983cd24fb0d6963f7d28e17f72')
+assert(md5('message digest') == 'f96b697d7cb7938d525a2f31aaf161d0')
+assert(md5('abcdefghijklmnopqrstuvwxyz') == 'c3fcd3d76192e4007dfb496cca67e13b')
+assert(md5('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789') ==
+       'd174ab98d277d9f5a5611c2c9f419d9f')
+assert(md5('12345678901234567890123456789012345678901234567890123456789012345678901234567890') ==
+       '57edf4a22be3c955ac49da2e2107b67a')
+
+local N = tonumber(arg and arg[1]) or 10000
+
+  -- Credits: William Shakespeare, Romeo and Juliet
+local txt = [[Rebellious subjects, enemies to peace,
+Profaners of this neighbour-stained steel,--
+Will they not hear? What, ho! you men, you beasts,
+That quench the fire of your pernicious rage
+With purple fountains issuing from your veins,
+On pain of torture, from those bloody hands
+Throw your mistemper'd weapons to the ground,
+And hear the sentence of your moved prince.
+Three civil brawls, bred of an airy word,
+By thee, old Capulet, and Montague,
+Have thrice disturb'd the quiet of our streets,
+And made Verona's ancient citizens
+Cast by their grave beseeming ornaments,
+To wield old partisans, in hands as old,
+Canker'd with peace, to part your canker'd hate:
+If ever you disturb our streets again,
+Your lives shall pay the forfeit of the peace.
+For this time, all the rest depart away:
+You Capulet; shall go along with me:
+And, Montague, come you this afternoon,
+To know our further pleasure in this case,
+To old Free-town, our common judgment-place.
+Once more, on pain of death, all men depart.]]
+  txt = txt..txt..txt..txt
+  txt = txt..txt..txt..txt
+
+for i=1,N do
+  res = md5(txt)
+end
+assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
+
diff --git a/perf/LuaJIT-benches/meteor.lua b/perf/LuaJIT-benches/meteor.lua
new file mode 100644
index 00000000..80588ab5
--- /dev/null
+++ b/perf/LuaJIT-benches/meteor.lua
@@ -0,0 +1,220 @@
+
+-- Generate a decision tree based solver for the meteor puzzle.
+local function generatesolver(countinit)
+  local pairs, ipairs, format = pairs, ipairs, string.format
+  local byte, min, sort = string.byte, math.min, table.sort
+
+  -- Cached position to distance lookup.
+  local dist = setmetatable({}, { __index = function(t, xy)
+    local x = xy%10; local y = (xy-x)/10
+    if (x+y)%2 == 1 then y = y + 1; x = 10 - x end
+    local d = xy + 256*x*x + 1024*y*y; t[xy] = d; return d
+  end})
+
+  -- Lookup table to validate a cell and to find its successor.
+  local ok = {}
+  for i=0,150 do ok[i] = false end
+  for i=99,0,-1 do
+    local x = i%10
+    if ((i-x)/10+x)%2 == 0 then
+      ok[i] = i + (ok[i+1] and 1 or (ok[i+2] and 2 or 3))
+    end
+  end
+
+  -- Temporary board state for the island checks.
+  local islands, slide = {}, {20,22,24,26,28,31,33,35,37,39}
+  local bbc, bb = 0, {}
+  for i=0,19 do bb[i] = false; bb[i+80] = false end
+  for i=20,79 do bb[i] = ok[i] end
+
+  -- Recursive flood fill algorithm.
+  local function fill(bb, p)
+    bbc = bbc + 1
+    local n = p+2; if bb[n] then bb[n] = false; fill(bb, n) end
+    n = p-2; if bb[n] then bb[n] = false; fill(bb, n) end
+    n = p-9; if bb[n] then bb[n] = false; fill(bb, n) end
+    n = p-11; if bb[n] then bb[n] = false; fill(bb, n) end
+    n = p+9; if bb[n] then bb[n] = false; fill(bb, n) end
+    n = p+11; if bb[n] then bb[n] = false; fill(bb, n) end
+  end
+
+  -- Generate pruned, sliding decision trees.
+  local dtrees = {{}, {}, {}, {}, {}, {}, {}, {}, {}, {}}
+  local rot = { nil, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {} }
+  for k=0,9 do
+    -- Generate 10 initial pieces from line noise. :-)
+    local t = { 60, 62, byte("@BMBIK@KT@GPIKR@IKIKT@GK@KM@BG", k*3+1, k*3+3) }
+    rot[1] = t
+    for i,xy in ipairs(t) do
+      local x = xy%10; local y = (xy-x-60)/10
+      -- Add 11 more variations by rotating and flipping.
+      for j=2,12 do
+	if j == 7 then y = -y else x,y = (x+3*y)/2, (y-x)/2 end
+	rot[j][i] = x+10*y
+      end
+    end
+    for r,v in ipairs(rot) do
+      -- Exploit symmetry and leave out half of the orientations of one piece.
+      -- The selected piece gives the best reduction of the solution space.
+      if k ~= 3 or r%2 == 0 then
+	-- Normalize to origin, add distance, sort by distance from origin.
+	local m = min(v[1], v[2], v[3], v[4], v[5])
+	for i=1,5 do v[i] = dist[v[i]-m] end
+	sort(v)
+	local v2, v3, v4, v5 = v[2]%256, v[3]%256, v[4]%256, v[5]%256
+	-- Slide the piece across 2 rows, prune the tree, check for islands.
+	for j,p in ipairs(slide) do
+	  bb[p] = false
+	  if ok[p+v2] and ok[p+v3] and ok[p+v4] and ok[p+v5] then -- Prune.
+	    for i=p+1,79 do bb[i] = ok[i] end -- Clear remaining board.
+	    bb[p+v2] = false; bb[p+v3] = false -- Add piece.
+	    bb[p+v4] = false; bb[p+v5] = false
+	    bbc = j -- Flood fill and count the filled positions.
+	    if bb[71] then bb[71] = false; fill(bb, 71) end -- Lower left.
+	    if bb[79] then bb[79] = false; fill(bb, 79) end -- Lower right.
+	    local di = 0
+	    if bbc < 22 then bbc = 26
+	    elseif bbc < 26 then -- Island found, locate it, fill from above.
+	      for i=p+2,79 do if bb[i] then di = i-p; break end end
+	      for i=p-9,p-1 do if ok[i] then fill(bb, i) bbc = bbc - 1 end end
+	    end
+	    if bbc == 26 then -- Prune boards with static islands.
+	      local tb = dtrees[j] -- Build decision tree in distance order.
+	      local ta = tb[v2]; if not ta then ta = {}; tb[v2] = ta end
+	      tb = ta[v3]; if not tb then tb = {}; ta[v3] = tb end
+	      ta = tb[v4]; if not ta then ta = {}; tb[v4] = ta; islands[ta] = di
+	      elseif islands[ta] ~= di then islands[ta] = 0 end
+	      ta[v5] = di*10+k -- Leaves hold island check and piece number.
+	    end
+	  end
+	end
+      end
+    end
+  end
+
+  local s = "local u0,u1,u2,u3,u4,u5,u6,u7,u8,u9" -- Piece use flags.
+  for p=0,99 do if ok[p] then s = s..",b"..p end end -- Board cells.
+  s = s.."\n"..[[
+local countinit = ...
+local count = countinit
+local bmin, bmax, pcs = 9, 0, {}
+local smin, smax
+local write, reverse = io.write, string.reverse
+
+-- Print min/max boards.
+local function printboard(s)
+  local flip = true
+  for x in string.gmatch(string.gsub(s, ".", "%1 "), "..........") do
+    write(x, flip and "\n " or "\n")
+    flip = not flip
+  end
+  write("\n")
+end
+
+-- Print result.
+local function printresult()
+  write(countinit-count, " solutions found\n\n")
+  printboard(smin)
+  printboard(smax)
+end
+
+-- Generate piece lookup array from the order of use.
+local function genp()
+  local p = pcs
+  p[u0] = "0" p[u1] = "1" p[u2] = "2" p[u3] = "3" p[u4] = "4"
+  p[u5] = "5" p[u6] = "6" p[u7] = "7" p[u8] = "8" p[u9] = "9"
+  return p
+end
+
+-- Goal function.
+local function f91(k)
+  if k ~= 10 then return end
+  count = count - 2 -- Need to count the symmetric solution, too.
+  repeat
+    -- Quick precheck before constructing the string.
+    local b0, b99 = b0, b99
+    if b0 <= bmin then bmin = b0 elseif b0 >= bmax then bmax = b0
+    elseif b99 <= bmin then bmin = b99 elseif b99 >= bmax then bmax = b99
+    else break end
+    -- Translate the filled board to a string.
+    local p = genp()
+    local s = p[b0] ]]
+  for p=2,99 do if ok[p] then s = s.."..p[b"..p.."]" end end
+  s = s..[[
+    -- Remember min/max boards, dito for the symmetric board.
+    if not smin then smin = s; smax = s
+    elseif s < smin then smin = s elseif s > smax then smax = s end
+    s = reverse(s)
+    if s < smin then smin = s elseif s > smax then smax = s end
+  until true
+  if count <= 0 then error() end -- Early abort if max count given.
+end
+local f93 = f91
+]]
+
+  -- Recursively convert the decision tree to Lua code.
+  local function codetree(tree, d, p, pn)
+    local found, s = false, ""
+    d = d + 1
+    for a,t in pairs(tree) do
+      local b = p+a
+      if b < 100 then -- Prune the tree at the lower border.
+	local pp = b ~= pn and pn or ok[b] -- Find maximum successor function.
+	if d >= 5 then -- Try to place the last cell of a piece and advance.
+	  found = true
+	  local u = t%10
+	  local di = (t-u)/10
+	  if di ~= 0 and d == 5 then
+	    di = di + p; if pp == di then pp = ok[di] end
+	    s = format("%sif b%d and not u%d and not b%d then b%d=k u%d=k f%d(k) u%d=N b%d=N end\n",
+		       s, di, u, b, b, u, pp, u, b)
+	  else
+	    s = format("%sif not u%d and not b%d then b%d=k u%d=k f%d(k) u%d=N b%d=N end\n",
+		       s, u, b, b, u, pp, u, b)
+	  end
+	else -- Try to place an intermediate cell.
+	  local di = d ~= 4 and 0 or islands[t]
+	  if di == 0 then
+	    local st = codetree(t, d, p, pp)
+	    if st then
+	      found = true
+	      s = format("%sif not b%d then b%d=k\n%sb%d=N end\n", s, b, b, st, b)
+	    end
+	  else -- Combine island checks.
+	    di = di + p; if pp == di then pp = ok[di] end
+	    local st = codetree(t, 6, p, pp)
+	    if st then
+	      found = true
+	      s = format("%sif b%d and not b%d then b%d=k\n%sb%d=N end\n", s, di, b, b, st, b)
+	    end
+	  end
+	end
+      end
+    end
+    return found and s
+  end
+
+  -- Embed the decision tree into a function hierarchy.
+  local j = 5
+  for p=88,0,-1 do
+    local pn = ok[p]
+    if pn then
+      s = format("%slocal function f%d(k)\nlocal N if b%d then return f%d(k) end k=k+1 b%d=k\n%sb%d=N end\n",
+	    s, p, p, pn, p, codetree(dtrees[j], 1, p, pn), p)
+      j = j - 1; if j == 0 then j = 10 end
+    end
+  end
+
+  -- Compile and return solver function and result getter.
+  return loadstring(s.."return f0, printresult\n", "solver")(countinit)
+end
+
+-- Generate the solver function hierarchy.
+local solver, printresult = generatesolver(tonumber(arg and arg[1]) or 10000)
+
+-- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
+if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
+
+-- Run the solver protected to get partial results (max count or ctrl-c).
+pcall(solver, 0)
+printresult()
diff --git a/perf/LuaJIT-benches/nbody.lua b/perf/LuaJIT-benches/nbody.lua
new file mode 100644
index 00000000..e0ff8f77
--- /dev/null
+++ b/perf/LuaJIT-benches/nbody.lua
@@ -0,0 +1,119 @@
+
+local sqrt = math.sqrt
+
+local PI = 3.141592653589793
+local SOLAR_MASS = 4 * PI * PI
+local DAYS_PER_YEAR = 365.24
+local bodies = {
+  { -- Sun
+    x = 0,
+    y = 0,
+    z = 0,
+    vx = 0,
+    vy = 0,
+    vz = 0,
+    mass = SOLAR_MASS
+  },
+  { -- Jupiter
+    x = 4.84143144246472090e+00,
+    y = -1.16032004402742839e+00,
+    z = -1.03622044471123109e-01,
+    vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
+    vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
+    vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
+    mass = 9.54791938424326609e-04 * SOLAR_MASS
+  },
+  { -- Saturn
+    x = 8.34336671824457987e+00,
+    y = 4.12479856412430479e+00,
+    z = -4.03523417114321381e-01,
+    vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
+    vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
+    vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
+    mass = 2.85885980666130812e-04 * SOLAR_MASS
+  },
+  { -- Uranus
+    x = 1.28943695621391310e+01,
+    y = -1.51111514016986312e+01,
+    z = -2.23307578892655734e-01,
+    vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
+    vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
+    vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
+    mass = 4.36624404335156298e-05 * SOLAR_MASS
+  },
+  { -- Neptune
+    x = 1.53796971148509165e+01,
+    y = -2.59193146099879641e+01,
+    z = 1.79258772950371181e-01,
+    vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
+    vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
+    vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
+    mass = 5.15138902046611451e-05 * SOLAR_MASS
+  }
+}
+
+local function advance(bodies, nbody, dt)
+  for i=1,nbody do
+    local bi = bodies[i]
+    local bix, biy, biz, bimass = bi.x, bi.y, bi.z, bi.mass
+    local bivx, bivy, bivz = bi.vx, bi.vy, bi.vz
+    for j=i+1,nbody do
+      local bj = bodies[j]
+      local dx, dy, dz = bix-bj.x, biy-bj.y, biz-bj.z
+      local mag = sqrt(dx*dx + dy*dy + dz*dz)
+      mag = dt / (mag * mag * mag)
+      local bm = bj.mass*mag
+      bivx = bivx - (dx * bm)
+      bivy = bivy - (dy * bm)
+      bivz = bivz - (dz * bm)
+      bm = bimass*mag
+      bj.vx = bj.vx + (dx * bm)
+      bj.vy = bj.vy + (dy * bm)
+      bj.vz = bj.vz + (dz * bm)
+    end
+    bi.vx = bivx
+    bi.vy = bivy
+    bi.vz = bivz
+    bi.x = bix + dt * bivx
+    bi.y = biy + dt * bivy
+    bi.z = biz + dt * bivz
+  end
+end
+
+local function energy(bodies, nbody)
+  local e = 0
+  for i=1,nbody do
+    local bi = bodies[i]
+    local vx, vy, vz, bim = bi.vx, bi.vy, bi.vz, bi.mass
+    e = e + (0.5 * bim * (vx*vx + vy*vy + vz*vz))
+    for j=i+1,nbody do
+      local bj = bodies[j]
+      local dx, dy, dz = bi.x-bj.x, bi.y-bj.y, bi.z-bj.z
+      local distance = sqrt(dx*dx + dy*dy + dz*dz)
+      e = e - ((bim * bj.mass) / distance)
+    end
+  end
+  return e
+end
+
+local function offsetMomentum(b, nbody)
+  local px, py, pz = 0, 0, 0
+  for i=1,nbody do
+    local bi = b[i]
+    local bim = bi.mass
+    px = px + (bi.vx * bim)
+    py = py + (bi.vy * bim)
+    pz = pz + (bi.vz * bim)
+  end
+  b[1].vx = -px / SOLAR_MASS
+  b[1].vy = -py / SOLAR_MASS
+  b[1].vz = -pz / SOLAR_MASS
+end
+
+local N = tonumber(arg and arg[1]) or 1000
+local nbody = #bodies
+
+offsetMomentum(bodies, nbody)
+io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
+for i=1,N do advance(bodies, nbody, 0.01) end
+io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
diff --git a/perf/LuaJIT-benches/nsieve-bit-fp.lua b/perf/LuaJIT-benches/nsieve-bit-fp.lua
new file mode 100644
index 00000000..3971ec1f
--- /dev/null
+++ b/perf/LuaJIT-benches/nsieve-bit-fp.lua
@@ -0,0 +1,37 @@
+
+local floor, ceil = math.floor, math.ceil
+
+local precision = 50 -- Maximum precision of lua_Number (minus safety margin).
+local onebits = (2^precision)-1
+
+local function nsieve(p, m)
+  local cm = ceil(m/precision)
+  do local onebits = onebits; for i=0,cm do p[i] = onebits end end
+  local count, idx, bit = 0, 2, 2
+  for i=2,m do
+    local r = p[idx] / bit
+    if r - floor(r) >= 0.5 then -- Bit set?
+      local kidx, kbit = idx, bit
+      for k=i+i,m,i do
+        kidx = kidx + i
+        while kidx >= cm do kidx = kidx - cm; kbit = kbit + kbit end
+        local x = p[kidx]
+        local r = x / kbit
+        if r - floor(r) >= 0.5 then p[kidx] = x - kbit*0.5 end -- Clear bit.
+      end
+      count = count + 1
+    end
+    idx = idx + 1
+    if idx >= cm then idx = 0; bit = bit + bit end
+  end
+  return count
+end
+
+local N = tonumber(arg and arg[1]) or 1
+if N < 2 then N = 2 end
+local primes = {}
+
+for i=0,2 do
+  local m = (2^(N-i))*10000
+  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
+end
diff --git a/perf/LuaJIT-benches/nsieve-bit.lua b/perf/LuaJIT-benches/nsieve-bit.lua
new file mode 100644
index 00000000..820a3726
--- /dev/null
+++ b/perf/LuaJIT-benches/nsieve-bit.lua
@@ -0,0 +1,27 @@
+
+local bit = require("bit")
+local band, bxor, rshift, rol = bit.band, bit.bxor, bit.rshift, bit.rol
+
+local function nsieve(p, m)
+  local count = 0
+  for i=0,rshift(m, 5) do p[i] = -1 end
+  for i=2,m do
+    if band(rshift(p[rshift(i, 5)], i), 1) ~= 0 then
+      count = count + 1
+      for j=i+i,m,i do
+	local jx = rshift(j, 5)
+	p[jx] = band(p[jx], rol(-2, j))
+      end
+    end
+  end
+  return count
+end
+
+local N = tonumber(arg and arg[1]) or 1
+if N < 2 then N = 2 end
+local primes = {}
+
+for i=0,2 do
+  local m = (2^(N-i))*10000
+  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
+end
diff --git a/perf/LuaJIT-benches/nsieve.lua b/perf/LuaJIT-benches/nsieve.lua
new file mode 100644
index 00000000..6de0524f
--- /dev/null
+++ b/perf/LuaJIT-benches/nsieve.lua
@@ -0,0 +1,21 @@
+
+local function nsieve(p, m)
+  for i=2,m do p[i] = true end
+  local count = 0
+  for i=2,m do
+    if p[i] then
+      for k=i+i,m,i do p[k] = false end
+      count = count + 1
+    end
+  end
+  return count
+end
+
+local N = tonumber(arg and arg[1]) or 1
+if N < 2 then N = 2 end
+local primes = {}
+
+for i=0,2 do
+  local m = (2^(N-i))*10000
+  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
+end
diff --git a/perf/LuaJIT-benches/partialsums.lua b/perf/LuaJIT-benches/partialsums.lua
new file mode 100644
index 00000000..46bb9da3
--- /dev/null
+++ b/perf/LuaJIT-benches/partialsums.lua
@@ -0,0 +1,29 @@
+
+local n = tonumber(arg[1])
+local function pr(fmt, x) io.write(string.format(fmt, x)) end
+
+local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
+local sqrt, sin, cos = math.sqrt, math.sin, math.cos
+for k=1,n do
+  local k2, sk, ck = k*k, sin(k), cos(k)
+  local k3 = k2*k
+  a1 = a1 + (2/3)^k
+  a2 = a2 + 1/sqrt(k)
+  a3 = a3 + 1/(k2+k)
+  a4 = a4 + 1/(k3*sk*sk)
+  a5 = a5 + 1/(k3*ck*ck)
+  a6 = a6 + 1/k
+  a7 = a7 + 1/k2
+  a8 = a8 + alt/k
+  a9 = a9 + alt/(k+k-1)
+  alt = -alt
+end
+pr("%.9f\t(2/3)^k\n", a1)
+pr("%.9f\tk^-0.5\n", a2)
+pr("%.9f\t1/k(k+1)\n", a3)
+pr("%.9f\tFlint Hills\n", a4)
+pr("%.9f\tCookson Hills\n", a5)
+pr("%.9f\tHarmonic\n", a6)
+pr("%.9f\tRiemann Zeta\n", a7)
+pr("%.9f\tAlternating Harmonic\n", a8)
+pr("%.9f\tGregory\n", a9)
diff --git a/perf/LuaJIT-benches/pidigits-nogmp.lua b/perf/LuaJIT-benches/pidigits-nogmp.lua
new file mode 100644
index 00000000..63a1cb0e
--- /dev/null
+++ b/perf/LuaJIT-benches/pidigits-nogmp.lua
@@ -0,0 +1,100 @@
+
+-- Start of dynamically compiled chunk.
+local chunk = [=[
+
+-- Factory function for multi-precision number (mpn) operations.
+local function fmm(fa, fb)
+  return loadstring([[
+    return function(y, a, ka, b, kb)
+      local carry, n = 0, #a ]]..(fb == 0 and "" or [[
+      local na, nb = n, #b -- Need to adjust lengths. 1 element suffices here.
+      if na > nb then b[na] = 0 elseif na < nb then a[nb] = 0; n = nb end
+    ]])..[[
+      for i=1,n do -- Sum up all elements and propagate carry.
+        local x = a[i] ]]..(fa == 2 and "*ka" or "")..
+          (fb == 2 and "+b[i]*kb" or (fb == 1 and "+b[i]" or ""))..[[ + carry
+        if x < RADIX and x >= 0 then carry = 0; y[i] = x -- Check for overflow.
+        else local d = x % RADIX; carry = (x-d) / RADIX; y[i] = d end
+      end
+      y[n+1] = nil -- Truncate target. 1 element suffices here.
+      if carry == 0 then while n > 0 and y[n] == 0 do y[n] = nil end
+      elseif carry == -1 then y[n] = y[n] - RADIX else y[n+1] = carry end
+    ]]..(fb == 0 and "" or [[ -- Undo length adjustment.
+      if na > nb then b[na] = nil elseif na < nb and y ~= a then a[nb] = nil end
+    ]])..[[
+      return y
+    end]])()
+end
+
+-- Generate needed mpn functions.
+local mm_kk, mm_k1, mm_k0, mm_11 = fmm(2, 2), fmm(2, 1), fmm(2, 0), fmm(1, 1)
+
+-- Choose the most efficient mpn function for y = a*ka + b*kb at run-time.
+local function mm(y, a, ka, b, kb)
+  local f = mm_kk
+  if kb == 0 or #b == 0 then if ka == 1 then return a else f = mm_k0 end
+  elseif kb == 1 then if ka == 1 then f = mm_11 else f = mm_k1 end end
+  return f(y, a, ka, b, kb)
+end
+
+-- Compose matrix with numbers on the right.
+local function compose_r(aq,ar,as,at, bq,br,bs,bt)
+  mm(ar, ar,bq, at,br) mm(at, at,bt, ar,bs)
+  mm(as, as,bt, aq,bs) mm(aq, aq,bq, nil,0)
+end
+
+-- Compose matrix with numbers on the left.
+local function compose_l(aq,ar,as,at, bq,br,bs,bt)
+  mm(ar, ar,bt, aq,br) mm(at, at,bt, as,br)
+  mm(as, as,bq, at,bs) mm(aq, aq,bq, nil,0)
+end
+
+-- Extract one digit.
+local u, v, jj = {}, {}, 0
+local function extract(q,r,s,t, j)
+  local u = j == jj + 1 and mm(u, u,1, q,1) or mm(u, q,j, r,1); jj = j
+  local v = mm(v, t,1, s,j)
+  local nu, nv, y = #u, #v
+  if nu == nv then
+    if nu == 1 then y = u[1] / v[1]
+    else y = (u[nu]*RADIX + u[nu-1]) / (v[nv]*RADIX + v[nv-1]) end
+  elseif nu == nv+1 then y = (u[nu]*RADIX + u[nv]) / v[nv]
+  else return 0 end
+  return math.floor(y)
+end
+
+-- Coroutine which yields successive digits of PI.
+return coroutine.wrap(function()
+  local q, r, s, t, k = {1}, {}, {}, {1}, 1
+  repeat
+    local y = extract(q,r,s,t, 3)
+    if y == extract(q,r,s,t, 4) then
+      coroutine.yield(y)
+      compose_r(q,r,s,t,  10, -10*y, 0, 1)
+    else
+      compose_l(q,r,s,t,   k, 4*k+2, 0, 2*k+1)
+      k = k + 1
+    end
+  until false
+end)
+
+]=] -- End of dynamically compiled chunk.
+
+local N = tonumber(arg and arg[1]) or 27
+local RADIX = N < 6500 and 2^36 or 2^32 -- Avoid overflow.
+
+-- Substitute radix and compile chunk.
+local pidigit = loadstring(string.gsub(chunk, "RADIX", tostring(RADIX)))()
+
+-- Print lines with 10 digits.
+for i=10,N,10 do
+  for j=1,10 do io.write(pidigit()) end
+  io.write("\t:", i, "\n")
+end
+
+-- Print remaining digits (if any).
+local n10 = N % 10
+if n10 ~= 0 then
+  for i=1,n10 do io.write(pidigit()) end
+  io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
+end
diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
new file mode 100644
index 00000000..2acc24c0
--- /dev/null
+++ b/perf/LuaJIT-benches/ray.lua
@@ -0,0 +1,135 @@
+local sqrt = math.sqrt
+local huge = math.huge
+
+local delta = 1
+while delta * delta + 1 ~= 1 do
+  delta = delta * 0.5
+end
+
+local function length(x, y, z)  return sqrt(x*x + y*y + z*z) end
+local function vlen(v)          return length(v[1], v[2], v[3]) end
+local function mul(c, x, y, z)  return c*x, c*y, c*z end
+local function unitise(x, y, z) return mul(1/length(x, y, z), x, y, z) end
+local function dot(x1, y1, z1, x2, y2, z2)
+  return x1*x2 + y1*y2 + z1*z2
+end
+
+local function vsub(a, b)        return a[1] - b[1], a[2] - b[2], a[3] - b[3] end
+local function vdot(a, b)        return dot(a[1], a[2], a[3], b[1], b[2], b[3]) end
+
+
+local sphere = {}
+function sphere:new(centre, radius)
+  self.__index = self
+  return setmetatable({centre=centre, radius=radius}, self)
+end
+
+local function sphere_distance(self, origin, dir)
+  local vx, vy, vz = vsub(self.centre, origin)
+  local b = dot(vx, vy, vz, dir[1], dir[2], dir[3])
+  local r = self.radius
+  local disc = r*r + b*b - vx*vx-vy*vy-vz*vz
+  if disc < 0 then return huge end
+  local d = sqrt(disc)
+  local t2 = b + d
+  if t2 < 0 then return huge end
+  local t1 = b - d
+  return t1 > 0 and t1 or t2
+end
+
+function sphere:intersect(origin, dir, best)
+  local lambda = sphere_distance(self, origin, dir)
+  if lambda < best[1] then
+    local c = self.centre
+    best[1] = lambda
+    local b2 = best[2]
+    b2[1], b2[2], b2[3] =
+      unitise(
+        origin[1] - c[1] + lambda * dir[1],
+        origin[2] - c[2] + lambda * dir[2],
+        origin[3] - c[3] + lambda * dir[3])
+  end
+end
+
+local group = {}
+function group:new(bound)
+  self.__index = self
+  return setmetatable({bound=bound, children={}}, self)
+end
+
+function group:add(s)
+  self.children[#self.children+1] = s
+end
+
+function group:intersect(origin, dir, best)
+  local lambda = sphere_distance(self.bound, origin, dir)
+  if lambda < best[1] then
+    for _, c in ipairs(self.children) do
+      c:intersect(origin, dir, best)
+    end
+  end
+end
+
+local hit = { 0, 0, 0 }
+local ilight
+local best = { huge, { 0, 0, 0 } }
+
+local function ray_trace(light, camera, dir, scene)
+  best[1] = huge
+  scene:intersect(camera, dir, best)
+  local b1 = best[1]
+  if b1 == huge then return 0 end
+  local b2 = best[2]
+  local g = vdot(b2, light)
+  if g >= 0 then return 0 end
+  hit[1] = camera[1] + b1*dir[1] + delta*b2[1]
+  hit[2] = camera[2] + b1*dir[2] + delta*b2[2]
+  hit[3] = camera[3] + b1*dir[3] + delta*b2[3]
+  best[1] = huge
+  scene:intersect(hit, ilight, best)
+  if best[1] == huge then
+    return -g
+  else
+    return 0
+  end
+end
+
+local function create(level, centre, radius)
+  local s = sphere:new(centre, radius)
+  if level == 1 then return s end
+  local gr = group:new(sphere:new(centre, 3*radius))
+  gr:add(s)
+  local rn = 3*radius/sqrt(12)
+  for dz = -1,1,2 do
+    for dx = -1,1,2 do
+      gr:add(create(level-1, { centre[1] + rn*dx, centre[2] + rn, centre[3] + rn*dz }, radius*0.5))
+    end
+  end
+  return gr
+end
+
+
+local level, n, ss = tonumber(arg[1]) or 9, tonumber(arg[2]) or 256, 4
+local iss = 1/ss
+local gf = 255/(ss*ss)
+
+io.write(("P5\n%d %d\n255\n"):format(n, n))
+local light = { unitise(-1, -3, 2) }
+ilight = { -light[1], -light[2], -light[3] }
+local camera = { 0, 0, -4 }
+local dir = { 0, 0, 0 }
+
+local scene = create(level, {0, -1, 0}, 1)
+
+for y = n/2-1, -n/2, -1 do
+  for x = -n/2, n/2-1 do
+    local g = 0
+    for d = y, y+.99, iss do
+      for e = x, x+.99, iss do
+        dir[1], dir[2], dir[3] = unitise(e, d, n)
+        g = g + ray_trace(light, camera, dir, scene) 
+      end
+    end
+    io.write(string.char(math.floor(0.5 + g*gf)))
+  end
+end
diff --git a/perf/LuaJIT-benches/recursive-ack.lua b/perf/LuaJIT-benches/recursive-ack.lua
new file mode 100644
index 00000000..fad30589
--- /dev/null
+++ b/perf/LuaJIT-benches/recursive-ack.lua
@@ -0,0 +1,8 @@
+local function Ack(m, n)
+  if m == 0 then return n+1 end
+  if n == 0 then return Ack(m-1, 1) end
+  return Ack(m-1, (Ack(m, n-1))) -- The parentheses are deliberate.
+end
+
+local N = tonumber(arg and arg[1]) or 10
+io.write("Ack(3,", N ,"): ", Ack(3,N), "\n")
diff --git a/perf/LuaJIT-benches/recursive-fib.lua b/perf/LuaJIT-benches/recursive-fib.lua
new file mode 100644
index 00000000..ef9950de
--- /dev/null
+++ b/perf/LuaJIT-benches/recursive-fib.lua
@@ -0,0 +1,7 @@
+local function fib(n)
+  if n < 2 then return 1 end
+  return fib(n-2) + fib(n-1)
+end
+
+local n = tonumber(arg[1]) or 10
+io.write(string.format("Fib(%d): %d\n", n, fib(n)))
diff --git a/perf/LuaJIT-benches/revcomp.lua b/perf/LuaJIT-benches/revcomp.lua
new file mode 100644
index 00000000..34fe347b
--- /dev/null
+++ b/perf/LuaJIT-benches/revcomp.lua
@@ -0,0 +1,37 @@
+
+local sub = string.sub
+iubc = setmetatable({
+  A="T", C="G", B="V", D="H", K="M", R="Y",
+  a="T", c="G", b="V", d="H", k="M", r="Y",
+  T="A", G="C", V="B", H="D", M="K", Y="R", U="A",
+  t="A", g="C", v="B", h="D", m="K", y="R", u="A",
+  N="N", S="S", W="W", n="N", s="S", w="W",
+}, { __index = function(t, s)
+  local r = t[sub(s, 2)]..t[sub(s, 1, 1)]; t[s] = r; return r end })
+
+local wcode = [=[
+return function(t, n)
+  if n == 1 then return end
+  local iubc, sub, write = iubc, string.sub, io.write
+  local s = table.concat(t, "", 1, n-1)
+  for i=#s-59,1,-60 do
+    write(]=]
+for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
+wcode = wcode..[=["\n")
+  end
+  local r = #s % 60
+  if r ~= 0 then
+    for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
+    write("\n")
+  end
+end
+]=]
+local writerev = loadstring(wcode)()
+
+local t, n = {}, 1
+for line in io.lines() do
+  local c = sub(line, 1, 1)
+  if c == ">" then writerev(t, n); io.write(line, "\n"); n = 1
+  elseif c ~= ";" then t[n] = line; n = n + 1 end
+end
+writerev(t, n)
diff --git a/perf/LuaJIT-benches/scimark-2010-12-20.lua b/perf/LuaJIT-benches/scimark-2010-12-20.lua
new file mode 100644
index 00000000..353acb7c
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-2010-12-20.lua
@@ -0,0 +1,400 @@
+------------------------------------------------------------------------------
+-- Lua SciMark (2010-12-20).
+--
+-- A literal translation of SciMark 2.0a, written in Java and C.
+-- Credits go to the original authors Roldan Pozo and Bruce Miller.
+-- See: http://math.nist.gov/scimark2/
+------------------------------------------------------------------------------
+
+local SCIMARK_VERSION = "2010-12-10"
+local SCIMARK_COPYRIGHT = "Copyright (C) 2006-2010 Mike Pall"
+
+local MIN_TIME = 2.0
+local RANDOM_SEED = 101009 -- Must be odd.
+local SIZE_SELECT = "small"
+
+local benchmarks = {
+  "FFT", "SOR", "MC", "SPARSE", "LU",
+  small = {
+    FFT		= { 1024 },
+    SOR		= { 100 },
+    MC		= { },
+    SPARSE	= { 1000, 5000 },
+    LU		= { 100 },
+  },
+  large = {
+    FFT		= { 1048576 },
+    SOR		= { 1000 },
+    MC		= { },
+    SPARSE	= { 100000, 1000000 },
+    LU		= { 1000 },
+  },
+}
+
+local abs, log, sin, floor = math.abs, math.log, math.sin, math.floor
+local pi, clock = math.pi, os.clock
+local format = string.format
+
+------------------------------------------------------------------------------
+-- Select array type: Lua tables or native (FFI) arrays
+------------------------------------------------------------------------------
+
+local darray, iarray
+
+local function array_init()
+  if jit and jit.status and jit.status() then
+    local ok, ffi = pcall(require, "ffi")
+    if ok then
+      darray = ffi.typeof("double[?]")
+      iarray = ffi.typeof("int[?]")
+      return
+    end
+  end
+  function darray(n) return {} end
+  iarray = darray
+end
+
+------------------------------------------------------------------------------
+-- This is a Lagged Fibonacci Pseudo-random Number Generator with
+-- j, k, M = 5, 17, 31. Pretty weak, but same as C/Java SciMark.
+------------------------------------------------------------------------------
+
+local rand, rand_init
+
+if jit and jit.status and jit.status() then
+  -- LJ2 has bit operations and zero-based arrays (internally).
+  local bit = require("bit")
+  local band, sar = bit.band, bit.arshift
+  function rand_init(seed)
+    local Rm, Rj, Ri = iarray(17), 16, 11
+    for i=0,16 do Rm[i] = 0 end
+    for i=16,0,-1 do
+      seed = band(seed*9069, 0x7fffffff)
+      Rm[i] = seed
+    end
+    function rand()
+      local i = band(Ri+1, sar(Ri-16, 31))
+      local j = band(Rj+1, sar(Rj-16, 31))
+      Ri, Rj = i, j
+      local k = band(Rm[i] - Rm[j], 0x7fffffff)
+      Rm[j] = k
+      return k * (1.0/2147483647.0)
+    end
+  end
+else
+  -- Better for standard Lua with one-based arrays and without bit operations.
+  function rand_init(seed)
+    local Rm, Rj = {}, 1
+    for i=1,17 do Rm[i] = 0 end
+    for i=17,1,-1 do
+      seed = (seed*9069) % (2^31)
+      Rm[i] = seed
+    end
+    function rand()
+      local j, m = Rj, Rm
+      local h = j - 5
+      if h < 1 then h = h + 17 end
+      local k = m[h] - m[j]
+      if k < 0 then k = k + 2147483647 end
+      m[j] = k
+      if j < 17 then Rj = j + 1 else Rj = 1 end
+      return k * (1.0/2147483647.0)
+    end
+  end
+end
+
+local function random_vector(n)
+  local v = darray(n+1)
+  for x=1,n do v[x] = rand() end
+  return v
+end
+
+local function random_matrix(m, n)
+  local a = {}
+  for y=1,m do
+    local v = darray(n+1)
+    a[y] = v
+    for x=1,n do v[x] = rand() end
+  end
+  return a
+end
+
+------------------------------------------------------------------------------
+-- FFT: Fast Fourier Transform.
+------------------------------------------------------------------------------
+
+local function fft_bitreverse(v, n)
+  local j = 0
+  for i=0,2*n-4,2 do
+    if i < j then
+      v[i+1], v[i+2], v[j+1], v[j+2] = v[j+1], v[j+2], v[i+1], v[i+2]
+    end
+    local k = n
+    while k <= j do j = j - k; k = k / 2 end
+    j = j + k
+  end
+end
+
+local function fft_transform(v, n, dir)
+  if n <= 1 then return end
+  fft_bitreverse(v, n)
+  local dual = 1
+  repeat
+    local dual2 = 2*dual
+    for i=1,2*n-1,2*dual2 do
+      local j = i+dual2
+      local ir, ii = v[i], v[i+1]
+      local jr, ji = v[j], v[j+1]
+      v[j], v[j+1] = ir - jr, ii - ji
+      v[i], v[i+1] = ir + jr, ii + ji
+    end
+    local theta = dir * pi / dual
+    local s, s2 = sin(theta), 2.0 * sin(theta * 0.5)^2
+    local wr, wi = 1.0, 0.0
+    for a=3,dual2-1,2 do
+      wr, wi = wr - s*wi - s2*wr, wi + s*wr - s2*wi
+      for i=a,a+2*(n-dual2),2*dual2 do
+	local j = i+dual2
+	local jr, ji = v[j], v[j+1]
+	local dr, di = wr*jr - wi*ji, wr*ji + wi*jr
+	local ir, ii = v[i], v[i+1]
+	v[j], v[j+1] = ir - dr, ii - di
+	v[i], v[i+1] = ir + dr, ii + di
+      end
+    end
+    dual = dual2
+  until dual >= n
+end
+
+function benchmarks.FFT(n)
+  local l2n = log(n)/log(2)
+  if l2n % 1 ~= 0 then
+    io.stderr:write("Error: FFT data length is not a power of 2\n")
+    os.exit(1)
+  end
+  local v = random_vector(n*2)
+  return function(cycles)
+    local norm = 1.0 / n
+    for p=1,cycles do
+      fft_transform(v, n, -1)
+      fft_transform(v, n, 1)
+      for i=1,n*2 do v[i] = v[i] * norm end
+    end
+    return ((5*n-2)*l2n + 2*(n+1)) * cycles
+  end
+end
+
+------------------------------------------------------------------------------
+-- SOR: Jacobi Successive Over-Relaxation.
+------------------------------------------------------------------------------
+
+local function sor_run(mat, m, n, cycles, omega)
+  local om4, om1 = omega*0.25, 1.0-omega
+  m = m - 1
+  n = n - 1
+  for i=1,cycles do
+    for y=2,m do
+      local v, vp, vn = mat[y], mat[y-1], mat[y+1]
+      for x=2,n do
+	v[x] = om4*((vp[x]+vn[x])+(v[x-1]+v[x+1])) + om1*v[x]
+      end
+    end
+  end
+end
+
+function benchmarks.SOR(n)
+  local mat = random_matrix(n, n)
+  return function(cycles)
+    sor_run(mat, n, n, cycles, 1.25)
+    return (n-1)*(n-1)*cycles*6
+  end
+end
+
+------------------------------------------------------------------------------
+-- MC: Monte Carlo Integration.
+------------------------------------------------------------------------------
+
+local function mc_integrate(cycles)
+  local under_curve = 0
+  local rand = rand
+  for i=1,cycles do
+    local x = rand()
+    local y = rand()
+    if x*x + y*y <= 1.0 then under_curve = under_curve + 1 end
+  end
+  return (under_curve/cycles) * 4
+end
+
+function benchmarks.MC()
+  return function(cycles)
+    local res = mc_integrate(cycles)
+    assert(math.sqrt(cycles)*math.abs(res-math.pi) < 5.0, "bad MC result")
+    return cycles * 4 -- Way off, but same as SciMark in C/Java.
+  end
+end
+
+------------------------------------------------------------------------------
+-- Sparse Matrix Multiplication.
+------------------------------------------------------------------------------
+
+local function sparse_mult(n, cycles, vy, val, row, col, vx)
+  for p=1,cycles do
+    for r=1,n do
+      local sum = 0
+      for i=row[r],row[r+1]-1 do sum = sum + vx[col[i]] * val[i] end
+      vy[r] = sum
+    end
+  end
+end
+
+function benchmarks.SPARSE(n, nz)
+  local nr = floor(nz/n)
+  local anz = nr*n
+  local vx = random_vector(n)
+  local val = random_vector(anz)
+  local vy, col, row = darray(n+1), iarray(nz+1), iarray(n+2)
+  row[1] = 1
+  for r=1,n do
+    local step = floor(r/nr)
+    if step < 1 then step = 1 end
+    local rr = row[r]
+    row[r+1] = rr+nr
+    for i=0,nr-1 do col[rr+i] = 1+i*step end
+  end
+  return function(cycles)
+    sparse_mult(n, cycles, vy, val, row, col, vx)
+    return anz*cycles*2
+  end
+end
+
+------------------------------------------------------------------------------
+-- LU: Dense Matrix Factorization.
+------------------------------------------------------------------------------
+
+local function lu_factor(a, pivot, m, n)
+  local min_m_n = m < n and m or n
+  for j=1,min_m_n do
+    local jp, t = j, abs(a[j][j])
+    for i=j+1,m do
+      local ab = abs(a[i][j])
+      if ab > t then
+	jp = i
+	t = ab
+      end
+    end
+    pivot[j] = jp
+    if a[jp][j] == 0 then error("zero pivot") end
+    if jp ~= j then a[j], a[jp] = a[jp], a[j] end
+    if j < m then
+      local recp = 1.0 / a[j][j]
+      for k=j+1,m do
+	local v = a[k]
+	v[j] = v[j] * recp
+      end
+    end
+    if j < min_m_n then
+      for i=j+1,m do
+	local vi, vj = a[i], a[j]
+	local eij = vi[j]
+	for k=j+1,n do vi[k] = vi[k] - eij * vj[k] end
+      end
+    end
+  end
+end
+
+local function matrix_alloc(m, n)
+  local a = {}
+  for y=1,m do a[y] = darray(n+1) end
+  return a
+end
+
+local function matrix_copy(dst, src, m, n)
+  for y=1,m do
+    local vd, vs = dst[y], src[y]
+    for x=1,n do vd[x] = vs[x] end
+  end
+end
+
+function benchmarks.LU(n)
+  local mat = random_matrix(n, n)
+  local tmp = matrix_alloc(n, n)
+  local pivot = iarray(n+1)
+  return function(cycles)
+    for i=1,cycles do
+      matrix_copy(tmp, mat, n, n)
+      lu_factor(tmp, pivot, n, n)
+    end
+    return 2.0/3.0*n*n*n*cycles
+  end
+end
+
+------------------------------------------------------------------------------
+-- Main program.
+------------------------------------------------------------------------------
+
+local function printf(...)
+  io.write(format(...))
+end
+
+local function fmtparams(p1, p2)
+  if p2 then return format("[%d, %d]", p1, p2)
+  elseif p1 then return format("[%d]", p1) end
+  return ""
+end
+
+local function measure(min_time, name, ...)
+  array_init()
+  rand_init(RANDOM_SEED)
+  local run = benchmarks[name](...)
+  local cycles = 1
+  repeat
+    local tm = clock()
+    local flops = run(cycles, ...)
+    tm = clock() - tm
+    if tm >= min_time then
+      local res = flops / tm * 1.0e-6
+      local p1, p2 = ...
+      printf("%-7s %8.2f  %s\n", name, res, fmtparams(...))
+      return res
+    end
+    cycles = cycles * 2
+  until false
+end
+
+printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
+       SCIMARK_VERSION, SCIMARK_COPYRIGHT)
+
+while arg and arg[1] do
+  local a = table.remove(arg, 1)
+  if a == "-noffi" then
+    package.preload.ffi = nil
+  elseif a == "-small" then
+    SIZE_SELECT = "small"
+  elseif a == "-large" then
+    SIZE_SELECT = "large"
+  elseif benchmarks[a] then
+    local p = benchmarks[SIZE_SELECT][a]
+    measure(MIN_TIME, a, tonumber(arg[1]) or p[1], tonumber(arg[2]) or p[2])
+    return
+  else
+    printf("Usage: scimark [-noffi] [-small|-large] [BENCH params...]\n\n")
+    printf("BENCH   -small         -large\n")
+    printf("---------------------------------------\n")
+    for _,name in ipairs(benchmarks) do
+      printf("%-7s %-13s %s\n", name,
+	     fmtparams(unpack(benchmarks.small[name])),
+	     fmtparams(unpack(benchmarks.large[name])))
+    end
+    printf("\n")
+    os.exit(1)
+  end
+end
+
+local params = benchmarks[SIZE_SELECT]
+local sum = 0
+for _,name in ipairs(benchmarks) do
+  sum = sum + measure(MIN_TIME, name, unpack(params[name]))
+end
+printf("\nSciMark %8.2f  [%s problem sizes]\n", sum / #benchmarks, SIZE_SELECT)
+io.flush()
+
diff --git a/perf/LuaJIT-benches/scimark-fft.lua b/perf/LuaJIT-benches/scimark-fft.lua
new file mode 100644
index 00000000..c05bb69a
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-fft.lua
@@ -0,0 +1 @@
+require("scimark_lib").FFT(1024)(tonumber(arg and arg[1]) or 50000)
diff --git a/perf/LuaJIT-benches/scimark-lu.lua b/perf/LuaJIT-benches/scimark-lu.lua
new file mode 100644
index 00000000..7636d994
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-lu.lua
@@ -0,0 +1 @@
+require("scimark_lib").LU(100)(tonumber(arg and arg[1]) or 5000)
diff --git a/perf/LuaJIT-benches/scimark-sor.lua b/perf/LuaJIT-benches/scimark-sor.lua
new file mode 100644
index 00000000..e537e986
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-sor.lua
@@ -0,0 +1 @@
+require("scimark_lib").SOR(100)(tonumber(arg and arg[1]) or 50000)
diff --git a/perf/LuaJIT-benches/scimark-sparse.lua b/perf/LuaJIT-benches/scimark-sparse.lua
new file mode 100644
index 00000000..01a2258d
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-sparse.lua
@@ -0,0 +1 @@
+require("scimark_lib").SPARSE(1000, 5000)(tonumber(arg and arg[1]) or 150000)
diff --git a/perf/LuaJIT-benches/scimark_lib.lua b/perf/LuaJIT-benches/scimark_lib.lua
new file mode 100644
index 00000000..aeffd75a
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark_lib.lua
@@ -0,0 +1,297 @@
+------------------------------------------------------------------------------
+-- Lua SciMark (2010-03-15).
+--
+-- A literal translation of SciMark 2.0a, written in Java and C.
+-- Credits go to the original authors Roldan Pozo and Bruce Miller.
+-- See: http://math.nist.gov/scimark2/
+------------------------------------------------------------------------------
+
+
+local SCIMARK_VERSION = "2010-03-15"
+
+local RANDOM_SEED = 101009 -- Must be odd.
+
+local abs, log, sin, floor = math.abs, math.log, math.sin, math.floor
+local pi, clock = math.pi, os.clock
+
+local benchmarks = {}
+
+------------------------------------------------------------------------------
+-- This is a Lagged Fibonacci Pseudo-random Number Generator with
+-- j, k, M = 5, 17, 31. Pretty weak, but same as C/Java SciMark.
+------------------------------------------------------------------------------
+
+local rand, rand_init
+
+if jit and jit.status and jit.status() then
+  -- LJ2 has bit operations and zero-based arrays (internally).
+  local bit = require("bit")
+  local band, sar = bit.band, bit.arshift
+  local Rm, Rj, Ri = {}, 0, 0
+  for i=0,16 do Rm[i] = 0 end
+  function rand_init(seed)
+    Rj, Ri = 16, 11
+    for i=16,0,-1 do
+      seed = band(seed*9069, 0x7fffffff)
+      Rm[i] = seed
+    end
+  end
+  function rand()
+    local i = band(Ri+1, sar(Ri-16, 31))
+    local j = band(Rj+1, sar(Rj-16, 31))
+    Ri, Rj = i, j
+    local k = band(Rm[i] - Rm[j], 0x7fffffff)
+    Rm[j] = k
+    return k * (1.0/2147483647.0)
+  end
+else
+  -- Better for standard Lua with one-based arrays and without bit operations.
+  local Rm, Rj = {}, 1
+  for i=1,17 do Rm[i] = 0 end
+  function rand_init(seed)
+    Rj = 1
+    for i=17,1,-1 do
+      seed = (seed*9069) % (2^31)
+      Rm[i] = seed
+    end
+  end
+  function rand()
+    local j, m = Rj, Rm
+    local h = j - 5
+    if h < 1 then h = h + 17 end
+    local k = m[h] - m[j]
+    if k < 0 then k = k + 2147483647 end
+    m[j] = k
+    if j < 17 then Rj = j + 1 else Rj = 1 end
+    return k * (1.0/2147483647.0)
+  end
+end
+
+local function random_vector(n)
+  local v = {}
+  for x=1,n do v[x] = rand() end
+  return v
+end
+
+local function random_matrix(m, n)
+  local a = {}
+  for y=1,m do
+    local v = {}
+    a[y] = v
+    for x=1,n do v[x] = rand() end
+  end
+  return a
+end
+
+------------------------------------------------------------------------------
+-- FFT: Fast Fourier Transform.
+------------------------------------------------------------------------------
+
+local function fft_bitreverse(v, n)
+  local j = 0
+  for i=0,2*n-4,2 do
+    if i < j then
+      v[i+1], v[i+2], v[j+1], v[j+2] = v[j+1], v[j+2], v[i+1], v[i+2]
+    end
+    local k = n
+    while k <= j do j = j - k; k = k / 2 end
+    j = j + k
+  end
+end
+
+local function fft_transform(v, n, dir)
+  if n <= 1 then return end
+  fft_bitreverse(v, n)
+  local dual = 1
+  repeat
+    local dual2 = 2*dual
+    for i=1,2*n-1,2*dual2 do
+      local j = i+dual2
+      local ir, ii = v[i], v[i+1]
+      local jr, ji = v[j], v[j+1]
+      v[j], v[j+1] = ir - jr, ii - ji
+      v[i], v[i+1] = ir + jr, ii + ji
+    end
+    local theta = dir * pi / dual
+    local s, s2 = sin(theta), 2.0 * sin(theta * 0.5)^2
+    local wr, wi = 1.0, 0.0
+    for a=3,dual2-1,2 do
+      wr, wi = wr - s*wi - s2*wr, wi + s*wr - s2*wi
+      for i=a,a+2*(n-dual2),2*dual2 do
+	local j = i+dual2
+	local jr, ji = v[j], v[j+1]
+	local dr, di = wr*jr - wi*ji, wr*ji + wi*jr
+	local ir, ii = v[i], v[i+1]
+	v[j], v[j+1] = ir - dr, ii - di
+	v[i], v[i+1] = ir + dr, ii + di
+      end
+    end
+    dual = dual2
+  until dual >= n
+end
+
+function benchmarks.FFT(n)
+  local l2n = log(n)/log(2)
+  if l2n % 1 ~= 0 then
+    io.stderr:write("Error: FFT data length is not a power of 2\n")
+    os.exit(1)
+  end
+  local v = random_vector(n*2)
+  return function(cycles)
+    local norm = 1.0 / n
+    for p=1,cycles do
+      fft_transform(v, n, -1)
+      fft_transform(v, n, 1)
+      for i=1,n*2 do v[i] = v[i] * norm end
+    end
+    return ((5*n-2)*l2n + 2*(n+1)) * cycles
+  end
+end
+
+------------------------------------------------------------------------------
+-- SOR: Jacobi Successive Over-Relaxation.
+------------------------------------------------------------------------------
+
+local function sor_run(mat, m, n, cycles, omega)
+  local om4, om1 = omega*0.25, 1.0-omega
+  m = m - 1
+  n = n - 1
+  for i=1,cycles do
+    for y=2,m do
+      local v, vp, vn = mat[y], mat[y-1], mat[y+1]
+      for x=2,n do
+	v[x] = om4*((vp[x]+vn[x])+(v[x-1]+v[x+1])) + om1*v[x]
+      end
+    end
+  end
+end
+
+function benchmarks.SOR(n)
+  local mat = random_matrix(n, n)
+  return function(cycles)
+    sor_run(mat, n, n, cycles, 1.25)
+    return (n-1)*(n-1)*cycles*6
+  end
+end
+
+------------------------------------------------------------------------------
+-- MC: Monte Carlo Integration.
+------------------------------------------------------------------------------
+
+local function mc_integrate(cycles)
+  local under_curve = 0
+  local rand = rand
+  for i=1,cycles do
+    local x = rand()
+    local y = rand()
+    if x*x + y*y <= 1.0 then under_curve = under_curve + 1 end
+  end
+  return (under_curve/cycles) * 4
+end
+
+function benchmarks.MC()
+  return function(cycles)
+    local res = mc_integrate(cycles)
+    assert(math.sqrt(cycles)*math.abs(res-math.pi) < 5.0, "bad MC result")
+    return cycles * 4 -- Way off, but same as SciMark in C/Java.
+  end
+end
+
+------------------------------------------------------------------------------
+-- Sparse Matrix Multiplication.
+------------------------------------------------------------------------------
+
+local function sparse_mult(n, cycles, vy, val, row, col, vx)
+  for p=1,cycles do
+    for r=1,n do
+      local sum = 0
+      for i=row[r],row[r+1]-1 do sum = sum + vx[col[i]] * val[i] end
+      vy[r] = sum
+    end
+  end
+end
+
+function benchmarks.SPARSE(n, nz)
+  local nr = floor(nz/n)
+  local anz = nr*n
+  local vx = random_vector(n)
+  local val = random_vector(anz)
+  local vy, col, row = {}, {}, {}
+  row[1] = 1
+  for r=1,n do
+    local step = floor(r/nr)
+    if step < 1 then step = 1 end
+    local rr = row[r]
+    row[r+1] = rr+nr
+    for i=0,nr-1 do col[rr+i] = 1+i*step end
+  end
+  return function(cycles)
+    sparse_mult(n, cycles, vy, val, row, col, vx)
+    return anz*cycles*2
+  end
+end
+
+------------------------------------------------------------------------------
+-- LU: Dense Matrix Factorization.
+------------------------------------------------------------------------------
+
+local function lu_factor(a, pivot, m, n)
+  local min_m_n = m < n and m or n
+  for j=1,min_m_n do
+    local jp, t = j, abs(a[j][j])
+    for i=j+1,m do
+      local ab = abs(a[i][j])
+      if ab > t then
+	jp = i
+	t = ab
+      end
+    end
+    pivot[j] = jp
+    if a[jp][j] == 0 then error("zero pivot") end
+    if jp ~= j then a[j], a[jp] = a[jp], a[j] end
+    if j < m then
+      local recp = 1.0 / a[j][j]
+      for k=j+1,m do
+        local v = a[k]
+	v[j] = v[j] * recp
+      end
+    end
+    if j < min_m_n then
+      for i=j+1,m do
+	local vi, vj = a[i], a[j]
+	local eij = vi[j]
+	for k=j+1,n do vi[k] = vi[k] - eij * vj[k] end
+      end
+    end
+  end
+end
+
+local function matrix_alloc(m, n)
+  local a = {}
+  for y=1,m do a[y] = {} end
+  return a
+end
+
+local function matrix_copy(dst, src, m, n)
+  for y=1,m do
+    local vd, vs = dst[y], src[y]
+    for x=1,n do vd[x] = vs[x] end
+  end
+end
+
+function benchmarks.LU(n)
+  local mat = random_matrix(n, n)
+  local tmp = matrix_alloc(n, n)
+  local pivot = {}
+  return function(cycles)
+    for i=1,cycles do
+      matrix_copy(tmp, mat, n, n)
+      lu_factor(tmp, pivot, n, n)
+    end
+    return 2.0/3.0*n*n*n*cycles
+  end
+end
+
+rand_init(RANDOM_SEED)
+
+return benchmarks
diff --git a/perf/LuaJIT-benches/series.lua b/perf/LuaJIT-benches/series.lua
new file mode 100644
index 00000000..f766cb32
--- /dev/null
+++ b/perf/LuaJIT-benches/series.lua
@@ -0,0 +1,34 @@
+
+local function integrate(x0, x1, nsteps, omegan, f)
+  local x, dx = x0, (x1-x0)/nsteps
+  local rvalue = ((x0+1)^x0 * f(omegan*x0)) / 2
+  for i=3,nsteps do
+    x = x + dx
+    rvalue = rvalue + (x+1)^x * f(omegan*x)
+  end
+  return (rvalue + ((x1+1)^x1 * f(omegan*x1)) / 2) * dx
+end
+
+local function series(n)
+  local sin, cos = math.sin, math.cos
+  local omega = math.pi
+  local t = {}
+
+  t[1] = integrate(0, 2, 1000, 0, function() return 1 end) / 2
+  t[2] = 0
+
+  for i=2,n do
+    t[2*i-1] = integrate(0, 2, 1000, omega*i, cos)
+    t[2*i] = integrate(0, 2, 1000, omega*i, sin)
+  end
+
+  return t
+end
+
+local n = tonumber(arg and arg[1]) or 10000
+local tm = os.clock()
+local t = series(n)
+tm = os.clock() - tm
+assert(math.abs(t[1]-2.87295) < 0.00001)
+io.write(string.format("size %d, %.2f s, %.1f iterations/s\n",
+                       n, tm, (2*n-1)/tm))
diff --git a/perf/LuaJIT-benches/spectral-norm.lua b/perf/LuaJIT-benches/spectral-norm.lua
new file mode 100644
index 00000000..ecc80112
--- /dev/null
+++ b/perf/LuaJIT-benches/spectral-norm.lua
@@ -0,0 +1,40 @@
+
+local function A(i, j)
+  local ij = i+j-1
+  return 1.0 / (ij * (ij-1) * 0.5 + i)
+end
+
+local function Av(x, y, N)
+  for i=1,N do
+    local a = 0
+    for j=1,N do a = a + x[j] * A(i, j) end
+    y[i] = a
+  end
+end
+
+local function Atv(x, y, N)
+  for i=1,N do
+    local a = 0
+    for j=1,N do a = a + x[j] * A(j, i) end
+    y[i] = a
+  end
+end
+
+local function AtAv(x, y, t, N)
+  Av(x, t, N)
+  Atv(t, y, N)
+end
+
+local N = tonumber(arg and arg[1]) or 100
+local u, v, t = {}, {}, {}
+for i=1,N do u[i] = 1 end
+
+for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
+
+local vBv, vv = 0, 0
+for i=1,N do
+  local ui, vi = u[i], v[i]
+  vBv = vBv + ui*vi
+  vv = vv + vi*vi
+end
+io.write(string.format("%0.9f\n", math.sqrt(vBv / vv)))
diff --git a/perf/LuaJIT-benches/sum-file.lua b/perf/LuaJIT-benches/sum-file.lua
new file mode 100644
index 00000000..c9e618fd
--- /dev/null
+++ b/perf/LuaJIT-benches/sum-file.lua
@@ -0,0 +1,6 @@
+
+local sum = 0
+for line in io.lines() do
+  sum = sum + line
+end
+io.write(sum, "\n")
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module Sergey Kaplun via Tarantool-patches
                   ` (38 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This module contains 2 functions:
- `realtime()` -- returns the time represented by the wall clock.
- `process_cputime()` -- returns the time consumed by all threads of
  the process.

Both functions are implemented via FFI call to the `clock_gettime()`.
---
 perf/utils/clock.lua | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)
 create mode 100644 perf/utils/clock.lua

diff --git a/perf/utils/clock.lua b/perf/utils/clock.lua
new file mode 100644
index 00000000..57385967
--- /dev/null
+++ b/perf/utils/clock.lua
@@ -0,0 +1,35 @@
+local ffi = require('ffi')
+
+ffi.cdef[[
+struct timespec {
+  long tv_sec; /* Seconds. */
+  long tv_nsec; /* Nanoseconds. */
+};
+
+int clock_gettime(int clockid, struct timespec *tp);
+]]
+
+local C = ffi.C
+
+-- Wall clock.
+local CLOCK_REALTIME = 0
+-- CPU time consumed by the process.
+local CLOCK_PROCESS_CPUTIME_ID = 2
+
+-- All functions below returns the corresponding `clock_gettime()`
+-- in seconds.
+local M = {}
+
+local timespec = ffi.new('struct timespec[1]')
+
+function M.realtime()
+  C.clock_gettime(CLOCK_REALTIME, timespec)
+  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
+end
+
+function M.process_cputime()
+  C.clock_gettime(CLOCK_PROCESS_CPUTIME_ID, timespec)
+  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
+end
+
+return M
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite Sergey Kaplun via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-11 15:41   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches Sergey Kaplun via Tarantool-patches
                   ` (37 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This module provides functionality to run custom benchmark workloads
defined by the following syntax:

| local bench = require('bench').new(arg)
|
| -- f_* are functions, n_* are numbers.
| bench:add({
|   setup = f_setup,
|   payload = f_payload,
|   teardown = f_teardown,
|   items = n_items_processed,
|
|   checker = f_checker,
|   -- Or instead:
|   skip_check = true,
|
|   iterations = n_iterations,
|   -- Or instead:
|   min_time = n_seconds,
| })
|
| bench:run_and_report()

The checker function received the single value returned by the payload
function and completed all checks related to the test. If it returns a
true value, it is considered a successful check pass. The checker
function is called before the main workload as a warm-up. Generally, you
should always provide the checker function to be sure that your
benchmark is still correct after optimizations. In cases when it is
impossible (for some reason), you may specify the `skip_check` flag. In
that case the warm-up part will be skipped as well.

Each test is run in the order it was added. The module measures the
real-time and CPU time necessary to run `iterations` repetitions of the
test or amount of iterations `min_time` in seconds (4 by default) and
calculates the metric items per second (more is better). The total
amount of items equals `n_items_processed * n_iterations`. The items may
be added in the table with the description inside the payload function
as well. The results (real-time, CPU time, iterations, items/s) are
reported in a format similar to the Google Benchmark suite [1].

Each test may be run from the command line as follows:
| LUA_PATH="..." luajit test_name.lua [flags] arguments

The supported flags are:
| -j{off|on}                 Disable/Enable JIT for the benchmarks.
| --benchmark_color={true|false|auto}
|                            Enables the colorized output for the
|                            terminal (not the file).
| --benchmark_min_time={number} Minimum seconds to run the benchmark
|                            tests.
| --benchmark_out=<file>     Places the output into <file>.
| --benchmark_out_format={console|json}
|                            The format is used when saving the results in the
|                            file. The default format is the JSON format.
| -h, --help                 Display help message and exit.

These options are similar to the Google Benchmark command line options,
but with a few changes:
1) If an output file is given, there is no output in the terminal.
2) The min_time option supports only number values. There is no support
   for the iterations number (by the 'x' suffix).

[1]: https://github.com/google/benchmark
---
 perf/utils/bench.lua | 509 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 509 insertions(+)
 create mode 100644 perf/utils/bench.lua

diff --git a/perf/utils/bench.lua b/perf/utils/bench.lua
new file mode 100644
index 00000000..68473215
--- /dev/null
+++ b/perf/utils/bench.lua
@@ -0,0 +1,509 @@
+local clock = require('clock')
+local ffi = require('ffi')
+-- Require 'cjson' only on demand for formatted output to file.
+local json
+
+local M = {}
+
+local type, assert, error = type, assert, error
+local format, rep = string.format, string.rep
+local floor, max, min = math.floor, math.max, math.min
+local table_remove = table.remove
+
+local LJ_HASJIT = jit and jit.opt
+
+-- Argument parsing. ---------------------------------------------
+
+-- XXX: Make options compatible with Google Benchmark, since most
+-- probably it will be used for the C benchmarks as well.
+-- Compatibility isn't full: there is no support for environment
+-- variables (since they are not so useful) and the output to the
+-- terminal is suppressed if the --benchmark_out flag is
+-- specified.
+
+local HELP_MSG = [[
+ Options:
+   -j{off|on}                 Disable/Enable JIT for the benchmarks.
+   --benchmark_color={true|false|auto}
+                              Enables the colorized output for the terminal (not
+                              the file). 'auto' means to use colors if the
+                              output is being sent to a terminal and the TERM
+                              environment variable is set to a terminal type
+                              that supports colors.
+   --benchmark_min_time={number}
+                              Minimum seconds to run the benchmark tests.
+                              4.0 by default.
+   --benchmark_out=<file>     Places the output into <file>.
+   --benchmark_out_format={console|json}
+                              The format is used when saving the results in the
+                              file. The default format is the JSON format.
+   -h, --help                 Display this message and exit.
+
+ There are a bunch of suggestions on how to achieve the most
+ stable benchmark results:
+ https://github.com/tarantool/tarantool/wiki/Benchmarking
+]]
+
+local function usage(ctx)
+  local header = format('USAGE: luajit %s [options]\n', ctx.name)
+  io.stderr:write(header, HELP_MSG)
+  os.exit(1)
+end
+
+local function check_param(check, strfmt, ...)
+  if not check then
+    io.stderr:write(format(strfmt, ...))
+    os.exit(1)
+  end
+end
+
+-- Valid values: 'false'/'no'/'0'.
+-- In case of an invalid value the 'auto' is used.
+local function set_color(ctx, value)
+  if value == 'false' or value == 'no' or value == '0' then
+    ctx.color = false
+  else
+    -- In case of an invalid value, the Google Benchmark uses
+    -- 'auto', which is true for the stdout output (the only
+    -- colorizable output). So just set it to true by default.
+    ctx.color = true
+  end
+end
+
+local DEFAULT_MIN_TIME = 4.0
+local function set_min_time(ctx, value)
+  local time = tonumber(value)
+  check_param(time, 'Invalid min time: "%s"\n', value)
+  ctx.min_time = time
+end
+
+local function set_output(ctx, filename)
+  check_param(type(filename) == "string", 'Invalid output value: "%s"\n',
+              filename)
+  ctx.output = filename
+end
+
+-- Determine the output format for the benchmark.
+-- Supports only 'console' and 'json' for now.
+local function set_output_format(ctx, value)
+  local output_format = tostring(value)
+  check_param(output_format, 'Invalid output format: "%s"\n', value)
+  output_format = output_format:lower()
+  check_param(output_format == 'json' or output_format == 'console',
+              'Unsupported output format: "%s"\n', output_format)
+  ctx.output_format = output_format
+end
+
+local function set_jit(ctx, value)
+  check_param(value == 'on' or value == 'off',
+             'Invalid jit value: "%s"\n', value)
+  if value == 'off' then
+    ctx.jit = false
+  elseif value == 'on' then
+    ctx.jit = true
+  end
+end
+
+local function unrecognized_option(optname, dashes)
+  local fullname = dashes .. (optname or '=')
+  io.stderr:write(format('unrecognized command-line flag: %s\n', fullname))
+  io.stderr:write(HELP_MSG)
+  os.exit(1)
+end
+
+local function unrecognized_long_option(_, optname)
+  unrecognized_option(optname, '--')
+end
+
+local function unrecognized_short_option(_, optname)
+  unrecognized_option(optname, '-')
+end
+
+local SHORT_OPTS = setmetatable({
+  ['h'] = usage,
+  ['j'] = set_jit,
+}, {__index = unrecognized_short_option})
+
+local LONG_OPTS = setmetatable({
+  ['benchmark_color'] = set_color,
+  ['benchmark_min_time'] = set_min_time,
+  ['benchmark_out'] = set_output,
+  -- XXX: For now support only JSON encoded and raw output.
+  ['benchmark_out_format'] = set_output_format,
+  ['help'] = usage,
+}, {__index = unrecognized_long_option})
+
+local function is_option(str)
+  return type(str) == 'string' and str:sub(1, 1) == '-' and str ~= '-'
+end
+
+local function next_arg_value(arg, n)
+  local opt_value = nil
+  if arg[n] and not is_option(arg[n]) then
+    opt_value = arg[n]
+    table_remove(arg, n)
+  end
+  return opt_value
+end
+
+local function parse_long_option(arg, a, n)
+  local opt_name, opt_value
+  -- Remove dashes.
+  local opt = a:sub(3)
+  -- --option=value
+  if opt:find('=', 1, true) then
+    -- May match empty option name and/or value.
+    opt_name, opt_value = opt:match('^([^=]+)=(.*)$')
+  else
+    -- --option value
+    opt_name = opt
+    opt_value = next_arg_value(arg, n)
+  end
+  return opt_name, opt_value
+end
+
+local function parse_short_option(arg, a, n)
+  local opt_name, opt_value
+  -- Remove the dash.
+  local opt = a:sub(2)
+  if #opt == 1 then
+    -- -o value
+    opt_name = opt
+    opt_value = next_arg_value(arg, n)
+  else
+    -- -ovalue.
+    opt_name = opt:sub(1, 1)
+    opt_value = opt:sub(2)
+  end
+  return opt_name, opt_value
+end
+
+local function parse_opt(ctx, arg, a, n)
+  if a:sub(1, 2) == '--' then
+    local opt_name, opt_value = parse_long_option(arg, a, n)
+    LONG_OPTS[opt_name](ctx, opt_value)
+  else
+    local opt_name, opt_value = parse_short_option(arg, a, n)
+    SHORT_OPTS[opt_name](ctx, opt_value)
+  end
+end
+
+-- Process the options and update the benchmark context.
+local function argparse(arg, name)
+  local ctx = {name = name}
+  local n = 1
+  while n <= #arg do
+    local a = arg[n]
+    if is_option(a) then
+      table_remove(arg, n)
+      parse_opt(ctx, arg, a, n)
+    else
+      -- Just ignore it.
+      n = n + 1
+    end
+  end
+  return ctx
+end
+
+-- Formatting. ---------------------------------------------------
+
+local function format_console_header()
+  -- Use a similar format to the Google Benchmark, except for the
+  -- fixed benchmark name length.
+  local header = format('%-37s %12s %15s %13s %-28s\n',
+    'Benchmark', 'Time', 'CPU', 'Iterations', 'UserCounters...'
+  )
+  local border = rep('-', #header - 1) .. '\n'
+  return border .. header .. border
+end
+
+local COLORS = {
+  GREEN = '\027[32m%s\027[m',
+  YELLOW = '\027[33m%s\027[m',
+  CYAN = '\027[36m%s\027[m',
+}
+
+local function format_name(ctx, name)
+  name = format('%-37s ', name)
+  if ctx.color then
+     name = format(COLORS.GREEN, name)
+  end
+  return name
+end
+
+local function format_time(ctx, real_time, cpu_time, time_unit)
+  local timestr = format('%10.2f %-4s %10.2f %-4s ', real_time, time_unit,
+                         cpu_time, time_unit)
+  if ctx.color then
+     timestr = format(COLORS.YELLOW, timestr)
+  end
+  return timestr
+end
+
+local function format_iterations(ctx, iterations)
+  iterations = format('%10d ', iterations)
+  if ctx.color then
+     iterations = format(COLORS.CYAN, iterations)
+  end
+  return iterations
+end
+
+local function format_ips(ips)
+  local ips_str
+  if ips / 1e6 > 1 then
+    ips_str = format('items_per_second=%.3fM/s', ips / 1e6)
+  elseif ips / 1e3 > 1 then
+    ips_str = format('items_per_second=%.3fk/s', ips / 1e3)
+  else
+    ips_str = format('items_per_second=%d/s', ips)
+  end
+  return ips_str
+end
+
+local function format_result_console(ctx, r)
+  return format('%s%s%s%s\n',
+    format_name(ctx, r.name),
+    format_time(ctx, r.real_time, r.cpu_time, r.time_unit),
+    format_iterations(ctx, r.iterations),
+    format_ips(r.items_per_second)
+  )
+end
+
+local function format_results(ctx)
+  local output_format = ctx.output_format
+  local res = ''
+  if output_format == 'json' then
+    res = json.encode({
+      benchmarks = ctx.results,
+      context = ctx.context,
+    })
+  else
+    assert(output_format == 'console', 'Unknown format: ' .. output_format)
+    res = res .. format_console_header()
+    for _, r in ipairs(ctx.results) do
+      res = res .. format_result_console(ctx, r)
+    end
+  end
+  return res
+end
+
+local function report_results(ctx)
+  ctx.fh:write(format_results(ctx))
+end
+
+-- Tests setup and run. ------------------------------------------
+
+local function term_is_color()
+  local term = os.getenv('TERM')
+  return (term and term:match('color') or os.getenv('COLORTERM'))
+end
+
+local function benchmark_context(ctx)
+  return {
+    arch = jit.arch,
+    -- Google Benchmark reports a date in ISO 8061 format.
+    date = os.date('%Y-%m-%dT%H:%M:%S%z'),
+    gc64 = ffi.abi('gc64'),
+    host_name = io.popen('hostname'):read(),
+    jit = ctx.jit,
+  }
+end
+
+local function init(ctx)
+  -- Array of benches to proceed with.
+  ctx.benches = {}
+  -- Array of the corresponding results.
+  ctx.results = {}
+
+  if ctx.jit == nil then
+    if LJ_HASJIT then
+      ctx.jit = jit.status()
+    else
+      ctx.jit = false
+    end
+  end
+  ctx.color = ctx.color == nil and true or ctx.color
+  if ctx.output then
+    -- Don't bother with manual file closing. It will be closed
+    -- automatically when the corresponding object is
+    -- garbage-collected.
+    ctx.fh = assert(io.open(ctx.output, 'w+'))
+    ctx.output_format = ctx.output_format or 'json'
+    -- Always without color.
+    ctx.color = false
+  else
+    ctx.fh = io.stdout
+    -- Always console outptut to the terminal.
+    ctx.output_format = 'console'
+    if ctx.color and term_is_color() then
+      ctx.color = true
+    else
+      ctx.color = false
+    end
+  end
+  ctx.min_time = ctx.min_time or DEFAULT_MIN_TIME
+
+  if ctx.output_format == 'json' then
+    json = require('cjson')
+  end
+
+  -- Google Benchmark's context, plus benchmark info.
+  ctx.context = benchmark_context(ctx)
+
+  return ctx
+end
+
+local function test_name()
+  return debug.getinfo(3, 'S').short_src:match('([^/\\]+)$')
+end
+
+local function add_bench(ctx, bench)
+  if bench.checker == nil and not bench.skip_check then
+    error('Bench requires a checker to proof the results', 2)
+  end
+  table.insert(ctx.benches, bench)
+end
+
+local MAX_ITERATIONS = 1e9
+-- Determine the number of iterations for the next benchmark run.
+local function iterations_multiplier(min_time, get_time, iterations)
+  -- When the last run is at least 10% of the required time, the
+  -- maximum expansion should be 14x.
+  local multiplier = min_time * 1.4 / max(get_time, 1e-9)
+  local is_significant = get_time / min_time > 0.1
+  multiplier = is_significant and multiplier or 10
+  local new_iterations = max(floor(multiplier * iterations), iterations + 1)
+  return min(new_iterations, MAX_ITERATIONS)
+end
+
+-- https://luajit.org/running.html#foot.
+local JIT_DEFAULTS = {
+  maxtrace = 1000,
+  maxrecord = 4000,
+  maxirconst = 500,
+  maxside = 100,
+  maxsnap = 500,
+  hotloop = 56,
+  hotexit = 10,
+  tryside = 4,
+  instunroll = 4,
+  loopunroll = 15,
+  callunroll = 3,
+  recunroll = 2,
+  sizemcode = 32,
+  maxmcode = 512,
+}
+
+-- Basic setup for all tests to clean up after a previous
+-- executor.
+local function luajit_tests_setup(ctx)
+  -- Reset the JIT to the defaults.
+  if ctx.jit == false then
+    jit.off()
+  elseif LJ_HASJIT then
+    jit.on()
+    jit.flush()
+    jit.opt.start(3)
+    for k, v in pairs(JIT_DEFAULTS) do
+      jit.opt.start(k .. '=' .. v)
+    end
+  end
+
+  -- Reset the GC to the defaults.
+  collectgarbage('setstepmul', 200)
+  collectgarbage('setpause', 200)
+
+  -- Collect all garbage at the end. Twice to be sure that all
+  -- finalizers are run.
+  collectgarbage()
+  collectgarbage()
+end
+
+local function run_benches(ctx)
+  -- Process the tests in the predefined order with ipairs.
+  for _, bench in ipairs(ctx.benches) do
+    luajit_tests_setup(ctx)
+    if bench.setup then bench.setup() end
+
+    -- The first run is used as a warm-up, plus results checks.
+    local payload = bench.payload
+    -- Generally you should never skip any checks. But sometimes
+    -- a bench may generate so much output in one run that it is
+    -- overkill to save the result in the file and test it.
+    -- So to avoid double time for the test run, just skip the
+    -- check.
+    if not bench.skip_check then
+      local result = payload()
+      assert(bench.checker(result))
+    end
+    local N
+    local delta_real, delta_cpu
+    -- Iterations are specified manually.
+    if bench.iterations then
+      N = bench.iterations
+
+      local start_real = clock.realtime()
+      local start_cpu  = clock.process_cputime()
+      for _ = 1, N do
+        payload()
+      end
+      delta_real = clock.realtime() - start_real
+      delta_cpu  = clock.process_cputime() - start_cpu
+    else
+      -- Iterations are determined dinamycally, adjusting to fit
+      -- the minimum time to run for the benchmark.
+      local min_time = bench.min_time or ctx.min_time
+      local next_iterations = 1
+      repeat
+        N = next_iterations
+        local start_real = clock.realtime()
+        local start_cpu  = clock.process_cputime()
+        for _ = 1, N do
+          payload()
+        end
+        delta_real = clock.realtime() - start_real
+        delta_cpu  = clock.process_cputime() - start_cpu
+        next_iterations = iterations_multiplier(min_time, delta_real, N)
+      until delta_real > min_time or N == next_iterations
+    end
+
+    if bench.teardown then bench.teardown() end
+
+    local items = N * bench.items
+    local items_per_second = math.floor(items / delta_real)
+    table.insert(ctx.results, {
+      cpu_time = delta_cpu,
+      real_time = delta_real,
+      items_per_second = items_per_second,
+      iterations = N,
+      name = bench.name,
+      time_unit = 's',
+      -- Fields below are used only for the Google Benchmark
+      -- compatibility. We don't use them really.
+      run_name = bench.name,
+      run_type = 'iteration',
+      repetitions = 1,
+      repetition_index = 1,
+      threads = 1,
+    })
+  end
+end
+
+local function run_and_report(ctx)
+  run_benches(ctx)
+  report_results(ctx)
+end
+
+function M.new(arg)
+  assert(type(arg) == 'table', 'given argument should be a table')
+  local name = test_name()
+  local ctx = init(argparse(arg, name))
+  return setmetatable(ctx, {__index = {
+    add = add_bench,
+    run = run_benches,
+    report = report_results,
+    run_and_report = run_and_report,
+  }})
+end
+
+return M
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (2 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees " Sergey Kaplun via Tarantool-patches
                   ` (36 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The number of iterations is fixed for this test to avoid OOM errors
for the non-GC64 builds.
---
 perf/LuaJIT-benches/array3d.lua | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
index c10b09b1..75ab5b01 100644
--- a/perf/LuaJIT-benches/array3d.lua
+++ b/perf/LuaJIT-benches/array3d.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local function array_set(self, x, y, z, p)
   assert(x >= 0 and x < self.nx, "x outside PA")
@@ -50,10 +51,24 @@ end
 
 local dim = tonumber(arg and arg[1]) or 300 -- Array dimension dim^3
 local packed = arg and arg[2] == "packed"   -- Packed image or flat
-local arr = array_new(dim, dim, dim, packed)
 
-for x,y,z in arr:points() do
-  arr:set(x, y, z, x*x)
-end
-assert(arr.image[dim^3-1] == (dim-1)^2)
+bench:add({
+  name = "array3d",
+  checker = function(arr)
+    assert(arr.image[dim^3-1] == (dim-1)^2)
+    return true
+  end,
+  payload = function()
+    local arr = array_new(dim, dim, dim, packed)
+    for x,y,z in arr:points() do
+      arr:set(x, y, z, x*x)
+    end
+    return arr
+  end,
+  items = dim * dim * dim,
+  -- Limit the number of iterations to avoid OOM errors for
+  -- non-GC64 builds.
+  iterations = 5,
+})
 
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (3 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos " Sergey Kaplun via Tarantool-patches
                   ` (35 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The test cases are split by the different types of trees:
1) stretched tree,
2) long-lived tree,
3) several trees with a depth of the power of 2,
4) iteration over all trees in the third test case.

The number of items is the number of `ItemCheck()` first-level calls
performed in the payload.
---

I'm not sure that we should distinguish different subtests here.
OTOH, how to calculate the amount of items correctly for the whole test
instead?

 perf/LuaJIT-benches/binary-trees.lua | 94 ++++++++++++++++++++++------
 1 file changed, 76 insertions(+), 18 deletions(-)

diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
index bf040466..9d4dc7b4 100644
--- a/perf/LuaJIT-benches/binary-trees.lua
+++ b/perf/LuaJIT-benches/binary-trees.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local function BottomUpTree(item, depth)
   if depth > 0 then
@@ -18,30 +19,87 @@ local function ItemCheck(tree)
   end
 end
 
-local N = tonumber(arg and arg[1]) or 0
+local N = tonumber(arg and arg[1]) or 16
 local mindepth = 4
 local maxdepth = mindepth + 2
 if maxdepth < N then maxdepth = N end
 
-do
-  local stretchdepth = maxdepth + 1
-  local stretchtree = BottomUpTree(0, stretchdepth)
-  io.write(string.format("stretch tree of depth %d\t check: %d\n",
-    stretchdepth, ItemCheck(stretchtree)))
-end
+local stretchdepth = maxdepth + 1
+
+bench:add({
+  name = "stretch_depth_" .. tostring(stretchdepth),
+  payload = function()
+    local stretchtree = BottomUpTree(0, stretchdepth)
+    local check = ItemCheck(stretchtree)
+    return check
+  end,
+  items = 1,
+  checker = function(check)
+    return check == -1
+  end,
+})
 
-local longlivedtree = BottomUpTree(0, maxdepth)
+-- This tree created once on the setup for the first test.
+local longlivedtree
 
-for depth=mindepth,maxdepth,2 do
+for depth = mindepth, maxdepth, 2 do
   local iterations = 2 ^ (maxdepth - depth + mindepth)
-  local check = 0
-  for i=1,iterations do
-    check = check + ItemCheck(BottomUpTree(1, depth)) +
-            ItemCheck(BottomUpTree(-1, depth))
-  end
-  io.write(string.format("%d\t trees of depth %d\t check: %d\n",
-    iterations*2, depth, check))
+  local tree_bench
+  tree_bench = {
+    name = "tree_depth_" .. tostring(depth),
+    setup = function()
+      if not longlivedtree then
+        longlivedtree = BottomUpTree(0, maxdepth)
+      end
+      tree_bench.items = iterations * 2
+    end,
+    checker = function(check)
+      return check == -iterations * 2
+    end,
+    payload = function()
+      local check = 0
+      for i = 1, iterations do
+        check = check + ItemCheck(BottomUpTree(1, depth)) +
+                ItemCheck(BottomUpTree(-1, depth))
+      end
+      return check
+    end,
+  }
+
+  bench:add(tree_bench)
 end
 
-io.write(string.format("long lived tree of depth %d\t check: %d\n",
-  maxdepth, ItemCheck(longlivedtree)))
+bench:add({
+  name = "longlived_depth_" .. tostring(maxdepth),
+  payload = function()
+    local check = ItemCheck(longlivedtree)
+    return check
+  end,
+  items = 1,
+  checker = function(check)
+    return check == -1
+  end,
+})
+
+bench:add({
+  name = "all_in_once",
+  payload = function()
+    for depth = mindepth, maxdepth, 2 do
+      local iterations = 2 ^ (maxdepth - depth + mindepth)
+      local tree_bench
+      local check = 0
+      for i = 1, iterations do
+        check = check + ItemCheck(BottomUpTree(1, depth)) +
+                ItemCheck(BottomUpTree(-1, depth))
+      end
+      assert(check == -iterations * 2)
+    end
+  end,
+  -- Geometric progression, starting at maxdepth trees with the
+  -- corresponding step.
+  items = (2 * maxdepth) * (4 ^ ((maxdepth - mindepth) / 2 + 1) - 1) / 3,
+  -- Correctness is checked in the payload function.
+  skip_check = true,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (4 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-13 11:11   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring " Sergey Kaplun via Tarantool-patches
                   ` (34 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/chameneos.lua | 32 ++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/perf/LuaJIT-benches/chameneos.lua b/perf/LuaJIT-benches/chameneos.lua
index 78b64c3f..c1002041 100644
--- a/perf/LuaJIT-benches/chameneos.lua
+++ b/perf/LuaJIT-benches/chameneos.lua
@@ -1,8 +1,10 @@
+local bench = require("bench").new(arg)
 
 local co = coroutine
 local create, resume, yield = co.create, co.resume, co.yield
 
-local N = tonumber(arg and arg[1]) or 10
+local N = tonumber(arg and arg[1]) or 1e7
+local N_ATTEMPTS = N
 local first, second
 
 -- Meet another creature.
@@ -57,12 +59,24 @@ local function schedule(threads)
   until false
 end
 
--- A bunch of colorful creatures.
-local threads = {
-  creature("blue"),
-  creature("red"),
-  creature("yellow"),
-  creature("blue"),
-}
+bench:add({
+  name = "chameneos",
+  items = N_ATTEMPTS,
+  checker = function(meetings) return meetings == N_ATTEMPTS * 2 end,
+  payload = function()
+    -- A bunch of colorful creatures.
+    local threads = {
+      creature("blue"),
+      creature("red"),
+      creature("yellow"),
+      creature("blue"),
+    }
 
-io.write(schedule(threads), "\n")
+    local meetings = schedule(threads)
+    -- XXX: Restore meetings for the next iteration.
+    N = N_ATTEMPTS
+    return meetings
+  end,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (5 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-13 11:17   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit " Sergey Kaplun via Tarantool-patches
                   ` (33 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/coroutine-ring.lua | 45 ++++++++++++++++----------
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
index 1e8c5ef6..1b86a5ba 100644
--- a/perf/LuaJIT-benches/coroutine-ring.lua
+++ b/perf/LuaJIT-benches/coroutine-ring.lua
@@ -1,3 +1,5 @@
+local bench = require("bench").new(arg)
+
 -- The Computer Language Benchmarks Game
 -- http://shootout.alioth.debian.org/
 -- contributed by Sam Roberts
@@ -7,7 +9,6 @@ local n         = tonumber(arg and arg[1]) or 2e7
 
 -- fixed size pool
 local poolsize  = 503
-local threads   = {}
 
 -- cache these to avoid global environment lookups
 local create    = coroutine.create
@@ -15,7 +16,6 @@ local resume    = coroutine.resume
 local yield     = coroutine.yield
 
 local id        = 1
-local token     = 0
 local ok
 
 local body = function(token)
@@ -24,19 +24,30 @@ local body = function(token)
   end
 end
 
--- create all threads
-for id = 1, poolsize do
-  threads[id] = create(body)
-end
-
--- send the token
-repeat
-  if id == poolsize then
-    id = 1
-  else
-    id = id + 1
-  end
-  ok, token = resume(threads[id], token)
-until token == n
+bench:add({
+  name = "coroutine_ring",
+  payload = function()
+    local token     = 0
+    -- create all threads
+    local threads   = {}
+    for id = 1, poolsize do
+      threads[id] = create(body)
+    end
+
+    -- send the token
+    repeat
+      if id == poolsize then
+        id = 1
+      else
+        id = id + 1
+      end
+      ok, token = resume(threads[id], token)
+    until token == n
+    return id
+  end,
+  checker = function(id) return id == (n % poolsize + 1) end,
+  items = n,
+})
+
+bench:run_and_report()
 
-io.write(id, "\n")
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (6 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-13 11:44   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch " Sergey Kaplun via Tarantool-patches
                   ` (32 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/euler14-bit.lua | 52 ++++++++++++++++++++---------
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
index 537f2bf3..7c521deb 100644
--- a/perf/LuaJIT-benches/euler14-bit.lua
+++ b/perf/LuaJIT-benches/euler14-bit.lua
@@ -1,22 +1,42 @@
+local bench = require("bench").new(arg)
 
 local bit = require("bit")
 local bnot, bor, band = bit.bnot, bit.bor, bit.band
 local shl, shr = bit.lshift, bit.rshift
 
-local N = tonumber(arg and arg[1]) or 10000000
-local cache, m, n = { 1 }, 1, 1
-if arg and arg[2] then cache = nil end
-for i=2,N do
-  local j = i
-  for len=1,1000000000 do
-    j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
-    if cache then
-      local x = cache[j]; if x then j = x+len; break end
-    elseif j == 1 then
-      j = len+1; break
+local DEFAULT_N = 2e7
+local N = tonumber(arg and arg[1]) or DEFAULT_N
+local drop_cache = arg and arg[2]
+
+bench:add({
+  name = "euler14_bit",
+  payload = function()
+    local cache, m, n = { 1 }, 1, 1
+    if drop_cache then cache = nil end
+    for i=2,N do
+      local j = i
+      for len=1,1000000000 do
+        j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
+        if cache then
+          local x = cache[j]; if x then j = x+len; break end
+        elseif j == 1 then
+          j = len+1; break
+        end
+      end
+      if cache then cache[i] = j end
+      if j > m then m, n = j, i end
+    end
+    return {n = n, m = m}
+  end,
+  checker = function(res)
+    if N ~= DEFAULT_N then
+      -- Test only for the default.
+      return true
+    else
+      return res.n == 18064027 and res.m == 623
     end
-  end
-  if cache then cache[i] = j end
-  if j > m then m, n = j, i end
-end
-io.write("Found ", n, " (chain length: ", m, ")\n")
+  end,
+  items = N,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (7 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta " Sergey Kaplun via Tarantool-patches
                   ` (31 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---

I'm not sure that amount of permutations is the correct items count.
Have you any other suggestions?

 perf/LuaJIT-benches/fannkuch.lua | 37 +++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
index 2a4cd426..c963c66f 100644
--- a/perf/LuaJIT-benches/fannkuch.lua
+++ b/perf/LuaJIT-benches/fannkuch.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local function fannkuch(n)
   local p, q, s, odd, check, maxflips = {}, {}, {}, true, 0, 0
@@ -6,7 +7,7 @@ local function fannkuch(n)
     -- Print max. 30 permutations.
     if check < 30 then
       if not p[n] then return maxflips end	-- Catch n = 0, 1, 2.
-      io.write(unpack(p)); io.write("\n")
+      -- io.write(unpack(p)); io.write("\n")
       check = check + 1
     end
     -- Copy and flip.
@@ -46,5 +47,35 @@ local function fannkuch(n)
   until false
 end
 
-local n = tonumber(arg and arg[1]) or 1
-io.write("Pfannkuchen(", n, ") = ", fannkuch(n), "\n")
+local n = tonumber(arg and arg[1]) or 11
+
+-- Precomputed numbers taken from:
+-- https://dl.acm.org/doi/pdf/10.1145/382109.382124
+local FANNKUCH = { 0, 1, 2, 4, 7, 10, 16, 22, 30, 38, 51, 65, 80 }
+
+local function factorial(n)
+  local fact = 1
+  for i = 2, n do
+    fact = fact * i
+  end
+  return fact
+end
+
+bench:add({
+  name = "fannkuch",
+  payload = function()
+    return fannkuch(n)
+  end,
+  checker = function(res)
+    if n > #FANNKUCH then
+      -- Not precomputed, so can't check.
+      return true
+    else
+      return res == FANNKUCH[n]
+    end
+  end,
+  -- Assume that we count permutations here.
+  items = factorial(n),
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (8 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide " Sergey Kaplun via Tarantool-patches
                   ` (30 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Since the result output (with the different input parameter value)
produced by this benchmark is used in other benchmarks
(<k-nucleotide.lua> and <revcomp.lua>), the original script is used as a
library (inside the <libs/> subdirectory) with the updated default input
value and returns the number of items processed. The output for the
benchmark itself is suppressed and not checked since it is irrational to
store in the repository such huge files for testing.
---
 perf/LuaJIT-benches/fasta.lua      | 120 +++++++----------------------
 perf/LuaJIT-benches/libs/fasta.lua |  98 +++++++++++++++++++++++
 2 files changed, 125 insertions(+), 93 deletions(-)
 create mode 100644 perf/LuaJIT-benches/libs/fasta.lua

diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua
index 7ce60804..d0dc005d 100644
--- a/perf/LuaJIT-benches/fasta.lua
+++ b/perf/LuaJIT-benches/fasta.lua
@@ -1,95 +1,29 @@
-
-local Last = 42
-local function random(max)
-  local y = (Last * 3877 + 29573) % 139968
-  Last = y
-  return (max * y) / 139968
-end
-
-local function make_repeat_fasta(id, desc, s, n)
-  local write, sub = io.write, string.sub
-  write(">", id, " ", desc, "\n")
-  local p, sn, s2 = 1, #s, s..s
-  for i=60,n,60 do
-    write(sub(s2, p, p + 59), "\n")
-    p = p + 60; if p > sn then p = p - sn end
-  end
-  local tail = n % 60
-  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
-end
-
-local function make_random_fasta(id, desc, bs, n)
-  io.write(">", id, " ", desc, "\n")
-  loadstring([=[
-    local write, char, unpack, n, random = io.write, string.char, unpack, ...
-    local buf, p = {}, 1
-    for i=60,n,60 do
-      for j=p,p+59 do ]=]..bs..[=[ end
-      buf[p+60] = 10; p = p + 61
-      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
-    end
-    local tail = n % 60
-    if tail > 0 then
-      for j=p,p+tail-1 do ]=]..bs..[=[ end
-      p = p + tail; buf[p] = 10; p = p + 1
-    end
-    write(char(unpack(buf, 1, p-1)))
-  ]=], desc)(n, random)
-end
-
-local function bisect(c, p, lo, hi)
-  local n = hi - lo
-  if n == 0 then return "buf[j] = "..c[hi].."\n" end
-  local mid = math.floor(n / 2)
-  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
-         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
-end
-
-local function make_bisect(tab)
-  local c, p, sum = {}, {}, 0
-  for i,row in ipairs(tab) do
-    c[i] = string.byte(row[1])
-    sum = sum + row[2]
-    p[i] = sum
-  end
-  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
-end
-
-local alu =
-  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
-  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
-  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
-  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
-  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
-  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
-  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
-
-local iub = make_bisect{
-  { "a", 0.27 },
-  { "c", 0.12 },
-  { "g", 0.12 },
-  { "t", 0.27 },
-  { "B", 0.02 },
-  { "D", 0.02 },
-  { "H", 0.02 },
-  { "K", 0.02 },
-  { "M", 0.02 },
-  { "N", 0.02 },
-  { "R", 0.02 },
-  { "S", 0.02 },
-  { "V", 0.02 },
-  { "W", 0.02 },
-  { "Y", 0.02 },
-}
-
-local homosapiens = make_bisect{
-  { "a", 0.3029549426680 },
-  { "c", 0.1979883004921 },
-  { "g", 0.1975473066391 },
-  { "t", 0.3015094502008 },
+local bench = require("bench").new(arg)
+
+local stdout = io.output()
+
+local benchmark
+benchmark = {
+  name = "fasta",
+  -- XXX: The result file may take up to 278 Mb for the default
+  -- settings. To check the correctness of the script, run it as
+  -- is from the console.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  payload = function()
+    -- Run the benchmark as is from the file.
+    local items = require("fasta")
+    -- Remove it from the cache to be sure the benchmark will run
+    -- at the next iteration.
+    package.loaded["fasta"] = nil
+    benchmark.items = items
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
 }
 
-local N = tonumber(arg and arg[1]) or 1000
-make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
-make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
-make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
+bench:add(benchmark)
+bench:run_and_report()
diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua
new file mode 100644
index 00000000..9c72c244
--- /dev/null
+++ b/perf/LuaJIT-benches/libs/fasta.lua
@@ -0,0 +1,98 @@
+
+local Last = 42
+local function random(max)
+  local y = (Last * 3877 + 29573) % 139968
+  Last = y
+  return (max * y) / 139968
+end
+
+local function make_repeat_fasta(id, desc, s, n)
+  local write, sub = io.write, string.sub
+  write(">", id, " ", desc, "\n")
+  local p, sn, s2 = 1, #s, s..s
+  for i=60,n,60 do
+    write(sub(s2, p, p + 59), "\n")
+    p = p + 60; if p > sn then p = p - sn end
+  end
+  local tail = n % 60
+  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
+end
+
+local function make_random_fasta(id, desc, bs, n)
+  io.write(">", id, " ", desc, "\n")
+  loadstring([=[
+    local write, char, unpack, n, random = io.write, string.char, unpack, ...
+    local buf, p = {}, 1
+    for i=60,n,60 do
+      for j=p,p+59 do ]=]..bs..[=[ end
+      buf[p+60] = 10; p = p + 61
+      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
+    end
+    local tail = n % 60
+    if tail > 0 then
+      for j=p,p+tail-1 do ]=]..bs..[=[ end
+      p = p + tail; buf[p] = 10; p = p + 1
+    end
+    write(char(unpack(buf, 1, p-1)))
+  ]=], desc)(n, random)
+end
+
+local function bisect(c, p, lo, hi)
+  local n = hi - lo
+  if n == 0 then return "buf[j] = "..c[hi].."\n" end
+  local mid = math.floor(n / 2)
+  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
+         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
+end
+
+local function make_bisect(tab)
+  local c, p, sum = {}, {}, 0
+  for i,row in ipairs(tab) do
+    c[i] = string.byte(row[1])
+    sum = sum + row[2]
+    p[i] = sum
+  end
+  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
+end
+
+local alu =
+  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
+  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
+  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
+  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
+  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
+  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
+  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
+
+local iub = make_bisect{
+  { "a", 0.27 },
+  { "c", 0.12 },
+  { "g", 0.12 },
+  { "t", 0.27 },
+  { "B", 0.02 },
+  { "D", 0.02 },
+  { "H", 0.02 },
+  { "K", 0.02 },
+  { "M", 0.02 },
+  { "N", 0.02 },
+  { "R", 0.02 },
+  { "S", 0.02 },
+  { "V", 0.02 },
+  { "W", 0.02 },
+  { "Y", 0.02 },
+}
+
+local homosapiens = make_bisect{
+  { "a", 0.3029549426680 },
+  { "c", 0.1979883004921 },
+  { "g", 0.1975473066391 },
+  { "t", 0.3015094502008 },
+}
+
+local N = tonumber(arg and arg[1]) or 25e6
+
+make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
+make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
+make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
+
+return N*2 + N*3 + N*5
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (9 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life " Sergey Kaplun via Tarantool-patches
                   ` (29 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The benchmark input is given by redirecting the corresponding
<FASTA_5000000> file generated by the `libs/fasta.lua 5e6`. The output
from the benchmark is redirected to /dev/null. All checks are done by
the comparison with the precomputed values for the aforementioned file.
---
 perf/LuaJIT-benches/k-nucleotide.lua | 93 ++++++++++++++++++++++++----
 1 file changed, 82 insertions(+), 11 deletions(-)

diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
index 0bfb41be..ae51dae9 100644
--- a/perf/LuaJIT-benches/k-nucleotide.lua
+++ b/perf/LuaJIT-benches/k-nucleotide.lua
@@ -1,3 +1,4 @@
+local bench = require('bench').new(arg)
 
 local function kfrequency(seq, freq, k, frame)
   local sub = string.sub
@@ -12,7 +13,8 @@ local function count(seq, frag)
   local k = #frag
   local freq = {}
   for frame=1,k do kfrequency(seq, freq, k, frame) end
-  io.write(freq[frag] or 0, "\t", frag, "\n")
+  return freq[frag]
+  -- io.write(freq[frag] or 0, "\t", frag, "\n")
 end
 
 local function frequency(seq, k)
@@ -24,10 +26,13 @@ local function frequency(seq, k)
     local fa, fb = freq[a], freq[b]
     return fa == fb and a > b or fa > fb
   end)
+  local res = {}
   for _,c in ipairs(sfreq) do
-    io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
+    -- io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
+    res[c] = freq[c]*100/sum
   end
-  io.write("\n")
+  -- io.write("\n")
+  return res
 end
 
 local function readseq()
@@ -48,11 +53,77 @@ local function readseq()
   return string.upper(table.concat(lines, "", 1, ln))
 end
 
-local seq = readseq()
-frequency(seq, 1)
-frequency(seq, 2)
-count(seq, "GGT")
-count(seq, "GGTA")
-count(seq, "GGTATT")
-count(seq, "GGTATTTTAATT")
-count(seq, "GGTATTTTAATTTATAGT")
+local function check_freq(res, expected)
+  for k,v in pairs(expected) do
+    assert(string.format("%0.3f", res[k]) == v,
+           "Incorrect frequency for fragment " .. k)
+  end
+end
+
+-- The input is generated by `fasta.lua 5e6'. The check function
+-- is corresponding.
+local N = 5e6
+-- See <libs/fasta.lua> for the details.
+local items = N * 5
+bench:add({
+  name = "k_nucleotide",
+  payload = function()
+    local seq = readseq()
+    local sfreq1 = frequency(seq, 1)
+    local sfreq2 = frequency(seq, 2)
+    local GGT  = count(seq, "GGT")
+    local GGTA = count(seq, "GGTA")
+    local GGTATT = count(seq, "GGTATT")
+    local GGTATTTTAATT = count(seq, "GGTATTTTAATT")
+    local GGTATTTTAATTTATAGT = count(seq, "GGTATTTTAATTTATAGT")
+
+    local res = {
+      sfreq1 = sfreq1,
+      sfreq2 = sfreq2,
+      GGT  = GGT,
+      GGTA = GGTA,
+      GGTATT = GGTATT,
+      GGTATTTTAATT = GGTATTTTAATT,
+      GGTATTTTAATTTATAGT = GGTATTTTAATTTATAGT,
+    }
+    -- XXX: Reset input for the non-check iteration.
+    io.stdin:seek("set", 0)
+    return res
+  end,
+  checker = function(res)
+    check_freq(res.sfreq1, {
+      A = "30.296",
+      T = "30.149",
+      C = "19.800",
+      G = "19.754",
+    })
+    check_freq(res.sfreq2, {
+      AA = "9.177",
+      TA = "9.132",
+      AT = "9.130",
+      TT = "9.091",
+      CA = "6.002",
+      AC = "6.001",
+      AG = "5.987",
+      GA = "5.984",
+      CT = "5.971",
+      TC = "5.971",
+      GT = "5.957",
+      TG = "5.956",
+      CC = "3.917",
+      GC = "3.911",
+      CG = "3.909",
+      GG = "3.902",
+    })
+
+    assert(res.GGT == 294331)
+    assert(res.GGTA == 89290)
+    assert(res.GGTATT == 9462)
+    assert(res.GGTATTTTAATT == 178)
+    assert(res.GGTATTTTAATTTATAGT == 178)
+    return true
+  end,
+  items = items,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (10 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17  8:35   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit " Sergey Kaplun via Tarantool-patches
                   ` (28 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 2994 bytes --]

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file.

The output is redirected to /dev/null. The checker tests the result
after the exact amount of iterations for the fixed field (as it is
declared in the original benchmark).
---
 perf/LuaJIT-benches/life.lua | 79 +++++++++++++++++++++++++++++++++++-
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/life.lua b/perf/LuaJIT-benches/life.lua
index 911d9fe1..d0e4dc98 100644
--- a/perf/LuaJIT-benches/life.lua
+++ b/perf/LuaJIT-benches/life.lua
@@ -3,6 +3,8 @@
 -- modified to use ANSI terminal escape sequences
 -- modified to use for instead of while
 
+local bench = require('bench').new(arg)
+
 local write=io.write
 
 ALIVE="¥"	DEAD="þ"
@@ -106,6 +108,81 @@ function LIFE(w,h)
     if gen>2000 then break end
     --delay()		-- no delay
   end
+  return thisgen
 end
 
-LIFE(40,20)
+-- Result of the LIFE(40, 20) after 2000 generations.
+--[[
+----------------------------------------
+----------------------------------------
+--OO--------------------------O---------
+-OO--------------------------O-O--------
+---O--------------------------O---------
+----------------------------------------
+----------------------------------------
+----------------------------------------
+----------------------------------------
+----------------------------------------
+----------------------------------------
+----------------------------------------
+---O------------------------------------
+--O-O-----------------------------------
+--O-O-----------------------------------
+---O------------------------------------
+----------------------------------------
+-------OO-------------------------------
+-------OO-------------------------------
+----------------------------------------
+]]
+
+local function check_life(thisgen, w, h)
+  local expected_cells = ARRAY2D(w, h)
+  for y = 1, h do
+    for x = 1, w do
+      expected_cells[y][x] = false
+    end
+  end
+  local alive_cells = {
+    {3, 3}, {3, 4}, {3, 31},
+    {4, 2}, {4, 3}, {4, 30}, {4, 32},
+    {5, 4}, {5, 31},
+    {13, 4},
+    {14, 3}, {14, 5},
+    {15, 3}, {15, 5},
+    {16, 4},
+    {18, 8}, {18, 9},
+    {19, 8}, {19, 9},
+  }
+  for _, cell in ipairs(alive_cells) do
+    local y, x = cell[1], cell[2]
+    expected_cells[y][x] = true
+  end
+  for y = 1, h do
+    for x = 1, w do
+      assert(thisgen[y][x] > 0 == expected_cells[y][x],
+             ('Incorrect value for cell (%d, %d)'):format(x, y))
+    end
+  end
+  return true
+end
+
+local stdout = io.output()
+
+bench:add({
+  name = 'life',
+  setup = function()
+    io.output('/dev/null')
+  end,
+  payload = function()
+    return LIFE(40, 20)
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
+  checker = function(res)
+    return check_life(res, 40, 20)
+  end,
+  items = 2000 * 40 * 20,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (11 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot " Sergey Kaplun via Tarantool-patches
                   ` (27 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The output is redirected to /dev/null. The check is skipped since it is
very inconvenient to check the binary output, especially since it may be
configured by the parameter.
---
 perf/LuaJIT-benches/mandelbrot-bit.lua | 86 +++++++++++++++++---------
 1 file changed, 57 insertions(+), 29 deletions(-)

diff --git a/perf/LuaJIT-benches/mandelbrot-bit.lua b/perf/LuaJIT-benches/mandelbrot-bit.lua
index 91d96975..a6b5e1f8 100644
--- a/perf/LuaJIT-benches/mandelbrot-bit.lua
+++ b/perf/LuaJIT-benches/mandelbrot-bit.lua
@@ -1,33 +1,61 @@
-
 local bit = require("bit")
-local bor, band = bit.bor, bit.band
-local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
-local write, char, unpack = io.write, string.char, unpack
-local N = tonumber(arg and arg[1]) or 100
-local M, buf = 2/N, {}
-write("P4\n", N, " ", N, "\n")
-for y=0,N-1 do
-  local Ci, b, p = y*M-1, -16777216, 0
-  local Ciq = Ci*Ci
-  for x=0,N-1,2 do
-    local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
-    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
-    local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
-    b = rol(b, 2)
-    for i=1,49 do
-      Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
-      Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
-      Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
-      Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
-      if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
-      if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
-      if band(b, 3) == 0 then break end
+
+local bench = require("bench").new(arg)
+
+local N = tonumber(arg and arg[1]) or 5000
+
+local function payload()
+  -- These functions must not be an upvalue but the stack slot.
+  local N = N
+  local bor, band = bit.bor, bit.band
+  local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
+  local write, char, unpack = io.write, string.char, unpack
+
+  local M, buf = 2/N, {}
+  write("P4\n", N, " ", N, "\n")
+  for y=0,N-1 do
+    local Ci, b, p = y*M-1, -16777216, 0
+    local Ciq = Ci*Ci
+    for x=0,N-1,2 do
+      local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
+      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
+      local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
+      b = rol(b, 2)
+      for i=1,49 do
+        Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
+        Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
+        Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
+        Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
+        if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
+        if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
+        if band(b, 3) == 0 then break end
+      end
+      if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
     end
-    if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
-  end
-  if b ~= -16777216 then
-    if band(N, 1) ~= 0 then b = shr(b, 1) end
-    p = p + 1; buf[p] = shl(b, 8-band(N, 7))
+    if b ~= -16777216 then
+      if band(N, 1) ~= 0 then b = shr(b, 1) end
+      p = p + 1; buf[p] = shl(b, 8-band(N, 7))
+    end
+    write(char(unpack(buf, 1, p)))
   end
-  write(char(unpack(buf, 1, p)))
 end
+
+local stdout = io.output()
+
+bench:add({
+  name = "mandelbrot_bit",
+  items = N,
+  -- XXX: This is inconvenient to have the binary file in the
+  -- repository for the comparison. If the check is needed, run
+  -- the payload manually.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
+  payload = payload,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (12 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 " Sergey Kaplun via Tarantool-patches
                   ` (26 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The output is redirected to /dev/null. The check is skipped since it is
very inconvenient to check the binary output, especially since it may be
configured by the parameter.
---
 perf/LuaJIT-benches/mandelbrot.lua | 64 +++++++++++++++++++++---------
 1 file changed, 45 insertions(+), 19 deletions(-)

diff --git a/perf/LuaJIT-benches/mandelbrot.lua b/perf/LuaJIT-benches/mandelbrot.lua
index 0ef595a2..51e0dd4f 100644
--- a/perf/LuaJIT-benches/mandelbrot.lua
+++ b/perf/LuaJIT-benches/mandelbrot.lua
@@ -1,23 +1,49 @@
+local bench = require("bench").new(arg)
 
-local write, char, unpack = io.write, string.char, unpack
-local N = tonumber(arg and arg[1]) or 100
-local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
-write("P4\n", N, " ", N, "\n")
-for y=0,N-1 do
-  local Ci, b, p = y*M-1, 1, 0
-  for x=0,N-1 do
-    local Cr = x*M-1.5
-    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
-    b = b + b
-    for i=1,49 do
-      Zi = Zr*Zi*2 + Ci
-      Zr = Zrq-Ziq + Cr
-      Ziq = Zi*Zi
-      Zrq = Zr*Zr
-      if Zrq+Ziq > 4.0 then b = b + 1; break; end
+local N = tonumber(arg and arg[1]) or 5000
+
+local function payload()
+  -- These functions must not be an upvalue but the stack slot.
+  local N = N
+  local write, char, unpack = io.write, string.char, unpack
+  local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
+  write("P4\n", N, " ", N, "\n")
+  for y=0,N-1 do
+    local Ci, b, p = y*M-1, 1, 0
+    for x=0,N-1 do
+      local Cr = x*M-1.5
+      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
+      b = b + b
+      for i=1,49 do
+        Zi = Zr*Zi*2 + Ci
+        Zr = Zrq-Ziq + Cr
+        Ziq = Zi*Zi
+        Zrq = Zr*Zr
+        if Zrq+Ziq > 4.0 then b = b + 1; break; end
+      end
+      if b >= 256 then p = p + 1; buf[p] = 511 - b; b = 1; end
     end
-    if b >= 256 then p = p + 1; buf[p] = 511 - b; b = 1; end
+    if b ~= 1 then p = p + 1; buf[p] = (ba-b)*bb; end
+    write(char(unpack(buf, 1, p)))
   end
-  if b ~= 1 then p = p + 1; buf[p] = (ba-b)*bb; end
-  write(char(unpack(buf, 1, p)))
 end
+
+local stdout = io.output()
+
+bench:add({
+  name = "mandelbrot",
+  items = N,
+  -- XXX: This is inconvenient to have the binary file in the
+  -- repository for the comparison. If the check is needed run,
+  -- the payload manually.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
+  payload = payload,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (13 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor " Sergey Kaplun via Tarantool-patches
                   ` (25 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/md5.lua | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/perf/LuaJIT-benches/md5.lua b/perf/LuaJIT-benches/md5.lua
index fdf6b4a7..5ec67527 100644
--- a/perf/LuaJIT-benches/md5.lua
+++ b/perf/LuaJIT-benches/md5.lua
@@ -1,5 +1,6 @@
-
 local bit = require("bit")
+local bench = require("bench").new(arg)
+
 local tobit, tohex, bnot = bit.tobit or bit.cast, bit.tohex, bit.bnot
 local bor, band, bxor = bit.bor, bit.band, bit.bxor
 local lshift, rshift, rol, bswap = bit.lshift, bit.rshift, bit.rol, bit.bswap
@@ -147,7 +148,7 @@ assert(md5('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789') ==
 assert(md5('12345678901234567890123456789012345678901234567890123456789012345678901234567890') ==
        '57edf4a22be3c955ac49da2e2107b67a')
 
-local N = tonumber(arg and arg[1]) or 10000
+local N = tonumber(arg and arg[1]) or 20000
 
   -- Credits: William Shakespeare, Romeo and Juliet
 local txt = [[Rebellious subjects, enemies to peace,
@@ -176,8 +177,20 @@ Once more, on pain of death, all men depart.]]
   txt = txt..txt..txt..txt
   txt = txt..txt..txt..txt
 
-for i=1,N do
-  res = md5(txt)
-end
-assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
-
+bench:add({
+  name = 'md5',
+  payload = function()
+    local res
+    for i=1,N do
+      res = md5(txt)
+    end
+    return res
+  end,
+  checker = function(res)
+    assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
+    return true
+  end,
+  items = N,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (14 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody " Sergey Kaplun via Tarantool-patches
                   ` (24 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The arguments to the script still can be
provided in the command line run. However, the values greater than the
maximum possible solutions found do not affect the time of execution for
this benchmark. Hence, the number of items to proceed is considered
constant as the maximum possible number of solutions.
---
 perf/LuaJIT-benches/meteor.lua | 46 ++++++++++++++++++++++++++--------
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/perf/LuaJIT-benches/meteor.lua b/perf/LuaJIT-benches/meteor.lua
index 80588ab5..f3962820 100644
--- a/perf/LuaJIT-benches/meteor.lua
+++ b/perf/LuaJIT-benches/meteor.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 -- Generate a decision tree based solver for the meteor puzzle.
 local function generatesolver(countinit)
@@ -118,6 +119,10 @@ local function printresult()
   printboard(smax)
 end
 
+local function getresult()
+  return countinit-count, smin, smax
+end
+
 -- Generate piece lookup array from the order of use.
 local function genp()
   local p = pcs
@@ -141,7 +146,7 @@ local function f91(k)
     local s = p[b0] ]]
   for p=2,99 do if ok[p] then s = s.."..p[b"..p.."]" end end
   s = s..[[
-    -- Remember min/max boards, dito for the symmetric board.
+    -- Remember min/max boards, ditto for the symmetric board.
     if not smin then smin = s; smax = s
     elseif s < smin then smin = s elseif s > smax then smax = s end
     s = reverse(s)
@@ -206,15 +211,36 @@ local f93 = f91
   end
 
   -- Compile and return solver function and result getter.
-  return loadstring(s.."return f0, printresult\n", "solver")(countinit)
+  return loadstring(s.."return f0, printresult, getresult\n", "solver")(countinit)
 end
 
--- Generate the solver function hierarchy.
-local solver, printresult = generatesolver(tonumber(arg and arg[1]) or 10000)
-
--- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
-if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
+local N = tonumber(arg and arg[1]) or 10000
+
+bench:add({
+  name = "meteror",
+  setup = function()
+    -- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
+    if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
+  end,
+  payload = function()
+    -- Generate the solver function hierarchy.
+    local solver, printresult, getresult = generatesolver(N)
+
+    -- Run the solver protected to get partial results (max count or ctrl-c).
+    pcall(solver, 0)
+
+    local n, smin, smax = getresult()
+    return {n = n, smin = smin, smax = smax}
+  end,
+  checker = function(res)
+    if N >= 2097 then
+      assert(res.n == 2098, "Incorrect solutions number")
+      assert(res.smin == "00001222012661126155865558633348893448934747977799")
+      assert(res.smax == "99998966856688568255777257472014220144031400311333")
+    end
+    return true
+  end,
+  items = 2098,
+})
 
--- Run the solver protected to get partial results (max count or ctrl-c).
-pcall(solver, 0)
-printresult()
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (15 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp " Sergey Kaplun via Tarantool-patches
                   ` (23 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/nbody.lua | 127 ++++++++++++++++++++--------------
 1 file changed, 74 insertions(+), 53 deletions(-)

diff --git a/perf/LuaJIT-benches/nbody.lua b/perf/LuaJIT-benches/nbody.lua
index e0ff8f77..f01c20a3 100644
--- a/perf/LuaJIT-benches/nbody.lua
+++ b/perf/LuaJIT-benches/nbody.lua
@@ -1,56 +1,12 @@
+local bench = require("bench").new(arg)
 
 local sqrt = math.sqrt
 
 local PI = 3.141592653589793
 local SOLAR_MASS = 4 * PI * PI
 local DAYS_PER_YEAR = 365.24
-local bodies = {
-  { -- Sun
-    x = 0,
-    y = 0,
-    z = 0,
-    vx = 0,
-    vy = 0,
-    vz = 0,
-    mass = SOLAR_MASS
-  },
-  { -- Jupiter
-    x = 4.84143144246472090e+00,
-    y = -1.16032004402742839e+00,
-    z = -1.03622044471123109e-01,
-    vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
-    vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
-    vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
-    mass = 9.54791938424326609e-04 * SOLAR_MASS
-  },
-  { -- Saturn
-    x = 8.34336671824457987e+00,
-    y = 4.12479856412430479e+00,
-    z = -4.03523417114321381e-01,
-    vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
-    vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
-    vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
-    mass = 2.85885980666130812e-04 * SOLAR_MASS
-  },
-  { -- Uranus
-    x = 1.28943695621391310e+01,
-    y = -1.51111514016986312e+01,
-    z = -2.23307578892655734e-01,
-    vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
-    vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
-    vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
-    mass = 4.36624404335156298e-05 * SOLAR_MASS
-  },
-  { -- Neptune
-    x = 1.53796971148509165e+01,
-    y = -2.59193146099879641e+01,
-    z = 1.79258772950371181e-01,
-    vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
-    vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
-    vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
-    mass = 5.15138902046611451e-05 * SOLAR_MASS
-  }
-}
+local bodies
+local nbody
 
 local function advance(bodies, nbody, dt)
   for i=1,nbody do
@@ -110,10 +66,75 @@ local function offsetMomentum(b, nbody)
   b[1].vz = -pz / SOLAR_MASS
 end
 
-local N = tonumber(arg and arg[1]) or 1000
-local nbody = #bodies
+local DEFAULT_N = 5e6
+local N = tonumber(arg and arg[1]) or DEFAULT_N
 
-offsetMomentum(bodies, nbody)
-io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
-for i=1,N do advance(bodies, nbody, 0.01) end
-io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
+bench:add({
+  name = "nbody",
+  payload = function()
+    bodies = {
+      { -- Sun
+        x = 0,
+        y = 0,
+        z = 0,
+        vx = 0,
+        vy = 0,
+        vz = 0,
+        mass = SOLAR_MASS
+      },
+      { -- Jupiter
+        x = 4.84143144246472090e+00,
+        y = -1.16032004402742839e+00,
+        z = -1.03622044471123109e-01,
+        vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
+        vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
+        vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
+        mass = 9.54791938424326609e-04 * SOLAR_MASS
+      },
+      { -- Saturn
+        x = 8.34336671824457987e+00,
+        y = 4.12479856412430479e+00,
+        z = -4.03523417114321381e-01,
+        vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
+        vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
+        vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
+        mass = 2.85885980666130812e-04 * SOLAR_MASS
+      },
+      { -- Uranus
+        x = 1.28943695621391310e+01,
+        y = -1.51111514016986312e+01,
+        z = -2.23307578892655734e-01,
+        vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
+        vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
+        vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
+        mass = 4.36624404335156298e-05 * SOLAR_MASS
+      },
+      { -- Neptune
+        x = 1.53796971148509165e+01,
+        y = -2.59193146099879641e+01,
+        z = 1.79258772950371181e-01,
+        vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
+        vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
+        vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
+        mass = 5.15138902046611451e-05 * SOLAR_MASS
+      }
+    }
+    nbody = #bodies
+
+    offsetMomentum(bodies, nbody)
+
+    assert(energy(bodies, nbody) == -0.16907516382852447179,
+             "Correct start energy")
+    for i=1,N do advance(bodies, nbody, 0.01) end
+  end,
+  checker = function()
+    if N == DEFAULT_N then
+      assert(energy(bodies, nbody) == -0.16908313397890917251,
+             "Correct result energy")
+    end
+    return true
+  end,
+  items = N,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (16 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit " Sergey Kaplun via Tarantool-patches
                   ` (22 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/nsieve-bit-fp.lua | 35 +++++++++++++++++++++++----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/perf/LuaJIT-benches/nsieve-bit-fp.lua b/perf/LuaJIT-benches/nsieve-bit-fp.lua
index 3971ec1f..d0ab23d2 100644
--- a/perf/LuaJIT-benches/nsieve-bit-fp.lua
+++ b/perf/LuaJIT-benches/nsieve-bit-fp.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local floor, ceil = math.floor, math.ceil
 
@@ -27,11 +28,35 @@ local function nsieve(p, m)
   return count
 end
 
-local N = tonumber(arg and arg[1]) or 1
+local DEFAULT_N = 12
+local N = tonumber(arg and arg[1]) or DEFAULT_N
 if N < 2 then N = 2 end
 local primes = {}
 
-for i=0,2 do
-  local m = (2^(N-i))*10000
-  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
-end
+local benchmark
+benchmark = {
+  name = "nsieve_bit_fp",
+  payload = function()
+    local res = {}
+    local items = 0
+    for i=0,2 do
+      local m = (2^(N-i))*10000
+      items = items + m
+      res[i] = nsieve(primes, m)
+    end
+    benchmark.items = items
+
+    return res
+  end,
+  checker = function(res)
+    if N == DEFAULT_N then
+      assert(res[0] == 2488465)
+      assert(res[1] == 1299069)
+      assert(res[2] == 679461)
+    end
+    return true
+  end,
+}
+
+bench:add(benchmark)
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (17 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve " Sergey Kaplun via Tarantool-patches
                   ` (21 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/nsieve-bit.lua | 35 +++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/perf/LuaJIT-benches/nsieve-bit.lua b/perf/LuaJIT-benches/nsieve-bit.lua
index 820a3726..4858e9e2 100644
--- a/perf/LuaJIT-benches/nsieve-bit.lua
+++ b/perf/LuaJIT-benches/nsieve-bit.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local bit = require("bit")
 local band, bxor, rshift, rol = bit.band, bit.bxor, bit.rshift, bit.rol
@@ -17,11 +18,35 @@ local function nsieve(p, m)
   return count
 end
 
-local N = tonumber(arg and arg[1]) or 1
+local DEFAULT_N = 12
+local N = tonumber(arg and arg[1]) or DEFAULT_N
 if N < 2 then N = 2 end
 local primes = {}
 
-for i=0,2 do
-  local m = (2^(N-i))*10000
-  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
-end
+local benchmark
+benchmark = {
+  name = "nsieve_bit",
+  payload = function()
+    local res = {}
+    local items = 0
+    for i=0,2 do
+      local m = (2^(N-i))*10000
+      items = items + m
+      res[i] = nsieve(primes, m)
+    end
+    benchmark.items = items
+
+    return res
+  end,
+  checker = function(res)
+    if N == DEFAULT_N then
+      assert(res[0] == 2488465)
+      assert(res[1] == 1299069)
+      assert(res[2] == 679461)
+    end
+    return true
+  end,
+}
+
+bench:add(benchmark)
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (18 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums " Sergey Kaplun via Tarantool-patches
                   ` (20 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/nsieve.lua | 35 +++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/perf/LuaJIT-benches/nsieve.lua b/perf/LuaJIT-benches/nsieve.lua
index 6de0524f..2d1b66c8 100644
--- a/perf/LuaJIT-benches/nsieve.lua
+++ b/perf/LuaJIT-benches/nsieve.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local function nsieve(p, m)
   for i=2,m do p[i] = true end
@@ -11,11 +12,35 @@ local function nsieve(p, m)
   return count
 end
 
-local N = tonumber(arg and arg[1]) or 1
+local DEFAULT_N = 12
+local N = tonumber(arg and arg[1]) or DEFAULT_N
 if N < 2 then N = 2 end
 local primes = {}
 
-for i=0,2 do
-  local m = (2^(N-i))*10000
-  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
-end
+local benchmark
+benchmark = {
+  name = "nsieve",
+  payload = function()
+    local res = {}
+    local items = 0
+    for i=0,2 do
+      local m = (2^(N-i))*10000
+      items = items + m
+      res[i] = nsieve(primes, m)
+    end
+    benchmark.items = items
+
+    return res
+  end,
+  checker = function(res)
+    if N == DEFAULT_N then
+      assert(res[0] == 2488465)
+      assert(res[1] == 1299069)
+      assert(res[2] == 679461)
+    end
+    return true
+  end,
+}
+
+bench:add(benchmark)
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (19 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp " Sergey Kaplun via Tarantool-patches
                   ` (19 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/partialsums.lua | 69 ++++++++++++++++++-----------
 1 file changed, 42 insertions(+), 27 deletions(-)

diff --git a/perf/LuaJIT-benches/partialsums.lua b/perf/LuaJIT-benches/partialsums.lua
index 46bb9da3..ab24b30a 100644
--- a/perf/LuaJIT-benches/partialsums.lua
+++ b/perf/LuaJIT-benches/partialsums.lua
@@ -1,29 +1,44 @@
+local bench = require("bench").new(arg)
 
-local n = tonumber(arg[1])
-local function pr(fmt, x) io.write(string.format(fmt, x)) end
+local DEFAULT_N = 1e7
+local n = tonumber(arg[1]) or DEFAULT_N
 
-local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
-local sqrt, sin, cos = math.sqrt, math.sin, math.cos
-for k=1,n do
-  local k2, sk, ck = k*k, sin(k), cos(k)
-  local k3 = k2*k
-  a1 = a1 + (2/3)^k
-  a2 = a2 + 1/sqrt(k)
-  a3 = a3 + 1/(k2+k)
-  a4 = a4 + 1/(k3*sk*sk)
-  a5 = a5 + 1/(k3*ck*ck)
-  a6 = a6 + 1/k
-  a7 = a7 + 1/k2
-  a8 = a8 + alt/k
-  a9 = a9 + alt/(k+k-1)
-  alt = -alt
-end
-pr("%.9f\t(2/3)^k\n", a1)
-pr("%.9f\tk^-0.5\n", a2)
-pr("%.9f\t1/k(k+1)\n", a3)
-pr("%.9f\tFlint Hills\n", a4)
-pr("%.9f\tCookson Hills\n", a5)
-pr("%.9f\tHarmonic\n", a6)
-pr("%.9f\tRiemann Zeta\n", a7)
-pr("%.9f\tAlternating Harmonic\n", a8)
-pr("%.9f\tGregory\n", a9)
+bench:add({
+  name = "partialsums",
+  payload = function()
+    local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
+    local sqrt, sin, cos = math.sqrt, math.sin, math.cos
+    for k=1,n do
+      local k2, sk, ck = k*k, sin(k), cos(k)
+      local k3 = k2*k
+      a1 = a1 + (2/3)^k
+      a2 = a2 + 1/sqrt(k)
+      a3 = a3 + 1/(k2+k)
+      a4 = a4 + 1/(k3*sk*sk)
+      a5 = a5 + 1/(k3*ck*ck)
+      a6 = a6 + 1/k
+      a7 = a7 + 1/k2
+      a8 = a8 + alt/k
+      a9 = a9 + alt/(k+k-1)
+      alt = -alt
+    end
+    return {a1, a2, a3, a4, a5, a6, a7, a8, a9}
+  end,
+  checker = function(a)
+    if n == DEFAULT_N then
+      assert(a[1] == 2.99999999999999866773)
+      assert(a[2] == 6323.09512394020111969439)
+      assert(a[3] == 0.99999989999981531152)
+      assert(a[4] == 30.31454593111029183206)
+      assert(a[5] == 42.99523427973661426904)
+      assert(a[6] == 16.69531136585727182364)
+      assert(a[7] == 1.64493396684725956547)
+      assert(a[8] == 0.69314713056010635039)
+      assert(a[9] == 0.78539813839744787582)
+    end
+    return true
+  end,
+  items = n,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (20 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray " Sergey Kaplun via Tarantool-patches
                   ` (18 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The output is redirected to /dev/null. The check is skipped since it is
very inconvenient to store the huge file in the repository with the
reference value.
---
 perf/LuaJIT-benches/pidigits-nogmp.lua | 49 ++++++++++++++++++--------
 1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/perf/LuaJIT-benches/pidigits-nogmp.lua b/perf/LuaJIT-benches/pidigits-nogmp.lua
index 63a1cb0e..e96b3e45 100644
--- a/perf/LuaJIT-benches/pidigits-nogmp.lua
+++ b/perf/LuaJIT-benches/pidigits-nogmp.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 -- Start of dynamically compiled chunk.
 local chunk = [=[
@@ -80,21 +81,41 @@ end)
 
 ]=] -- End of dynamically compiled chunk.
 
-local N = tonumber(arg and arg[1]) or 27
+local N = tonumber(arg and arg[1]) or 5000
 local RADIX = N < 6500 and 2^36 or 2^32 -- Avoid overflow.
 
--- Substitute radix and compile chunk.
-local pidigit = loadstring(string.gsub(chunk, "RADIX", tostring(RADIX)))()
+local stdout = io.output()
 
--- Print lines with 10 digits.
-for i=10,N,10 do
-  for j=1,10 do io.write(pidigit()) end
-  io.write("\t:", i, "\n")
-end
+bench:add({
+  name = "pidigit_nogmp",
+  -- Avoid skip checking here, since it is not very convenient.
+  -- If you want to check the behaviour -- drop the setup
+  -- function.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  payload = function()
+    -- Substitute radix and compile chunk.
+    local pidigit = loadstring(string.gsub(chunk, "RADIX", tostring(RADIX)))()
 
--- Print remaining digits (if any).
-local n10 = N % 10
-if n10 ~= 0 then
-  for i=1,n10 do io.write(pidigit()) end
-  io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
-end
+    -- Print lines with 10 digits.
+    for i=10,N,10 do
+      for j=1,10 do io.write(pidigit()) end
+      io.write("\t:", i, "\n")
+    end
+
+    -- Print remaining digits (if any).
+    local n10 = N % 10
+    if n10 ~= 0 then
+      for i=1,n10 do io.write(pidigit()) end
+      io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
+    end
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
+  items = N,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (21 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack " Sergey Kaplun via Tarantool-patches
                   ` (17 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The output is redirected to /dev/null. The check is skipped since it is
very inconvenient to check the binary output, especially since it may be
configured by the parameter.
---
 perf/LuaJIT-benches/ray.lua | 76 ++++++++++++++++++++++++-------------
 1 file changed, 50 insertions(+), 26 deletions(-)

diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
index 2acc24c0..f7b76d0a 100644
--- a/perf/LuaJIT-benches/ray.lua
+++ b/perf/LuaJIT-benches/ray.lua
@@ -1,10 +1,8 @@
+local bench = require("bench").new(arg)
+
 local sqrt = math.sqrt
 local huge = math.huge
-
-local delta = 1
-while delta * delta + 1 ~= 1 do
-  delta = delta * 0.5
-end
+local delta
 
 local function length(x, y, z)  return sqrt(x*x + y*y + z*z) end
 local function vlen(v)          return length(v[1], v[2], v[3]) end
@@ -110,26 +108,52 @@ end
 
 
 local level, n, ss = tonumber(arg[1]) or 9, tonumber(arg[2]) or 256, 4
-local iss = 1/ss
-local gf = 255/(ss*ss)
-
-io.write(("P5\n%d %d\n255\n"):format(n, n))
-local light = { unitise(-1, -3, 2) }
-ilight = { -light[1], -light[2], -light[3] }
-local camera = { 0, 0, -4 }
-local dir = { 0, 0, 0 }
-
-local scene = create(level, {0, -1, 0}, 1)
-
-for y = n/2-1, -n/2, -1 do
-  for x = -n/2, n/2-1 do
-    local g = 0
-    for d = y, y+.99, iss do
-      for e = x, x+.99, iss do
-        dir[1], dir[2], dir[3] = unitise(e, d, n)
-        g = g + ray_trace(light, camera, dir, scene) 
+
+local stdout = io.output()
+
+bench:add({
+  name = "ray",
+  -- Avoid skip checking here, since it is not very convenient.
+  -- If you want to check the behaviour -- drop the setup
+  -- function.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  payload = function()
+    local iss = 1/ss
+    local gf = 255/(ss*ss)
+
+    delta = 1
+    while delta * delta + 1 ~= 1 do
+      delta = delta * 0.5
+    end
+
+    io.write(("P5\n%d %d\n255\n"):format(n, n))
+    local light = { unitise(-1, -3, 2) }
+    ilight = { -light[1], -light[2], -light[3] }
+    local camera = { 0, 0, -4 }
+    local dir = { 0, 0, 0 }
+
+    local scene = create(level, {0, -1, 0}, 1)
+
+    for y = n/2-1, -n/2, -1 do
+      for x = -n/2, n/2-1 do
+        local g = 0
+        for d = y, y+.99, iss do
+          for e = x, x+.99, iss do
+            dir[1], dir[2], dir[3] = unitise(e, d, n)
+            g = g + ray_trace(light, camera, dir, scene)
+          end
+        end
+        io.write(string.char(math.floor(0.5 + g*gf)))
       end
     end
-    io.write(string.char(math.floor(0.5 + g*gf)))
-  end
-end
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
+  items = n * n * level,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (22 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib " Sergey Kaplun via Tarantool-patches
                   ` (16 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/recursive-ack.lua | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/recursive-ack.lua b/perf/LuaJIT-benches/recursive-ack.lua
index fad30589..1172d4b3 100644
--- a/perf/LuaJIT-benches/recursive-ack.lua
+++ b/perf/LuaJIT-benches/recursive-ack.lua
@@ -1,3 +1,5 @@
+local bench = require("bench").new(arg)
+
 local function Ack(m, n)
   if m == 0 then return n+1 end
   if n == 0 then return Ack(m-1, 1) end
@@ -5,4 +7,17 @@ local function Ack(m, n)
 end
 
 local N = tonumber(arg and arg[1]) or 10
-io.write("Ack(3,", N ,"): ", Ack(3,N), "\n")
+
+bench:add({
+  name = "recursive_ack",
+  -- Sum of calls for the function RA(3, N).
+  items = 128 * ((4 ^ N - 1) / 3) - 40 * (2 ^ N - 1) + 3 * N + 15,
+  payload = function()
+    return Ack(3, N)
+  end,
+  checker = function(res)
+    return res == 2 ^ (N + 3) - 3
+  end,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (23 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp " Sergey Kaplun via Tarantool-patches
                   ` (15 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/recursive-fib.lua | 28 +++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/perf/LuaJIT-benches/recursive-fib.lua b/perf/LuaJIT-benches/recursive-fib.lua
index ef9950de..99af3f9e 100644
--- a/perf/LuaJIT-benches/recursive-fib.lua
+++ b/perf/LuaJIT-benches/recursive-fib.lua
@@ -1,7 +1,31 @@
+local bench = require("bench").new(arg)
+
 local function fib(n)
   if n < 2 then return 1 end
   return fib(n-2) + fib(n-1)
 end
 
-local n = tonumber(arg[1]) or 10
-io.write(string.format("Fib(%d): %d\n", n, fib(n)))
+local n = tonumber(arg[1]) or 40
+
+local benchmark
+benchmark = {
+  name = "recursive_fib",
+  checker = function(res)
+    local km1, k = 1, 1
+    for i = 2, n do
+      local tmp = k + km1
+      km1 = k
+      k = tmp
+    end
+    return k == res
+  end,
+  payload = function()
+    local res = fib(n)
+    -- Number of calls.
+    benchmark.items = res * 2 - 1
+    return res
+  end,
+}
+
+bench:add(benchmark)
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (24 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 " Sergey Kaplun via Tarantool-patches
                   ` (14 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The benchmark input is given by redirecting the corresponding
<FASTA_5000000> file generated by the `libs/fasta.lua 5e6`. The output
from the benchmark is redirected to /dev/null. Checks are skipped since
the output is very huge, and it is overkill to store it in the
repository.
---
 perf/LuaJIT-benches/revcomp.lua | 72 +++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 25 deletions(-)

diff --git a/perf/LuaJIT-benches/revcomp.lua b/perf/LuaJIT-benches/revcomp.lua
index 34fe347b..2b1ffa5c 100644
--- a/perf/LuaJIT-benches/revcomp.lua
+++ b/perf/LuaJIT-benches/revcomp.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local sub = string.sub
 iubc = setmetatable({
@@ -9,29 +10,50 @@ iubc = setmetatable({
 }, { __index = function(t, s)
   local r = t[sub(s, 2)]..t[sub(s, 1, 1)]; t[s] = r; return r end })
 
-local wcode = [=[
-return function(t, n)
-  if n == 1 then return end
-  local iubc, sub, write = iubc, string.sub, io.write
-  local s = table.concat(t, "", 1, n-1)
-  for i=#s-59,1,-60 do
-    write(]=]
-for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
-wcode = wcode..[=["\n")
-  end
-  local r = #s % 60
-  if r ~= 0 then
-    for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
-    write("\n")
-  end
-end
-]=]
-local writerev = loadstring(wcode)()
+local stdout = io.output()
 
-local t, n = {}, 1
-for line in io.lines() do
-  local c = sub(line, 1, 1)
-  if c == ">" then writerev(t, n); io.write(line, "\n"); n = 1
-  elseif c ~= ";" then t[n] = line; n = n + 1 end
-end
-writerev(t, n)
+bench:add({
+  name = "revcomp",
+  -- The compare with the result output file is inconvenient.
+  skip_check = true,
+  setup = function()
+    io.output("/dev/null")
+  end,
+  payload = function()
+    local wcode = [=[
+    return function(t, n)
+      if n == 1 then return end
+      local iubc, sub, write = iubc, string.sub, io.write
+      local s = table.concat(t, "", 1, n-1)
+      for i=#s-59,1,-60 do
+        write(]=]
+    for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
+    wcode = wcode..[=["\n")
+      end
+      local r = #s % 60
+      if r ~= 0 then
+        for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
+        write("\n")
+      end
+    end
+    ]=]
+    local writerev = loadstring(wcode)()
+
+    local t, n = {}, 1
+    for line in io.lines() do
+      local c = sub(line, 1, 1)
+      if c == ">" then writerev(t, n); io.write(line, "\n"); n = 1
+      elseif c ~= ";" then t[n] = line; n = n + 1 end
+    end
+    writerev(t, n)
+    -- Repeat operation several times.
+    io.stdin:seek("set", 0)
+  end,
+  teardown = function()
+    io.output(stdout)
+  end,
+  -- Amount of symbols in the input file.
+  items = 5e6,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (25 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:56   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory Sergey Kaplun via Tarantool-patches
                   ` (13 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The time for each subsequent benchmark is increased up to 4 seconds,
accoring the defaults in the "bench" framework. The main difference
between this test and others that will be added in next commits is
the usage of FFI arrays instead of plain Lua tables.
---
 perf/LuaJIT-benches/scimark-2010-12-20.lua | 93 +++++++++++++---------
 1 file changed, 54 insertions(+), 39 deletions(-)

diff --git a/perf/LuaJIT-benches/scimark-2010-12-20.lua b/perf/LuaJIT-benches/scimark-2010-12-20.lua
index 353acb7c..3fb627fa 100644
--- a/perf/LuaJIT-benches/scimark-2010-12-20.lua
+++ b/perf/LuaJIT-benches/scimark-2010-12-20.lua
@@ -9,25 +9,26 @@
 local SCIMARK_VERSION = "2010-12-10"
 local SCIMARK_COPYRIGHT = "Copyright (C) 2006-2010 Mike Pall"
 
-local MIN_TIME = 2.0
+local bench = require("bench").new(arg)
+
 local RANDOM_SEED = 101009 -- Must be odd.
 local SIZE_SELECT = "small"
 
 local benchmarks = {
   "FFT", "SOR", "MC", "SPARSE", "LU",
   small = {
-    FFT		= { 1024 },
-    SOR		= { 100 },
-    MC		= { },
-    SPARSE	= { 1000, 5000 },
-    LU		= { 100 },
+    FFT		= { params = { 1024 }, cycles = 50000, },
+    SOR		= { params = { 100 }, cycles = 50000, },
+    MC		= { params = { }, cycles = 15e7, },
+    SPARSE	= { params = { 1000, 5000 }, cycles = 15e4, },
+    LU		= { params = { 100 }, cycles = 5000, },
   },
   large = {
-    FFT		= { 1048576 },
-    SOR		= { 1000 },
-    MC		= { },
-    SPARSE	= { 100000, 1000000 },
-    LU		= { 1000 },
+    FFT		= { params = { 1048576 }, cycles = 25, },
+    SOR		= { params = { 1000 }, cycles = 500, },
+    MC		= { params = { }, cycles = 15e7, },
+    SPARSE	= { params = { 100000, 1000000 }, cycles = 1500, },
+    LU		= { params = { 1000 }, cycles = 50, },
   },
 }
 
@@ -342,48 +343,51 @@ local function fmtparams(p1, p2)
   return ""
 end
 
-local function measure(min_time, name, ...)
+local function measure(name, cycles, ...)
   array_init()
   rand_init(RANDOM_SEED)
   local run = benchmarks[name](...)
-  local cycles = 1
-  repeat
-    local tm = clock()
-    local flops = run(cycles, ...)
-    tm = clock() - tm
-    if tm >= min_time then
-      local res = flops / tm * 1.0e-6
-      local p1, p2 = ...
-      printf("%-7s %8.2f  %s\n", name, res, fmtparams(...))
-      return res
-    end
-    cycles = cycles * 2
-  until false
+  local flops = run(cycles, ...)
+  return flops
 end
 
-printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
-       SCIMARK_VERSION, SCIMARK_COPYRIGHT)
+-- printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
+--        SCIMARK_VERSION, SCIMARK_COPYRIGHT)
 
 while arg and arg[1] do
   local a = table.remove(arg, 1)
-  if a == "-noffi" then
+  if a == "noffi" then
     package.preload.ffi = nil
-  elseif a == "-small" then
+  elseif a == "small" then
     SIZE_SELECT = "small"
-  elseif a == "-large" then
+  elseif a == "large" then
     SIZE_SELECT = "large"
   elseif benchmarks[a] then
-    local p = benchmarks[SIZE_SELECT][a]
-    measure(MIN_TIME, a, tonumber(arg[1]) or p[1], tonumber(arg[2]) or p[2])
+    local cycles = benchmarks[SIZE_SELECT][a].cycles
+    local p = benchmarks[SIZE_SELECT][a].params
+    local b
+    b = {
+      name = a,
+      -- XXX: The description of tests for each function is too
+      -- inconvenient.
+      skip_check = true,
+      payload = function()
+        local flops = measure(a, cycles, tonumber(arg[1]) or p[1],
+                              tonumber(arg[2]) or p[2])
+        b.items = flops
+      end,
+    }
+    bench:add(b)
+    bench:run_and_report()
     return
   else
-    printf("Usage: scimark [-noffi] [-small|-large] [BENCH params...]\n\n")
-    printf("BENCH   -small         -large\n")
+    printf("Usage: scimark [noffi] [small|large] [BENCH params...]\n\n")
+    printf("BENCH   small         large\n")
     printf("---------------------------------------\n")
     for _,name in ipairs(benchmarks) do
       printf("%-7s %-13s %s\n", name,
-	     fmtparams(unpack(benchmarks.small[name])),
-	     fmtparams(unpack(benchmarks.large[name])))
+	     fmtparams(unpack(benchmarks.small[name].params)),
+	     fmtparams(unpack(benchmarks.large[name].params)))
     end
     printf("\n")
     os.exit(1)
@@ -393,8 +397,19 @@ end
 local params = benchmarks[SIZE_SELECT]
 local sum = 0
 for _,name in ipairs(benchmarks) do
-  sum = sum + measure(MIN_TIME, name, unpack(params[name]))
+  local cycles = params[name].cycles
+  local b
+  b = {
+    name = name,
+    -- XXX: The description of tests for each function is too
+    -- inconvenient.
+    skip_check = true,
+    payload = function()
+      local flops = measure(name, cycles, unpack(params[name].params))
+      b.items = flops
+    end,
+  }
+  bench:add(b)
 end
-printf("\nSciMark %8.2f  [%s problem sizes]\n", sum / #benchmarks, SIZE_SELECT)
-io.flush()
 
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (26 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 13:58   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches Sergey Kaplun via Tarantool-patches
                   ` (12 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This helps to avoid this library in the scanning of the test files
for the suite.
---
 perf/LuaJIT-benches/{ => libs}/scimark_lib.lua | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename perf/LuaJIT-benches/{ => libs}/scimark_lib.lua (100%)

diff --git a/perf/LuaJIT-benches/scimark_lib.lua b/perf/LuaJIT-benches/libs/scimark_lib.lua
similarity index 100%
rename from perf/LuaJIT-benches/scimark_lib.lua
rename to perf/LuaJIT-benches/libs/scimark_lib.lua
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (27 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:00   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu " Sergey Kaplun via Tarantool-patches
                   ` (11 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-fft.lua | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-fft.lua b/perf/LuaJIT-benches/scimark-fft.lua
index c05bb69a..96535774 100644
--- a/perf/LuaJIT-benches/scimark-fft.lua
+++ b/perf/LuaJIT-benches/scimark-fft.lua
@@ -1 +1,18 @@
-require("scimark_lib").FFT(1024)(tonumber(arg and arg[1]) or 50000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 50000
+local benchmark
+benchmark = {
+  name = "scimark_fft",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").FFT(1024)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (28 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
                     ` (2 more replies)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc " Sergey Kaplun via Tarantool-patches
                   ` (10 subsequent siblings)
  40 siblings, 3 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-lu.lua | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-lu.lua b/perf/LuaJIT-benches/scimark-lu.lua
index 7636d994..4f521e0b 100644
--- a/perf/LuaJIT-benches/scimark-lu.lua
+++ b/perf/LuaJIT-benches/scimark-lu.lua
@@ -1 +1,19 @@
-require("scimark_lib").LU(100)(tonumber(arg and arg[1]) or 5000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 5000
+
+local benchmark
+benchmark = {
+  name = "scimark_lu",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").LU(100)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (29 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
                     ` (2 more replies)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor " Sergey Kaplun via Tarantool-patches
                   ` (9 subsequent siblings)
  40 siblings, 3 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adds the aforementioned test with the use of the benchmark
framework introduced before. The default arguments are adjusted
according to the amount of cycles in the <scimark-2010-12-20.lua> file.
The arguments to the script can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-mc.lua | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
 create mode 100644 perf/LuaJIT-benches/scimark-mc.lua

diff --git a/perf/LuaJIT-benches/scimark-mc.lua b/perf/LuaJIT-benches/scimark-mc.lua
new file mode 100644
index 00000000..d26b6e48
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-mc.lua
@@ -0,0 +1,19 @@
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 15e7
+
+local benchmark
+benchmark = {
+  name = "scimark_mc",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").MC()(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (30 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
                     ` (2 more replies)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse " Sergey Kaplun via Tarantool-patches
                   ` (8 subsequent siblings)
  40 siblings, 3 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-sor.lua | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-sor.lua b/perf/LuaJIT-benches/scimark-sor.lua
index e537e986..9bcdb0ad 100644
--- a/perf/LuaJIT-benches/scimark-sor.lua
+++ b/perf/LuaJIT-benches/scimark-sor.lua
@@ -1 +1,19 @@
-require("scimark_lib").SOR(100)(tonumber(arg and arg[1]) or 50000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 50000
+
+local benchmark
+benchmark = {
+  name = "scimark_sor",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").SOR(100)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (31 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 10:50 ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
                     ` (2 more replies)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series " Sergey Kaplun via Tarantool-patches
                   ` (7 subsequent siblings)
  40 siblings, 3 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 10:50 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-sparse.lua | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-sparse.lua b/perf/LuaJIT-benches/scimark-sparse.lua
index 01a2258d..a855cc22 100644
--- a/perf/LuaJIT-benches/scimark-sparse.lua
+++ b/perf/LuaJIT-benches/scimark-sparse.lua
@@ -1 +1,19 @@
-require("scimark_lib").SPARSE(1000, 5000)(tonumber(arg and arg[1]) or 150000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 150000
+
+local benchmark
+benchmark = {
+  name = "scimark_sparse",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").SPARSE(1000, 5000)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:01   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:07   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-lu.lua | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-lu.lua b/perf/LuaJIT-benches/scimark-lu.lua
index 7636d994..4f521e0b 100644
--- a/perf/LuaJIT-benches/scimark-lu.lua
+++ b/perf/LuaJIT-benches/scimark-lu.lua
@@ -1 +1,19 @@
-require("scimark_lib").LU(100)(tonumber(arg and arg[1]) or 5000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 5000
+
+local benchmark
+benchmark = {
+  name = "scimark_lu",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").LU(100)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:09   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adds the aforementioned test with the use of the benchmark
framework introduced before. The default arguments are adjusted
according to the amount of cycles in the <scimark-2010-12-20.lua> file.
The arguments to the script can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-mc.lua | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
 create mode 100644 perf/LuaJIT-benches/scimark-mc.lua

diff --git a/perf/LuaJIT-benches/scimark-mc.lua b/perf/LuaJIT-benches/scimark-mc.lua
new file mode 100644
index 00000000..d26b6e48
--- /dev/null
+++ b/perf/LuaJIT-benches/scimark-mc.lua
@@ -0,0 +1,19 @@
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 15e7
+
+local benchmark
+benchmark = {
+  name = "scimark_mc",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").MC()(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:11   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-sor.lua | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-sor.lua b/perf/LuaJIT-benches/scimark-sor.lua
index e537e986..9bcdb0ad 100644
--- a/perf/LuaJIT-benches/scimark-sor.lua
+++ b/perf/LuaJIT-benches/scimark-sor.lua
@@ -1 +1,19 @@
-require("scimark_lib").SOR(100)(tonumber(arg and arg[1]) or 50000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 50000
+
+local benchmark
+benchmark = {
+  name = "scimark_sor",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").SOR(100)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:03   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:15   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

Checks are omitted since they were not present in the original suite,
plus the precise result value depends on the input parameter.
---
 perf/LuaJIT-benches/scimark-sparse.lua | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/perf/LuaJIT-benches/scimark-sparse.lua b/perf/LuaJIT-benches/scimark-sparse.lua
index 01a2258d..a855cc22 100644
--- a/perf/LuaJIT-benches/scimark-sparse.lua
+++ b/perf/LuaJIT-benches/scimark-sparse.lua
@@ -1 +1,19 @@
-require("scimark_lib").SPARSE(1000, 5000)(tonumber(arg and arg[1]) or 150000)
+local bench = require("bench").new(arg)
+
+local cycles = tonumber(arg and arg[1]) or 150000
+
+local benchmark
+benchmark = {
+  name = "scimark_sparse",
+  -- XXX: The description of tests for the function is too
+  -- inconvenient.
+  skip_check = true,
+  payload = function()
+    local flops = require("scimark_lib").SPARSE(1000, 5000)(cycles)
+    benchmark.items = flops
+  end,
+}
+
+bench:add(benchmark)
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (32 preceding siblings ...)
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:19   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm " Sergey Kaplun via Tarantool-patches
                   ` (6 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/series.lua | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/perf/LuaJIT-benches/series.lua b/perf/LuaJIT-benches/series.lua
index f766cb32..3dc970c5 100644
--- a/perf/LuaJIT-benches/series.lua
+++ b/perf/LuaJIT-benches/series.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local function integrate(x0, x1, nsteps, omegan, f)
   local x, dx = x0, (x1-x0)/nsteps
@@ -26,9 +27,16 @@ local function series(n)
 end
 
 local n = tonumber(arg and arg[1]) or 10000
-local tm = os.clock()
-local t = series(n)
-tm = os.clock() - tm
-assert(math.abs(t[1]-2.87295) < 0.00001)
-io.write(string.format("size %d, %.2f s, %.1f iterations/s\n",
-                       n, tm, (2*n-1)/tm))
+
+bench:add({
+  name = "series",
+  checker = function(res)
+    return math.abs(res[1]-2.87295) < 0.00001
+  end,
+  payload = function()
+    return series(n)
+  end,
+  items = 2 * n - 1,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (33 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:23   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file " Sergey Kaplun via Tarantool-patches
                   ` (5 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.
---
 perf/LuaJIT-benches/spectral-norm.lua | 40 +++++++++++++++++++--------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/perf/LuaJIT-benches/spectral-norm.lua b/perf/LuaJIT-benches/spectral-norm.lua
index ecc80112..6e63cd47 100644
--- a/perf/LuaJIT-benches/spectral-norm.lua
+++ b/perf/LuaJIT-benches/spectral-norm.lua
@@ -1,3 +1,4 @@
+local bench = require("bench").new(arg)
 
 local function A(i, j)
   local ij = i+j-1
@@ -25,16 +26,33 @@ local function AtAv(x, y, t, N)
   Atv(t, y, N)
 end
 
-local N = tonumber(arg and arg[1]) or 100
-local u, v, t = {}, {}, {}
-for i=1,N do u[i] = 1 end
+local N = tonumber(arg and arg[1]) or 3000
 
-for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
+bench:add({
+  name = "spectral_norm",
+  checker = function(res)
+    -- XXX: Empirical value.
+    if N > 66 then
+      assert(math.abs(res - 1.27422) < 0.00001)
+    end
+    return true
+  end,
+  payload = function()
+    local u, v, t = {}, {}, {}
+    for i=1,N do u[i] = 1 end
 
-local vBv, vv = 0, 0
-for i=1,N do
-  local ui, vi = u[i], v[i]
-  vBv = vBv + ui*vi
-  vv = vv + vi*vi
-end
-io.write(string.format("%0.9f\n", math.sqrt(vBv / vv)))
+    for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
+
+    local vBv, vv = 0, 0
+    for i=1,N do
+      local ui, vi = u[i], v[i]
+      vBv = vBv + ui*vi
+      vv = vv + vi*vi
+    end
+    return math.sqrt(vBv / vv)
+  end,
+  -- Operations inside `for i=1,10` loop.
+  items = 40 * N * N,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file in LuaJIT-benches
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (34 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
  2025-12-23 10:44   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure Sergey Kaplun via Tarantool-patches
                   ` (4 subsequent siblings)
  40 siblings, 2 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adjusts the aforementioned test to use the benchmark
framework introduced before. The default arguments are adjusted
according to the <PARAM_x86.txt> file. The arguments to the script still
can be provided in the command line run.

The input for the test is redirected from the generated file
<SUMCOL_5000.txt>. This file is the result of concatenation of the
<SUMCOL_1.txt> 5000 times.
---
 perf/LuaJIT-benches/sum-file.lua | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/perf/LuaJIT-benches/sum-file.lua b/perf/LuaJIT-benches/sum-file.lua
index c9e618fd..270c1865 100644
--- a/perf/LuaJIT-benches/sum-file.lua
+++ b/perf/LuaJIT-benches/sum-file.lua
@@ -1,6 +1,25 @@
+local bench = require("bench").new(arg)
 
-local sum = 0
-for line in io.lines() do
-  sum = sum + line
-end
-io.write(sum, "\n")
+-- XXX: The input file is generated from <SUMCOL_1.txt> by
+-- repeating it 5000 times. The <SUMCOL_1.txt> contains 1000 lines
+-- with the total sum of 500.
+bench:add({
+  name = "sum_file",
+  payload = function()
+    local sum = 0
+    for line in io.lines() do
+      sum = sum + line
+    end
+    -- Allow several iterations.
+    io.stdin:seek("set", 0)
+    return sum
+  end,
+  checker = function(res)
+    -- Precomputed result.
+    return res == 2500000
+  end,
+  -- Fixed size of the file.
+  items = 5e6,
+})
+
+bench:run_and_report()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (35 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file " Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-18 12:21   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics Sergey Kaplun via Tarantool-patches
                   ` (3 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This commit introduces CMake building scripts for the benches introduced
before. The benchmarks are enabled only if `LUAJIT_ENABLE_PERF` option
is set. For each suite (LuaJIT-benches in this patch set)
`AddBenchTarget()` macro generates 2 targets:
* Target to run all benches and store results in the
  perf/output/<suite_name> directory.
* Target to run all benches via CTest and inspect results in the
  console.

For the LuaJIT-benches there are 2 generated files:
* FASTA_5000000 -- is used as an input for <k-nukleotide.lua> and
                   <revcomp.lua>.
* SUMCOLL_5000.txt -- is used as an input for <sum-file.lua>.

These files and <perf/output> directory are added to the .gitignore files.
---
 .gitignore                         |  5 ++
 CMakeLists.txt                     | 11 ++++
 perf/CMakeLists.txt                | 99 ++++++++++++++++++++++++++++++
 perf/LuaJIT-benches/CMakeLists.txt | 52 ++++++++++++++++
 4 files changed, 167 insertions(+)
 create mode 100644 perf/CMakeLists.txt
 create mode 100644 perf/LuaJIT-benches/CMakeLists.txt

diff --git a/.gitignore b/.gitignore
index c26a7eb8..bfc7d401 100644
--- a/.gitignore
+++ b/.gitignore
@@ -28,3 +28,8 @@ luajit-parse-memprof
 luajit-parse-sysprof
 luajit.pc
 *.c_test
+
+# Generated by the performance tests.
+FASTA_5000000
+SUMCOL_5000.txt
+perf/output/
diff --git a/CMakeLists.txt b/CMakeLists.txt
index c0da4362..73f46835 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -464,6 +464,17 @@ if(LUAJIT_USE_TEST)
 endif()
 add_subdirectory(test)
 
+# --- Benchmarks source tree ---------------------------------------------------
+
+# The option to enable performance tests for the LuaJIT.
+# Disabled by default, since commonly it is used only by LuaJIT
+# developers and run in the CI with the specially set-up machine.
+option(LUAJIT_ENABLE_PERF "Generate <perf> target" OFF)
+
+if(LUAJIT_ENABLE_PERF)
+  add_subdirectory(perf)
+endif()
+
 # --- Misc rules ---------------------------------------------------------------
 
 # XXX: Implement <uninstall> target using the following recipe:
diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
new file mode 100644
index 00000000..cc3c312f
--- /dev/null
+++ b/perf/CMakeLists.txt
@@ -0,0 +1,99 @@
+# Running various bench suites against LuaJIT.
+
+include(MakeLuaPath)
+
+if(CMAKE_BUILD_TYPE STREQUAL "Debug")
+  message(WARNING "LuaJIT and perf tests are built in the Debug mode."
+                  "Timings may be affected.")
+endif()
+
+set(PERF_OUTPUT_DIR ${PROJECT_BINARY_DIR}/perf/output)
+file(MAKE_DIRECTORY ${PERF_OUTPUT_DIR})
+
+# List of paths that will be used for each suite.
+make_lua_path(LUA_PATH_BENCH_BASE
+  PATHS
+    # Use of the bench module.
+    ${CMAKE_CURRENT_SOURCE_DIR}/utils/?.lua
+    # Simple usage with `jit.dump()`, etc.
+    ${LUAJIT_SOURCE_DIR}/?.lua
+    ${LUAJIT_BINARY_DIR}/?.lua
+)
+
+make_lua_path(LUA_CPATH
+  PATHS
+    # XXX: Some arches may have installed the cjson module here.
+    /usr/lib64/lua/5.1/?.so
+)
+
+# Produce the pair:
+# Target to run for reporting and target to inspect from the
+# console, runnable by the CTest.
+macro(AddBenchTarget perf_suite)
+  file(MAKE_DIRECTORY "${PERF_OUTPUT_DIR}/${perf_suite}/")
+  message(STATUS "Add perf suite ${perf_suite}")
+  add_custom_target(${perf_suite})
+  add_custom_target(${perf_suite}-console
+    COMMAND ${CMAKE_CTEST_COMMAND}
+      -L ${perf_suite}
+      --parallel 1
+      --verbose
+      --output-on-failure
+      --no-tests=error
+  )
+  add_dependencies(${perf_suite}-console luajit-main)
+endmacro()
+
+# Add the bench to the pair of targets created by the call above.
+macro(AddBench bench_name bench_path perf_suite LUA_PATH)
+  set(bench_title "perf/${perf_suite}/${bench_name}")
+  get_filename_component(bench_name_stripped  ${bench_name} NAME_WE)
+  set(bench_out_file
+    ${PERF_OUTPUT_DIR}/${perf_suite}/${bench_name_stripped}.json
+  )
+  set(bench_command "${LUAJIT_BINARY} ${bench_path}")
+  if(${ARGC} GREATER 4)
+    set(input_file ${ARGV4})
+    set(bench_command "${bench_command} < ${input_file}")
+  endif()
+  set(BENCH_FLAGS
+    "--benchmark_out_format=json --benchmark_out=${bench_out_file}"
+  )
+  set(bench_command_flags ${bench_command} ${BENCH_FLAGS})
+  separate_arguments(bench_command_separated UNIX_COMMAND ${bench_command})
+  add_custom_command(
+    COMMAND ${CMAKE_COMMAND} -E env
+      LUA_PATH="${LUA_PATH}"
+      LUA_CPATH="${LUA_CPATH}"
+        ${bench_command_separated}
+          --benchmark_out_format=json
+          --benchmark_out="${bench_out_file}"
+    OUTPUT ${bench_out_file}
+    DEPENDS luajit-main
+    COMMENT
+      "Running benchmark ${bench_title} saving results in ${bench_out_file}."
+  )
+  add_custom_target(${bench_name} DEPENDS ${bench_out_file})
+  add_dependencies(${perf_suite} ${bench_name})
+
+  # Report in the console.
+  add_test(NAME ${bench_title}
+    COMMAND sh -c "${bench_command}"
+  )
+  set_tests_properties(${bench_title} PROPERTIES
+    ENVIRONMENT "LUA_PATH=${LUA_PATH}"
+    LABELS ${perf_suite}
+    DEPENDS luajit-main
+  )
+  unset(input_file)
+endmacro()
+
+add_subdirectory(LuaJIT-benches)
+
+add_custom_target(${PROJECT_NAME}-perf
+  DEPENDS LuaJIT-benches
+)
+
+add_custom_target(${PROJECT_NAME}-perf-console
+  DEPENDS LuaJIT-benches-console
+)
diff --git a/perf/LuaJIT-benches/CMakeLists.txt b/perf/LuaJIT-benches/CMakeLists.txt
new file mode 100644
index 00000000..d9909f36
--- /dev/null
+++ b/perf/LuaJIT-benches/CMakeLists.txt
@@ -0,0 +1,52 @@
+set(PERF_SUITE_NAME LuaJIT-benches)
+set(LUA_BENCH_SUFFIX .lua)
+
+AddBenchTarget(${PERF_SUITE_NAME})
+
+# Input for the k-nucleotide and revcomp benchmarks.
+set(FASTA_NAME ${CMAKE_CURRENT_BINARY_DIR}/FASTA_5000000)
+add_custom_target(FASTA_5000000
+  COMMAND ${LUAJIT_BINARY}
+    ${CMAKE_CURRENT_SOURCE_DIR}/libs/fasta.lua 5000000 > ${FASTA_NAME}
+  OUTPUT ${FASTA_NAME}
+  DEPENDS luajit-main
+  COMMENT "Generate ${FASTA_NAME}."
+)
+
+make_lua_path(LUA_PATH
+  PATHS
+    ${LUA_PATH_BENCH_BASE}
+    ${CMAKE_CURRENT_SOURCE_DIR}/libs/?.lua
+)
+
+# Input for the <sum-file.lua> benchmark.
+set(SUM_NAME ${CMAKE_CURRENT_BINARY_DIR}/SUMCOL_5000.txt)
+# Remove possibly existing file.
+file(REMOVE ${SUM_NAME})
+
+set(SUMCOL_FILE ${CMAKE_CURRENT_SOURCE_DIR}/SUMCOL_1.txt)
+file(READ ${SUMCOL_FILE} SUMCOL_CONTENT)
+foreach(_unused RANGE 4999)
+  file(APPEND ${SUM_NAME} "${SUMCOL_CONTENT}")
+endforeach()
+
+file(GLOB benches "${CMAKE_CURRENT_SOURCE_DIR}/*${LUA_BENCH_SUFFIX}")
+foreach(bench_path ${benches})
+  file(RELATIVE_PATH bench_name ${CMAKE_CURRENT_SOURCE_DIR} ${bench_path})
+  set(bench_title "perf/${PERF_SUITE_NAME}/${bench_name}")
+  if(bench_name MATCHES "k-nucleotide" OR bench_name MATCHES "revcomp")
+    AddBench(${bench_name}
+      ${bench_path} ${PERF_SUITE_NAME} "${LUA_PATH}" ${FASTA_NAME}
+    )
+    add_dependencies(${bench_name} FASTA_5000000)
+  elseif(bench_name MATCHES "sum-file")
+    AddBench(${bench_name}
+      ${bench_path} ${PERF_SUITE_NAME} "${LUA_PATH}" ${SUM_NAME}
+    )
+  else()
+    AddBench(${bench_name} ${bench_path} ${PERF_SUITE_NAME} "${LUA_PATH}")
+  endif()
+endforeach()
+
+# We need to generate the file before we run tests.
+add_dependencies(${PERF_SUITE_NAME}-console FASTA_5000000)
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (36 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-18 12:31   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup Sergey Kaplun via Tarantool-patches
                   ` (2 subsequent siblings)
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adds a helper script to aggregate the benchmark results from
JSON files to the format parsable by the InfluxDB line protocol [1].

All JSON files from each suite in the <perf/output> directory are
considered as the benchmark results and aggregated into the
<perf/output/summary.txt> file that can be posted to the InfluxDB. The
results are aggregated via the new target LuaJIT-perf-aggregate.

[1]: https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/
---
 perf/CMakeLists.txt        |  13 ++++
 perf/helpers/aggregate.lua | 124 +++++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+)
 create mode 100644 perf/helpers/aggregate.lua

diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
index cc3c312f..68e561fd 100644
--- a/perf/CMakeLists.txt
+++ b/perf/CMakeLists.txt
@@ -97,3 +97,16 @@ add_custom_target(${PROJECT_NAME}-perf
 add_custom_target(${PROJECT_NAME}-perf-console
   DEPENDS LuaJIT-benches-console
 )
+
+set(PERF_SUMMARY ${PERF_OUTPUT_DIR}/summary.txt)
+add_custom_target(${PROJECT_NAME}-perf-aggregate
+  BYPRODUCTS ${PERF_SUMMARY}
+  COMMENT "Aggregate performance test results into ${PERF_SUMMARY}"
+  COMMAND ${CMAKE_COMMAND} -E env
+    LUA_CPATH="${LUA_CPATH}"
+      ${LUAJIT_BINARY} ${CMAKE_CURRENT_SOURCE_DIR}/helpers/aggregate.lua
+        ${PERF_SUMMARY}
+        ${PERF_OUTPUT_DIR}
+  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
+  DEPENDS luajit-main
+)
diff --git a/perf/helpers/aggregate.lua b/perf/helpers/aggregate.lua
new file mode 100644
index 00000000..12a8ab89
--- /dev/null
+++ b/perf/helpers/aggregate.lua
@@ -0,0 +1,124 @@
+local json = require('cjson')
+
+-- File to aggregate the benchmark results from JSON files to the
+-- format parsable by the InfluxDB line protocol [1]:
+-- <measurement>,<tag_set> <field_set> <timestamp>
+--
+-- <tag_set> and <field_set> have the following format:
+-- <key1>=<value1>,<key2>=<value2>
+--
+-- The reported tag set is a set of values that can be used for
+-- filtering data (i.e., branch or benchmark name).
+--
+-- luacheck: push no max comment line length
+--
+-- [1]: https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/
+--
+-- luacheck: pop
+
+local output = assert(arg[1], 'Output file is required as the first argument')
+local input_dir = arg[2] or '.'
+
+local out_fh = assert(io.open(output, 'w+'))
+
+local function exec(cmd)
+  return io.popen(cmd):read('*all'):gsub('%s+$', '')
+end
+
+local commit = os.getenv('PERF_COMMIT') or exec('git rev-parse --short HEAD')
+assert(commit, 'can not determine the commit')
+
+local branch = os.getenv('PERF_BRANCH') or
+  exec('git rev-parse --abbrev-ref HEAD')
+assert(branch, 'can not determine the branch')
+
+-- Not very robust, but OK for our needs.
+local function listdir(path)
+  local handle = io.popen('ls -1 ' .. path)
+
+  local files = {}
+  for file in handle:lines() do
+    table.insert(files, file)
+  end
+
+  return files
+end
+
+local tag_set = {branch = branch}
+
+local function table_plain_copy(src)
+  local dst = {}
+  for k, v in pairs(src) do
+    dst[k] = v
+  end
+  return dst
+end
+
+local function read_all(file)
+  local fh = assert(io.open(file, 'rb'))
+  local content = fh:read('*all')
+  fh:close()
+  return content
+end
+
+local REPORTED_FIELDS = {
+  'cpu_time',
+  'items_per_second',
+  'iterations',
+  'real_time',
+}
+
+local function influx_kv(tab)
+  local kv_string = {}
+  for k, v in pairs(tab) do
+    table.insert(kv_string, ('%s=%s'):format(k, v))
+  end
+  return table.concat(kv_string, ',')
+end
+
+local time = os.time()
+local function influx_line(measurement, tags, fields)
+  return ('%s,%s %s %d\n'):format(measurement, influx_kv(tags),
+          influx_kv(fields), time)
+end
+
+for _, suite_name in pairs(listdir(input_dir)) do
+  -- May list the report file, but will be ignored by the
+  -- condition below.
+  local suite_dir = ('%s/%s'):format(input_dir, suite_name)
+  for _, file in pairs(listdir(suite_dir)) do
+    -- Skip files in which we are not interested.
+    if not file:match('%.json$') then goto continue end
+
+    local data = read_all(('%s/%s'):format(suite_dir, file))
+    local bench_name = file:match('([^/]+)%.json')
+    local bench_data = json.decode(data)
+    local benchmarks = bench_data.benchmarks
+    local arch = bench_data.context.arch
+    local gc64 = bench_data.context.gc64
+    local jit = bench_data.context.jit
+
+    for _, bench in ipairs(benchmarks) do
+      local full_tag_set = table_plain_copy(tag_set)
+      full_tag_set.name = bench.name
+      full_tag_set.suite = suite_name
+      full_tag_set.arch = arch
+      full_tag_set.gc64 = gc64
+      full_tag_set.jit = jit
+
+      -- Save the commit as a field, since we don't want to filter
+      -- benchmarks by the commit (one point of data).
+      local field_set = {commit = ('"%s"'):format(commit)}
+
+      for _, field in ipairs(REPORTED_FIELDS) do
+          field_set[field] = bench[field]
+      end
+
+      local line = influx_line(bench_name, full_tag_set, field_set)
+      out_fh:write(line)
+    end
+    ::continue::
+  end
+end
+
+out_fh:close()
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (37 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-18 12:36   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow Sergey Kaplun via Tarantool-patches
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

The patch adds a script for setting the environment before running
performance tests. Most of the settings are taken from the Tarantool's
wiki page dedicated to the benchmarking [1].

[1]: https://github.com/tarantool/tarantool/wiki/Benchmarking
---
 perf/helpers/setup_env.sh | 135 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 135 insertions(+)
 create mode 100755 perf/helpers/setup_env.sh

diff --git a/perf/helpers/setup_env.sh b/perf/helpers/setup_env.sh
new file mode 100755
index 00000000..043d3c88
--- /dev/null
+++ b/perf/helpers/setup_env.sh
@@ -0,0 +1,135 @@
+#!/bin/sh
+
+# The script sets up a Linux operating system before running
+# LuaJIT benchmarks. See more details in [1].
+#
+# [1]: https://github.com/tarantool/tarantool/wiki/Benchmarking
+
+set -eu
+
+uid=$(id -u)
+if [ "$uid" -ne 0 ]
+  then echo "Please run as root."
+  exit 1
+fi
+
+###
+# Helpers.
+###
+
+cpu_vendor="unknown"
+cpuinfo_vendor=$(awk '/vendor_id/{ print $3; exit }' < /proc/cpuinfo)
+if [ "$cpuinfo_vendor" = "GenuineIntel" ]; then
+  cpu_vendor="intel"
+elif [ "$cpuinfo_vendor" = "AuthenticAMD" ]; then
+  cpu_vendor="amd"
+else
+  echo "Unknown CPU vendor '$cpuinfo_vendor'"
+  exit 1
+fi
+
+FAILURE_MSG="WARNING"
+SUCCESS_MSG="CHECKED"
+SKIPPED_MSG="SKIPPED"
+
+set_kernel_setting() {
+  desc_msg="$1"
+  file_path="$2"
+  value="$3"
+
+  if [ -f "$file_path" ]; then
+    sh -c "echo $value > $file_path" && status="$SUCCESS_MSG" || status="$FAILURE_MSG"
+  else
+    status="$SKIPPED_MSG"
+  fi
+  echo "$desc_msg $status"
+}
+
+kernel_setting_is_nonzero() {
+  desc_msg="$1"
+  file_path="$2"
+  hint_msg="$3"
+
+  if [ -f "$file_path" ]; then
+    value=$(cat "$file_path")
+    if [ -n "$value" ]; then
+      status="$SUCCESS_MSG"
+    else
+      status="$FAILURE_MSG (hint: $hint_msg)"
+    fi
+  else
+    status="$SKIPPED_MSG"
+  fi
+  echo "$desc_msg $status"
+}
+
+###
+# Kernel command line parameters.
+###
+
+desc_msg="Disable AMD SMT or Intel Hyperthreading "
+sysfs_path="/sys/devices/system/cpu/smt/active"
+if [ -f "$sysfs_path" ]; then
+  is_set=$(cat $sysfs_path)
+  err_msg="$FAILURE_MSG (hint: set 'nosmt' kernel parameter)"
+  [ "$is_set" = 1 ] && status="$SUCCESS_MSG" || status="$err_msg"
+else
+  status="$SKIPPED_MSG"
+fi
+echo "$desc_msg $status"
+
+kernel_setting_is_nonzero \
+  "Isolate CPUs for benchmarking" \
+  "/sys/devices/system/cpu/isolated" \
+  "set 'isolcpus' kernel parameter"
+
+kernel_setting_is_nonzero \
+  "Offload interrupts from the isolated CPUs" \
+  "/proc/irq/default_smp_affinity" \
+  "set 'irqaffinity' kernel parameter"
+
+kernel_setting_is_nonzero \
+  "Disable scheduling on single-task isolated CPUs" \
+  "/sys/devices/system/cpu/nohz_full" \
+  "set 'nohz_full' kernel parameter"
+
+set_kernel_setting \
+  "Disable transparent huge pages" \
+  "/sys/kernel/mm/transparent_hugepage/enabled" \
+  "never"
+
+set_kernel_setting \
+  "Disable direct compaction of transparent huge pages" \
+  "/sys/kernel/mm/transparent_hugepage/defrag" \
+  "never"
+
+# Disable ASLR for the repeatable LuaJIT behaviour.
+set_kernel_setting \
+  "Disable ASLR" \
+  "/proc/sys/kernel/randomize_va_space" \
+  "0"
+
+###
+# System tuning.
+###
+
+if [ "$cpu_vendor" = "amd" ]; then
+  sysfs_path="/sys/devices/system/cpu/cpufreq/boost"
+  value=0
+elif [ "$cpu_vendor" = "intel" ]; then
+  sysfs_path="/sys/devices/system/cpu/intel_pstate/no_turbo"
+  value=1
+fi
+set_kernel_setting \
+  "Disable TurboBoost" \
+  "$sysfs_path" \
+  "$value"
+
+ncpu=$(getconf _NPROCESSORS_ONLN)
+for cpu_id in $(seq 0 1 $((ncpu-1))); do
+  sysfs_path_cpu="/sys/devices/system/cpu/cpu$cpu_id/cpufreq/scaling_governor"
+  set_kernel_setting \
+    "Stabilize the frequency of CPU $cpu_id" \
+    "$sysfs_path_cpu" \
+    "performance"
+done
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (38 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-18 12:51   ` Sergey Bronnikov via Tarantool-patches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow Sergey Kaplun via Tarantool-patches
  40 siblings, 1 reply; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch introduces the `LUAJIT_BENCH_INIT` option to determine the
shell command to be run before the benchmark itself. It may be useful to
set taskset, etc.
---
 perf/CMakeLists.txt | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
index 68e561fd..c315597f 100644
--- a/perf/CMakeLists.txt
+++ b/perf/CMakeLists.txt
@@ -7,6 +7,13 @@ if(CMAKE_BUILD_TYPE STREQUAL "Debug")
                   "Timings may be affected.")
 endif()
 
+# The shell command needs to be run before benchmarks are started.
+if(LUAJIT_BENCH_INIT)
+  message(STATUS
+    "The following command will run before benchmarks: '${LUAJIT_BENCH_INIT}'."
+  )
+endif()
+
 set(PERF_OUTPUT_DIR ${PROJECT_BINARY_DIR}/perf/output)
 file(MAKE_DIRECTORY ${PERF_OUTPUT_DIR})
 
@@ -51,7 +58,7 @@ macro(AddBench bench_name bench_path perf_suite LUA_PATH)
   set(bench_out_file
     ${PERF_OUTPUT_DIR}/${perf_suite}/${bench_name_stripped}.json
   )
-  set(bench_command "${LUAJIT_BINARY} ${bench_path}")
+  set(bench_command "${LUAJIT_BENCH_INIT} ${LUAJIT_BINARY} ${bench_path}")
   if(${ARGC} GREATER 4)
     set(input_file ${ARGV4})
     set(bench_command "${bench_command} < ${input_file}")
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow
  2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
                   ` (39 preceding siblings ...)
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:00 ` Sergey Kaplun via Tarantool-patches
  2025-11-18 13:08   ` Sergey Bronnikov via Tarantool-patches
  2025-11-18 13:13   ` Sergey Bronnikov via Tarantool-patches
  40 siblings, 2 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:00 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

This patch adds the workflow to run benchmarks from various suites,
aggregate their results, and send statistics to the InfluxDB to be
processed later.

The workflow contains a matrix to measure GC64 and non-GC64 modes with
enabled/disabled JIT for x64 architecture.
---
 .github/actions/setup-performance/README.md  |  10 ++
 .github/actions/setup-performance/action.yml |  18 +++
 .github/workflows/performance.yml            | 110 +++++++++++++++++++
 3 files changed, 138 insertions(+)
 create mode 100644 .github/actions/setup-performance/README.md
 create mode 100644 .github/actions/setup-performance/action.yml
 create mode 100644 .github/workflows/performance.yml

diff --git a/.github/actions/setup-performance/README.md b/.github/actions/setup-performance/README.md
new file mode 100644
index 00000000..4c4bbdab
--- /dev/null
+++ b/.github/actions/setup-performance/README.md
@@ -0,0 +1,10 @@
+# Setup performance
+
+Action setups the performance on Linux runners.
+
+## How to use Github Action from Github workflow
+
+Add the following code to the running steps before LuaJIT configuration:
+```
+- uses: ./.github/actions/setup-performance
+```
diff --git a/.github/actions/setup-performance/action.yml b/.github/actions/setup-performance/action.yml
new file mode 100644
index 00000000..24d07440
--- /dev/null
+++ b/.github/actions/setup-performance/action.yml
@@ -0,0 +1,18 @@
+name: Setup performance
+description: The Linux machine setup for running LuaJIT benchmarks
+runs:
+  using: composite
+  steps:
+    - name: Setup CI environment (Linux)
+      uses: ./.github/actions/setup-linux
+    - name: Install dependencies for the LuaJIT benchmarks
+      run: |
+        apt -y update
+        apt install -y luarocks curl
+      shell: bash
+    - name: Install Lua modules
+      run: luarocks install lua-cjson
+      shell: bash
+    - name: Run script to setup Linux environment
+      run: sh ./perf/helpers/setup_env.sh
+      shell: bash
diff --git a/.github/workflows/performance.yml b/.github/workflows/performance.yml
new file mode 100644
index 00000000..bfb6be97
--- /dev/null
+++ b/.github/workflows/performance.yml
@@ -0,0 +1,110 @@
+name: Performance
+
+on:
+  push:
+    branches-ignore:
+      - '**-noperf'
+      - 'tarantool/release/**'
+      - 'upstream-**'
+    tags-ignore:
+      - '**'
+  schedule:
+    # Once a day at 03:00 to avoid clashing with runs for the
+    # Tarantool benchmarks at midnight.
+    - cron: '0 3 * * *'
+
+concurrency:
+  # An update of a developer branch cancels the previously
+  # scheduled workflow run for this branch. However, the default
+  # branch, and long-term branch (tarantool/release/2.11,
+  # tarantool/release/2.10, etc) workflow runs are never canceled.
+  #
+  # We use a trick here: define the concurrency group as 'workflow
+  # run ID' + # 'workflow run attempt' because it is a unique
+  # combination for any run. So it effectively discards grouping.
+  #
+  # XXX: we cannot use `github.sha` as a unique identifier because
+  # pushing a tag may cancel a run that works on a branch push
+  # event.
+  group: ${{ startsWith(github.ref, 'refs/heads/tarantool/')
+    && format('{0}-{1}', github.run_id, github.run_attempt)
+    || format('{0}-{1}', github.workflow, github.ref) }}
+  cancel-in-progress: true
+
+jobs:
+  performance-luajit:
+    # The 'performance' label _must_ be set only for the single
+    # runner to guarantee that results are not dependent on the
+    # machine.
+    runs-on:
+      - self-hosted
+      - Linux
+      - x86_64
+      - 'performance'
+
+    env:
+      PERF_BRANCH: ${{ github.ref_name }}
+      PERF_COMMIT: ${{ github.sha }}
+
+    strategy:
+      fail-fast: false
+      matrix:
+        GC64: [ON, OFF]
+        JOFF: [ON, OFF]
+      # Run each job sequentially.
+      max-parallel: 1
+    name: >
+      LuaJIT
+      GC64:${{ matrix.GC64 }}
+      JOFF:${{ matrix.GC64 }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          submodules: recursive
+      - name: setup performance environment
+        uses: ./.github/actions/setup-performance
+      - name: configure
+        # The taskset alone will pin all the process threads
+        # into a single (random) isolated CPU, see
+        # https://bugzilla.kernel.org/show_bug.cgi?id=116701.
+        # The workaround is using realtime scheduler for the
+        # isolated task using chrt, e. g.:
+        # sudo taskset 0xef chrt 50.
+        # But this makes the process use non-standard, real-time
+        # round-robin scheduling mechanism.
+        run: >
+          cmake -S . -B ${{ env.BUILDDIR }}
+          -DCMAKE_BUILD_TYPE=RelWithDebInfo
+          -DLUAJIT_ENABLE_PERF=ON
+          -DLUAJIT_BENCH_INIT="taskset 0xfe chrt 50"
+          -DLUAJIT_DISABLE_JIT=${{ matrix.JOFF }}
+          -DLUAJIT_ENABLE_GC64=${{ matrix.GC64 }}
+      - name: build
+        run: cmake --build . --parallel
+        working-directory: ${{ env.BUILDDIR }}
+      - name: perf
+        run: make LuaJIT-perf
+        working-directory: ${{ env.BUILDDIR }}
+      - name: aggregate benchmark results
+        run: make LuaJIT-perf-aggregate
+        working-directory: ${{ env.BUILDDIR }}
+      - name: send statistics to InfluxDB
+        # --silent -o /dev/null: Prevent dumping any reply part
+        # in the output in case of an error.
+        # --fail: Exit with the 22 error code is status >= 400.
+        # --write-out: See the reason for the failure, if any.
+        # --retry, --retry-delay: To avoid losing the results of
+        # running after such a long job, try to retry sending the
+        # results.
+        run: >
+          curl --request POST
+          "${{ secrets.INFLUXDB_URL }}/api/v2/write?org=tarantool&bucket=luajit-performance&precision=s"
+          --write-out "%{http_code}"
+          --retry 5
+          --retry-delay 5
+          --connect-timeout 120
+          --fail --silent -o /dev/null
+          --header "Authorization: Token ${{ secrets.INFLUXDB_TOKEN }}"
+          --data-binary @./perf/output/summary.txt
+        working-directory: ${{ env.BUILDDIR }}
-- 
2.51.0


^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:01   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:07   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:01 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

The second copy of the letter is send by mistake. Feel free to ignore
it.

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:09   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:02 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

The second copy of the letter is send by mistake. Feel free to ignore
it.

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:11   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:02 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

The second copy of the letter is send by mistake. Feel free to ignore
it.

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
@ 2025-10-24 11:03   ` Sergey Kaplun via Tarantool-patches
  2025-11-17 14:15   ` Sergey Bronnikov via Tarantool-patches
  2 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-10-24 11:03 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

The second copy of the letter is send by mistake. Feel free to ignore
it.

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite Sergey Kaplun via Tarantool-patches
@ 2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:04     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-11 14:28 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 22688 bytes --]

Hi, Sergey,

thanks for the patch!

This is a big step forward for LuaJIT performance testing.

Please take a look on the comments below.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch introduces the LuaJIT-test-cleanup bench suite [1] into our
s/bench/benchmark/
> LuaJIT fork source tree. To provide relatable reprodusible results

did not get it: "relatable"

s/reprodusible/reproducible/

> several benchmarks need to be adjusted. However, to be sure we initially use
> the valid suite, everything in the <perf/LuaJIT-benches> directory is
> moved intact.
>
> [1]:https://github.com/LuaJIT/LuaJIT-test-cleanup/tree/014708b/bench
> ---
>   .luacheckrc                                |    1 +
>   perf/LuaJIT-benches/PARAM_arm.txt          |   29 +
>   perf/LuaJIT-benches/PARAM_mips.txt         |   29 +
>   perf/LuaJIT-benches/PARAM_ppc.txt          |   29 +
>   perf/LuaJIT-benches/PARAM_x86.txt          |   29 +
>   perf/LuaJIT-benches/SUMCOL_1.txt           | 1000 ++++++++++++++++++++
>   perf/LuaJIT-benches/TEST_md5sum.txt        |   20 +
>   perf/LuaJIT-benches/array3d.lua            |   59 ++
>   perf/LuaJIT-benches/binary-trees.lua       |   47 +
>   perf/LuaJIT-benches/chameneos.lua          |   68 ++
>   perf/LuaJIT-benches/coroutine-ring.lua     |   42 +
>   perf/LuaJIT-benches/euler14-bit.lua        |   22 +
>   perf/LuaJIT-benches/fannkuch.lua           |   50 +
>   perf/LuaJIT-benches/fasta.lua              |   95 ++
>   perf/LuaJIT-benches/k-nucleotide.lua       |   58 ++
>   perf/LuaJIT-benches/life.lua               |  111 +++
>   perf/LuaJIT-benches/mandelbrot-bit.lua     |   33 +
>   perf/LuaJIT-benches/mandelbrot.lua         |   23 +
>   perf/LuaJIT-benches/md5.lua                |  183 ++++
>   perf/LuaJIT-benches/meteor.lua             |  220 +++++
>   perf/LuaJIT-benches/nbody.lua              |  119 +++
>   perf/LuaJIT-benches/nsieve-bit-fp.lua      |   37 +
>   perf/LuaJIT-benches/nsieve-bit.lua         |   27 +
>   perf/LuaJIT-benches/nsieve.lua             |   21 +
>   perf/LuaJIT-benches/partialsums.lua        |   29 +
>   perf/LuaJIT-benches/pidigits-nogmp.lua     |  100 ++
>   perf/LuaJIT-benches/ray.lua                |  135 +++
>   perf/LuaJIT-benches/recursive-ack.lua      |    8 +
>   perf/LuaJIT-benches/recursive-fib.lua      |    7 +
>   perf/LuaJIT-benches/revcomp.lua            |   37 +
>   perf/LuaJIT-benches/scimark-2010-12-20.lua |  400 ++++++++
>   perf/LuaJIT-benches/scimark-fft.lua        |    1 +
>   perf/LuaJIT-benches/scimark-lu.lua         |    1 +
>   perf/LuaJIT-benches/scimark-sor.lua        |    1 +
>   perf/LuaJIT-benches/scimark-sparse.lua     |    1 +
>   perf/LuaJIT-benches/scimark_lib.lua        |  297 ++++++
>   perf/LuaJIT-benches/series.lua             |   34 +
>   perf/LuaJIT-benches/spectral-norm.lua      |   40 +
>   perf/LuaJIT-benches/sum-file.lua           |    6 +
>   39 files changed, 3449 insertions(+)
>   create mode 100644 perf/LuaJIT-benches/PARAM_arm.txt
>   create mode 100644 perf/LuaJIT-benches/PARAM_mips.txt
>   create mode 100644 perf/LuaJIT-benches/PARAM_ppc.txt
>   create mode 100644 perf/LuaJIT-benches/PARAM_x86.txt
>   create mode 100644 perf/LuaJIT-benches/SUMCOL_1.txt
>   create mode 100644 perf/LuaJIT-benches/TEST_md5sum.txt
>   create mode 100644 perf/LuaJIT-benches/array3d.lua
>   create mode 100644 perf/LuaJIT-benches/binary-trees.lua
>   create mode 100644 perf/LuaJIT-benches/chameneos.lua
>   create mode 100644 perf/LuaJIT-benches/coroutine-ring.lua
>   create mode 100644 perf/LuaJIT-benches/euler14-bit.lua
>   create mode 100644 perf/LuaJIT-benches/fannkuch.lua
>   create mode 100644 perf/LuaJIT-benches/fasta.lua
>   create mode 100644 perf/LuaJIT-benches/k-nucleotide.lua
>   create mode 100644 perf/LuaJIT-benches/life.lua
>   create mode 100644 perf/LuaJIT-benches/mandelbrot-bit.lua
>   create mode 100644 perf/LuaJIT-benches/mandelbrot.lua
>   create mode 100644 perf/LuaJIT-benches/md5.lua
>   create mode 100644 perf/LuaJIT-benches/meteor.lua
>   create mode 100644 perf/LuaJIT-benches/nbody.lua
>   create mode 100644 perf/LuaJIT-benches/nsieve-bit-fp.lua
>   create mode 100644 perf/LuaJIT-benches/nsieve-bit.lua
>   create mode 100644 perf/LuaJIT-benches/nsieve.lua
>   create mode 100644 perf/LuaJIT-benches/partialsums.lua
>   create mode 100644 perf/LuaJIT-benches/pidigits-nogmp.lua
>   create mode 100644 perf/LuaJIT-benches/ray.lua
>   create mode 100644 perf/LuaJIT-benches/recursive-ack.lua
>   create mode 100644 perf/LuaJIT-benches/recursive-fib.lua
>   create mode 100644 perf/LuaJIT-benches/revcomp.lua
>   create mode 100644 perf/LuaJIT-benches/scimark-2010-12-20.lua
>   create mode 100644 perf/LuaJIT-benches/scimark-fft.lua
>   create mode 100644 perf/LuaJIT-benches/scimark-lu.lua
>   create mode 100644 perf/LuaJIT-benches/scimark-sor.lua
>   create mode 100644 perf/LuaJIT-benches/scimark-sparse.lua
>   create mode 100644 perf/LuaJIT-benches/scimark_lib.lua
>   create mode 100644 perf/LuaJIT-benches/series.lua
>   create mode 100644 perf/LuaJIT-benches/spectral-norm.lua
>   create mode 100644 perf/LuaJIT-benches/sum-file.lua
>
> diff --git a/.luacheckrc b/.luacheckrc
> index 19098dd9..35824875 100644
> --- a/.luacheckrc
> +++ b/.luacheckrc
> @@ -16,6 +16,7 @@ files['test/tarantool-tests/'] = {
>   -- test suites and need to be coherent with the upstream.
>   exclude_files = {
>     'dynasm/',
> +  'perf/LuaJIT-benches/',

Please don't do this. It is better to ignore by code number and at least

some groups of warnings in the code.

--- a/.luacheckrc
+++ b/.luacheckrc
@@ -12,11 +12,29 @@ files['test/tarantool-tests/'] = {
    read_globals = {'_TARANTOOL'},
  }

+files["perf/LuaJIT-benches/*.lua"] = {
+  ignore = {
+    "111",
+    "112",
+    "113",
+    "211",
+    "212",
+    "213",
+    "231",
+    "413",
+    "432",
+    "421",
+    "431",
+    "612",
+    "631",
+  }
+}
+
  -- These files are inherited from the vanilla LuaJIT or different
  -- test suites and need to be coherent with the upstream.

>     'src/',
>     'test/LuaJIT-tests/',
>     'test/PUC-Rio-Lua-5.1-tests/',
> diff --git a/perf/LuaJIT-benches/PARAM_arm.txt b/perf/LuaJIT-benches/PARAM_arm.txt
> new file mode 100644
> index 00000000..a07fd010
> --- /dev/null
> +++ b/perf/LuaJIT-benches/PARAM_arm.txt
> @@ -0,0 +1,29 @@
> +array3d 200

It is not clear why exactly these parameters are used.

Should we change them?

it deserves a comment in commit message

> +binary-trees 13
> +chameneos 1e6
> +coroutine-ring 3e6
> +euler14-bit 5e6
> +fannkuch 10
> +fasta 2e6
> +k-nucleotide 5e5 FASTA_500000
> +life
> +mandelbrot 2000
> +mandelbrot-bit 2000
> +md5 5000
> +nbody 1e6
> +nsieve 9
> +nsieve-bit 9
> +nsieve-bit-fp 9
> +partialsums 2e6
> +pidigits-nogmp 2000
> +ray 4
> +recursive-ack 9
> +recursive-fib 37
> +revcomp 1e6 FASTA_1000000
> +scimark-fft 2000
> +scimark-lu 300
> +scimark-sor 5000
> +scimark-sparse 5e3
> +series 1500
> +spectral-norm 1000
> +sum-file 1000 SUMCOL_1000
> diff --git a/perf/LuaJIT-benches/PARAM_mips.txt b/perf/LuaJIT-benches/PARAM_mips.txt
> new file mode 100644
> index 00000000..e6bcadba
> --- /dev/null
> +++ b/perf/LuaJIT-benches/PARAM_mips.txt

Do we really need parameters for unsupported platforms (MIPS, x86, ppc)?

it deserves a comment in commit message

> @@ -0,0 +1,29 @@
> +array3d 50
> +binary-trees 10
> +chameneos 5e4
> +coroutine-ring 2e5
> +euler14-bit 2e4
> +fannkuch 8
> +fasta 2e4
> +k-nucleotide 1e4 FASTA_10000
> +life
> +mandelbrot 150
> +mandelbrot-bit 150
> +md5 10
> +nbody 1e4
> +nsieve 4
> +nsieve-bit 4
> +nsieve-bit-fp 2
> +partialsums 5e4
> +pidigits-nogmp 150
> +ray 2
> +recursive-ack 7
> +recursive-fib 29
> +revcomp 5e4 FASTA_50000
> +scimark-fft 20
> +scimark-lu 3
> +scimark-sor 40
> +scimark-sparse 100
> +series 50
> +spectral-norm 100
> +sum-file 100 SUMCOL_100
> diff --git a/perf/LuaJIT-benches/PARAM_ppc.txt b/perf/LuaJIT-benches/PARAM_ppc.txt
> new file mode 100644
> index 00000000..c8319a15
> --- /dev/null
> +++ b/perf/LuaJIT-benches/PARAM_ppc.txt
> @@ -0,0 +1,29 @@
> +array3d 200
> +binary-trees 13
> +chameneos 1e6
> +coroutine-ring 4e6
> +euler14-bit 1e6
> +fannkuch 9
> +fasta 5e5
> +k-nucleotide 1e5 FASTA_100000
> +life
> +mandelbrot 800
> +mandelbrot-bit 800
> +md5 500
> +nbody 1e5
> +nsieve 8
> +nsieve-bit 8
> +nsieve-bit-fp 8
> +partialsums 5e5
> +pidigits-nogmp 800
> +ray 5
> +recursive-ack 9
> +recursive-fib 34
> +revcomp 1e6 FASTA_1000000
> +scimark-fft 500
> +scimark-lu 100
> +scimark-sor 1000
> +scimark-sparse 3000
> +series 1000
> +spectral-norm 200
> +sum-file 1000 SUMCOL_1000
> diff --git a/perf/LuaJIT-benches/PARAM_x86.txt b/perf/LuaJIT-benches/PARAM_x86.txt
> new file mode 100644
> index 00000000..87088d7b
> --- /dev/null
> +++ b/perf/LuaJIT-benches/PARAM_x86.txt
> @@ -0,0 +1,29 @@
> +array3d 300
> +binary-trees 16
> +chameneos 1e7
> +coroutine-ring 2e7
> +euler14-bit 2e7
> +fannkuch 11
> +fasta 25e6
> +k-nucleotide 5e6 FASTA_5000000
> +life
> +mandelbrot 5000
> +mandelbrot-bit 5000
> +md5 20000
> +nbody 5e6
> +nsieve 12
> +nsieve-bit 12
> +nsieve-bit-fp 12
> +partialsums 1e7
> +pidigits-nogmp 5000
> +ray 9
> +recursive-ack 10
> +recursive-fib 40
> +revcomp 5e6 FASTA_5000000
> +scimark-fft 50000
> +scimark-lu 5000
> +scimark-sor 50000
> +scimark-sparse 15e4
> +series 10000
> +spectral-norm 3000
> +sum-file 5000 SUMCOL_5000
> diff --git a/perf/LuaJIT-benches/SUMCOL_1.txt b/perf/LuaJIT-benches/SUMCOL_1.txt
> new file mode 100644
> index 00000000..956aba14
> --- /dev/null
> +++ b/perf/LuaJIT-benches/SUMCOL_1.txt
> @@ -0,0 +1,1000 @@
<snipped>
> diff --git a/perf/LuaJIT-benches/TEST_md5sum.txt b/perf/LuaJIT-benches/TEST_md5sum.txt
> new file mode 100644
> index 00000000..15aa8a1c
> --- /dev/null
> +++ b/perf/LuaJIT-benches/TEST_md5sum.txt
> @@ -0,0 +1,20 @@
> +binarytrees	10	7202f4e13df7abc5ad8c07f05fe9d644
> +chameneos	1e5	a629ce12f63050c6656bce175258cf8f
> +cheapconcr	1000	d29799d1e263810a4db7bbf43ca66499
> +cheapconcw	1000	d29799d1e263810a4db7bbf43ca66499
> +fannkuch	8	51e5e372cbc5471ea8940b20ad782319
> +fasta	1e5	78cd327de6f0a5667da0aa9349888279
> +knucleotide	x	88efb24c1fed533959ed84bb32c88142 <FASTA_10000
> +mandelbrot	200	cc65e64bd553ed18896de1dfe7fae3e5
> +meteor	3000	9a65bb4b0a735ace1eaa4f2628f01026
> +nbody	1e4	e0361c898ba747117ec177f7b3b3359c
> +nsieve	4	767e02c93624995732e151932fa5f304
> +nsievebits	4	767e02c93624995732e151932fa5f304
> +partialsums	1e5	33efb41c72f8ecfb5b36c99e32189a3f
> +pidigits	200	173a11a77bb1e72dd31254a760317428
> +recursive	4	07a47c2d2cf50503b16efda789f84916
> +regexdna	x	fdf3e6e9c599754e1eec3e524ea13fed <FASTA_10000
> +revcomp	x	47de276e2f72519b57b82da39f4c7592 <FASTA_10000
> +spectralnorm 200	25f44bd552ccd9faa0ee2ae5617947e2
> +sumfile	x	2ebd3caa45b31a2e74e436b645eab4b0 <SUMCOL_100
> +
> diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
> new file mode 100644
> index 00000000..c10b09b1
> --- /dev/null
> +++ b/perf/LuaJIT-benches/array3d.lua
> @@ -0,0 +1,59 @@
> +
please remove a newline
> +local function array_set(self, x, y, z, p)
> +  assert(x >= 0 and x < self.nx, "x outside PA")
> +  assert(y >= 0 and y < self.ny, "y outside PA")
> +  assert(z >= 0 and z < self.nz, "z outside PA")
> +  local pos = (z*self.ny + y)*self.nx + x
> +  local image = self.image
> +  if self.packed then
> +    local maxv = self.max_voltage
> +    if p > maxv then self.max_voltage = p*2.0 end
> +    local oldp = image[pos] or 0.0 -- Works with uninitialized table, too
> +    if oldp > maxv then p = p + maxv*2.0 end
> +    image[pos] = p
> +  else
> +    image[pos] = p
> +  end
> +  self.changed = true
> +  self.changed_recently = true
> +end
> +
> +local function array_points(self)
> +  local y, z = 0, 0
> +  return function(self, x)
> +    x = x + 1
> +    if x >= self.nx then
> +      x = 0
> +      y = y + 1
> +      if y >= self.ny then
> +	y = 0
> +	z = z + 1
> +	if z >= self.nz then
> +	  return nil, nil, nil
> +	end
> +      end
> +    end
> +    return x, y, z
> +  end, self, 0
> +end
> +
> +local function array_new(nx, ny, nz, packed)
> +  return {
> +    nx = nx, ny = ny, nz = nz,
> +    packed = packed, max_voltage = 0.0,
> +    changed = false, changed_recently = false,
> +    image = {}, -- Preferably use a fixed-type, pre-sized array here.
> +    set = array_set,
> +    points = array_points,
> +  }
> +end
> +
> +local dim = tonumber(arg and arg[1]) or 300 -- Array dimension dim^3
> +local packed = arg and arg[2] == "packed"   -- Packed image or flat
> +local arr = array_new(dim, dim, dim, packed)
> +
> +for x,y,z inarr:points() do
> +arr:set(x, y, z, x*x)
> +end
> +assert(arr.image[dim^3-1] == (dim-1)^2)
> +
trailing newline
> diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
> new file mode 100644
> index 00000000..bf040466
> --- /dev/null
> +++ b/perf/LuaJIT-benches/binary-trees.lua
> @@ -0,0 +1,47 @@
> +
unnecessary newline
> +local function BottomUpTree(item, depth)
> +  if depth > 0 then
> +    local i = item + item
> +    depth = depth - 1
> +    local left, right = BottomUpTree(i-1, depth), BottomUpTree(i, depth)
> +    return { item, left, right }
> +  else
> +    return { item }
> +  end
> +end
> +
> +local function ItemCheck(tree)
> +  if tree[2] then
> +    return tree[1] + ItemCheck(tree[2]) - ItemCheck(tree[3])
> +  else
> +    return tree[1]
> +  end
> +end
> +
> +local N = tonumber(arg and arg[1]) or 0
> +local mindepth = 4
> +local maxdepth = mindepth + 2
> +if maxdepth < N then maxdepth = N end
> +
> +do
> +  local stretchdepth = maxdepth + 1
> +  local stretchtree = BottomUpTree(0, stretchdepth)
> +  io.write(string.format("stretch tree of depth %d\t check: %d\n",
> +    stretchdepth, ItemCheck(stretchtree)))
> +end
> +
> +local longlivedtree = BottomUpTree(0, maxdepth)
> +
> +for depth=mindepth,maxdepth,2 do
> +  local iterations = 2 ^ (maxdepth - depth + mindepth)
> +  local check = 0
> +  for i=1,iterations do
> +    check = check + ItemCheck(BottomUpTree(1, depth)) +
> +            ItemCheck(BottomUpTree(-1, depth))
> +  end
> +  io.write(string.format("%d\t trees of depth %d\t check: %d\n",
> +    iterations*2, depth, check))
> +end
> +
> +io.write(string.format("long lived tree of depth %d\t check: %d\n",
> +  maxdepth, ItemCheck(longlivedtree)))
> diff --git a/perf/LuaJIT-benches/chameneos.lua b/perf/LuaJIT-benches/chameneos.lua
> new file mode 100644
> index 00000000..78b64c3f
> --- /dev/null
> +++ b/perf/LuaJIT-benches/chameneos.lua
> @@ -0,0 +1,68 @@
> +
unnecessary newline
> +local co = coroutine
> +local create, resume, yield = co.create, co.resume, co.yield
> +
> +local N = tonumber(arg and arg[1]) or 10
> +local first, second
> +
> +-- Meet another creature.
> +local function meet(me)
> +  while second do yield() end -- Wait until meeting place clears.
> +  local other = first
> +  if other then -- Hey, I found a new friend!
> +    first = nil
> +    second = me
> +  else -- Sniff, nobody here (yet).
> +    local n = N - 1
> +    if n < 0 then return end -- Uh oh, the mall is closed.
> +    N = n
> +    first = me
> +    repeat yield(); other = second until other -- Wait for another creature.
> +    second = nil
> +    yield() -- Be nice and let others meet up.
> +  end
> +  return other
> +end
> +
> +-- Create a very social creature.
> +local function creature(color)
> +  return create(function()
> +    local me = color
> +    for met=0,1000000000 do
> +      local other = meet(me)
> +      if not other then return met end
> +      if me ~= other then
> +        if me == "blue" then me = other == "red" and "yellow" or "red"
> +        elseif me == "red" then me = other == "blue" and "yellow" or "blue"
> +        else me = other == "blue" and "red" or "blue" end
> +      end
> +    end
> +  end)
> +end
> +
> +-- Trivial round-robin scheduler.
> +local function schedule(threads)
> +  local resume = resume
> +  local nthreads, meetings = #threads, 0
> +  repeat
> +    for i=1,nthreads do
> +      local thr = threads[i]
> +      if not thr then return meetings end
> +      local ok, met = resume(thr)
> +      if met then
> +        meetings = meetings + met
> +        threads[i] = nil
> +      end
> +    end
> +  until false
> +end
> +
> +-- A bunch of colorful creatures.
> +local threads = {
> +  creature("blue"),
> +  creature("red"),
> +  creature("yellow"),
> +  creature("blue"),
> +}
> +
> +io.write(schedule(threads), "\n")
> diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
> new file mode 100644
> index 00000000..1e8c5ef6
> --- /dev/null
> +++ b/perf/LuaJIT-benches/coroutine-ring.lua
> @@ -0,0 +1,42 @@
> +-- The Computer Language Benchmarks Game
> +--http://shootout.alioth.debian.org/
> +-- contributed by Sam Roberts
> +-- reviewed by Bruno Massa
> +
> +local n         = tonumber(arg and arg[1]) or 2e7
> +
> +-- fixed size pool
> +local poolsize  = 503
> +local threads   = {}
> +
> +-- cache these to avoid global environment lookups
> +local create    = coroutine.create
> +local resume    = coroutine.resume
> +local yield     = coroutine.yield
> +
> +local id        = 1
> +local token     = 0
> +local ok
> +
> +local body = function(token)
> +  while true do
> +    token = yield(token + 1)
> +  end
> +end
> +
> +-- create all threads
> +for id = 1, poolsize do
> +  threads[id] = create(body)
> +end
> +
> +-- send the token
> +repeat
> +  if id == poolsize then
> +    id = 1
> +  else
> +    id = id + 1
> +  end
> +  ok, token = resume(threads[id], token)
> +until token == n
> +
> +io.write(id, "\n")
> diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
> new file mode 100644
> index 00000000..537f2bf3
> --- /dev/null
> +++ b/perf/LuaJIT-benches/euler14-bit.lua
> @@ -0,0 +1,22 @@
> +
unnecessary newline. here and below
> +local bit = require("bit")
> +local bnot, bor, band = bit.bnot, bit.bor, bit.band
> +local shl, shr = bit.lshift, bit.rshift
> +
> +local N = tonumber(arg and arg[1]) or 10000000
> +local cache, m, n = { 1 }, 1, 1
> +if arg and arg[2] then cache = nil end
> +for i=2,N do
> +  local j = i
> +  for len=1,1000000000 do
> +    j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
> +    if cache then
> +      local x = cache[j]; if x then j = x+len; break end
> +    elseif j == 1 then
> +      j = len+1; break
> +    end
> +  end
> +  if cache then cache[i] = j end
> +  if j > m then m, n = j, i end
> +end
> +io.write("Found ", n, " (chain length: ", m, ")\n")
<snipped>
> diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
> new file mode 100644
> index 00000000..2acc24c0
> --- /dev/null
> +++ b/perf/LuaJIT-benches/ray.lua
> @@ -0,0 +1,135 @@
> +local sqrt = math.sqrt
> +local huge = math.huge
> +
> +local delta = 1
> +while delta * delta + 1 ~= 1 do
> +  delta = delta * 0.5
> +end
> +
> +local function length(x, y, z)  return sqrt(x*x + y*y + z*z) end
> +local function vlen(v)          return length(v[1], v[2], v[3]) end
> +local function mul(c, x, y, z)  return c*x, c*y, c*z end
> +local function unitise(x, y, z) return mul(1/length(x, y, z), x, y, z) end
> +local function dot(x1, y1, z1, x2, y2, z2)
> +  return x1*x2 + y1*y2 + z1*z2
> +end
> +
> +local function vsub(a, b)        return a[1] - b[1], a[2] - b[2], a[3] - b[3] end
> +local function vdot(a, b)        return dot(a[1], a[2], a[3], b[1], b[2], b[3]) end
> +
> +
> +local sphere = {}
> +functionsphere:new(centre, radius)
> +  self.__index = self
> +  return setmetatable({centre=centre, radius=radius}, self)
> +end
> +
> +local function sphere_distance(self, origin, dir)
> +  local vx, vy, vz = vsub(self.centre, origin)
> +  local b = dot(vx, vy, vz, dir[1], dir[2], dir[3])
> +  local r = self.radius
> +  local disc = r*r + b*b - vx*vx-vy*vy-vz*vz
> +  if disc < 0 then return huge end
> +  local d = sqrt(disc)
> +  local t2 = b + d
> +  if t2 < 0 then return huge end
> +  local t1 = b - d
> +  return t1 > 0 and t1 or t2
> +end
> +
> +functionsphere:intersect(origin, dir, best)
> +  local lambda = sphere_distance(self, origin, dir)
> +  if lambda < best[1] then
> +    local c = self.centre
> +    best[1] = lambda
> +    local b2 = best[2]
> +    b2[1], b2[2], b2[3] =
> +      unitise(
> +        origin[1] - c[1] + lambda * dir[1],
> +        origin[2] - c[2] + lambda * dir[2],
> +        origin[3] - c[3] + lambda * dir[3])
> +  end
> +end
> +
> +local group = {}
> +functiongroup:new(bound)
> +  self.__index = self
> +  return setmetatable({bound=bound, children={}}, self)
> +end
> +
> +functiongroup:add(s)
> +  self.children[#self.children+1] = s
> +end
> +
> +functiongroup:intersect(origin, dir, best)
> +  local lambda = sphere_distance(self.bound, origin, dir)
> +  if lambda < best[1] then
> +    for _, c in ipairs(self.children) do
> +c:intersect(origin, dir, best)
> +    end
> +  end
> +end
> +
> +local hit = { 0, 0, 0 }
> +local ilight
> +local best = { huge, { 0, 0, 0 } }
> +
> +local function ray_trace(light, camera, dir, scene)
> +  best[1] = huge
> +scene:intersect(camera, dir, best)
> +  local b1 = best[1]
> +  if b1 == huge then return 0 end
> +  local b2 = best[2]
> +  local g = vdot(b2, light)
> +  if g >= 0 then return 0 end
> +  hit[1] = camera[1] + b1*dir[1] + delta*b2[1]
> +  hit[2] = camera[2] + b1*dir[2] + delta*b2[2]
> +  hit[3] = camera[3] + b1*dir[3] + delta*b2[3]
> +  best[1] = huge
> +scene:intersect(hit, ilight, best)
> +  if best[1] == huge then
> +    return -g
> +  else
> +    return 0
> +  end
> +end
> +
> +local function create(level, centre, radius)
> +  local s =sphere:new(centre, radius)
> +  if level == 1 then return s end
> +  local gr =group:new(sphere:new(centre, 3*radius))
> +gr:add(s)
> +  local rn = 3*radius/sqrt(12)
> +  for dz = -1,1,2 do
> +    for dx = -1,1,2 do
> +gr:add(create(level-1, { centre[1] + rn*dx, centre[2] + rn, centre[3] + rn*dz }, radius*0.5))
> +    end
> +  end
> +  return gr
> +end
> +
> +
> +local level, n, ss = tonumber(arg[1]) or 9, tonumber(arg[2]) or 256, 4
> +local iss = 1/ss
> +local gf = 255/(ss*ss)
> +
> +io.write(("P5\n%d %d\n255\n"):format(n, n))
> +local light = { unitise(-1, -3, 2) }
> +ilight = { -light[1], -light[2], -light[3] }
> +local camera = { 0, 0, -4 }
> +local dir = { 0, 0, 0 }
> +
> +local scene = create(level, {0, -1, 0}, 1)
> +
> +for y = n/2-1, -n/2, -1 do
> +  for x = -n/2, n/2-1 do
> +    local g = 0
> +    for d = y, y+.99, iss do
> +      for e = x, x+.99, iss do
> +        dir[1], dir[2], dir[3] = unitise(e, d, n)
> +        g = g + ray_trace(light, camera, dir, scene)
trailing space
> +      end
> +    end
> +    io.write(string.char(math.floor(0.5 + g*gf)))
> +  end
> +end
<snipped>

[-- Attachment #2: Type: text/html, Size: 26606 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module Sergey Kaplun via Tarantool-patches
@ 2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:05     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-11 14:28 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2401 bytes --]

Hi, Sergey!

thanks for the patch! Please see my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This module contains 2 functions:
> - `realtime()` -- returns the time represented by the wall clock.
> - `process_cputime()` -- returns the time consumed by all threads of
>    the process.
I would rephrase second bullet: "to measure CPU time instead of elapsed 
time"
Also, I would add this description to the Lua module as well.
>
> Both functions are implemented via FFI call to the `clock_gettime()`.
> ---
>   perf/utils/clock.lua | 35 +++++++++++++++++++++++++++++++++++
>   1 file changed, 35 insertions(+)
>   create mode 100644 perf/utils/clock.lua
>
> diff --git a/perf/utils/clock.lua b/perf/utils/clock.lua
> new file mode 100644
> index 00000000..57385967
> --- /dev/null
> +++ b/perf/utils/clock.lua
> @@ -0,0 +1,35 @@
> +local ffi = require('ffi')
> +
> +ffi.cdef[[
> +struct timespec {
> +  long tv_sec; /* Seconds. */
> +  long tv_nsec; /* Nanoseconds. */
> +};
> +
> +int clock_gettime(int clockid, struct timespec *tp);
> +]]
> +
> +local C = ffi.C
> +
> +-- Wall clock.
> +local CLOCK_REALTIME = 0

  This clock is not a reliable source of the time. This clock can be 
adjusted by

NTP or manually or by timezones. It is better to use CLOCK_MONOTONIC or

even CLOCK_MONOTONIC_RAW (not portable, Linux-specific), it is more reliable

and does not depend on things listed above.

> +-- CPU time consumed by the process.
> +local CLOCK_PROCESS_CPUTIME_ID = 2
> +
> +-- All functions below returns the corresponding `clock_gettime()`
s/`clock_gettime()`/elapsed time/
> +-- in seconds.
> +local M = {}
> +
> +local timespec = ffi.new('struct timespec[1]')
> +
> +function M.realtime()
> +  C.clock_gettime(CLOCK_REALTIME, timespec)
> +  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
> +end
> +

may be it is better to make conversion only once?

@@ -24,7 +24,7 @@ local timespec = ffi.new('struct timespec[1]')

  function M.realtime()
    C.clock_gettime(CLOCK_REALTIME, timespec)
-  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
+  return tonumber(timespec[0].tv_sec + timespec[0].tv_nsec / 1e9)
  end

the same below

> +function M.process_cputime()
> +  C.clock_gettime(CLOCK_PROCESS_CPUTIME_ID, timespec)
> +  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
> +end
> +
> +return M

[-- Attachment #2: Type: text/html, Size: 3701 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module Sergey Kaplun via Tarantool-patches
@ 2025-11-11 15:41   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:06     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-11 15:41 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 20783 bytes --]

Hi, Sergey, again!

thanks for the patch!

Please see comments below.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This module provides functionality to run custom benchmark workloads
> defined by the following syntax:
>
> | local bench = require('bench').new(arg)

there are many LuaJIT-specific functions below (ffi, jit etc.)

I propose to check that luabin is LuaJIT-compatible and exit with 
appropriate message

if it is not.

> |
> | -- f_* are functions, n_* are numbers.

s/functions/user-defined functions/

s/numbers/user-defined numbers/

> |bench:add({
> |   setup = f_setup,
> |   payload = f_payload,
> |   teardown = f_teardown,
> |   items = n_items_processed,
> |
> |   checker = f_checker,
> |   -- Or instead:
> |   skip_check = true,
> |
> |   iterations = n_iterations,
> |   -- Or instead:
> |   min_time = n_seconds,
> | })
> |
> |bench:run_and_report()
>
> The checker function received the single value returned by the payload
> function and completed all checks related to the test. If it returns a
> true value, it is considered a successful check pass. The checker
> function is called before the main workload as a warm-up. Generally, you
> should always provide the checker function to be sure that your
> benchmark is still correct after optimizations. In cases when it is
> impossible (for some reason), you may specify the `skip_check` flag. In
> that case the warm-up part will be skipped as well.
>
> Each test is run in the order it was added. The module measures the
> real-time and CPU time necessary to run `iterations` repetitions of the
please consider using monotonic time, not a realtime (see a previous patch)
> test or amount of iterations `min_time` in seconds (4 by default) and
> calculates the metric items per second (more is better). The total
> amount of items equals `n_items_processed * n_iterations`. The items may
> be added in the table with the description inside the payload function
> as well. The results (real-time, CPU time, iterations, items/s) are
> reported in a format similar to the Google Benchmark suite [1].
s/similar/compatible/
>
> Each test may be run from the command line as follows:
> | LUA_PATH="..." luajit test_name.lua [flags] arguments
>
> The supported flags are:
> | -j{off|on}                 Disable/Enable JIT for the benchmarks.
Why do you implement this flag for a Lua module? It can be passed to 
luajit directly.
> | --benchmark_color={true|false|auto}
> |                            Enables the colorized output for the
> |                            terminal (not the file).
> | --benchmark_min_time={number} Minimum seconds to run the benchmark
> |                            tests.
> | --benchmark_out=<file>     Places the output into <file>.
> | --benchmark_out_format={console|json}
> |                            The format is used when saving the results in the
> |                            file. The default format is the JSON format.
> | -h, --help                 Display help message and exit.
>
> These options are similar to the Google Benchmark command line options,
> but with a few changes:
> 1) If an output file is given, there is no output in the terminal.
> 2) The min_time option supports only number values. There is no support
>     for the iterations number (by the 'x' suffix).
>
> [1]:https://github.com/google/benchmark
> ---
>   perf/utils/bench.lua | 509 +++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 509 insertions(+)
>   create mode 100644 perf/utils/bench.lua
>
> diff --git a/perf/utils/bench.lua b/perf/utils/bench.lua
> new file mode 100644
> index 00000000..68473215
> --- /dev/null
> +++ b/perf/utils/bench.lua
> @@ -0,0 +1,509 @@
> +local clock = require('clock')
> +local ffi = require('ffi')
> +-- Require 'cjson' only on demand for formatted output to file.
> +local json
> +
> +local M = {}
> +
> +local type, assert, error = type, assert, error
> +local format, rep = string.format, string.rep

s/, rep/string_rep/

s/local format/string_format/

for consistency with shortcuts below


> +local floor, max, min = math.floor, math.max, math.min
> +local table_remove = table.remove
> +
> +local LJ_HASJIT = jit and jit.opt
> +
> +-- Argument parsing. ---------------------------------------------
> +
> +-- XXX: Make options compatible with Google Benchmark, since most
> +-- probably it will be used for the C benchmarks as well.
> +-- Compatibility isn't full: there is no support for environment
> +-- variables (since they are not so useful) and the output to the
> +-- terminal is suppressed if the --benchmark_out flag is
> +-- specified.
> +
> +local HELP_MSG = [[
> + Options:
> +   -j{off|on}                 Disable/Enable JIT for the benchmarks.
Add a default value to the description. Here and below.
> +   --benchmark_color={true|false|auto}
> +                              Enables the colorized output for the terminal (not
> +                              the file). 'auto' means to use colors if the
> +                              output is being sent to a terminal and the TERM
> +                              environment variable is set to a terminal type
> +                              that supports colors.
> +   --benchmark_min_time={number}
> +                              Minimum seconds to run the benchmark tests.
> +                              4.0 by default.
Why 4.0?
> +   --benchmark_out=<file>     Places the output into <file>.
> +   --benchmark_out_format={console|json}
> +                              The format is used when saving the results in the
> +                              file. The default format is the JSON format.

 >  The default format is the JSON format.

by default JSON module is not available, in source code it is marked as 
"on demand".

I'm not sure JSON format should be by default.

> +   -h, --help                 Display this message and exit.
> +
> + There are a bunch of suggestions on how to achieve the most
> + stable benchmark results:
> +https://github.com/tarantool/tarantool/wiki/Benchmarking
> +]]
> +
> +local function usage(ctx)
> +  local header = format('USAGE: luajit %s [options]\n', ctx.name)

I would not hardcode "luajit" and use luabin instead, see `M.luabin` in

test/tarantool-tests/utils/exec.lua. This can be at least tarantool 
instead luajit.

> +io.stderr:write(header, HELP_MSG)
> +  os.exit(1)
> +end
> +
> +local function check_param(check, strfmt, ...)
> +  if not check then
> +io.stderr:write(format(strfmt, ...))
> +    os.exit(1)

please define possible exit codes as variables and use them in os.exit().

Like well-known EXIT_SUCCES and EXIT_FAILURE in C. Feel free to ignore.

> +  end
> +end
> +
> +-- Valid values: 'false'/'no'/'0'.
> +-- In case of an invalid value the 'auto' is used.
> +local function set_color(ctx, value)
> +  if value == 'false' or value == 'no' or value == '0' then
> +    ctx.color = false
> +  else
> +    -- In case of an invalid value, the Google Benchmark uses
> +    -- 'auto', which is true for the stdout output (the only
> +    -- colorizable output). So just set it to true by default.
> +    ctx.color = true
> +  end
> +end
> +
> +local DEFAULT_MIN_TIME = 4.0
> +local function set_min_time(ctx, value)
> +  local time = tonumber(value)

 >  Tries to convert its argument to a number. If the argument is already

 > a number or a string convertible to a number, then tonumber returns 
this number;

 > otherwise, it returns *nil*.

https://www.lua.org/manual/5.1/manual.html#pdf-tonumber

please check result for nil

> +  check_param(time, 'Invalid min time: "%s"\n', value)
> +  ctx.min_time = time
> +end
> +
> +local function set_output(ctx, filename)
> +  check_param(type(filename) == "string", 'Invalid output value: "%s"\n',
> +              filename)
> +  ctx.output = filename
> +end
> +
> +-- Determine the output format for the benchmark.
> +-- Supports only 'console' and 'json' for now.
> +local function set_output_format(ctx, value)
> +  local output_format = tostring(value)
> +  check_param(output_format, 'Invalid output format: "%s"\n', value)
> +  output_format = output_format:lower()
> +  check_param(output_format == 'json' or output_format == 'console',
> +              'Unsupported output format: "%s"\n', output_format)
> +  ctx.output_format = output_format
> +end
> +
> +local function set_jit(ctx, value)
> +  check_param(value == 'on' or value == 'off',
> +             'Invalid jit value: "%s"\n', value)
> +  if value == 'off' then
> +    ctx.jit = false
> +  elseif value == 'on' then
> +    ctx.jit = true
> +  end
> +end
> +
> +local function unrecognized_option(optname, dashes)
> +  local fullname = dashes .. (optname or '=')
> +io.stderr:write(format('unrecognized command-line flag: %s\n', fullname))
> +io.stderr:write(HELP_MSG)
> +  os.exit(1)
> +end
> +
> +local function unrecognized_long_option(_, optname)
> +  unrecognized_option(optname, '--')
> +end
> +
> +local function unrecognized_short_option(_, optname)
> +  unrecognized_option(optname, '-')
> +end
> +
> +local SHORT_OPTS = setmetatable({
> +  ['h'] = usage,
> +  ['j'] = set_jit,
> +}, {__index = unrecognized_short_option})
> +
> +local LONG_OPTS = setmetatable({
> +  ['benchmark_color'] = set_color,
> +  ['benchmark_min_time'] = set_min_time,
> +  ['benchmark_out'] = set_output,
> +  -- XXX: For now support only JSON encoded and raw output.
> +  ['benchmark_out_format'] = set_output_format,
> +  ['help'] = usage,
> +}, {__index = unrecognized_long_option})
> +
Is taking argparse from the tarantool repository (src/lua/argparse.lua) 
not the option?
> +local function is_option(str)
> +  return type(str) == 'string' andstr:sub(1, 1) == '-' and str ~= '-'
> +end
> +
> +local function next_arg_value(arg, n)
> +  local opt_value = nil
> +  if arg[n] and not is_option(arg[n]) then
> +    opt_value = arg[n]
> +    table_remove(arg, n)
> +  end
> +  return opt_value
> +end
> +
> +local function parse_long_option(arg, a, n)
> +  local opt_name, opt_value
> +  -- Remove dashes.
> +  local opt =a:sub(3)
> +  -- --option=value
> +  ifopt:find('=', 1, true) then
> +    -- May match empty option name and/or value.
> +    opt_name, opt_value =opt:match('^([^=]+)=(.*)$')
> +  else
> +    -- --option value
> +    opt_name = opt
> +    opt_value = next_arg_value(arg, n)
> +  end
> +  return opt_name, opt_value
> +end
> +
> +local function parse_short_option(arg, a, n)
> +  local opt_name, opt_value
> +  -- Remove the dash.
> +  local opt =a:sub(2)
> +  if #opt == 1 then
> +    -- -o value
> +    opt_name = opt
> +    opt_value = next_arg_value(arg, n)
> +  else
> +    -- -ovalue.
> +    opt_name =opt:sub(1, 1)
> +    opt_value =opt:sub(2)
> +  end
> +  return opt_name, opt_value
> +end
> +
> +local function parse_opt(ctx, arg, a, n)
> +  ifa:sub(1, 2) == '--' then
> +    local opt_name, opt_value = parse_long_option(arg, a, n)
> +    LONG_OPTS[opt_name](ctx, opt_value)
> +  else
> +    local opt_name, opt_value = parse_short_option(arg, a, n)
> +    SHORT_OPTS[opt_name](ctx, opt_value)
> +  end
> +end
> +
> +-- Process the options and update the benchmark context.
> +local function argparse(arg, name)
> +  local ctx = {name = name}
> +  local n = 1
> +  while n <= #arg do
> +    local a = arg[n]
> +    if is_option(a) then
> +      table_remove(arg, n)
> +      parse_opt(ctx, arg, a, n)
> +    else
> +      -- Just ignore it.
> +      n = n + 1
> +    end
> +  end
> +  return ctx
> +end
> +
> +-- Formatting. ---------------------------------------------------
> +
> +local function format_console_header()
> +  -- Use a similar format to the Google Benchmark, except for the
> +  -- fixed benchmark name length.
> +  local header = format('%-37s %12s %15s %13s %-28s\n',
> +    'Benchmark', 'Time', 'CPU', 'Iterations', 'UserCounters...'
> +  )
> +  local border = rep('-', #header - 1) .. '\n'
> +  return border .. header .. border
> +end
> +
> +local COLORS = {
> +  GREEN = '\027[32m%s\027[m',
> +  YELLOW = '\027[33m%s\027[m',
> +  CYAN = '\027[36m%s\027[m',
> +}
> +
> +local function format_name(ctx, name)
> +  name = format('%-37s ', name)
> +  if ctx.color then
> +     name = format(COLORS.GREEN, name)
> +  end
> +  return name
> +end
> +
> +local function format_time(ctx, real_time, cpu_time, time_unit)
> +  local timestr = format('%10.2f %-4s %10.2f %-4s ', real_time, time_unit,
> +                         cpu_time, time_unit)
> +  if ctx.color then
> +     timestr = format(COLORS.YELLOW, timestr)
> +  end
> +  return timestr
> +end
> +
> +local function format_iterations(ctx, iterations)
> +  iterations = format('%10d ', iterations)
> +  if ctx.color then
> +     iterations = format(COLORS.CYAN, iterations)
> +  end
> +  return iterations
> +end
> +
> +local function format_ips(ips)
> +  local ips_str
> +  if ips / 1e6 > 1 then
> +    ips_str = format('items_per_second=%.3fM/s', ips / 1e6)
> +  elseif ips / 1e3 > 1 then
> +    ips_str = format('items_per_second=%.3fk/s', ips / 1e3)
> +  else
> +    ips_str = format('items_per_second=%d/s', ips)
> +  end
> +  return ips_str
> +end
> +
> +local function format_result_console(ctx, r)
> +  return format('%s%s%s%s\n',
> +    format_name(ctx, r.name),
> +    format_time(ctx, r.real_time, r.cpu_time, r.time_unit),
> +    format_iterations(ctx, r.iterations),
> +    format_ips(r.items_per_second)
> +  )
> +end
> +
> +local function format_results(ctx)
> +  local output_format = ctx.output_format
> +  local res = ''
> +  if output_format == 'json' then
> +    res = json.encode({
> +      benchmarks = ctx.results,
> +      context = ctx.context,
> +    })
> +  else
> +    assert(output_format == 'console', 'Unknown format: ' .. output_format)
> +    res = res .. format_console_header()
> +    for _, r in ipairs(ctx.results) do
> +      res = res .. format_result_console(ctx, r)
> +    end
> +  end
> +  return res
> +end
> +
> +local function report_results(ctx)
> +ctx.fh:write(format_results(ctx))
> +end
> +
> +-- Tests setup and run. ------------------------------------------
> +
> +local function term_is_color()
> +  local term = os.getenv('TERM')
> +  return (term andterm:match('color') or os.getenv('COLORTERM'))
> +end
> +
> +local function benchmark_context(ctx)
> +  return {
> +    arch = jit.arch,
> +    -- Google Benchmark reports a date in ISO 8061 format.
> +    date = os.date('%Y-%m-%dT%H:%M:%S%z'),
> +    gc64 = ffi.abi('gc64'),
> +    host_name = io.popen('hostname'):read(),
> +    jit = ctx.jit,
> +  }
> +end
> +
> +local function init(ctx)
> +  -- Array of benches to proceed with.
> +  ctx.benches = {}
> +  -- Array of the corresponding results.
> +  ctx.results = {}
> +
> +  if ctx.jit == nil then
> +    if LJ_HASJIT then
> +      ctx.jit = jit.status()
> +    else
> +      ctx.jit = false
> +    end
> +  end
> +  ctx.color = ctx.color == nil and true or ctx.color
> +  if ctx.output then
> +    -- Don't bother with manual file closing. It will be closed
> +    -- automatically when the corresponding object is
> +    -- garbage-collected.
> +    ctx.fh = assert(io.open(ctx.output, 'w+'))
> +    ctx.output_format = ctx.output_format or 'json'
> +    -- Always without color.
> +    ctx.color = false
> +  else
> +    ctx.fh = io.stdout
> +    -- Always console outptut to the terminal.
> +    ctx.output_format = 'console'
> +    if ctx.color and term_is_color() then
> +      ctx.color = true
> +    else
> +      ctx.color = false
> +    end
> +  end
> +  ctx.min_time = ctx.min_time or DEFAULT_MIN_TIME
> +
> +  if ctx.output_format == 'json' then
> +    json = require('cjson')
should we make it compatible with tarantool? (module is named 'json' there)
> +  end
> +
> +  -- Google Benchmark's context, plus benchmark info.
> +  ctx.context = benchmark_context(ctx)
> +
> +  return ctx
> +end
> +
> +local function test_name()
> +  return debug.getinfo(3, 'S').short_src:match('([^/\\]+)$')
> +end
> +
> +local function add_bench(ctx, bench)
> +  if bench.checker == nil and not bench.skip_check then
> +    error('Bench requires a checker to proof the results', 2)
> +  end
> +  table.insert(ctx.benches, bench)
> +end
> +
> +local MAX_ITERATIONS = 1e9
> +-- Determine the number of iterations for the next benchmark run.
> +local function iterations_multiplier(min_time, get_time, iterations)
> +  -- When the last run is at least 10% of the required time, the
> +  -- maximum expansion should be 14x.
> +  local multiplier = min_time * 1.4 / max(get_time, 1e-9)
> +  local is_significant = get_time / min_time > 0.1
> +  multiplier = is_significant and multiplier or 10
> +  local new_iterations = max(floor(multiplier * iterations), iterations + 1)
> +  return min(new_iterations, MAX_ITERATIONS)
> +end
> +
> +--https://luajit.org/running.html#foot.
> +local JIT_DEFAULTS = {
> +  maxtrace = 1000,
> +  maxrecord = 4000,
> +  maxirconst = 500,
> +  maxside = 100,
> +  maxsnap = 500,
> +  hotloop = 56,
> +  hotexit = 10,
> +  tryside = 4,
> +  instunroll = 4,
> +  loopunroll = 15,
> +  callunroll = 3,
> +  recunroll = 2,
> +  sizemcode = 32,
> +  maxmcode = 512,
> +}
please sort alphabetically
> +
> +-- Basic setup for all tests to clean up after a previous
> +-- executor.
> +local function luajit_tests_setup(ctx)
> +  -- Reset the JIT to the defaults.
> +  if ctx.jit == false then
> +    jit.off()
> +  elseif LJ_HASJIT then
> +    jit.on()
> +    jit.flush()
> +    jit.opt.start(3)
> +    for k, v in pairs(JIT_DEFAULTS) do
> +      jit.opt.start(k .. '=' .. v)
> +    end
> +  end
> +
> +  -- Reset the GC to the defaults.
> +  collectgarbage('setstepmul', 200)
> +  collectgarbage('setpause', 200)
should we define 200 as a named constant?
> +
> +  -- Collect all garbage at the end. Twice to be sure that all
> +  -- finalizers are run.
> +  collectgarbage()
> +  collectgarbage()
> +end
> +
> +local function run_benches(ctx)
> +  -- Process the tests in the predefined order with ipairs.
> +  for _, bench in ipairs(ctx.benches) do
> +    luajit_tests_setup(ctx)
> +    if bench.setup then bench.setup() end
> +
> +    -- The first run is used as a warm-up, plus results checks.
> +    local payload = bench.payload
> +    -- Generally you should never skip any checks. But sometimes
> +    -- a bench may generate so much output in one run that it is
> +    -- overkill to save the result in the file and test it.
> +    -- So to avoid double time for the test run, just skip the
> +    -- check.
> +    if not bench.skip_check then
> +      local result = payload()
> +      assert(bench.checker(result))
> +    end
> +    local N
> +    local delta_real, delta_cpu
> +    -- Iterations are specified manually.
> +    if bench.iterations then
> +      N = bench.iterations
> +
> +      local start_real = clock.realtime()
> +      local start_cpu  = clock.process_cputime()
> +      for _ = 1, N do
> +        payload()
> +      end
> +      delta_real = clock.realtime() - start_real
> +      delta_cpu  = clock.process_cputime() - start_cpu
> +    else
> +      -- Iterations are determined dinamycally, adjusting to fit
> +      -- the minimum time to run for the benchmark.
> +      local min_time = bench.min_time or ctx.min_time
> +      local next_iterations = 1
> +      repeat
> +        N = next_iterations
> +        local start_real = clock.realtime()
> +        local start_cpu  = clock.process_cputime()
> +        for _ = 1, N do
> +          payload()
> +        end
> +        delta_real = clock.realtime() - start_real
> +        delta_cpu  = clock.process_cputime() - start_cpu
> +        next_iterations = iterations_multiplier(min_time, delta_real, N)
> +      until delta_real > min_time or N == next_iterations
> +    end
> +
> +    if bench.teardown then bench.teardown() end
> +
> +    local items = N * bench.items
> +    local items_per_second = math.floor(items / delta_real)
> +    table.insert(ctx.results, {
> +      cpu_time = delta_cpu,
> +      real_time = delta_real,
> +      items_per_second = items_per_second,
> +      iterations = N,
> +      name = bench.name,
> +      time_unit = 's',
> +      -- Fields below are used only for the Google Benchmark
> +      -- compatibility. We don't use them really.
> +      run_name = bench.name,
> +      run_type = 'iteration',
> +      repetitions = 1,
> +      repetition_index = 1,
> +      threads = 1,
> +    })
> +  end
> +end
> +
> +local function run_and_report(ctx)
> +  run_benches(ctx)
> +  report_results(ctx)
> +end
> +
> +function M.new(arg)
> +  assert(type(arg) == 'table', 'given argument should be a table')
> +  local name = test_name()
> +  local ctx = init(argparse(arg, name))
> +  return setmetatable(ctx, {__index = {
> +    add = add_bench,
> +    run = run_benches,
> +    report = report_results,
> +    run_and_report = run_and_report,
> +  }})
> +end
> +
> +return M
>

[-- Attachment #2: Type: text/html, Size: 24967 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees " Sergey Kaplun via Tarantool-patches
@ 2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:08     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-13 11:06 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 4716 bytes --]

Hi, Sergey!

thanks for the patch!

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The test cases are split by the different types of trees:
> 1) stretched tree,
> 2) long-lived tree,
> 3) several trees with a depth of the power of 2,
> 4) iteration over all trees in the third test case.
>
> The number of items is the number of `ItemCheck()` first-level calls
> performed in the payload.
> ---
>
> I'm not sure that we should distinguish different subtests here.
> OTOH, how to calculate the amount of items correctly for the whole test
> instead?
>
>   perf/LuaJIT-benches/binary-trees.lua | 94 ++++++++++++++++++++++------
>   1 file changed, 76 insertions(+), 18 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
> index bf040466..9d4dc7b4 100644
> --- a/perf/LuaJIT-benches/binary-trees.lua
> +++ b/perf/LuaJIT-benches/binary-trees.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local function BottomUpTree(item, depth)
>     if depth > 0 then
> @@ -18,30 +19,87 @@ local function ItemCheck(tree)
>     end
>   end
>   
> -local N = tonumber(arg and arg[1]) or 0
> +local N = tonumber(arg and arg[1]) or 16
Why 16?
>   local mindepth = 4
>   local maxdepth = mindepth + 2
>   if maxdepth < N then maxdepth = N end
>   
> -do
> -  local stretchdepth = maxdepth + 1
> -  local stretchtree = BottomUpTree(0, stretchdepth)
> -  io.write(string.format("stretch tree of depth %d\t check: %d\n",
> -    stretchdepth, ItemCheck(stretchtree)))
> -end
> +local stretchdepth = maxdepth + 1
> +
> +bench:add({
> +  name = "stretch_depth_" .. tostring(stretchdepth),
> +  payload = function()
> +    local stretchtree = BottomUpTree(0, stretchdepth)
> +    local check = ItemCheck(stretchtree)
> +    return check
> +  end,
> +  items = 1,
> +  checker = function(check)
> +    return check == -1
it deserves a comment
> +  end,
> +})
>   
> -local longlivedtree = BottomUpTree(0, maxdepth)
> +-- This tree created once on the setup for the first test.
> +local longlivedtree

I don't like that we should save a benchmark state in global variables.

What if we allow setting a user-defined object that will have a state

and this state will be passed to checker/payload functions?

>   
> -for depth=mindepth,maxdepth,2 do
> +for depth = mindepth, maxdepth, 2 do
>     local iterations = 2 ^ (maxdepth - depth + mindepth)
> -  local check = 0
> -  for i=1,iterations do
> -    check = check + ItemCheck(BottomUpTree(1, depth)) +
> -            ItemCheck(BottomUpTree(-1, depth))
> -  end
> -  io.write(string.format("%d\t trees of depth %d\t check: %d\n",
> -    iterations*2, depth, check))
> +  local tree_bench
> +  tree_bench = {
> +    name = "tree_depth_" .. tostring(depth),
> +    setup = function()
> +      if not longlivedtree then
> +        longlivedtree = BottomUpTree(0, maxdepth)
> +      end
> +      tree_bench.items = iterations * 2
> +    end,
> +    checker = function(check)
> +      return check == -iterations * 2
> +    end,
> +    payload = function()
> +      local check = 0
> +      for i = 1, iterations do
> +        check = check + ItemCheck(BottomUpTree(1, depth)) +
> +                ItemCheck(BottomUpTree(-1, depth))
> +      end
> +      return check
> +    end,
> +  }
> +
> +bench:add(tree_bench)
>   end
>   
> -io.write(string.format("long lived tree of depth %d\t check: %d\n",
> -  maxdepth, ItemCheck(longlivedtree)))
> +bench:add({
> +  name = "longlived_depth_" .. tostring(maxdepth),
> +  payload = function()
> +    local check = ItemCheck(longlivedtree)
> +    return check
> +  end,
> +  items = 1,
> +  checker = function(check)
> +    return check == -1
> +  end,
> +})
> +
> +bench:add({
> +  name = "all_in_once",
s/all_in_once/all_in_one/?
> +  payload = function()
> +    for depth = mindepth, maxdepth, 2 do
> +      local iterations = 2 ^ (maxdepth - depth + mindepth)
> +      local tree_bench
> +      local check = 0
> +      for i = 1, iterations do
> +        check = check + ItemCheck(BottomUpTree(1, depth)) +
> +                ItemCheck(BottomUpTree(-1, depth))
> +      end
> +      assert(check == -iterations * 2)
> +    end
> +  end,
> +  -- Geometric progression, starting at maxdepth trees with the
> +  -- corresponding step.
> +  items = (2 * maxdepth) * (4 ^ ((maxdepth - mindepth) / 2 + 1) - 1) / 3,
> +  -- Correctness is checked in the payload function.
> +  skip_check = true,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 5781 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches Sergey Kaplun via Tarantool-patches
@ 2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:07     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-13 11:06 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2679 bytes --]

Hi, Sergey!

thanks for the patch!

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The number of iterations is fixed for this test to avoid OOM errors
> for the non-GC64 builds.
> ---
>   perf/LuaJIT-benches/array3d.lua | 25 ++++++++++++++++++++-----
>   1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
> index c10b09b1..75ab5b01 100644
> --- a/perf/LuaJIT-benches/array3d.lua
> +++ b/perf/LuaJIT-benches/array3d.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local function array_set(self, x, y, z, p)
>     assert(x >= 0 and x < self.nx, "x outside PA")
> @@ -50,10 +51,24 @@ end
>   
>   local dim = tonumber(arg and arg[1]) or 300 -- Array dimension dim^3
>   local packed = arg and arg[2] == "packed"   -- Packed image or flat
> -local arr = array_new(dim, dim, dim, packed)
>   
> -for x,y,z inarr:points() do
> -arr:set(x, y, z, x*x)
> -end
> -assert(arr.image[dim^3-1] == (dim-1)^2)
> +bench:add({
> +  name = "array3d",
> +  checker = function(arr)
> +    assert(arr.image[dim^3-1] == (dim-1)^2)
> +    return true
> +  end,
> +  payload = function()
> +    local arr = array_new(dim, dim, dim, packed)
> +    for x,y,z inarr:points() do
please add whitespaces after commas
> +arr:set(x, y, z, x*x)
> +    end
> +    return arr
> +  end,
> +  items = dim * dim * dim,
> +  -- Limit the number of iterations to avoid OOM errors for
> +  -- non-GC64 builds.
> +  iterations = 5,
> +})
>   
> +bench:run_and_report()

Looks like benchmark min time does not work as expected:

[1] ~/sources/MRG/tarantool/third_party/luajit $ time ./build/src/luajit 
perf/LuaJIT-benches/array3d.lua --benchmark_min_time=10
-------------------------------------------------------------------------------------------------------------
Benchmark                                     Time  CPU    Iterations 
UserCounters...
-------------------------------------------------------------------------------------------------------------
array3d                                     2.10 s          2.13 s      
        5 items_per_second=64.370M/s

real    0m2.333s
user    0m1.869s
sys     0m0.461s
[1] ~/sources/MRG/tarantool/third_party/luajit $

--benchmark_min_time set to 10 sec, but benchmark.lua reports "2.10 s" 
and time reported by `time` utility

is less than 10 sec.

[-- Attachment #2: Type: text/html, Size: 3859 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos " Sergey Kaplun via Tarantool-patches
@ 2025-11-13 11:11   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:10     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-13 11:11 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]

Hi, Sergey!

thanks for the patch! LGTM

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/chameneos.lua | 32 ++++++++++++++++++++++---------
>   1 file changed, 23 insertions(+), 9 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/chameneos.lua b/perf/LuaJIT-benches/chameneos.lua
> index 78b64c3f..c1002041 100644
> --- a/perf/LuaJIT-benches/chameneos.lua
> +++ b/perf/LuaJIT-benches/chameneos.lua
> @@ -1,8 +1,10 @@
> +local bench = require("bench").new(arg)
>   
>   local co = coroutine
>   local create, resume, yield = co.create, co.resume, co.yield
>   
> -local N = tonumber(arg and arg[1]) or 10
> +local N = tonumber(arg and arg[1]) or 1e7
Why 1e7?
> +local N_ATTEMPTS = N
>   local first, second
>   
>   -- Meet another creature.
> @@ -57,12 +59,24 @@ local function schedule(threads)
>     until false
>   end
>   
> --- A bunch of colorful creatures.
> -local threads = {
> -  creature("blue"),
> -  creature("red"),
> -  creature("yellow"),
> -  creature("blue"),
> -}
> +bench:add({
> +  name = "chameneos",
> +  items = N_ATTEMPTS,
> +  checker = function(meetings) return meetings == N_ATTEMPTS * 2 end,
> +  payload = function()
> +    -- A bunch of colorful creatures.
> +    local threads = {
> +      creature("blue"),
> +      creature("red"),
> +      creature("yellow"),
> +      creature("blue"),
> +    }
>   
> -io.write(schedule(threads), "\n")
> +    local meetings = schedule(threads)
> +    -- XXX: Restore meetings for the next iteration.
> +    N = N_ATTEMPTS
> +    return meetings
> +  end,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2399 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring " Sergey Kaplun via Tarantool-patches
@ 2025-11-13 11:17   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:11     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-13 11:17 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2463 bytes --]

Hi, Sergey!

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/coroutine-ring.lua | 45 ++++++++++++++++----------
>   1 file changed, 28 insertions(+), 17 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
> index 1e8c5ef6..1b86a5ba 100644
> --- a/perf/LuaJIT-benches/coroutine-ring.lua
> +++ b/perf/LuaJIT-benches/coroutine-ring.lua
> @@ -1,3 +1,5 @@
> +local bench = require("bench").new(arg)
> +
>   -- The Computer Language Benchmarks Game
>   --http://shootout.alioth.debian.org/
>   -- contributed by Sam Roberts
> @@ -7,7 +9,6 @@ local n         = tonumber(arg and arg[1]) or 2e7
>   
>   -- fixed size pool
>   local poolsize  = 503
> -local threads   = {}
>   
>   -- cache these to avoid global environment lookups
>   local create    = coroutine.create
> @@ -15,7 +16,6 @@ local resume    = coroutine.resume
>   local yield     = coroutine.yield
>   
>   local id        = 1
> -local token     = 0
>   local ok
>   
>   local body = function(token)
> @@ -24,19 +24,30 @@ local body = function(token)
>     end
>   end
>   
> --- create all threads
> -for id = 1, poolsize do
> -  threads[id] = create(body)
> -end
> -
> --- send the token
> -repeat
> -  if id == poolsize then
> -    id = 1
> -  else
> -    id = id + 1
> -  end
> -  ok, token = resume(threads[id], token)
> -until token == n
> +bench:add({
> +  name = "coroutine_ring",
> +  payload = function()
> +    local token     = 0
a single whitespace before "="
> +    -- create all threads
First letter is in uppercase and a dot at the end.
> +    local threads   = {}
a single whitespace before "="
> +    for id = 1, poolsize do
> +      threads[id] = create(body)
> +    end
> +
> +    -- send the token
First letter is in uppercase and a dot at the end.
> +    repeat
> +      if id == poolsize then
> +        id = 1
> +      else
> +        id = id + 1
> +      end
> +      ok, token = resume(threads[id], token)
> +    until token == n
> +    return id
> +  end,
> +  checker = function(id) return id == (n % poolsize + 1) end,
> +  items = n,
> +})
> +
> +bench:run_and_report()
>   
> -io.write(id, "\n")

[-- Attachment #2: Type: text/html, Size: 3611 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit " Sergey Kaplun via Tarantool-patches
@ 2025-11-13 11:44   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:12     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-13 11:44 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2501 bytes --]

Hi, Sergey,

thanks for the patch!

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/euler14-bit.lua | 52 ++++++++++++++++++++---------
>   1 file changed, 36 insertions(+), 16 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
> index 537f2bf3..7c521deb 100644
> --- a/perf/LuaJIT-benches/euler14-bit.lua
> +++ b/perf/LuaJIT-benches/euler14-bit.lua
> @@ -1,22 +1,42 @@
> +local bench = require("bench").new(arg)
>   
>   local bit = require("bit")
>   local bnot, bor, band = bit.bnot, bit.bor, bit.band
>   local shl, shr = bit.lshift, bit.rshift
>   
> -local N = tonumber(arg and arg[1]) or 10000000
> -local cache, m, n = { 1 }, 1, 1
> -if arg and arg[2] then cache = nil end
> -for i=2,N do
> -  local j = i
> -  for len=1,1000000000 do
> -    j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
> -    if cache then
> -      local x = cache[j]; if x then j = x+len; break end
> -    elseif j == 1 then
> -      j = len+1; break
> +local DEFAULT_N = 2e7
> +local N = tonumber(arg and arg[1]) or DEFAULT_N
> +local drop_cache = arg and arg[2]
> +
> +bench:add({
> +  name = "euler14_bit",
> +  payload = function()
> +    local cache, m, n = { 1 }, 1, 1
> +    if drop_cache then cache = nil end
> +    for i=2,N do
s/2,/2, /
> +      local j = i
> +      for len=1,1000000000 do
s/1,/1, /
> +        j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
please add whitespaces, here and below
> +        if cache then
> +          local x = cache[j]; if x then j = x+len; break end
whitespaces
> +        elseif j == 1 then
> +          j = len+1; break
s/+/ + /
> +        end
> +      end
> +      if cache then cache[i] = j end
> +      if j > m then m, n = j, i end
> +    end
> +    return {n = n, m = m}
> +  end,
> +  checker = function(res)
> +    if N ~= DEFAULT_N then
> +      -- Test only for the default.
> +      return true
> +    else
> +      return res.n == 18064027 and res.m == 623
>       end
> -  end
> -  if cache then cache[i] = j end
> -  if j > m then m, n = j, i end
> -end
> -io.write("Found ", n, " (chain length: ", m, ")\n")
> +  end,
> +  items = N,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3794 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life " Sergey Kaplun via Tarantool-patches
@ 2025-11-17  8:35   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:18     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17  8:35 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3404 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file.
>
> The output is redirected to /dev/null. The checker tests the result
> after the exact amount of iterations for the fixed field (as it is
> declared in the original benchmark).
> ---
>   perf/LuaJIT-benches/life.lua | 79 +++++++++++++++++++++++++++++++++++-
>   1 file changed, 78 insertions(+), 1 deletion(-)
>
> diff --git a/perf/LuaJIT-benches/life.lua b/perf/LuaJIT-benches/life.lua
> index 911d9fe1..d0e4dc98 100644
> --- a/perf/LuaJIT-benches/life.lua
> +++ b/perf/LuaJIT-benches/life.lua
> @@ -3,6 +3,8 @@
>   -- modified to use ANSI terminal escape sequences
>   -- modified to use for instead of while
>   
> +local bench = require('bench').new(arg)
> +
>   local write=io.write
>   
>   ALIVE="�"	DEAD="�"
We usually use ascii only symbols. Should we replace with ascii-only 
alternative?
> @@ -106,6 +108,81 @@ function LIFE(w,h)
>       if gen>2000 then break end
>       --delay()		-- no delay
dead code
>     end
> +  return thisgen
>   end
>   
> -LIFE(40,20)
> +-- Result of the LIFE(40, 20) after 2000 generations.
> +--[[
> +----------------------------------------
> +----------------------------------------
> +--OO--------------------------O---------
> +-OO--------------------------O-O--------
> +---O--------------------------O---------
> +----------------------------------------
> +----------------------------------------
> +----------------------------------------
> +----------------------------------------
> +----------------------------------------
> +----------------------------------------
> +----------------------------------------
> +---O------------------------------------
> +--O-O-----------------------------------
> +--O-O-----------------------------------
> +---O------------------------------------
> +----------------------------------------
> +-------OO-------------------------------
> +-------OO-------------------------------
> +----------------------------------------
> +]]
> +
> +local function check_life(thisgen, w, h)
> +  local expected_cells = ARRAY2D(w, h)
> +  for y = 1, h do
> +    for x = 1, w do
> +      expected_cells[y][x] = false
> +    end
> +  end
> +  local alive_cells = {
> +    {3, 3}, {3, 4}, {3, 31},
> +    {4, 2}, {4, 3}, {4, 30}, {4, 32},
> +    {5, 4}, {5, 31},
> +    {13, 4},
> +    {14, 3}, {14, 5},
> +    {15, 3}, {15, 5},
> +    {16, 4},
> +    {18, 8}, {18, 9},
> +    {19, 8}, {19, 9},
> +  }
> +  for _, cell in ipairs(alive_cells) do
> +    local y, x = cell[1], cell[2]
> +    expected_cells[y][x] = true
> +  end
> +  for y = 1, h do
> +    for x = 1, w do
> +      assert(thisgen[y][x] > 0 == expected_cells[y][x],
> +             ('Incorrect value for cell (%d, %d)'):format(x, y))
> +    end
> +  end
> +  return true
> +end
> +
> +local stdout = io.output()
> +
> +bench:add({
> +  name = 'life',
> +  setup = function()
> +    io.output('/dev/null')
> +  end,
> +  payload = function()
> +    return LIFE(40, 20)
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
> +  checker = function(res)
> +    return check_life(res, 40, 20)
> +  end,
> +  items = 2000 * 40 * 20,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 4060 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch " Sergey Kaplun via Tarantool-patches
@ 2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:13     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17  8:36 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2656 bytes --]

Hi, Sergey,

thanks for the patch!

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>
> I'm not sure that amount of permutations is the correct items count.
> Have you any other suggestions?
>
>   perf/LuaJIT-benches/fannkuch.lua | 37 +++++++++++++++++++++++++++++---
>   1 file changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
> index 2a4cd426..c963c66f 100644
> --- a/perf/LuaJIT-benches/fannkuch.lua
> +++ b/perf/LuaJIT-benches/fannkuch.lua

I'm highly recommend adding description to benchmarks.

At least to the tests from "benchmarks game" suite.

You can use descriptions from the [1] and [2].

1. https://benchmarksgame-team.pages.debian.net/benchmarksgame/

2. 
https://en.wikipedia.org/wiki/The_Computer_Language_Benchmarks_Game#Benchmark_programs


> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local function fannkuch(n)
>     local p, q, s, odd, check, maxflips = {}, {}, {}, true, 0, 0
> @@ -6,7 +7,7 @@ local function fannkuch(n)
>       -- Print max. 30 permutations.
>       if check < 30 then
>         if not p[n] then return maxflips end	-- Catch n = 0, 1, 2.
> -      io.write(unpack(p)); io.write("\n")
> +      -- io.write(unpack(p)); io.write("\n")
isn't better to remove at all?
>         check = check + 1
>       end
>       -- Copy and flip.
> @@ -46,5 +47,35 @@ local function fannkuch(n)
>     until false
>   end
>   
> -local n = tonumber(arg and arg[1]) or 1
> -io.write("Pfannkuchen(", n, ") = ", fannkuch(n), "\n")
> +local n = tonumber(arg and arg[1]) or 11
> +
> +-- Precomputed numbers taken from:
please add description of the paper as well: "Performing Lisp Analysis 
of the FANNKUCH Benchmark"
> +--https://dl.acm.org/doi/pdf/10.1145/382109.382124
> +local FANNKUCH = { 0, 1, 2, 4, 7, 10, 16, 22, 30, 38, 51, 65, 80 }
> +
> +local function factorial(n)
> +  local fact = 1
> +  for i = 2, n do
> +    fact = fact * i
> +  end
> +  return fact
> +end
> +
> +bench:add({
> +  name = "fannkuch",
> +  payload = function()
> +    return fannkuch(n)
> +  end,
> +  checker = function(res)
> +    if n > #FANNKUCH then
> +      -- Not precomputed, so can't check.
> +      return true
> +    else
> +      return res == FANNKUCH[n]
> +    end
> +  end,
> +  -- Assume that we count permutations here.
> +  items = factorial(n),
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3972 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide " Sergey Kaplun via Tarantool-patches
@ 2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:17     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17  8:36 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 4278 bytes --]

Hi, Sergey,

thanks for the patch!

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The benchmark input is given by redirecting the corresponding
> <FASTA_5000000> file generated by the `libs/fasta.lua 5e6`. The output
> from the benchmark is redirected to /dev/null. All checks are done by
> the comparison with the precomputed values for the aforementioned file.
> ---
>   perf/LuaJIT-benches/k-nucleotide.lua | 93 ++++++++++++++++++++++++----
>   1 file changed, 82 insertions(+), 11 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
> index 0bfb41be..ae51dae9 100644
> --- a/perf/LuaJIT-benches/k-nucleotide.lua
> +++ b/perf/LuaJIT-benches/k-nucleotide.lua
> @@ -1,3 +1,4 @@
> +local bench = require('bench').new(arg)
>   
>   local function kfrequency(seq, freq, k, frame)
>     local sub = string.sub
> @@ -12,7 +13,8 @@ local function count(seq, frag)
>     local k = #frag
>     local freq = {}
>     for frame=1,k do kfrequency(seq, freq, k, frame) end
> -  io.write(freq[frag] or 0, "\t", frag, "\n")
> +  return freq[frag]
> +  -- io.write(freq[frag] or 0, "\t", frag, "\n")
remove this at all?
>   end
>   
>   local function frequency(seq, k)
> @@ -24,10 +26,13 @@ local function frequency(seq, k)
>       local fa, fb = freq[a], freq[b]
>       return fa == fb and a > b or fa > fb
>     end)
> +  local res = {}
>     for _,c in ipairs(sfreq) do
> -    io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
> +    -- io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
remove?
> +    res[c] = freq[c]*100/sum
>     end
> -  io.write("\n")
> +  -- io.write("\n")
> +  return res
>   end
>   
>   local function readseq()
> @@ -48,11 +53,77 @@ local function readseq()
>     return string.upper(table.concat(lines, "", 1, ln))
>   end
>   
> -local seq = readseq()
> -frequency(seq, 1)
> -frequency(seq, 2)
> -count(seq, "GGT")
> -count(seq, "GGTA")
> -count(seq, "GGTATT")
> -count(seq, "GGTATTTTAATT")
> -count(seq, "GGTATTTTAATTTATAGT")
> +local function check_freq(res, expected)
> +  for k,v in pairs(expected) do
> +    assert(string.format("%0.3f", res[k]) == v,
> +           "Incorrect frequency for fragment " .. k)
> +  end
> +end
> +
> +-- The input is generated by `fasta.lua 5e6'. The check function
> +-- is corresponding.
> +local N = 5e6
> +-- See <libs/fasta.lua> for the details.
> +local items = N * 5
> +bench:add({
> +  name = "k_nucleotide",
> +  payload = function()
> +    local seq = readseq()
> +    local sfreq1 = frequency(seq, 1)
> +    local sfreq2 = frequency(seq, 2)
> +    local GGT  = count(seq, "GGT")
> +    local GGTA = count(seq, "GGTA")
> +    local GGTATT = count(seq, "GGTATT")
> +    local GGTATTTTAATT = count(seq, "GGTATTTTAATT")
> +    local GGTATTTTAATTTATAGT = count(seq, "GGTATTTTAATTTATAGT")
> +
> +    local res = {
> +      sfreq1 = sfreq1,
> +      sfreq2 = sfreq2,
> +      GGT  = GGT,
> +      GGTA = GGTA,
> +      GGTATT = GGTATT,
> +      GGTATTTTAATT = GGTATTTTAATT,
> +      GGTATTTTAATTTATAGT = GGTATTTTAATTTATAGT,
> +    }
> +    -- XXX: Reset input for the non-check iteration.
> +io.stdin:seek("set", 0)
> +    return res
> +  end,
> +  checker = function(res)
> +    check_freq(res.sfreq1, {
> +      A = "30.296",
> +      T = "30.149",
> +      C = "19.800",
> +      G = "19.754",
> +    })
> +    check_freq(res.sfreq2, {
> +      AA = "9.177",
> +      TA = "9.132",
> +      AT = "9.130",
> +      TT = "9.091",
> +      CA = "6.002",
> +      AC = "6.001",
> +      AG = "5.987",
> +      GA = "5.984",
> +      CT = "5.971",
> +      TC = "5.971",
> +      GT = "5.957",
> +      TG = "5.956",
> +      CC = "3.917",
> +      GC = "3.911",
> +      CG = "3.909",
> +      GG = "3.902",
> +    })
> +
> +    assert(res.GGT == 294331)
> +    assert(res.GGTA == 89290)
> +    assert(res.GGTATT == 9462)
> +    assert(res.GGTATTTTAATT == 178)
> +    assert(res.GGTATTTTAATTTATAGT == 178)
> +    return true
> +  end,
> +  items = items,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 4943 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:30     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1346 bytes --]

Hi, Sergey,

thanks for the patch!  LGTM

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/recursive-ack.lua | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/perf/LuaJIT-benches/recursive-ack.lua b/perf/LuaJIT-benches/recursive-ack.lua
> index fad30589..1172d4b3 100644
> --- a/perf/LuaJIT-benches/recursive-ack.lua
> +++ b/perf/LuaJIT-benches/recursive-ack.lua
> @@ -1,3 +1,5 @@
> +local bench = require("bench").new(arg)
> +
>   local function Ack(m, n)
>     if m == 0 then return n+1 end
>     if n == 0 then return Ack(m-1, 1) end
> @@ -5,4 +7,17 @@ local function Ack(m, n)
>   end
>   
>   local N = tonumber(arg and arg[1]) or 10
> -io.write("Ack(3,", N ,"): ", Ack(3,N), "\n")
> +
> +bench:add({
> +  name = "recursive_ack",
> +  -- Sum of calls for the function RA(3, N).
> +  items = 128 * ((4 ^ N - 1) / 3) - 40 * (2 ^ N - 1) + 3 * N + 15,
> +  payload = function()
> +    return Ack(3, N)
> +  end,
> +  checker = function(res)
> +    return res == 2 ^ (N + 3) - 3
> +  end,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 1761 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:29     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3286 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The output is redirected to /dev/null. The check is skipped since it is
> very inconvenient to check the binary output, especially since it may be
> configured by the parameter.
> ---
>   perf/LuaJIT-benches/ray.lua | 76 ++++++++++++++++++++++++-------------
>   1 file changed, 50 insertions(+), 26 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
> index 2acc24c0..f7b76d0a 100644
> --- a/perf/LuaJIT-benches/ray.lua
> +++ b/perf/LuaJIT-benches/ray.lua
> @@ -1,10 +1,8 @@
> +local bench = require("bench").new(arg)
> +
>   local sqrt = math.sqrt
>   local huge = math.huge
> -
> -local delta = 1
> -while delta * delta + 1 ~= 1 do
> -  delta = delta * 0.5
> -end
> +local delta
>   
>   local function length(x, y, z)  return sqrt(x*x + y*y + z*z) end
>   local function vlen(v)          return length(v[1], v[2], v[3]) end
> @@ -110,26 +108,52 @@ end
>   
>   
>   local level, n, ss = tonumber(arg[1]) or 9, tonumber(arg[2]) or 256, 4
> -local iss = 1/ss
> -local gf = 255/(ss*ss)
> -
> -io.write(("P5\n%d %d\n255\n"):format(n, n))
> -local light = { unitise(-1, -3, 2) }
> -ilight = { -light[1], -light[2], -light[3] }
> -local camera = { 0, 0, -4 }
> -local dir = { 0, 0, 0 }
> -
> -local scene = create(level, {0, -1, 0}, 1)
> -
> -for y = n/2-1, -n/2, -1 do
> -  for x = -n/2, n/2-1 do
> -    local g = 0
> -    for d = y, y+.99, iss do
> -      for e = x, x+.99, iss do
> -        dir[1], dir[2], dir[3] = unitise(e, d, n)
> -        g = g + ray_trace(light, camera, dir, scene)
> +
> +local stdout = io.output()
> +
> +bench:add({
> +  name = "ray",
> +  -- Avoid skip checking here, since it is not very convenient.
> +  -- If you want to check the behaviour -- drop the setup
> +  -- function.
> +  skip_check = true,
> +  setup = function()
> +    io.output("/dev/null")
> +  end,
> +  payload = function()
> +    local iss = 1/ss
please add more whitespaces, here and below
> +    local gf = 255/(ss*ss)
> +
> +    delta = 1
> +    while delta * delta + 1 ~= 1 do
> +      delta = delta * 0.5
> +    end
> +
> +    io.write(("P5\n%d %d\n255\n"):format(n, n))
> +    local light = { unitise(-1, -3, 2) }
> +    ilight = { -light[1], -light[2], -light[3] }
> +    local camera = { 0, 0, -4 }
> +    local dir = { 0, 0, 0 }
> +
> +    local scene = create(level, {0, -1, 0}, 1)
> +
> +    for y = n/2-1, -n/2, -1 do
> +      for x = -n/2, n/2-1 do
> +        local g = 0
> +        for d = y, y+.99, iss do
> +          for e = x, x+.99, iss do
> +            dir[1], dir[2], dir[3] = unitise(e, d, n)
> +            g = g + ray_trace(light, camera, dir, scene)
> +          end
> +        end
> +        io.write(string.char(math.floor(0.5 + g*gf)))
>         end
>       end
> -    io.write(string.char(math.floor(0.5 + g*gf)))
> -  end
> -end
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
> +  items = n * n * level,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3753 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:27     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2712 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The output is redirected to /dev/null. The check is skipped since it is
> very inconvenient to store the huge file in the repository with the
> reference value.
> ---
>   perf/LuaJIT-benches/pidigits-nogmp.lua | 49 ++++++++++++++++++--------
>   1 file changed, 35 insertions(+), 14 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/pidigits-nogmp.lua b/perf/LuaJIT-benches/pidigits-nogmp.lua
> index 63a1cb0e..e96b3e45 100644
> --- a/perf/LuaJIT-benches/pidigits-nogmp.lua
> +++ b/perf/LuaJIT-benches/pidigits-nogmp.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   -- Start of dynamically compiled chunk.
>   local chunk = [=[
> @@ -80,21 +81,41 @@ end)
>   
>   ]=] -- End of dynamically compiled chunk.
>   
> -local N = tonumber(arg and arg[1]) or 27
> +local N = tonumber(arg and arg[1]) or 5000
Why 5000 by default?
>   local RADIX = N < 6500 and 2^36 or 2^32 -- Avoid overflow.
>   
> --- Substitute radix and compile chunk.
> -local pidigit = loadstring(string.gsub(chunk, "RADIX", tostring(RADIX)))()
> +local stdout = io.output()
>   
> --- Print lines with 10 digits.
> -for i=10,N,10 do
> -  for j=1,10 do io.write(pidigit()) end
> -  io.write("\t:", i, "\n")
> -end
> +bench:add({
> +  name = "pidigit_nogmp",
> +  -- Avoid skip checking here, since it is not very convenient.
> +  -- If you want to check the behaviour -- drop the setup
> +  -- function.
> +  skip_check = true,
> +  setup = function()
> +    io.output("/dev/null")
> +  end,
> +  payload = function()
> +    -- Substitute radix and compile chunk.
> +    local pidigit = loadstring(string.gsub(chunk, "RADIX", tostring(RADIX)))()
>   
> --- Print remaining digits (if any).
> -local n10 = N % 10
> -if n10 ~= 0 then
> -  for i=1,n10 do io.write(pidigit()) end
add more whitespaces, here and below
> -  io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
> -end
> +    -- Print lines with 10 digits.
> +    for i=10,N,10 do
> +      for j=1,10 do io.write(pidigit()) end
> +      io.write("\t:", i, "\n")
> +    end
> +
> +    -- Print remaining digits (if any).
> +    local n10 = N % 10
> +    if n10 ~= 0 then
> +      for i=1,n10 do io.write(pidigit()) end
> +      io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
> +    end
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
> +  items = N,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3423 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:27     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3169 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/partialsums.lua | 69 ++++++++++++++++++-----------
>   1 file changed, 42 insertions(+), 27 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/partialsums.lua b/perf/LuaJIT-benches/partialsums.lua
> index 46bb9da3..ab24b30a 100644
> --- a/perf/LuaJIT-benches/partialsums.lua
> +++ b/perf/LuaJIT-benches/partialsums.lua
> @@ -1,29 +1,44 @@
> +local bench = require("bench").new(arg)
>   
> -local n = tonumber(arg[1])
> -local function pr(fmt, x) io.write(string.format(fmt, x)) end
> +local DEFAULT_N = 1e7
> +local n = tonumber(arg[1]) or DEFAULT_N
Why 1e7 is default?
>   
> -local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
> -local sqrt, sin, cos = math.sqrt, math.sin, math.cos
> -for k=1,n do
> -  local k2, sk, ck = k*k, sin(k), cos(k)
> -  local k3 = k2*k
> -  a1 = a1 + (2/3)^k
> -  a2 = a2 + 1/sqrt(k)
> -  a3 = a3 + 1/(k2+k)
> -  a4 = a4 + 1/(k3*sk*sk)
> -  a5 = a5 + 1/(k3*ck*ck)
> -  a6 = a6 + 1/k
> -  a7 = a7 + 1/k2
> -  a8 = a8 + alt/k
> -  a9 = a9 + alt/(k+k-1)
> -  alt = -alt
> -end
> -pr("%.9f\t(2/3)^k\n", a1)
> -pr("%.9f\tk^-0.5\n", a2)
> -pr("%.9f\t1/k(k+1)\n", a3)
> -pr("%.9f\tFlint Hills\n", a4)
> -pr("%.9f\tCookson Hills\n", a5)
> -pr("%.9f\tHarmonic\n", a6)
> -pr("%.9f\tRiemann Zeta\n", a7)
> -pr("%.9f\tAlternating Harmonic\n", a8)
> -pr("%.9f\tGregory\n", a9)

debug prints were lost, is it intentional?

In a previous benches debug prints were left, but suppressed.

Also I propose to use the same printf function in all benches for 
consistency.

> +bench:add({
> +  name = "partialsums",
> +  payload = function()
> +    local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
> +    local sqrt, sin, cos = math.sqrt, math.sin, math.cos
> +    for k=1,n do
please add more whitespaces, here and below
> +      local k2, sk, ck = k*k, sin(k), cos(k)
> +      local k3 = k2*k
> +      a1 = a1 + (2/3)^k
> +      a2 = a2 + 1/sqrt(k)
> +      a3 = a3 + 1/(k2+k)
> +      a4 = a4 + 1/(k3*sk*sk)
> +      a5 = a5 + 1/(k3*ck*ck)
> +      a6 = a6 + 1/k
> +      a7 = a7 + 1/k2
> +      a8 = a8 + alt/k
> +      a9 = a9 + alt/(k+k-1)
> +      alt = -alt
> +    end
> +    return {a1, a2, a3, a4, a5, a6, a7, a8, a9}
> +  end,
> +  checker = function(a)
> +    if n == DEFAULT_N then
> +      assert(a[1] == 2.99999999999999866773)
> +      assert(a[2] == 6323.09512394020111969439)
> +      assert(a[3] == 0.99999989999981531152)
> +      assert(a[4] == 30.31454593111029183206)
> +      assert(a[5] == 42.99523427973661426904)
> +      assert(a[6] == 16.69531136585727182364)
> +      assert(a[7] == 1.64493396684725956547)
> +      assert(a[8] == 0.69314713056010635039)
> +      assert(a[9] == 0.78539813839744787582)
> +    end
> +    return true
> +  end,
> +  items = n,
> +})
> +
> +bench:run_and_report()
>

[-- Attachment #2: Type: text/html, Size: 4087 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:26     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:25 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/nsieve.lua | 35 +++++++++++++++++++++++++++++-----
>   1 file changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/nsieve.lua b/perf/LuaJIT-benches/nsieve.lua
> index 6de0524f..2d1b66c8 100644
> --- a/perf/LuaJIT-benches/nsieve.lua
> +++ b/perf/LuaJIT-benches/nsieve.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local function nsieve(p, m)
>     for i=2,m do p[i] = true end
> @@ -11,11 +12,35 @@ local function nsieve(p, m)
>     return count
>   end
>   
> -local N = tonumber(arg and arg[1]) or 1
> +local DEFAULT_N = 12
Why 12?
> +local N = tonumber(arg and arg[1]) or DEFAULT_N
>   if N < 2 then N = 2 end
>   local primes = {}
>   
> -for i=0,2 do
> -  local m = (2^(N-i))*10000
> -  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
> -end
> +local benchmark
> +benchmark = {
> +  name = "nsieve",
> +  payload = function()
> +    local res = {}
> +    local items = 0
> +    for i=0,2 do
add more whitespaces, here and below
> +      local m = (2^(N-i))*10000
> +      items = items + m
> +      res[i] = nsieve(primes, m)
> +    end
> +    benchmark.items = items
> +
> +    return res
> +  end,
> +  checker = function(res)
> +    if N == DEFAULT_N then
> +      assert(res[0] == 2488465)
> +      assert(res[1] == 1299069)
> +      assert(res[2] == 679461)
> +    end
> +    return true
> +  end,
> +}
> +
> +bench:add(benchmark)
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2573 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:25     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1925 bytes --]

Hi, Sergey,

thanks for the patch!

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/nsieve-bit.lua | 35 +++++++++++++++++++++++++-----
>   1 file changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/nsieve-bit.lua b/perf/LuaJIT-benches/nsieve-bit.lua
> index 820a3726..4858e9e2 100644
> --- a/perf/LuaJIT-benches/nsieve-bit.lua
> +++ b/perf/LuaJIT-benches/nsieve-bit.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local bit = require("bit")
>   local band, bxor, rshift, rol = bit.band, bit.bxor, bit.rshift, bit.rol
> @@ -17,11 +18,35 @@ local function nsieve(p, m)
>     return count
>   end
>   
> -local N = tonumber(arg and arg[1]) or 1
> +local DEFAULT_N = 12
Why 12?
> +local N = tonumber(arg and arg[1]) or DEFAULT_N
>   if N < 2 then N = 2 end
>   local primes = {}
>   
> -for i=0,2 do
add more whitespaces, here and below
> -  local m = (2^(N-i))*10000
> -  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
io.write is lost, is it intentional?
> -end
> +local benchmark
> +benchmark = {
> +  name = "nsieve_bit",
> +  payload = function()
> +    local res = {}
> +    local items = 0
> +    for i=0,2 do
> +      local m = (2^(N-i))*10000
add more whitespaces
> +      items = items + m
> +      res[i] = nsieve(primes, m)
> +    end
> +    benchmark.items = items
> +
> +    return res
> +  end,
> +  checker = function(res)
> +    if N == DEFAULT_N then
> +      assert(res[0] == 2488465)
> +      assert(res[1] == 1299069)
> +      assert(res[2] == 679461)
> +    end
> +    return true
> +  end,
> +}
> +
> +bench:add(benchmark)
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3050 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:25     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1861 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/nsieve-bit-fp.lua | 35 +++++++++++++++++++++++----
>   1 file changed, 30 insertions(+), 5 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/nsieve-bit-fp.lua b/perf/LuaJIT-benches/nsieve-bit-fp.lua
> index 3971ec1f..d0ab23d2 100644
> --- a/perf/LuaJIT-benches/nsieve-bit-fp.lua
> +++ b/perf/LuaJIT-benches/nsieve-bit-fp.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local floor, ceil = math.floor, math.ceil
>   
> @@ -27,11 +28,35 @@ local function nsieve(p, m)
>     return count
>   end
>   
> -local N = tonumber(arg and arg[1]) or 1
> +local DEFAULT_N = 12
Why 12 instead of 1?
> +local N = tonumber(arg and arg[1]) or DEFAULT_N
>   if N < 2 then N = 2 end
>   local primes = {}
>   
> -for i=0,2 do
> -  local m = (2^(N-i))*10000
> -  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
> -end
> +local benchmark
> +benchmark = {
> +  name = "nsieve_bit_fp",
> +  payload = function()
> +    local res = {}
> +    local items = 0
> +    for i=0,2 do
add more whitespaces
> +      local m = (2^(N-i))*10000
add more whitespaces
> +      items = items + m
> +      res[i] = nsieve(primes, m)
> +    end
> +    benchmark.items = items
> +
> +    return res
> +  end,
> +  checker = function(res)
> +    if N == DEFAULT_N then
> +      assert(res[0] == 2488465)
> +      assert(res[1] == 1299069)
> +      assert(res[2] == 679461)
> +    end
> +    return true
> +  end,
> +}
> +
> +bench:add(benchmark)
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2798 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:20     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 4083 bytes --]

Hi, Sergey!

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The output is redirected to /dev/null. The check is skipped since it is
> very inconvenient to check the binary output, especially since it may be
> configured by the parameter.
> ---
>   perf/LuaJIT-benches/mandelbrot-bit.lua | 86 +++++++++++++++++---------
>   1 file changed, 57 insertions(+), 29 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/mandelbrot-bit.lua b/perf/LuaJIT-benches/mandelbrot-bit.lua
> index 91d96975..a6b5e1f8 100644
> --- a/perf/LuaJIT-benches/mandelbrot-bit.lua
> +++ b/perf/LuaJIT-benches/mandelbrot-bit.lua
> @@ -1,33 +1,61 @@
> -
>   local bit = require("bit")
> -local bor, band = bit.bor, bit.band
> -local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
> -local write, char, unpack = io.write, string.char, unpack
> -local N = tonumber(arg and arg[1]) or 100
> -local M, buf = 2/N, {}
> -write("P4\n", N, " ", N, "\n")
> -for y=0,N-1 do
> -  local Ci, b, p = y*M-1, -16777216, 0
> -  local Ciq = Ci*Ci
> -  for x=0,N-1,2 do
> -    local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
> -    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
> -    local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
> -    b = rol(b, 2)
> -    for i=1,49 do
> -      Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
> -      Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
> -      Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
> -      Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
> -      if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
> -      if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
> -      if band(b, 3) == 0 then break end
> +
> +local bench = require("bench").new(arg)
> +
> +local N = tonumber(arg and arg[1]) or 5000
> +
> +local function payload()
> +  -- These functions must not be an upvalue but the stack slot.
please add here details about performance impact
> +  local N = N
> +  local bor, band = bit.bor, bit.band
> +  local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
> +  local write, char, unpack = io.write, string.char, unpack
> +
> +  local M, buf = 2/N, {}
> +  write("P4\n", N, " ", N, "\n")
> +  for y=0,N-1 do
please add spaces here and below
> +    local Ci, b, p = y*M-1, -16777216, 0
> +    local Ciq = Ci*Ci
> +    for x=0,N-1,2 do
> +      local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
> +      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
> +      local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
> +      b = rol(b, 2)
> +      for i=1,49 do
> +        Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
> +        Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
> +        Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
> +        Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
> +        if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
> +        if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
> +        if band(b, 3) == 0 then break end
> +      end
> +      if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
>       end
> -    if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
> -  end
> -  if b ~= -16777216 then
> -    if band(N, 1) ~= 0 then b = shr(b, 1) end
> -    p = p + 1; buf[p] = shl(b, 8-band(N, 7))
> +    if b ~= -16777216 then
> +      if band(N, 1) ~= 0 then b = shr(b, 1) end
> +      p = p + 1; buf[p] = shl(b, 8-band(N, 7))
> +    end
> +    write(char(unpack(buf, 1, p)))
>     end
> -  write(char(unpack(buf, 1, p)))
>   end
> +
> +local stdout = io.output()
> +
> +bench:add({
> +  name = "mandelbrot_bit",
> +  items = N,
> +  -- XXX: This is inconvenient to have the binary file in the
> +  -- repository for the comparison. If the check is needed, run
> +  -- the payload manually.
> +  skip_check = true,
> +  setup = function()
> +    io.output("/dev/null")
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
> +  payload = payload,
> +})
> +
> +bench:run_and_report()
>

[-- Attachment #2: Type: text/html, Size: 4754 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:22     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2061 bytes --]

Hi, Sergey,

thanks for the patch! See  my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/md5.lua | 27 ++++++++++++++++++++-------
>   1 file changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/md5.lua b/perf/LuaJIT-benches/md5.lua
> index fdf6b4a7..5ec67527 100644
> --- a/perf/LuaJIT-benches/md5.lua
> +++ b/perf/LuaJIT-benches/md5.lua
> @@ -1,5 +1,6 @@
> -
>   local bit = require("bit")
> +local bench = require("bench").new(arg)
> +
>   local tobit, tohex, bnot = bit.tobit or bit.cast, bit.tohex, bit.bnot
>   local bor, band, bxor = bit.bor, bit.band, bit.bxor
>   local lshift, rshift, rol, bswap = bit.lshift, bit.rshift, bit.rol, bit.bswap
> @@ -147,7 +148,7 @@ assert(md5('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789') ==
>   assert(md5('12345678901234567890123456789012345678901234567890123456789012345678901234567890') ==
>          '57edf4a22be3c955ac49da2e2107b67a')
>   
> -local N = tonumber(arg and arg[1]) or 10000
> +local N = tonumber(arg and arg[1]) or 20000
this deserves a comment, why 20000 instead 10000?
>   
>     -- Credits: William Shakespeare, Romeo and Juliet
>   local txt = [[Rebellious subjects, enemies to peace,
> @@ -176,8 +177,20 @@ Once more, on pain of death, all men depart.]]
>     txt = txt..txt..txt..txt
>     txt = txt..txt..txt..txt
>   
> -for i=1,N do
> -  res = md5(txt)
> -end
> -assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
> -
> +bench:add({
> +  name = 'md5',
> +  payload = function()
> +    local res
> +    for i=1,N do
s/1,/1, /
> +      res = md5(txt)
> +    end
> +    return res
> +  end,
> +  checker = function(res)
> +    assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
> +    return true
> +  end,
> +  items = N,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2809 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:24     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:26 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 5328 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/nbody.lua | 127 ++++++++++++++++++++--------------
>   1 file changed, 74 insertions(+), 53 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/nbody.lua b/perf/LuaJIT-benches/nbody.lua
> index e0ff8f77..f01c20a3 100644
> --- a/perf/LuaJIT-benches/nbody.lua
> +++ b/perf/LuaJIT-benches/nbody.lua
> @@ -1,56 +1,12 @@
> +local bench = require("bench").new(arg)
>   
>   local sqrt = math.sqrt
>   
>   local PI = 3.141592653589793
>   local SOLAR_MASS = 4 * PI * PI
>   local DAYS_PER_YEAR = 365.24
> -local bodies = {
> -  { -- Sun
> -    x = 0,
> -    y = 0,
> -    z = 0,
> -    vx = 0,
> -    vy = 0,
> -    vz = 0,
> -    mass = SOLAR_MASS
> -  },
> -  { -- Jupiter
> -    x = 4.84143144246472090e+00,
> -    y = -1.16032004402742839e+00,
> -    z = -1.03622044471123109e-01,
> -    vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
> -    vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
> -    vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
> -    mass = 9.54791938424326609e-04 * SOLAR_MASS
> -  },
> -  { -- Saturn
> -    x = 8.34336671824457987e+00,
> -    y = 4.12479856412430479e+00,
> -    z = -4.03523417114321381e-01,
> -    vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
> -    vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
> -    vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
> -    mass = 2.85885980666130812e-04 * SOLAR_MASS
> -  },
> -  { -- Uranus
> -    x = 1.28943695621391310e+01,
> -    y = -1.51111514016986312e+01,
> -    z = -2.23307578892655734e-01,
> -    vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
> -    vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
> -    vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
> -    mass = 4.36624404335156298e-05 * SOLAR_MASS
> -  },
> -  { -- Neptune
> -    x = 1.53796971148509165e+01,
> -    y = -2.59193146099879641e+01,
> -    z = 1.79258772950371181e-01,
> -    vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
> -    vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
> -    vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
> -    mass = 5.15138902046611451e-05 * SOLAR_MASS
> -  }
> -}
> +local bodies
> +local nbody
>   
>   local function advance(bodies, nbody, dt)
>     for i=1,nbody do
> @@ -110,10 +66,75 @@ local function offsetMomentum(b, nbody)
>     b[1].vz = -pz / SOLAR_MASS
>   end
>   
> -local N = tonumber(arg and arg[1]) or 1000
> -local nbody = #bodies
> +local DEFAULT_N = 5e6
> +local N = tonumber(arg and arg[1]) or DEFAULT_N
>   
> -offsetMomentum(bodies, nbody)
> -io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
> -for i=1,N do advance(bodies, nbody, 0.01) end
> -io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
> +bench:add({
> +  name = "nbody",
> +  payload = function()
> +    bodies = {
> +      { -- Sun
> +        x = 0,
> +        y = 0,
> +        z = 0,
> +        vx = 0,
> +        vy = 0,
> +        vz = 0,
> +        mass = SOLAR_MASS
> +      },
> +      { -- Jupiter
> +        x = 4.84143144246472090e+00,
> +        y = -1.16032004402742839e+00,
> +        z = -1.03622044471123109e-01,
> +        vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
> +        vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
> +        vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
> +        mass = 9.54791938424326609e-04 * SOLAR_MASS
> +      },
> +      { -- Saturn
> +        x = 8.34336671824457987e+00,
> +        y = 4.12479856412430479e+00,
> +        z = -4.03523417114321381e-01,
> +        vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
> +        vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
> +        vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
> +        mass = 2.85885980666130812e-04 * SOLAR_MASS
> +      },
> +      { -- Uranus
> +        x = 1.28943695621391310e+01,
> +        y = -1.51111514016986312e+01,
> +        z = -2.23307578892655734e-01,
> +        vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
> +        vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
> +        vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
> +        mass = 4.36624404335156298e-05 * SOLAR_MASS
> +      },
> +      { -- Neptune
> +        x = 1.53796971148509165e+01,
> +        y = -2.59193146099879641e+01,
> +        z = 1.79258772950371181e-01,
> +        vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
> +        vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
> +        vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
> +        mass = 5.15138902046611451e-05 * SOLAR_MASS
> +      }
> +    }
> +    nbody = #bodies
> +
> +    offsetMomentum(bodies, nbody)
Two `io.write()` were lost. It is intentional?
> +
> +    assert(energy(bodies, nbody) == -0.16907516382852447179,
> +             "Correct start energy")
> +    for i=1,N do advance(bodies, nbody, 0.01) end
s/1,/1, /
> +  end,
> +  checker = function()
> +    if N == DEFAULT_N then
> +      assert(energy(bodies, nbody) == -0.16908313397890917251,
> +             "Correct result energy")
> +    end
> +    return true
> +  end,
> +  items = N,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 5878 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:56   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:32     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:56 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 6144 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The time for each subsequent benchmark is increased up to 4 seconds,
> accoring the defaults in the "bench" framework. The main difference
> between this test and others that will be added in next commits is
> the usage of FFI arrays instead of plain Lua tables.
> ---
>   perf/LuaJIT-benches/scimark-2010-12-20.lua | 93 +++++++++++++---------
>   1 file changed, 54 insertions(+), 39 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/scimark-2010-12-20.lua b/perf/LuaJIT-benches/scimark-2010-12-20.lua
> index 353acb7c..3fb627fa 100644
> --- a/perf/LuaJIT-benches/scimark-2010-12-20.lua
> +++ b/perf/LuaJIT-benches/scimark-2010-12-20.lua
> @@ -9,25 +9,26 @@
>   local SCIMARK_VERSION = "2010-12-10"
>   local SCIMARK_COPYRIGHT = "Copyright (C) 2006-2010 Mike Pall"
>   
> -local MIN_TIME = 2.0
> +local bench = require("bench").new(arg)
> +
>   local RANDOM_SEED = 101009 -- Must be odd.
>   local SIZE_SELECT = "small"
>   
>   local benchmarks = {
>     "FFT", "SOR", "MC", "SPARSE", "LU",
>     small = {
> -    FFT		= { 1024 },
> -    SOR		= { 100 },
> -    MC		= { },
> -    SPARSE	= { 1000, 5000 },
> -    LU		= { 100 },
> +    FFT		= { params = { 1024 }, cycles = 50000, },
> +    SOR		= { params = { 100 }, cycles = 50000, },
> +    MC		= { params = { }, cycles = 15e7, },
> +    SPARSE	= { params = { 1000, 5000 }, cycles = 15e4, },
> +    LU		= { params = { 100 }, cycles = 5000, },
>     },
>     large = {
> -    FFT		= { 1048576 },
> -    SOR		= { 1000 },
> -    MC		= { },
> -    SPARSE	= { 100000, 1000000 },
> -    LU		= { 1000 },
> +    FFT		= { params = { 1048576 }, cycles = 25, },
> +    SOR		= { params = { 1000 }, cycles = 500, },
> +    MC		= { params = { }, cycles = 15e7, },
> +    SPARSE	= { params = { 100000, 1000000 }, cycles = 1500, },
> +    LU		= { params = { 1000 }, cycles = 50, },
>     },
>   }
please add a comment about chosen parameters
>   
> @@ -342,48 +343,51 @@ local function fmtparams(p1, p2)
>     return ""
>   end
>   
> -local function measure(min_time, name, ...)
> +local function measure(name, cycles, ...)
>     array_init()
>     rand_init(RANDOM_SEED)
>     local run = benchmarks[name](...)
> -  local cycles = 1
> -  repeat
> -    local tm = clock()
> -    local flops = run(cycles, ...)
> -    tm = clock() - tm
> -    if tm >= min_time then
> -      local res = flops / tm * 1.0e-6
> -      local p1, p2 = ...
> -      printf("%-7s %8.2f  %s\n", name, res, fmtparams(...))
> -      return res
> -    end
> -    cycles = cycles * 2
> -  until false
> +  local flops = run(cycles, ...)
> +  return flops
>   end
>   
> -printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
> -       SCIMARK_VERSION, SCIMARK_COPYRIGHT)
> +-- printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
> +--        SCIMARK_VERSION, SCIMARK_COPYRIGHT)
>   

I propose to move this to a comment with test description.

Something like:

The test runs the Lua version of SciMark 2.0a, which is a benchmark for 
scientific and numerical computing developed by programmers at the NIST 
(National Institute of Standards and Technology). This test is made up 
of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte 
Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks.

plus description of available test-specific options (noffi, small, etc) 
or just

a command-line that will show usage: ./scimark-2010-12-20.lua help

>   while arg and arg[1] do
>     local a = table.remove(arg, 1)
> -  if a == "-noffi" then
> +  if a == "noffi" then
>       package.preload.ffi = nil
> -  elseif a == "-small" then
> +  elseif a == "small" then
>       SIZE_SELECT = "small"
> -  elseif a == "-large" then
> +  elseif a == "large" then
>       SIZE_SELECT = "large"
>     elseif benchmarks[a] then
> -    local p = benchmarks[SIZE_SELECT][a]
> -    measure(MIN_TIME, a, tonumber(arg[1]) or p[1], tonumber(arg[2]) or p[2])
> +    local cycles = benchmarks[SIZE_SELECT][a].cycles
> +    local p = benchmarks[SIZE_SELECT][a].params
> +    local b
> +    b = {
> +      name = a,
> +      -- XXX: The description of tests for each function is too
> +      -- inconvenient.
> +      skip_check = true,
> +      payload = function()
> +        local flops = measure(a, cycles, tonumber(arg[1]) or p[1],
> +                              tonumber(arg[2]) or p[2])
> +        b.items = flops
> +      end,
> +    }
> +bench:add(b)
> +bench:run_and_report()
>       return
>     else
> -    printf("Usage: scimark [-noffi] [-small|-large] [BENCH params...]\n\n")
> -    printf("BENCH   -small         -large\n")
> +    printf("Usage: scimark [noffi] [small|large] [BENCH params...]\n\n")
> +    printf("BENCH   small         large\n")
>       printf("---------------------------------------\n")
>       for _,name in ipairs(benchmarks) do
>         printf("%-7s %-13s %s\n", name,
> -	     fmtparams(unpack(benchmarks.small[name])),
> -	     fmtparams(unpack(benchmarks.large[name])))
> +	     fmtparams(unpack(benchmarks.small[name].params)),
> +	     fmtparams(unpack(benchmarks.large[name].params)))
>       end
>       printf("\n")
>       os.exit(1)
> @@ -393,8 +397,19 @@ end
>   local params = benchmarks[SIZE_SELECT]
>   local sum = 0
>   for _,name in ipairs(benchmarks) do
> -  sum = sum + measure(MIN_TIME, name, unpack(params[name]))
> +  local cycles = params[name].cycles
> +  local b
> +  b = {
> +    name = name,
> +    -- XXX: The description of tests for each function is too
> +    -- inconvenient.
> +    skip_check = true,
> +    payload = function()
> +      local flops = measure(name, cycles, unpack(params[name].params))
> +      b.items = flops
> +    end,
> +  }
> +bench:add(b)
>   end
> -printf("\nSciMark %8.2f  [%s problem sizes]\n", sum / #benchmarks, SIZE_SELECT)
> -io.flush()
>   
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 6940 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:58   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:32     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:58 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 721 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with a minor comment:

I propose to squash this patch with patch "perf: add CMake infrastructure"

or add to the commit message "Needed for ...".

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This helps to avoid this library in the scanning of the test files
> for the suite.
> ---
>   perf/LuaJIT-benches/{ => libs}/scimark_lib.lua | 0
>   1 file changed, 0 insertions(+), 0 deletions(-)
>   rename perf/LuaJIT-benches/{ => libs}/scimark_lib.lua (100%)
>
> diff --git a/perf/LuaJIT-benches/scimark_lib.lua b/perf/LuaJIT-benches/libs/scimark_lib.lua
> similarity index 100%
> rename from perf/LuaJIT-benches/scimark_lib.lua
> rename to perf/LuaJIT-benches/libs/scimark_lib.lua

[-- Attachment #2: Type: text/html, Size: 1219 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:30     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:59 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1510 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/recursive-fib.lua | 28 +++++++++++++++++++++++++--
>   1 file changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/recursive-fib.lua b/perf/LuaJIT-benches/recursive-fib.lua
> index ef9950de..99af3f9e 100644
> --- a/perf/LuaJIT-benches/recursive-fib.lua
> +++ b/perf/LuaJIT-benches/recursive-fib.lua
> @@ -1,7 +1,31 @@
> +local bench = require("bench").new(arg)
> +
>   local function fib(n)
>     if n < 2 then return 1 end
>     return fib(n-2) + fib(n-1)
>   end
>   
> -local n = tonumber(arg[1]) or 10
> -io.write(string.format("Fib(%d): %d\n", n, fib(n)))
debug print was lost, is it intentional?
> +local n = tonumber(arg[1]) or 40
Why 40?
> +
> +local benchmark
> +benchmark = {
> +  name = "recursive_fib",
> +  checker = function(res)
> +    local km1, k = 1, 1
> +    for i = 2, n do
> +      local tmp = k + km1
> +      km1 = k
> +      k = tmp
> +    end
> +    return k == res
> +  end,
> +  payload = function()
> +    local res = fib(n)
> +    -- Number of calls.
> +    benchmark.items = res * 2 - 1
> +    return res
> +  end,
> +}
> +
> +bench:add(benchmark)
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2287 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:31     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 13:59 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3366 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The benchmark input is given by redirecting the corresponding
> <FASTA_5000000> file generated by the `libs/fasta.lua 5e6`. The output
> from the benchmark is redirected to /dev/null. Checks are skipped since
> the output is very huge, and it is overkill to store it in the
> repository.
> ---
>   perf/LuaJIT-benches/revcomp.lua | 72 +++++++++++++++++++++------------
>   1 file changed, 47 insertions(+), 25 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/revcomp.lua b/perf/LuaJIT-benches/revcomp.lua
> index 34fe347b..2b1ffa5c 100644
> --- a/perf/LuaJIT-benches/revcomp.lua
> +++ b/perf/LuaJIT-benches/revcomp.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local sub = string.sub
>   iubc = setmetatable({
> @@ -9,29 +10,50 @@ iubc = setmetatable({
>   }, { __index = function(t, s)
>     local r = t[sub(s, 2)]..t[sub(s, 1, 1)]; t[s] = r; return r end })
>   
> -local wcode = [=[
> -return function(t, n)
> -  if n == 1 then return end
> -  local iubc, sub, write = iubc, string.sub, io.write
> -  local s = table.concat(t, "", 1, n-1)
> -  for i=#s-59,1,-60 do
> -    write(]=]
> -for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
> -wcode = wcode..[=["\n")
> -  end
> -  local r = #s % 60
> -  if r ~= 0 then
> -    for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
> -    write("\n")
> -  end
> -end
> -]=]
> -local writerev = loadstring(wcode)()
> +local stdout = io.output()
>   
> -local t, n = {}, 1
> -for line in io.lines() do
> -  local c = sub(line, 1, 1)
> -  if c == ">" then writerev(t, n); io.write(line, "\n"); n = 1
> -  elseif c ~= ";" then t[n] = line; n = n + 1 end
> -end
> -writerev(t, n)
> +bench:add({
> +  name = "revcomp",
> +  -- The compare with the result output file is inconvenient.
> +  skip_check = true,
> +  setup = function()
> +    io.output("/dev/null")
> +  end,
> +  payload = function()
> +    local wcode = [=[
> +    return function(t, n)
> +      if n == 1 then return end
> +      local iubc, sub, write = iubc, string.sub, io.write
> +      local s = table.concat(t, "", 1, n-1)
> +      for i=#s-59,1,-60 do
> +        write(]=]
> +    for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
> +    wcode = wcode..[=["\n")
> +      end
> +      local r = #s % 60
> +      if r ~= 0 then
> +        for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
> +        write("\n")
> +      end
> +    end
> +    ]=]
> +    local writerev = loadstring(wcode)()
> +
> +    local t, n = {}, 1
> +    for line in io.lines() do
> +      local c = sub(line, 1, 1)
> +      if c == ">" then writerev(t, n); io.write(line, "\n"); n = 1
> +      elseif c ~= ";" then t[n] = line; n = n + 1 end
> +    end
> +    writerev(t, n)
> +    -- Repeat operation several times.
> +io.stdin:seek("set", 0)
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
> +  -- Amount of symbols in the input file.
> +  items = 5e6,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3738 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:00   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:33     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:00 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1402 bytes --]

Hi, Sergey,

thanks for the patch! Please see my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> Checks are omitted since they were not present in the original suite,
> plus the precise result value depends on the input parameter.
> ---
>   perf/LuaJIT-benches/scimark-fft.lua | 19 ++++++++++++++++++-
>   1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/perf/LuaJIT-benches/scimark-fft.lua b/perf/LuaJIT-benches/scimark-fft.lua
> index c05bb69a..96535774 100644
> --- a/perf/LuaJIT-benches/scimark-fft.lua
> +++ b/perf/LuaJIT-benches/scimark-fft.lua
> @@ -1 +1,18 @@
> -require("scimark_lib").FFT(1024)(tonumber(arg and arg[1]) or 50000)
> +local bench = require("bench").new(arg)
> +
> +local cycles = tonumber(arg and arg[1]) or 50000
Why 50000?
> +local benchmark
> +benchmark = {
> +  name = "scimark_fft",
> +  -- XXX: The description of tests for the function is too
> +  -- inconvenient.
> +  skip_check = true,
> +  payload = function()
> +    local flops = require("scimark_lib").FFT(1024)(cycles)
Why 1024?
> +    benchmark.items = flops
> +  end,
> +}
> +
> +bench:add(benchmark)
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2204 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:01   ` Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:07   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:34     ` Sergey Kaplun via Tarantool-patches
  2 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:07 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with a minor comment.

I propose to add a small description to the comment. Something like this:

The test runs a part of the Lua version of SciMark 2.0a, which is a 
benchmark for scientific and numerical computing developed by 
programmers at the NIST (National Institute of Standards and 
Technology). This test is made up of  dense LU matrix factorization 
benchmarks.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> Checks are omitted since they were not present in the original suite,
> plus the precise result value depends on the input parameter.
> ---
>   perf/LuaJIT-benches/scimark-lu.lua | 20 +++++++++++++++++++-
>   1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/perf/LuaJIT-benches/scimark-lu.lua b/perf/LuaJIT-benches/scimark-lu.lua
> index 7636d994..4f521e0b 100644
> --- a/perf/LuaJIT-benches/scimark-lu.lua
> +++ b/perf/LuaJIT-benches/scimark-lu.lua
> @@ -1 +1,19 @@
> -require("scimark_lib").LU(100)(tonumber(arg and arg[1]) or 5000)
> +local bench = require("bench").new(arg)
> +
> +local cycles = tonumber(arg and arg[1]) or 5000
> +
> +local benchmark
> +benchmark = {
> +  name = "scimark_lu",
> +  -- XXX: The description of tests for the function is too
> +  -- inconvenient.
> +  skip_check = true,
> +  payload = function()
> +    local flops = require("scimark_lib").LU(100)(cycles)
> +    benchmark.items = flops
> +  end,
> +}
> +
> +bench:add(benchmark)
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2220 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:09   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:35     ` Sergey Kaplun via Tarantool-patches
  2 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:09 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with a minor comment below.

I propose to add a small test description to a comment:

SciMark is a popular benchmark, MC is a Monte Carlo Integration.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adds the aforementioned test with the use of the benchmark
> framework introduced before. The default arguments are adjusted
> according to the amount of cycles in the <scimark-2010-12-20.lua> file.
> The arguments to the script can be provided in the command line run.
>
> Checks are omitted since they were not present in the original suite,
> plus the precise result value depends on the input parameter.
> ---
>   perf/LuaJIT-benches/scimark-mc.lua | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
>   create mode 100644 perf/LuaJIT-benches/scimark-mc.lua
>
> diff --git a/perf/LuaJIT-benches/scimark-mc.lua b/perf/LuaJIT-benches/scimark-mc.lua
> new file mode 100644
> index 00000000..d26b6e48
> --- /dev/null
> +++ b/perf/LuaJIT-benches/scimark-mc.lua
> @@ -0,0 +1,19 @@
> +local bench = require("bench").new(arg)
> +
> +local cycles = tonumber(arg and arg[1]) or 15e7
> +
> +local benchmark
> +benchmark = {
> +  name = "scimark_mc",
> +  -- XXX: The description of tests for the function is too
> +  -- inconvenient.
> +  skip_check = true,
> +  payload = function()
> +    local flops = require("scimark_lib").MC()(cycles)
> +    benchmark.items = flops
> +  end,
> +}
> +
> +bench:add(benchmark)
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 1942 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:11   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:35     ` Sergey Kaplun via Tarantool-patches
  2 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:11 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with a minor comment.

Please add a small test description to the comment.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> Checks are omitted since they were not present in the original suite,
> plus the precise result value depends on the input parameter.
> ---
>   perf/LuaJIT-benches/scimark-sor.lua | 20 +++++++++++++++++++-
>   1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/perf/LuaJIT-benches/scimark-sor.lua b/perf/LuaJIT-benches/scimark-sor.lua
> index e537e986..9bcdb0ad 100644
> --- a/perf/LuaJIT-benches/scimark-sor.lua
> +++ b/perf/LuaJIT-benches/scimark-sor.lua
> @@ -1 +1,19 @@
> -require("scimark_lib").SOR(100)(tonumber(arg and arg[1]) or 50000)
> +local bench = require("bench").new(arg)
> +
> +local cycles = tonumber(arg and arg[1]) or 50000
> +
> +local benchmark
> +benchmark = {
> +  name = "scimark_sor",
> +  -- XXX: The description of tests for the function is too
> +  -- inconvenient.
> +  skip_check = true,
> +  payload = function()
> +    local flops = require("scimark_lib").SOR(100)(cycles)
> +    benchmark.items = flops
> +  end,
> +}
> +
> +bench:add(benchmark)
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 1874 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse " Sergey Kaplun via Tarantool-patches
  2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
  2025-10-24 11:03   ` Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:15   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:36     ` Sergey Kaplun via Tarantool-patches
  2 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:15 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

Hi, Sergey,

thanks for the patch! LGTM with a minor comment.

please add a small test description to the comment.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> Checks are omitted since they were not present in the original suite,
> plus the precise result value depends on the input parameter.
> ---
>   perf/LuaJIT-benches/scimark-sparse.lua | 20 +++++++++++++++++++-
>   1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/perf/LuaJIT-benches/scimark-sparse.lua b/perf/LuaJIT-benches/scimark-sparse.lua
> index 01a2258d..a855cc22 100644
> --- a/perf/LuaJIT-benches/scimark-sparse.lua
> +++ b/perf/LuaJIT-benches/scimark-sparse.lua
> @@ -1 +1,19 @@
> -require("scimark_lib").SPARSE(1000, 5000)(tonumber(arg and arg[1]) or 150000)
> +local bench = require("bench").new(arg)
> +
> +local cycles = tonumber(arg and arg[1]) or 150000
> +
> +local benchmark
> +benchmark = {
> +  name = "scimark_sparse",
> +  -- XXX: The description of tests for the function is too
> +  -- inconvenient.
> +  skip_check = true,
> +  payload = function()
> +    local flops = require("scimark_lib").SPARSE(1000, 5000)(cycles)
> +    benchmark.items = flops
> +  end,
> +}
> +
> +bench:add(benchmark)
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 1914 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series in LuaJIT-benches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:19   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:37     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:19 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]

Hi, Sergey,

thanks for the patch! See comments below.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/series.lua | 20 ++++++++++++++------
>   1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/series.lua b/perf/LuaJIT-benches/series.lua
> index f766cb32..3dc970c5 100644
> --- a/perf/LuaJIT-benches/series.lua
> +++ b/perf/LuaJIT-benches/series.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local function integrate(x0, x1, nsteps, omegan, f)
>     local x, dx = x0, (x1-x0)/nsteps
> @@ -26,9 +27,16 @@ local function series(n)
>   end
>   
>   local n = tonumber(arg and arg[1]) or 10000
> -local tm = os.clock()
> -local t = series(n)
> -tm = os.clock() - tm
> -assert(math.abs(t[1]-2.87295) < 0.00001)
> -io.write(string.format("size %d, %.2f s, %.1f iterations/s\n",
> -                       n, tm, (2*n-1)/tm))
debug print was lost, is it intentional?
> +
> +bench:add({
> +  name = "series",
> +  checker = function(res)
> +    return math.abs(res[1]-2.87295) < 0.00001
add more whitespaces
> +  end,
> +  payload = function()
> +    return series(n)
> +  end,
> +  items = 2 * n - 1,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2264 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm in LuaJIT-benches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm " Sergey Kaplun via Tarantool-patches
@ 2025-11-17 14:23   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:37     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-17 14:23 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2048 bytes --]

Hi, Sergey,

  thanks for the patch! See my comments below.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
> ---
>   perf/LuaJIT-benches/spectral-norm.lua | 40 +++++++++++++++++++--------
>   1 file changed, 29 insertions(+), 11 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/spectral-norm.lua b/perf/LuaJIT-benches/spectral-norm.lua
> index ecc80112..6e63cd47 100644
> --- a/perf/LuaJIT-benches/spectral-norm.lua
> +++ b/perf/LuaJIT-benches/spectral-norm.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   local function A(i, j)
>     local ij = i+j-1
> @@ -25,16 +26,33 @@ local function AtAv(x, y, t, N)
>     Atv(t, y, N)
>   end
>   
> -local N = tonumber(arg and arg[1]) or 100
> -local u, v, t = {}, {}, {}
> -for i=1,N do u[i] = 1 end
> +local N = tonumber(arg and arg[1]) or 3000
Why it was changed to 3000?
>   
> -for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
> +bench:add({
> +  name = "spectral_norm",
> +  checker = function(res)
> +    -- XXX: Empirical value.
> +    if N > 66 then
> +      assert(math.abs(res - 1.27422) < 0.00001)
> +    end
> +    return true
> +  end,
> +  payload = function()
> +    local u, v, t = {}, {}, {}
> +    for i=1,N do u[i] = 1 end
add more whitespaces, here and below
>   
> -local vBv, vv = 0, 0
> -for i=1,N do
> -  local ui, vi = u[i], v[i]
> -  vBv = vBv + ui*vi
> -  vv = vv + vi*vi
> -end
> -io.write(string.format("%0.9f\n", math.sqrt(vBv / vv)))
> +    for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
> +
> +    local vBv, vv = 0, 0
> +    for i=1,N do
> +      local ui, vi = u[i], v[i]
> +      vBv = vBv + ui*vi
> +      vv = vv + vi*vi
> +    end
> +    return math.sqrt(vBv / vv)
> +  end,
> +  -- Operations inside `for i=1,10` loop.
> +  items = 40 * N * N,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2792 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure Sergey Kaplun via Tarantool-patches
@ 2025-11-18 12:21   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:40     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-18 12:21 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 8541 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This commit introduces CMake building scripts for the benches introduced
> before. The benchmarks are enabled only if `LUAJIT_ENABLE_PERF` option
> is set. For each suite (LuaJIT-benches in this patch set)
> `AddBenchTarget()` macro generates 2 targets:
> * Target to run all benches and store results in the
>    perf/output/<suite_name> directory.
> * Target to run all benches via CTest and inspect results in the
>    console.
>
> For the LuaJIT-benches there are 2 generated files:
> * FASTA_5000000 -- is used as an input for <k-nukleotide.lua> and
>                     <revcomp.lua>.
> * SUMCOLL_5000.txt -- is used as an input for <sum-file.lua>.
>
> These files and <perf/output> directory are added to the .gitignore files.
> ---
>   .gitignore                         |  5 ++
>   CMakeLists.txt                     | 11 ++++
>   perf/CMakeLists.txt                | 99 ++++++++++++++++++++++++++++++
>   perf/LuaJIT-benches/CMakeLists.txt | 52 ++++++++++++++++
>   4 files changed, 167 insertions(+)
>   create mode 100644 perf/CMakeLists.txt
>   create mode 100644 perf/LuaJIT-benches/CMakeLists.txt
>
> diff --git a/.gitignore b/.gitignore
> index c26a7eb8..bfc7d401 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -28,3 +28,8 @@ luajit-parse-memprof
>   luajit-parse-sysprof
>   luajit.pc
>   *.c_test
> +
> +# Generated by the performance tests.
> +FASTA_5000000
> +SUMCOL_5000.txt
> +perf/output/
> diff --git a/CMakeLists.txt b/CMakeLists.txt
> index c0da4362..73f46835 100644
> --- a/CMakeLists.txt
> +++ b/CMakeLists.txt
> @@ -464,6 +464,17 @@ if(LUAJIT_USE_TEST)
>   endif()
>   add_subdirectory(test)
>   
> +# --- Benchmarks source tree ---------------------------------------------------
> +
> +# The option to enable performance tests for the LuaJIT.
> +# Disabled by default, since commonly it is used only by LuaJIT
> +# developers and run in the CI with the specially set-up machine.
> +option(LUAJIT_ENABLE_PERF "Generate <perf> target" OFF)
> +
> +if(LUAJIT_ENABLE_PERF)

option name confuses a bit due to `perf` utility.

I would rename to something like "LUAJIT_ENABLE_PERF_TESTS".

Feel free to ignore.

> +  add_subdirectory(perf)
> +endif()
> +
>   # --- Misc rules ---------------------------------------------------------------
>   
>   # XXX: Implement <uninstall> target using the following recipe:
> diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
> new file mode 100644
> index 00000000..cc3c312f
> --- /dev/null
> +++ b/perf/CMakeLists.txt
> @@ -0,0 +1,99 @@
> +# Running various bench suites against LuaJIT.
> +
> +include(MakeLuaPath)
> +
> +if(CMAKE_BUILD_TYPE STREQUAL "Debug")
> +  message(WARNING "LuaJIT and perf tests are built in the Debug mode."

s/./. /

missed whitespace after dot

> +                  "Timings may be affected.")
> +endif()
> +
> +set(PERF_OUTPUT_DIR ${PROJECT_BINARY_DIR}/perf/output)
> +file(MAKE_DIRECTORY ${PERF_OUTPUT_DIR})
> +
> +# List of paths that will be used for each suite.
> +make_lua_path(LUA_PATH_BENCH_BASE
> +  PATHS
> +    # Use of the bench module.
> +    ${CMAKE_CURRENT_SOURCE_DIR}/utils/?.lua
> +    # Simple usage with `jit.dump()`, etc.
> +    ${LUAJIT_SOURCE_DIR}/?.lua
> +    ${LUAJIT_BINARY_DIR}/?.lua
> +)
> +
> +make_lua_path(LUA_CPATH
> +  PATHS
> +    # XXX: Some arches may have installed the cjson module here.
> +    /usr/lib64/lua/5.1/?.so
> +)
> +
> +# Produce the pair:
> +# Target to run for reporting and target to inspect from the
> +# console, runnable by the CTest.
> +macro(AddBenchTarget perf_suite)
> +  file(MAKE_DIRECTORY "${PERF_OUTPUT_DIR}/${perf_suite}/")
> +  message(STATUS "Add perf suite ${perf_suite}")
> +  add_custom_target(${perf_suite})
> +  add_custom_target(${perf_suite}-console
> +    COMMAND ${CMAKE_CTEST_COMMAND}
> +      -L ${perf_suite}
> +      --parallel 1
> +      --verbose
> +      --output-on-failure
> +      --no-tests=error
may be --schedule-random, --timeout XXX (default timeout is 10000000)?
> +  )
> +  add_dependencies(${perf_suite}-console luajit-main)
> +endmacro()
> +
> +# Add the bench to the pair of targets created by the call above.
> +macro(AddBench bench_name bench_path perf_suite LUA_PATH)
> +  set(bench_title "perf/${perf_suite}/${bench_name}")
> +  get_filename_component(bench_name_stripped  ${bench_name} NAME_WE)
> +  set(bench_out_file
> +    ${PERF_OUTPUT_DIR}/${perf_suite}/${bench_name_stripped}.json
> +  )
> +  set(bench_command "${LUAJIT_BINARY} ${bench_path}")
> +  if(${ARGC} GREATER 4)
> +    set(input_file ${ARGV4})
> +    set(bench_command "${bench_command} < ${input_file}")
> +  endif()
> +  set(BENCH_FLAGS
> +    "--benchmark_out_format=json --benchmark_out=${bench_out_file}"
> +  )
> +  set(bench_command_flags ${bench_command} ${BENCH_FLAGS})
> +  separate_arguments(bench_command_separated UNIX_COMMAND ${bench_command})
> +  add_custom_command(
> +    COMMAND ${CMAKE_COMMAND} -E env
> +      LUA_PATH="${LUA_PATH}"
> +      LUA_CPATH="${LUA_CPATH}"
> +        ${bench_command_separated}
> +          --benchmark_out_format=json
> +          --benchmark_out="${bench_out_file}"
previous two lines can be replaced with ${BENCH_FLAGS}, right?
> +    OUTPUT ${bench_out_file}
> +    DEPENDS luajit-main
> +    COMMENT
> +      "Running benchmark ${bench_title} saving results in ${bench_out_file}."
> +  )
> +  add_custom_target(${bench_name} DEPENDS ${bench_out_file})
> +  add_dependencies(${perf_suite} ${bench_name})
> +
> +  # Report in the console.
> +  add_test(NAME ${bench_title}
> +    COMMAND sh -c "${bench_command}"
> +  )
> +  set_tests_properties(${bench_title} PROPERTIES
> +    ENVIRONMENT "LUA_PATH=${LUA_PATH}"
> +    LABELS ${perf_suite}
> +    DEPENDS luajit-main
> +  )
> +  unset(input_file)
> +endmacro()
> +
> +add_subdirectory(LuaJIT-benches)
> +
> +add_custom_target(${PROJECT_NAME}-perf
> +  DEPENDS LuaJIT-benches
missed a COMMENT field
> +)
> +
> +add_custom_target(${PROJECT_NAME}-perf-console
> +  DEPENDS LuaJIT-benches-console
missed a COMMENT field
> +)
> diff --git a/perf/LuaJIT-benches/CMakeLists.txt b/perf/LuaJIT-benches/CMakeLists.txt
> new file mode 100644
> index 00000000..d9909f36
> --- /dev/null
> +++ b/perf/LuaJIT-benches/CMakeLists.txt
> @@ -0,0 +1,52 @@
> +set(PERF_SUITE_NAME LuaJIT-benches)
> +set(LUA_BENCH_SUFFIX .lua)
it is not a bench-specific suffix. May be LUA_SUFFIX?
> +
> +AddBenchTarget(${PERF_SUITE_NAME})
> +
> +# Input for the k-nucleotide and revcomp benchmarks.
> +set(FASTA_NAME ${CMAKE_CURRENT_BINARY_DIR}/FASTA_5000000)
> +add_custom_target(FASTA_5000000
> +  COMMAND ${LUAJIT_BINARY}
> +    ${CMAKE_CURRENT_SOURCE_DIR}/libs/fasta.lua 5000000 > ${FASTA_NAME}

FASTA_5000000 is a plain text file. I propose to add extension .txt for 
its full name and

probably postfix "_autogenerated". Like we do this for SUMCOL_5000 and 
SUMCOL_1.

> +  OUTPUT ${FASTA_NAME}
> +  DEPENDS luajit-main
> +  COMMENT "Generate ${FASTA_NAME}."
> +)
> +
> +make_lua_path(LUA_PATH
> +  PATHS
> +    ${LUA_PATH_BENCH_BASE}
> +    ${CMAKE_CURRENT_SOURCE_DIR}/libs/?.lua
> +)
> +
> +# Input for the <sum-file.lua> benchmark.
> +set(SUM_NAME ${CMAKE_CURRENT_BINARY_DIR}/SUMCOL_5000.txt)
> +# Remove possibly existing file.
> +file(REMOVE ${SUM_NAME})

Why do we need generate file after every cmake configuration?

I propose to skip generation if file already exist or regenerate if 
SHA256 is not the same.

> +
> +set(SUMCOL_FILE ${CMAKE_CURRENT_SOURCE_DIR}/SUMCOL_1.txt)
> +file(READ ${SUMCOL_FILE} SUMCOL_CONTENT)
> +foreach(_unused RANGE 4999)
> +  file(APPEND ${SUM_NAME} "${SUMCOL_CONTENT}")
> +endforeach()
> +
> +file(GLOB benches "${CMAKE_CURRENT_SOURCE_DIR}/*${LUA_BENCH_SUFFIX}")
> +foreach(bench_path ${benches})
> +  file(RELATIVE_PATH bench_name ${CMAKE_CURRENT_SOURCE_DIR} ${bench_path})
> +  set(bench_title "perf/${PERF_SUITE_NAME}/${bench_name}")
> +  if(bench_name MATCHES "k-nucleotide" OR bench_name MATCHES "revcomp")
> +    AddBench(${bench_name}
> +      ${bench_path} ${PERF_SUITE_NAME} "${LUA_PATH}" ${FASTA_NAME}
> +    )
> +    add_dependencies(${bench_name} FASTA_5000000)
> +  elseif(bench_name MATCHES "sum-file")
> +    AddBench(${bench_name}
> +      ${bench_path} ${PERF_SUITE_NAME} "${LUA_PATH}" ${SUM_NAME}
> +    )
> +  else()
> +    AddBench(${bench_name} ${bench_path} ${PERF_SUITE_NAME} "${LUA_PATH}")
> +  endif()
> +endforeach()
> +
> +# We need to generate the file before we run tests.
> +add_dependencies(${PERF_SUITE_NAME}-console FASTA_5000000)

[-- Attachment #2: Type: text/html, Size: 10406 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics Sergey Kaplun via Tarantool-patches
@ 2025-11-18 12:31   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:41     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-18 12:31 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 6019 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch adds a helper script to aggregate the benchmark results from
> JSON files to the format parsable by the InfluxDB line protocol [1].

format cannot be parsed by protocol, please rephrase.

Something like "the format compatible with the InfluxDB line protocol"

>
> All JSON files from each suite in the <perf/output> directory are
> considered as the benchmark results and aggregated into the
> <perf/output/summary.txt> file that can be posted to the InfluxDB. The
> results are aggregated via the new target LuaJIT-perf-aggregate.
may be say that cjson is required?
>
> [1]:https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/
> ---
>   perf/CMakeLists.txt        |  13 ++++
>   perf/helpers/aggregate.lua | 124 +++++++++++++++++++++++++++++++++++++
>   2 files changed, 137 insertions(+)
>   create mode 100644 perf/helpers/aggregate.lua
>
> diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
> index cc3c312f..68e561fd 100644
> --- a/perf/CMakeLists.txt
> +++ b/perf/CMakeLists.txt
> @@ -97,3 +97,16 @@ add_custom_target(${PROJECT_NAME}-perf
>   add_custom_target(${PROJECT_NAME}-perf-console
>     DEPENDS LuaJIT-benches-console
>   )
> +
> +set(PERF_SUMMARY ${PERF_OUTPUT_DIR}/summary.txt)
> +add_custom_target(${PROJECT_NAME}-perf-aggregate
> +  BYPRODUCTS ${PERF_SUMMARY}
> +  COMMENT "Aggregate performance test results into ${PERF_SUMMARY}"
> +  COMMAND ${CMAKE_COMMAND} -E env
> +    LUA_CPATH="${LUA_CPATH}"
> +      ${LUAJIT_BINARY} ${CMAKE_CURRENT_SOURCE_DIR}/helpers/aggregate.lua
> +        ${PERF_SUMMARY}
> +        ${PERF_OUTPUT_DIR}
> +  WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
> +  DEPENDS luajit-main
> +)
> diff --git a/perf/helpers/aggregate.lua b/perf/helpers/aggregate.lua
> new file mode 100644
> index 00000000..12a8ab89
> --- /dev/null
> +++ b/perf/helpers/aggregate.lua
> @@ -0,0 +1,124 @@
> +local json = require('cjson')
What if cjson is absent? Do we want to handle error?
> +
> +-- File to aggregate the benchmark results from JSON files to the
> +-- format parsable by the InfluxDB line protocol [1]:
> +-- <measurement>,<tag_set> <field_set> <timestamp>
> +--
> +-- <tag_set> and <field_set> have the following format:
> +-- <key1>=<value1>,<key2>=<value2>
> +--
> +-- The reported tag set is a set of values that can be used for
> +-- filtering data (i.e., branch or benchmark name).
> +--
> +-- luacheck: push no max comment line length
> +--
> +-- [1]:https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/
> +--
> +-- luacheck: pop

I propose to document command-line options

(1st arg is output file, 2nd arg is a dir, "current dir by default"),

env variables (PERF_COMMIT, PERF_BRANCH) and requirements

(git is an optional requirement, cjson Lua module is mandatory).

> +
> +local output = assert(arg[1], 'Output file is required as the first argument')
> +local input_dir = arg[2] or '.'
> +
> +local out_fh = assert(io.open(output, 'w+'))
> +
> +local function exec(cmd)
> +  return io.popen(cmd):read('*all'):gsub('%s+$', '')
> +end
> +
> +local commit = os.getenv('PERF_COMMIT') or exec('git rev-parse --short HEAD')
> +assert(commit, 'can not determine the commit')
> +
> +local branch = os.getenv('PERF_BRANCH') or
> +  exec('git rev-parse --abbrev-ref HEAD')
> +assert(branch, 'can not determine the branch')
> +
> +-- Not very robust, but OK for our needs.
> +local function listdir(path)
> +  local handle = io.popen('ls -1 ' .. path)
> +
> +  local files = {}
> +  for file inhandle:lines() do
> +    table.insert(files, file)
> +  end
> +
> +  return files
> +end
> +
> +local tag_set = {branch = branch}
> +
> +local function table_plain_copy(src)
> +  local dst = {}
> +  for k, v in pairs(src) do
> +    dst[k] = v
> +  end
> +  return dst
> +end
> +
> +local function read_all(file)
> +  local fh = assert(io.open(file, 'rb'))
> +  local content =fh:read('*all')
> +fh:close()
> +  return content
> +end
> +
> +local REPORTED_FIELDS = {
> +  'cpu_time',
> +  'items_per_second',
> +  'iterations',
> +  'real_time',
> +}
> +
> +local function influx_kv(tab)
> +  local kv_string = {}
> +  for k, v in pairs(tab) do
> +    table.insert(kv_string, ('%s=%s'):format(k, v))
> +  end
> +  return table.concat(kv_string, ',')
> +end
> +
> +local time = os.time()
> +local function influx_line(measurement, tags, fields)
> +  return ('%s,%s %s %d\n'):format(measurement, influx_kv(tags),
> +          influx_kv(fields), time)
> +end
> +
> +for _, suite_name in pairs(listdir(input_dir)) do
> +  -- May list the report file, but will be ignored by the
> +  -- condition below.
> +  local suite_dir = ('%s/%s'):format(input_dir, suite_name)
> +  for _, file in pairs(listdir(suite_dir)) do
> +    -- Skip files in which we are not interested.
> +    if notfile:match('%.json$') then goto continue end
> +
> +    local data = read_all(('%s/%s'):format(suite_dir, file))
> +    local bench_name =file:match('([^/]+)%.json')
> +    local bench_data = json.decode(data)
> +    local benchmarks = bench_data.benchmarks
> +    local arch = bench_data.context.arch
> +    local gc64 = bench_data.context.gc64
> +    local jit = bench_data.context.jit
> +
> +    for _, bench in ipairs(benchmarks) do
> +      local full_tag_set = table_plain_copy(tag_set)
> +      full_tag_set.name = bench.name
> +      full_tag_set.suite = suite_name
> +      full_tag_set.arch = arch
> +      full_tag_set.gc64 = gc64
> +      full_tag_set.jit = jit
> +
> +      -- Save the commit as a field, since we don't want to filter
> +      -- benchmarks by the commit (one point of data).
> +      local field_set = {commit = ('"%s"'):format(commit)}
> +
> +      for _, field in ipairs(REPORTED_FIELDS) do
> +          field_set[field] = bench[field]
> +      end
> +
> +      local line = influx_line(bench_name, full_tag_set, field_set)
> +      out_fh:write(line)
> +    end
> +    ::continue::
> +  end
> +end
> +
> +out_fh:close()

[-- Attachment #2: Type: text/html, Size: 7711 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup Sergey Kaplun via Tarantool-patches
@ 2025-11-18 12:36   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:41     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-18 12:36 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 4601 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> The patch adds a script for setting the environment before running
> performance tests. Most of the settings are taken from the Tarantool's
> wiki page dedicated to the benchmarking [1].

Honestly, I don't like that we have similar files in two repositories 
(aggregate, setup-env.sh).

This makes maintenance more complicated. I would put these files to a 
shared repository

and reuse it for tarantool and luajit repos.

Original files are even not specified in commit messages for 
setup-env.sh and aggregate.lua,

I think it is worth it. No sense to review these files if it was already 
done previously.

>
> [1]:https://github.com/tarantool/tarantool/wiki/Benchmarking
> ---
>   perf/helpers/setup_env.sh | 135 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 135 insertions(+)
>   create mode 100755 perf/helpers/setup_env.sh
>
> diff --git a/perf/helpers/setup_env.sh b/perf/helpers/setup_env.sh
> new file mode 100755
> index 00000000..043d3c88
> --- /dev/null
> +++ b/perf/helpers/setup_env.sh
> @@ -0,0 +1,135 @@
> +#!/bin/sh
> +
> +# The script sets up a Linux operating system before running
> +# LuaJIT benchmarks. See more details in [1].
> +#
> +# [1]:https://github.com/tarantool/tarantool/wiki/Benchmarking
> +
> +set -eu
> +
> +uid=$(id -u)
> +if [ "$uid" -ne 0 ]
> +  then echo "Please run as root."
> +  exit 1
> +fi
> +
> +###
> +# Helpers.
> +###
> +
> +cpu_vendor="unknown"
> +cpuinfo_vendor=$(awk '/vendor_id/{ print $3; exit }' < /proc/cpuinfo)
> +if [ "$cpuinfo_vendor" = "GenuineIntel" ]; then
> +  cpu_vendor="intel"
> +elif [ "$cpuinfo_vendor" = "AuthenticAMD" ]; then
> +  cpu_vendor="amd"
> +else
> +  echo "Unknown CPU vendor '$cpuinfo_vendor'"
> +  exit 1
> +fi
> +
> +FAILURE_MSG="WARNING"
> +SUCCESS_MSG="CHECKED"
> +SKIPPED_MSG="SKIPPED"
> +
> +set_kernel_setting() {
> +  desc_msg="$1"
> +  file_path="$2"
> +  value="$3"
> +
> +  if [ -f "$file_path" ]; then
> +    sh -c "echo $value > $file_path" && status="$SUCCESS_MSG" || status="$FAILURE_MSG"
> +  else
> +    status="$SKIPPED_MSG"
> +  fi
> +  echo "$desc_msg $status"
> +}
> +
> +kernel_setting_is_nonzero() {
> +  desc_msg="$1"
> +  file_path="$2"
> +  hint_msg="$3"
> +
> +  if [ -f "$file_path" ]; then
> +    value=$(cat "$file_path")
> +    if [ -n "$value" ]; then
> +      status="$SUCCESS_MSG"
> +    else
> +      status="$FAILURE_MSG (hint: $hint_msg)"
> +    fi
> +  else
> +    status="$SKIPPED_MSG"
> +  fi
> +  echo "$desc_msg $status"
> +}
> +
> +###
> +# Kernel command line parameters.
> +###
> +
> +desc_msg="Disable AMD SMT or Intel Hyperthreading "
> +sysfs_path="/sys/devices/system/cpu/smt/active"
> +if [ -f "$sysfs_path" ]; then
> +  is_set=$(cat $sysfs_path)
> +  err_msg="$FAILURE_MSG (hint: set 'nosmt' kernel parameter)"
> +  [ "$is_set" = 1 ] && status="$SUCCESS_MSG" || status="$err_msg"
> +else
> +  status="$SKIPPED_MSG"
> +fi
> +echo "$desc_msg $status"
> +
> +kernel_setting_is_nonzero \
> +  "Isolate CPUs for benchmarking" \
> +  "/sys/devices/system/cpu/isolated" \
> +  "set 'isolcpus' kernel parameter"
> +
> +kernel_setting_is_nonzero \
> +  "Offload interrupts from the isolated CPUs" \
> +  "/proc/irq/default_smp_affinity" \
> +  "set 'irqaffinity' kernel parameter"
> +
> +kernel_setting_is_nonzero \
> +  "Disable scheduling on single-task isolated CPUs" \
> +  "/sys/devices/system/cpu/nohz_full" \
> +  "set 'nohz_full' kernel parameter"
> +
> +set_kernel_setting \
> +  "Disable transparent huge pages" \
> +  "/sys/kernel/mm/transparent_hugepage/enabled" \
> +  "never"
> +
> +set_kernel_setting \
> +  "Disable direct compaction of transparent huge pages" \
> +  "/sys/kernel/mm/transparent_hugepage/defrag" \
> +  "never"
> +
> +# Disable ASLR for the repeatable LuaJIT behaviour.
> +set_kernel_setting \
> +  "Disable ASLR" \
> +  "/proc/sys/kernel/randomize_va_space" \
> +  "0"
> +
> +###
> +# System tuning.
> +###
> +
> +if [ "$cpu_vendor" = "amd" ]; then
> +  sysfs_path="/sys/devices/system/cpu/cpufreq/boost"
> +  value=0
> +elif [ "$cpu_vendor" = "intel" ]; then
> +  sysfs_path="/sys/devices/system/cpu/intel_pstate/no_turbo"
> +  value=1
> +fi
> +set_kernel_setting \
> +  "Disable TurboBoost" \
> +  "$sysfs_path" \
> +  "$value"
> +
> +ncpu=$(getconf _NPROCESSORS_ONLN)
> +for cpu_id in $(seq 0 1 $((ncpu-1))); do
> +  sysfs_path_cpu="/sys/devices/system/cpu/cpu$cpu_id/cpufreq/scaling_governor"
> +  set_kernel_setting \
> +    "Stabilize the frequency of CPU $cpu_id" \
> +    "$sysfs_path_cpu" \
> +    "performance"
> +done

[-- Attachment #2: Type: text/html, Size: 5267 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark Sergey Kaplun via Tarantool-patches
@ 2025-11-18 12:51   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:42     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-18 12:51 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1710 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch introduces the `LUAJIT_BENCH_INIT` option to determine the

it is actually a runner for benchmark, not a command that runs before 
the benchmark itself.

Please rephrase.


I would rename a cmake option appropriately. 
LUAJIT_BENCH_EXEC/LUAJIT_BENCH_RUNNER?

Feel free to keep as is, I don't insist.

> shell command to be run before the benchmark itself. It may be useful to
> set taskset, etc.
> ---
>   perf/CMakeLists.txt | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
> index 68e561fd..c315597f 100644
> --- a/perf/CMakeLists.txt
> +++ b/perf/CMakeLists.txt
> @@ -7,6 +7,13 @@ if(CMAKE_BUILD_TYPE STREQUAL "Debug")
>                     "Timings may be affected.")
>   endif()
>   
> +# The shell command needs to be run before benchmarks are started.
> +if(LUAJIT_BENCH_INIT)
> +  message(STATUS
> +    "The following command will run before benchmarks: '${LUAJIT_BENCH_INIT}'."
> +  )
> +endif()
this message is not necessary, one can see it in "ctest -V" output
> +
>   set(PERF_OUTPUT_DIR ${PROJECT_BINARY_DIR}/perf/output)
>   file(MAKE_DIRECTORY ${PERF_OUTPUT_DIR})
>   
> @@ -51,7 +58,7 @@ macro(AddBench bench_name bench_path perf_suite LUA_PATH)
>     set(bench_out_file
>       ${PERF_OUTPUT_DIR}/${perf_suite}/${bench_name_stripped}.json
>     )
> -  set(bench_command "${LUAJIT_BINARY} ${bench_path}")
> +  set(bench_command "${LUAJIT_BENCH_INIT} ${LUAJIT_BINARY} ${bench_path}")
>     if(${ARGC} GREATER 4)
>       set(input_file ${ARGV4})
>       set(bench_command "${bench_command} < ${input_file}")

[-- Attachment #2: Type: text/html, Size: 2563 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow Sergey Kaplun via Tarantool-patches
@ 2025-11-18 13:08   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:43     ` Sergey Kaplun via Tarantool-patches
  2025-11-18 13:13   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-18 13:08 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 6844 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch adds the workflow to run benchmarks from various suites,
> aggregate their results, and send statistics to the InfluxDB to be
> processed later.
>
> The workflow contains a matrix to measure GC64 and non-GC64 modes with
> enabled/disabled JIT for x64 architecture.
> ---
>   .github/actions/setup-performance/README.md  |  10 ++
>   .github/actions/setup-performance/action.yml |  18 +++
>   .github/workflows/performance.yml            | 110 +++++++++++++++++++
>   3 files changed, 138 insertions(+)
>   create mode 100644 .github/actions/setup-performance/README.md
>   create mode 100644 .github/actions/setup-performance/action.yml
>   create mode 100644 .github/workflows/performance.yml
>
> diff --git a/.github/actions/setup-performance/README.md b/.github/actions/setup-performance/README.md
> new file mode 100644
> index 00000000..4c4bbdab
> --- /dev/null
> +++ b/.github/actions/setup-performance/README.md
> @@ -0,0 +1,10 @@
> +# Setup performance
> +
> +Action setups the performance on Linux runners.
> +
> +## How to use Github Action from Github workflow
> +
> +Add the following code to the running steps before LuaJIT configuration:
> +```
> +- uses: ./.github/actions/setup-performance
> +```
> diff --git a/.github/actions/setup-performance/action.yml b/.github/actions/setup-performance/action.yml
> new file mode 100644
> index 00000000..24d07440
> --- /dev/null
> +++ b/.github/actions/setup-performance/action.yml
> @@ -0,0 +1,18 @@
> +name: Setup performance
> +description: The Linux machine setup for running LuaJIT benchmarks
> +runs:
> +  using: composite
> +  steps:
> +    - name: Setup CI environment (Linux)
> +      uses: ./.github/actions/setup-linux
> +    - name: Install dependencies for the LuaJIT benchmarks
> +      run: |
> +        apt -y update
> +        apt install -y luarocks curl
> +      shell: bash
> +    - name: Install Lua modules
> +      run: luarocks install lua-cjson
> +      shell: bash
> +    - name: Run script to setup Linux environment
> +      run: sh ./perf/helpers/setup_env.sh
> +      shell: bash
bash or shell is used in the last step? (shebang in setup_env.sh is 
/bin/sh)
> diff --git a/.github/workflows/performance.yml b/.github/workflows/performance.yml
> new file mode 100644
> index 00000000..bfb6be97
> --- /dev/null
> +++ b/.github/workflows/performance.yml
> @@ -0,0 +1,110 @@
> +name: Performance
> +
> +on:
> +  push:
> +    branches-ignore:
> +      - '**-noperf'
> +      - 'tarantool/release/**'
> +      - 'upstream-**'
> +    tags-ignore:
> +      - '**'
> +  schedule:
> +    # Once a day at 03:00 to avoid clashing with runs for the
> +    # Tarantool benchmarks at midnight.
> +    - cron: '0 3 * * *'
> +
> +concurrency:
> +  # An update of a developer branch cancels the previously
> +  # scheduled workflow run for this branch. However, the default
> +  # branch, and long-term branch (tarantool/release/2.11,
> +  # tarantool/release/2.10, etc) workflow runs are never canceled.
> +  #
it is not relevant, right?
> +  # We use a trick here: define the concurrency group as 'workflow
> +  # run ID' + # 'workflow run attempt' because it is a unique
> +  # combination for any run. So it effectively discards grouping.
> +  #
> +  # XXX: we cannot use `github.sha` as a unique identifier because
> +  # pushing a tag may cancel a run that works on a branch push
> +  # event.
> +  group: ${{ startsWith(github.ref, 'refs/heads/tarantool/')
> +    && format('{0}-{1}', github.run_id, github.run_attempt)
> +    || format('{0}-{1}', github.workflow, github.ref) }}
> +  cancel-in-progress: true
> +
> +jobs:
> +  performance-luajit:
> +    # The 'performance' label _must_ be set only for the single
> +    # runner to guarantee that results are not dependent on the
> +    # machine.
> +    runs-on:
> +      - self-hosted
> +      - Linux
> +      - x86_64
> +      - 'performance'
> +
> +    env:
> +      PERF_BRANCH: ${{ github.ref_name }}
> +      PERF_COMMIT: ${{ github.sha }}
> +
> +    strategy:
> +      fail-fast: false
> +      matrix:
> +        GC64: [ON, OFF]
> +        JOFF: [ON, OFF]
> +      # Run each job sequentially.
> +      max-parallel: 1
> +    name: >
> +      LuaJIT
> +GC64:${{ matrix.GC64 }}
> +JOFF:${{ matrix.GC64 }}
> +    steps:
> +      - uses: actions/checkout@v4
> +        with:
> +          fetch-depth: 0
> +          submodules: recursive
> +      - name: setup performance environment
> +        uses: ./.github/actions/setup-performance
> +      - name: configure
> +        # The taskset alone will pin all the process threads
> +        # into a single (random) isolated CPU, see
> +        #https://bugzilla.kernel.org/show_bug.cgi?id=116701.
> +        # The workaround is using realtime scheduler for the
> +        # isolated task using chrt, e. g.:
> +        # sudo taskset 0xef chrt 50.
> +        # But this makes the process use non-standard, real-time
> +        # round-robin scheduling mechanism.
> +        run: >
> +          cmake -S . -B ${{ env.BUILDDIR }}
> +          -DCMAKE_BUILD_TYPE=RelWithDebInfo

RelWithDebInfo is -O2 (moderate optimization), Release is -O3 (high 
optimization).

Do we really need  RelWithDebInfo? I think it deserves a comment.

> +          -DLUAJIT_ENABLE_PERF=ON
> +          -DLUAJIT_BENCH_INIT="taskset 0xfe chrt 50"
> +          -DLUAJIT_DISABLE_JIT=${{ matrix.JOFF }}
> +          -DLUAJIT_ENABLE_GC64=${{ matrix.GC64 }}
> +      - name: build
> +        run: cmake --build . --parallel
> +        working-directory: ${{ env.BUILDDIR }}
> +      - name: perf
> +        run: make LuaJIT-perf
> +        working-directory: ${{ env.BUILDDIR }}
> +      - name: aggregate benchmark results
> +        run: make LuaJIT-perf-aggregate
> +        working-directory: ${{ env.BUILDDIR }}
> +      - name: send statistics to InfluxDB
> +        # --silent -o /dev/null: Prevent dumping any reply part
> +        # in the output in case of an error.
> +        # --fail: Exit with the 22 error code is status >= 400.
> +        # --write-out: See the reason for the failure, if any.
> +        # --retry, --retry-delay: To avoid losing the results of
> +        # running after such a long job, try to retry sending the
> +        # results.
> +        run: >
> +          curl --request POST
> +          "${{ secrets.INFLUXDB_URL }}/api/v2/write?org=tarantool&bucket=luajit-performance&precision=s"
> +          --write-out "%{http_code}"
> +          --retry 5
> +          --retry-delay 5
> +          --connect-timeout 120
> +          --fail --silent -o /dev/null
> +          --header "Authorization: Token ${{ secrets.INFLUXDB_TOKEN }}"
> +          --data-binary @./perf/output/summary.txt
> +        working-directory: ${{ env.BUILDDIR }}

[-- Attachment #2: Type: text/html, Size: 7812 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow Sergey Kaplun via Tarantool-patches
  2025-11-18 13:08   ` Sergey Bronnikov via Tarantool-patches
@ 2025-11-18 13:13   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 0 replies; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-11-18 13:13 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 402 bytes --]

Hi, again,


On 10/24/25 14:00, Sergey Kaplun wrote:

> <snipped>

> --- /dev/null
> +++ b/.github/workflows/performance.yml
> @@ -0,0 +1,110 @@
<snipped>
> +      - name: build
> +        run: cmake --build . --parallel
> +        working-directory: ${{ env.BUILDDIR }}
> +      - name: perf
this name is visible in Web UI. I would make it more descriptive 
(execute performance benchmarks?)
<snipped>

[-- Attachment #2: Type: text/html, Size: 1478 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file in LuaJIT-benches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file " Sergey Kaplun via Tarantool-patches
@ 2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
  2025-12-23 10:44   ` Sergey Bronnikov via Tarantool-patches
  1 sibling, 0 replies; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-12-23 10:37 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

Hi, Sergey,

thanks for the patch! LGTM

Sergey

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The input for the test is redirected from the generated file
> <SUMCOL_5000.txt>. This file is the result of concatenation of the
> <SUMCOL_1.txt> 5000 times.
> ---
>   perf/LuaJIT-benches/sum-file.lua | 29 ++++++++++++++++++++++++-----
>   1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/sum-file.lua b/perf/LuaJIT-benches/sum-file.lua
> index c9e618fd..270c1865 100644
> --- a/perf/LuaJIT-benches/sum-file.lua
> +++ b/perf/LuaJIT-benches/sum-file.lua
> @@ -1,6 +1,25 @@
> +local bench = require("bench").new(arg)
>   
> -local sum = 0
> -for line in io.lines() do
> -  sum = sum + line
> -end
> -io.write(sum, "\n")
> +-- XXX: The input file is generated from <SUMCOL_1.txt> by
> +-- repeating it 5000 times. The <SUMCOL_1.txt> contains 1000 lines
> +-- with the total sum of 500.
> +bench:add({
> +  name = "sum_file",
> +  payload = function()
> +    local sum = 0
> +    for line in io.lines() do
> +      sum = sum + line
> +    end
> +    -- Allow several iterations.
> +io.stdin:seek("set", 0)
> +    return sum
> +  end,
> +  checker = function(res)
> +    -- Precomputed result.
> +    return res == 2500000
> +  end,
> +  -- Fixed size of the file.
> +  items = 5e6,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2082 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta " Sergey Kaplun via Tarantool-patches
@ 2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:15     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-12-23 10:37 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 8107 bytes --]

Hello,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> Since the result output (with the different input parameter value)
> produced by this benchmark is used in other benchmarks
> (<k-nucleotide.lua> and <revcomp.lua>), the original script is used as a
> library (inside the <libs/> subdirectory) with the updated default input
> value and returns the number of items processed. The output for the
> benchmark itself is suppressed and not checked since it is irrational to
> store in the repository such huge files for testing.
> ---
>   perf/LuaJIT-benches/fasta.lua      | 120 +++++++----------------------
>   perf/LuaJIT-benches/libs/fasta.lua |  98 +++++++++++++++++++++++
>   2 files changed, 125 insertions(+), 93 deletions(-)
>   create mode 100644 perf/LuaJIT-benches/libs/fasta.lua
>
> diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua
> index 7ce60804..d0dc005d 100644
> --- a/perf/LuaJIT-benches/fasta.lua
> +++ b/perf/LuaJIT-benches/fasta.lua
> @@ -1,95 +1,29 @@
> -
> -local Last = 42
> -local function random(max)
> -  local y = (Last * 3877 + 29573) % 139968
> -  Last = y
> -  return (max * y) / 139968
> -end
> -
> -local function make_repeat_fasta(id, desc, s, n)
> -  local write, sub = io.write, string.sub
> -  write(">", id, " ", desc, "\n")
> -  local p, sn, s2 = 1, #s, s..s
> -  for i=60,n,60 do
> -    write(sub(s2, p, p + 59), "\n")
> -    p = p + 60; if p > sn then p = p - sn end
> -  end
> -  local tail = n % 60
> -  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
> -end
> -
> -local function make_random_fasta(id, desc, bs, n)
> -  io.write(">", id, " ", desc, "\n")
> -  loadstring([=[
> -    local write, char, unpack, n, random = io.write, string.char, unpack, ...
> -    local buf, p = {}, 1
> -    for i=60,n,60 do
> -      for j=p,p+59 do ]=]..bs..[=[ end
> -      buf[p+60] = 10; p = p + 61
> -      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
> -    end
> -    local tail = n % 60
> -    if tail > 0 then
> -      for j=p,p+tail-1 do ]=]..bs..[=[ end
> -      p = p + tail; buf[p] = 10; p = p + 1
> -    end
> -    write(char(unpack(buf, 1, p-1)))
> -  ]=], desc)(n, random)
> -end
> -
> -local function bisect(c, p, lo, hi)
> -  local n = hi - lo
> -  if n == 0 then return "buf[j] = "..c[hi].."\n" end
> -  local mid = math.floor(n / 2)
> -  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
> -         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
> -end
> -
> -local function make_bisect(tab)
> -  local c, p, sum = {}, {}, 0
> -  for i,row in ipairs(tab) do
> -    c[i] = string.byte(row[1])
> -    sum = sum + row[2]
> -    p[i] = sum
> -  end
> -  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
> -end
> -
> -local alu =
> -  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
> -  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
> -  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
> -  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
> -  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
> -  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
> -  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
> -
> -local iub = make_bisect{
> -  { "a", 0.27 },
> -  { "c", 0.12 },
> -  { "g", 0.12 },
> -  { "t", 0.27 },
> -  { "B", 0.02 },
> -  { "D", 0.02 },
> -  { "H", 0.02 },
> -  { "K", 0.02 },
> -  { "M", 0.02 },
> -  { "N", 0.02 },
> -  { "R", 0.02 },
> -  { "S", 0.02 },
> -  { "V", 0.02 },
> -  { "W", 0.02 },
> -  { "Y", 0.02 },
> -}
> -
> -local homosapiens = make_bisect{
> -  { "a", 0.3029549426680 },
> -  { "c", 0.1979883004921 },
> -  { "g", 0.1975473066391 },
> -  { "t", 0.3015094502008 },
> +local bench = require("bench").new(arg)
> +
> +local stdout = io.output()
> +
> +local benchmark
> +benchmark = {
> +  name = "fasta",
> +  -- XXX: The result file may take up to 278 Mb for the default
> +  -- settings. To check the correctness of the script, run it as
> +  -- is from the console.
> +  skip_check = true,
> +  setup = function()
> +    io.output("/dev/null")
> +  end,
> +  payload = function()
> +    -- Run the benchmark as is from the file.
> +    local items = require("fasta")
> +    -- Remove it from the cache to be sure the benchmark will run
> +    -- at the next iteration.
> +    package.loaded["fasta"] = nil
> +    benchmark.items = items
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
>   }
>   
> -local N = tonumber(arg and arg[1]) or 1000
> -make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
> -make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
> -make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
> +bench:add(benchmark)
> +bench:run_and_report()
> diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua
> new file mode 100644
> index 00000000..9c72c244
> --- /dev/null
> +++ b/perf/LuaJIT-benches/libs/fasta.lua
> @@ -0,0 +1,98 @@
> +
> +local Last = 42
> +local function random(max)
> +  local y = (Last * 3877 + 29573) % 139968
> +  Last = y
> +  return (max * y) / 139968
> +end
> +
> +local function make_repeat_fasta(id, desc, s, n)
> +  local write, sub = io.write, string.sub
> +  write(">", id, " ", desc, "\n")
> +  local p, sn, s2 = 1, #s, s..s
> +  for i=60,n,60 do
more whitespaces please
> +    write(sub(s2, p, p + 59), "\n")
> +    p = p + 60; if p > sn then p = p - sn end
> +  end
> +  local tail = n % 60
> +  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
more whitespaces please. Here and below.
> +end
> +
> +local function make_random_fasta(id, desc, bs, n)
> +  io.write(">", id, " ", desc, "\n")
> +  loadstring([=[
> +    local write, char, unpack, n, random = io.write, string.char, unpack, ...
> +    local buf, p = {}, 1
> +    for i=60,n,60 do
> +      for j=p,p+59 do ]=]..bs..[=[ end
> +      buf[p+60] = 10; p = p + 61
> +      if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
> +    end
> +    local tail = n % 60
> +    if tail > 0 then
> +      for j=p,p+tail-1 do ]=]..bs..[=[ end
> +      p = p + tail; buf[p] = 10; p = p + 1
> +    end
> +    write(char(unpack(buf, 1, p-1)))
> +  ]=], desc)(n, random)
> +end
> +
> +local function bisect(c, p, lo, hi)
> +  local n = hi - lo
> +  if n == 0 then return "buf[j] = "..c[hi].."\n" end
> +  local mid = math.floor(n / 2)
> +  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
> +         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
> +end
> +
> +local function make_bisect(tab)
> +  local c, p, sum = {}, {}, 0
> +  for i,row in ipairs(tab) do
> +    c[i] = string.byte(row[1])
> +    sum = sum + row[2]
> +    p[i] = sum
> +  end
> +  return "local r = random(1)\n"..bisect(c, p, 1, #tab)
> +end
> +
> +local alu =
> +  "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"..
> +  "GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"..
> +  "CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"..
> +  "ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"..
> +  "GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"..
> +  "AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"..
> +  "AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
> +
> +local iub = make_bisect{
> +  { "a", 0.27 },
> +  { "c", 0.12 },
> +  { "g", 0.12 },
> +  { "t", 0.27 },
> +  { "B", 0.02 },
> +  { "D", 0.02 },
> +  { "H", 0.02 },
> +  { "K", 0.02 },
> +  { "M", 0.02 },
> +  { "N", 0.02 },
> +  { "R", 0.02 },
> +  { "S", 0.02 },
> +  { "V", 0.02 },
> +  { "W", 0.02 },
> +  { "Y", 0.02 },
> +}
> +
> +local homosapiens = make_bisect{
> +  { "a", 0.3029549426680 },
> +  { "c", 0.1979883004921 },
> +  { "g", 0.1975473066391 },
> +  { "t", 0.3015094502008 },
> +}
> +
> +local N = tonumber(arg and arg[1]) or 25e6
> +
> +make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
> +make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
> +make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
> +
> +return N*2 + N*3 + N*5

[-- Attachment #2: Type: text/html, Size: 8537 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor " Sergey Kaplun via Tarantool-patches
@ 2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:23     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-12-23 10:38 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3474 bytes --]

Hello,

thanks for the patch! See my comments.

Sergey


On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The arguments to the script still can be
> provided in the command line run. However, the values greater than the
> maximum possible solutions found do not affect the time of execution for
> this benchmark. Hence, the number of items to proceed is considered
> constant as the maximum possible number of solutions.
> ---
>   perf/LuaJIT-benches/meteor.lua | 46 ++++++++++++++++++++++++++--------
>   1 file changed, 36 insertions(+), 10 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/meteor.lua b/perf/LuaJIT-benches/meteor.lua
> index 80588ab5..f3962820 100644
> --- a/perf/LuaJIT-benches/meteor.lua
> +++ b/perf/LuaJIT-benches/meteor.lua
> @@ -1,3 +1,4 @@
> +local bench = require("bench").new(arg)
>   
>   -- Generate a decision tree based solver for the meteor puzzle.
>   local function generatesolver(countinit)
> @@ -118,6 +119,10 @@ local function printresult()
>     printboard(smax)
>   end
>   
> +local function getresult()
> +  return countinit-count, smin, smax
> +end
> +
>   -- Generate piece lookup array from the order of use.
>   local function genp()
>     local p = pcs
> @@ -141,7 +146,7 @@ local function f91(k)
>       local s = p[b0] ]]
>     for p=2,99 do if ok[p] then s = s.."..p[b"..p.."]" end end
please add more whitespaces. Here and below.
>     s = s..[[
> -    -- Remember min/max boards, dito for the symmetric board.
> +    -- Remember min/max boards, ditto for the symmetric board.
>       if not smin then smin = s; smax = s
>       elseif s < smin then smin = s elseif s > smax then smax = s end
>       s = reverse(s)
> @@ -206,15 +211,36 @@ local f93 = f91
>     end
>   
>     -- Compile and return solver function and result getter.
> -  return loadstring(s.."return f0, printresult\n", "solver")(countinit)
> +  return loadstring(s.."return f0, printresult, getresult\n", "solver")(countinit)
>   end
>   
> --- Generate the solver function hierarchy.
> -local solver, printresult = generatesolver(tonumber(arg and arg[1]) or 10000)
> -
> --- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
> -if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
> +local N = tonumber(arg and arg[1]) or 10000
> +
> +bench:add({
> +  name = "meteror",
typo: s/meteror/meteor/
> +  setup = function()
> +    -- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
> +    if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
> +  end,
> +  payload = function()
> +    -- Generate the solver function hierarchy.
> +    local solver, printresult, getresult = generatesolver(N)
> +
> +    -- Run the solver protected to get partial results (max count or ctrl-c).
> +    pcall(solver, 0)
> +
> +    local n, smin, smax = getresult()
> +    return {n = n, smin = smin, smax = smax}
> +  end,
> +  checker = function(res)
> +    if N >= 2097 then
> +      assert(res.n == 2098, "Incorrect solutions number")
> +      assert(res.smin == "00001222012661126155865558633348893448934747977799")
> +      assert(res.smax == "99998966856688568255777257472014220144031400311333")
> +    end
> +    return true
> +  end,
> +  items = 2098,
> +})
>   
> --- Run the solver protected to get partial results (max count or ctrl-c).
> -pcall(solver, 0)
> -printresult()
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 4171 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot in LuaJIT-benches
  2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot " Sergey Kaplun via Tarantool-patches
@ 2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:20     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-12-23 10:38 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 3034 bytes --]

Hi, Sergey,

thanks for the patch! See my comments.

Sergey

On 10/24/25 13:50, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The output is redirected to /dev/null. The check is skipped since it is
> very inconvenient to check the binary output, especially since it may be
> configured by the parameter.
> ---
>   perf/LuaJIT-benches/mandelbrot.lua | 64 +++++++++++++++++++++---------
>   1 file changed, 45 insertions(+), 19 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/mandelbrot.lua b/perf/LuaJIT-benches/mandelbrot.lua
> index 0ef595a2..51e0dd4f 100644
> --- a/perf/LuaJIT-benches/mandelbrot.lua
> +++ b/perf/LuaJIT-benches/mandelbrot.lua
> @@ -1,23 +1,49 @@
> +local bench = require("bench").new(arg)
>   
> -local write, char, unpack = io.write, string.char, unpack
> -local N = tonumber(arg and arg[1]) or 100
> -local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
> -write("P4\n", N, " ", N, "\n")
> -for y=0,N-1 do
> -  local Ci, b, p = y*M-1, 1, 0
> -  for x=0,N-1 do
> -    local Cr = x*M-1.5
> -    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
> -    b = b + b
> -    for i=1,49 do
> -      Zi = Zr*Zi*2 + Ci
> -      Zr = Zrq-Ziq + Cr
> -      Ziq = Zi*Zi
> -      Zrq = Zr*Zr
> -      if Zrq+Ziq > 4.0 then b = b + 1; break; end
> +local N = tonumber(arg and arg[1]) or 5000
> +
> +local function payload()
> +  -- These functions must not be an upvalue but the stack slot.
> +  local N = N
> +  local write, char, unpack = io.write, string.char, unpack
> +  local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
please add more whitespaces. Here and below.
> +  write("P4\n", N, " ", N, "\n")
> +  for y=0,N-1 do
> +    local Ci, b, p = y*M-1, 1, 0
> +    for x=0,N-1 do
> +      local Cr = x*M-1.5
> +      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
> +      b = b + b
> +      for i=1,49 do
> +        Zi = Zr*Zi*2 + Ci
> +        Zr = Zrq-Ziq + Cr
> +        Ziq = Zi*Zi
> +        Zrq = Zr*Zr
> +        if Zrq+Ziq > 4.0 then b = b + 1; break; end
> +      end
> +      if b >= 256 then p = p + 1; buf[p] = 511 - b; b = 1; end
>       end
> -    if b >= 256 then p = p + 1; buf[p] = 511 - b; b = 1; end
> +    if b ~= 1 then p = p + 1; buf[p] = (ba-b)*bb; end
> +    write(char(unpack(buf, 1, p)))
>     end
> -  if b ~= 1 then p = p + 1; buf[p] = (ba-b)*bb; end
> -  write(char(unpack(buf, 1, p)))
>   end
> +
> +local stdout = io.output()
> +
> +bench:add({
> +  name = "mandelbrot",
> +  items = N,
> +  -- XXX: This is inconvenient to have the binary file in the
> +  -- repository for the comparison. If the check is needed run,
> +  -- the payload manually.
> +  skip_check = true,
> +  setup = function()
> +    io.output("/dev/null")
> +  end,
> +  teardown = function()
> +    io.output(stdout)
> +  end,
> +  payload = payload,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 3556 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file in LuaJIT-benches
  2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file " Sergey Kaplun via Tarantool-patches
  2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-23 10:44   ` Sergey Bronnikov via Tarantool-patches
  2025-12-26  8:38     ` Sergey Kaplun via Tarantool-patches
  1 sibling, 1 reply; 134+ messages in thread
From: Sergey Bronnikov via Tarantool-patches @ 2025-12-23 10:44 UTC (permalink / raw)
  To: Sergey Kaplun; +Cc: tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 1745 bytes --]

on execution I got an error:

./build/src/luajit: perf/LuaJIT-benches/sum-file.lua:11: attempt to 
perform arithmetic on local 'line' (a string value)

On 10/24/25 14:00, Sergey Kaplun wrote:
> This patch adjusts the aforementioned test to use the benchmark
> framework introduced before. The default arguments are adjusted
> according to the <PARAM_x86.txt> file. The arguments to the script still
> can be provided in the command line run.
>
> The input for the test is redirected from the generated file
> <SUMCOL_5000.txt>. This file is the result of concatenation of the
> <SUMCOL_1.txt> 5000 times.
> ---
>   perf/LuaJIT-benches/sum-file.lua | 29 ++++++++++++++++++++++++-----
>   1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/perf/LuaJIT-benches/sum-file.lua b/perf/LuaJIT-benches/sum-file.lua
> index c9e618fd..270c1865 100644
> --- a/perf/LuaJIT-benches/sum-file.lua
> +++ b/perf/LuaJIT-benches/sum-file.lua
> @@ -1,6 +1,25 @@
> +local bench = require("bench").new(arg)
>   
> -local sum = 0
> -for line in io.lines() do
> -  sum = sum + line
> -end
> -io.write(sum, "\n")
> +-- XXX: The input file is generated from <SUMCOL_1.txt> by
> +-- repeating it 5000 times. The <SUMCOL_1.txt> contains 1000 lines
> +-- with the total sum of 500.
> +bench:add({
> +  name = "sum_file",
> +  payload = function()
> +    local sum = 0
> +    for line in io.lines() do
> +      sum = sum + line
You obviously cannot sum a string and a number.
> +    end
> +    -- Allow several iterations.
> +io.stdin:seek("set", 0)
> +    return sum
> +  end,
> +  checker = function(res)
> +    -- Precomputed result.
> +    return res == 2500000
> +  end,
> +  -- Fixed size of the file.
> +  items = 5e6,
> +})
> +
> +bench:run_and_report()

[-- Attachment #2: Type: text/html, Size: 2496 bytes --]

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite
  2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:04     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:04 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!

Please consider my answers below.

On 11.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch!
> 
> This is a big step forward for LuaJIT performance testing.
> 
> Please take a look on the comments below.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch introduces the LuaJIT-test-cleanup bench suite [1] into our
> s/bench/benchmark/

Fixed.

> > LuaJIT fork source tree. To provide relatable reprodusible results
> 
> did not get it: "relatable"

I've meant reliable. Fixed.

> 
> s/reprodusible/reproducible/

Fixed, thanks!

> 
> > several benchmarks need to be adjusted. However, to be sure we initially use
> > the valid suite, everything in the <perf/LuaJIT-benches> directory is
> > moved intact.
> >
> > [1]:https://github.com/LuaJIT/LuaJIT-test-cleanup/tree/014708b/bench

The new commit message is the following:

| perf: add LuaJIT-test-cleanup perf suite
|
| This patch introduces the LuaJIT-test-cleanup benchmark suite [1] into
| our LuaJIT fork source tree. To provide reliable reproducible results
| several benchmarks need to be adjusted. However, to be sure we initially
| use the valid suite, everything in the <perf/LuaJIT-benches> directory
| is moved intact.
|
| [1]: https://github.com/LuaJIT/LuaJIT-test-cleanup/tree/014708b/bench

<snipped>

> > +  'perf/LuaJIT-benches/',
> 
> Please don't do this. It is better to ignore by code number and at least
> 
> some groups of warnings in the code.
> 

It is not clear anyway what these magic numbers mean.
For now it is just more convenient to disable the full set of the added
suite, since we need to ignore the enormous amount of warnings.
We may refactor it later if we have such a need.

I'm not sure that this particular suite will be updated somehow in the
future, so this format looks acceptable. We have the same approach for
LuaJIT tests and PUC-Rio-Lua-5.1 tests

> 
> It is not clear why exactly these parameters are used.

I suppose this is the empirical setup that Mike thinks is optimal.

> 
> Should we change them?
> 

This commit brings the original code as is. So, no, we don't need to
change them here.

> 
> Do we really need parameters for unsupported platforms (MIPS, x86, ppc)?
> 
> it deserves a comment in commit message
> 

This commit brings the original code as is. So, no, we don't need to
change it here.

> please remove a newline
> ...
> trailing newline
> ...
> unnecessary newline
> ...
> unnecessary newline
> ...
> unnecessary newline
> ...
> trailing space

This commit brings the original code as is. So, no, we don't need to
change it here. It should be removed in the corresponding refactoring
commit if you insist.

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module
  2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:05     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:05 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Please consider my answers below.

On 11.11.25, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> thanks for the patch! Please see my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This module contains 2 functions:
> > - `realtime()` -- returns the time represented by the wall clock.
> > - `process_cputime()` -- returns the time consumed by all threads of
> >    the process.
> I would rephrase second bullet: "to measure CPU time instead of elapsed 
> time"

Rephrased.

> Also, I would add this description to the Lua module as well.

It is mentioned in the corresponding comment. Or I don't get what you
mean.

> >
> > Both functions are implemented via FFI call to the `clock_gettime()`.
> > ---
> >   perf/utils/clock.lua | 35 +++++++++++++++++++++++++++++++++++
> >   1 file changed, 35 insertions(+)
> >   create mode 100644 perf/utils/clock.lua
> >
> > diff --git a/perf/utils/clock.lua b/perf/utils/clock.lua
> > new file mode 100644
> > index 00000000..57385967
> > --- /dev/null
> > +++ b/perf/utils/clock.lua
> > @@ -0,0 +1,35 @@
> > +local ffi = require('ffi')
> > +
> > +ffi.cdef[[
> > +struct timespec {
> > +  long tv_sec; /* Seconds. */
> > +  long tv_nsec; /* Nanoseconds. */
> > +};
> > +
> > +int clock_gettime(int clockid, struct timespec *tp);
> > +]]
> > +
> > +local C = ffi.C
> > +
> > +-- Wall clock.
> > +local CLOCK_REALTIME = 0
> 
>   This clock is not a reliable source of the time. This clock can be 
> adjusted by
> 
> NTP or manually or by timezones. It is better to use CLOCK_MONOTONIC or
> 
> even CLOCK_MONOTONIC_RAW (not portable, Linux-specific), it is more reliable
> 
> and does not depend on things listed above.

I have no strict opinion on this. On the first hand, monotonic time may
be more reliable. OTOH, Google Benchmark uses realtime. AFAIU, the main
point of using "real-time" is to reflect overall execution duration and
adjust it for the minimum benchmark duration.

Do we need to compare results with MONOTONIC time?
I suppose that there is no way that wall clock will be adjusted during
benchmarks somehow.

> 
> > +-- CPU time consumed by the process.
> > +local CLOCK_PROCESS_CPUTIME_ID = 2
> > +
> > +-- All functions below returns the corresponding `clock_gettime()`
> s/`clock_gettime()`/elapsed time/

Fixed. See the iterative patch below:

===================================================================
diff --git a/perf/utils/clock.lua b/perf/utils/clock.lua
index 57385967..cf708194 100644
--- a/perf/utils/clock.lua
+++ b/perf/utils/clock.lua
@@ -16,8 +16,8 @@ local CLOCK_REALTIME = 0
 -- CPU time consumed by the process.
 local CLOCK_PROCESS_CPUTIME_ID = 2

--- All functions below returns the corresponding `clock_gettime()`
--- in seconds.
+-- All functions below returns the corresponding elapsed time in
+-- seconds.
 local M = {}

 local timespec = ffi.new('struct timespec[1]')
===================================================================

> > +-- in seconds.
> > +local M = {}
> > +
> > +local timespec = ffi.new('struct timespec[1]')
> > +
> > +function M.realtime()
> > +  C.clock_gettime(CLOCK_REALTIME, timespec)
> > +  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
> > +end
> > +
> 
> may be it is better to make conversion only once?
> 
> @@ -24,7 +24,7 @@ local timespec = ffi.new('struct timespec[1]')
> 
>   function M.realtime()
>     C.clock_gettime(CLOCK_REALTIME, timespec)
> -  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
> +  return tonumber(timespec[0].tv_sec + timespec[0].tv_nsec / 1e9)

In that way we lose the precision of `tv_nsec`. Ignoring.
| luajit -e 'print(400000LL / 1e9)'
| 0LL

>   end
> 
> the same below
> 
> > +function M.process_cputime()
> > +  C.clock_gettime(CLOCK_PROCESS_CPUTIME_ID, timespec)
> > +  return tonumber(timespec[0].tv_sec) + tonumber(timespec[0].tv_nsec) / 1e9
> > +end
> > +
> > +return M

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module
  2025-11-11 15:41   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:06     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:06 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi again, Sergey!
Thanks for the review!
Please consider my answers below.

On 11.11.25, Sergey Bronnikov wrote:
> Hi, Sergey, again!
> 
> thanks for the patch!
> 
> Please see comments below.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This module provides functionality to run custom benchmark workloads
> > defined by the following syntax:
> >
> > | local bench = require('bench').new(arg)
> 
> there are many LuaJIT-specific functions below (ffi, jit etc.)
> 
> I propose to check that luabin is LuaJIT-compatible and exit with 
> appropriate message
> 
> if it is not.

For now I have no goal to use this module as LuaJIT-agnostic. The same
approach is used for our tap module for tests -- it is LuaJIT-only. If
we want to make it LuaJIT-agnostic, it should be done out of scope of
this patch set.

> 
> > |
> > | -- f_* are functions, n_* are numbers.
> 
> s/functions/user-defined functions/
> 
> s/numbers/user-defined numbers/

Fixed.

> 
> > |bench:add({
> > |   setup = f_setup,
> > |   payload = f_payload,
> > |   teardown = f_teardown,
> > |   items = n_items_processed,
> > |
> > |   checker = f_checker,
> > |   -- Or instead:
> > |   skip_check = true,
> > |
> > |   iterations = n_iterations,
> > |   -- Or instead:
> > |   min_time = n_seconds,
> > | })
> > |
> > |bench:run_and_report()
> >
> > The checker function received the single value returned by the payload
> > function and completed all checks related to the test. If it returns a
> > true value, it is considered a successful check pass. The checker
> > function is called before the main workload as a warm-up. Generally, you
> > should always provide the checker function to be sure that your
> > benchmark is still correct after optimizations. In cases when it is
> > impossible (for some reason), you may specify the `skip_check` flag. In
> > that case the warm-up part will be skipped as well.
> >
> > Each test is run in the order it was added. The module measures the
> > real-time and CPU time necessary to run `iterations` repetitions of the
> please consider using monotonic time, not a realtime (see a previous patch)

Yes, discussed in the corresponding ML reply.

> > test or amount of iterations `min_time` in seconds (4 by default) and
> > calculates the metric items per second (more is better). The total
> > amount of items equals `n_items_processed * n_iterations`. The items may
> > be added in the table with the description inside the payload function
> > as well. The results (real-time, CPU time, iterations, items/s) are
> > reported in a format similar to the Google Benchmark suite [1].
> s/similar/compatible/

Fixed.

> >
> > Each test may be run from the command line as follows:
> > | LUA_PATH="..." luajit test_name.lua [flags] arguments
> >
> > The supported flags are:
> > | -j{off|on}                 Disable/Enable JIT for the benchmarks.
> Why do you implement this flag for a Lua module? It can be passed to 
> luajit directly.

It's just a little bit more convenient to use it for me if I
mispositioned the flag :).

> > | --benchmark_color={true|false|auto}
> > |                            Enables the colorized output for the
> > |                            terminal (not the file).
> > | --benchmark_min_time={number} Minimum seconds to run the benchmark
> > |                            tests.
> > | --benchmark_out=<file>     Places the output into <file>.
> > | --benchmark_out_format={console|json}
> > |                            The format is used when saving the results in the
> > |                            file. The default format is the JSON format.
> > | -h, --help                 Display help message and exit.
> >
> > These options are similar to the Google Benchmark command line options,
> > but with a few changes:
> > 1) If an output file is given, there is no output in the terminal.
> > 2) The min_time option supports only number values. There is no support
> >     for the iterations number (by the 'x' suffix).
> >
> > [1]:https://github.com/google/benchmark
> > ---
> >   perf/utils/bench.lua | 509 +++++++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 509 insertions(+)
> >   create mode 100644 perf/utils/bench.lua
> >
> > diff --git a/perf/utils/bench.lua b/perf/utils/bench.lua
> > new file mode 100644
> > index 00000000..68473215
> > --- /dev/null
> > +++ b/perf/utils/bench.lua
> > @@ -0,0 +1,509 @@
> > +local clock = require('clock')
> > +local ffi = require('ffi')
> > +-- Require 'cjson' only on demand for formatted output to file.
> > +local json
> > +
> > +local M = {}
> > +
> > +local type, assert, error = type, assert, error
> > +local format, rep = string.format, string.rep
> 
> s/, rep/string_rep/
> 
> s/local format/string_format/
> 
> for consistency with shortcuts below

AFAICS, it's kind of common practice to use the shortcuts as mentioned
above and below. `table_remove` is the only exception from this naming.

> 
> 
> > +local floor, max, min = math.floor, math.max, math.min
> > +local table_remove = table.remove
> > +
> > +local LJ_HASJIT = jit and jit.opt
> > +
> > +-- Argument parsing. ---------------------------------------------
> > +
> > +-- XXX: Make options compatible with Google Benchmark, since most
> > +-- probably it will be used for the C benchmarks as well.
> > +-- Compatibility isn't full: there is no support for environment
> > +-- variables (since they are not so useful) and the output to the
> > +-- terminal is suppressed if the --benchmark_out flag is
> > +-- specified.
> > +
> > +local HELP_MSG = [[
> > + Options:
> > +   -j{off|on}                 Disable/Enable JIT for the benchmarks.
> Add a default value to the description. Here and below.

There is no default value for this option. Behaviour is affected only if
JIT is available and the -j option contradicts the current
`jit.status()`.

I've thought that the default for the benchmark_color is obvious, but
make the help more verbouse:

===================================================================
diff --git a/perf/utils/bench.lua b/perf/utils/bench.lua
index 7b7bb5b3..09a5c41a 100644
--- a/perf/utils/bench.lua
+++ b/perf/utils/bench.lua
@@ -29,7 +29,7 @@ local HELP_MSG = [[
                               the file). 'auto' means to use colors if the
                               output is being sent to a terminal and the TERM
                               environment variable is set to a terminal type
-                              that supports colors.
+                              that supports colors. Default is 'auto'.
    --benchmark_min_time={number}
                               Minimum seconds to run the benchmark tests.
                               4.0 by default.
===================================================================

> > +   --benchmark_color={true|false|auto}
> > +                              Enables the colorized output for the terminal (not
> > +                              the file). 'auto' means to use colors if the
> > +                              output is being sent to a terminal and the TERM
> > +                              environment variable is set to a terminal type
> > +                              that supports colors.
> > +   --benchmark_min_time={number}
> > +                              Minimum seconds to run the benchmark tests.
> > +                              4.0 by default.
> Why 4.0?

Just some empirical value that is good enough. 2.0 seconds (the default
in the Google benchmark) is sometimes (in a few rare cases) not enough
for the LuaJIT runtime to be stable.

> > +   --benchmark_out=<file>     Places the output into <file>.
> > +   --benchmark_out_format={console|json}
> > +                              The format is used when saving the results in the
> > +                              file. The default format is the JSON format.
> 
>  >  The default format is the JSON format.
> 
> by default JSON module is not available, in source code it is marked as 
> "on demand".
> 
> I'm not sure JSON format should be by default.

This flag is meaningful only if the `--benchmark_out` flag is given as
well. So, when we have output to the file, JSON format is the default.
The behaviour is the same as for Google Benchmark.

> 
> > +   -h, --help                 Display this message and exit.
> > +
> > + There are a bunch of suggestions on how to achieve the most
> > + stable benchmark results:
> > +https://github.com/tarantool/tarantool/wiki/Benchmarking
> > +]]
> > +
> > +local function usage(ctx)
> > +  local header = format('USAGE: luajit %s [options]\n', ctx.name)
> 
> I would not hardcode "luajit" and use luabin instead, see `M.luabin` in
> 
> test/tarantool-tests/utils/exec.lua. This can be at least tarantool 

No, it can't. I see no reason to run this benchmark (and any
LuaJIT-related) suites anywhere except LuaJIT binary (and maybe Lua
binary, but this is out of scope of this patch set). Hence, the "luajit"
is hardcoded. Also, I see no reason to mention some custom path to the
binary.

> instead luajit.
> 
> > +io.stderr:write(header, HELP_MSG)
> > +  os.exit(1)
> > +end
> > +
> > +local function check_param(check, strfmt, ...)
> > +  if not check then
> > +io.stderr:write(format(strfmt, ...))
> > +    os.exit(1)
> 
> please define possible exit codes as variables and use them in os.exit().
> 
> Like well-known EXIT_SUCCES and EXIT_FAILURE in C. Feel free to ignore.

Added:

===================================================================
diff --git a/perf/utils/bench.lua b/perf/utils/bench.lua
index 68473215..7b7bb5b3 100644
--- a/perf/utils/bench.lua
+++ b/perf/utils/bench.lua
@@ -44,16 +44,18 @@ local HELP_MSG = [[
  https://github.com/tarantool/tarantool/wiki/Benchmarking
 ]]
 
+local EXIT_FAILURE = 1
+
 local function usage(ctx)
   local header = format('USAGE: luajit %s [options]\n', ctx.name)
   io.stderr:write(header, HELP_MSG)
-  os.exit(1)
+  os.exit(EXIT_FAILURE)
 end
 
 local function check_param(check, strfmt, ...)
   if not check then
     io.stderr:write(format(strfmt, ...))
-    os.exit(1)
+    os.exit(EXIT_FAILURE)
   end
 end
 
@@ -108,7 +110,7 @@ local function unrecognized_option(optname, dashes)
   local fullname = dashes .. (optname or '=')
   io.stderr:write(format('unrecognized command-line flag: %s\n', fullname))
   io.stderr:write(HELP_MSG)
-  os.exit(1)
+  os.exit(EXIT_FAILURE)
 end
 
 local function unrecognized_long_option(_, optname)
===================================================================

> 
> > +  end
> > +end
> > +
> > +-- Valid values: 'false'/'no'/'0'.
> > +-- In case of an invalid value the 'auto' is used.
> > +local function set_color(ctx, value)
> > +  if value == 'false' or value == 'no' or value == '0' then
> > +    ctx.color = false
> > +  else
> > +    -- In case of an invalid value, the Google Benchmark uses
> > +    -- 'auto', which is true for the stdout output (the only
> > +    -- colorizable output). So just set it to true by default.
> > +    ctx.color = true
> > +  end
> > +end
> > +
> > +local DEFAULT_MIN_TIME = 4.0
> > +local function set_min_time(ctx, value)
> > +  local time = tonumber(value)
> 
>  >  Tries to convert its argument to a number. If the argument is already
> 
>  > a number or a string convertible to a number, then tonumber returns 
> this number;
> 
>  > otherwise, it returns *nil*.
> 
> https://www.lua.org/manual/5.1/manual.html#pdf-tonumber
> 
> please check result for nil

It is checked via the line below.

> 
> > +  check_param(time, 'Invalid min time: "%s"\n', value)
> > +  ctx.min_time = time
> > +end
> > +
> > +local function set_output(ctx, filename)
> > +  check_param(type(filename) == "string", 'Invalid output value: "%s"\n',
> > +              filename)
> > +  ctx.output = filename
> > +end
> > +

<snipped>

> > +local SHORT_OPTS = setmetatable({
> > +  ['h'] = usage,
> > +  ['j'] = set_jit,
> > +}, {__index = unrecognized_short_option})
> > +
> > +local LONG_OPTS = setmetatable({
> > +  ['benchmark_color'] = set_color,
> > +  ['benchmark_min_time'] = set_min_time,
> > +  ['benchmark_out'] = set_output,
> > +  -- XXX: For now support only JSON encoded and raw output.
> > +  ['benchmark_out_format'] = set_output_format,
> > +  ['help'] = usage,
> > +}, {__index = unrecognized_long_option})
> > +
> Is taking argparse from the tarantool repository (src/lua/argparse.lua) 
> not the option?

I don't like this particular module very much and I don't want to add
any third-party dependencies. Also, the arg parser is rather primitive,
so I prefer to write it from scratch.

> > +local function is_option(str)
> > +  return type(str) == 'string' andstr:sub(1, 1) == '-' and str ~= '-'
> > +end
> > +

<snipped>

> > +
> > +  if ctx.output_format == 'json' then
> > +    json = require('cjson')
> should we make it compatible with tarantool? (module is named 'json' there)

I suppose not, since it wouldn't be launched by the Tarantool.

> > +  end
> > +

<snipped>

> > +
> > +--https://luajit.org/running.html#foot.
> > +local JIT_DEFAULTS = {
> > +  maxtrace = 1000,
> > +  maxrecord = 4000,
> > +  maxirconst = 500,
> > +  maxside = 100,
> > +  maxsnap = 500,
> > +  hotloop = 56,
> > +  hotexit = 10,
> > +  tryside = 4,
> > +  instunroll = 4,
> > +  loopunroll = 15,
> > +  callunroll = 3,
> > +  recunroll = 2,
> > +  sizemcode = 32,
> > +  maxmcode = 512,
> > +}
> please sort alphabetically

I've used the same sorting order as in the source, see the link.
It may be easier to find the option this way (at least it is for me).

> > +
> > +-- Basic setup for all tests to clean up after a previous
> > +-- executor.
> > +local function luajit_tests_setup(ctx)

<snipped>

> > +  -- Reset the GC to the defaults.
> > +  collectgarbage('setstepmul', 200)
> > +  collectgarbage('setpause', 200)
> should we define 200 as a named constant?

It is the default value from the Lua Reference Manual. These values are
used only once here, so I don't see the necessity in it.

> > +
> > +  -- Collect all garbage at the end. Twice to be sure that all
> > +  -- finalizers are run.
> > +  collectgarbage()
> > +  collectgarbage()

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches
  2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:07     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:07 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Please consider my comments below.

On 13.11.25, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> thanks for the patch!
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

Also, added the comment with the benchmark description as we discussed
offline.

===================================================================
diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
index 1062fff0..4d88432d 100644
--- a/perf/LuaJIT-benches/array3d.lua
+++ b/perf/LuaJIT-benches/array3d.lua
@@ -1,3 +1,7 @@
+-- The benchmark to measure simple operations on array-like data
+-- structures.
+-- Here, a 3D array is represented as a contiguous 1D array.
+
 local bench = require("bench").new(arg)
 
 local function array_set(self, x, y, z, p)
===================================================================

> > -local arr = array_new(dim, dim, dim, packed)
> >   
> > -for x,y,z inarr:points() do
> > -arr:set(x, y, z, x*x)
> > -end
> > -assert(arr.image[dim^3-1] == (dim-1)^2)
> > +bench:add({
> > +  name = "array3d",
> > +  checker = function(arr)
> > +    assert(arr.image[dim^3-1] == (dim-1)^2)
> > +    return true
> > +  end,
> > +  payload = function()
> > +    local arr = array_new(dim, dim, dim, packed)
> > +    for x,y,z inarr:points() do
> please add whitespaces after commas

Added:

===================================================================
diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
index 75ab5b01..1062fff0 100644
--- a/perf/LuaJIT-benches/array3d.lua
+++ b/perf/LuaJIT-benches/array3d.lua
@@ -60,7 +60,7 @@ bench:add({
   end,
   payload = function()
     local arr = array_new(dim, dim, dim, packed)
-    for x,y,z in arr:points() do
+    for x, y, z in arr:points() do
       arr:set(x, y, z, x*x)
     end
     return arr
===================================================================

But I'm not sure that we need to refactor each part of the code this
way. Anyway, I'm OK with reformatting this since we already touched this
chunk.

> > +arr:set(x, y, z, x*x)
> > +    end
> > +    return arr
> > +  end,
> > +  items = dim * dim * dim,
> > +  -- Limit the number of iterations to avoid OOM errors for
> > +  -- non-GC64 builds.
> > +  iterations = 5,

Replaced to 4 iterations, since sometimes it strikes the OOM on the
runner.

===================================================================
diff --git a/perf/LuaJIT-benches/array3d.lua b/perf/LuaJIT-benches/array3d.lua
index 4d88432d..80562706 100644
--- a/perf/LuaJIT-benches/array3d.lua
+++ b/perf/LuaJIT-benches/array3d.lua
@@ -72,7 +72,7 @@ bench:add({
   items = dim * dim * dim,
   -- Limit the number of iterations to avoid OOM errors for
   -- non-GC64 builds.
-  iterations = 5,
+  iterations = 4,
 })
 
 bench:run_and_report()
===================================================================

> > +})
> >   
> > +bench:run_and_report()
> 
> Looks like benchmark min time does not work as expected:
> 
> [1] ~/sources/MRG/tarantool/third_party/luajit $ time ./build/src/luajit 
> perf/LuaJIT-benches/array3d.lua --benchmark_min_time=10
> -------------------------------------------------------------------------------------------------------------
> Benchmark                                     Time  CPU    Iterations 
> UserCounters...
> -------------------------------------------------------------------------------------------------------------
> array3d                                     2.10 s          2.13 s      
>         5 items_per_second=64.370M/s
> 
> real    0m2.333s
> user    0m1.869s
> sys     0m0.461s
> [1] ~/sources/MRG/tarantool/third_party/luajit $
> 
> --benchmark_min_time set to 10 sec, but benchmark.lua reports "2.10 s" 
> and time reported by `time` utility
> 
> is less than 10 sec.

It works exactly as expected, since this option is superseded by the
`iterations` value in the becnhmark configuration, see the corresponding
comment in the code for the rationale in this particular benchmark.
Also, it is mentioned in the commit message.

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees in LuaJIT-benches
  2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:08     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:08 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!

Thanks for the review!
Please consider my answers below.

On 13.11.25, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> thanks for the patch!
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The test cases are split by the different types of trees:
> > 1) stretched tree,
> > 2) long-lived tree,
> > 3) several trees with a depth of the power of 2,
> > 4) iteration over all trees in the third test case.
> >
> > The number of items is the number of `ItemCheck()` first-level calls
> > performed in the payload.
> > ---
> >
> > I'm not sure that we should distinguish different subtests here.
> > OTOH, how to calculate the amount of items correctly for the whole test
> > instead?

Discussed offline to leave different subtests for now.

> >
> >   perf/LuaJIT-benches/binary-trees.lua | 94 ++++++++++++++++++++++------
> >   1 file changed, 76 insertions(+), 18 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
> > index bf040466..9d4dc7b4 100644
> > --- a/perf/LuaJIT-benches/binary-trees.lua
> > +++ b/perf/LuaJIT-benches/binary-trees.lua
> > @@ -1,3 +1,4 @@


Also, added the comment with the benchmark description as we discussed
offline.

===================================================================
diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
index ae02a1ab..df288032 100644
--- a/perf/LuaJIT-benches/binary-trees.lua
+++ b/perf/LuaJIT-benches/binary-trees.lua
@@ -1,3 +1,9 @@
+-- The benchmark to check the performance of the GC and memory
+-- allocator. Allocate, walk, and deallocate many bottom-up binary
+-- trees.
+-- For the details, see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/binarytrees.html
+
 local bench = require("bench").new(arg)
 
 local function BottomUpTree(item, depth)
@@ -28,6 +34,8 @@ if maxdepth < N then maxdepth = N end
 
 local stretchdepth = maxdepth + 1
 
+-- Allocate a binary tree to "stretch" memory, check it exists,
+-- and "deallocate" it.
 bench:add({
   name = "stretch_depth_" .. tostring(stretchdepth),
   payload = function()
@@ -41,9 +49,12 @@ bench:add({
   end,
 })
 
+-- Allocate a long-lived binary tree that will live on while
+-- other trees are allocated and "deallocated".
 -- This tree created once on the setup for the first test.
 local longlivedtree
 
+-- Allocate, walk, and "deallocate" many bottom-up binary trees.
 for depth = mindepth, maxdepth, 2 do
   local iterations = 2 ^ (maxdepth - depth + mindepth)
   local tree_bench
@@ -71,6 +82,7 @@ for depth = mindepth, maxdepth, 2 do
   bench:add(tree_bench)
 end
 
+-- Check that the long-lived binary tree still exists.
 bench:add({
   name = "longlived_depth_" .. tostring(maxdepth),
   payload = function()
@@ -83,6 +95,7 @@ bench:add({
   end,
 })
 
+-- All in one benchmark for the various trees.
 bench:add({
   name = "all_in_one",
   payload = function()
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local function BottomUpTree(item, depth)
> >     if depth > 0 then
> > @@ -18,30 +19,87 @@ local function ItemCheck(tree)
> >     end
> >   end
> >   
> > -local N = tonumber(arg and arg[1]) or 0
> > +local N = tonumber(arg and arg[1]) or 16
> Why 16?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> >   local mindepth = 4
> >   local maxdepth = mindepth + 2
> >   if maxdepth < N then maxdepth = N end
> >   
> > -do
> > -  local stretchdepth = maxdepth + 1
> > -  local stretchtree = BottomUpTree(0, stretchdepth)
> > -  io.write(string.format("stretch tree of depth %d\t check: %d\n",
> > -    stretchdepth, ItemCheck(stretchtree)))
> > -end
> > +local stretchdepth = maxdepth + 1
> > +
> > +bench:add({
> > +  name = "stretch_depth_" .. tostring(stretchdepth),
> > +  payload = function()
> > +    local stretchtree = BottomUpTree(0, stretchdepth)
> > +    local check = ItemCheck(stretchtree)
> > +    return check
> > +  end,
> > +  items = 1,
> > +  checker = function(check)
> > +    return check == -1
> it deserves a comment

Added the corresponding comment to the `ItemCheck()`.
===================================================================
diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
index 9d4dc7b4..75c60332 100644
--- a/perf/LuaJIT-benches/binary-trees.lua
+++ b/perf/LuaJIT-benches/binary-trees.lua
@@ -11,6 +11,8 @@ local function BottomUpTree(item, depth)
   end
 end
 
+-- The checker function. For the tree created with the given
+-- `item` returns `item` - 1 (by induction).
 local function ItemCheck(tree)
   if tree[2] then
     return tree[1] + ItemCheck(tree[2]) - ItemCheck(tree[3])
===================================================================

> > +  end,
> > +})
> >   
> > -local longlivedtree = BottomUpTree(0, maxdepth)
> > +-- This tree created once on the setup for the first test.
> > +local longlivedtree
> 
> I don't like that we should save a benchmark state in global variables.
> 
> What if we allow setting a user-defined object that will have a state
> 
> and this state will be passed to checker/payload functions?

This changes the semantics of the benchmark.
It is improtant that the state is created in the first benchmark and
used later as an upvalue, for the example.

> 
> >   

<snipped>

> > +bench:add({
> > +  name = "all_in_once",
> s/all_in_once/all_in_one/?

Renamed:

===================================================================
diff --git a/perf/LuaJIT-benches/binary-trees.lua b/perf/LuaJIT-benches/binary-trees.lua
index 75c60332..ae02a1ab 100644
--- a/perf/LuaJIT-benches/binary-trees.lua
+++ b/perf/LuaJIT-benches/binary-trees.lua
@@ -84,7 +84,7 @@ bench:add({
 })
 
 bench:add({
-  name = "all_in_once",
+  name = "all_in_one",
   payload = function()
     for depth = mindepth, maxdepth, 2 do
       local iterations = 2 ^ (maxdepth - depth + mindepth)
===================================================================

> > +  payload = function()

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos in LuaJIT-benches
  2025-11-13 11:11   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:10     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:10 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
See my answers below.

On 13.11.25, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> thanks for the patch! LGTM
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/chameneos.lua | 32 ++++++++++++++++++++++---------
> >   1 file changed, 23 insertions(+), 9 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/chameneos.lua b/perf/LuaJIT-benches/chameneos.lua
> > index 78b64c3f..c1002041 100644
> > --- a/perf/LuaJIT-benches/chameneos.lua
> > +++ b/perf/LuaJIT-benches/chameneos.lua
> > @@ -1,8 +1,10 @@

Also, added the comment with the benchmark description as we discussed
offline.

===================================================================
diff --git a/perf/LuaJIT-benches/chameneos.lua b/perf/LuaJIT-benches/chameneos.lua
index c1002041..9bd83081 100644
--- a/perf/LuaJIT-benches/chameneos.lua
+++ b/perf/LuaJIT-benches/chameneos.lua
@@ -1,3 +1,9 @@
+-- The benchmark to check the performance of coroutine interaction
+-- using symmetrical rendezvous requests.
+-- For the details see:
+-- https://pybenchmarks.org/u64q/performance.php?test=chameneosredux
+-- https://cedric.cnam.fr/PUBLIS/RC474.pdf
+
 local bench = require("bench").new(arg)
 
 local co = coroutine
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local co = coroutine
> >   local create, resume, yield = co.create, co.resume, co.yield
> >   
> > -local N = tonumber(arg and arg[1]) or 10
> > +local N = tonumber(arg and arg[1]) or 1e7
> Why 1e7?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> > +local N_ATTEMPTS = N

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring in LuaJIT-benches
  2025-11-13 11:17   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:11     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:11 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixes your comments and added the description to the benchmark.

On 13.11.25, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/coroutine-ring.lua | 45 ++++++++++++++++----------
> >   1 file changed, 28 insertions(+), 17 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
> > index 1e8c5ef6..1b86a5ba 100644
> > --- a/perf/LuaJIT-benches/coroutine-ring.lua
> > +++ b/perf/LuaJIT-benches/coroutine-ring.lua
> > @@ -1,3 +1,5 @@

Also, added the comment with the benchmark description as we discussed
offline.

===================================================================
diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
index 1b86a5ba..8efe7b2a 100644
--- a/perf/LuaJIT-benches/coroutine-ring.lua
+++ b/perf/LuaJIT-benches/coroutine-ring.lua
@@ -1,3 +1,9 @@
+-- The benchmark to check the performance of coroutine interaction
+-- to test possible "death by concurrency," when one coroutine
+-- is active and others are waiting their turn.
+-- For the details see:
+-- https://pybenchmarks.org/u64q/performance.php?test=threadring
+
 local bench = require("bench").new(arg)

 -- The Computer Language Benchmarks Game
===================================================================

> > +local bench = require("bench").new(arg)
> > +
> >   -- The Computer Language Benchmarks Game
> >   --http://shootout.alioth.debian.org/
> >   -- contributed by Sam Roberts

<snipped>

> > +bench:add({
> > +  name = "coroutine_ring",
> > +  payload = function()
> > +    local token     = 0
> a single whitespace before "="

Fixed. See the iterative patch below.

> > +    -- create all threads
> First letter is in uppercase and a dot at the end.

Fixed. See the iterative patch below.

> > +    local threads   = {}
> a single whitespace before "="

Fixed. See the iterative patch below.

> > +    for id = 1, poolsize do
> > +      threads[id] = create(body)
> > +    end
> > +
> > +    -- send the token
> First letter is in uppercase and a dot at the end.

Fixed:

===================================================================
diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
index 1b86a5ba..e1455205 100644
--- a/perf/LuaJIT-benches/coroutine-ring.lua
+++ b/perf/LuaJIT-benches/coroutine-ring.lua
@@ -27,14 +27,14 @@ end
 bench:add({
   name = "coroutine_ring",
   payload = function()
-    local token     = 0
-    -- create all threads
-    local threads   = {}
+    local token = 0
+    -- Create all threads.
+    local threads = {}
     for id = 1, poolsize do
       threads[id] = create(body)
     end
 
-    -- send the token
+    -- Send the token.
     repeat
       if id == poolsize then
         id = 1
===================================================================

> > +    repeat
> > +      if id == poolsize then
> > +        id = 1
> > +      else
> > +        id = id + 1
> > +      end
> > +      ok, token = resume(threads[id], token)
> > +    until token == n
> > +    return id
> > +  end,
> > +  checker = function(id) return id == (n % poolsize + 1) end,
> > +  items = n,
> > +})
> > +
> > +bench:run_and_report()
> >   
> > -io.write(id, "\n")

I've also refactored the benchmark to make it performance match the
original version:

===================================================================
diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
index 8efe7b2a..747a2ecc 100644
--- a/perf/LuaJIT-benches/coroutine-ring.lua
+++ b/perf/LuaJIT-benches/coroutine-ring.lua
@@ -17,13 +17,8 @@ local n         = tonumber(arg and arg[1]) or 2e7
 local poolsize  = 503
 
 -- cache these to avoid global environment lookups
-local create    = coroutine.create
-local resume    = coroutine.resume
 local yield     = coroutine.yield
 
-local id        = 1
-local ok
-
 local body = function(token)
   while true do
     token = yield(token + 1)
@@ -33,7 +28,18 @@ end
 bench:add({
   name = "coroutine_ring",
   payload = function()
+    -- Cache to avoid upvalue lookups.
     local token = 0
+    local n = n
+    local poolsize = poolsize
+
+    -- Cache these to avoid global environment lookups.
+    local create = coroutine.create
+    local resume = coroutine.resume
+
+    local id = 1
+    local ok
+
     -- Create all threads.
     local threads = {}
     for id = 1, poolsize do
===================================================================

Also, removed excess empty line at the end of the file:
===================================================================
diff --git a/perf/LuaJIT-benches/coroutine-ring.lua b/perf/LuaJIT-benches/coroutine-ring.lua
index 8efe7b2a..747a2ecc 100644
--- a/perf/LuaJIT-benches/coroutine-ring.lua
+++ b/perf/LuaJIT-benches/coroutine-ring.lua
@@ -56,4 +62,3 @@ bench:add({
 })

 bench:run_and_report()
-
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit in LuaJIT-benches
  2025-11-13 11:44   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:12     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:12 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Fixed your comments and added the description for the benchmark.

On 13.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch!
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/euler14-bit.lua | 52 ++++++++++++++++++++---------
> >   1 file changed, 36 insertions(+), 16 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
> > index 537f2bf3..7c521deb 100644
> > --- a/perf/LuaJIT-benches/euler14-bit.lua
> > +++ b/perf/LuaJIT-benches/euler14-bit.lua
> > @@ -1,22 +1,42 @@

Also, added the comment with the benchmark description as we discussed
offline.

===================================================================
diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
index 54311abe..c4ae3713 100644
--- a/perf/LuaJIT-benches/euler14-bit.lua
+++ b/perf/LuaJIT-benches/euler14-bit.lua
@@ -1,3 +1,8 @@
+-- The benchmark to check the performance of bitwise operations.
+-- It finds the longest Collatz sequence using bitwise arithmetic.
+-- For the details see:
+-- https://projecteuler.net/problem=14
+
 local bench = require("bench").new(arg)
 
 local bit = require("bit")
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local bit = require("bit")
> >   local bnot, bor, band = bit.bnot, bit.bor, bit.band
> >   local shl, shr = bit.lshift, bit.rshift
> >   

<snipped>

> > +bench:add({
> > +  name = "euler14_bit",
> > +  payload = function()
> > +    local cache, m, n = { 1 }, 1, 1
> > +    if drop_cache then cache = nil end
> > +    for i=2,N do
> s/2,/2, /
> > +      local j = i
> > +      for len=1,1000000000 do
> s/1,/1, /
> > +        j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
> please add whitespaces, here and below
> > +        if cache then
> > +          local x = cache[j]; if x then j = x+len; break end
> whitespaces
> > +        elseif j == 1 then
> > +          j = len+1; break
> s/+/ + /

Refactor this part of the code to make it more clear. See the iterative
patch below:

===================================================================
diff --git a/perf/LuaJIT-benches/euler14-bit.lua b/perf/LuaJIT-benches/euler14-bit.lua
index 7c521deb..54311abe 100644
--- a/perf/LuaJIT-benches/euler14-bit.lua
+++ b/perf/LuaJIT-benches/euler14-bit.lua
@@ -13,14 +13,22 @@ bench:add({
   payload = function()
     local cache, m, n = { 1 }, 1, 1
     if drop_cache then cache = nil end
-    for i=2,N do
+    for i = 2, N do
       local j = i
-      for len=1,1000000000 do
-        j = bor(band(shr(j,1), band(j,1)-1), band(shl(j,1)+j+1, bnot(band(j,1)-1)))
+      for len = 1, 1000000000 do
+        j = bor(
+          band(shr(j, 1), band(j, 1) - 1),
+          band(shl(j, 1) + j + 1, bnot(band(j, 1) - 1))
+        )
         if cache then
-          local x = cache[j]; if x then j = x+len; break end
+          local x = cache[j]
+          if x then
+            j = x + len
+            break
+          end
         elseif j == 1 then
-          j = len+1; break
+          j = len + 1
+          break
         end
       end
       if cache then cache[i] = j end
===================================================================

> > +        end
> > +      end
> > +      if cache then cache[i] = j end
> > +      if j > m then m, n = j, i end
> > +    end
> > +    return {n = n, m = m}
> > +  end,
> > +  checker = function(res)

<snipped>

> > +bench:run_and_report()


-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch in LuaJIT-benches
  2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:13     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:13 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments and updated the branch.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch!
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >
> > I'm not sure that amount of permutations is the correct items count.
> > Have you any other suggestions?
> >
> >   perf/LuaJIT-benches/fannkuch.lua | 37 +++++++++++++++++++++++++++++---
> >   1 file changed, 34 insertions(+), 3 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
> > index 2a4cd426..c963c66f 100644
> > --- a/perf/LuaJIT-benches/fannkuch.lua
> > +++ b/perf/LuaJIT-benches/fannkuch.lua
> 
> I'm highly recommend adding description to benchmarks.
> 
> At least to the tests from "benchmarks game" suite.
> 
> You can use descriptions from the [1] and [2].
> 
> 1. https://benchmarksgame-team.pages.debian.net/benchmarksgame/
> 
> 2. 
> https://en.wikipedia.org/wiki/The_Computer_Language_Benchmarks_Game#Benchmark_programs

Added the short description, as you suggested:

===================================================================
diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
index 2328e058..f51e0eaf 100644
--- a/perf/LuaJIT-benches/fannkuch.lua
+++ b/perf/LuaJIT-benches/fannkuch.lua
@@ -1,3 +1,10 @@
+-- The benchmark that checks the performance of operations on
+-- small integers and vectors of integers and the performance of
+-- inner loops of the benchmark. The benchmark finds the maximum
+-- number of flips in the table needed for any permutation.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html
+
 local bench = require("bench").new(arg)
 
 local function fannkuch(n)
===================================================================

> 
> 
> > @@ -1,3 +1,4 @@
> > +local bench = require("bench").new(arg)
> >   
> >   local function fannkuch(n)
> >     local p, q, s, odd, check, maxflips = {}, {}, {}, true, 0, 0
> > @@ -6,7 +7,7 @@ local function fannkuch(n)
> >       -- Print max. 30 permutations.
> >       if check < 30 then
> >         if not p[n] then return maxflips end	-- Catch n = 0, 1, 2.
> > -      io.write(unpack(p)); io.write("\n")
> > +      -- io.write(unpack(p)); io.write("\n")
> isn't better to remove at all?

OK, removed:

===================================================================
diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
index c963c66f..ceb13ca6 100644
--- a/perf/LuaJIT-benches/fannkuch.lua
+++ b/perf/LuaJIT-benches/fannkuch.lua
@@ -7,7 +7,6 @@ local function fannkuch(n)
     -- Print max. 30 permutations.
     if check < 30 then
       if not p[n] then return maxflips end      -- Catch n = 0, 1, 2.
-      -- io.write(unpack(p)); io.write("\n")
       check = check + 1
     end
     -- Copy and flip.
===================================================================

> >         check = check + 1
> >       end
> >       -- Copy and flip.
> > @@ -46,5 +47,35 @@ local function fannkuch(n)
> >     until false
> >   end
> >   
> > -local n = tonumber(arg and arg[1]) or 1
> > -io.write("Pfannkuchen(", n, ") = ", fannkuch(n), "\n")
> > +local n = tonumber(arg and arg[1]) or 11
> > +
> > +-- Precomputed numbers taken from:
> please add description of the paper as well: "Performing Lisp Analysis 
> of the FANNKUCH Benchmark"

Added:

===================================================================
diff --git a/perf/LuaJIT-benches/fannkuch.lua b/perf/LuaJIT-benches/fannkuch.lua
index ceb13ca6..2328e058 100644
--- a/perf/LuaJIT-benches/fannkuch.lua
+++ b/perf/LuaJIT-benches/fannkuch.lua
@@ -48,7 +48,8 @@ end
 
 local n = tonumber(arg and arg[1]) or 11
 
--- Precomputed numbers taken from:
+-- Precomputed numbers taken from "Performing Lisp Analysis of the
+-- FANNKUCH Benchmark":
 -- https://dl.acm.org/doi/pdf/10.1145/382109.382124
 local FANNKUCH = { 0, 1, 2, 4, 7, 10, 16, 22, 30, 38, 51, 65, 80 }
 
===================================================================

> > +--https://dl.acm.org/doi/pdf/10.1145/382109.382124
> > +local FANNKUCH = { 0, 1, 2, 4, 7, 10, 16, 22, 30, 38, 51, 65, 80 }
> > +

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta in LuaJIT-benches
  2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:15     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:15 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments and added the description for the benchmark.

On 23.12.25, Sergey Bronnikov wrote:
> Hello,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > Since the result output (with the different input parameter value)
> > produced by this benchmark is used in other benchmarks
> > (<k-nucleotide.lua> and <revcomp.lua>), the original script is used as a
> > library (inside the <libs/> subdirectory) with the updated default input
> > value and returns the number of items processed. The output for the
> > benchmark itself is suppressed and not checked since it is irrational to
> > store in the repository such huge files for testing.
> > ---
> >   perf/LuaJIT-benches/fasta.lua      | 120 +++++++----------------------
> >   perf/LuaJIT-benches/libs/fasta.lua |  98 +++++++++++++++++++++++
> >   2 files changed, 125 insertions(+), 93 deletions(-)
> >   create mode 100644 perf/LuaJIT-benches/libs/fasta.lua
> >
> > diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua
> > index 7ce60804..d0dc005d 100644
> > --- a/perf/LuaJIT-benches/fasta.lua
> > +++ b/perf/LuaJIT-benches/fasta.lua

<snipped>

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/fasta.lua b/perf/LuaJIT-benches/fasta.lua
index d0dc005d..457623b2 100644
--- a/perf/LuaJIT-benches/fasta.lua
+++ b/perf/LuaJIT-benches/fasta.lua
@@ -1,3 +1,9 @@
+-- Benchmark to check the performance of working with strings and
+-- output to the file. It generates DNA sequences by copying or
+-- weighted random selection.
+-- For details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html
+
 local bench = require("bench").new(arg)
 
 local stdout = io.output()
diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua
index e1592e77..58f59dd5 100644
--- a/perf/LuaJIT-benches/libs/fasta.lua
+++ b/perf/LuaJIT-benches/libs/fasta.lua
@@ -1,3 +1,10 @@
+-- Benchmark to check the performance of working with strings and
+-- output to the file. It generates DNA sequences by copying or
+-- weighted random selection.
+-- For details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html
+-- Also, this file is used as a script to generate inputs for
+-- other benchmarks like <k-nucleotide.lua> and <revcomp.lua>.
 
 local Last = 42
 local function random(max)
===================================================================

> > +local bench = require("bench").new(arg)
> > +
> > +local stdout = io.output()
> > +
> > +local benchmark
> > +benchmark = {
> > +  name = "fasta",
> > +  -- XXX: The result file may take up to 278 Mb for the default
> > +  -- settings. To check the correctness of the script, run it as
> > +  -- is from the console.
> > +  skip_check = true,
> > +  setup = function()
> > +    io.output("/dev/null")
> > +  end,
> > +  payload = function()
> > +    -- Run the benchmark as is from the file.
> > +    local items = require("fasta")
> > +    -- Remove it from the cache to be sure the benchmark will run
> > +    -- at the next iteration.
> > +    package.loaded["fasta"] = nil
> > +    benchmark.items = items
> > +  end,
> > +  teardown = function()
> > +    io.output(stdout)
> > +  end,
> >   }
> >   
> > -local N = tonumber(arg and arg[1]) or 1000
> > -make_repeat_fasta('ONE', 'Homo sapiens alu', alu, N*2)
> > -make_random_fasta('TWO', 'IUB ambiguity codes', iub, N*3)
> > -make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, N*5)
> > +bench:add(benchmark)
> > +bench:run_and_report()
> > diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua
> > new file mode 100644
> > index 00000000..9c72c244
> > --- /dev/null
> > +++ b/perf/LuaJIT-benches/libs/fasta.lua
> > @@ -0,0 +1,98 @@
> > +
> > +local Last = 42
> > +local function random(max)
> > +  local y = (Last * 3877 + 29573) % 139968
> > +  Last = y
> > +  return (max * y) / 139968
> > +end
> > +
> > +local function make_repeat_fasta(id, desc, s, n)
> > +  local write, sub = io.write, string.sub
> > +  write(">", id, " ", desc, "\n")
> > +  local p, sn, s2 = 1, #s, s..s
> > +  for i=60,n,60 do
> more whitespaces please
> > +    write(sub(s2, p, p + 59), "\n")
> > +    p = p + 60; if p > sn then p = p - sn end
> > +  end
> > +  local tail = n % 60
> > +  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
> more whitespaces please. Here and below.

Reformated, as you suggested:

===================================================================
diff --git a/perf/LuaJIT-benches/libs/fasta.lua b/perf/LuaJIT-benches/libs/fasta.lua
index 9c72c244..e1592e77 100644
--- a/perf/LuaJIT-benches/libs/fasta.lua
+++ b/perf/LuaJIT-benches/libs/fasta.lua
@@ -10,12 +10,12 @@ local function make_repeat_fasta(id, desc, s, n)
   local write, sub = io.write, string.sub
   write(">", id, " ", desc, "\n")
   local p, sn, s2 = 1, #s, s..s
-  for i=60,n,60 do
+  for i = 60, n, 60 do
     write(sub(s2, p, p + 59), "\n")
     p = p + 60; if p > sn then p = p - sn end
   end
   local tail = n % 60
-  if tail > 0 then write(sub(s2, p, p + tail-1), "\n") end
+  if tail > 0 then write(sub(s2, p, p + tail - 1), "\n") end
 end
 
 local function make_random_fasta(id, desc, bs, n)
@@ -23,17 +23,17 @@ local function make_random_fasta(id, desc, bs, n)
   loadstring([=[
     local write, char, unpack, n, random = io.write, string.char, unpack, ...
     local buf, p = {}, 1
-    for i=60,n,60 do
-      for j=p,p+59 do ]=]..bs..[=[ end
-      buf[p+60] = 10; p = p + 61
+    for i = 60, n, 60 do
+      for j = p, p + 59 do ]=]..bs..[=[ end
+      buf[p + 60] = 10; p = p + 61
       if p >= 2048 then write(char(unpack(buf, 1, p-1))); p = 1 end
     end
     local tail = n % 60
     if tail > 0 then
-      for j=p,p+tail-1 do ]=]..bs..[=[ end
+      for j = p, p + tail - 1 do ]=]..bs..[=[ end
       p = p + tail; buf[p] = 10; p = p + 1
     end
-    write(char(unpack(buf, 1, p-1)))
+    write(char(unpack(buf, 1, p - 1)))
   ]=], desc)(n, random)
 end
 
@@ -41,13 +41,13 @@ local function bisect(c, p, lo, hi)
   local n = hi - lo
   if n == 0 then return "buf[j] = "..c[hi].."\n" end
   local mid = math.floor(n / 2)
-  return "if r < "..p[lo+mid].." then\n"..bisect(c, p, lo, lo+mid)..
-         "else\n"..bisect(c, p, lo+mid+1, hi).."end\n"
+  return "if r < "..p[lo + mid].." then\n"..bisect(c, p, lo, lo + mid)..
+         "else\n"..bisect(c, p, lo + mid + 1, hi).."end\n"
 end
 
 local function make_bisect(tab)
   local c, p, sum = {}, {}, 0
-  for i,row in ipairs(tab) do
+  for i, row in ipairs(tab) do
     c[i] = string.byte(row[1])
     sum = sum + row[2]
     p[i] = sum
===================================================================

> > +end
> > +

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide in LuaJIT-benches
  2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:17     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:17 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments and added the benchmark description.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch!
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The benchmark input is given by redirecting the corresponding
> > <FASTA_5000000> file generated by the `libs/fasta.lua 5e6`. The output
> > from the benchmark is redirected to /dev/null. All checks are done by
> > the comparison with the precomputed values for the aforementioned file.
> > ---
> >   perf/LuaJIT-benches/k-nucleotide.lua | 93 ++++++++++++++++++++++++----
> >   1 file changed, 82 insertions(+), 11 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
> > index 0bfb41be..ae51dae9 100644
> > --- a/perf/LuaJIT-benches/k-nucleotide.lua
> > +++ b/perf/LuaJIT-benches/k-nucleotide.lua
> > @@ -1,3 +1,4 @@

Also, added the comment with the benchmark description as we discussed
offline:

===================================================================
diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
index 2a6cbb67..e92429e8 100644
--- a/perf/LuaJIT-benches/k-nucleotide.lua
+++ b/perf/LuaJIT-benches/k-nucleotide.lua
@@ -1,3 +1,9 @@
+-- The benchmark that checks the performance of hash tables.
+-- The program reads the redirected FASTA format file from stdin,
+-- extracts DNA sequence THREE, and counts the specific sequences.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html
+
 local bench = require('bench').new(arg)
 
 local function kfrequency(seq, freq, k, frame)
===================================================================

> > +local bench = require('bench').new(arg)
> >   
> >   local function kfrequency(seq, freq, k, frame)
> >     local sub = string.sub
> > @@ -12,7 +13,8 @@ local function count(seq, frag)
> >     local k = #frag
> >     local freq = {}
> >     for frame=1,k do kfrequency(seq, freq, k, frame) end
> > -  io.write(freq[frag] or 0, "\t", frag, "\n")
> > +  return freq[frag]
> > +  -- io.write(freq[frag] or 0, "\t", frag, "\n")
> remove this at all?
> >   end
> >   
> >   local function frequency(seq, k)
> > @@ -24,10 +26,13 @@ local function frequency(seq, k)
> >       local fa, fb = freq[a], freq[b]
> >       return fa == fb and a > b or fa > fb
> >     end)
> > +  local res = {}
> >     for _,c in ipairs(sfreq) do
> > -    io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
> > +    -- io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
> remove?
> > +    res[c] = freq[c]*100/sum
> >     end
> > -  io.write("\n")
> > +  -- io.write("\n")
> > +  return res
> >   end
> >   
> >   local function readseq()
> > @@ -48,11 +53,77 @@ local function readseq()

Removed all output, as you suggested:

===================================================================
diff --git a/perf/LuaJIT-benches/k-nucleotide.lua b/perf/LuaJIT-benches/k-nucleotide.lua
index ae51dae9..2a6cbb67 100644
--- a/perf/LuaJIT-benches/k-nucleotide.lua
+++ b/perf/LuaJIT-benches/k-nucleotide.lua
@@ -14,7 +14,6 @@ local function count(seq, frag)
   local freq = {}
   for frame=1,k do kfrequency(seq, freq, k, frame) end
   return freq[frag]
-  -- io.write(freq[frag] or 0, "\t", frag, "\n")
 end
 
 local function frequency(seq, k)
@@ -28,10 +27,8 @@ local function frequency(seq, k)
   end)
   local res = {}
   for _,c in ipairs(sfreq) do
-    -- io.write(string.format("%s %0.3f\n", c, (freq[c]*100)/sum))
     res[c] = freq[c]*100/sum
   end
-  -- io.write("\n")
   return res
 end
 
===================================================================

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life in LuaJIT-benches
  2025-11-17  8:35   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:18     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:18 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Please consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file.
> >
> > The output is redirected to /dev/null. The checker tests the result
> > after the exact amount of iterations for the fixed field (as it is
> > declared in the original benchmark).
> > ---
> >   perf/LuaJIT-benches/life.lua | 79 +++++++++++++++++++++++++++++++++++-
> >   1 file changed, 78 insertions(+), 1 deletion(-)
> >
> > diff --git a/perf/LuaJIT-benches/life.lua b/perf/LuaJIT-benches/life.lua
> > index 911d9fe1..d0e4dc98 100644
> > --- a/perf/LuaJIT-benches/life.lua
> > +++ b/perf/LuaJIT-benches/life.lua

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/life.lua b/perf/LuaJIT-benches/life.lua
index 5a316364..dbf26fac 100644
--- a/perf/LuaJIT-benches/life.lua
+++ b/perf/LuaJIT-benches/life.lua
@@ -1,5 +1,12 @@
 -- life.lua
--- original by Dave Bollinger <DBollinger@compuserve.com> posted to lua-l
+-- The benchmark to check the performance of array-like data
+-- structures with RW access. John Horton Conway's "Game of Life"
+-- cellular automaton.
+-- For the details see:
+-- https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life
+-- Original by Dave Bollinger <DBollinger@compuserve.com> posted
+-- to lua-l:
+-- http://lua-users.org/lists/lua-l/1999-12/msg00003.html
 -- modified to use ANSI terminal escape sequences
 -- modified to use for instead of while
 
===================================================================

> > @@ -3,6 +3,8 @@
> >   -- modified to use ANSI terminal escape sequences
> >   -- modified to use for instead of while
> >   
> > +local bench = require('bench').new(arg)
> > +
> >   local write=io.write
> >   
> >   ALIVE="�"	DEAD="�"
> We usually use ascii only symbols. Should we replace with ascii-only 
> alternative?

OK, removed, since we already have the mentioning of the modification in
the test header and it can't affect the performance somehow.

===================================================================
diff --git a/perf/LuaJIT-benches/life.lua b/perf/LuaJIT-benches/life.lua
index d0e4dc98..5a316364 100644
--- a/perf/LuaJIT-benches/life.lua
+++ b/perf/LuaJIT-benches/life.lua
@@ -7,7 +7,6 @@ local bench = require('bench').new(arg)
 
 local write=io.write
 
-ALIVE="¥"	DEAD="þ"
 ALIVE="O"	DEAD="-"
 
 function delay() -- NOTE: SYSTEM-DEPENDENT, adjust as necessary
===================================================================


> > @@ -106,6 +108,81 @@ function LIFE(w,h)
> >       if gen>2000 then break end
> >       --delay()		-- no delay
> dead code

What do you mean about dead code?
If you are suggesting to delete the `delay()` function, I prefer not to
do it in case we want to debug the behaviour somehow. Also, this adds
additional GC objects related to the `delay()` function and its proto
(it is not much, but anyway). Also, this comment helps to understand
where the delay function should be emitted if necessary. I prefer
avoiding the modification of the code that is not vital. Also, this part
isn't modified, so I prefer to leave it as is.

> >     end
> > +  return thisgen
> >   end
> >   
> > -LIFE(40,20)

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit in LuaJIT-benches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:20     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:20 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comments and answered your questions below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey!
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The output is redirected to /dev/null. The check is skipped since it is
> > very inconvenient to check the binary output, especially since it may be
> > configured by the parameter.
> > ---
> >   perf/LuaJIT-benches/mandelbrot-bit.lua | 86 +++++++++++++++++---------
> >   1 file changed, 57 insertions(+), 29 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/mandelbrot-bit.lua b/perf/LuaJIT-benches/mandelbrot-bit.lua
> > index 91d96975..a6b5e1f8 100644
> > --- a/perf/LuaJIT-benches/mandelbrot-bit.lua
> > +++ b/perf/LuaJIT-benches/mandelbrot-bit.lua

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/mandelbrot-bit.lua b/perf/LuaJIT-benches/mandelbrot-bit.lua
index 88df480e..53c3ad4e 100644
--- a/perf/LuaJIT-benches/mandelbrot-bit.lua
+++ b/perf/LuaJIT-benches/mandelbrot-bit.lua
@@ -1,3 +1,10 @@
+-- The benchmark to check the performance of multiple inner loops
+-- with arithmetic operations. Bit variation. Calculates the
+-- Mandelbrot Set on a bitmap and dumps output in the portable
+-- bitmap format.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/mandelbrot.html
+
 local bit = require("bit")
 
 local bench = require("bench").new(arg)
===================================================================

> > @@ -1,33 +1,61 @@
> > -
> >   local bit = require("bit")

<snipped>

> > +local bench = require("bench").new(arg)
> > +
> > +local N = tonumber(arg and arg[1]) or 5000
> > +
> > +local function payload()
> > +  -- These functions must not be an upvalue but the stack slot.
> please add here details about performance impact

I suppose this is related to the open upvalue fetching overhead, but I'm
not sure. I suggest to postponing this research out of scope of this
issue.

> > +  local N = N
> > +  local bor, band = bit.bor, bit.band
> > +  local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
> > +  local write, char, unpack = io.write, string.char, unpack
> > +
> > +  local M, buf = 2/N, {}
> > +  write("P4\n", N, " ", N, "\n")
> > +  for y=0,N-1 do
> please add spaces here and below

Fixed the formating as you suggested:

===================================================================
diff --git a/perf/LuaJIT-benches/mandelbrot-bit.lua b/perf/LuaJIT-benches/mandelbrot-bit.lua
index a6b5e1f8..88df480e 100644
--- a/perf/LuaJIT-benches/mandelbrot-bit.lua
+++ b/perf/LuaJIT-benches/mandelbrot-bit.lua
@@ -11,30 +11,30 @@ local function payload()
   local shl, shr, rol = bit.lshift, bit.rshift, bit.rol
   local write, char, unpack = io.write, string.char, unpack
 
-  local M, buf = 2/N, {}
+  local M, buf = 2 / N, {}
   write("P4\n", N, " ", N, "\n")
-  for y=0,N-1 do
-    local Ci, b, p = y*M-1, -16777216, 0
-    local Ciq = Ci*Ci
-    for x=0,N-1,2 do
-      local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
-      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
-      local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
+  for y = 0, N - 1 do
+    local Ci, b, p = y * M - 1, -16777216, 0
+    local Ciq = Ci * Ci
+    for x = 0, N - 1, 2 do
+      local Cr, Cr2 = x * M - 1.5, (x + 1) * M - 1.5
+      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr * Cr, Ciq
+      local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2 * Cr2, Ciq
       b = rol(b, 2)
-      for i=1,49 do
-        Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
-        Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
-        Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
-        Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
-        if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
-        if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
+      for i = 1, 49 do
+        Zi = Zr * Zi * 2 + Ci; Zi2 = Zr2 * Zi2 * 2 + Ci
+        Zr = Zrq - Ziq + Cr; Zr2 = Zrq2 - Ziq2 + Cr2
+        Ziq = Zi * Zi; Ziq2 = Zi2 * Zi2
+        Zrq = Zr * Zr; Zrq2 = Zr2 * Zr2
+        if band(b, 2) ~= 0 and Zrq + Ziq > 4.0 then b = band(b, -3) end
+        if band(b, 1) ~= 0 and Zrq2 + Ziq2 > 4.0 then b = band(b, -2) end
         if band(b, 3) == 0 then break end
       end
       if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
     end
     if b ~= -16777216 then
       if band(N, 1) ~= 0 then b = shr(b, 1) end
-      p = p + 1; buf[p] = shl(b, 8-band(N, 7))
+      p = p + 1; buf[p] = shl(b, 8 - band(N, 7))
     end
     write(char(unpack(buf, 1, p)))
   end
===================================================================

> > +    local Ci, b, p = y*M-1, -16777216, 0
> > +    local Ciq = Ci*Ci
> > +    for x=0,N-1,2 do
> > +      local Cr, Cr2 = x*M-1.5, (x+1)*M-1.5
> > +      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ciq
> > +      local Zr2, Zi2, Zrq2, Ziq2 = Cr2, Ci, Cr2*Cr2, Ciq
> > +      b = rol(b, 2)
> > +      for i=1,49 do
> > +        Zi = Zr*Zi*2 + Ci; Zi2 = Zr2*Zi2*2 + Ci
> > +        Zr = Zrq-Ziq + Cr; Zr2 = Zrq2-Ziq2 + Cr2
> > +        Ziq = Zi*Zi; Ziq2 = Zi2*Zi2
> > +        Zrq = Zr*Zr; Zrq2 = Zr2*Zr2
> > +        if band(b, 2) ~= 0 and Zrq+Ziq > 4.0 then b = band(b, -3) end
> > +        if band(b, 1) ~= 0 and Zrq2+Ziq2 > 4.0 then b = band(b, -2) end
> > +        if band(b, 3) == 0 then break end
> > +      end
> > +      if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
> >       end
> > -    if b >= 0 then p = p + 1; buf[p] = b; b = -16777216; end
> > -  end
> > -  if b ~= -16777216 then
> > -    if band(N, 1) ~= 0 then b = shr(b, 1) end
> > -    p = p + 1; buf[p] = shl(b, 8-band(N, 7))
> > +    if b ~= -16777216 then
> > +      if band(N, 1) ~= 0 then b = shr(b, 1) end
> > +      p = p + 1; buf[p] = shl(b, 8-band(N, 7))
> > +    end
> > +    write(char(unpack(buf, 1, p)))
> >     end
> > -  write(char(unpack(buf, 1, p)))
> >   end

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot in LuaJIT-benches
  2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:20     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:20 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Fixed you comments and added the description for the benchmark.

On 23.12.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The output is redirected to /dev/null. The check is skipped since it is
> > very inconvenient to check the binary output, especially since it may be
> > configured by the parameter.
> > ---
> >   perf/LuaJIT-benches/mandelbrot.lua | 64 +++++++++++++++++++++---------
> >   1 file changed, 45 insertions(+), 19 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/mandelbrot.lua b/perf/LuaJIT-benches/mandelbrot.lua
> > index 0ef595a2..51e0dd4f 100644
> > --- a/perf/LuaJIT-benches/mandelbrot.lua
> > +++ b/perf/LuaJIT-benches/mandelbrot.lua
> > @@ -1,23 +1,49 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/mandelbrot.lua b/perf/LuaJIT-benches/mandelbrot.lua
index 8b7310b8..999ff200 100644
--- a/perf/LuaJIT-benches/mandelbrot.lua
+++ b/perf/LuaJIT-benches/mandelbrot.lua
@@ -1,3 +1,9 @@
+-- The benchmark to check the performance of multiple inner loops
+-- with arithmetic operations. Calculates the Mandelbrot Set on a
+-- bitmap and dumps output in the portable bitmap format.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/mandelbrot.html
+
 local bench = require("bench").new(arg)
 
 local N = tonumber(arg and arg[1]) or 5000
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> > -local write, char, unpack = io.write, string.char, unpack
> > -local N = tonumber(arg and arg[1]) or 100
> > -local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
> > -write("P4\n", N, " ", N, "\n")
> > -for y=0,N-1 do
> > -  local Ci, b, p = y*M-1, 1, 0
> > -  for x=0,N-1 do
> > -    local Cr = x*M-1.5
> > -    local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
> > -    b = b + b
> > -    for i=1,49 do
> > -      Zi = Zr*Zi*2 + Ci
> > -      Zr = Zrq-Ziq + Cr
> > -      Ziq = Zi*Zi
> > -      Zrq = Zr*Zr
> > -      if Zrq+Ziq > 4.0 then b = b + 1; break; end
> > +local N = tonumber(arg and arg[1]) or 5000
> > +
> > +local function payload()
> > +  -- These functions must not be an upvalue but the stack slot.
> > +  local N = N
> > +  local write, char, unpack = io.write, string.char, unpack
> > +  local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
> please add more whitespaces. Here and below.

Rewritten as you suggested:

===================================================================
diff --git a/perf/LuaJIT-benches/mandelbrot.lua b/perf/LuaJIT-benches/mandelbrot.lua
index 51e0dd4f..8b7310b8 100644
--- a/perf/LuaJIT-benches/mandelbrot.lua
+++ b/perf/LuaJIT-benches/mandelbrot.lua
@@ -6,24 +6,24 @@ local function payload()
   -- These functions must not be an upvalue but the stack slot.
   local N = N
   local write, char, unpack = io.write, string.char, unpack
-  local M, ba, bb, buf = 2/N, 2^(N%8+1)-1, 2^(8-N%8), {}
+  local M, ba, bb, buf = 2 / N, 2 ^ (N % 8 + 1) - 1, 2 ^ (8 - N % 8), {}
   write("P4\n", N, " ", N, "\n")
-  for y=0,N-1 do
-    local Ci, b, p = y*M-1, 1, 0
-    for x=0,N-1 do
-      local Cr = x*M-1.5
-      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr*Cr, Ci*Ci
+  for y = 0, N - 1 do
+    local Ci, b, p = y * M - 1, 1, 0
+    for x = 0, N - 1 do
+      local Cr = x * M - 1.5
+      local Zr, Zi, Zrq, Ziq = Cr, Ci, Cr * Cr, Ci * Ci
       b = b + b
-      for i=1,49 do
-        Zi = Zr*Zi*2 + Ci
-        Zr = Zrq-Ziq + Cr
-        Ziq = Zi*Zi
-        Zrq = Zr*Zr
-        if Zrq+Ziq > 4.0 then b = b + 1; break; end
+      for i = 1, 49 do
+        Zi = Zr * Zi * 2 + Ci
+        Zr = Zrq - Ziq + Cr
+        Ziq = Zi * Zi
+        Zrq = Zr * Zr
+        if Zrq + Ziq > 4.0 then b = b + 1; break; end
       end
       if b >= 256 then p = p + 1; buf[p] = 511 - b; b = 1; end
     end
-    if b ~= 1 then p = p + 1; buf[p] = (ba-b)*bb; end
+    if b ~= 1 then p = p + 1; buf[p] = (ba - b) * bb; end
     write(char(unpack(buf, 1, p)))
   end
 end
===================================================================

> > +  write("P4\n", N, " ", N, "\n")

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 in LuaJIT-benches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:22     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:22 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Please consider my comments below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See  my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/md5.lua | 27 ++++++++++++++++++++-------
> >   1 file changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/md5.lua b/perf/LuaJIT-benches/md5.lua
> > index fdf6b4a7..5ec67527 100644
> > --- a/perf/LuaJIT-benches/md5.lua
> > +++ b/perf/LuaJIT-benches/md5.lua

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/md5.lua b/perf/LuaJIT-benches/md5.lua
index 2a96de95..556a0cd5 100644
--- a/perf/LuaJIT-benches/md5.lua
+++ b/perf/LuaJIT-benches/md5.lua
@@ -1,3 +1,8 @@
+-- Benchmark to check the performance of the bit operations.
+-- Calculates the MD5 hash sum from the fixed string.
+-- For more details about the MD5 algorithm see:
+-- https://en.wikipedia.org/wiki/MD5
+
 local bit = require("bit")
 local bench = require("bench").new(arg)
 
===================================================================

> > @@ -1,5 +1,6 @@
> > -
> >   local bit = require("bit")
> > +local bench = require("bench").new(arg)
> > +
> >   local tobit, tohex, bnot = bit.tobit or bit.cast, bit.tohex, bit.bnot
> >   local bor, band, bxor = bit.bor, bit.band, bit.bxor
> >   local lshift, rshift, rol, bswap = bit.lshift, bit.rshift, bit.rol, bit.bswap
> > @@ -147,7 +148,7 @@ assert(md5('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789') ==
> >   assert(md5('12345678901234567890123456789012345678901234567890123456789012345678901234567890') ==
> >          '57edf4a22be3c955ac49da2e2107b67a')
> >   
> > -local N = tonumber(arg and arg[1]) or 10000
> > +local N = tonumber(arg and arg[1]) or 20000
> this deserves a comment, why 20000 instead 10000?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> >   
> >     -- Credits: William Shakespeare, Romeo and Juliet
> >   local txt = [[Rebellious subjects, enemies to peace,
> > @@ -176,8 +177,20 @@ Once more, on pain of death, all men depart.]]
> >     txt = txt..txt..txt..txt
> >     txt = txt..txt..txt..txt
> >   
> > -for i=1,N do
> > -  res = md5(txt)
> > -end
> > -assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
> > -
> > +bench:add({
> > +  name = 'md5',
> > +  payload = function()
> > +    local res
> > +    for i=1,N do
> s/1,/1, /

Fixed:

===================================================================
diff --git a/perf/LuaJIT-benches/md5.lua b/perf/LuaJIT-benches/md5.lua
index 5ec67527..2a96de95 100644
--- a/perf/LuaJIT-benches/md5.lua
+++ b/perf/LuaJIT-benches/md5.lua
@@ -181,7 +181,7 @@ bench:add({
   name = 'md5',
   payload = function()
     local res
-    for i=1,N do
+    for _ = 1, N do
       res = md5(txt)
     end
     return res
===================================================================

> > +      res = md5(txt)
> > +    end
> > +    return res
> > +  end,
> > +  checker = function(res)
> > +    assert(res == 'a831e91e0f70eddcb70dc61c6f82f6cd')
> > +    return true
> > +  end,
> > +  items = N,
> > +})
> > +
> > +bench:run_and_report()

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor in LuaJIT-benches
  2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:23     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:23 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Please considery my answers below.

On 23.12.25, Sergey Bronnikov wrote:
> Hello,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The arguments to the script still can be
> > provided in the command line run. However, the values greater than the
> > maximum possible solutions found do not affect the time of execution for
> > this benchmark. Hence, the number of items to proceed is considered
> > constant as the maximum possible number of solutions.
> > ---
> >   perf/LuaJIT-benches/meteor.lua | 46 ++++++++++++++++++++++++++--------
> >   1 file changed, 36 insertions(+), 10 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/meteor.lua b/perf/LuaJIT-benches/meteor.lua
> > index 80588ab5..f3962820 100644
> > --- a/perf/LuaJIT-benches/meteor.lua
> > +++ b/perf/LuaJIT-benches/meteor.lua
> > @@ -1,3 +1,4 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/meteor.lua b/perf/LuaJIT-benches/meteor.lua
index 7acb86af..8cda0190 100644
--- a/perf/LuaJIT-benches/meteor.lua
+++ b/perf/LuaJIT-benches/meteor.lua
@@ -1,3 +1,8 @@
+-- Benchmark to check various operations via the Meteor puzzle
+-- solver.
+-- For the details see:
+-- https://pybenchmarks.org/u64q/performance.php?test=meteor
+
 local bench = require("bench").new(arg)
 
 -- Generate a decision tree based solver for the meteor puzzle.
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   -- Generate a decision tree based solver for the meteor puzzle.
> >   local function generatesolver(countinit)
> > @@ -118,6 +119,10 @@ local function printresult()
> >     printboard(smax)
> >   end
> >   
> > +local function getresult()
> > +  return countinit-count, smin, smax
> > +end
> > +
> >   -- Generate piece lookup array from the order of use.
> >   local function genp()
> >     local p = pcs
> > @@ -141,7 +146,7 @@ local function f91(k)
> >       local s = p[b0] ]]
> >     for p=2,99 do if ok[p] then s = s.."..p[b"..p.."]" end end
> please add more whitespaces. Here and below.

This part wasn't touched by the patch. Let's leave this chunk as is in
the original state. If we want to refactor it, this should be carefully
done when we have the stable performance testing so we may verify that
our changes don't affect the measurements. All stylistic fixes should be
done only in cases when we have already touched the code. Without it,
these changes have no sense to me. Most probably, we will never refactor
this code, but we will add new tests in our own perf suite.

> >     s = s..[[
> > -    -- Remember min/max boards, dito for the symmetric board.
> > +    -- Remember min/max boards, ditto for the symmetric board.
> >       if not smin then smin = s; smax = s
> >       elseif s < smin then smin = s elseif s > smax then smax = s end
> >       s = reverse(s)
> > @@ -206,15 +211,36 @@ local f93 = f91
> >     end
> >   
> >     -- Compile and return solver function and result getter.
> > -  return loadstring(s.."return f0, printresult\n", "solver")(countinit)
> > +  return loadstring(s.."return f0, printresult, getresult\n", "solver")(countinit)
> >   end
> >   
> > --- Generate the solver function hierarchy.
> > -local solver, printresult = generatesolver(tonumber(arg and arg[1]) or 10000)
> > -
> > --- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
> > -if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
> > +local N = tonumber(arg and arg[1]) or 10000
> > +
> > +bench:add({
> > +  name = "meteror",
> typo: s/meteror/meteor/

Fixed, thanks!

===================================================================
diff --git a/perf/LuaJIT-benches/meteor.lua b/perf/LuaJIT-benches/meteor.lua
index f3962820..7acb86af 100644
--- a/perf/LuaJIT-benches/meteor.lua
+++ b/perf/LuaJIT-benches/meteor.lua
@@ -217,7 +217,7 @@ end
 local N = tonumber(arg and arg[1]) or 10000
 
 bench:add({
-  name = "meteror",
+  name = "meteor",
   setup = function()
     -- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
     if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
===================================================================

> > +  setup = function()
> > +    -- The optimizer for LuaJIT 1.1.x is not helpful here, so turn it off.
> > +    if jit and jit.opt and jit.version_num < 10200 then jit.opt.start(0) end
> > +  end,
> > +  payload = function()
> > +    -- Generate the solver function hierarchy.
> > +    local solver, printresult, getresult = generatesolver(N)
> > +
> > +    -- Run the solver protected to get partial results (max count or ctrl-c).
> > +    pcall(solver, 0)
> > +
> > +    local n, smin, smax = getresult()
> > +    return {n = n, smin = smin, smax = smax}
> > +  end,
> > +  checker = function(res)
> > +    if N >= 2097 then
> > +      assert(res.n == 2098, "Incorrect solutions number")
> > +      assert(res.smin == "00001222012661126155865558633348893448934747977799")
> > +      assert(res.smax == "99998966856688568255777257472014220144031400311333")
> > +    end
> > +    return true
> > +  end,
> > +  items = 2098,
> > +})
> >   
> > --- Run the solver protected to get partial results (max count or ctrl-c).
> > -pcall(solver, 0)
> > -printresult()
> > +bench:run_and_report()

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody in LuaJIT-benches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:24     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:24 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
See my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/nbody.lua | 127 ++++++++++++++++++++--------------
> >   1 file changed, 74 insertions(+), 53 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/nbody.lua b/perf/LuaJIT-benches/nbody.lua
> > index e0ff8f77..f01c20a3 100644
> > --- a/perf/LuaJIT-benches/nbody.lua
> > +++ b/perf/LuaJIT-benches/nbody.lua
> > @@ -1,56 +1,12 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/nbody.lua b/perf/LuaJIT-benches/nbody.lua
index 335a43a5..4370b4d7 100644
--- a/perf/LuaJIT-benches/nbody.lua
+++ b/perf/LuaJIT-benches/nbody.lua
@@ -1,3 +1,9 @@
+-- Benchmark to check the performance of FP arithmetics.
+-- It models the orbits of Jovian planets, using the simple
+-- symplectic-integrator.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/nbody.html
+
 local bench = require("bench").new(arg)
 
 local sqrt = math.sqrt
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local sqrt = math.sqrt
> >   
> >   local PI = 3.141592653589793
> >   local SOLAR_MASS = 4 * PI * PI
> >   local DAYS_PER_YEAR = 365.24
> > -local bodies = {
> > -  { -- Sun
> > -    x = 0,
> > -    y = 0,
> > -    z = 0,
> > -    vx = 0,
> > -    vy = 0,
> > -    vz = 0,
> > -    mass = SOLAR_MASS
> > -  },
> > -  { -- Jupiter
> > -    x = 4.84143144246472090e+00,
> > -    y = -1.16032004402742839e+00,
> > -    z = -1.03622044471123109e-01,
> > -    vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
> > -    vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
> > -    vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
> > -    mass = 9.54791938424326609e-04 * SOLAR_MASS
> > -  },
> > -  { -- Saturn
> > -    x = 8.34336671824457987e+00,
> > -    y = 4.12479856412430479e+00,
> > -    z = -4.03523417114321381e-01,
> > -    vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
> > -    vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
> > -    vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
> > -    mass = 2.85885980666130812e-04 * SOLAR_MASS
> > -  },
> > -  { -- Uranus
> > -    x = 1.28943695621391310e+01,
> > -    y = -1.51111514016986312e+01,
> > -    z = -2.23307578892655734e-01,
> > -    vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
> > -    vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
> > -    vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
> > -    mass = 4.36624404335156298e-05 * SOLAR_MASS
> > -  },
> > -  { -- Neptune
> > -    x = 1.53796971148509165e+01,
> > -    y = -2.59193146099879641e+01,
> > -    z = 1.79258772950371181e-01,
> > -    vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
> > -    vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
> > -    vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
> > -    mass = 5.15138902046611451e-05 * SOLAR_MASS
> > -  }
> > -}
> > +local bodies
> > +local nbody
> >   
> >   local function advance(bodies, nbody, dt)
> >     for i=1,nbody do
> > @@ -110,10 +66,75 @@ local function offsetMomentum(b, nbody)
> >     b[1].vz = -pz / SOLAR_MASS
> >   end
> >   
> > -local N = tonumber(arg and arg[1]) or 1000
> > -local nbody = #bodies
> > +local DEFAULT_N = 5e6
> > +local N = tonumber(arg and arg[1]) or DEFAULT_N
> >   
> > -offsetMomentum(bodies, nbody)
> > -io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
> > -for i=1,N do advance(bodies, nbody, 0.01) end
> > -io.write( string.format("%0.9f",energy(bodies, nbody)), "\n")
> > +bench:add({
> > +  name = "nbody",
> > +  payload = function()
> > +    bodies = {
> > +      { -- Sun
> > +        x = 0,
> > +        y = 0,
> > +        z = 0,
> > +        vx = 0,
> > +        vy = 0,
> > +        vz = 0,
> > +        mass = SOLAR_MASS
> > +      },
> > +      { -- Jupiter
> > +        x = 4.84143144246472090e+00,
> > +        y = -1.16032004402742839e+00,
> > +        z = -1.03622044471123109e-01,
> > +        vx = 1.66007664274403694e-03 * DAYS_PER_YEAR,
> > +        vy = 7.69901118419740425e-03 * DAYS_PER_YEAR,
> > +        vz = -6.90460016972063023e-05 * DAYS_PER_YEAR,
> > +        mass = 9.54791938424326609e-04 * SOLAR_MASS
> > +      },
> > +      { -- Saturn
> > +        x = 8.34336671824457987e+00,
> > +        y = 4.12479856412430479e+00,
> > +        z = -4.03523417114321381e-01,
> > +        vx = -2.76742510726862411e-03 * DAYS_PER_YEAR,
> > +        vy = 4.99852801234917238e-03 * DAYS_PER_YEAR,
> > +        vz = 2.30417297573763929e-05 * DAYS_PER_YEAR,
> > +        mass = 2.85885980666130812e-04 * SOLAR_MASS
> > +      },
> > +      { -- Uranus
> > +        x = 1.28943695621391310e+01,
> > +        y = -1.51111514016986312e+01,
> > +        z = -2.23307578892655734e-01,
> > +        vx = 2.96460137564761618e-03 * DAYS_PER_YEAR,
> > +        vy = 2.37847173959480950e-03 * DAYS_PER_YEAR,
> > +        vz = -2.96589568540237556e-05 * DAYS_PER_YEAR,
> > +        mass = 4.36624404335156298e-05 * SOLAR_MASS
> > +      },
> > +      { -- Neptune
> > +        x = 1.53796971148509165e+01,
> > +        y = -2.59193146099879641e+01,
> > +        z = 1.79258772950371181e-01,
> > +        vx = 2.68067772490389322e-03 * DAYS_PER_YEAR,
> > +        vy = 1.62824170038242295e-03 * DAYS_PER_YEAR,
> > +        vz = -9.51592254519715870e-05 * DAYS_PER_YEAR,
> > +        mass = 5.15138902046611451e-05 * SOLAR_MASS
> > +      }
> > +    }
> > +    nbody = #bodies
> > +
> > +    offsetMomentum(bodies, nbody)
> Two `io.write()` were lost. It is intentional?

Yes, it is checked via the corresponding assertions.

> > +
> > +    assert(energy(bodies, nbody) == -0.16907516382852447179,
> > +             "Correct start energy")
> > +    for i=1,N do advance(bodies, nbody, 0.01) end
> s/1,/1, /

Fixed:
===================================================================
diff --git a/perf/LuaJIT-benches/nbody.lua b/perf/LuaJIT-benches/nbody.lua
index f01c20a3..335a43a5 100644
--- a/perf/LuaJIT-benches/nbody.lua
+++ b/perf/LuaJIT-benches/nbody.lua
@@ -125,7 +125,7 @@ bench:add({
 
     assert(energy(bodies, nbody) == -0.16907516382852447179,
              "Correct start energy")
-    for i=1,N do advance(bodies, nbody, 0.01) end
+    for _ = 1, N do advance(bodies, nbody, 0.01) end
   end,
   checker = function()
     if N == DEFAULT_N then
===================================================================

> > +  end,
> > +  checker = function()
> > +    if N == DEFAULT_N then
> > +      assert(energy(bodies, nbody) == -0.16908313397890917251,
> > +             "Correct result energy")
> > +    end
> > +    return true
> > +  end,
> > +  items = N,
> > +})
> > +
> > +bench:run_and_report()

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp in LuaJIT-benches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:25     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:25 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
See my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/nsieve-bit-fp.lua | 35 +++++++++++++++++++++++----
> >   1 file changed, 30 insertions(+), 5 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/nsieve-bit-fp.lua b/perf/LuaJIT-benches/nsieve-bit-fp.lua
> > index 3971ec1f..d0ab23d2 100644
> > --- a/perf/LuaJIT-benches/nsieve-bit-fp.lua
> > +++ b/perf/LuaJIT-benches/nsieve-bit-fp.lua
> > @@ -1,3 +1,4 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/nsieve-bit-fp.lua b/perf/LuaJIT-benches/nsieve-bit-fp.lua
index 9d958950..8ad85dc9 100644
--- a/perf/LuaJIT-benches/nsieve-bit-fp.lua
+++ b/perf/LuaJIT-benches/nsieve-bit-fp.lua
@@ -1,3 +1,10 @@
+-- Benchmark to check the performance of FP arithmetics and
+-- access to the array structure. This benchmark finds all prime
+-- numbers in a given segment. This is the FP benchmark that
+-- models the bit variation behaviour.
+-- For the details see:
+-- https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
+
 local bench = require("bench").new(arg)

 local floor, ceil = math.floor, math.ceil
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local floor, ceil = math.floor, math.ceil
> >   
> > @@ -27,11 +28,35 @@ local function nsieve(p, m)
> >     return count
> >   end
> >   
> > -local N = tonumber(arg and arg[1]) or 1
> > +local DEFAULT_N = 12
> Why 12 instead of 1?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> > +local N = tonumber(arg and arg[1]) or DEFAULT_N
> >   if N < 2 then N = 2 end
> >   local primes = {}
> >   
> > -for i=0,2 do
> > -  local m = (2^(N-i))*10000
> > -  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
> > -end
> > +local benchmark
> > +benchmark = {
> > +  name = "nsieve_bit_fp",
> > +  payload = function()
> > +    local res = {}
> > +    local items = 0
> > +    for i=0,2 do
> add more whitespaces
> > +      local m = (2^(N-i))*10000
> add more whitespaces

Fixed:
===================================================================
diff --git a/perf/LuaJIT-benches/nsieve-bit-fp.lua b/perf/LuaJIT-benches/nsieve-bit-fp.lua
index d0ab23d2..9d958950 100644
--- a/perf/LuaJIT-benches/nsieve-bit-fp.lua
+++ b/perf/LuaJIT-benches/nsieve-bit-fp.lua
@@ -39,8 +39,8 @@ benchmark = {
   payload = function()
     local res = {}
     local items = 0
-    for i=0,2 do
-      local m = (2^(N-i))*10000
+    for i = 0, 2 do
+      local m = (2 ^ (N - i)) * 10000
       items = items + m
       res[i] = nsieve(primes, m)
     end
===================================================================

> > +      items = items + m

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit in LuaJIT-benches
  2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:25     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:25 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
See my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch!
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/nsieve-bit.lua | 35 +++++++++++++++++++++++++-----
> >   1 file changed, 30 insertions(+), 5 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/nsieve-bit.lua b/perf/LuaJIT-benches/nsieve-bit.lua
> > index 820a3726..4858e9e2 100644
> > --- a/perf/LuaJIT-benches/nsieve-bit.lua
> > +++ b/perf/LuaJIT-benches/nsieve-bit.lua
> > @@ -1,3 +1,4 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/nsieve-bit.lua b/perf/LuaJIT-benches/nsieve-bit.lua
index e2c6f52c..dd85dad5 100644
--- a/perf/LuaJIT-benches/nsieve-bit.lua
+++ b/perf/LuaJIT-benches/nsieve-bit.lua
@@ -1,3 +1,9 @@
+-- Benchmark to check the performance of bitwise arithmetics and
+-- access to the array structure. This benchmark finds all prime
+-- numbers in a given segment. This is the bit variation.
+-- For the details see:
+-- https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
+
 local bench = require("bench").new(arg)
 
 local bit = require("bit")
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local bit = require("bit")
> >   local band, bxor, rshift, rol = bit.band, bit.bxor, bit.rshift, bit.rol
> > @@ -17,11 +18,35 @@ local function nsieve(p, m)
> >     return count
> >   end
> >   
> > -local N = tonumber(arg and arg[1]) or 1
> > +local DEFAULT_N = 12
> Why 12?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> > +local N = tonumber(arg and arg[1]) or DEFAULT_N
> >   if N < 2 then N = 2 end
> >   local primes = {}
> >   
> > -for i=0,2 do
> add more whitespaces, here and below

This is removed line, fixed in the corresponding added line.

> > -  local m = (2^(N-i))*10000
> > -  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
> io.write is lost, is it intentional?

Yes, it is checked via the checker function.

> > -end
> > +local benchmark
> > +benchmark = {
> > +  name = "nsieve_bit",
> > +  payload = function()
> > +    local res = {}
> > +    local items = 0
> > +    for i=0,2 do
> > +      local m = (2^(N-i))*10000
> add more whitespaces

Fixed.

===================================================================
diff --git a/perf/LuaJIT-benches/nsieve-bit.lua b/perf/LuaJIT-benches/nsieve-bit.lua
index 4858e9e2..e2c6f52c 100644
--- a/perf/LuaJIT-benches/nsieve-bit.lua
+++ b/perf/LuaJIT-benches/nsieve-bit.lua
@@ -29,8 +29,8 @@ benchmark = {
   payload = function()
     local res = {}
     local items = 0
-    for i=0,2 do
-      local m = (2^(N-i))*10000
+    for i = 0, 2 do
+      local m = (2 ^ (N - i)) * 10000
       items = items + m
       res[i] = nsieve(primes, m)
     end
===================================================================

> > +      items = items + m

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve in LuaJIT-benches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:26     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:26 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
See my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/nsieve.lua | 35 +++++++++++++++++++++++++++++-----
> >   1 file changed, 30 insertions(+), 5 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/nsieve.lua b/perf/LuaJIT-benches/nsieve.lua
> > index 6de0524f..2d1b66c8 100644
> > --- a/perf/LuaJIT-benches/nsieve.lua
> > +++ b/perf/LuaJIT-benches/nsieve.lua
> > @@ -1,3 +1,4 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/nsieve.lua b/perf/LuaJIT-benches/nsieve.lua
index 1eb4efe8..dd59c71c 100644
--- a/perf/LuaJIT-benches/nsieve.lua
+++ b/perf/LuaJIT-benches/nsieve.lua
@@ -1,3 +1,10 @@
+-- Benchmark to check the performance of access to the array
+-- structure in the tiny inner loops. This benchmark finds all the
+-- prime numbers in a given segment. This is the most
+-- straightforward implementation.
+-- For the details see:
+-- https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
+
 local bench = require("bench").new(arg)
 
 local function nsieve(p, m)
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local function nsieve(p, m)
> >     for i=2,m do p[i] = true end
> > @@ -11,11 +12,35 @@ local function nsieve(p, m)
> >     return count
> >   end
> >   
> > -local N = tonumber(arg and arg[1]) or 1
> > +local DEFAULT_N = 12
> Why 12?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> > +local N = tonumber(arg and arg[1]) or DEFAULT_N
> >   if N < 2 then N = 2 end
> >   local primes = {}
> >   
> > -for i=0,2 do
> > -  local m = (2^(N-i))*10000
> > -  io.write(string.format("Primes up to %8d %8d\n", m, nsieve(primes, m)))
> > -end
> > +local benchmark
> > +benchmark = {
> > +  name = "nsieve",
> > +  payload = function()
> > +    local res = {}
> > +    local items = 0
> > +    for i=0,2 do
> add more whitespaces, here and below

Fixed:

===================================================================
diff --git a/perf/LuaJIT-benches/nsieve.lua b/perf/LuaJIT-benches/nsieve.lua
index 2d1b66c8..1eb4efe8 100644
--- a/perf/LuaJIT-benches/nsieve.lua
+++ b/perf/LuaJIT-benches/nsieve.lua
@@ -23,8 +23,8 @@ benchmark = {
   payload = function()
     local res = {}
     local items = 0
-    for i=0,2 do
-      local m = (2^(N-i))*10000
+    for i = 0, 2 do
+      local m = (2 ^ (N - i)) * 10000
       items = items + m
       res[i] = nsieve(primes, m)
     end
===================================================================

> > +      local m = (2^(N-i))*10000

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums in LuaJIT-benches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:27     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:27 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/partialsums.lua | 69 ++++++++++++++++++-----------
> >   1 file changed, 42 insertions(+), 27 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/partialsums.lua b/perf/LuaJIT-benches/partialsums.lua
> > index 46bb9da3..ab24b30a 100644
> > --- a/perf/LuaJIT-benches/partialsums.lua
> > +++ b/perf/LuaJIT-benches/partialsums.lua

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/partialsums.lua b/perf/LuaJIT-benches/partialsums.lua
index c9d48a2d..2e5967d2 100644
--- a/perf/LuaJIT-benches/partialsums.lua
+++ b/perf/LuaJIT-benches/partialsums.lua
@@ -1,3 +1,7 @@
+-- The benchmark to check the performance of FP arithmetic and
+-- math functions. Calculates the partial sums of several series
+-- in the single loop.
+
 local bench = require("bench").new(arg)
 
 local DEFAULT_N = 1e7
===================================================================

> > @@ -1,29 +1,44 @@
> > +local bench = require("bench").new(arg)
> >   
> > -local n = tonumber(arg[1])
> > -local function pr(fmt, x) io.write(string.format(fmt, x)) end
> > +local DEFAULT_N = 1e7
> > +local n = tonumber(arg[1]) or DEFAULT_N
> Why 1e7 is default?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> >   
> > -local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
> > -local sqrt, sin, cos = math.sqrt, math.sin, math.cos
> > -for k=1,n do
> > -  local k2, sk, ck = k*k, sin(k), cos(k)
> > -  local k3 = k2*k
> > -  a1 = a1 + (2/3)^k
> > -  a2 = a2 + 1/sqrt(k)
> > -  a3 = a3 + 1/(k2+k)
> > -  a4 = a4 + 1/(k3*sk*sk)
> > -  a5 = a5 + 1/(k3*ck*ck)
> > -  a6 = a6 + 1/k
> > -  a7 = a7 + 1/k2
> > -  a8 = a8 + alt/k
> > -  a9 = a9 + alt/(k+k-1)
> > -  alt = -alt
> > -end
> > -pr("%.9f\t(2/3)^k\n", a1)
> > -pr("%.9f\tk^-0.5\n", a2)
> > -pr("%.9f\t1/k(k+1)\n", a3)
> > -pr("%.9f\tFlint Hills\n", a4)
> > -pr("%.9f\tCookson Hills\n", a5)
> > -pr("%.9f\tHarmonic\n", a6)
> > -pr("%.9f\tRiemann Zeta\n", a7)
> > -pr("%.9f\tAlternating Harmonic\n", a8)
> > -pr("%.9f\tGregory\n", a9)
> 
> debug prints were lost, is it intentional?

It is not the debug print, but verification of program correctness, also
it prevents DCE by the compilers in the compiled languages.

Here it is tested by the checker function.

> 
> In a previous benches debug prints were left, but suppressed.

No, there are no debug prints. Output to suppress is the program
behaviour (like in the life benchmark) or the result of the program
(like in the k-nucleotide). Since it is rather huge, to avoid saving the
result in the file, it is better to drop it into the /dev/null to avoid
dependence on the file system.

> 
> Also I propose to use the same printf function in all benches for 
> consistency.

There is no need for it since this is not the main payload of the
benchmark.

> 
> > +bench:add({
> > +  name = "partialsums",
> > +  payload = function()
> > +    local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
> > +    local sqrt, sin, cos = math.sqrt, math.sin, math.cos
> > +    for k=1,n do
> please add more whitespaces, here and below

Added:

===================================================================
diff --git a/perf/LuaJIT-benches/partialsums.lua b/perf/LuaJIT-benches/partialsums.lua
index ab24b30a..c9d48a2d 100644
--- a/perf/LuaJIT-benches/partialsums.lua
+++ b/perf/LuaJIT-benches/partialsums.lua
@@ -8,18 +8,18 @@ bench:add({
   payload = function()
     local a1, a2, a3, a4, a5, a6, a7, a8, a9, alt = 1, 0, 0, 0, 0, 0, 0, 0, 0, 1
     local sqrt, sin, cos = math.sqrt, math.sin, math.cos
-    for k=1,n do
-      local k2, sk, ck = k*k, sin(k), cos(k)
-      local k3 = k2*k
-      a1 = a1 + (2/3)^k
-      a2 = a2 + 1/sqrt(k)
-      a3 = a3 + 1/(k2+k)
-      a4 = a4 + 1/(k3*sk*sk)
-      a5 = a5 + 1/(k3*ck*ck)
-      a6 = a6 + 1/k
-      a7 = a7 + 1/k2
-      a8 = a8 + alt/k
-      a9 = a9 + alt/(k+k-1)
+    for k = 1, n do
+      local k2, sk, ck = k * k, sin(k), cos(k)
+      local k3 = k2 * k
+      a1 = a1 + (2 / 3) ^ k
+      a2 = a2 + 1 / sqrt(k)
+      a3 = a3 + 1 / (k2 + k)
+      a4 = a4 + 1 / (k3 * sk * sk)
+      a5 = a5 + 1 / (k3 * ck * ck)
+      a6 = a6 + 1 / k
+      a7 = a7 + 1 / k2
+      a8 = a8 + alt / k
+      a9 = a9 + alt / (k + k - 1)
       alt = -alt
     end
     return {a1, a2, a3, a4, a5, a6, a7, a8, a9}
===================================================================

> > +      local k2, sk, ck = k*k, sin(k), cos(k)
> > +      local k3 = k2*k
> > +      a1 = a1 + (2/3)^k
> > +      a2 = a2 + 1/sqrt(k)
> > +      a3 = a3 + 1/(k2+k)
> > +      a4 = a4 + 1/(k3*sk*sk)
> > +      a5 = a5 + 1/(k3*ck*ck)
> > +      a6 = a6 + 1/k
> > +      a7 = a7 + 1/k2
> > +      a8 = a8 + alt/k
> > +      a9 = a9 + alt/(k+k-1)
> > +      alt = -alt
> > +    end
> > +    return {a1, a2, a3, a4, a5, a6, a7, a8, a9}
> > +  end,
> > +  checker = function(a)
> > +    if n == DEFAULT_N then
> > +      assert(a[1] == 2.99999999999999866773)
> > +      assert(a[2] == 6323.09512394020111969439)
> > +      assert(a[3] == 0.99999989999981531152)
> > +      assert(a[4] == 30.31454593111029183206)
> > +      assert(a[5] == 42.99523427973661426904)
> > +      assert(a[6] == 16.69531136585727182364)
> > +      assert(a[7] == 1.64493396684725956547)
> > +      assert(a[8] == 0.69314713056010635039)
> > +      assert(a[9] == 0.78539813839744787582)
> > +    end
> > +    return true
> > +  end,
> > +  items = n,
> > +})
> > +
> > +bench:run_and_report()
> >

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp in LuaJIT-benches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:27     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:27 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The output is redirected to /dev/null. The check is skipped since it is
> > very inconvenient to store the huge file in the repository with the
> > reference value.
> > ---
> >   perf/LuaJIT-benches/pidigits-nogmp.lua | 49 ++++++++++++++++++--------
> >   1 file changed, 35 insertions(+), 14 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/pidigits-nogmp.lua b/perf/LuaJIT-benches/pidigits-nogmp.lua
> > index 63a1cb0e..e96b3e45 100644
> > --- a/perf/LuaJIT-benches/pidigits-nogmp.lua
> > +++ b/perf/LuaJIT-benches/pidigits-nogmp.lua
> > @@ -1,3 +1,4 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/pidigits-nogmp.lua b/perf/LuaJIT-benches/pidigits-nogmp.lua
index 3581a55a..cdfa7fcb 100644
--- a/perf/LuaJIT-benches/pidigits-nogmp.lua
+++ b/perf/LuaJIT-benches/pidigits-nogmp.lua
@@ -1,3 +1,10 @@
+-- The benchmark to check the performance of arithmetic operations
+-- array accesses simulating the multi-precision number
+-- operations and the streaming output to the stdout. This
+-- benchmark calculates the first numbers of Pi.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/pidigits.html
+
 local bench = require("bench").new(arg)
 
 -- Start of dynamically compiled chunk.
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   -- Start of dynamically compiled chunk.
> >   local chunk = [=[
> > @@ -80,21 +81,41 @@ end)
> >   
> >   ]=] -- End of dynamically compiled chunk.
> >   
> > -local N = tonumber(arg and arg[1]) or 27
> > +local N = tonumber(arg and arg[1]) or 5000
> Why 5000 by default?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> >   local RADIX = N < 6500 and 2^36 or 2^32 -- Avoid overflow.
> >   

<snipped>

> > --- Print remaining digits (if any).
> > -local n10 = N % 10
> > -if n10 ~= 0 then
> > -  for i=1,n10 do io.write(pidigit()) end
> add more whitespaces, here and below

Fixed:

===================================================================
diff --git a/perf/LuaJIT-benches/pidigits-nogmp.lua b/perf/LuaJIT-benches/pidigits-nogmp.lua
index e96b3e45..3581a55a 100644
--- a/perf/LuaJIT-benches/pidigits-nogmp.lua
+++ b/perf/LuaJIT-benches/pidigits-nogmp.lua
@@ -100,16 +100,16 @@ bench:add({
     local pidigit = loadstring(string.gsub(chunk, "RADIX", tostring(RADIX)))()
 
     -- Print lines with 10 digits.
-    for i=10,N,10 do
-      for j=1,10 do io.write(pidigit()) end
+    for i = 10, N, 10 do
+      for _ = 1, 10 do io.write(pidigit()) end
       io.write("\t:", i, "\n")
     end
 
     -- Print remaining digits (if any).
     local n10 = N % 10
     if n10 ~= 0 then
-      for i=1,n10 do io.write(pidigit()) end
-      io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
+      for _ = 1, n10 do io.write(pidigit()) end
+      io.write(string.rep(" ", 10 - n10), "\t:", N, "\n")
     end
   end,
   teardown = function()
===================================================================

> > -  io.write(string.rep(" ", 10-n10), "\t:", N, "\n")
> > -end

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray in LuaJIT-benches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:29     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:29 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Fixed your comment and added the short benchmark description as
you suggested.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The output is redirected to /dev/null. The check is skipped since it is
> > very inconvenient to check the binary output, especially since it may be
> > configured by the parameter.
> > ---
> >   perf/LuaJIT-benches/ray.lua | 76 ++++++++++++++++++++++++-------------
> >   1 file changed, 50 insertions(+), 26 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
> > index 2acc24c0..f7b76d0a 100644
> > --- a/perf/LuaJIT-benches/ray.lua
> > +++ b/perf/LuaJIT-benches/ray.lua
> > @@ -1,10 +1,8 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
index e9719941..fe31481b 100644
--- a/perf/LuaJIT-benches/ray.lua
+++ b/perf/LuaJIT-benches/ray.lua
@@ -1,3 +1,7 @@
+-- Benchmark to check the performance of FP arithmetics and
+-- structure accesses via modeling of ray tracing. The benchmark
+-- generates the fractal image of spheres in the PGM format.
+
 local bench = require("bench").new(arg)
 
 local sqrt = math.sqrt
===================================================================

> > +local bench = require("bench").new(arg)
> > +

<snipped>

> > +bench:add({
> > +  name = "ray",
> > +  -- Avoid skip checking here, since it is not very convenient.
> > +  -- If you want to check the behaviour -- drop the setup
> > +  -- function.
> > +  skip_check = true,
> > +  setup = function()
> > +    io.output("/dev/null")
> > +  end,
> > +  payload = function()
> > +    local iss = 1/ss
> please add more whitespaces, here and below

Fixed:

===================================================================
diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
index f7b76d0a..e9719941 100644
--- a/perf/LuaJIT-benches/ray.lua
+++ b/perf/LuaJIT-benches/ray.lua
@@ -121,8 +121,8 @@ bench:add({
     io.output("/dev/null")
   end,
   payload = function()
-    local iss = 1/ss
-    local gf = 255/(ss*ss)
+    local iss = 1 / ss
+    local gf = 255 / (ss * ss)
 
     delta = 1
     while delta * delta + 1 ~= 1 do
@@ -137,16 +137,16 @@ bench:add({
 
     local scene = create(level, {0, -1, 0}, 1)
 
-    for y = n/2-1, -n/2, -1 do
-      for x = -n/2, n/2-1 do
+    for y = n / 2 - 1, -n / 2, -1 do
+      for x = -n / 2, n / 2 - 1 do
         local g = 0
-        for d = y, y+.99, iss do
-          for e = x, x+.99, iss do
+        for d = y, y + .99, iss do
+          for e = x, x + .99, iss do
             dir[1], dir[2], dir[3] = unitise(e, d, n)
             g = g + ray_trace(light, camera, dir, scene)
           end
         end
-        io.write(string.char(math.floor(0.5 + g*gf)))
+        io.write(string.char(math.floor(0.5 + g * gf)))
       end
     end
   end,
===================================================================

> > +    local gf = 255/(ss*ss)

<snipped>

I've also refactored the benchmark a bit to make it performance match
the original version:

===================================================================
diff --git a/perf/LuaJIT-benches/ray.lua b/perf/LuaJIT-benches/ray.lua
index fe31481b..f322b023 100644
--- a/perf/LuaJIT-benches/ray.lua
+++ b/perf/LuaJIT-benches/ray.lua
@@ -125,6 +125,9 @@ bench:add({
     io.output("/dev/null")
   end,
   payload = function()
+    -- Cache to avoid upvalue lookups.
+    local level, n, ss = level, n, ss
+
     local iss = 1 / ss
     local gf = 255 / (ss * ss)
 
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack in LuaJIT-benches
  2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:30     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:30 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Added the short benchmark description as you suggested.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch!  LGTM
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

> > diff --git a/perf/LuaJIT-benches/recursive-ack.lua b/perf/LuaJIT-benches/recursive-ack.lua
> > index fad30589..1172d4b3 100644
> > --- a/perf/LuaJIT-benches/recursive-ack.lua
> > +++ b/perf/LuaJIT-benches/recursive-ack.lua
> > @@ -1,3 +1,5 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/recursive-ack.lua b/perf/LuaJIT-benches/recursive-ack.lua
index 1172d4b3..26b4e5c9 100644
--- a/perf/LuaJIT-benches/recursive-ack.lua
+++ b/perf/LuaJIT-benches/recursive-ack.lua
@@ -1,3 +1,8 @@
+-- The benchmark to check the performance of recursive calls.
+-- Calculates the Ackermann function.
+-- For the details see:
+-- https://mathworld.wolfram.com/AckermannFunction.html
+
 local bench = require("bench").new(arg)
 
 local function Ack(m, n)
===================================================================

> > +local bench = require("bench").new(arg)

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib in LuaJIT-benches
  2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:30     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:30 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/recursive-fib.lua | 28 +++++++++++++++++++++++++--
> >   1 file changed, 26 insertions(+), 2 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/recursive-fib.lua b/perf/LuaJIT-benches/recursive-fib.lua
> > index ef9950de..99af3f9e 100644
> > --- a/perf/LuaJIT-benches/recursive-fib.lua
> > +++ b/perf/LuaJIT-benches/recursive-fib.lua
> > @@ -1,7 +1,31 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/recursive-fib.lua b/perf/LuaJIT-benches/recursive-fib.lua
index 99af3f9e..8e96934a 100644
--- a/perf/LuaJIT-benches/recursive-fib.lua
+++ b/perf/LuaJIT-benches/recursive-fib.lua
@@ -1,3 +1,8 @@
+-- The benchmark to check the performance of recursive calls.
+-- Calculates the Fibonacci values recursively.
+-- For the details see:
+-- http://mathworld.wolfram.com/FibonacciNumber.html
+
 local bench = require("bench").new(arg)
 
 local function fib(n)
===================================================================

> > +local bench = require("bench").new(arg)
> > +
> >   local function fib(n)
> >     if n < 2 then return 1 end
> >     return fib(n-2) + fib(n-1)
> >   end
> >   
> > -local n = tonumber(arg[1]) or 10
> > -io.write(string.format("Fib(%d): %d\n", n, fib(n)))
> debug print was lost, is it intentional?

It is not the debug print, but verification of program correctness, also
it prevents DCE by the compilers in the compiled languages.

Here it is tested by the checker function.

> > +local n = tonumber(arg[1]) or 40
> Why 40?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> > +

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp in LuaJIT-benches
  2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:31     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:31 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review!
Fixed your comment and added the short benchmark description as
you suggested.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.

Discussed offline and fixed your comments.

> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

> > diff --git a/perf/LuaJIT-benches/revcomp.lua b/perf/LuaJIT-benches/revcomp.lua
> > index 34fe347b..2b1ffa5c 100644
> > --- a/perf/LuaJIT-benches/revcomp.lua
> > +++ b/perf/LuaJIT-benches/revcomp.lua
> > @@ -1,3 +1,4 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/revcomp.lua b/perf/LuaJIT-benches/revcomp.lua
index cc944f39..6e8a7049 100644
--- a/perf/LuaJIT-benches/revcomp.lua
+++ b/perf/LuaJIT-benches/revcomp.lua
@@ -1,3 +1,13 @@
+-- The benchmark to check the performance of hash table lookups by
+-- constant keys, string manipulations, and read/write
+-- operations.
+-- This benchmark reads the line-by-line a redirected FASTA format
+-- file from stdin, which is generated by the <fasta.lua>
+-- benchmark, and writes the id, description, and the
+-- reverse-complement sequence in FASTA format to stdout.
+-- For the details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/revcomp.html
+
 local bench = require("bench").new(arg)
 
 local sub = string.sub
===================================================================

> > +local bench = require("bench").new(arg)
> >   
> >   local sub = string.sub
> >   iubc = setmetatable({
> > @@ -9,29 +10,50 @@ iubc = setmetatable({

<snipped>

> > +bench:add({
> > +  name = "revcomp",
> > +  -- The compare with the result output file is inconvenient.
> > +  skip_check = true,
> > +  setup = function()
> > +    io.output("/dev/null")
> > +  end,
> > +  payload = function()
> > +    local wcode = [=[
> > +    return function(t, n)
> > +      if n == 1 then return end
> > +      local iubc, sub, write = iubc, string.sub, io.write
> > +      local s = table.concat(t, "", 1, n-1)
> > +      for i=#s-59,1,-60 do
> > +        write(]=]
> > +    for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
> > +    wcode = wcode..[=["\n")
> > +      end
> > +      local r = #s % 60
> > +      if r ~= 0 then
> > +        for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
> > +        write("\n")
> > +      end
> > +    end
> > +    ]=]

Discussed offline and reformatted as the following:

===================================================================
diff --git a/perf/LuaJIT-benches/revcomp.lua b/perf/LuaJIT-benches/revcomp.lua
index 2b1ffa5c..cc944f39 100644
--- a/perf/LuaJIT-benches/revcomp.lua
+++ b/perf/LuaJIT-benches/revcomp.lua
@@ -24,15 +24,17 @@ bench:add({
     return function(t, n)
       if n == 1 then return end
       local iubc, sub, write = iubc, string.sub, io.write
-      local s = table.concat(t, "", 1, n-1)
-      for i=#s-59,1,-60 do
+      local s = table.concat(t, "", 1, n - 1)
+      for i = #s - 59, 1, -60 do
         write(]=]
-    for i=59,3,-4 do wcode = wcode.."iubc[sub(s, i+"..(i-3)..", i+"..i..")], " end
+    for i = 59, 3, -4 do
+      wcode = wcode.."iubc[sub(s, i + "..(i - 3)..", i + "..i..")], "
+    end
     wcode = wcode..[=["\n")
       end
       local r = #s % 60
       if r ~= 0 then
-        for i=r,1,-4 do write(iubc[sub(s, i-3 < 1 and 1 or i-3, i)]) end
+        for i = r, 1, -4 do write(iubc[sub(s, i - 3 < 1 and 1 or i - 3, i)]) end
         write("\n")
       end
     end
===================================================================

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 in LuaJIT-benches
  2025-11-17 13:56   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:32     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:32 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The time for each subsequent benchmark is increased up to 4 seconds,
> > accoring the defaults in the "bench" framework. The main difference
> > between this test and others that will be added in next commits is
> > the usage of FFI arrays instead of plain Lua tables.
> > ---
> >   perf/LuaJIT-benches/scimark-2010-12-20.lua | 93 +++++++++++++---------
> >   1 file changed, 54 insertions(+), 39 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/scimark-2010-12-20.lua b/perf/LuaJIT-benches/scimark-2010-12-20.lua
> > index 353acb7c..3fb627fa 100644
> > --- a/perf/LuaJIT-benches/scimark-2010-12-20.lua
> > +++ b/perf/LuaJIT-benches/scimark-2010-12-20.lua
> > @@ -9,25 +9,26 @@
> >   local SCIMARK_VERSION = "2010-12-10"
> >   local SCIMARK_COPYRIGHT = "Copyright (C) 2006-2010 Mike Pall"
> >   
> > -local MIN_TIME = 2.0
> > +local bench = require("bench").new(arg)
> > +
> >   local RANDOM_SEED = 101009 -- Must be odd.
> >   local SIZE_SELECT = "small"
> >   
> >   local benchmarks = {
> >     "FFT", "SOR", "MC", "SPARSE", "LU",
> >     small = {
> > -    FFT		= { 1024 },
> > -    SOR		= { 100 },
> > -    MC		= { },
> > -    SPARSE	= { 1000, 5000 },
> > -    LU		= { 100 },
> > +    FFT		= { params = { 1024 }, cycles = 50000, },
> > +    SOR		= { params = { 100 }, cycles = 50000, },
> > +    MC		= { params = { }, cycles = 15e7, },
> > +    SPARSE	= { params = { 1000, 5000 }, cycles = 15e4, },
> > +    LU		= { params = { 100 }, cycles = 5000, },
> >     },
> >     large = {
> > -    FFT		= { 1048576 },
> > -    SOR		= { 1000 },
> > -    MC		= { },
> > -    SPARSE	= { 100000, 1000000 },
> > -    LU		= { 1000 },
> > +    FFT		= { params = { 1048576 }, cycles = 25, },
> > +    SOR		= { params = { 1000 }, cycles = 500, },
> > +    MC		= { params = { }, cycles = 15e7, },
> > +    SPARSE	= { params = { 100000, 1000000 }, cycles = 1500, },
> > +    LU		= { params = { 1000 }, cycles = 50, },
> >     },
> >   }
> please add a comment about chosen parameters

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

Others parameters are adjusted by comparing with the non-ffi version of
the benchmark.

> >   
> > @@ -342,48 +343,51 @@ local function fmtparams(p1, p2)
> >     return ""
> >   end
> >   
> > -local function measure(min_time, name, ...)
> > +local function measure(name, cycles, ...)
> >     array_init()
> >     rand_init(RANDOM_SEED)
> >     local run = benchmarks[name](...)
> > -  local cycles = 1
> > -  repeat
> > -    local tm = clock()
> > -    local flops = run(cycles, ...)
> > -    tm = clock() - tm
> > -    if tm >= min_time then
> > -      local res = flops / tm * 1.0e-6
> > -      local p1, p2 = ...
> > -      printf("%-7s %8.2f  %s\n", name, res, fmtparams(...))
> > -      return res
> > -    end
> > -    cycles = cycles * 2
> > -  until false
> > +  local flops = run(cycles, ...)
> > +  return flops
> >   end
> >   
> > -printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
> > -       SCIMARK_VERSION, SCIMARK_COPYRIGHT)
> > +-- printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
> > +--        SCIMARK_VERSION, SCIMARK_COPYRIGHT)
> >   
> 
> I propose to move this to a comment with test description.

Removed this line. The top of the file already contains the
description:

===================================================================
diff --git a/perf/LuaJIT-benches/scimark-2010-12-20.lua b/perf/LuaJIT-benches/scimark-2010-12-20.lua
index 3fb627fa..4b80ffe2 100644
--- a/perf/LuaJIT-benches/scimark-2010-12-20.lua
+++ b/perf/LuaJIT-benches/scimark-2010-12-20.lua
@@ -351,9 +351,6 @@ local function measure(name, cycles, ...)
   return flops
 end
 
--- printf("Lua SciMark %s based on SciMark 2.0a. %s.\n\n",
---        SCIMARK_VERSION, SCIMARK_COPYRIGHT)
-
 while arg and arg[1] do
   local a = table.remove(arg, 1)
   if a == "noffi" then
===================================================================

> 
> Something like:
> 
> The test runs the Lua version of SciMark 2.0a, which is a benchmark for 
> scientific and numerical computing developed by programmers at the NIST 
> (National Institute of Standards and Technology). This test is made up 
> of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte 
> Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks.
> 
> plus description of available test-specific options (noffi, small, etc) 
> or just

This options are kinda obvious for the usage. I suppose there is no need
to document them somehow, since the benchmark is run in the one default
configuration in the CI.

> 
> a command-line that will show usage: ./scimark-2010-12-20.lua help

It already has the following ouptut:

| Usage: scimark [noffi] [small|large] [BENCH params...]
|
| BENCH   small         large
| ---------------------------------------
| FFT     [1024]        [1048576]
| SOR     [100]         [1000]
| MC
| SPARSE  [1000, 5000]  [100000, 1000000]
| LU      [100]         [1000]

> 
> >   while arg and arg[1] do

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory
  2025-11-17 13:58   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:32     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:32 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with a minor comment:
> 
> I propose to squash this patch with patch "perf: add CMake infrastructure"

I prefer to leave it as is to make diff clear.

> 
> or add to the commit message "Needed for ...".

Needed for is used for the ticket mentioning. There is no ticket about
this, so ignoring. The commit message contains the rationale. The patch
itself was done in this order before refactoring the scimark-*.lua
benchmarks, so I prefer to leave it as is.

> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This helps to avoid this library in the scanning of the test files
> > for the suite.
> > ---
> >   perf/LuaJIT-benches/{ => libs}/scimark_lib.lua | 0
> >   1 file changed, 0 insertions(+), 0 deletions(-)
> >   rename perf/LuaJIT-benches/{ => libs}/scimark_lib.lua (100%)
> >
> > diff --git a/perf/LuaJIT-benches/scimark_lib.lua b/perf/LuaJIT-benches/libs/scimark_lib.lua
> > similarity index 100%
> > rename from perf/LuaJIT-benches/scimark_lib.lua
> > rename to perf/LuaJIT-benches/libs/scimark_lib.lua

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches
  2025-11-17 14:00   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:33     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:33 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! Please see my comments.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > Checks are omitted since they were not present in the original suite,
> > plus the precise result value depends on the input parameter.
> > ---
> >   perf/LuaJIT-benches/scimark-fft.lua | 19 ++++++++++++++++++-
> >   1 file changed, 18 insertions(+), 1 deletion(-)
> >
> > diff --git a/perf/LuaJIT-benches/scimark-fft.lua b/perf/LuaJIT-benches/scimark-fft.lua
> > index c05bb69a..96535774 100644
> > --- a/perf/LuaJIT-benches/scimark-fft.lua
> > +++ b/perf/LuaJIT-benches/scimark-fft.lua
> > @@ -1 +1,18 @@
> > -require("scimark_lib").FFT(1024)(tonumber(arg and arg[1]) or 50000)
> > +local bench = require("bench").new(arg)
> > +
> > +local cycles = tonumber(arg and arg[1]) or 50000
> Why 50000?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

Also, it is the same as before the patch.

> > +local benchmark
> > +benchmark = {
> > +  name = "scimark_fft",
> > +  -- XXX: The description of tests for the function is too
> > +  -- inconvenient.
> > +  skip_check = true,
> > +  payload = function()
> > +    local flops = require("scimark_lib").FFT(1024)(cycles)
> Why 1024?

It is the same as before the patch. This constant is the same as in the
original SciMark v2 benchmark.

> > +    benchmark.items = flops
> > +  end,
> > +}
> > +
> > +bench:add(benchmark)
> > +
> > +bench:run_and_report()

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu in LuaJIT-benches
  2025-11-17 14:07   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:34     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:34 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answer below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with a minor comment.
> 
> I propose to add a small description to the comment. Something like this:
> 
> The test runs a part of the Lua version of SciMark 2.0a, which is a 
> benchmark for scientific and numerical computing developed by 
> programmers at the NIST (National Institute of Standards and 
> Technology). This test is made up of  dense LU matrix factorization 
> benchmarks.

This description is already persisted in the <scimark_lib.lua>.
I see no reason to duplicate it.
Ignoring.

> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc in LuaJIT-benches
  2025-11-17 14:09   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:35     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:35 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answer below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with a minor comment below.
> 
> I propose to add a small test description to a comment:

This description is already persisted in the <scimark_lib.lua>.
It has the link to the detailed description of the benchmarks and
the original C sources.
I see no reason to duplicate it.
Ignoring.

> 
> SciMark is a popular benchmark, MC is a Monte Carlo Integration.
> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor in LuaJIT-benches
  2025-11-17 14:11   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:35     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:35 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answer below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with a minor comment.
> 
> Please add a small test description to the comment.

This description is already persisted in the <scimark_lib.lua>.
I see no reason to duplicate it.
Ignoring.

> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse in LuaJIT-benches
  2025-11-17 14:15   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:36     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:36 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answer below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! LGTM with a minor comment.
> 
> please add a small test description to the comment.

This description is already persisted in the <scimark_lib.lua>.
I see no reason to duplicate it.
Ignoring.

> 
> Sergey
> 
> On 10/24/25 13:50, Sergey Kaplun wrote:

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series in LuaJIT-benches
  2025-11-17 14:19   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:37     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:37 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See comments below.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/series.lua | 20 ++++++++++++++------
> >   1 file changed, 14 insertions(+), 6 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/series.lua b/perf/LuaJIT-benches/series.lua
> > index f766cb32..3dc970c5 100644
> > --- a/perf/LuaJIT-benches/series.lua
> > +++ b/perf/LuaJIT-benches/series.lua

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/series.lua b/perf/LuaJIT-benches/series.lua
index 3dc970c5..56f012b7 100644
--- a/perf/LuaJIT-benches/series.lua
+++ b/perf/LuaJIT-benches/series.lua
@@ -1,3 +1,7 @@
+-- The benchmark to check the performance of FP arithmetics, power
+-- operation, and trigonometrical functions. Calculates the
+-- integrals of sin/cos functions.
+
 local bench = require("bench").new(arg)
 
 local function integrate(x0, x1, nsteps, omegan, f)
===================================================================

> > @@ -1,3 +1,4 @@
> > +local bench = require("bench").new(arg)
> >   
> >   local function integrate(x0, x1, nsteps, omegan, f)
> >     local x, dx = x0, (x1-x0)/nsteps
> > @@ -26,9 +27,16 @@ local function series(n)
> >   end
> >   
> >   local n = tonumber(arg and arg[1]) or 10000
> > -local tm = os.clock()
> > -local t = series(n)
> > -tm = os.clock() - tm
> > -assert(math.abs(t[1]-2.87295) < 0.00001)
> > -io.write(string.format("size %d, %.2f s, %.1f iterations/s\n",
> > -                       n, tm, (2*n-1)/tm))
> debug print was lost, is it intentional?

It is not the debug print, but rather the information about the
benchmark performance. We collect this stat automatically by the bench
module.

> > +
> > +bench:add({
> > +  name = "series",
> > +  checker = function(res)
> > +    return math.abs(res[1]-2.87295) < 0.00001
> add more whitespaces

Added:

===================================================================
diff --git a/perf/LuaJIT-benches/series.lua b/perf/LuaJIT-benches/series.lua
index 3dc970c5..16d34fdf 100644
--- a/perf/LuaJIT-benches/series.lua
+++ b/perf/LuaJIT-benches/series.lua
@@ -31,7 +31,7 @@ local n = tonumber(arg and arg[1]) or 10000
 bench:add({
   name = "series",
   checker = function(res)
-    return math.abs(res[1]-2.87295) < 0.00001
+    return math.abs(res[1] - 2.87295) < 0.00001
   end,
   payload = function()
     return series(n)
===================================================================

> > +  end,
> > +  payload = function()
> > +    return series(n)
> > +  end,
> > +  items = 2 * n - 1,
> > +})
> > +
> > +bench:run_and_report()

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm in LuaJIT-benches
  2025-11-17 14:23   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:37     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:37 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 17.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
>   thanks for the patch! See my comments below.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> > ---
> >   perf/LuaJIT-benches/spectral-norm.lua | 40 +++++++++++++++++++--------
> >   1 file changed, 29 insertions(+), 11 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/spectral-norm.lua b/perf/LuaJIT-benches/spectral-norm.lua
> > index ecc80112..6e63cd47 100644
> > --- a/perf/LuaJIT-benches/spectral-norm.lua
> > +++ b/perf/LuaJIT-benches/spectral-norm.lua

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/spectral-norm.lua b/perf/LuaJIT-benches/spectral-norm.lua
index ae0a6b6d..7a7712a0 100644
--- a/perf/LuaJIT-benches/spectral-norm.lua
+++ b/perf/LuaJIT-benches/spectral-norm.lua
@@ -1,3 +1,10 @@
+-- The benchmark to check the performance of FP arithmetics and
+-- function call inlining in the inner loops.
+-- The benchmark calculates the spectral norm of an infinite
+-- matrix.
+-- For more details see:
+-- https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/spectralnorm.html
+
 local bench = require("bench").new(arg)
 
 local function A(i, j)
===================================================================

> > @@ -1,3 +1,4 @@
> > +local bench = require("bench").new(arg)
> >   
> >   local function A(i, j)
> >     local ij = i+j-1
> > @@ -25,16 +26,33 @@ local function AtAv(x, y, t, N)
> >     Atv(t, y, N)
> >   end
> >   
> > -local N = tonumber(arg and arg[1]) or 100
> > -local u, v, t = {}, {}, {}
> > -for i=1,N do u[i] = 1 end
> > +local N = tonumber(arg and arg[1]) or 3000
> Why it was changed to 3000?

It is the default for x86 arch. I've taken the values from PARAMS_x86,
since this is the most important architecture for the Tarantool, see the
commit message.

> >   
> > -for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
> > +bench:add({
> > +  name = "spectral_norm",
> > +  checker = function(res)
> > +    -- XXX: Empirical value.
> > +    if N > 66 then
> > +      assert(math.abs(res - 1.27422) < 0.00001)
> > +    end
> > +    return true
> > +  end,
> > +  payload = function()
> > +    local u, v, t = {}, {}, {}
> > +    for i=1,N do u[i] = 1 end
> add more whitespaces, here and below

Added:

===================================================================
diff --git a/perf/LuaJIT-benches/spectral-norm.lua b/perf/LuaJIT-benches/spectral-norm.lua
index 6e63cd47..ae0a6b6d 100644
--- a/perf/LuaJIT-benches/spectral-norm.lua
+++ b/perf/LuaJIT-benches/spectral-norm.lua
@@ -39,15 +39,15 @@ bench:add({
   end,
   payload = function()
     local u, v, t = {}, {}, {}
-    for i=1,N do u[i] = 1 end
+    for i = 1, N do u[i] = 1 end
 
-    for i=1,10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
+    for _ = 1, 10 do AtAv(u, v, t, N) AtAv(v, u, t, N) end
 
     local vBv, vv = 0, 0
-    for i=1,N do
+    for i = 1, N do
       local ui, vi = u[i], v[i]
-      vBv = vBv + ui*vi
-      vv = vv + vi*vi
+      vBv = vBv + ui * vi
+      vv = vv + vi * vi
     end
     return math.sqrt(vBv / vv)
   end,
===================================================================

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file in LuaJIT-benches
  2025-12-23 10:44   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:38     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:38 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 23.12.25, Sergey Bronnikov wrote:
> on execution I got an error:
> 
> ./build/src/luajit: perf/LuaJIT-benches/sum-file.lua:11: attempt to 
> perform arithmetic on local 'line' (a string value)

How do you run the benchmark?

| LUA_PATH="perf/utils/?.lua;;" src/luajit perf/LuaJIT-benches/sum-file.lua < perf/LuaJIT-benches/SUMCOL_5000.txt
| -------------------------------------------------------------------------------------------------------------
| Benchmark                                     Time             CPU    Iterations UserCounters...
| -------------------------------------------------------------------------------------------------------------
| sum_file                                    6.21 s          6.20 s             4 items_per_second=3.219M/s

The input file contains only numbers. See the comment in the test.

> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > This patch adjusts the aforementioned test to use the benchmark
> > framework introduced before. The default arguments are adjusted
> > according to the <PARAM_x86.txt> file. The arguments to the script still
> > can be provided in the command line run.
> >
> > The input for the test is redirected from the generated file
> > <SUMCOL_5000.txt>. This file is the result of concatenation of the
> > <SUMCOL_1.txt> 5000 times.
> > ---
> >   perf/LuaJIT-benches/sum-file.lua | 29 ++++++++++++++++++++++++-----
> >   1 file changed, 24 insertions(+), 5 deletions(-)
> >
> > diff --git a/perf/LuaJIT-benches/sum-file.lua b/perf/LuaJIT-benches/sum-file.lua
> > index c9e618fd..270c1865 100644
> > --- a/perf/LuaJIT-benches/sum-file.lua
> > +++ b/perf/LuaJIT-benches/sum-file.lua
> > @@ -1,6 +1,25 @@

Added the comment with the short benchmark description, as we
discussed offline:

===================================================================
diff --git a/perf/LuaJIT-benches/sum-file.lua b/perf/LuaJIT-benches/sum-file.lua
index 270c1865..6af2b4a5 100644
--- a/perf/LuaJIT-benches/sum-file.lua
+++ b/perf/LuaJIT-benches/sum-file.lua
@@ -1,3 +1,7 @@
+-- The benchmark to check the performance of reading lines from
+-- stdin and sum the given numbers (the strings converted to
+-- numbers by the VM automatically).
+
 local bench = require("bench").new(arg)
 
 -- XXX: The input file is generated from <SUMCOL_1.txt> by
===================================================================

> > +local bench = require("bench").new(arg)

<snipped>

> > +-- XXX: The input file is generated from <SUMCOL_1.txt> by
> > +-- repeating it 5000 times. The <SUMCOL_1.txt> contains 1000 lines
> > +-- with the total sum of 500.
> > +bench:add({
> > +  name = "sum_file",
> > +  payload = function()
> > +    local sum = 0
> > +    for line in io.lines() do
> > +      sum = sum + line
> You obviously cannot sum a string and a number.

I can:
| luajit -e 'print(1 + "1")'
| 2

I haven't changed this line.

> > +    end

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure
  2025-11-18 12:21   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:40     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:40 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 18.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:

<snipped>

> > diff --git a/CMakeLists.txt b/CMakeLists.txt
> > index c0da4362..73f46835 100644
> > --- a/CMakeLists.txt
> > +++ b/CMakeLists.txt
> > @@ -464,6 +464,17 @@ if(LUAJIT_USE_TEST)
> >   endif()
> >   add_subdirectory(test)
> >   
> > +# --- Benchmarks source tree ---------------------------------------------------
> > +
> > +# The option to enable performance tests for the LuaJIT.
> > +# Disabled by default, since commonly it is used only by LuaJIT
> > +# developers and run in the CI with the specially set-up machine.
> > +option(LUAJIT_ENABLE_PERF "Generate <perf> target" OFF)
> > +
> > +if(LUAJIT_ENABLE_PERF)
> 
> option name confuses a bit due to `perf` utility.
> 
> I would rename to something like "LUAJIT_ENABLE_PERF_TESTS".

OTOH, it matches with the directory name which is the same as in
Tarantool.

LUAJIT_USE_PERFTOOLS is used for the perftools support in JIT engine.

> 
> Feel free to ignore.

Ignoring.

> 
> > +  add_subdirectory(perf)
> > +endif()
> > +
> >   # --- Misc rules ---------------------------------------------------------------
> >   
> >   # XXX: Implement <uninstall> target using the following recipe:
> > diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
> > new file mode 100644
> > index 00000000..cc3c312f
> > --- /dev/null
> > +++ b/perf/CMakeLists.txt
> > @@ -0,0 +1,99 @@
> > +# Running various bench suites against LuaJIT.
> > +
> > +include(MakeLuaPath)
> > +
> > +if(CMAKE_BUILD_TYPE STREQUAL "Debug")
> > +  message(WARNING "LuaJIT and perf tests are built in the Debug mode."
> 
> s/./. /
> 
> missed whitespace after dot

Fixed, thanks!
===================================================================
diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
index c315597f..5a5cf777 100644
--- a/perf/CMakeLists.txt
+++ b/perf/CMakeLists.txt
@@ -3,7 +3,7 @@
 include(MakeLuaPath)
 
 if(CMAKE_BUILD_TYPE STREQUAL "Debug")
-  message(WARNING "LuaJIT and perf tests are built in the Debug mode."
+  message(WARNING "LuaJIT and perf tests are built in the Debug mode. "
                   "Timings may be affected.")
 endif()
 
===================================================================

> 
> > +                  "Timings may be affected.")
> > +endif()
> > +

<snipped>

> > +macro(AddBenchTarget perf_suite)
> > +  file(MAKE_DIRECTORY "${PERF_OUTPUT_DIR}/${perf_suite}/")
> > +  message(STATUS "Add perf suite ${perf_suite}")
> > +  add_custom_target(${perf_suite})
> > +  add_custom_target(${perf_suite}-console
> > +    COMMAND ${CMAKE_CTEST_COMMAND}
> > +      -L ${perf_suite}
> > +      --parallel 1
> > +      --verbose
> > +      --output-on-failure
> > +      --no-tests=error
> may be --schedule-random, --timeout XXX (default timeout is 10000000)?

I don't want to add the `--schedule-random`. It is better to have the
deterministic execution order.
There is no need for the strict timeout anyway. We can't predict the
behaviour of the benchmark. The default timeout on CI runners is OK.
The user may determine the timeout locally via the `CTEST_TEST_TIMEOUT`
environment variable.

> > +  )
> > +  add_dependencies(${perf_suite}-console luajit-main)
> > +endmacro()
> > +

<snipped>

> > +  set(BENCH_FLAGS
> > +    "--benchmark_out_format=json --benchmark_out=${bench_out_file}"
> > +  )
> > +  set(bench_command_flags ${bench_command} ${BENCH_FLAGS})
> > +  separate_arguments(bench_command_separated UNIX_COMMAND ${bench_command})
> > +  add_custom_command(
> > +    COMMAND ${CMAKE_COMMAND} -E env
> > +      LUA_PATH="${LUA_PATH}"
> > +      LUA_CPATH="${LUA_CPATH}"
> > +        ${bench_command_separated}
> > +          --benchmark_out_format=json
> > +          --benchmark_out="${bench_out_file}"
> previous two lines can be replaced with ${BENCH_FLAGS}, right?

No, this brokes the CMake generation, IINM.

> > +    OUTPUT ${bench_out_file}
> > +    DEPENDS luajit-main
> > +    COMMENT
> > +      "Running benchmark ${bench_title} saving results in ${bench_out_file}."
> > +  )
> > +  add_custom_target(${bench_name} DEPENDS ${bench_out_file})
> > +  add_dependencies(${perf_suite} ${bench_name})
> > +
> > +  # Report in the console.
> > +  add_test(NAME ${bench_title}
> > +    COMMAND sh -c "${bench_command}"
> > +  )
> > +  set_tests_properties(${bench_title} PROPERTIES
> > +    ENVIRONMENT "LUA_PATH=${LUA_PATH}"
> > +    LABELS ${perf_suite}
> > +    DEPENDS luajit-main
> > +  )
> > +  unset(input_file)
> > +endmacro()
> > +
> > +add_subdirectory(LuaJIT-benches)
> > +
> > +add_custom_target(${PROJECT_NAME}-perf
> > +  DEPENDS LuaJIT-benches
> missed a COMMENT field

Why do we need it?
There is no such field in the tests target, for the example.

> > +)
> > +
> > +add_custom_target(${PROJECT_NAME}-perf-console
> > +  DEPENDS LuaJIT-benches-console
> missed a COMMENT field

Why do we need it?
The output is so huge anyway that this field will not be visible.

> > +)
> > diff --git a/perf/LuaJIT-benches/CMakeLists.txt b/perf/LuaJIT-benches/CMakeLists.txt
> > new file mode 100644
> > index 00000000..d9909f36
> > --- /dev/null
> > +++ b/perf/LuaJIT-benches/CMakeLists.txt
> > @@ -0,0 +1,52 @@
> > +set(PERF_SUITE_NAME LuaJIT-benches)
> > +set(LUA_BENCH_SUFFIX .lua)
> it is not a bench-specific suffix. May be LUA_SUFFIX?

This is just for unified naming like it is done in our tests.

> > +
> > +AddBenchTarget(${PERF_SUITE_NAME})
> > +
> > +# Input for the k-nucleotide and revcomp benchmarks.
> > +set(FASTA_NAME ${CMAKE_CURRENT_BINARY_DIR}/FASTA_5000000)
> > +add_custom_target(FASTA_5000000
> > +  COMMAND ${LUAJIT_BINARY}
> > +    ${CMAKE_CURRENT_SOURCE_DIR}/libs/fasta.lua 5000000 > ${FASTA_NAME}
> 
> FASTA_5000000 is a plain text file. I propose to add extension .txt for 
> its full name and
> 
> probably postfix "_autogenerated". Like we do this for SUMCOL_5000 and 
> SUMCOL_1.
> 

The FASTA_\d* is the name used in the PARAMS file and
<TEST_md5sum.txt>. I prefer not to change it to avoid confusion.

> > +  OUTPUT ${FASTA_NAME}
> > +  DEPENDS luajit-main
> > +  COMMENT "Generate ${FASTA_NAME}."
> > +)
> > +
> > +make_lua_path(LUA_PATH
> > +  PATHS
> > +    ${LUA_PATH_BENCH_BASE}
> > +    ${CMAKE_CURRENT_SOURCE_DIR}/libs/?.lua
> > +)
> > +
> > +# Input for the <sum-file.lua> benchmark.
> > +set(SUM_NAME ${CMAKE_CURRENT_BINARY_DIR}/SUMCOL_5000.txt)
> > +# Remove possibly existing file.
> > +file(REMOVE ${SUM_NAME})
> 
> Why do we need generate file after every cmake configuration?
> 
> I propose to skip generation if file already exist or regenerate if 
> SHA256 is not the same.

I'm not sure that this is good idea, since possible the file may changed
accidentally or replaced by the mistake. The generation takes to small
amount of time to be crucial.

> 
> > +

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics
  2025-11-18 12:31   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:41     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:41 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Fixed your comments.

On 18.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > This patch adds a helper script to aggregate the benchmark results from
> > JSON files to the format parsable by the InfluxDB line protocol [1].
> 
> format cannot be parsed by protocol, please rephrase.
> 
> Something like "the format compatible with the InfluxDB line protocol"

Rephrased as you suggested.

> 
> >
> > All JSON files from each suite in the <perf/output> directory are
> > considered as the benchmark results and aggregated into the
> > <perf/output/summary.txt> file that can be posted to the InfluxDB. The
> > results are aggregated via the new target LuaJIT-perf-aggregate.
> may be say that cjson is required?

Added about it and git.

> >
> > [1]:https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/

The updated commit message is the following:

| perf: add aggregator helper for bench statistics
|
| This patch adds a helper script to aggregate the benchmark results from
| JSON files to the format compatible with the InfluxDB line protocol [1].
|
| The script takes 2 command line arguments:
| | luajit aggregate.lua output_file [input_dir]
|
| If `input_dir` isn't given, it uses the current directory by default.
|
| The script requires the `git` command or specified the `PERF_COMMIT`,
| `PERF_BRANCH` environment variables. Also, it requires the `cjson`
| module.
|
| All JSON files from each suite in the <perf/output> directory are
| considered as the benchmark results and aggregated into the
| <perf/output/summary.txt> file that can be posted to the InfluxDB. The
| results are aggregated via the new target LuaJIT-perf-aggregate.
|
| [1]: https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/

> > ---
> >   perf/CMakeLists.txt        |  13 ++++
> >   perf/helpers/aggregate.lua | 124 +++++++++++++++++++++++++++++++++++++
> >   2 files changed, 137 insertions(+)
> >   create mode 100644 perf/helpers/aggregate.lua
> >
> > diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
> > index cc3c312f..68e561fd 100644
> > --- a/perf/CMakeLists.txt
> > +++ b/perf/CMakeLists.txt

<snipped>

> > diff --git a/perf/helpers/aggregate.lua b/perf/helpers/aggregate.lua
> > new file mode 100644
> > index 00000000..12a8ab89
> > --- /dev/null
> > +++ b/perf/helpers/aggregate.lua
> > @@ -0,0 +1,124 @@
> > +local json = require('cjson')
> What if cjson is absent? Do we want to handle error?

No, it is required. Mentioned it in the commit message and in the
comment below as you suggested.

> > +
> > +-- File to aggregate the benchmark results from JSON files to the
> > +-- format parsable by the InfluxDB line protocol [1]:
> > +-- <measurement>,<tag_set> <field_set> <timestamp>
> > +--
> > +-- <tag_set> and <field_set> have the following format:
> > +-- <key1>=<value1>,<key2>=<value2>
> > +--
> > +-- The reported tag set is a set of values that can be used for
> > +-- filtering data (i.e., branch or benchmark name).
> > +--
> > +-- luacheck: push no max comment line length
> > +--
> > +-- [1]:https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/
> > +--
> > +-- luacheck: pop
> 
> I propose to document command-line options
> 
> (1st arg is output file, 2nd arg is a dir, "current dir by default"),
> 
> env variables (PERF_COMMIT, PERF_BRANCH) and requirements
> 
> (git is an optional requirement, cjson Lua module is mandatory).

Added the comment:

===================================================================
diff --git a/perf/helpers/aggregate.lua b/perf/helpers/aggregate.lua
index 12a8ab89..01410a47 100644
--- a/perf/helpers/aggregate.lua
+++ b/perf/helpers/aggregate.lua
@@ -15,6 +15,14 @@ local json = require('cjson')
 -- [1]: https://docs.influxdata.com/influxdb/v2/reference/syntax/line-protocol/
 --
 -- luacheck: pop
+--
+-- The script takes 2 command line arguments:
+-- | luajit aggregate.lua output_file [input_dir]
+-- If `input_dir` isn't given, it uses the current directory by
+-- default.
+-- The script requires the `git` command or specified
+-- `PERF_COMMIT`, `PERF_BRANCH` environment variables. Also, it
+-- requires the `cjson` module.
 
 local output = assert(arg[1], 'Output file is required as the first argument')
 local input_dir = arg[2] or '.'
===================================================================

> 
> > +
> > +local output = assert(arg[1], 'Output file is required as the first argument')
> > +local input_dir = arg[2] or '.'
> > +
> > +local out_fh = assert(io.open(output, 'w+'))
> > +
> > +local function exec(cmd)
> > +  return io.popen(cmd):read('*all'):gsub('%s+$', '')
> > +end
> > +
> > +local commit = os.getenv('PERF_COMMIT') or exec('git rev-parse --short HEAD')
> > +assert(commit, 'can not determine the commit')
> > +
> > +local branch = os.getenv('PERF_BRANCH') or
> > +  exec('git rev-parse --abbrev-ref HEAD')
> > +assert(branch, 'can not determine the branch')
> > +

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup
  2025-11-18 12:36   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:41     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:41 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 18.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > The patch adds a script for setting the environment before running
> > performance tests. Most of the settings are taken from the Tarantool's
> > wiki page dedicated to the benchmarking [1].
> 
> Honestly, I don't like that we have similar files in two repositories 
> (aggregate, setup-env.sh).

The <aggregate.lua> file is different due to different libraries
available. Also, it has its own logic of scanning of directories.

<setup-env.sh> may be updated too with time independently between the
repositories, since Tarantool and LuaJIT may require the different
setup.

> 
> This makes maintenance more complicated. I would put these files to a 
> shared repository

No. It isn't a good approach. As we discussed a thousand times with Igor
before, there is no need in the third-party dependency for this
repository, it should be self-sufficient. Trust me, you don't want to
investigate some regression in the performance testing results because
somebody updated a third-party repository. This is why we have the our
own tap module for the <tarantool-tests> suite. As you can see, we have
zero problems with it. Also, considering my comment above.

> 
> and reuse it for tarantool and luajit repos.
> 
> Original files are even not specified in commit messages for 
> setup-env.sh and aggregate.lua,

Mentioned that this file was taken from the Tarantool repository in the
commit message.

> 
> I think it is worth it. No sense to review these files if it was already 
> done previously.

It is good to check that I didn't mess-up the content by accident,
anyway.

> 
> >
> > [1]:https://github.com/tarantool/tarantool/wiki/Benchmarking
> > ---

The updated commit message is the following:

| perf: add a script for the environment setup
|
| The patch adds a script for setting the environment before running
| performance tests. The script originated from the Tarantool's repository
| [2]. Most of the settings are taken from the Tarantool's wiki page
| dedicated to the benchmarking [1].
|
| [1]: https://github.com/tarantool/tarantool/blob/dcdb3ee83b3d6324011e704b5a3f4ee3e19bbf47/perf/tools/setup_env.sh
| [2]: https://github.com/tarantool/tarantool/wiki/Benchmarking

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark
  2025-11-18 12:51   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:42     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:42 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 18.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > This patch introduces the `LUAJIT_BENCH_INIT` option to determine the
> 
> it is actually a runner for benchmark, not a command that runs before 
> the benchmark itself.
> 
> Please rephrase.

Rephrased as discussed offline.

> 
> 
> I would rename a cmake option appropriately. 
> LUAJIT_BENCH_EXEC/LUAJIT_BENCH_RUNNER?

I prefer to leave it as is to be consistent with namings in tests.

> 
> Feel free to keep as is, I don't insist.
> 
> > shell command to be run before the benchmark itself. It may be useful to
> > set taskset, etc.
> > ---

The updated commit message is the following:

| perf: provide CMake option to setup the benchmark
|
| This patch introduces the `LUAJIT_BENCH_INIT` option to determine the
| shell command prefix to be run before the benchmark itself. It may be
| useful to set taskset, etc.

> >   perf/CMakeLists.txt | 9 ++++++++-
> >   1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/perf/CMakeLists.txt b/perf/CMakeLists.txt
> > index 68e561fd..c315597f 100644
> > --- a/perf/CMakeLists.txt
> > +++ b/perf/CMakeLists.txt
> > @@ -7,6 +7,13 @@ if(CMAKE_BUILD_TYPE STREQUAL "Debug")
> >                     "Timings may be affected.")
> >   endif()
> >   
> > +# The shell command needs to be run before benchmarks are started.
> > +if(LUAJIT_BENCH_INIT)
> > +  message(STATUS
> > +    "The following command will run before benchmarks: '${LUAJIT_BENCH_INIT}'."
> > +  )
> > +endif()
> this message is not necessary, one can see it in "ctest -V" output

I suppose it is good to mention it in the configuration phase as well.

<snipped>

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow
  2025-11-18 13:08   ` Sergey Bronnikov via Tarantool-patches
@ 2025-12-26  8:43     ` Sergey Kaplun via Tarantool-patches
  0 siblings, 0 replies; 134+ messages in thread
From: Sergey Kaplun via Tarantool-patches @ 2025-12-26  8:43 UTC (permalink / raw)
  To: Sergey Bronnikov; +Cc: tarantool-patches

Hi, Sergey!
Thanks for the review.
Please, consider my answers below.

On 18.11.25, Sergey Bronnikov wrote:
> Hi, Sergey,
> 
> thanks for the patch! See my comments.
> 
> Sergey
> 
> On 10/24/25 14:00, Sergey Kaplun wrote:
> > This patch adds the workflow to run benchmarks from various suites,
> > aggregate their results, and send statistics to the InfluxDB to be
> > processed later.
> >
> > The workflow contains a matrix to measure GC64 and non-GC64 modes with
> > enabled/disabled JIT for x64 architecture.
> > ---

<snipped>

> > +      run: |
> > +        apt -y update
> > +        apt install -y luarocks curl
> > +      shell: bash
> > +    - name: Install Lua modules
> > +      run: luarocks install lua-cjson
> > +      shell: bash
> > +    - name: Run script to setup Linux environment
> > +      run: sh ./perf/helpers/setup_env.sh
> > +      shell: bash
> bash or shell is used in the last step? (shebang in setup_env.sh is 
> /bin/sh)

It uses the `sh` command anyway, but I prefer to leave this line as is
for the consistency.

> > diff --git a/.github/workflows/performance.yml b/.github/workflows/performance.yml
> > new file mode 100644
> > index 00000000..bfb6be97
> > --- /dev/null
> > +++ b/.github/workflows/performance.yml

<snipped>

> > +concurrency:
> > +  # An update of a developer branch cancels the previously
> > +  # scheduled workflow run for this branch. However, the default
> > +  # branch, and long-term branch (tarantool/release/2.11,
> > +  # tarantool/release/2.10, etc) workflow runs are never canceled.
> > +  #
> it is not relevant, right?

Why not? Unless we send the results to the InfluxDB, the job isn't
finished and may be aborted that way for the default branch if we
need to push dummy commits.

> > +  # We use a trick here: define the concurrency group as 'workflow
> > +  # run ID' + # 'workflow run attempt' because it is a unique
> > +  # combination for any run. So it effectively discards grouping.
> > +  #
> > +  # XXX: we cannot use `github.sha` as a unique identifier because
> > +  # pushing a tag may cancel a run that works on a branch push
> > +  # event.
> > +  group: ${{ startsWith(github.ref, 'refs/heads/tarantool/')
> > +    && format('{0}-{1}', github.run_id, github.run_attempt)
> > +    || format('{0}-{1}', github.workflow, github.ref) }}
> > +  cancel-in-progress: true
> > +

<snipped>

> > +      - name: configure
> > +        # The taskset alone will pin all the process threads
> > +        # into a single (random) isolated CPU, see
> > +        #https://bugzilla.kernel.org/show_bug.cgi?id=116701.
> > +        # The workaround is using realtime scheduler for the
> > +        # isolated task using chrt, e. g.:
> > +        # sudo taskset 0xef chrt 50.
> > +        # But this makes the process use non-standard, real-time
> > +        # round-robin scheduling mechanism.
> > +        run: >
> > +          cmake -S . -B ${{ env.BUILDDIR }}
> > +          -DCMAKE_BUILD_TYPE=RelWithDebInfo
> 
> RelWithDebInfo is -O2 (moderate optimization), Release is -O3 (high 
> optimization).
> 
> Do we really need  RelWithDebInfo? I think it deserves a comment.

It is the default build for the Tarantool as well. We have no Release
builds, IINM. Also, mostly we executed the hot code either in the VM or
on trace, where these flags are irrelevant (excluding table lookups).

> 
> > +          -DLUAJIT_ENABLE_PERF=ON
> > +          -DLUAJIT_BENCH_INIT="taskset 0xfe chrt 50"
> > +          -DLUAJIT_DISABLE_JIT=${{ matrix.JOFF }}
> > +          -DLUAJIT_ENABLE_GC64=${{ matrix.GC64 }}

<snipped>

> > +      - name: perf
> this name is visible in Web UI. I would make it more descriptive (execute
> performance benchmarks?)

I prefer to leave it as is to be consistent with naming in tests. All
description is given in the workflow name anyway.

<snipped>

Also, fixed the workflow name:

===================================================================
diff --git a/.github/workflows/performance.yml b/.github/workflows/performance.yml
index bfb6be97..fe22eb4b 100644
--- a/.github/workflows/performance.yml
+++ b/.github/workflows/performance.yml
@@ -56,7 +56,7 @@ jobs:
     name: >
       LuaJIT
       GC64:${{ matrix.GC64 }}
-      JOFF:${{ matrix.GC64 }}
+      JOFF:${{ matrix.JOFF }}
     steps:
       - uses: actions/checkout@v4
         with:
===================================================================

-- 
Best regards,
Sergey Kaplun

^ permalink raw reply	[flat|nested] 134+ messages in thread

end of thread, other threads:[~2025-12-26  8:43 UTC | newest]

Thread overview: 134+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-24 10:50 [Tarantool-patches] [PATCH v1 luajit 00/41] LuaJIT performance testing Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 01/41] perf: add LuaJIT-test-cleanup perf suite Sergey Kaplun via Tarantool-patches
2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:04     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 02/41] perf: introduce clock module Sergey Kaplun via Tarantool-patches
2025-11-11 14:28   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:05     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 03/41] perf: introduce bench module Sergey Kaplun via Tarantool-patches
2025-11-11 15:41   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:06     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 04/41] perf: adjust array3d in LuaJIT-benches Sergey Kaplun via Tarantool-patches
2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:07     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 05/41] perf: adjust binary-trees " Sergey Kaplun via Tarantool-patches
2025-11-13 11:06   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:08     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 06/41] perf: adjust chameneos " Sergey Kaplun via Tarantool-patches
2025-11-13 11:11   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:10     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 07/41] perf: adjust coroutine-ring " Sergey Kaplun via Tarantool-patches
2025-11-13 11:17   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:11     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 08/41] perf: adjust euler14-bit " Sergey Kaplun via Tarantool-patches
2025-11-13 11:44   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:12     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 09/41] perf: adjust fannkuch " Sergey Kaplun via Tarantool-patches
2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:13     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 10/41] perf: adjust fasta " Sergey Kaplun via Tarantool-patches
2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:15     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 11/41] perf: adjust k-nucleotide " Sergey Kaplun via Tarantool-patches
2025-11-17  8:36   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:17     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 12/41] perf: adjust life " Sergey Kaplun via Tarantool-patches
2025-11-17  8:35   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:18     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 13/41] perf: adjust mandelbrot-bit " Sergey Kaplun via Tarantool-patches
2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:20     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 14/41] perf: adjust mandelbrot " Sergey Kaplun via Tarantool-patches
2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:20     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 15/41] perf: adjust md5 " Sergey Kaplun via Tarantool-patches
2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:22     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 16/41] perf: adjust meteor " Sergey Kaplun via Tarantool-patches
2025-12-23 10:38   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:23     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 17/41] perf: adjust nbody " Sergey Kaplun via Tarantool-patches
2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:24     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 18/41] perf: adjust nsieve-bit-fp " Sergey Kaplun via Tarantool-patches
2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:25     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 19/41] perf: adjust nsieve-bit " Sergey Kaplun via Tarantool-patches
2025-11-17 13:26   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:25     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 20/41] perf: adjust nsieve " Sergey Kaplun via Tarantool-patches
2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:26     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 21/41] perf: adjust partialsums " Sergey Kaplun via Tarantool-patches
2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:27     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 22/41] perf: adjust pidigits-nogmp " Sergey Kaplun via Tarantool-patches
2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:27     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 23/41] perf: adjust ray " Sergey Kaplun via Tarantool-patches
2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:29     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 24/41] perf: adjust recursive-ack " Sergey Kaplun via Tarantool-patches
2025-11-17 13:25   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:30     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 25/41] perf: adjust recursive-fib " Sergey Kaplun via Tarantool-patches
2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:30     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 26/41] perf: adjust revcomp " Sergey Kaplun via Tarantool-patches
2025-11-17 13:59   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:31     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 27/41] perf: adjust scimark-2010-12-20 " Sergey Kaplun via Tarantool-patches
2025-11-17 13:56   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:32     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 28/41] perf: move <scimark_lib.lua> to <libs/> directory Sergey Kaplun via Tarantool-patches
2025-11-17 13:58   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:32     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 29/41] perf: adjust scimark-fft in LuaJIT-benches Sergey Kaplun via Tarantool-patches
2025-11-17 14:00   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:33     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 30/41] perf: adjust scimark-lu " Sergey Kaplun via Tarantool-patches
2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:01   ` Sergey Kaplun via Tarantool-patches
2025-11-17 14:07   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:34     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 31/41] perf: add scimark-mc " Sergey Kaplun via Tarantool-patches
2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
2025-11-17 14:09   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:35     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 32/41] perf: adjust scimark-sor " Sergey Kaplun via Tarantool-patches
2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:02   ` Sergey Kaplun via Tarantool-patches
2025-11-17 14:11   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:35     ` Sergey Kaplun via Tarantool-patches
2025-10-24 10:50 ` [Tarantool-patches] [PATCH v1 luajit 33/41] perf: adjust scimark-sparse " Sergey Kaplun via Tarantool-patches
2025-10-24 11:00   ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:03   ` Sergey Kaplun via Tarantool-patches
2025-11-17 14:15   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:36     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 34/41] perf: adjust series " Sergey Kaplun via Tarantool-patches
2025-11-17 14:19   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:37     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 35/41] perf: adjust spectral-norm " Sergey Kaplun via Tarantool-patches
2025-11-17 14:23   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:37     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 36/41] perf: adjust sum-file " Sergey Kaplun via Tarantool-patches
2025-12-23 10:37   ` Sergey Bronnikov via Tarantool-patches
2025-12-23 10:44   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:38     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 37/41] perf: add CMake infrastructure Sergey Kaplun via Tarantool-patches
2025-11-18 12:21   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:40     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 38/41] perf: add aggregator helper for bench statistics Sergey Kaplun via Tarantool-patches
2025-11-18 12:31   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:41     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 39/41] perf: add a script for the environment setup Sergey Kaplun via Tarantool-patches
2025-11-18 12:36   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:41     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 40/41] perf: provide CMake option to setup the benchmark Sergey Kaplun via Tarantool-patches
2025-11-18 12:51   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:42     ` Sergey Kaplun via Tarantool-patches
2025-10-24 11:00 ` [Tarantool-patches] [PATCH v1 luajit 41/41] ci: introduce the performance workflow Sergey Kaplun via Tarantool-patches
2025-11-18 13:08   ` Sergey Bronnikov via Tarantool-patches
2025-12-26  8:43     ` Sergey Kaplun via Tarantool-patches
2025-11-18 13:13   ` Sergey Bronnikov via Tarantool-patches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox