From: Sergey Kaplun via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Sergey Bronnikov <sergeyb@tarantool.org> Cc: tarantool-patches@dev.tarantool.org Subject: Re: [Tarantool-patches] [PATCH] serializer: serialize recursive structures Date: Wed, 21 Apr 2021 21:20:31 +0300 [thread overview] Message-ID: <YIBs7112r1thKf7f@root> (raw) In-Reply-To: <092bc107-40cd-2c58-7a90-236663551249@tarantool.org> Hi! Thanks for benchmarks! I like them! But I am tormented by doubts whether it is worth keeping it in a semi-working state in our repository. Shouldn't we first introduce the architecture of performance tests, and then add some new benchmarks? I'm totally for the presence of such benchmarks in our repo, but I'm afraid to make a real mess of things. Also, I suppose that it should be a collegeal descicion (not only 2 reviewers). On 21.04.21, Sergey Bronnikov wrote: > > On 21.04.2021 12:27, Sergey Kaplun via Tarantool-patches wrote: > > Hi! > > > > Thanks for the fixes and the benchmarks! > > I propose an updated version of script and propose to add benchmark as a > part of patch series. > > --[[ > -- measure serialization time > -- run: taskset -c 1 tarantool perf.lua > ]] > > local clock = require('clock') > > local function elapsed(f, n) > local t0 = clock.monotonic() I've read discussions that we should use CLOCK_PROCESS_CPUTIME_ID instead of CLOCK_MONOTONIC. Is it one preferable than another? > for i = 1, n do > f() > end > local t1 = clock.monotonic() > return t1 - t0 > end Shouldn't we still take into account the time to call the function `clock.monotonic()`? > > -- Get the mean of a table > function calculate_mean(t) > local sum = 0 > local count= 0 > for k, v in pairs(t) do > if type(v) == 'number' then 2 general points: 1) use `for i = 1, #t do` instead pairs (as far as we use #t later) 2) use `assert(type(v) == 'number')` instead of if statement. Here and below. > sum = sum + v > end > end > > return (sum/#t) > end > > -- Get the median of a table. > function calculate_median(t) > local temp = {} > -- deep copy table so that when we sort it, the original is unchanged Are we interested in unchanged original table? May be it will be better to sort values inside before and work with them? > -- also weed out any non numbers > for k, v in pairs(t) do > if type(v) == 'number' then > table.insert(temp, v) > end > end > table.sort(temp) > -- If we have an even number of table elements or odd. > if math.fmod(#temp,2) == 0 then > -- return mean value of middle two elements > return (temp[#temp/2] + temp[(#temp/2)+1]) / 2 > else > -- return middle element > return temp[math.ceil(#temp/2)] > end > end > > -- Get the standard deviation of a table > function calculate_stddev(t) > local vm Sorry, don't get it: why do you choose this scope for the variable? > local sum = 0 > local mean = calculate_mean(t) > for k, v in pairs(t) do > if type(v) == 'number' then > vm = v - mean > sum = sum + (vm * vm) > end > end > > return math.sqrt(sum/(#t - 1)) > end > > local function timeit(f, name) > print('======================================') > print(name) > print('======================================') > local res = {} > local iterations = 10 > local elapsed_time = 0 > local repetitions = 150000 It will be nice to have an ability to configure this value (via config or environment variables). > for j = 1, iterations do > -- warming > for i = 1, 100 do f() end > -- measurement > elapsed_time = elapsed(f, repetitions) > table.insert(res, elapsed_time) > print(string.format("%-2d - %f sec / %d repetitions", j, > elapsed_time, repetitions)) > end > print(string.format("time mean %f", calculate_mean(res))) > print(string.format("time median %f", calculate_median(res))) > print(string.format("time stddev %f", calculate_stddev(res))) > end > > local t = {{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} > timeit(function() > local console = require('console') > console.set_default_output('yaml') > return console.eval(tostring(t)) > end, 'serializer console yaml') It is better to move out all this actions and create a local copy of called functions. For example: =================================================================== local tstr = tostring(t) local console = require('console') console.set_default_output('yaml') local eval = console.eval timeit(function() return eval(tstr) end, 'serializer console yaml') =================================================================== It reduces time by 5% (stderr is 0.5%). > timeit(function() > local console = require('console') > console.set_default_output('lua') > return console.eval(tostring(t)) > end, 'serializer console lua') > timeit(function() > local serializer = require('json') > serializer.cfg({encode_max_depth = 64}) > return serializer.encode(t) > end, 'serializer json') > timeit(function() > local serializer = require('yaml') > serializer.cfg({encode_max_depth = 64}) > return serializer.encode(t) > end, 'serializer yaml') > timeit(function() > local serializer = require('msgpack') > serializer.cfg({encode_max_depth = 64}) > return serializer.encode(t) > end, 'serializer msgpack') > timeit(function() > local serializer = require('msgpackffi') > return serializer.encode(t) > end, 'serializer msgpackffi') > -- Best regards, Sergey Kaplun
next prev parent reply other threads:[~2021-04-21 18:21 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-03-11 5:49 Roman Khabibov via Tarantool-patches 2021-03-14 12:42 ` Sergey Kaplun via Tarantool-patches 2021-04-05 22:05 ` Roman Khabibov via Tarantool-patches 2021-04-21 9:27 ` Sergey Kaplun via Tarantool-patches 2021-04-21 13:12 ` Sergey Bronnikov via Tarantool-patches 2021-04-21 18:20 ` Sergey Kaplun via Tarantool-patches [this message] 2021-04-13 15:54 ` Sergey Bronnikov via Tarantool-patches 2021-04-15 20:39 ` Roman Khabibov via Tarantool-patches 2021-04-21 12:34 ` Sergey Bronnikov via Tarantool-patches 2021-07-07 22:42 ` Alexander Turenko via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YIBs7112r1thKf7f@root \ --to=tarantool-patches@dev.tarantool.org \ --cc=sergeyb@tarantool.org \ --cc=skaplun@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] serializer: serialize recursive structures' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox