[Tarantool-patches] [PATCH resend v2 02/11] lua: built-in module datetime

Timur Safin tsafin at tarantool.org
Sat Jul 31 19:51:44 MSK 2021


: From: Oleg Babin <olegrok at tarantool.org>
: Subject: Re: [Tarantool-patches] [PATCH resend v2 02/11] lua: built-in
: module datetime
: 
: Thanks for your answers.
: 
: On 30.07.2021 22:00, Timur Safin wrote:
: > Ok, here is a selection list which I've seen:
: > - decimal is in LuaC;
: > - uuid is in ffi;
: >
: > It looked equally acceptable and I've selected the one which is
: > more convenient - via ffi. Now, you've pointed me to the probable
: > performance issues we have with ffi, which are not there with LuaC.
: > I believe they have been fixed with that Mike fix for NYI in
: > structure allocations (after Sergos dirty fix) and once we will land
: > this patch to our local fork those performance issues for structures
: > allocations will gone.
: >
: > In any case, 1st version of module is not yet a proper moment to start
: > to worry about performance, more rather about clarity and quality of API.
: > We need to return to benchmarking of this code soon after code freeze
: > lifted.
: 
: I don't know why uuid is implemented via FFI. Uuid as module was
: introduced at least 8 yeard ago.
: 
: Probably there were supporters of the theory that FFI/LuaJIT
: implementation could be fast.
: 
: But currently we can see something like
: https://github.com/luafun/luafun/issues/31#issuecomment-277508076
: 
: and rewrite netbox in C.

"I feel your pain, bro!" :)

[here follows seemingly irrelevant speech, but actually it's on topic]

I knew you have a plenty of LuaJIT experience, and sometimes, 
you might want to throw it out altogether. All that pain of debugging JIT
performance issues. Woodoo and black magics of LuaJIT debugging. General 
unpredictability of Mike responses. And even lack of convenient tooling
here and there. I do understand how all this could exhaust eventually, 
and make to loss of believe in LuaJIT usefulness.

But we should not. We have skilled LuaJIT team here, which could handle
all sorts of problems. We have huge amount of Lua code, and have no any
reason to start the process of getting rid of it. Yes, there are cases 
which might hit some performance degradations, but usually there are 
reasonable workarounds. (Whom am I teaching here? You are much more aware
of all these tricks than I am!)

Having our background and grownup skillset, there are actually no much 
reasons to try to drastically reduce amount of LuaJIT usage, because 
we continue to be used as LuaJIT-based application server for all 
foreseeable future.
[/end of seemingly irrelevant speech]

To make this story short, I believe there is no much point to start 
rewrite of this module in LuaC before we have it published and landed to 
the Tarantool code-base. We need to make it feature complete first, then 
"blazingly fast". In that order.

: 
: And it's not about patches in LuaJIT about structure initialization. You
: can easily cherry-pick it into
: 
: our fork and show that FFI is really better than Lua C API. It's about
: an approach. We saw benchmarks
: 
: when decimal was introduced but currently we didn't see anything.
: 
: > : > +    local y, M, d, ymd
: > : > +    y, M, d, ymd = 0, 0, 0, false
: > : > +
: > : > +    local h, m, s, frac, hms
: > : > +    h, m, s, frac, hms = 0, 0, 0, 0, false
: > : > +
: > : > +    local dt = 0
: > : > +
: > : > +    for key, value in pairs(o) do
: > : > +        local handlers = {
: > :
: > : Here you recreate handlers for each iteration of the loop. I think it
: > : should be reworked.
: >
: > Yes, let me check how better will be if this handlers array will be
: invariant
: > in the loop, and not created each iteration.
: >
: > I still prefer this tag to function approach because of a better clarity,
: > but I need to verify in benchmark whether it provides acceptable
: performance.
: >
: > I do not expect thought that this attribute object creation approach would
: become
: > a bootleneck in any case and might be on a critical path anywhere (I'd
: rather expect
: > it for datetime objects created after parsing of massive bunch of log
: text),
: > but need to look into timings we would have with invariant table.
: >
: > :
: > : Currenlty it's quite slow I think even if-elseif-else branches will work
: > : faster without
: > :
: > : creating redundant GC objects.
: >
: > Will see ...
: 
: I'll simplify your task a bit:
: 
: local clock = require('clock')
: 
: local mt_fun = function(_, key)
:      local mt = {
:          ['key'] = function()
:              return 'value'
:          end
:      }
:      return mt[key] ~= nil and mt[key]()
: end
: 
: local mt_table = {
:      ['key'] = function()
:          return 'value'
:      end
: }
: 
: local t1 = setmetatable({}, {__index = mt_fun})
: local t2 = setmetatable({}, {__index = mt_table})
: local _
: 
: local start = clock.time()
: for _ = 1, 1e6 do
:      _ = t1['key']
: end
: print('function', clock.time() - start)
: 
: collectgarbage()
: collectgarbage()
: 
: local start = clock.time()
: for _ = 1, 1e6 do
:      _ = t2['key']
: end
: print('table', clock.time() - start)
: 
: 
: tarantool test.lua
: function    0.14332509040833
: table    0.0002901554107666
: 
: 
: The difference about x500 times on my Mac.

Thanks! That's convincing! Thanks for this experiment,
much appreciated!

Interesting observation, though, that once we remove local 
mt table creation from inside of mt_fun to the module scope, making
it invariant for the cycle. The difference is not that much
huge. 

Debug build on WSL::
- your original version:

```
19:22 $ ../build/src/tarantool bench-index.lua 
function        0.16711115837097
table   0.00081968307495117
```

- modified version with invariant mt table:

```
19:25 $ ../build/src/tarantool bench-index.lua 
function        0.00028538703918457
table   0.00028634071350098
```

```lua
local clock = require('clock')

local mt = {
    ['key'] = function()
        return 'value'
    end
}
local mt_fun = function(_, key)
    -- return mt[key]()
    return mt[key] ~= nil and mt[key]()
end

local mt_table = {
    ['key'] = function()
        return 'value'
    end
}

local t1 = setmetatable({}, {__index = mt_fun})
local t2 = setmetatable({}, {__index = mt_table})
local _

local start = clock.time()
for _ = 1, 1e6 do
     _ = t1['key']
end
print('function', clock.time() - start)

collectgarbage()
collectgarbage()

local start = clock.time()
for _ = 1, 1e6 do
     _ = t2['key']
end
print('table', clock.time() - start)
```

So once again we have learned that it's not a good idea
to create invariant table inside of loop, and simple 
loop invariant removal might change things dramatically.

Thanks for the lesson!
: 
: > : > +end
: > : > +
: > : > +local function strftime(fmt, o)
: > : > +    assert(ffi.typeof(o) == datetime_t)
: > : > +    local sz = 50
: > : Why 50?
: >
: > Good question, for ISO-8601 formats, for normal date range
: > (from 0000 till 9999 years) it's more like closer to 40, but
: > well, there will be various timezones and all crazy kinds of
: > input format combinations, so we need to either be precautious
: > with larger sizes allocations.
: 
: On the one side you are right. But if user want to format string with
: length more than 50
: 
: it will be silently truncated.

No, no, no, please see my 2 pass variant suggested below. We will ask
`strftime` first to calculate the length for us (passing NULL as buffer
pointer), then allocate, and fill the allocated buffer with content 
the 2nd pass. 

: 
: 
: ```
: 
: tarantool> datetime.strftime('%d' .. string.rep('content', 50),
: datetime.new())
: ---
: - 01contentcontentcontentcontentcontentcontentconten
: ...
: 
: ```
: 
: I'm not sure that anybody will use strftime in such way but it's strange.
: 
: Maybe you could use preallocated buffer for short strings and allocate
: huge if
: 
: it's not enough. Some kind of optimistic approach.
: 
: 
: > Or probably rely on glibc convention (which is apparently not in
: > POSIX/SUSV) with 2 passes approach:
: > 1. call with NULL so it will return size of required buffer;
: > 2. then, after allocation, with the adjusted allocated buffer;
: >
: > https://www.gnu.org/software/libc/manual/html_node/Formatting-Calendar-
: Time.html
: > https://pubs.opengroup.org/onlinepubs/000095399/functions/strftime.html
: >
: > Something like that (beware of inline patch):
: >
: > ----------------------------------------------
: > local function strftime(fmt, o)
: >       check_date(o, "datetime.strftime(fmt, date)")
: > -    local sz = 50
: > -    local buff = ffi.new('char[?]', sz)
: >       local p_tm = datetime_to_tm_ptr(o)
: > -    native.strftime(buff, sz, fmt, p_tm)
: > +    local sz = builtin.strftime(nil, 1024, fmt, p_tm)
: > +    local buff = ffi.new('char[?]', sz + 1)
: > +    builtin.strftime(buff, sz, fmt, p_tm)
: >       return ffi.string(buff)
: >   end
: > ----------------------------------------------

Thanks,
Timur



More information about the Tarantool-patches mailing list