[Tarantool-patches] [PATCH resend v2 02/11] lua: built-in module datetime

Oleg Babin olegrok at tarantool.org
Sat Jul 31 09:29:30 MSK 2021


Thanks for your answers.

On 30.07.2021 22:00, Timur Safin wrote:
> Ok, here is a selection list which I've seen:
> - decimal is in LuaC;
> - uuid is in ffi;
>
> It looked equally acceptable and I've selected the one which is
> more convenient - via ffi. Now, you've pointed me to the probable
> performance issues we have with ffi, which are not there with LuaC.
> I believe they have been fixed with that Mike fix for NYI in
> structure allocations (after Sergos dirty fix) and once we will land
> this patch to our local fork those performance issues for structures
> allocations will gone.
>
> In any case, 1st version of module is not yet a proper moment to start
> to worry about performance, more rather about clarity and quality of API.
> We need to return to benchmarking of this code soon after code freeze
> lifted.

I don't know why uuid is implemented via FFI. Uuid as module was 
introduced at least 8 yeard ago.

Probably there were supporters of the theory that FFI/LuaJIT 
implementation could be fast.

But currently we can see something like 
https://github.com/luafun/luafun/issues/31#issuecomment-277508076

and rewrite netbox in C.

And it's not about patches in LuaJIT about structure initialization. You 
can easily cherry-pick it into

our fork and show that FFI is really better than Lua C API. It's about 
an approach. We saw benchmarks

when decimal was introduced but currently we didn't see anything.

> : > +    local y, M, d, ymd
> : > +    y, M, d, ymd = 0, 0, 0, false
> : > +
> : > +    local h, m, s, frac, hms
> : > +    h, m, s, frac, hms = 0, 0, 0, 0, false
> : > +
> : > +    local dt = 0
> : > +
> : > +    for key, value in pairs(o) do
> : > +        local handlers = {
> :
> : Here you recreate handlers for each iteration of the loop. I think it
> : should be reworked.
>
> Yes, let me check how better will be if this handlers array will be invariant
> in the loop, and not created each iteration.
>
> I still prefer this tag to function approach because of a better clarity,
> but I need to verify in benchmark whether it provides acceptable performance.
>
> I do not expect thought that this attribute object creation approach would become
> a bootleneck in any case and might be on a critical path anywhere (I'd rather expect
> it for datetime objects created after parsing of massive bunch of log text),
> but need to look into timings we would have with invariant table.
>
> :
> : Currenlty it's quite slow I think even if-elseif-else branches will work
> : faster without
> :
> : creating redundant GC objects.
>
> Will see ...

I'll simplify your task a bit:

local clock = require('clock')

local mt_fun = function(_, key)
     local mt = {
         ['key'] = function()
             return 'value'
         end
     }
     return mt[key] ~= nil and mt[key]()
end

local mt_table = {
     ['key'] = function()
         return 'value'
     end
}

local t1 = setmetatable({}, {__index = mt_fun})
local t2 = setmetatable({}, {__index = mt_table})
local _

local start = clock.time()
for _ = 1, 1e6 do
     _ = t1['key']
end
print('function', clock.time() - start)

collectgarbage()
collectgarbage()

local start = clock.time()
for _ = 1, 1e6 do
     _ = t2['key']
end
print('table', clock.time() - start)


tarantool test.lua
function    0.14332509040833
table    0.0002901554107666


The difference about x500 times on my Mac.

> : > +end
> : > +
> : > +local function strftime(fmt, o)
> : > +    assert(ffi.typeof(o) == datetime_t)
> : > +    local sz = 50
> : Why 50?
>
> Good question, for ISO-8601 formats, for normal date range
> (from 0000 till 9999 years) it's more like closer to 40, but
> well, there will be various timezones and all crazy kinds of
> input format combinations, so we need to either be precautious
> with larger sizes allocations.

On the one side you are right. But if user want to format string with 
length more than 50

it will be silently truncated.


```

tarantool> datetime.strftime('%d' .. string.rep('content', 50), 
datetime.new())
---
- 01contentcontentcontentcontentcontentcontentconten
...

```

I'm not sure that anybody will use strftime in such way but it's strange.

Maybe you could use preallocated buffer for short strings and allocate 
huge if

it's not enough. Some kind of optimistic approach.


> Or probably rely on glibc convention (which is apparently not in
> POSIX/SUSV) with 2 passes approach:
> 1. call with NULL so it will return size of required buffer;
> 2. then, after allocation, with the adjusted allocated buffer;
>
> https://www.gnu.org/software/libc/manual/html_node/Formatting-Calendar-Time.html
> https://pubs.opengroup.org/onlinepubs/000095399/functions/strftime.html
>
> Something like that (beware of inline patch):
>
> ----------------------------------------------
> local function strftime(fmt, o)
>       check_date(o, "datetime.strftime(fmt, date)")
> -    local sz = 50
> -    local buff = ffi.new('char[?]', sz)
>       local p_tm = datetime_to_tm_ptr(o)
> -    native.strftime(buff, sz, fmt, p_tm)
> +    local sz = builtin.strftime(nil, 1024, fmt, p_tm)
> +    local buff = ffi.new('char[?]', sz + 1)
> +    builtin.strftime(buff, sz, fmt, p_tm)
>       return ffi.string(buff)
>   end
> ----------------------------------------------
>
> : > +    local buff = ffi.new('char[?]', sz)
> : > +    local p_tm = datetime_to_tm_ptr(o)
> : > +    native.strftime(buff, sz, fmt, p_tm)
> : > +    return ffi.string(buff)
> : > +end
> : > +
> : > +-- strftime may be redirected to datetime:fmt("format")
> : > +local function datetime_fmt()
> : > +end
> : > +
> : > +
> : > +ffi.metatype(interval_t, interval_mt)
> : > +ffi.metatype(datetime_t, datetime_mt)
> : > +
> : > +return setmetatable(
> : > +    {
> : > +        datetime = datetime_new,
> : > +        interval = interval_new,
> : > +
> : > +        parse = parse_str,
> : > +        parse_date = parse_date,
> : > +        parse_time = parse_time,
> : > +        parse_zone = parse_zone,
> : > +        fmt = datetime_fmt,
> : > +
> : > +        now = local_now,
> : > +    -- strptime = strptime;
> : It should be dropped if you don't need it.
> : > +        strftime = strftime,
> : > +        asctime = asctime,
> : > +        ctime = ctime,
> : > +    }, {
> : > +        __call = function(self, ...) return datetime_from(...) end
> : > +    }
> : > +)
>
> Thanks,
> Timur
>


More information about the Tarantool-patches mailing list