[Tarantool-patches] [PATCH v3 2/9] lua: built-in module datetime

Safin Timur tsafin at tarantool.org
Fri Aug 6 03:23:33 MSK 2021


Hello Vlad,

On 05.08.2021 2:58, Vladislav Shpilevoy wrote:
> Thanks for the patch!
> 
> See 28 comments below.
> 
> 
>> diff --git a/src/lib/core/datetime.h b/src/lib/core/datetime.h
>> new file mode 100644
>> index 000000000..84d6642ab
>> --- /dev/null
>> +++ b/src/lib/core/datetime.h

>> +/**
>> + * Full datetime structure representing moments
>> + * since Unix Epoch (1970-01-01).
>> + * Time is kept normalized to UTC, time-zone offset
>> + * is informative only.
>> + */
>> +struct datetime_t {
> 
> 1. Well. How can we proceed beyond the second commit, if you keep
> ignoring my comments? Please, drop _t suffix. Please, comply with
> our code style for comments (they should not be on the same line as
> the members, and should not use any signs like '<').
> 
> If you did it in some later commits - it is not atomic anymore. The
> change must be self-sufficient. This commit is not. It depends on
> one of the last commits which renames something. Why couldn't you
> give the proper names from the beginning, from this commit?

Yes, at that moment renames were in different commit as later 
refactoring. Now, I've squashed those changes to the original commit 
introducing header file, and module in general.

> 
> And what is worse - this entire file is not needed in this commit,
> I already said it before. It is simply not used here at all. Please,
> move it to the relevant commit.
> 
> What is even further worse - you added here some functions which
> do not have any definition. You only added .h file. All the functions
> here are simply undefined. Wtf?
> 
>> +	int64_t secs;	/**< seconds since epoch */
>> +	int32_t nsec;	/**< nanoseconds if any */
>> +	int32_t offset; /**< offset in minutes from UTC */
>> +};
>> +
>> +/**
>> + * Date/time interval structure
>> + */
>> +struct datetime_interval_t {
>> +	int64_t secs; /**< relative seconds delta */
>> +	int32_t nsec; /**< nanoseconds delta */
>> +};
>> +
>> +int
>> +datetime_compare(const struct datetime_t * lhs,
>> +		 const struct datetime_t * rhs);
>> +
>> +
>> +struct datetime_t *
>> +datetime_unpack(const char **data, uint32_t len, struct datetime_t *date);
>> +
>> +/**
>> + * Pack datetime_t data to the MessagePack buffer.
>> + */
>> +char *
>> +datetime_pack(char *data, const struct datetime_t *date);
>> +
>> +/**
>> + * Calculate size of MessagePack buffer for datetime_t data.
>> + */
>> +uint32_t
>> +mp_sizeof_datetime(const struct datetime_t *date);
>> +
>> +/**
>> + * Decode data from MessagePack buffer to datetime_t structure.
>> + */
>> +struct datetime_t *
>> +mp_decode_datetime(const char **data, struct datetime_t *date);
>> +
>> +/**
>> + * Encode datetime_t structure to the MessagePack buffer.
>> + */
>> +char *
>> +mp_encode_datetime(char *data, const struct datetime_t *date);
>> +
>> +/**
>> + * Convert datetime to string using default format
>> + * @param date source datetime value
>> + * @param buf output character buffer
>> + * @param len size ofoutput buffer
>> + */
>> +int
>> +datetime_to_string(const struct datetime_t * date, char *buf, uint32_t len);
>> +
>> +int
>> +mp_snprint_datetime(char *buf, int size, const char **data, uint32_t len);
>> +
>> +int
>> +mp_fprint_datetime(FILE *file, const char **data, uint32_t len);
> 
> 2. Please, try to follow decimal and uuid here. They have separate files
> for working with MessagePack: tt_uuid.h/mp_uuid.h, decimal.h/mp_decimal.h.
> You should have datetime.h and mp_datetime.h.
> 
> Also once you add these to_string, mp_fprint, mp_snprint, pack/unpack,
> you need to add unitests in C for all these functions. See how UUID does
> it in test/unit/uuid.c and decimal in test/unit/decimal.c.


> 
>> diff --git a/src/lua/datetime.lua b/src/lua/datetime.lua
>> new file mode 100644
>> index 000000000..870edc541
>> --- /dev/null
>> +++ b/src/lua/datetime.lua
> 
> <...>
> 
>> +    // /usr/include/x86_64-linux-gnu/bits/types/struct_tm.h
>> +    /* ISO C `broken-down time' structure.  */
>> +    struct tm
>> +    {
>> +        int tm_sec;	        /* Seconds.	[0-60] (1 leap second) */
>> +        int tm_min;	        /* Minutes.	[0-59] */
>> +        int tm_hour;        /* Hours.	[0-23] */
>> +        int tm_mday;        /* Day.		[1-31] */
>> +        int tm_mon;	        /* Month.	[0-11] */
>> +        int tm_year;        /* Year	- 1900.  */
>> +        int tm_wday;        /* Day of week.	[0-6] */
>> +        int tm_yday;        /* Days in year.[0-365]	*/
>> +        int tm_isdst;       /* DST.		[-1/0/1]*/
>> +
>> +        long int tm_gmtoff; /* Seconds east of UTC.  */
>> +        const char *tm_zone;/* Timezone abbreviation.  */
> 
> 3. Didn't we already discuss it? The structure is platform-specific.
> You can't use it safely via FFI. And you even agreed with that, but
> still the struct is here. Why?
> 

At that moment C implementation of datetime_strftime, datetime_ctime, 
datetime_asctime, and datetime_now was in different patch in patchset. 
Now it's squashed to the 2nd commit in branch - where module introduced.

>> +    };
> 
> <...>
> 
>> +local datetime_t = ffi.typeof('struct datetime_t')
>> +local interval_t = ffi.typeof('struct datetime_interval_t')
>> +
>> +local function is_interval(o)
>> +    return type(o) == 'cdata' and ffi.istype(interval_t, o)
>> +end
>> +
>> +local function is_datetime(o)
>> +    return type(o) == 'cdata' and ffi.istype(datetime_t, o)
>> +end
>> +
>> +local function is_date_interval(o)
>> +    return is_datetime(o) or is_interval(o)
> 
> 4. You make type(o) == 'cdata' check 2 times here.

Yes. It's not a problem, IMHO. LuaJIT after inlining of those 2 
functions should take care of that. (I believe)

> 
>> +end
>> +
>> +local function interval_new()
>> +    local interval = ffi.new(interval_t)
>> +    return interval
>> +end
>> +
>> +local function check_date(o, message, lvl)
> 
> 5. 'lvl' is never used by any of the invocations. You waste time on
> the 'if', so it can be dropped. Or start using 'lvl' but never pass it
> as nil, please. The same for all the other functions below taking
> 'lvl'.

Cleaned up.

> 
>> +    if lvl == nil then
>> +        lvl = 2
>> +    end
>> +    if not is_datetime(o) then
>> +        return error(("%s: expected datetime, but received %s"):
>> +                     format(message, o), lvl)
>> +    end
>> +end
> <...>
> 
>> +
>> +local function datetime_cmp(lhs, rhs)
> 
> 6. This does not look correct:
> 
> tarantool> datetime = require('datetime')
> ---
> ...
> 
> tarantool> d = datetime()
> ---
> ...
> 
> tarantool> d < 'bad string'
> ---
> - false
> ...
> 
> tarantool> d > 'bad string'
> ---
> - false
> ...

These results in 1st commit were actually side-effect of incorrect 
comparison implementation. But even today, after all interval extensions 
added and all extra checks introduced, it still shows same results for 
bogus data of unsupported types.
==========================================
local function datetime_cmp(lhs, rhs)
     if not is_date_interval(lhs) or
        not is_date_interval(rhs) then
        return nil
     end
     local sdiff = lhs.secs - rhs.secs
     return sdiff ~= 0 and sdiff or (lhs.nsec - rhs.nsec)
end

local function datetime_eq(lhs, rhs)
     local rc = datetime_cmp(lhs, rhs)
     return rc ~= nil and rc == 0 or false
end

local function datetime_lt(lhs, rhs)
     local rc = datetime_cmp(lhs, rhs)
     return rc ~= nil and rc < 0 or false
end

local function datetime_le(lhs, rhs)
     local rc = datetime_cmp(lhs, rhs)
     return rc ~= nil and rc <= 0 or false
end

==========================================
i.e. whenever we see not datetime or interva value datetime_cmp return 
nil, and all comparisons always return false. Please give me know if 
that's incorrect scenario and we should raise exception. This was the 
way Oleg Bain has requested it to behave. (return false for bogus data)

> 
> Do you have any tests? In this commit I see zero tests for
> your code. Only very few tests for dt library. Please, add
> tests for everything you have here.

A lot. It started to be added with the next commit, and added with any 
other commit adding functionality to datetime.lua. It's pity you didn't 
have a chance to see next commits.

> 
>> +    if not is_date_interval(lhs) or
>> +       not is_date_interval(rhs) then
>> +       return nil
>> +    end
>> +    local sdiff = lhs.secs - rhs.secs
>> +    return sdiff ~= 0 and sdiff or (lhs.nsec - rhs.nsec)
>> +end
>> +
>> +local function datetime_eq(lhs, rhs)
>> +    local rc = datetime_cmp(lhs, rhs)
>> +    return rc ~= nil and rc == 0 or false
>> +end
>> +
>> +local function datetime_lt(lhs, rhs)
>> +    local rc = datetime_cmp(lhs, rhs)
>> +    return rc ~= nil and rc < 0 or false
>> +end
>> +
>> +local function datetime_le(lhs, rhs)
>> +    local rc = datetime_cmp(lhs, rhs)
>> +    return rc ~= nil and rc <= 0 or false
>> +end
>> +
>> +local function datetime_serialize(self)
>> +    -- Allow YAML, MsgPack and JSON to dump objects with sockets
>> +    return { secs = self.secs, nsec = self.nsec, offset = self.offset }
>> +end
>> +
>> +local function interval_serialize(self)
>> +    -- Allow YAML and JSON to dump objects with sockets
> 
> 7. What do these comments mean? Why is it specific to these
> formats? And why doesn this comment include MsgPack while the
> previous one does?

Removed confusing text.

> 
>> +    return { secs = self.secs, nsec = self.nsec }
>> +end
>> +
>> +local function local_rd(o)
>> +    return math.floor(tonumber(o.secs / SECS_PER_DAY)) + DT_EPOCH_1970_OFFSET
> 
> 8. I suggest to cache math.floor into a global variable. The same
> for the other math.* usages in this file. To save one index operation.

Cached math.floor and math.modf.

> 
>> +end
>> +
>> +local function local_dt(o)
>> +    return builtin.dt_from_rdn(local_rd(o))
>> +end
>> +
>> +local function _normalize_nsec(secs, nsec)
> 
> 9. Why do you have '_' prefix? The function is not global, you
> don't need any protection against name conflicts.

Removed _ from the name.

> 
>> +    if nsec < 0 then
>> +        secs = secs - 1
>> +        nsec = nsec + NANOS_PER_SEC
>> +    elseif nsec >= NANOS_PER_SEC then
>> +        secs = secs + 1
>> +        nsec = nsec - NANOS_PER_SEC
>> +    end
>> +    return secs, nsec
>> +end
>> +
>> +local function check_range(v, range, txt)
>> +    assert(#range == 2)
>> +    if not (v >= range[1] and v <= range[2]) then
> 
> 10. Wouldn't it be shorter to make
> 
> 	if v < range[1] or v > range[2] then
 >
 > ?
 >

Applied.

>> +        error(('value %d of %s is out of allowed range [%d, %d]'):
>> +              format(v, txt, range[1], range[2]))
>> +    end
>> +end
>> +
>> +local datetime_index_handlers = {
>> +    unixtime = function(self)
>> +        return self.secs
>> +    end,
>> +
>> +    timestamp = function(self)
>> +        return tonumber(self.secs) + self.nsec / 1e9
> 
> 11. If you are saying the Lua number value range is enough,
> why do you store them as 64bit integers in 'self' and
> convert to a number only for the external API?

May be I misunderstood your sentence, but let me elaborate here.
I need seconds and nanoseconds kept separately for their efficient and 
precise handling, and for passing to c-dt.

If we would keep 32-bit integers in seconds then we would be able to 
handle only dates upto 2038 year, thus we need 64-bit seconds for 
extended range.

OTOH, not whole 64-bit range is practical, and required for expected in 
real-life datetime range values. It's not a problem that Lua number and 
int64_t have very different range for precise integer values. Lua number 
could handle integer values upto 9e15, which is corresponding to ...

	tarantool> T.secs = 9e15
	---
	...
	
	tarantool> T
	---
	- 2979311-04-03T16:00Z
	...

Which is more than enough for us.

> 
>> +    end,
>> +
>> +    nanoseconds = function(self)
>> +        return self.secs * 1e9 + self.nsec
>> +    end,
>> +
>> +    microseconds = function(self)
>> +        return self.secs * 1e6 + self.nsec / 1e3
>> +    end,
>> +
>> +    milliseconds = function(self)
>> +        return self.secs * 1e3 + self.nsec / 1e6
>> +    end,
>> +
>> +    seconds = function(self)
>> +        return tonumber(self.secs) + self.nsec / 1e9
>> +    end,
>> +
>> +    minutes = function(self)
>> +        return (tonumber(self.secs) + self.nsec / 1e9) / 60
>> +    end,
>> +
>> +    hours = function(self)
>> +        return (tonumber(self.secs) + self.nsec / 1e9) / (60 * 60)
>> +    end,
>> +
>> +    days = function(self)
>> +        return (tonumber(self.secs) + self.nsec / 1e9) / (60 * 60) / 24
> 
> 12. Is Lua parser able to merge the last constants into one
> number to make only one division at runtime? If not, I suggest
> to do it manually.

Good question. Apparently Lua parser is able to fold constant expression 
if it's inside of parenthesis, and is unable to cross pairs boundaries. 
Hope LuaJIT in -jO2 mode is better here, but at least we have a 
workaround for constant expressions - put them in the same pairs.

i.e.
```
local days = function(self)
        return (tonumber(self.secs) + self.nsec / 1e9) / (24 * 60 * 60)
end
```

calculates constant expression value (see 86400)
```
-- BYTECODE -- days_ffi.lua:9-11
0001    GGET     1   0      ; "tonumber"
0002    TGETS    2   0   1  ; "secs"
0003    CALL     1   2   2
0004    TGETS    2   0   2  ; "nsec"
0005    DIVVN    2   2   0  ; 1000000000
0006    ADDVV    1   1   2
0007    DIVVN    1   1   1  ; 86400
0008    RET1     1   2
```

but original version:
```
local days = function(self)
        return (tonumber(self.secs) + self.nsec / 1e9) / (60 * 60) / 24
end
```
had separate expression leafs, without folded constant:
```
-- BYTECODE -- days_ffi.lua:9-11
0001    GGET     1   0      ; "tonumber"
0002    TGETS    2   0   1  ; "secs"
0003    CALL     1   2   2
0004    TGETS    2   0   2  ; "nsec"
0005    DIVVN    2   2   0  ; 1000000000
0006    ADDVV    1   1   2
0007    DIVVN    1   1   1  ; 3600
0008    DIVVN    1   1   2  ; 24
0009    RET1     1   2
```

Changed to use parenthesis accordingly.

> 
>> +    end,
>> +}
>> +
>> +local datetime_index = function(self, key)
>> +    return datetime_index_handlers[key] ~= nil and
>> +           datetime_index_handlers[key](self) or nil
> 
> 13. To save one hash lookup operation you can do this:
> 
> 	local h = datetime_index_handlers[key]
> 	return h and h(self)
> 
> The same for datetime_newindex.

Applied.

> 
>> +end
>> +
>> +local datetime_newindex_handlers = {
>> +    unixtime = function(self, value)
>> +        self.secs = value
>> +        self.nsec, self.offset = 0, 0
>> +    end,
>> +
>> +    timestamp = function(self, value)
>> +        local secs, frac = math.modf(value)
>> +        self.secs = secs
>> +        self.nsec = frac * 1e9
>> +        self.offset = 0
>> +    end,
>> +}
>> +
>> +local function datetime_newindex(self, key, value)
>> +    if datetime_newindex_handlers[key] ~= nil then
>> +        datetime_newindex_handlers[key](self, value)
>> +    end
>> +end
>> +
>> +local function datetime_new_raw(secs, nsec, offset)
>> +    local dt_obj = ffi.new(datetime_t)
>> +    dt_obj.secs = secs
>> +    dt_obj.nsec = nsec
>> +    dt_obj.offset = offset
>> +    return dt_obj
>> +end
>> +
>> +local function mk_timestamp(dt, sp, fp, offset)
> 
> 14. What are 'sp' and 'fp'?

sp - seconds part
fp - fractional part

Agreed that those names were confusing, so renamed to simply `secs` and 
`frac`.

> 
> 15. The function is a datetime constructor, because it creates and
> returns a new datetime object. Please, call it correspondingly
> then. For instance, `datetime_new_dt()`.

Renamed to datetime_new_dt

> 
>> +    local epochV = dt ~= nil and (builtin.dt_rdn(dt) - DT_EPOCH_1970_OFFSET) *
>> +                   SECS_PER_DAY or 0
>> +    local spV = sp ~= nil and sp or 0
>> +    local fpV = fp ~= nil and fp or 0
>> +    local ofsV = offset ~= nil and offset or 0
>> +    return datetime_new_raw (epochV + spV - ofsV * 60, fpV, ofsV)
>> +end
>> +
>> +-- create @datetime_t given object @o fields
> 
> 16. Didn't we agree in the previous review that you will stop using
> name 'o'? At least in the context where there are zeros. It is very
> confusing.
> 

Renamed this confusing case to `obj`.

>> +local function datetime_new(o)
>> +    if o == nil then
>> +        return datetime_new_raw(0, 0, 0)
>> +    end
>> +    local secs = 0
>> +    local nsec = 0
>> +    local offset = 0
>> +    local easy_way = false
>> +    local y = 0
>> +    local M = 0
>> +    local d = 0
>> +    local ymd = false
>> +
>> +    local h = 0
>> +    local m = 0
>> +    local s = 0
>> +    local frac = 0
>> +    local hms = false
>> +
>> +    local dt = 0
>> +
>> +    local handlers = {
> 
> 17. Please, stop using closures in the constructor of a
> potentionally very frequently created/deleted object >

I still disagreed that datetime_new constructor will be on critical 
path, the most frequent case expected to be parse*() -> 
datetime_new_dt() -> datetime_new_raw(). i.e. when we parse millions of 
timestamps in logs and then proceed them.

datetime_new(builder_object) expected to be called in rare cases when 
object got created from Lua code, not on performance critical paths, 
thus convenience here is more important than performance.

And BTW I've created simple benchmark for comparing datetime_new_dt vs 
datetime_new(from builder object) vs datetime_new_2 which would be not 
using hash object with handlers, but rather sequence of ifs.

https://gist.github.com/tsafin/31cc9b0872b6015904fcc90d97740770
```
  ~/datetime/tarantoolt [tsafin/gh-5941-datetime-v3]
00:39 $ ./build/src/tarantool ../bench-hash-dt.lua
datetime_new_dt 0.002718448638916
datetime_new    5.1318607330322
datetime_new(precreated)        4.4448664188385
datetime_new_2(ifs)     4.4977836608887
```

Note, that creation from builder object is much, much slower than raw 
cdata object creation (that's expected). If we move object out of loop, 
and precreate it then we save some time (as expected) 5.1 vs 4.4.
But if we change implementation from hash with closure handlers to 
sequence of ifs - we actually degrade performance (a bit) from 4.44 to 4.49.

Biggest performance impact here - traversing of object attributes, but 
whether it's using closures or not using them, it makes no much difference.

But using closures makes this code look much more elegant. So keeping it 
here for today.


>> +        secs = function(_, v)
>> +            secs = v
>> +            easy_way = true
>> +        end,
>> +
>> +        nsec = function(_, v)
>> +            nsec = v
>> +            easy_way = true
>> +        end,
>> +
>> +        offset = function (_, v)
>> +            offset = v
>> +            easy_way = true
>> +        end,
>> +
>> +        year = function(k, v)
>> +            check_range(v, {1, 9999}, k)
>> +            y = v
>> +            ymd = true
>> +        end,
>> +
>> +        month = function(k, v)
>> +            check_range(v, {1, 12}, k)
>> +            M = v
>> +            ymd = true
>> +        end,
>> +
>> +        day = function(k, v)
>> +            check_range(v, {1, 31}, k)
>> +            d = v
>> +            ymd = true
>> +        end,
>> +
>> +        hour = function(k, v)
>> +            check_range(v, {0, 23}, k)
>> +            h = v
>> +            hms = true
>> +        end,
>> +
>> +        minute = function(k, v)
>> +            check_range(v, {0, 59}, k)
>> +            m = v
>> +            hms = true
>> +        end,
>> +
>> +        second = function(k, v)
>> +            check_range(v, {0, 60}, k)
>> +            s, frac = math.modf(v)
>> +            frac = frac * 1e9 -- convert fraction to nanoseconds
>> +            hms = true
>> +        end,
>> +
>> +        -- tz offset in minutes
>> +        tz = function(k, v)
>> +            check_range(v, {0, 720}, k)
>> +            offset = v
>> +        end
>> +    }
>> +    for key, value in pairs(o) do
>> +        handlers[key](key, value)
>> +    end
>> +
>> +    -- .sec, .nsec, .offset
>> +    if easy_way then
>> +        return datetime_new_raw(secs, nsec, offset)
>> +    end
>> +
>> +    -- .year, .month, .day
>> +    if ymd then
>> +        dt = dt + builtin.dt_from_ymd(y, M, d)
>> +    end
>> +
>> +    -- .hour, .minute, .second
>> +    if hms then
>> +        secs = h * 3600 + m * 60 + s
>> +    end
>> +
>> +    return mk_timestamp(dt, secs, frac, offset)
>> +end
> 
> 18. It fails with a bad error:
> 
> tarantool> d = datetime({garbage = true})
> ---
> - error: 'builtin/datetime.lua:449: attempt to call a nil value'
> ...
> 

Added default handler for an unknown attribute.

```
tarantool> T = date{years = -123}
---
- error: '[string "T = date{years = -123}"]:1: unknown attribute years'
...
```

> 
> 19. It can ignore some fields (here it ignores 'year'):
> 
> tarantool> d1 = datetime({secs = 100, year = 1})
> ---
> ...
> 
> tarantool> d2 = datetime({secs = 100, year = 2})
> ---
> ...
> 
> tarantool> d1 == d2
> ---
> - true
> ...
> 

Yes, and that's by design. .secs defines 'easy_way' of initializing 
object via secs/nsec/offset, while .year selects 'ymd' mode. And 
`easy_way` has precedence. It looks reasonable for me, please, suggest 
your idea how to handle it even better without much redesign?

> 
> 20. Serialization does not seem to be working in the console
> despite you tried to implement it above:
> 
> tarantool> d = datetime({secs = 100})
> ---
> ...
> 
> tarantool> d
> ---
> - 'cdata<struct datetime_t>: 0x0103ad9c50'
> ...

Yup, redefining of __serialize is not enough for cdata object by 
whatever reason (I didn't look deep). But after the next patch(es) it's 
start to stringize properly.

> 
>> +
>> +local function date_first(lhs, rhs)
>> +    if is_datetime(lhs) then
>> +        return lhs, rhs
>> +    else
>> +        return rhs, lhs
>> +    end
>> +end
>> +
>> +local function error_incompatible(name)
>> +    error(("datetime:%s() - incompatible type of arguments"):
>> +          format(name), 3)
>> +end
>> +
>> +local function datetime_sub(lhs, rhs)
>> +    check_date_interval(lhs, "operator -")
>> +    local d, s = lhs, rhs
>> +    local left_t = ffi.typeof(d)
>> +    local right_t = ffi.typeof(s)
>> +    local o
>> +
>> +    if left_t == datetime_t then
>> +        -- left is date, right is date or generic interval
>> +        if (right_t == datetime_t or right_t == interval_t) then
> 
> 21. You do not need () around the condition in Lua.

Fixed.

> 
>> +            o = right_t == datetime_t and interval_new() or datetime_new()
>> +            o.secs, o.nsec = _normalize_nsec(lhs.secs - rhs.secs,
>> +                                            lhs.nsec - rhs.nsec)
>> +            return o
>> +        else
>> +            error_incompatible("operator -")
>> +        end
>> +    -- both left and right are generic intervals
>> +    elseif left_t == interval_t and right_t == interval_t then
>> +        o = interval_new()
>> +        o.secs, o.nsec = _normalize_nsec(lhs.secs - rhs.secs,
>> +                                        lhs.nsec - rhs.nsec)
>> +        return o
>> +    else
>> +        error_incompatible("operator -")
>> +    end
>> +end
> 
> 22. Did you really test this function?
> 
> tarantool> dt = require('datetime')
> ---
> ...
> 
> tarantool> d1 = dt()
> ---
> ...
> 
> tarantool> d2 - 1
> ---
> - error: 'builtin/datetime.lua:487: bad argument #1 to ''typeof'' (C type expected,
>      got number)'
> ...

All kinds of arguments checking implemented in the separate patch about 
interval arithmetic operations, please find a moment to look into them.

> 
>> +
>> +local function datetime_add(lhs, rhs)
>> +    local d, s = date_first(lhs, rhs)
>> +
>> +    check_date_interval(d, "operator +")
>> +    check_interval(s, "operator +")
>> +    local left_t = ffi.typeof(d)
>> +    local right_t = ffi.typeof(s)
>> +    local o
>> +
>> +    -- left is date, right is date or interval
>> +    if left_t == datetime_t and right_t == interval_t then
>> +        o = datetime_new()
>> +        o.secs, o.nsec = _normalize_nsec(d.secs + s.secs, d.nsec + s.nsec)
>> +        return o
>> +    -- both left and right are generic intervals
>> +    elseif left_t == interval_t and right_t == interval_t then
>> +        o = interval_new()
>> +        o.secs, o.nsec = _normalize_nsec(d.secs + s.secs, d.nsec + s.nsec)
>> +        return o
>> +    else
>> +        error_incompatible("operator +")
>> +    end
>> +end
>> +
>> +-- simple parse functions:
>> +-- parse_date/parse_time/parse_zone
> 
> 23. As I said in the previous review, the comment is useless.
> You just listed 3 function names following the comment. What
> is the purpose of that? What should I learn from it? Please,
> drop it.

Dropped.

> 
> <...>
> 
>> +
>> +local function datetime_from(o)
>> +    if o == nil or type(o) == 'table' then
>> +        return datetime_new(o)
>> +    elseif type(o) == 'string' then
>> +        return parse(o)
>> +    end
>> +end
>> +
>> +local function local_now()
>> +    local p_tv = ffi.new('struct timeval [1]')
>> +    local rc = builtin.gettimeofday(p_tv, nil)
> 
> 24. Why can't you use fiber time? It should be enough in
> nearly all usecases and is much faster. For real-real time
> talking directly to the kernel we have 'clock' Lua module
> already. You could check if results of clock functions can
> be passed into your module to create 'real' timestamps not
> depending on the eventloop.

Fiber time, which actually return cached ev_rt_now, looks imprecise and 
depends on event loop circumstances, though eventually it calls the same 
clock_gettime(). So fiber-time is looking like a cached timestamp value 
from [hopefully, non-distant] past.

I'd rather prefer to have more realtime values, and 
clock_realtime*/clock_monotonic* look relevant, they call 
clock_gettime() for their purposes, which in turn calls the same 
gettimeofday(). So in this case we call same api, but with 2 extra call 
layers above.

IMHO, not worth extra complexity if we could just simply call 
gettimeofday().

[BTW, in current branch this code lives in C - I've moved 
ctime/asctime/strftime/now handlers to datetime.c, to hide Linux vs BSD 
differences there]

> 
>> +    assert(rc == 0)
>> +
>> +    local secs = p_tv[0].tv_sec
>> +    local nsec = p_tv[0].tv_usec * 1000
>> +
>> +    local p_time = ffi.new('time_t[1]')
>> +    local p_tm = ffi.new('struct tm[1]')
>> +    builtin.time(p_time)
>> +    builtin.localtime_r(p_time, p_tm)
>> +    local ofs = p_tm[0].tm_gmtoff / 60 -- convert seconds to minutes
>> +
>> +    return datetime_new_raw(secs, nsec, ofs)
>> +end
>> +
>> +local function datetime_to_tm_ptr(o)
>> +    assert(is_datetime(o))
>> +    local p_tm = ffi.new('struct tm[1]')
>> +    -- dt_to_struct_tm() fills only date data
>> +    builtin.dt_to_struct_tm(local_dt(o), p_tm)
>> +
>> +    -- calculate the smaller data (hour, minute,
>> +    -- seconds) using datetime seconds value
>> +    local seconds_of_day = o.secs % 86400
>> +    local hour = (seconds_of_day / 3600) % 24
>> +    local minute = (seconds_of_day / 60) % 60
>> +    p_tm[0].tm_sec = seconds_of_day % 60
>> +    p_tm[0].tm_min = minute
>> +    p_tm[0].tm_hour = hour
>> +
>> +    p_tm[0].tm_gmtoff = o.offset * 60
>> +
>> +    return p_tm
>> +end

There is no such code in Lua anymore.

>> +
>> +local function asctime(o)
>> +    check_date(o, "datetime:asctime()")
>> +
>> +    local p_tm = datetime_to_tm_ptr(o)
>> +    return ffi.string(builtin.asctime(p_tm))
>> +end
>> +
>> +local function ctime(o)
>> +    check_date(o, "datetime:ctime()")
>> +    local p_time = ffi.new('time_t[1]')
>> +    p_time[0] = o.secs
>> +    return ffi.string(builtin.ctime(p_time))
> 
> 25. Functions like ctime can't be used in Lua, because according
> to their Linux man:
> 
> 	The return value points to a statically allocated string
> 	which might be overwritten by subsequent calls to any of the
> 	date and time functions
> 
> Which means they use a global buffer. If you will
> use datetime module in any of __gc handlers, it will
> return garbage sometimes or just override results
> somethere. That is exactly the same problem as existed
> with tarantool's static buffer.

I do not believe it's the same problem - in this case content of static 
buffer is immediately interned in Lua via ffi.string()

https://luajit.org/ext_ffi_api.html
"The Lua string is an (interned) copy of the data and bears no relation 
to the original data area anymore."

> 
>> diff --git a/test/unit/datetime.c b/test/unit/datetime.c
>> new file mode 100644
>> index 000000000..b6f568c03
>> --- /dev/null
>> +++ b/test/unit/datetime.c
> 
> 26. This file tests dt library. It does not test your code. Please, add
> tests for everything you implemented in this commit. For c-dt library
> the tests must be added in the previous commit, and the test file
> must be called test/unit/cdt.c then.

Unit-test has been moved to the commit introducing c-dt. And in turn, 
initial datetime test (checking parser and ctime/asctime/now) squashed 
to the same commit where we introduce module with parser.

> 
>> @@ -0,0 +1,221 @@
>> +#include "dt.h"
>> +#include <assert.h>
>> +#include <stdint.h>
>> +#include <string.h>
>> +#include <time.h>
>> +
>> +#include "unit.h"
>> +
>> +const char sample[] = "2012-12-24T15:30Z";
> 
> 27. Please, make it static.

Done

> 
>> +
>> +#define S(s) {s, sizeof(s) - 1}
>> +struct {
>> +	const char * sz;
>> +	size_t len;
>> +} tests[] = {
>> +	S("2012-12-24 15:30Z"),
>> +	S("2012-12-24 15:30z"),
>> +	S("2012-12-24 15:30"),
>> +	S("2012-12-24 16:30+01:00"),
>> +	S("2012-12-24 16:30+0100"),
>> +	S("2012-12-24 16:30+01"),
>> +	S("2012-12-24 14:30-01:00"),
>> +	S("2012-12-24 14:30-0100"),
>> +	S("2012-12-24 14:30-01"),
>> +	S("2012-12-24 15:30:00Z"),
>> +	S("2012-12-24 15:30:00z"),
>> +	S("2012-12-24 15:30:00"),
>> +	S("2012-12-24 16:30:00+01:00"),
>> +	S("2012-12-24 16:30:00+0100"),
>> +	S("2012-12-24 14:30:00-01:00"),
>> +	S("2012-12-24 14:30:00-0100"),
>> +	S("2012-12-24 15:30:00.123456Z"),
>> +	S("2012-12-24 15:30:00.123456z"),
>> +	S("2012-12-24 15:30:00.123456"),
>> +	S("2012-12-24 16:30:00.123456+01:00"),
>> +	S("2012-12-24 16:30:00.123456+01"),
>> +	S("2012-12-24 14:30:00.123456-01:00"),
>> +	S("2012-12-24 14:30:00.123456-01"),
>> +	S("2012-12-24t15:30Z"),
>> +	S("2012-12-24t15:30z"),
>> +	S("2012-12-24t15:30"),
>> +	S("2012-12-24t16:30+01:00"),
>> +	S("2012-12-24t16:30+0100"),
>> +	S("2012-12-24t14:30-01:00"),
>> +	S("2012-12-24t14:30-0100"),
>> +	S("2012-12-24t15:30:00Z"),
>> +	S("2012-12-24t15:30:00z"),
>> +	S("2012-12-24t15:30:00"),
>> +	S("2012-12-24t16:30:00+01:00"),
>> +	S("2012-12-24t16:30:00+0100"),
>> +	S("2012-12-24t14:30:00-01:00"),
>> +	S("2012-12-24t14:30:00-0100"),
>> +	S("2012-12-24t15:30:00.123456Z"),
>> +	S("2012-12-24t15:30:00.123456z"),
>> +	S("2012-12-24t16:30:00.123456+01:00"),
>> +	S("2012-12-24t14:30:00.123456-01:00"),
>> +	S("2012-12-24 16:30 +01:00"),
>> +	S("2012-12-24 14:30 -01:00"),
>> +	S("2012-12-24 15:30 UTC"),
>> +	S("2012-12-24 16:30 UTC+1"),
>> +	S("2012-12-24 16:30 UTC+01"),
>> +	S("2012-12-24 16:30 UTC+0100"),
>> +	S("2012-12-24 16:30 UTC+01:00"),
>> +	S("2012-12-24 14:30 UTC-1"),
>> +	S("2012-12-24 14:30 UTC-01"),
>> +	S("2012-12-24 14:30 UTC-01:00"),
>> +	S("2012-12-24 14:30 UTC-0100"),
>> +	S("2012-12-24 15:30 GMT"),
>> +	S("2012-12-24 16:30 GMT+1"),
>> +	S("2012-12-24 16:30 GMT+01"),
>> +	S("2012-12-24 16:30 GMT+0100"),
>> +	S("2012-12-24 16:30 GMT+01:00"),
>> +	S("2012-12-24 14:30 GMT-1"),
>> +	S("2012-12-24 14:30 GMT-01"),
>> +	S("2012-12-24 14:30 GMT-01:00"),
>> +	S("2012-12-24 14:30 GMT-0100"),
>> +	S("2012-12-24 14:30 -01:00"),
>> +	S("2012-12-24 16:30:00 +01:00"),
>> +	S("2012-12-24 14:30:00 -01:00"),
>> +	S("2012-12-24 16:30:00.123456 +01:00"),
>> +	S("2012-12-24 14:30:00.123456 -01:00"),
>> +	S("2012-12-24 15:30:00.123456 -00:00"),
>> +	S("20121224T1630+01:00"),
>> +	S("2012-12-24T1630+01:00"),
>> +	S("20121224T16:30+01"),
>> +	S("20121224T16:30 +01"),
>> +};
>> +#undef S
>> +
>> +#define DIM(a) (sizeof(a) / sizeof(a[0]))
>> +
>> +// p5-time-moment/src/moment_parse.c: parse_string_lenient()
> 
> 28. Please, stop using // comment opening.
> 

Uh, ok.

All, with exception of separation of mp_datetime and introduction of 
test (test requires some time for plumbing) is alreday in the branch 
updated - tsafin/gh-5941-datetime-v3

Vlad, if you still have some time this Friday, before goin to vacation, 
then please look into the rest of patches I have so far. Now they 7, and 
look reasonable enough:

* a1905f727 build: add Christian Hansen c-dt to the build
* f1dcde7ec lua: built-in module datetime
* 88e25eaa0 lua, datetime: display datetime
* cfa40d93e box, datetime: messagepack support for datetime
* 9e718e5c5 box, datetime: datetime comparison for indices
* 8d26d2f76 lua, datetime: time intervals support
* 9b338fc57 datetime: changelog for datetime module

Thanks,
Timur


More information about the Tarantool-patches mailing list