From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 96105741EA; Fri, 6 Aug 2021 03:23:36 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 96105741EA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1628209416; bh=XXhisu2nDjZ+TV2nZ8LWn+EMZ/zchhTwuXjGsMryRfo=; h=To:References:Date:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=xCbBaHCjtC1e0wmHJxVZD7vA/llrexXwMZgM0LVD7514l41CizRmOAUBkFcsPFXtR cofCULnJ5YLySDrbfocmkJsOnorg5LmboZefIgP5r/5cEtpDZo9hQfWcz9mTsec2Ob MopQtjReFBy/6bXW/KHxe6aFXioUwVGIPAOO4+70= Received: from smtp30.i.mail.ru (smtp30.i.mail.ru [94.100.177.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id F1AE2741EA for ; Fri, 6 Aug 2021 03:23:34 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org F1AE2741EA Received: by smtp30.i.mail.ru with esmtpa (envelope-from ) id 1mBne9-0003ZU-9r; Fri, 06 Aug 2021 03:23:33 +0300 To: Vladislav Shpilevoy , olegrok@tarantool.org References: <449cb4c7-c738-ed07-8517-61b663ad9f79@tarantool.org> Message-ID: <0931c9db-bd8e-0687-2c96-73fac1aeff7e@tarantool.org> Date: Fri, 6 Aug 2021 03:23:33 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <449cb4c7-c738-ed07-8517-61b663ad9f79@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD92087353F0EC44DD9ECFD080E047A606F6525B29142351271182A05F5380850409C48B49B5A89F851260F9B065B605F5EDB96A7D90E2F4902576582B0A85F1355 X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7FCFCB92DA8654BB0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637B23888C9749F3CAC8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D8F8B1BB9BAFB5D1A5FAACED752783009A117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAA867293B0326636D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE7B96B19DC40933219935A1E27F592749D8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE3A6C7FFFE744CA7FB03F1AB874ED89028C4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637A7EFCB0EB5ACB161EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A213B5FB47DCBC3458F0AFF96BAACF4158235E5A14AD4A4A4625E192CAD1D9E79D94463893BF8742D0F671E0E5005C9022 X-C1DE0DAB: 0D63561A33F958A5E9D9ABE0014B624D8D3E2A8AA1A5E48A496A7998A632D719D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA7501A9DF589746230F410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34F8A2DF183797A6A85C673002F08FDB85FCD49577C7139DAEE4607C6016B4527CF029BF7E4F3978BD1D7E09C32AA3244CD0D3666D358D6C693487E7E65BAD28BD3A76366E8A9DE7CA83B48618A63566E0 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojFhlvmGwdUwRCu8osdganLQ== X-Mailru-Sender: B5B6A6EBBD94DAD81091F6801AA03C01EE8A8DBDF32FCF8B84DCA6544A68615D9255D0F8401B32E05C2808D6142752370A8ED71B308007E3DC85537438B7E1A423D748DE48713E689437F6177E88F7363CDA0F3B3F5B9367 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v3 2/9] lua: built-in module datetime X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Safin Timur via Tarantool-patches Reply-To: Safin Timur Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hello Vlad, On 05.08.2021 2:58, Vladislav Shpilevoy wrote: > Thanks for the patch! > > See 28 comments below. > > >> diff --git a/src/lib/core/datetime.h b/src/lib/core/datetime.h >> new file mode 100644 >> index 000000000..84d6642ab >> --- /dev/null >> +++ b/src/lib/core/datetime.h >> +/** >> + * Full datetime structure representing moments >> + * since Unix Epoch (1970-01-01). >> + * Time is kept normalized to UTC, time-zone offset >> + * is informative only. >> + */ >> +struct datetime_t { > > 1. Well. How can we proceed beyond the second commit, if you keep > ignoring my comments? Please, drop _t suffix. Please, comply with > our code style for comments (they should not be on the same line as > the members, and should not use any signs like '<'). > > If you did it in some later commits - it is not atomic anymore. The > change must be self-sufficient. This commit is not. It depends on > one of the last commits which renames something. Why couldn't you > give the proper names from the beginning, from this commit? Yes, at that moment renames were in different commit as later refactoring. Now, I've squashed those changes to the original commit introducing header file, and module in general. > > And what is worse - this entire file is not needed in this commit, > I already said it before. It is simply not used here at all. Please, > move it to the relevant commit. > > What is even further worse - you added here some functions which > do not have any definition. You only added .h file. All the functions > here are simply undefined. Wtf? > >> + int64_t secs; /**< seconds since epoch */ >> + int32_t nsec; /**< nanoseconds if any */ >> + int32_t offset; /**< offset in minutes from UTC */ >> +}; >> + >> +/** >> + * Date/time interval structure >> + */ >> +struct datetime_interval_t { >> + int64_t secs; /**< relative seconds delta */ >> + int32_t nsec; /**< nanoseconds delta */ >> +}; >> + >> +int >> +datetime_compare(const struct datetime_t * lhs, >> + const struct datetime_t * rhs); >> + >> + >> +struct datetime_t * >> +datetime_unpack(const char **data, uint32_t len, struct datetime_t *date); >> + >> +/** >> + * Pack datetime_t data to the MessagePack buffer. >> + */ >> +char * >> +datetime_pack(char *data, const struct datetime_t *date); >> + >> +/** >> + * Calculate size of MessagePack buffer for datetime_t data. >> + */ >> +uint32_t >> +mp_sizeof_datetime(const struct datetime_t *date); >> + >> +/** >> + * Decode data from MessagePack buffer to datetime_t structure. >> + */ >> +struct datetime_t * >> +mp_decode_datetime(const char **data, struct datetime_t *date); >> + >> +/** >> + * Encode datetime_t structure to the MessagePack buffer. >> + */ >> +char * >> +mp_encode_datetime(char *data, const struct datetime_t *date); >> + >> +/** >> + * Convert datetime to string using default format >> + * @param date source datetime value >> + * @param buf output character buffer >> + * @param len size ofoutput buffer >> + */ >> +int >> +datetime_to_string(const struct datetime_t * date, char *buf, uint32_t len); >> + >> +int >> +mp_snprint_datetime(char *buf, int size, const char **data, uint32_t len); >> + >> +int >> +mp_fprint_datetime(FILE *file, const char **data, uint32_t len); > > 2. Please, try to follow decimal and uuid here. They have separate files > for working with MessagePack: tt_uuid.h/mp_uuid.h, decimal.h/mp_decimal.h. > You should have datetime.h and mp_datetime.h. > > Also once you add these to_string, mp_fprint, mp_snprint, pack/unpack, > you need to add unitests in C for all these functions. See how UUID does > it in test/unit/uuid.c and decimal in test/unit/decimal.c. > >> diff --git a/src/lua/datetime.lua b/src/lua/datetime.lua >> new file mode 100644 >> index 000000000..870edc541 >> --- /dev/null >> +++ b/src/lua/datetime.lua > > <...> > >> + // /usr/include/x86_64-linux-gnu/bits/types/struct_tm.h >> + /* ISO C `broken-down time' structure. */ >> + struct tm >> + { >> + int tm_sec; /* Seconds. [0-60] (1 leap second) */ >> + int tm_min; /* Minutes. [0-59] */ >> + int tm_hour; /* Hours. [0-23] */ >> + int tm_mday; /* Day. [1-31] */ >> + int tm_mon; /* Month. [0-11] */ >> + int tm_year; /* Year - 1900. */ >> + int tm_wday; /* Day of week. [0-6] */ >> + int tm_yday; /* Days in year.[0-365] */ >> + int tm_isdst; /* DST. [-1/0/1]*/ >> + >> + long int tm_gmtoff; /* Seconds east of UTC. */ >> + const char *tm_zone;/* Timezone abbreviation. */ > > 3. Didn't we already discuss it? The structure is platform-specific. > You can't use it safely via FFI. And you even agreed with that, but > still the struct is here. Why? > At that moment C implementation of datetime_strftime, datetime_ctime, datetime_asctime, and datetime_now was in different patch in patchset. Now it's squashed to the 2nd commit in branch - where module introduced. >> + }; > > <...> > >> +local datetime_t = ffi.typeof('struct datetime_t') >> +local interval_t = ffi.typeof('struct datetime_interval_t') >> + >> +local function is_interval(o) >> + return type(o) == 'cdata' and ffi.istype(interval_t, o) >> +end >> + >> +local function is_datetime(o) >> + return type(o) == 'cdata' and ffi.istype(datetime_t, o) >> +end >> + >> +local function is_date_interval(o) >> + return is_datetime(o) or is_interval(o) > > 4. You make type(o) == 'cdata' check 2 times here. Yes. It's not a problem, IMHO. LuaJIT after inlining of those 2 functions should take care of that. (I believe) > >> +end >> + >> +local function interval_new() >> + local interval = ffi.new(interval_t) >> + return interval >> +end >> + >> +local function check_date(o, message, lvl) > > 5. 'lvl' is never used by any of the invocations. You waste time on > the 'if', so it can be dropped. Or start using 'lvl' but never pass it > as nil, please. The same for all the other functions below taking > 'lvl'. Cleaned up. > >> + if lvl == nil then >> + lvl = 2 >> + end >> + if not is_datetime(o) then >> + return error(("%s: expected datetime, but received %s"): >> + format(message, o), lvl) >> + end >> +end > <...> > >> + >> +local function datetime_cmp(lhs, rhs) > > 6. This does not look correct: > > tarantool> datetime = require('datetime') > --- > ... > > tarantool> d = datetime() > --- > ... > > tarantool> d < 'bad string' > --- > - false > ... > > tarantool> d > 'bad string' > --- > - false > ... These results in 1st commit were actually side-effect of incorrect comparison implementation. But even today, after all interval extensions added and all extra checks introduced, it still shows same results for bogus data of unsupported types. ========================================== local function datetime_cmp(lhs, rhs) if not is_date_interval(lhs) or not is_date_interval(rhs) then return nil end local sdiff = lhs.secs - rhs.secs return sdiff ~= 0 and sdiff or (lhs.nsec - rhs.nsec) end local function datetime_eq(lhs, rhs) local rc = datetime_cmp(lhs, rhs) return rc ~= nil and rc == 0 or false end local function datetime_lt(lhs, rhs) local rc = datetime_cmp(lhs, rhs) return rc ~= nil and rc < 0 or false end local function datetime_le(lhs, rhs) local rc = datetime_cmp(lhs, rhs) return rc ~= nil and rc <= 0 or false end ========================================== i.e. whenever we see not datetime or interva value datetime_cmp return nil, and all comparisons always return false. Please give me know if that's incorrect scenario and we should raise exception. This was the way Oleg Bain has requested it to behave. (return false for bogus data) > > Do you have any tests? In this commit I see zero tests for > your code. Only very few tests for dt library. Please, add > tests for everything you have here. A lot. It started to be added with the next commit, and added with any other commit adding functionality to datetime.lua. It's pity you didn't have a chance to see next commits. > >> + if not is_date_interval(lhs) or >> + not is_date_interval(rhs) then >> + return nil >> + end >> + local sdiff = lhs.secs - rhs.secs >> + return sdiff ~= 0 and sdiff or (lhs.nsec - rhs.nsec) >> +end >> + >> +local function datetime_eq(lhs, rhs) >> + local rc = datetime_cmp(lhs, rhs) >> + return rc ~= nil and rc == 0 or false >> +end >> + >> +local function datetime_lt(lhs, rhs) >> + local rc = datetime_cmp(lhs, rhs) >> + return rc ~= nil and rc < 0 or false >> +end >> + >> +local function datetime_le(lhs, rhs) >> + local rc = datetime_cmp(lhs, rhs) >> + return rc ~= nil and rc <= 0 or false >> +end >> + >> +local function datetime_serialize(self) >> + -- Allow YAML, MsgPack and JSON to dump objects with sockets >> + return { secs = self.secs, nsec = self.nsec, offset = self.offset } >> +end >> + >> +local function interval_serialize(self) >> + -- Allow YAML and JSON to dump objects with sockets > > 7. What do these comments mean? Why is it specific to these > formats? And why doesn this comment include MsgPack while the > previous one does? Removed confusing text. > >> + return { secs = self.secs, nsec = self.nsec } >> +end >> + >> +local function local_rd(o) >> + return math.floor(tonumber(o.secs / SECS_PER_DAY)) + DT_EPOCH_1970_OFFSET > > 8. I suggest to cache math.floor into a global variable. The same > for the other math.* usages in this file. To save one index operation. Cached math.floor and math.modf. > >> +end >> + >> +local function local_dt(o) >> + return builtin.dt_from_rdn(local_rd(o)) >> +end >> + >> +local function _normalize_nsec(secs, nsec) > > 9. Why do you have '_' prefix? The function is not global, you > don't need any protection against name conflicts. Removed _ from the name. > >> + if nsec < 0 then >> + secs = secs - 1 >> + nsec = nsec + NANOS_PER_SEC >> + elseif nsec >= NANOS_PER_SEC then >> + secs = secs + 1 >> + nsec = nsec - NANOS_PER_SEC >> + end >> + return secs, nsec >> +end >> + >> +local function check_range(v, range, txt) >> + assert(#range == 2) >> + if not (v >= range[1] and v <= range[2]) then > > 10. Wouldn't it be shorter to make > > if v < range[1] or v > range[2] then > > ? > Applied. >> + error(('value %d of %s is out of allowed range [%d, %d]'): >> + format(v, txt, range[1], range[2])) >> + end >> +end >> + >> +local datetime_index_handlers = { >> + unixtime = function(self) >> + return self.secs >> + end, >> + >> + timestamp = function(self) >> + return tonumber(self.secs) + self.nsec / 1e9 > > 11. If you are saying the Lua number value range is enough, > why do you store them as 64bit integers in 'self' and > convert to a number only for the external API? May be I misunderstood your sentence, but let me elaborate here. I need seconds and nanoseconds kept separately for their efficient and precise handling, and for passing to c-dt. If we would keep 32-bit integers in seconds then we would be able to handle only dates upto 2038 year, thus we need 64-bit seconds for extended range. OTOH, not whole 64-bit range is practical, and required for expected in real-life datetime range values. It's not a problem that Lua number and int64_t have very different range for precise integer values. Lua number could handle integer values upto 9e15, which is corresponding to ... tarantool> T.secs = 9e15 --- ... tarantool> T --- - 2979311-04-03T16:00Z ... Which is more than enough for us. > >> + end, >> + >> + nanoseconds = function(self) >> + return self.secs * 1e9 + self.nsec >> + end, >> + >> + microseconds = function(self) >> + return self.secs * 1e6 + self.nsec / 1e3 >> + end, >> + >> + milliseconds = function(self) >> + return self.secs * 1e3 + self.nsec / 1e6 >> + end, >> + >> + seconds = function(self) >> + return tonumber(self.secs) + self.nsec / 1e9 >> + end, >> + >> + minutes = function(self) >> + return (tonumber(self.secs) + self.nsec / 1e9) / 60 >> + end, >> + >> + hours = function(self) >> + return (tonumber(self.secs) + self.nsec / 1e9) / (60 * 60) >> + end, >> + >> + days = function(self) >> + return (tonumber(self.secs) + self.nsec / 1e9) / (60 * 60) / 24 > > 12. Is Lua parser able to merge the last constants into one > number to make only one division at runtime? If not, I suggest > to do it manually. Good question. Apparently Lua parser is able to fold constant expression if it's inside of parenthesis, and is unable to cross pairs boundaries. Hope LuaJIT in -jO2 mode is better here, but at least we have a workaround for constant expressions - put them in the same pairs. i.e. ``` local days = function(self) return (tonumber(self.secs) + self.nsec / 1e9) / (24 * 60 * 60) end ``` calculates constant expression value (see 86400) ``` -- BYTECODE -- days_ffi.lua:9-11 0001 GGET 1 0 ; "tonumber" 0002 TGETS 2 0 1 ; "secs" 0003 CALL 1 2 2 0004 TGETS 2 0 2 ; "nsec" 0005 DIVVN 2 2 0 ; 1000000000 0006 ADDVV 1 1 2 0007 DIVVN 1 1 1 ; 86400 0008 RET1 1 2 ``` but original version: ``` local days = function(self) return (tonumber(self.secs) + self.nsec / 1e9) / (60 * 60) / 24 end ``` had separate expression leafs, without folded constant: ``` -- BYTECODE -- days_ffi.lua:9-11 0001 GGET 1 0 ; "tonumber" 0002 TGETS 2 0 1 ; "secs" 0003 CALL 1 2 2 0004 TGETS 2 0 2 ; "nsec" 0005 DIVVN 2 2 0 ; 1000000000 0006 ADDVV 1 1 2 0007 DIVVN 1 1 1 ; 3600 0008 DIVVN 1 1 2 ; 24 0009 RET1 1 2 ``` Changed to use parenthesis accordingly. > >> + end, >> +} >> + >> +local datetime_index = function(self, key) >> + return datetime_index_handlers[key] ~= nil and >> + datetime_index_handlers[key](self) or nil > > 13. To save one hash lookup operation you can do this: > > local h = datetime_index_handlers[key] > return h and h(self) > > The same for datetime_newindex. Applied. > >> +end >> + >> +local datetime_newindex_handlers = { >> + unixtime = function(self, value) >> + self.secs = value >> + self.nsec, self.offset = 0, 0 >> + end, >> + >> + timestamp = function(self, value) >> + local secs, frac = math.modf(value) >> + self.secs = secs >> + self.nsec = frac * 1e9 >> + self.offset = 0 >> + end, >> +} >> + >> +local function datetime_newindex(self, key, value) >> + if datetime_newindex_handlers[key] ~= nil then >> + datetime_newindex_handlers[key](self, value) >> + end >> +end >> + >> +local function datetime_new_raw(secs, nsec, offset) >> + local dt_obj = ffi.new(datetime_t) >> + dt_obj.secs = secs >> + dt_obj.nsec = nsec >> + dt_obj.offset = offset >> + return dt_obj >> +end >> + >> +local function mk_timestamp(dt, sp, fp, offset) > > 14. What are 'sp' and 'fp'? sp - seconds part fp - fractional part Agreed that those names were confusing, so renamed to simply `secs` and `frac`. > > 15. The function is a datetime constructor, because it creates and > returns a new datetime object. Please, call it correspondingly > then. For instance, `datetime_new_dt()`. Renamed to datetime_new_dt > >> + local epochV = dt ~= nil and (builtin.dt_rdn(dt) - DT_EPOCH_1970_OFFSET) * >> + SECS_PER_DAY or 0 >> + local spV = sp ~= nil and sp or 0 >> + local fpV = fp ~= nil and fp or 0 >> + local ofsV = offset ~= nil and offset or 0 >> + return datetime_new_raw (epochV + spV - ofsV * 60, fpV, ofsV) >> +end >> + >> +-- create @datetime_t given object @o fields > > 16. Didn't we agree in the previous review that you will stop using > name 'o'? At least in the context where there are zeros. It is very > confusing. > Renamed this confusing case to `obj`. >> +local function datetime_new(o) >> + if o == nil then >> + return datetime_new_raw(0, 0, 0) >> + end >> + local secs = 0 >> + local nsec = 0 >> + local offset = 0 >> + local easy_way = false >> + local y = 0 >> + local M = 0 >> + local d = 0 >> + local ymd = false >> + >> + local h = 0 >> + local m = 0 >> + local s = 0 >> + local frac = 0 >> + local hms = false >> + >> + local dt = 0 >> + >> + local handlers = { > > 17. Please, stop using closures in the constructor of a > potentionally very frequently created/deleted object > I still disagreed that datetime_new constructor will be on critical path, the most frequent case expected to be parse*() -> datetime_new_dt() -> datetime_new_raw(). i.e. when we parse millions of timestamps in logs and then proceed them. datetime_new(builder_object) expected to be called in rare cases when object got created from Lua code, not on performance critical paths, thus convenience here is more important than performance. And BTW I've created simple benchmark for comparing datetime_new_dt vs datetime_new(from builder object) vs datetime_new_2 which would be not using hash object with handlers, but rather sequence of ifs. https://gist.github.com/tsafin/31cc9b0872b6015904fcc90d97740770 ``` ~/datetime/tarantoolt [tsafin/gh-5941-datetime-v3] 00:39 $ ./build/src/tarantool ../bench-hash-dt.lua datetime_new_dt 0.002718448638916 datetime_new 5.1318607330322 datetime_new(precreated) 4.4448664188385 datetime_new_2(ifs) 4.4977836608887 ``` Note, that creation from builder object is much, much slower than raw cdata object creation (that's expected). If we move object out of loop, and precreate it then we save some time (as expected) 5.1 vs 4.4. But if we change implementation from hash with closure handlers to sequence of ifs - we actually degrade performance (a bit) from 4.44 to 4.49. Biggest performance impact here - traversing of object attributes, but whether it's using closures or not using them, it makes no much difference. But using closures makes this code look much more elegant. So keeping it here for today. >> + secs = function(_, v) >> + secs = v >> + easy_way = true >> + end, >> + >> + nsec = function(_, v) >> + nsec = v >> + easy_way = true >> + end, >> + >> + offset = function (_, v) >> + offset = v >> + easy_way = true >> + end, >> + >> + year = function(k, v) >> + check_range(v, {1, 9999}, k) >> + y = v >> + ymd = true >> + end, >> + >> + month = function(k, v) >> + check_range(v, {1, 12}, k) >> + M = v >> + ymd = true >> + end, >> + >> + day = function(k, v) >> + check_range(v, {1, 31}, k) >> + d = v >> + ymd = true >> + end, >> + >> + hour = function(k, v) >> + check_range(v, {0, 23}, k) >> + h = v >> + hms = true >> + end, >> + >> + minute = function(k, v) >> + check_range(v, {0, 59}, k) >> + m = v >> + hms = true >> + end, >> + >> + second = function(k, v) >> + check_range(v, {0, 60}, k) >> + s, frac = math.modf(v) >> + frac = frac * 1e9 -- convert fraction to nanoseconds >> + hms = true >> + end, >> + >> + -- tz offset in minutes >> + tz = function(k, v) >> + check_range(v, {0, 720}, k) >> + offset = v >> + end >> + } >> + for key, value in pairs(o) do >> + handlers[key](key, value) >> + end >> + >> + -- .sec, .nsec, .offset >> + if easy_way then >> + return datetime_new_raw(secs, nsec, offset) >> + end >> + >> + -- .year, .month, .day >> + if ymd then >> + dt = dt + builtin.dt_from_ymd(y, M, d) >> + end >> + >> + -- .hour, .minute, .second >> + if hms then >> + secs = h * 3600 + m * 60 + s >> + end >> + >> + return mk_timestamp(dt, secs, frac, offset) >> +end > > 18. It fails with a bad error: > > tarantool> d = datetime({garbage = true}) > --- > - error: 'builtin/datetime.lua:449: attempt to call a nil value' > ... > Added default handler for an unknown attribute. ``` tarantool> T = date{years = -123} --- - error: '[string "T = date{years = -123}"]:1: unknown attribute years' ... ``` > > 19. It can ignore some fields (here it ignores 'year'): > > tarantool> d1 = datetime({secs = 100, year = 1}) > --- > ... > > tarantool> d2 = datetime({secs = 100, year = 2}) > --- > ... > > tarantool> d1 == d2 > --- > - true > ... > Yes, and that's by design. .secs defines 'easy_way' of initializing object via secs/nsec/offset, while .year selects 'ymd' mode. And `easy_way` has precedence. It looks reasonable for me, please, suggest your idea how to handle it even better without much redesign? > > 20. Serialization does not seem to be working in the console > despite you tried to implement it above: > > tarantool> d = datetime({secs = 100}) > --- > ... > > tarantool> d > --- > - 'cdata: 0x0103ad9c50' > ... Yup, redefining of __serialize is not enough for cdata object by whatever reason (I didn't look deep). But after the next patch(es) it's start to stringize properly. > >> + >> +local function date_first(lhs, rhs) >> + if is_datetime(lhs) then >> + return lhs, rhs >> + else >> + return rhs, lhs >> + end >> +end >> + >> +local function error_incompatible(name) >> + error(("datetime:%s() - incompatible type of arguments"): >> + format(name), 3) >> +end >> + >> +local function datetime_sub(lhs, rhs) >> + check_date_interval(lhs, "operator -") >> + local d, s = lhs, rhs >> + local left_t = ffi.typeof(d) >> + local right_t = ffi.typeof(s) >> + local o >> + >> + if left_t == datetime_t then >> + -- left is date, right is date or generic interval >> + if (right_t == datetime_t or right_t == interval_t) then > > 21. You do not need () around the condition in Lua. Fixed. > >> + o = right_t == datetime_t and interval_new() or datetime_new() >> + o.secs, o.nsec = _normalize_nsec(lhs.secs - rhs.secs, >> + lhs.nsec - rhs.nsec) >> + return o >> + else >> + error_incompatible("operator -") >> + end >> + -- both left and right are generic intervals >> + elseif left_t == interval_t and right_t == interval_t then >> + o = interval_new() >> + o.secs, o.nsec = _normalize_nsec(lhs.secs - rhs.secs, >> + lhs.nsec - rhs.nsec) >> + return o >> + else >> + error_incompatible("operator -") >> + end >> +end > > 22. Did you really test this function? > > tarantool> dt = require('datetime') > --- > ... > > tarantool> d1 = dt() > --- > ... > > tarantool> d2 - 1 > --- > - error: 'builtin/datetime.lua:487: bad argument #1 to ''typeof'' (C type expected, > got number)' > ... All kinds of arguments checking implemented in the separate patch about interval arithmetic operations, please find a moment to look into them. > >> + >> +local function datetime_add(lhs, rhs) >> + local d, s = date_first(lhs, rhs) >> + >> + check_date_interval(d, "operator +") >> + check_interval(s, "operator +") >> + local left_t = ffi.typeof(d) >> + local right_t = ffi.typeof(s) >> + local o >> + >> + -- left is date, right is date or interval >> + if left_t == datetime_t and right_t == interval_t then >> + o = datetime_new() >> + o.secs, o.nsec = _normalize_nsec(d.secs + s.secs, d.nsec + s.nsec) >> + return o >> + -- both left and right are generic intervals >> + elseif left_t == interval_t and right_t == interval_t then >> + o = interval_new() >> + o.secs, o.nsec = _normalize_nsec(d.secs + s.secs, d.nsec + s.nsec) >> + return o >> + else >> + error_incompatible("operator +") >> + end >> +end >> + >> +-- simple parse functions: >> +-- parse_date/parse_time/parse_zone > > 23. As I said in the previous review, the comment is useless. > You just listed 3 function names following the comment. What > is the purpose of that? What should I learn from it? Please, > drop it. Dropped. > > <...> > >> + >> +local function datetime_from(o) >> + if o == nil or type(o) == 'table' then >> + return datetime_new(o) >> + elseif type(o) == 'string' then >> + return parse(o) >> + end >> +end >> + >> +local function local_now() >> + local p_tv = ffi.new('struct timeval [1]') >> + local rc = builtin.gettimeofday(p_tv, nil) > > 24. Why can't you use fiber time? It should be enough in > nearly all usecases and is much faster. For real-real time > talking directly to the kernel we have 'clock' Lua module > already. You could check if results of clock functions can > be passed into your module to create 'real' timestamps not > depending on the eventloop. Fiber time, which actually return cached ev_rt_now, looks imprecise and depends on event loop circumstances, though eventually it calls the same clock_gettime(). So fiber-time is looking like a cached timestamp value from [hopefully, non-distant] past. I'd rather prefer to have more realtime values, and clock_realtime*/clock_monotonic* look relevant, they call clock_gettime() for their purposes, which in turn calls the same gettimeofday(). So in this case we call same api, but with 2 extra call layers above. IMHO, not worth extra complexity if we could just simply call gettimeofday(). [BTW, in current branch this code lives in C - I've moved ctime/asctime/strftime/now handlers to datetime.c, to hide Linux vs BSD differences there] > >> + assert(rc == 0) >> + >> + local secs = p_tv[0].tv_sec >> + local nsec = p_tv[0].tv_usec * 1000 >> + >> + local p_time = ffi.new('time_t[1]') >> + local p_tm = ffi.new('struct tm[1]') >> + builtin.time(p_time) >> + builtin.localtime_r(p_time, p_tm) >> + local ofs = p_tm[0].tm_gmtoff / 60 -- convert seconds to minutes >> + >> + return datetime_new_raw(secs, nsec, ofs) >> +end >> + >> +local function datetime_to_tm_ptr(o) >> + assert(is_datetime(o)) >> + local p_tm = ffi.new('struct tm[1]') >> + -- dt_to_struct_tm() fills only date data >> + builtin.dt_to_struct_tm(local_dt(o), p_tm) >> + >> + -- calculate the smaller data (hour, minute, >> + -- seconds) using datetime seconds value >> + local seconds_of_day = o.secs % 86400 >> + local hour = (seconds_of_day / 3600) % 24 >> + local minute = (seconds_of_day / 60) % 60 >> + p_tm[0].tm_sec = seconds_of_day % 60 >> + p_tm[0].tm_min = minute >> + p_tm[0].tm_hour = hour >> + >> + p_tm[0].tm_gmtoff = o.offset * 60 >> + >> + return p_tm >> +end There is no such code in Lua anymore. >> + >> +local function asctime(o) >> + check_date(o, "datetime:asctime()") >> + >> + local p_tm = datetime_to_tm_ptr(o) >> + return ffi.string(builtin.asctime(p_tm)) >> +end >> + >> +local function ctime(o) >> + check_date(o, "datetime:ctime()") >> + local p_time = ffi.new('time_t[1]') >> + p_time[0] = o.secs >> + return ffi.string(builtin.ctime(p_time)) > > 25. Functions like ctime can't be used in Lua, because according > to their Linux man: > > The return value points to a statically allocated string > which might be overwritten by subsequent calls to any of the > date and time functions > > Which means they use a global buffer. If you will > use datetime module in any of __gc handlers, it will > return garbage sometimes or just override results > somethere. That is exactly the same problem as existed > with tarantool's static buffer. I do not believe it's the same problem - in this case content of static buffer is immediately interned in Lua via ffi.string() https://luajit.org/ext_ffi_api.html "The Lua string is an (interned) copy of the data and bears no relation to the original data area anymore." > >> diff --git a/test/unit/datetime.c b/test/unit/datetime.c >> new file mode 100644 >> index 000000000..b6f568c03 >> --- /dev/null >> +++ b/test/unit/datetime.c > > 26. This file tests dt library. It does not test your code. Please, add > tests for everything you implemented in this commit. For c-dt library > the tests must be added in the previous commit, and the test file > must be called test/unit/cdt.c then. Unit-test has been moved to the commit introducing c-dt. And in turn, initial datetime test (checking parser and ctime/asctime/now) squashed to the same commit where we introduce module with parser. > >> @@ -0,0 +1,221 @@ >> +#include "dt.h" >> +#include >> +#include >> +#include >> +#include >> + >> +#include "unit.h" >> + >> +const char sample[] = "2012-12-24T15:30Z"; > > 27. Please, make it static. Done > >> + >> +#define S(s) {s, sizeof(s) - 1} >> +struct { >> + const char * sz; >> + size_t len; >> +} tests[] = { >> + S("2012-12-24 15:30Z"), >> + S("2012-12-24 15:30z"), >> + S("2012-12-24 15:30"), >> + S("2012-12-24 16:30+01:00"), >> + S("2012-12-24 16:30+0100"), >> + S("2012-12-24 16:30+01"), >> + S("2012-12-24 14:30-01:00"), >> + S("2012-12-24 14:30-0100"), >> + S("2012-12-24 14:30-01"), >> + S("2012-12-24 15:30:00Z"), >> + S("2012-12-24 15:30:00z"), >> + S("2012-12-24 15:30:00"), >> + S("2012-12-24 16:30:00+01:00"), >> + S("2012-12-24 16:30:00+0100"), >> + S("2012-12-24 14:30:00-01:00"), >> + S("2012-12-24 14:30:00-0100"), >> + S("2012-12-24 15:30:00.123456Z"), >> + S("2012-12-24 15:30:00.123456z"), >> + S("2012-12-24 15:30:00.123456"), >> + S("2012-12-24 16:30:00.123456+01:00"), >> + S("2012-12-24 16:30:00.123456+01"), >> + S("2012-12-24 14:30:00.123456-01:00"), >> + S("2012-12-24 14:30:00.123456-01"), >> + S("2012-12-24t15:30Z"), >> + S("2012-12-24t15:30z"), >> + S("2012-12-24t15:30"), >> + S("2012-12-24t16:30+01:00"), >> + S("2012-12-24t16:30+0100"), >> + S("2012-12-24t14:30-01:00"), >> + S("2012-12-24t14:30-0100"), >> + S("2012-12-24t15:30:00Z"), >> + S("2012-12-24t15:30:00z"), >> + S("2012-12-24t15:30:00"), >> + S("2012-12-24t16:30:00+01:00"), >> + S("2012-12-24t16:30:00+0100"), >> + S("2012-12-24t14:30:00-01:00"), >> + S("2012-12-24t14:30:00-0100"), >> + S("2012-12-24t15:30:00.123456Z"), >> + S("2012-12-24t15:30:00.123456z"), >> + S("2012-12-24t16:30:00.123456+01:00"), >> + S("2012-12-24t14:30:00.123456-01:00"), >> + S("2012-12-24 16:30 +01:00"), >> + S("2012-12-24 14:30 -01:00"), >> + S("2012-12-24 15:30 UTC"), >> + S("2012-12-24 16:30 UTC+1"), >> + S("2012-12-24 16:30 UTC+01"), >> + S("2012-12-24 16:30 UTC+0100"), >> + S("2012-12-24 16:30 UTC+01:00"), >> + S("2012-12-24 14:30 UTC-1"), >> + S("2012-12-24 14:30 UTC-01"), >> + S("2012-12-24 14:30 UTC-01:00"), >> + S("2012-12-24 14:30 UTC-0100"), >> + S("2012-12-24 15:30 GMT"), >> + S("2012-12-24 16:30 GMT+1"), >> + S("2012-12-24 16:30 GMT+01"), >> + S("2012-12-24 16:30 GMT+0100"), >> + S("2012-12-24 16:30 GMT+01:00"), >> + S("2012-12-24 14:30 GMT-1"), >> + S("2012-12-24 14:30 GMT-01"), >> + S("2012-12-24 14:30 GMT-01:00"), >> + S("2012-12-24 14:30 GMT-0100"), >> + S("2012-12-24 14:30 -01:00"), >> + S("2012-12-24 16:30:00 +01:00"), >> + S("2012-12-24 14:30:00 -01:00"), >> + S("2012-12-24 16:30:00.123456 +01:00"), >> + S("2012-12-24 14:30:00.123456 -01:00"), >> + S("2012-12-24 15:30:00.123456 -00:00"), >> + S("20121224T1630+01:00"), >> + S("2012-12-24T1630+01:00"), >> + S("20121224T16:30+01"), >> + S("20121224T16:30 +01"), >> +}; >> +#undef S >> + >> +#define DIM(a) (sizeof(a) / sizeof(a[0])) >> + >> +// p5-time-moment/src/moment_parse.c: parse_string_lenient() > > 28. Please, stop using // comment opening. > Uh, ok. All, with exception of separation of mp_datetime and introduction of test (test requires some time for plumbing) is alreday in the branch updated - tsafin/gh-5941-datetime-v3 Vlad, if you still have some time this Friday, before goin to vacation, then please look into the rest of patches I have so far. Now they 7, and look reasonable enough: * a1905f727 build: add Christian Hansen c-dt to the build * f1dcde7ec lua: built-in module datetime * 88e25eaa0 lua, datetime: display datetime * cfa40d93e box, datetime: messagepack support for datetime * 9e718e5c5 box, datetime: datetime comparison for indices * 8d26d2f76 lua, datetime: time intervals support * 9b338fc57 datetime: changelog for datetime module Thanks, Timur