From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id EA5936EC55; Fri, 30 Jul 2021 18:39:27 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org EA5936EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1627659568; bh=dhrcUJnA/LsjEvgWXbTYJ+Noqd7TtVOkC7aJ4/KRSbA=; h=To:References:In-Reply-To:Date:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=x0H/F+SMBHe7rfN4ijUUglWwgN/uVfVembba+lLsS2ItuJj1FwjToijVnZc/uAWg3 XJyUH3eWrXC6WA3kMi8kMO4wdOCvppPDqsFA2MLswi9l4MGzT1S4yNdu/0SpiIfHiG uhXJ/l722BEaYQN1nO14xPlpLv6HNEdnnqAOouj8= Received: from smtp48.i.mail.ru (smtp48.i.mail.ru [94.100.177.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 4A8D76EC55 for ; Fri, 30 Jul 2021 18:39:27 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 4A8D76EC55 Received: by smtp48.i.mail.ru with esmtpa (envelope-from ) id 1m9Ubd-00007m-Ut; Fri, 30 Jul 2021 18:39:26 +0300 To: "'Vladislav Shpilevoy'" References: <50175cbd00c57e3bc8eed222951a551f4bb7effb.1627468002.git.tsafin@tarantool.org> <75730675-e162-887c-3ab3-1c8a8600b27e@tarantool.org> In-Reply-To: <75730675-e162-887c-3ab3-1c8a8600b27e@tarantool.org> Date: Fri, 30 Jul 2021 18:39:25 +0300 Message-ID: <031c01d78559$13ae90f0$3b0bb2d0$@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQNJI0ot4YBfKbZyj2tYNprdOv6c+gGi40uBAZLeCUOoXiyKQA== Content-Language: ru X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD941C43E597735A9C351B198F4576AC7B21928AAE70459C21B182A05F538085040F2E2D1B8E34EF48FB6635FBC7181706C3489232B21CE758B3BE35DF8CE00071B X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7F1942E6D70B4A2F0EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063700B087299B9BE63E8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D889D6CD22BA2798C82B83348FEAA00206117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCAA867293B0326636D2E47CDBA5A96583BD4B6F7A4D31EC0BC014FD901B82EE079FA2833FD35BB23D27C277FBC8AE2E8BAA867293B0326636D2E47CDBA5A96583BA9C0B312567BB231DD303D21008E29813377AFFFEAFD269A417C69337E82CC2E827F84554CEF50127C277FBC8AE2E8BA83251EDC214901ED5E8D9A59859A8B6A45692FFBBD75A6A089D37D7C0E48F6C5571747095F342E88FB05168BE4CE3AF X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A213B5FB47DCBC3458834459D11680B5055CADDCBDF01DD427D77AB57EC845BC15 X-C1DE0DAB: 0D63561A33F958A505DB97CC5C8BC5CBFEFDE6618BBD99F71D6AD2B540831A87D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA751B940EDA0DFB0535410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D34951738D4D62E58A47143F85D2AB4E2E4617FB364BD496082A73A97AA98EF7ADEB67A59399499ADBA1D7E09C32AA3244C9FEAEDCB9D49D73873E451E0E90A06E2BBA718C7E6A9E042927AC6DF5659F194 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojWBddABnKmoKW9yBA4z0NwQ== X-Mailru-Sender: 3B9A0136629DC91279414E1E1306C6036AD3BD07E3294FD2B6635FBC7181706C4AF810104A35EE5EFF1C70CE0B45DD4C1458020726E2BC9FA2617F3CD613626F01CEC67AEC592A747402F9BA4338D657ED14614B50AE0675 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH resend v2 02/11] lua: built-in module datetime X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Timur Safin via Tarantool-patches Reply-To: Timur Safin Cc: tarantool-patches@dev.tarantool.org Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" : From: Vladislav Shpilevoy : Subject: Re: [PATCH resend v2 02/11] lua: built-in module datetime :=20 : Thanks for the patch! :=20 : This is a bad question to ask now, but why did you choose a dead = library : from 2015 with just 25 stars, without any tests, and without any = README as : a basic block for our datetime types? Is this the correct link I am : using - https://github.com/chansen/c-dt? Good you asked :) That was data-driven decision and you could see = details here https://github.com/tsafin/bench-timestamps In short c-dt is as almost as feature-rich as ICU or CCTZ, but much, = much=20 faster (and we know how to make it even faster if we have to). Originally I've learned about c-dt from perl' perl5-time-moment = mentioned by Mons (which module was the fastest date/time implementation in Perl = ecosystem). :=20 : See 31 comment below. :=20 : On 28.07.2021 12:34, Timur Safin wrote: : > * created a new Tarantool built-in module `datetime`; : > * register cdef types for this module; : > * export some `dt_*` functions from `c-dt` library; : > * lua implementationis of `asctime` and `strftime`; :=20 : 1. implementationis -> implementations :=20 : > * datetime parsing unit tests, with and withput timezones; :=20 : 2. withput -> without Thanks! :=20 : > diff --git a/src/exports.h b/src/exports.h : > index 5bb3e6a2b..db40c03a4 100644 : > --- a/src/exports.h : > +++ b/src/exports.h : > @@ -531,3 +531,24 @@ EXPORT(uri_format) : > EXPORT(uri_parse) : > EXPORT(uuid_nil) : > EXPORT(uuid_unpack) : > +EXPORT(dt_from_rdn) : > +EXPORT(dt_from_yd) : > +EXPORT(dt_from_ymd) : > +EXPORT(dt_from_yqd) : > +EXPORT(dt_from_ywd) : > +EXPORT(dt_to_yd) : > +EXPORT(dt_to_ymd) : > +EXPORT(dt_to_yqd) : > +EXPORT(dt_to_ywd) : > +EXPORT(dt_rdn) : > +EXPORT(dt_dow) : > +EXPORT(dt_parse_iso_date) : > +EXPORT(dt_parse_iso_time) : > +EXPORT(dt_parse_iso_time_basic) : > +EXPORT(dt_parse_iso_time_extended) : > +EXPORT(dt_parse_iso_zone) : > +EXPORT(dt_parse_iso_zone_basic) : > +EXPORT(dt_parse_iso_zone_extended) : > +EXPORT(dt_parse_iso_zone_lenient) : > +EXPORT(dt_from_struct_tm) : > +EXPORT(dt_to_struct_tm) :=20 : 3. Please, read the comment in the beginning of the file. The : first sentence would be: :=20 : Keep the symbols sorted by name for search and addition : simplicity, to avoid duplicates. Yes, I knew it, but forgot.=20 BTW, when I tried to sort the whole block I have discovered that there are multiple of symbols, which are already in incorrect order but I decided to not touch irrelevant chunks, and only sorted d* See in the next iteration... :=20 : > diff --git a/src/lib/core/datetime.h b/src/lib/core/datetime.h : > new file mode 100644 : > index 000000000..403bf1c64 : > --- /dev/null : > +++ b/src/lib/core/datetime.h : > @@ -0,0 +1,61 @@ :=20 : <...> :=20 : > +#include :=20 : 4. Please, use "" instead of <> for non-system headers. Ok. :=20 : > +#include : > +#include : > + : > +#if defined(__cplusplus) : > +extern "C" { : > +#endif /* defined(__cplusplus) */ : > + : > +/** : > + * datetime structure consisting of: :=20 : 5. Well, this comment isn't really helpful. Also we do not : use _t suffix ever except for typedefs. Please, drop both : these things. :=20 : The same for datetime_interval_t. Will rename to `struct datetime`, and `struct datetime_interval`=20 Accordingly. :=20 : > + */ : > +struct datetime_t { : > + int64_t secs; ///< seconds since epoch : > + int32_t nsec; ///< nanoseconds if any : > + int32_t offset; ///< offset in minutes from GMT :=20 : 6. We never use // nor /// nor ///< nor we write comments : on the same line as the code (except for old legacy code we : inherited from sqlite and some super old tarantool code. BTW, why? Single line comment // is the legal C construction=20 since C99 (and always have been acceptable by Gnu C and major=20 other compilers).=20 Also did you know that ///< is doxygen construction for=20 documentation of a struct member? (Asking, just to make sure) :=20 : The same for datetime_interval_t. :=20 : > +}; : > + : > +/** : > + * Date/time delta structure : > + */ : > +struct datetime_interval_t { : > + int64_t secs; ///< relative seconds delta : > + int32_t nsec; ///< nanoseconds delta : > +}; : > + :=20 : 7. Why do you need this file? It is not included anywhere. : And you don't need to define the structs in C if you are : using them in Lua only. You can just define them in Lua : using ffi.cdef like it is done in some other places. I'll need to use it in C very soon, once I get to SQL parser INTERVAL 'xx' support. :=20 : > diff --git a/src/lua/datetime.lua b/src/lua/datetime.lua : > new file mode 100644 : > index 000000000..0996ca5a2 : > --- /dev/null : > +++ b/src/lua/datetime.lua : > @@ -0,0 +1,581 @@ : > +local ffi =3D require('ffi') : > +local cdt =3D ffi.C : > + : > +ffi.cdef [[ : > + : > + typedef int dt_t; : > + : > + // dt_core.h : > + typedef enum { : > + DT_MON =3D 1, : > + DT_MONDAY =3D 1, : > + DT_TUE =3D 2, : > + DT_TUESDAY =3D 2, : > + DT_WED =3D 3, : > + DT_WEDNESDAY =3D 3, : > + DT_THU =3D 4, : > + DT_THURSDAY =3D 4, : > + DT_FRI =3D 5, : > + DT_FRIDAY =3D 5, : > + DT_SAT =3D 6, : > + DT_SATURDAY =3D 6, : > + DT_SUN =3D 7, : > + DT_SUNDAY =3D 7, : > + } dt_dow_t; : > + : > + dt_t dt_from_rdn (int n); : > + dt_t dt_from_yd (int y, int d); : > + dt_t dt_from_ymd (int y, int m, int d); : > + dt_t dt_from_yqd (int y, int q, int d); : > + dt_t dt_from_ywd (int y, int w, int d); : > + : > + void dt_to_yd (dt_t dt, int *y, int *d); : > + void dt_to_ymd (dt_t dt, int *y, int *m, int *d); : > + void dt_to_yqd (dt_t dt, int *y, int *q, int *d); : > + void dt_to_ywd (dt_t dt, int *y, int *w, int *d); : > + : > + int dt_rdn (dt_t dt); : > + dt_dow_t dt_dow (dt_t dt); : > + : > + // dt_parse_iso.h : > + size_t dt_parse_iso_date (const char *str, size_t len, = dt_t : *dt); : > + : > + size_t dt_parse_iso_time (const char *str, size_t len, = int : *sod, int *nsec); : > + size_t dt_parse_iso_time_basic (const char *str, size_t len, = int : *sod, int *nsec); : > + size_t dt_parse_iso_time_extended (const char *str, size_t len, = int : *sod, int *nsec); : > + : > + size_t dt_parse_iso_zone (const char *str, size_t len, = int : *offset); : > + size_t dt_parse_iso_zone_basic (const char *str, size_t len, = int : *offset); : > + size_t dt_parse_iso_zone_extended (const char *str, size_t len, = int : *offset); : > + size_t dt_parse_iso_zone_lenient (const char *str, size_t len, = int : *offset); : > + : > + // dt_tm.h : > + dt_t dt_from_struct_tm (const struct tm *tm); : > + void dt_to_struct_tm (dt_t dt, struct tm *tm); : > + : > + // : > + typedef long __kernel_long_t; : > + typedef unsigned long __kernel_ulong_t; : > + // /usr/include/x86_64-linux-gnu/bits/types/time_t.h : > + typedef long time_t; : > + : > + : > + // : > + typedef __kernel_long_t __kernel_time_t; : > + typedef __kernel_long_t __kernel_suseconds_t; : > + : > + struct timespec { : > + __kernel_time_t tv_sec; /* seconds */ : > + long tv_nsec; /* nanoseconds */ : > + }; : > + : > + struct timeval { : > + __kernel_time_t tv_sec; /* seconds */ : > + __kernel_suseconds_t tv_usec; /* microseconds */ : > + }; : > + : > + struct timezone { : > + int tz_minuteswest; /* minutes west of Greenwich */ : > + int tz_dsttime; /* type of dst correction */ : > + }; : > + : > + // /usr/include/x86_64-linux-gnu/sys/time.h : > + typedef struct timezone * __timezone_ptr_t; : > + : > + /* Get the current time of day and timezone information, : > + putting it into *TV and *TZ. If TZ is NULL, *TZ is not = filled. : > + Returns 0 on success, -1 on errors. : > + : > + NOTE: This form of timezone information is obsolete. : > + Use the functions and variables declared in = instead. */ : > + int gettimeofday (struct timeval *__tv, struct timezone * = __tz); : > + : > + // /usr/include/x86_64-linux-gnu/bits/types/struct_tm.h : > + /* ISO C `broken-down time' structure. */ : > + struct tm : > + { : > + int tm_sec; /* Seconds. [0-60] (1 leap second) */ : > + int tm_min; /* Minutes. [0-59] */ : > + int tm_hour; /* Hours. [0-23] */ : > + int tm_mday; /* Day. [1-31] */ : > + int tm_mon; /* Month. [0-11] */ : > + int tm_year; /* Year - 1900. */ : > + int tm_wday; /* Day of week. [0-6] */ : > + int tm_yday; /* Days in year.[0-365] */ : > + int tm_isdst; /* DST. [-1/0/1]*/ : > + : > + long int tm_gmtoff; /* Seconds east of UTC. */ : > + const char *tm_zone;/* Timezone abbreviation. */ : > + }; : > + : > + // : > + /* Return the current time and put it in *TIMER if TIMER is not = NULL. : */ : > + time_t time (time_t *__timer); : > + : > + /* Format TP into S according to FORMAT. : > + Write no more than MAXSIZE characters and return the number : > + of characters written, or 0 if it would exceed MAXSIZE. */ : > + size_t strftime (char * __s, size_t __maxsize, const char * = __format, : > + const struct tm * __tp); : > + : > + /* Parse S according to FORMAT and store binary time = information in : TP. : > + The return value is a pointer to the first unparsed character = in S. : */ : > + char *strptime (const char * __s, const char * __fmt, struct tm : *__tp); : > + : > + /* Return the `struct tm' representation of *TIMER in UTC, : > + using *TP to store the result. */ : > + struct tm *gmtime_r (const time_t * __timer, struct tm * __tp); : > + : > + /* Return the `struct tm' representation of *TIMER in local = time, : > + using *TP to store the result. */ : > + struct tm *localtime_r (const time_t * __timer, struct tm * = __tp); : > + : > + /* Return a string of the form "Day Mon dd hh:mm:ss yyyy\n" : > + that is the representation of TP in this format. */ : > + char *asctime (const struct tm *__tp); : > + : > + /* Equivalent to `asctime (localtime (timer))'. */ : > + char *ctime (const time_t *__timer); : > + :=20 : 8. The amount of platform-specific stuff you got here really bothers = me. : For instance, struct tm from /usr/include/x86_64-linux- : gnu/bits/types/struct_tm.h. :=20 : Is it said anywhere that the struct has only these members and only in = this : order? The standard says "includes at least the following members". = Not : only them and no specific order. Agreed that despite the fact that `struct tm` is defined pretty = similarly=20 in POSIX/SUSV = https://pubs.opengroup.org/onlinepubs/009696699/basedefs/time.h.html actual layout of (at least last fields tm_gmtoff and tm_zone) might vary significantly. And we better hide this layout specific code (i.e. = allocation and manipulations with fields) into accompanying C code.=20 :=20 : I would probably go with Lua C implementation if I would start this : module. Too strong dependecy on C. Also I wouldn't be so sure FFI is = faster : than : Lua C: https://github.com/tarantool/tarantool/issues/5896. As you know - who is faster is complicated question. There are some subtle bug in LuaJIT with structure allocations which I believe was the major contributor to slowness reported in Decimal branch=20 https://github.com/tarantool/tarantool/issues/692#issuecomment-505043929 = but which has been narrowed down by Sergos and Mike has already = committed patch to upstream in = https://github.com/LuaJIT/LuaJIT/commit/ad65934fa08a65bfb0eb9528731a43948= 42cc173 but, I believe, it's not yet in our LuaJIT fork.=20 I'd probably postpone total redesign from ffi to LuaC of this 1st = version of module (because performance is not yet problem, functionality is), = but rather wait and see how things would change once this struct-of-NYI = would be landed to Tarantool code.=20 :=20 : > +]] : > + : > +local native =3D ffi.C :=20 : 9. You already have cdt in the beginning of the file. Yes, thanks :=20 : > + : > +local SECS_PER_DAY =3D 86400 : > +local NANOS_PER_SEC =3D 1000000000LL : > + : > +-- c-dt/dt_config.h : > + : > +-- Unix, January 1, 1970, Thursday : > +local DT_EPOCH_1970_OFFSET =3D 719163LL :=20 : 10. I remember there was a good question somewhere - how are these : datetimes supposed to store dates before 1970? For example, a : database of historical events or documents. :=20 : Can this library store dates before Christ? Yes, please see relevant discussion with Peter = https://github.com/tarantool/tarantool/discussions/6244#discussioncomment= -1055996 c-dt is properly handling positive and negative dt values,=20 where dt is number of dates since Rata Die, January 1, 0001. For better MessagePack compactness of our typical data we shift = beginning of our 0 to the Unix Epoch (1970-01-01), and our 0 is actually=20 dt=3D719163. So our epoch-based second 0 would become dt value (719163) via this = simple formula (secs / 86400) + 719163. Where 719163 is an offset of Unix Epoch (1970-01-01) since Rata Die (0001-01-01) in dates. We might consider to further shift, though, our relative 0=20 to any date in time like 2000-01-01, whatever, if we would decide that that different base would provide us more compact storage. But I believe unix-epoch date, from statistical point of view, is more convenient for=20 storing and using at runtime. :=20 : > + : > + : > +local datetime_t =3D ffi.typeof('struct datetime_t') : > +local interval_t =3D ffi.typeof('struct datetime_interval_t') : > + : > +local function interval_new() : > + local interval =3D ffi.new(interval_t) : > + return interval : > +end : > + : > +local function adjusted_secs(dt) : > + return dt.secs - dt.offset * 60 : > +end : > + : > +local function datetime_sub(lhs, rhs) : > + local s1 =3D adjusted_secs(lhs) : > + local s2 =3D adjusted_secs(rhs) :=20 : 11. Did you think about storing the time in a unified : format with offset already applied and add the offset only : when you print the time or convert to something? It seems : these `dt.offset * 60` might be expensive on each arith : operation done again and again. Exactly! That was bug in the early patches, which has been fixed=20 in later version of this code. Sorry for the confusion created.=20 Yes, secs are UC normalized, and offset if nominal (used only For output). :=20 : > + local d =3D interval_new() : > + d.secs =3D s2 - s1 : > + d.nsec =3D rhs.nsec - lhs.nsec : > + if d.nsec < 0 then :=20 : 12. AFAIK, in Lua index operator '.' is costly. : You might want to cache its values like this: :=20 : local secs =3D s2 - s1 : local nsec =3D rhs.nsec - lhs.nsec : if nsec < 0 then : d.secs =3D secs - 1 : d.nsec =3D nsec + NANOS_PER_SEC : else : d.secs =3D secs : d.nsec =3D nsec : end :=20 : The same in the other places where index operation : is repeated more than once: datetime_lt, datetime_le, : etc. Thanks, I'll review all relevant places and extract read references to locals. :=20 : > + d.secs =3D d.secs - 1 : > + d.nsec =3D d.nsec + NANOS_PER_SEC : > + end : > + return d : > +end : > + : > +local function datetime_add(lhs, rhs) : > + local s1 =3D adjusted_secs(lhs) : > + local s2 =3D adjusted_secs(rhs) : > + local d =3D interval_new() : > + d.secs =3D s2 + s1 : > + d.nsec =3D rhs.nsec + lhs.nsec : > + if d.nsec >=3D NANOS_PER_SEC then : > + d.secs =3D d.secs + 1 : > + d.nsec =3D d.nsec - NANOS_PER_SEC : > + end : > + return d : > +end : > + : > +local function datetime_eq(lhs, rhs) : > + -- we usually don't need to check nullness : > + -- but older tarantool console will call us checking for = equality to : nil : > + if rhs =3D=3D nil then : > + return false : > + end : > + return (lhs.secs =3D=3D rhs.secs) and (lhs.nsec =3D=3D = rhs.nsec) :=20 : 13. You do not need () around the comparison operators. The same in = all : places : below. Also you could make it shorter: :=20 : return rhs ~=3D nil and lhs.secs =3D=3D rhs.secs and lhs.nsec =3D=3D = rhs.nsec Nice! Thanks! :=20 : 14. Why don't you take 'offset' into account? Because it's not needed due to UTC-normalization. Sorry for the = confusion=20 with adjusted_sescs function.=20 :=20 : > +end : > + : > + : > +local function datetime_lt(lhs, rhs) : > + return (lhs.secs < rhs.secs) or : > + (lhs.secs =3D=3D rhs.secs and lhs.nsec < rhs.nsec) : > +end : > + : > +local function datetime_le(lhs, rhs) : > + return (lhs.secs <=3D rhs.secs) or : > + (lhs.secs =3D=3D rhs.secs and lhs.nsec <=3D rhs.nsec) : > +end : > + : > +local function datetime_serialize(self) : > + -- Allow YAML, MsgPack and JSON to dump objects with sockets : > + return { secs =3D self.secs, nsec =3D self.nsec, tz =3D = self.offset } :=20 : 15. Why is 'offset' serialized as 'tz' and not just 'offset'? Yes, now you've asked it and it's looking confusing after rename. :=20 : > +end : > + : > +local function interval_serialize(self) : > + -- Allow YAML and JSON to dump objects with sockets : > + return { secs =3D self.secs, nsec =3D self.nsec } : > +end : > + : > +local datetime_mt =3D { : > + -- __tostring =3D datetime_tostring, :=20 : 16. Why is it commented out? Is it even tested then? It's implemented in the later patch in patchset series. :=20 : > + __serialize =3D datetime_serialize, : > + __eq =3D datetime_eq, : > + __lt =3D datetime_lt, : > + __le =3D datetime_le, : > + __sub =3D datetime_sub, : > + __add =3D datetime_add, : > + : > + nanoseconds =3D function(self) : > + return tonumber(self.secs*NANOS_PER_SEC + self.nsec) : > + end, :=20 : 17. Please, separate function definitions with an empty line : between them. Done! (Will be pushed later)=20 :=20 : > + microseconds =3D function(self) : > + return tonumber(self.secs*1e6 + self.nsec*1e3) :=20 : 18. Please, add whitespaces around binary operators. Done! It was already such in the later commits which were part of in patchset. [Yes, apparently I need to squash few more commits, as Oleg Babin has asked elsewhere] :=20 : 19. Why do you do tonumber() for all results? You are loosing : precision for big values. The same in all the places where you : use tonumber(). To be able to use it in places which expect numbers like %d in string.format. Though at the end of day I've came up to the version where I convert to Lua numbers not all (potentially fraction point)=20 values, but with exception of nanoseconds / microseconds / milliseconds (which return integer values).=20 :=20 : > + end, : > + seconds =3D function(self) : > + return tonumber(self.secs + self.nsec*1e3) : > + end, : > + minutes =3D function(self) : > + return tonumber((self._ticks/(1e6*60))%60) : > + end, : > + hours =3D function(self) : > + return tonumber(self._ticks/(1e6*60*60)) :=20 : 20. I don't see _ticks defined anywhere in our source code in : any of the files. What is it? Is it tested? Yes, that was miscopy from elsewhere. Fixed in newer commits.=20 Sorry for the confusion. :=20 : > + end, : > + :=20 : 21. Extra empty line. Yup, but later, well, you know ... :=20 : > +} : > + : > +local interval_mt =3D { : > + -- __tostring =3D interval_tostring, :=20 : 22. Ditto. Why is it commented out and not tested? Tostring conversion was implemented later in a few=20 commits. One for datetime stringization, and another=20 one for interval. :=20 : > + __serialize =3D interval_serialize, : > + __eq =3D datetime_eq, : > + __lt =3D datetime_lt, : > + __le =3D datetime_le, : > +} : > + : > +local function datetime_new_raw(secs, nsec, offset) : > + local dt_obj =3D ffi.new(datetime_t) : > + dt_obj.secs =3D secs : > + dt_obj.nsec =3D nsec : > + dt_obj.offset =3D offset : > + return dt_obj : > +end : > + : > +local function local_rd(o) :=20 : 23. What is 'rd' and 'dt'? `dt` is type - data type from c-dt (alias to signed integer). Which shows a number of dates since Rata Die date (0001-01-01). It's signed 32-bit value, thus could represent huge range before=20 0001-01-01 as well.=20 :=20 : > + return math.floor(tonumber(o.secs / SECS_PER_DAY)) + : DT_EPOCH_1970_OFFSET : > +end : > + : > +local function local_dt(o) : > + return cdt.dt_from_rdn(local_rd(o)) : > +end : > + : > +local function mk_timestamp(dt, sp, fp, offset) : > + local epochV =3D dt ~=3D nil and (cdt.dt_rdn(dt) - = DT_EPOCH_1970_OFFSET) : * SECS_PER_DAY or 0 : > + local spV =3D sp ~=3D nil and sp or 0 : > + local fpV =3D fp ~=3D nil and fp or 0 : > + local ofsV =3D offset ~=3D nil and offset or 0 : > + return datetime_new_raw (epochV + spV - ofsV * 60, fpV, ofsV) : > +end : > + : > +-- create @datetime_t given object @o fields : > +local function datetime_new(o) :=20 : 24. Why such a wierd name 'o'? It looks like 0, super : confusing. o means object (with attributes) here. Will make it look=20 longer, i.e. obj. :=20 : > + if o =3D=3D nil then : > + return datetime_new_raw(0, 0, 0) : > + end : > + local secs =3D 0 : > + local nsec =3D 0 : > + local offset =3D 0 : > + local easy_way =3D false : > + local y, M, d, ymd : > + y, M, d, ymd =3D 0, 0, 0, false : > + : > + local h, m, s, frac, hms : > + h, m, s, frac, hms =3D 0, 0, 0, 0, false :=20 : 25. Please, try to define the variables on their own lines. : Like you did a few lines above. Otherwise it is practically : unreadable. Ok :=20 : > + : > + local dt =3D 0 : > + : > + for key, value in pairs(o) do : > + local handlers =3D { : > + secs =3D function(v) : > + secs =3D v : > + easy_way =3D true :=20 : 26. I don't understand what 'easy way' means. Please, elaborate : what you are trying to achieve. We could create datetime object 2 ways: - using raw attributes, without any conversion - i.e. secs, nsec, = offset; - or using human-friendly year/month/day/hour/minutes/seconds; The easy_way is using raw mode - defining .secs, .nsec, .offset, but = using The same object constructor interface. i.e. date =3D require 'datetime' T =3D date{ secs =3D 0, nsec =3D 0, offset =3D 0} Will create datetime object representing 1970-01-01T00:00Z moment. :=20 : > + end, :=20 : 27. The usage of closures here might render all your FFI efforts : useless, killing the performance. Please, try to define all : methods of all objects only once in the root namespace of the : file. Closure usage might be justified only for rarely created : long living heavy objects like netbox. Agreed that creating closures in each iteration of a loop was bad idea (as Oleg Babin has already pin-pointed it elsewhere). And we have to ate least to remove loop-invariant outside of cycle. But at the moment I do not consider the code creating object=20 as very performance critical path, because I do not expect this path used frequently, more like after parsing of ido-8601 literals, which uses raw object construction approach. [I'll bench though the difference between using invariant closures=20 and multiple ifs to see the size of a problem] :=20 : > + : > + nsec =3D function(v) : > + nsec =3D v : > + easy_way =3D true : > + end, : > + : > + offset =3D function (v) : > + offset =3D v : > + easy_way =3D true : > + end, : > + : > + year =3D function(v) : > + assert(v > 0 and v < 10000) : > + y =3D v : > + ymd =3D true : > + end, : > + : > + month =3D function(v) : > + assert(v > 0 and v < 12 ) : > + M =3D v : > + ymd =3D true : > + end, : > + : > + day =3D function(v) : > + assert(v > 0 and v < 32) : > + d =3D v : > + ymd =3D true : > + end, : > + : > + hour =3D function(v) : > + assert(v >=3D 0 and v < 24) : > + h =3D v : > + hms =3D true : > + end, : > + : > + minute =3D function(v) : > + assert(v >=3D 0 and v < 60) : > + m =3D v : > + hms =3D true : > + end, : > + : > + second =3D function(v) : > + assert(v >=3D 0 and v < 61) : > + frac =3D v % 1 : > + if frac then : > + s =3D v - (v % 1) : > + else : > + s =3D v : > + end : > + hms =3D true : > + end, : > + : > + -- tz offset in minutes : > + tz =3D function(v) : > + assert(v >=3D 0 and v <=3D 720) : > + offset =3D v : > + end : > + } : > + handlers[key](value) : > + end : > + : > + -- .sec, .nsec, .offset : > + if easy_way then : > + return datetime_new_raw(secs, nsec, offset) : > + end : > + : > + -- .year, .month, .day : > + if ymd then : > + dt =3D dt + cdt.dt_from_ymd(y, M, d) : > + end : > + : > + -- .hour, .minute, .second : > + if hms then : > + secs =3D h * 3600 + m * 60 + s : > + end : > + : > + return mk_timestamp(dt, secs, frac, offset) : > +end : > + : > + : > +-- simple parse functions: : > +-- parse_date/parse_time/parse_zone :=20 : 28. I wouldn't be so sure they are 'simple' and I can read : their names below anyway. This makes this comment impractical. :) ok :=20 : > + : > +--[[ : > + Basic Extended : > + 20121224 2012-12-24 Calendar date (ISO 8601) : > + 2012359 2012-359 Ordinal date (ISO 8601) : > + 2012W521 2012-W52-1 Week date (ISO 8601) : > + 2012Q485 2012-Q4-85 Quarter date : > +]] : > + : > +local function parse_date(str) : > + local dt =3D ffi.new('dt_t[1]') : > + local rc =3D cdt.dt_parse_iso_date(str, #str, dt) : > + assert(rc > 0) :=20 : 29. Er ... And what if I pass a string in a wrong format? Users : should not get assertions. :=20 : tarantool> datetime.parse_date('121r1 r1r 13 13 13') : --- : - error: 'builtin/datetime.lua:397: assertion failed!' : ... :=20 : This shall not ever happen. Would you mind implementing proper : error handling. The same in all the places where the errors are : ignored. Now, I've reworked approach how they handle errors (you could see It in later commits in the patch-set), and now it returns tuple -=20 - parsed value - and the number of accepted input characters. tarantool> date.parse_date('121r1 r1r 13 13 13') --- - null - 0 ... Sometimes, while parsing composed strings using these partial methods (parse_date / parse_time / parse_time_zone) we need to know the number of symbols we have successfully accepted. That's why there is seconds value in returned tuple. Please give me know if there is better/more idiomatic approach used for such scenarios? :=20 : > + return mk_timestamp(dt[0]) : > +end : > + : > +--[[ : > + Basic Extended : > + T12 N/A : > + T1230 T12:30 : > + T123045 T12:30:45 : > + T123045.123456789 T12:30:45.123456789 : > + T123045,123456789 T12:30:45,123456789 : > + : > + The time designator [T] may be omitted. : > +]] : > +local function parse_time(str) : > + local sp =3D ffi.new('int[1]') : > + local fp =3D ffi.new('int[1]') : > + local rc =3D cdt.dt_parse_iso_time(str, #str, sp, fp) : > + assert(rc > 0) : > + return mk_timestamp(nil, sp[0], fp[0]) : > +end : > + : > +--[[ : > + Basic Extended : > + Z N/A : > + =C2=B1hh N/A : > + =C2=B1hhmm =C2=B1hh:mm :=20 : 30. I would recommend to avoid non-ASCII symbls in the code. I : remember I used unicode table borders to build nice schemas : but people were complaining it ruins their terminal's output. Even if it's properly utf-8 encoded? That's odd.=20 Will replace with + and - separately. :=20 : > +]] : > +local function parse_zone(str) : > + local offset =3D ffi.new('int[1]') : > + local len =3D cdt.dt_parse_iso_zone_lenient(str, #str, offset) : > + return len > 0 and mk_timestamp(nil, nil, nil, offset[0]) or = nil, : tonumber(len) :=20 : 31. Please, keep 80 symbols border. Yup :=20 : I am going to stop here, will return later. Thanks! Timur