From: Vladimir Davydov via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Safin Timur <tsafin@tarantool.org> Cc: Timur Safin via Tarantool-patches <tarantool-patches@dev.tarantool.org>, v.shpilevoy@tarantool.org Subject: Re: [Tarantool-patches] [PATCH v5 2/8] lua: built-in module datetime Date: Wed, 18 Aug 2021 14:45:29 +0300 [thread overview] Message-ID: <20210818114529.wx7ekgpc2ccqfxmq@esperanza> (raw) In-Reply-To: <512da66c-1b3f-c671-b653-96726293a5f2@tarantool.org> On Wed, Aug 18, 2021 at 01:03:27PM +0300, Safin Timur wrote: > > > +void > > > +datetime_now(struct datetime *now) > > > +{ > > > + struct timeval tv; > > > + gettimeofday(&tv, NULL); > > > + now->secs = tv.tv_sec; > > > + now->nsec = tv.tv_usec * 1000; > > > + > > > + time_t now_seconds; > > > + time(&now_seconds); > > > > Can't you use tv.tv_sec here? > > Nope. :) > > Actually I don't want to know value returned from time() - all I need is to > eventually call localtime[_r]() to get local timezone. If you know simpler > way to determine local timezone without time, then please give me know. AFAIU time() returns exactly the same value as tv.tv_sec, where tv is the value returned by gettimeofday(), no? Come to think of it, I'm not so sure it's a good idea to call localtime() on each call to get the timezone. In comparison to gettimeofday() it 10x slower. Maybe, it's worth caching the current timezone? vlad@esperanza:~/tmp$ cat bench.lua clock = require('clock') datetime = require('datetime') LOOPS = 10000000 t1 = clock.monotonic() for i = 1, LOOPS do datetime.now() end t2 = clock.monotonic() print(string.format('datetime.now: %f', t2 - t1)) t1 = clock.monotonic() for i = 1, LOOPS do clock.realtime() end t2 = clock.monotonic() print(string.format('clock.realtime: %f', t2 - t1)) os.exit(0) vlad@esperanza:~/tmp$ tarantool bench.lua datetime.now: 2.190355 clock.realtime: 0.201188 > > > +#define DT_EPOCH_1970_OFFSET 719163 > > > > I don't understand what this constant stores. Please add a comment. > > I've documented them after Serge request this way: > > ------------------------------------------------------ > /** > * We count dates since so called "Rata Die" date > * January 1, 0001, Monday (as Day 1). > * But datetime structure keeps seconds since > * Unix "Epoch" date: > * Unix, January 1, 1970, Thursday > * > * The difference between Epoch (1970-01-01) > * and Rata Die (0001-01-01) is 719163 days. > */ > > #ifndef SECS_PER_DAY > #define SECS_PER_DAY 86400 > #define DT_EPOCH_1970_OFFSET 719163 > #endif > > /** > * c-dt library uses int as type for dt value, which > * represents the number of days since Rata Die date. > * This implies limits to the number of seconds we > * could safely store in our structures and then safely > * pass to c-dt functions. > * > * So supported ranges will be > * - for seconds [-185604722870400 .. 185480451417600] > * - for dates [-5879610-06-22T00:00Z .. 5879611-07-11T00:00Z] > */ > #define MAX_DT_DAY_VALUE (int64_t)INT_MAX > #define MIN_DT_DAY_VALUE (int64_t)INT_MIN > #define SECS_EPOCH_1970_OFFSET \ > ((int64_t)DT_EPOCH_1970_OFFSET * SECS_PER_DAY) > #define MAX_EPOCH_SECS_VALUE \ > (MAX_DT_DAY_VALUE * SECS_PER_DAY - SECS_EPOCH_1970_OFFSET) > #define MIN_EPOCH_SECS_VALUE \ > (MIN_DT_DAY_VALUE * SECS_PER_DAY - SECS_EPOCH_1970_OFFSET) I'd name them DT_MAX_DAY_VALUE, etc, with DT_ prefix. For consistency with DT_EPOCH_1970_OFFSET and so as to emphasize that these constants are related to datetime objects. SECS_PER_DAY, MINS_PER_DAY, etc are fine without prefix, because they are universal. > > > +local function datetime_new_dt(dt, secs, frac, offset) > > > > What's 'frac'? Please either add a comment or give it a better name. > > fraction part. Renamed to full fraction. It's unclear what fraction is of. I'd rename it to 'nsecs', because the function it is passed to (datetime_new_raw) actually takes nanoseconds. > > > +-- create datetime given attribute values from obj > > > +-- in the "easy mode", providing builder with > > > +-- .secs, .nsec, .offset > > > +local function datetime_new_obj(obj, ...) > > > + if obj == nil or type(obj) ~= 'table' then > > > > type(obj) ~= 'table' implies 'obj' == nil > > I meant that obj may be simple number, not builder object. I understand, but if type(obj) ~= 'table' then will work for obj == nil (because type(obj) is nil ~= 'table). That is, 'obj == nil' check is redundant. > > > > > > + return datetime_new_raw(obj, ...) > > > + end > > > + local secs = 0 > > > + local nsec = 0 > > > + local offset = 0 > > > + > > > + for key, value in pairs(obj) do > > > + if key == 'secs' then > > > + secs = value > > > + elseif key == 'nsec' then > > > + nsec = value > > > + elseif key == 'offset' then > > > + offset = value > > > + else > > > + error(('unknown attribute %s'):format(key), 2) > > > + end > > > > Hmm, do we really need this check? IMO this is less clear and probably > > slower than accessing the table directly: > > > > local secs = obj.secs > > local nsecs = obj.nsecs > > ... > > So here is dilema: > - we could directly pass initialization object to the ffi.new. But allow all > kinds of unknown attributes; > - or we could check all attributes in a slightly slower loop, but provide > nice user visible diagnostics for an error. > > What would you recommend? (I prefer implementation with checks and > human-understandable errors) I understand the dilemma. I don't know what we should do. Probably ask someone who writes a lot of Lua code. I, personally, haven't seen checks like this in our Lua code - we never check if there's bogus positional arguments AFAICS. E.g. if you pass {tmieout = 123} to net.box.call() in options, there will be no error (and apparently no timeout because of the typo). I think that this is the Lua way, but I can't say for sure. > > > +local function datetime_new(obj) > > > + if obj == nil or type(obj) ~= 'table' then > > > + return datetime_new_raw(0, 0, 0) > > > + end > > > + local y = 0 > > > + local M = 0 > > > + local d = 0 > > > + local ymd = false > > > + > > > + local h = 0 > > > + local m = 0 > > > + local s = 0 > > > + local frac = 0 > > > + local hms = false > > > + local offset = 0 > > > + > > > + local dt = 0 > > > + > > > + for key, value in pairs(obj) do > > > + if key == 'year' then > > > + check_range(value, {1, 9999}, key) > > > + y = value > > > + ymd = true > > > + elseif key == 'month' then > > > + check_range(value, {1, 12}, key) > > > + M = value > > > + ymd = true > > > + elseif key == 'day' then > > > + check_range(value, {1, 31}, key) > > > + d = value > > > + ymd = true > > > + elseif key == 'hour' then > > > + check_range(value, {0, 23}, key) > > > + h = value > > > + hms = true > > > + elseif key == 'min' or key == 'minute' then > > > + check_range(value, {0, 59}, key) > > > + m = value > > > + hms = true > > > + elseif key == 'sec' or key == 'second' then > > > > I don't think we should support both 'sec'/'min' and 'second'/'minute' > > here. Since you want to be consistent with os.date() output, I would > > leave 'sec'/'min' only. > > Originally, there were only human-readable names like 'second' or 'minute', > os.date compatibility added only lately. Because, it costs nothing > (comparing to all other overhead). > > I'd prefer to have ability to use full readable names, not only those from > set of os.date. It costs nothing, but make it's friendlier. 'min' and 'sec' are common abbreviations so I wouldn't say they are not human-readable. > > > +local function datetime_index(self, key) > > > + if key == 'epoch' or key == 'unixtime' then > > > + return self.secs > > > + elseif key == 'ts' or key == 'timestamp' then > > > + return self.secs + self.nsec / 1e9 > > > + elseif key == 'ns' or key == 'nanoseconds' then > > > + return self.secs * 1e9 + self.nsec > > > + elseif key == 'us' or key == 'microseconds' then > > > + return self.secs * 1e6 + self.nsec / 1e3 > > > + elseif key == 'ms' or key == 'milliseconds' then > > > + return self.secs * 1e3 + self.nsec / 1e6 > > > + elseif key == 's' or key == 'seconds' then > > > + return self.secs + self.nsec / 1e9 > > > + elseif key == 'm' or key == 'min' or key == 'minutes' then > > > + return (self.secs + self.nsec / 1e9) / 60 > > > + elseif key == 'hr' or key == 'hours' then > > > + return (self.secs + self.nsec / 1e9) / (60 * 60) > > > + elseif key == 'd' or key == 'days' then > > > + return (self.secs + self.nsec / 1e9) / (24 * 60 * 60) > > > > 1. There's so many ways to get the same information, for example m, > > min, minutes. There should be exactly one way. > > Disagreed. I prefer friendlier approach. I wouldn't call it 'friendly'. What exactly, as a user, should I use, 'minutes', 'min', or 'm'? Freedom of choice is a pain. > > > > > 2. I'd expect datetime.hour return the number of hours in the current > > day, like os.date(), but it returns the number of hours since the > > Unix epoch instead. This is confusing and useless. > > Well, at least this is consistent with seconds and their fractions of > different units. If there would be demand for os.date compatible table > format, we might extend interface with corresponding accessor. But I do not > foresee it, because os.date is totally broken if we need correct handling of > timezone information. AFAIK it's always using local time context. It's just that I don't see any point in knowing the number of hours that passed since 1970. Better remove this method altogether then. To be discussed. > > > > 3. Please document the behavior somewhere in the comments to the code. > > Documented. > > > > > 4. These if-else's are inefficient AFAIU. > > Quite the contrary (if we compare to the idiomatic closure handlers case). > If/ifelse is the fastest way to handle such cases in Lua. > Here is earlier bench we have discussed with Vlad and Oleg Babin, and after > which I've changed from using hash based handlers to the series of ifs. > > https://gist.github.com/tsafin/31cc9b0872b6015904fcc90d97740770 The test creates a new hashtable on each function invocation - no surprise it's slow. The hashtable should be global. > > > > > > > + elseif key == 'to_utc' then > > > + return function(self) > > > + return datetime_to_tz(self, 0) > > > + end > > > + elseif key == 'to_tz' then > > > + return function(self, offset) > > > + return datetime_to_tz(self, offset) > > > + end > > > + else > > > + error(('unknown attribute %s'):format(key), 2) > > > + end > > > +end > > > + > > > +local function datetime_newindex(self, key, value) > > > + if key == 'epoch' or key == 'unixtime' then > > > + self.secs = value > > > + self.nsec, self.offset = 0, 0 > > > + elseif key == 'ts' or key == 'timestamp' then > > > + local secs, frac = math_modf(value) > > > + self.secs = secs > > > + self.nsec = frac * 1e9 > > > + self.offset = 0 > > > > Do we really want the datetime object to be mutable? If so, allowing to > > set its value only to the time since the unix epoch doesn't look > > particularly user-friendly. > > I do want. That was request from customer - to have ability to modify by > timestamp from Epoch. Why can't the user create a new datetime object instead? To be discussed. > > > +local datetime_mt = { > > > + __serialize = datetime_serialize, > > > + __eq = datetime_eq, > > > + __lt = datetime_lt, > > > + __le = datetime_le, > > > + __index = datetime_index, > > > + __newindex = datetime_newindex, > > > +} > > > + > > > +local interval_mt = { > > > + __serialize = interval_serialize, > > > + __eq = datetime_eq, > > > + __lt = datetime_lt, > > > + __le = datetime_le, > > > + __index = datetime_index, > > > > Why does datetime_mt has __newindex while interval_mt doesn't? > > I did not foresee the need. Should we? > > And semantically datetime has persistence and life-time, but intervals have > not. [And besides all we already provide a plenty of helpers for creation of > intervals] You're right - there's no point in __newindex for interval. Resolved. > > > > > > +} > > > + > > > +ffi.metatype(interval_t, interval_mt) > > > +ffi.metatype(datetime_t, datetime_mt) > > > + > > > +return setmetatable( > > > + { > > > + new = datetime_new, > > > + new_raw = datetime_new_obj, > > > > I'm not sure, we need to make the 'raw' function public. > > That's for second kind of constructors using initialization object but with > low level attributes {secs = 0, nsec = 0, offset}. It's not for everybody, > but still provides some ergonomics. You're probably right. Still, to be discussed. > > > + > > > + parse = parse, > > > + parse_date = parse_date, > > > > > + parse_time = parse_time, > > > + parse_zone = parse_zone, > > > > Creating a datetime object without a date sounds confusing. > > :) But I do foresee the need to parse parts of datetime string, and I did > not want to establish special type for that. > > Now you have asked it I realized we may better create interval objects of > approapriate time. But in this case we need to add timezone to interval > record. So it will be able to represent timezone shifts. > > Ok, heer is the deal: > - at the moment those partial parsed objects store their data to the partial > datetime object created. That's confusing but do not require modifications > in interval arithmetic. > - we may start to store partial time and timezone information into generic > intervals. But it would require extension of interval arithmetic to be ready > to shift by timezone delta. [Does it even make any sense?] > > What do you think? No, I don't like the idea of storing a timezone shift in time interval. IMO better introduce separate 'time' and 'date' objects, which would store the time in the datetime struct under the hood. This needs to be discussed. > > > > > > + > > > + now = local_now, > > > + strftime = strftime, > > > + asctime = asctime, > > > + ctime = ctime, > > > + > > > + is_datetime = is_datetime, > > > + is_interval = is_interval, > > > > I don't see any point in making these functions public. > > I was asked by customer to introduce such. Okay, please add tests for them.
next prev parent reply other threads:[~2021-08-18 11:45 UTC|newest] Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-15 23:59 [Tarantool-patches] [PATCH v5 0/8] Initial datetime implementation Timur Safin via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 1/8] build: add Christian Hansen c-dt to the build Timur Safin via Tarantool-patches 2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:24 ` Safin Timur via Tarantool-patches 2021-08-18 8:56 ` Serge Petrenko via Tarantool-patches 2021-08-17 15:50 ` Vladimir Davydov via Tarantool-patches 2021-08-18 10:04 ` Safin Timur via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 2/8] lua: built-in module datetime Timur Safin via Tarantool-patches 2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:30 ` Safin Timur via Tarantool-patches 2021-08-18 8:56 ` Serge Petrenko via Tarantool-patches 2021-08-17 16:52 ` Vladimir Davydov via Tarantool-patches 2021-08-17 19:16 ` Vladimir Davydov via Tarantool-patches 2021-08-18 13:38 ` Safin Timur via Tarantool-patches 2021-08-18 10:03 ` Safin Timur via Tarantool-patches 2021-08-18 10:06 ` Safin Timur via Tarantool-patches 2021-08-18 11:45 ` Vladimir Davydov via Tarantool-patches [this message] 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 3/8] lua, datetime: display datetime Timur Safin via Tarantool-patches 2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:32 ` Safin Timur via Tarantool-patches 2021-08-17 17:06 ` Vladimir Davydov via Tarantool-patches 2021-08-18 14:10 ` Safin Timur via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 4/8] box, datetime: messagepack support for datetime Timur Safin via Tarantool-patches 2021-08-16 0:20 ` Safin Timur via Tarantool-patches 2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches 2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:42 ` Safin Timur via Tarantool-patches 2021-08-18 9:01 ` Serge Petrenko via Tarantool-patches 2021-08-17 18:36 ` Vladimir Davydov via Tarantool-patches 2021-08-18 14:27 ` Safin Timur via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 5/8] box, datetime: datetime comparison for indices Timur Safin via Tarantool-patches 2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:43 ` Safin Timur via Tarantool-patches 2021-08-18 9:03 ` Serge Petrenko via Tarantool-patches 2021-08-17 19:05 ` Vladimir Davydov via Tarantool-patches 2021-08-18 17:18 ` Safin Timur via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 6/8] lua, datetime: time intervals support Timur Safin via Tarantool-patches 2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:44 ` Safin Timur via Tarantool-patches 2021-08-17 18:52 ` Vladimir Davydov via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 7/8] datetime: perf test for datetime parser Timur Safin via Tarantool-patches 2021-08-17 19:13 ` Vladimir Davydov via Tarantool-patches 2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 8/8] datetime: changelog for datetime module Timur Safin via Tarantool-patches 2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches 2021-08-17 23:44 ` Safin Timur via Tarantool-patches 2021-08-18 9:04 ` Serge Petrenko via Tarantool-patches 2021-08-17 12:15 ` [Tarantool-patches] [PATCH v5 0/8] Initial datetime implementation Serge Petrenko via Tarantool-patches [not found] ` <20210818082222.mofgheciutpipelz@esperanza> 2021-08-18 8:25 ` Vladimir Davydov via Tarantool-patches 2021-08-18 13:24 ` Safin Timur via Tarantool-patches 2021-08-18 14:22 ` Vladimir Davydov via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210818114529.wx7ekgpc2ccqfxmq@esperanza \ --to=tarantool-patches@dev.tarantool.org \ --cc=tsafin@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --cc=vdavydov@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v5 2/8] lua: built-in module datetime' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox