From: Vladimir Davydov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Safin Timur <tsafin@tarantool.org>
Cc: Timur Safin via Tarantool-patches
<tarantool-patches@dev.tarantool.org>,
v.shpilevoy@tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v5 2/8] lua: built-in module datetime
Date: Wed, 18 Aug 2021 14:45:29 +0300 [thread overview]
Message-ID: <20210818114529.wx7ekgpc2ccqfxmq@esperanza> (raw)
In-Reply-To: <512da66c-1b3f-c671-b653-96726293a5f2@tarantool.org>
On Wed, Aug 18, 2021 at 01:03:27PM +0300, Safin Timur wrote:
> > > +void
> > > +datetime_now(struct datetime *now)
> > > +{
> > > + struct timeval tv;
> > > + gettimeofday(&tv, NULL);
> > > + now->secs = tv.tv_sec;
> > > + now->nsec = tv.tv_usec * 1000;
> > > +
> > > + time_t now_seconds;
> > > + time(&now_seconds);
> >
> > Can't you use tv.tv_sec here?
>
> Nope. :)
>
> Actually I don't want to know value returned from time() - all I need is to
> eventually call localtime[_r]() to get local timezone. If you know simpler
> way to determine local timezone without time, then please give me know.
AFAIU time() returns exactly the same value as tv.tv_sec, where tv is
the value returned by gettimeofday(), no?
Come to think of it, I'm not so sure it's a good idea to call
localtime() on each call to get the timezone. In comparison to
gettimeofday() it 10x slower. Maybe, it's worth caching the current
timezone?
vlad@esperanza:~/tmp$ cat bench.lua
clock = require('clock')
datetime = require('datetime')
LOOPS = 10000000
t1 = clock.monotonic()
for i = 1, LOOPS do
datetime.now()
end
t2 = clock.monotonic()
print(string.format('datetime.now: %f', t2 - t1))
t1 = clock.monotonic()
for i = 1, LOOPS do
clock.realtime()
end
t2 = clock.monotonic()
print(string.format('clock.realtime: %f', t2 - t1))
os.exit(0)
vlad@esperanza:~/tmp$ tarantool bench.lua
datetime.now: 2.190355
clock.realtime: 0.201188
> > > +#define DT_EPOCH_1970_OFFSET 719163
> >
> > I don't understand what this constant stores. Please add a comment.
>
> I've documented them after Serge request this way:
>
> ------------------------------------------------------
> /**
> * We count dates since so called "Rata Die" date
> * January 1, 0001, Monday (as Day 1).
> * But datetime structure keeps seconds since
> * Unix "Epoch" date:
> * Unix, January 1, 1970, Thursday
> *
> * The difference between Epoch (1970-01-01)
> * and Rata Die (0001-01-01) is 719163 days.
> */
>
> #ifndef SECS_PER_DAY
> #define SECS_PER_DAY 86400
> #define DT_EPOCH_1970_OFFSET 719163
> #endif
>
> /**
> * c-dt library uses int as type for dt value, which
> * represents the number of days since Rata Die date.
> * This implies limits to the number of seconds we
> * could safely store in our structures and then safely
> * pass to c-dt functions.
> *
> * So supported ranges will be
> * - for seconds [-185604722870400 .. 185480451417600]
> * - for dates [-5879610-06-22T00:00Z .. 5879611-07-11T00:00Z]
> */
> #define MAX_DT_DAY_VALUE (int64_t)INT_MAX
> #define MIN_DT_DAY_VALUE (int64_t)INT_MIN
> #define SECS_EPOCH_1970_OFFSET \
> ((int64_t)DT_EPOCH_1970_OFFSET * SECS_PER_DAY)
> #define MAX_EPOCH_SECS_VALUE \
> (MAX_DT_DAY_VALUE * SECS_PER_DAY - SECS_EPOCH_1970_OFFSET)
> #define MIN_EPOCH_SECS_VALUE \
> (MIN_DT_DAY_VALUE * SECS_PER_DAY - SECS_EPOCH_1970_OFFSET)
I'd name them DT_MAX_DAY_VALUE, etc, with DT_ prefix. For consistency
with DT_EPOCH_1970_OFFSET and so as to emphasize that these constants
are related to datetime objects. SECS_PER_DAY, MINS_PER_DAY, etc are
fine without prefix, because they are universal.
> > > +local function datetime_new_dt(dt, secs, frac, offset)
> >
> > What's 'frac'? Please either add a comment or give it a better name.
>
> fraction part. Renamed to full fraction.
It's unclear what fraction is of. I'd rename it to 'nsecs', because
the function it is passed to (datetime_new_raw) actually takes
nanoseconds.
> > > +-- create datetime given attribute values from obj
> > > +-- in the "easy mode", providing builder with
> > > +-- .secs, .nsec, .offset
> > > +local function datetime_new_obj(obj, ...)
> > > + if obj == nil or type(obj) ~= 'table' then
> >
> > type(obj) ~= 'table' implies 'obj' == nil
>
> I meant that obj may be simple number, not builder object.
I understand, but
if type(obj) ~= 'table' then
will work for obj == nil (because type(obj) is nil ~= 'table).
That is, 'obj == nil' check is redundant.
>
> >
> > > + return datetime_new_raw(obj, ...)
> > > + end
> > > + local secs = 0
> > > + local nsec = 0
> > > + local offset = 0
> > > +
> > > + for key, value in pairs(obj) do
> > > + if key == 'secs' then
> > > + secs = value
> > > + elseif key == 'nsec' then
> > > + nsec = value
> > > + elseif key == 'offset' then
> > > + offset = value
> > > + else
> > > + error(('unknown attribute %s'):format(key), 2)
> > > + end
> >
> > Hmm, do we really need this check? IMO this is less clear and probably
> > slower than accessing the table directly:
> >
> > local secs = obj.secs
> > local nsecs = obj.nsecs
> > ...
>
> So here is dilema:
> - we could directly pass initialization object to the ffi.new. But allow all
> kinds of unknown attributes;
> - or we could check all attributes in a slightly slower loop, but provide
> nice user visible diagnostics for an error.
>
> What would you recommend? (I prefer implementation with checks and
> human-understandable errors)
I understand the dilemma. I don't know what we should do. Probably ask
someone who writes a lot of Lua code. I, personally, haven't seen checks
like this in our Lua code - we never check if there's bogus positional
arguments AFAICS. E.g. if you pass {tmieout = 123} to net.box.call() in
options, there will be no error (and apparently no timeout because of
the typo). I think that this is the Lua way, but I can't say for sure.
> > > +local function datetime_new(obj)
> > > + if obj == nil or type(obj) ~= 'table' then
> > > + return datetime_new_raw(0, 0, 0)
> > > + end
> > > + local y = 0
> > > + local M = 0
> > > + local d = 0
> > > + local ymd = false
> > > +
> > > + local h = 0
> > > + local m = 0
> > > + local s = 0
> > > + local frac = 0
> > > + local hms = false
> > > + local offset = 0
> > > +
> > > + local dt = 0
> > > +
> > > + for key, value in pairs(obj) do
> > > + if key == 'year' then
> > > + check_range(value, {1, 9999}, key)
> > > + y = value
> > > + ymd = true
> > > + elseif key == 'month' then
> > > + check_range(value, {1, 12}, key)
> > > + M = value
> > > + ymd = true
> > > + elseif key == 'day' then
> > > + check_range(value, {1, 31}, key)
> > > + d = value
> > > + ymd = true
> > > + elseif key == 'hour' then
> > > + check_range(value, {0, 23}, key)
> > > + h = value
> > > + hms = true
> > > + elseif key == 'min' or key == 'minute' then
> > > + check_range(value, {0, 59}, key)
> > > + m = value
> > > + hms = true
> > > + elseif key == 'sec' or key == 'second' then
> >
> > I don't think we should support both 'sec'/'min' and 'second'/'minute'
> > here. Since you want to be consistent with os.date() output, I would
> > leave 'sec'/'min' only.
>
> Originally, there were only human-readable names like 'second' or 'minute',
> os.date compatibility added only lately. Because, it costs nothing
> (comparing to all other overhead).
>
> I'd prefer to have ability to use full readable names, not only those from
> set of os.date. It costs nothing, but make it's friendlier.
'min' and 'sec' are common abbreviations so I wouldn't say they are not
human-readable.
> > > +local function datetime_index(self, key)
> > > + if key == 'epoch' or key == 'unixtime' then
> > > + return self.secs
> > > + elseif key == 'ts' or key == 'timestamp' then
> > > + return self.secs + self.nsec / 1e9
> > > + elseif key == 'ns' or key == 'nanoseconds' then
> > > + return self.secs * 1e9 + self.nsec
> > > + elseif key == 'us' or key == 'microseconds' then
> > > + return self.secs * 1e6 + self.nsec / 1e3
> > > + elseif key == 'ms' or key == 'milliseconds' then
> > > + return self.secs * 1e3 + self.nsec / 1e6
> > > + elseif key == 's' or key == 'seconds' then
> > > + return self.secs + self.nsec / 1e9
> > > + elseif key == 'm' or key == 'min' or key == 'minutes' then
> > > + return (self.secs + self.nsec / 1e9) / 60
> > > + elseif key == 'hr' or key == 'hours' then
> > > + return (self.secs + self.nsec / 1e9) / (60 * 60)
> > > + elseif key == 'd' or key == 'days' then
> > > + return (self.secs + self.nsec / 1e9) / (24 * 60 * 60)
> >
> > 1. There's so many ways to get the same information, for example m,
> > min, minutes. There should be exactly one way.
>
> Disagreed. I prefer friendlier approach.
I wouldn't call it 'friendly'. What exactly, as a user, should I use,
'minutes', 'min', or 'm'? Freedom of choice is a pain.
>
> >
> > 2. I'd expect datetime.hour return the number of hours in the current
> > day, like os.date(), but it returns the number of hours since the
> > Unix epoch instead. This is confusing and useless.
>
> Well, at least this is consistent with seconds and their fractions of
> different units. If there would be demand for os.date compatible table
> format, we might extend interface with corresponding accessor. But I do not
> foresee it, because os.date is totally broken if we need correct handling of
> timezone information. AFAIK it's always using local time context.
It's just that I don't see any point in knowing the number of hours that
passed since 1970. Better remove this method altogether then. To be
discussed.
> >
> > 3. Please document the behavior somewhere in the comments to the code.
>
> Documented.
>
> >
> > 4. These if-else's are inefficient AFAIU.
>
> Quite the contrary (if we compare to the idiomatic closure handlers case).
> If/ifelse is the fastest way to handle such cases in Lua.
> Here is earlier bench we have discussed with Vlad and Oleg Babin, and after
> which I've changed from using hash based handlers to the series of ifs.
>
> https://gist.github.com/tsafin/31cc9b0872b6015904fcc90d97740770
The test creates a new hashtable on each function invocation - no
surprise it's slow. The hashtable should be global.
>
>
> >
> > > + elseif key == 'to_utc' then
> > > + return function(self)
> > > + return datetime_to_tz(self, 0)
> > > + end
> > > + elseif key == 'to_tz' then
> > > + return function(self, offset)
> > > + return datetime_to_tz(self, offset)
> > > + end
> > > + else
> > > + error(('unknown attribute %s'):format(key), 2)
> > > + end
> > > +end
> > > +
> > > +local function datetime_newindex(self, key, value)
> > > + if key == 'epoch' or key == 'unixtime' then
> > > + self.secs = value
> > > + self.nsec, self.offset = 0, 0
> > > + elseif key == 'ts' or key == 'timestamp' then
> > > + local secs, frac = math_modf(value)
> > > + self.secs = secs
> > > + self.nsec = frac * 1e9
> > > + self.offset = 0
> >
> > Do we really want the datetime object to be mutable? If so, allowing to
> > set its value only to the time since the unix epoch doesn't look
> > particularly user-friendly.
>
> I do want. That was request from customer - to have ability to modify by
> timestamp from Epoch.
Why can't the user create a new datetime object instead?
To be discussed.
> > > +local datetime_mt = {
> > > + __serialize = datetime_serialize,
> > > + __eq = datetime_eq,
> > > + __lt = datetime_lt,
> > > + __le = datetime_le,
> > > + __index = datetime_index,
> > > + __newindex = datetime_newindex,
> > > +}
> > > +
> > > +local interval_mt = {
> > > + __serialize = interval_serialize,
> > > + __eq = datetime_eq,
> > > + __lt = datetime_lt,
> > > + __le = datetime_le,
> > > + __index = datetime_index,
> >
> > Why does datetime_mt has __newindex while interval_mt doesn't?
>
> I did not foresee the need. Should we?
>
> And semantically datetime has persistence and life-time, but intervals have
> not. [And besides all we already provide a plenty of helpers for creation of
> intervals]
You're right - there's no point in __newindex for interval. Resolved.
>
> >
> > > +}
> > > +
> > > +ffi.metatype(interval_t, interval_mt)
> > > +ffi.metatype(datetime_t, datetime_mt)
> > > +
> > > +return setmetatable(
> > > + {
> > > + new = datetime_new,
> > > + new_raw = datetime_new_obj,
> >
> > I'm not sure, we need to make the 'raw' function public.
>
> That's for second kind of constructors using initialization object but with
> low level attributes {secs = 0, nsec = 0, offset}. It's not for everybody,
> but still provides some ergonomics.
You're probably right. Still, to be discussed.
> > > +
> > > + parse = parse,
> > > + parse_date = parse_date,
> >
> > > + parse_time = parse_time,
> > > + parse_zone = parse_zone,
> >
> > Creating a datetime object without a date sounds confusing.
>
> :) But I do foresee the need to parse parts of datetime string, and I did
> not want to establish special type for that.
>
> Now you have asked it I realized we may better create interval objects of
> approapriate time. But in this case we need to add timezone to interval
> record. So it will be able to represent timezone shifts.
>
> Ok, heer is the deal:
> - at the moment those partial parsed objects store their data to the partial
> datetime object created. That's confusing but do not require modifications
> in interval arithmetic.
> - we may start to store partial time and timezone information into generic
> intervals. But it would require extension of interval arithmetic to be ready
> to shift by timezone delta. [Does it even make any sense?]
>
> What do you think?
No, I don't like the idea of storing a timezone shift in time interval.
IMO better introduce separate 'time' and 'date' objects, which would
store the time in the datetime struct under the hood. This needs to be
discussed.
>
> >
> > > +
> > > + now = local_now,
> > > + strftime = strftime,
> > > + asctime = asctime,
> > > + ctime = ctime,
> > > +
> > > + is_datetime = is_datetime,
> > > + is_interval = is_interval,
> >
> > I don't see any point in making these functions public.
>
> I was asked by customer to introduce such.
Okay, please add tests for them.
next prev parent reply other threads:[~2021-08-18 11:45 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-15 23:59 [Tarantool-patches] [PATCH v5 0/8] Initial datetime implementation Timur Safin via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 1/8] build: add Christian Hansen c-dt to the build Timur Safin via Tarantool-patches
2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:24 ` Safin Timur via Tarantool-patches
2021-08-18 8:56 ` Serge Petrenko via Tarantool-patches
2021-08-17 15:50 ` Vladimir Davydov via Tarantool-patches
2021-08-18 10:04 ` Safin Timur via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 2/8] lua: built-in module datetime Timur Safin via Tarantool-patches
2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:30 ` Safin Timur via Tarantool-patches
2021-08-18 8:56 ` Serge Petrenko via Tarantool-patches
2021-08-17 16:52 ` Vladimir Davydov via Tarantool-patches
2021-08-17 19:16 ` Vladimir Davydov via Tarantool-patches
2021-08-18 13:38 ` Safin Timur via Tarantool-patches
2021-08-18 10:03 ` Safin Timur via Tarantool-patches
2021-08-18 10:06 ` Safin Timur via Tarantool-patches
2021-08-18 11:45 ` Vladimir Davydov via Tarantool-patches [this message]
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 3/8] lua, datetime: display datetime Timur Safin via Tarantool-patches
2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:32 ` Safin Timur via Tarantool-patches
2021-08-17 17:06 ` Vladimir Davydov via Tarantool-patches
2021-08-18 14:10 ` Safin Timur via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 4/8] box, datetime: messagepack support for datetime Timur Safin via Tarantool-patches
2021-08-16 0:20 ` Safin Timur via Tarantool-patches
2021-08-17 12:15 ` Serge Petrenko via Tarantool-patches
2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:42 ` Safin Timur via Tarantool-patches
2021-08-18 9:01 ` Serge Petrenko via Tarantool-patches
2021-08-17 18:36 ` Vladimir Davydov via Tarantool-patches
2021-08-18 14:27 ` Safin Timur via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 5/8] box, datetime: datetime comparison for indices Timur Safin via Tarantool-patches
2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:43 ` Safin Timur via Tarantool-patches
2021-08-18 9:03 ` Serge Petrenko via Tarantool-patches
2021-08-17 19:05 ` Vladimir Davydov via Tarantool-patches
2021-08-18 17:18 ` Safin Timur via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 6/8] lua, datetime: time intervals support Timur Safin via Tarantool-patches
2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:44 ` Safin Timur via Tarantool-patches
2021-08-17 18:52 ` Vladimir Davydov via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 7/8] datetime: perf test for datetime parser Timur Safin via Tarantool-patches
2021-08-17 19:13 ` Vladimir Davydov via Tarantool-patches
2021-08-15 23:59 ` [Tarantool-patches] [PATCH v5 8/8] datetime: changelog for datetime module Timur Safin via Tarantool-patches
2021-08-17 12:16 ` Serge Petrenko via Tarantool-patches
2021-08-17 23:44 ` Safin Timur via Tarantool-patches
2021-08-18 9:04 ` Serge Petrenko via Tarantool-patches
2021-08-17 12:15 ` [Tarantool-patches] [PATCH v5 0/8] Initial datetime implementation Serge Petrenko via Tarantool-patches
[not found] ` <20210818082222.mofgheciutpipelz@esperanza>
2021-08-18 8:25 ` Vladimir Davydov via Tarantool-patches
2021-08-18 13:24 ` Safin Timur via Tarantool-patches
2021-08-18 14:22 ` Vladimir Davydov via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210818114529.wx7ekgpc2ccqfxmq@esperanza \
--to=tarantool-patches@dev.tarantool.org \
--cc=tsafin@tarantool.org \
--cc=v.shpilevoy@tarantool.org \
--cc=vdavydov@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH v5 2/8] lua: built-in module datetime' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox