From: Vladimir Davydov via Tarantool-patches <tarantool-patches@dev.tarantool.org> To: Timur Safin <tsafin@tarantool.org> Cc: tarantool-patches@dev.tarantool.org, v.shpilevoy@tarantool.org Subject: Re: [Tarantool-patches] [PATCH v6 1/5] build, lua: built-in module datetime Date: Thu, 19 Aug 2021 18:26:56 +0300 [thread overview] Message-ID: <20210819152656.viku6crt3uezsrdd@esperanza> (raw) In-Reply-To: <dd59371cb49eab89cf229a8c0b6cbbeb002de920.1629341071.git.tsafin@tarantool.org> On Thu, Aug 19, 2021 at 05:56:29AM +0300, Timur Safin wrote: > New third_party module - c-dt > ----------------------------- > > * Integrated chansen/c-dt parser as 3rd party module to the > Tarantool cmake build process; > * Points to tarantool/c-dt instead of original chansen/c-dt to > have easier build integration, because there are additional > commits which have integrated cmake support and have established > symbols renaming facilities (similar to those we see in xxhash > or icu). > * took special care that generated build artifacts not override > in-source files, but use different build/ directory. > > * added datetime parsing unit tests: > - for literals - with and without trailing timezones; > - we check that strftime is reversible and produce consistent > results after roundtrip from/to strings; > > New built-in module `datetime` > ------------------------------ > > * created a new Tarantool built-in module `datetime`; > * register new cdef types for this module; > * reexported some `dt_*` functions from `c-dt` library. > We have to rename externally public symbols to avoid > name clashes possible usage of this same party module but > from inside of dynamic libraries. We prefix c-dt symbols > in the Tarantool build with `tnt_` prefix; > > * `strftime` implemented as a simple ffi wrappers in C > kernel code: > > We used to use heavy ffi-related Lua code, but it appears > to be quite fragile, if we run this code on different OS > targets. To make this Lua code less platfrom dependent we > have moved all platform specifics to the C level. > Specifically we want to avoid dependence on `struct tm {}` > layout and GLIBC-specific `strftime` behaviour (which > differs to BSD/Mac LIBC behaviour). > > * display datetime > > - introduced output routine for converting datetime > to their default output format. > > - use this routine for tostring() in datetime.lua > > - verify that we parse expected, and fail with unexpected: > - for that, in the datetime.lua, we have extended api of `parse_date()`, > `parse_time()`, and `parse_time_zone()` so they return > not only parsed object, but also a length of parsed substring; > - which allows us to parse even _partially_ valid strings like > "20121224 Foo bar". > > * simplified interfaces > - totable() export table values similar to os.date('*t') > - set() provide unified interface to set values using > the same set of attributes as in totable() > > Part of #5941 > > @TarantoolBot document > Title: Introduced new built-in `datetime` module > > `datetime` module has been introduced, which allows to parse > ISO-8601 literals representing timestamps of various formats, > and then manipulate with date objects. > > Please refer to https://github.com/tarantool/tarantool/discussions/6244 > for more detailed description of module API. > > Part of #5941 > --- > .gitmodules | 3 + > CMakeLists.txt | 8 + > cmake/BuildCDT.cmake | 10 + > extra/exports | 32 ++ > src/CMakeLists.txt | 5 +- > src/lib/core/CMakeLists.txt | 1 + > src/lib/core/datetime.c | 125 ++++++++ > src/lib/core/datetime.h | 86 +++++ > src/lua/datetime.lua | 567 +++++++++++++++++++++++++++++++++ > src/lua/init.c | 4 +- > src/lua/utils.c | 27 ++ > src/lua/utils.h | 12 + > test/app-tap/datetime.test.lua | 213 +++++++++++++ There must be many more Lua tests covering all possible cases. Currently, the module doesn't seem to work quite as expected: tarantool> datetime.new({year = 2000}) --- - 1999-11-30T00:00Z ... > test/unit/CMakeLists.txt | 3 +- > test/unit/datetime.c | 260 +++++++++++++++ > test/unit/datetime.result | 358 +++++++++++++++++++++ > third_party/c-dt | 1 + > 17 files changed, 1712 insertions(+), 3 deletions(-) > create mode 100644 cmake/BuildCDT.cmake > create mode 100644 src/lib/core/datetime.c > create mode 100644 src/lib/core/datetime.h > create mode 100644 src/lua/datetime.lua > create mode 100755 test/app-tap/datetime.test.lua > create mode 100644 test/unit/datetime.c > create mode 100644 test/unit/datetime.result > create mode 160000 third_party/c-dt > > diff --git a/.gitmodules b/.gitmodules > index f2f91ee72..aa3fbae4e 100644 > --- a/.gitmodules > +++ b/.gitmodules > @@ -43,3 +43,6 @@ > [submodule "third_party/xxHash"] > path = third_party/xxHash > url = https://github.com/tarantool/xxHash > +[submodule "third_party/c-dt"] > + path = third_party/c-dt > + url = https://github.com/tarantool/c-dt.git > diff --git a/CMakeLists.txt b/CMakeLists.txt > index e25b81eac..8037c30a7 100644 > --- a/CMakeLists.txt > +++ b/CMakeLists.txt > @@ -571,6 +571,14 @@ endif() > # zstd > # > > +# > +# Christian Hansen c-dt > +# > + > +include(BuildCDT) > +libccdt_build() > +add_dependencies(build_bundled_libs cdt) > + > # > # Third-Party misc > # > diff --git a/cmake/BuildCDT.cmake b/cmake/BuildCDT.cmake > new file mode 100644 > index 000000000..80b26c64a > --- /dev/null > +++ b/cmake/BuildCDT.cmake > @@ -0,0 +1,10 @@ > +macro(libccdt_build) > + set(LIBCDT_INCLUDE_DIRS ${PROJECT_SOURCE_DIR}/third_party/c-dt/) > + set(LIBCDT_LIBRARIES cdt) > + > + file(MAKE_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/third_party/c-dt/build/) > + add_subdirectory(${PROJECT_SOURCE_DIR}/third_party/c-dt > + ${CMAKE_CURRENT_BINARY_DIR}/third_party/c-dt/build/) > + set_target_properties(cdt PROPERTIES COMPILE_FLAGS "-DDT_NAMESPACE=tnt_") > + add_definitions("-DDT_NAMESPACE=tnt_") > +endmacro() > diff --git a/extra/exports b/extra/exports > index 9eaba1282..0e7392d61 100644 > --- a/extra/exports > +++ b/extra/exports > @@ -148,8 +148,16 @@ csv_feed > csv_iterator_create > csv_next > csv_setopt > +datetime_now > +datetime_strftime > +datetime_to_string > +datetime_unpack > decimal_from_string > decimal_unpack > +dt_year > +dt_month > +dt_doy > +dt_dom Why no tnt_ prefix? > error_ref > error_set_prev > error_unref > @@ -447,6 +455,30 @@ title_set_status > title_update > tnt_default_cert_dir_paths > tnt_default_cert_file_paths > +tnt_dt_add_years > +tnt_dt_add_quarters > +tnt_dt_add_months > +tnt_dt_dow > +tnt_dt_from_rdn > +tnt_dt_from_struct_tm This function is unused, and so are a few others below. Should remove them from this list? > +tnt_dt_from_yd > +tnt_dt_from_ymd > +tnt_dt_from_yqd > +tnt_dt_from_ywd > +tnt_dt_parse_iso_date > +tnt_dt_parse_iso_time_basic > +tnt_dt_parse_iso_time_extended > +tnt_dt_parse_iso_time > +tnt_dt_parse_iso_zone_basic > +tnt_dt_parse_iso_zone_extended > +tnt_dt_parse_iso_zone_lenient > +tnt_dt_parse_iso_zone > +tnt_dt_rdn > +tnt_dt_to_struct_tm > +tnt_dt_to_yd > +tnt_dt_to_ymd > +tnt_dt_to_yqd > +tnt_dt_to_ywd > tnt_iconv > tnt_iconv_close > tnt_iconv_open > diff --git a/src/lib/core/datetime.c b/src/lib/core/datetime.c > new file mode 100644 > index 000000000..7125090e6 > --- /dev/null > +++ b/src/lib/core/datetime.c > @@ -0,0 +1,125 @@ > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#include <assert.h> > +#include <limits.h> > +#include <string.h> > +#include <time.h> > + > +#include "trivia/util.h" > +#include "datetime.h" > + > +/* Should be /**. It's not about doxygen, it's in accordance with our coding style: https://www.tarantool.io/en/doc/latest/dev_guide/c_style_guide/#chapter-8-commenting (BTW parenthesis after sizeof are also mandatory) Please fix here and everywhere else. > + * Given the seconds from Epoch (1970-01-01) we calculate date > + * since Rata Die (0001-01-01). > + * DT_EPOCH_1970_OFFSET is the distance in days from Rata Die to Epoch. > + */ > +static int > +local_dt(int64_t secs) > +{ > + return dt_from_rdn((int)(secs / SECS_PER_DAY) + DT_EPOCH_1970_OFFSET); > +} > + > +static struct tm * > +datetime_to_tm(const struct datetime *date) > +{ > + static struct tm tm; There's no reason to use a static variable here. Please pass tm by pointer. > + > + memset(&tm, 0, sizeof(tm)); > + int64_t secs = date->secs; > + dt_to_struct_tm(local_dt(secs), &tm); > + > + int seconds_of_day = (int64_t)date->secs % SECS_PER_DAY; > + tm.tm_hour = (seconds_of_day / 3600) % 24; > + tm.tm_min = (seconds_of_day / 60) % 60; > + tm.tm_sec = seconds_of_day % 60; Please use constants here as well: HOURS_PER_DAY, SECS_PER_MINUTE, etc. Please replace here and everywhere else. > + > + return &tm; > +} > + > +void > +datetime_now(struct datetime *now) > +{ > + struct timeval tv; > + gettimeofday(&tv, NULL); > + now->secs = tv.tv_sec; > + now->nsec = tv.tv_usec * 1000; > + > + time_t now_seconds; > + time(&now_seconds); > + struct tm tm; > + localtime_r(&now_seconds, &tm); Please either implement local timezone caching (preferrably) or add a todo: /* * TODO(gh-XXXX): ... */ gh-XXXX is the number of GitHub ticket to track the work. > + now->offset = tm.tm_gmtoff / 60; > +} > + > +size_t > +datetime_strftime(const struct datetime *date, const char *fmt, char *buf, > + uint32_t len) I would reorder the arguments for consistency with strftime and compatibility with SNPRINT: datetime_strftime(buf, len, fmt, date); > +{ > + struct tm *p_tm = datetime_to_tm(date); > + return strftime(buf, len, fmt, p_tm); > +} > + > + > +/* NB! buf may be NULL, and we should handle it gracefully, returning > + * calculated length of output string > + */ > +int > +datetime_to_string(const struct datetime *date, char *buf, int len) datetime_snprint(buf, len, date); for compatibility with SNPRINT and consistency with our other snprint-like functions. > +{ > + int offset = date->offset; > + /* for negative offsets around Epoch date we could get > + * negative secs value, which should be attributed to > + * 1969-12-31, not 1970-01-01, thus we first shift > + * epoch to Rata Die then divide by seconds per day, > + * not in reverse > + */ > + int64_t rd_seconds = (int64_t)date->secs + offset * 60 + > + SECS_EPOCH_1970_OFFSET; > + int rd_number = rd_seconds / SECS_PER_DAY; > + assert(rd_number <= INT_MAX); > + assert(rd_number >= INT_MIN); > + dt_t dt = dt_from_rdn(rd_number); > + > + int year, month, day, second, nanosec, sign; > + dt_to_ymd(dt, &year, &month, &day); > + > + int hour = (rd_seconds / 3600) % 24; > + int minute = (rd_seconds / 60) % 60; > + second = rd_seconds % 60; > + nanosec = date->nsec; > + > + int sz = 0; > + SNPRINT(sz, snprintf, buf, len, "%04d-%02d-%02dT%02d:%02d", > + year, month, day, hour, minute); > + if (second || nanosec) { > + SNPRINT(sz, snprintf, buf, len, ":%02d", second); > + if (nanosec) { > + if ((nanosec % 1000000) == 0) > + SNPRINT(sz, snprintf, buf, len, ".%03d", > + nanosec / 1000000); > + else if ((nanosec % 1000) == 0) > + SNPRINT(sz, snprintf, buf, len, ".%06d", > + nanosec / 1000); > + else > + SNPRINT(sz, snprintf, buf, len, ".%09d", nanosec); > + } > + } > + if (offset == 0) { > + SNPRINT(sz, snprintf, buf, len, "Z"); > + } else { > + if (offset < 0) { > + sign = '-'; > + offset = -offset; > + } else { > + sign = '+'; > + } > + SNPRINT(sz, snprintf, buf, len, "%c%02d:%02d", sign, > + offset / 60, offset % 60); > + } > + return sz; > +} > + > diff --git a/src/lib/core/datetime.h b/src/lib/core/datetime.h > new file mode 100644 > index 000000000..71feefded > --- /dev/null > +++ b/src/lib/core/datetime.h > @@ -0,0 +1,86 @@ > +#pragma once > +/* > + * SPDX-License-Identifier: BSD-2-Clause > + * > + * Copyright 2021, Tarantool AUTHORS, please see AUTHORS file. > + */ > + > +#include <stdint.h> > +#include <stdbool.h> > +#include "c-dt/dt.h" > + > +#if defined(__cplusplus) > +extern "C" > +{ > +#endif /* defined(__cplusplus) */ > + > +/** > + * We count dates since so called "Rata Die" date > + * January 1, 0001, Monday (as Day 1). > + * But datetime structure keeps seconds since > + * Unix "Epoch" date: > + * Unix, January 1, 1970, Thursday > + * > + * The difference between Epoch (1970-01-01) > + * and Rata Die (0001-01-01) is 719163 days. > + */ > + > +#ifndef SECS_PER_DAY > +#define SECS_PER_DAY 86400 > +#define DT_EPOCH_1970_OFFSET 719163 > +#endif > + > +#define SECS_EPOCH_1970_OFFSET \ > + ((int64_t)DT_EPOCH_1970_OFFSET * SECS_PER_DAY) DT_EPOCH_1970_OFFSET SECS_EPOCH_1970_OFFSET Inconsistent naming. Looks like DT_ is a namespace-like prefix. I suggest EPOCH_1970_OFFSET_DAYS EPOCH_1970_OFFSET_SECS Maybe with DT_ prefix, although I don't think it's necessary, because these seem like universal constants. > +/** > + * datetime structure keeps number of seconds since > + * Unix Epoch. > + * Time is normalized by UTC, so time-zone offset > + * is informative only. > + */ > +struct datetime { > + /** Seconds since Epoch. */ > + double secs; epoch Please add a comment explaining why you use double here. > + /** Nanoseconds, if any. */ > + uint32_t nsec; AFAIK unsigned types are typically less effecient than signed, because the compiler has to correctly handle overflow for them. That's why it is often discouraged to use an unsigned type even if the variable never stores a value < 0. > + /** Offset in minutes from UTC. */ > + int32_t offset; tzoffset > +}; > + > +/** > + * Date/time interval structure > + */ > +struct datetime_interval { > + /** Relative seconds delta. */ > + double secs; > + /** Nanoseconds delta, if any. */ > + uint32_t nsec; > +}; Not used. Please remove. > + > +/** > + * Convert datetime to string using default format > + * @param date source datetime value > + * @param buf output character buffer > + * @param len size ofoutput buffer > + */ > +int > +datetime_to_string(const struct datetime *date, char *buf, int len); > + > +/** > + * Convert datetime to string using default format provided > + * Wrapper around standard strftime() function > + * @param date source datetime value > + * @param fmt format > + * @param buf output buffer > + * @param len size of output buffer > + */ > +size_t > +datetime_strftime(const struct datetime *date, const char *fmt, char *buf, > + uint32_t len); > + > +void > +datetime_now(struct datetime *now); Missing comment. Please add. > + > +#if defined(__cplusplus) > +} /* extern "C" */ > +#endif /* defined(__cplusplus) */ > diff --git a/src/lua/datetime.lua b/src/lua/datetime.lua > new file mode 100644 > index 000000000..88aad0744 > --- /dev/null > +++ b/src/lua/datetime.lua > @@ -0,0 +1,567 @@ > +local ffi = require('ffi') > + > +--[[ > + `c-dt` library functions handles properly both positive and negative `dt` > + values, where `dt` is a number of dates since Rata Die date (0001-01-01). > + > + For better compactness of our typical data in MessagePack stream we shift > + root of our time to the Unix Epoch date (1970-01-01), thus our 0 is > + actually dt = 719163. > + > + So here is a simple formula how convert our epoch-based seconds to dt values > + dt = (secs / 86400) + 719163 > + Where 719163 is an offset of Unix Epoch (1970-01-01) since Rata Die > + (0001-01-01) in dates. > +]] > + > +-- dt_core.h definitions > +ffi.cdef [[ > + > +typedef int dt_t; > + > +typedef enum { > + DT_MON = 1, > + DT_MONDAY = 1, > + DT_TUE = 2, > + DT_TUESDAY = 2, > + DT_WED = 3, > + DT_WEDNESDAY = 3, > + DT_THU = 4, > + DT_THURSDAY = 4, > + DT_FRI = 5, > + DT_FRIDAY = 5, > + DT_SAT = 6, > + DT_SATURDAY = 6, > + DT_SUN = 7, > + DT_SUNDAY = 7, > +} dt_dow_t; > + > +dt_t tnt_dt_from_rdn (int n); > +dt_t tnt_dt_from_ymd (int y, int m, int d); > +int tnt_dt_rdn (dt_t dt); > + > +dt_dow_t tnt_dt_dow (dt_t dt); > + > +]] > + > +-- dt_accessor.h > +ffi.cdef [[ > + > +int dt_year (dt_t dt); > +int dt_month (dt_t dt); > + > +int dt_doy (dt_t dt); > +int dt_dom (dt_t dt); > + > +]] > + > +-- dt_arithmetic.h definitions > +ffi.cdef [[ > + > +typedef enum { > + DT_EXCESS, > + DT_LIMIT, > + DT_SNAP > +} dt_adjust_t; > + > +dt_t tnt_dt_add_years (dt_t dt, int delta, dt_adjust_t adjust); > +dt_t tnt_dt_add_quarters (dt_t dt, int delta, dt_adjust_t adjust); > +dt_t tnt_dt_add_months (dt_t dt, int delta, dt_adjust_t adjust); > + > +]] > + > +-- dt_parse_iso.h definitions > +ffi.cdef [[ > + > +size_t tnt_dt_parse_iso_date (const char *str, size_t len, dt_t *dt); > +size_t tnt_dt_parse_iso_time (const char *str, size_t len, int *sod, int *nsec); > +size_t tnt_dt_parse_iso_zone_lenient(const char *str, size_t len, int *offset); Why not hide all dt internals behind C wrappers defined in the datetime module? This would make the Lua code easier for understanding, because e.g. 'parse' would call just one C functions instead of three. Also, this would expose useful datetime helpers (like adding years to a date or parsing from string to C code), which could be useful. We wouldn't have to append tnt_ prefix or export any dt functions then. > + > +]] > + > +-- Tarantool functions - datetime.c > +ffi.cdef [[ > + > +int datetime_to_string(const struct datetime * date, char *buf, int len); Extra space after '*'. Please self-check coding style. > +size_t datetime_strftime(const struct datetime *date, const char *fmt, char *buf, > + uint32_t len); > +void datetime_now(struct datetime *now); > + > +]] I don't see any point in splitting cdef. I'd put everything in one cdef, like it was before. > + > +local builtin = ffi.C > +local math_modf = math.modf > +local math_floor = math.floor > + > +local SECS_PER_DAY = 86400 > +local NANOS_PER_SEC = 1000000000 NSECS_PER_SEC seems to be a more conventional name. > + > +-- c-dt/dt_config.h > + > +-- Unix, January 1, 1970, Thursday > +local DT_EPOCH_1970_OFFSET = 719163 > + > + > +local datetime_t = ffi.typeof('struct datetime') > + > +local function is_datetime(o) > + return ffi.istype(datetime_t, o) > +end > + > +local function check_date(o, message) > + if not is_datetime(o) then > + return error(("%s: expected datetime, but received %s"): > + format(message, o), 2) > + end > +end > + > +local function check_str(s, message) > + if not type(s) == 'string' then > + return error(("%s: expected string, but received %s"): > + format(message, s), 2) > + end > +end > + > +local function check_range(v, range, txt) > + assert(#range == 2) > + if v < range[1] or v > range[2] then > + error(('value %d of %s is out of allowed range [%d, %d]'): > + format(v, txt, range[1], range[2]), 4) I think it would be more effecient to pass range[1] and range[2] as arguments: check_range(val, min, max, name), because you wouldn't create a table per each call then. > + end > +end > + > +local SECS_EPOCH_OFFSET = (DT_EPOCH_1970_OFFSET * SECS_PER_DAY) > + > +local function local_rd(secs) > + return math_floor((secs + SECS_EPOCH_OFFSET) / SECS_PER_DAY) > +end > + > +local function local_dt(secs) > + return builtin.tnt_dt_from_rdn(local_rd(secs)) > +end The names of these functions are really unfortunate - I keep forgetting what they do. There should be comments. This is another reason to hide everything behind the datetime library - then we wouldn't need to expose them outside datetime.c. > + > +local function normalize_nsec(secs, nsec) secs, but nsec - inconsistent :-( Would be nice to use the same names throughout the code. > + if nsec < 0 then > + secs = secs - 1 > + nsec = nsec + NANOS_PER_SEC > + elseif nsec >= NANOS_PER_SEC then > + secs = secs + 1 > + nsec = nsec - NANOS_PER_SEC > + end > + return secs, nsec > +end > + > +local function datetime_cmp(lhs, rhs) > + if not is_datetime(lhs) or not is_datetime(rhs) then > + return nil > + end > + local sdiff = lhs.secs - rhs.secs > + return sdiff ~= 0 and sdiff or (lhs.nsec - rhs.nsec) > +end > + > +local function datetime_eq(lhs, rhs) > + local rc = datetime_cmp(lhs, rhs) > + return rc ~= nil and rc == 0 > +end > + > +local function datetime_lt(lhs, rhs) > + local rc = datetime_cmp(lhs, rhs) > + return rc == nil and error('incompatible types for comparison', 2) or > + rc < 0 > +end > + > +local function datetime_le(lhs, rhs) > + local rc = datetime_cmp(lhs, rhs) > + return rc == nil and error('incompatible types for comparison', 2) or > + rc <= 0 > +end > + > +local function datetime_serialize(self) > + return { secs = self.secs, nsec = self.nsec, offset = self.offset } > +end I don't think this should be visibile to the user, even via __serialize. I think __serialize should return the same value as __tostring. > + > +local parse_zone Please avoid forward declarations unless absolutely necessary. Better move the function definition upper. > + > +local function datetime_new_raw(secs, nsec, offset) > + local dt_obj = ffi.new(datetime_t) > + dt_obj.secs = secs > + dt_obj.nsec = nsec > + dt_obj.offset = offset > + return dt_obj > +end > + > +local function datetime_new_dt(dt, secs, fraction, offset) > + local epochV = dt ~= nil and (builtin.tnt_dt_rdn(dt) - DT_EPOCH_1970_OFFSET) * > + SECS_PER_DAY or 0 > + local secsV = secs or 0 > + local fracV = fraction or 0 > + local ofsV = offset or 0 Bad variable naming - violates our coding style (we use underscores for local variable names, not camel-case). I wouldn't even bother defining new local variables - we can overwrite the argument values: secs = secs or 0 fraction = fraction or 0 offset = offset or 0 > + return datetime_new_raw(epochV + secsV - ofsV * 60, fracV, ofsV) > +end > + > +-- create datetime given attribute values from obj > +-- { secs = N, nsec = M, offset = O} > +local function datetime_new_obj(obj, ...) > + if obj == nil or type(obj) == 'table' then > + return ffi.new(datetime_t, obj) > + else > + return datetime_new_raw(obj, ...) > + end > +end > + > +-- create datetime given attribute values from obj Please add a comment what obj is. If it's a table, what fields are expected and what happens if a field is missing. > +local function datetime_new(obj) > + if obj == nil or type(obj) ~= 'table' then > + return datetime_new_raw(0, 0, 0) > + end There should be three variants of datetime.new: datetime.new() datetime.new(str) datetime.new(table) everything else should raise an error. > + local ymd = false > + > + local nsec = 0 > + local hms = false > + > + local dt = 0 > + The way you place empty lines within functions seems random to me. Better not use them at all. > + local y = obj.year I don't see much point in all these local variables (y, M, h, s, etc) - they only clutter the code. Should be as short and simple as: local function datetime_new(o) ... check_range(o.year, 1, 9999, 'year') check_range(o.month, 1, 12, 'month') ... if o.year or o.month or o.day then ... end if o.hour or o.min or o.sec then ... end ... end > + if y ~= nil then > + check_range(y, {1, 9999}, 'year') > + ymd = true datetime.new{year = 0} doesn't work, but datetime.new{} works and creates a date with year equal to 0... > + end > + local M = obj.month > + if M ~= nil then > + check_range(M, {1, 12}, 'month') > + ymd = true > + end > + local d = obj.day > + if d ~= nil then > + check_range(d, {1, 31}, 'day') What happens if month = 2 and day = 31? This should raise an error, I think. > + ymd = true > + end > + local h = obj.hour > + if h ~= nil then > + check_range(h, {0, 23}, 'hour') > + hms = true > + end > + local m = obj.min > + if m ~= nil then > + check_range(m, {0, 59}, 'min') > + hms = true > + end > + local ts = obj.sec ts is typically short for 'timestamp', but here you deal with seconds in a minute. > + local s = 0 > + if ts ~= nil then > + check_range(ts, {0, 60}, 'sec') Should be 59? There must be tests for every corner case. > + s, nsec = math_modf(ts) > + nsec = nsec * 1e9 -- convert fraction to nanoseconds > + hms = true > + end > + local offset = obj.tz > + if offset ~= nil then > + if type(offset) == 'number' then > + -- tz offset in minutes > + check_range(offset, {0, 720}, offset) > + elseif type(offset) == 'string' then > + local zone = parse_zone(offset) > + if zone == nil then > + error(('invalid time-zone format %s'):format(offset), 2) > + else > + offset = zone.offset > + end > + end > + end > + > + -- .year, .month, .day > + if ymd then > + dt = builtin.tnt_dt_from_ymd(y or 0, M or 0, d or 0) > + end > + > + -- .hour, .minute, .second > + local secs = 0 > + if hms then > + secs = (h or 0) * 3600 + (m or 0) * 60 + (s or 0) > + end > + > + return datetime_new_dt(dt, secs, nsec, offset) > +end > + > +--[[ > + Convert to text datetime values > + > + - datetime will use ISO-8601 forat: > + 1970-01-01T00:00Z > + 2021-08-18T16:57:08.981725+03:00 > +]] It looks like public functions should use different comment style: https://www.tarantool.io/en/doc/latest/dev_guide/lua_style_guide/#commenting > +local function datetime_tostring(o) > + if ffi.typeof(o) == datetime_t then > + local sz = 48 > + local buff = ffi.new('char[?]', sz) I think we've agreed that the buffer used for formatting should be global. > + local len = builtin.datetime_to_string(o, buff, sz) > + assert(len < sz) > + return ffi.string(buff) > + end > +end > + > +--[[ > + Parse partial ISO-8601 date string > + > + Accepetd formats are: > + > + Basic Extended > + 20121224 2012-12-24 Calendar date (ISO 8601) > + 2012359 2012-359 Ordinal date (ISO 8601) > + 2012W521 2012-W52-1 Week date (ISO 8601) > + 2012Q485 2012-Q4-85 Quarter date > + > + Returns pair of constructed datetime object, and length of string > + which has been accepted by parser. > +]] > + > +local function parse_date(str) > + check_str("datetime.parse_date()") > + local dt = ffi.new('dt_t[1]') > + local len = builtin.tnt_dt_parse_iso_date(str, #str, dt) > + return len > 0 and datetime_new_dt(dt[0]) or nil, tonumber(len) > +end > + > +--[[ > + Basic Extended > + Z N/A > + +hh N/A > + -hh N/A > + +hhmm +hh:mm > + -hhmm -hh:mm > + > + Returns pair of constructed datetime object, and length of string > + which has been accepted by parser. > +]] > +parse_zone = function(str) local function datetime_parse_zone(str) All datetime functions should be prefixed with datetime_ for consistency and to avoid name clashes in future. > + check_str("datetime.parse_zone()") > + local offset = ffi.new('int[1]') > + local len = builtin.tnt_dt_parse_iso_zone_lenient(str, #str, offset) > + return len > 0 and offset[0], tonumber(len) > +end > + > +--[[ > + aggregated parse functions > + assumes to deal with date T time time_zone > + at once > + > + date [T] time [ ] time_zone > + > + Returns constructed datetime object. > +]] > +local function parse(str) > + check_str("datetime.parse()") > + local dt = ffi.new('dt_t[1]') > + local len = #str > + local n = builtin.tnt_dt_parse_iso_date(str, len, dt) > + local dt_ = dt[0] > + if n == 0 or len == n then > + return datetime_new_dt(dt_) > + end > + > + str = str:sub(tonumber(n) + 1) > + > + local ch = str:sub(1, 1) > + if ch:match('[Tt ]') == nil then > + return datetime_new_dt(dt_) > + end > + > + str = str:sub(2) > + len = #str > + > + local sp = ffi.new('int[1]') > + local fp = ffi.new('int[1]') > + local n = builtin.tnt_dt_parse_iso_time(str, len, sp, fp) > + if n == 0 then > + return datetime_new_dt(dt_) > + end > + local sp_ = sp[0] > + local fp_ = fp[0] > + if len == n then > + return datetime_new_dt(dt_, sp_, fp_) > + end > + > + str = str:sub(tonumber(n) + 1) > + > + if str:sub(1, 1) == ' ' then > + str = str:sub(2) > + end > + > + len = #str > + > + local offset = ffi.new('int[1]') > + n = builtin.tnt_dt_parse_iso_zone_lenient(str, len, offset) > + if n == 0 then > + return datetime_new_dt(dt_, sp_, fp_) > + end > + return datetime_new_dt(dt_, sp_, fp_, offset[0]) > +end > + > +--[[ > + Dispatch function to create datetime from string or table. > + Creates default timeobject (pointing to Epoch date) if > + called without arguments. > +]] > +local function datetime_from(o) > + if o == nil or type(o) == 'table' then > + return datetime_new(o) > + elseif type(o) == 'string' then > + return parse(o) > + end > +end > + > +--[[ > + Create datetime object representing current time using microseconds > + platform timer and local timezone information. > +]] > +local function local_now() datetime_now (all datetime methods should be prefixed with datetime_) > + local d = datetime_new_raw(0, 0, 0) > + builtin.datetime_now(d) > + return d > +end > + > +-- addition or subtraction from date/time of a given interval > +-- described via table direction should be +1 or -1 Please write a better comment: what expected in the argument; in what order addition/subtraction proceeds. > +local function datetime_increment(self, o, direction) Confusing name: it's not increment. It's add or sub. > + assert(direction == -1 or direction == 1) > + local title = direction > 0 and "datetime.add" or "datetime.sub" > + check_date(self, title) > + if type(o) ~= 'table' then > + error(('%s - object expected'):format(title), 2) Object? Or table? > + end > + > + local secs, nsec = self.secs, self.nsec > + local offset = self.offset > + > + -- operations with intervals should be done using human dates > + -- not UTC dates, thus we normalize to UTC > + local dt = local_dt(secs) > + > + local ym_updated = false > + local years, months, weeks = o.years, o.months, o.weeks Local variables are not really needed here. > + > + if years ~= nil then > + check_range(years, {0, 9999}, 'years') I think we should use the same names here (singular that is) as in datetime.new(), to avoid confusion. We can't create a year > 9999, but we can create year 9999 and add 1. We should either remove the limitation or enforce it everywhere. > + dt = builtin.tnt_dt_add_years(dt, direction * years, builtin.DT_LIMIT) > + ym_updated = true > + end > + if months ~= nil then > + check_range(months, {0, 12}, 'months') There must not be any limitations here. You can add 16 months or 48 hours to a date, it's perfrectly fine. > + dt = builtin.tnt_dt_add_months(dt, direction * months, builtin.DT_LIMIT) I'd really prefer to hide this dt thing behind datetime lib so that here we would only operate on datetime object. > + ym_updated = true > + end > + if ym_updated then > + secs = (builtin.tnt_dt_rdn(dt) - DT_EPOCH_1970_OFFSET) * SECS_PER_DAY + > + secs % SECS_PER_DAY > + end > + > + if weeks ~= nil then > + check_range(weeks, {0, 52}, 'weeks') > + secs = secs + direction * 7 * weeks * SECS_PER_DAY > + end > + > + local days, hours, minutes, seconds = o.days, o.hours, o.minutes, o.seconds > + if days ~= nil then > + check_range(days, {0, 31}, 'days') > + secs = secs + direction * days * SECS_PER_DAY > + end > + if hours ~= nil then > + check_range(hours, {0, 23}, 'hours') > + secs = secs + direction * 60 * 60 * hours > + end > + if minutes ~= nil then > + check_range(minutes, {0, 59}, 'minutes') > + secs = secs + direction * 60 * minutes > + end > + if seconds ~= nil then > + check_range(seconds, {0, 60}, 'seconds') > + local s, frac = math.modf(seconds) > + secs = secs + direction * s > + nsec = nsec + direction * frac * 1e9 > + end > + > + secs, nsec = normalize_nsec(secs, nsec) > + > + return datetime_new_raw(secs, nsec, offset) > +end > + > +--[[ > + Return table in os.date('*t') format, but with timezone > + and nanoseconds > +]] > +local function datetime_totable(self) > + local secs = self.secs > + local dt = local_dt(secs) > + local year = builtin.dt_year(dt) > + local month = builtin.dt_month(dt) > + local yday = builtin.dt_doy(dt) > + local wday = ffi.cast('int32_t', builtin.tnt_dt_dow(dt)) > + local day_of_month = builtin.dt_dom(dt) > + local hour = math_floor((secs / 3600) % 24) > + local minute = math_floor((secs / 60) % 60) > + local second = secs % 60 All these local variables only clutter the code - please initialize the table directly. > + > + return { > + sec = second, > + min = minute, > + day = day_of_month, > + isdst = false, > + wday = wday, > + yday = yday, > + year = year, > + month = month, > + hour = hour, > + nsec = self.nsec, You return 'nsec' here, but neither datetime_new nor add/sub take 'nsec' AFAICS. Should we add 'nsec' to new/add/sub and ignore fractional part of 'sec' in those methods? > + tz = self.offset, > + } > +end > + > +local function datetime_set(self, obj) > + -- FIXME Please fix. > + return datetime_new(obj) > +end > + > +local function strftime(fmt, o) > + check_date(o, "datetime.strftime()") > + local sz = 128 > + local buff = ffi.new('char[?]', sz) > + builtin.datetime_strftime(o, fmt, buff, sz) > + return ffi.string(buff) > +end > + > +ffi.metatype(datetime_t, { > + __tostring = datetime_tostring, > + __serialize = datetime_serialize, > + __eq = datetime_eq, > + __lt = datetime_lt, > + __le = datetime_le, > + __index = { > + epoch = function(self) return self.secs end, > + timestamp = function(self) return self.secs + self.nsec / 1e9 end, Please let's not mix lambdas and functions here - let's define function wrappers for consistency. > + nanoseconds = function(self) return self.secs * 1e9 + self.nsec end, > + microseconds = function(self) return self.secs * 1e6 + self.nsec / 1e3 end, > + milliseconds = function(self) return self.secs * 1e3 + self.nsec / 1e6 end, > + seconds = function(self) return self.secs + self.nsec / 1e9 end, Please remove these four methods. There should only be epoch() and timestamp() - the user can convert them to nanoseconds, microseconds, whatever by themselves. > + add = function(self, obj) return datetime_increment(self, obj, 1) end, > + sub = function(self, obj) return datetime_increment(self, obj, -1) end, Please implement the 'set' method requested by @unera. > + totable = datetime_totable, > + set = datetime_set, > + } Please don't use __index. Just put the methods in the metatable. > +}) > + > +return setmetatable( > + { > + new = datetime_new, > + new_raw = datetime_new_obj, new_raw is a bad name. I think datetime.new() should handle this: datetime.new{epoch = ..., tz = ... } or datetime.new{timestamp = ..., tz = ... } > + > + parse = parse, > + parse_date = parse_date, I think we've agreed to remove this fuctions from the public API. Parsing should be handled by datetime.new(). > + > + now = local_now, > + strftime = strftime, strftime should be a method of datetime object. > + > + is_datetime = is_datetime, > + }, { > + __call = function(self, ...) return datetime_from(...) end > + } I'd expect datetime() to be a synonym for datetime.new(), but hey seem to be different. Anyway, I don't think we really need to support datetime() - datetime.new() should be enough. > +) > diff --git a/src/lua/init.c b/src/lua/init.c > index f9738025d..127e935d7 100644 > --- a/src/lua/init.c > +++ b/src/lua/init.c > @@ -129,7 +129,8 @@ extern char strict_lua[], > parse_lua[], > process_lua[], > humanize_lua[], > - memprof_lua[] > + memprof_lua[], > + datetime_lua[] > ; > > static const char *lua_modules[] = { > @@ -184,6 +185,7 @@ static const char *lua_modules[] = { > "memprof.process", process_lua, > "memprof.humanize", humanize_lua, > "memprof", memprof_lua, > + "datetime", datetime_lua, > NULL > }; > > diff --git a/src/lua/utils.c b/src/lua/utils.c > index c71cd4857..2c89326f3 100644 > --- a/src/lua/utils.c > +++ b/src/lua/utils.c > @@ -48,6 +48,9 @@ static uint32_t CTID_STRUCT_IBUF_PTR; > uint32_t CTID_CHAR_PTR; > uint32_t CTID_CONST_CHAR_PTR; > uint32_t CTID_UUID; > +uint32_t CTID_DATETIME = 0; This belongs to the next patch - it isn't used in this patch. > +uint32_t CTID_INTERVAL = 0; No such thing anymore. > + Extra new line. > > void * > luaL_pushcdata(struct lua_State *L, uint32_t ctypeid) > @@ -120,6 +123,12 @@ luaL_pushuuidstr(struct lua_State *L, const struct tt_uuid *uuid) > lua_pushlstring(L, str, UUID_STR_LEN); > } > > +struct datetime * > +luaL_pushdatetime(struct lua_State *L) > +{ > + return luaL_pushcdata(L, CTID_DATETIME); > +} > + > int > luaL_iscdata(struct lua_State *L, int idx) > { > @@ -725,6 +734,24 @@ tarantool_lua_utils_init(struct lua_State *L) > CTID_UUID = luaL_ctypeid(L, "struct tt_uuid"); > assert(CTID_UUID != 0); > > + rc = luaL_cdef(L, "struct datetime {" > + "double secs;" > + "int32_t nsec;" It is uint32_t in struct datetime now. > + "int32_t offset;" > + "};"); > + assert(rc == 0); > + (void) rc; > + CTID_DATETIME = luaL_ctypeid(L, "struct datetime"); > + assert(CTID_DATETIME != 0); > + rc = luaL_cdef(L, "struct datetime_interval {" > + "double secs;" > + "int32_t nsec;" > + "};"); > + assert(rc == 0); > + (void) rc; > + CTID_INTERVAL = luaL_ctypeid(L, "struct datetime_interval"); > + assert(CTID_INTERVAL != 0); > + > lua_pushcfunction(L, luaT_newthread_wrapper); > luaT_newthread_ref = luaL_ref(L, LUA_REGISTRYINDEX); > return 0; > diff --git a/src/lua/utils.h b/src/lua/utils.h > index 45070b778..73495b607 100644 > --- a/src/lua/utils.h > +++ b/src/lua/utils.h > @@ -59,6 +59,7 @@ struct lua_State; > struct ibuf; > typedef struct ibuf box_ibuf_t; > struct tt_uuid; > +struct datetime; > > /** > * Single global lua_State shared by core and modules. > @@ -71,6 +72,8 @@ extern struct lua_State *tarantool_L; > extern uint32_t CTID_CHAR_PTR; > extern uint32_t CTID_CONST_CHAR_PTR; > extern uint32_t CTID_UUID; > +extern uint32_t CTID_DATETIME; > +extern uint32_t CTID_INTERVAL; > > struct tt_uuid * > luaL_pushuuid(struct lua_State *L); > @@ -78,6 +81,15 @@ luaL_pushuuid(struct lua_State *L); > void > luaL_pushuuidstr(struct lua_State *L, const struct tt_uuid *uuid); > > +/** > + * @brief Push cdata of a datetime type onto the stack. > + * @param L Lua State > + * @sa luaL_pushcdata > + * @return memory associated with this datetime data > + */ > +struct datetime * > +luaL_pushdatetime(struct lua_State *L); > + > /** \cond public */ > > /** > diff --git a/test/app-tap/datetime.test.lua b/test/app-tap/datetime.test.lua > new file mode 100755 > index 000000000..5c39feee8 > --- /dev/null > +++ b/test/app-tap/datetime.test.lua > @@ -0,0 +1,213 @@ > +#!/usr/bin/env tarantool > + > +local tap = require('tap') > +local test = tap.test("errno") > +local date = require('datetime') > +local ffi = require('ffi') > + > +test:plan(7) > + > +test:test("Simple tests for parser", function(test) > + test:plan(2) > + test:ok(date("1970-01-01T01:00:00Z") == > + date {year=1970, month=1, day=1, hour=1, minute=0, second=0}) > + test:ok(date("1970-01-01T02:00:00+02:00") == > + date {year=1970, month=1, day=1, hour=2, minute=0, second=0, tz=120}) > +end) > + > +test:test("Multiple tests for parser (with nanoseconds)", function(test) > + test:plan(193) > + -- borrowed from p5-time-moments/t/180_from_string.t > + local tests = > + { > + {'1970-01-01T00:00Z', 0, 0, 0, 1}, > + {'1970-01-01T02:00+02:00', 0, 0, 120, 1}, > + {'1970-01-01T01:30+01:30', 0, 0, 90, 1}, > + {'1970-01-01T01:00+01:00', 0, 0, 60, 1}, > + {'1970-01-01T00:01+00:01', 0, 0, 1, 1}, > + {'1970-01-01T00:00Z', 0, 0, 0, 1}, > + {'1969-12-31T23:59-00:01', 0, 0, -1, 1}, > + {'1969-12-31T23:00-01:00', 0, 0, -60, 1}, > + {'1969-12-31T22:30-01:30', 0, 0, -90, 1}, > + {'1969-12-31T22:00-02:00', 0, 0, -120, 1}, > + {'1970-01-01T00:00:00.123456789Z', 0, 123456789, 0, 1}, > + {'1970-01-01T00:00:00.12345678Z', 0, 123456780, 0, 0}, > + {'1970-01-01T00:00:00.1234567Z', 0, 123456700, 0, 0}, > + {'1970-01-01T00:00:00.123456Z', 0, 123456000, 0, 1}, > + {'1970-01-01T00:00:00.12345Z', 0, 123450000, 0, 0}, > + {'1970-01-01T00:00:00.1234Z', 0, 123400000, 0, 0}, > + {'1970-01-01T00:00:00.123Z', 0, 123000000, 0, 1}, > + {'1970-01-01T00:00:00.12Z', 0, 120000000, 0, 0}, > + {'1970-01-01T00:00:00.1Z', 0, 100000000, 0, 0}, > + {'1970-01-01T00:00:00.01Z', 0, 10000000, 0, 0}, > + {'1970-01-01T00:00:00.001Z', 0, 1000000, 0, 1}, > + {'1970-01-01T00:00:00.0001Z', 0, 100000, 0, 0}, > + {'1970-01-01T00:00:00.00001Z', 0, 10000, 0, 0}, > + {'1970-01-01T00:00:00.000001Z', 0, 1000, 0, 1}, > + {'1970-01-01T00:00:00.0000001Z', 0, 100, 0, 0}, > + {'1970-01-01T00:00:00.00000001Z', 0, 10, 0, 0}, > + {'1970-01-01T00:00:00.000000001Z', 0, 1, 0, 1}, > + {'1970-01-01T00:00:00.000000009Z', 0, 9, 0, 1}, > + {'1970-01-01T00:00:00.00000009Z', 0, 90, 0, 0}, > + {'1970-01-01T00:00:00.0000009Z', 0, 900, 0, 0}, > + {'1970-01-01T00:00:00.000009Z', 0, 9000, 0, 1}, > + {'1970-01-01T00:00:00.00009Z', 0, 90000, 0, 0}, > + {'1970-01-01T00:00:00.0009Z', 0, 900000, 0, 0}, > + {'1970-01-01T00:00:00.009Z', 0, 9000000, 0, 1}, > + {'1970-01-01T00:00:00.09Z', 0, 90000000, 0, 0}, > + {'1970-01-01T00:00:00.9Z', 0, 900000000, 0, 0}, > + {'1970-01-01T00:00:00.99Z', 0, 990000000, 0, 0}, > + {'1970-01-01T00:00:00.999Z', 0, 999000000, 0, 1}, > + {'1970-01-01T00:00:00.9999Z', 0, 999900000, 0, 0}, > + {'1970-01-01T00:00:00.99999Z', 0, 999990000, 0, 0}, > + {'1970-01-01T00:00:00.999999Z', 0, 999999000, 0, 1}, > + {'1970-01-01T00:00:00.9999999Z', 0, 999999900, 0, 0}, > + {'1970-01-01T00:00:00.99999999Z', 0, 999999990, 0, 0}, > + {'1970-01-01T00:00:00.999999999Z', 0, 999999999, 0, 1}, > + {'1970-01-01T00:00:00.0Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.00Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.000Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.0000Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.00000Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.000000Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.0000000Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.00000000Z', 0, 0, 0, 0}, > + {'1970-01-01T00:00:00.000000000Z', 0, 0, 0, 0}, > + {'1973-11-29T21:33:09Z', 123456789, 0, 0, 1}, > + {'2013-10-28T17:51:56Z', 1382982716, 0, 0, 1}, > + {'9999-12-31T23:59:59Z', 253402300799, 0, 0, 1}, > + } > + for _, value in ipairs(tests) do > + local str, epoch, nsec, offset, check > + str, epoch, nsec, offset, check = unpack(value) > + local dt = date(str) > + test:ok(dt.secs == epoch, ('%s: dt.secs == %d'):format(str, epoch)) Shouldn't be that easily accessible from the code. Prepend with an underscore to emphasize that they are private members? > + test:ok(dt.nsec == nsec, ('%s: dt.nsec == %d'):format(str, nsec)) > + test:ok(dt.offset == offset, ('%s: dt.offset == %d'):format(str, offset)) > + if check > 0 then > + test:ok(str == tostring(dt), ('%s == tostring(%s)'): > + format(str, tostring(dt))) > + end > + end > +end) > + > +ffi.cdef [[ > + void tzset(void); > +]] Unused? > + > +test:test("Datetime string formatting", function(test) > + test:plan(5) > + local str = "1970-01-01" > + local t = date(str) > + test:ok(t.secs == 0, ('%s: t.secs == %d'):format(str, tonumber(t.secs))) > + test:ok(t.nsec == 0, ('%s: t.nsec == %d'):format(str, t.nsec)) > + test:ok(t.offset == 0, ('%s: t.offset == %d'):format(str, t.offset)) > + test:ok(date.strftime('%d/%m/%Y', t) == '01/01/1970', ('%s: strftime #1'):format(str)) > + test:ok(date.strftime('%A %d. %B %Y', t) == 'Thursday 01. January 1970', ('%s: strftime #2'):format(str)) > +end) > + > +test:test("Parse iso date - valid strings", function(test) > + test:plan(32) > + local good = { > + {2012, 12, 24, "20121224", 8 }, > + {2012, 12, 24, "20121224 Foo bar", 8 }, > + {2012, 12, 24, "2012-12-24", 10 }, > + {2012, 12, 24, "2012-12-24 23:59:59", 10 }, > + {2012, 12, 24, "2012-12-24T00:00:00+00:00", 10 }, > + {2012, 12, 24, "2012359", 7 }, > + {2012, 12, 24, "2012359T235959+0130", 7 }, > + {2012, 12, 24, "2012-359", 8 }, > + {2012, 12, 24, "2012W521", 8 }, > + {2012, 12, 24, "2012-W52-1", 10 }, > + {2012, 12, 24, "2012Q485", 8 }, > + {2012, 12, 24, "2012-Q4-85", 10 }, > + { 1, 1, 1, "0001-Q1-01", 10 }, > + { 1, 1, 1, "0001-W01-1", 10 }, > + { 1, 1, 1, "0001-01-01", 10 }, > + { 1, 1, 1, "0001-001", 8 }, > + } > + > + for _, value in ipairs(good) do > + local year, month, day, str, date_part_len > + year, month, day, str, date_part_len = unpack(value) > + local expected_date = date{year = year, month = month, day = day} > + local date_part, len > + date_part, len = date.parse_date(str) > + test:ok(len == date_part_len, ('%s: length check %d'):format(str, len)) > + test:ok(expected_date == date_part, ('%s: expected date'):format(str)) > + end > +end) > + > +test:test("Parse iso date - invalid strings", function(test) > + test:plan(62) > + local bad = { > + "20121232" , -- Invalid day of month > + "2012-12-310", -- Invalid day of month > + "2012-13-24" , -- Invalid month > + "2012367" , -- Invalid day of year > + "2012-000" , -- Invalid day of year > + "2012W533" , -- Invalid week of year > + "2012-W52-8" , -- Invalid day of week > + "2012Q495" , -- Invalid day of quarter > + "2012-Q5-85" , -- Invalid quarter > + "20123670" , -- Trailing digit > + "201212320" , -- Trailing digit > + "2012-12" , -- Reduced accuracy > + "2012-Q4" , -- Reduced accuracy > + "2012-Q42" , -- Invalid > + "2012-Q1-1" , -- Invalid day of quarter > + "2012Q--420" , -- Invalid > + "2012-Q-420" , -- Invalid > + "2012Q11" , -- Incomplete > + "2012Q1234" , -- Trailing digit > + "2012W12" , -- Incomplete > + "2012W1234" , -- Trailing digit > + "2012W-123" , -- Invalid > + "2012-W12" , -- Incomplete > + "2012-W12-12", -- Trailing digit > + "2012U1234" , -- Invalid > + "2012-1234" , -- Invalid > + "2012-X1234" , -- Invalid > + "0000-Q1-01" , -- Year less than 0001 > + "0000-W01-1" , -- Year less than 0001 > + "0000-01-01" , -- Year less than 0001 > + "0000-001" , -- Year less than 0001 > + } > + > + for _, str in ipairs(bad) do > + local date_part, len > + date_part, len = date.parse_date(str) > + test:ok(len == 0, ('%s: length check %d'):format(str, len)) > + test:ok(date_part == nil, ('%s: empty date check %s'):format(str, date_part)) > + end > +end) > + > +test:test("Parse tiny date into seconds and other parts", function(test) > + test:plan(7) > + local str = '19700101 00:00:30.528' > + local tiny = date(str) > + test:ok(tiny.secs == 30, ("secs of '%s'"):format(str)) > + test:ok(tiny.nsec == 528000000, ("nsec of '%s'"):format(str)) > + test:ok(tiny:nanoseconds() == 30528000000, "nanoseconds") > + test:ok(tiny:microseconds() == 30528000, "microseconds") > + test:ok(tiny:milliseconds() == 30528, "milliseconds") > + test:ok(tiny:seconds() == 30.528, "seconds") > + test:ok(tiny:timestamp() == 30.528, "timestamp") > +end) > + > +test:test("Time interval operations", function(test) > + test:plan(2) > + > + -- check arithmetic with leap dates > + local T = date('1972-02-29') > + test:ok(tostring(T:add{years = 1, months = 2}) == '1973-05-01T00:00Z', > + ('T:add{years=1,months=2}(%s)'):format(T)) > + > + -- check average, not leap dates > + T = date('1970-01-08') > + test:ok(tostring(T:add{years = 1, months = 2}) == '1971-03-08T00:00Z', > + ('T:add{years=1,months=2}(%s)'):format(T)) > + > +end) > + > +os.exit(test:check() and 0 or 1) > diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt > index 5bb7cd6e7..8194a133f 100644 > --- a/test/unit/CMakeLists.txt > +++ b/test/unit/CMakeLists.txt > @@ -56,7 +56,8 @@ add_executable(uuid.test uuid.c core_test_utils.c) > target_link_libraries(uuid.test uuid unit) > add_executable(random.test random.c core_test_utils.c) > target_link_libraries(random.test core unit) > - > +add_executable(datetime.test datetime.c) > +target_link_libraries(datetime.test cdt core unit) > add_executable(bps_tree.test bps_tree.cc) > target_link_libraries(bps_tree.test small misc) > add_executable(bps_tree_iterator.test bps_tree_iterator.cc) > diff --git a/test/unit/datetime.c b/test/unit/datetime.c > new file mode 100644 > index 000000000..931636172 > --- /dev/null > +++ b/test/unit/datetime.c > @@ -0,0 +1,260 @@ > +#include "dt.h" > +#include <assert.h> > +#include <stdint.h> > +#include <string.h> > +#include <time.h> > + > +#include "unit.h" > +#include "datetime.h" > +#include "trivia/util.h" > + > +static const char sample[] = "2012-12-24T15:30Z"; > + > +#define S(s) {s, sizeof(s) - 1} > +struct { > + const char *str; > + size_t len; > +} tests[] = { > + S("2012-12-24 15:30Z"), > + S("2012-12-24 15:30z"), > + S("2012-12-24 15:30"), > + S("2012-12-24 16:30+01:00"), > + S("2012-12-24 16:30+0100"), > + S("2012-12-24 16:30+01"), > + S("2012-12-24 14:30-01:00"), > + S("2012-12-24 14:30-0100"), > + S("2012-12-24 14:30-01"), > + S("2012-12-24 15:30:00Z"), > + S("2012-12-24 15:30:00z"), > + S("2012-12-24 15:30:00"), > + S("2012-12-24 16:30:00+01:00"), > + S("2012-12-24 16:30:00+0100"), > + S("2012-12-24 14:30:00-01:00"), > + S("2012-12-24 14:30:00-0100"), > + S("2012-12-24 15:30:00.123456Z"), > + S("2012-12-24 15:30:00.123456z"), > + S("2012-12-24 15:30:00.123456"), > + S("2012-12-24 16:30:00.123456+01:00"), > + S("2012-12-24 16:30:00.123456+01"), > + S("2012-12-24 14:30:00.123456-01:00"), > + S("2012-12-24 14:30:00.123456-01"), > + S("2012-12-24t15:30Z"), > + S("2012-12-24t15:30z"), > + S("2012-12-24t15:30"), > + S("2012-12-24t16:30+01:00"), > + S("2012-12-24t16:30+0100"), > + S("2012-12-24t14:30-01:00"), > + S("2012-12-24t14:30-0100"), > + S("2012-12-24t15:30:00Z"), > + S("2012-12-24t15:30:00z"), > + S("2012-12-24t15:30:00"), > + S("2012-12-24t16:30:00+01:00"), > + S("2012-12-24t16:30:00+0100"), > + S("2012-12-24t14:30:00-01:00"), > + S("2012-12-24t14:30:00-0100"), > + S("2012-12-24t15:30:00.123456Z"), > + S("2012-12-24t15:30:00.123456z"), > + S("2012-12-24t16:30:00.123456+01:00"), > + S("2012-12-24t14:30:00.123456-01:00"), > + S("2012-12-24 16:30 +01:00"), > + S("2012-12-24 14:30 -01:00"), > + S("2012-12-24 15:30 UTC"), > + S("2012-12-24 16:30 UTC+1"), > + S("2012-12-24 16:30 UTC+01"), > + S("2012-12-24 16:30 UTC+0100"), > + S("2012-12-24 16:30 UTC+01:00"), > + S("2012-12-24 14:30 UTC-1"), > + S("2012-12-24 14:30 UTC-01"), > + S("2012-12-24 14:30 UTC-01:00"), > + S("2012-12-24 14:30 UTC-0100"), > + S("2012-12-24 15:30 GMT"), > + S("2012-12-24 16:30 GMT+1"), > + S("2012-12-24 16:30 GMT+01"), > + S("2012-12-24 16:30 GMT+0100"), > + S("2012-12-24 16:30 GMT+01:00"), > + S("2012-12-24 14:30 GMT-1"), > + S("2012-12-24 14:30 GMT-01"), > + S("2012-12-24 14:30 GMT-01:00"), > + S("2012-12-24 14:30 GMT-0100"), > + S("2012-12-24 14:30 -01:00"), > + S("2012-12-24 16:30:00 +01:00"), > + S("2012-12-24 14:30:00 -01:00"), > + S("2012-12-24 16:30:00.123456 +01:00"), > + S("2012-12-24 14:30:00.123456 -01:00"), > + S("2012-12-24 15:30:00.123456 -00:00"), > + S("20121224T1630+01:00"), > + S("2012-12-24T1630+01:00"), > + S("20121224T16:30+01"), > + S("20121224T16:30 +01"), > +}; > +#undef S > + > +static int > +parse_datetime(const char *str, size_t len, int64_t *secs_p, > + int32_t *nanosecs_p, int32_t *offset_p) > +{ > + size_t n; > + dt_t dt; > + char c; > + int sec_of_day = 0, nanosecond = 0, offset = 0; > + > + n = dt_parse_iso_date(str, len, &dt); > + if (!n) > + return 1; > + if (n == len) > + goto exit; > + > + c = str[n++]; > + if (!(c == 'T' || c == 't' || c == ' ')) > + return 1; > + > + str += n; > + len -= n; > + > + n = dt_parse_iso_time(str, len, &sec_of_day, &nanosecond); > + if (!n) > + return 1; > + if (n == len) > + goto exit; > + > + if (str[n] == ' ') > + n++; > + > + str += n; > + len -= n; > + > + n = dt_parse_iso_zone_lenient(str, len, &offset); > + if (!n || n != len) > + return 1; > + > +exit: > + *secs_p = ((int64_t)dt_rdn(dt) - DT_EPOCH_1970_OFFSET) * SECS_PER_DAY + > + sec_of_day - offset * 60; > + *nanosecs_p = nanosecond; > + *offset_p = offset; > + > + return 0; > +} > + > +static int > +local_rd(const struct datetime *dt) > +{ > + return (int)((int64_t)dt->secs / SECS_PER_DAY) + DT_EPOCH_1970_OFFSET; > +} > + > +static int > +local_dt(const struct datetime *dt) > +{ > + return dt_from_rdn(local_rd(dt)); > +} > + > +struct tm * > +datetime_to_tm(struct datetime *dt) > +{ > + static struct tm tm; > + > + memset(&tm, 0, sizeof(tm)); > + dt_to_struct_tm(local_dt(dt), &tm); > + > + int seconds_of_day = (int64_t)dt->secs % 86400; > + tm.tm_hour = (seconds_of_day / 3600) % 24; > + tm.tm_min = (seconds_of_day / 60) % 60; > + tm.tm_sec = seconds_of_day % 60; > + > + return &tm; > +} Code duplication with the code from the datetime module. I think that all datetime helper functions should be defined only in the datetime module and used both by the code and the tests. > + > +static void datetime_test(void) > +{ > + size_t index; > + int64_t secs_expected; > + int32_t nanosecs; > + int32_t offset; > + > + plan(355); > + parse_datetime(sample, sizeof(sample) - 1, > + &secs_expected, &nanosecs, &offset); > + > + for (index = 0; index < lengthof(tests); index++) { > + int64_t secs; > + int rc = parse_datetime(tests[index].str, tests[index].len, > + &secs, &nanosecs, &offset); > + is(rc, 0, "correct parse_datetime return value for '%s'", > + tests[index].str); > + is(secs, secs_expected, "correct parse_datetime output " > + "seconds for '%s", > + tests[index].str); > + > + /* > + * check that stringized literal produces the same date > + * time fields > + */ > + static char buff[40]; > + struct datetime dt = {secs, nanosecs, offset}; > + /* datetime_to_tm returns time in GMT zone */ > + struct tm *p_tm = datetime_to_tm(&dt); > + size_t len = strftime(buff, sizeof(buff), "%F %T", p_tm); > + ok(len > 0, "strftime"); > + int64_t parsed_secs; > + int32_t parsed_nsecs, parsed_ofs; > + rc = parse_datetime(buff, len, &parsed_secs, &parsed_nsecs, &parsed_ofs); > + is(rc, 0, "correct parse_datetime return value for '%s'", buff); > + is(secs, parsed_secs, > + "reversible seconds via strftime for '%s", buff); > + } > + check_plan(); > +} > + > + > +static void > +tostring_datetime_test(void) > +{ > + static struct { > + const char *string; > + int64_t secs; > + uint32_t nsec; > + uint32_t offset; > + } tests[] = { > + {"1970-01-01T02:00+02:00", 0, 0, 120}, > + {"1970-01-01T01:30+01:30", 0, 0, 90}, > + {"1970-01-01T01:00+01:00", 0, 0, 60}, > + {"1970-01-01T00:01+00:01", 0, 0, 1}, > + {"1970-01-01T00:00Z", 0, 0, 0}, > + {"1969-12-31T23:59-00:01", 0, 0, -1}, > + {"1969-12-31T23:00-01:00", 0, 0, -60}, > + {"1969-12-31T22:30-01:30", 0, 0, -90}, > + {"1969-12-31T22:00-02:00", 0, 0, -120}, > + {"1970-01-01T00:00:00.123456789Z", 0, 123456789, 0}, > + {"1970-01-01T00:00:00.123456Z", 0, 123456000, 0}, > + {"1970-01-01T00:00:00.123Z", 0, 123000000, 0}, > + {"1973-11-29T21:33:09Z", 123456789, 0, 0}, > + {"2013-10-28T17:51:56Z", 1382982716, 0, 0}, > + {"9999-12-31T23:59:59Z", 253402300799, 0, 0}, > + }; > + size_t index; > + > + plan(15); > + for (index = 0; index < lengthof(tests); index++) { > + struct datetime date = { > + tests[index].secs, > + tests[index].nsec, > + tests[index].offset > + }; > + char buf[48]; > + datetime_to_string(&date, buf, sizeof(buf)); > + is(strcmp(buf, tests[index].string), 0, > + "string '%s' expected, received '%s'", > + tests[index].string, buf); > + } > + check_plan(); > +} > + > +int > +main(void) > +{ > + plan(2); > + datetime_test(); > + tostring_datetime_test(); > + > + return check_plan(); > +}
next prev parent reply other threads:[~2021-08-19 15:27 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-19 2:56 [Tarantool-patches] [PATCH v6 0/5] Initial datetime implementation Timur Safin via Tarantool-patches 2021-08-19 2:56 ` [Tarantool-patches] [PATCH v6 1/5] build, lua: built-in module datetime Timur Safin via Tarantool-patches 2021-08-19 9:43 ` Serge Petrenko via Tarantool-patches 2021-08-19 9:47 ` Safin Timur via Tarantool-patches 2021-08-19 15:26 ` Vladimir Davydov via Tarantool-patches [this message] 2021-08-24 21:13 ` Vladislav Shpilevoy via Tarantool-patches 2021-08-19 2:56 ` [Tarantool-patches] [PATCH v6 2/5] box, datetime: messagepack support for datetime Timur Safin via Tarantool-patches 2021-08-19 9:58 ` Serge Petrenko via Tarantool-patches 2021-08-19 2:56 ` [Tarantool-patches] [PATCH v6 3/5] box, datetime: datetime comparison for indices Timur Safin via Tarantool-patches 2021-08-19 10:16 ` Serge Petrenko via Tarantool-patches 2021-08-19 11:18 ` UNera via Tarantool-patches 2021-08-19 11:53 ` Safin Timur via Tarantool-patches 2021-08-19 14:47 ` Dmitry E. Oboukhov via Tarantool-patches 2021-08-19 2:56 ` [Tarantool-patches] [PATCH v6 4/5] datetime: perf test for datetime parser Timur Safin via Tarantool-patches 2021-08-19 10:19 ` Serge Petrenko via Tarantool-patches 2021-08-19 10:29 ` Safin Timur via Tarantool-patches 2021-08-19 11:11 ` Serge Petrenko via Tarantool-patches 2021-08-19 15:58 ` Vladimir Davydov via Tarantool-patches 2021-08-19 2:56 ` [Tarantool-patches] [PATCH v6 5/5] datetime: changelog for datetime module Timur Safin via Tarantool-patches 2021-08-19 10:20 ` Serge Petrenko via Tarantool-patches
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210819152656.viku6crt3uezsrdd@esperanza \ --to=tarantool-patches@dev.tarantool.org \ --cc=tsafin@tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --cc=vdavydov@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH v6 1/5] build, lua: built-in module datetime' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox