[Tarantool-patches] [RFC PATCH 00/13] Initial datetime support

Igor Munkin imun at tarantool.org
Thu Jul 22 13:01:33 MSK 2021


Timur,

Thanks for the series! I wonder, why you decided to reinvent several
approaches being already used for quite similar Tarantool types: decimal
and uuid. Frankly speaking, I expect you introduce datetime the very
*very* similar way. I didn't make the precise review for this series
(this is waste of time), so please consider my major comments below.

Let's start with top 3 comments to the eye catching problems I've found
while looking at the patches.

1. Tests have to be committed in the same scope the corresponding
   changes in Tarantool are introduced.
2. Docbot request is required for the changes you're introducing: this
   is the place where you can leave the description you've added below.
3. Please consider approaches used for decimal and uuid: almost every
   Oleg's comment came from the bugs fixed for these types.

As I write above, this is not a precise review, so I don't even start
nitpicking the patches and simply address you to this document[1] and
our dev guide[2]. Please, check whether your next series respects these
guidelines.

On 15.07.21, Timur Safin wrote:
> Here is the preliminary RFC version of a patchset, which is implementing
> datetime lua support in box, their messagepack, yaml, json serialization
> support, and a set of c and lua tests.
> 
> * It's heavily influenced by Sci-Lua lua-time module implementation
>   https://github.com/stepelu/lua-time 
>   e.g. you could find very similar approach for handling of operations
>   with year and months periods (which should be handled differently than
>   usual seconds, days periods).
> 
> * But internally it actually uses Christina Hanson' c-dt module
>   https://github.com/chansen/c-dt 
>   (though it has been modified slightly for cleaner integration
>   into cmake build process)

BTW, submodule points to the very strange repo: please use either
vanilla one, or create a fork into Tarantool repositories.

> 
> This is preliminary version of patchset, with unoptimal patches divisions
> (like you could see patches with rename refactorings), but, due to an amount
> of code in patchset, and limited time have, I send patchset as RFC today,
> expecting to receive initial feedback for various parts of a series.

Well, I guess I can stop here for the initial feedback.

I have no idea, why you decided to introduce a separate tiny module
src/lua/datetime.c just for initializing several constants and a
oneliner for pushing datetime GCcdata on Lua stack. This module is
excess. Please consider the way this is done for uuid and move
everything to src/lua/utils.c that encloses all illegitimate tricks with
Lua C API.

> 
> Prior plan was to split series into one, which introduces datetime module api,
> and another one with datetime messagepack serialization implementation.
> But here they left as part of a same series. Sorry, for invonvenience - I'll
> reshuffle order according to feedback provided.

[I personally don't know what kind of the feedback you need to reshuffle
the series...]

> 
> The problem, which was stopping such simple division - that `datetime.lua`
> module uses `datetime_to_string()` ffi function for stringization of a
> datetime object. And this function has been introduced as part of
> messagepack/yaml/json serialization support. Yes, intra-dependendencies
> between parts looks suboptimal, and I'm very open to suggestions how to
> divide it properly.

Again, vse uje ukradeno^W pridumano do nas: consider the way this is
solved for uuid (within src/lib/uuid/tt_uuid.[ch]), since you've chosen
FFI approach for this type (we'll return to this question later).

> 
> Module API
> ==========
> 
> To simplify review of a code in patchset below you could find preliminary
> documentation of a datetime api module. Eventually we may put it
> as .rst file in documentation repository (though, personally, I'd prefer
> to have all modules api committed to the same repo as code, but I digress)

I guess, I'll leave general comments right here.

1. <asctime>, <strftime>, <ctime> can be implemented via FFI by user,
   and you simply provide a wrapped. Anyway, it's already implemented,
   so OK.
2. IMHO, you test too many exotic formats and corner cases of creating
   datetime object. I doubt somebody will use at least 10% of the
   provided functionality (especially in beta).
3. Datetime and interval arithmetics -- this is what bother me a lot.
   I believe every average Lua user can parse date into the table and
   build the string from this table back. But nobody wants to mess with
   all that hell with leap years arithmetics, leap seconds, etc.
4. Furthermore, I don't understand the following behaviour:
   | tarantool> d = require('datetime')
   | ---
   | ...
   | 
   | tarantool> ft = require('ffi').typeof
   | ---
   | - +2 days, 0 hours, 0 minutes, 0 seconds
   | ...
   | 
   | tarantool> d.days(2) + d.days(1)
   | ---
   | - 1970-01-04T00:00Z
   | ...
   | 
   | tarantool> ft(d.days(2) + d.days(1))
   | ---
   | - ctype<struct datetime_t>
   | ...
   | 
   | tarantool> d.days(2) - d.days(1)
   | ---
   | - error: 'builtin/datetime.lua:554: date/time expected'
   | ...
   | 
   I believe it's quite natural to do arithmetics with intervals to
   create a new one, isn't it?

> 
> Datetime module
> ===============
> 
> This module allows to parse date/time stamps, manipulate with them, and
> generate textual representation. Parsing of date constants will handle
> all ISO-8601 formats, and deal with some extensions.
> 
> For internal date representation it uses cdata structures of a form:

BTW, I agree w/ Oleg and see no motivation why FFI is chosen for
implementation: decimal is implemented via Lua C API, uuid -- via FFI.
Are there any proofs and benchmarks for datetime? Does it violate any
limit mentioned here[1]?

> 
> ```c++
> struct t_datetime {
> 	int secs;
> 	int nsecs;
> 	int offset;
> };
> ```
> 
> Where:
> 
> -   secs is the (signed) number of seconds since epoch 1970-01-01T00:00Z;
> -   nsecs is number of nanoseconds since beginning of day;
> -   offset is the timezone offset (in minutes);
> 
> `datetime()` ? create datetime object
> -------------------------------------
> 

Both decimal and uuid modules provides <new> function, but you decided
to name the datetime constructor <datetime>. I see no reason for such
inconsistency.

> Create date/time object using either string literals, or initialization
> object. When given string literals it behaves like wrapper around
> parse() method.
> 
> When there is initialization object it could create date/time object
> using these attributes below:
> 
> | **Easy way** |                                                                                                 |
> |--------------|-------------------------------------------------------------------------------------------------|
> | secs         | Seconds since epoch                                                                             |
> | nsec         | Nanoseconds since midnight                                                                      |
> | offset       | Time-zone offset in minutes                                                                     |
> | **YMD part** |                                                                                                 |
> | year         | Year in range \[1..9999\]                                                                       |
> | month        | Month in range \[1..12\]                                                                        |
> | day          | Day in month in range \[1..31\]                                                                 |
> | **HMS part** |                                                                                                 |
> | hour         | Hour in range \[0..23\]                                                                         |
> | minute       | Minute in range \[0..59\]                                                                       |
> | second       | Seconds in range \[0..60\]. It allows to have fraction part in which case it goes to nsec field |
> | tz           | Timezone offset (in minutes) for HMS part                                                       |
> 
> Example:
> 
> ```lua
> 
> datetime = require `datetime`
> 
> d = datetime{secs = 0, nsec = 0, offset = 0}
> d = datetime(?1970-01-01?)
> d = datetime{year = 1970, month = 1, day = 1}
> ```
> 
> `delta()` ? create time duration object
> ---------------------------------------
> 
> TBD

Why delta? Sci-Lua uses period, that is quite accurate term. SQL
provides INTERVAL type. I guess you can choose something from this.

> 
> `parse()` ? parse full ISO-8601 string
> --------------------------------------
> 
> Parse full length date/time literal, which may be in ISO-8601 format of
> any of extended formats supported by `parse_date()`, `parse_time()` or
> `parse_timezone()`
> 
> It deals with date/time string in format
> 
> `date ([Tt ] time ([ ] time_zone)? )?`
> 
> Where time or `time_zone` parts may be omitted.
> 
> Example:
> 
> ```lua
> 
> datetime = require `datetime`
> 
> d = datetime.parse(`1970-01-01`)
> d = datetime.parse(`1970-01-01T00:00:00Z`)
> d = datetime.parse(`1970-01-01T02:00:00+02:00`)
> ```
> 
> `parse_date()` ? parse ISO-8601 date literal
> --------------------------------------------
> 
> Parse date string literal, return partial date object which has
> precision of up-to date period of time.
> 
> A number of standard ISO-8601 formats supported, plus there are some
> relaxed formats which are of frequently use:
> 
> | Basic    | Extended   |                          |
> |----------|------------|--------------------------|
> | 20121224 | 2012-12-24 | Calendar date (ISO 8601) |
> | 2012359  | 2012-359   | Ordinal date (ISO 8601)  |
> | 2012W521 | 2012-W52-1 | Week date (ISO 8601)     |
> | 2012Q485 | 2012-Q4-85 | Quarter date             |
> 
> `parse_time()` ? parse ISO-8601 time literal
> --------------------------------------------
> 
> Parse time string literal, return partial date/time object, which
> defines time period inside of single date.
> 
> A number of standard ISO-8601 formats supported, plus there are some
> relaxed formats which are of frequently use:
> 
> | Basic             | Extended            |
> |-------------------|---------------------|
> | T12               | N/A                 |
> | T1230             | T12:30              |
> | T123045           | T12:30:45           |
> | T123045.123456789 | T12:30:45.123456789 |
> | T123045,123456789 | T12:30:45,123456789 |
> 
> The time designator T may be omitted.
> 
> `parse_zone()` ? parse ISO-8601 time zone
> -----------------------------------------
> 
> Parse time-zone string literal, return partial date/time object, which
> defines timezone offset in minutes sing GMT.
> 
> A number of standard ISO-8601 formats supported, plus there are some
> relaxed formats which are of frequently use:
> 
> | Basic | Extended |
> |-------|----------|
> | Z     | N/A      |
> | ?hh   | N/A      |
> | ?hhmm | ?hh:mm   |

AFAIU, <parse_{date,time,zone}> are the parts of <parse> routine, right?
Then why do you provide a separate interfaces for these purposes?

> 
> `tostring()` ? convert datetime object to string
> ------------------------------------------------
> 
> Return string representation (probably compact if there are some parts
> missing) of a date-time objects passed
> 
> `now()` ? return current date/time
> ----------------------------------
> 
> `now()` returns local date and time object. It will use local time-zone
> and nanosecond precision.
> 
> `strftime()` ? convert date object to string using format
> ---------------------------------------------------------
> 
> `strftime()` is the FFI wrapper around strftime() function in LIBC. It
> supports all the same flags which supports strftime() from host OS.
> 
> See
> <https://pubs.opengroup.org/onlinepubs/000095399/functions/strftime.html>
> for more details.
> 
> `asctime()` ? convert date object to string using asctime predefined format
> ---------------------------------------------------------------------------
> 
> `asctime()` is the FFI wrapper over `asctime_r()` from a host libc. asctime
> returns string in the form `"Sun Sep 16 01:03:52 1973\\n\\0"`
> 
> <https://pubs.opengroup.org/onlinepubs/009695399/functions/asctime.html>
> 
> `ctime()` ? convert local time to string using ctime() predefined format
> ------------------------------------------------------------------------
> 
> `ctime()` is the FFI wrapper over `ctime_r()` in the host libc. ctime
> returns string in the form `"Sun Sep 16 01:03:52 1973\\n\\0"`

This is not quite right, considering your code, so here is the question:
do you mean ctime(3) or ctime_r(3).

> 
> <https://pubs.opengroup.org/onlinepubs/009695399/functions/ctime.html>
> 
> The difference of `ctime()` and `asctime()` is that former is returning
> local time zone formatted, while the latter will deal with GMT.
> 
> Examples:
> 
> ```
> tarantool> date = require 'datetime'
> ---
> ...
> tarantool> T = date.now()
> ---
> ...
> tarantool> T
> ---
> - 2021-07-14T01:36:48.554105+03:00
> ...
> tarantool> date.asctime(T)
> ---
> - 'Tue Jul 13 22:36:48 2021
> '
> ...
> tarantool> date.ctime(T)
> ---
> - 'Wed Jul 14 01:36:48 2021
> '
> ...
> ```
> 
> Date attribute accessors
> ------------------------
> 
> |                |                                                                 |
> |----------------|-----------------------------------------------------------------|
> | `timestamp`    | Calculate timestamp with seconds and nanoseconds parts combined |
> | `nanoseconds`  | Number of nanoseconds in time part                              |
> | `microseconds` | Number of microseconds in time part                             |
> | `milliseconds` | Number of milliseconds in time part                             |
> | `seconds`      | Alias to timestamp                                              |
> | `minutes`      | Number of minutes in time part                                  |
> | `hours`        | Number of hours in time part                                    |
> | `days`         | Number of days in time part                                     |
> 
> ```
> tarantool> d = date{year = 1970, month = 1, day = 1, hour = 0, minute = 10, second=10}
> tarantool> d.secs
> ---
> - 610
> ...
> tarantool> d.nsec
> ---
> - 0
> ...
> tarantool> d.offset
> ---
> - 0
> ...
> tarantool> d.nanoseconds
> ---
> - 610000000000
> ...
> tarantool> d.milliseconds
> ---
> - 610000
> ...
> tarantool> d.hours
> ---
> - 0.16944444444444
> ...
> tarantool> d.minutes
> ---
> - 10.166666666667
> ...
> ```
> 
> Date arithmetic
> ---------------
> 

The most interesting part of the doc by the way.

> TBD
> 
> 
> https://github.com/tarantool/tarantool/issues/5941
> https://github.com/tarantool/tarantool/issues/5946
> https://github.com/tarantool/tarantool/tree/tsafin/gh-5941-datetime-V2
> 
> 
> 
> Timur Safin (13):
>   build: add Christian Hansen c-dt to the build
>   lua: built-in module datetime
>   test: datetime test
>   test: datetime string formatting
>   box: add messagepack support for datetime
>   lua: positive/negative cases in datetime test
>   lua: asctime and strfime fixed
>   box, lua: renamed t_datetime_tz structure to datetime_t
>   lua: calculated attributes for date
>   lua: tostring formatization in datetime.lua
>   test: allow relaxed date format without tz
>   lua: initial time duration support
>   lua: complete time duration support
> 
>  .gitmodules                       |   3 +
>  CMakeLists.txt                    |   8 +
>  cmake/BuildCDT.cmake              |   6 +
>  src/CMakeLists.txt                |   4 +
>  src/box/field_def.c               |  34 +-
>  src/box/field_def.h               |   1 +
>  src/box/msgpack.c                 |   7 +-
>  src/box/tuple_compare.cc          |  24 +
>  src/exports.h                     |  26 +
>  src/lib/core/CMakeLists.txt       |   4 +-
>  src/lib/core/datetime.h           |  96 ++++
>  src/lib/core/mp_datetime.c        | 232 ++++++++
>  src/lib/core/mp_extension_types.h |   1 +
>  src/lib/mpstream/mpstream.c       |  11 +
>  src/lib/mpstream/mpstream.h       |   4 +
>  src/lua/datetime.c                |  70 +++
>  src/lua/datetime.h                |  53 ++
>  src/lua/datetime.lua              | 868 ++++++++++++++++++++++++++++++
>  src/lua/init.c                    |   6 +-
>  src/lua/msgpack.c                 |  12 +
>  src/lua/msgpackffi.lua            |   8 +
>  src/lua/serializer.c              |   4 +
>  src/lua/serializer.h              |   2 +
>  src/lua/utils.c                   |   1 -
>  test/app-tap/datetime.test.lua    | 266 +++++++++
>  test/unit/CMakeLists.txt          |   2 +
>  test/unit/datetime.c              | 220 ++++++++
>  test/unit/datetime.result         | 358 ++++++++++++
>  third_party/c-dt                  |   1 +
>  third_party/lua-cjson/lua_cjson.c |   8 +
>  third_party/lua-yaml/lyaml.cc     |   6 +-
>  31 files changed, 2326 insertions(+), 20 deletions(-)
>  create mode 100644 cmake/BuildCDT.cmake
>  create mode 100644 src/lib/core/datetime.h
>  create mode 100644 src/lib/core/mp_datetime.c
>  create mode 100644 src/lua/datetime.c
>  create mode 100644 src/lua/datetime.h
>  create mode 100644 src/lua/datetime.lua
>  create mode 100755 test/app-tap/datetime.test.lua
>  create mode 100644 test/unit/datetime.c
>  create mode 100644 test/unit/datetime.result
>  create mode 160000 third_party/c-dt
> 
> -- 
> 2.29.2
> 

[1]: https://github.com/tarantool/tarantool/wiki/Code-review-procedure
[2]: https://www.tarantool.io/en/doc/latest/dev_guide/developer_guidelines/
[1]: http://luajit.org/ext_ffi_semantics.html#status

-- 
Best regards,
IM


More information about the Tarantool-patches mailing list