[Tarantool-patches] [PATCH v5 1/8] build: add Christian Hansen c-dt to the build

Safin Timur tsafin at tarantool.org
Wed Aug 18 13:04:16 MSK 2021


Thanks for your feedback, it's massive. Will take some time to process 
all changes...

On 17.08.2021 18:50, Vladimir Davydov via Tarantool-patches wrote:
> On Mon, Aug 16, 2021 at 02:59:35AM +0300, Timur Safin via
> Tarantool-patches wrote:
>> * Integrated chansen/c-dt parser as 3rd party module to the
>>    Tarantool cmake build process;
>> * Points to tarantool/c-dt instead of original chansen/c-dt to
>>    have easier build integration, because there are additional
>>    commits which have integrated cmake support and have established
>>    symbols renaming facilities (similar to those we see in xxhash
>>    or icu).
>>    We have to be able to rename externally public symbols to avoid
>>    name clashes with 3rd party modules. We prefix c-dt symbols
>>    in the Tarantool build with `tnt_` prefix;
> 
> I don't see this done in this patch. Looks like this is done by patch 2.

Indeed, that was extracted for better observability of a patch.

> 
>> * took special care that generated build artifacts not override
>>    in-source files, but use different build/ directory.
>>
>> * added datetime parsing unit tests, for literals - with and
>>    without trailing timezones;
>> * also we check that strftime is reversible and produce consistent
>>    results after roundtrip from/to strings;
>> * discovered the harder way that on *BSD/MacOSX `strftime()` format
>>    `%z` outputs local time-zone if passed `tm_gmtoff` is 0.
>>    This behaviour is different to that we observed on Linux, thus we
>>    might have different execution results. Made test to not use `%z`
>>    and only operate with normalized date time formats `%F` and `%T`
>>
>> Part of #5941
>> ---
>>   .gitmodules               |   3 +
>>   CMakeLists.txt            |   8 +
>>   cmake/BuildCDT.cmake      |   8 +
>>   src/CMakeLists.txt        |   3 +-
>>   test/unit/CMakeLists.txt  |   3 +-
>>   test/unit/datetime.c      | 223 ++++++++++++++++++++++++
>>   test/unit/datetime.result | 358 ++++++++++++++++++++++++++++++++++++++
>>   third_party/c-dt          |   1 +
>>   8 files changed, 605 insertions(+), 2 deletions(-)
>>   create mode 100644 cmake/BuildCDT.cmake
>>   create mode 100644 test/unit/datetime.c
>>   create mode 100644 test/unit/datetime.result
>>   create mode 160000 third_party/c-dt
>>
>> diff --git a/test/unit/datetime.c b/test/unit/datetime.c
>> new file mode 100644
>> index 000000000..64c19dac4
>> --- /dev/null
>> +++ b/test/unit/datetime.c
>> @@ -0,0 +1,223 @@
>> +#include "dt.h"
>> +#include <assert.h>
>> +#include <stdint.h>
>> +#include <string.h>
>> +#include <time.h>
>> +
>> +#include "unit.h"
>> +
>> +static const char sample[] = "2012-12-24T15:30Z";
>> +
>> +#define S(s) {s, sizeof(s) - 1}
>> +struct {
>> +	const char * sz;
> 
> Extra space after '*'.
> 
> We usually name a variable that stores a zero-terminated 's' or 'str',
> never 'sz'.
> 
>> +#define DIM(a) (sizeof(a) / sizeof(a[0]))
> 
> There's 'lengthof' helper already defined for this.

Thanks! Updated.

> 
>> +
>> +/* p5-time-moment/src/moment_parse.c: parse_string_lenient() */
> 
> I don't understand the purpose of this comment.

That was referring to the original implementation of similar code in 
Christian Hansen time-moment module for perl5. Agreed that it has not 
much value right now, so removed it.

> 
>> +static int
>> +parse_datetime(const char *str, size_t len, int64_t *sp, int32_t *np,
>> +	       int32_t *op)
> 
> What's 'sp', 'np', and 'op'? Please refrain from using confusing
> abbreviations.
> 
>> +{
>> +	size_t n;
>> +	dt_t dt;
>> +	char c;
>> +	int sod = 0, nanosecond = 0, offset = 0;
>> +
>> +	n = dt_parse_iso_date(str, len, &dt);
>> +	if (!n)
>> +		return 1;
>> +	if (n == len)
>> +		goto exit;
>> +
>> +	c = str[n++];
>> +	if (!(c == 'T' || c == 't' || c == ' '))
>> +		return 1;
>> +
>> +	str += n;
>> +	len -= n;
>> +
>> +	n = dt_parse_iso_time(str, len, &sod, &nanosecond);
>> +	if (!n)
>> +		return 1;
>> +	if (n == len)
>> +		goto exit;
>> +
>> +	if (str[n] == ' ')
>> +	n++;
> 
> Bad indentation.

Fixed.

> 
>> +
>> +	str += n;
>> +	len -= n;
>> +
>> +	n = dt_parse_iso_zone_lenient(str, len, &offset);
>> +	if (!n || n != len)
>> +		return 1;
>> +
>> +exit:
>> +	*sp = ((int64_t)dt_rdn(dt) - 719163) * 86400 + sod - offset * 60;
> 
> Please define and use appropriate constants to make the code easier for
> understanding: DAYS_PER_YEAR, MINUTES_PER_HOUR, etc.

Changed to use DT_EPOCH_1970_OFFSET and SECS_PER_DAY.

> 
>> +	*np = nanosecond;
>> +	*op = offset;
>> +
>> +	return 0;
>> +}
>> +
>> +/* avoid introducing external datetime.h dependency -
>> +   just copy paste it for today
>> +*/
> 
> Bad comment formatting.

This comment gone. Thanks!

> 
>> +#define SECS_PER_DAY      86400
>> +#define DT_EPOCH_1970_OFFSET 719163
>> +
>> +struct datetime {
>> +	double secs;
>> +	int32_t nsec;
>> +	int32_t offset;
>> +};
> 
> I see that this struct as well as some functions in this module are
> defined in src/lib/core/datetime in patch 2, and then you remove struct
> datetime from this test in patch 3, but leave the helper functions
> datetime_to_tm, parse_datetime.
> 
> This makes me think that:
>   - You should squash patches 1-3.
>   - Both datetime and datetime manipulating functions (including
>     parse_datetime) should be defined only in src/lib/core/datetime
>     while test/unit should depend on src/lib/core/datetime.

Thats's quite reasonable suggestion (and was my original intention), but 
for simplicity of review I tried split these huge patches into smaller 
ones (look s not very successfully, and introduced some 
inter-dependencies between patches). I'll squash.


> 
>> +
>> +static int
>> +local_rd(const struct datetime *dt)
>> +{
>> +	return (int)((int64_t)dt->secs / SECS_PER_DAY) +
> DT_EPOCH_1970_OFFSET;
>> +}
>> +
>> +static int
>> +local_dt(const struct datetime *dt)
>> +{
>> +	return dt_from_rdn(local_rd(dt));
>> +}
>> +
>> +struct tm *
>> +datetime_to_tm(struct datetime *dt)
>> +{
>> +	static struct tm tm;
>> +
>> +	memset(&tm, 0, sizeof(tm));
>> +	dt_to_struct_tm(local_dt(dt), &tm);
>> +
>> +	int seconds_of_day = (int64_t)dt->secs % 86400;
>> +	tm.tm_hour = (seconds_of_day / 3600) % 24;
>> +	tm.tm_min = (seconds_of_day / 60) % 60;
>> +	tm.tm_sec = seconds_of_day % 60;
>> +
>> +	return &tm;
>> +}
>> +
>> +static void datetime_test(void)
>> +{
>> +	size_t index;
>> +	int64_t secs_expected;
>> +	int32_t nanosecs;
>> +	int32_t ofs;
> 
> offset?

Yup, renamed.

> 
>> +
>> +	plan(355);
>> +	parse_datetime(sample, sizeof(sample) - 1,
>> +		       &secs_expected, &nanosecs, &ofs);
>> +
>> +	for (index = 0; index < DIM(tests); index++) {
>> +		int64_t secs;
>> +		int rc = parse_datetime(tests[index].sz, tests[index].len,
>> +						&secs, &nanosecs, &ofs);
>> +		is(rc, 0, "correct parse_datetime return value for '%s'",
>> +		   tests[index].sz);
>> +		is(secs, secs_expected, "correct parse_datetime output "
>> +		   "seconds for '%s", tests[index].sz);
>> +
>> +		/* check that stringized literal produces the same date */
>> +		/* time fields */
>> +		static char buff[40];
>> +		struct datetime dt = {secs, nanosecs, ofs};
>> +		/* datetime_to_tm returns time in GMT zone */
>> +		struct tm * p_tm = datetime_to_tm(&dt);
> 
> Extra space after '*'.

Removed

> 
>> +		size_t len = strftime(buff, sizeof buff, "%F %T", p_tm);
>> +		ok(len > 0, "strftime");
>> +		int64_t parsed_secs;
>> +		int32_t parsed_nsecs, parsed_ofs;
>> +		rc = parse_datetime(buff, len, &parsed_secs,
> &parsed_nsecs, &parsed_ofs);
>> +		is(rc, 0, "correct parse_datetime return value for '%s'",
> buff);
>> +		is(secs, parsed_secs,
>> +		   "reversible seconds via strftime for '%s", buff);
>> +	}
>> +}
>> +
>> +int
>> +main(void)
>> +{
>> +	plan(1);
>> +	datetime_test();
>> +
>> +	return check_plan();
>> +}


Here are incremental changes so far:
--------------------------------------------------------
diff --git a/test/unit/datetime.c b/test/unit/datetime.c
index 18b24927e..bfc752c59 100644
--- a/test/unit/datetime.c
+++ b/test/unit/datetime.c
@@ -9,12 +9,13 @@
  #include "mp_datetime.h"
  #include "msgpuck.h"
  #include "mp_extension_types.h"
+#include "trivia/util.h"

  static const char sample[] = "2012-12-24T15:30Z";

  #define S(s) {s, sizeof(s) - 1}
  struct {
-	const char * sz;
+	const char *str;
  	size_t len;
  } tests[] = {
  	S("2012-12-24 15:30Z"),
@@ -91,17 +92,14 @@ struct {
  };
  #undef S

-#define DIM(a) (sizeof(a) / sizeof(a[0]))
-
-/* p5-time-moment/src/moment_parse.c: parse_string_lenient() */
  static int
-parse_datetime(const char *str, size_t len, int64_t *sp, int32_t *np,
-	       int32_t *op)
+parse_datetime(const char *str, size_t len, int64_t *secs_p,
+	       int32_t *nanosecs_p, int32_t *offset_p)
  {
  	size_t n;
  	dt_t dt;
  	char c;
-	int sod = 0, nanosecond = 0, offset = 0;
+	int sec_of_day = 0, nanosecond = 0, offset = 0;

  	n = dt_parse_iso_date(str, len, &dt);
  	if (!n)
@@ -116,14 +114,14 @@ parse_datetime(const char *str, size_t len, 
int64_t *sp, int32_t *np,
  	str += n;
  	len -= n;

-	n = dt_parse_iso_time(str, len, &sod, &nanosecond);
+	n = dt_parse_iso_time(str, len, &sec_of_day, &nanosecond);
  	if (!n)
  		return 1;
  	if (n == len)
  		goto exit;

  	if (str[n] == ' ')
-	n++;
+		n++;

  	str += n;
  	len -= n;
@@ -133,9 +131,10 @@ parse_datetime(const char *str, size_t len, int64_t 
*sp, int32_t *np,
  		return 1;

  exit:
-	*sp = ((int64_t)dt_rdn(dt) - 719163) * 86400 + sod - offset * 60;
-	*np = nanosecond;
-	*op = offset;
+	*secs_p = ((int64_t)dt_rdn(dt) - DT_EPOCH_1970_OFFSET) * SECS_PER_DAY +
+		  sec_of_day - offset * 60;
+	*nanosecs_p = nanosecond;
+	*offset_p = offset;

  	return 0;
  }
@@ -173,29 +172,30 @@ static void datetime_test(void)
  	size_t index;
  	int64_t secs_expected;
  	int32_t nanosecs;
-	int32_t ofs;
+	int32_t offset;

  	plan(355);
  	parse_datetime(sample, sizeof(sample) - 1,
-		       &secs_expected, &nanosecs, &ofs);
+		       &secs_expected, &nanosecs, &offset);

-	for (index = 0; index < DIM(tests); index++) {
+	for (index = 0; index < lengthof(tests); index++) {
  		int64_t secs;
-		int rc = parse_datetime(tests[index].sz, tests[index].len,
-					&secs, &nanosecs, &ofs);
+		int rc = parse_datetime(tests[index].str, tests[index].len,
+					&secs, &nanosecs, &offset);
  		is(rc, 0, "correct parse_datetime return value for '%s'",
-		   tests[index].sz);
+		   tests[index].str);
  		is(secs, secs_expected, "correct parse_datetime output "
-		   "seconds for '%s", tests[index].sz);
+					"seconds for '%s",
+		   tests[index].str);

  		/*
  		 * check that stringized literal produces the same date
  		 * time fields
  		 */
  		static char buff[40];
-		struct datetime dt = {secs, nanosecs, ofs};
+		struct datetime dt = {secs, nanosecs, offset};
  		/* datetime_to_tm returns time in GMT zone */
-		struct tm * p_tm = datetime_to_tm(&dt);
+		struct tm *p_tm = datetime_to_tm(&dt);
  		size_t len = strftime(buff, sizeof buff, "%F %T", p_tm);
  		ok(len > 0, "strftime");
  		int64_t parsed_secs;
@@ -237,7 +237,7 @@ tostring_datetime_test(void)
  	size_t index;

  	plan(15);
-	for (index = 0; index < DIM(tests); index++) {
+	for (index = 0; index < lengthof(tests); index++) {
  		struct datetime date = {
  			tests[index].secs,
  			tests[index].nsec,
@@ -282,7 +282,7 @@ mp_datetime_test()
  	size_t index;

  	plan(85);
-	for (index = 0; index < DIM(tests); index++) {
+	for (index = 0; index < lengthof(tests); index++) {
  		struct datetime date = {
  			tests[index].secs,
  			tests[index].nsec,
--------------------------------------------------------

Thanks,
Timur


More information about the Tarantool-patches mailing list