[tarantool-patches] Re: [PATCH v3 2/4] lib/core: introduce decimal type to tarantool

Tue Jun 11 19:14:05 MSK 2019

Hi! Thank you for review.
I addressed your comments and sent v4.

> 5 июня 2019 г., в 15:19, Vladimir Davydov <vdavydov.dev at gmail.com> написал(а):
> 
> On Tue, Jun 04, 2019 at 07:04:17PM +0300, Serge Petrenko wrote:
>> Add fixed-point decimal type to tarantool core.
>> Adapt decNumber floating-point decimal library for the purpose, write a
>> small wrapper and add unit tests.
>> 
>> A new decimal type is an alias for decNumber numbers from the decNumber
>> library.
>> Arithmetic operations (+, -, *, /) and some mathematic functions
>> (ln, log10, exp, pow, sqrt) are available.
>> 
>> We introduce a single context for all the arithmetic operations
>> on decimals, which enforces both number precision and scale to be
>> in range [0, 38].
>> 
>> Every mathematic function has precision as one of its parameters, and
>> the results are rounded to fit the desired precision.
>> 
>> Part of #692
> 
> Are NaN and inf supported? I assume not. Please mention it in
> the commit log.
Done.

> 
>> diff --git a/src/lib/core/decimal.c b/src/lib/core/decimal.c
>> new file mode 100644
>> index 000000000..12250da95
>> --- /dev/null
>> +++ b/src/lib/core/decimal.c
>> @@ -0,0 +1,293 @@
> 
>> +#include "decimal.h"
>> +#include "third_party/decNumber/decContext.h"
>> +#include <stdlib.h>
> 
> Do you really need stdlib.h? Isn't stddef.h enough?
> 
> Also, you use intN_t hence you should include stdint.h.
> I'm not sure you really need to use fixed size ints in
> this module though.
I need fixed size ints since decNumber library uses them
in conversions: decNumberToInt32 (decNumberToInt64)

> 
>> +#include <assert.h>
>> +
>> +#define MAX(a, b) ((a) > (b) ? (a) : (b))
>> +#define MIN(a, b) ((a) > (b) ? (b) : (a))
> 
> Why not include trivial/util.h instead?
Indeed, done.

> 
>> +
>> +/** A single context for all the arithmetic operations. */
>> +static decContext decimal_context = {
>> +	/* Maximum precision during operations. */
>> +	TARANTOOL_MAX_DECIMAL_DIGITS,
>> +	/*
>> +	 * Maximum decimal lagarithm of the number.
>> +	 * Allows for precision = TARANTOOL_MAX_DECIMAL_DIGITS
>> +	 */
>> +	TARANTOOL_MAX_DECIMAL_DIGITS - 1,
>> +	/*
>> +	 * Minimal adjusted exponent. The smallest absolute value will be
>> +	 * exp((1 - TARANTOOL_MAX_DECIMAL_DIGITS) - 1) =
>> +	 * exp(-TARANTOOL_MAX_DECIMAL_DIGITS) allowing for scale =
>> +	 * TARANTOOL_MAX_DECIMAL_DIGITS
>> +	 */
>> +	-1,
>> +	/* Rounding mode: .5 rounds away from 0. */
>> +	DEC_ROUND_HALF_UP,
>> +	/* Turn off signalling for failed operations. */
>> +	0,
>> +	/* Status holding occured events. Initially empty. */
>> +	0,
>> +	/* Turn off exponent clamping. */
>> +	0
>> +};
>> +
>> +/**
>> + * Return operation status and clear it for future checks.
>> + *
>> + * @return 0 if ok, bitwise or of decNumber errors, if any.
>> + */
>> +static inline uint32_t
>> +decimal_get_op_status(decContext *context)
>> +{
>> +	uint32_t status = decContextGetStatus(context);
>> +	decContextZeroStatus(&decimal_context);
> 
> Should be 'context', not decimal_context?

You’re right, fixed.

> 
>> +	/*
>> +	 * Clear warnings. Every value less than 0.1 is
>> +	 * subnormal, DEC_Inexact and DEC_Rounded result
>> +	 * from rounding. DEC_Inexact with DEC_Subnormal
>> +	 * together result in DEC_Underflow. DEC_Clamped
>> +	 * happens after underflow if rounding to zero.
>> +	 */
>> +	return status & ~(uint32_t)(DEC_Inexact | DEC_Rounded | DEC_Underflow |
>> +				    DEC_Subnormal | DEC_Clamped);
> 
> I'd fold this function in decimal_finalize as it's short and you don't
> use it anywhere else.
> 
Done.

>> +}
>> +
>> +/**
>> + * A finalizer for all the operations.
>> + * Check the operation context status and empty it.
>> + *
>> + * @return NULL if finalization failed.
>> + *         result pointer otherwise.
>> + */
>> +static inline decimal *
>> +decimal_finalize(decimal *res, decContext *context)
>> +{
>> +	uint32_t status = decimal_get_op_status(context);
>> +	if (status || ! decNumberIsFinite(res)) {
> 
> Do we really need decNumberIsFinite here? According to the context,
> you don't allow inf.

Fixed.

> 
>> +		return NULL;
>> +	}
>> +	return res;
>> +}
>> +
>> +uint8_t decimal_precision(const decimal *dec) {
>> +	return dec->exponent <= 0 ? MAX(dec->digits, -dec->exponent) :
>> +				    dec->digits + dec->exponent;
>> +}
>> +
>> +uint8_t decimal_scale(const decimal *dec) {
>> +	return dec->exponent < 0 ? -dec->exponent : 0;
>> +}
>> +
>> +decimal *
>> +decimal_zero(decimal *dec)
>> +{
>> +	decNumberZero(dec);
>> +	return dec;
>> +}
>> +
>> +decimal *
>> +decimal_from_string(decimal *dec, const char *str)
>> +{
>> +	decNumberFromString(dec, str, &decimal_context);
>> +	return decimal_finalize(dec, &decimal_context);
>> +}
> 
> It's unsafe to use a global context in all these functions as they might
> be used from different threads while the context is used for propagating
> errors. I think you should copy it to stack.

Added `__thread` specifier.

> 
>> +int
>> +decimal_compare(const decimal *lhs, const decimal *rhs)
>> +{
>> +	decNumber res;
>> +	decNumberCompare(&res, lhs, rhs, &decimal_context);
>> +	return decNumberToInt32(&res, &decimal_context);
>> +}
> 
> I'd add an assertion here checking the context status if this function
> aren't supposed to fail. The same is fair to all other functions that
> must not fail.

Ok, done.

> 
>> diff --git a/src/lib/core/decimal.h b/src/lib/core/decimal.h
>> new file mode 100644
>> index 000000000..f2812f500
>> --- /dev/null
>> +++ b/src/lib/core/decimal.h
>> @@ -0,0 +1,159 @@
> 
>> +#define TARANTOOL_MAX_DECIMAL_DIGITS 38
> 
> This macro definition must be accompanied by a comment. Also, I don't
> see any point in adding TARANTOOL_ prefix - you don't use the prefix
> with the object type so using it with one constant looks inconsistent.
> Let's rename it to DECIMAL_MAX_DIGITS?

Done.

> 
>> +#define DECNUMDIGITS TARANTOOL_MAX_DECIMAL_DIGITS
> 
>> +#define DECSTRING_NO_EXPONENT
> 
> AFAICS this macro is defined by default so there's no need to define it
> here.
> 
>> +#include "third_party/decNumber/decNumber.h"
>> +
>> +typedef decNumber decimal;
> 
> We typically add _t suffix to opaque types.

Done.

> 
>> +
>> +/**
>> + * @return decimal precision,
>> + * i.e. the amount of decimal digits in
>> + * its representation.
>> + */
>> +uint8_t
>> +decimal_precision(const decimal *dec);
> 
> We prefer using int unless we really need fixed size ints
> for some reason.

decimal_precision() and decimal_scale() return int from now on.
I left fixed size ints in some other places where they seem appropriate.

> 
>> +
>> +/**
>> + * @return decimal scale,
>> + * i.e. the number of decimal digits after
>> + * the decimal separator.
>> + */
>> +uint8_t
>> +decimal_scale(const decimal *dec);
>> +
>> +/**
>> + * Initialize a zero decimal number.
>> + */
>> +decimal *
>> +decimal_zero(decimal *dec);
>> +
>> +/**
>> + * Initialize a decimal with a value from the string.
>> + *
>> + * If the number is less, than 10^TARANTOOL_MAX_DECIMAL_DIGITS,
>> + * but has excess digits in fractional part, it will be rounded.
>> + *
>> + * @return NULL if string is invalid or
>> + * the number is too big (>= 10^TARANTOOL_MAX_DECIMAL_DIGITS)
>> + */
>> +decimal *
>> +decimal_from_string(decimal *dec, const char *str);
> 
> AFAIK in SQL one can set column type to DECIMAL(precision,scale), i.e.
> specify the precision/scale explicitly. An attempt to insert a value
> that doesn't fit results in an error. That said, we might need to allow
> specifying precision/scale here. Not sure about this. Please consult
> SQL guys.

I guess this check can be done after a decimal is constructed.
Precision and scale meet the ones of the string for now. 

> 
>> +
>> +/**
>> + * Initialize a decimal with an integer value.
>> + *
>> + * @return NULL if precicion is insufficient to hold
>> + * the value or precision/scale are out of bounds.
>> +*/
>> +decimal *
>> +decimal_from_int(decimal *dec, int32_t num);
>> +
>> +/** @copydoc decimal_from_int */
>> +decimal *
>> +decimal_from_uint(decimal *dec, uint32_t num);
> 
> Why int32_t? Why not int64_t? Anyway, why do you need this operations
> at all? Are you going to try to compare decimals with numbers when
> the indexed field is scalar? That is, are you going to make decimal
> a part of MP_CLASS_NUMBER? If so, you need _from_float() helpers, too.
> Or are planning to arrange decimals before any simple number?

Added to/from_double and from int64_4 helpers. Yep, we’re going to compare decimals
with doubles, and ints.
> 
>> +
>> +/**
>> + * Write the decimal to a string.
>> + * A string has to be at least dec->digits + 3 bytes in size.
>> + */
>> +char *
>> +decimal_to_string(const decimal *dec, char *buf);
> 
> This is kinda inconsistent with others formatting routines, e.g.
> take a look at vclock_to_string:
> 
> - there should be decimal_snprint which is an snprintf like function
>   that dumps decimal to the provided buffer.
> - there should also be decimal_to_string wrapper which formats a
>   decimal on a static buffer (tt_static_buf).

Fixed according to a verbal discussion.

> 
>> +
>> +/**
>> + * Cast decimal to an integer value. The number will be rounded
>> + * if it has a fractional part.
>> + */
>> +int32_t
>> +decimal_to_int(const decimal *dec);
>> +
>> +/** @copydoc decimal_to_int */
>> +uint32_t
>> +decimal_to_uint(const decimal *dec);
> 
> These functions can easily fail, but the comment says nothing about it.
> Looks like there's no way to figure that out. Anyway, why do we need
> these functions at all?

Removed the to_uint and to_int helpers.

> 
>> +
>> +/**
>> + * Compare 2 decimal values.
>> + * @return -1, lhs < rhs,
>> + *	    0, lhs = rhs,
>> + *	    1, lhs > rhs
>> + */
>> +int
>> +decimal_compare(const decimal *lhs, const decimal *rhs);
>> +
>> +/**
>> + * res is set to the absolute value of dec
>> + * decimal_abs(&a, &a) is allowed.
>> + */
>> +decimal *
>> +decimal_abs(decimal *res, const decimal *dec);
> 
> What about unary minus?

Implemented.

> 
>> +
>> +/*
>> + * Arithmetic ops: add, subtract, multiply and divide.
>> + * Return result pointer on success, NULL on an error (overflow).
>> + */
>> +
>> +decimal *
>> +decimal_add(decimal *res, const decimal *lhs, const decimal *rhs);
>> +
>> +decimal *
>> +decimal_sub(decimal *res, const decimal *lhs, const decimal *rhs);
>> +
>> +decimal *
>> +decimal_mul(decimal *res, const decimal *lhs, const decimal *rhs);
>> +
>> +decimal *
>> +decimal_div(decimal *res, const decimal *lhs, const decimal *rhs);
>> +
>> +/*
>> + * log10, ln, pow, exp, sqrt.
>> + * Calculate the appropriate function and then round the result
>> + * to the desired precision.
>> + * Return result pointer on success, NULL on an error (overflow).
>> + */
>> +decimal *
>> +decimal_log10(decimal *res, const decimal *lhs, uint32_t precision);
>> +
>> +decimal *
>> +decimal_ln(decimal *res, const decimal *lhs, uint32_t precision);
> 
> I'd expect to see decimal_log. Yeah, I know that
> 
>  log(a, b) = log(c, b) / log(c, a)
> 
> but a helper function would come in handy.

The helper will perform exactly the same division.
This can be done in a lua wrapper, I guess.

> 
>> +
>> +decimal *
>> +decimal_pow(decimal *res, const decimal *lhs, const decimal *rhs, uint32_t precision);
>> +
>> +decimal *
>> +decimal_exp(decimal *res, const decimal *lhs, uint32_t precision);
>> +
>> +decimal *
>> +decimal_sqrt(decimal *res, const decimal *lhs, uint32_t precision);
> 
> What's the point in specifying precision for these functions? Wouldn't
> having decimal_round be better? Why doesn't decimal_div have this option
> then?

Added decimal_round(), removed precision from math functions.

> 
>> diff --git a/test/unit/decimal.c b/test/unit/decimal.c
>> new file mode 100644
>> index 000000000..adfbead91
>> --- /dev/null
>> +++ b/test/unit/decimal.c
>> @@ -0,0 +1,168 @@
> 
>> +	decimal_from_int(&s, 4);
>> +	ret = decimal_mul(&s, &s, &d);
>> +	isnt(ret, NULL, "Simple multiplication");
> 
> Please also test all possible overflow/underflow cases.

Reworked the test.

--
Serge Petrenko
sergepetrenko at tarantool.org