[PATCH v3 2/4] lib/core: introduce decimal type to tarantool

Vladimir Davydov vdavydov.dev at gmail.com
Wed Jun 5 15:19:29 MSK 2019


On Tue, Jun 04, 2019 at 07:04:17PM +0300, Serge Petrenko wrote:
> Add fixed-point decimal type to tarantool core.
> Adapt decNumber floating-point decimal library for the purpose, write a
> small wrapper and add unit tests.
> 
> A new decimal type is an alias for decNumber numbers from the decNumber
> library.
> Arithmetic operations (+, -, *, /) and some mathematic functions
> (ln, log10, exp, pow, sqrt) are available.
> 
> We introduce a single context for all the arithmetic operations
> on decimals, which enforces both number precision and scale to be
> in range [0, 38].
> 
> Every mathematic function has precision as one of its parameters, and
> the results are rounded to fit the desired precision.
> 
> Part of #692

Are NaN and inf supported? I assume not. Please mention it in
the commit log.

> diff --git a/src/lib/core/decimal.c b/src/lib/core/decimal.c
> new file mode 100644
> index 000000000..12250da95
> --- /dev/null
> +++ b/src/lib/core/decimal.c
> @@ -0,0 +1,293 @@

> +#include "decimal.h"
> +#include "third_party/decNumber/decContext.h"
> +#include <stdlib.h>

Do you really need stdlib.h? Isn't stddef.h enough?

Also, you use intN_t hence you should include stdint.h.
I'm not sure you really need to use fixed size ints in
this module though.

> +#include <assert.h>
> +
> +#define MAX(a, b) ((a) > (b) ? (a) : (b))
> +#define MIN(a, b) ((a) > (b) ? (b) : (a))

Why not include trivial/util.h instead?

> +
> +/** A single context for all the arithmetic operations. */
> +static decContext decimal_context = {
> +	/* Maximum precision during operations. */
> +	TARANTOOL_MAX_DECIMAL_DIGITS,
> +	/*
> +	 * Maximum decimal lagarithm of the number.
> +	 * Allows for precision = TARANTOOL_MAX_DECIMAL_DIGITS
> +	 */
> +	TARANTOOL_MAX_DECIMAL_DIGITS - 1,
> +	/*
> +	 * Minimal adjusted exponent. The smallest absolute value will be
> +	 * exp((1 - TARANTOOL_MAX_DECIMAL_DIGITS) - 1) =
> +	 * exp(-TARANTOOL_MAX_DECIMAL_DIGITS) allowing for scale =
> +	 * TARANTOOL_MAX_DECIMAL_DIGITS
> +	 */
> +	-1,
> +	/* Rounding mode: .5 rounds away from 0. */
> +	DEC_ROUND_HALF_UP,
> +	/* Turn off signalling for failed operations. */
> +	0,
> +	/* Status holding occured events. Initially empty. */
> +	0,
> +	/* Turn off exponent clamping. */
> +	0
> +};
> +
> +/**
> + * Return operation status and clear it for future checks.
> + *
> + * @return 0 if ok, bitwise or of decNumber errors, if any.
> + */
> +static inline uint32_t
> +decimal_get_op_status(decContext *context)
> +{
> +	uint32_t status = decContextGetStatus(context);
> +	decContextZeroStatus(&decimal_context);

Should be 'context', not decimal_context?

> +	/*
> +	 * Clear warnings. Every value less than 0.1 is
> +	 * subnormal, DEC_Inexact and DEC_Rounded result
> +	 * from rounding. DEC_Inexact with DEC_Subnormal
> +	 * together result in DEC_Underflow. DEC_Clamped
> +	 * happens after underflow if rounding to zero.
> +	 */
> +	return status & ~(uint32_t)(DEC_Inexact | DEC_Rounded | DEC_Underflow |
> +				    DEC_Subnormal | DEC_Clamped);

I'd fold this function in decimal_finalize as it's short and you don't
use it anywhere else.

> +}
> +
> +/**
> + * A finalizer for all the operations.
> + * Check the operation context status and empty it.
> + *
> + * @return NULL if finalization failed.
> + *         result pointer otherwise.
> + */
> +static inline decimal *
> +decimal_finalize(decimal *res, decContext *context)
> +{
> +	uint32_t status = decimal_get_op_status(context);
> +	if (status || ! decNumberIsFinite(res)) {

Do we really need decNumberIsFinite here? According to the context,
you don't allow inf.

> +		return NULL;
> +	}
> +	return res;
> +}
> +
> +uint8_t decimal_precision(const decimal *dec) {
> +	return dec->exponent <= 0 ? MAX(dec->digits, -dec->exponent) :
> +				    dec->digits + dec->exponent;
> +}
> +
> +uint8_t decimal_scale(const decimal *dec) {
> +	return dec->exponent < 0 ? -dec->exponent : 0;
> +}
> +
> +decimal *
> +decimal_zero(decimal *dec)
> +{
> +	decNumberZero(dec);
> +	return dec;
> +}
> +
> +decimal *
> +decimal_from_string(decimal *dec, const char *str)
> +{
> +	decNumberFromString(dec, str, &decimal_context);
> +	return decimal_finalize(dec, &decimal_context);
> +}

It's unsafe to use a global context in all these functions as they might
be used from different threads while the context is used for propagating
errors. I think you should copy it to stack.

> +int
> +decimal_compare(const decimal *lhs, const decimal *rhs)
> +{
> +	decNumber res;
> +	decNumberCompare(&res, lhs, rhs, &decimal_context);
> +	return decNumberToInt32(&res, &decimal_context);
> +}

I'd add an assertion here checking the context status if this function
aren't supposed to fail. The same is fair to all other functions that
must not fail.

> diff --git a/src/lib/core/decimal.h b/src/lib/core/decimal.h
> new file mode 100644
> index 000000000..f2812f500
> --- /dev/null
> +++ b/src/lib/core/decimal.h
> @@ -0,0 +1,159 @@

> +#define TARANTOOL_MAX_DECIMAL_DIGITS 38

This macro definition must be accompanied by a comment. Also, I don't
see any point in adding TARANTOOL_ prefix - you don't use the prefix
with the object type so using it with one constant looks inconsistent.
Let's rename it to DECIMAL_MAX_DIGITS?

> +#define DECNUMDIGITS TARANTOOL_MAX_DECIMAL_DIGITS

> +#define DECSTRING_NO_EXPONENT

AFAICS this macro is defined by default so there's no need to define it
here.

> +#include "third_party/decNumber/decNumber.h"
> +
> +typedef decNumber decimal;

We typically add _t suffix to opaque types.

> +
> +/**
> + * @return decimal precision,
> + * i.e. the amount of decimal digits in
> + * its representation.
> + */
> +uint8_t
> +decimal_precision(const decimal *dec);

We prefer using int unless we really need fixed size ints
for some reason.

> +
> +/**
> + * @return decimal scale,
> + * i.e. the number of decimal digits after
> + * the decimal separator.
> + */
> +uint8_t
> +decimal_scale(const decimal *dec);
> +
> +/**
> + * Initialize a zero decimal number.
> + */
> +decimal *
> +decimal_zero(decimal *dec);
> +
> +/**
> + * Initialize a decimal with a value from the string.
> + *
> + * If the number is less, than 10^TARANTOOL_MAX_DECIMAL_DIGITS,
> + * but has excess digits in fractional part, it will be rounded.
> + *
> + * @return NULL if string is invalid or
> + * the number is too big (>= 10^TARANTOOL_MAX_DECIMAL_DIGITS)
> + */
> +decimal *
> +decimal_from_string(decimal *dec, const char *str);

AFAIK in SQL one can set column type to DECIMAL(precision,scale), i.e.
specify the precision/scale explicitly. An attempt to insert a value
that doesn't fit results in an error. That said, we might need to allow
specifying precision/scale here. Not sure about this. Please consult
SQL guys.

> +
> +/**
> + * Initialize a decimal with an integer value.
> + *
> + * @return NULL if precicion is insufficient to hold
> + * the value or precision/scale are out of bounds.
> +*/
> +decimal *
> +decimal_from_int(decimal *dec, int32_t num);
> +
> +/** @copydoc decimal_from_int */
> +decimal *
> +decimal_from_uint(decimal *dec, uint32_t num);

Why int32_t? Why not int64_t? Anyway, why do you need this operations
at all? Are you going to try to compare decimals with numbers when
the indexed field is scalar? That is, are you going to make decimal
a part of MP_CLASS_NUMBER? If so, you need _from_float() helpers, too.
Or are planning to arrange decimals before any simple number?

> +
> +/**
> + * Write the decimal to a string.
> + * A string has to be at least dec->digits + 3 bytes in size.
> + */
> +char *
> +decimal_to_string(const decimal *dec, char *buf);

This is kinda inconsistent with others formatting routines, e.g.
take a look at vclock_to_string:

 - there should be decimal_snprint which is an snprintf like function
   that dumps decimal to the provided buffer.
 - there should also be decimal_to_string wrapper which formats a
   decimal on a static buffer (tt_static_buf).

> +
> +/**
> + * Cast decimal to an integer value. The number will be rounded
> + * if it has a fractional part.
> + */
> +int32_t
> +decimal_to_int(const decimal *dec);
> +
> +/** @copydoc decimal_to_int */
> +uint32_t
> +decimal_to_uint(const decimal *dec);

These functions can easily fail, but the comment says nothing about it.
Looks like there's no way to figure that out. Anyway, why do we need
these functions at all?

> +
> +/**
> + * Compare 2 decimal values.
> + * @return -1, lhs < rhs,
> + *	    0, lhs = rhs,
> + *	    1, lhs > rhs
> + */
> +int
> +decimal_compare(const decimal *lhs, const decimal *rhs);
> +
> +/**
> + * res is set to the absolute value of dec
> + * decimal_abs(&a, &a) is allowed.
> + */
> +decimal *
> +decimal_abs(decimal *res, const decimal *dec);

What about unary minus?

> +
> +/*
> + * Arithmetic ops: add, subtract, multiply and divide.
> + * Return result pointer on success, NULL on an error (overflow).
> + */
> +
> +decimal *
> +decimal_add(decimal *res, const decimal *lhs, const decimal *rhs);
> +
> +decimal *
> +decimal_sub(decimal *res, const decimal *lhs, const decimal *rhs);
> +
> +decimal *
> +decimal_mul(decimal *res, const decimal *lhs, const decimal *rhs);
> +
> +decimal *
> +decimal_div(decimal *res, const decimal *lhs, const decimal *rhs);
> +
> +/*
> + * log10, ln, pow, exp, sqrt.
> + * Calculate the appropriate function and then round the result
> + * to the desired precision.
> + * Return result pointer on success, NULL on an error (overflow).
> + */
> +decimal *
> +decimal_log10(decimal *res, const decimal *lhs, uint32_t precision);
> +
> +decimal *
> +decimal_ln(decimal *res, const decimal *lhs, uint32_t precision);

I'd expect to see decimal_log. Yeah, I know that

  log(a, b) = log(c, b) / log(c, a)

but a helper function would come in handy.

> +
> +decimal *
> +decimal_pow(decimal *res, const decimal *lhs, const decimal *rhs, uint32_t precision);
> +
> +decimal *
> +decimal_exp(decimal *res, const decimal *lhs, uint32_t precision);
> +
> +decimal *
> +decimal_sqrt(decimal *res, const decimal *lhs, uint32_t precision);

What's the point in specifying precision for these functions? Wouldn't
having decimal_round be better? Why doesn't decimal_div have this option
then?

> diff --git a/test/unit/decimal.c b/test/unit/decimal.c
> new file mode 100644
> index 000000000..adfbead91
> --- /dev/null
> +++ b/test/unit/decimal.c
> @@ -0,0 +1,168 @@

> +	decimal_from_int(&s, 4);
> +	ret = decimal_mul(&s, &s, &d);
> +	isnt(ret, NULL, "Simple multiplication");

Please also test all possible overflow/underflow cases.



More information about the Tarantool-patches mailing list