[Tarantool-patches] [PATCH v1 2/8] sql: refactor CHAR_LENGTH() function
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Sat Oct 9 00:56:59 MSK 2021
Thanks for the patch!
On 01.10.2021 18:29, imeevma at tarantool.org wrote:
> Part of #4145
> ---
> src/box/sql/func.c | 38 +++++++++++++++++++++++++++++++++++---
> 1 file changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 54b03f359..2e53b32d8 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -263,6 +263,38 @@ func_abs_double(struct sql_context *ctx, int argc, struct Mem *argv)
> mem_set_double(ctx->pOut, arg->u.r < 0 ? -arg->u.r : arg->u.r);
> }
>
> +/** Implementation of the CHAR_LENGTH() function. */
> +static inline uint8_t
> +utf8_len_char(char c)
> +{
> + uint8_t u = (uint8_t)c;
> + return 1 + (u >= 0xc2) + (u >= 0xe0) + (u >= 0xf0);
It is not that simple really. Consider either using the old
lengthFunc() and other sqlite utf8 helpers or use the approach
similar to utf8_len() in utf8.c. It uses ICU macro U8_NEXT()
and has handling for special symbols like U_SENTINEL.
Otherwise you are making already third version of functions to
work with utf8.
I would even prefer to refactor lengthFunc() to stop using sqlite
legacy and drop sqlite utf8 entirely, but I suspect it might be
not so trivial to do and should be done later.
More information about the Tarantool-patches
mailing list