[tarantool-patches] Re: [PATCH v1 1/1] sql: fix perf degradation on name normalization
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Thu Apr 4 20:31:27 MSK 2019
>> 5. Why do you declare 'rc' and on the next line set it inside 'if'?
>> It does not make the code shorter, not more readable. Just write
>> rc = sql_normalize...(); and check its result on the next line.
> I like this more. Ok. Reworked here and everywhere.
I could understand that if not the fact, that through the patch
you used both, sometimes in one function on neighbour lines.
>> 11. Third argument of OutOfMemory is a name of the function
>> failed to allocate memory. Here you call region_alloc, not region.
> Vova prefer "region" in such cases if I not mistaken. I don't care. "region_alloc"
Do not remember if Vova even once said me something like this.
But clearly remember that Kostja said to use function name,
and we do that through the code.
See 3 comments below.
>
> Because sql_normalize_name used to be called twice - to estimate
> the size of the name buffer and to process data querying the
> UCaseMap object each time performance in SQL felt by 15%.
>
> This patch should eliminate some of the negative effects of using
> ICU for name normalization.
>
> Thanks @avtikhon for a bechmark
>
> Follow up e7558062d3559e6bcc18f91eacb88269428321dc
> ---
> src/box/sql/expr.c | 29 ++++++++--------
> src/box/sql/parse.y | 21 ++++++------
> src/box/sql/sqlInt.h | 9 ++---
> src/box/sql/trigger.c | 22 ++++++++-----
> src/box/sql/util.c | 71 +++++++++++++++++++++-------------------
> src/lib/coll/coll.c | 8 ++++-
> src/lib/coll/coll.h | 3 ++
> src/lua/utf8.c | 11 +------
> test/sql/errinj.result | 9 ++++-
> test/sql/errinj.test.lua | 2 ++
> 10 files changed, 98 insertions(+), 87 deletions(-)
>
> diff --git a/src/box/sql/util.c b/src/box/sql/util.c
> index a13efa682..2f3c17c9a 100644
> --- a/src/box/sql/util.c
> +++ b/src/box/sql/util.c
> @@ -259,66 +260,68 @@ int
> char *
> sql_normalized_name_region_new(struct region *r, const char *name, int len)
> {
> - int size = sql_normalize_name(NULL, 0, name, len);
> - if (size < 0)
> - return NULL;
> - char *res = (char *) region_alloc(r, size);
> - if (res == NULL) {
> + int size = len + 1;
> + ERROR_INJECT(ERRINJ_SQL_NAME_NORMALIZATION, {
> diag_set(OutOfMemory, size, "region_alloc", "res");
> return NULL;
> - }
> - if (sql_normalize_name(res, size, name, len) < 0)
> + });
> + size_t region_svp = region_used(r);
> + char *res = region_alloc(r, size);
> + if (res == NULL)
> + return NULL;
1. Missed diag_set.
> + int rc = sql_normalize_name(res, size, name, len);
> + if (rc <= size)
> + return res;
> +
> + size = rc;
> + region_truncate(r, region_svp);
> + res = region_alloc(r, size);
> + if (res == NULL)
> return NULL;
2. Again.
> + if (sql_normalize_name(res, size, name, len) > size)
> + unreachable();
> return res;
> }
>
> diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
> index b83f0fdc7..21f2489d4 100644
> --- a/src/lib/coll/coll.c
> +++ b/src/lib/coll/coll.c
> @@ -34,8 +34,11 @@
> #include "diag.h"
> #include "assoc.h"
> #include <unicode/ucol.h>
> +#include <unicode/ucasemap.h>
> #include <trivia/config.h>
>
> +struct UCaseMap *root_map = NULL;
3. That name was ok to be local for utf8.c, but now
it is global, and 'root_map' in the whole scope of
tarantool looks ambiguous. What it is? MessagePack map?
Lua table map? RB-tree map? What is 'root'? I propose
icu_ucase_default_map
More information about the Tarantool-patches
mailing list