Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Kirill Shcherbatov <kshcherbatov@tarantool.org>,
	tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH v1 1/1] sql: fix perf degradation on name normalization
Date: Thu, 4 Apr 2019 20:31:27 +0300	[thread overview]
Message-ID: <bf8e6724-9e6f-68af-1db3-2a23390c133a@tarantool.org> (raw)
In-Reply-To: <7cdb5d69-8ace-3899-872e-c97c477866e6@tarantool.org>


>> 5. Why do you declare 'rc' and on the next line set it inside 'if'?
>> It does not make the code shorter, not more readable. Just write
>> rc = sql_normalize...(); and check its result on the next line.
> I like this more. Ok. Reworked here and everywhere.

I could understand that if not the fact, that through the patch
you used both, sometimes in one function on neighbour lines.

>> 11. Third argument of OutOfMemory is a name of the function
>> failed to allocate memory. Here you call region_alloc, not region.
> Vova prefer "region" in such cases if I not mistaken. I don't care. "region_alloc"

Do not remember if Vova even once said me something like this.
But clearly remember that Kostja said to use function name,
and we do that through the code.

See 3 comments below.

> 
> Because sql_normalize_name used to be called twice - to estimate
> the size of the name buffer and to process data querying the
> UCaseMap object each time performance in SQL felt by 15%.
> 
> This patch should eliminate some of the negative effects of using
> ICU for name normalization.
> 
> Thanks @avtikhon for a bechmark
> 
> Follow up e7558062d3559e6bcc18f91eacb88269428321dc
> ---
>  src/box/sql/expr.c       | 29 ++++++++--------
>  src/box/sql/parse.y      | 21 ++++++------
>  src/box/sql/sqlInt.h     |  9 ++---
>  src/box/sql/trigger.c    | 22 ++++++++-----
>  src/box/sql/util.c       | 71 +++++++++++++++++++++-------------------
>  src/lib/coll/coll.c      |  8 ++++-
>  src/lib/coll/coll.h      |  3 ++
>  src/lua/utf8.c           | 11 +------
>  test/sql/errinj.result   |  9 ++++-
>  test/sql/errinj.test.lua |  2 ++
>  10 files changed, 98 insertions(+), 87 deletions(-)
> 
> diff --git a/src/box/sql/util.c b/src/box/sql/util.c
> index a13efa682..2f3c17c9a 100644
> --- a/src/box/sql/util.c
> +++ b/src/box/sql/util.c
> @@ -259,66 +260,68 @@ int
>  char *
>  sql_normalized_name_region_new(struct region *r, const char *name, int len)
>  {
> -	int size = sql_normalize_name(NULL, 0, name, len);
> -	if (size < 0)
> -		return NULL;
> -	char *res = (char *) region_alloc(r, size);
> -	if (res == NULL) {
> +	int size = len + 1;
> +	ERROR_INJECT(ERRINJ_SQL_NAME_NORMALIZATION, {
>  		diag_set(OutOfMemory, size, "region_alloc", "res");
>  		return NULL;
> -	}
> -	if (sql_normalize_name(res, size, name, len) < 0)
> +	});
> +	size_t region_svp = region_used(r);
> +	char *res = region_alloc(r, size);
> +	if (res == NULL)
> +		return NULL;

1. Missed diag_set.

> +	int rc = sql_normalize_name(res, size, name, len);
> +	if (rc <= size)
> +		return res;
> +
> +	size = rc;
> +	region_truncate(r, region_svp);
> +	res = region_alloc(r, size);
> +	if (res == NULL)
>  		return NULL;

2. Again.

> +	if (sql_normalize_name(res, size, name, len) > size)
> +		unreachable();
>  	return res;
>  }
>  
> diff --git a/src/lib/coll/coll.c b/src/lib/coll/coll.c
> index b83f0fdc7..21f2489d4 100644
> --- a/src/lib/coll/coll.c
> +++ b/src/lib/coll/coll.c
> @@ -34,8 +34,11 @@
>  #include "diag.h"
>  #include "assoc.h"
>  #include <unicode/ucol.h>
> +#include <unicode/ucasemap.h>
>  #include <trivia/config.h>
>  
> +struct UCaseMap *root_map = NULL;

3. That name was ok to be local for utf8.c, but now
it is global, and 'root_map' in the whole scope of
tarantool looks ambiguous. What it is? MessagePack map?
Lua table map? RB-tree map? What is 'root'? I propose

    icu_ucase_default_map

  reply	other threads:[~2019-04-04 17:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 14:07 [tarantool-patches] " Kirill Shcherbatov
2019-04-04 15:03 ` [tarantool-patches] " Vladislav Shpilevoy
2019-04-04 17:12   ` Kirill Shcherbatov
2019-04-04 17:31     ` Vladislav Shpilevoy [this message]
2019-04-04 18:08       ` Kirill Shcherbatov
2019-04-04 20:15         ` Vladislav Shpilevoy
2019-04-05 11:33 ` Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf8e6724-9e6f-68af-1db3-2a23390c133a@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH v1 1/1] sql: fix perf degradation on name normalization' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox