Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Chris Sosnin <k.sosnin@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org
Subject: Re: [Tarantool-patches] [PATCH v2] box: add binary search for _session_settings space
Date: Sat, 25 Jan 2020 01:19:06 +0100	[thread overview]
Message-ID: <77348412-c9ac-54e7-2997-f9b2a2a9e0cb@tarantool.org> (raw)
In-Reply-To: <20200119201815.71286-1-k.sosnin@tarantool.org>

Hi! Thanks for the patch!

I've pushed my review fixes on top of this commit. See it below
and on the branch. If you agree, then squash. Otherwise lets
discuss.

See 6 comments below.

On 19/01/2020 21:18, Chris Sosnin wrote:
> Vlad, thank you for your suggestions!
> Instead of using module_id as a flag, I added an explicit 'is_set' variable.
> It is more convenient to use if you take into consideration adding more
> modules. The rest is according to your advice: we set an interator with
> binary search (finding lower/upper bound to be precise) and then do linear
> lookup.
> 
> See the whole patch below:

1. I forget to say that every time. Please, write your comments below
'---' symbols. In the same place where you mention branch and issue.
Because otherwise I don't see where your comment ends, and where a
commit message begins.

2. Please, put me into CC, when you want a review from me. IMO, this
is a bad practice to send a patch into nowhere without To/CC. Because
such a patch tends to hang until you would ask someone explicitly for
a review, or your manager accidentally by luck sees that patch and
decides to do a review by himself.

> As it is mentioned in implementation, it is important
> that _session_settings options are sorted by name, so
> there is no need in linear lookup and we can replace it
> with binary search.
> 
> Closes #4712
> ---
> branch: https://github.com/tarantool/tarantool/tree/ksosnin/gh-4712-search-settings
> issue: https://github.com/tarantool/tarantool/issues/4712
> 
>  src/box/session_settings.c | 92 ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 88 insertions(+), 4 deletions(-)
> 
> diff --git a/src/box/session_settings.c b/src/box/session_settings.c
> index a969d3d10..441dd29ff 100644
> --- a/src/box/session_settings.c
> +++ b/src/box/session_settings.c
> @@ -74,6 +74,8 @@ struct session_settings_iterator {
>  	bool is_eq;
>  	/** True if the iterator should include equal keys. */
>  	bool is_including;
> +	/** True if the iterator is pointing to the existing setting */

3. I would rather say to '*an* existing'. Because there is no a one
certain setting to which all the iterators point after setting that
flag.

> +	bool is_set;
>  };
>  
>  static void
> @@ -85,6 +87,69 @@ session_settings_iterator_free(struct iterator *ptr)
>  	free(it);
>  }
>  
> +static int
> +session_settings_set_forward(const struct session_setting_module *module,
> +			     int *sid, const char *key, bool is_eq,
> +			     bool is_including)
> +{
> +	int count = module->setting_count;
> +	int low = 0, high = count;
> +	if (key == NULL)
> +		return 0;
> +	const char **name = module->settings;
> +	while (low < high) {
> +		int index = low + (high - low) / 2;
> +		int cmp = strcmp(name[index], key);
> +		if ((cmp < 0 && is_including) ||
> +		    (cmp <= 0 && !is_including))
> +			low = index + 1;
> +		else
> +			high = index;
> +	}

4. Even though that bsearch passes the tests, I failed to understand
it. Please, consider a more canonical bsearch on the branch in a
separate commit. Feel free to comment, if someone catches your attention
in my version and you don't agree.

> +	if (low < count) {
> +		int cmp = strcmp(name[low], key);
> +		if ((cmp == 0 && is_including) ||
> +		    (cmp > 0 && !is_eq)) {
> +			*sid = low;
> +			return 0;
> +		}
> +	}

5. This last strcmp can be avoided. Or at least
moved into the cycle. See my commit.

All the same about 'reverse' version.

> +	*sid = count;
> +	return -1;
> +}
> +
> +static int
> +session_settings_set_reverse(const struct session_setting_module *module,
> +			     int *sid, const char *key, bool is_eq,
> +			     bool is_including)
> +{
> +	int count = module->setting_count;
> +	int low = -1, high = count - 1;
> +	if (key == NULL)
> +		return 0;
> +	const char **name = module->settings;
> +	while (low < high) {
> +		int index = high - (high - low) / 2;
> +		int cmp = strcmp(name[index], key);
> +		if ((cmp > 0 && is_including) ||
> +		    (cmp >= 0 && !is_including))
> +			high = index - 1;
> +		else
> +			low = index;
> +	}
> +	if (high >= 0) {
> +		int cmp = strcmp(name[high], key);
> +		if ((cmp == 0 && is_including) ||
> +		    (cmp < 0 && !is_eq)) {
> +			*sid = high;
> +			return 0;
> +		}
> +	}
> +	*sid = count;
> +	return -1;
> +}
> +
> +
>  static int
>  session_settings_next_in_module(const struct session_setting_module *module,
>  				int *sid, const char *key, bool is_eq,
> @@ -148,6 +213,15 @@ session_settings_iterator_next(struct iterator *iterator, struct tuple **result)
>  	bool is_found = false;
>  	for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
>  		module = &session_setting_modules[mid];
> +		if (!it->is_set) {
> +			if (session_settings_set_forward(module, &sid, key,
> +							 is_eq,
> +							 is_including) == 0) {
> +				it->is_set = true;
> +				is_found = true;
> +				break;
> +			}
> +		}

6. What I don't like here is that we check is_set as many times,
as many modules we have. Moreover, if set_forward didn't find
anything, you fallback to a fullscan in that module, even though
it wont find anything with 100% guarantee.

I fixed it, see my commit.

The same about iterator_prev.

>  		if (session_settings_next_in_module(module, &sid, key, is_eq,
>  						    is_including) == 0) {
>  			is_found = true;
> @@ -178,6 +252,15 @@ session_settings_iterator_prev(struct iterator *iterator, struct tuple **result)
>  	bool is_found = false;
>  	for (; mid >= 0; --mid, sid = INT_MAX) {
>  		module = &session_setting_modules[mid];
> +		if (!it->is_set) {
> +			if (session_settings_set_reverse(module, &sid, key,
> +							 is_eq,
> +							 is_including) == 0) {
> +				it->is_set = true;
> +				is_found = true;
> +				break;
> +			}
> +		}
>  		if (session_settings_prev_in_module(module, &sid, key, is_eq,
>  						    is_including) == 0) {
>  			is_found = true;
My commit is here. I fixed settings tests also - there was no a check that
real settings space, and a user defined one return the same tuple count. It
could happen, that one of them returns 0 tuples, and it would be considered
success.

Additionally, I added a test, which tries all existing iterators on every
setting.

================================================================================

commit 8c57ca1747abf446c45c5acc1b4f5bc19ffead56
Author: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Date:   Sat Jan 25 01:18:35 2020 +0100

    Review fixes

diff --git a/src/box/session_settings.c b/src/box/session_settings.c
index 441dd29ff..f09392343 100644
--- a/src/box/session_settings.c
+++ b/src/box/session_settings.c
@@ -93,29 +93,33 @@ session_settings_set_forward(const struct session_setting_module *module,
 			     bool is_including)
 {
 	int count = module->setting_count;
-	int low = 0, high = count;
+	int low = 0, high = count - 1;
 	if (key == NULL)
 		return 0;
 	const char **name = module->settings;
-	while (low < high) {
-		int index = low + (high - low) / 2;
+	while (low <= high) {
+		int index = (high + low) / 2;
 		int cmp = strcmp(name[index], key);
-		if ((cmp < 0 && is_including) ||
-		    (cmp <= 0 && !is_including))
+		if (cmp == 0) {
+			if (is_including) {
+				*sid = index;
+				return 0;
+			}
+			*sid = ++index;
+			return index < count ? 0 : -1;
+		}
+		if (cmp < 0)
 			low = index + 1;
 		else
-			high = index;
+			high = index - 1;
 	}
-	if (low < count) {
-		int cmp = strcmp(name[low], key);
-		if ((cmp == 0 && is_including) ||
-		    (cmp > 0 && !is_eq)) {
-			*sid = low;
-			return 0;
-		}
+	if (is_eq) {
+		*sid = count;
+		return -1;
 	}
-	*sid = count;
-	return -1;
+	assert(low > high);
+	*sid = low;
+	return low < count ? 0 : -1;
 }
 
 static int
@@ -124,29 +128,33 @@ session_settings_set_reverse(const struct session_setting_module *module,
 			     bool is_including)
 {
 	int count = module->setting_count;
-	int low = -1, high = count - 1;
+	int low = 0, high = count - 1;
 	if (key == NULL)
 		return 0;
 	const char **name = module->settings;
-	while (low < high) {
-		int index = high - (high - low) / 2;
+	while (low <= high) {
+		int index = (high + low) / 2;
 		int cmp = strcmp(name[index], key);
-		if ((cmp > 0 && is_including) ||
-		    (cmp >= 0 && !is_including))
-			high = index - 1;
+		if (cmp == 0) {
+			if (is_including) {
+				*sid = index;
+				return 0;
+			}
+			*sid = --index;
+			return index >= 0 ? 0 : -1;
+		}
+		if (cmp < 0)
+			low = index + 1;
 		else
-			low = index;
+			high = index - 1;
 	}
-	if (high >= 0) {
-		int cmp = strcmp(name[high], key);
-		if ((cmp == 0 && is_including) ||
-		    (cmp < 0 && !is_eq)) {
-			*sid = high;
-			return 0;
-		}
+	if (is_eq) {
+		*sid = count;
+		return -1;
 	}
-	*sid = count;
-	return -1;
+	assert(low > high);
+	*sid = high;
+	return high >= 0 ? 0 : -1;
 }
 
 
@@ -211,21 +219,26 @@ session_settings_iterator_next(struct iterator *iterator, struct tuple **result)
 	const char *key = it->key;
 	bool is_including = it->is_including, is_eq = it->is_eq;
 	bool is_found = false;
-	for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
-		module = &session_setting_modules[mid];
-		if (!it->is_set) {
-			if (session_settings_set_forward(module, &sid, key,
-							 is_eq,
-							 is_including) == 0) {
-				it->is_set = true;
+	if (!it->is_set) {
+		it->is_set = true;
+		for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
+			module = &session_setting_modules[mid];
+			if (session_settings_set_forward(
+				module, &sid, key, is_eq, is_including) == 0) {
+
 				is_found = true;
 				break;
 			}
 		}
-		if (session_settings_next_in_module(module, &sid, key, is_eq,
-						    is_including) == 0) {
-			is_found = true;
-			break;
+	} else {
+		for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
+			module = &session_setting_modules[mid];
+			if (session_settings_next_in_module(
+				module, &sid, key, is_eq, is_including) == 0) {
+
+				is_found = true;
+				break;
+			}
 		}
 	}
 	it->module_id = mid;
@@ -250,21 +263,26 @@ session_settings_iterator_prev(struct iterator *iterator, struct tuple **result)
 	const char *key = it->key;
 	bool is_including = it->is_including, is_eq = it->is_eq;
 	bool is_found = false;
-	for (; mid >= 0; --mid, sid = INT_MAX) {
-		module = &session_setting_modules[mid];
-		if (!it->is_set) {
-			if (session_settings_set_reverse(module, &sid, key,
-							 is_eq,
-							 is_including) == 0) {
-				it->is_set = true;
+	if (!it->is_set) {
+		it->is_set = true;
+		for (; mid >= 0; --mid, sid = INT_MAX) {
+			module = &session_setting_modules[mid];
+			if (session_settings_set_reverse(
+				module, &sid, key, is_eq, is_including) == 0) {
+
 				is_found = true;
 				break;
 			}
 		}
-		if (session_settings_prev_in_module(module, &sid, key, is_eq,
-						    is_including) == 0) {
-			is_found = true;
-			break;
+	} else {
+		for (; mid >= 0; --mid, sid = INT_MAX) {
+			module = &session_setting_modules[mid];
+			if (session_settings_prev_in_module(
+				module, &sid, key, is_eq, is_including) == 0) {
+
+				is_found = true;
+				break;
+			}
 		}
 	}
 	it->module_id = mid;
diff --git a/test/box/gh-4511-access-settings-from-any-frontend.result b/test/box/gh-4511-access-settings-from-any-frontend.result
index ed340d053..bae77192e 100644
--- a/test/box/gh-4511-access-settings-from-any-frontend.result
+++ b/test/box/gh-4511-access-settings-from-any-frontend.result
@@ -82,6 +82,12 @@ function check_sorting(ss, ts, key)
     for _, it in pairs(iterators_list) do
         local view_space = ss:select({key}, {iterator = it})
         local test_space = ts:select({key}, {iterator = it})
+        if #view_space ~= #test_space then
+            return {
+                err = 'bad sorting', type = it, exp = test_space,
+                got = view_space
+            }
+        end
         for key, value in pairs(view_space) do
             if test_space[key].name ~= value.name then
                 return {
@@ -111,32 +117,37 @@ check_sorting(s, t, 'sql_d')
 check_sorting(s, t, 'sql_v')
  | ---
  | ...
-check_sorting(s, t, 'sql_defer_foreign_keys')
+check_sorting(s, t, 'z')
  | ---
  | ...
-
-t:drop()
+check_sorting(s, t, 'sql_parser_debuf')
+ | ---
+ | ...
+check_sorting(s, t, 'sql_parser_debuh')
  | ---
  | ...
 
--- Check get() method of session_settings space.
-s:get({'sql_defer_foreign_keys'})
+name = nil
  | ---
- | - ['sql_defer_foreign_keys', false]
  | ...
-s:get({'sql_recursive_triggers'})
+err = nil
  | ---
- | - ['sql_recursive_triggers', true]
  | ...
-s:get({'sql_reverse_unordered_selects'})
+for _, tuple in t:pairs() do                    \
+    name = tuple.name                           \
+    err = check_sorting(s, t, name)             \
+    if err then                                 \
+        break                                   \
+    end                                         \
+end
  | ---
- | - ['sql_reverse_unordered_selects', false]
  | ...
-s:get({'sql_default_engine'})
+err and {err, name}
  | ---
- | - ['sql_default_engine', 'memtx']
+ | - null
  | ...
-s:get({'abcd'})
+
+t:drop()
  | ---
  | ...
 
diff --git a/test/box/gh-4511-access-settings-from-any-frontend.test.lua b/test/box/gh-4511-access-settings-from-any-frontend.test.lua
index 366874998..b243be15e 100644
--- a/test/box/gh-4511-access-settings-from-any-frontend.test.lua
+++ b/test/box/gh-4511-access-settings-from-any-frontend.test.lua
@@ -36,6 +36,12 @@ function check_sorting(ss, ts, key)
     for _, it in pairs(iterators_list) do
         local view_space = ss:select({key}, {iterator = it})
         local test_space = ts:select({key}, {iterator = it})
+        if #view_space ~= #test_space then
+            return {
+                err = 'bad sorting', type = it, exp = test_space,
+                got = view_space
+            }
+        end
         for key, value in pairs(view_space) do
             if test_space[key].name ~= value.name then
                 return {
@@ -52,17 +58,23 @@ check_sorting(s, t)
 check_sorting(s, t, 'abcde')
 check_sorting(s, t, 'sql_d')
 check_sorting(s, t, 'sql_v')
-check_sorting(s, t, 'sql_defer_foreign_keys')
+check_sorting(s, t, 'z')
+check_sorting(s, t, 'sql_parser_debuf')
+check_sorting(s, t, 'sql_parser_debuh')
+
+name = nil
+err = nil
+for _, tuple in t:pairs() do                    \
+    name = tuple.name                           \
+    err = check_sorting(s, t, name)             \
+    if err then                                 \
+        break                                   \
+    end                                         \
+end
+err and {err, name}
 
 t:drop()
 
--- Check get() method of session_settings space.
-s:get({'sql_defer_foreign_keys'})
-s:get({'sql_recursive_triggers'})
-s:get({'sql_reverse_unordered_selects'})
-s:get({'sql_default_engine'})
-s:get({'abcd'})
-
 -- Check pairs() method of session_settings space.
 t = {}
 for key, value in s:pairs() do table.insert(t, {key, value}) end

  reply	other threads:[~2020-01-25  0:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-19 20:18 Chris Sosnin
2020-01-25  0:19 ` Vladislav Shpilevoy [this message]
2020-01-25  7:04   ` Chris Sosnin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77348412-c9ac-54e7-2997-f9b2a2a9e0cb@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=k.sosnin@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH v2] box: add binary search for _session_settings space' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox