[Tarantool-patches] [PATCH v2] box: add binary search for _session_settings space

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sat Jan 25 03:19:06 MSK 2020


Hi! Thanks for the patch!

I've pushed my review fixes on top of this commit. See it below
and on the branch. If you agree, then squash. Otherwise lets
discuss.

See 6 comments below.

On 19/01/2020 21:18, Chris Sosnin wrote:
> Vlad, thank you for your suggestions!
> Instead of using module_id as a flag, I added an explicit 'is_set' variable.
> It is more convenient to use if you take into consideration adding more
> modules. The rest is according to your advice: we set an interator with
> binary search (finding lower/upper bound to be precise) and then do linear
> lookup.
> 
> See the whole patch below:

1. I forget to say that every time. Please, write your comments below
'---' symbols. In the same place where you mention branch and issue.
Because otherwise I don't see where your comment ends, and where a
commit message begins.

2. Please, put me into CC, when you want a review from me. IMO, this
is a bad practice to send a patch into nowhere without To/CC. Because
such a patch tends to hang until you would ask someone explicitly for
a review, or your manager accidentally by luck sees that patch and
decides to do a review by himself.

> As it is mentioned in implementation, it is important
> that _session_settings options are sorted by name, so
> there is no need in linear lookup and we can replace it
> with binary search.
> 
> Closes #4712
> ---
> branch: https://github.com/tarantool/tarantool/tree/ksosnin/gh-4712-search-settings
> issue: https://github.com/tarantool/tarantool/issues/4712
> 
>  src/box/session_settings.c | 92 ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 88 insertions(+), 4 deletions(-)
> 
> diff --git a/src/box/session_settings.c b/src/box/session_settings.c
> index a969d3d10..441dd29ff 100644
> --- a/src/box/session_settings.c
> +++ b/src/box/session_settings.c
> @@ -74,6 +74,8 @@ struct session_settings_iterator {
>  	bool is_eq;
>  	/** True if the iterator should include equal keys. */
>  	bool is_including;
> +	/** True if the iterator is pointing to the existing setting */

3. I would rather say to '*an* existing'. Because there is no a one
certain setting to which all the iterators point after setting that
flag.

> +	bool is_set;
>  };
>  
>  static void
> @@ -85,6 +87,69 @@ session_settings_iterator_free(struct iterator *ptr)
>  	free(it);
>  }
>  
> +static int
> +session_settings_set_forward(const struct session_setting_module *module,
> +			     int *sid, const char *key, bool is_eq,
> +			     bool is_including)
> +{
> +	int count = module->setting_count;
> +	int low = 0, high = count;
> +	if (key == NULL)
> +		return 0;
> +	const char **name = module->settings;
> +	while (low < high) {
> +		int index = low + (high - low) / 2;
> +		int cmp = strcmp(name[index], key);
> +		if ((cmp < 0 && is_including) ||
> +		    (cmp <= 0 && !is_including))
> +			low = index + 1;
> +		else
> +			high = index;
> +	}

4. Even though that bsearch passes the tests, I failed to understand
it. Please, consider a more canonical bsearch on the branch in a
separate commit. Feel free to comment, if someone catches your attention
in my version and you don't agree.

> +	if (low < count) {
> +		int cmp = strcmp(name[low], key);
> +		if ((cmp == 0 && is_including) ||
> +		    (cmp > 0 && !is_eq)) {
> +			*sid = low;
> +			return 0;
> +		}
> +	}

5. This last strcmp can be avoided. Or at least
moved into the cycle. See my commit.

All the same about 'reverse' version.

> +	*sid = count;
> +	return -1;
> +}
> +
> +static int
> +session_settings_set_reverse(const struct session_setting_module *module,
> +			     int *sid, const char *key, bool is_eq,
> +			     bool is_including)
> +{
> +	int count = module->setting_count;
> +	int low = -1, high = count - 1;
> +	if (key == NULL)
> +		return 0;
> +	const char **name = module->settings;
> +	while (low < high) {
> +		int index = high - (high - low) / 2;
> +		int cmp = strcmp(name[index], key);
> +		if ((cmp > 0 && is_including) ||
> +		    (cmp >= 0 && !is_including))
> +			high = index - 1;
> +		else
> +			low = index;
> +	}
> +	if (high >= 0) {
> +		int cmp = strcmp(name[high], key);
> +		if ((cmp == 0 && is_including) ||
> +		    (cmp < 0 && !is_eq)) {
> +			*sid = high;
> +			return 0;
> +		}
> +	}
> +	*sid = count;
> +	return -1;
> +}
> +
> +
>  static int
>  session_settings_next_in_module(const struct session_setting_module *module,
>  				int *sid, const char *key, bool is_eq,
> @@ -148,6 +213,15 @@ session_settings_iterator_next(struct iterator *iterator, struct tuple **result)
>  	bool is_found = false;
>  	for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
>  		module = &session_setting_modules[mid];
> +		if (!it->is_set) {
> +			if (session_settings_set_forward(module, &sid, key,
> +							 is_eq,
> +							 is_including) == 0) {
> +				it->is_set = true;
> +				is_found = true;
> +				break;
> +			}
> +		}

6. What I don't like here is that we check is_set as many times,
as many modules we have. Moreover, if set_forward didn't find
anything, you fallback to a fullscan in that module, even though
it wont find anything with 100% guarantee.

I fixed it, see my commit.

The same about iterator_prev.

>  		if (session_settings_next_in_module(module, &sid, key, is_eq,
>  						    is_including) == 0) {
>  			is_found = true;
> @@ -178,6 +252,15 @@ session_settings_iterator_prev(struct iterator *iterator, struct tuple **result)
>  	bool is_found = false;
>  	for (; mid >= 0; --mid, sid = INT_MAX) {
>  		module = &session_setting_modules[mid];
> +		if (!it->is_set) {
> +			if (session_settings_set_reverse(module, &sid, key,
> +							 is_eq,
> +							 is_including) == 0) {
> +				it->is_set = true;
> +				is_found = true;
> +				break;
> +			}
> +		}
>  		if (session_settings_prev_in_module(module, &sid, key, is_eq,
>  						    is_including) == 0) {
>  			is_found = true;
My commit is here. I fixed settings tests also - there was no a check that
real settings space, and a user defined one return the same tuple count. It
could happen, that one of them returns 0 tuples, and it would be considered
success.

Additionally, I added a test, which tries all existing iterators on every
setting.

================================================================================

commit 8c57ca1747abf446c45c5acc1b4f5bc19ffead56
Author: Vladislav Shpilevoy <v.shpilevoy at tarantool.org>
Date:   Sat Jan 25 01:18:35 2020 +0100

    Review fixes

diff --git a/src/box/session_settings.c b/src/box/session_settings.c
index 441dd29ff..f09392343 100644
--- a/src/box/session_settings.c
+++ b/src/box/session_settings.c
@@ -93,29 +93,33 @@ session_settings_set_forward(const struct session_setting_module *module,
 			     bool is_including)
 {
 	int count = module->setting_count;
-	int low = 0, high = count;
+	int low = 0, high = count - 1;
 	if (key == NULL)
 		return 0;
 	const char **name = module->settings;
-	while (low < high) {
-		int index = low + (high - low) / 2;
+	while (low <= high) {
+		int index = (high + low) / 2;
 		int cmp = strcmp(name[index], key);
-		if ((cmp < 0 && is_including) ||
-		    (cmp <= 0 && !is_including))
+		if (cmp == 0) {
+			if (is_including) {
+				*sid = index;
+				return 0;
+			}
+			*sid = ++index;
+			return index < count ? 0 : -1;
+		}
+		if (cmp < 0)
 			low = index + 1;
 		else
-			high = index;
+			high = index - 1;
 	}
-	if (low < count) {
-		int cmp = strcmp(name[low], key);
-		if ((cmp == 0 && is_including) ||
-		    (cmp > 0 && !is_eq)) {
-			*sid = low;
-			return 0;
-		}
+	if (is_eq) {
+		*sid = count;
+		return -1;
 	}
-	*sid = count;
-	return -1;
+	assert(low > high);
+	*sid = low;
+	return low < count ? 0 : -1;
 }
 
 static int
@@ -124,29 +128,33 @@ session_settings_set_reverse(const struct session_setting_module *module,
 			     bool is_including)
 {
 	int count = module->setting_count;
-	int low = -1, high = count - 1;
+	int low = 0, high = count - 1;
 	if (key == NULL)
 		return 0;
 	const char **name = module->settings;
-	while (low < high) {
-		int index = high - (high - low) / 2;
+	while (low <= high) {
+		int index = (high + low) / 2;
 		int cmp = strcmp(name[index], key);
-		if ((cmp > 0 && is_including) ||
-		    (cmp >= 0 && !is_including))
-			high = index - 1;
+		if (cmp == 0) {
+			if (is_including) {
+				*sid = index;
+				return 0;
+			}
+			*sid = --index;
+			return index >= 0 ? 0 : -1;
+		}
+		if (cmp < 0)
+			low = index + 1;
 		else
-			low = index;
+			high = index - 1;
 	}
-	if (high >= 0) {
-		int cmp = strcmp(name[high], key);
-		if ((cmp == 0 && is_including) ||
-		    (cmp < 0 && !is_eq)) {
-			*sid = high;
-			return 0;
-		}
+	if (is_eq) {
+		*sid = count;
+		return -1;
 	}
-	*sid = count;
-	return -1;
+	assert(low > high);
+	*sid = high;
+	return high >= 0 ? 0 : -1;
 }
 
 
@@ -211,21 +219,26 @@ session_settings_iterator_next(struct iterator *iterator, struct tuple **result)
 	const char *key = it->key;
 	bool is_including = it->is_including, is_eq = it->is_eq;
 	bool is_found = false;
-	for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
-		module = &session_setting_modules[mid];
-		if (!it->is_set) {
-			if (session_settings_set_forward(module, &sid, key,
-							 is_eq,
-							 is_including) == 0) {
-				it->is_set = true;
+	if (!it->is_set) {
+		it->is_set = true;
+		for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
+			module = &session_setting_modules[mid];
+			if (session_settings_set_forward(
+				module, &sid, key, is_eq, is_including) == 0) {
+
 				is_found = true;
 				break;
 			}
 		}
-		if (session_settings_next_in_module(module, &sid, key, is_eq,
-						    is_including) == 0) {
-			is_found = true;
-			break;
+	} else {
+		for (; mid < session_setting_type_MAX; ++mid, sid = 0) {
+			module = &session_setting_modules[mid];
+			if (session_settings_next_in_module(
+				module, &sid, key, is_eq, is_including) == 0) {
+
+				is_found = true;
+				break;
+			}
 		}
 	}
 	it->module_id = mid;
@@ -250,21 +263,26 @@ session_settings_iterator_prev(struct iterator *iterator, struct tuple **result)
 	const char *key = it->key;
 	bool is_including = it->is_including, is_eq = it->is_eq;
 	bool is_found = false;
-	for (; mid >= 0; --mid, sid = INT_MAX) {
-		module = &session_setting_modules[mid];
-		if (!it->is_set) {
-			if (session_settings_set_reverse(module, &sid, key,
-							 is_eq,
-							 is_including) == 0) {
-				it->is_set = true;
+	if (!it->is_set) {
+		it->is_set = true;
+		for (; mid >= 0; --mid, sid = INT_MAX) {
+			module = &session_setting_modules[mid];
+			if (session_settings_set_reverse(
+				module, &sid, key, is_eq, is_including) == 0) {
+
 				is_found = true;
 				break;
 			}
 		}
-		if (session_settings_prev_in_module(module, &sid, key, is_eq,
-						    is_including) == 0) {
-			is_found = true;
-			break;
+	} else {
+		for (; mid >= 0; --mid, sid = INT_MAX) {
+			module = &session_setting_modules[mid];
+			if (session_settings_prev_in_module(
+				module, &sid, key, is_eq, is_including) == 0) {
+
+				is_found = true;
+				break;
+			}
 		}
 	}
 	it->module_id = mid;
diff --git a/test/box/gh-4511-access-settings-from-any-frontend.result b/test/box/gh-4511-access-settings-from-any-frontend.result
index ed340d053..bae77192e 100644
--- a/test/box/gh-4511-access-settings-from-any-frontend.result
+++ b/test/box/gh-4511-access-settings-from-any-frontend.result
@@ -82,6 +82,12 @@ function check_sorting(ss, ts, key)
     for _, it in pairs(iterators_list) do
         local view_space = ss:select({key}, {iterator = it})
         local test_space = ts:select({key}, {iterator = it})
+        if #view_space ~= #test_space then
+            return {
+                err = 'bad sorting', type = it, exp = test_space,
+                got = view_space
+            }
+        end
         for key, value in pairs(view_space) do
             if test_space[key].name ~= value.name then
                 return {
@@ -111,32 +117,37 @@ check_sorting(s, t, 'sql_d')
 check_sorting(s, t, 'sql_v')
  | ---
  | ...
-check_sorting(s, t, 'sql_defer_foreign_keys')
+check_sorting(s, t, 'z')
  | ---
  | ...
-
-t:drop()
+check_sorting(s, t, 'sql_parser_debuf')
+ | ---
+ | ...
+check_sorting(s, t, 'sql_parser_debuh')
  | ---
  | ...
 
--- Check get() method of session_settings space.
-s:get({'sql_defer_foreign_keys'})
+name = nil
  | ---
- | - ['sql_defer_foreign_keys', false]
  | ...
-s:get({'sql_recursive_triggers'})
+err = nil
  | ---
- | - ['sql_recursive_triggers', true]
  | ...
-s:get({'sql_reverse_unordered_selects'})
+for _, tuple in t:pairs() do                    \
+    name = tuple.name                           \
+    err = check_sorting(s, t, name)             \
+    if err then                                 \
+        break                                   \
+    end                                         \
+end
  | ---
- | - ['sql_reverse_unordered_selects', false]
  | ...
-s:get({'sql_default_engine'})
+err and {err, name}
  | ---
- | - ['sql_default_engine', 'memtx']
+ | - null
  | ...
-s:get({'abcd'})
+
+t:drop()
  | ---
  | ...
 
diff --git a/test/box/gh-4511-access-settings-from-any-frontend.test.lua b/test/box/gh-4511-access-settings-from-any-frontend.test.lua
index 366874998..b243be15e 100644
--- a/test/box/gh-4511-access-settings-from-any-frontend.test.lua
+++ b/test/box/gh-4511-access-settings-from-any-frontend.test.lua
@@ -36,6 +36,12 @@ function check_sorting(ss, ts, key)
     for _, it in pairs(iterators_list) do
         local view_space = ss:select({key}, {iterator = it})
         local test_space = ts:select({key}, {iterator = it})
+        if #view_space ~= #test_space then
+            return {
+                err = 'bad sorting', type = it, exp = test_space,
+                got = view_space
+            }
+        end
         for key, value in pairs(view_space) do
             if test_space[key].name ~= value.name then
                 return {
@@ -52,17 +58,23 @@ check_sorting(s, t)
 check_sorting(s, t, 'abcde')
 check_sorting(s, t, 'sql_d')
 check_sorting(s, t, 'sql_v')
-check_sorting(s, t, 'sql_defer_foreign_keys')
+check_sorting(s, t, 'z')
+check_sorting(s, t, 'sql_parser_debuf')
+check_sorting(s, t, 'sql_parser_debuh')
+
+name = nil
+err = nil
+for _, tuple in t:pairs() do                    \
+    name = tuple.name                           \
+    err = check_sorting(s, t, name)             \
+    if err then                                 \
+        break                                   \
+    end                                         \
+end
+err and {err, name}
 
 t:drop()
 
--- Check get() method of session_settings space.
-s:get({'sql_defer_foreign_keys'})
-s:get({'sql_recursive_triggers'})
-s:get({'sql_reverse_unordered_selects'})
-s:get({'sql_default_engine'})
-s:get({'abcd'})
-
 -- Check pairs() method of session_settings space.
 t = {}
 for key, value in s:pairs() do table.insert(t, {key, value}) end


More information about the Tarantool-patches mailing list