Tarantool development patches archive
 help / color / mirror / Atom feed
* [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func.
@ 2019-03-20 11:11 Ivan Koptelov
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Ivan Koptelov @ 2019-03-20 11:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: korablev, Ivan Koptelov

There are two patches in this series. First one enhances
collation determination for function arguments (e.g. for cases,
when function arguments have different collations). Second one
renames instr() to position() and swaps arguments order. Also
after the second patch both arguments must have the same type,
which should be either TEXT or BLOB.

Branch https://github.com/tarantool/tarantool/tree/sudobobo/gh-3933-add-position-func
Issue https://github.com/tarantool/tarantool/issues/3933

Ivan Koptelov (2):
  sql: add better collation determination in functions
  sql: rename instr to position & add collation usage

 src/box/sql/expr.c             |  65 ++-
 src/box/sql/func.c             | 157 ++++--
 src/box/sql/sqlInt.h           |   8 +-
 src/box/sql/vdbeInt.h          |   9 +
 test/sql-tap/func.test.lua     |  10 +-
 test/sql-tap/func5.test.lua    |   4 +-
 test/sql-tap/instr.test.lua    | 706 -------------------------
 test/sql-tap/position.test.lua | 905 +++++++++++++++++++++++++++++++++
 test/sql-tap/with1.test.lua    |   4 +-
 9 files changed, 1106 insertions(+), 762 deletions(-)
 delete mode 100755 test/sql-tap/instr.test.lua
 create mode 100755 test/sql-tap/position.test.lua

-- 
2.20.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions
  2019-03-20 11:11 [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func Ivan Koptelov
@ 2019-03-20 11:11 ` Ivan Koptelov
  2019-03-25 19:26   ` [tarantool-patches] " n.pettik
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov
  2019-04-01 14:15 ` [tarantool-patches] Re: [PATCH 0/2] sql: add better coll. determination & position func Kirill Yukhin
  2 siblings, 1 reply; 13+ messages in thread
From: Ivan Koptelov @ 2019-03-20 11:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: korablev, Ivan Koptelov

Before the patch determination of collation in SQL functions
(used only in instr) was too narrow: the arguments were scanned
from left to right, till the argument with collation was
encountered, then its collation was used.
Now every arguments collation is considered. The right collation
which would be used in function is determined using ANSI
compatibility rules ("type combination" rules).

Note: currently only instr() a.k.a position() uses mechanism
described above, other functions (except aggregate) simply
ignores collations.
---
 src/box/sql/expr.c   | 69 ++++++++++++++++++++++++++++++++++++++++----
 src/box/sql/sqlInt.h |  8 ++++-
 2 files changed, 70 insertions(+), 7 deletions(-)

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index a2c70935e..2f48d90c6 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -4093,16 +4093,73 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 					testcase(i == 31);
 					constMask |= MASKBIT32(i);
 				}
-				if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) !=
-				    0 && coll == NULL) {
-					bool unused;
-					uint32_t id;
+			}
+			/*
+			 * Function arguments may have different
+			 * collations. The following code
+			 * checks if they are compatible and
+			 * finds the collation to be used. This
+			 * is done using ANSI rules from
+			 * collations_check_compatibility().
+			 */
+			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
+			     coll == NULL) {
+				struct coll *unused = NULL;
+				uint32_t curr_id = COLL_NONE;
+				bool is_curr_forced = false;
+
+				uint32_t temp_id = COLL_NONE;
+				bool is_temp_forced = false;
+
+				uint32_t lhs_id = COLL_NONE;
+				bool is_lhs_forced = false;
+
+				uint32_t rhs_id = COLL_NONE;
+				bool is_rhs_forced = false;
+
+				for (int i = 0; i < nFarg; i++) {
 					if (sql_expr_coll(pParse,
 							  pFarg->a[i].pExpr,
-							  &unused, &id,
-							  &coll) != 0)
+							  &is_lhs_forced,
+							  &lhs_id,
+							  &unused) != 0)
 						return 0;
+
+					for (int j = i + 1; j < nFarg; j++) {
+						if (sql_expr_coll(pParse,
+								  pFarg->a[j].pExpr,
+								  &is_rhs_forced,
+								  &rhs_id,
+								  &unused) != 0)
+							return 0;
+
+						if (collations_check_compatibility(
+							lhs_id, is_lhs_forced,
+							rhs_id, is_rhs_forced,
+							&temp_id) != 0) {
+							pParse->rc =
+								SQL_TARANTOOL_ERROR;
+							pParse->nErr++;
+							return 0;
+						}
+
+						is_temp_forced = (temp_id ==
+								  lhs_id) ?
+								 is_lhs_forced :
+								 is_rhs_forced;
+
+						if (collations_check_compatibility(
+							curr_id, is_curr_forced,
+							temp_id, is_temp_forced,
+							&curr_id) != 0) {
+							pParse->rc =
+								SQL_TARANTOOL_ERROR;
+							pParse->nErr++;
+							return 0;
+						}
+					}
 				}
+				coll = coll_by_id(curr_id)->coll;
 			}
 			if (pFarg) {
 				if (constMask) {
diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
index 8967ea3e0..47ee474bb 100644
--- a/src/box/sql/sqlInt.h
+++ b/src/box/sql/sqlInt.h
@@ -1660,7 +1660,13 @@ struct FuncDestructor {
 #define SQL_FUNC_LIKE     0x0004	/* Candidate for the LIKE optimization */
 #define SQL_FUNC_CASE     0x0008	/* Case-sensitive LIKE-type function */
 #define SQL_FUNC_EPHEM    0x0010	/* Ephemeral.  Delete with VDBE */
-#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called */
+#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called.
+					 * The flag is set when the collation
+					 * of function arguments should be
+					 * determined, using rules in
+					 * collations_check_compatibility()
+					 * function.
+					 */
 #define SQL_FUNC_LENGTH   0x0040	/* Built-in length() function */
 #define SQL_FUNC_TYPEOF   0x0080	/* Built-in typeof() function */
 #define SQL_FUNC_COUNT    0x0100	/* Built-in count(*) aggregate */
-- 
2.20.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage
  2019-03-20 11:11 [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func Ivan Koptelov
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov
@ 2019-03-20 11:11 ` Ivan Koptelov
  2019-03-20 12:59   ` [tarantool-patches] Re: [PATCH 1/2] " i.koptelov
  2019-03-26 12:32   ` [tarantool-patches] Re: [PATCH 2/2] " n.pettik
  2019-04-01 14:15 ` [tarantool-patches] Re: [PATCH 0/2] sql: add better coll. determination & position func Kirill Yukhin
  2 siblings, 2 replies; 13+ messages in thread
From: Ivan Koptelov @ 2019-03-20 11:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: korablev, Ivan Koptelov

Before this patch we had instr() SQL function.
After the patch it is renamed to position() for a better ANSI
compatibility. Also a few things have been changed: arguments
order, allowed arguments types and usage of collations.

Note: after the patch position() syntax is still different
from ANSI. The desirable syntax is POSITION(substring IN string).
It is not possible to implement right now, because in our SQL
grammar we lack expr types. We have only basic 'expr' type and
it is not possible to write unambiguous rule for POSITION IN.
To solve this we have to refactor grammar and add something
like 'string_expr' (as it is done in other DBs grammars)

Workaround for #3933

@TarantoolBot document
Title: instr() is replaced with position()
Name and order of the arguments has changed for a better ANSI
compatibility:
Before: instr(haystack, needle).
After: position(needle, haystack).

Type checking became more strict: before it was
possible to call the function with INTEGER arguments
or with arguments of different types. Now both arguments
must have the same type and be either text or binary
strings.

Before the patch collations were not taken into
consideration during the search. Now it is fixed, and
both implicit (column) collations and explicit
(using COLLATE expression) are used. Single collation which
would be used in function is determined using ANSI
“Type combination” rules.

---
Before I changed postion() functionality, I refactored it:

---
 src/box/sql/func.c | 68 ++++++++++++++++++++++------------------------
 1 file changed, 33 insertions(+), 35 deletions(-)

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 2de6ef5ce..0100bb22f 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -223,49 +223,46 @@ absFunc(sql_context * context, int argc, sql_value ** argv)
  * or 0 if needle never occurs in haystack.
  */
 static void
-instrFunc(sql_context * context, int argc, sql_value ** argv)
+position_func(sql_context * context, int argc, sql_value ** argv)
 {
-	const unsigned char *zHaystack;
-	const unsigned char *zNeedle;
-	int nHaystack;
-	int nNeedle;
-	int typeHaystack, typeNeedle;
-	int N = 1;
-	int isText;
-
 	UNUSED_PARAMETER(argc);
-	typeHaystack = sql_value_type(argv[0]);
-	typeNeedle = sql_value_type(argv[1]);
-	if (typeHaystack == SQL_NULL || typeNeedle == SQL_NULL)
+	int type_haystack = sql_value_type(argv[0]);
+	int type_needle = sql_value_type(argv[1]);
+	if (type_haystack == SQL_NULL || type_needle == SQL_NULL)
 		return;
-	nHaystack = sql_value_bytes(argv[0]);
-	nNeedle = sql_value_bytes(argv[1]);
-	if (nNeedle > 0) {
-		if (typeHaystack == SQL_BLOB && typeNeedle == SQL_BLOB) {
-			zHaystack = sql_value_blob(argv[0]);
-			zNeedle = sql_value_blob(argv[1]);
-			assert(zNeedle != 0);
-			assert(zHaystack != 0 || nHaystack == 0);
-			isText = 0;
+
+	int n_haystack = sql_value_bytes(argv[0]);
+	int n_needle = sql_value_bytes(argv[1]);
+	const unsigned char *haystack;
+	const unsigned char *needle;
+	int position = 1;
+	int is_text;
+	if (n_needle > 0) {
+		if (type_haystack == SQL_BLOB && type_needle == SQL_BLOB) {
+			haystack = sql_value_blob(argv[0]);
+			needle = sql_value_blob(argv[1]);
+			assert(needle != 0);
+			assert(haystack != 0 || n_haystack == 0);
+			is_text = 0;
 		} else {
-			zHaystack = sql_value_text(argv[0]);
-			zNeedle = sql_value_text(argv[1]);
-			isText = 1;
-			if (zHaystack == 0 || zNeedle == 0)
+			haystack = sql_value_text(argv[0]);
+			needle = sql_value_text(argv[1]);
+			is_text = 1;
+			if (haystack == 0 || needle == 0)
 				return;
 		}
-		while (nNeedle <= nHaystack
-		       && memcmp(zHaystack, zNeedle, nNeedle) != 0) {
-			N++;
+		while (n_needle <= n_haystack
+		       && memcmp(haystack, needle, n_needle) != 0) {
+			position++;
 			do {
-				nHaystack--;
-				zHaystack++;
-			} while (isText && (zHaystack[0] & 0xc0) == 0x80);
+				n_haystack--;
+				haystack++;
+			} while (is_text && (haystack[0] & 0xc0) == 0x80);
 		}
-		if (nNeedle > nHaystack)
-			N = 0;
+		if (n_needle > n_haystack)
+			position = 0;
 	}
-	sql_result_int(context, N);
+	sql_result_int(context, position);
 }

 /*
@@ -1760,7 +1757,8 @@ sqlRegisterBuiltinFunctions(void)
 			  FIELD_TYPE_STRING),
 		FUNCTION2(length, 1, 0, 0, lengthFunc, SQL_FUNC_LENGTH,
 			  FIELD_TYPE_INTEGER),
-		FUNCTION(instr, 2, 0, 0, instrFunc, FIELD_TYPE_INTEGER),
+		FUNCTION2(position, 2, 0, 1, position_func, SQL_FUNC_NEEDCOLL,
+			  FIELD_TYPE_INTEGER),
 		FUNCTION(printf, -1, 0, 0, printfFunc, FIELD_TYPE_STRING),
 		FUNCTION(unicode, 1, 0, 0, unicodeFunc, FIELD_TYPE_STRING),
 		FUNCTION(char, -1, 0, 0, charFunc, FIELD_TYPE_STRING),
--

 src/box/sql/expr.c             |   8 +-
 src/box/sql/func.c             | 157 ++++--
 src/box/sql/vdbeInt.h          |   9 +
 test/sql-tap/func.test.lua     |  10 +-
 test/sql-tap/func5.test.lua    |   4 +-
 test/sql-tap/instr.test.lua    | 706 -------------------------
 test/sql-tap/position.test.lua | 905 +++++++++++++++++++++++++++++++++
 test/sql-tap/with1.test.lua    |   4 +-
 8 files changed, 1042 insertions(+), 761 deletions(-)
 delete mode 100755 test/sql-tap/instr.test.lua
 create mode 100755 test/sql-tap/position.test.lua

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 2f48d90c6..53926e3f2 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -4137,9 +4137,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 							lhs_id, is_lhs_forced,
 							rhs_id, is_rhs_forced,
 							&temp_id) != 0) {
-							pParse->rc =
-								SQL_TARANTOOL_ERROR;
-							pParse->nErr++;
+							pParse->is_aborted = true;
 							return 0;
 						}
 
@@ -4152,9 +4150,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 							curr_id, is_curr_forced,
 							temp_id, is_temp_forced,
 							&curr_id) != 0) {
-							pParse->rc =
-								SQL_TARANTOOL_ERROR;
-							pParse->nErr++;
+							pParse->is_aborted = true;
 							return 0;
 						}
 					}
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index a750e52a1..6654d4998 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -38,6 +38,7 @@
 #include "vdbeInt.h"
 #include "version.h"
 #include "coll/coll.h"
+#include "tarantoolInt.h"
 #include <unicode/ustring.h>
 #include <unicode/ucasemap.h>
 #include <unicode/ucnv.h>
@@ -212,9 +213,9 @@ absFunc(sql_context * context, int argc, sql_value ** argv)
 }
 
 /*
- * Implementation of the instr() function.
+ * Implementation of the position() function.
  *
- * instr(haystack,needle) finds the first occurrence of needle
+ * position(needle, haystack) finds the first occurrence of needle
  * in haystack and returns the number of previous characters plus 1,
  * or 0 if needle does not occur within haystack.
  *
@@ -223,49 +224,124 @@ absFunc(sql_context * context, int argc, sql_value ** argv)
  * or 0 if needle never occurs in haystack.
  */
 static void
-instrFunc(sql_context * context, int argc, sql_value ** argv)
+position_func(sql_context * context, int argc, sql_value ** argv)
 {
-	const unsigned char *zHaystack;
-	const unsigned char *zNeedle;
-	int nHaystack;
-	int nNeedle;
-	int typeHaystack, typeNeedle;
-	int N = 1;
-	int isText;
-
 	UNUSED_PARAMETER(argc);
-	typeHaystack = sql_value_type(argv[0]);
-	typeNeedle = sql_value_type(argv[1]);
-	if (typeHaystack == SQL_NULL || typeNeedle == SQL_NULL)
+	sql_value *needle = argv[0];
+	sql_value *haystack = argv[1];
+	int type_needle = sql_value_type(needle);
+	int type_haystack = sql_value_type(haystack);
+
+	if (type_haystack == SQL_NULL || type_needle == SQL_NULL)
+		return;
+	/*
+	 * Position function can be called only with string
+	 * or blob params.
+	 */
+	sql_value *inconsistent_type_arg = NULL;
+	if (type_needle != SQL_TEXT && type_needle != SQL_BLOB)
+		inconsistent_type_arg = needle;
+	if (type_haystack != SQL_TEXT && type_haystack != SQL_BLOB)
+		inconsistent_type_arg = haystack;
+	if (inconsistent_type_arg != NULL) {
+		diag_set(ClientError, ER_INCONSISTENT_TYPES, "TEXT or BLOB",
+			 mem_type_to_str(inconsistent_type_arg));
+		context->pVdbe->pParse->is_aborted = true;
+		sql_result_error(context, tarantoolErrorMessage(), -1);
+		return;
+	}
+	/*
+	 * Both params of Position function must be of the same
+	 * type.
+	 */
+	if (type_haystack != type_needle) {
+		diag_set(ClientError, ER_INCONSISTENT_TYPES,
+			 mem_type_to_str(needle),
+			 mem_type_to_str(haystack));
+		context->pVdbe->pParse->is_aborted = true;
+		sql_result_error(context, tarantoolErrorMessage(), -1);
 		return;
-	nHaystack = sql_value_bytes(argv[0]);
-	nNeedle = sql_value_bytes(argv[1]);
-	if (nNeedle > 0) {
-		if (typeHaystack == SQL_BLOB && typeNeedle == SQL_BLOB) {
-			zHaystack = sql_value_blob(argv[0]);
-			zNeedle = sql_value_blob(argv[1]);
-			assert(zNeedle != 0);
-			assert(zHaystack != 0 || nHaystack == 0);
-			isText = 0;
+	}
+
+	int n_needle_bytes = sql_value_bytes(needle);
+	int n_haystack_bytes = sql_value_bytes(haystack);
+	int position = 1;
+	if (n_needle_bytes > 0) {
+		const unsigned char *haystack_str;
+		const unsigned char *needle_str;
+		if (type_haystack == SQL_BLOB) {
+			needle_str = sql_value_blob(needle);
+			haystack_str = sql_value_blob(haystack);
+			assert(needle_str != NULL);
+			assert(haystack_str != NULL || n_haystack_bytes == 0);
+
+			while (n_needle_bytes <= n_haystack_bytes
+			       && memcmp(haystack_str, needle_str, n_needle_bytes) != 0) {
+				position++;
+				n_haystack_bytes--;
+				haystack_str++;
+			}
+			if (n_needle_bytes > n_haystack_bytes)
+				position = 0;
 		} else {
-			zHaystack = sql_value_text(argv[0]);
-			zNeedle = sql_value_text(argv[1]);
-			isText = 1;
-			if (zHaystack == 0 || zNeedle == 0)
-				return;
-		}
-		while (nNeedle <= nHaystack
-		       && memcmp(zHaystack, zNeedle, nNeedle) != 0) {
-			N++;
-			do {
-				nHaystack--;
-				zHaystack++;
-			} while (isText && (zHaystack[0] & 0xc0) == 0x80);
+			/*
+			 * Code below handles not only simple
+			 * cases like position('a', 'bca'), but
+			 * also more complex ones:
+			 * position('a', 'bcá' COLLATE "unicode_ci")
+			 * To do so, we need to use comparison
+			 * window, which has constant character
+			 * size, but variable byte size.
+			 * Character size is equal to
+			 * needle char size.
+			 */
+			haystack_str = sql_value_text(haystack);
+			needle_str = sql_value_text(needle);
+
+			int n_needle_chars =
+				sql_utf8_char_count(needle_str, n_needle_bytes);
+			int n_haystack_chars =
+				sql_utf8_char_count(haystack_str, n_haystack_bytes);
+
+			if (n_haystack_chars < n_needle_chars) {
+				position = 0;
+				goto finish;
+			}
+			/*
+			 * Comparison window is determined by
+			 * beg_offset and end_offset. beg_offset
+			 * is offset in bytes from haystack
+			 * beginning to window beginning.
+			 * end_offset is offset in bytes from
+			 * haystack beginning to window end.
+			 */
+			int end_offset = 0;
+			for (int c = 0; c < n_needle_chars; c++) {
+				SQL_UTF8_FWD_1(haystack_str, end_offset,
+					       n_haystack_bytes);
+			}
+			int beg_offset = 0;
+			struct coll *coll = sqlGetFuncCollSeq(context);
+			int c;
+			for (c = 0; c + n_needle_chars <= n_haystack_chars; c++) {
+				if (coll->cmp((const char *) haystack_str + beg_offset,
+					      end_offset - beg_offset,
+					      (const char *) needle_str,
+					      n_needle_bytes, coll) == 0)
+					goto finish;
+				position++;
+				/* Update offsets. */
+				SQL_UTF8_FWD_1(haystack_str, beg_offset,
+					       n_haystack_bytes);
+				SQL_UTF8_FWD_1(haystack_str, end_offset,
+					       n_haystack_bytes);
+			}
+			/* Needle was not found in the haystack. */
+			position = 0;
 		}
-		if (nNeedle > nHaystack)
-			N = 0;
 	}
-	sql_result_int(context, N);
+finish:
+	sql_result_int(context, position);
 }
 
 /*
@@ -1756,7 +1832,8 @@ sqlRegisterBuiltinFunctions(void)
 			  FIELD_TYPE_STRING),
 		FUNCTION2(length, 1, 0, 0, lengthFunc, SQL_FUNC_LENGTH,
 			  FIELD_TYPE_INTEGER),
-		FUNCTION(instr, 2, 0, 0, instrFunc, FIELD_TYPE_INTEGER),
+		FUNCTION2(position, 2, 0, 1, position_func, SQL_FUNC_NEEDCOLL,
+			  FIELD_TYPE_INTEGER),
 		FUNCTION(printf, -1, 0, 0, printfFunc, FIELD_TYPE_STRING),
 		FUNCTION(unicode, 1, 0, 0, unicodeFunc, FIELD_TYPE_STRING),
 		FUNCTION(char, -1, 0, 0, charFunc, FIELD_TYPE_STRING),
diff --git a/src/box/sql/vdbeInt.h b/src/box/sql/vdbeInt.h
index c84f22caf..8af4fa359 100644
--- a/src/box/sql/vdbeInt.h
+++ b/src/box/sql/vdbeInt.h
@@ -607,4 +607,13 @@ char *
 sql_vdbe_mem_encode_tuple(struct Mem *fields, uint32_t field_count,
 			  uint32_t *tuple_size, struct region *region);
 
+/**
+ * Simple type to str convertor. It is used to simplify
+ * error reporting.
+ * @param mem Memory field which type we want to get.
+ * @retval String representing name of the type.
+ */
+char *
+mem_type_to_str(const struct Mem *p);
+
 #endif				/* !defined(SQL_VDBEINT_H) */
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 889fc5867..ddfd79163 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -2860,25 +2860,25 @@ test:do_execsql_test(
      SELECT GROUP_CONCAT(b, '') FROM t100;",
     {string.char(00,65,00,65,00)})
 
--- INSTR
+-- POSITION
 test:do_execsql_test(
     "func-73",
-    "SELECT INSTR(CHAR(00,65,00,66,00), CHAR(65));",
+    "SELECT POSITION(CHAR(65), CHAR(00,65,00,66,00));",
     {2})
 
 test:do_execsql_test(
     "func-74",
-    "SELECT INSTR(CHAR(00,65,00,66,00), CHAR(66));",
+    "SELECT POSITION(CHAR(66), CHAR(00,65,00,66,00));",
     {4})
 
 test:do_execsql_test(
     "func-75",
-    "SELECT INSTR(CHAR(00,65,00,66,00), CHAR(00));",
+    "SELECT POSITION(CHAR(00), CHAR(00,65,00,66,00));",
     {1})
 
 test:do_execsql_test(
     "func-76",
-    "SELECT INSTR(CHAR(65,66), CHAR(00));",
+    "SELECT POSITION(CHAR(00), CHAR(65,66));",
     {0})
 
 test:finish_test()
diff --git a/test/sql-tap/func5.test.lua b/test/sql-tap/func5.test.lua
index bb9e11a04..6605a2ba1 100755
--- a/test/sql-tap/func5.test.lua
+++ b/test/sql-tap/func5.test.lua
@@ -30,7 +30,7 @@ test:do_execsql_test(
         INSERT INTO t1 VALUES(3,'pqr','fuzzy',99);
         INSERT INTO t1 VALUES(4,'abcdefg','xy',22);
         INSERT INTO t1 VALUES(5,'shoe','mayer',2953);
-        SELECT x FROM t1 WHERE c=instr('abcdefg',b) OR a='abcdefg' ORDER BY +x;       
+        SELECT x FROM t1 WHERE c=position(b, 'abcdefg') OR a='abcdefg' ORDER BY +x;
     ]], {
         -- <func5-1.1>
         2, 4
@@ -40,7 +40,7 @@ test:do_execsql_test(
 test:do_execsql_test(
     "func5-1.2",
     [[
-        SELECT x FROM t1 WHERE a='abcdefg' OR c=instr('abcdefg',b) ORDER BY +x; 
+        SELECT x FROM t1 WHERE a='abcdefg' OR c=position(b, 'abcdefg') ORDER BY +x;
     ]], {
         -- <func5-1.1>
         2, 4
diff --git a/test/sql-tap/instr.test.lua b/test/sql-tap/instr.test.lua
deleted file mode 100755
index d42da5ccd..000000000
--- a/test/sql-tap/instr.test.lua
+++ /dev/null
@@ -1,706 +0,0 @@
-#!/usr/bin/env tarantool
-test = require("sqltester")
-test:plan(65)
-
---!./tcltestrunner.lua
--- 2012 October 24
---
--- The author disclaims copyright to this source code.  In place of
--- a legal notice, here is a blessing:
---
---    May you do good and not evil.
---    May you find forgiveness for yourself and forgive others.
---    May you share freely, never taking more than you give.
---
--------------------------------------------------------------------------
--- This file implements regression tests for sql library.  The
--- focus of this file is testing the built-in INSTR() functions.
---
--- EVIDENCE-OF: R-27549-59611 The instr(X,Y) function finds the first
--- occurrence of string Y within string X and returns the number of prior
--- characters plus 1, or 0 if Y is nowhere found within X.
---
--- ["set","testdir",[["file","dirname",["argv0"]]]]
--- ["source",[["testdir"],"\/tester.tcl"]]
--- Create a table to work with.
---
-test:do_test(
-    "instr-1.1",
-    function()
-        return test:execsql "SELECT instr('abcdefg','a');"
-    end, {
-        -- <instr-1.1>
-        1
-        -- </instr-1.1>
-    })
-
-test:do_test(
-    "instr-1.2",
-    function()
-        return test:execsql "SELECT instr('abcdefg','b');"
-    end, {
-        -- <instr-1.2>
-        2
-        -- </instr-1.2>
-    })
-
-test:do_test(
-    "instr-1.3",
-    function()
-        return test:execsql "SELECT instr('abcdefg','c');"
-    end, {
-        -- <instr-1.3>
-        3
-        -- </instr-1.3>
-    })
-
-test:do_test(
-    "instr-1.4",
-    function()
-        return test:execsql "SELECT instr('abcdefg','d');"
-    end, {
-        -- <instr-1.4>
-        4
-        -- </instr-1.4>
-    })
-
-test:do_test(
-    "instr-1.5",
-    function()
-        return test:execsql "SELECT instr('abcdefg','e');"
-    end, {
-        -- <instr-1.5>
-        5
-        -- </instr-1.5>
-    })
-
-test:do_test(
-    "instr-1.6",
-    function()
-        return test:execsql "SELECT instr('abcdefg','f');"
-    end, {
-        -- <instr-1.6>
-        6
-        -- </instr-1.6>
-    })
-
-test:do_test(
-    "instr-1.7",
-    function()
-        return test:execsql "SELECT instr('abcdefg','g');"
-    end, {
-        -- <instr-1.7>
-        7
-        -- </instr-1.7>
-    })
-
-test:do_test(
-    "instr-1.8",
-    function()
-        return test:execsql "SELECT instr('abcdefg','h');"
-    end, {
-        -- <instr-1.8>
-        0
-        -- </instr-1.8>
-    })
-
-test:do_test(
-    "instr-1.9",
-    function()
-        return test:execsql "SELECT instr('abcdefg','abcdefg');"
-    end, {
-        -- <instr-1.9>
-        1
-        -- </instr-1.9>
-    })
-
-test:do_test(
-    "instr-1.10",
-    function()
-        return test:execsql "SELECT instr('abcdefg','abcdefgh');"
-    end, {
-        -- <instr-1.10>
-        0
-        -- </instr-1.10>
-    })
-
-test:do_test(
-    "instr-1.11",
-    function()
-        return test:execsql "SELECT instr('abcdefg','bcdefg');"
-    end, {
-        -- <instr-1.11>
-        2
-        -- </instr-1.11>
-    })
-
-test:do_test(
-    "instr-1.12",
-    function()
-        return test:execsql "SELECT instr('abcdefg','bcdefgh');"
-    end, {
-        -- <instr-1.12>
-        0
-        -- </instr-1.12>
-    })
-
-test:do_test(
-    "instr-1.13",
-    function()
-        return test:execsql "SELECT instr('abcdefg','cdefg');"
-    end, {
-        -- <instr-1.13>
-        3
-        -- </instr-1.13>
-    })
-
-test:do_test(
-    "instr-1.14",
-    function()
-        return test:execsql "SELECT instr('abcdefg','cdefgh');"
-    end, {
-        -- <instr-1.14>
-        0
-        -- </instr-1.14>
-    })
-
-test:do_test(
-    "instr-1.15",
-    function()
-        return test:execsql "SELECT instr('abcdefg','defg');"
-    end, {
-        -- <instr-1.15>
-        4
-        -- </instr-1.15>
-    })
-
-test:do_test(
-    "instr-1.16",
-    function()
-        return test:execsql "SELECT instr('abcdefg','defgh');"
-    end, {
-        -- <instr-1.16>
-        0
-        -- </instr-1.16>
-    })
-
-test:do_test(
-    "instr-1.17",
-    function()
-        return test:execsql "SELECT instr('abcdefg','efg');"
-    end, {
-        -- <instr-1.17>
-        5
-        -- </instr-1.17>
-    })
-
-test:do_test(
-    "instr-1.18",
-    function()
-        return test:execsql "SELECT instr('abcdefg','efgh');"
-    end, {
-        -- <instr-1.18>
-        0
-        -- </instr-1.18>
-    })
-
-test:do_test(
-    "instr-1.19",
-    function()
-        return test:execsql "SELECT instr('abcdefg','fg');"
-    end, {
-        -- <instr-1.19>
-        6
-        -- </instr-1.19>
-    })
-
-test:do_test(
-    "instr-1.20",
-    function()
-        return test:execsql "SELECT instr('abcdefg','fgh');"
-    end, {
-        -- <instr-1.20>
-        0
-        -- </instr-1.20>
-    })
-
-test:do_test(
-    "instr-1.21",
-    function()
-        return test:execsql "SELECT coalesce(instr('abcdefg',NULL),'nil');"
-    end, {
-        -- <instr-1.21>
-        "nil"
-        -- </instr-1.21>
-    })
-
-test:do_test(
-    "instr-1.22",
-    function()
-        return test:execsql "SELECT coalesce(instr(NULL,'x'),'nil');"
-    end, {
-        -- <instr-1.22>
-        "nil"
-        -- </instr-1.22>
-    })
-
-test:do_test(
-    "instr-1.23",
-    function()
-        return test:execsql "SELECT instr(12345,34);"
-    end, {
-        -- <instr-1.23>
-        3
-        -- </instr-1.23>
-    })
-
-test:do_test(
-    "instr-1.24",
-    function()
-        return test:execsql "SELECT instr(123456.78,34);"
-    end, {
-        -- <instr-1.24>
-        3
-        -- </instr-1.24>
-    })
-
-test:do_test(
-    "instr-1.25",
-    function()
-        return test:execsql "SELECT instr(123456.78,x'3334');"
-    end, {
-        -- <instr-1.25>
-        3
-        -- </instr-1.25>
-    })
-
-test:do_test(
-    "instr-1.26",
-    function()
-        return test:execsql "SELECT instr('äbcdefg','efg');"
-    end, {
-        -- <instr-1.26>
-        5
-        -- </instr-1.26>
-    })
-
-test:do_test(
-    "instr-1.27",
-    function()
-        return test:execsql "SELECT instr('€xyzzy','xyz');"
-    end, {
-        -- <instr-1.27>
-        2
-        -- </instr-1.27>
-    })
-
-test:do_test(
-    "instr-1.28",
-    function()
-        return test:execsql "SELECT instr('abc€xyzzy','xyz');"
-    end, {
-        -- <instr-1.28>
-        5
-        -- </instr-1.28>
-    })
-
-test:do_test(
-    "instr-1.29",
-    function()
-        return test:execsql "SELECT instr('abc€xyzzy','€xyz');"
-    end, {
-        -- <instr-1.29>
-        4
-        -- </instr-1.29>
-    })
-
-test:do_test(
-    "instr-1.30",
-    function()
-        return test:execsql "SELECT instr('abc€xyzzy','c€xyz');"
-    end, {
-        -- <instr-1.30>
-        3
-        -- </instr-1.30>
-    })
-
-test:do_test(
-    "instr-1.31",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'01');"
-    end, {
-        -- <instr-1.31>
-        1
-        -- </instr-1.31>
-    })
-
-test:do_test(
-    "instr-1.32",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'02');"
-    end, {
-        -- <instr-1.32>
-        2
-        -- </instr-1.32>
-    })
-
-test:do_test(
-    "instr-1.33",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'03');"
-    end, {
-        -- <instr-1.33>
-        3
-        -- </instr-1.33>
-    })
-
-test:do_test(
-    "instr-1.34",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'04');"
-    end, {
-        -- <instr-1.34>
-        4
-        -- </instr-1.34>
-    })
-
-test:do_test(
-    "instr-1.35",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'05');"
-    end, {
-        -- <instr-1.35>
-        5
-        -- </instr-1.35>
-    })
-
-test:do_test(
-    "instr-1.36",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'06');"
-    end, {
-        -- <instr-1.36>
-        0
-        -- </instr-1.36>
-    })
-
-test:do_test(
-    "instr-1.37",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'0102030405');"
-    end, {
-        -- <instr-1.37>
-        1
-        -- </instr-1.37>
-    })
-
-test:do_test(
-    "instr-1.38",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'02030405');"
-    end, {
-        -- <instr-1.38>
-        2
-        -- </instr-1.38>
-    })
-
-test:do_test(
-    "instr-1.39",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'030405');"
-    end, {
-        -- <instr-1.39>
-        3
-        -- </instr-1.39>
-    })
-
-test:do_test(
-    "instr-1.40",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'0405');"
-    end, {
-        -- <instr-1.40>
-        4
-        -- </instr-1.40>
-    })
-
-test:do_test(
-    "instr-1.41",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'0506');"
-    end, {
-        -- <instr-1.41>
-        0
-        -- </instr-1.41>
-    })
-
-test:do_test(
-    "instr-1.42",
-    function()
-        return test:execsql "SELECT instr(x'0102030405',x'');"
-    end, {
-        -- <instr-1.42>
-        1
-        -- </instr-1.42>
-    })
-
-test:do_test(
-    "instr-1.43",
-    function()
-        return test:execsql "SELECT instr(x'',x'');"
-    end, {
-        -- <instr-1.43>
-        1
-        -- </instr-1.43>
-    })
-
-test:do_test(
-    "instr-1.44",
-    function()
-        return test:execsql "SELECT instr('','');"
-    end, {
-        -- <instr-1.44>
-        1
-        -- </instr-1.44>
-    })
-
-test:do_test(
-    "instr-1.45",
-    function()
-        return test:execsql "SELECT instr('abcdefg','');"
-    end, {
-        -- <instr-1.45>
-        1
-        -- </instr-1.45>
-    })
-
--- ["unset","-nocomplain","longstr"]
-local longstr = "abcdefghijklmonpqrstuvwxyz"
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
-longstr = longstr .. longstr
--- puts [string length '"..longstr.."']
-longstr = longstr .. "Xabcde"
-test:do_test(
-    "instr-1.46",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','X');")
-    end, {
-        -- <instr-1.46>
-        106497
-        -- </instr-1.46>
-    })
-
-test:do_test(
-    "instr-1.47",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','Y');")
-    end, {
-        -- <instr-1.47>
-        0
-        -- </instr-1.47>
-    })
-
-test:do_test(
-    "instr-1.48",
-    function()
-        return test:execsql( "SELECT instr('"..longstr.."','Xa');")
-    end, {
-        -- <instr-1.48>
-        106497
-        -- </instr-1.48>
-    })
-
-test:do_test(
-    "instr-1.49",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','zXa');")
-    end, {
-        -- <instr-1.49>
-        106496
-        -- </instr-1.49>
-    })
-
-longstr = string.gsub(longstr, "a", "ä")
-test:do_test(
-    "instr-1.50",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','X');")
-    end, {
-        -- <instr-1.50>
-        106497
-        -- </instr-1.50>
-    })
-
-test:do_test(
-    "instr-1.51",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','Y');")
-    end, {
-        -- <instr-1.51>
-        0
-        -- </instr-1.51>
-    })
-
-test:do_test(
-    "instr-1.52",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','Xä');")
-    end, {
-        -- <instr-1.52>
-        106497
-        -- </instr-1.52>
-    })
-
-test:do_test(
-    "instr-1.53",
-    function()
-        return test:execsql("SELECT instr('"..longstr.."','zXä');")
-    end, {
-        -- <instr-1.53>
-        106496
-        -- </instr-1.53>
-    })
-
-test:do_test(
-    "instr-1.54",
-    function()
-        return test:execsql("SELECT instr(x'78c3a4e282ac79','x');")
-    end, {
-        -- <instr-1.54>
-        1
-        -- </instr-1.54>
-    })
-
-test:do_test(
-    "instr-1.55",
-    function()
-        return test:execsql "SELECT instr(x'78c3a4e282ac79','y');"
-    end, {
-        -- <instr-1.55>
-        4
-        -- </instr-1.55>
-    })
-
--- EVIDENCE-OF: R-46421-32541 Or, if X and Y are both BLOBs, then
--- instr(X,Y) returns one more than the number bytes prior to the first
--- occurrence of Y, or 0 if Y does not occur anywhere within X.
---
-test:do_test(
-    "instr-1.56.1",
-    function()
-        return test:execsql "SELECT instr(x'78c3a4e282ac79',x'79');"
-    end, {
-        -- <instr-1.56.1>
-        7
-        -- </instr-1.56.1>
-    })
-
-test:do_test(
-    "instr-1.56.2",
-    function()
-        return test:execsql "SELECT instr(x'78c3a4e282ac79',x'7a');"
-    end, {
-        -- <instr-1.56.2>
-        0
-        -- </instr-1.56.2>
-    })
-
-test:do_test(
-    "instr-1.56.3",
-    function()
-        return test:execsql "SELECT instr(x'78c3a4e282ac79',x'78');"
-    end, {
-        -- <instr-1.56.3>
-        1
-        -- </instr-1.56.3>
-    })
-
-test:do_test(
-    "instr-1.56.3",
-    function()
-        return test:execsql "SELECT instr(x'78c3a4e282ac79',x'a4');"
-    end, {
-        -- <instr-1.56.3>
-        3
-        -- </instr-1.56.3>
-    })
-
--- EVIDENCE-OF: R-17329-35644 If both arguments X and Y to instr(X,Y) are
--- non-NULL and are not BLOBs then both are interpreted as strings.
---
-test:do_test(
-    "instr-1.57.1",
-    function()
-        return test:execsql "SELECT instr('xä€y',x'79');"
-    end, {
-        -- <instr-1.57.1>
-        4
-        -- </instr-1.57.1>
-    })
-
-test:do_test(
-    "instr-1.57.2",
-    function()
-        return test:execsql "SELECT instr('xä€y',x'a4');"
-    end, {
-        -- <instr-1.57.2>
-        0
-        -- </instr-1.57.2>
-    })
-
-test:do_test(
-    "instr-1.57.3",
-    function()
-        return test:execsql "SELECT instr(x'78c3a4e282ac79','y');"
-    end, {
-        -- <instr-1.57.3>
-        4
-        -- </instr-1.57.3>
-    })
-
--- EVIDENCE-OF: R-14708-27487 If either X or Y are NULL in instr(X,Y)
--- then the result is NULL.
---
-test:do_execsql_test(
-    "instr-1.60",
-    [[
-        SELECT coalesce(instr(NULL,'abc'), 999);
-    ]], {
-        -- <instr-1.60>
-        999
-        -- </instr-1.60>
-    })
-
-test:do_execsql_test(
-    "instr-1.61",
-    [[
-        SELECT coalesce(instr('abc',NULL), 999);
-    ]], {
-        -- <instr-1.61>
-        999
-        -- </instr-1.61>
-    })
-
-test:do_execsql_test(
-    "instr-1.62",
-    [[
-        SELECT coalesce(instr(NULL,NULL), 999);
-    ]], {
-        -- <instr-1.62>
-        999
-        -- </instr-1.62>
-    })
-
-
-
-test:finish_test()
diff --git a/test/sql-tap/position.test.lua b/test/sql-tap/position.test.lua
new file mode 100755
index 000000000..70d11444c
--- /dev/null
+++ b/test/sql-tap/position.test.lua
@@ -0,0 +1,905 @@
+#!/usr/bin/env tarantool
+test = require("sqltester")
+test:plan(80)
+
+--!./tcltestrunner.lua
+-- 2012 October 24
+--
+-- The author disclaims copyright to this source code.  In place of
+-- a legal notice, here is a blessing:
+--
+--    May you do good and not evil.
+--    May you find forgiveness for yourself and forgive others.
+--    May you share freely, never taking more than you give.
+--
+-------------------------------------------------------------------------
+-- This file implements regression tests for sql library.  The
+-- focus of this file is testing the built-in POSITION() functions.
+--
+-- EVIDENCE-OF: R-27549-59611 The position(X,Y) function finds the first
+-- occurrence of string X within string Y and returns the number of prior
+-- characters plus 1, or 0 if X is nowhere found within Y.
+--
+-- ["set","testdir",[["file","dirname",["argv0"]]]]
+-- ["source",[["testdir"],"\/tester.tcl"]]
+-- Create a table to work with.
+--
+test:do_test(
+    "position-1.1",
+    function()
+        return test:execsql "SELECT position('a', 'abcdefg');"
+    end, {
+        -- <position-1.1>
+        1
+        -- </position-1.1>
+    })
+
+test:do_test(
+    "position-1.2",
+    function()
+        return test:execsql "SELECT position('b', 'abcdefg');"
+    end, {
+        -- <position-1.2>
+        2
+        -- </position-1.2>
+    })
+
+test:do_test(
+    "position-1.3",
+    function()
+        return test:execsql "SELECT position('c', 'abcdefg');"
+    end, {
+        -- <position-1.3>
+        3
+        -- </position-1.3>
+    })
+
+test:do_test(
+    "position-1.4",
+    function()
+        return test:execsql "SELECT position('d', 'abcdefg');"
+    end, {
+        -- <position-1.4>
+        4
+        -- </position-1.4>
+    })
+
+test:do_test(
+    "position-1.5",
+    function()
+        return test:execsql "SELECT position('e', 'abcdefg');"
+    end, {
+        -- <position-1.5>
+        5
+        -- </position-1.5>
+    })
+
+test:do_test(
+    "position-1.6",
+    function()
+        return test:execsql "SELECT position('f', 'abcdefg');"
+    end, {
+        -- <position-1.6>
+        6
+        -- </position-1.6>
+    })
+
+test:do_test(
+    "position-1.7",
+    function()
+        return test:execsql "SELECT position('g', 'abcdefg');"
+    end, {
+        -- <position-1.7>
+        7
+        -- </position-1.7>
+    })
+
+test:do_test(
+    "position-1.8",
+    function()
+        return test:execsql "SELECT position('h', 'abcdefg');"
+    end, {
+        -- <position-1.8>
+        0
+        -- </position-1.8>
+    })
+
+test:do_test(
+    "position-1.9",
+    function()
+        return test:execsql "SELECT position('abcdefg', 'abcdefg');"
+    end, {
+        -- <position-1.9>
+        1
+        -- </position-1.9>
+    })
+
+test:do_test(
+    "position-1.10",
+    function()
+        return test:execsql "SELECT position('abcdefgh', 'abcdefg');"
+    end, {
+        -- <position-1.10>
+        0
+        -- </position-1.10>
+    })
+
+test:do_test(
+    "position-1.11",
+    function()
+        return test:execsql "SELECT position('bcdefg', 'abcdefg');"
+    end, {
+        -- <position-1.11>
+        2
+        -- </position-1.11>
+    })
+
+test:do_test(
+    "position-1.12",
+    function()
+        return test:execsql "SELECT position('bcdefgh', 'abcdefg');"
+    end, {
+        -- <position-1.12>
+        0
+        -- </position-1.12>
+    })
+
+test:do_test(
+    "position-1.13",
+    function()
+        return test:execsql "SELECT position('cdefg', 'abcdefg');"
+    end, {
+        -- <position-1.13>
+        3
+        -- </position-1.13>
+    })
+
+test:do_test(
+    "position-1.14",
+    function()
+        return test:execsql "SELECT position('cdefgh', 'abcdefg');"
+    end, {
+        -- <position-1.14>
+        0
+        -- </position-1.14>
+    })
+
+test:do_test(
+    "position-1.15",
+    function()
+        return test:execsql "SELECT position('defg', 'abcdefg');"
+    end, {
+        -- <position-1.15>
+        4
+        -- </position-1.15>
+    })
+
+test:do_test(
+    "position-1.16",
+    function()
+        return test:execsql "SELECT position('defgh', 'abcdefg');"
+    end, {
+        -- <position-1.16>
+        0
+        -- </position-1.16>
+    })
+
+test:do_test(
+    "position-1.17",
+    function()
+        return test:execsql "SELECT position('efg', 'abcdefg');"
+    end, {
+        -- <position-1.17>
+        5
+        -- </position-1.17>
+    })
+
+test:do_test(
+    "position-1.18",
+    function()
+        return test:execsql "SELECT position('efgh', 'abcdefg');"
+    end, {
+        -- <position-1.18>
+        0
+        -- </position-1.18>
+    })
+
+test:do_test(
+    "position-1.19",
+    function()
+        return test:execsql "SELECT position('fg', 'abcdefg');"
+    end, {
+        -- <position-1.19>
+        6
+        -- </position-1.19>
+    })
+
+test:do_test(
+    "position-1.20",
+    function()
+        return test:execsql "SELECT position('fgh', 'abcdefg');"
+    end, {
+        -- <position-1.20>
+        0
+        -- </position-1.20>
+    })
+
+test:do_test(
+    "position-1.21",
+    function()
+        return test:execsql "SELECT coalesce(position(NULL, 'abcdefg'), 'nil');"
+    end, {
+        -- <position-1.21>
+        "nil"
+        -- </position-1.21>
+    })
+
+test:do_test(
+    "position-1.22",
+    function()
+        return test:execsql "SELECT coalesce(position('x', NULL), 'nil');"
+    end, {
+        -- <position-1.22>
+        "nil"
+        -- </position-1.22>
+    })
+
+test:do_test(
+    "position-1.23",
+    function()
+        return test:catchsql "SELECT position(34, 12345);"
+    end, {
+        -- <position-1.23>
+        1, "Inconsistent types: expected TEXT or BLOB got INTEGER"
+        -- </position-1.23>
+    })
+
+test:do_test(
+    "position-1.24",
+    function()
+        return test:catchsql "SELECT position(34, 123456.78);"
+    end, {
+        -- <position-1.24>
+        1, "Inconsistent types: expected TEXT or BLOB got REAL"
+        -- </position-1.24>
+    })
+
+test:do_test(
+    "position-1.25",
+    function()
+        return test:catchsql "SELECT position(x'3334', 123456.78);"
+    end, {
+        -- <position-1.25>
+        1, "Inconsistent types: expected TEXT or BLOB got REAL"
+        -- </position-1.25>
+    })
+
+test:do_test(
+    "position-1.26",
+    function()
+        return test:execsql "SELECT position('efg', 'äbcdefg');"
+    end, {
+        -- <position-1.26>
+        5
+        -- </position-1.26>
+    })
+
+test:do_test(
+    "position-1.27",
+    function()
+        return test:execsql "SELECT position('xyz', '€xyzzy');"
+    end, {
+        -- <position-1.27>
+        2
+        -- </position-1.27>
+    })
+
+test:do_test(
+    "position-1.28",
+    function()
+        return test:execsql "SELECT position('xyz', 'abc€xyzzy');"
+    end, {
+        -- <position-1.28>
+        5
+        -- </position-1.28>
+    })
+
+test:do_test(
+    "position-1.29",
+    function()
+        return test:execsql "SELECT position('€xyz', 'abc€xyzzy');"
+    end, {
+        -- <position-1.29>
+        4
+        -- </position-1.29>
+    })
+
+test:do_test(
+    "position-1.30",
+    function()
+        return test:execsql "SELECT position('c€xyz', 'abc€xyzzy');"
+    end, {
+        -- <position-1.30>
+        3
+        -- </position-1.30>
+    })
+
+test:do_test(
+    "position-1.31",
+    function()
+        return test:execsql "SELECT position(x'01', x'0102030405');"
+    end, {
+        -- <position-1.31>
+        1
+        -- </position-1.31>
+    })
+
+test:do_test(
+    "position-1.32",
+    function()
+        return test:execsql "SELECT position(x'02', x'0102030405');"
+    end, {
+        -- <position-1.32>
+        2
+        -- </position-1.32>
+    })
+
+test:do_test(
+    "position-1.33",
+    function()
+        return test:execsql "SELECT position(x'03', x'0102030405');"
+    end, {
+        -- <position-1.33>
+        3
+        -- </position-1.33>
+    })
+
+test:do_test(
+    "position-1.34",
+    function()
+        return test:execsql "SELECT position(x'04', x'0102030405');"
+    end, {
+        -- <position-1.34>
+        4
+        -- </position-1.34>
+    })
+
+test:do_test(
+    "position-1.35",
+    function()
+        return test:execsql "SELECT position(x'05', x'0102030405');"
+    end, {
+        -- <position-1.35>
+        5
+        -- </position-1.35>
+    })
+
+test:do_test(
+    "position-1.36",
+    function()
+        return test:execsql "SELECT position(x'06', x'0102030405');"
+    end, {
+        -- <position-1.36>
+        0
+        -- </position-1.36>
+    })
+
+test:do_test(
+    "position-1.37",
+    function()
+        return test:execsql "SELECT position(x'0102030405', x'0102030405');"
+    end, {
+        -- <position-1.37>
+        1
+        -- </position-1.37>
+    })
+
+test:do_test(
+    "position-1.38",
+    function()
+        return test:execsql "SELECT position(x'02030405', x'0102030405');"
+    end, {
+        -- <position-1.38>
+        2
+        -- </position-1.38>
+    })
+
+test:do_test(
+    "position-1.39",
+    function()
+        return test:execsql "SELECT position(x'030405', x'0102030405');"
+    end, {
+        -- <position-1.39>
+        3
+        -- </position-1.39>
+    })
+
+test:do_test(
+    "position-1.40",
+    function()
+        return test:execsql "SELECT position(x'0405', x'0102030405');"
+    end, {
+        -- <position-1.40>
+        4
+        -- </position-1.40>
+    })
+
+test:do_test(
+    "position-1.41",
+    function()
+        return test:execsql "SELECT position(x'0506', x'0102030405');"
+    end, {
+        -- <position-1.41>
+        0
+        -- </position-1.41>
+    })
+
+test:do_test(
+    "position-1.42",
+    function()
+        return test:execsql "SELECT position(x'', x'0102030405');"
+    end, {
+        -- <position-1.42>
+        1
+        -- </position-1.42>
+    })
+
+test:do_test(
+    "position-1.43",
+    function()
+        return test:execsql "SELECT position(x'', x'');"
+    end, {
+        -- <position-1.43>
+        1
+        -- </position-1.43>
+    })
+
+test:do_test(
+    "position-1.44",
+    function()
+        return test:execsql "SELECT position('', '');"
+    end, {
+        -- <position-1.44>
+        1
+        -- </position-1.44>
+    })
+
+test:do_test(
+    "position-1.45",
+    function()
+        return test:execsql "SELECT position('', 'abcdefg');"
+    end, {
+        -- <position-1.45>
+        1
+        -- </position-1.45>
+    })
+
+-- ["unset","-nocomplain","longstr"]
+local longstr = "abcdefghijklmonpqrstuvwxyz"
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+longstr = longstr .. longstr
+-- puts [string length '"..longstr.."']
+longstr = longstr .. "Xabcde"
+test:do_test(
+    "position-1.46",
+    function()
+        return test:execsql("SELECT position('X', '"..longstr.."');")
+    end, {
+        -- <position-1.46>
+        106497
+        -- </position-1.46>
+    })
+
+test:do_test(
+    "position-1.47",
+    function()
+        return test:execsql("SELECT position('Y', '"..longstr.."');")
+    end, {
+        -- <position-1.47>
+        0
+        -- </position-1.47>
+    })
+
+test:do_test(
+    "position-1.48",
+    function()
+        return test:execsql( "SELECT position('Xa', '"..longstr.."');")
+    end, {
+        -- <position-1.48>
+        106497
+        -- </position-1.48>
+    })
+
+test:do_test(
+    "position-1.49",
+    function()
+        return test:execsql("SELECT position('zXa', '"..longstr.."');")
+    end, {
+        -- <position-1.49>
+        106496
+        -- </position-1.49>
+    })
+
+longstr = string.gsub(longstr, "a", "ä")
+test:do_test(
+    "position-1.50",
+    function()
+        return test:execsql("SELECT position('X', '"..longstr.."');")
+    end, {
+        -- <position-1.50>
+        106497
+        -- </position-1.50>
+    })
+
+test:do_test(
+    "position-1.51",
+    function()
+        return test:execsql("SELECT position('Y', '"..longstr.."');")
+    end, {
+        -- <position-1.51>
+        0
+        -- </position-1.51>
+    })
+
+test:do_test(
+    "position-1.52",
+    function()
+        return test:execsql("SELECT position('Xä', '"..longstr.."');")
+    end, {
+        -- <position-1.52>
+        106497
+        -- </position-1.52>
+    })
+
+test:do_test(
+    "position-1.53",
+    function()
+        return test:execsql("SELECT position('zXä', '"..longstr.."');")
+    end, {
+        -- <position-1.53>
+        106496
+        -- </position-1.53>
+    })
+
+test:do_test(
+    "position-1.54",
+    function()
+        return test:catchsql("SELECT position('x', x'78c3a4e282ac79');")
+    end, {
+        -- <position-1.54>
+        1, "Inconsistent types: expected TEXT got BLOB"
+        -- </position-1.54>
+    })
+
+test:do_test(
+    "position-1.55",
+    function()
+        return test:catchsql "SELECT position('y', x'78c3a4e282ac79');"
+    end, {
+        -- <position-1.55>
+        1, "Inconsistent types: expected TEXT got BLOB"
+        -- </position-1.55>
+    })
+
+-- EVIDENCE-OF: R-46421-32541 Or, if X and Y are both BLOBs, then
+-- position(X,Y) returns one more than the number bytes prior to the first
+-- occurrence of X, or 0 if X does not occur anywhere within Y.
+--
+test:do_test(
+    "position-1.56.1",
+    function()
+        return test:execsql "SELECT position(x'79', x'78c3a4e282ac79');"
+    end, {
+        -- <position-1.56.1>
+        7
+        -- </position-1.56.1>
+    })
+
+test:do_test(
+    "position-1.56.2",
+    function()
+        return test:execsql "SELECT position(x'7a', x'78c3a4e282ac79');"
+    end, {
+        -- <position-1.56.2>
+        0
+        -- </position-1.56.2>
+    })
+
+test:do_test(
+    "position-1.56.3",
+    function()
+        return test:execsql "SELECT position(x'78', x'78c3a4e282ac79');"
+    end, {
+        -- <position-1.56.3>
+        1
+        -- </position-1.56.3>
+    })
+
+test:do_test(
+    "position-1.56.3",
+    function()
+        return test:execsql "SELECT position(x'a4', x'78c3a4e282ac79');"
+    end, {
+        -- <position-1.56.3>
+        3
+        -- </position-1.56.3>
+    })
+
+-- EVIDENCE-OF: R-17329-35644 If both arguments X and Y to position(X,Y) are
+-- non-NULL and are not BLOBs then both are interpreted as strings.
+--
+test:do_test(
+    "position-1.57.1",
+    function()
+        return test:catchsql "SELECT position(x'79', 'xä€y');"
+    end, {
+        -- <position-1.57.1>
+        1, "Inconsistent types: expected BLOB got TEXT"
+        -- </position-1.57.1>
+    })
+
+test:do_test(
+    "position-1.57.2",
+    function()
+        return test:catchsql "SELECT position(x'a4', 'xä€y');"
+    end, {
+        -- <position-1.57.2>
+        1, "Inconsistent types: expected BLOB got TEXT"
+        -- </position-1.57.2>
+    })
+
+test:do_test(
+    "position-1.57.3",
+    function()
+        return test:catchsql "SELECT position('y', x'78c3a4e282ac79');"
+    end, {
+        -- <position-1.57.3>
+        1, "Inconsistent types: expected TEXT got BLOB"
+        -- </position-1.57.3>
+    })
+
+-- EVIDENCE-OF: R-14708-27487 If either X or Y are NULL in position(X,Y)
+-- then the result is NULL.
+--
+test:do_execsql_test(
+    "position-1.60",
+    [[
+        SELECT coalesce(position(NULL, 'abc'), 999);
+    ]], {
+        -- <position-1.60>
+        999
+        -- </position-1.60>
+    })
+
+test:do_execsql_test(
+    "position-1.61",
+    [[
+        SELECT coalesce(position('abc', NULL), 999);
+    ]], {
+        -- <position-1.61>
+        999
+        -- </position-1.61>
+    })
+
+test:do_execsql_test(
+    "position-1.62",
+    [[
+        SELECT coalesce(position(NULL, NULL), 999);
+    ]], {
+        -- <position-1.62>
+        999
+        -- </position-1.62>
+    })
+
+-- Tests that make use of collations.
+-- A short remainder of how "unicode" and "unicode_ci" collations
+-- works:
+-- unicode_ci: „a“ = „A“ = „á“ = „Á“.
+-- unicode: all symbols above are pairwise unequal.
+
+-- Collation is set in space format.
+
+test:do_execsql_test(
+    "position-1.63",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('à');
+        SELECT POSITION('a', s1) FROM test1;
+        DROP TABLE test1;
+    ]], {
+        1
+    }
+)
+
+test:do_execsql_test(
+    "position-1.64",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrty');
+        SELECT POSITION('er', s1) FROM test1;
+        DROP TABLE test1;
+    ]], {
+        3
+    }
+)
+
+test:do_execsql_test(
+    "position-1.65",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrtÿ');
+        SELECT POSITION('y', s1) FROM test1;
+        DROP TABLE test1;
+    ]], {
+        6
+    }
+)
+
+-- Collation is set in space format and also in position() -
+-- for haystack (string) only.
+
+test:do_execsql_test(
+    "position-1.66",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('à');
+        SELECT POSITION('a', s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+test:do_execsql_test(
+    "position-1.67",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrty');
+        SELECT POSITION('er', s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+test:do_execsql_test(
+    "position-1.68",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrtÿ');
+        SELECT POSITION('Y', s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+-- Collation is set in space format and also in position () -
+-- for needle (string) only.
+
+test:do_execsql_test(
+    "position-1.69",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('à');
+        SELECT POSITION('a' COLLATE "unicode", s1) FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+test:do_execsql_test(
+    "position-1.70",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrty');
+        SELECT POSITION('er' COLLATE "unicode", s1) FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+test:do_execsql_test(
+    "position-1.71",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrtÿ');
+        SELECT POSITION('Y' COLLATE "unicode", s1) FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+-- Collation is set in space format and also in position() -
+-- for both arguments. Arguments have the same collations.
+
+test:do_execsql_test(
+    "position-1.72",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('à');
+        SELECT POSITION('a' COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+test:do_execsql_test(
+    "position-1.73",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrty');
+        SELECT POSITION('er' COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+test:do_execsql_test(
+    "position-1.74",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrtÿ');
+        SELECT POSITION('Y'COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        0
+    }
+)
+
+-- Collation is set in space format and also in position() -
+-- for both arguments. Arguments have different explicit
+-- collations thus an error is expected.
+
+test:do_catchsql_test(
+    "position-1.75",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('à');
+        SELECT POSITION('a' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
+    ]], {
+        1, "Illegal mix of collations"
+    }
+)
+
+test:do_catchsql_test(
+    "position-1.76",
+    [[
+        DROP TABLE test1;
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrty');
+        SELECT POSITION('er' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
+        DROP TABLE test1;
+    ]], {
+        1, "Illegal mix of collations"
+    }
+)
+
+test:do_catchsql_test(
+    "position-1.77",
+    [[
+        DROP TABLE test1;
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES('qwèrtÿ');
+        SELECT POSITION('Y' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
+    ]], {
+        1, "Illegal mix of collations"
+    }
+)
+
+test:finish_test()
diff --git a/test/sql-tap/with1.test.lua b/test/sql-tap/with1.test.lua
index 16c9b1207..ecc16755e 100755
--- a/test/sql-tap/with1.test.lua
+++ b/test/sql-tap/with1.test.lua
@@ -595,11 +595,11 @@ test:do_execsql_test("8.2-soduko", [[
 
     /* The tricky bit. */
     x(s, ind) AS (
-      SELECT sud, instr(sud, '.') FROM input
+      SELECT sud, position('.', sud) FROM input
       UNION ALL
       SELECT
         substr(s, 1, ind-1) || z || substr(s, ind+1),
-        instr( substr(s, 1, ind-1) || z || substr(s, ind+1), '.' )
+        position('.', substr(s, 1, ind-1) || z || substr(s, ind+1))
        FROM x, digits AS z
       WHERE ind>0
         AND NOT EXISTS (
-- 
2.20.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: rename instr to position & add collation usage
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov
@ 2019-03-20 12:59   ` i.koptelov
  2019-03-26 12:32   ` [tarantool-patches] Re: [PATCH 2/2] " n.pettik
  1 sibling, 0 replies; 13+ messages in thread
From: i.koptelov @ 2019-03-20 12:59 UTC (permalink / raw)
  To: tarantool-patches; +Cc: n.pettik

Sorry, I have missed one case. (already squashed):
---
 src/box/sql/expr.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 53926e3f2..8c1889d8a 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -4102,8 +4102,16 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 			 * is done using ANSI rules from
 			 * collations_check_compatibility().
 			 */
+			if (nFarg == 1) {
+				bool unused;
+				uint32_t id;
+				if (sql_expr_coll(pParse,
+						  pFarg->a[0].pExpr, &unused,
+						  &id, &coll) != 0)
+					return 0;
+			}
 			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
-			     coll == NULL) {
+			     coll == NULL && nFarg > 1) {
 				struct coll *unused = NULL;
 				uint32_t curr_id = COLL_NONE;
 				bool is_curr_forced = false;
-- 
2.20.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov
@ 2019-03-25 19:26   ` n.pettik
  2019-03-27 13:38     ` i.koptelov
  0 siblings, 1 reply; 13+ messages in thread
From: n.pettik @ 2019-03-25 19:26 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Ivan Koptelov



> On 20 Mar 2019, at 14:11, Ivan Koptelov <ivan.koptelov@tarantool.org> wrote:
> 
> Before the patch determination of collation in SQL functions
> (used only in instr) was too narrow: the arguments were scanned
> from left to right, till the argument with collation was
> encountered, then its collation was used.
> Now every arguments collation is considered. The right collation
> which would be used in function is determined using ANSI
> compatibility rules ("type combination" rules).
> 
> Note: currently only instr() a.k.a position() uses mechanism
> described above, other functions (except aggregate) simply
> ignores collations.

That’s not true: I see that min-max aggregates also feature this flag:

FUNCTION(min, -1, 0, 1, minmaxFunc, FIELD_TYPE_SCALAR),

Fourth param indicates whether SQL_FUNC_NEEDCOLL is set or not.

> ---
> src/box/sql/expr.c   | 69 ++++++++++++++++++++++++++++++++++++++++----
> src/box/sql/sqlInt.h |  8 ++++-
> 2 files changed, 70 insertions(+), 7 deletions(-)
> 
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index a2c70935e..2f48d90c6 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -4093,16 +4093,73 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
> 					testcase(i == 31);
> 					constMask |= MASKBIT32(i);
> 				}
> -				if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) !=
> -				    0 && coll == NULL) {
> -					bool unused;
> -					uint32_t id;
> +			}
> +			/*
> +			 * Function arguments may have different
> +			 * collations. The following code
> +			 * checks if they are compatible and
> +			 * finds the collation to be used. This
> +			 * is done using ANSI rules from
> +			 * collations_check_compatibility().
> +			 */
> +			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
> +			     coll == NULL) {
> +				struct coll *unused = NULL;
> +				uint32_t curr_id = COLL_NONE;
> +				bool is_curr_forced = false;
> +
> +				uint32_t temp_id = COLL_NONE;
> +				bool is_temp_forced = false;
> +
> +				uint32_t lhs_id = COLL_NONE;
> +				bool is_lhs_forced = false;
> +
> +				uint32_t rhs_id = COLL_NONE;
> +				bool is_rhs_forced = false;
> +
> +				for (int i = 0; i < nFarg; i++) {
> 					if (sql_expr_coll(pParse,
> 							  pFarg->a[i].pExpr,
> -							  &unused, &id,
> -							  &coll) != 0)
> +							  &is_lhs_forced,
> +							  &lhs_id,
> +							  &unused) != 0)
> 						return 0;
> +
> +					for (int j = i + 1; j < nFarg; j++) {
> +						if (sql_expr_coll(pParse,
> +								  pFarg->a[j].pExpr,
> +								  &is_rhs_forced,
> +								  &rhs_id,
> +								  &unused) != 0)
> +							return 0;

Seems like you need only one pass saving resulting collation.
Resulting collation shouldn’t depend on way of passing through
arguments. And second call of collations_check_copatiility() is
redundant as well. Now you are using 2n^2 calls of this function,
but n is enough: you compare collation of first argument with
second one and save result in tmp. Then, compare tmp with
third argument etc.

> +
> +						if (collations_check_compatibility(
> +							lhs_id, is_lhs_forced,
> +							rhs_id, is_rhs_forced,
> +							&temp_id) != 0) {
> +							pParse->rc =
> +								SQL_TARANTOOL_ERROR;
> +							pParse->nErr++;
> +							return 0;
> +						}
> +
> +						is_temp_forced = (temp_id ==
> +								  lhs_id) ?
> +								 is_lhs_forced :
> +								 is_rhs_forced;
> +
> +						if (collations_check_compatibility(
> +							curr_id, is_curr_forced,
> +							temp_id, is_temp_forced,
> +							&curr_id) != 0) {
> +							pParse->rc =
> +								SQL_TARANTOOL_ERROR;
> +							pParse->nErr++;
> +							return 0;
> +						}
> +					}
> 				}
> +				coll = coll_by_id(curr_id)->coll;
> 			}
> 			if (pFarg) {
> 				if (constMask) {
> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
> index 8967ea3e0..47ee474bb 100644
> --- a/src/box/sql/sqlInt.h
> +++ b/src/box/sql/sqlInt.h
> @@ -1660,7 +1660,13 @@ struct FuncDestructor {
> #define SQL_FUNC_LIKE     0x0004	/* Candidate for the LIKE optimization */
> #define SQL_FUNC_CASE     0x0008	/* Case-sensitive LIKE-type function */
> #define SQL_FUNC_EPHEM    0x0010	/* Ephemeral.  Delete with VDBE */
> -#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called */
> +#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called.
> +					 * The flag is set when the collation
> +					 * of function arguments should be
> +					 * determined, using rules in
> +					 * collations_check_compatibility()
> +					 * function.
> +					 */
> #define SQL_FUNC_LENGTH   0x0040	/* Built-in length() function */
> #define SQL_FUNC_TYPEOF   0x0080	/* Built-in typeof() function */
> #define SQL_FUNC_COUNT    0x0100	/* Built-in count(*) aggregate */
> -- 

Please, provide basic test cases involving one or more built-in
functions and incompatible arguments (at least min-max funcs use it).
Moreover, this flag can’t be set for user-defined functions, which is pretty sad.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 2/2] sql: rename instr to position & add collation usage
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov
  2019-03-20 12:59   ` [tarantool-patches] Re: [PATCH 1/2] " i.koptelov
@ 2019-03-26 12:32   ` n.pettik
  2019-03-27 13:39     ` i.koptelov
  1 sibling, 1 reply; 13+ messages in thread
From: n.pettik @ 2019-03-26 12:32 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Ivan Koptelov

I’ve pushed a couple of code-style fixes. Diff:

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 6654d4998..f9c0be97e 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -212,54 +212,53 @@ absFunc(sql_context * context, int argc, sql_value ** argv)
        }
 }
 
-/*
+/**
  * Implementation of the position() function.
  *
  * position(needle, haystack) finds the first occurrence of needle
- * in haystack and returns the number of previous characters plus 1,
- * or 0 if needle does not occur within haystack.
+ * in haystack and returns the number of previous characters
+ * plus 1, or 0 if needle does not occur within haystack.
  *
- * If both haystack and needle are BLOBs, then the result is one more than
- * the number of bytes in haystack prior to the first occurrence of needle,
- * or 0 if needle never occurs in haystack.
+ * If both haystack and needle are BLOBs, then the result is one
+ * more than the number of bytes in haystack prior to the first
+ * occurrence of needle, or 0 if needle never occurs in haystack.
  */
 static void
-position_func(sql_context * context, int argc, sql_value ** argv)
+position_func(struct sql_context *context, int argc, struct Mem **argv)
 {
        UNUSED_PARAMETER(argc);
-       sql_value *needle = argv[0];
-       sql_value *haystack = argv[1];
-       int type_needle = sql_value_type(needle);
-       int type_haystack = sql_value_type(haystack);
+       struct Mem *needle = argv[0];
+       struct Mem *haystack = argv[1];
+       int needle_type = sql_value_type(needle);
+       int haystack_type = sql_value_type(haystack);
 
-       if (type_haystack == SQL_NULL || type_needle == SQL_NULL)
+       if (haystack_type == SQL_NULL || needle_type == SQL_NULL)
                return;
        /*
         * Position function can be called only with string
         * or blob params.
         */
-       sql_value *inconsistent_type_arg = NULL;
-       if (type_needle != SQL_TEXT && type_needle != SQL_BLOB)
+       struct Mem *inconsistent_type_arg = NULL;
+       if (needle_type != SQL_TEXT && needle_type != SQL_BLOB)
                inconsistent_type_arg = needle;
-       if (type_haystack != SQL_TEXT && type_haystack != SQL_BLOB)
+       if (haystack_type != SQL_TEXT && haystack_type != SQL_BLOB)
                inconsistent_type_arg = haystack;
        if (inconsistent_type_arg != NULL) {
                diag_set(ClientError, ER_INCONSISTENT_TYPES, "TEXT or BLOB",
                         mem_type_to_str(inconsistent_type_arg));
-               context->pVdbe->pParse->is_aborted = true;
-               sql_result_error(context, tarantoolErrorMessage(), -1);
+               context->isError = SQL_TARANTOOL_ERROR;
+               context->fErrorOrAux = 1;
                return;
        }
        /*
         * Both params of Position function must be of the same
         * type.
         */
-       if (type_haystack != type_needle) {
+       if (haystack_type != needle_type) {
                diag_set(ClientError, ER_INCONSISTENT_TYPES,
-                        mem_type_to_str(needle),
-                        mem_type_to_str(haystack));
-               context->pVdbe->pParse->is_aborted = true;
-               sql_result_error(context, tarantoolErrorMessage(), -1);
+                        mem_type_to_str(needle), mem_type_to_str(haystack));
+               context->isError = SQL_TARANTOOL_ERROR;
+               context->fErrorOrAux = 1;
                return;
        }
 
@@ -269,14 +268,18 @@ position_func(sql_context * context, int argc, sql_value ** argv)
        if (n_needle_bytes > 0) {
                const unsigned char *haystack_str;
                const unsigned char *needle_str;
-               if (type_haystack == SQL_BLOB) {
+               if (haystack_type == SQL_BLOB) {
                        needle_str = sql_value_blob(needle);
                        haystack_str = sql_value_blob(haystack);
                        assert(needle_str != NULL);
                        assert(haystack_str != NULL || n_haystack_bytes == 0);
-
-                       while (n_needle_bytes <= n_haystack_bytes
-                              && memcmp(haystack_str, needle_str, n_needle_bytes) != 0) {
+                       /*
+                        * Naive implementation of substring
+                        * searching: matching time O(n * m).
+                        * Can be improved.
+                        */
+                       while (n_needle_bytes <= n_haystack_bytes &&
+                              memcmp(haystack_str, needle_str, n_needle_bytes) != 0) {
                                position++;
                                n_haystack_bytes--;
                                haystack_str++;
@@ -301,7 +304,8 @@ position_func(sql_context * context, int argc, sql_value ** argv)
                        int n_needle_chars =
                                sql_utf8_char_count(needle_str, n_needle_bytes);
                        int n_haystack_chars =
-                               sql_utf8_char_count(haystack_str, n_haystack_bytes);
+                               sql_utf8_char_count(haystack_str,
+                                                   n_haystack_bytes);
 
                        if (n_haystack_chars < n_needle_chars) {
                                position = 0;


> ---
> Before I changed postion() functionality, I refactored it:

I appreciate that you attached this diff, but it would be better if
you placed it in a separate commit (do it next time, now it is ok).

> 
> ---
> src/box/sql/func.c | 68 ++++++++++++++++++++++------------------------
> 1 file changed, 33 insertions(+), 35 deletions(-)
> 
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 2de6ef5ce..0100bb22f 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> /*
> @@ -1760,7 +1757,8 @@ sqlRegisterBuiltinFunctions(void)
> 			  FIELD_TYPE_STRING),
> 		FUNCTION2(length, 1, 0, 0, lengthFunc, SQL_FUNC_LENGTH,
> 			  FIELD_TYPE_INTEGER),
> -		FUNCTION(instr, 2, 0, 0, instrFunc, FIELD_TYPE_INTEGER),
> +		FUNCTION2(position, 2, 0, 1, position_func, SQL_FUNC_NEEDCOLL,

You don’t need FUNCTION2: SQL_FUNC_NEEDCOL is set by fourth param of FUNCTION.

> +			  FIELD_TYPE_INTEGER),
> 		FUNCTION(printf, -1, 0, 0, printfFunc, FIELD_TYPE_STRING),
> 		FUNCTION(unicode, 1, 0, 0, unicodeFunc, FIELD_TYPE_STRING),
> 		FUNCTION(char, -1, 0, 0, charFunc, FIELD_TYPE_STRING),
> --
> 
> 
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index 2f48d90c6..53926e3f2 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -4137,9 +4137,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
> 							lhs_id, is_lhs_forced,
> 							rhs_id, is_rhs_forced,
> 							&temp_id) != 0) {
> -							pParse->rc =
> -								SQL_TARANTOOL_ERROR;
> -							pParse->nErr++;
> +							pParse->is_aborted = true;
> 							return 0;
> 						}
> 
> @@ -4152,9 +4150,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
> 							curr_id, is_curr_forced,
> 							temp_id, is_temp_forced,
> 							&curr_id) != 0) {
> -							pParse->rc =
> -								SQL_TARANTOOL_ERROR;
> -							pParse->nErr++;
> +							pParse->is_aborted = true;
> 							return 0;
> 						}
> 					}

Remains after rebase, I guess.

> diff --git a/src/box/sql/vdbeInt.h b/src/box/sql/vdbeInt.h
> index c84f22caf..8af4fa359 100644
> --- a/src/box/sql/vdbeInt.h
> +++ b/src/box/sql/vdbeInt.h
> @@ -607,4 +607,13 @@ char *
> sql_vdbe_mem_encode_tuple(struct Mem *fields, uint32_t field_count,
> 			  uint32_t *tuple_size, struct region *region);
> 
> +/**
> + * Simple type to str convertor. It is used to simplify
> + * error reporting.
> + * @param mem Memory field which type we want to get.
> + * @retval String representing name of the type.
> + */
> +char *
> +mem_type_to_str(const struct Mem *p);

Rebase to fresh 2.1: I’ve already moved it to vdbeInt.h

> diff --git a/test/sql-tap/position.test.lua b/test/sql-tap/position.test.lua
> new file mode 100755
> index 000000000..70d11444c
> --- /dev/null
> +++ b/test/sql-tap/position.test.lua
> @@ -0,0 +1,905 @@
> +#!/usr/bin/env tarantool
> +test = require("sqltester")
> +test:plan(80)
> +
> +--!./tcltestrunner.lua
> +-- 2012 October 24
> +--
> +-- The author disclaims copyright to this source code.  In place of
> +-- a legal notice, here is a blessing:
> +--
> +--    May you do good and not evil.
> +--    May you find forgiveness for yourself and forgive others.
> +--    May you share freely, never taking more than you give.
> +--
> +-------------------------------------------------------------------------
> +-- This file implements regression tests for sql library.  The
> +-- focus of this file is testing the built-in POSITION() functions.
> +--
> +-- EVIDENCE-OF: R-27549-59611 The position(X,Y) function finds the first
> +-- occurrence of string X within string Y and returns the number of prior
> +-- characters plus 1, or 0 if X is nowhere found within Y.
> +--
> +-- ["set","testdir",[["file","dirname",["argv0"]]]]
> +-- ["source",[["testdir"],"\/tester.tcl"]]
> +-- Create a table to work with.

Please, remove this annoying header. It’s SQLite’s artefact.

> +test:do_test(
> +    "position-1.45",
> +    function()
> +        return test:execsql "SELECT position('', 'abcdefg');"
> +    end, {
> +        -- <position-1.45>
> +        1
> +        -- </position-1.45>
> +    })
> +
> +-- ["unset","-nocomplain","longstr”]

Remove this comment - it’s an artefact of auto-converting tool.

> +test:do_test(
> +    "position-1.56.3",
> +    function()
> +        return test:execsql "SELECT position(x'a4', x'78c3a4e282ac79');"
> +    end, {
> +        -- <position-1.56.3>
> +        3
> +        -- </position-1.56.3>
> +    })
> +
> +-- EVIDENCE-OF: R-17329-35644 If both arguments X and Y to position(X,Y) are
> +-- non-NULL and are not BLOBs then both are interpreted as strings.

Please, clean-up file from non-related to out SQL implementation comments.

> +--
> +test:do_test(
> +    "position-1.57.1",
> +    function()
> +        return test:catchsql "SELECT position(x'79', 'xä€y');"
> +    end, {
> +        -- <position-1.57.1>
> +        1, "Inconsistent types: expected BLOB got TEXT"
> +        -- </position-1.57.1>
> +    })
> +
> +
> +-- Tests that make use of collations.
> +-- A short remainder of how "unicode" and "unicode_ci" collations
> +-- works:
> +-- unicode_ci: „a“ = „A“ = „á“ = „Á“.
> +-- unicode: all symbols above are pairwise unequal.
> +
> +-- Collation is set in space format.
> +
> +test:do_execsql_test(
> +    "position-1.63",
> +    [[
> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
> +        INSERT INTO test1 VALUES('à');
> +        SELECT POSITION('a', s1) FROM test1;
> +        DROP TABLE test1;
> +    ]], {
> +        1
> +    }
> +)
> +
> +test:do_execsql_test(
> +    "position-1.64",
> +    [[
> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci”);

Why do you re-create table in each test? It features the same collation.
Also, I’d rather use “unicode” collation: you don’t test case insensitive option.

> +
> +-- Collation is set in space format and also in position() -
> +-- for both arguments. Arguments have different explicit
> +-- collations thus an error is expected.
> +
> +test:do_catchsql_test(
> +    "position-1.75",
> +    [[
> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
> +        INSERT INTO test1 VALUES('à');
> +        SELECT POSITION('a' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
> +    ]], {
> +        1, "Illegal mix of collations"
> +    }
> +)
> +
> +test:do_catchsql_test(
> +    "position-1.76",
> +    [[
> +        DROP TABLE test1;
> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
> +        INSERT INTO test1 VALUES('qwèrty');
> +        SELECT POSITION('er' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
> +        DROP TABLE test1;
> +    ]], {
> +        1, "Illegal mix of collations"
> +    }
> +)
> +
> +test:do_catchsql_test(
> +    "position-1.77",
> +    [[
> +        DROP TABLE test1;
> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
> +        INSERT INTO test1 VALUES('qwèrtÿ');
> +        SELECT POSITION('Y' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
> +    ]], {
> +        1, "Illegal mix of collations"
> +    }
> +)

Add test cases on: explicitly set incompatible collations,
explicitly set compatible collations.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions
  2019-03-25 19:26   ` [tarantool-patches] " n.pettik
@ 2019-03-27 13:38     ` i.koptelov
  2019-03-28 12:50       ` n.pettik
  0 siblings, 1 reply; 13+ messages in thread
From: i.koptelov @ 2019-03-27 13:38 UTC (permalink / raw)
  To: n.pettik; +Cc: tarantool-patches


> On 25 Mar 2019, at 22:26, n.pettik <korablev@tarantool.org> wrote:
> 
> 
> 
>> On 20 Mar 2019, at 14:11, Ivan Koptelov <ivan.koptelov@tarantool.org> wrote:
>> 
>> Before the patch determination of collation in SQL functions
>> (used only in instr) was too narrow: the arguments were scanned
>> from left to right, till the argument with collation was
>> encountered, then its collation was used.
>> Now every arguments collation is considered. The right collation
>> which would be used in function is determined using ANSI
>> compatibility rules ("type combination" rules).
>> 
>> Note: currently only instr() a.k.a position() uses mechanism
>> described above, other functions (except aggregate) simply
>> ignores collations.
> 
> That’s not true: I see that min-max aggregates also feature this flag:
> 
> FUNCTION(min, -1, 0, 1, minmaxFunc, FIELD_TYPE_SCALAR),
> 
> Fourth param indicates whether SQL_FUNC_NEEDCOLL is set or not.
Oh, sorry for that. Fixed commit message.

>> ---
>> src/box/sql/expr.c   | 69 ++++++++++++++++++++++++++++++++++++++++----
>> src/box/sql/sqlInt.h |  8 ++++-
>> 2 files changed, 70 insertions(+), 7 deletions(-)
>> 
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index a2c70935e..2f48d90c6 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -4093,16 +4093,73 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
>> 					testcase(i == 31);
>> 					constMask |= MASKBIT32(i);
>> 				}
>> -				if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) !=
>> -				    0 && coll == NULL) {
>> -					bool unused;
>> -					uint32_t id;
>> +			}
>> +			/*
>> +			 * Function arguments may have different
>> +			 * collations. The following code
>> +			 * checks if they are compatible and
>> +			 * finds the collation to be used. This
>> +			 * is done using ANSI rules from
>> +			 * collations_check_compatibility().
>> +			 */
>> +			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
>> +			     coll == NULL) {
>> +				struct coll *unused = NULL;
>> +				uint32_t curr_id = COLL_NONE;
>> +				bool is_curr_forced = false;
>> +
>> +				uint32_t temp_id = COLL_NONE;
>> +				bool is_temp_forced = false;
>> +
>> +				uint32_t lhs_id = COLL_NONE;
>> +				bool is_lhs_forced = false;
>> +
>> +				uint32_t rhs_id = COLL_NONE;
>> +				bool is_rhs_forced = false;
>> +
>> +				for (int i = 0; i < nFarg; i++) {
>> 					if (sql_expr_coll(pParse,
>> 							  pFarg->a[i].pExpr,
>> -							  &unused, &id,
>> -							  &coll) != 0)
>> +							  &is_lhs_forced,
>> +							  &lhs_id,
>> +							  &unused) != 0)
>> 						return 0;
>> +
>> +					for (int j = i + 1; j < nFarg; j++) {
>> +						if (sql_expr_coll(pParse,
>> +								  pFarg->a[j].pExpr,
>> +								  &is_rhs_forced,
>> +								  &rhs_id,
>> +								  &unused) != 0)
>> +							return 0;
> 
> Seems like you need only one pass saving resulting collation.
> Resulting collation shouldn’t depend on way of passing through
> arguments. And second call of collations_check_copatiility() is
> redundant as well. Now you are using 2n^2 calls of this function,
> but n is enough: you compare collation of first argument with
> second one and save result in tmp. Then, compare tmp with
> third argument etc.
Thank you for noticing, fixed now:
diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 8c1889d8a..34abb9665 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -4102,65 +4102,34 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 			 * is done using ANSI rules from
 			 * collations_check_compatibility().
 			 */
-			if (nFarg == 1) {
-				bool unused;
-				uint32_t id;
-				if (sql_expr_coll(pParse,
-						  pFarg->a[0].pExpr, &unused,
-						  &id, &coll) != 0)
-					return 0;
-			}
 			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
-			     coll == NULL && nFarg > 1) {
+			     coll == NULL) {
 				struct coll *unused = NULL;
 				uint32_t curr_id = COLL_NONE;
 				bool is_curr_forced = false;
 
-				uint32_t temp_id = COLL_NONE;
-				bool is_temp_forced = false;
-
-				uint32_t lhs_id = COLL_NONE;
-				bool is_lhs_forced = false;
+				uint32_t next_id = COLL_NONE;
+				bool is_next_forced = false;
 
-				uint32_t rhs_id = COLL_NONE;
-				bool is_rhs_forced = false;
+				if (sql_expr_coll(pParse, pFarg->a[0].pExpr,
+						  &is_curr_forced, &curr_id,
+						  &unused) != 0)
+					return 0;
 
-				for (int i = 0; i < nFarg; i++) {
+				for (int j = 1; j < nFarg; j++) {
 					if (sql_expr_coll(pParse,
-							  pFarg->a[i].pExpr,
-							  &is_lhs_forced,
-							  &lhs_id,
+							  pFarg->a[j].pExpr,
+							  &is_next_forced,
+							  &next_id,
 							  &unused) != 0)
 						return 0;
 
-					for (int j = i + 1; j < nFarg; j++) {
-						if (sql_expr_coll(pParse,
-								  pFarg->a[j].pExpr,
-								  &is_rhs_forced,
-								  &rhs_id,
-								  &unused) != 0)
-							return 0;
-
-						if (collations_check_compatibility(
-							lhs_id, is_lhs_forced,
-							rhs_id, is_rhs_forced,
-							&temp_id) != 0) {
-							pParse->is_aborted = true;
-							return 0;
-						}
-
-						is_temp_forced = (temp_id ==
-								  lhs_id) ?
-								 is_lhs_forced :
-								 is_rhs_forced;
-
-						if (collations_check_compatibility(
-							curr_id, is_curr_forced,
-							temp_id, is_temp_forced,
-							&curr_id) != 0) {
-							pParse->is_aborted = true;
-							return 0;
-						}
+					if (collations_check_compatibility(
+						curr_id, is_curr_forced,
+						next_id, is_next_forced,
+						&curr_id) != 0) {
+						pParse->is_aborted = true;
+						return 0;
 					}
 				}
 				coll = coll_by_id(curr_id)->coll;
> 
>> +
>> +						if (collations_check_compatibility(
>> +							lhs_id, is_lhs_forced,
>> +							rhs_id, is_rhs_forced,
>> +							&temp_id) != 0) {
>> +							pParse->rc =
>> +								SQL_TARANTOOL_ERROR;
>> +							pParse->nErr++;
>> +							return 0;
>> +						}
>> +
>> +						is_temp_forced = (temp_id ==
>> +								  lhs_id) ?
>> +								 is_lhs_forced :
>> +								 is_rhs_forced;
>> +
>> +						if (collations_check_compatibility(
>> +							curr_id, is_curr_forced,
>> +							temp_id, is_temp_forced,
>> +							&curr_id) != 0) {
>> +							pParse->rc =
>> +								SQL_TARANTOOL_ERROR;
>> +							pParse->nErr++;
>> +							return 0;
>> +						}
>> +					}
>> 				}
>> +				coll = coll_by_id(curr_id)->coll;
>> 			}
>> 			if (pFarg) {
>> 				if (constMask) {
>> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
>> index 8967ea3e0..47ee474bb 100644
>> --- a/src/box/sql/sqlInt.h
>> +++ b/src/box/sql/sqlInt.h
>> @@ -1660,7 +1660,13 @@ struct FuncDestructor {
>> #define SQL_FUNC_LIKE     0x0004	/* Candidate for the LIKE optimization */
>> #define SQL_FUNC_CASE     0x0008	/* Case-sensitive LIKE-type function */
>> #define SQL_FUNC_EPHEM    0x0010	/* Ephemeral.  Delete with VDBE */
>> -#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called */
>> +#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called.
>> +					 * The flag is set when the collation
>> +					 * of function arguments should be
>> +					 * determined, using rules in
>> +					 * collations_check_compatibility()
>> +					 * function.
>> +					 */
>> #define SQL_FUNC_LENGTH   0x0040	/* Built-in length() function */
>> #define SQL_FUNC_TYPEOF   0x0080	/* Built-in typeof() function */
>> #define SQL_FUNC_COUNT    0x0100	/* Built-in count(*) aggregate */
>> -- 
> 
> Please, provide basic test cases involving one or more built-in
> functions and incompatible arguments (at least min-max funcs use it).
> Moreover, this flag can’t be set for user-defined functions, which is pretty sad.
Add a few tests:
diff --git a/test/sql-tap/func5.test.lua b/test/sql-tap/func5.test.lua
index 6605a2ba1..4282fdac8 100755
--- a/test/sql-tap/func5.test.lua
+++ b/test/sql-tap/func5.test.lua
@@ -1,6 +1,6 @@
 #!/usr/bin/env tarantool
 test = require("sqltester")
-test:plan(5)
+test:plan(9)
 
 --!./tcltestrunner.lua
 -- 2010 August 27
@@ -98,5 +98,59 @@ test:do_execsql_test(
         -- </func5-2.2>
     })
 
+-- The following tests ensures that max() and min() functions
+-- raise error if argument's collations are incompatible.
+
+test:do_catchsql_test(
+    "func-5-3.1",
+    [[
+        SELECT max('a' COLLATE "unicode", 'A' COLLATE "unicode_ci");
+    ]],
+    {
+        -- <func5-3.1>
+        1, "Illegal mix of collations"
+        -- </func5-3.1>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5-3.2",
+    [[
+        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode");
+        CREATE TABLE test2 (s2 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        INSERT INTO test1 VALUES ('a');
+        INSERT INTO test2 VALUES ('a');
+        SELECT max(s1, s2) FROM test1 JOIN test2;
+    ]],
+    {
+        -- <func5-3.2>
+        1, "Illegal mix of collations"
+        -- </func5-3.2>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5-3.3",
+    [[
+        SELECT min('a' COLLATE "unicode", 'A' COLLATE "unicode_ci");
+    ]],
+    {
+        -- <func5-3.3>
+        1, "Illegal mix of collations"
+        -- </func5-3.3>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5-3.4",
+    [[
+        SELECT min(s1, s2) FROM test1 JOIN test2;
+    ]],
+    {
+        -- <func5-3.4>
+        1, "Illegal mix of collations"
+        -- </func5-3.4>
+    }
+)
 
 test:finish_test()

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 2/2] sql: rename instr to position & add collation usage
  2019-03-26 12:32   ` [tarantool-patches] Re: [PATCH 2/2] " n.pettik
@ 2019-03-27 13:39     ` i.koptelov
  2019-03-28 12:57       ` n.pettik
  0 siblings, 1 reply; 13+ messages in thread
From: i.koptelov @ 2019-03-27 13:39 UTC (permalink / raw)
  To: tarantool-patches; +Cc: n.pettik



> On 26 Mar 2019, at 15:32, n.pettik <korablev@tarantool.org> wrote:
> 
> I’ve pushed a couple of code-style fixes. Diff:
> 
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 6654d4998..f9c0be97e 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -212,54 +212,53 @@ absFunc(sql_context * context, int argc, sql_value ** argv)
>        }
> }
> 
> -/*
> +/**
>  * Implementation of the position() function.
>  *
>  * position(needle, haystack) finds the first occurrence of needle
> - * in haystack and returns the number of previous characters plus 1,
> - * or 0 if needle does not occur within haystack.
> + * in haystack and returns the number of previous characters
> + * plus 1, or 0 if needle does not occur within haystack.
>  *
> - * If both haystack and needle are BLOBs, then the result is one more than
> - * the number of bytes in haystack prior to the first occurrence of needle,
> - * or 0 if needle never occurs in haystack.
> + * If both haystack and needle are BLOBs, then the result is one
> + * more than the number of bytes in haystack prior to the first
> + * occurrence of needle, or 0 if needle never occurs in haystack.
>  */
> static void
> -position_func(sql_context * context, int argc, sql_value ** argv)
> +position_func(struct sql_context *context, int argc, struct Mem **argv)
> {
>        UNUSED_PARAMETER(argc);
> -       sql_value *needle = argv[0];
> -       sql_value *haystack = argv[1];
> -       int type_needle = sql_value_type(needle);
> -       int type_haystack = sql_value_type(haystack);
> +       struct Mem *needle = argv[0];
> +       struct Mem *haystack = argv[1];
> +       int needle_type = sql_value_type(needle);
> +       int haystack_type = sql_value_type(haystack);
> 
> -       if (type_haystack == SQL_NULL || type_needle == SQL_NULL)
> +       if (haystack_type == SQL_NULL || needle_type == SQL_NULL)
>                return;
>        /*
>         * Position function can be called only with string
>         * or blob params.
>         */
> -       sql_value *inconsistent_type_arg = NULL;
> -       if (type_needle != SQL_TEXT && type_needle != SQL_BLOB)
> +       struct Mem *inconsistent_type_arg = NULL;
> +       if (needle_type != SQL_TEXT && needle_type != SQL_BLOB)
>                inconsistent_type_arg = needle;
> -       if (type_haystack != SQL_TEXT && type_haystack != SQL_BLOB)
> +       if (haystack_type != SQL_TEXT && haystack_type != SQL_BLOB)
>                inconsistent_type_arg = haystack;
>        if (inconsistent_type_arg != NULL) {
>                diag_set(ClientError, ER_INCONSISTENT_TYPES, "TEXT or BLOB",
>                         mem_type_to_str(inconsistent_type_arg));
> -               context->pVdbe->pParse->is_aborted = true;
> -               sql_result_error(context, tarantoolErrorMessage(), -1);
> +               context->isError = SQL_TARANTOOL_ERROR;
> +               context->fErrorOrAux = 1;
>                return;
>        }
>        /*
>         * Both params of Position function must be of the same
>         * type.
>         */
> -       if (type_haystack != type_needle) {
> +       if (haystack_type != needle_type) {
>                diag_set(ClientError, ER_INCONSISTENT_TYPES,
> -                        mem_type_to_str(needle),
> -                        mem_type_to_str(haystack));
> -               context->pVdbe->pParse->is_aborted = true;
> -               sql_result_error(context, tarantoolErrorMessage(), -1);
> +                        mem_type_to_str(needle), mem_type_to_str(haystack));
> +               context->isError = SQL_TARANTOOL_ERROR;
> +               context->fErrorOrAux = 1;
>                return;
>        }
> 
> @@ -269,14 +268,18 @@ position_func(sql_context * context, int argc, sql_value ** argv)
>        if (n_needle_bytes > 0) {
>                const unsigned char *haystack_str;
>                const unsigned char *needle_str;
> -               if (type_haystack == SQL_BLOB) {
> +               if (haystack_type == SQL_BLOB) {
>                        needle_str = sql_value_blob(needle);
>                        haystack_str = sql_value_blob(haystack);
>                        assert(needle_str != NULL);
>                        assert(haystack_str != NULL || n_haystack_bytes == 0);
> -
> -                       while (n_needle_bytes <= n_haystack_bytes
> -                              && memcmp(haystack_str, needle_str, n_needle_bytes) != 0) {
> +                       /*
> +                        * Naive implementation of substring
> +                        * searching: matching time O(n * m).
> +                        * Can be improved.
> +                        */
> +                       while (n_needle_bytes <= n_haystack_bytes &&
> +                              memcmp(haystack_str, needle_str, n_needle_bytes) != 0) {
>                                position++;
>                                n_haystack_bytes--;
>                                haystack_str++;
> @@ -301,7 +304,8 @@ position_func(sql_context * context, int argc, sql_value ** argv)
>                        int n_needle_chars =
>                                sql_utf8_char_count(needle_str, n_needle_bytes);
>                        int n_haystack_chars =
> -                               sql_utf8_char_count(haystack_str, n_haystack_bytes);
> +                               sql_utf8_char_count(haystack_str,
> +                                                   n_haystack_bytes);
> 
>                        if (n_haystack_chars < n_needle_chars) {
>                                position = 0;
> 
> 
Thank you. I have reviewed them and squashed.
>> ---
>> Before I changed postion() functionality, I refactored it:
> 
> I appreciate that you attached this diff, but it would be better if
> you placed it in a separate commit (do it next time, now it is ok).
Ok.
> 
>> 
>> ---
>> src/box/sql/func.c | 68 ++++++++++++++++++++++------------------------
>> 1 file changed, 33 insertions(+), 35 deletions(-)
>> 
>> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
>> index 2de6ef5ce..0100bb22f 100644
>> --- a/src/box/sql/func.c
>> +++ b/src/box/sql/func.c
>> /*
>> @@ -1760,7 +1757,8 @@ sqlRegisterBuiltinFunctions(void)
>> 			  FIELD_TYPE_STRING),
>> 		FUNCTION2(length, 1, 0, 0, lengthFunc, SQL_FUNC_LENGTH,
>> 			  FIELD_TYPE_INTEGER),
>> -		FUNCTION(instr, 2, 0, 0, instrFunc, FIELD_TYPE_INTEGER),
>> +		FUNCTION2(position, 2, 0, 1, position_func, SQL_FUNC_NEEDCOLL,
> 
> You don’t need FUNCTION2: SQL_FUNC_NEEDCOL is set by fourth param of FUNCTION.
You are right, fixed:

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index f9c0be97e..b86a95d9a 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1836,8 +1836,7 @@ sqlRegisterBuiltinFunctions(void)
 			  FIELD_TYPE_STRING),
 		FUNCTION2(length, 1, 0, 0, lengthFunc, SQL_FUNC_LENGTH,
 			  FIELD_TYPE_INTEGER),
-		FUNCTION2(position, 2, 0, 1, position_func, SQL_FUNC_NEEDCOLL,
-			  FIELD_TYPE_INTEGER),
+		FUNCTION(position, 2, 0, 1, position_func, FIELD_TYPE_INTEGER),
 		FUNCTION(printf, -1, 0, 0, printfFunc, FIELD_TYPE_STRING),
 		FUNCTION(unicode, 1, 0, 0, unicodeFunc, FIELD_TYPE_STRING),
 		FUNCTION(char, -1, 0, 0, charFunc, FIELD_TYPE_STRING),
> 
>> +			  FIELD_TYPE_INTEGER),
>> 		FUNCTION(printf, -1, 0, 0, printfFunc, FIELD_TYPE_STRING),
>> 		FUNCTION(unicode, 1, 0, 0, unicodeFunc, FIELD_TYPE_STRING),
>> 		FUNCTION(char, -1, 0, 0, charFunc, FIELD_TYPE_STRING),
>> --
>> 
>> 
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index 2f48d90c6..53926e3f2 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -4137,9 +4137,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
>> 							lhs_id, is_lhs_forced,
>> 							rhs_id, is_rhs_forced,
>> 							&temp_id) != 0) {
>> -							pParse->rc =
>> -								SQL_TARANTOOL_ERROR;
>> -							pParse->nErr++;
>> +							pParse->is_aborted = true;
>> 							return 0;
>> 						}
>> 
>> @@ -4152,9 +4150,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
>> 							curr_id, is_curr_forced,
>> 							temp_id, is_temp_forced,
>> 							&curr_id) != 0) {
>> -							pParse->rc =
>> -								SQL_TARANTOOL_ERROR;
>> -							pParse->nErr++;
>> +							pParse->is_aborted = true;
>> 							return 0;
>> 						}
>> 					}
> 
> Remains after rebase, I guess.
> 
>> diff --git a/src/box/sql/vdbeInt.h b/src/box/sql/vdbeInt.h
>> index c84f22caf..8af4fa359 100644
>> --- a/src/box/sql/vdbeInt.h
>> +++ b/src/box/sql/vdbeInt.h
>> @@ -607,4 +607,13 @@ char *
>> sql_vdbe_mem_encode_tuple(struct Mem *fields, uint32_t field_count,
>> 			  uint32_t *tuple_size, struct region *region);
>> 
>> +/**
>> + * Simple type to str convertor. It is used to simplify
>> + * error reporting.
>> + * @param mem Memory field which type we want to get.
>> + * @retval String representing name of the type.
>> + */
>> +char *
>> +mem_type_to_str(const struct Mem *p);
> 
> Rebase to fresh 2.1: I’ve already moved it to vdbeInt.h
Ok, done.
> 
>> diff --git a/test/sql-tap/position.test.lua b/test/sql-tap/position.test.lua
>> new file mode 100755
>> index 000000000..70d11444c
>> --- /dev/null
>> +++ b/test/sql-tap/position.test.lua
>> @@ -0,0 +1,905 @@
>> +#!/usr/bin/env tarantool
>> +test = require("sqltester")
>> +test:plan(80)
>> +
>> +--!./tcltestrunner.lua
>> +-- 2012 October 24
>> +--
>> +-- The author disclaims copyright to this source code.  In place of
>> +-- a legal notice, here is a blessing:
>> +--
>> +--    May you do good and not evil.
>> +--    May you find forgiveness for yourself and forgive others.
>> +--    May you share freely, never taking more than you give.
>> +--
>> +-------------------------------------------------------------------------
>> +-- This file implements regression tests for sql library.  The
>> +-- focus of this file is testing the built-in POSITION() functions.
>> +--
>> +-- EVIDENCE-OF: R-27549-59611 The position(X,Y) function finds the first
>> +-- occurrence of string X within string Y and returns the number of prior
>> +-- characters plus 1, or 0 if X is nowhere found within Y.
>> +--
>> +-- ["set","testdir",[["file","dirname",["argv0"]]]]
>> +-- ["source",[["testdir"],"\/tester.tcl"]]
>> +-- Create a table to work with.
> 
> Please, remove this annoying header. It’s SQLite’s artefact.
Ok.

diff --git a/test/sql-tap/position.test.lua b/test/sql-tap/position.test.lua
index 70d11444c..8c46d7b9e 100755
--- a/test/sql-tap/position.test.lua
+++ b/test/sql-tap/position.test.lua
@@ -2,28 +2,6 @@
 test = require("sqltester")
 test:plan(80)
 
---!./tcltestrunner.lua
--- 2012 October 24
---
--- The author disclaims copyright to this source code.  In place of
--- a legal notice, here is a blessing:
---
---    May you do good and not evil.
---    May you find forgiveness for yourself and forgive others.
---    May you share freely, never taking more than you give.
---
--------------------------------------------------------------------------
--- This file implements regression tests for sql library.  The
--- focus of this file is testing the built-in POSITION() functions.
---
--- EVIDENCE-OF: R-27549-59611 The position(X,Y) function finds the first
--- occurrence of string X within string Y and returns the number of prior
--- characters plus 1, or 0 if X is nowhere found within Y.
---
--- ["set","testdir",[["file","dirname",["argv0"]]]]
--- ["source",[["testdir"],"\/tester.tcl"]]
--- Create a table to work with.
---

> 
>> +test:do_test(
>> +    "position-1.45",
>> +    function()
>> +        return test:execsql "SELECT position('', 'abcdefg');"
>> +    end, {
>> +        -- <position-1.45>
>> +        1
>> +        -- </position-1.45>
>> +    })
>> +
>> +-- ["unset","-nocomplain","longstr”]
> 
> Remove this comment - it’s an artefact of auto-converting tool.
Ok.

@@ -474,7 +452,6 @@ test:do_test(
         -- </position-1.45>
     })
 
--- ["unset","-nocomplain","longstr"]
 local longstr = "abcdefghijklmonpqrstuvwxyz"
 longstr = longstr .. longstr
 longstr = longstr .. longstr

> 
>> +test:do_test(
>> +    "position-1.56.3",
>> +    function()
>> +        return test:execsql "SELECT position(x'a4', x'78c3a4e282ac79');"
>> +    end, {
>> +        -- <position-1.56.3>
>> +        3
>> +        -- </position-1.56.3>
>> +    })
>> +
>> +-- EVIDENCE-OF: R-17329-35644 If both arguments X and Y to position(X,Y) are
>> +-- non-NULL and are not BLOBs then both are interpreted as strings.
> 
> Please, clean-up file from non-related to out SQL implementation comments.
Ok, done.

@@ -591,10 +568,6 @@ test:do_test(
         -- </position-1.55>
     })
 
--- EVIDENCE-OF: R-46421-32541 Or, if X and Y are both BLOBs, then
--- position(X,Y) returns one more than the number bytes prior to the first
--- occurrence of X, or 0 if X does not occur anywhere within Y.
---

@@ -635,9 +608,6 @@ test:do_test(
         -- </position-1.56.3>
     })
 
--- EVIDENCE-OF: R-17329-35644 If both arguments X and Y to position(X,Y) are
--- non-NULL and are not BLOBs then both are interpreted as strings.
---

@@ -668,7 +638,7 @@ test:do_test(
         -- </position-1.57.3>
     })
 
--- EVIDENCE-OF: R-14708-27487 If either X or Y are NULL in position(X,Y)
+-- If either X or Y are NULL in position(X,Y)
 -- then the result is NULL.
 --

> 
>> +--
>> +test:do_test(
>> +    "position-1.57.1",
>> +    function()
>> +        return test:catchsql "SELECT position(x'79', 'xä€y');"
>> +    end, {
>> +        -- <position-1.57.1>
>> +        1, "Inconsistent types: expected BLOB got TEXT"
>> +        -- </position-1.57.1>
>> +    })
>> +
>> +
>> +-- Tests that make use of collations.
>> +-- A short remainder of how "unicode" and "unicode_ci" collations
>> +-- works:
>> +-- unicode_ci: „a“ = „A“ = „á“ = „Á“.
>> +-- unicode: all symbols above are pairwise unequal.
>> +
>> +-- Collation is set in space format.
>> +
>> +test:do_execsql_test(
>> +    "position-1.63",
>> +    [[
>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>> +        INSERT INTO test1 VALUES('à');
>> +        SELECT POSITION('a', s1) FROM test1;
>> +        DROP TABLE test1;
>> +    ]], {
>> +        1
>> +    }
>> +)
>> +
>> +test:do_execsql_test(
>> +    "position-1.64",
>> +    [[
>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci”);
> 
> Why do you re-create table in each test? It features the same collation.
> Also, I’d rather use “unicode” collation: you don’t test case insensitive option.
Probably before I had an intention to use tables with different collations.
Anyway removed this redundant create/drop.

@@ -715,7 +685,7 @@ test:do_execsql_test(
         CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('à');
         SELECT POSITION('a', s1) FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         1
     }
@@ -724,10 +694,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.64",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrty');
         SELECT POSITION('er', s1) FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         3
     }
@@ -736,10 +705,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.65",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrtÿ');
         SELECT POSITION('y', s1) FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         6
     }
@@ -751,10 +719,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.66",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('à');
         SELECT POSITION('a', s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -763,10 +730,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.67",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrty');
         SELECT POSITION('er', s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -775,10 +741,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.68",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrtÿ');
         SELECT POSITION('Y', s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -790,10 +755,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.69",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('à');
         SELECT POSITION('a' COLLATE "unicode", s1) FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -802,10 +766,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.70",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrty');
         SELECT POSITION('er' COLLATE "unicode", s1) FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -814,10 +777,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.71",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrtÿ');
         SELECT POSITION('Y' COLLATE "unicode", s1) FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -829,10 +791,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.72",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('à');
         SELECT POSITION('a' COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -841,10 +802,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.73",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrty');
         SELECT POSITION('er' COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -853,10 +813,9 @@ test:do_execsql_test(
 test:do_execsql_test(
     "position-1.74",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
         INSERT INTO test1 VALUES('qwèrtÿ');
         SELECT POSITION('Y'COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
+        DELETE FROM test1;
     ]], {
         0
     }
@@ -869,7 +828,7 @@ test:do_execsql_test(
 test:do_catchsql_test(
     "position-1.75",
     [[
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        DELETE FROM test1;
         INSERT INTO test1 VALUES('à');
         SELECT POSITION('a' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
     ]], {
@@ -880,11 +839,9 @@ test:do_catchsql_test(
 test:do_catchsql_test(
     "position-1.76",
     [[
-        DROP TABLE test1;
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        DELETE FROM test1;
         INSERT INTO test1 VALUES('qwèrty');
         SELECT POSITION('er' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
-        DROP TABLE test1;
     ]], {
         1, "Illegal mix of collations"
     }
@@ -893,8 +850,7 @@ test:do_catchsql_test(
 test:do_catchsql_test(
     "position-1.77",
     [[
-        DROP TABLE test1;
-        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
+        DELETE FROM test1;
         INSERT INTO test1 VALUES('qwèrtÿ');
         SELECT POSITION('Y' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
     ]], {
> 
>> +
>> +-- Collation is set in space format and also in position() -
>> +-- for both arguments. Arguments have different explicit
>> +-- collations thus an error is expected.
>> +
>> +test:do_catchsql_test(
>> +    "position-1.75",
>> +    [[
>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>> +        INSERT INTO test1 VALUES('à');
>> +        SELECT POSITION('a' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
>> +    ]], {
>> +        1, "Illegal mix of collations"
>> +    }
>> +)
>> +
>> +test:do_catchsql_test(
>> +    "position-1.76",
>> +    [[
>> +        DROP TABLE test1;
>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>> +        INSERT INTO test1 VALUES('qwèrty');
>> +        SELECT POSITION('er' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
>> +        DROP TABLE test1;
>> +    ]], {
>> +        1, "Illegal mix of collations"
>> +    }
>> +)
>> +
>> +test:do_catchsql_test(
>> +    "position-1.77",
>> +    [[
>> +        DROP TABLE test1;
>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>> +        INSERT INTO test1 VALUES('qwèrtÿ');
>> +        SELECT POSITION('Y' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
>> +    ]], {
>> +        1, "Illegal mix of collations"
>> +    }
>> +)
> 
> Add test cases on: explicitly set incompatible collations,
> explicitly set compatible collations.
> 
> 

Don’t we already have such tests? From the end of position.test.lua:

-- Collation is set in space format and also in position() -
-- for both arguments. Arguments have the same collations.

test:do_execsql_test(
    "position-1.72",
    [[
        INSERT INTO test1 VALUES('à');
        SELECT POSITION('a' COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
        DELETE FROM test1;
    ]], {
        0
    }
)

test:do_execsql_test(
    "position-1.73",
    [[
        INSERT INTO test1 VALUES('qwèrty');
        SELECT POSITION('er' COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
        DELETE FROM test1;
    ]], {
        0
    }
)

test:do_execsql_test(
    "position-1.74",
    [[
        INSERT INTO test1 VALUES('qwèrtÿ');
        SELECT POSITION('Y'COLLATE "unicode", s1 COLLATE "unicode") FROM test1;
        DELETE FROM test1;
    ]], {
        0
    }
)

-- Collation is set in space format and also in position() -
-- for both arguments. Arguments have different explicit
-- collations thus an error is expected.

test:do_catchsql_test(
    "position-1.75",
    [[
        DELETE FROM test1;
        INSERT INTO test1 VALUES('à');
        SELECT POSITION('a' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
    ]], {
        1, "Illegal mix of collations"
    }
)

test:do_catchsql_test(
    "position-1.76",
    [[
        DELETE FROM test1;
        INSERT INTO test1 VALUES('qwèrty');
        SELECT POSITION('er' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
    ]], {
        1, "Illegal mix of collations"
    }
)

test:do_catchsql_test(
    "position-1.77",
    [[
        DELETE FROM test1;
        INSERT INTO test1 VALUES('qwèrtÿ');
        SELECT POSITION('Y' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
    ]], {
        1, "Illegal mix of collations"
    }
)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions
  2019-03-27 13:38     ` i.koptelov
@ 2019-03-28 12:50       ` n.pettik
  2019-03-28 14:08         ` i.koptelov
  0 siblings, 1 reply; 13+ messages in thread
From: n.pettik @ 2019-03-28 12:50 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Ivan Koptelov


> Thank you for noticing, fixed now:
> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
> index 8c1889d8a..34abb9665 100644
> --- a/src/box/sql/expr.c
> +++ b/src/box/sql/expr.c
> @@ -4102,65 +4102,34 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
> 			 * is done using ANSI rules from
> 			 * collations_check_compatibility().
> 			 */
> -			if (nFarg == 1) {
> -				bool unused;
> -				uint32_t id;
> -				if (sql_expr_coll(pParse,
> -						  pFarg->a[0].pExpr, &unused,
> -						  &id, &coll) != 0)
> -					return 0;
> -			}
> 			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
> -			     coll == NULL && nFarg > 1) {

coll == NULL is redundant check: before this branch this variable
is assigned only once with NULL value.

> +			     coll == NULL) {
> 				struct coll *unused = NULL;
> 				uint32_t curr_id = COLL_NONE;
> 				bool is_curr_forced = false;
> 
> -				uint32_t temp_id = COLL_NONE;
> -				bool is_temp_forced = false;
> -
> -				uint32_t lhs_id = COLL_NONE;
> -				bool is_lhs_forced = false;
> +				uint32_t next_id = COLL_NONE;
> +				bool is_next_forced = false;
> 
> -				uint32_t rhs_id = COLL_NONE;
> -				bool is_rhs_forced = false;
> +				if (sql_expr_coll(pParse, pFarg->a[0].pExpr,
> +						  &is_curr_forced, &curr_id,
> +						  &unused) != 0)
> +					return 0;
> 
> -				for (int i = 0; i < nFarg; i++) {
> +				for (int j = 1; j < nFarg; j++) {
> 					if (sql_expr_coll(pParse,
> -							  pFarg->a[i].pExpr,
> -							  &is_lhs_forced,
> -							  &lhs_id,
> +							  pFarg->a[j].pExpr,
> +							  &is_next_forced,
> +							  &next_id,
> 							  &unused) != 0)
> 						return 0;
> 
> -					for (int j = i + 1; j < nFarg; j++) {
> -						if (sql_expr_coll(pParse,
> -								  pFarg->a[j].pExpr,
> -								  &is_rhs_forced,
> -								  &rhs_id,
> -								  &unused) != 0)
> -							return 0;
> -
> -						if (collations_check_compatibility(
> -							lhs_id, is_lhs_forced,
> -							rhs_id, is_rhs_forced,
> -							&temp_id) != 0) {
> -							pParse->is_aborted = true;
> -							return 0;
> -						}
> -
> -						is_temp_forced = (temp_id ==
> -								  lhs_id) ?
> -								 is_lhs_forced :
> -								 is_rhs_forced;
> -
> -						if (collations_check_compatibility(
> -							curr_id, is_curr_forced,
> -							temp_id, is_temp_forced,
> -							&curr_id) != 0) {
> -							pParse->is_aborted = true;
> -							return 0;
> -						}
> +					if (collations_check_compatibility(
> +						curr_id, is_curr_forced,
> +						next_id, is_next_forced,
> +						&curr_id) != 0) {
> +						pParse->is_aborted = true;
> +						return 0;
> 					}
> 				}
> 				coll = coll_by_id(curr_id)->coll;
>> 
>>> +
>>> +						if (collations_check_compatibility(
>>> +							lhs_id, is_lhs_forced,
>>> +							rhs_id, is_rhs_forced,
>>> +							&temp_id) != 0) {
>>> +							pParse->rc =
>>> +								SQL_TARANTOOL_ERROR;
>>> +							pParse->nErr++;
>>> +							return 0;
>>> +						}
>>> +
>>> +						is_temp_forced = (temp_id ==
>>> +								  lhs_id) ?
>>> +								 is_lhs_forced :
>>> +								 is_rhs_forced;
>>> +
>>> +						if (collations_check_compatibility(
>>> +							curr_id, is_curr_forced,
>>> +							temp_id, is_temp_forced,
>>> +							&curr_id) != 0) {
>>> +							pParse->rc =
>>> +								SQL_TARANTOOL_ERROR;
>>> +							pParse->nErr++;
>>> +							return 0;
>>> +						}

You don’t update is_curr_forced, and it allows to use invalid
collation mix:

tarantool> select min ('abc', 'asd' collate "binary", 'abc' COLLATE "unicode")
---
- - ['abc']
...

>>> +					}
>>> 				}
>>> +				coll = coll_by_id(curr_id)->coll;
>>> 			}
>>> 			if (pFarg) {
>>> 				if (constMask) {
>>> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
>>> index 8967ea3e0..47ee474bb 100644
>>> --- a/src/box/sql/sqlInt.h
>>> +++ b/src/box/sql/sqlInt.h
>>> @@ -1660,7 +1660,13 @@ struct FuncDestructor {
>>> #define SQL_FUNC_LIKE     0x0004	/* Candidate for the LIKE optimization */
>>> #define SQL_FUNC_CASE     0x0008	/* Case-sensitive LIKE-type function */
>>> #define SQL_FUNC_EPHEM    0x0010	/* Ephemeral.  Delete with VDBE */
>>> -#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called */
>>> +#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called.
>>> +					 * The flag is set when the collation
>>> +					 * of function arguments should be
>>> +					 * determined, using rules in
>>> +					 * collations_check_compatibility()
>>> +					 * function.
>>> +					 */
>>> #define SQL_FUNC_LENGTH   0x0040	/* Built-in length() function */
>>> #define SQL_FUNC_TYPEOF   0x0080	/* Built-in typeof() function */
>>> #define SQL_FUNC_COUNT    0x0100	/* Built-in count(*) aggregate */
>>> -- 
>> 
>> Please, provide basic test cases involving one or more built-in
>> functions and incompatible arguments (at least min-max funcs use it).
>> Moreover, this flag can’t be set for user-defined functions, which is pretty sad.
> Add a few tests:
> diff --git a/test/sql-tap/func5.test.lua b/test/sql-tap/func5.test.lua
> index 6605a2ba1..4282fdac8 100755
> --- a/test/sql-tap/func5.test.lua
> +++ b/test/sql-tap/func5.test.lua
> @@ -1,6 +1,6 @@
> #!/usr/bin/env tarantool
> test = require("sqltester")
> -test:plan(5)
> +test:plan(9)
> 
> --!./tcltestrunner.lua
> -- 2010 August 27
> @@ -98,5 +98,59 @@ test:do_execsql_test(
>         -- </func5-2.2>
>     })
> 
> +-- The following tests ensures that max() and min() functions
> +-- raise error if argument's collations are incompatible.
> +
> +test:do_catchsql_test(
> +    "func-5-3.1",
> +    [[
> +        SELECT max('a' COLLATE "unicode", 'A' COLLATE "unicode_ci");
> +    ]],
> +    {
> +        -- <func5-3.1>
> +        1, "Illegal mix of collations"
> +        -- </func5-3.1>
> +    }
> +)
> +
> +test:do_catchsql_test(
> +    "func-5-3.2",
> +    [[
> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode");
> +        CREATE TABLE test2 (s2 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
> +        INSERT INTO test1 VALUES ('a');
> +        INSERT INTO test2 VALUES ('a');
> +        SELECT max(s1, s2) FROM test1 JOIN test2;
> +    ]],
> +    {
> +        -- <func5-3.2>
> +        1, "Illegal mix of collations"
> +        -- </func5-3.2>
> +    }
> +)
> +
> +test:do_catchsql_test(
> +    "func-5-3.3",
> +    [[
> +        SELECT min('a' COLLATE "unicode", 'A' COLLATE "unicode_ci”);

Please add tests involving more than two params.
Example of illegal collation mix passing your tests
I pointed out above.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 2/2] sql: rename instr to position & add collation usage
  2019-03-27 13:39     ` i.koptelov
@ 2019-03-28 12:57       ` n.pettik
  0 siblings, 0 replies; 13+ messages in thread
From: n.pettik @ 2019-03-28 12:57 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Ivan Koptelov

[-- Attachment #1: Type: text/plain, Size: 1820 bytes --]


>>> +
>>> +-- Collation is set in space format and also in position() -
>>> +-- for both arguments. Arguments have different explicit
>>> +-- collations thus an error is expected.
>>> +
>>> +test:do_catchsql_test(
>>> +    "position-1.75",
>>> +    [[
>>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>>> +        INSERT INTO test1 VALUES('à');
>>> +        SELECT POSITION('a' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
>>> +    ]], {
>>> +        1, "Illegal mix of collations"
>>> +    }
>>> +)
>>> +
>>> +test:do_catchsql_test(
>>> +    "position-1.76",
>>> +    [[
>>> +        DROP TABLE test1;
>>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>>> +        INSERT INTO test1 VALUES('qwèrty');
>>> +        SELECT POSITION('er' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
>>> +        DROP TABLE test1;
>>> +    ]], {
>>> +        1, "Illegal mix of collations"
>>> +    }
>>> +)
>>> +
>>> +test:do_catchsql_test(
>>> +    "position-1.77",
>>> +    [[
>>> +        DROP TABLE test1;
>>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>>> +        INSERT INTO test1 VALUES('qwèrtÿ');
>>> +        SELECT POSITION('Y' COLLATE "unicode_ci", s1 COLLATE "unicode") FROM test1;
>>> +    ]], {
>>> +        1, "Illegal mix of collations"
>>> +    }
>>> +)
>> 
>> Add test cases on: explicitly set incompatible collations,
>> explicitly set compatible collations.
>> 
>> 
> 
> Don’t we already have such tests? From the end of position.test.lua:
> 
> -- Collation is set in space format and also in position() -
> -- for both arguments. Arguments have the same collations.

Oh, I meant ‘implicitly set’. Still can’t find such tests.


[-- Attachment #2: Type: text/html, Size: 6103 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions
  2019-03-28 12:50       ` n.pettik
@ 2019-03-28 14:08         ` i.koptelov
  2019-03-29  9:57           ` n.pettik
  0 siblings, 1 reply; 13+ messages in thread
From: i.koptelov @ 2019-03-28 14:08 UTC (permalink / raw)
  To: tarantool-patches; +Cc: n.pettik



> On 28 Mar 2019, at 15:50, n.pettik <korablev@tarantool.org> wrote:
> 
>> 
>> Thank you for noticing, fixed now:
>> diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
>> index 8c1889d8a..34abb9665 100644
>> --- a/src/box/sql/expr.c
>> +++ b/src/box/sql/expr.c
>> @@ -4102,65 +4102,34 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
>> 			 * is done using ANSI rules from
>> 			 * collations_check_compatibility().
>> 			 */
>> -			if (nFarg == 1) {
>> -				bool unused;
>> -				uint32_t id;
>> -				if (sql_expr_coll(pParse,
>> -						  pFarg->a[0].pExpr, &unused,
>> -						  &id, &coll) != 0)
>> -					return 0;
>> -			}
>> 			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
>> -			     coll == NULL && nFarg > 1) {
> 
> coll == NULL is redundant check: before this branch this variable
> is assigned only once with NULL value.
Ok, removed.
diff --git a/src/box/sql/expr.c b/src/box/sql/expr.c
index 4eaec8e42..444b262a3 100644
--- a/src/box/sql/expr.c
+++ b/src/box/sql/expr.c
@@ -4078,8 +4078,7 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 			 * is done using ANSI rules from
 			 * collations_check_compatibility().
 			 */
-			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0 &&
-			     coll == NULL) {
+			if ((pDef->funcFlags & SQL_FUNC_NEEDCOLL) != 0) {
 				struct coll *unused = NULL;
 				uint32_t curr_id = COLL_NONE;
 				bool is_curr_forced = false;

> 
>> +			     coll == NULL) {
>> 				struct coll *unused = NULL;
>> 				uint32_t curr_id = COLL_NONE;
>> 				bool is_curr_forced = false;
>> 
>> -				uint32_t temp_id = COLL_NONE;
>> -				bool is_temp_forced = false;
>> -
>> -				uint32_t lhs_id = COLL_NONE;
>> -				bool is_lhs_forced = false;
>> +				uint32_t next_id = COLL_NONE;
>> +				bool is_next_forced = false;
>> 
>> -				uint32_t rhs_id = COLL_NONE;
>> -				bool is_rhs_forced = false;
>> +				if (sql_expr_coll(pParse, pFarg->a[0].pExpr,
>> +						  &is_curr_forced, &curr_id,
>> +						  &unused) != 0)
>> +					return 0;
>> 
>> -				for (int i = 0; i < nFarg; i++) {
>> +				for (int j = 1; j < nFarg; j++) {
>> 					if (sql_expr_coll(pParse,
>> -							  pFarg->a[i].pExpr,
>> -							  &is_lhs_forced,
>> -							  &lhs_id,
>> +							  pFarg->a[j].pExpr,
>> +							  &is_next_forced,
>> +							  &next_id,
>> 							  &unused) != 0)
>> 						return 0;
>> 
>> -					for (int j = i + 1; j < nFarg; j++) {
>> -						if (sql_expr_coll(pParse,
>> -								  pFarg->a[j].pExpr,
>> -								  &is_rhs_forced,
>> -								  &rhs_id,
>> -								  &unused) != 0)
>> -							return 0;
>> -
>> -						if (collations_check_compatibility(
>> -							lhs_id, is_lhs_forced,
>> -							rhs_id, is_rhs_forced,
>> -							&temp_id) != 0) {
>> -							pParse->is_aborted = true;
>> -							return 0;
>> -						}
>> -
>> -						is_temp_forced = (temp_id ==
>> -								  lhs_id) ?
>> -								 is_lhs_forced :
>> -								 is_rhs_forced;
>> -
>> -						if (collations_check_compatibility(
>> -							curr_id, is_curr_forced,
>> -							temp_id, is_temp_forced,
>> -							&curr_id) != 0) {
>> -							pParse->is_aborted = true;
>> -							return 0;
>> -						}
>> +					if (collations_check_compatibility(
>> +						curr_id, is_curr_forced,
>> +						next_id, is_next_forced,
>> +						&curr_id) != 0) {
>> +						pParse->is_aborted = true;
>> +						return 0;
>> 					}
>> 				}
>> 				coll = coll_by_id(curr_id)->coll;
>>> 
>>>> +
>>>> +						if (collations_check_compatibility(
>>>> +							lhs_id, is_lhs_forced,
>>>> +							rhs_id, is_rhs_forced,
>>>> +							&temp_id) != 0) {
>>>> +							pParse->rc =
>>>> +								SQL_TARANTOOL_ERROR;
>>>> +							pParse->nErr++;
>>>> +							return 0;
>>>> +						}
>>>> +
>>>> +						is_temp_forced = (temp_id ==
>>>> +								  lhs_id) ?
>>>> +								 is_lhs_forced :
>>>> +								 is_rhs_forced;
>>>> +
>>>> +						if (collations_check_compatibility(
>>>> +							curr_id, is_curr_forced,
>>>> +							temp_id, is_temp_forced,
>>>> +							&curr_id) != 0) {
>>>> +							pParse->rc =
>>>> +								SQL_TARANTOOL_ERROR;
>>>> +							pParse->nErr++;
>>>> +							return 0;
>>>> +						}
> 
> You don’t update is_curr_forced, and it allows to use invalid
> collation mix:
> 
> tarantool> select min ('abc', 'asd' collate "binary", 'abc' COLLATE "unicode")
> ---
> - - ['abc']
> ...
> 
Oh, sorry, I was too quick to send fixes.

@@ -4107,6 +4106,9 @@ sqlExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
 						pParse->is_aborted = true;
 						return 0;
 					}
+					is_curr_forced = curr_id == next_id ?
+							 is_next_forced :
+							 is_curr_forced;
 				}
 				coll = coll_by_id(curr_id)->coll;
 			}

>>>> +					}
>>>> 				}
>>>> +				coll = coll_by_id(curr_id)->coll;
>>>> 			}
>>>> 			if (pFarg) {
>>>> 				if (constMask) {
>>>> diff --git a/src/box/sql/sqlInt.h b/src/box/sql/sqlInt.h
>>>> index 8967ea3e0..47ee474bb 100644
>>>> --- a/src/box/sql/sqlInt.h
>>>> +++ b/src/box/sql/sqlInt.h
>>>> @@ -1660,7 +1660,13 @@ struct FuncDestructor {
>>>> #define SQL_FUNC_LIKE     0x0004	/* Candidate for the LIKE optimization */
>>>> #define SQL_FUNC_CASE     0x0008	/* Case-sensitive LIKE-type function */
>>>> #define SQL_FUNC_EPHEM    0x0010	/* Ephemeral.  Delete with VDBE */
>>>> -#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called */
>>>> +#define SQL_FUNC_NEEDCOLL 0x0020	/* sqlGetFuncCollSeq() might be called.
>>>> +					 * The flag is set when the collation
>>>> +					 * of function arguments should be
>>>> +					 * determined, using rules in
>>>> +					 * collations_check_compatibility()
>>>> +					 * function.
>>>> +					 */
>>>> #define SQL_FUNC_LENGTH   0x0040	/* Built-in length() function */
>>>> #define SQL_FUNC_TYPEOF   0x0080	/* Built-in typeof() function */
>>>> #define SQL_FUNC_COUNT    0x0100	/* Built-in count(*) aggregate */
>>>> -- 
>>> 
>>> Please, provide basic test cases involving one or more built-in
>>> functions and incompatible arguments (at least min-max funcs use it).
>>> Moreover, this flag can’t be set for user-defined functions, which is pretty sad.
>> Add a few tests:
>> diff --git a/test/sql-tap/func5.test.lua b/test/sql-tap/func5.test.lua
>> index 6605a2ba1..4282fdac8 100755
>> --- a/test/sql-tap/func5.test.lua
>> +++ b/test/sql-tap/func5.test.lua
>> @@ -1,6 +1,6 @@
>> #!/usr/bin/env tarantool
>> test = require("sqltester")
>> -test:plan(5)
>> +test:plan(9)
>> 
>> --!./tcltestrunner.lua
>> -- 2010 August 27
>> @@ -98,5 +98,59 @@ test:do_execsql_test(
>>        -- </func5-2.2>
>>    })
>> 
>> +-- The following tests ensures that max() and min() functions
>> +-- raise error if argument's collations are incompatible.
>> +
>> +test:do_catchsql_test(
>> +    "func-5-3.1",
>> +    [[
>> +        SELECT max('a' COLLATE "unicode", 'A' COLLATE "unicode_ci");
>> +    ]],
>> +    {
>> +        -- <func5-3.1>
>> +        1, "Illegal mix of collations"
>> +        -- </func5-3.1>
>> +    }
>> +)
>> +
>> +test:do_catchsql_test(
>> +    "func-5-3.2",
>> +    [[
>> +        CREATE TABLE test1 (s1 VARCHAR(5) PRIMARY KEY COLLATE "unicode");
>> +        CREATE TABLE test2 (s2 VARCHAR(5) PRIMARY KEY COLLATE "unicode_ci");
>> +        INSERT INTO test1 VALUES ('a');
>> +        INSERT INTO test2 VALUES ('a');
>> +        SELECT max(s1, s2) FROM test1 JOIN test2;
>> +    ]],
>> +    {
>> +        -- <func5-3.2>
>> +        1, "Illegal mix of collations"
>> +        -- </func5-3.2>
>> +    }
>> +)
>> +
>> +test:do_catchsql_test(
>> +    "func-5-3.3",
>> +    [[
>> +        SELECT min('a' COLLATE "unicode", 'A' COLLATE "unicode_ci”);
> 
> Please add tests involving more than two params.
> Example of illegal collation mix passing your tests
> I pointed out above.

> Oh, I meant ‘implicitly set’. Still can’t find such tests.


Added additional tests:

iff --git a/test/sql-tap/func5.test.lua b/test/sql-tap/func5.test.lua
index 4282fdac8..4120d5fb5 100755
--- a/test/sql-tap/func5.test.lua
+++ b/test/sql-tap/func5.test.lua
@@ -1,6 +1,6 @@
 #!/usr/bin/env tarantool
 test = require("sqltester")
-test:plan(9)
+test:plan(15)
 
 --!./tcltestrunner.lua
 -- 2010 August 27
@@ -132,7 +132,7 @@ test:do_catchsql_test(
 test:do_catchsql_test(
     "func-5-3.3",
     [[
-        SELECT min('a' COLLATE "unicode", 'A' COLLATE "unicode_ci");
+        SELECT max ('abc', 'asd' COLLATE "binary", 'abc' COLLATE "unicode")
     ]],
     {
         -- <func5-3.3>
@@ -141,15 +141,91 @@ test:do_catchsql_test(
     }
 )
 
-test:do_catchsql_test(
+test:do_execsql_test(
     "func-5-3.4",
+    [[
+        SELECT max (s1, 'asd' COLLATE "binary", s2) FROM test1 JOIN test2;
+    ]], {
+        -- <func5-3.4>
+        "asd"
+        -- </func5-3.4>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5.3.5",
+    [[
+        CREATE TABLE test3 (s3 VARCHAR(5) PRIMARY KEY COLLATE "unicode");
+        CREATE TABLE test4 (s4 VARCHAR(5) PRIMARY KEY COLLATE "unicode");
+        CREATE TABLE test5 (s5 VARCHAR(5) PRIMARY KEY COLLATE "binary");
+        INSERT INTO test3 VALUES ('a');
+        INSERT INTO test4 VALUES ('a');
+        INSERT INTO test5 VALUES ('a');
+        SELECT max(s3, s4, s5) FROM test3 JOIN test4 JOIN test5;
+    ]],
+    {
+        -- <func5-3.5>
+        1, "Illegal mix of collations"
+        -- </func5-3.5>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5-3.6",
+    [[
+        SELECT min('a' COLLATE "unicode", 'A' COLLATE "unicode_ci");
+    ]],
+    {
+        -- <func5-3.6>
+        1, "Illegal mix of collations"
+        -- </func5-3.6>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5-3.7",
     [[
         SELECT min(s1, s2) FROM test1 JOIN test2;
     ]],
     {
-        -- <func5-3.4>
+        -- <func5-3.7>
         1, "Illegal mix of collations"
-        -- </func5-3.4>
+        -- </func5-3.7>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5-3.8",
+    [[
+        SELECT min ('abc', 'asd' COLLATE "binary", 'abc' COLLATE "unicode")
+    ]],
+    {
+        -- <func5-3.8>
+        1, "Illegal mix of collations"
+        -- </func5-3.8>
+    }
+)
+
+test:do_execsql_test(
+    "func-5-3.9",
+    [[
+        SELECT min (s1, 'asd' COLLATE "binary", s2) FROM test1 JOIN test2;
+    ]], {
+        -- <func5-3.9>
+        "a"
+        -- </func5-3.9>
+    }
+)
+
+test:do_catchsql_test(
+    "func-5.3.10",
+    [[
+        SELECT min(s3, s4, s5) FROM test3 JOIN test4 JOIN test5;
+    ]],
+    {
+        -- <func5-3.10>
+        1, "Illegal mix of collations"
+        -- <func5-3.10>
     }
 )

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 1/2] sql: add better collation determination in functions
  2019-03-28 14:08         ` i.koptelov
@ 2019-03-29  9:57           ` n.pettik
  0 siblings, 0 replies; 13+ messages in thread
From: n.pettik @ 2019-03-29  9:57 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Ivan Koptelov

Both patches LGTM.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tarantool-patches] Re: [PATCH 0/2] sql: add better coll. determination & position func.
  2019-03-20 11:11 [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func Ivan Koptelov
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov
  2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov
@ 2019-04-01 14:15 ` Kirill Yukhin
  2 siblings, 0 replies; 13+ messages in thread
From: Kirill Yukhin @ 2019-04-01 14:15 UTC (permalink / raw)
  To: tarantool-patches; +Cc: korablev, Ivan Koptelov

Hello,

On 20 Mar 14:11, Ivan Koptelov wrote:
> There are two patches in this series. First one enhances
> collation determination for function arguments (e.g. for cases,
> when function arguments have different collations). Second one
> renames instr() to position() and swaps arguments order. Also
> after the second patch both arguments must have the same type,
> which should be either TEXT or BLOB.
> 
> Branch https://github.com/tarantool/tarantool/tree/sudobobo/gh-3933-add-position-func
> Issue https://github.com/tarantool/tarantool/issues/3933

I've checked your patches into master & 2.1 branch.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-04-01 14:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20 11:11 [tarantool-patches] [PATCH 0/2] sql: add better coll. determination & position func Ivan Koptelov
2019-03-20 11:11 ` [tarantool-patches] [PATCH 1/2] sql: add better collation determination in functions Ivan Koptelov
2019-03-25 19:26   ` [tarantool-patches] " n.pettik
2019-03-27 13:38     ` i.koptelov
2019-03-28 12:50       ` n.pettik
2019-03-28 14:08         ` i.koptelov
2019-03-29  9:57           ` n.pettik
2019-03-20 11:11 ` [tarantool-patches] [PATCH 2/2] sql: rename instr to position & add collation usage Ivan Koptelov
2019-03-20 12:59   ` [tarantool-patches] Re: [PATCH 1/2] " i.koptelov
2019-03-26 12:32   ` [tarantool-patches] Re: [PATCH 2/2] " n.pettik
2019-03-27 13:39     ` i.koptelov
2019-03-28 12:57       ` n.pettik
2019-04-01 14:15 ` [tarantool-patches] Re: [PATCH 0/2] sql: add better coll. determination & position func Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox