Tarantool development patches archive
 help / color / mirror / Atom feed
* [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
@ 2018-12-15 10:57 Roman Khabibov
  2018-12-18  8:40 ` [tarantool-patches] " n.pettik
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Roman Khabibov @ 2018-12-15 10:57 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

The reason for the bug was that X'00' is a terminal symbol. If the char set
contained X'00', all other characters after its (inculding itself) are then
ignored.

Closes #3543

Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
Issue: https://github.com/tarantool/tarantool/issues/3543
---
 src/box/sql/func.c         |  18 +++++-
 test/sql-tap/func.test.lua | 126 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 140 insertions(+), 4 deletions(-)

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 9667aead5..5beba7bd2 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1224,8 +1224,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
 		return;
 	} else {
 		const unsigned char *z;
-		for (z = zCharSet, nChar = 0; *z; nChar++) {
+		int sizeInChar = sqlite3_value_bytes(argv[1]);
+		int length = 0;
+		z = zCharSet;
+		nChar = 0;
+		const unsigned char *zStepBack;
+		while(sizeInChar - length) {
+			zStepBack = z;
 			SQLITE_SKIP_UTF8(z);
+			length += z - zStepBack;
+			nChar++;
 		}
 		if (nChar > 0) {
 			azChar =
@@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
 				return;
 			}
 			aLen = (unsigned char *)&azChar[nChar];
-			for (z = zCharSet, nChar = 0; *z; nChar++) {
+			z = zCharSet;
+			nChar = 0;
+			length = 0;
+			while(sizeInChar - length) {
 				azChar[nChar] = (unsigned char *)z;
+				zStepBack = z;
 				SQLITE_SKIP_UTF8(z);
+				length += z - zStepBack;
 				aLen[nChar] = (u8) (z - azChar[nChar]);
+				nChar++;
 			}
 		}
 	}
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 393212968..b7b8e7c4c 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1,6 +1,6 @@
 #!/usr/bin/env tarantool
 test = require("sqltester")
-test:plan(14535)
+test:plan(14547)
 
 --!./tcltestrunner.lua
 -- 2001 September 15
@@ -2100,11 +2100,133 @@ test:do_execsql_test(
         -- </func-22.22>
     })
 
+-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
+
+test:do_execsql_test(
+    "func-22.23",
+    [[
+        SELECT TRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.23>
+        "A"
+        -- </func-22.23>
+    })
+
+test:do_execsql_test(
+    "func-22.24",
+    [[
+        SELECT TRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.24>
+        "A"
+        -- </func-22.24>
+    })
+
+test:do_execsql_test(
+    "func-22.25",
+    [[
+        SELECT TRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.25>
+        "A"
+        -- </func-22.25>
+    })
+
+test:do_execsql_test(
+    "func-22.26",
+    [[
+        SELECT TRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.26>
+        "A\0B"
+        -- </func-22.26>
+    })
+
+test:do_execsql_test(
+    "func-22.27",
+    [[
+        SELECT LTRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.27>
+        "A\0"
+        -- </func-22.27>
+    })
+
+test:do_execsql_test(
+    "func-22.28",
+    [[
+        SELECT LTRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.28>
+        "A\0"
+        -- </func-22.28>
+    })
+
+test:do_execsql_test(
+    "func-22.29",
+    [[
+        SELECT LTRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.29>
+        "A\0"
+        -- </func-22.29>
+    })
+
+test:do_execsql_test(
+    "func-22.30",
+    [[
+        SELECT LTRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.30>
+        "A\0B\0\0"
+        -- </func-22.30>
+    })
+
+test:do_execsql_test(
+    "func-22.31",
+    [[
+        SELECT RTRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.31>
+        "\0A"
+        -- </func-22.31>
+    })
+
+test:do_execsql_test(
+    "func-22.32",
+    [[
+        SELECT RTRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.32>
+        "\0A"
+        -- </func-22.32>
+    })
+
+test:do_execsql_test(
+    "func-22.33",
+    [[
+        SELECT RTRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.33>
+        "\0A"
+        -- </func-22.33>
+    })
+
+test:do_execsql_test(
+    "func-22.34",
+    [[
+        SELECT RTRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.34>
+        "\0\0A\0B"
+        -- </func-22.34>
+    })
+
 -- This is to test the deprecated sqlite3_aggregate_count() API.
 --
 --test:do_test(
 --    "func-23.1",
---    function()
+--    function()S
 --        sqlite3_create_aggregate("db")
 --        return test:execsql([[
 --            SELECT legacy_count() FROM t6;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
@ 2018-12-18  8:40 ` n.pettik
  2018-12-18 14:30   ` Roman
  2018-12-20 20:41 ` Vladislav Shpilevoy
  2018-12-27 12:28 ` Kirill Yukhin
  2 siblings, 1 reply; 8+ messages in thread
From: n.pettik @ 2018-12-18  8:40 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Roman Khabibov



> On 15 Dec 2018, at 13:57, Roman Khabibov <roman.habibov@tarantool.org> wrote:
> 
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all other characters after its (inculding itself) are then
> ignored.

Nit: ‘… after its’ -> ‘… after it’.

Nit: inculding -> including.

Nit: ‘are then ignored’ -> ‘are ignored’.

Nit: I wouldn’t mix first and second conditions:
‘If the char set contains …, all characters are ignored...'

> 
> Closes #3543
> 
> Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
> Issue: https://github.com/tarantool/tarantool/issues/3543

You should put links to the branch and issue below delimiter ---, not above.

> ---
> src/box/sql/func.c         |  18 +++++-
> test/sql-tap/func.test.lua | 126 ++++++++++++++++++++++++++++++++++++-
> 2 files changed, 140 insertions(+), 4 deletions(-)
> 
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9667aead5..5beba7bd2 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1224,8 +1224,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> 		return;
> 	} else {
> 		const unsigned char *z;
> -		for (z = zCharSet, nChar = 0; *z; nChar++) {

Lets start from putting here comments which explain
what is going on here (how you algorithm/approach works).
Then, lets choose more suitable names for variables:
length, sizeInChar and nChar sound very similar
(if you can’t come up with better names, then at least
mention in comment what they mean).

> +		int sizeInChar = sqlite3_value_bytes(argv[1]);
> +		int length = 0;

What is the difference between sizeInChar, length and nChar?

> +		z = zCharSet;
> +		nChar = 0;
> +		const unsigned char *zStepBack;
> +		while(sizeInChar - length) {

We use explicit conditions as a predicate value. In other words:

while (sizeInChar - length > 0)

Also, it would be better if you declare and initialise var at once:

const unsigned char *z = zCharSet;


> +			zStepBack = z;
> 			SQLITE_SKIP_UTF8(z);
> +			length += z - zStepBack;
> +			nChar++;
> 		}
> 		if (nChar > 0) {
> 			azChar =
> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> 				return;
> 			}
> 			aLen = (unsigned char *)&azChar[nChar];
> -			for (z = zCharSet, nChar = 0; *z; nChar++) {
> +			z = zCharSet;
> +			nChar = 0;
> +			length = 0;
> +			while(sizeInChar - length) {
> 				azChar[nChar] = (unsigned char *)z;
> +				zStepBack = z;
> 				SQLITE_SKIP_UTF8(z);
> +				length += z - zStepBack;
> 				aLen[nChar] = (u8) (z - azChar[nChar]);
> +				nChar++;
> 			}
> 		}
> 	}
> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
> index 393212968..b7b8e7c4c 100755
> --- a/test/sql-tap/func.test.lua
> +++ b/test/sql-tap/func.test.lua
> @@ -1,6 +1,6 @@
> #!/usr/bin/env tarantool
> test = require("sqltester")
> -test:plan(14535)
> +test:plan(14547)
> 
> --!./tcltestrunner.lua
> -- 2001 September 15
> @@ -2100,11 +2100,133 @@ test:do_execsql_test(
>         -- </func-22.22>
>     })
> 
> -- This is to test the deprecated sqlite3_aggregate_count() API.
> --
> --test:do_test(
> --    "func-23.1",
> ---    function()
> +--    function()S

Garbage diff.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-18  8:40 ` [tarantool-patches] " n.pettik
@ 2018-12-18 14:30   ` Roman
  2018-12-25 11:40     ` n.pettik
  0 siblings, 1 reply; 8+ messages in thread
From: Roman @ 2018-12-18 14:30 UTC (permalink / raw)
  To: tarantool-patches, n.pettik

Hi! Thanks for review.

> Nit: ‘… after its’ -> ‘… after it’.
>
> Nit: inculding -> including.
>
> Nit: ‘are then ignored’ -> ‘are ignored’.
>
> Nit: I wouldn’t mix first and second conditions:
> ‘If the char set contains …, all characters are ignored...'

Fixed.


> You should put links to the branch and issue below delimiter ---, not above.

Sorry.


> Lets start from putting here comments which explain
> what is going on here (how you algorithm/approach works).
> Then, lets choose more suitable names for variables:
> length, sizeInChar and nChar sound very similar
> (if you can’t come up with better names, then at least
> mention in comment what they mean).
>
>> +		int sizeInChar = sqlite3_value_bytes(argv[1]);
>> +		int length = 0;
> What is the difference between sizeInChar, length and nChar?

I add comments and renamed vars. Hope, now it's clearer.


> We use explicit conditions as a predicate value. In other words:
>
> while (sizeInChar - length > 0)
>
> Also, it would be better if you declare and initialise var at once:
>
> const unsigned char *z = zCharSet;

Fixed.


>> +			zStepBack = z;
>> 			SQLITE_SKIP_UTF8(z);
>> +			length += z - zStepBack;
>> +			nChar++;
>> 		}
>> 		if (nChar > 0) {
>> 			azChar =
>> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>> 				return;
>> 			}
>> 			aLen = (unsigned char *)&azChar[nChar];
>> -			for (z = zCharSet, nChar = 0; *z; nChar++) {
>> +			z = zCharSet;
>> +			nChar = 0;
>> +			length = 0;
>> +			while(sizeInChar - length) {
>> 				azChar[nChar] = (unsigned char *)z;
>> +				zStepBack = z;
>> 				SQLITE_SKIP_UTF8(z);
>> +				length += z - zStepBack;
>> 				aLen[nChar] = (u8) (z - azChar[nChar]);
>> +				nChar++;
>> 			}
>> 		}
>> 	}
>> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
>> index 393212968..b7b8e7c4c 100755
>> --- a/test/sql-tap/func.test.lua
>> +++ b/test/sql-tap/func.test.lua
>> @@ -1,6 +1,6 @@
>> #!/usr/bin/env tarantool
>> test = require("sqltester")
>> -test:plan(14535)
>> +test:plan(14547)
>>
>> --!./tcltestrunner.lua
>> -- 2001 September 15
>> @@ -2100,11 +2100,133 @@ test:do_execsql_test(
>>          -- </func-22.22>
>>      })
>>
>> -- This is to test the deprecated sqlite3_aggregate_count() API.
>> --
>> --test:do_test(
>> --    "func-23.1",
>> ---    function()
>> +--    function()S
> Garbage diff.

Sorry. I's an accident.


commit 39b2481341cd2e71d67a04bec9aeed0e1189740c
Author: Roman Khabibov <roman.habibov@tarantool.org>
Date:   Sat Dec 15 13:21:59 2018 +0300

     sql: fix bug with BLOB TRIM() when X'00' in char set

     The reason for the bug was that X'00' is a terminal symbol. If the 
char set
     contained X'00', all characters are ignored after it (including 
itself).

     Closes #3543

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 9667aead5..f397e23c1 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1223,9 +1223,19 @@ trimFunc(sqlite3_context * context, int argc, 
sqlite3_value ** argv)
      } else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) {
          return;
      } else {
-        const unsigned char *z;
-        for (z = zCharSet, nChar = 0; *z; nChar++) {
+        const unsigned char *z = zCharSet;
+        int sizeOfCharSet = \
+        sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
+        int nProcessedBytes = 0;
+        nChar = 0;
+        const unsigned char *zStepBack;
+        /* Count the number of UTF-8 characters passing through the
+        * entire char set, but not up to the '\0' or X'00' character. */
+        while(sizeOfCharSet - nProcessedBytes > 0) {
+            zStepBack = z;
              SQLITE_SKIP_UTF8(z);
+            nProcessedBytes += z - zStepBack;
+            nChar++;
          }
          if (nChar > 0) {
              azChar =
@@ -1235,10 +1245,18 @@ trimFunc(sqlite3_context * context, int argc, 
sqlite3_value ** argv)
                  return;
              }
              aLen = (unsigned char *)&azChar[nChar];
-            for (z = zCharSet, nChar = 0; *z; nChar++) {
+            z = zCharSet;
+            nChar = 0;
+            nProcessedBytes = 0;
+            /* Similar to the previous cycle. But
+            * now write into "azCharSet". */
+            while(sizeOfCharSet - nProcessedBytes > 0) {
                  azChar[nChar] = (unsigned char *)z;
+                zStepBack = z;
                  SQLITE_SKIP_UTF8(z);
+                nProcessedBytes += z - zStepBack;
                  aLen[nChar] = (u8) (z - azChar[nChar]);
+                nChar++;
              }
          }
      }
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 393212968..b7de1d955 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1,6 +1,6 @@
  #!/usr/bin/env tarantool
  test = require("sqltester")
-test:plan(14535)
+test:plan(14547)

  --!./tcltestrunner.lua
  -- 2001 September 15
@@ -2100,6 +2100,128 @@ test:do_execsql_test(
          -- </func-22.22>
      })

+-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
+
+test:do_execsql_test(
+    "func-22.23",
+    [[
+        SELECT TRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.23>
+        "A"
+        -- </func-22.23>
+    })
+
+test:do_execsql_test(
+    "func-22.24",
+    [[
+        SELECT TRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.24>
+        "A"
+        -- </func-22.24>
+    })
+
+test:do_execsql_test(
+    "func-22.25",
+    [[
+        SELECT TRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.25>
+        "A"
+        -- </func-22.25>
+    })
+
+test:do_execsql_test(
+    "func-22.26",
+    [[
+        SELECT TRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.26>
+        "A\0B"
+        -- </func-22.26>
+    })
+
+test:do_execsql_test(
+    "func-22.27",
+    [[
+        SELECT LTRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.27>
+        "A\0"
+        -- </func-22.27>
+    })
+
+test:do_execsql_test(
+    "func-22.28",
+    [[
+        SELECT LTRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.28>
+        "A\0"
+        -- </func-22.28>
+    })
+
+test:do_execsql_test(
+    "func-22.29",
+    [[
+        SELECT LTRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.29>
+        "A\0"
+        -- </func-22.29>
+    })
+
+test:do_execsql_test(
+    "func-22.30",
+    [[
+        SELECT LTRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.30>
+        "A\0B\0\0"
+        -- </func-22.30>
+    })
+
+test:do_execsql_test(
+    "func-22.31",
+    [[
+        SELECT RTRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.31>
+        "\0A"
+        -- </func-22.31>
+    })
+
+test:do_execsql_test(
+    "func-22.32",
+    [[
+        SELECT RTRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.32>
+        "\0A"
+        -- </func-22.32>
+    })
+
+test:do_execsql_test(
+    "func-22.33",
+    [[
+        SELECT RTRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.33>
+        "\0A"
+        -- </func-22.33>
+    })
+
+test:do_execsql_test(
+    "func-22.34",
+    [[
+        SELECT RTRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.34>
+        "\0\0A\0B"
+        -- </func-22.34>
+    })
+
  -- This is to test the deprecated sqlite3_aggregate_count() API.
  --
  --test:do_test(

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
  2018-12-18  8:40 ` [tarantool-patches] " n.pettik
@ 2018-12-20 20:41 ` Vladislav Shpilevoy
  2018-12-27 12:28 ` Kirill Yukhin
  2 siblings, 0 replies; 8+ messages in thread
From: Vladislav Shpilevoy @ 2018-12-20 20:41 UTC (permalink / raw)
  To: Roman Khabibov, tarantool-patches, Nikita Pettik

Nikita, please, do first review.

On 15/12/2018 13:57, Roman Khabibov wrote:
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all other characters after its (inculding itself) are then
> ignored.
> 
> Closes #3543
> 
> Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
> Issue: https://github.com/tarantool/tarantool/issues/3543
> ---
>   src/box/sql/func.c         |  18 +++++-
>   test/sql-tap/func.test.lua | 126 ++++++++++++++++++++++++++++++++++++-
>   2 files changed, 140 insertions(+), 4 deletions(-)
> 
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9667aead5..5beba7bd2 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1224,8 +1224,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>   		return;
>   	} else {
>   		const unsigned char *z;
> -		for (z = zCharSet, nChar = 0; *z; nChar++) {
> +		int sizeInChar = sqlite3_value_bytes(argv[1]);
> +		int length = 0;
> +		z = zCharSet;
> +		nChar = 0;
> +		const unsigned char *zStepBack;
> +		while(sizeInChar - length) {
> +			zStepBack = z;
>   			SQLITE_SKIP_UTF8(z);
> +			length += z - zStepBack;
> +			nChar++;
>   		}
>   		if (nChar > 0) {
>   			azChar =
> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>   				return;
>   			}
>   			aLen = (unsigned char *)&azChar[nChar];
> -			for (z = zCharSet, nChar = 0; *z; nChar++) {
> +			z = zCharSet;
> +			nChar = 0;
> +			length = 0;
> +			while(sizeInChar - length) {
>   				azChar[nChar] = (unsigned char *)z;
> +				zStepBack = z;
>   				SQLITE_SKIP_UTF8(z);
> +				length += z - zStepBack;
>   				aLen[nChar] = (u8) (z - azChar[nChar]);
> +				nChar++;
>   			}
>   		}
>   	}
> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
> index 393212968..b7b8e7c4c 100755
> --- a/test/sql-tap/func.test.lua
> +++ b/test/sql-tap/func.test.lua
> @@ -1,6 +1,6 @@
>   #!/usr/bin/env tarantool
>   test = require("sqltester")
> -test:plan(14535)
> +test:plan(14547)
>   
>   --!./tcltestrunner.lua
>   -- 2001 September 15
> @@ -2100,11 +2100,133 @@ test:do_execsql_test(
>           -- </func-22.22>
>       })
>   
> +-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
> +
> +test:do_execsql_test(
> +    "func-22.23",
> +    [[
> +        SELECT TRIM(X'004100', X'00');
> +    ]], {
> +        -- <func-22.23>
> +        "A"
> +        -- </func-22.23>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.24",
> +    [[
> +        SELECT TRIM(X'004100', X'0000');
> +    ]], {
> +        -- <func-22.24>
> +        "A"
> +        -- </func-22.24>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.25",
> +    [[
> +        SELECT TRIM(X'004100', X'0042');
> +    ]], {
> +        -- <func-22.25>
> +        "A"
> +        -- </func-22.25>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.26",
> +    [[
> +        SELECT TRIM(X'00004100420000', X'00');
> +    ]], {
> +        -- <func-22.26>
> +        "A\0B"
> +        -- </func-22.26>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.27",
> +    [[
> +        SELECT LTRIM(X'004100', X'00');
> +    ]], {
> +        -- <func-22.27>
> +        "A\0"
> +        -- </func-22.27>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.28",
> +    [[
> +        SELECT LTRIM(X'004100', X'0000');
> +    ]], {
> +        -- <func-22.28>
> +        "A\0"
> +        -- </func-22.28>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.29",
> +    [[
> +        SELECT LTRIM(X'004100', X'0042');
> +    ]], {
> +        -- <func-22.29>
> +        "A\0"
> +        -- </func-22.29>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.30",
> +    [[
> +        SELECT LTRIM(X'00004100420000', X'00');
> +    ]], {
> +        -- <func-22.30>
> +        "A\0B\0\0"
> +        -- </func-22.30>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.31",
> +    [[
> +        SELECT RTRIM(X'004100', X'00');
> +    ]], {
> +        -- <func-22.31>
> +        "\0A"
> +        -- </func-22.31>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.32",
> +    [[
> +        SELECT RTRIM(X'004100', X'0000');
> +    ]], {
> +        -- <func-22.32>
> +        "\0A"
> +        -- </func-22.32>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.33",
> +    [[
> +        SELECT RTRIM(X'004100', X'0042');
> +    ]], {
> +        -- <func-22.33>
> +        "\0A"
> +        -- </func-22.33>
> +    })
> +
> +test:do_execsql_test(
> +    "func-22.34",
> +    [[
> +        SELECT RTRIM(X'00004100420000', X'00');
> +    ]], {
> +        -- <func-22.34>
> +        "\0\0A\0B"
> +        -- </func-22.34>
> +    })
> +
>   -- This is to test the deprecated sqlite3_aggregate_count() API.
>   --
>   --test:do_test(
>   --    "func-23.1",
> ---    function()
> +--    function()S
>   --        sqlite3_create_aggregate("db")
>   --        return test:execsql([[
>   --            SELECT legacy_count() FROM t6;
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-18 14:30   ` Roman
@ 2018-12-25 11:40     ` n.pettik
  2018-12-26 13:56       ` Roman
  0 siblings, 1 reply; 8+ messages in thread
From: n.pettik @ 2018-12-25 11:40 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Roman Khabibov


> commit 39b2481341cd2e71d67a04bec9aeed0e1189740c
> Author: Roman Khabibov <roman.habibov@tarantool.org>
> Date:   Sat Dec 15 13:21:59 2018 +0300
> 
>     sql: fix bug with BLOB TRIM() when X'00' in char set
> 
>     The reason for the bug was that X'00' is a terminal symbol. If the char set
>     contained X'00', all characters are ignored after it (including itself).
> 
>     Closes #3543
> 
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9667aead5..f397e23c1 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1223,9 +1223,19 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>      } else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) {
>          return;
>      } else {
> -        const unsigned char *z;
> -        for (z = zCharSet, nChar = 0; *z; nChar++) {
> +        const unsigned char *z = zCharSet;
> +        int sizeOfCharSet = \

We don’t use backslashes to carry code (it is used for macroses and
sometimes for comments).

> +        sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */

We put comments on the top of code to be commented:

 /* Size of char set in bytes. */
 int sizeOfCharSet = sqlite3_value_bytes(argv[1]);

I guess this comment is completely useless:
name of var and function say exactly the same as comment does.

> +        int nProcessedBytes = 0;
> +        nChar = 0;
> +        const unsigned char *zStepBack;
> +        /* Count the number of UTF-8 characters passing through the
> +        * entire char set, but not up to the '\0' or X'00' character. */

Use tnt-style comments.

> +        while(sizeOfCharSet - nProcessedBytes > 0) {
> +            zStepBack = z;
>              SQLITE_SKIP_UTF8(z);
> +            nProcessedBytes += z - zStepBack;
> +            nChar++;
>          }
>          if (nChar > 0) {
>              azChar =
> @@ -1235,10 +1245,18 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>                  return;
>              }
>              aLen = (unsigned char *)&azChar[nChar];
> -            for (z = zCharSet, nChar = 0; *z; nChar++) {
> +            z = zCharSet;
> +            nChar = 0;
> +            nProcessedBytes = 0;

This comment is again useless.

> +            /* Similar to the previous cycle. But

I see trailing space. Use git diff to spot such places.

> +            * now write into "azCharSet". */

Use tnt-style comments.

> +            while(sizeOfCharSet - nProcessedBytes > 0) {
>                  azChar[nChar] = (unsigned char *)z;
> +                zStepBack = z;

You don’t need here ‘zStepBack’, you already saved current str
position to azChar.

>                  SQLITE_SKIP_UTF8(z);
> +                nProcessedBytes += z - zStepBack;
>                  aLen[nChar] = (u8) (z - azChar[nChar]);
> +                nChar++;
>              }
>          }
>      }

All points considered, I suggest diff like this:

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index f397e23c1..9b5773321 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1203,7 +1203,8 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
        int i;                  /* Loop counter */
        unsigned char *aLen = 0;        /* Length of each character in zCharSet */
        unsigned char **azChar = 0;     /* Individual characters in zCharSet */
-       int nChar;              /* Number of characters in zCharSet */
+       /* Number of UTF-8 characters in zCharSet. */
+       int nChar;
 
        if (sqlite3_value_type(argv[0]) == SQLITE_NULL) {
                return;
@@ -1224,17 +1225,20 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
                return;
        } else {
                const unsigned char *z = zCharSet;
-               int sizeOfCharSet = \
-               sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
-               int nProcessedBytes = 0;
+               int trim_set_sz = sqlite3_value_bytes(argv[1]);
+               int handled_bytes_cnt = trim_set_sz;
                nChar = 0;
-               const unsigned char *zStepBack;
-               /* Count the number of UTF-8 characters passing through the
-               * entire char set, but not up to the '\0' or X'00' character. */
-               while(sizeOfCharSet - nProcessedBytes > 0) {
-                       zStepBack = z;
+               /*
+                * Count the number of UTF-8 characters passing
+                * through the entire char set, but not up
+                * to the '\0' or X'00' character. This allows
+                * to handle trimming set containing such
+                * characters.
+                */
+               while(handled_bytes_cnt > 0) {
+                       const unsigned char *prev_byte = z;
                        SQLITE_SKIP_UTF8(z);
-                       nProcessedBytes += z - zStepBack;
+                       handled_bytes_cnt -= (z - prev_byte);
                        nChar++;
                }
                if (nChar > 0) {
@@ -1247,15 +1251,12 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
                        aLen = (unsigned char *)&azChar[nChar];
                        z = zCharSet;
                        nChar = 0;
-                       nProcessedBytes = 0;
-                       /* Similar to the previous cycle. But 
-                       * now write into "azCharSet". */
-                       while(sizeOfCharSet - nProcessedBytes > 0) {
+                       handled_bytes_cnt = trim_set_sz;
+                       while(handled_bytes_cnt > 0) {
                                azChar[nChar] = (unsigned char *)z;
-                               zStepBack = z;
                                SQLITE_SKIP_UTF8(z);
-                               nProcessedBytes += z - zStepBack;
                                aLen[nChar] = (u8) (z - azChar[nChar]);
+                               handled_bytes_cnt -= aLen[nChar];
                                nChar++;

Check it out. If you are ok with it, you can apply it (partially or fully).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-25 11:40     ` n.pettik
@ 2018-12-26 13:56       ` Roman
  2018-12-28 11:09         ` n.pettik
  0 siblings, 1 reply; 8+ messages in thread
From: Roman @ 2018-12-26 13:56 UTC (permalink / raw)
  To: n.pettik, tarantool-patches


On 25.12.2018 14:40, n.pettik wrote:
> All points considered, I suggest diff like this:
>
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index f397e23c1..9b5773321 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1203,7 +1203,8 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>          int i;                  /* Loop counter */
>          unsigned char *aLen = 0;        /* Length of each character in zCharSet */
>          unsigned char **azChar = 0;     /* Individual characters in zCharSet */
> -       int nChar;              /* Number of characters in zCharSet */
> +       /* Number of UTF-8 characters in zCharSet. */
> +       int nChar;
>   
>          if (sqlite3_value_type(argv[0]) == SQLITE_NULL) {
>                  return;
> @@ -1224,17 +1225,20 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>                  return;
>          } else {
>                  const unsigned char *z = zCharSet;
> -               int sizeOfCharSet = \
> -               sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
> -               int nProcessedBytes = 0;
> +               int trim_set_sz = sqlite3_value_bytes(argv[1]);
> +               int handled_bytes_cnt = trim_set_sz;
>                  nChar = 0;
> -               const unsigned char *zStepBack;
> -               /* Count the number of UTF-8 characters passing through the
> -               * entire char set, but not up to the '\0' or X'00' character. */
> -               while(sizeOfCharSet - nProcessedBytes > 0) {
> -                       zStepBack = z;
> +               /*
> +                * Count the number of UTF-8 characters passing
> +                * through the entire char set, but not up
> +                * to the '\0' or X'00' character. This allows
> +                * to handle trimming set containing such
> +                * characters.
> +                */
> +               while(handled_bytes_cnt > 0) {
> +                       const unsigned char *prev_byte = z;
>                          SQLITE_SKIP_UTF8(z);
> -                       nProcessedBytes += z - zStepBack;
> +                       handled_bytes_cnt -= (z - prev_byte);
>                          nChar++;
>                  }
>                  if (nChar > 0) {
> @@ -1247,15 +1251,12 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>                          aLen = (unsigned char *)&azChar[nChar];
>                          z = zCharSet;
>                          nChar = 0;
> -                       nProcessedBytes = 0;
> -                       /* Similar to the previous cycle. But
> -                       * now write into "azCharSet". */
> -                       while(sizeOfCharSet - nProcessedBytes > 0) {
> +                       handled_bytes_cnt = trim_set_sz;
> +                       while(handled_bytes_cnt > 0) {
>                                  azChar[nChar] = (unsigned char *)z;
> -                               zStepBack = z;
>                                  SQLITE_SKIP_UTF8(z);
> -                               nProcessedBytes += z - zStepBack;
>                                  aLen[nChar] = (u8) (z - azChar[nChar]);
> +                               handled_bytes_cnt -= aLen[nChar];
>                                  nChar++;
>
> Check it out. If you are ok with it, you can apply it (partially or fully).
>
Applied your diff.


commit 844d438852be6e3bc06a7020ec0aeb96d3d5ee4e
Author: Roman Khabibov <roman.habibov@tarantool.org>
Date:   Sat Dec 15 13:21:59 2018 +0300

     sql: fix bug with BLOB TRIM() when X'00' in char set

     The reason for the bug was that X'00' is a terminal symbol. If the 
char set
     contained X'00', all characters are ignored after it (including 
itself).

     Closes #3543

diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 9667aead5..e46b162d9 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1223,9 +1223,22 @@ trimFunc(sqlite3_context * context, int argc, 
sqlite3_value ** argv)
      } else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) {
          return;
      } else {
-        const unsigned char *z;
-        for (z = zCharSet, nChar = 0; *z; nChar++) {
+        const unsigned char *z = zCharSet;
+        int trim_set_sz = sqlite3_value_bytes(argv[1]);
+        int handled_bytes_cnt = trim_set_sz;
+        nChar = 0;
+        /*
+        * Count the number of UTF-8 characters passing
+        * through the entire char set, but not up
+        * to the '\0' or X'00' character. This allows
+        * to handle trimming set containing such
+        * characters.
+        */
+        while(handled_bytes_cnt > 0) {
+        const unsigned char *prev_byte = z;
              SQLITE_SKIP_UTF8(z);
+            handled_bytes_cnt -= (z - prev_byte);
+            nChar++;
          }
          if (nChar > 0) {
              azChar =
@@ -1235,10 +1248,15 @@ trimFunc(sqlite3_context * context, int argc, 
sqlite3_value ** argv)
                  return;
              }
              aLen = (unsigned char *)&azChar[nChar];
-            for (z = zCharSet, nChar = 0; *z; nChar++) {
+            z = zCharSet;
+            nChar = 0;
+            handled_bytes_cnt = trim_set_sz;
+            while(handled_bytes_cnt > 0) {
                  azChar[nChar] = (unsigned char *)z;
                  SQLITE_SKIP_UTF8(z);
                  aLen[nChar] = (u8) (z - azChar[nChar]);
+                handled_bytes_cnt -= aLen[nChar];
+                nChar++;
              }
          }
      }
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 393212968..b7de1d955 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1,6 +1,6 @@
  #!/usr/bin/env tarantool
  test = require("sqltester")
-test:plan(14535)
+test:plan(14547)

  --!./tcltestrunner.lua
  -- 2001 September 15
@@ -2100,6 +2100,128 @@ test:do_execsql_test(
          -- </func-22.22>
      })

+-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
+
+test:do_execsql_test(
+    "func-22.23",
+    [[
+        SELECT TRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.23>
+        "A"
+        -- </func-22.23>
+    })
+
+test:do_execsql_test(
+    "func-22.24",
+    [[
+        SELECT TRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.24>
+        "A"
+        -- </func-22.24>
+    })
+
+test:do_execsql_test(
+    "func-22.25",
+    [[
+        SELECT TRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.25>
+        "A"
+        -- </func-22.25>
+    })
+
+test:do_execsql_test(
+    "func-22.26",
+    [[
+        SELECT TRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.26>
+        "A\0B"
+        -- </func-22.26>
+    })
+
+test:do_execsql_test(
+    "func-22.27",
+    [[
+        SELECT LTRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.27>
+        "A\0"
+        -- </func-22.27>
+    })
+
+test:do_execsql_test(
+    "func-22.28",
+    [[
+        SELECT LTRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.28>
+        "A\0"
+        -- </func-22.28>
+    })
+
+test:do_execsql_test(
+    "func-22.29",
+    [[
+        SELECT LTRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.29>
+        "A\0"
+        -- </func-22.29>
+    })
+
+test:do_execsql_test(
+    "func-22.30",
+    [[
+        SELECT LTRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.30>
+        "A\0B\0\0"
+        -- </func-22.30>
+    })
+
+test:do_execsql_test(
+    "func-22.31",
+    [[
+        SELECT RTRIM(X'004100', X'00');
+    ]], {
+        -- <func-22.31>
+        "\0A"
+        -- </func-22.31>
+    })
+
+test:do_execsql_test(
+    "func-22.32",
+    [[
+        SELECT RTRIM(X'004100', X'0000');
+    ]], {
+        -- <func-22.32>
+        "\0A"
+        -- </func-22.32>
+    })
+
+test:do_execsql_test(
+    "func-22.33",
+    [[
+        SELECT RTRIM(X'004100', X'0042');
+    ]], {
+        -- <func-22.33>
+        "\0A"
+        -- </func-22.33>
+    })
+
+test:do_execsql_test(
+    "func-22.34",
+    [[
+        SELECT RTRIM(X'00004100420000', X'00');
+    ]], {
+        -- <func-22.34>
+        "\0\0A\0B"
+        -- </func-22.34>
+    })
+
  -- This is to test the deprecated sqlite3_aggregate_count() API.
  --
  --test:do_test(

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
  2018-12-18  8:40 ` [tarantool-patches] " n.pettik
  2018-12-20 20:41 ` Vladislav Shpilevoy
@ 2018-12-27 12:28 ` Kirill Yukhin
  2 siblings, 0 replies; 8+ messages in thread
From: Kirill Yukhin @ 2018-12-27 12:28 UTC (permalink / raw)
  To: tarantool-patches; +Cc: v.shpilevoy

Hello,

On 15 Dec 13:57, Roman Khabibov wrote:
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all other characters after its (inculding itself) are then
> ignored.
> 
> Closes #3543
> 
> Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
> Issue: https://github.com/tarantool/tarantool/issues/3543

I've checked your patch into 2.1 branch.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
  2018-12-26 13:56       ` Roman
@ 2018-12-28 11:09         ` n.pettik
  0 siblings, 0 replies; 8+ messages in thread
From: n.pettik @ 2018-12-28 11:09 UTC (permalink / raw)
  To: tarantool-patches; +Cc: Roman Khabibov

LGTM.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-12-28 11:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
2018-12-18  8:40 ` [tarantool-patches] " n.pettik
2018-12-18 14:30   ` Roman
2018-12-25 11:40     ` n.pettik
2018-12-26 13:56       ` Roman
2018-12-28 11:09         ` n.pettik
2018-12-20 20:41 ` Vladislav Shpilevoy
2018-12-27 12:28 ` Kirill Yukhin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox