* [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
@ 2018-12-15 10:57 Roman Khabibov
2018-12-18 8:40 ` [tarantool-patches] " n.pettik
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Roman Khabibov @ 2018-12-15 10:57 UTC (permalink / raw)
To: tarantool-patches; +Cc: v.shpilevoy
The reason for the bug was that X'00' is a terminal symbol. If the char set
contained X'00', all other characters after its (inculding itself) are then
ignored.
Closes #3543
Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
Issue: https://github.com/tarantool/tarantool/issues/3543
---
src/box/sql/func.c | 18 +++++-
test/sql-tap/func.test.lua | 126 ++++++++++++++++++++++++++++++++++++-
2 files changed, 140 insertions(+), 4 deletions(-)
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 9667aead5..5beba7bd2 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1224,8 +1224,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
return;
} else {
const unsigned char *z;
- for (z = zCharSet, nChar = 0; *z; nChar++) {
+ int sizeInChar = sqlite3_value_bytes(argv[1]);
+ int length = 0;
+ z = zCharSet;
+ nChar = 0;
+ const unsigned char *zStepBack;
+ while(sizeInChar - length) {
+ zStepBack = z;
SQLITE_SKIP_UTF8(z);
+ length += z - zStepBack;
+ nChar++;
}
if (nChar > 0) {
azChar =
@@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
return;
}
aLen = (unsigned char *)&azChar[nChar];
- for (z = zCharSet, nChar = 0; *z; nChar++) {
+ z = zCharSet;
+ nChar = 0;
+ length = 0;
+ while(sizeInChar - length) {
azChar[nChar] = (unsigned char *)z;
+ zStepBack = z;
SQLITE_SKIP_UTF8(z);
+ length += z - zStepBack;
aLen[nChar] = (u8) (z - azChar[nChar]);
+ nChar++;
}
}
}
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 393212968..b7b8e7c4c 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1,6 +1,6 @@
#!/usr/bin/env tarantool
test = require("sqltester")
-test:plan(14535)
+test:plan(14547)
--!./tcltestrunner.lua
-- 2001 September 15
@@ -2100,11 +2100,133 @@ test:do_execsql_test(
-- </func-22.22>
})
+-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
+
+test:do_execsql_test(
+ "func-22.23",
+ [[
+ SELECT TRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.23>
+ "A"
+ -- </func-22.23>
+ })
+
+test:do_execsql_test(
+ "func-22.24",
+ [[
+ SELECT TRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.24>
+ "A"
+ -- </func-22.24>
+ })
+
+test:do_execsql_test(
+ "func-22.25",
+ [[
+ SELECT TRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.25>
+ "A"
+ -- </func-22.25>
+ })
+
+test:do_execsql_test(
+ "func-22.26",
+ [[
+ SELECT TRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.26>
+ "A\0B"
+ -- </func-22.26>
+ })
+
+test:do_execsql_test(
+ "func-22.27",
+ [[
+ SELECT LTRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.27>
+ "A\0"
+ -- </func-22.27>
+ })
+
+test:do_execsql_test(
+ "func-22.28",
+ [[
+ SELECT LTRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.28>
+ "A\0"
+ -- </func-22.28>
+ })
+
+test:do_execsql_test(
+ "func-22.29",
+ [[
+ SELECT LTRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.29>
+ "A\0"
+ -- </func-22.29>
+ })
+
+test:do_execsql_test(
+ "func-22.30",
+ [[
+ SELECT LTRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.30>
+ "A\0B\0\0"
+ -- </func-22.30>
+ })
+
+test:do_execsql_test(
+ "func-22.31",
+ [[
+ SELECT RTRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.31>
+ "\0A"
+ -- </func-22.31>
+ })
+
+test:do_execsql_test(
+ "func-22.32",
+ [[
+ SELECT RTRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.32>
+ "\0A"
+ -- </func-22.32>
+ })
+
+test:do_execsql_test(
+ "func-22.33",
+ [[
+ SELECT RTRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.33>
+ "\0A"
+ -- </func-22.33>
+ })
+
+test:do_execsql_test(
+ "func-22.34",
+ [[
+ SELECT RTRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.34>
+ "\0\0A\0B"
+ -- </func-22.34>
+ })
+
-- This is to test the deprecated sqlite3_aggregate_count() API.
--
--test:do_test(
-- "func-23.1",
--- function()
+-- function()S
-- sqlite3_create_aggregate("db")
-- return test:execsql([[
-- SELECT legacy_count() FROM t6;
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
@ 2018-12-18 8:40 ` n.pettik
2018-12-18 14:30 ` Roman
2018-12-20 20:41 ` Vladislav Shpilevoy
2018-12-27 12:28 ` Kirill Yukhin
2 siblings, 1 reply; 8+ messages in thread
From: n.pettik @ 2018-12-18 8:40 UTC (permalink / raw)
To: tarantool-patches; +Cc: Roman Khabibov
> On 15 Dec 2018, at 13:57, Roman Khabibov <roman.habibov@tarantool.org> wrote:
>
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all other characters after its (inculding itself) are then
> ignored.
Nit: ‘… after its’ -> ‘… after it’.
Nit: inculding -> including.
Nit: ‘are then ignored’ -> ‘are ignored’.
Nit: I wouldn’t mix first and second conditions:
‘If the char set contains …, all characters are ignored...'
>
> Closes #3543
>
> Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
> Issue: https://github.com/tarantool/tarantool/issues/3543
You should put links to the branch and issue below delimiter ---, not above.
> ---
> src/box/sql/func.c | 18 +++++-
> test/sql-tap/func.test.lua | 126 ++++++++++++++++++++++++++++++++++++-
> 2 files changed, 140 insertions(+), 4 deletions(-)
>
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9667aead5..5beba7bd2 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1224,8 +1224,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> return;
> } else {
> const unsigned char *z;
> - for (z = zCharSet, nChar = 0; *z; nChar++) {
Lets start from putting here comments which explain
what is going on here (how you algorithm/approach works).
Then, lets choose more suitable names for variables:
length, sizeInChar and nChar sound very similar
(if you can’t come up with better names, then at least
mention in comment what they mean).
> + int sizeInChar = sqlite3_value_bytes(argv[1]);
> + int length = 0;
What is the difference between sizeInChar, length and nChar?
> + z = zCharSet;
> + nChar = 0;
> + const unsigned char *zStepBack;
> + while(sizeInChar - length) {
We use explicit conditions as a predicate value. In other words:
while (sizeInChar - length > 0)
Also, it would be better if you declare and initialise var at once:
const unsigned char *z = zCharSet;
> + zStepBack = z;
> SQLITE_SKIP_UTF8(z);
> + length += z - zStepBack;
> + nChar++;
> }
> if (nChar > 0) {
> azChar =
> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> return;
> }
> aLen = (unsigned char *)&azChar[nChar];
> - for (z = zCharSet, nChar = 0; *z; nChar++) {
> + z = zCharSet;
> + nChar = 0;
> + length = 0;
> + while(sizeInChar - length) {
> azChar[nChar] = (unsigned char *)z;
> + zStepBack = z;
> SQLITE_SKIP_UTF8(z);
> + length += z - zStepBack;
> aLen[nChar] = (u8) (z - azChar[nChar]);
> + nChar++;
> }
> }
> }
> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
> index 393212968..b7b8e7c4c 100755
> --- a/test/sql-tap/func.test.lua
> +++ b/test/sql-tap/func.test.lua
> @@ -1,6 +1,6 @@
> #!/usr/bin/env tarantool
> test = require("sqltester")
> -test:plan(14535)
> +test:plan(14547)
>
> --!./tcltestrunner.lua
> -- 2001 September 15
> @@ -2100,11 +2100,133 @@ test:do_execsql_test(
> -- </func-22.22>
> })
>
> -- This is to test the deprecated sqlite3_aggregate_count() API.
> --
> --test:do_test(
> -- "func-23.1",
> --- function()
> +-- function()S
Garbage diff.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-18 8:40 ` [tarantool-patches] " n.pettik
@ 2018-12-18 14:30 ` Roman
2018-12-25 11:40 ` n.pettik
0 siblings, 1 reply; 8+ messages in thread
From: Roman @ 2018-12-18 14:30 UTC (permalink / raw)
To: tarantool-patches, n.pettik
Hi! Thanks for review.
> Nit: ‘… after its’ -> ‘… after it’.
>
> Nit: inculding -> including.
>
> Nit: ‘are then ignored’ -> ‘are ignored’.
>
> Nit: I wouldn’t mix first and second conditions:
> ‘If the char set contains …, all characters are ignored...'
Fixed.
> You should put links to the branch and issue below delimiter ---, not above.
Sorry.
> Lets start from putting here comments which explain
> what is going on here (how you algorithm/approach works).
> Then, lets choose more suitable names for variables:
> length, sizeInChar and nChar sound very similar
> (if you can’t come up with better names, then at least
> mention in comment what they mean).
>
>> + int sizeInChar = sqlite3_value_bytes(argv[1]);
>> + int length = 0;
> What is the difference between sizeInChar, length and nChar?
I add comments and renamed vars. Hope, now it's clearer.
> We use explicit conditions as a predicate value. In other words:
>
> while (sizeInChar - length > 0)
>
> Also, it would be better if you declare and initialise var at once:
>
> const unsigned char *z = zCharSet;
Fixed.
>> + zStepBack = z;
>> SQLITE_SKIP_UTF8(z);
>> + length += z - zStepBack;
>> + nChar++;
>> }
>> if (nChar > 0) {
>> azChar =
>> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
>> return;
>> }
>> aLen = (unsigned char *)&azChar[nChar];
>> - for (z = zCharSet, nChar = 0; *z; nChar++) {
>> + z = zCharSet;
>> + nChar = 0;
>> + length = 0;
>> + while(sizeInChar - length) {
>> azChar[nChar] = (unsigned char *)z;
>> + zStepBack = z;
>> SQLITE_SKIP_UTF8(z);
>> + length += z - zStepBack;
>> aLen[nChar] = (u8) (z - azChar[nChar]);
>> + nChar++;
>> }
>> }
>> }
>> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
>> index 393212968..b7b8e7c4c 100755
>> --- a/test/sql-tap/func.test.lua
>> +++ b/test/sql-tap/func.test.lua
>> @@ -1,6 +1,6 @@
>> #!/usr/bin/env tarantool
>> test = require("sqltester")
>> -test:plan(14535)
>> +test:plan(14547)
>>
>> --!./tcltestrunner.lua
>> -- 2001 September 15
>> @@ -2100,11 +2100,133 @@ test:do_execsql_test(
>> -- </func-22.22>
>> })
>>
>> -- This is to test the deprecated sqlite3_aggregate_count() API.
>> --
>> --test:do_test(
>> -- "func-23.1",
>> --- function()
>> +-- function()S
> Garbage diff.
Sorry. I's an accident.
commit 39b2481341cd2e71d67a04bec9aeed0e1189740c
Author: Roman Khabibov <roman.habibov@tarantool.org>
Date: Sat Dec 15 13:21:59 2018 +0300
sql: fix bug with BLOB TRIM() when X'00' in char set
The reason for the bug was that X'00' is a terminal symbol. If the
char set
contained X'00', all characters are ignored after it (including
itself).
Closes #3543
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 9667aead5..f397e23c1 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1223,9 +1223,19 @@ trimFunc(sqlite3_context * context, int argc,
sqlite3_value ** argv)
} else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) {
return;
} else {
- const unsigned char *z;
- for (z = zCharSet, nChar = 0; *z; nChar++) {
+ const unsigned char *z = zCharSet;
+ int sizeOfCharSet = \
+ sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
+ int nProcessedBytes = 0;
+ nChar = 0;
+ const unsigned char *zStepBack;
+ /* Count the number of UTF-8 characters passing through the
+ * entire char set, but not up to the '\0' or X'00' character. */
+ while(sizeOfCharSet - nProcessedBytes > 0) {
+ zStepBack = z;
SQLITE_SKIP_UTF8(z);
+ nProcessedBytes += z - zStepBack;
+ nChar++;
}
if (nChar > 0) {
azChar =
@@ -1235,10 +1245,18 @@ trimFunc(sqlite3_context * context, int argc,
sqlite3_value ** argv)
return;
}
aLen = (unsigned char *)&azChar[nChar];
- for (z = zCharSet, nChar = 0; *z; nChar++) {
+ z = zCharSet;
+ nChar = 0;
+ nProcessedBytes = 0;
+ /* Similar to the previous cycle. But
+ * now write into "azCharSet". */
+ while(sizeOfCharSet - nProcessedBytes > 0) {
azChar[nChar] = (unsigned char *)z;
+ zStepBack = z;
SQLITE_SKIP_UTF8(z);
+ nProcessedBytes += z - zStepBack;
aLen[nChar] = (u8) (z - azChar[nChar]);
+ nChar++;
}
}
}
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 393212968..b7de1d955 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1,6 +1,6 @@
#!/usr/bin/env tarantool
test = require("sqltester")
-test:plan(14535)
+test:plan(14547)
--!./tcltestrunner.lua
-- 2001 September 15
@@ -2100,6 +2100,128 @@ test:do_execsql_test(
-- </func-22.22>
})
+-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
+
+test:do_execsql_test(
+ "func-22.23",
+ [[
+ SELECT TRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.23>
+ "A"
+ -- </func-22.23>
+ })
+
+test:do_execsql_test(
+ "func-22.24",
+ [[
+ SELECT TRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.24>
+ "A"
+ -- </func-22.24>
+ })
+
+test:do_execsql_test(
+ "func-22.25",
+ [[
+ SELECT TRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.25>
+ "A"
+ -- </func-22.25>
+ })
+
+test:do_execsql_test(
+ "func-22.26",
+ [[
+ SELECT TRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.26>
+ "A\0B"
+ -- </func-22.26>
+ })
+
+test:do_execsql_test(
+ "func-22.27",
+ [[
+ SELECT LTRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.27>
+ "A\0"
+ -- </func-22.27>
+ })
+
+test:do_execsql_test(
+ "func-22.28",
+ [[
+ SELECT LTRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.28>
+ "A\0"
+ -- </func-22.28>
+ })
+
+test:do_execsql_test(
+ "func-22.29",
+ [[
+ SELECT LTRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.29>
+ "A\0"
+ -- </func-22.29>
+ })
+
+test:do_execsql_test(
+ "func-22.30",
+ [[
+ SELECT LTRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.30>
+ "A\0B\0\0"
+ -- </func-22.30>
+ })
+
+test:do_execsql_test(
+ "func-22.31",
+ [[
+ SELECT RTRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.31>
+ "\0A"
+ -- </func-22.31>
+ })
+
+test:do_execsql_test(
+ "func-22.32",
+ [[
+ SELECT RTRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.32>
+ "\0A"
+ -- </func-22.32>
+ })
+
+test:do_execsql_test(
+ "func-22.33",
+ [[
+ SELECT RTRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.33>
+ "\0A"
+ -- </func-22.33>
+ })
+
+test:do_execsql_test(
+ "func-22.34",
+ [[
+ SELECT RTRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.34>
+ "\0\0A\0B"
+ -- </func-22.34>
+ })
+
-- This is to test the deprecated sqlite3_aggregate_count() API.
--
--test:do_test(
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
2018-12-18 8:40 ` [tarantool-patches] " n.pettik
@ 2018-12-20 20:41 ` Vladislav Shpilevoy
2018-12-27 12:28 ` Kirill Yukhin
2 siblings, 0 replies; 8+ messages in thread
From: Vladislav Shpilevoy @ 2018-12-20 20:41 UTC (permalink / raw)
To: Roman Khabibov, tarantool-patches, Nikita Pettik
Nikita, please, do first review.
On 15/12/2018 13:57, Roman Khabibov wrote:
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all other characters after its (inculding itself) are then
> ignored.
>
> Closes #3543
>
> Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
> Issue: https://github.com/tarantool/tarantool/issues/3543
> ---
> src/box/sql/func.c | 18 +++++-
> test/sql-tap/func.test.lua | 126 ++++++++++++++++++++++++++++++++++++-
> 2 files changed, 140 insertions(+), 4 deletions(-)
>
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9667aead5..5beba7bd2 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1224,8 +1224,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> return;
> } else {
> const unsigned char *z;
> - for (z = zCharSet, nChar = 0; *z; nChar++) {
> + int sizeInChar = sqlite3_value_bytes(argv[1]);
> + int length = 0;
> + z = zCharSet;
> + nChar = 0;
> + const unsigned char *zStepBack;
> + while(sizeInChar - length) {
> + zStepBack = z;
> SQLITE_SKIP_UTF8(z);
> + length += z - zStepBack;
> + nChar++;
> }
> if (nChar > 0) {
> azChar =
> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> return;
> }
> aLen = (unsigned char *)&azChar[nChar];
> - for (z = zCharSet, nChar = 0; *z; nChar++) {
> + z = zCharSet;
> + nChar = 0;
> + length = 0;
> + while(sizeInChar - length) {
> azChar[nChar] = (unsigned char *)z;
> + zStepBack = z;
> SQLITE_SKIP_UTF8(z);
> + length += z - zStepBack;
> aLen[nChar] = (u8) (z - azChar[nChar]);
> + nChar++;
> }
> }
> }
> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
> index 393212968..b7b8e7c4c 100755
> --- a/test/sql-tap/func.test.lua
> +++ b/test/sql-tap/func.test.lua
> @@ -1,6 +1,6 @@
> #!/usr/bin/env tarantool
> test = require("sqltester")
> -test:plan(14535)
> +test:plan(14547)
>
> --!./tcltestrunner.lua
> -- 2001 September 15
> @@ -2100,11 +2100,133 @@ test:do_execsql_test(
> -- </func-22.22>
> })
>
> +-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
> +
> +test:do_execsql_test(
> + "func-22.23",
> + [[
> + SELECT TRIM(X'004100', X'00');
> + ]], {
> + -- <func-22.23>
> + "A"
> + -- </func-22.23>
> + })
> +
> +test:do_execsql_test(
> + "func-22.24",
> + [[
> + SELECT TRIM(X'004100', X'0000');
> + ]], {
> + -- <func-22.24>
> + "A"
> + -- </func-22.24>
> + })
> +
> +test:do_execsql_test(
> + "func-22.25",
> + [[
> + SELECT TRIM(X'004100', X'0042');
> + ]], {
> + -- <func-22.25>
> + "A"
> + -- </func-22.25>
> + })
> +
> +test:do_execsql_test(
> + "func-22.26",
> + [[
> + SELECT TRIM(X'00004100420000', X'00');
> + ]], {
> + -- <func-22.26>
> + "A\0B"
> + -- </func-22.26>
> + })
> +
> +test:do_execsql_test(
> + "func-22.27",
> + [[
> + SELECT LTRIM(X'004100', X'00');
> + ]], {
> + -- <func-22.27>
> + "A\0"
> + -- </func-22.27>
> + })
> +
> +test:do_execsql_test(
> + "func-22.28",
> + [[
> + SELECT LTRIM(X'004100', X'0000');
> + ]], {
> + -- <func-22.28>
> + "A\0"
> + -- </func-22.28>
> + })
> +
> +test:do_execsql_test(
> + "func-22.29",
> + [[
> + SELECT LTRIM(X'004100', X'0042');
> + ]], {
> + -- <func-22.29>
> + "A\0"
> + -- </func-22.29>
> + })
> +
> +test:do_execsql_test(
> + "func-22.30",
> + [[
> + SELECT LTRIM(X'00004100420000', X'00');
> + ]], {
> + -- <func-22.30>
> + "A\0B\0\0"
> + -- </func-22.30>
> + })
> +
> +test:do_execsql_test(
> + "func-22.31",
> + [[
> + SELECT RTRIM(X'004100', X'00');
> + ]], {
> + -- <func-22.31>
> + "\0A"
> + -- </func-22.31>
> + })
> +
> +test:do_execsql_test(
> + "func-22.32",
> + [[
> + SELECT RTRIM(X'004100', X'0000');
> + ]], {
> + -- <func-22.32>
> + "\0A"
> + -- </func-22.32>
> + })
> +
> +test:do_execsql_test(
> + "func-22.33",
> + [[
> + SELECT RTRIM(X'004100', X'0042');
> + ]], {
> + -- <func-22.33>
> + "\0A"
> + -- </func-22.33>
> + })
> +
> +test:do_execsql_test(
> + "func-22.34",
> + [[
> + SELECT RTRIM(X'00004100420000', X'00');
> + ]], {
> + -- <func-22.34>
> + "\0\0A\0B"
> + -- </func-22.34>
> + })
> +
> -- This is to test the deprecated sqlite3_aggregate_count() API.
> --
> --test:do_test(
> -- "func-23.1",
> --- function()
> +-- function()S
> -- sqlite3_create_aggregate("db")
> -- return test:execsql([[
> -- SELECT legacy_count() FROM t6;
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-18 14:30 ` Roman
@ 2018-12-25 11:40 ` n.pettik
2018-12-26 13:56 ` Roman
0 siblings, 1 reply; 8+ messages in thread
From: n.pettik @ 2018-12-25 11:40 UTC (permalink / raw)
To: tarantool-patches; +Cc: Roman Khabibov
> commit 39b2481341cd2e71d67a04bec9aeed0e1189740c
> Author: Roman Khabibov <roman.habibov@tarantool.org>
> Date: Sat Dec 15 13:21:59 2018 +0300
>
> sql: fix bug with BLOB TRIM() when X'00' in char set
>
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all characters are ignored after it (including itself).
>
> Closes #3543
>
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index 9667aead5..f397e23c1 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1223,9 +1223,19 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> } else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) {
> return;
> } else {
> - const unsigned char *z;
> - for (z = zCharSet, nChar = 0; *z; nChar++) {
> + const unsigned char *z = zCharSet;
> + int sizeOfCharSet = \
We don’t use backslashes to carry code (it is used for macroses and
sometimes for comments).
> + sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
We put comments on the top of code to be commented:
/* Size of char set in bytes. */
int sizeOfCharSet = sqlite3_value_bytes(argv[1]);
I guess this comment is completely useless:
name of var and function say exactly the same as comment does.
> + int nProcessedBytes = 0;
> + nChar = 0;
> + const unsigned char *zStepBack;
> + /* Count the number of UTF-8 characters passing through the
> + * entire char set, but not up to the '\0' or X'00' character. */
Use tnt-style comments.
> + while(sizeOfCharSet - nProcessedBytes > 0) {
> + zStepBack = z;
> SQLITE_SKIP_UTF8(z);
> + nProcessedBytes += z - zStepBack;
> + nChar++;
> }
> if (nChar > 0) {
> azChar =
> @@ -1235,10 +1245,18 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> return;
> }
> aLen = (unsigned char *)&azChar[nChar];
> - for (z = zCharSet, nChar = 0; *z; nChar++) {
> + z = zCharSet;
> + nChar = 0;
> + nProcessedBytes = 0;
This comment is again useless.
> + /* Similar to the previous cycle. But
I see trailing space. Use git diff to spot such places.
> + * now write into "azCharSet". */
Use tnt-style comments.
> + while(sizeOfCharSet - nProcessedBytes > 0) {
> azChar[nChar] = (unsigned char *)z;
> + zStepBack = z;
You don’t need here ‘zStepBack’, you already saved current str
position to azChar.
> SQLITE_SKIP_UTF8(z);
> + nProcessedBytes += z - zStepBack;
> aLen[nChar] = (u8) (z - azChar[nChar]);
> + nChar++;
> }
> }
> }
All points considered, I suggest diff like this:
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index f397e23c1..9b5773321 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1203,7 +1203,8 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
int i; /* Loop counter */
unsigned char *aLen = 0; /* Length of each character in zCharSet */
unsigned char **azChar = 0; /* Individual characters in zCharSet */
- int nChar; /* Number of characters in zCharSet */
+ /* Number of UTF-8 characters in zCharSet. */
+ int nChar;
if (sqlite3_value_type(argv[0]) == SQLITE_NULL) {
return;
@@ -1224,17 +1225,20 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
return;
} else {
const unsigned char *z = zCharSet;
- int sizeOfCharSet = \
- sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
- int nProcessedBytes = 0;
+ int trim_set_sz = sqlite3_value_bytes(argv[1]);
+ int handled_bytes_cnt = trim_set_sz;
nChar = 0;
- const unsigned char *zStepBack;
- /* Count the number of UTF-8 characters passing through the
- * entire char set, but not up to the '\0' or X'00' character. */
- while(sizeOfCharSet - nProcessedBytes > 0) {
- zStepBack = z;
+ /*
+ * Count the number of UTF-8 characters passing
+ * through the entire char set, but not up
+ * to the '\0' or X'00' character. This allows
+ * to handle trimming set containing such
+ * characters.
+ */
+ while(handled_bytes_cnt > 0) {
+ const unsigned char *prev_byte = z;
SQLITE_SKIP_UTF8(z);
- nProcessedBytes += z - zStepBack;
+ handled_bytes_cnt -= (z - prev_byte);
nChar++;
}
if (nChar > 0) {
@@ -1247,15 +1251,12 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
aLen = (unsigned char *)&azChar[nChar];
z = zCharSet;
nChar = 0;
- nProcessedBytes = 0;
- /* Similar to the previous cycle. But
- * now write into "azCharSet". */
- while(sizeOfCharSet - nProcessedBytes > 0) {
+ handled_bytes_cnt = trim_set_sz;
+ while(handled_bytes_cnt > 0) {
azChar[nChar] = (unsigned char *)z;
- zStepBack = z;
SQLITE_SKIP_UTF8(z);
- nProcessedBytes += z - zStepBack;
aLen[nChar] = (u8) (z - azChar[nChar]);
+ handled_bytes_cnt -= aLen[nChar];
nChar++;
Check it out. If you are ok with it, you can apply it (partially or fully).
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-25 11:40 ` n.pettik
@ 2018-12-26 13:56 ` Roman
2018-12-28 11:09 ` n.pettik
0 siblings, 1 reply; 8+ messages in thread
From: Roman @ 2018-12-26 13:56 UTC (permalink / raw)
To: n.pettik, tarantool-patches
On 25.12.2018 14:40, n.pettik wrote:
> All points considered, I suggest diff like this:
>
> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> index f397e23c1..9b5773321 100644
> --- a/src/box/sql/func.c
> +++ b/src/box/sql/func.c
> @@ -1203,7 +1203,8 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> int i; /* Loop counter */
> unsigned char *aLen = 0; /* Length of each character in zCharSet */
> unsigned char **azChar = 0; /* Individual characters in zCharSet */
> - int nChar; /* Number of characters in zCharSet */
> + /* Number of UTF-8 characters in zCharSet. */
> + int nChar;
>
> if (sqlite3_value_type(argv[0]) == SQLITE_NULL) {
> return;
> @@ -1224,17 +1225,20 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> return;
> } else {
> const unsigned char *z = zCharSet;
> - int sizeOfCharSet = \
> - sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */
> - int nProcessedBytes = 0;
> + int trim_set_sz = sqlite3_value_bytes(argv[1]);
> + int handled_bytes_cnt = trim_set_sz;
> nChar = 0;
> - const unsigned char *zStepBack;
> - /* Count the number of UTF-8 characters passing through the
> - * entire char set, but not up to the '\0' or X'00' character. */
> - while(sizeOfCharSet - nProcessedBytes > 0) {
> - zStepBack = z;
> + /*
> + * Count the number of UTF-8 characters passing
> + * through the entire char set, but not up
> + * to the '\0' or X'00' character. This allows
> + * to handle trimming set containing such
> + * characters.
> + */
> + while(handled_bytes_cnt > 0) {
> + const unsigned char *prev_byte = z;
> SQLITE_SKIP_UTF8(z);
> - nProcessedBytes += z - zStepBack;
> + handled_bytes_cnt -= (z - prev_byte);
> nChar++;
> }
> if (nChar > 0) {
> @@ -1247,15 +1251,12 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv)
> aLen = (unsigned char *)&azChar[nChar];
> z = zCharSet;
> nChar = 0;
> - nProcessedBytes = 0;
> - /* Similar to the previous cycle. But
> - * now write into "azCharSet". */
> - while(sizeOfCharSet - nProcessedBytes > 0) {
> + handled_bytes_cnt = trim_set_sz;
> + while(handled_bytes_cnt > 0) {
> azChar[nChar] = (unsigned char *)z;
> - zStepBack = z;
> SQLITE_SKIP_UTF8(z);
> - nProcessedBytes += z - zStepBack;
> aLen[nChar] = (u8) (z - azChar[nChar]);
> + handled_bytes_cnt -= aLen[nChar];
> nChar++;
>
> Check it out. If you are ok with it, you can apply it (partially or fully).
>
Applied your diff.
commit 844d438852be6e3bc06a7020ec0aeb96d3d5ee4e
Author: Roman Khabibov <roman.habibov@tarantool.org>
Date: Sat Dec 15 13:21:59 2018 +0300
sql: fix bug with BLOB TRIM() when X'00' in char set
The reason for the bug was that X'00' is a terminal symbol. If the
char set
contained X'00', all characters are ignored after it (including
itself).
Closes #3543
diff --git a/src/box/sql/func.c b/src/box/sql/func.c
index 9667aead5..e46b162d9 100644
--- a/src/box/sql/func.c
+++ b/src/box/sql/func.c
@@ -1223,9 +1223,22 @@ trimFunc(sqlite3_context * context, int argc,
sqlite3_value ** argv)
} else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) {
return;
} else {
- const unsigned char *z;
- for (z = zCharSet, nChar = 0; *z; nChar++) {
+ const unsigned char *z = zCharSet;
+ int trim_set_sz = sqlite3_value_bytes(argv[1]);
+ int handled_bytes_cnt = trim_set_sz;
+ nChar = 0;
+ /*
+ * Count the number of UTF-8 characters passing
+ * through the entire char set, but not up
+ * to the '\0' or X'00' character. This allows
+ * to handle trimming set containing such
+ * characters.
+ */
+ while(handled_bytes_cnt > 0) {
+ const unsigned char *prev_byte = z;
SQLITE_SKIP_UTF8(z);
+ handled_bytes_cnt -= (z - prev_byte);
+ nChar++;
}
if (nChar > 0) {
azChar =
@@ -1235,10 +1248,15 @@ trimFunc(sqlite3_context * context, int argc,
sqlite3_value ** argv)
return;
}
aLen = (unsigned char *)&azChar[nChar];
- for (z = zCharSet, nChar = 0; *z; nChar++) {
+ z = zCharSet;
+ nChar = 0;
+ handled_bytes_cnt = trim_set_sz;
+ while(handled_bytes_cnt > 0) {
azChar[nChar] = (unsigned char *)z;
SQLITE_SKIP_UTF8(z);
aLen[nChar] = (u8) (z - azChar[nChar]);
+ handled_bytes_cnt -= aLen[nChar];
+ nChar++;
}
}
}
diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua
index 393212968..b7de1d955 100755
--- a/test/sql-tap/func.test.lua
+++ b/test/sql-tap/func.test.lua
@@ -1,6 +1,6 @@
#!/usr/bin/env tarantool
test = require("sqltester")
-test:plan(14535)
+test:plan(14547)
--!./tcltestrunner.lua
-- 2001 September 15
@@ -2100,6 +2100,128 @@ test:do_execsql_test(
-- </func-22.22>
})
+-- gh-3543 Check trimming of binary string when X'00' in trimming char set.
+
+test:do_execsql_test(
+ "func-22.23",
+ [[
+ SELECT TRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.23>
+ "A"
+ -- </func-22.23>
+ })
+
+test:do_execsql_test(
+ "func-22.24",
+ [[
+ SELECT TRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.24>
+ "A"
+ -- </func-22.24>
+ })
+
+test:do_execsql_test(
+ "func-22.25",
+ [[
+ SELECT TRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.25>
+ "A"
+ -- </func-22.25>
+ })
+
+test:do_execsql_test(
+ "func-22.26",
+ [[
+ SELECT TRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.26>
+ "A\0B"
+ -- </func-22.26>
+ })
+
+test:do_execsql_test(
+ "func-22.27",
+ [[
+ SELECT LTRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.27>
+ "A\0"
+ -- </func-22.27>
+ })
+
+test:do_execsql_test(
+ "func-22.28",
+ [[
+ SELECT LTRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.28>
+ "A\0"
+ -- </func-22.28>
+ })
+
+test:do_execsql_test(
+ "func-22.29",
+ [[
+ SELECT LTRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.29>
+ "A\0"
+ -- </func-22.29>
+ })
+
+test:do_execsql_test(
+ "func-22.30",
+ [[
+ SELECT LTRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.30>
+ "A\0B\0\0"
+ -- </func-22.30>
+ })
+
+test:do_execsql_test(
+ "func-22.31",
+ [[
+ SELECT RTRIM(X'004100', X'00');
+ ]], {
+ -- <func-22.31>
+ "\0A"
+ -- </func-22.31>
+ })
+
+test:do_execsql_test(
+ "func-22.32",
+ [[
+ SELECT RTRIM(X'004100', X'0000');
+ ]], {
+ -- <func-22.32>
+ "\0A"
+ -- </func-22.32>
+ })
+
+test:do_execsql_test(
+ "func-22.33",
+ [[
+ SELECT RTRIM(X'004100', X'0042');
+ ]], {
+ -- <func-22.33>
+ "\0A"
+ -- </func-22.33>
+ })
+
+test:do_execsql_test(
+ "func-22.34",
+ [[
+ SELECT RTRIM(X'00004100420000', X'00');
+ ]], {
+ -- <func-22.34>
+ "\0\0A\0B"
+ -- </func-22.34>
+ })
+
-- This is to test the deprecated sqlite3_aggregate_count() API.
--
--test:do_test(
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
2018-12-18 8:40 ` [tarantool-patches] " n.pettik
2018-12-20 20:41 ` Vladislav Shpilevoy
@ 2018-12-27 12:28 ` Kirill Yukhin
2 siblings, 0 replies; 8+ messages in thread
From: Kirill Yukhin @ 2018-12-27 12:28 UTC (permalink / raw)
To: tarantool-patches; +Cc: v.shpilevoy
Hello,
On 15 Dec 13:57, Roman Khabibov wrote:
> The reason for the bug was that X'00' is a terminal symbol. If the char set
> contained X'00', all other characters after its (inculding itself) are then
> ignored.
>
> Closes #3543
>
> Branch: https://github.com/tarantool/tarantool/tree/romankhabibov/gh-3543-trim-terminal
> Issue: https://github.com/tarantool/tarantool/issues/3543
I've checked your patch into 2.1 branch.
--
Regards, Kirill Yukhin
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set
2018-12-26 13:56 ` Roman
@ 2018-12-28 11:09 ` n.pettik
0 siblings, 0 replies; 8+ messages in thread
From: n.pettik @ 2018-12-28 11:09 UTC (permalink / raw)
To: tarantool-patches; +Cc: Roman Khabibov
LGTM.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-12-28 11:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-15 10:57 [tarantool-patches] [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Roman Khabibov
2018-12-18 8:40 ` [tarantool-patches] " n.pettik
2018-12-18 14:30 ` Roman
2018-12-25 11:40 ` n.pettik
2018-12-26 13:56 ` Roman
2018-12-28 11:09 ` n.pettik
2018-12-20 20:41 ` Vladislav Shpilevoy
2018-12-27 12:28 ` Kirill Yukhin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox