From: Roman <roman.habibov@tarantool.org> To: tarantool-patches@freelists.org, "n.pettik" <korablev@tarantool.org> Subject: [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Date: Tue, 18 Dec 2018 17:30:36 +0300 [thread overview] Message-ID: <3b2ab896-d36f-3d19-e740-34d9db5d91f0@tarantool.org> (raw) In-Reply-To: <D9B9528A-5E3E-4AF1-B2CE-90BA417B969A@tarantool.org> Hi! Thanks for review. > Nit: ‘… after its’ -> ‘… after it’. > > Nit: inculding -> including. > > Nit: ‘are then ignored’ -> ‘are ignored’. > > Nit: I wouldn’t mix first and second conditions: > ‘If the char set contains …, all characters are ignored...' Fixed. > You should put links to the branch and issue below delimiter ---, not above. Sorry. > Lets start from putting here comments which explain > what is going on here (how you algorithm/approach works). > Then, lets choose more suitable names for variables: > length, sizeInChar and nChar sound very similar > (if you can’t come up with better names, then at least > mention in comment what they mean). > >> + int sizeInChar = sqlite3_value_bytes(argv[1]); >> + int length = 0; > What is the difference between sizeInChar, length and nChar? I add comments and renamed vars. Hope, now it's clearer. > We use explicit conditions as a predicate value. In other words: > > while (sizeInChar - length > 0) > > Also, it would be better if you declare and initialise var at once: > > const unsigned char *z = zCharSet; Fixed. >> + zStepBack = z; >> SQLITE_SKIP_UTF8(z); >> + length += z - zStepBack; >> + nChar++; >> } >> if (nChar > 0) { >> azChar = >> @@ -1235,10 +1243,16 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) >> return; >> } >> aLen = (unsigned char *)&azChar[nChar]; >> - for (z = zCharSet, nChar = 0; *z; nChar++) { >> + z = zCharSet; >> + nChar = 0; >> + length = 0; >> + while(sizeInChar - length) { >> azChar[nChar] = (unsigned char *)z; >> + zStepBack = z; >> SQLITE_SKIP_UTF8(z); >> + length += z - zStepBack; >> aLen[nChar] = (u8) (z - azChar[nChar]); >> + nChar++; >> } >> } >> } >> diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua >> index 393212968..b7b8e7c4c 100755 >> --- a/test/sql-tap/func.test.lua >> +++ b/test/sql-tap/func.test.lua >> @@ -1,6 +1,6 @@ >> #!/usr/bin/env tarantool >> test = require("sqltester") >> -test:plan(14535) >> +test:plan(14547) >> >> --!./tcltestrunner.lua >> -- 2001 September 15 >> @@ -2100,11 +2100,133 @@ test:do_execsql_test( >> -- </func-22.22> >> }) >> >> -- This is to test the deprecated sqlite3_aggregate_count() API. >> -- >> --test:do_test( >> -- "func-23.1", >> --- function() >> +-- function()S > Garbage diff. Sorry. I's an accident. commit 39b2481341cd2e71d67a04bec9aeed0e1189740c Author: Roman Khabibov <roman.habibov@tarantool.org> Date: Sat Dec 15 13:21:59 2018 +0300 sql: fix bug with BLOB TRIM() when X'00' in char set The reason for the bug was that X'00' is a terminal symbol. If the char set contained X'00', all characters are ignored after it (including itself). Closes #3543 diff --git a/src/box/sql/func.c b/src/box/sql/func.c index 9667aead5..f397e23c1 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -1223,9 +1223,19 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) } else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) { return; } else { - const unsigned char *z; - for (z = zCharSet, nChar = 0; *z; nChar++) { + const unsigned char *z = zCharSet; + int sizeOfCharSet = \ + sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */ + int nProcessedBytes = 0; + nChar = 0; + const unsigned char *zStepBack; + /* Count the number of UTF-8 characters passing through the + * entire char set, but not up to the '\0' or X'00' character. */ + while(sizeOfCharSet - nProcessedBytes > 0) { + zStepBack = z; SQLITE_SKIP_UTF8(z); + nProcessedBytes += z - zStepBack; + nChar++; } if (nChar > 0) { azChar = @@ -1235,10 +1245,18 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) return; } aLen = (unsigned char *)&azChar[nChar]; - for (z = zCharSet, nChar = 0; *z; nChar++) { + z = zCharSet; + nChar = 0; + nProcessedBytes = 0; + /* Similar to the previous cycle. But + * now write into "azCharSet". */ + while(sizeOfCharSet - nProcessedBytes > 0) { azChar[nChar] = (unsigned char *)z; + zStepBack = z; SQLITE_SKIP_UTF8(z); + nProcessedBytes += z - zStepBack; aLen[nChar] = (u8) (z - azChar[nChar]); + nChar++; } } } diff --git a/test/sql-tap/func.test.lua b/test/sql-tap/func.test.lua index 393212968..b7de1d955 100755 --- a/test/sql-tap/func.test.lua +++ b/test/sql-tap/func.test.lua @@ -1,6 +1,6 @@ #!/usr/bin/env tarantool test = require("sqltester") -test:plan(14535) +test:plan(14547) --!./tcltestrunner.lua -- 2001 September 15 @@ -2100,6 +2100,128 @@ test:do_execsql_test( -- </func-22.22> }) +-- gh-3543 Check trimming of binary string when X'00' in trimming char set. + +test:do_execsql_test( + "func-22.23", + [[ + SELECT TRIM(X'004100', X'00'); + ]], { + -- <func-22.23> + "A" + -- </func-22.23> + }) + +test:do_execsql_test( + "func-22.24", + [[ + SELECT TRIM(X'004100', X'0000'); + ]], { + -- <func-22.24> + "A" + -- </func-22.24> + }) + +test:do_execsql_test( + "func-22.25", + [[ + SELECT TRIM(X'004100', X'0042'); + ]], { + -- <func-22.25> + "A" + -- </func-22.25> + }) + +test:do_execsql_test( + "func-22.26", + [[ + SELECT TRIM(X'00004100420000', X'00'); + ]], { + -- <func-22.26> + "A\0B" + -- </func-22.26> + }) + +test:do_execsql_test( + "func-22.27", + [[ + SELECT LTRIM(X'004100', X'00'); + ]], { + -- <func-22.27> + "A\0" + -- </func-22.27> + }) + +test:do_execsql_test( + "func-22.28", + [[ + SELECT LTRIM(X'004100', X'0000'); + ]], { + -- <func-22.28> + "A\0" + -- </func-22.28> + }) + +test:do_execsql_test( + "func-22.29", + [[ + SELECT LTRIM(X'004100', X'0042'); + ]], { + -- <func-22.29> + "A\0" + -- </func-22.29> + }) + +test:do_execsql_test( + "func-22.30", + [[ + SELECT LTRIM(X'00004100420000', X'00'); + ]], { + -- <func-22.30> + "A\0B\0\0" + -- </func-22.30> + }) + +test:do_execsql_test( + "func-22.31", + [[ + SELECT RTRIM(X'004100', X'00'); + ]], { + -- <func-22.31> + "\0A" + -- </func-22.31> + }) + +test:do_execsql_test( + "func-22.32", + [[ + SELECT RTRIM(X'004100', X'0000'); + ]], { + -- <func-22.32> + "\0A" + -- </func-22.32> + }) + +test:do_execsql_test( + "func-22.33", + [[ + SELECT RTRIM(X'004100', X'0042'); + ]], { + -- <func-22.33> + "\0A" + -- </func-22.33> + }) + +test:do_execsql_test( + "func-22.34", + [[ + SELECT RTRIM(X'00004100420000', X'00'); + ]], { + -- <func-22.34> + "\0\0A\0B" + -- </func-22.34> + }) + -- This is to test the deprecated sqlite3_aggregate_count() API. -- --test:do_test(
next prev parent reply other threads:[~2018-12-18 14:30 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-12-15 10:57 [tarantool-patches] " Roman Khabibov 2018-12-18 8:40 ` [tarantool-patches] " n.pettik 2018-12-18 14:30 ` Roman [this message] 2018-12-25 11:40 ` n.pettik 2018-12-26 13:56 ` Roman 2018-12-28 11:09 ` n.pettik 2018-12-20 20:41 ` Vladislav Shpilevoy 2018-12-27 12:28 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=3b2ab896-d36f-3d19-e740-34d9db5d91f0@tarantool.org \ --to=roman.habibov@tarantool.org \ --cc=korablev@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'\''00'\'' in char set' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox