From: "n.pettik" <korablev@tarantool.org> To: tarantool-patches@freelists.org Cc: Roman Khabibov <roman.habibov@tarantool.org> Subject: [tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'00' in char set Date: Tue, 25 Dec 2018 13:40:50 +0200 [thread overview] Message-ID: <EA1374B8-8DD8-42DD-9CAA-5FE136939959@tarantool.org> (raw) In-Reply-To: <3b2ab896-d36f-3d19-e740-34d9db5d91f0@tarantool.org> > commit 39b2481341cd2e71d67a04bec9aeed0e1189740c > Author: Roman Khabibov <roman.habibov@tarantool.org> > Date: Sat Dec 15 13:21:59 2018 +0300 > > sql: fix bug with BLOB TRIM() when X'00' in char set > > The reason for the bug was that X'00' is a terminal symbol. If the char set > contained X'00', all characters are ignored after it (including itself). > > Closes #3543 > > diff --git a/src/box/sql/func.c b/src/box/sql/func.c > index 9667aead5..f397e23c1 100644 > --- a/src/box/sql/func.c > +++ b/src/box/sql/func.c > @@ -1223,9 +1223,19 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) > } else if ((zCharSet = sqlite3_value_text(argv[1])) == 0) { > return; > } else { > - const unsigned char *z; > - for (z = zCharSet, nChar = 0; *z; nChar++) { > + const unsigned char *z = zCharSet; > + int sizeOfCharSet = \ We don’t use backslashes to carry code (it is used for macroses and sometimes for comments). > + sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */ We put comments on the top of code to be commented: /* Size of char set in bytes. */ int sizeOfCharSet = sqlite3_value_bytes(argv[1]); I guess this comment is completely useless: name of var and function say exactly the same as comment does. > + int nProcessedBytes = 0; > + nChar = 0; > + const unsigned char *zStepBack; > + /* Count the number of UTF-8 characters passing through the > + * entire char set, but not up to the '\0' or X'00' character. */ Use tnt-style comments. > + while(sizeOfCharSet - nProcessedBytes > 0) { > + zStepBack = z; > SQLITE_SKIP_UTF8(z); > + nProcessedBytes += z - zStepBack; > + nChar++; > } > if (nChar > 0) { > azChar = > @@ -1235,10 +1245,18 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) > return; > } > aLen = (unsigned char *)&azChar[nChar]; > - for (z = zCharSet, nChar = 0; *z; nChar++) { > + z = zCharSet; > + nChar = 0; > + nProcessedBytes = 0; This comment is again useless. > + /* Similar to the previous cycle. But I see trailing space. Use git diff to spot such places. > + * now write into "azCharSet". */ Use tnt-style comments. > + while(sizeOfCharSet - nProcessedBytes > 0) { > azChar[nChar] = (unsigned char *)z; > + zStepBack = z; You don’t need here ‘zStepBack’, you already saved current str position to azChar. > SQLITE_SKIP_UTF8(z); > + nProcessedBytes += z - zStepBack; > aLen[nChar] = (u8) (z - azChar[nChar]); > + nChar++; > } > } > } All points considered, I suggest diff like this: diff --git a/src/box/sql/func.c b/src/box/sql/func.c index f397e23c1..9b5773321 100644 --- a/src/box/sql/func.c +++ b/src/box/sql/func.c @@ -1203,7 +1203,8 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) int i; /* Loop counter */ unsigned char *aLen = 0; /* Length of each character in zCharSet */ unsigned char **azChar = 0; /* Individual characters in zCharSet */ - int nChar; /* Number of characters in zCharSet */ + /* Number of UTF-8 characters in zCharSet. */ + int nChar; if (sqlite3_value_type(argv[0]) == SQLITE_NULL) { return; @@ -1224,17 +1225,20 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) return; } else { const unsigned char *z = zCharSet; - int sizeOfCharSet = \ - sqlite3_value_bytes(argv[1]); /* Size of char set in bytes. */ - int nProcessedBytes = 0; + int trim_set_sz = sqlite3_value_bytes(argv[1]); + int handled_bytes_cnt = trim_set_sz; nChar = 0; - const unsigned char *zStepBack; - /* Count the number of UTF-8 characters passing through the - * entire char set, but not up to the '\0' or X'00' character. */ - while(sizeOfCharSet - nProcessedBytes > 0) { - zStepBack = z; + /* + * Count the number of UTF-8 characters passing + * through the entire char set, but not up + * to the '\0' or X'00' character. This allows + * to handle trimming set containing such + * characters. + */ + while(handled_bytes_cnt > 0) { + const unsigned char *prev_byte = z; SQLITE_SKIP_UTF8(z); - nProcessedBytes += z - zStepBack; + handled_bytes_cnt -= (z - prev_byte); nChar++; } if (nChar > 0) { @@ -1247,15 +1251,12 @@ trimFunc(sqlite3_context * context, int argc, sqlite3_value ** argv) aLen = (unsigned char *)&azChar[nChar]; z = zCharSet; nChar = 0; - nProcessedBytes = 0; - /* Similar to the previous cycle. But - * now write into "azCharSet". */ - while(sizeOfCharSet - nProcessedBytes > 0) { + handled_bytes_cnt = trim_set_sz; + while(handled_bytes_cnt > 0) { azChar[nChar] = (unsigned char *)z; - zStepBack = z; SQLITE_SKIP_UTF8(z); - nProcessedBytes += z - zStepBack; aLen[nChar] = (u8) (z - azChar[nChar]); + handled_bytes_cnt -= aLen[nChar]; nChar++; Check it out. If you are ok with it, you can apply it (partially or fully).
next prev parent reply other threads:[~2018-12-25 11:40 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-12-15 10:57 [tarantool-patches] " Roman Khabibov 2018-12-18 8:40 ` [tarantool-patches] " n.pettik 2018-12-18 14:30 ` Roman 2018-12-25 11:40 ` n.pettik [this message] 2018-12-26 13:56 ` Roman 2018-12-28 11:09 ` n.pettik 2018-12-20 20:41 ` Vladislav Shpilevoy 2018-12-27 12:28 ` Kirill Yukhin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=EA1374B8-8DD8-42DD-9CAA-5FE136939959@tarantool.org \ --to=korablev@tarantool.org \ --cc=roman.habibov@tarantool.org \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH] sql: fix bug with BLOB TRIM() when X'\''00'\'' in char set' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox