From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id 1B1806EC55; Tue, 7 Sep 2021 12:16:28 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 1B1806EC55 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1631006188; bh=TLm+Ised4aC3ZoGnD0IEZV83+H6M/e+szh16wEvRlMc=; h=Date:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bph87vHjKWFiwUlm18sW2GMyzKkJY11gj+xVECqPSJo96rSpjp6k3BnyVXmsIXL/2 M5MzfJbFBuazbGky66IEu86G27SDIG7K9ayf5Ua6MAdMt/rQkq1KN5JMMqgCy7pcQi UZPpqhoWE+bghZNt/R1qny611mRwUSTEgNP5ZV90= Received: from smtp33.i.mail.ru (smtp33.i.mail.ru [94.100.177.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 8D78E6EC55 for ; Tue, 7 Sep 2021 12:16:26 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 8D78E6EC55 Received: by smtp33.i.mail.ru with esmtpa (envelope-from ) id 1mNXDN-000159-JF; Tue, 07 Sep 2021 12:16:26 +0300 Date: Tue, 7 Sep 2021 12:16:24 +0300 To: Safin Timur Cc: tarantool-patches@dev.tarantool.org Message-ID: <20210907091624.GA4768@tarantool.org> References: <9ec7b38b0979cb2e9ac6cb6b8f2e405c313a67f9.1630305008.git.imeevma@gmail.com> <017001d79e9e$f9d5f8d0$ed81ea70$@tarantool.org> <20210901084450.GA111802@tarantool.org> <73f26e5c-9374-682a-5787-0da49b32953c@tarantool.org> <20210906094528.GA24664@tarantool.org> <7ac838fc-44cf-ccf0-2ca2-c126be437cb2@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <7ac838fc-44cf-ccf0-2ca2-c126be437cb2@tarantool.org> X-4EC0790: 10 X-7564579A: B8F34718100C35BD X-77F55803: 4F1203BC0FB41BD9D96C1EA41D18F4D5C0BAD37378C6732F68CF9FDB2FAB06FD182A05F538085040122772DF70C6B74F423411D5A76DFD56EEB41A668E2C2CD81B2D22B8DF89BA3A X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7F8E53417176C7207EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F7900637F0135404761DA3FC8638F802B75D45FF36EB9D2243A4F8B5A6FCA7DBDB1FC311F39EFFDF887939037866D6147AF826D808588453F4C4A7510ED77D76A77545CD117882F4460429724CE54428C33FAD305F5C1EE8F4F765FCF1175FABE1C0F9B6A471835C12D1D9774AD6D5ED66289B52BA9C0B312567BB23117882F446042972877693876707352033AC447995A7AD18C26CFBAC0749D213D2E47CDBA5A96583BA9C0B312567BB2376E601842F6C81A19E625A9149C048EE91ADC097FE2C3A08043FB282AF95FB6BD8FC6C240DEA7642DBF02ECDB25306B2B78CF848AE20165D0A6AB1C7CE11FEE34E7D9683544204AF9735652A29929C6CC4224003CC836476EA7A3FFF5B025636E2021AF6380DFAD1A18204E546F3947CB11811A4A51E3B096D1867E19FE1407959CC434672EE6371089D37D7C0E48F6C8AA50765F7900637AD0424077D726551EFF80C71ABB335746BA297DBC24807EABDAD6C7F3747799A X-B7AD71C0: AC4F5C86D027EB782CDD5689AFBDA7A213B5FB47DCBC3458834459D11680B5056021B7E7F5CADD8A9C5F20FED48AC69A X-C1DE0DAB: 0D63561A33F958A5571AA4D6A296F32134552A4E2C23A38B79FDFE6ED21A47A0D59269BC5F550898D99A6476B3ADF6B47008B74DF8BB9EF7333BD3B22AA88B938A852937E12ACA752546FE575EB473F1410CA545F18667F91A7EA1CDA0B5A7A0 X-C8649E89: 4E36BF7865823D7055A7F0CF078B5EC49A30900B95165D349BD6FB698A487E7E263CCF66A8FEB19805E623E0AC05DC05548BD30E1D71AFFABC46F0DFF1BECE101D7E09C32AA3244CB608D215794C6353D20CBE27C934625435DA7DC5AF9B58C0729B2BEF169E0186 X-D57D3AED: 3ZO7eAau8CL7WIMRKs4sN3D3tLDjz0dLbV79QFUyzQ2Ujvy7cMT6pYYqY16iZVKkSc3dCLJ7zSJH7+u4VD18S7Vl4ZUrpaVfd2+vE6kuoey4m4VkSEu530nj6fImhcD4MUrOEAnl0W826KZ9Q+tr5ycPtXkTV4k65bRjmOUUP8cvGozZ33TWg5HZplvhhXbhDGzqmQDTd6OAevLeAnq3Ra9uf7zvY2zzsIhlcp/Y7m53TZgf2aB4JOg4gkr2biojSvkey75OmIru2GEYiHPitQ== X-Mailru-Sender: 5C3750E245F362008BC1685FEC6306EDE102C1DB1834626A423411D5A76DFD566565FCDE7E403FA15105BD0848736F9966FEC6BF5C9C28D97E07721503EA2E00ED97202A5A4E92BF7402F9BA4338D657ED14614B50AE0675 X-Mras: Ok Subject: Re: [Tarantool-patches] [PATCH v1 1/1] sql: fix a segfault in hex() on receiving zeroblob X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Mergen Imeev via Tarantool-patches Reply-To: Mergen Imeev Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" Hi! Thank you for the review! My answers below. Also, I think there is no point to continue with this review since we cannot get through such basic questions. Thanks for you time. On Mon, Sep 06, 2021 at 11:32:06PM +0300, Safin Timur wrote: > > On 06.09.2021 12:45, Mergen Imeev wrote: > > Hi! Thank you for the review! My answer below. > > > > On Fri, Sep 03, 2021 at 10:19:56PM +0300, Safin Timur wrote: > > > > > > > > > On 01.09.2021 11:44, Mergen Imeev wrote: > > > > Hi! Thank you for the review. My answers below. > > > > > > > > On Tue, Aug 31, 2021 at 10:32:46PM +0300, Timur Safin wrote: > > > > > I may miss something obvious, but prior version of a code > > > > > with pBlob and n was much shorter, compacter and more readable. > > > > > I'm curious, why do you prefer to always use argv[0]->n and > > > > > argv[0]->z instead? > > > > > > > > > If we talk about the old function, then it really looks simpler. However, it did > > > > not work correctly and also made some unnecessary changes to the arguments. You > > > > can compare to the fixed version of old function on this branch: > > > > imeevma/gh-6113-fix-hex-segfault-2.8 (which I also sent you for review). You will > > > > see much less difference there. > > > > > > I meant that newer code was a little bit .. mouthful, with unnecessary code > > > substitution and visual noise which harmed readability. Here is an example > > > of version which is not using argv[0]->.. wherever we refer to fields. > > > > > > ---------------------------------------------------- > > > /** Implementation of the HEX() SQL built-in function. */ > > > static void > > > func_hex(struct sql_context *ctx, int argc, struct Mem **argv) > > > { > > > assert(argc == 1); > > > (void)argc; > > > if (argv[0]->type == MEM_TYPE_NULL) > > > return mem_set_null(ctx->pOut); > > > > > > int n = argv[0]->n; > > > int zero_len = argv[0]->u.nZero; > > > I believe you cannot use undefined value. > > That's good question, and wording in standard C is(was) confusing here, and > one could get an impression that it's UB to acsess field of union, other > than that has been initialized last. It's called type-punning. And always > was just implemented as a trickier way for "reinterpret cast" which is > compatible with aliasing analysis in compiler. There is correndum in C99 > defect report which clarifies that: > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm > ``` > 78a If the member used to access the contents of a union object is not the > same as the member last used to store a value in the object, the appropriate > part of the object representation of the value is reinterpreted as an object > representation in the new type as described in 6.2.6 (a process sometimes > called "type punning"). This might be a trap representation. > ``` > > i.e. it's either cast to asked type, or trap of such cast is impossible > (like if target type is double, and value in memory creates invalid floating > value, and host hardware programed to raise in such cases). > > In this case we always get integer value, moreover we use this value below > __only__ if argv[0]->flags & MEM_Zero is non-zero, so we reuse this value > only in correct circumstances. > > So, in short, that's ok in code below. > You want to declare variable with value, which will be properly defined only if MEM has MEM_Zero flag. I think that it is not good idea. > > > > > assert(argv[0]->type == MEM_TYPE_BIN && n >= 0); > > > assert((argv[0]->flags & MEM_Zero) == 0 || zero_len >= 0); > > > > > > uint32_t size = 2 * n; > > > if ((argv[0]->flags & MEM_Zero) != 0) > > > size += 2 * zero_len; > > > if (size == 0) > > > return mem_set_str0_static(ctx->pOut, ""); > > > > > > char *str = sqlDbMallocRawNN(sql_get(), size); > > > if (str == NULL) { > > > ctx->is_aborted = true; > > > return; > > > } > > > for (int i = 0; i < n; ++i) { > > > char c = argv[0]->z[i]; > > > str[2 * i] = hexdigits[(c >> 4) & 0xf]; > > > str[2 * i + 1] = hexdigits[c & 0xf]; > > > } > > > if ((argv[0]->flags & MEM_Zero) != 0) > > > memset(&str[2 * n], '0', 2 * zero_len); > > > mem_set_str_allocated(ctx->pOut, str, size); > > > } > > > > > > ---------------------------------------------------- > > > > > > It's more resembling original code (and that was done intentionally). > > > > > I don't like that you define a variable with an undefined value in some cases. > > I would introduce some new variables if there was some complicated logic, > > however I don't see the need to do this here since I don't see complex > > expressions. > > I still insist that excessive usage of argv[0]->xxx in every line make this > code uglier and less readable. Please, make code less verbose. > > > > > > Also (and I didn't change it in the sample) there is apparent missing check > > > for SQL_LIMIT_LENGTH limit which used to be done in contextMalloc() before, > > > but now is missing once we use sqlDbMallocRawNN(). I assume we better return > > > this check (once again as a proper wrapper which contextMalloc() essentially > > > was). > > > > > This will be verified in VDBE. I think it is better to have such a check > > centralized for all functions. > > Is it already verified at the moment? Or you meant it __will be__ verified > eventually in future code. > See vdbe.c line 1337. > In general, though, we may not assume that the code will always be called in > correct context where all bounds checks processed. So assertions should be > local, placed where they are assumed to be present. > > Remember recent discussion elsewhere - assertions may not be too much. > Please put them here. (And in any case, they will not hurt performance in > release mode, even if there will be coincidentally some duplicates here and > there). > This will be plainly wrong. If some value will have too much memory, the error will be thrown in VDBE (see vdbe.c line 1337). However, if we insert assert before this check, we will get assertion instead of the error, which doesn't look good, I believe. > > > > > > > > > > > Also, it seems to me we better to limit the number of bytes customer > > > > > may request to allocate from HEX()? What about to check against SQL_LIMIT_LENGTH? > > > > > > > > > This check is performed in the OP_BuiltinFunction opcode. > > > > > > That's nice, so it's not a problem then. > > Though, assertions help, as I said above... > > > > > > > > > > > > > Thanks, > > > > > Timur > > > > > > > Thanks, > Timur > > > > > > > > -----Original Message----- > > > > > > From: imeevma@tarantool.org > > > > > > Sent: Monday, August 30, 2021 9:31 AM > > > > > > To: tsafin@tarantool.org > > > > > > Cc: tarantool-patches@dev.tarantool.org > > > > > > Subject: [PATCH v1 1/1] sql: fix a segfault in hex() on receiving > > > > > > zeroblob > > > > > > > > > > > > This patch fixes a segmentation fault when zeroblob is received by > > > > > > the > > > > > > SQL built-in HEX() function. > > > > > > > > > > > > Closes #6113 > > > > > > --- > > > > > > https://github.com/tarantool/tarantool/issues/6113 > > > > > > https://github.com/tarantool/tarantool/tree/imeevma/gh-6113-fix-hex- > > > > > > segfault-2.10 > > > > > > > > > > > > .../gh-6113-fix-segfault-in-hex-func.md | 5 ++ > > > > > > src/box/sql/func.c | 75 ++++++++++------- > > > > > > -- > > > > > > test/sql-tap/engine.cfg | 1 + > > > > > > ...gh-6113-assert-in-hex-on-zeroblob.test.lua | 13 ++++ > > > > > > 4 files changed, 58 insertions(+), 36 deletions(-) > > > > > > create mode 100644 changelogs/unreleased/gh-6113-fix-segfault-in- > > > > > > hex-func.md > > > > > > create mode 100755 test/sql-tap/gh-6113-assert-in-hex-on- > > > > > > zeroblob.test.lua > > > > > > > > > > > > diff --git a/changelogs/unreleased/gh-6113-fix-segfault-in-hex- > > > > > > func.md b/changelogs/unreleased/gh-6113-fix-segfault-in-hex-func.md > > > > > > new file mode 100644 > > > > > > index 000000000..c59be4d96 > > > > > > --- /dev/null > > > > > > +++ b/changelogs/unreleased/gh-6113-fix-segfault-in-hex-func.md > > > > > > @@ -0,0 +1,5 @@ > > > > > > +## bugfix/sql > > > > > > + > > > > > > +* The HEX() SQL built-in function now does not throw an assert on > > > > > > receiving > > > > > > + varbinary values that consist of zero-bytes (gh-6113). > > > > > > + > > > > > > diff --git a/src/box/sql/func.c b/src/box/sql/func.c > > > > > > index c063552d6..fa2a2c245 100644 > > > > > > --- a/src/box/sql/func.c > > > > > > +++ b/src/box/sql/func.c > > > > > > @@ -53,6 +53,44 @@ > > > > > > static struct mh_strnptr_t *built_in_functions = NULL; > > > > > > static struct func_sql_builtin **functions; > > > > > > > > > > > > +/** Array for converting from half-bytes into ASCII hex digits. */ > > > > > > +static const char hexdigits[] = { > > > > > > + '0', '1', '2', '3', '4', '5', '6', '7', > > > > > > + '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' > > > > > > +}; > > > > > > + > > > > > > +/** Implementation of the HEX() SQL built-in function. */ > > > > > > +static void > > > > > > +func_hex(struct sql_context *ctx, int argc, struct Mem **argv) > > > > > > +{ > > > > > > + assert(argc == 1); > > > > > > + (void)argc; > > > > > > + if (argv[0]->type == MEM_TYPE_NULL) > > > > > > + return mem_set_null(ctx->pOut); > > > > > > + > > > > > > + assert(argv[0]->type == MEM_TYPE_BIN && argv[0]->n >= 0); > > > > > > + assert((argv[0]->flags & MEM_Zero) == 0 || argv[0]->u.nZero >= > > > > > > 0); > > > > > > + uint32_t size = 2 * argv[0]->n; > > > > > > + if ((argv[0]->flags & MEM_Zero) != 0) > > > > > > + size += 2 * argv[0]->u.nZero; > > > > > > + if (size == 0) > > > > > > + return mem_set_str0_static(ctx->pOut, ""); > > > > > > + > > > > > > + char *str = sqlDbMallocRawNN(sql_get(), size); > > > > > > + if (str == NULL) { > > > > > > + ctx->is_aborted = true; > > > > > > + return; > > > > > > + } > > > > > > + for (int i = 0; i < argv[0]->n; ++i) { > > > > > > + char c = argv[0]->z[i]; > > > > > > + str[2 * i] = hexdigits[(c >> 4) & 0xf]; > > > > > > + str[2 * i + 1] = hexdigits[c & 0xf]; > > > > > > + } > > > > > > + if ((argv[0]->flags & MEM_Zero) != 0) > > > > > > + memset(&str[2 * argv[0]->n], '0', 2 * argv[0]->u.nZero); > > > > > > + mem_set_str_allocated(ctx->pOut, str, size); > > > > > > +} > > > > > > + > > > > > > static const unsigned char * > > > > > > mem_as_ustr(struct Mem *mem) > > > > > > { > > > > > > @@ -1072,14 +1110,6 @@ sql_func_version(struct sql_context *context, > > > > > > sql_result_text(context, tarantool_version(), -1, SQL_STATIC); > > > > > > } > > > > > > > > > > > > -/* Array for converting from half-bytes (nybbles) into ASCII hex > > > > > > - * digits. > > > > > > - */ > > > > > > -static const char hexdigits[] = { > > > > > > - '0', '1', '2', '3', '4', '5', '6', '7', > > > > > > - '8', '9', 'A', 'B', 'C', 'D', 'E', 'F' > > > > > > -}; > > > > > > - > > > > > > /* > > > > > > * Implementation of the QUOTE() function. This function takes a > > > > > > single > > > > > > * argument. If the argument is numeric, the return value is the > > > > > > same as > > > > > > @@ -1233,33 +1263,6 @@ charFunc(sql_context * context, int argc, > > > > > > sql_value ** argv) > > > > > > sql_result_text64(context, (char *)z, zOut - z, sql_free); > > > > > > } > > > > > > > > > > > > -/* > > > > > > - * The hex() function. Interpret the argument as a blob. Return > > > > > > - * a hexadecimal rendering as text. > > > > > > - */ > > > > > > -static void > > > > > > -hexFunc(sql_context * context, int argc, sql_value ** argv) > > > > > > -{ > > > > > > - int i, n; > > > > > > - const unsigned char *pBlob; > > > > > > - char *zHex, *z; > > > > > > - assert(argc == 1); > > > > > > - UNUSED_PARAMETER(argc); > > > > > > - pBlob = mem_as_bin(argv[0]); > > > > > > - n = mem_len_unsafe(argv[0]); > > > > > > - assert(pBlob == mem_as_bin(argv[0])); /* No encoding change */ > > > > > > - z = zHex = contextMalloc(context, ((i64) n) * 2 + 1); > > > > > > - if (zHex) { > > > > > > - for (i = 0; i < n; i++, pBlob++) { > > > > > > - unsigned char c = *pBlob; > > > > > > - *(z++) = hexdigits[(c >> 4) & 0xf]; > > > > > > - *(z++) = hexdigits[c & 0xf]; > > > > > > - } > > > > > > - *z = 0; > > > > > > - sql_result_text(context, zHex, n * 2, sql_free); > > > > > > - } > > > > > > -} > > > > > > - > > > > > > /* > > > > > > * The zeroblob(N) function returns a zero-filled blob of size N > > > > > > bytes. > > > > > > */ > > > > > > @@ -2034,7 +2037,7 @@ static struct sql_func_definition definitions[] > > > > > > = { > > > > > > {"GROUP_CONCAT", 2, {FIELD_TYPE_VARBINARY, > > > > > > FIELD_TYPE_VARBINARY}, > > > > > > FIELD_TYPE_VARBINARY, groupConcatStep, groupConcatFinalize}, > > > > > > > > > > > > - {"HEX", 1, {FIELD_TYPE_VARBINARY}, FIELD_TYPE_STRING, hexFunc, > > > > > > NULL}, > > > > > > + {"HEX", 1, {FIELD_TYPE_VARBINARY}, FIELD_TYPE_STRING, func_hex, > > > > > > NULL}, > > > > > > {"IFNULL", 2, {FIELD_TYPE_ANY, FIELD_TYPE_ANY}, > > > > > > FIELD_TYPE_SCALAR, > > > > > > sql_builtin_stub, NULL}, > > > > > > > > > > > > Regards, > > > Timur