[Tarantool-patches] [PATCH v1 1/1] sql: fix a segfault in hex() on receiving zeroblob

Mergen Imeev imeevma at tarantool.org
Mon Sep 6 12:45:28 MSK 2021


Hi! Thank you for the review! My answer below.

On Fri, Sep 03, 2021 at 10:19:56PM +0300, Safin Timur wrote:
> 
> 
> On 01.09.2021 11:44, Mergen Imeev wrote:
> > Hi! Thank you for the review. My answers below.
> > 
> > On Tue, Aug 31, 2021 at 10:32:46PM +0300, Timur Safin wrote:
> > > I may miss something obvious, but prior version of a code
> > > with pBlob and n was much shorter, compacter and more readable.
> > > I'm curious, why do you prefer to always use argv[0]->n and
> > > argv[0]->z instead?
> > > 
> > If we talk about the old function, then it really looks simpler. However, it did
> > not work correctly and also made some unnecessary changes to the arguments. You
> > can compare to the fixed version of old function on this branch:
> > imeevma/gh-6113-fix-hex-segfault-2.8 (which I also sent you for review). You will
> > see much less difference there.
> 
> I meant that newer code was a little bit .. mouthful, with unnecessary code
> substitution and visual noise which harmed readability. Here is an example
> of version which is not using argv[0]->.. wherever we refer to fields.
> 
> ----------------------------------------------------
> /** Implementation of the HEX() SQL built-in function. */
> static void
> func_hex(struct sql_context *ctx, int argc, struct Mem **argv)
> {
> 	assert(argc == 1);
> 	(void)argc;
> 	if (argv[0]->type == MEM_TYPE_NULL)
> 		return mem_set_null(ctx->pOut);
> 
> 	int n = argv[0]->n;
> 	int zero_len = argv[0]->u.nZero;
I believe you cannot use undefined value.

> 	assert(argv[0]->type == MEM_TYPE_BIN && n >= 0);
> 	assert((argv[0]->flags & MEM_Zero) == 0 || zero_len >= 0);
> 
> 	uint32_t size = 2 * n;
> 	if ((argv[0]->flags & MEM_Zero) != 0)
> 		size += 2 * zero_len;
> 	if (size == 0)
> 		return mem_set_str0_static(ctx->pOut, "");
> 
> 	char *str = sqlDbMallocRawNN(sql_get(), size);
> 	if (str == NULL) {
> 		ctx->is_aborted = true;
> 		return;
> 	}
> 	for (int i = 0; i < n; ++i) {
> 		char c = argv[0]->z[i];
> 		str[2 * i] = hexdigits[(c >> 4) & 0xf];
> 		str[2 * i + 1] = hexdigits[c & 0xf];
> 	}
> 	if ((argv[0]->flags & MEM_Zero) != 0)
> 		memset(&str[2 * n], '0', 2 * zero_len);
> 	mem_set_str_allocated(ctx->pOut, str, size);
> }
> 
> ----------------------------------------------------
> 
> It's more resembling original code (and that was done intentionally).
> 
I don't like that you define a variable with an undefined value in some cases.
I would introduce some new variables if there was some complicated logic,
however I don't see the need to do this here since I don't see complex
expressions. 

> Also (and I didn't change it in the sample) there is apparent missing check
> for SQL_LIMIT_LENGTH limit which used to be done in contextMalloc() before,
> but now is missing once we use sqlDbMallocRawNN(). I assume we better return
> this check (once again as a proper wrapper which contextMalloc() essentially
> was).
> 
This will be verified in VDBE. I think it is better to have such a check
centralized for all functions.

> > 
> > > Also, it seems to me we better to limit the number of bytes customer
> > > may request to allocate from HEX()? What about to check against SQL_LIMIT_LENGTH?
> > > 
> > This check is performed in the OP_BuiltinFunction opcode.
> 
> That's nice, so it's not a problem then.
> 
> > 
> > > Thanks,
> > > Timur
> > > 
> > > > -----Original Message-----
> > > > From: imeevma at tarantool.org <imeevma at tarantool.org>
> > > > Sent: Monday, August 30, 2021 9:31 AM
> > > > To: tsafin at tarantool.org
> > > > Cc: tarantool-patches at dev.tarantool.org
> > > > Subject: [PATCH v1 1/1] sql: fix a segfault in hex() on receiving
> > > > zeroblob
> > > > 
> > > > This patch fixes a segmentation fault when zeroblob is received by
> > > > the
> > > > SQL built-in HEX() function.
> > > > 
> > > > Closes #6113
> > > > ---
> > > > https://github.com/tarantool/tarantool/issues/6113
> > > > https://github.com/tarantool/tarantool/tree/imeevma/gh-6113-fix-hex-
> > > > segfault-2.10
> > > > 
> > > >   .../gh-6113-fix-segfault-in-hex-func.md       |  5 ++
> > > >   src/box/sql/func.c                            | 75 ++++++++++-------
> > > > --
> > > >   test/sql-tap/engine.cfg                       |  1 +
> > > >   ...gh-6113-assert-in-hex-on-zeroblob.test.lua | 13 ++++
> > > >   4 files changed, 58 insertions(+), 36 deletions(-)
> > > >   create mode 100644 changelogs/unreleased/gh-6113-fix-segfault-in-
> > > > hex-func.md
> > > >   create mode 100755 test/sql-tap/gh-6113-assert-in-hex-on-
> > > > zeroblob.test.lua
> > > > 
> > > > diff --git a/changelogs/unreleased/gh-6113-fix-segfault-in-hex-
> > > > func.md b/changelogs/unreleased/gh-6113-fix-segfault-in-hex-func.md
> > > > new file mode 100644
> > > > index 000000000..c59be4d96
> > > > --- /dev/null
> > > > +++ b/changelogs/unreleased/gh-6113-fix-segfault-in-hex-func.md
> > > > @@ -0,0 +1,5 @@
> > > > +## bugfix/sql
> > > > +
> > > > +* The HEX() SQL built-in function now does not throw an assert on
> > > > receiving
> > > > +  varbinary values that consist of zero-bytes (gh-6113).
> > > > +
> > > > diff --git a/src/box/sql/func.c b/src/box/sql/func.c
> > > > index c063552d6..fa2a2c245 100644
> > > > --- a/src/box/sql/func.c
> > > > +++ b/src/box/sql/func.c
> > > > @@ -53,6 +53,44 @@
> > > >   static struct mh_strnptr_t *built_in_functions = NULL;
> > > >   static struct func_sql_builtin **functions;
> > > > 
> > > > +/** Array for converting from half-bytes into ASCII hex digits. */
> > > > +static const char hexdigits[] = {
> > > > +	'0', '1', '2', '3', '4', '5', '6', '7',
> > > > +	'8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
> > > > +};
> > > > +
> > > > +/** Implementation of the HEX() SQL built-in function. */
> > > > +static void
> > > > +func_hex(struct sql_context *ctx, int argc, struct Mem **argv)
> > > > +{
> > > > +	assert(argc == 1);
> > > > +	(void)argc;
> > > > +	if (argv[0]->type == MEM_TYPE_NULL)
> > > > +		return mem_set_null(ctx->pOut);
> > > > +
> > > > +	assert(argv[0]->type == MEM_TYPE_BIN && argv[0]->n >= 0);
> > > > +	assert((argv[0]->flags & MEM_Zero) == 0 || argv[0]->u.nZero >=
> > > > 0);
> > > > +	uint32_t size = 2 * argv[0]->n;
> > > > +	if ((argv[0]->flags & MEM_Zero) != 0)
> > > > +		size += 2 * argv[0]->u.nZero;
> > > > +	if (size == 0)
> > > > +		return mem_set_str0_static(ctx->pOut, "");
> > > > +
> > > > +	char *str = sqlDbMallocRawNN(sql_get(), size);
> > > > +	if (str == NULL) {
> > > > +		ctx->is_aborted = true;
> > > > +		return;
> > > > +	}
> > > > +	for (int i = 0; i < argv[0]->n; ++i) {
> > > > +		char c = argv[0]->z[i];
> > > > +		str[2 * i] = hexdigits[(c >> 4) & 0xf];
> > > > +		str[2 * i + 1] = hexdigits[c & 0xf];
> > > > +	}
> > > > +	if ((argv[0]->flags & MEM_Zero) != 0)
> > > > +		memset(&str[2 * argv[0]->n], '0', 2 * argv[0]->u.nZero);
> > > > +	mem_set_str_allocated(ctx->pOut, str, size);
> > > > +}
> > > > +
> > > >   static const unsigned char *
> > > >   mem_as_ustr(struct Mem *mem)
> > > >   {
> > > > @@ -1072,14 +1110,6 @@ sql_func_version(struct sql_context *context,
> > > >   	sql_result_text(context, tarantool_version(), -1, SQL_STATIC);
> > > >   }
> > > > 
> > > > -/* Array for converting from half-bytes (nybbles) into ASCII hex
> > > > - * digits.
> > > > - */
> > > > -static const char hexdigits[] = {
> > > > -	'0', '1', '2', '3', '4', '5', '6', '7',
> > > > -	'8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
> > > > -};
> > > > -
> > > >   /*
> > > >    * Implementation of the QUOTE() function.  This function takes a
> > > > single
> > > >    * argument.  If the argument is numeric, the return value is the
> > > > same as
> > > > @@ -1233,33 +1263,6 @@ charFunc(sql_context * context, int argc,
> > > > sql_value ** argv)
> > > >   	sql_result_text64(context, (char *)z, zOut - z, sql_free);
> > > >   }
> > > > 
> > > > -/*
> > > > - * The hex() function.  Interpret the argument as a blob.  Return
> > > > - * a hexadecimal rendering as text.
> > > > - */
> > > > -static void
> > > > -hexFunc(sql_context * context, int argc, sql_value ** argv)
> > > > -{
> > > > -	int i, n;
> > > > -	const unsigned char *pBlob;
> > > > -	char *zHex, *z;
> > > > -	assert(argc == 1);
> > > > -	UNUSED_PARAMETER(argc);
> > > > -	pBlob = mem_as_bin(argv[0]);
> > > > -	n = mem_len_unsafe(argv[0]);
> > > > -	assert(pBlob == mem_as_bin(argv[0]));	/* No encoding change */
> > > > -	z = zHex = contextMalloc(context, ((i64) n) * 2 + 1);
> > > > -	if (zHex) {
> > > > -		for (i = 0; i < n; i++, pBlob++) {
> > > > -			unsigned char c = *pBlob;
> > > > -			*(z++) = hexdigits[(c >> 4) & 0xf];
> > > > -			*(z++) = hexdigits[c & 0xf];
> > > > -		}
> > > > -		*z = 0;
> > > > -		sql_result_text(context, zHex, n * 2, sql_free);
> > > > -	}
> > > > -}
> > > > -
> > > >   /*
> > > >    * The zeroblob(N) function returns a zero-filled blob of size N
> > > > bytes.
> > > >    */
> > > > @@ -2034,7 +2037,7 @@ static struct sql_func_definition definitions[]
> > > > = {
> > > >   	{"GROUP_CONCAT", 2, {FIELD_TYPE_VARBINARY,
> > > > FIELD_TYPE_VARBINARY},
> > > >   	 FIELD_TYPE_VARBINARY, groupConcatStep, groupConcatFinalize},
> > > > 
> > > > -	{"HEX", 1, {FIELD_TYPE_VARBINARY}, FIELD_TYPE_STRING, hexFunc,
> > > > NULL},
> > > > +	{"HEX", 1, {FIELD_TYPE_VARBINARY}, FIELD_TYPE_STRING, func_hex,
> > > > NULL},
> > > >   	{"IFNULL", 2, {FIELD_TYPE_ANY, FIELD_TYPE_ANY},
> > > > FIELD_TYPE_SCALAR,
> > > >   	 sql_builtin_stub, NULL},
> > > > 
> 
> Regards,
> Timur


More information about the Tarantool-patches mailing list