From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 961F625A5D for ; Wed, 31 Jul 2019 12:57:43 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PmtRmOJ1foHO for ; Wed, 31 Jul 2019 12:57:43 -0400 (EDT) Received: from smtp18.mail.ru (smtp18.mail.ru [94.100.176.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 25B6D22E8D for ; Wed, 31 Jul 2019 12:57:42 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member References: <8be21965-b8ef-4f96-8ce3-99c2efcecfc0@tarantool.org> From: Vladislav Shpilevoy Message-ID: <351df50f-d096-45de-0d04-c9cee83d1289@tarantool.org> Date: Wed, 31 Jul 2019 18:59:52 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: "n.pettik" , tarantool-patches@freelists.org Hi! Thanks for the fixes! See 5 comments below. On 31/07/2019 11:14, n.pettik wrote: > > >>> diff --git a/src/box/sql/func.c b/src/box/sql/func.c >>> index fcf147c96..f50df105d 100644 >>> --- a/src/box/sql/func.c >>> +++ b/src/box/sql/func.c >>> @@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv) >>> { >>> const char *z = 0; >>> UNUSED_PARAMETER(NotUsed); >>> + enum field_type f_t = argv[0]->field_type; >> >> 1. Perhaps, field_type is not the best name, because Mem is not always >> is a field. It can be just a value, not from a space. It is rather >> just 'type’. > > When memory represents smth different from value fetched from > tuple, it is set to field_type_MAX. So I’d rather not rename this member. 1. Ok, now I got it. It works only for tuple fields, and is totally disabled for other values. Or not? See other comments. >> 7. All these manual assignments of field_type are broken by design. >> When you have to update field_type after each operation manually, >> you will miss some places for sure. For example: >> >> tarantool> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)') >> --- >> - row_count: 1 >> ... >> >> tarantool> box.execute('INSERT INTO test VALUES (1, NULL, NULL)') >> --- >> - row_count: 1 >> ... >> >> tarantool> box.execute('SELECT typeof(a & b) FROM test') >> --- >> - metadata: >> - name: typeof(a & b) >> type: string >> rows: >> - ['null'] >> ... >> >> tarantool> box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM test') >> --- >> - metadata: >> - name: typeof(a) >> type: string >> - name: typeof(b) >> type: string >> - name: typeof(a & b) >> type: string >> rows: >> - ['integer', 'integer', 'integer'] >> ... >> >> >> Here in the last two queries 'typeof(a & b)' result *depends* on >> other columns in result set. But it should be integer always, shouldn't >> it? > > Yep, it should, I’ve fixed that: > > diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c > index b3b230ad0..f6eab8a0e 100644 > --- a/src/box/sql/vdbe.c > +++ b/src/box/sql/vdbe.c > @@ -1842,6 +1842,8 @@ case OP_ShiftRight: { /* same as TK_RSHIFT, in1, in2, out3 */ > pOut = &aMem[pOp->p3]; > if ((pIn1->flags | pIn2->flags) & MEM_Null) { > sqlVdbeMemSetNull(pOut); > + /* Force NULL be of type INTEGER. */ > + pOut->field_type = FIELD_TYPE_INTEGER; > break; > } > bool unused; > @@ -2477,6 +2479,8 @@ case OP_BitNot: { /* same as TK_BITNOT, in1, out2 */ > pIn1 = &aMem[pOp->p1]; > pOut = &aMem[pOp->p2]; > sqlVdbeMemSetNull(pOut); > + /* Force NULL be of type INTEGER. */ > + pOut->field_type = FIELD_TYPE_INTEGER; 2. tarantool> box.execute('SELECT typeof(~NULL)') --- - metadata: - name: typeof(~NULL) type: string rows: - ['integer'] ... But it is not integer, as I think. I didn't select NULL from an integer column. Here NULL was of type 'boolean'. And the result should be 'boolean' too, because it is just NULL, not a column NULL. The same about other bit operations. And the same about strings: tarantool> box.execute("SELECT typeof(NULL || NULL)") --- - metadata: - name: typeof(NULL || NULL) type: string rows: - ['string'] ... Here NULLs were of type 'boolean', but output shows 'string'. > if ((pIn1->flags & MEM_Null)==0) { > int64_t i; > bool is_neg; > diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua > index 489a5a91c..8a509da96 100644 > --- a/test/sql/types.test.lua > +++ b/test/sql/types.test.lua > @@ -407,6 +407,12 @@ box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);") > box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);") > box.execute("SELECT typeof(a), typeof(s) FROM t;") > > +box.execute('CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)') > +box.execute('INSERT INTO t1 VALUES (1, NULL, NULL);') > +box.execute('SELECT typeof(a & b) FROM t1;') > +box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM t1') > + > box.execute("SELECT typeof(CAST(0 AS UNSIGNED));") > > box.space.T:drop() > +box.space.T1:drop() > >> The same about all other places, where you call mem_set_* and don't >> update field_type. > > It’s okay, since updating current value in memory cell doesn’t > imply that field type should be updated as well. In most scenarios > before setting new value to memory cell, we provide clean-up > via memAboutToChange(), which in turn invalidates field type. 3. memAboutToChange is no-op in Release build, so it does not invalidate anything in that mode. It means, that field_type now can be a garbage from a previous value, if the memory cell was reused to assign something new. >> I propose you to redesign how to update field_type, so as it would >> be hard to forget to update it. >> >> For instance, add a field_type argument to sqlVdbeMemSetNull() to never >> miss to set a real type for a NULL value. Make mem_set_int update field_type >> depending on the positivity flag. Make mem_set_u64 always set field_type >> UNSIGNED. Make sqlVdbeMemCast always set field_type to the target type. > > I’ve tried to refactor patch this way, but in most cases sqlVdbeMemSetNull() > would set default (field_type_MAX). As for mem_set_* family functions - > I can move assignment of field_type_MAX to the body, but does it make > much sense? > 4. Currently, as I said above, I don't see where you invalidate type of a memory cell before it takes a new value. Yes, field_type is initialized with field_type_MAX when the register array is just created, but where the cell type is reset, when it takes a new value? And one another possible bug: box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER)') box.execute('INSERT INTO test VALUES (1, NULL)') tarantool> box.execute('SELECT typeof(NOT a) FROM test') --- - metadata: - name: typeof(NOT a) type: string rows: - ['boolean'] ... As you can see, 'NOT' lost the original type - it was integer NULL, not boolean NULL. Another place is vdbe.c:2627 - here you jump to op_column_out bypassing field_type assignment. Why? >>> @@ -1909,6 +1924,7 @@ case OP_ShiftRight: { /* same as TK_RSHIFT, in1, in2, out3 */ >>> } >>> } >>> mem_set_i64(pOut, iA); >>> + pOut->field_type = field_type_MAX; >>> break; >>> } >>> >>> @@ -1988,6 +2004,13 @@ case OP_Cast: { /* in1 */ >>> if (ExpandBlob(pIn1) != 0) >>> goto abort_due_to_error; >>> rc = sqlVdbeMemCast(pIn1, pOp->p2); >>> + /* >>> + * SCALAR is not type itself, but rather an aggregation >>> + * of types. Hence, cast to this type shouldn't change >>> + * original type of argument. >>> + */ >>> + if (pOp->p2 != FIELD_TYPE_SCALAR) >>> + pIn1->field_type = pOp->p2; >> >> 8. Why sqlVdbeMemCast does not update the type inside? Now >> this function works on half - it updates flags with a new type, >> sets a new value, but doesn't touch field_type. > > Because it has several exit points and otherwise we have > to add to each branch: > > diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c > index e5941de8e..7892af116 100644 > --- a/src/box/sql/vdbemem.c > +++ b/src/box/sql/vdbemem.c > @@ -673,10 +673,12 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type) > case FIELD_TYPE_BOOLEAN: > if ((pMem->flags & MEM_Int) != 0) { > mem_set_bool(pMem, pMem->u.i); > + pMem->field_type = type; > return 0; > } > if ((pMem->flags & MEM_UInt) != 0) { > mem_set_bool(pMem, pMem->u.u); > + pMem->field_type = type; > return 0; > } > if ((pMem->flags & MEM_Real) != 0) { > > Etc. > > What is more, this is the only path (OP_Cast) which calls > sqlVdbeMemCast() function. So IMHO it is pretty OK. > > If you want, I can attempt at refactoring sqlVdbeMemCast() > so that make it feature only one exit point and move filed_type > assignment to it. > 5. Never mind. Too many changes.