Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: "n.pettik" <korablev@tarantool.org>, tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member
Date: Wed, 31 Jul 2019 18:59:52 +0200	[thread overview]
Message-ID: <351df50f-d096-45de-0d04-c9cee83d1289@tarantool.org> (raw)
In-Reply-To: <C1A27F18-8395-4C22-AE07-C30C009C4CA6@tarantool.org>

Hi! Thanks for the fixes! See 5 comments below.

On 31/07/2019 11:14, n.pettik wrote:
> 
> 
>>> diff --git a/src/box/sql/func.c b/src/box/sql/func.c
>>> index fcf147c96..f50df105d 100644
>>> --- a/src/box/sql/func.c
>>> +++ b/src/box/sql/func.c
>>> @@ -107,6 +107,17 @@ typeofFunc(sql_context * context, int NotUsed, sql_value ** argv)
>>> {
>>> 	const char *z = 0;
>>> 	UNUSED_PARAMETER(NotUsed);
>>> +	enum field_type f_t = argv[0]->field_type;
>>
>> 1. Perhaps, field_type is not the best name, because Mem is not always
>> is a field. It can be just a value, not from a space. It is rather
>> just 'type’.
> 
> When memory represents smth different from value fetched from
> tuple, it is set to field_type_MAX. So I’d rather not rename this member.

1. Ok, now I got it. It works only for tuple fields, and is totally disabled
for other values. Or not? See other comments.

>> 7. All these manual assignments of field_type are broken by design.
>> When you have to update field_type after each operation manually,
>> you will miss some places for sure. For example:
>>
>> tarantool> box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
>> ---
>> - row_count: 1
>> ...
>>
>> tarantool> box.execute('INSERT INTO test VALUES (1, NULL, NULL)')
>> ---
>> - row_count: 1
>> ...
>>
>> tarantool> box.execute('SELECT typeof(a & b) FROM test')
>> ---
>> - metadata:
>>  - name: typeof(a & b)
>>    type: string
>>  rows:
>>  - ['null']
>> ...
>>
>> tarantool> box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM test')
>> ---
>> - metadata:
>>  - name: typeof(a)
>>    type: string
>>  - name: typeof(b)
>>    type: string
>>  - name: typeof(a & b)
>>    type: string
>>  rows:
>>  - ['integer', 'integer', 'integer']
>> ...
>>
>>
>> Here in the last two queries 'typeof(a & b)' result *depends* on
>> other columns in result set. But it should be integer always, shouldn't
>> it?
> 
> Yep, it should, I’ve fixed that:
> 
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index b3b230ad0..f6eab8a0e 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -1842,6 +1842,8 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>         pOut = &aMem[pOp->p3];
>         if ((pIn1->flags | pIn2->flags) & MEM_Null) {
>                 sqlVdbeMemSetNull(pOut);
> +               /* Force NULL be of type INTEGER. */
> +               pOut->field_type = FIELD_TYPE_INTEGER;
>                 break;
>         }
>         bool unused;
> @@ -2477,6 +2479,8 @@ case OP_BitNot: {             /* same as TK_BITNOT, in1, out2 */
>         pIn1 = &aMem[pOp->p1];
>         pOut = &aMem[pOp->p2];
>         sqlVdbeMemSetNull(pOut);
> +       /* Force NULL be of type INTEGER. */
> +       pOut->field_type = FIELD_TYPE_INTEGER;
2.

tarantool> box.execute('SELECT typeof(~NULL)')
---
- metadata:
  - name: typeof(~NULL)
    type: string
  rows:
  - ['integer']
...

But it is not integer, as I think. I didn't select NULL from
an integer column. Here NULL was of type 'boolean'. And the
result should be 'boolean' too, because it is just NULL, not
a column NULL. The same about other bit operations.

And the same about strings:

tarantool> box.execute("SELECT typeof(NULL || NULL)")
---
- metadata:
  - name: typeof(NULL || NULL)
    type: string
  rows:
  - ['string']
...

Here NULLs were of type 'boolean', but output shows 'string'.

>         if ((pIn1->flags & MEM_Null)==0) {
>                 int64_t i;
>                 bool is_neg;
> diff --git a/test/sql/types.test.lua b/test/sql/types.test.lua
> index 489a5a91c..8a509da96 100644
> --- a/test/sql/types.test.lua
> +++ b/test/sql/types.test.lua
> @@ -407,6 +407,12 @@ box.execute("CREATE TABLE t (id INT PRIMARY KEY, a INT, s SCALAR);")
>  box.execute("INSERT INTO t VALUES (1, 1, 1), (2, NULL, NULL);")
>  box.execute("SELECT typeof(a), typeof(s) FROM t;")
>  
> +box.execute('CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER)')
> +box.execute('INSERT INTO t1 VALUES (1, NULL, NULL);')
> +box.execute('SELECT typeof(a & b) FROM t1;')
> +box.execute('SELECT typeof(a), typeof(b), typeof(a & b) FROM t1')
> +
>  box.execute("SELECT typeof(CAST(0 AS UNSIGNED));")
>  
>  box.space.T:drop()
> +box.space.T1:drop()
> 
>> The same about all other places, where you call mem_set_* and don't
>> update field_type.
> 
> It’s okay, since updating current value in memory cell doesn’t
> imply that field type should be updated as well. In most scenarios
> before setting new value to memory cell, we provide clean-up
> via memAboutToChange(), which in turn invalidates field type.

3. memAboutToChange is no-op in Release build, so it does not invalidate
anything in that mode. It means, that field_type now can be a garbage
from a previous value, if the memory cell was reused to assign something
new.

>> I propose you to redesign how to update field_type, so as it would
>> be hard to forget to update it.
>>
>> For instance, add a field_type argument to sqlVdbeMemSetNull() to never
>> miss to set a real type for a NULL value. Make mem_set_int update field_type
>> depending on the positivity flag. Make mem_set_u64 always set field_type
>> UNSIGNED. Make sqlVdbeMemCast always set field_type to the target type.
> 
> I’ve tried to refactor patch this way, but in most cases sqlVdbeMemSetNull()
> would set default (field_type_MAX). As for mem_set_* family functions -
> I can move assignment of field_type_MAX to the body, but does it make
> much sense?
> 

4. Currently, as I said above, I don't see where you invalidate type of a
memory cell before it takes a new value. Yes, field_type is initialized
with field_type_MAX when the register array is just created, but where
the cell type is reset, when it takes a new value?

And one another possible bug:

box.execute('CREATE TABLE test (id INTEGER PRIMARY KEY, a INTEGER)')
box.execute('INSERT INTO test VALUES (1, NULL)')
tarantool> box.execute('SELECT typeof(NOT a) FROM test')
---
- metadata:
  - name: typeof(NOT a)
    type: string
  rows:
  - ['boolean']
...

As you can see, 'NOT' lost the original type - it was integer NULL,
not boolean NULL.

Another place is vdbe.c:2627 - here you jump to op_column_out
bypassing field_type assignment. Why?

>>> @@ -1909,6 +1924,7 @@ case OP_ShiftRight: {           /* same as TK_RSHIFT, in1, in2, out3 */
>>> 		}
>>> 	}
>>> 	mem_set_i64(pOut, iA);
>>> +	pOut->field_type = field_type_MAX;
>>> 	break;
>>> }
>>>
>>> @@ -1988,6 +2004,13 @@ case OP_Cast: {                  /* in1 */
>>> 	if (ExpandBlob(pIn1) != 0)
>>> 		goto abort_due_to_error;
>>> 	rc = sqlVdbeMemCast(pIn1, pOp->p2);
>>> +	/*
>>> +	 * SCALAR is not type itself, but rather an aggregation
>>> +	 * of types. Hence, cast to this type shouldn't change
>>> +	 * original type of argument.
>>> +	 */
>>> +	if (pOp->p2 != FIELD_TYPE_SCALAR)
>>> +		pIn1->field_type = pOp->p2;
>>
>> 8. Why sqlVdbeMemCast does not update the type inside? Now
>> this function works on half - it updates flags with a new type,
>> sets a new value, but doesn't touch field_type.
> 
> Because it has several exit points and otherwise we have
> to add to each branch:
> 
> diff --git a/src/box/sql/vdbemem.c b/src/box/sql/vdbemem.c
> index e5941de8e..7892af116 100644
> --- a/src/box/sql/vdbemem.c
> +++ b/src/box/sql/vdbemem.c
> @@ -673,10 +673,12 @@ sqlVdbeMemCast(Mem * pMem, enum field_type type)
>         case FIELD_TYPE_BOOLEAN:
>                 if ((pMem->flags & MEM_Int) != 0) {
>                         mem_set_bool(pMem, pMem->u.i);
> +                       pMem->field_type = type;
>                         return 0;
>                 }
>                 if ((pMem->flags & MEM_UInt) != 0) {
>                         mem_set_bool(pMem, pMem->u.u);
> +                       pMem->field_type = type;
>                         return 0;
>                 }
>                 if ((pMem->flags & MEM_Real) != 0) {
> 
> Etc.
> 
> What is more, this is the only path (OP_Cast) which calls
> sqlVdbeMemCast() function. So IMHO it is pretty OK.
> 
> If you want, I can attempt at refactoring sqlVdbeMemCast()
> so that make it feature only one exit point and move filed_type
> assignment to it.
> 

5. Never mind. Too many changes.

  reply	other threads:[~2019-07-31 16:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-27 18:45 [tarantool-patches] [PATCH 0/2] Improve operability of typeof() function Nikita Pettik
2019-07-27 18:45 ` [tarantool-patches] [PATCH 1/2] sql: extend struct Mem with field_type member Nikita Pettik
2019-07-28 15:22   ` [tarantool-patches] " Vladislav Shpilevoy
2019-07-28 21:26     ` Konstantin Osipov
2019-07-31  9:14     ` n.pettik
2019-07-31 16:59       ` Vladislav Shpilevoy [this message]
2019-08-01 15:56         ` n.pettik
2019-08-01 23:11           ` Vladislav Shpilevoy
2019-08-02 10:02             ` n.pettik
2019-07-27 18:45 ` [tarantool-patches] [PATCH 2/2] sql: make default type of NULL be boolean Nikita Pettik
2019-08-02 10:52 ` [tarantool-patches] Re: [PATCH 0/2] Improve operability of typeof() function Kirill Yukhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=351df50f-d096-45de-0d04-c9cee83d1289@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=korablev@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH 1/2] sql: extend struct Mem with field_type member' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox