[Tarantool-patches] [PATCH v1 03/13] sql: move collation to struct sql_context

Mergen Imeev imeevma at tarantool.org
Tue Sep 21 13:40:54 MSK 2021


Hi! Thank you for the review! My answers below.

On Tue, Sep 14, 2021 at 11:22:39PM +0200, Vladislav Shpilevoy wrote:
> Thanks for working on this!
> 
> On 10.09.2021 18:01, imeevma at tarantool.org wrote:
> > This patch makes it easier to get a collation by a function.
> 
> You could also store the opcode pointer instead of iOp. But I
> suspect your main reason is to get rid of the vdbe pointer? If
> yes, then why?
> 
I wanted to remove unnecessary connecton between VDBE and functions. Also, this
way it will be easier for me to remove OP_CollSeq.

> > diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> > index 2ff7ce8f4..12dc9126b 100644
> > --- a/src/box/sql/vdbe.c
> > +++ b/src/box/sql/vdbe.c
> > @@ -1159,23 +1159,13 @@ case OP_Remainder: {           /* same as TK_REM, in1, in2, out3 */
> >  	break;
> >  }
> >  
> > -/* Opcode: CollSeq P1 * * P4
> > - *
> > - * P4 is a pointer to a CollSeq struct. If the next call to a user function
> > - * or aggregate calls sqlGetFuncCollSeq(), this collation sequence will
> > - * be returned. This is used by the built-in min(), max() and nullif()
> > - * functions.
> > +/* Opcode: SkipLoad P1 * * * *
> >   *
> >   * If P1 is not zero, then it is a register that a subsequent min() or
> >   * max() aggregate will set to true if the current row is not the minimum or
> >   * maximum.  The P1 register is initialized to false by this instruction.
> > - *
> > - * The interface used by the implementation of the aforementioned functions
> > - * to retrieve the collation sequence set by this opcode is not available
> > - * publicly.  Only built-in functions have access to this feature.
> >   */
> > -case OP_CollSeq: {
> > -	assert(pOp->p4type==P4_COLLSEQ || pOp->p4.pColl == NULL);
> > +case OP_SkipLoad: {
> 
> That is a very strange name. Couldn't OP_Bool somehow be reused?
> Why is this R[p1] = false even needed?
>
It is possible, for example like this:

diff --git a/src/box/sql/select.c b/src/box/sql/select.c
index 2880f8ea0..c3f36d74f 100644
--- a/src/box/sql/select.c
+++ b/src/box/sql/select.c
@@ -5636,7 +5636,8 @@ updateAccumulator(Parse * pParse, AggInfo * pAggInfo)
 			}
 			if (regHit == 0 && pAggInfo->nAccumulator)
 				regHit = ++pParse->nMem;
-			sqlVdbeAddOp1(v, OP_SkipLoad, regHit);
+			if (regHit != 0)
+				sqlVdbeAddOp2(v, OP_Bool, false, regHit);
 		}
 		struct sql_context *ctx = sql_context_new(pF->func, nArg, coll);
 		if (ctx == NULL) {
diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
index 12dc9126b..b17179a57 100644
--- a/src/box/sql/vdbe.c
+++ b/src/box/sql/vdbe.c
@@ -4182,9 +4182,8 @@ case OP_AggStep: {
 		goto abort_due_to_error;
 	}
 	assert(mem_is_null(&t));
-	if (pCtx->skipFlag) {
-		assert(pOp[-1].opcode == OP_SkipLoad);
-		i = pOp[-1].p1;
+	if (pCtx->skipFlag && pOp[-1].opcode == OP_Bool) {
+		i = pOp[-1].p2;
 		if (i) mem_set_bool(&aMem[i], true);
 	}
 	break;

However, I'm not sure if this is the correct way to use OP_Bool.

This opcode is used in rather strange queries, such as the following:
SELECT a, min(b) FROM t;

As you can see, there can be multiple values for min (b) in "a". This behavior is
discussed here:
https://github.com/tarantool/tarantool/discussions/6416

Most likely, we will follow the same path as PostgreSQL and MS SQL Server,
namely to prohibit such expressions. This will naturally lead to removal of
OP_SkipLoad/OP_CollSeq.

As for the name - I though of it when read desctiption of skipFlag field in
struct sql_context.

> >  	if (pOp->p1) {
> >  		mem_set_bool(&aMem[pOp->p1], false);
> >  	}



More information about the Tarantool-patches mailing list