[tarantool-patches] Re: [PATCH] sql: refactor SQL cursor to remove write ones

n.pettik korablev at tarantool.org
Thu Oct 4 05:21:43 MSK 2018


>> In general patch is OK, I have spotted only several nits.
>> Also, I support your intention (which we discussed outside mailing list) to make
>> system spaces be ‘pre-loaded’. It means that all required system spaces will be
>> stored in predefined (at the start of program) registers. Moreover, I suggest to
>> attempt at involving not only system spaces. Idea is following:
>> each OP_CursorOpen during compilation stage skips register where space
>> should be stored. At the end of compilation (sql_finish_coding) we know that
>> totally program uses n registers. Then, we can allocate n + m (where m is number
>> of distinct spaces which are used in out query) registers to hold pointers to required
>> spaces. Finally, we patch each OP_CursorOpen and substitute void arg with
>> number of register where required space is stored. This approach for instance allows
>> to update all space pointers after schema changes with ease.
> Great! I'll do that in follow up patches.

Ok, then register it as an (set of) issues(es)
(in case you are going to implement them separately).

>> Finally, now I think about renaming CursorOpen to IteratorOpen: in fact after your
>> patch cursor has turned almost in wrapper around index iterator. Such naming
>> would underline the fact that ‘cursor’ is used only for read iterations over the space.
> Yes, sure. Renamed.

You forgot to update name of opcode in commit message.
Also, you deleted not all mentions of OP_Cursor(Re)Open - just grep it

>>> diff --git a/src/box/sql/delete.c b/src/box/sql/delete.c
>>> index 9aa7058..ebde698 100644
>>> --- a/src/box/sql/delete.c
>>> +++ b/src/box/sql/delete.c
>>> @@ -69,7 +69,7 @@ sql_lookup_table(struct Parse *parse, struct SrcList_item *tbl_name)
>>> 
>>> void
>>> sql_materialize_view(struct Parse *parse, const char *name, struct Expr *where,
>>> -		     int cursor)
>>> +		     int cursor, int reg_eph)
>> 
>> This function is used twice: to materialize view during firing INSTEAD OF DELETE
>> and INSTEAD OF UPDATE triggers. In both cases, you pass reg_eph which is
>> not used after this function invocation. So I guess you can remove reg_eph arg.
> IMHO, allocating a register w/o assigning it to local variable is less
> straightforward, but OK, done.

It still can be simplified - see comments below.

>>> @@ -5261,7 +5303,7 @@ resetAccumulator(Parse * pParse, AggInfo * pAggInfo)
>>> 	/* Verify that all AggInfo registers are within the range specified by
>>> 	 * AggInfo.mnReg..AggInfo.mxReg
>>> 	 */
>>> -	assert(nReg == pAggInfo->mxReg - pAggInfo->mnReg + 1);
>>> +	assert(nReg <= pAggInfo->mxReg - pAggInfo->mnReg + 1);
>> 
>> Hm? Explain pls this change.
> I'd avoid explaining it in the code. So, will do it here.
> Issue is that aggregate function may or may not contain distinct clause.
> If so - one extra reg per such func will be used. Right hand size of the
> assert doesn't take that into account and I see no reason to do that.
> That's why, new assertion is that number of allocated registers for
> aggregate is greater or even than sum of number of aggregate columns and
> functions.
> 
>>> +	cur = allocateCursor(p, pOp->p1,
>>> +			     space->def->exact_field_count == 0 ?
>>> +			     space->def->field_count :
>>> +			     space->def->exact_field_count,
>>> +			     CURTYPE_TARANTOOL);
>>> +	if (cur == NULL)
>>> +		goto no_mem;
>>> +	struct BtCursor *bt_cur = cur->uc.pCursor;
>>> +	bt_cur->curFlags |= space->def->id == 0 ? BTCF_TEphemCursor :
>>> +				BTCF_TaCursor;
>>> +	bt_cur->space = space;
>>> +	bt_cur->index = index;
>>> +	bt_cur->eState = CURSOR_INVALID;
>> 
>> Grep CURSOR_INVALID - it is not used anywhere else.
>> Mb it is worth to remove it?
> eState flag is heavily used by cursors machinery. I don't
> think we can evict it right now.

I meant CURSOR_INVALID is not used (you can grep its usages).
So basically we can make eState to be boolean value:
cursor is whether broken (invalid) or not. I don’t insist on this change tho.

> --
> Regards, Kirill Yukhin
> 
> commit 5bd0cf10ac94cfbaf5d076a3f5a347811fcf4c11
> Author: Kirill Yukhin <kyukhin at tarantool.org>
> Date:   Tue Sep 25 19:01:10 2018 +0300
> 
>    sql: refactor SQL cursor to remove write ones
> 
>    In Tarantool opening and positioning cursor for writing
>    have no sense. So refactor SQL code, to perform:
>      - Creation of ephemeral table w/o any cursors machinery.
>        No op-code returns register which contains plain pointer
>        to the ephemeral space.
>      - OpenRead/OpenWrite opcodes were replaced with single
>        CursorOpen op-code, which establishes new cursor w/

IteratorOpen

>        intention to subsequent read from the space. This opcode
>        accepts both plain pointer (in P4 operand) and register
>        which contains pointer (inn P3) to the ephemeral space.

Typo: in P3

> diff --git a/src/box/sql.c b/src/box/sql.c
> index ab4a587..f3836f2 100644
> --- a/src/box/sql.c
> +++ b/src/box/sql.c
> @@ -355,31 +355,15 @@ int tarantoolSqlite3Count(BtCursor *pCur, i64 *pnEntry)
> 	return SQLITE_OK;
> }
> 
> -/**
> - * Create ephemeral space and set cursor to the first entry. Features of
> - * ephemeral spaces: id == 0, name == "ephemeral", memtx engine (in future it
> - * can be changed, but now only memtx engine is supported), primary index
> - * which covers all fields and no secondary indexes, given field number and
> - * collation sequence. All fields are scalar and nullable.
> - *
> - * @param pCur Cursor which will point to the new ephemeral space.
> - * @param field_count Number of fields in ephemeral space.
> - * @param key_info Keys description for new ephemeral space.
> - *
> - * @retval SQLITE_OK on success, SQLITE_TARANTOOL_ERROR otherwise.
> - */
> -int
> -tarantoolSqlite3EphemeralCreate(BtCursor *pCur, uint32_t field_count,
> -				struct sql_key_info *key_info)
> +struct space *
> +sql_ephemeral_space_create(uint32_t field_count,
> +			   struct sql_key_info *key_info)

Nit: you can fit now args in one line.

> diff --git a/src/box/sql/build.c b/src/box/sql/build.c
> index 43be777..7670d53 100644
> --- a/src/box/sql/build.c
> +++ b/src/box/sql/build.c
> @@ -883,7 +883,7 @@ vdbe_emit_open_cursor(struct Parse *parse_context, int cursor, int index_id,
> 		      struct space *space)
> {
> 	assert(space != NULL);
> -	return sqlite3VdbeAddOp4(parse_context->pVdbe, OP_OpenWrite, cursor,
> +	return sqlite3VdbeAddOp4(parse_context->pVdbe, OP_IteratorOpen, cursor,
> 				 index_id, 0, (void *) space, P4_SPACEPTR);
> }
> 
> @@ -2693,7 +2693,7 @@ sql_create_index(struct Parse *parse, struct Token *token,
> 			goto exit_create_index;
> 
> 		sql_set_multi_write(parse, true);
> -		sqlite3VdbeAddOp4(vdbe, OP_OpenWrite, cursor, 0, 0,
> +		sqlite3VdbeAddOp4(vdbe, OP_IteratorOpen, cursor, 0, 0,
> 				  (void *)space_by_id(BOX_INDEX_ID),
> 				  P4_SPACEPTR);
> 		sqlite3VdbeChangeP5(vdbe, OPFLAG_SEEKEQ);
> diff --git a/src/box/sql/delete.c b/src/box/sql/delete.c
> index 9aa7058..17e1c32 100644
> --- a/src/box/sql/delete.c
> +++ b/src/box/sql/delete.c
> @@ -69,7 +69,7 @@ sql_lookup_table(struct Parse *parse, struct SrcList_item *tbl_name)
> 
> void
> sql_materialize_view(struct Parse *parse, const char *name, struct Expr *where,
> -		     int cursor)
> +		     int cursor, int reg_eph)

You slightly misunderstood me: you don’t need to pass arg reg_eph:
you already have struct Parse *parse, so can fetch value for register
inside function.

> @@ -4061,7 +4064,8 @@ sqlite3ExprCodeTarget(Parse * pParse, Expr * pExpr, int target)
> 			int n;
> 			if (pExpr->pLeft->iTable == 0) {
> 				pExpr->pLeft->iTable =
> -				    sqlite3CodeSubselect(pParse, pExpr->pLeft, 0);
> +				    sqlite3CodeSubselect(pParse, pExpr->pLeft,
> +							 0);

Noise diff.

> diff --git a/src/box/sql/select.c b/src/box/sql/select.c
> index 02c0a5d..2776044 100644
> --- a/src/box/sql/select.c
> +++ b/src/box/sql/select.c
> @@ -64,8 +64,15 @@ typedef struct DistinctCtx DistinctCtx;
> struct DistinctCtx {
> 	u8 isTnct;		/* True if the DISTINCT keyword is present */
> 	u8 eTnctType;		/* One of the WHERE_DISTINCT_* operators */
> -	int tabTnct;		/* Ephemeral table used for DISTINCT processing */
> -	int addrTnct;		/* Address of OP_OpenEphemeral opcode for tabTnct */
> +	/* Ephemeral table's cursor used for DISTINCT processing. It is
> +	 * used for readin from ephemeral space.
> +	 */
> +	int cur_eph;
> +	/* Register, containing a pointer to ephemeral space. It
> +	 * is used for insertions while procesing DISTINCT.
> +	 */
> +	int reg_eph;
> +	int addrTnct;		/* Address of OP_OpenEphemeral opcode for cur_eph */
> };

I fixed comment a little bit:


@@ -64,12 +64,14 @@ typedef struct DistinctCtx DistinctCtx;
 struct DistinctCtx {
        u8 isTnct;              /* True if the DISTINCT keyword is present */
        u8 eTnctType;           /* One of the WHERE_DISTINCT_* operators */
-       /* Ephemeral table's cursor used for DISTINCT processing. It is
-        * used for readin from ephemeral space.
+       /**
+        * Ephemeral table's cursor used for DISTINCT processing.
+        * It is used for reading from ephemeral space.
         */
        int cur_eph;
-       /* Register, containing a pointer to ephemeral space. It
-        * is used for insertions while procesing DISTINCT.
+       /**
+        * Register, containing a pointer to ephemeral space.
+        * It is used for insertions while processing DISTINCT.
         */

> @@ -1039,17 +1059,25 @@ selectInnerLoop(Parse * pParse,		/* The parser context */
> 			}
> 
> 		case WHERE_DISTINCT_UNIQUE:{
> -				sqlite3VdbeChangeToNoop(v, pDistinct->addrTnct);
> -				break;
> +			/* To handle DISTINCT two op-codes are
> +			 * emitted: OpenTEphemeral & IteratorOpen.
> +			 * addrTnct is address off first insn in
> +			 * a couple. Two evict ephemral space,
> +			 * need to noop both op-codes.
> +			 */

I’ve fixed comment:

-                       /* To handle DISTINCT two op-codes are
+                       /*
+                        * To handle DISTINCT two op-codes are
                         * emitted: OpenTEphemeral & IteratorOpen.
-                        * addrTnct is address off first insn in
-                        * a couple. Two evict ephemral space,
-                        * need to noop both op-codes.
+                        * addrTnct is address of first opcode in
+                        * a couple. To evict ephemeral space,
+                        * we need to noop both op-codes.
                         */

> +			sqlite3VdbeChangeToNoop(v, pDistinct->addrTnct);
> +			sqlite3VdbeChangeToNoop(v, pDistinct->addrTnct + 1);
> +			break;
> 			}
> 
> 
> @@ -5679,7 +5725,7 @@ sqlite3Select(Parse * pParse,		/* The parser context */
> 			VdbeComment((v, "%s", pItem->pTab->def->name));
> 			pItem->addrFillSub = addrTop;
> 			sqlite3SelectDestInit(&dest, SRT_Coroutine,
> -					      pItem->regReturn);
> +					      pItem->regReturn, -1);
> 			pItem->iSelectId = pParse->iNextSelectId;
> 			sqlite3Select(pParse, pSub, &dest);
> 			pItem->pTab->tuple_log_count = pSub->nSelectRow;
> @@ -5699,6 +5745,7 @@ sqlite3Select(Parse * pParse,		/* The parser context */
> 			int retAddr;
> 			assert(pItem->addrFillSub == 0);
> 			pItem->regReturn = ++pParse->nMem;
> +

Noise diff.

> @@ -5885,7 +5941,15 @@ sqlite3Select(Parse * pParse,		/* The parser context */
> 		 * into an OP_Noop.
> 		 */
> 		if (sSort.addrSortIndex >= 0 && sSort.pOrderBy == 0) {
> +			/* To handle ordering two op-codes are
> +			 * emitted: OpenTEphemeral & IteratorOpen.
> +			 * sSort.addrSortIndex is address off
> +			 * first insn in a couple. Two evict
> +			 * ephemral space, need to noop both
> +			 * op-codes.
> +			 */

I’ve fixed comment:

-                       /* To handle ordering two op-codes are
+                       /*
+                        * To handle ordering two op-codes are
                         * emitted: OpenTEphemeral & IteratorOpen.
-                        * sSort.addrSortIndex is address off
-                        * first insn in a couple. Two evict
-                        * ephemral space, need to noop both
+                        * sSort.addrSortIndex is address of
+                        * first opcode in a couple. To evict
+                        * ephemeral space, we need to noop both
                         * op-codes.
                         */

> diff --git a/src/box/sql/sqliteInt.h b/src/box/sql/sqliteInt.h
> index 53188e7..29636a0 100644
> --- a/src/box/sql/sqliteInt.h
> +++ b/src/box/sql/sqliteInt.h
> @@ -2000,6 +2000,8 @@ struct AggInfo {
> 		FuncDef *pFunc;	/* The aggregate function implementation */
> 		int iMem;	/* Memory location that acts as accumulator */
> 		int iDistinct;	/* Ephemeral table used to enforce DISTINCT */
> +		/* Register, holding ephemeral's spacepointer. */

Nit: missed space between ’space’ and ‘pointer’. And start comment from /**.

> +		int reg_eph;
> 	} *aFunc;
> 	int nFunc;		/* Number of entries in aFunc[] */
> };
> @@ -2591,6 +2593,8 @@ struct SelectDest {
> 	u8 eDest;		/* How to dispose of the results.  On of SRT_* above. */
> 	char *zAffSdst;		/* Affinity used when eDest==SRT_Set */
> 	int iSDParm;		/* A parameter used by the eDest disposal method */
> +	/* Register containing ephemeral's space pointer. */

Nit: start comment from /**.

> diff --git a/src/box/sql/tarantoolInt.h b/src/box/sql/tarantoolInt.h
> index 8017742..497d989 100644
> --- a/src/box/sql/tarantoolInt.h
> +++ b/src/box/sql/tarantoolInt.h
> @@ -97,9 +97,22 @@ sql_index_update_table_name(struct index_def *idef, const char *new_tbl_name,
> int tarantoolSqlite3RenameTrigger(const char *zTriggerName,
> 				  const char *zOldName, const char *zNewName);
> 
> -/* Interface for ephemeral tables. */
> -int tarantoolSqlite3EphemeralCreate(BtCursor * pCur, uint32_t filed_count,
> -				    struct sql_key_info *key_info);
> +/**
> + * Create ephemeral space. Features of ephemeral spaces: id == 0,
> + * name == "ephemeral", memtx engine (in future it can be changed,
> + * but now only memtx engine is supported), primary index which
> + * covers all fields and no secondary indexes, given field number
> + * and collation sequence. All fields are scalar and nullable.
> + *
> + * @param field_count Number of fields in ephemeral space.
> + * @param key_info Keys description for new ephemeral space.
> + *
> + * @retval Pointer to created space, NULL if error.
> + */
> +struct space *
> +sql_ephemeral_space_create(uint32_t filed_count,
> +			   struct sql_key_info *key_info);

Nit: you can fit it in one line.

> @@ -276,9 +280,7 @@ sqlite3Update(Parse * pParse,		/* The parser context */
> 								pPk->def);
> 		sqlite3VdbeAddOp4(v, OP_MakeRecord, iPk, pk_part_count,
> 				  regKey, zAff, pk_part_count);
> -		sqlite3VdbeAddOp2(v, OP_IdxInsert, iEph, regKey);
> -		/* Set flag to save memory allocating one by malloc. */
> -		sqlite3VdbeChangeP5(v, 1);

You misunderstood me: you should change P5 to 1 for OP_MakeRecord opcode.
To be more precise see patch:
https://github.com/tarantool/tarantool/commit/a5576aa77bf900d0ceae126ba46a80c56078df6c

> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index 00ffb0c..aa602df 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -3054,51 +3054,37 @@ case OP_TTransaction: {
> 	break;
> }
> 
> -/* Opcode: OpenRead P1 P2 P3 P4 P5
> +/* Opcode: CursorReopen P1 P2 P3 P4 P5

IteratorReopen?

>  * Synopsis: index id = P2, space ptr = P4
>  *
> - * Open a cursor for a space specified by pointer in P4 and index
> - * id in P2. Give the new cursor an identifier of P1. The P1
> - * values need not be contiguous but all P1 values should be
> - * small integers. It is an error for P1 to be negative.
> - */
> -/* Opcode: ReopenIdx P1 P2 P3 P4 P5
> - * Synopsis: index id = P2, space ptr = P4
> - *
> - * The ReopenIdx opcode works exactly like OpenRead except that
> - * it first checks to see if the cursor on P1 is already open
> + * The CursorReopen opcode works exactly like CursorOpen except
> + * that it first checks to see if the cursor on P1 is already open
>  * with the same index and if it is this opcode becomes a no-op.
> - * In other words, if the cursor is already open, do not reopen it.
> + * In other words, if the cursor is already open, do not reopen
> + * it.
>  *
> - * The ReopenIdx opcode may only be used with P5 == 0.
> + * The CursorReopen opcode may only be used with P5 == 0.
>  */
> -/* Opcode: OpenWrite P1 P2 P3 P4 P5
> - * Synopsis: index id = P2, space ptr = P4
> +/* Opcode: CursorOpen P1 P2 P3 P4 P5

Don’t forget to rename opcode even in comments.

> diff --git a/src/box/sql/whereInt.h b/src/box/sql/whereInt.h
> index 8a3f2ac..1aeb360 100644
> --- a/src/box/sql/whereInt.h
> +++ b/src/box/sql/whereInt.h
> @@ -416,7 +416,7 @@ struct WhereInfo {
> 	ExprList *pOrderBy;	/* The ORDER BY clause or NULL */
> 	ExprList *pDistinctSet;	/* DISTINCT over all these values */
> 	LogEst iLimit;		/* LIMIT if wctrlFlags has WHERE_USE_LIMIT */
> -	int aiCurOnePass[2];	/* OP_OpenWrite cursors for the ONEPASS opt */
> +	int aiCurOnePass[2];	/* OP_CursorOpen cursors for the ONEPASS opt */

IteratorOpen?






More information about the Tarantool-patches mailing list