Tarantool development patches archive
 help / color / mirror / Atom feed
From: "v.shpilevoy@tarantool.org" <v.shpilevoy@tarantool.org>
To: tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH 2/4] sql: introduce opcodes to operate on system spaces
Date: Thu, 22 Mar 2018 16:09:27 +0300	[thread overview]
Message-ID: <B09FDAB3-F048-4706-B29C-7EE796D10AB1@tarantool.org> (raw)
In-Reply-To: <430BF95D-84E2-4A63-915D-A079C5E97285@tarantool.org>

[-- Attachment #1: Type: text/plain, Size: 8443 bytes --]



> 22 марта 2018 г., в 15:23, n.pettik <korablev@tarantool.org> написал(а):
> 
> 
>> On 22 Mar 2018, at 14:42, v.shpilevoy@tarantool.org <mailto:v.shpilevoy@tarantool.org> wrote:
>> 
>> See below 6 comments.
>> 
>>> 21 марта 2018 г., в 2:48, Nikita Pettik <korablev@tarantool.org <mailto:korablev@tarantool.org>> написал(а):
>>> 
>>> Since it is impossible to use space pointers during execution of DDL
>>> (after any DDL schema may change and pointers fetched at compile time
>>> can become expired), special opcodes to operate on system spaces have
>>> been introduced: OP_SInsert and OP_SDelete. They take space id as
>>> an argument and make space lookup each time before insertion or deletion.
>>> However, sometimes it is still required to iterate through space (e.g.
>>> to satisfy WHERE clause) during DDL. As far as now cursors rely on
>>> pointers to space, it is required to convert space id to space pointer
>>> during VDBE execution. Hence, another opcode is added: OP_SIDtoPtr.
>>> Finally, existing opcodes, which are invoked only during DDL, have been
>>> also refactored.
>>> 
>>> Part of #3252
>>> ---
>>> src/box/sql.c              |  51 ++++++++++++-------
>>> src/box/sql/opcodes.c      |  45 +++++++++--------
>>> src/box/sql/opcodes.h      |  53 ++++++++++----------
>>> src/box/sql/tarantoolInt.h |  11 ++---
>>> src/box/sql/vdbe.c         | 121 ++++++++++++++++++++++++++++++++++-----------
>>> 5 files changed, 182 insertions(+), 99 deletions(-)
>>> 
>>> diff --git a/src/box/sql.c b/src/box/sql.c
>>> index e943131ae..db4c7e7ea 100644
>>> --- a/src/box/sql.c
>>> +++ b/src/box/sql.c
>>> +int
>>> +sql_delete_by_key(struct space *space, char *key, uint32_t key_size)
>>> +{
>>> 	struct request request;
>>> 	memset(&request, 0, sizeof(request));
>>> 	request.type = IPROTO_DELETE;
>>> 	request.key = key;
>>> 	request.key_end = key + key_size;
>>> -	request.space_id = pCur->space->def->id;
>>> -	rc = sql_execute_dml(&request, pCur->space);
>>> +	request.space_id = space->def->id;
>>> +	int rc = sql_execute_dml(&request, space);
>> 
>> 1. Why do you need sql_execute_dml? It does exactly the same as process_rw, but process_rw can also return a tuple. And this process_rw feature is needed for the patch on last inserted tuple returning.
> 
> I need it as an analogue of process_rw(), since process_rw() is unavailable 
> via public BOX interface (declared as static). If you approve to make it public,
> where should I place it and should I rename it?
> Or, for instance, make sql_execute_dml be complete copy of process_rw()?
> It is unlikely to be useful somewhere except for SQL (and box internals surely).
> Initial indent was to reduce overhead while calling box functions from SQL bindings (sql.c):
> ordinary call stack looks like: box_insert() -> box_process1() -> space lookup + process_rw().
> In such scheme we also process redundant space lookup since we already have struct space * in cursor.
> (Patch where I introduced this function isn’t in trunk, so I can easily fix it).

Yes, I very like the removal of box_process1. I just do not like code duplicate. But also I can not see a place for declaration of process_rw in any header. So lets make it extern "C" non-static in box.cc, and declare it as extern in sql.c.

> 
>>> 
>>> @@ -1088,20 +1107,16 @@ int tarantoolSqlite3IncrementMaxid(BtCursor *pCur)
>>> 	request.ops = ops;
>>> 	request.ops_end = ops + sizeof(ops);
>>> 	request.type = IPROTO_UPSERT;
>>> -	request.space_id = pCur->space->def->id;
>>> -	rc = sql_execute_dml(&request, pCur->space);
>>> +	request.space_id = BOX_SCHEMA_ID;
>>> +	rc = sql_execute_dml(&request, space_schema);
>> 
>> 2. If you use process_rw, then you do not need iterator to get the previous max_id and increment it - you just do process_rw, get result and do tuple_field_u64(result, 1, space_max_id);

Suppose, this will be fixed, if you will use process_rw, as described above.

>> 
>>> 	if (rc != 0) {
>>> 		iterator_delete(it);
>>> 		return SQL_TARANTOOL_ERROR;
>>> 	}
>>> -	if (pCur->last_tuple != NULL) {
>>> -		box_tuple_unref(pCur->last_tuple);
>>> -	}
>>> -	box_tuple_ref(res);
>>> -	pCur->last_tuple = res;
>>> -	pCur->eState = CURSOR_VALID;
>>> +	rc = tuple_field_u64(res, 1, space_max_id);
>>> +	(*space_max_id)++;
>>> 	iterator_delete(it);
>>> -	return SQLITE_OK;
>>> +	return rc  == 0 ? SQLITE_OK : SQL_TARANTOOL_ERROR;
>>> }
>>> 
>>> /*
>>> @@ -1687,14 +1702,12 @@ int tarantoolSqlite3EphemeralGetMaxId(BtCursor *pCur, uint32_t fieldno,
>>> * If index is empty - return 0 in max_id and success status
>>> */
>>> int
>>> -tarantoolSqlGetMaxId(BtCursor *cur, uint32_t fieldno,
>>> -		     uint64_t *max_id)
>>> +tarantoolSqlGetMaxId(uint32_t space_id, uint64_t *max_id)
>> 
>> 3. After changing a function signature, please, fix a comments too. And why this function is sure, that the max id stored in the first column? And why it can not be used for secondary indexes?
> 
> It might, but it is only applied for _sequence space during DDL routine
> (and this is also answer for the first question: format of system space doesn’t change frequently),
> so I decided to remove cursor/index overhead (as a part of overall DDL refactoring).
> Probably, I should also rename this opcode to make it more clear.

Yes, lets rename it, and remove space_id argument, if it is always used for _sequence space.
>>> 
>>> +/* Opcode: SInsert P1 P2 * * P5
>>> + * Synopsis: space id = P1, key = r[P2]
>>> + *
>>> + * This opcode is used only during DML routine.
>> 
>> 4. Do you mean DDL instead of DML?

Hm?

>> 
>>> + * In contrast to ordinary insertion, insertion to system spaces
>>> + * such as _space or _index will lead to schema changes.
>>> + * Thus, usage of space pointers is going to be impossible,
>>> + * as far as pointers can be expired since compilation time.
>>> + *
>>> + * If P5 is set to OPFLAG_NCHANGE, account overall changes
>>> + * made to database.
>>> + */
>>> +case OP_SInsert: {
>>> +	assert(pOp->p1 > 0);
>>> +
>>> +/* Opcode: SIDtoPtr P1 P2 * * *
>>> + * Synopsis: space id = P1, space[out] = r[P2]
>>> + *
>>> + * This opcode makes look up by space id and save found space
>>> + * into register, specified by the content of register P2.
>>> + * Such trick is needed during DML routine, since schema may
>>> + * change and pointers become expired.
>>> + */
>>> +case OP_SIDtoPtr: {
>> 
>> 6. It seems to be unused in the patch. Can you use it for all these things, introduced by the previous patch?
>> struct space *space = space_by_id(SQLITE_PAGENO_TO_SPACEID(pIdx->tnum));
>> assert(space != NULL);
>> sqlite3VdbeAddOp4Ptr(v, OP_LoadPtr, 0, space_ptr_reg, 0, (void *) space);
>> sqlite3VdbeAddOp3(v, OP_OpenRead, iIdxCur, pIdx->tnum, space_ptr_reg);
>> 
>> Using OP_SIDtoPtr you already in this patch can defer space_by_id lookup.
> 
> It is used in the next patch. I don’t want to make a lot of diffs, since introducing of any opcodes
> leads to auto-generating of opcodes.c/h and as a result to vast amount of redundant diff.

Ok, then lets leave it in this patch.

> As for your example, during SQL DDL we need to execute queries such as:
> 'DELETE FROM BOX_SEQUENCE WHERE name = zName;’
> And ‘name’ is not a key. Hence, we have to open cursor and iterate through the space.
> But if we used pointers to spaces which were fetched at compile state,
> they would become expired at VDBE runtime:
> after first stage of DDL (e.g. deletion from _trigger), schema may change.
> Moreover, we are going to support syntax like: ‘CREATE TABLE AS SELECT …’,
> where DDL and DML are mixing. Thus we need facility to make space lookup at VDBE runtime.

I agree, that we must have this opcode, but why it can not be used for DML too? In my example you at compile time get space *, store it by OP_LoadPtr, and then use to open iterator (but I can not see where - I commented this in a previous patch review), but for DML you can replace compile lookups on runtime lookups using OP_SIDtoPtr. And it also solves the problem, that a prepared VDBE statement (which in a future we will store for a long time) must store space * in opcodes - we can use OP_SIDtoPtr and do not store it until execution. Sorry, can not explain more clear. We can also discuss it face-to-face.

> 


[-- Attachment #2: Type: text/html, Size: 26092 bytes --]

  reply	other threads:[~2018-03-22 13:09 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 23:48 [tarantool-patches] [PATCH 0/4] Replace space id with space ptrs in VDBE runtime Nikita Pettik
2018-03-20 23:48 ` [tarantool-patches] [PATCH 1/4] sql: pass space pointer to OP_OpenRead/OpenWrite Nikita Pettik
2018-03-21 13:14   ` [tarantool-patches] " Kirill Yukhin
2018-03-22 10:07     ` n.pettik
2018-03-22 11:04       ` v.shpilevoy
2018-03-23 16:01         ` n.pettik
2018-03-20 23:48 ` [tarantool-patches] [PATCH 2/4] sql: introduce opcodes to operate on system spaces Nikita Pettik
2018-03-22 11:42   ` [tarantool-patches] " v.shpilevoy
2018-03-22 12:23     ` n.pettik
2018-03-22 13:09       ` v.shpilevoy [this message]
2018-03-23 16:20         ` n.pettik
2018-03-20 23:48 ` [tarantool-patches] [PATCH 3/4] sql: rework code generation for DDL routine Nikita Pettik
2018-03-22 13:57   ` [tarantool-patches] " v.shpilevoy
2018-03-23 16:33     ` n.pettik
2018-03-20 23:48 ` [tarantool-patches] [PATCH 4/4] sql: rework OP_OpenWrite/OpenRead Nikita Pettik
2018-03-22 14:11   ` [tarantool-patches] " v.shpilevoy
2018-03-23 16:39     ` n.pettik
2018-03-24 12:37 ` [tarantool-patches] Re: [PATCH 0/4] Replace space id with space ptrs in VDBE runtime v.shpilevoy
2018-03-27 16:28   ` n.pettik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B09FDAB3-F048-4706-B29C-7EE796D10AB1@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH 2/4] sql: introduce opcodes to operate on system spaces' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox