22 марта 2018 г., в 15:23, n.pettik <korablev@tarantool.org> написал(а):


On 22 Mar 2018, at 14:42, v.shpilevoy@tarantool.org wrote:

See below 6 comments.

21 марта 2018 г., в 2:48, Nikita Pettik <korablev@tarantool.org> написал(а):

Since it is impossible to use space pointers during execution of DDL
(after any DDL schema may change and pointers fetched at compile time
can become expired), special opcodes to operate on system spaces have
been introduced: OP_SInsert and OP_SDelete. They take space id as
an argument and make space lookup each time before insertion or deletion.
However, sometimes it is still required to iterate through space (e.g.
to satisfy WHERE clause) during DDL. As far as now cursors rely on
pointers to space, it is required to convert space id to space pointer
during VDBE execution. Hence, another opcode is added: OP_SIDtoPtr.
Finally, existing opcodes, which are invoked only during DDL, have been
also refactored.

Part of #3252
---
src/box/sql.c              |  51 ++++++++++++-------
src/box/sql/opcodes.c      |  45 +++++++++--------
src/box/sql/opcodes.h      |  53 ++++++++++----------
src/box/sql/tarantoolInt.h |  11 ++---
src/box/sql/vdbe.c         | 121 ++++++++++++++++++++++++++++++++++-----------
5 files changed, 182 insertions(+), 99 deletions(-)

diff --git a/src/box/sql.c b/src/box/sql.c
index e943131ae..db4c7e7ea 100644
--- a/src/box/sql.c
+++ b/src/box/sql.c
+int
+sql_delete_by_key(struct space *space, char *key, uint32_t key_size)
+{
struct request request;
memset(&request, 0, sizeof(request));
request.type = IPROTO_DELETE;
request.key = key;
request.key_end = key + key_size;
- request.space_id = pCur->space->def->id;
- rc = sql_execute_dml(&request, pCur->space);
+ request.space_id = space->def->id;
+ int rc = sql_execute_dml(&request, space);

1. Why do you need sql_execute_dml? It does exactly the same as process_rw, but process_rw can also return a tuple. And this process_rw feature is needed for the patch on last inserted tuple returning.

I need it as an analogue of process_rw(), since process_rw() is unavailable 
via public BOX interface (declared as static). If you approve to make it public,
where should I place it and should I rename it?
Or, for instance, make sql_execute_dml be complete copy of process_rw()?
It is unlikely to be useful somewhere except for SQL (and box internals surely).
Initial indent was to reduce overhead while calling box functions from SQL bindings (sql.c):
ordinary call stack looks like: box_insert() -> box_process1() -> space lookup + process_rw().
In such scheme we also process redundant space lookup since we already have struct space * in cursor.
(Patch where I introduced this function isn’t in trunk, so I can easily fix it).

Yes, I very like the removal of box_process1. I just do not like code duplicate. But also I can not see a place for declaration of process_rw in any header. So lets make it extern "C" non-static in box.cc, and declare it as extern in sql.c.



@@ -1088,20 +1107,16 @@ int tarantoolSqlite3IncrementMaxid(BtCursor *pCur)
request.ops = ops;
request.ops_end = ops + sizeof(ops);
request.type = IPROTO_UPSERT;
- request.space_id = pCur->space->def->id;
- rc = sql_execute_dml(&request, pCur->space);
+ request.space_id = BOX_SCHEMA_ID;
+ rc = sql_execute_dml(&request, space_schema);

2. If you use process_rw, then you do not need iterator to get the previous max_id and increment it - you just do process_rw, get result and do tuple_field_u64(result, 1, space_max_id);

Suppose, this will be fixed, if you will use process_rw, as described above.


if (rc != 0) {
iterator_delete(it);
return SQL_TARANTOOL_ERROR;
}
- if (pCur->last_tuple != NULL) {
- box_tuple_unref(pCur->last_tuple);
- }
- box_tuple_ref(res);
- pCur->last_tuple = res;
- pCur->eState = CURSOR_VALID;
+ rc = tuple_field_u64(res, 1, space_max_id);
+ (*space_max_id)++;
iterator_delete(it);
- return SQLITE_OK;
+ return rc  == 0 ? SQLITE_OK : SQL_TARANTOOL_ERROR;
}

/*
@@ -1687,14 +1702,12 @@ int tarantoolSqlite3EphemeralGetMaxId(BtCursor *pCur, uint32_t fieldno,
* If index is empty - return 0 in max_id and success status
*/
int
-tarantoolSqlGetMaxId(BtCursor *cur, uint32_t fieldno,
-      uint64_t *max_id)
+tarantoolSqlGetMaxId(uint32_t space_id, uint64_t *max_id)

3. After changing a function signature, please, fix a comments too. And why this function is sure, that the max id stored in the first column? And why it can not be used for secondary indexes?

It might, but it is only applied for _sequence space during DDL routine
(and this is also answer for the first question: format of system space doesn’t change frequently),
so I decided to remove cursor/index overhead (as a part of overall DDL refactoring).
Probably, I should also rename this opcode to make it more clear.

Yes, lets rename it, and remove space_id argument, if it is always used for _sequence space.

+/* Opcode: SInsert P1 P2 * * P5
+ * Synopsis: space id = P1, key = r[P2]
+ *
+ * This opcode is used only during DML routine.

4. Do you mean DDL instead of DML?

Hm?


+ * In contrast to ordinary insertion, insertion to system spaces
+ * such as _space or _index will lead to schema changes.
+ * Thus, usage of space pointers is going to be impossible,
+ * as far as pointers can be expired since compilation time.
+ *
+ * If P5 is set to OPFLAG_NCHANGE, account overall changes
+ * made to database.
+ */
+case OP_SInsert: {
+ assert(pOp->p1 > 0);
+
+/* Opcode: SIDtoPtr P1 P2 * * *
+ * Synopsis: space id = P1, space[out] = r[P2]
+ *
+ * This opcode makes look up by space id and save found space
+ * into register, specified by the content of register P2.
+ * Such trick is needed during DML routine, since schema may
+ * change and pointers become expired.
+ */
+case OP_SIDtoPtr: {

6. It seems to be unused in the patch. Can you use it for all these things, introduced by the previous patch?
struct space *space = space_by_id(SQLITE_PAGENO_TO_SPACEID(pIdx->tnum));
assert(space != NULL);
sqlite3VdbeAddOp4Ptr(v, OP_LoadPtr, 0, space_ptr_reg, 0, (void *) space);
sqlite3VdbeAddOp3(v, OP_OpenRead, iIdxCur, pIdx->tnum, space_ptr_reg);

Using OP_SIDtoPtr you already in this patch can defer space_by_id lookup.

It is used in the next patch. I don’t want to make a lot of diffs, since introducing of any opcodes
leads to auto-generating of opcodes.c/h and as a result to vast amount of redundant diff.

Ok, then lets leave it in this patch.

As for your example, during SQL DDL we need to execute queries such as:
'DELETE FROM BOX_SEQUENCE WHERE name = zName;’
And ‘name’ is not a key. Hence, we have to open cursor and iterate through the space.
But if we used pointers to spaces which were fetched at compile state,
they would become expired at VDBE runtime:
after first stage of DDL (e.g. deletion from _trigger), schema may change.
Moreover, we are going to support syntax like: ‘CREATE TABLE AS SELECT …’,
where DDL and DML are mixing. Thus we need facility to make space lookup at VDBE runtime.

I agree, that we must have this opcode, but why it can not be used for DML too? In my example you at compile time get space *, store it by OP_LoadPtr, and then use to open iterator (but I can not see where - I commented this in a previous patch review), but for DML you can replace compile lookups on runtime lookups using OP_SIDtoPtr. And it also solves the problem, that a prepared VDBE statement (which in a future we will store for a long time) must store space * in opcodes - we can use OP_SIDtoPtr and do not store it until execution. Sorry, can not explain more clear. We can also discuss it face-to-face.