[tarantool-patches] Re: [PATCH v2] sql: fix ephemeral table next rowid generation

n.pettik korablev at tarantool.org
Wed Jul 25 18:06:22 MSK 2018


> diff --git a/src/box/memtx_space.c b/src/box/memtx_space.c
> index f2d512f21..5a80d48ab 100644
> --- a/src/box/memtx_space.c
> +++ b/src/box/memtx_space.c
> @@ -843,6 +843,8 @@ static void
> memtx_init_ephemeral_space(struct space *space)
> {
> 	memtx_space_add_primary_key(space);
> +	struct memtx_space * ephem_space = (struct memtx_space *) space;
> +	ephem_space->next_rowid = 0;
> }
> 
> static int
> diff --git a/src/box/memtx_space.h b/src/box/memtx_space.h
> index 7dc341079..26181166a 100644
> --- a/src/box/memtx_space.h
> +++ b/src/box/memtx_space.h
> @@ -48,6 +48,11 @@ struct memtx_space {
> 	 */
> 	int (*replace)(struct space *, struct tuple *, struct tuple *,
> 		       enum dup_replace_mode, struct tuple **);
> +	/**
> +	 * Next rowid. This number is added to the end of pk to

In fact, to the end of tuple to be inserted. Now ofc PK of ephemeral
spaces consists of all fields, but it may change someday, I guess.

Moreover, ephemeral spaces aren’t supposed to be feature exclusively
of memtx engine. Vinyl ephemeral spaces also have right to exist. 

> +	 * allow to store non-unique rows in ephemeral tables.
> +	 */
> +	uint64_t next_rowid;
> };
> 
> /**
> diff --git a/src/box/sql.c b/src/box/sql.c
> index 6104a6d0f..fdf0313d3 100644
> --- a/src/box/sql.c
> +++ b/src/box/sql.c
> @@ -47,6 +47,7 @@
> #include "box.h"
> #include "txn.h"
> #include "space.h"
> +#include "memtx_space.h"
> #include "space_def.h"
> #include "index_def.h"
> #include "tuple.h"
> @@ -448,6 +449,18 @@ int tarantoolSqlite3EphemeralDrop(BtCursor *pCur)
> 	return SQLITE_OK;
> }
> 
> +/**
> + * Get next rowid for an ephemeral table.
> + */
> +uint64_t
> +tarantool_ephemeral_next_rowid(BtCursor *bt_cur)

Firstly, we don’t use ’tarantool_’ prefix anymore. Also we use ’struct’ prefix
for compound types: struct BtCursor *bt_cur. Moreover, I don’t see comment
for this function.

Secondly, I see your message within issue discussion:
"
Discussed verbally with
@kyukhin , @Korablev77 , @kostja and decided, that the unique rowid should be encapsulated into
vtab method of ephemeral space.
“

But this function doesn’t seem to be method of vtab…

> +{
> +	assert(bt_cur);
> +	assert(bt_cur->curFlags & BTCF_TEphemCursor);
> +	struct memtx_space * ephem_space = (struct memtx_space *) bt_cur->space;
> +	return ephem_space->next_rowid++;
> +}
> +
> static inline int
> insertOrReplace(struct space *space, const char *tuple, const char *tuple_end,
> 		enum iproto_type type)
> @@ -1080,7 +1093,6 @@ key_alloc(BtCursor *cur, size_t key_size)
> 		}
> 		cur->key = new_key;
> 	}
> -	cur->nKey = key_size;
> 	return 0;
> }
> 
> @@ -1621,37 +1633,6 @@ sql_debug_info(struct info_handler *h)
> 	info_end(h);
> }
> 
> 
> diff --git a/src/box/sql/tarantoolInt.h b/src/box/sql/tarantoolInt.h
> index 94f94cb44..369af77ff 100644
> --- a/src/box/sql/tarantoolInt.h
> +++ b/src/box/sql/tarantoolInt.h
> @@ -116,8 +116,8 @@ int tarantoolSqlite3EphemeralDelete(BtCursor * pCur);
> int tarantoolSqlite3EphemeralCount(BtCursor * pCur, i64 * pnEntry);
> int tarantoolSqlite3EphemeralDrop(BtCursor * pCur);
> int tarantoolSqlite3EphemeralClearTable(BtCursor * pCur);
> -int tarantoolSqlite3EphemeralGetMaxId(BtCursor * pCur, uint32_t fieldno,
> -				       uint64_t * max_id);
> +uint64_t
> +tarantool_ephemeral_next_rowid(BtCursor *bt_cur);
> 
> /**
>  * Performs exactly as extract_key + sqlite3VdbeCompareMsgpack,
> diff --git a/src/box/sql/vdbe.c b/src/box/sql/vdbe.c
> index 89c35d218..532c98e54 100644
> --- a/src/box/sql/vdbe.c
> +++ b/src/box/sql/vdbe.c
> @@ -3790,27 +3790,19 @@ case OP_NextSequenceId: {
> 	break;
> }
> 
> -/* Opcode: NextIdEphemeral P1 P2 P3 * *
> - * Synopsis: r[P3]=get_max(space_index[P1]{Column[P2]})
> +/* Opcode: NextIdEphemeral P1 P2 * * *
> + * Synopsis: r[P2]=get_next_rowid(space_index[P1]})
>  *
> - * This opcode works in the same way as OP_NextId does, except it is
> - * only applied for ephemeral tables. The difference is in the fact that
> - * all ephemeral tables don't have space_id (to be more precise it equals to zero).
> + * This opcode stores next `rowid` for the ephemeral space to P2 register.
> + * `rowid` is necessary, because inserted to ephemeral space tuples may be not
> + * unique, while Tarantool`s spaces can contain only unique tuples.

Nitpicking: generally speaking, ephemeral spaces can contain duplicates
when it comes for NULLs. You may investigate how this request works:

SELECT (SELECT NULL UNION SELECT NULL) EXCEPT SELECT NULL;

:)

>  */
> case OP_NextIdEphemeral: {
> 	VdbeCursor *pC;
> -	int p2;
> 	pC = p->apCsr[pOp->p1];
> -	p2 = pOp->p2;
> -	pOut = &aMem[pOp->p3];
> -
> -	assert(pC->uc.pCursor->curFlags & BTCF_TEphemCursor);
> -
> -	rc = tarantoolSqlite3EphemeralGetMaxId(pC->uc.pCursor, p2,
> -					       (uint64_t *) &pOut->u.i);
> -	if (rc) goto abort_due_to_error;
> -
> -	pOut->u.i += 1;
> +	pOut = &aMem[pOp->p2];
> +	struct BtCursor * bt_cur = pC->uc.pCursor;
> +	pOut->u.i = tarantool_ephemeral_next_rowid(bt_cur);

Should we add some checks on rowid overflow? Ephemeral spaces can be extensively
used for JOIN’s where number of inserted to ephemeral space tuples is O(n^2) in
worst case (for two tables). On the other hand, even simple check can slow down execution
especially when it concerns millions of tuples...

> 	pOut->flags = MEM_Int;
> 	break;
> }
> diff --git a/test/sql-tap/gh-3297_ephemeral_rowid.test.lua b/test/sql-tap/gh-3297_ephemeral_rowid.test.lua
> new file mode 100755
> index 000000000..ed596e7ad
> --- /dev/null
> +++ b/test/sql-tap/gh-3297_ephemeral_rowid.test.lua
> @@ -0,0 +1,30 @@
> +#!/usr/bin/env tarantool
> +test = require("sqltester")
> +test:plan(1)
> +
> +-- Check that OP_NextIdEphemeral generates unique ids.
> +
> +test:execsql [[
> +    CREATE TABLE t1(a primary key);
> +    CREATE TABLE t2(a primary key, b);
> +    insert into t1 values(12);
> +    insert into t2 values(1, 5);
> +    insert into t2 values(2, 2);
> +    insert into t2 values(3, 2);
> +    insert into t2 values(4, 2);
> +]]
> +
> +test:do_execsql_test(
> +    "gh-3297-1",
> +    [[
> +        select * from ( select a from t1 limit 1), (select b from t2 limit 10);

Use upper-case for all SQL statements pls.






More information about the Tarantool-patches mailing list