[tarantool-patches] Re: [PATCH v1 1/1] sql: prevent executing crossengine sql

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Mon Jul 23 19:39:42 MSK 2018

On 23/07/2018 19:20, n.pettik wrote:
>>>   	struct iterator *it = index_create_iterator(*pCur->space->index,
>>>   						    ITER_ALL, nil_key,
>>>   						    0 /* part_count */);
>>> @@ -1134,12 +1135,26 @@ cursor_seek(BtCursor *pCur, int *pRes)
>>>   	}
>>>   +	struct space *space = pCur->space;
>>> +	struct txn *txn = NULL;> +
>>> +	assert(space->def->id != 0 || space_is_memtx(space));
>>> +	if (space->def->id != 0 &&> +	    space->def->id != BOX_SQL_STAT4_ID &&
>>> +	    space->def->id != BOX_SQL_STAT1_ID) {
>> 2. I strongly do not like these 3 checks.
>> * space->id == 0 can be skipped, since begin_ro_stmt needs
>> only space engine.
>> * id != stat4/1 is a huge crutch that should be removed. The
>> later will be dilapidated by any stat or tx manager change.
>> I guess it is because of ANALYZE that scans spaces and then
>> updates _stat1/4. But I can not understand, why it can not
>> firstly read the samples from user spaces, commit ro, and
>> secondly insert the samples into stats in another transaction.
>> Besides, I believe, that user spaces and analyze ones should
>> not be in a single transaction ever as former are actually
>> system ones and should not be mixed with user spaces.
>> Nikita, I appeal to you for onlooking. Is it possible to
>> split ANALYZE in two transactions? How could Kirill do it?
>> Besides there is another option, but IMHO it could seem even
>> more flawed.
> Why do you think so?

I am afraid of additional 'if's for such internal thing.

> Actually, I can’t come up with better solution:
> when we start executing OP_AnalysisLoad we always start transaction
> (member that issue with gathering statistics on region to make it consistent?
> See sql_analysis_load()).

I think that on insertion into analyze spaces we correctly start a
transaction, and it is not for region usage only. As I remember,
there were problems with stat spaces consistency if we update stat
not in a transaction. Region is just a convenient benefit.

> Mb we always can avoid calling txn_begin_ro_stmt() on system spaces?

It is okay, but for ephemeral spaces too, it is not? On the parsing
stage we know is the space system one or not. Besides, I think that if
a user calls explicit "SELECT * FROM _space", we should begin ro,
but not when we open an internal cursor (ephemeral, or to insert into
stats or something). It looks like a good question for the server
team chat - should be start read-only statements on explicit SELECT
from a system space? Box API now does it (inside box_index_iterator()).

> Quite similar to this issue might be CREATE TABLE AS SELECT
> when it will be implemented. However, in this case we still can avoid
> mixing user and system spaces within one transaction:
> SELECT FROM source INSERT TO ephemeral
> SELECT FROM ephemeral INSERT TO destination

Then I do not understand why can not we do the same for ANALYZE:

SELECT samples from different engines with no ro stmts;
INSERT tuples into stat1/4;

>> We could introduce a new opcode or add an option
>> for OP_Seek like 'is_atomic' that will trigger ro stmt start.
>> For spaces like analyze and ephemeral ones we could set this
>> option in True thereby "allowing" cross-engine transactions
>> for internal usage when we do not need read-view and other tx
>> manager facilities.
>>> +		if (txn_begin_ro_stmt(space, &txn) != 0)
>>> +			return SQL_TARANTOOL_ERROR;
>>> +	}
>>>   	struct iterator *it = index_create_iterator(pCur->index, pCur->iter_type,
>>>   						    key, part_count);
>>>   	if (it == NULL) {
>>> +		if (txn != NULL)
>>> +			txn_rollback_stmt();
>>>   		pCur->eState = CURSOR_INVALID;
>>>   	}
>>> +	if (txn != NULL)
>>> +		txn_commit_ro_stmt(txn);
>>>   	pCur->iter = it;
>>>   	pCur->eState = CURSOR_VALID;

More information about the Tarantool-patches mailing list