[tarantool-patches] Re: [PATCH v1 1/1] sql: prevent executing crossengine sql
Vladislav Shpilevoy
v.shpilevoy at tarantool.org
Mon Jul 23 19:39:42 MSK 2018
On 23/07/2018 19:20, n.pettik wrote:
>
>>> struct iterator *it = index_create_iterator(*pCur->space->index,
>>> ITER_ALL, nil_key,
>>> 0 /* part_count */);
>>> @@ -1134,12 +1135,26 @@ cursor_seek(BtCursor *pCur, int *pRes)
>>> return SQL_TARANTOOL_ITERATOR_FAIL;
>>> }
>>> + struct space *space = pCur->space;
>>> + struct txn *txn = NULL;> +
>>> + assert(space->def->id != 0 || space_is_memtx(space));
>>> + if (space->def->id != 0 &&> + space->def->id != BOX_SQL_STAT4_ID &&
>>> + space->def->id != BOX_SQL_STAT1_ID) {
>>
>> 2. I strongly do not like these 3 checks.
>>
>> * space->id == 0 can be skipped, since begin_ro_stmt needs
>> only space engine.
>>
>> * id != stat4/1 is a huge crutch that should be removed. The
>> later will be dilapidated by any stat or tx manager change.
>> I guess it is because of ANALYZE that scans spaces and then
>> updates _stat1/4. But I can not understand, why it can not
>> firstly read the samples from user spaces, commit ro, and
>> secondly insert the samples into stats in another transaction.
>> Besides, I believe, that user spaces and analyze ones should
>> not be in a single transaction ever as former are actually
>> system ones and should not be mixed with user spaces.
>> Nikita, I appeal to you for onlooking. Is it possible to
>> split ANALYZE in two transactions? How could Kirill do it?
>> Besides there is another option, but IMHO it could seem even
>> more flawed.
>
> Why do you think so?
I am afraid of additional 'if's for such internal thing.
> Actually, I can’t come up with better solution:
> when we start executing OP_AnalysisLoad we always start transaction
> (member that issue with gathering statistics on region to make it consistent?
> See sql_analysis_load()).
I think that on insertion into analyze spaces we correctly start a
transaction, and it is not for region usage only. As I remember,
there were problems with stat spaces consistency if we update stat
not in a transaction. Region is just a convenient benefit.
> Mb we always can avoid calling txn_begin_ro_stmt() on system spaces?
It is okay, but for ephemeral spaces too, it is not? On the parsing
stage we know is the space system one or not. Besides, I think that if
a user calls explicit "SELECT * FROM _space", we should begin ro,
but not when we open an internal cursor (ephemeral, or to insert into
stats or something). It looks like a good question for the server
team chat - should be start read-only statements on explicit SELECT
from a system space? Box API now does it (inside box_index_iterator()).
>
> Quite similar to this issue might be CREATE TABLE AS SELECT
> when it will be implemented. However, in this case we still can avoid
> mixing user and system spaces within one transaction:
>
> INSERT TO _SPACE/_INDEX etc
> SELECT FROM source INSERT TO ephemeral
> START TRANSACTION
> SELECT FROM ephemeral INSERT TO destination
> COMMIT TRANSACTION
Then I do not understand why can not we do the same for ANALYZE:
SELECT samples from different engines with no ro stmts;
START TX;
INSERT tuples into stat1/4;
COMMIT;
>
>> We could introduce a new opcode or add an option
>> for OP_Seek like 'is_atomic' that will trigger ro stmt start.
>> For spaces like analyze and ephemeral ones we could set this
>> option in True thereby "allowing" cross-engine transactions
>> for internal usage when we do not need read-view and other tx
>> manager facilities.
>>
>>
>>> + if (txn_begin_ro_stmt(space, &txn) != 0)
>>> + return SQL_TARANTOOL_ERROR;
>>> + }
>>> struct iterator *it = index_create_iterator(pCur->index, pCur->iter_type,
>>> key, part_count);
>>> if (it == NULL) {
>>> + if (txn != NULL)
>>> + txn_rollback_stmt();
>>> pCur->eState = CURSOR_INVALID;
>>> return SQL_TARANTOOL_ITERATOR_FAIL;
>>> }
>>> + if (txn != NULL)
>>> + txn_commit_ro_stmt(txn);
>>> pCur->iter = it;
>>> pCur->eState = CURSOR_VALID;
>>>
>>
>
More information about the Tarantool-patches
mailing list