[tarantool-patches] Re: [PATCH v3 0/3] box: run checks on insertions in LUA spaces

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Sun May 19 19:02:05 MSK 2019



On 14/05/2019 18:02, Kirill Shcherbatov wrote:
> @v.shpilevoy
>> Yes, I will. Kirill, please, send it again in a new thread. You can keep
>> version 3 and omit change list.
> 
> @kostya
>> It's better to fetch the bound field upon first access.
>> Most paths of the CHECK constraint may not touch most of the
>> fields.
> I have no idea, how, to fit it in our architecture.
> OP_Column has no intersections with binding machinery.

Talking "I have no idea how to implement something" is always
a bad decision. As I said earlier, you are not a text editor,
who just writes code fetched from review comments. What did you
do to investigate possibility of reusing OP_Column?

OP_Column has nothing to do with bind, but it contains all
the necessary tuple decoding, with caching of decoded fields,
and which you implemented again in ck_constraint.c with all these
column masks etc. IMO this is the reason, why SQL part of checks
still is huge. I expected, that this patchset should have almost
completely eliminated all check-hooks from SQL.

OP_Column in a nutshell takes raw tuple data, decodes to the
requested field number, remembers offsets to the decoded ones,
and returns the field. Next accesses of the same field are O(1).
Unused fields beyond max used field number are not even decoded.
This is the exact functionality we want in checks, and which you
are trying to do with the column mask.

The only problem is that OP_Column strongly depends on cursor.
You can either somehow refactor it, or split into 2 opcodes -
first fetches a tuple from a cursor, second fetches a field.

So as not to slow down OP_Column, we can make something similar
to OP_Function0 and OP_Function - OP_Function0 automatically
falls through to OP_Function. Here we can do something like:

	case OP_Column:
		/* take tuple data ... */
		/* prepare to OP_Fetch call, fill needed registers ... */
		FALLTHROUGH;
	case OP_Fetch:
		/* extract a column from tuple data */
		break;

Column fetch has a context: array of offsets, flag if header is
parsed, raw tuple data etc. Since OP_Fetch should be able to work
without OP_Column, we need to extract that context into a struct
tuple_fetcher or something. Check compiler will call OP_Fetch for
referenced columns.

Now we need somehow to bind a tuple to the fetcher at runtime. For
that we could either

- create a fetcher on region with an existing tuple pointer and bind
  it as some kind of cdata. For that you need to add a new sql_bind_*
  function, which binds a raw pointer;

- or add an opcode OP_FetcherNew at compilation time, which will
  create a fetcher, and take tuple data from a register, to which
  we will bind tuple cdata at runtime. For that you again need to be
  able to bind a pointer so as not to copy tuple data.

I think, the first way is better.

This solution solves another problem of your solution -
sql_bind_column copies data. If a CHECK constraint checks strings
or blobs, then it is expensive to copy them.

One another problem is that your solution is not scalable. You
implemented lots of machinery usable for checks only, but we will
have another place, where we need to access tuple fields without
a cursor - functional indexes. I am 99% sure, they will need the
same kind of 'fetcher' to extract fields from an existing tuple
data. It is impossible to implement all these column masks again
for each place, where we need to fetch tuple fields not from a
cursor.




More information about the Tarantool-patches mailing list