Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: tarantool-patches@freelists.org,
	Kirill Shcherbatov <kshcherbatov@tarantool.org>,
	Konstantin Osipov <kostja@tarantool.org>
Subject: [tarantool-patches] Re: [PATCH v3 0/3] box: run checks on insertions in LUA spaces
Date: Sun, 19 May 2019 19:02:05 +0300	[thread overview]
Message-ID: <58d2ea98-343c-e460-60e2-634c187abcc1@tarantool.org> (raw)
In-Reply-To: <cover.1557845850.git.kshcherbatov@tarantool.org>



On 14/05/2019 18:02, Kirill Shcherbatov wrote:
> @v.shpilevoy
>> Yes, I will. Kirill, please, send it again in a new thread. You can keep
>> version 3 and omit change list.
> 
> @kostya
>> It's better to fetch the bound field upon first access.
>> Most paths of the CHECK constraint may not touch most of the
>> fields.
> I have no idea, how, to fit it in our architecture.
> OP_Column has no intersections with binding machinery.

Talking "I have no idea how to implement something" is always
a bad decision. As I said earlier, you are not a text editor,
who just writes code fetched from review comments. What did you
do to investigate possibility of reusing OP_Column?

OP_Column has nothing to do with bind, but it contains all
the necessary tuple decoding, with caching of decoded fields,
and which you implemented again in ck_constraint.c with all these
column masks etc. IMO this is the reason, why SQL part of checks
still is huge. I expected, that this patchset should have almost
completely eliminated all check-hooks from SQL.

OP_Column in a nutshell takes raw tuple data, decodes to the
requested field number, remembers offsets to the decoded ones,
and returns the field. Next accesses of the same field are O(1).
Unused fields beyond max used field number are not even decoded.
This is the exact functionality we want in checks, and which you
are trying to do with the column mask.

The only problem is that OP_Column strongly depends on cursor.
You can either somehow refactor it, or split into 2 opcodes -
first fetches a tuple from a cursor, second fetches a field.

So as not to slow down OP_Column, we can make something similar
to OP_Function0 and OP_Function - OP_Function0 automatically
falls through to OP_Function. Here we can do something like:

	case OP_Column:
		/* take tuple data ... */
		/* prepare to OP_Fetch call, fill needed registers ... */
		FALLTHROUGH;
	case OP_Fetch:
		/* extract a column from tuple data */
		break;

Column fetch has a context: array of offsets, flag if header is
parsed, raw tuple data etc. Since OP_Fetch should be able to work
without OP_Column, we need to extract that context into a struct
tuple_fetcher or something. Check compiler will call OP_Fetch for
referenced columns.

Now we need somehow to bind a tuple to the fetcher at runtime. For
that we could either

- create a fetcher on region with an existing tuple pointer and bind
  it as some kind of cdata. For that you need to add a new sql_bind_*
  function, which binds a raw pointer;

- or add an opcode OP_FetcherNew at compilation time, which will
  create a fetcher, and take tuple data from a register, to which
  we will bind tuple cdata at runtime. For that you again need to be
  able to bind a pointer so as not to copy tuple data.

I think, the first way is better.

This solution solves another problem of your solution -
sql_bind_column copies data. If a CHECK constraint checks strings
or blobs, then it is expensive to copy them.

One another problem is that your solution is not scalable. You
implemented lots of machinery usable for checks only, but we will
have another place, where we need to access tuple fields without
a cursor - functional indexes. I am 99% sure, they will need the
same kind of 'fetcher' to extract fields from an existing tuple
data. It is impossible to implement all these column masks again
for each place, where we need to fetch tuple fields not from a
cursor.

      parent reply	other threads:[~2019-05-19 16:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14 15:02 [tarantool-patches] " Kirill Shcherbatov
2019-05-14 15:02 ` [tarantool-patches] [PATCH v3 1/3] schema: add new system space for CHECK constraints Kirill Shcherbatov
2019-05-14 18:29   ` [tarantool-patches] " Konstantin Osipov
2019-05-14 15:02 ` [tarantool-patches] [PATCH v3 2/3] box: run check constraint tests on space alter Kirill Shcherbatov
2019-05-14 15:02 ` [tarantool-patches] [PATCH v3 3/3] box: user-friendly interface to manage ck constraints Kirill Shcherbatov
2019-05-14 20:09   ` [tarantool-patches] " Konstantin Osipov
2019-05-14 17:00 ` [tarantool-patches] Re: [PATCH v3 0/3] box: run checks on insertions in LUA spaces Konstantin Osipov
     [not found]   ` <FB4C0C9D-E56E-4FEF-A2C5-87AD8BF634F9@gmail.com>
2019-05-14 18:22     ` n.pettik
2019-05-14 18:41       ` Konstantin Osipov
2019-05-14 18:49         ` n.pettik
2019-05-15  8:37           ` Kirill Shcherbatov
2019-05-16 13:51 ` Kirill Shcherbatov
2019-05-19 16:02 ` Vladislav Shpilevoy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58d2ea98-343c-e460-60e2-634c187abcc1@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=kostja@tarantool.org \
    --cc=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='[tarantool-patches] Re: [PATCH v3 0/3] box: run checks on insertions in LUA spaces' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox