From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B4C162E554 for ; Sun, 19 May 2019 12:02:09 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id THtY6H4dbm4C for ; Sun, 19 May 2019 12:02:09 -0400 (EDT) Received: from smtp44.i.mail.ru (smtp44.i.mail.ru [94.100.177.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 758992E52A for ; Sun, 19 May 2019 12:02:09 -0400 (EDT) Subject: [tarantool-patches] Re: [PATCH v3 0/3] box: run checks on insertions in LUA spaces References: From: Vladislav Shpilevoy Message-ID: <58d2ea98-343c-e460-60e2-634c187abcc1@tarantool.org> Date: Sun, 19 May 2019 19:02:05 +0300 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org, Kirill Shcherbatov , Konstantin Osipov On 14/05/2019 18:02, Kirill Shcherbatov wrote: > @v.shpilevoy >> Yes, I will. Kirill, please, send it again in a new thread. You can keep >> version 3 and omit change list. > > @kostya >> It's better to fetch the bound field upon first access. >> Most paths of the CHECK constraint may not touch most of the >> fields. > I have no idea, how, to fit it in our architecture. > OP_Column has no intersections with binding machinery. Talking "I have no idea how to implement something" is always a bad decision. As I said earlier, you are not a text editor, who just writes code fetched from review comments. What did you do to investigate possibility of reusing OP_Column? OP_Column has nothing to do with bind, but it contains all the necessary tuple decoding, with caching of decoded fields, and which you implemented again in ck_constraint.c with all these column masks etc. IMO this is the reason, why SQL part of checks still is huge. I expected, that this patchset should have almost completely eliminated all check-hooks from SQL. OP_Column in a nutshell takes raw tuple data, decodes to the requested field number, remembers offsets to the decoded ones, and returns the field. Next accesses of the same field are O(1). Unused fields beyond max used field number are not even decoded. This is the exact functionality we want in checks, and which you are trying to do with the column mask. The only problem is that OP_Column strongly depends on cursor. You can either somehow refactor it, or split into 2 opcodes - first fetches a tuple from a cursor, second fetches a field. So as not to slow down OP_Column, we can make something similar to OP_Function0 and OP_Function - OP_Function0 automatically falls through to OP_Function. Here we can do something like: case OP_Column: /* take tuple data ... */ /* prepare to OP_Fetch call, fill needed registers ... */ FALLTHROUGH; case OP_Fetch: /* extract a column from tuple data */ break; Column fetch has a context: array of offsets, flag if header is parsed, raw tuple data etc. Since OP_Fetch should be able to work without OP_Column, we need to extract that context into a struct tuple_fetcher or something. Check compiler will call OP_Fetch for referenced columns. Now we need somehow to bind a tuple to the fetcher at runtime. For that we could either - create a fetcher on region with an existing tuple pointer and bind it as some kind of cdata. For that you need to add a new sql_bind_* function, which binds a raw pointer; - or add an opcode OP_FetcherNew at compilation time, which will create a fetcher, and take tuple data from a register, to which we will bind tuple cdata at runtime. For that you again need to be able to bind a pointer so as not to copy tuple data. I think, the first way is better. This solution solves another problem of your solution - sql_bind_column copies data. If a CHECK constraint checks strings or blobs, then it is expensive to copy them. One another problem is that your solution is not scalable. You implemented lots of machinery usable for checks only, but we will have another place, where we need to access tuple fields without a cursor - functional indexes. I am 99% sure, they will need the same kind of 'fetcher' to extract fields from an existing tuple data. It is impossible to implement all these column masks again for each place, where we need to fetch tuple fields not from a cursor.