[Tarantool-patches] [PATCH v2 6/6] sql: extend result set with alias

Nikita Pettik korablev at tarantool.org
Thu Dec 26 14:24:11 MSK 2019


On 24 Dec 16:34, Vladislav Shpilevoy wrote:
> Thanks for the discussion!
> 
> >>> diff --git a/test/sql/full_metadata.result b/test/sql/full_metadata.result
> >>> index 7c2982682..8f0bced36 100644
> >>> --- a/test/sql/full_metadata.result
> >>> +++ b/test/sql/full_metadata.result
> >>> @@ -118,6 +121,35 @@ execute("SELECT * FROM t;")
> >>>   |   - [1, 1, 'aSd']
> >>>   | ...
> >>>  
> >>> +-- Alias is always set in extended metadata. If column label in
> >>> +-- form of AS clause is set, then this alias is presented in
> >>> +-- metadata. Otherwise, alias is just the same as name.
> >>
> >> 4. But it is not true. The alias is not set always:
> >>
> >> =========================================================
> >>
> >> tarantool> box.execute('PRAGMA full_metadata = true;')
> >> ---
> >> - row_count: 0
> >> ...
> >>
> >> tarantool> box.execute('SELECT 1')
> >> ---
> >> - metadata:
> >>   - name: '1'
> >>     type: integer
> >>   rows:
> >>   - [1]
> >> ...
> >>
> >> =========================================================
> >>
> >> As you can see, no AS = no alias.
> > 
> > Always displaying alias (or label - call it whatever you want)
> > even in full_metadata mode doesn't seem to be reasonable for me.
> > Users can easily handle it in their code. Sending one map field
> > with string value duplicating the name and decoding it may affect
> > performance much more significant than one NULL check (IMHO).
> 
> As I said already, I don't propose to send a duplicate name. I
> propose to handle it on the client side. In Lua you can push one
> string 2 times, and it will cost nothing - they will reference
> one string object.
> 
> In IProto you still can send one name, when possible, and on the
> netbox side push both names.

>From performance point of view this is ok. But still some
users may wonder why alias/label is displayed while there's
no explicit AS clause (even in full_metadata mode).

Moreover, now client only decodes received metadata, but in your
suggestion it is assumed that full_metadata mode is the same on
client and server. Otherwise, it would look inconsistent:

server (full_metadata = true), client (full_metadata = off):
server sends name, is_autoincrement, but client ignores extended
metadata and displays only name;

server (full_metadata = false), clien (full_metadata = true):
server sends name (but in full_metadata mode there's also is_autoincrement
for instance), but client displays only half of extended metadata (alias,
but not is_autoincrement or other fields).

Surely, we can send server's metadata status, but is it worth it?

> >> This is what I don't like
> >> in this patch the most. A user does not have a reliable always
> >> present column with a displayable name: either the column
> >> span or its alias after AS. He can't use name, because it
> >> changes its behavour depending on AS. And he can't use alias,
> >> because sometimes it is not present.
> > 
> >> That leads to confusion,
> >> and necessity to branch in user code a lot.
> > 
> > Yep, if user wants (for some reason; I don't have statistics how
> > often alias value can be useful or/and is used in production code)
> > to get alias value, one should make a small effort and add one
> > 'if' condition:
> > 
> > if (is_full_metadata && alias == NULL)
> >     alias = name;
> > 
> > That's it.
> > 
> 
> Why, if we can remove that effort easily? I still don't see any
> real arguments why can't we always push a label. The argument
> about sending duplicate names is really weak - you can avoid
> sending two names, and push them on the client side, as I said
> above. No extra payload, not extra MessagePack decoding.
> 
> Even the standard driver, to which you appeal so much, always
> returns a label.
> > Alternatively (I don't take into consideration option of swapping
> > name and alias as you suggest) we can change current default behaviour
> > and in 'casual metadata mode' return name as name (not alias):
> > 
> > SELECT 1 AS x; -- would return '1' as name (now it returns 'x' as name). 
> > 
> > But I guess some users may turn out to be confused (in case they
> > haven't red about full_metadata pragma in docs).
> > 
> >> The same example shows, that alias is a bad name. Here you need
> >> to return it, but it is not alias like AS. 'Label' would fit
> >> better here.
> > 
> > I don't care much how to call it. If you prefer 'label', I will rename it.
> 
> Yes, I prefer label. Not because this is my personal taste. I provided
> reasons, why it can't be called 'alias'. If you think alias is better,
> they say why. That is a discussion.

There are no strict definitions under these terms, so it is just
matter of taste. Let's call it label, I am not against (I guess
you assume that label = (name ∪ alias), meanwhile alias is supposed
only to be idenfitier after AS clause). 
 
> > TBO I also don't care much which approach we should follow: always return
> > alias/label, fix current name/alias resolution, or swap name and alias
> > at all. If you insist on certain implementation, let's do it. If you
> > don't care much as well, let's ask smb else to take a final decision.
> > 
> 
> I insist on not breaking the current behaviour. The patch breaks it -
> 'name' behaves differently whether full_metadata is specified or not.
> 
> Current clients may rely on the fact, that 'name' is always present,
> and that it behaves like now: returns either alias, or the original
> expression.
> 
> Here are solutions with which I am good:
> 
> - Keep the name as is, and add a new column meaning the original
>   expression: 'orig_name', 'oname', 'span', 'column', 'expr', etc.
>   In that case we don't break anything, we satisfy the driver, and
>   'full_metadata' only adds new fields. It does not change the
>   behaviour of always present columns.
> 
> - Or we could break the 'name' column a little, as you proposed -
>   always return it as if AS is not specified, even in short metadata.
>   In the full metadata it still will return the original expression,
>   but we add a 'label' column, which returns either alias or name.

I am going to implement last option.

> In both solutions I want to rely on that both columns are always
> present on the client side, when full metadata is specified.
> Regardless of how are they sent in the binary protocol.
> 
> Talking of asking somebody - I tried to get a final decision of what we
> should return from Alexander, because others just don't care. But the
> only thing he provides is samples of how other DBs behave. AFAIU,
> whatever we return and with whatever names, the driver will handle that.
> 
> But how the driver will parse our protocol is not a public thing. I worry
> about public API, in Lua. Here the names and their behaviour matter.


More information about the Tarantool-patches mailing list