[Tarantool-patches] [PATCH] box: remove unicode_ci for functions

Konstantin Osipov kostja.osipov at gmail.com
Mon Dec 2 17:49:27 MSK 2019


* Nikita Pettik <korablev at tarantool.org> [19/12/02 17:39]:
> On 02 Dec 10:07, Konstantin Osipov wrote:
> > * Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [19/12/01 19:29]:
> > > >> Unicode_ci collation breaks the general
> > > >> rule for objects naming, so we remove it
> > > >> in version 2.3.1
> > > > 
> > > > The code works according to RFC.
> > > > 
> > > > There is a justification for this behaviour in RFC.
> > 
> > Please see my reply with an explanation. The RFC was  written
> > presuming https://github.com/tarantool/tarantool/issues/4467 
> > will be fixed. 
> 
> According to milestone (which is 'feature'), it is not going to be
> implemented soon. What is more, there's even no clearly stated proposal
> or RFC without contradictions.

Uhm, you could of course shift the burden on me for writing RFC
and do nothing on this premise. But come on, ask users how much
pain the uppercasing is making. It is not about me having my way,
it is about fixing a broken implementation.

As to contradictions of RFC, well, there was some ambiguity in
what I initially suggested, but it was later resolved in the
comments to the ticket.

> 
> '''
> To avoid name clash, we will reserve these names by adding entries for them in _func system space.
> '''
> 
> That's all.
> 
> I can't figure out what did author really mean by 'name clash'.
> We are able to create two different objects (of any kind: space,
> trigger etc) with the same in terms of case-insensetive collation
> (e.g. "t1" and "T1"). Why this rule should be violated for functions?

The idea of _ci is that SQL function lookup never returns a
non-builtin function for a built-in function.

I guess as long as SQL uppercases the name before lookup, this
won't happen even if the collation is case-sensitive - all the
uppercase names are reserved already.

But imagine we stop uppercasing, as 4467 suggests. Then, unless
the collation is _ci, another function will be returned, if it
exists, say, for a lowercase use of the name.

This is why _ci is in there - to prevent two functions with the
same name (but different casing) to ever exist.

This will also help with SQL/PSM name lookup, not just built-ins, when
SQL/PSM is in. Since SQL/PSM functions are also case-insensitive,
when I invoke a UDF from SQL/PSM I want to avoid ambiguity as well - and
I can't prevent it by "reserving" all UDFs used for SQL/PSM.

E.g. imagine I have box.schema.func.create("foo", ...) called from
Lua. Later I do CREATE FUNCTION foo ... using SQL. The name will
be uppercased and the function will be created. I would like to
avoid this. 

-- 
Konstantin Osipov, Moscow, Russia


More information about the Tarantool-patches mailing list