[tarantool-patches] Re: [PATCH 2/3] Add surrogate ID for BINARY collation

n.pettik korablev at tarantool.org
Wed Oct 31 18:47:45 MSK 2018



> On 31 Oct 2018, at 15:34, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> wrote:
> 
> Hi! Thanks for the patch! See 3 comments below.
> 
>> diff --git a/src/box/key_def.h b/src/box/key_def.h
>> index 20e79f9fe..ecdc199d9 100644
>> --- a/src/box/key_def.h
>> +++ b/src/box/key_def.h
>> @@ -78,6 +78,23 @@ extern const struct key_part_def key_part_def_default;
>>   */
>>  #define COLL_NONE UINT32_MAX
>>  +/**
>> + * In SQL explicitly specified binary collation and absence of
>> + * any collation are different in behaviour: according to ANSI
>> + * it is prohibited to compare strings with different explicitly
>> + * indicated collations. However, if one of collation is default,
>> + * (i.e. absent) the second one will be forced.
>> + * So, lets introduce another id to indicate explicitly specified
>> + * binary collation.
> 
> 1. Sorry, I am not sure that we can use imperative while
> describing not functions.

OK, I’ll re-phrase it (still strange rule).

> 
> 2. I see, that you actually created a 'phantom' collation.
> A collation, that has no a record in the collation cache,
> but is visible to a user via space format. I think for
> externally visible changes you should consult Kostja.
> Alternatively, it is possible to create binary collation
> in the same way as unicode and unicode_ci - via insertion
> into _collation in upgrade script.
> 
> Also, I see a bug that we can create a collation in
> _collation with id = COLL_NONE and COLL_BINARY, but which
> actually are not NONE nor BINARY. Storing such identifiers
> in _collation should be prohibited (if we will leave current
> 'phantom' binary collation as is). Furthermore COLL_NONE
> for unknown reason is declared in key_def.h instead of
> coll_id.h. It should be moved out. It is worth to create a
> separate commit with refactoring right before this one.

I asked Vladimir before implementing this patch, and we
decided to avoid adding real collation struct to cache,
since in this case we would get *tiny but still* overhead
in the form of calling additional collations functions. However,
I agree that creating collations with ids COLL_NONE and
COLL_BINARY should be banned. I am going to re-ask
Vladimir and Konstantin and in case they don’t mind,
I will expose patch-set with additional commit containing
these checks.

> 
>> + */
>> +#define COLL_BINARY (UINT32_MAX - 1)
>> +
>> +static inline bool
>> +coll_is_missing(uint32_t coll_id)
>> +{
>> +	return coll_id == COLL_NONE || coll_id == COLL_BINARY;
>> +}
> 
> Hence, this should be moved to coll_id.h as well and
> renamed to coll_id_is_missing.

Ok, I am going to put this refactoring in a separate commit as well
and resend patch-set as v2.

> 
>> diff --git a/test/sql/collation.test.lua b/test/sql/collation.test.lua
>> index 935dea824..f9d653717 100644
>> --- a/test/sql/collation.test.lua
>> +++ b/test/sql/collation.test.lua
>> @@ -42,4 +42,12 @@ cn = remote.connect(box.cfg.listen)
>>  cn:execute('select 1 limit ? collate not_exist', {1})
>>    cn:close()
>> +
>> +-- Explicitly set BINARY collation has ID.
> 
> 3. Please, add more tests, especially for box when you set id to
> 4294967294.
> 
>> +--
>> +box.sql.execute("CREATE TABLE t (id INT PRIMARY KEY, a TEXT, b TEXT COLLATE BINARY);")
>> +box.space.T:format()[2]['collation']
>> +box.space.T:format()[3]['collation']
>> +box.sql.execute("DROP TABLE t;")
>> +
>>  box.schema.user.revoke('guest', 'read,write,execute', 'universe')





More information about the Tarantool-patches mailing list