[tarantool-patches] [PATCH] sql: set explicit default collation's strength

i.koptelov ivan.koptelov at tarantool.org
Wed Mar 27 17:08:47 MSK 2019


Thank you for the review.

> On 27 Mar 2019, at 14:59, Vladislav Shpilevoy <v.shpilevoy at tarantool.org> wrote:
> 
> Hi! Thanks for the patch! See 5 comments below.
> 
> On 26/03/2019 23:34, Ivan Koptelov wrote:
>> Before the patch, collations with no strength set used
>> tertiary strength. But it was not easy to understand it,
>> because box.space._collation:select{} would return
>> ... [1, 'unicode', 1, 'ICU', '', {}] ... for such collations.
>> After the patch default value is set explicitly, so
>> user would observe : ... [1, 'unicode', 1, 'ICU', '',
>> {strength='tertiary'}] ...
>> 
>> Note that box/stat.test.lua is temporary disabled with this
>> patch. It is done so because the patch is meant for the 2.1.2
>> release. Current tarantool version is 2.1.2, so upgrade is done
>> (using upgrade.lua) and because of it box/stat is broken (it
>> does not expect changes in upgrade) But after the release would
>> be made, box/stat would work again, because no changing would be
>> done in upgrade.lua. To resume, after we set tarantool
>> version to => 2.1.2 box/stat should be enabled again.
> 
> 1. Why so complex? Just update box stat test output. We have
> upgrade_to_2_1_2 on some other branches in review, but they do not
> have these problems.
> 
> What is more, I enabled that test back, and it passes. So what
> a problem?
Seems like this test works fine if bootstrap.snap is updated. So I
just enable the test and fix commit message.
> 
>> 
>> Closes #3573
>> 
>> @TarantoolBot document
>> Title: default collation strength is explicit tertiary now
>> Before the patch we already have tertiary strength is default
>> strength for collations, but it was explicit:
> 
> 2. 'It was explicit', 'it's just become explicit'. A guess,
> the first one was 'implicit’.
You are right. Sorry, fixed now.
> 
>> [1, 'unicode', 1, 'ICU', '', {}]
>> After the patch it's just become explicit:
>> 1, 'unicode', 1, 'ICU', '', {'strength' = 'tertiary'}]
>> 
>> Also please fix this https://tarantool.io/en/doc/2.1/book/box/data_model/#collations
>> There is line saying: "unicode collation observes all weights,
>> from L1 to Ln (identical)" It was not true and now this fact
>> would just become obvious.
>> ---
>> Branch https://github.com/tarantool/tarantool/tree/sudobobo/gh-3573-add-explicit-default-coll-strength
>> Issue https://github.com/tarantool/tarantool/issues/3573
>> 
>> src/box/bootstrap.snap             | Bin 1834 -> 1840 bytes
>> src/box/lua/schema.lua             |   4 ++++
>> src/box/lua/upgrade.lua            |  26 ++++++++++++++++++++++++--
>> src/lua/utf8.c                     |   1 +
>> test/app-tap/tarantoolctl.test.lua |  10 +++++-----
>> test/box-py/bootstrap.result       |   2 +-
>> test/box/ddl.result                |   6 +++---
>> test/box/suite.ini                 |   2 +-
>> test/sql/collation.result          |  14 ++++++++++++++
>> test/sql/collation.test.lua        |   8 ++++++++
>> test/unit/coll.cpp                 |   2 ++
>> 11 files changed, 63 insertions(+), 12 deletions(-)
>> 
>> diff --git a/src/box/lua/upgrade.lua b/src/box/lua/upgrade.lua
>> index dc7328714..440061558 100644
>> --- a/src/box/lua/upgrade.lua
>> +++ b/src/box/lua/upgrade.lua
>> @@ -400,7 +400,7 @@ local function create_collation_space()
>>     box.space._index:insert{_collation.id, 1, 'name', 'tree', {unique = true}, {{1, 'string'}}}
>> 
>>     log.info("create predefined collations")
>> -    box.space._collation:replace{1, "unicode", ADMIN, "ICU", "", setmap{}}
>> +    box.space._collation:replace{1, "unicode", ADMIN, "ICU", "", {strength='tertiary'}}
> 
> 3. Please, do not touch old upgrade scripts. They are 'read-only’.
Moreover, this change does not really do anything. Removed.
> 
>>     box.space._collation:replace{2, "unicode_ci", ADMIN, "ICU", "", {strength='primary'}}
>> 
>>     local _priv = box.space[box.schema.PRIV_ID]
>> @@ -632,6 +632,27 @@ local function upgrade_to_2_1_1()
>>     end
>> end
>> 
>> +--------------------------------------------------------------------------------
>> +-- Tarantool 2.1.2
>> +--------------------------------------------------------------------------------
>> +
>> +local function update_collation_strength_field()
>> +    local _collation = box.space[box.schema.COLLATION_ID]
>> +    for _, collation in ipairs(_collation:select()) do
>> +        if collation.opts.strength == nil and collation.name ~= 'none' and
>> +            collation.name ~= 'binary' then
>> +            local new_collation = _collation:get{collation.id}:totable()
>> +            new_collation[6].strength = 'tertiary'
>> +            _collation:delete{collation.id}
>> +            _collation:insert(new_collation)
> 
> 4. 'replace’ ?
Replaces are prohibited for _collation space.
>> +        end
>> +    end
>> +end
>> +
>> +local function upgrade_to_2_1_2()
>> +    update_collation_strength_field()
>> +end
>> +
>> local function get_version()
>>     local version = box.space._schema:get{'version'}
>>     if version == nil then
>> diff --git a/test/sql/collation.result b/test/sql/collation.result
>> index 3794990dc..9994baca9 100644
>> --- a/test/sql/collation.result
>> +++ b/test/sql/collation.result
>> @@ -785,3 +785,17 @@ box.sql.execute("SELECT DISTINCT substr(s2, 1, 1) FROM jj;")
>> box.space.JJ:drop()
>> ---
>> ...
>> +-- gh-3573: Strength in the _collation space
>> +-- Collation without 'strength' option set now has explicit
>> +-- 'strength' = 'tertiary'.
>> +--
>> +box.internal.collation.create('c', 'ICU', 'unicode')
>> +---
>> +...
>> +id =  box.internal.collation.id_by_name('c')
>> +---
>> +...
>> +box.space._collation:select(id)
> 
> 5. id_by_name + select can be replaced with one
> box.space._collation.index.name:get({'c'}).
Ok, done.
> 
>> +---
>> +- - [4, 'c', 1, 'ICU', 'unicode', {'strength': 'tertiary'}]
>> +...





More information about the Tarantool-patches mailing list