[tarantool-patches] Re: [PATCH] sql: set explicit default collation's strength

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Wed Mar 27 14:59:47 MSK 2019


Hi! Thanks for the patch! See 5 comments below.

On 26/03/2019 23:34, Ivan Koptelov wrote:
> Before the patch, collations with no strength set used
> tertiary strength. But it was not easy to understand it,
> because box.space._collation:select{} would return
> ... [1, 'unicode', 1, 'ICU', '', {}] ... for such collations.
> After the patch default value is set explicitly, so
> user would observe : ... [1, 'unicode', 1, 'ICU', '',
> {strength='tertiary'}] ...
> 
> Note that box/stat.test.lua is temporary disabled with this
> patch. It is done so because the patch is meant for the 2.1.2
> release. Current tarantool version is 2.1.2, so upgrade is done
> (using upgrade.lua) and because of it box/stat is broken (it
> does not expect changes in upgrade) But after the release would
> be made, box/stat would work again, because no changing would be
> done in upgrade.lua. To resume, after we set tarantool
> version to => 2.1.2 box/stat should be enabled again.

1. Why so complex? Just update box stat test output. We have
upgrade_to_2_1_2 on some other branches in review, but they do not
have these problems.

What is more, I enabled that test back, and it passes. So what
a problem?

> 
> Closes #3573
> 
> @TarantoolBot document
> Title: default collation strength is explicit tertiary now
> Before the patch we already have tertiary strength is default
> strength for collations, but it was explicit:

2. 'It was explicit', 'it's just become explicit'. A guess,
the first one was 'implicit'.

> [1, 'unicode', 1, 'ICU', '', {}]
> After the patch it's just become explicit:
> 1, 'unicode', 1, 'ICU', '', {'strength' = 'tertiary'}]
> 
> Also please fix this https://tarantool.io/en/doc/2.1/book/box/data_model/#collations
> There is line saying: "unicode collation observes all weights,
> from L1 to Ln (identical)" It was not true and now this fact
> would just become obvious.
> ---
> Branch https://github.com/tarantool/tarantool/tree/sudobobo/gh-3573-add-explicit-default-coll-strength
> Issue https://github.com/tarantool/tarantool/issues/3573
> 
>  src/box/bootstrap.snap             | Bin 1834 -> 1840 bytes
>  src/box/lua/schema.lua             |   4 ++++
>  src/box/lua/upgrade.lua            |  26 ++++++++++++++++++++++++--
>  src/lua/utf8.c                     |   1 +
>  test/app-tap/tarantoolctl.test.lua |  10 +++++-----
>  test/box-py/bootstrap.result       |   2 +-
>  test/box/ddl.result                |   6 +++---
>  test/box/suite.ini                 |   2 +-
>  test/sql/collation.result          |  14 ++++++++++++++
>  test/sql/collation.test.lua        |   8 ++++++++
>  test/unit/coll.cpp                 |   2 ++
>  11 files changed, 63 insertions(+), 12 deletions(-)
> 
> diff --git a/src/box/lua/upgrade.lua b/src/box/lua/upgrade.lua
> index dc7328714..440061558 100644
> --- a/src/box/lua/upgrade.lua
> +++ b/src/box/lua/upgrade.lua
> @@ -400,7 +400,7 @@ local function create_collation_space()
>      box.space._index:insert{_collation.id, 1, 'name', 'tree', {unique = true}, {{1, 'string'}}}
>  
>      log.info("create predefined collations")
> -    box.space._collation:replace{1, "unicode", ADMIN, "ICU", "", setmap{}}
> +    box.space._collation:replace{1, "unicode", ADMIN, "ICU", "", {strength='tertiary'}}

3. Please, do not touch old upgrade scripts. They are 'read-only'.

>      box.space._collation:replace{2, "unicode_ci", ADMIN, "ICU", "", {strength='primary'}}
>  
>      local _priv = box.space[box.schema.PRIV_ID]
> @@ -632,6 +632,27 @@ local function upgrade_to_2_1_1()
>      end
>  end
>  
> +--------------------------------------------------------------------------------
> +-- Tarantool 2.1.2
> +--------------------------------------------------------------------------------
> +
> +local function update_collation_strength_field()
> +    local _collation = box.space[box.schema.COLLATION_ID]
> +    for _, collation in ipairs(_collation:select()) do
> +        if collation.opts.strength == nil and collation.name ~= 'none' and
> +            collation.name ~= 'binary' then
> +            local new_collation = _collation:get{collation.id}:totable()
> +            new_collation[6].strength = 'tertiary'
> +            _collation:delete{collation.id}
> +            _collation:insert(new_collation)

4. 'replace' ?

> +        end
> +    end
> +end
> +
> +local function upgrade_to_2_1_2()
> +    update_collation_strength_field()
> +end
> +
>  local function get_version()
>      local version = box.space._schema:get{'version'}
>      if version == nil then
> diff --git a/test/sql/collation.result b/test/sql/collation.result
> index 3794990dc..9994baca9 100644
> --- a/test/sql/collation.result
> +++ b/test/sql/collation.result
> @@ -785,3 +785,17 @@ box.sql.execute("SELECT DISTINCT substr(s2, 1, 1) FROM jj;")
>  box.space.JJ:drop()
>  ---
>  ...
> +-- gh-3573: Strength in the _collation space
> +-- Collation without 'strength' option set now has explicit
> +-- 'strength' = 'tertiary'.
> +--
> +box.internal.collation.create('c', 'ICU', 'unicode')
> +---
> +...
> +id =  box.internal.collation.id_by_name('c')
> +---
> +...
> +box.space._collation:select(id)

5. id_by_name + select can be replaced with one
box.space._collation.index.name:get({'c'}).

> +---
> +- - [4, 'c', 1, 'ICU', 'unicode', {'strength': 'tertiary'}]
> +...




More information about the Tarantool-patches mailing list