I have occasionally sent mail from wrong address, so
you might miss it. My apologies, I resend it from right one.

On 1 Nov 2018, at 19:31, Никита Петтик <kitnerh@gmail.com> wrote:

On 1 Nov 2018, at 18:39, Konstantin Osipov <kostja@tarantool.org> wrote:

* n.pettik <korablev@tarantool.org> [18/11/01 16:11]:
I guess, because

1) It is not real collation and is not presented in
_collation. So for a user it would be strange to see
a gap between 2 and 4 in _collation, which can not be
set.

Let's insert it there.

So, you insist on id == 3, right? Again, if user process select
rom _collation space, one won’t see entry with id == 3.
On the other hand, if user attempts at inserting id == 3,
one will get an error.

No, I don't insist yet. Why not insert a special row in there?

Because insertion to _collation would result in creation
of collation objects. Meanwhile, in fact we need only ID
to distinguish BINARY and no-collation. The rest is the
same for them. So, it makes sense to store only ID within
space format. That is my point.

is consistent to has its ID near COLL_NONE, in a "special
range" of collation identifiers.

Uhm, AFAIU we have two binary collations. One is "collation is not
set" and another is "collation binary". Which one did you mean
now?

FIrst one is not collation at all. It is rather “absence” of any collation.
The second one is sort of “surrogate” and in terms of functionality
means the same. However, its id will be stored in space format in
order to indicate that BINARY collation should be forced during
comparisons.

I think we could use internal ids to reference both cases. For
these both ids we could have surrogate rows in _coll system space,
they won't harm. This will make things easier in the future. 

Ok,  how do you suggest to call “absence” of collation? Like this:

box.space._collation:select()

---
- - [1, 'unicode', 1, 'ICU', '', {}]
 - [2, 'unicode_ci', 1, 'ICU', '', {'strength': 'primary’}]
 - [3, ‘none', 1, 'ICU', '', {}]
...

It is nonsense, IMHO. No collation is like “no collation at all” -
nothing represents it, especially visible for user. With BINARY
collation it would look even more suspicious:

- - [1, 'unicode', 1, 'ICU', '', {}]
 - [2, 'unicode_ci', 1, 'ICU', '', {'strength': 'primary’}]
 - [3, ‘none', 1, 'ICU', '', {}]
 - [4, ‘binary', 1, 'ICU', '', {}]

It would confuse users who don’t use SQL: in Tarantool NoSQL
there is no difference between “binary” and “no-collation”.
Moreover, to keep things consistent, we would have to  make
default collation be ’none’ instead of absence of collation.
It means that field def without explicitly set collation would
have ’none’ collation in format. For instance:

*before*

- [{'affinity': 66, 'type': ’string', 'nullable_action': 'abort', 'name': 'ID', 'is_nullable': false}]

*after*

- [{'collation': 3, 'affinity': 66, 'type': 'string', 'nullable_action': 'abort',
   'name': 'ID', 'is_nullable': false}]

This is going to be the same mess as with NO ACTION and DEFAULT,
which are mostly the same, but not quite, so we'd better prepare.

It is considered to be mess due to SQLite legacy. On the other hand, all
these manipulations with collations follow SQL ANSI.

All points considered, I would prefer to introduce only another one ID
(alongside with COLL_NONE ID) and prohibit to create collations with
these ids. OR, add surrogate “binary collation” to _collation with id == 3,
but not both “binary” and “none”.