Tarantool development patches archive
 help / color / mirror / Atom feed
From: Alexander Turenko <alexander.turenko@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: kostja@tarantool.org, tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH 0/7] Expose ICU into Lua
Date: Sat, 28 Apr 2018 04:55:47 +0300	[thread overview]
Message-ID: <20180428015547.stjjxx5nm67do5lb@tkn_work_nb> (raw)
In-Reply-To: <cover.1524698920.git.v.shpilevoy@tarantool.org>

Hi Vlad!

Some thoughts / questions re API are below.

1. u_upper/u_lower have support of user-provided locales and follow
   system-default locale by default (that decision is debatable IMHO,
   see the email re 1st patch). But u_compare/u_icompare uses built-in
   collations unconditionally. Should all these functions have some
   unified approach to handle locales?

2. Should we expose these functions into the 'string' module? The module
   seems to be very basic for the language and maybe it worth to be
   conservative in its extending. Lua 5.3 have separate 'utf8' module,
   for example.

3. Should we stick to some existing API to be more friendly for existing
   users?

The examples I found:

- lua 5.3 utf8: https://www.lua.org/manual/5.3/manual.html#6.5 
- lua-utf8: https://github.com/starwing/luautf8
- icu-lua: http://files.luaforge.net/releases/icu-lua/icu-lua/0.1A

From the other side, they seems to don't have character properties
exposing like in our u_count and don't provide ability to set specific
locale. So trying to provide transparent replacement for some parts of
these APIs seems not being a good idea. Just note this possible point
here.

WBR, Alexander Turenko.

On Thu, Apr 26, 2018 at 02:29:00AM +0300, Vladislav Shpilevoy wrote:
> Branch: http://github.com/tarantool/tarantool/tree/gh-3290-lua-icu-ucasemap
> Issue: https://github.com/tarantool/tarantool/issues/3290
> Issue: https://github.com/tarantool/tarantool/issues/3081
> 
> Lua can not work with unicode - in Lua it is enterpreted as a binary. On such
> string built-in upper/lower functions, '#' (length) and comparison operators do
> not work. But Tarantool links with ICU and has comparators with collations, that
> can solve the problems.
> 
> But there is another issue - string methods must be available before box.cfg,
> so the ICU and collations must be built out of main 'box' static library. To do
> this the collation related files are moved from 'box' into 'core' library.
> 
> A second issue is that when box.cfg is called, it inserts built-in collations
> into _collation space, and these insertions can conflict with built-in
> collations, created before box.cfg. Delete from _collation can break the
> collations cache as well. The patchset solves this by checking collations
> deletions and insertions, and if they tries to operate on built-in collations,
> then they are ignored - a user sees changes in _collation, but the cache is
> unchanged.
> 
> Vladislav Shpilevoy (7):
>   lua: expose ICU upper/lower functions to Lua
>   lua: implement string.u_count
>   alter: fix assertion in collations alter
>   Move struct on_access_denied_ctx into error.h
>   Merge box_error, stat and collations into core library
>   Always store built-in collations in the cache
>   lua: expose u_compare/u_icompare into Lua
> 
> <...>

      parent reply	other threads:[~2018-04-28  1:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-25 23:29 [tarantool-patches] " Vladislav Shpilevoy
2018-04-25 23:29 ` [tarantool-patches] [PATCH 1/7] lua: expose ICU upper/lower functions to Lua Vladislav Shpilevoy
2018-04-28  0:56   ` [tarantool-patches] " Alexander Turenko
2018-04-25 23:29 ` [tarantool-patches] [PATCH 2/7] lua: implement string.u_count Vladislav Shpilevoy
2018-04-26 10:36   ` [tarantool-patches] " Vladislav Shpilevoy
2018-04-26 16:07   ` Vladislav Shpilevoy
2018-04-26 23:57   ` Vladislav Shpilevoy
2018-04-28  1:10   ` Alexander Turenko
2018-04-25 23:29 ` [tarantool-patches] [PATCH 3/7] alter: fix assertion in collations alter Vladislav Shpilevoy
2018-04-25 23:29 ` [tarantool-patches] [PATCH 4/7] Move struct on_access_denied_ctx into error.h Vladislav Shpilevoy
2018-04-25 23:29 ` [tarantool-patches] [PATCH 5/7] Merge box_error, stat and collations into core library Vladislav Shpilevoy
2018-04-25 23:29 ` [tarantool-patches] [PATCH 6/7] Always store built-in collations in the cache Vladislav Shpilevoy
2018-04-25 23:29 ` [tarantool-patches] [PATCH 7/7] lua: expose u_compare/u_icompare into Lua Vladislav Shpilevoy
2018-04-28  1:55 ` Alexander Turenko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180428015547.stjjxx5nm67do5lb@tkn_work_nb \
    --to=alexander.turenko@tarantool.org \
    --cc=kostja@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='[tarantool-patches] Re: [PATCH 0/7] Expose ICU into Lua' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox