From: Stanislav Zudin <szudin@tarantool.org> To: tarantool-patches@freelists.org, Konstantin Osipov <kostja.osipov@gmail.com> Subject: [tarantool-patches] Re: [PATCH v4] Feature request for a new collation Date: Wed, 20 Mar 2019 18:53:05 +0300 [thread overview] Message-ID: <eb02c5b9-f3a4-1ca2-c621-51bf4c751b59@tarantool.org> (raw) In-Reply-To: <20190306094802.GH17432@chai> The recent patch defines COLL_LOCALE_LEN_MAX equal to 30 and introduces test for the following collations: - Afrikaans - Amharic - Assamese - Azerbaijani - Belarusian - Kyrgyz - German (phonebook) - Hebrew - Japanese The source of test sequences is the following: https://www.unicode.org/cldr/charts/34/collation/index.html Issue: https://github.com/tarantool/tarantool/issues/4007 Branch: https://github.com/tarantool/tarantool/tree/stanztt/gh-4007-new-default-collation-2.1 On 06.03.2019 12:48, Konstantin Osipov wrote: > * Stanislav Zudin <szudin@tarantool.org> [19/03/06 11:26]: >> >> >> On 06.03.2019 11:11, Konstantin Osipov wrote: >>> * Stanislav Zudin <szudin@tarantool.org> [19/03/06 10:24]: >>>> >>>> >>>> On 05.03.2019 18:41, Konstantin Osipov wrote: >>>>> * Stanislav Zudin <szudin@tarantool.org> [19/03/05 14:45]: >>>>>> /** Maximal length of locale name. */ >>>>>> enum { >>>>>> - COLL_LOCALE_LEN_MAX = 16, >>>>>> + COLL_LOCALE_LEN_MAX = 1024, >>>>>> }; >>>>> >>>>> Why? >>>>> >>>> >>>> Because locale may include numerous options such as collation, numbers, >>>> calendar etc. e.g.: "fr@collation=phonebook;calendar=islamic-civil". >>>> The max length is not specified. >>> >>> *Nobody* except us can add a new collation. The names of the >>> collations you added don't exceed 16 bytes. Please keep the old >>> value. >>> >> strlen("de_DE_u_co_phonebk") == 18. > > + "unicode_", then, increase to a relevant value please. > Done. --- src/coll_def.h | 2 +- test/sql-tap/collation-2.test.lua | 143 ++++++++++++++++++++++++++++++ 2 files changed, 144 insertions(+), 1 deletion(-) create mode 100755 test/sql-tap/collation-2.test.lua diff --git a/src/coll_def.h b/src/coll_def.h index 18ed606ad..fd91f942a 100644 --- a/src/coll_def.h +++ b/src/coll_def.h @@ -44,7 +44,7 @@ extern const char *coll_type_strs[]; /** Maximal length of locale name. */ enum { - COLL_LOCALE_LEN_MAX = 1024, + COLL_LOCALE_LEN_MAX = 30, }; /* diff --git a/test/sql-tap/collation-2.test.lua b/test/sql-tap/collation-2.test.lua new file mode 100755 index 000000000..58b6a1ef1 --- /dev/null +++ b/test/sql-tap/collation-2.test.lua @@ -0,0 +1,143 @@ +#!/usr/bin/env tarantool +test = require("sqltester") +test:plan(10 * 4) + +local prefix = "unicode-collation-" + +local function insert_into_table(tbl_name, data) + local sql = string.format("INSERT INTO %s VALUES ", tbl_name) + --local values = {} + for _, item in ipairs(data) do + local value = "('"..item.."')" + local e = sql .. value + box.sql.execute(e) + end +end + + +local collation_entries = +{ + { -- Afrikaans case sensitive + "unicode_af_s3", + {"a","A","á","Á","â","Â","b","B","c","C","d","D","e","E","é","É", + "è","È","ê","Ê","ë","Ë","f","F","g","G","h","H","i","I","î","Î", + "ï","Ï","j","J","k","K","l","L","m","M","n","N","ʼn","o","O", + "ô","Ô","ö","Ö","p","P","q","Q","r","R","s","S","t","T","u","U", + "û","Û","v","V","w","W","x","X","y","Y","z","Z"}}, + { + -- Amharic + "unicode_am_s3", + {"ሀ","ሁ","ሂ","ሃ","ሄ","ህ","ሆ","ለ","ሉ","ሊ","ላ","ሌ","ል","ሎ","ሏ","ሐ", + "ሑ","ሒ","ሓ","ሔ","ሕ","ሖ","ሗ","መ","ሙ","ሚ","ማ","ሜ","ም","ሞ", + "ሟ","ሠ","ሡ","ሢ","ሣ","ሤ","ሥ","ሦ","ሧ","ረ","ሩ","ሪ","ራ","ሬ","ር", + "ሮ","ሯ","ሰ","ሱ","ሲ","ሳ","ሴ","ስ","ሶ","ሷ","ሸ","ሹ","ሺ","ሻ","ሼ","ሽ", + "ሾ","ሿ","ቀ","ቁ","ቂ","ቃ","ቄ","ቅ","ቆ","ቈ","ቊ","ቋ","ቌ","ቍ","በ", + "ቡ","ቢ","ባ","ቤ","ብ","ቦ","ቧ","ቨ","ቩ","ቪ","ቫ","ቬ","ቭ","ቮ","ቯ", + "ተ","ቱ","ቲ","ታ","ቴ","ት","ቶ","ቷ","ቸ","ቹ","ቺ","ቻ","ቼ","ች","ቾ", + "ቿ","ኀ","ኁ","ኂ","ኃ","ኄ","ኅ","ኆ","ኈ","ኊ","ኋ","ኌ","ኍ","ነ","ኑ", + "ኒ","ና","ኔ","ን","ኖ","ኗ","ኘ","ኙ","ኚ","ኛ","ኜ","ኝ","ኞ","ኟ","አ","ኡ", + "ኢ","ኣ","ኤ","እ","ኦ","ኧ","ከ","ኩ","ኪ","ካ","ኬ","ክ","ኮ","ኰ","ኲ", + "ኳ","ኴ","ኵ","ኸ","ኹ","ኺ","ኻ","ኼ","ኽ","ኾ","ወ","ዉ","ዊ","ዋ","ዌ", + "ው","ዎ","ዐ","ዑ","ዒ","ዓ","ዔ","ዕ","ዖ","ዘ","ዙ","ዚ","ዛ","ዜ","ዝ", + "ዞ","ዟ","ዠ","ዡ","ዢ","ዣ","ዤ","ዥ","ዦ","ዧ","የ","ዩ","ዪ","ያ", + "ዬ","ይ","ዮ","ደ","ዱ","ዲ","ዳ","ዴ","ድ","ዶ","ዷ","ጀ","ጁ","ጂ","ጃ", + "ጄ","ጅ","ጆ","ጇ","ገ","ጉ","ጊ","ጋ","ጌ","ግ","ጎ","ጐ","ጒ","ጓ","ጔ", + "ጕ","ጠ","ጡ","ጢ","ጣ","ጤ","ጥ","ጦ","ጧ","ጨ","ጩ","ጪ","ጫ","ጬ", + "ጭ","ጮ","ጯ","ጰ","ጱ","ጲ","ጳ","ጴ","ጵ","ጶ","ጷ","ጸ","ጹ","ጺ","ጻ", + "ጼ","ጽ","ጾ","ጿ","ፀ","ፁ","ፂ","ፃ","ፄ","ፅ","ፆ","ፈ","ፉ","ፊ","ፋ","ፌ", + "ፍ","ፎ","ፏ","ፐ","ፑ","ፒ","ፓ","ፔ","ፕ","ፖ","ፗ"}}, + { + -- Assamese + "unicode_as_s3", + {"়","অ","আ","ই","ঈ","উ","ঊ","ঋ","এ","ঐ","ও","ঔ","ং ","ঁ ","ঃ ", + "ক","খ","গ","ঘ","ঙ","চ","ছ","জ","ঝ","ঞ","ট","ঠ","ড","ড়","ঢ","ঢ়", + "ণ","ৎ ","ত","থ","দ","ধ","ন","প","ফ","ব","ভ","ম","য","য়","ৰ", + "ল","ৱ","শ","ষ","স","হ","ক্ষ ","া","ি","ী","ু","ূ","ৃ","ে","ৈ", + "ো","ৌ","্"}}, + + { + -- Azerbaijani + "unicode_az_s3", + {"a ","A ","b ","B ","c ","C ","ç ","Ç ","ḉ ","Ḉ ","d ","D ","e ", + "E ","ə ","Ə ","f ","F ","g ","G ","ğ ","Ğ ","ģ̆ ","Ģ̆ ","h ", + "H ","x ","X ","ẍ ","Ẍ ","ẋ ","Ẋ ","ı ","I ","Í ","Ì ","Ĭ ", + "Î ","Ǐ ","Ï ","Ḯ ","Ĩ ","Į ","Ī ","Ỉ ","Ȉ ","Ȋ ","Ị ","Ḭ ", + "i ","İ ","Į̇ ","Ị̇ ","Ḭ̇ ","j ","J ","k ","K ","q ","Q ","l ", + "L ","m ","M ","n ","N ","o ","O ","ö ","Ö ","ǫ̈ ","Ǫ̈ ","ȫ ", + "Ȫ ","ơ̈ ","Ơ̈ ","ợ̈ ","Ợ̈ ","ọ̈ ","Ọ̈ ","p ","P ","r ","R ","s ", + "S ","ş ","Ş ","t ","T ","u ","U ","ü ","Ü ","ǘ ","Ǘ ","ǜ ", + "Ǜ ","ǚ ","Ǚ ","ų̈ ","Ų̈ ","ǖ ","Ǖ ","ư̈ ","Ư̈ ","ự̈ ","Ự̈ ","ụ̈ ", + "Ụ̈ ","ṳ̈ ","Ṳ̈ ","ṷ̈ ","Ṷ̈ ","ṵ̈ ","Ṵ̈ ","v ","V ","y ","Y ","z ", + "Z ","Ẉ","w ","W ","ẃ ","Ẃ ","ẁ ","Ẁ ","ŵ ","Ŵ ","ẘ ","ẅ ","Ẅ ", + "ẇ ","Ẇ ","ẉ "}}, + { + -- Belarusian + "unicode_be_s3", + {"а","А","б","Б","в","ᲀ","В","г","Г","д","ᲁ","Д","дж","дз","е", + "Е","ё","Ё","ж","Ж","з","З","і","І","й","Й","к","К","л", + "Л","м","М","н","Н","о","ᲂ","О","п","П","р","Р","с","ᲃ", + "С","т","Т","у","У","ў","Ў","ф","Ф","х","Х","ц", + "Ц","ч","Ч","ш","Ш","ы","Ы","ь","Ь","э","Э","ю","Ю","я","Я"}}, + { + -- Kyrgyz + "unicode_ky_s3", + {"а","А","б","Б","г","Г","д","ᲁ","Д","е","Е","ё","Ё","ж","Ж", + "з","З","и","И","й","Й","к","К","л","Л","м","М","н","Н","ң","Ң", + "о","ᲂ","О","ө","Ө","п","П","р","Р","с","ᲃ","С","т","ᲄ", + "Т","у","У","ү","Ү","х","Х","ч","Ч","ш","Ш","ъ","ᲆ","Ъ","ы","Ы", + "э","Э","ю","Ю","я","Я"}}, + { + -- Kyrgyz (russian codepage) + "unicode_ky_s3", + {"а","А","б","Б","в","В","г","Г","д","Д","е","Е","ё","Ё","ж","Ж", + "з","З","и","И","й","Й","к","К","л","Л","м","М","н","Н", + "о","О","п","П","р","Р","с","С","т","Т","у","У","ф","Ф", + "х","Х","ц","Ц","ч","Ч","ш","Ш","щ","Щ","ъ","Ъ","ы","Ы", + "ь","Ь","э","Э","ю","Ю","я","Я"}}, + { + -- German (umlaut as 'ae', 'oe', 'ue') + "unicode_de__phonebook_s3", + {"a","A","ä","ǟ","Ǟ","ą̈","Ą̈","ạ̈","Ạ̈","ḁ̈","Ḁ̈","Ä ","b","B","c","C", + "d","D","e","E","f","F","g","G","h","H","i","I","j","J", + "k","K","l","L","m","M","n","N","o","O","ȫ","Ȫ","ǫ̈","Ǫ̈", + "ơ̈","Ơ̈","ợ̈","Ợ̈","ọ̈","Ọ̈","ö ","Ö ","p","P","q","Q","r","R", + "s","S","ss","ß","t","T","u","U","ǘ","Ǘ","ǜ","Ǜ","ǚ","Ǚ", + "ǖ","Ǖ","ų̈","Ų̈","ư̈","Ư̈","ự̈","Ự̈","ụ̈","Ụ̈","ṳ̈","Ṳ̈","ṷ̈","Ṷ̈", + "ṵ̈","Ṵ̈","ü ","Ü ","v","V","w","W","x","X","y","Y","z","Z"}}, + { + -- Hebrew + "unicode_he_s3", + {"׳","״","א","ב","ג","ד","ה","ו","ז","ח","ט","י","כ", + "ך","ל","מ","ם","נ","ן","ס","ע","פ","ף","צ","ץ", + "ק","ר","ש","ת"} }, + { + -- Japanese + "unicode_ja_s3", + {"幸","広","庚","康","弘","恒","慌","抗","拘","控","攻","港", + "溝","甲","皇","硬","稿"}} +} + +for _, test_entry in ipairs(collation_entries) do + -- create title + local extendex_prefix = string.format("%s1.%s.", prefix, test_entry[1]) + + test:do_execsql_test( + extendex_prefix.."create_table", + string.format("create table t1(a char(5) collate \"%s\" primary key);", test_entry[1]), + {}) + test:do_test( + extendex_prefix.."insert_values", + function() + return insert_into_table("t1", test_entry[2]) + end, {}) + test:do_execsql_test( + extendex_prefix.."select", + string.format("select a from t1 order by a"), + test_entry[2]) + test:do_execsql_test( + extendex_prefix.."drop_table", + "drop table t1", + {}) +end + +test:finish_test() -- 2.17.1
prev parent reply other threads:[~2019-03-20 15:53 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-05 11:44 [tarantool-patches] " Stanislav Zudin 2019-03-05 14:40 ` Vladimir Davydov 2019-03-06 6:32 ` Stanislav Zudin 2019-03-06 8:10 ` Konstantin Osipov 2019-03-05 15:41 ` [tarantool-patches] " Konstantin Osipov 2019-03-06 6:36 ` Stanislav Zudin 2019-03-06 8:11 ` Konstantin Osipov 2019-03-06 8:23 ` Stanislav Zudin 2019-03-06 9:48 ` Konstantin Osipov 2019-03-20 15:53 ` Stanislav Zudin [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=eb02c5b9-f3a4-1ca2-c621-51bf4c751b59@tarantool.org \ --to=szudin@tarantool.org \ --cc=kostja.osipov@gmail.com \ --cc=tarantool-patches@freelists.org \ --subject='[tarantool-patches] Re: [PATCH v4] Feature request for a new collation' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox