* [tarantool-patches] [PATCH v3 1/4] error: introduce error rebulding API
2018-05-15 19:54 [tarantool-patches] [PATCH v3 0/4] Lua utf8 module Vladislav Shpilevoy
@ 2018-05-15 19:54 ` Vladislav Shpilevoy
2018-05-16 17:06 ` [tarantool-patches] " Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 2/4] collation: split collation into core and box objects Vladislav Shpilevoy
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-15 19:54 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Some of modules are out of box library and returns their custom
or common errors like SocketError, IllegalParams erc. But when
the core code is used in the box library, some of errors must be
converted into another errors. For example, it would be useful to
be able to transform IllegalParams into ClientError.
The patch introduces diag_reset, that takes new error type and
custom arguments depending on the type. An error, that must be
available for rebuilding must have Rebuild<error_name> method,
that takes old error object, and custom arguments.
---
src/box/error.cc | 27 +++++++++++++++++++++++++++
src/box/error.h | 5 +++++
src/diag.h | 9 +++++++++
3 files changed, 41 insertions(+)
diff --git a/src/box/error.cc b/src/box/error.cc
index 99f519537..bbe3b5236 100644
--- a/src/box/error.cc
+++ b/src/box/error.cc
@@ -108,6 +108,20 @@ ClientError::ClientError(const type_info *type, const char *file, unsigned line,
rmean_collect(rmean_error, RMEAN_ERROR, 1);
}
+ClientError::ClientError(struct error *last_e, uint32_t errcode)
+ :Exception(&type_ClientError, last_e->file, last_e->line)
+{
+ m_errcode = errcode;
+ /*
+ * Do not collect error - it was collected already by the
+ * original error.
+ */
+ int len = strlen(last_e->errmsg);
+ assert(len < DIAG_ERRMSG_MAX);
+ memcpy(this->errmsg, last_e->errmsg, len);
+ this->errmsg[len] = 0;
+}
+
ClientError::ClientError(const char *file, unsigned line,
uint32_t errcode, ...)
:Exception(&type_ClientError, file, line)
@@ -137,6 +151,19 @@ BuildClientError(const char *file, unsigned line, uint32_t errcode, ...)
}
}
+struct error *
+RebuildClientError(struct error *last_e, uint32_t errcode)
+{
+ /* Can not convert OOM. */
+ if (last_e->type == &type_OutOfMemory)
+ return last_e;
+ try {
+ return new ClientError(last_e, errcode);
+ } catch (OutOfMemory *e) {
+ return e;
+ }
+}
+
void
ClientError::log() const
{
diff --git a/src/box/error.h b/src/box/error.h
index c791e6c6a..5bad1cdc3 100644
--- a/src/box/error.h
+++ b/src/box/error.h
@@ -44,6 +44,9 @@ BuildAccessDeniedError(const char *file, unsigned int line,
const char *access_type, const char *object_type,
const char *object_name, const char *user_name);
+struct error *
+RebuildClientError(struct error *last_e, uint32_t errcode);
+
/** \cond public */
@@ -164,6 +167,8 @@ public:
ClientError(const char *file, unsigned line, uint32_t errcode, ...);
+ ClientError(struct error *last_e, uint32_t errcode);
+
static uint32_t get_errcode(const struct error *e);
/* client errno code */
int m_errcode;
diff --git a/src/diag.h b/src/diag.h
index dc6c132d5..85fc1ab21 100644
--- a/src/diag.h
+++ b/src/diag.h
@@ -263,6 +263,15 @@ BuildUnsupportedIndexFeature(const char *file, unsigned line,
diag_add_error(diag_get(), e); \
} while (0)
+#define diag_reset(new_class, ...) do { \
+ struct diag *d = diag_get(); \
+ struct error *last_e = diag_last_error(d); \
+ if (last_e->type != &type_##new_class) { \
+ last_e = Rebuild##new_class(last_e, ##__VA_ARGS__); \
+ diag_add_error(d, last_e); \
+ } \
+} while (0)
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
--
2.15.1 (Apple Git-101)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] Re: [PATCH v3 1/4] error: introduce error rebulding API
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 1/4] error: introduce error rebulding API Vladislav Shpilevoy
@ 2018-05-16 17:06 ` Vladislav Shpilevoy
0 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-16 17:06 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
The patch is removed.
On 15/05/2018 22:54, Vladislav Shpilevoy wrote:
> Some of modules are out of box library and returns their custom
> or common errors like SocketError, IllegalParams erc. But when
> the core code is used in the box library, some of errors must be
> converted into another errors. For example, it would be useful to
> be able to transform IllegalParams into ClientError.
>
> The patch introduces diag_reset, that takes new error type and
> custom arguments depending on the type. An error, that must be
> available for rebuilding must have Rebuild<error_name> method,
> that takes old error object, and custom arguments.
> ---
> src/box/error.cc | 27 +++++++++++++++++++++++++++
> src/box/error.h | 5 +++++
> src/diag.h | 9 +++++++++
> 3 files changed, 41 insertions(+)
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v3 2/4] collation: split collation into core and box objects
2018-05-15 19:54 [tarantool-patches] [PATCH v3 0/4] Lua utf8 module Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 1/4] error: introduce error rebulding API Vladislav Shpilevoy
@ 2018-05-15 19:54 ` Vladislav Shpilevoy
2018-05-16 17:07 ` [tarantool-patches] " Vladislav Shpilevoy
2018-05-17 19:23 ` Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 3/4] collation: introduce collation fingerprint Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 4/4] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
3 siblings, 2 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-15 19:54 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
In the issue #3290 the important problem appeared - Tarantool can
not create completely internal collations with no ID, name,
owner. Just for internal usage.
Original struct coll can not be used for this since
* it has fields that are not needed in internals;
* collation name is public thing, and the collation cache uses
it, so it would be necessary to forbid to a user usage of some
system names;
* when multiple collations has the same comparator and only their
names/owners/IDs are different, the separate UCollator objects
are created, but it would be good to be able to reference a
single one.
This patch renames coll to box_coll, coll_def to box_call_def and
introduces coll - pure collation object with no any user defined
things.
Needed for #3290.
---
src/CMakeLists.txt | 2 +
src/box/alter.cc | 72 +++++++-------
src/box/coll.c | 247 ++++-------------------------------------------
src/box/coll.h | 59 +++--------
src/box/coll_cache.c | 44 +++++----
src/box/coll_cache.h | 17 ++--
src/box/coll_def.c | 32 ------
src/box/coll_def.h | 86 +----------------
src/box/key_def.cc | 22 +++--
src/box/key_def.h | 5 +-
src/box/lua/space.cc | 8 +-
src/box/schema.cc | 8 +-
src/box/tuple.c | 4 +-
src/box/tuple_compare.cc | 5 +-
src/box/tuple_hash.cc | 4 +-
src/coll.c | 234 ++++++++++++++++++++++++++++++++++++++++++++
src/coll.h | 98 +++++++++++++++++++
src/coll_def.c | 63 ++++++++++++
src/coll_def.h | 115 ++++++++++++++++++++++
test/unit/coll.cpp | 8 +-
20 files changed, 653 insertions(+), 480 deletions(-)
create mode 100644 src/coll.c
create mode 100644 src/coll.h
create mode 100644 src/coll_def.c
create mode 100644 src/coll_def.h
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 8ab09e968..5bf17614b 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -94,6 +94,8 @@ set (core_sources
random.c
trigger.cc
http_parser.c
+ coll.c
+ coll_def.c
)
if (TARGET_OS_NETBSD)
diff --git a/src/box/alter.cc b/src/box/alter.cc
index 8766c8171..d72b9a3bb 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -35,6 +35,7 @@
#include "index.h"
#include "func.h"
#include "coll_cache.h"
+#include "coll_def.h"
#include "txn.h"
#include "tuple.h"
#include "fiber.h" /* for gc_pool */
@@ -2286,7 +2287,7 @@ on_replace_dd_func(struct trigger * /* trigger */, void *event)
/** Create a collation definition from tuple. */
void
-coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
+box_coll_def_new_from_tuple(const struct tuple *tuple, struct box_coll_def *def)
{
memset(def, 0, sizeof(*def));
uint32_t name_len, locale_len, type_len;
@@ -2294,15 +2295,16 @@ coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
def->name = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_NAME, &name_len);
def->name_len = name_len;
def->owner_id = tuple_field_u32_xc(tuple, BOX_COLLATION_FIELD_UID);
+ struct coll_def *base = &def->base;
const char *type = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_TYPE,
&type_len);
- def->type = STRN2ENUM(coll_type, type, type_len);
- if (def->type == coll_type_MAX)
+ base->type = STRN2ENUM(coll_type, type, type_len);
+ if (base->type == coll_type_MAX)
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"unknown collation type");
- def->locale = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_LOCALE,
- &locale_len);
- def->locale_len = locale_len;
+ base->locale = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_LOCALE,
+ &locale_len);
+ base->locale_len = locale_len;
const char *options =
tuple_field_with_type_xc(tuple, BOX_COLLATION_FIELD_OPTIONS,
MP_MAP);
@@ -2315,53 +2317,53 @@ coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
"collation locale is too long");
/* Locale is an optional argument and can be NULL. */
if (locale_len > 0)
- identifier_check_xc(def->locale, locale_len);
+ identifier_check_xc(base->locale, locale_len);
identifier_check_xc(def->name, name_len);
- assert(def->type == COLL_TYPE_ICU); /* no more defined now */
- if (opts_decode(&def->icu, coll_icu_opts_reg, &options,
+ assert(base->type == COLL_TYPE_ICU);
+ if (opts_decode(&base->icu, coll_icu_opts_reg, &options,
ER_WRONG_COLLATION_OPTIONS,
BOX_COLLATION_FIELD_OPTIONS, NULL) != 0)
diag_raise();
- if (def->icu.french_collation == coll_icu_on_off_MAX) {
+ if (base->icu.french_collation == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong french_collation option setting, "
"expected ON | OFF");
}
- if (def->icu.alternate_handling == coll_icu_alternate_handling_MAX) {
+ if (base->icu.alternate_handling == coll_icu_alternate_handling_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong alternate_handling option setting, "
"expected NON_IGNORABLE | SHIFTED");
}
- if (def->icu.case_first == coll_icu_case_first_MAX) {
+ if (base->icu.case_first == coll_icu_case_first_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong case_first option setting, "
"expected OFF | UPPER_FIRST | LOWER_FIRST");
}
- if (def->icu.case_level == coll_icu_on_off_MAX) {
+ if (base->icu.case_level == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong case_level option setting, "
"expected ON | OFF");
}
- if (def->icu.normalization_mode == coll_icu_on_off_MAX) {
+ if (base->icu.normalization_mode == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong normalization_mode option setting, "
"expected ON | OFF");
}
- if (def->icu.strength == coll_icu_strength_MAX) {
+ if (base->icu.strength == coll_icu_strength_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong strength option setting, "
"expected PRIMARY | SECONDARY | "
"TERTIARY | QUATERNARY | IDENTICAL");
}
- if (def->icu.numeric_collation == coll_icu_on_off_MAX) {
+ if (base->icu.numeric_collation == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong numeric_collation option setting, "
"expected ON | OFF");
@@ -2373,16 +2375,16 @@ coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
* A change is only INSERT or DELETE, UPDATE is not supported.
*/
static void
-coll_cache_rollback(struct trigger *trigger, void *event)
+box_coll_cache_rollback(struct trigger *trigger, void *event)
{
- struct coll *coll = (struct coll *) trigger->data;
+ struct box_coll *coll = (struct box_coll *) trigger->data;
struct txn_stmt *stmt = txn_last_stmt((struct txn*) event);
if (stmt->new_tuple == NULL) {
/* Rollback DELETE: put the collation back. */
assert(stmt->old_tuple != NULL);
- struct coll *replaced;
- if (coll_cache_replace(coll, &replaced) != 0) {
+ struct box_coll *replaced;
+ if (box_coll_cache_replace(coll, &replaced) != 0) {
panic("Out of memory on insertion into collation "\
"cache");
}
@@ -2390,19 +2392,19 @@ coll_cache_rollback(struct trigger *trigger, void *event)
} else {
/* INSERT: remove and free the new collation */
assert(stmt->old_tuple == NULL);
- coll_cache_delete(coll);
- coll_unref(coll);
+ box_coll_cache_delete(coll);
+ box_coll_delete(coll);
}
}
/** Dereference a deleted collation on commit. */
static void
-coll_cache_commit(struct trigger *trigger, void *event)
+box_coll_cache_commit(struct trigger *trigger, void *event)
{
(void) event;
- struct coll *coll = (struct coll *) trigger->data;
- coll_unref(coll);
+ struct box_coll *coll = (struct box_coll *) trigger->data;
+ box_coll_delete(coll);
}
/**
@@ -2418,15 +2420,15 @@ on_replace_dd_collation(struct trigger * /* trigger */, void *event)
struct tuple *new_tuple = stmt->new_tuple;
txn_check_singlestatement_xc(txn, "Space _collation");
struct trigger *on_rollback =
- txn_alter_trigger_new(coll_cache_rollback, NULL);
+ txn_alter_trigger_new(box_coll_cache_rollback, NULL);
struct trigger *on_commit =
- txn_alter_trigger_new(coll_cache_commit, NULL);
+ txn_alter_trigger_new(box_coll_cache_commit, NULL);
if (new_tuple == NULL && old_tuple != NULL) {
/* DELETE */
/* TODO: Check that no index uses the collation */
int32_t old_id = tuple_field_u32_xc(old_tuple,
BOX_COLLATION_FIELD_ID);
- struct coll *old_coll = coll_by_id(old_id);
+ struct box_coll *old_coll = box_coll_by_id(old_id);
assert(old_coll != NULL);
access_check_ddl(old_coll->name, old_coll->owner_id,
SC_COLLATION, PRIV_D, false);
@@ -2435,23 +2437,23 @@ on_replace_dd_collation(struct trigger * /* trigger */, void *event)
* deletion from the cache to make trigger logic
* simple..
*/
- coll_cache_delete(old_coll);
+ box_coll_cache_delete(old_coll);
on_rollback->data = old_coll;
on_commit->data = old_coll;
txn_on_rollback(txn, on_rollback);
txn_on_commit(txn, on_commit);
} else if (new_tuple != NULL && old_tuple == NULL) {
/* INSERT */
- struct coll_def new_def;
- coll_def_new_from_tuple(new_tuple, &new_def);
+ struct box_coll_def new_def;
+ box_coll_def_new_from_tuple(new_tuple, &new_def);
access_check_ddl(new_def.name, new_def.owner_id, SC_COLLATION,
PRIV_C, false);
- struct coll *new_coll = coll_new(&new_def);
+ struct box_coll *new_coll = box_coll_new(&new_def);
if (new_coll == NULL)
diag_raise();
- struct coll *replaced;
- if (coll_cache_replace(new_coll, &replaced) != 0) {
- coll_unref(new_coll);
+ struct box_coll *replaced;
+ if (box_coll_cache_replace(new_coll, &replaced) != 0) {
+ box_coll_delete(new_coll);
diag_raise();
}
assert(replaced == NULL);
diff --git a/src/box/coll.c b/src/box/coll.c
index 436d8d127..3bf3aff3c 100644
--- a/src/box/coll.c
+++ b/src/box/coll.c
@@ -28,252 +28,39 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
#include "coll.h"
-#include "third_party/PMurHash.h"
+#include <coll.h>
+#include "coll_def.h"
#include "error.h"
#include "diag.h"
-#include <unicode/ucol.h>
-#include <trivia/config.h>
-
-enum {
- MAX_HASH_BUFFER = 1024,
- MAX_LOCALE = 1024,
-};
-
-/**
- * Compare two string using ICU collation.
- */
-static int
-coll_icu_cmp(const char *s, size_t slen, const char *t, size_t tlen,
- const struct coll *coll)
-{
- assert(coll->icu.collator != NULL);
-
- UErrorCode status = U_ZERO_ERROR;
-
-#ifdef HAVE_ICU_STRCOLLUTF8
- UCollationResult result = ucol_strcollUTF8(coll->icu.collator,
- s, slen, t, tlen, &status);
-#else
- UCharIterator s_iter, t_iter;
- uiter_setUTF8(&s_iter, s, slen);
- uiter_setUTF8(&t_iter, t, tlen);
- UCollationResult result = ucol_strcollIter(coll->icu.collator,
- &s_iter, &t_iter, &status);
-#endif
- assert(!U_FAILURE(status));
- return (int)result;
-}
-
-/**
- * Get a hash of a string using ICU collation.
- */
-static uint32_t
-coll_icu_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
- struct coll *coll)
-{
- uint32_t total_size = 0;
- UCharIterator itr;
- uiter_setUTF8(&itr, s, s_len);
- uint8_t buf[MAX_HASH_BUFFER];
- uint32_t state[2] = {0, 0};
- UErrorCode status = U_ZERO_ERROR;
- while (true) {
- int32_t got = ucol_nextSortKeyPart(coll->icu.collator,
- &itr, state, buf,
- MAX_HASH_BUFFER, &status);
- PMurHash32_Process(ph, pcarry, buf, got);
- total_size += got;
- if (got < MAX_HASH_BUFFER)
- break;
- }
- return total_size;
-}
-/**
- * Set up ICU collator and init cmp and hash members of collation.
- * @param coll - collation to set up.
- * @param def - collation definition.
- * @return 0 on success, -1 on error.
- */
-static int
-coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
+struct box_coll *
+box_coll_new(const struct box_coll_def *def)
{
- if (coll->icu.collator != NULL) {
- ucol_close(coll->icu.collator);
- coll->icu.collator = NULL;
- }
-
- if (def->locale_len >= MAX_LOCALE) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "too long locale");
- return -1;
- }
- char locale[MAX_LOCALE];
- memcpy(locale, def->locale, def->locale_len);
- locale[def->locale_len] = '\0';
- UErrorCode status = U_ZERO_ERROR;
- struct UCollator *collator = ucol_open(locale, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- u_errorName(status));
- return -1;
- }
- coll->icu.collator = collator;
-
- if (def->icu.french_collation != COLL_ICU_DEFAULT) {
- enum coll_icu_on_off w = def->icu.french_collation;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_FRENCH_COLLATION, v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set french_collation");
- return -1;
- }
- }
- if (def->icu.alternate_handling != COLL_ICU_AH_DEFAULT) {
- enum coll_icu_alternate_handling w = def->icu.alternate_handling;
- UColAttributeValue v =
- w == COLL_ICU_AH_NON_IGNORABLE ? UCOL_NON_IGNORABLE :
- w == COLL_ICU_AH_SHIFTED ? UCOL_SHIFTED :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_ALTERNATE_HANDLING, v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set alternate_handling");
- return -1;
- }
- }
- if (def->icu.case_first != COLL_ICU_CF_DEFAULT) {
- enum coll_icu_case_first w = def->icu.case_first;
- UColAttributeValue v =
- w == COLL_ICU_CF_OFF ? UCOL_OFF :
- w == COLL_ICU_CF_UPPER_FIRST ? UCOL_UPPER_FIRST :
- w == COLL_ICU_CF_LOWER_FIRST ? UCOL_LOWER_FIRST :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_CASE_FIRST, v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set case_first");
- return -1;
- }
- }
- if (def->icu.case_level != COLL_ICU_DEFAULT) {
- enum coll_icu_on_off w = def->icu.case_level;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_CASE_LEVEL , v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set case_level");
- return -1;
- }
- }
- if (def->icu.normalization_mode != COLL_ICU_DEFAULT) {
- enum coll_icu_on_off w = def->icu.normalization_mode;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_NORMALIZATION_MODE, v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set normalization_mode");
- return -1;
- }
- }
- if (def->icu.strength != COLL_ICU_STRENGTH_DEFAULT) {
- enum coll_icu_strength w = def->icu.strength;
- UColAttributeValue v =
- w == COLL_ICU_STRENGTH_PRIMARY ? UCOL_PRIMARY :
- w == COLL_ICU_STRENGTH_SECONDARY ? UCOL_SECONDARY :
- w == COLL_ICU_STRENGTH_TERTIARY ? UCOL_TERTIARY :
- w == COLL_ICU_STRENGTH_QUATERNARY ? UCOL_QUATERNARY :
- w == COLL_ICU_STRENGTH_IDENTICAL ? UCOL_IDENTICAL :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_STRENGTH, v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set strength");
- return -1;
- }
- }
- if (def->icu.numeric_collation != COLL_ICU_DEFAULT) {
- enum coll_icu_on_off w = def->icu.numeric_collation;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_NUMERIC_COLLATION, v, &status);
- if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set numeric_collation");
- return -1;
- }
- }
-
- coll->cmp = coll_icu_cmp;
- coll->hash = coll_icu_hash;
- return 0;
-}
-
-/**
- * Destroy ICU collation.
- */
-static void
-coll_icu_destroy(struct coll *coll)
-{
- if (coll->icu.collator != NULL)
- ucol_close(coll->icu.collator);
-}
-
-/**
- * Create a collation by definition.
- * @param def - collation definition.
- * @return - the collation OR NULL on memory error (diag is set).
- */
-struct coll *
-coll_new(const struct coll_def *def)
-{
- assert(def->type == COLL_TYPE_ICU); /* no more types are implemented yet */
-
- size_t total_len = sizeof(struct coll) + def->name_len + 1;
- struct coll *coll = (struct coll *)calloc(1, total_len);
+ assert(def->base.type == COLL_TYPE_ICU);
+ size_t total_len = sizeof(struct box_coll) + def->name_len + 1;
+ struct box_coll *coll = (struct box_coll *) malloc(total_len);
if (coll == NULL) {
- diag_set(OutOfMemory, total_len, "malloc", "struct coll");
+ diag_set(OutOfMemory, total_len, "malloc", "coll");
+ return NULL;
+ }
+ coll->base = coll_new(&def->base);
+ if (coll->base == NULL) {
+ diag_reset(ClientError, ER_CANT_CREATE_COLLATION);
+ free(coll);
return NULL;
}
-
- coll->refs = 1;
coll->id = def->id;
coll->owner_id = def->owner_id;
- coll->type = def->type;
coll->name_len = def->name_len;
memcpy(coll->name, def->name, def->name_len);
coll->name[coll->name_len] = 0;
-
- if (coll_icu_init_cmp(coll, def) != 0) {
- free(coll);
- return NULL;
- }
-
return coll;
}
void
-coll_unref(struct coll *coll)
+box_coll_delete(struct box_coll *coll)
{
- /* No more types are implemented yet. */
- assert(coll->type == COLL_TYPE_ICU);
- assert(coll->refs > 0);
- if (--coll->refs == 0) {
- coll_icu_destroy(coll);
- free(coll);
- }
+ coll_unref(coll->base);
+ free(coll);
}
diff --git a/src/box/coll.h b/src/box/coll.h
index 248500ab4..dd91f2c4c 100644
--- a/src/box/coll.h
+++ b/src/box/coll.h
@@ -30,8 +30,6 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
-#include "coll_def.h"
#include <stddef.h>
#include <stdint.h>
@@ -39,65 +37,40 @@
extern "C" {
#endif /* defined(__cplusplus) */
+struct box_coll_def;
struct coll;
-typedef int (*coll_cmp_f)(const char *s, size_t s_len,
- const char *t, size_t t_len,
- const struct coll *coll);
-
-typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len,
- uint32_t *ph, uint32_t *pcarry,
- struct coll *coll);
-
/**
- * ICU collation specific data.
+ * A box collation. Box collation is not the same as core one. Box
+ * collation has name, owner and identifier, and each user defined
+ * collation has its own box_coll object. Multiple box_coll can
+ * reference the same core collation if their functional parts
+ * match.
*/
-struct UCollator;
-
-struct coll_icu {
- struct UCollator *collator;
-};
-
-/**
- * A collation.
- */
-struct coll {
+struct box_coll {
/** Personal ID */
uint32_t id;
/** Owner ID */
uint32_t owner_id;
- /** Collation type. */
- enum coll_type type;
- /** Type specific data. */
- struct coll_icu icu;
- /** String comparator. */
- coll_cmp_f cmp;
- coll_hash_f hash;
- /** Reference counter. */
- int refs;
+ /** Core collation. */
+ struct coll *base;
/** Collation name. */
size_t name_len;
char name[0];
};
/**
- * Create a collation by definition.
- * @param def - collation definition.
- * @return - the collation OR NULL on memory error (diag is set).
+ * Create a box collation by definition.
+ * @param def Collation definition.
+ * @retval NULL Illegal parameters or memory error.
+ * @retval not NULL Collation.
*/
-struct coll *
-coll_new(const struct coll_def *def);
+struct box_coll *
+box_coll_new(const struct box_coll_def *def);
/** Increment reference counter. */
-static inline void
-coll_ref(struct coll *coll)
-{
- ++coll->refs;
-}
-
-/** Decrement reference counter. Delete when 0. */
void
-coll_unref(struct coll *coll);
+box_coll_delete(struct box_coll *coll);
#if defined(__cplusplus)
} /* extern "C" */
diff --git a/src/box/coll_cache.c b/src/box/coll_cache.c
index b7eb3edb9..6695dad22 100644
--- a/src/box/coll_cache.c
+++ b/src/box/coll_cache.c
@@ -29,20 +29,21 @@
* SUCH DAMAGE.
*/
#include "coll_cache.h"
+#include "coll.h"
#include "diag.h"
#include "assoc.h"
/** mhash table (id -> collation) */
-static struct mh_i32ptr_t *coll_cache_id = NULL;
+static struct mh_i32ptr_t *box_coll_cache_id = NULL;
/** Create global hash tables if necessary. */
int
-coll_cache_init()
+box_coll_cache_init()
{
- coll_cache_id = mh_i32ptr_new();
- if (coll_cache_id == NULL) {
- diag_set(OutOfMemory, sizeof(*coll_cache_id), "malloc",
- "coll_cache_id");
+ box_coll_cache_id = mh_i32ptr_new();
+ if (box_coll_cache_id == NULL) {
+ diag_set(OutOfMemory, sizeof(*box_coll_cache_id), "malloc",
+ "box_coll_cache_id");
return -1;
}
return 0;
@@ -50,9 +51,9 @@ coll_cache_init()
/** Delete global hash tables. */
void
-coll_cache_destroy()
+box_coll_cache_destroy()
{
- mh_i32ptr_delete(coll_cache_id);
+ mh_i32ptr_delete(box_coll_cache_id);
}
/**
@@ -61,14 +62,15 @@ coll_cache_destroy()
* @return - NULL if inserted, replaced collation if replaced.
*/
int
-coll_cache_replace(struct coll *coll, struct coll **replaced)
+box_coll_cache_replace(struct box_coll *coll, struct box_coll **replaced)
{
const struct mh_i32ptr_node_t id_node = {coll->id, coll};
struct mh_i32ptr_node_t repl_id_node = {0, NULL};
struct mh_i32ptr_node_t *prepl_id_node = &repl_id_node;
- if (mh_i32ptr_put(coll_cache_id, &id_node, &prepl_id_node, NULL) ==
- mh_end(coll_cache_id)) {
- diag_set(OutOfMemory, sizeof(id_node), "malloc", "coll_cache_id");
+ if (mh_i32ptr_put(box_coll_cache_id, &id_node, &prepl_id_node, NULL) ==
+ mh_end(box_coll_cache_id)) {
+ diag_set(OutOfMemory, sizeof(id_node), "malloc",
+ "box_coll_cache_id");
return -1;
}
assert(repl_id_node.val == NULL);
@@ -81,22 +83,22 @@ coll_cache_replace(struct coll *coll, struct coll **replaced)
* @param coll - collation to delete.
*/
void
-coll_cache_delete(const struct coll *coll)
+box_coll_cache_delete(const struct box_coll *coll)
{
- mh_int_t i = mh_i32ptr_find(coll_cache_id, coll->id, NULL);
- if (i == mh_end(coll_cache_id))
+ mh_int_t i = mh_i32ptr_find(box_coll_cache_id, coll->id, NULL);
+ if (i == mh_end(box_coll_cache_id))
return;
- mh_i32ptr_del(coll_cache_id, i, NULL);
+ mh_i32ptr_del(box_coll_cache_id, i, NULL);
}
/**
* Find a collation object by its id.
*/
-struct coll *
-coll_by_id(uint32_t id)
+struct box_coll *
+box_coll_by_id(uint32_t id)
{
- mh_int_t pos = mh_i32ptr_find(coll_cache_id, id, NULL);
- if (pos == mh_end(coll_cache_id))
+ mh_int_t pos = mh_i32ptr_find(box_coll_cache_id, id, NULL);
+ if (pos == mh_end(box_coll_cache_id))
return NULL;
- return mh_i32ptr_node(coll_cache_id, pos)->val;
+ return mh_i32ptr_node(box_coll_cache_id, pos)->val;
}
diff --git a/src/box/coll_cache.h b/src/box/coll_cache.h
index 418de4e35..21bf22701 100644
--- a/src/box/coll_cache.h
+++ b/src/box/coll_cache.h
@@ -30,23 +30,24 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
-#include "coll.h"
+#include <stdint.h>
#if defined(__cplusplus)
extern "C" {
#endif /* defined(__cplusplus) */
+struct box_coll;
+
/**
* Create global hash tables.
* @return - 0 on success, -1 on memory error.
*/
int
-coll_cache_init();
+box_coll_cache_init();
/** Delete global hash tables. */
void
-coll_cache_destroy();
+box_coll_cache_destroy();
/**
* Insert or replace a collation into collation cache.
@@ -55,20 +56,20 @@ coll_cache_destroy();
* @return - 0 on success, -1 on memory error.
*/
int
-coll_cache_replace(struct coll *coll, struct coll **replaced);
+box_coll_cache_replace(struct box_coll *coll, struct box_coll **replaced);
/**
* Delete a collation from collation cache.
* @param coll - collation to delete.
*/
void
-coll_cache_delete(const struct coll *coll);
+box_coll_cache_delete(const struct box_coll *coll);
/**
* Find a collation object by its id.
*/
-struct coll *
-coll_by_id(uint32_t id);
+struct box_coll *
+box_coll_by_id(uint32_t id);
#if defined(__cplusplus)
} /* extern "C" */
diff --git a/src/box/coll_def.c b/src/box/coll_def.c
index f849845b3..fa003bc63 100644
--- a/src/box/coll_def.c
+++ b/src/box/coll_def.c
@@ -31,38 +31,6 @@
#include "coll_def.h"
-const char *coll_type_strs[] = {
- "ICU"
-};
-
-const char *coll_icu_on_off_strs[] = {
- "DEFAULT",
- "ON",
- "OFF"
-};
-
-const char *coll_icu_alternate_handling_strs[] = {
- "DEFAULT",
- "NON_IGNORABLE",
- "SHIFTED"
-};
-
-const char *coll_icu_case_first_strs[] = {
- "DEFAULT",
- "OFF",
- "UPPER_FIRST",
- "LOWER_FIRST"
-};
-
-const char *coll_icu_strength_strs[] = {
- "DEFAULT",
- "PRIMARY",
- "SECONDARY",
- "TERTIARY",
- "QUATERNARY",
- "IDENTICAL"
-};
-
static int64_t
icu_on_off_from_str(const char *str, uint32_t len)
{
diff --git a/src/box/coll_def.h b/src/box/coll_def.h
index 7a1027a1e..4d475fab5 100644
--- a/src/box/coll_def.h
+++ b/src/box/coll_def.h
@@ -33,86 +33,15 @@
#include <stddef.h>
#include <stdint.h>
+#include <coll_def.h>
#include "opt_def.h"
#if defined(__cplusplus)
extern "C" {
#endif /* defined(__cplusplus) */
-/**
- * The supported collation types
- */
-enum coll_type {
- COLL_TYPE_ICU = 0,
- coll_type_MAX,
-};
-
-extern const char *coll_type_strs[];
-
-/*
- * ICU collation options. See
- * http://icu-project.org/apiref/icu4c/ucol_8h.html#a583fbe7fc4a850e2fcc692e766d2826c
- */
-
-/** Settings for simple ICU on/off options */
-enum coll_icu_on_off {
- COLL_ICU_DEFAULT = 0,
- COLL_ICU_ON,
- COLL_ICU_OFF,
- coll_icu_on_off_MAX
-};
-
-extern const char *coll_icu_on_off_strs[];
-
-/** Alternate handling ICU settings */
-enum coll_icu_alternate_handling {
- COLL_ICU_AH_DEFAULT = 0,
- COLL_ICU_AH_NON_IGNORABLE,
- COLL_ICU_AH_SHIFTED,
- coll_icu_alternate_handling_MAX
-};
-
-extern const char *coll_icu_alternate_handling_strs[];
-
-/** Case first ICU settings */
-enum coll_icu_case_first {
- COLL_ICU_CF_DEFAULT = 0,
- COLL_ICU_CF_OFF,
- COLL_ICU_CF_UPPER_FIRST,
- COLL_ICU_CF_LOWER_FIRST,
- coll_icu_case_first_MAX
-};
-
-extern const char *coll_icu_case_first_strs[];
-
-/** Strength ICU settings */
-enum coll_icu_strength {
- COLL_ICU_STRENGTH_DEFAULT = 0,
- COLL_ICU_STRENGTH_PRIMARY,
- COLL_ICU_STRENGTH_SECONDARY,
- COLL_ICU_STRENGTH_TERTIARY,
- COLL_ICU_STRENGTH_QUATERNARY,
- COLL_ICU_STRENGTH_IDENTICAL,
- coll_icu_strength_MAX
-};
-
-extern const char *coll_icu_strength_strs[];
-
-/** Collection of ICU settings */
-struct coll_icu_def {
- enum coll_icu_on_off french_collation;
- enum coll_icu_alternate_handling alternate_handling;
- enum coll_icu_case_first case_first;
- enum coll_icu_on_off case_level;
- enum coll_icu_on_off normalization_mode;
- enum coll_icu_strength strength;
- enum coll_icu_on_off numeric_collation;
-};
-
-/**
- * Definition of a collation.
- */
-struct coll_def {
+/** Box collation definition. */
+struct box_coll_def {
/** Perconal ID */
uint32_t id;
/** Owner ID */
@@ -120,13 +49,8 @@ struct coll_def {
/** Collation name. */
size_t name_len;
const char *name;
- /** Locale. */
- size_t locale_len;
- const char *locale;
- /** Collation type. */
- enum coll_type type;
- /** Type specific options. */
- struct coll_icu_def icu;
+ /** Core collation definition. */
+ struct coll_def base;
};
extern const struct opt_def coll_icu_opts_reg[];
diff --git a/src/box/key_def.cc b/src/box/key_def.cc
index 45997ae83..8f08cfd22 100644
--- a/src/box/key_def.cc
+++ b/src/box/key_def.cc
@@ -156,16 +156,18 @@ key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count)
struct key_part_def *part = &parts[i];
struct coll *coll = NULL;
if (part->coll_id != COLL_NONE) {
- coll = coll_by_id(part->coll_id);
- if (coll == NULL) {
+ struct box_coll *box_coll =
+ box_coll_by_id(part->coll_id);
+ if (box_coll == NULL) {
diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
i + 1, "collation was not found by ID");
key_def_delete(def);
return NULL;
}
+ coll = box_coll->base;
}
key_def_set_part(def, i, part->fieldno, part->type,
- part->is_nullable, coll);
+ part->is_nullable, coll, part->coll_id);
}
return def;
}
@@ -179,8 +181,7 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
part_def->fieldno = part->fieldno;
part_def->type = part->type;
part_def->is_nullable = part->is_nullable;
- part_def->coll_id = (part->coll != NULL ?
- part->coll->id : COLL_NONE);
+ part_def->coll_id = part->coll_id;
}
}
@@ -194,7 +195,8 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
for (uint32_t item = 0; item < part_count; ++item) {
key_def_set_part(key_def, item, fields[item],
(enum field_type)types[item],
- key_part_def_default.is_nullable, NULL);
+ key_part_def_default.is_nullable, NULL,
+ COLL_NONE);
}
return key_def;
}
@@ -246,7 +248,8 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
void
key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
- enum field_type type, bool is_nullable, struct coll *coll)
+ enum field_type type, bool is_nullable, struct coll *coll,
+ uint32_t coll_id)
{
assert(part_no < def->part_count);
assert(type < field_type_MAX);
@@ -255,6 +258,7 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
def->parts[part_no].fieldno = fieldno;
def->parts[part_no].type = type;
def->parts[part_no].coll = coll;
+ def->parts[part_no].coll_id = coll_id;
column_mask_set_fieldno(&def->column_mask, fieldno);
/**
* When all parts are set, initialize the tuple
@@ -554,7 +558,7 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
end = part + first->part_count;
for (; part != end; part++) {
key_def_set_part(new_def, pos++, part->fieldno, part->type,
- part->is_nullable, part->coll);
+ part->is_nullable, part->coll, part->coll_id);
}
/* Set-append second key def's part to the new key def. */
@@ -564,7 +568,7 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
if (key_def_find(first, part->fieldno))
continue;
key_def_set_part(new_def, pos++, part->fieldno, part->type,
- part->is_nullable, part->coll);
+ part->is_nullable, part->coll, part->coll_id);
}
return new_def;
}
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 12016a51a..0e9b5f5f3 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -68,6 +68,8 @@ struct key_part {
uint32_t fieldno;
/** Type of the tuple field */
enum field_type type;
+ /** Collation ID for string comparison. */
+ uint32_t coll_id;
/** Collation definition for string comparison */
struct coll *coll;
/** True if a part can store NULLs. */
@@ -249,7 +251,8 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
*/
void
key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
- enum field_type type, bool is_nullable, struct coll *coll);
+ enum field_type type, bool is_nullable, struct coll *coll,
+ uint32_t coll_id);
/**
* Update 'has_optional_parts' of @a key_def with correspondence
diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 333b6370f..385c2374a 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -46,6 +46,7 @@ extern "C" {
#include "box/txn.h"
#include "box/vclock.h" /* VCLOCK_MAX */
#include "box/sequence.h"
+#include "box/coll_cache.h"
/**
* Trigger function for all spaces
@@ -291,8 +292,11 @@ lbox_fillspace(struct lua_State *L, struct space *space, int i)
lua_pushboolean(L, part->is_nullable);
lua_setfield(L, -2, "is_nullable");
- if (part->coll != NULL) {
- lua_pushstring(L, part->coll->name);
+ if (part->coll_id != COLL_NONE) {
+ struct box_coll *coll =
+ box_coll_by_id(part->coll_id);
+ assert(coll != NULL);
+ lua_pushstring(L, coll->name);
lua_setfield(L, -2, "collation");
}
diff --git a/src/box/schema.cc b/src/box/schema.cc
index 1b96f978c..8df4aa73b 100644
--- a/src/box/schema.cc
+++ b/src/box/schema.cc
@@ -281,13 +281,13 @@ schema_init()
auto key_def_guard = make_scoped_guard([&] { key_def_delete(key_def); });
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_STRING, false, NULL);
+ FIELD_TYPE_STRING, false, NULL, COLL_NONE);
sc_space_new(BOX_SCHEMA_ID, "_schema", key_def, &on_replace_schema,
NULL);
/* _space - home for all spaces. */
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
/* _collation - collation description. */
sc_space_new(BOX_COLLATION_ID, "_collation", key_def,
@@ -335,10 +335,10 @@ schema_init()
diag_raise();
/* space no */
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
/* index no */
key_def_set_part(key_def, 1 /* part no */, 1 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
sc_space_new(BOX_INDEX_ID, "_index", key_def,
&alter_space_on_replace_index, &on_stmt_begin_index);
}
diff --git a/src/box/tuple.c b/src/box/tuple.c
index d4760f3b1..665af2ba9 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -207,7 +207,7 @@ tuple_init(field_name_hash_f hash)
box_tuple_last = NULL;
- if (coll_cache_init() != 0)
+ if (box_coll_cache_init() != 0)
return -1;
return 0;
@@ -260,7 +260,7 @@ tuple_free(void)
tuple_format_free();
- coll_cache_destroy();
+ box_coll_cache_destroy();
}
box_tuple_format_t *
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index cfee00496..c82995d1a 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -32,7 +32,7 @@
#include "tuple.h"
#include "trivia/util.h" /* NOINLINE */
#include <math.h>
-#include "coll_def.h"
+#include <coll.h>
/* {{{ tuple_compare */
@@ -295,8 +295,7 @@ mp_compare_str(const char *field_a, const char *field_b)
}
static inline int
-mp_compare_str_coll(const char *field_a, const char *field_b,
- struct coll *coll)
+mp_compare_str_coll(const char *field_a, const char *field_b, struct coll *coll)
{
uint32_t size_a = mp_decode_strl(&field_a);
uint32_t size_b = mp_decode_strl(&field_b);
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 0fa8ea561..a2a237b4a 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -28,11 +28,9 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
#include "tuple_hash.h"
-
+#include <coll.h>
#include "third_party/PMurHash.h"
-#include "coll.h"
/* Tuple and key hasher */
namespace {
diff --git a/src/coll.c b/src/coll.c
new file mode 100644
index 000000000..eacb643f2
--- /dev/null
+++ b/src/coll.c
@@ -0,0 +1,234 @@
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include "coll.h"
+#include "third_party/PMurHash.h"
+#include "diag.h"
+#include <unicode/ucol.h>
+#include <trivia/config.h>
+
+enum {
+ MAX_HASH_BUFFER = 1024,
+ MAX_LOCALE = 1024,
+};
+
+/** Compare two string using ICU collation. */
+static int
+coll_icu_cmp(const char *s, size_t slen, const char *t, size_t tlen,
+ const struct coll *coll)
+{
+ assert(coll->icu.collator != NULL);
+
+ UErrorCode status = U_ZERO_ERROR;
+
+#ifdef HAVE_ICU_STRCOLLUTF8
+ UCollationResult result = ucol_strcollUTF8(coll->icu.collator,
+ s, slen, t, tlen, &status);
+#else
+ UCharIterator s_iter, t_iter;
+ uiter_setUTF8(&s_iter, s, slen);
+ uiter_setUTF8(&t_iter, t, tlen);
+ UCollationResult result = ucol_strcollIter(coll->icu.collator,
+ &s_iter, &t_iter, &status);
+#endif
+ assert(!U_FAILURE(status));
+ return (int)result;
+}
+
+/** Get a hash of a string using ICU collation. */
+static uint32_t
+coll_icu_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
+ struct coll *coll)
+{
+ uint32_t total_size = 0;
+ UCharIterator itr;
+ uiter_setUTF8(&itr, s, s_len);
+ uint8_t buf[MAX_HASH_BUFFER];
+ uint32_t state[2] = {0, 0};
+ UErrorCode status = U_ZERO_ERROR;
+ int32_t got;
+ do {
+ got = ucol_nextSortKeyPart(coll->icu.collator, &itr, state, buf,
+ MAX_HASH_BUFFER, &status);
+ PMurHash32_Process(ph, pcarry, buf, got);
+ total_size += got;
+ } while (got == MAX_HASH_BUFFER);
+ return total_size;
+}
+
+/**
+ * Set up ICU collator and init cmp and hash members of collation.
+ * @param coll Collation to set up.
+ * @param def Collation definition.
+ * @retval 0 Success.
+ * @retval -1 Illegal parameters or memory error.
+ */
+static int
+coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
+{
+ if (def->locale_len >= MAX_LOCALE) {
+ diag_set(IllegalParams, "too long locale");
+ return -1;
+ }
+ char locale[MAX_LOCALE];
+ memcpy(locale, def->locale, def->locale_len);
+ locale[def->locale_len] = '\0';
+ UErrorCode status = U_ZERO_ERROR;
+ struct UCollator *collator = ucol_open(locale, &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, u_errorName(status));
+ return -1;
+ }
+ coll->icu.collator = collator;
+
+ if (def->icu.french_collation != COLL_ICU_DEFAULT) {
+ enum coll_icu_on_off w = def->icu.french_collation;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF :
+ UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_FRENCH_COLLATION, v, &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "french_collation: %s", u_errorName(status)));
+ return -1;
+ }
+ }
+ if (def->icu.alternate_handling != COLL_ICU_AH_DEFAULT) {
+ enum coll_icu_alternate_handling w =
+ def->icu.alternate_handling;
+ UColAttributeValue v =
+ w == COLL_ICU_AH_NON_IGNORABLE ? UCOL_NON_IGNORABLE :
+ w == COLL_ICU_AH_SHIFTED ? UCOL_SHIFTED : UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_ALTERNATE_HANDLING, v,
+ &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "alternate_handling: %s",
+ u_errorName(status)));
+ return -1;
+ }
+ }
+ if (def->icu.case_first != COLL_ICU_CF_DEFAULT) {
+ enum coll_icu_case_first w = def->icu.case_first;
+ UColAttributeValue v = w == COLL_ICU_CF_OFF ? UCOL_OFF :
+ w == COLL_ICU_CF_UPPER_FIRST ? UCOL_UPPER_FIRST :
+ w == COLL_ICU_CF_LOWER_FIRST ? UCOL_LOWER_FIRST :
+ UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_CASE_FIRST, v, &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "case_first: %s", u_errorName(status)));
+ return -1;
+ }
+ }
+ if (def->icu.case_level != COLL_ICU_DEFAULT) {
+ enum coll_icu_on_off w = def->icu.case_level;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_CASE_LEVEL , v, &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "case_level: %s", u_errorName(status)));
+ return -1;
+ }
+ }
+ if (def->icu.normalization_mode != COLL_ICU_DEFAULT) {
+ enum coll_icu_on_off w = def->icu.normalization_mode;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_NORMALIZATION_MODE, v,
+ &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "normalization_mode: %s",
+ u_errorName(status)));
+ return -1;
+ }
+ }
+ if (def->icu.strength != COLL_ICU_STRENGTH_DEFAULT) {
+ enum coll_icu_strength w = def->icu.strength;
+ UColAttributeValue v =
+ w == COLL_ICU_STRENGTH_PRIMARY ? UCOL_PRIMARY :
+ w == COLL_ICU_STRENGTH_SECONDARY ? UCOL_SECONDARY :
+ w == COLL_ICU_STRENGTH_TERTIARY ? UCOL_TERTIARY :
+ w == COLL_ICU_STRENGTH_QUATERNARY ? UCOL_QUATERNARY :
+ w == COLL_ICU_STRENGTH_IDENTICAL ? UCOL_IDENTICAL :
+ UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_STRENGTH, v, &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "strength: %s", u_errorName(status)));
+ return -1;
+ }
+ }
+ if (def->icu.numeric_collation != COLL_ICU_DEFAULT) {
+ enum coll_icu_on_off w = def->icu.numeric_collation;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_NUMERIC_COLLATION, v, &status);
+ if (U_FAILURE(status)) {
+ diag_set(IllegalParams, tt_sprintf("failed to set "\
+ "numeric_collation: %s", u_errorName(status)));
+ return -1;
+ }
+ }
+ coll->cmp = coll_icu_cmp;
+ coll->hash = coll_icu_hash;
+ return 0;
+}
+
+struct coll *
+coll_new(const struct coll_def *def)
+{
+ assert(def->type == COLL_TYPE_ICU);
+ struct coll *coll = (struct coll *) malloc(sizeof(*coll));
+ if (coll == NULL) {
+ diag_set(OutOfMemory, sizeof(*coll), "malloc", "coll");
+ return NULL;
+ }
+ coll->refs = 1;
+ coll->type = def->type;
+ if (coll_icu_init_cmp(coll, def) != 0) {
+ free(coll);
+ return NULL;
+ }
+ return coll;
+}
+
+void
+coll_unref(struct coll *coll)
+{
+ assert(coll->refs > 0);
+ if (--coll->refs == 0) {
+ ucol_close(coll->icu.collator);
+ free(coll);
+ }
+}
diff --git a/src/coll.h b/src/coll.h
new file mode 100644
index 000000000..8798d9491
--- /dev/null
+++ b/src/coll.h
@@ -0,0 +1,98 @@
+#ifndef TARANTOOL_COLL_H_INCLUDED
+#define TARANTOOL_COLL_H_INCLUDED
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include "coll_def.h"
+#include <stddef.h>
+#include <stdint.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+struct coll;
+
+typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
+ size_t t_len, const struct coll *coll);
+
+typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
+ uint32_t *pcarry, struct coll *coll);
+
+/** ICU collation specific data. */
+struct UCollator;
+
+struct coll_icu {
+ struct UCollator *collator;
+};
+
+/**
+ * A core collation. It has no any unique features like name, id
+ * or owner. Only functional part - comparator, locale, ICU
+ * settings.
+ */
+struct coll {
+ /** Collation type. */
+ enum coll_type type;
+ /** Type specific data. */
+ struct coll_icu icu;
+ /** String comparator. */
+ coll_cmp_f cmp;
+ coll_hash_f hash;
+ /** Reference counter. */
+ int refs;
+};
+
+/**
+ * Create a core collation by definition.
+ * @param def Core collation definition.
+ * @retval NULL Illegal parameters or memory error.
+ * @retval not NULL Collation.
+ */
+struct coll *
+coll_new(const struct coll_def *def);
+
+/** Increment reference counter. */
+static inline void
+coll_ref(struct coll *coll)
+{
+ ++coll->refs;
+}
+
+/** Decrement reference counter. Delete when 0. */
+void
+coll_unref(struct coll *coll);
+
+#if defined(__cplusplus)
+} /* extern "C" */
+#endif /* defined(__cplusplus) */
+
+#endif /* TARANTOOL_COLL_H_INCLUDED */
diff --git a/src/coll_def.c b/src/coll_def.c
new file mode 100644
index 000000000..df58caca8
--- /dev/null
+++ b/src/coll_def.c
@@ -0,0 +1,63 @@
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "coll_def.h"
+
+const char *coll_type_strs[] = {
+ "ICU"
+};
+
+const char *coll_icu_on_off_strs[] = {
+ "DEFAULT",
+ "ON",
+ "OFF"
+};
+
+const char *coll_icu_alternate_handling_strs[] = {
+ "DEFAULT",
+ "NON_IGNORABLE",
+ "SHIFTED"
+};
+
+const char *coll_icu_case_first_strs[] = {
+ "DEFAULT",
+ "OFF",
+ "UPPER_FIRST",
+ "LOWER_FIRST"
+};
+
+const char *coll_icu_strength_strs[] = {
+ "DEFAULT",
+ "PRIMARY",
+ "SECONDARY",
+ "TERTIARY",
+ "QUATERNARY",
+ "IDENTICAL"
+};
diff --git a/src/coll_def.h b/src/coll_def.h
new file mode 100644
index 000000000..c8921b41a
--- /dev/null
+++ b/src/coll_def.h
@@ -0,0 +1,115 @@
+#ifndef TARANTOOL_COLL_DEF_H_INCLUDED
+#define TARANTOOL_COLL_DEF_H_INCLUDED
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include <stddef.h>
+#include <stdint.h>
+
+/** The supported collation types */
+enum coll_type {
+ COLL_TYPE_ICU = 0,
+ coll_type_MAX,
+};
+
+extern const char *coll_type_strs[];
+
+/*
+ * ICU collation options. See
+ * http://icu-project.org/apiref/icu4c/ucol_8h.html#a583fbe7fc4a850e2fcc692e766d2826c
+ */
+
+/** Settings for simple ICU on/off options */
+enum coll_icu_on_off {
+ COLL_ICU_DEFAULT = 0,
+ COLL_ICU_ON,
+ COLL_ICU_OFF,
+ coll_icu_on_off_MAX
+};
+
+extern const char *coll_icu_on_off_strs[];
+
+/** Alternate handling ICU settings */
+enum coll_icu_alternate_handling {
+ COLL_ICU_AH_DEFAULT = 0,
+ COLL_ICU_AH_NON_IGNORABLE,
+ COLL_ICU_AH_SHIFTED,
+ coll_icu_alternate_handling_MAX
+};
+
+extern const char *coll_icu_alternate_handling_strs[];
+
+/** Case first ICU settings */
+enum coll_icu_case_first {
+ COLL_ICU_CF_DEFAULT = 0,
+ COLL_ICU_CF_OFF,
+ COLL_ICU_CF_UPPER_FIRST,
+ COLL_ICU_CF_LOWER_FIRST,
+ coll_icu_case_first_MAX
+};
+
+extern const char *coll_icu_case_first_strs[];
+
+/** Strength ICU settings */
+enum coll_icu_strength {
+ COLL_ICU_STRENGTH_DEFAULT = 0,
+ COLL_ICU_STRENGTH_PRIMARY,
+ COLL_ICU_STRENGTH_SECONDARY,
+ COLL_ICU_STRENGTH_TERTIARY,
+ COLL_ICU_STRENGTH_QUATERNARY,
+ COLL_ICU_STRENGTH_IDENTICAL,
+ coll_icu_strength_MAX
+};
+
+extern const char *coll_icu_strength_strs[];
+
+/** Collection of ICU settings */
+struct coll_icu_def {
+ enum coll_icu_on_off french_collation;
+ enum coll_icu_alternate_handling alternate_handling;
+ enum coll_icu_case_first case_first;
+ enum coll_icu_on_off case_level;
+ enum coll_icu_on_off normalization_mode;
+ enum coll_icu_strength strength;
+ enum coll_icu_on_off numeric_collation;
+};
+
+/** Core collation definition. */
+struct coll_def {
+ /** Locale. */
+ size_t locale_len;
+ const char *locale;
+ /** Collation type. */
+ enum coll_type type;
+ /** Type specific options. */
+ struct coll_icu_def icu;
+};
+
+#endif /* TARANTOOL_COLL_DEF_H_INCLUDED */
diff --git a/test/unit/coll.cpp b/test/unit/coll.cpp
index d77959606..17f26ea07 100644
--- a/test/unit/coll.cpp
+++ b/test/unit/coll.cpp
@@ -1,9 +1,9 @@
-#include "box/coll.h"
#include <iostream>
#include <vector>
#include <algorithm>
#include <string.h>
-#include <box/coll_def.h>
+#include <coll_def.h>
+#include <coll.h>
#include <assert.h>
#include <msgpuck.h>
#include <diag.h>
@@ -51,8 +51,6 @@ manual_test()
def.locale = "ru_RU";
def.locale_len = strlen(def.locale);
def.type = COLL_TYPE_ICU;
- def.name = "test";
- def.name_len = strlen(def.name);
struct coll *coll;
cout << " -- default ru_RU -- " << endl;
@@ -136,8 +134,6 @@ hash_test()
def.locale = "ru_RU";
def.locale_len = strlen(def.locale);
def.type = COLL_TYPE_ICU;
- def.name = "test";
- def.name_len = strlen(def.name);
struct coll *coll;
/* Case sensitive */
--
2.15.1 (Apple Git-101)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] Re: [PATCH v3 2/4] collation: split collation into core and box objects
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 2/4] collation: split collation into core and box objects Vladislav Shpilevoy
@ 2018-05-16 17:07 ` Vladislav Shpilevoy
2018-05-16 17:17 ` Konstantin Osipov
2018-05-17 19:23 ` Vladislav Shpilevoy
1 sibling, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-16 17:07 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
I added CollationError:
diff --git a/src/box/coll.c b/src/box/coll.c
index 3bf3aff3c..a8e54727b 100644
--- a/src/box/coll.c
+++ b/src/box/coll.c
@@ -46,7 +46,6 @@ box_coll_new(const struct box_coll_def *def)
}
coll->base = coll_new(&def->base);
if (coll->base == NULL) {
- diag_reset(ClientError, ER_CANT_CREATE_COLLATION);
free(coll);
return NULL;
}
diff --git a/src/box/error.cc b/src/box/error.cc
index bbe3b5236..6b14dff05 100644
--- a/src/box/error.cc
+++ b/src/box/error.cc
@@ -108,20 +108,6 @@ ClientError::ClientError(const type_info *type, const char *file, unsigned line,
rmean_collect(rmean_error, RMEAN_ERROR, 1);
}
-ClientError::ClientError(struct error *last_e, uint32_t errcode)
- :Exception(&type_ClientError, last_e->file, last_e->line)
-{
- m_errcode = errcode;
- /*
- * Do not collect error - it was collected already by the
- * original error.
- */
- int len = strlen(last_e->errmsg);
- assert(len < DIAG_ERRMSG_MAX);
- memcpy(this->errmsg, last_e->errmsg, len);
- this->errmsg[len] = 0;
-}
-
ClientError::ClientError(const char *file, unsigned line,
uint32_t errcode, ...)
:Exception(&type_ClientError, file, line)
@@ -151,19 +137,6 @@ BuildClientError(const char *file, unsigned line, uint32_t errcode, ...)
}
}
-struct error *
-RebuildClientError(struct error *last_e, uint32_t errcode)
-{
- /* Can not convert OOM. */
- if (last_e->type == &type_OutOfMemory)
- return last_e;
- try {
- return new ClientError(last_e, errcode);
- } catch (OutOfMemory *e) {
- return e;
- }
-}
-
void
ClientError::log() const
{
@@ -182,6 +155,8 @@ ClientError::get_errcode(const struct error *e)
return ER_MEMORY_ISSUE;
if (type_cast(SystemError, e))
return ER_SYSTEM;
+ if (type_cast(CollationError, e))
+ return ER_CANT_CREATE_COLLATION;
return ER_PROC_LUA;
}
diff --git a/src/box/error.h b/src/box/error.h
index 5bad1cdc3..c791e6c6a 100644
--- a/src/box/error.h
+++ b/src/box/error.h
@@ -44,9 +44,6 @@ BuildAccessDeniedError(const char *file, unsigned int line,
const char *access_type, const char *object_type,
const char *object_name, const char *user_name);
-struct error *
-RebuildClientError(struct error *last_e, uint32_t errcode);
-
/** \cond public */
@@ -167,8 +164,6 @@ public:
ClientError(const char *file, unsigned line, uint32_t errcode, ...);
- ClientError(struct error *last_e, uint32_t errcode);
-
static uint32_t get_errcode(const struct error *e);
/* client errno code */
int m_errcode;
diff --git a/src/coll.c b/src/coll.c
index 398bff49e..2794d5f3c 100644
--- a/src/coll.c
+++ b/src/coll.c
@@ -127,7 +127,7 @@ static int
coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
{
if (def->locale_len >= MAX_LOCALE) {
- diag_set(IllegalParams, "too long locale");
+ diag_set(CollationError, "too long locale");
return -1;
}
char locale[MAX_LOCALE];
@@ -136,7 +136,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
UErrorCode status = U_ZERO_ERROR;
struct UCollator *collator = ucol_open(locale, &status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, u_errorName(status));
+ diag_set(CollationError, u_errorName(status));
return -1;
}
coll->icu.collator = collator;
@@ -148,7 +148,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_FRENCH_COLLATION, v, &status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"french_collation: %s", u_errorName(status)));
return -1;
}
@@ -162,7 +162,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
ucol_setAttribute(collator, UCOL_ALTERNATE_HANDLING, v,
&status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"alternate_handling: %s",
u_errorName(status)));
return -1;
@@ -176,7 +176,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_CASE_FIRST, v, &status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"case_first: %s", u_errorName(status)));
return -1;
}
@@ -187,7 +187,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_CASE_LEVEL , v, &status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"case_level: %s", u_errorName(status)));
return -1;
}
@@ -199,7 +199,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
ucol_setAttribute(collator, UCOL_NORMALIZATION_MODE, v,
&status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"normalization_mode: %s",
u_errorName(status)));
return -1;
@@ -216,7 +216,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_STRENGTH, v, &status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"strength: %s", u_errorName(status)));
return -1;
}
@@ -227,7 +227,7 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_NUMERIC_COLLATION, v, &status);
if (U_FAILURE(status)) {
- diag_set(IllegalParams, tt_sprintf("failed to set "\
+ diag_set(CollationError, tt_sprintf("failed to set "\
"numeric_collation: %s", u_errorName(status)));
return -1;
}
diff --git a/src/diag.h b/src/diag.h
index 85fc1ab21..bd5a539b0 100644
--- a/src/diag.h
+++ b/src/diag.h
@@ -249,6 +249,8 @@ struct error *
BuildSystemError(const char *file, unsigned line, const char *format, ...);
struct error *
BuildXlogError(const char *file, unsigned line, const char *format, ...);
+struct error *
+BuildCollationError(const char *file, unsigned line, const char *format, ...);
struct index_def;
@@ -263,15 +265,6 @@ BuildUnsupportedIndexFeature(const char *file, unsigned line,
diag_add_error(diag_get(), e); \
} while (0)
-#define diag_reset(new_class, ...) do { \
- struct diag *d = diag_get(); \
- struct error *last_e = diag_last_error(d); \
- if (last_e->type != &type_##new_class) { \
- last_e = Rebuild##new_class(last_e, ##__VA_ARGS__); \
- diag_add_error(d, last_e); \
- } \
-} while (0)
-
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/exception.cc b/src/exception.cc
index 56077f76d..1cbf8852f 100644
--- a/src/exception.cc
+++ b/src/exception.cc
@@ -235,6 +235,18 @@ IllegalParams::IllegalParams(const char *file, unsigned line,
va_end(ap);
}
+const struct type_info type_CollationError =
+ make_type("CollationError", &type_Exception);
+
+CollationError::CollationError(const char *file, unsigned line,
+ const char *format, ...)
+ : Exception(&type_CollationError, file, line)
+{
+ va_list ap;
+ va_start(ap, format);
+ error_vformat_msg(this, format, ap);
+ va_end(ap);
+}
#define BuildAlloc(type) \
void *p = malloc(sizeof(type)); \
@@ -303,6 +315,18 @@ BuildSystemError(const char *file, unsigned line, const char *format, ...)
return e;
}
+struct error *
+BuildCollationError(const char *file, unsigned line, const char *format, ...)
+{
+ BuildAlloc(CollationError);
+ CollationError *e = new (p) CollationError(file, line, "");
+ va_list ap;
+ va_start(ap, format);
+ error_vformat_msg(e, format, ap);
+ va_end(ap);
+ return e;
+}
+
void
exception_init()
{
diff --git a/src/exception.h b/src/exception.h
index fe7ab84f0..f56616b68 100644
--- a/src/exception.h
+++ b/src/exception.h
@@ -49,6 +49,7 @@ extern const struct type_info type_ChannelIsClosed;
extern const struct type_info type_LuajitError;
extern const struct type_info type_IllegalParams;
extern const struct type_info type_SystemError;
+extern const struct type_info type_CollationError;
const char *
exception_get_string(struct error *e, const struct method_info *method);
@@ -139,6 +140,14 @@ public:
IllegalParams(const char *file, unsigned line, const char *format, ...);
virtual void raise() { throw this; }
};
+
+class CollationError: public Exception {
+public:
+ CollationError(const char *file, unsigned line, const char *format,
+ ...);
+ virtual void raise() { throw this; }
+};
+
/**
* Initialize the exception subsystem.
*/
On 15/05/2018 22:54, Vladislav Shpilevoy wrote:
> In the issue #3290 the important problem appeared - Tarantool can
> not create completely internal collations with no ID, name,
> owner. Just for internal usage.
>
> Original struct coll can not be used for this since
> * it has fields that are not needed in internals;
> * collation name is public thing, and the collation cache uses
> it, so it would be necessary to forbid to a user usage of some
> system names;
> * when multiple collations has the same comparator and only their
> names/owners/IDs are different, the separate UCollator objects
> are created, but it would be good to be able to reference a
> single one.
>
> This patch renames coll to box_coll, coll_def to box_call_def and
> introduces coll - pure collation object with no any user defined
> things.
>
> Needed for #3290.
> ---
> src/CMakeLists.txt | 2 +
> src/box/alter.cc | 72 +++++++-------
> src/box/coll.c | 247 ++++-------------------------------------------
> src/box/coll.h | 59 +++--------
> src/box/coll_cache.c | 44 +++++----
> src/box/coll_cache.h | 17 ++--
> src/box/coll_def.c | 32 ------
> src/box/coll_def.h | 86 +----------------
> src/box/key_def.cc | 22 +++--
> src/box/key_def.h | 5 +-
> src/box/lua/space.cc | 8 +-
> src/box/schema.cc | 8 +-
> src/box/tuple.c | 4 +-
> src/box/tuple_compare.cc | 5 +-
> src/box/tuple_hash.cc | 4 +-
> src/coll.c | 234 ++++++++++++++++++++++++++++++++++++++++++++
> src/coll.h | 98 +++++++++++++++++++
> src/coll_def.c | 63 ++++++++++++
> src/coll_def.h | 115 ++++++++++++++++++++++
> test/unit/coll.cpp | 8 +-
> 20 files changed, 653 insertions(+), 480 deletions(-)
> create mode 100644 src/coll.c
> create mode 100644 src/coll.h
> create mode 100644 src/coll_def.c
> create mode 100644 src/coll_def.h
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] Re: [PATCH v3 2/4] collation: split collation into core and box objects
2018-05-16 17:07 ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-05-16 17:17 ` Konstantin Osipov
2018-05-16 17:19 ` Vladislav Shpilevoy
0 siblings, 1 reply; 11+ messages in thread
From: Konstantin Osipov @ 2018-05-16 17:17 UTC (permalink / raw)
To: tarantool-patches
* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/05/16 20:10]:
> I added CollationError:
Are there any other errors besides the failure to create a
collation?
> return ER_SYSTEM;
> + if (type_cast(CollationError, e))
> + return ER_CANT_CREATE_COLLATION;
> return ER_PROC_LUA;
> }
> diff --git a/src/box/error.h b/src/box/error.h
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] Re: [PATCH v3 2/4] collation: split collation into core and box objects
2018-05-16 17:17 ` Konstantin Osipov
@ 2018-05-16 17:19 ` Vladislav Shpilevoy
0 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-16 17:19 UTC (permalink / raw)
To: tarantool-patches, Konstantin Osipov
On 16/05/2018 20:17, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/05/16 20:10]:
>> I added CollationError:
>
> Are there any other errors besides the failure to create a
> collation?
>
No. Only OOM.
>
>> return ER_SYSTEM;
>> + if (type_cast(CollationError, e))
>> + return ER_CANT_CREATE_COLLATION;
>> return ER_PROC_LUA;
>> }
>> diff --git a/src/box/error.h b/src/box/error.h
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] Re: [PATCH v3 2/4] collation: split collation into core and box objects
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 2/4] collation: split collation into core and box objects Vladislav Shpilevoy
2018-05-16 17:07 ` [tarantool-patches] " Vladislav Shpilevoy
@ 2018-05-17 19:23 ` Vladislav Shpilevoy
1 sibling, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-17 19:23 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Hello. Below the new patch is presented. Here box_coll is
renamed to coll_id, and stack buffers are removed from coll.c.
---
collation: split collation into coll and id objects
In the issue #3290 the important problem appeared - Tarantool can
not create completely internal collations with no ID, name,
owner. Just for internal usage.
Original struct coll can not be used for this since
* it has fields that are not needed in internals;
* collation name is public thing, and the collation cache uses
it, so it would be necessary to forbid to a user usage of some
system names;
* when multiple collations has the same comparator and only their
names/owners/IDs are different, the separate UCollator objects
are created, but it would be good to be able to reference a
single one.
This patch renames coll to coll_id, coll_def to call_id_def and
introduces coll - pure collation object with no any user defined
things.
Needed for #3290.
---
src/CMakeLists.txt | 2 +
src/box/CMakeLists.txt | 6 +-
src/box/alter.cc | 105 ++++++++++----------
src/box/coll_id.c | 65 +++++++++++++
src/box/coll_id.h | 77 +++++++++++++++
src/box/{coll_cache.c => coll_id_cache.c} | 62 +++++-------
src/box/{coll_cache.h => coll_id_cache.h} | 27 +++---
src/box/{coll_def.c => coll_id_def.c} | 34 +------
src/box/coll_id_def.h | 54 +++++++++++
src/box/error.cc | 2 +
src/box/key_def.cc | 23 +++--
src/box/key_def.h | 7 +-
src/box/lua/space.cc | 8 +-
src/box/schema.cc | 8 +-
src/box/tuple.c | 6 +-
src/box/tuple_compare.cc | 5 +-
src/box/tuple_hash.cc | 1 -
src/{box => }/coll.c | 153 +++++++++++-------------------
src/{box => }/coll.h | 37 +++-----
src/coll_def.c | 63 ++++++++++++
src/{box => }/coll_def.h | 35 ++-----
src/diag.h | 2 +
src/exception.cc | 24 +++++
src/exception.h | 9 ++
test/unit/coll.cpp | 8 +-
25 files changed, 505 insertions(+), 318 deletions(-)
create mode 100644 src/box/coll_id.c
create mode 100644 src/box/coll_id.h
rename src/box/{coll_cache.c => coll_id_cache.c} (56%)
rename src/box/{coll_cache.h => coll_id_cache.h} (78%)
rename src/box/{coll_def.c => coll_id_def.c} (86%)
create mode 100644 src/box/coll_id_def.h
rename src/{box => }/coll.c (63%)
rename src/{box => }/coll.h (74%)
create mode 100644 src/coll_def.c
rename src/{box => }/coll_def.h (82%)
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 8ab09e968..5bf17614b 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -94,6 +94,8 @@ set (core_sources
random.c
trigger.cc
http_parser.c
+ coll.c
+ coll_def.c
)
if (TARGET_OS_NETBSD)
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index 866b7b75c..e84e791df 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -41,9 +41,9 @@ add_library(tuple STATIC
tuple_bloom.c
tuple_dictionary.c
key_def.cc
- coll_def.c
- coll.c
- coll_cache.c
+ coll_id_def.c
+ coll_id.c
+ coll_id_cache.c
field_def.c
opt_def.c
)
diff --git a/src/box/alter.cc b/src/box/alter.cc
index 8766c8171..7858af989 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -34,7 +34,8 @@
#include "space.h"
#include "index.h"
#include "func.h"
-#include "coll_cache.h"
+#include "coll_id_cache.h"
+#include "coll_id_def.h"
#include "txn.h"
#include "tuple.h"
#include "fiber.h" /* for gc_pool */
@@ -2284,9 +2285,9 @@ on_replace_dd_func(struct trigger * /* trigger */, void *event)
}
}
-/** Create a collation definition from tuple. */
+/** Create a collation identifier definition from tuple. */
void
-coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
+coll_id_def_new_from_tuple(const struct tuple *tuple, struct coll_id_def *def)
{
memset(def, 0, sizeof(*def));
uint32_t name_len, locale_len, type_len;
@@ -2294,15 +2295,16 @@ coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
def->name = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_NAME, &name_len);
def->name_len = name_len;
def->owner_id = tuple_field_u32_xc(tuple, BOX_COLLATION_FIELD_UID);
+ struct coll_def *base = &def->base;
const char *type = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_TYPE,
&type_len);
- def->type = STRN2ENUM(coll_type, type, type_len);
- if (def->type == coll_type_MAX)
+ base->type = STRN2ENUM(coll_type, type, type_len);
+ if (base->type == coll_type_MAX)
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"unknown collation type");
- def->locale = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_LOCALE,
- &locale_len);
- def->locale_len = locale_len;
+ base->locale = tuple_field_str_xc(tuple, BOX_COLLATION_FIELD_LOCALE,
+ &locale_len);
+ base->locale_len = locale_len;
const char *options =
tuple_field_with_type_xc(tuple, BOX_COLLATION_FIELD_OPTIONS,
MP_MAP);
@@ -2315,53 +2317,53 @@ coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
"collation locale is too long");
/* Locale is an optional argument and can be NULL. */
if (locale_len > 0)
- identifier_check_xc(def->locale, locale_len);
+ identifier_check_xc(base->locale, locale_len);
identifier_check_xc(def->name, name_len);
- assert(def->type == COLL_TYPE_ICU); /* no more defined now */
- if (opts_decode(&def->icu, coll_icu_opts_reg, &options,
+ assert(base->type == COLL_TYPE_ICU);
+ if (opts_decode(&base->icu, coll_icu_opts_reg, &options,
ER_WRONG_COLLATION_OPTIONS,
BOX_COLLATION_FIELD_OPTIONS, NULL) != 0)
diag_raise();
- if (def->icu.french_collation == coll_icu_on_off_MAX) {
+ if (base->icu.french_collation == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong french_collation option setting, "
"expected ON | OFF");
}
- if (def->icu.alternate_handling == coll_icu_alternate_handling_MAX) {
+ if (base->icu.alternate_handling == coll_icu_alternate_handling_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong alternate_handling option setting, "
"expected NON_IGNORABLE | SHIFTED");
}
- if (def->icu.case_first == coll_icu_case_first_MAX) {
+ if (base->icu.case_first == coll_icu_case_first_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong case_first option setting, "
"expected OFF | UPPER_FIRST | LOWER_FIRST");
}
- if (def->icu.case_level == coll_icu_on_off_MAX) {
+ if (base->icu.case_level == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong case_level option setting, "
"expected ON | OFF");
}
- if (def->icu.normalization_mode == coll_icu_on_off_MAX) {
+ if (base->icu.normalization_mode == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong normalization_mode option setting, "
"expected ON | OFF");
}
- if (def->icu.strength == coll_icu_strength_MAX) {
+ if (base->icu.strength == coll_icu_strength_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong strength option setting, "
"expected PRIMARY | SECONDARY | "
"TERTIARY | QUATERNARY | IDENTICAL");
}
- if (def->icu.numeric_collation == coll_icu_on_off_MAX) {
+ if (base->icu.numeric_collation == coll_icu_on_off_MAX) {
tnt_raise(ClientError, ER_CANT_CREATE_COLLATION,
"ICU wrong numeric_collation option setting, "
"expected ON | OFF");
@@ -2373,36 +2375,36 @@ coll_def_new_from_tuple(const struct tuple *tuple, struct coll_def *def)
* A change is only INSERT or DELETE, UPDATE is not supported.
*/
static void
-coll_cache_rollback(struct trigger *trigger, void *event)
+coll_id_cache_rollback(struct trigger *trigger, void *event)
{
- struct coll *coll = (struct coll *) trigger->data;
+ struct coll_id *coll_id = (struct coll_id *) trigger->data;
struct txn_stmt *stmt = txn_last_stmt((struct txn*) event);
if (stmt->new_tuple == NULL) {
- /* Rollback DELETE: put the collation back. */
+ /* DELETE: put the collation identifier back. */
assert(stmt->old_tuple != NULL);
- struct coll *replaced;
- if (coll_cache_replace(coll, &replaced) != 0) {
+ struct coll_id *replaced_id;
+ if (coll_id_cache_replace(coll_id, &replaced_id) != 0) {
panic("Out of memory on insertion into collation "\
"cache");
}
- assert(replaced == NULL);
+ assert(replaced_id == NULL);
} else {
- /* INSERT: remove and free the new collation */
+ /* INSERT: delete the new collation identifier. */
assert(stmt->old_tuple == NULL);
- coll_cache_delete(coll);
- coll_unref(coll);
+ coll_id_cache_delete(coll_id);
+ coll_id_delete(coll_id);
}
}
-/** Dereference a deleted collation on commit. */
+/** Free a deleted collation identifier on commit. */
static void
-coll_cache_commit(struct trigger *trigger, void *event)
+coll_id_cache_commit(struct trigger *trigger, void *event)
{
(void) event;
- struct coll *coll = (struct coll *) trigger->data;
- coll_unref(coll);
+ struct coll_id *coll_id = (struct coll_id *) trigger->data;
+ coll_id_delete(coll_id);
}
/**
@@ -2418,44 +2420,47 @@ on_replace_dd_collation(struct trigger * /* trigger */, void *event)
struct tuple *new_tuple = stmt->new_tuple;
txn_check_singlestatement_xc(txn, "Space _collation");
struct trigger *on_rollback =
- txn_alter_trigger_new(coll_cache_rollback, NULL);
+ txn_alter_trigger_new(coll_id_cache_rollback, NULL);
struct trigger *on_commit =
- txn_alter_trigger_new(coll_cache_commit, NULL);
+ txn_alter_trigger_new(coll_id_cache_commit, NULL);
if (new_tuple == NULL && old_tuple != NULL) {
/* DELETE */
- /* TODO: Check that no index uses the collation */
+ /*
+ * TODO: Check that no index uses the collation
+ * identifier.
+ */
int32_t old_id = tuple_field_u32_xc(old_tuple,
BOX_COLLATION_FIELD_ID);
- struct coll *old_coll = coll_by_id(old_id);
- assert(old_coll != NULL);
- access_check_ddl(old_coll->name, old_coll->owner_id,
+ struct coll_id *old_coll_id = coll_by_id(old_id);
+ assert(old_coll_id != NULL);
+ access_check_ddl(old_coll_id->name, old_coll_id->owner_id,
SC_COLLATION, PRIV_D, false);
/*
* Set on_commit/on_rollback triggers after
* deletion from the cache to make trigger logic
- * simple..
+ * simple.
*/
- coll_cache_delete(old_coll);
- on_rollback->data = old_coll;
- on_commit->data = old_coll;
+ coll_id_cache_delete(old_coll_id);
+ on_rollback->data = old_coll_id;
+ on_commit->data = old_coll_id;
txn_on_rollback(txn, on_rollback);
txn_on_commit(txn, on_commit);
} else if (new_tuple != NULL && old_tuple == NULL) {
/* INSERT */
- struct coll_def new_def;
- coll_def_new_from_tuple(new_tuple, &new_def);
+ struct coll_id_def new_def;
+ coll_id_def_new_from_tuple(new_tuple, &new_def);
access_check_ddl(new_def.name, new_def.owner_id, SC_COLLATION,
PRIV_C, false);
- struct coll *new_coll = coll_new(&new_def);
- if (new_coll == NULL)
+ struct coll_id *new_coll_id = coll_id_new(&new_def);
+ if (new_coll_id == NULL)
diag_raise();
- struct coll *replaced;
- if (coll_cache_replace(new_coll, &replaced) != 0) {
- coll_unref(new_coll);
+ struct coll_id *replaced_id;
+ if (coll_id_cache_replace(new_coll_id, &replaced_id) != 0) {
+ coll_id_delete(new_coll_id);
diag_raise();
}
- assert(replaced == NULL);
- on_rollback->data = new_coll;
+ assert(replaced_id == NULL);
+ on_rollback->data = new_coll_id;
txn_on_rollback(txn, on_rollback);
} else {
/* UPDATE */
diff --git a/src/box/coll_id.c b/src/box/coll_id.c
new file mode 100644
index 000000000..2d5f8a09a
--- /dev/null
+++ b/src/box/coll_id.c
@@ -0,0 +1,65 @@
+/*
+ * Copyright 2010-2017, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "coll_id.h"
+#include "coll_id_def.h"
+#include "coll.h"
+#include "error.h"
+#include "diag.h"
+
+struct coll_id *
+coll_id_new(const struct coll_id_def *def)
+{
+ assert(def->base.type == COLL_TYPE_ICU);
+ size_t total_len = sizeof(struct coll_id) + def->name_len + 1;
+ struct coll_id *coll_id = (struct coll_id *) malloc(total_len);
+ if (coll_id == NULL) {
+ diag_set(OutOfMemory, total_len, "malloc", "coll_id");
+ return NULL;
+ }
+ coll_id->coll = coll_new(&def->base);
+ if (coll_id->coll == NULL) {
+ free(coll_id);
+ return NULL;
+ }
+ coll_id->id = def->id;
+ coll_id->owner_id = def->owner_id;
+ coll_id->name_len = def->name_len;
+ memcpy(coll_id->name, def->name, def->name_len);
+ coll_id->name[coll_id->name_len] = 0;
+ return coll_id;
+}
+
+void
+coll_id_delete(struct coll_id *coll_id)
+{
+ coll_unref(coll_id->coll);
+ free(coll_id);
+}
diff --git a/src/box/coll_id.h b/src/box/coll_id.h
new file mode 100644
index 000000000..1b67a3f86
--- /dev/null
+++ b/src/box/coll_id.h
@@ -0,0 +1,77 @@
+#ifndef TARANTOOL_BOX_COLL_ID_H_INCLUDED
+#define TARANTOOL_BOX_COLL_ID_H_INCLUDED
+/*
+ * Copyright 2010-2017, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include <stddef.h>
+#include <stdint.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+struct coll_id_def;
+struct coll;
+
+/**
+ * A collation identifier. It gives a name, owner and unique
+ * identifier to a base collation. Multiple coll_id can reference
+ * the same collation if their functional parts match.
+ */
+struct coll_id {
+ /** Personal ID */
+ uint32_t id;
+ /** Owner ID */
+ uint32_t owner_id;
+ /** Collation object. */
+ struct coll *coll;
+ /** Collation name. */
+ size_t name_len;
+ char name[0];
+};
+
+/**
+ * Create a collation identifier by definition.
+ * @param def Collation definition.
+ * @retval NULL Illegal parameters or memory error.
+ * @retval not NULL Collation.
+ */
+struct coll_id *
+coll_id_new(const struct coll_id_def *def);
+
+/** Delete collation identifier, unref the basic collation. */
+void
+coll_id_delete(struct coll_id *coll);
+
+#if defined(__cplusplus)
+} /* extern "C" */
+#endif /* defined(__cplusplus) */
+
+#endif /* TARANTOOL_BOX_COLL_ID_H_INCLUDED */
diff --git a/src/box/coll_cache.c b/src/box/coll_id_cache.c
similarity index 56%
rename from src/box/coll_cache.c
rename to src/box/coll_id_cache.c
index b7eb3edb9..122863937 100644
--- a/src/box/coll_cache.c
+++ b/src/box/coll_id_cache.c
@@ -28,75 +28,63 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-#include "coll_cache.h"
+#include "coll_id_cache.h"
+#include "coll_id.h"
#include "diag.h"
#include "assoc.h"
/** mhash table (id -> collation) */
-static struct mh_i32ptr_t *coll_cache_id = NULL;
+static struct mh_i32ptr_t *coll_id_cache = NULL;
-/** Create global hash tables if necessary. */
int
-coll_cache_init()
+coll_id_cache_init()
{
- coll_cache_id = mh_i32ptr_new();
- if (coll_cache_id == NULL) {
- diag_set(OutOfMemory, sizeof(*coll_cache_id), "malloc",
- "coll_cache_id");
+ coll_id_cache = mh_i32ptr_new();
+ if (coll_id_cache == NULL) {
+ diag_set(OutOfMemory, sizeof(*coll_id_cache), "malloc",
+ "coll_id_cache");
return -1;
}
return 0;
}
-/** Delete global hash tables. */
void
-coll_cache_destroy()
+coll_id_cache_destroy()
{
- mh_i32ptr_delete(coll_cache_id);
+ mh_i32ptr_delete(coll_id_cache);
}
-/**
- * Insert or replace a collation into collation cache.
- * @param coll - collation to insert/replace.
- * @return - NULL if inserted, replaced collation if replaced.
- */
int
-coll_cache_replace(struct coll *coll, struct coll **replaced)
+coll_id_cache_replace(struct coll_id *coll_id, struct coll_id **replaced_id)
{
- const struct mh_i32ptr_node_t id_node = {coll->id, coll};
+ const struct mh_i32ptr_node_t id_node = {coll_id->id, coll_id};
struct mh_i32ptr_node_t repl_id_node = {0, NULL};
struct mh_i32ptr_node_t *prepl_id_node = &repl_id_node;
- if (mh_i32ptr_put(coll_cache_id, &id_node, &prepl_id_node, NULL) ==
- mh_end(coll_cache_id)) {
- diag_set(OutOfMemory, sizeof(id_node), "malloc", "coll_cache_id");
+ if (mh_i32ptr_put(coll_id_cache, &id_node, &prepl_id_node, NULL) ==
+ mh_end(coll_id_cache)) {
+ diag_set(OutOfMemory, sizeof(id_node), "malloc",
+ "coll_id_cache");
return -1;
}
assert(repl_id_node.val == NULL);
- *replaced = repl_id_node.val;
+ *replaced_id = repl_id_node.val;
return 0;
}
-/**
- * Delete a collation from collation cache.
- * @param coll - collation to delete.
- */
void
-coll_cache_delete(const struct coll *coll)
+coll_id_cache_delete(const struct coll_id *coll_id)
{
- mh_int_t i = mh_i32ptr_find(coll_cache_id, coll->id, NULL);
- if (i == mh_end(coll_cache_id))
+ mh_int_t i = mh_i32ptr_find(coll_id_cache, coll_id->id, NULL);
+ if (i == mh_end(coll_id_cache))
return;
- mh_i32ptr_del(coll_cache_id, i, NULL);
+ mh_i32ptr_del(coll_id_cache, i, NULL);
}
-/**
- * Find a collation object by its id.
- */
-struct coll *
+struct coll_id *
coll_by_id(uint32_t id)
{
- mh_int_t pos = mh_i32ptr_find(coll_cache_id, id, NULL);
- if (pos == mh_end(coll_cache_id))
+ mh_int_t pos = mh_i32ptr_find(coll_id_cache, id, NULL);
+ if (pos == mh_end(coll_id_cache))
return NULL;
- return mh_i32ptr_node(coll_cache_id, pos)->val;
+ return mh_i32ptr_node(coll_id_cache, pos)->val;
}
diff --git a/src/box/coll_cache.h b/src/box/coll_id_cache.h
similarity index 78%
rename from src/box/coll_cache.h
rename to src/box/coll_id_cache.h
index 418de4e35..4bbbc85de 100644
--- a/src/box/coll_cache.h
+++ b/src/box/coll_id_cache.h
@@ -1,5 +1,5 @@
-#ifndef TARANTOOL_BOX_COLL_CACHE_H_INCLUDED
-#define TARANTOOL_BOX_COLL_CACHE_H_INCLUDED
+#ifndef TARANTOOL_BOX_COLL_ID_CACHE_H_INCLUDED
+#define TARANTOOL_BOX_COLL_ID_CACHE_H_INCLUDED
/*
* Copyright 2010-2016, Tarantool AUTHORS, please see AUTHORS file.
*
@@ -30,48 +30,49 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
-#include "coll.h"
+#include <stdint.h>
#if defined(__cplusplus)
extern "C" {
#endif /* defined(__cplusplus) */
+struct coll_id;
+
/**
* Create global hash tables.
* @return - 0 on success, -1 on memory error.
*/
int
-coll_cache_init();
+coll_id_cache_init();
/** Delete global hash tables. */
void
-coll_cache_destroy();
+coll_id_cache_destroy();
/**
* Insert or replace a collation into collation cache.
- * @param coll - collation to insert/replace.
- * @param replaced - collation that was replaced.
+ * @param coll_id Collation to insert/replace.
+ * @param Replaced_id Collation that was replaced.
* @return - 0 on success, -1 on memory error.
*/
int
-coll_cache_replace(struct coll *coll, struct coll **replaced);
+coll_id_cache_replace(struct coll_id *coll_id, struct coll_id **replaced_id);
/**
* Delete a collation from collation cache.
- * @param coll - collation to delete.
+ * @param coll_id Collation to delete.
*/
void
-coll_cache_delete(const struct coll *coll);
+coll_id_cache_delete(const struct coll_id *coll_id);
/**
* Find a collation object by its id.
*/
-struct coll *
+struct coll_id *
coll_by_id(uint32_t id);
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
-#endif /* TARANTOOL_BOX_COLL_CACHE_H_INCLUDED */
+#endif /* TARANTOOL_BOX_COLL_ID_CACHE_H_INCLUDED */
diff --git a/src/box/coll_def.c b/src/box/coll_id_def.c
similarity index 86%
rename from src/box/coll_def.c
rename to src/box/coll_id_def.c
index f849845b3..9fe0cda8c 100644
--- a/src/box/coll_def.c
+++ b/src/box/coll_id_def.c
@@ -29,39 +29,7 @@
* SUCH DAMAGE.
*/
-#include "coll_def.h"
-
-const char *coll_type_strs[] = {
- "ICU"
-};
-
-const char *coll_icu_on_off_strs[] = {
- "DEFAULT",
- "ON",
- "OFF"
-};
-
-const char *coll_icu_alternate_handling_strs[] = {
- "DEFAULT",
- "NON_IGNORABLE",
- "SHIFTED"
-};
-
-const char *coll_icu_case_first_strs[] = {
- "DEFAULT",
- "OFF",
- "UPPER_FIRST",
- "LOWER_FIRST"
-};
-
-const char *coll_icu_strength_strs[] = {
- "DEFAULT",
- "PRIMARY",
- "SECONDARY",
- "TERTIARY",
- "QUATERNARY",
- "IDENTICAL"
-};
+#include "coll_id_def.h"
static int64_t
icu_on_off_from_str(const char *str, uint32_t len)
diff --git a/src/box/coll_id_def.h b/src/box/coll_id_def.h
new file mode 100644
index 000000000..489280c00
--- /dev/null
+++ b/src/box/coll_id_def.h
@@ -0,0 +1,54 @@
+#ifndef TARANTOOL_BOX_COLL_ID_DEF_H_INCLUDED
+#define TARANTOOL_BOX_COLL_ID_DEF_H_INCLUDED
+/*
+ * Copyright 2010-2017, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stddef.h>
+#include <stdint.h>
+#include <coll_def.h>
+#include "opt_def.h"
+
+/** Collation identifier definition. */
+struct coll_id_def {
+ /** Perconal ID */
+ uint32_t id;
+ /** Owner ID */
+ uint32_t owner_id;
+ /** Collation name. */
+ size_t name_len;
+ const char *name;
+ /** Core collation definition. */
+ struct coll_def base;
+};
+
+extern const struct opt_def coll_icu_opts_reg[];
+
+#endif /* TARANTOOL_BOX_COLL_ID_DEF_H_INCLUDED */
diff --git a/src/box/error.cc b/src/box/error.cc
index 99f519537..6b14dff05 100644
--- a/src/box/error.cc
+++ b/src/box/error.cc
@@ -155,6 +155,8 @@ ClientError::get_errcode(const struct error *e)
return ER_MEMORY_ISSUE;
if (type_cast(SystemError, e))
return ER_SYSTEM;
+ if (type_cast(CollationError, e))
+ return ER_CANT_CREATE_COLLATION;
return ER_PROC_LUA;
}
diff --git a/src/box/key_def.cc b/src/box/key_def.cc
index 45997ae83..ee09dc99d 100644
--- a/src/box/key_def.cc
+++ b/src/box/key_def.cc
@@ -34,7 +34,7 @@
#include "tuple_hash.h"
#include "column_mask.h"
#include "schema_def.h"
-#include "coll_cache.h"
+#include "coll_id_cache.h"
static const struct key_part_def key_part_def_default = {
0,
@@ -156,16 +156,17 @@ key_def_new_with_parts(struct key_part_def *parts, uint32_t part_count)
struct key_part_def *part = &parts[i];
struct coll *coll = NULL;
if (part->coll_id != COLL_NONE) {
- coll = coll_by_id(part->coll_id);
- if (coll == NULL) {
+ struct coll_id *coll_id = coll_by_id(part->coll_id);
+ if (coll_id == NULL) {
diag_set(ClientError, ER_WRONG_INDEX_OPTIONS,
i + 1, "collation was not found by ID");
key_def_delete(def);
return NULL;
}
+ coll = coll_id->coll;
}
key_def_set_part(def, i, part->fieldno, part->type,
- part->is_nullable, coll);
+ part->is_nullable, coll, part->coll_id);
}
return def;
}
@@ -179,8 +180,7 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts)
part_def->fieldno = part->fieldno;
part_def->type = part->type;
part_def->is_nullable = part->is_nullable;
- part_def->coll_id = (part->coll != NULL ?
- part->coll->id : COLL_NONE);
+ part_def->coll_id = part->coll_id;
}
}
@@ -194,7 +194,8 @@ box_key_def_new(uint32_t *fields, uint32_t *types, uint32_t part_count)
for (uint32_t item = 0; item < part_count; ++item) {
key_def_set_part(key_def, item, fields[item],
(enum field_type)types[item],
- key_part_def_default.is_nullable, NULL);
+ key_part_def_default.is_nullable, NULL,
+ COLL_NONE);
}
return key_def;
}
@@ -246,7 +247,8 @@ key_part_cmp(const struct key_part *parts1, uint32_t part_count1,
void
key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
- enum field_type type, bool is_nullable, struct coll *coll)
+ enum field_type type, bool is_nullable, struct coll *coll,
+ uint32_t coll_id)
{
assert(part_no < def->part_count);
assert(type < field_type_MAX);
@@ -255,6 +257,7 @@ key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
def->parts[part_no].fieldno = fieldno;
def->parts[part_no].type = type;
def->parts[part_no].coll = coll;
+ def->parts[part_no].coll_id = coll_id;
column_mask_set_fieldno(&def->column_mask, fieldno);
/**
* When all parts are set, initialize the tuple
@@ -554,7 +557,7 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
end = part + first->part_count;
for (; part != end; part++) {
key_def_set_part(new_def, pos++, part->fieldno, part->type,
- part->is_nullable, part->coll);
+ part->is_nullable, part->coll, part->coll_id);
}
/* Set-append second key def's part to the new key def. */
@@ -564,7 +567,7 @@ key_def_merge(const struct key_def *first, const struct key_def *second)
if (key_def_find(first, part->fieldno))
continue;
key_def_set_part(new_def, pos++, part->fieldno, part->type,
- part->is_nullable, part->coll);
+ part->is_nullable, part->coll, part->coll_id);
}
return new_def;
}
diff --git a/src/box/key_def.h b/src/box/key_def.h
index 12016a51a..aecbe0345 100644
--- a/src/box/key_def.h
+++ b/src/box/key_def.h
@@ -36,7 +36,7 @@
#include <msgpuck.h>
#include <limits.h>
#include "field_def.h"
-#include "coll.h"
+#include "coll_id.h"
#if defined(__cplusplus)
extern "C" {
@@ -68,6 +68,8 @@ struct key_part {
uint32_t fieldno;
/** Type of the tuple field */
enum field_type type;
+ /** Collation ID for string comparison. */
+ uint32_t coll_id;
/** Collation definition for string comparison */
struct coll *coll;
/** True if a part can store NULLs. */
@@ -249,7 +251,8 @@ key_def_dump_parts(const struct key_def *def, struct key_part_def *parts);
*/
void
key_def_set_part(struct key_def *def, uint32_t part_no, uint32_t fieldno,
- enum field_type type, bool is_nullable, struct coll *coll);
+ enum field_type type, bool is_nullable, struct coll *coll,
+ uint32_t coll_id);
/**
* Update 'has_optional_parts' of @a key_def with correspondence
diff --git a/src/box/lua/space.cc b/src/box/lua/space.cc
index 333b6370f..524382750 100644
--- a/src/box/lua/space.cc
+++ b/src/box/lua/space.cc
@@ -46,6 +46,7 @@ extern "C" {
#include "box/txn.h"
#include "box/vclock.h" /* VCLOCK_MAX */
#include "box/sequence.h"
+#include "box/coll_id_cache.h"
/**
* Trigger function for all spaces
@@ -291,8 +292,11 @@ lbox_fillspace(struct lua_State *L, struct space *space, int i)
lua_pushboolean(L, part->is_nullable);
lua_setfield(L, -2, "is_nullable");
- if (part->coll != NULL) {
- lua_pushstring(L, part->coll->name);
+ if (part->coll_id != COLL_NONE) {
+ struct coll_id *coll_id =
+ coll_by_id(part->coll_id);
+ assert(coll_id != NULL);
+ lua_pushstring(L, coll_id->name);
lua_setfield(L, -2, "collation");
}
diff --git a/src/box/schema.cc b/src/box/schema.cc
index 1b96f978c..8df4aa73b 100644
--- a/src/box/schema.cc
+++ b/src/box/schema.cc
@@ -281,13 +281,13 @@ schema_init()
auto key_def_guard = make_scoped_guard([&] { key_def_delete(key_def); });
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_STRING, false, NULL);
+ FIELD_TYPE_STRING, false, NULL, COLL_NONE);
sc_space_new(BOX_SCHEMA_ID, "_schema", key_def, &on_replace_schema,
NULL);
/* _space - home for all spaces. */
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
/* _collation - collation description. */
sc_space_new(BOX_COLLATION_ID, "_collation", key_def,
@@ -335,10 +335,10 @@ schema_init()
diag_raise();
/* space no */
key_def_set_part(key_def, 0 /* part no */, 0 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
/* index no */
key_def_set_part(key_def, 1 /* part no */, 1 /* field no */,
- FIELD_TYPE_UNSIGNED, false, NULL);
+ FIELD_TYPE_UNSIGNED, false, NULL, COLL_NONE);
sc_space_new(BOX_INDEX_ID, "_index", key_def,
&alter_space_on_replace_index, &on_stmt_begin_index);
}
diff --git a/src/box/tuple.c b/src/box/tuple.c
index d4760f3b1..cf3929734 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -38,7 +38,7 @@
#include "small/small.h"
#include "tuple_update.h"
-#include "coll_cache.h"
+#include "coll_id_cache.h"
static struct mempool tuple_iterator_pool;
static struct small_alloc runtime_alloc;
@@ -207,7 +207,7 @@ tuple_init(field_name_hash_f hash)
box_tuple_last = NULL;
- if (coll_cache_init() != 0)
+ if (coll_id_cache_init() != 0)
return -1;
return 0;
@@ -260,7 +260,7 @@ tuple_free(void)
tuple_format_free();
- coll_cache_destroy();
+ coll_id_cache_destroy();
}
box_tuple_format_t *
diff --git a/src/box/tuple_compare.cc b/src/box/tuple_compare.cc
index cfee00496..e53afba42 100644
--- a/src/box/tuple_compare.cc
+++ b/src/box/tuple_compare.cc
@@ -30,9 +30,9 @@
*/
#include "tuple_compare.h"
#include "tuple.h"
+#include "coll.h"
#include "trivia/util.h" /* NOINLINE */
#include <math.h>
-#include "coll_def.h"
/* {{{ tuple_compare */
@@ -295,8 +295,7 @@ mp_compare_str(const char *field_a, const char *field_b)
}
static inline int
-mp_compare_str_coll(const char *field_a, const char *field_b,
- struct coll *coll)
+mp_compare_str_coll(const char *field_a, const char *field_b, struct coll *coll)
{
uint32_t size_a = mp_decode_strl(&field_a);
uint32_t size_b = mp_decode_strl(&field_b);
diff --git a/src/box/tuple_hash.cc b/src/box/tuple_hash.cc
index 0fa8ea561..dee9be328 100644
--- a/src/box/tuple_hash.cc
+++ b/src/box/tuple_hash.cc
@@ -30,7 +30,6 @@
*/
#include "tuple_hash.h"
-
#include "third_party/PMurHash.h"
#include "coll.h"
diff --git a/src/box/coll.c b/src/coll.c
similarity index 63%
rename from src/box/coll.c
rename to src/coll.c
index 436d8d127..66afa6c4f 100644
--- a/src/box/coll.c
+++ b/src/coll.c
@@ -1,5 +1,5 @@
/*
- * Copyright 2010-2017, Tarantool AUTHORS, please see AUTHORS file.
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
@@ -31,19 +31,18 @@
#include "coll.h"
#include "third_party/PMurHash.h"
-#include "error.h"
#include "diag.h"
#include <unicode/ucol.h>
#include <trivia/config.h>
enum {
- MAX_HASH_BUFFER = 1024,
MAX_LOCALE = 1024,
};
-/**
- * Compare two string using ICU collation.
- */
+static_assert(MAX_LOCALE <= TT_STATIC_BUF_LEN,
+ "static buf is used to 0-terminate locale name");
+
+/** Compare two string using ICU collation. */
static int
coll_icu_cmp(const char *s, size_t slen, const char *t, size_t tlen,
const struct coll *coll)
@@ -66,9 +65,7 @@ coll_icu_cmp(const char *s, size_t slen, const char *t, size_t tlen,
return (int)result;
}
-/**
- * Get a hash of a string using ICU collation.
- */
+/** Get a hash of a string using ICU collation. */
static uint32_t
coll_icu_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
struct coll *coll)
@@ -76,115 +73,103 @@ coll_icu_hash(const char *s, size_t s_len, uint32_t *ph, uint32_t *pcarry,
uint32_t total_size = 0;
UCharIterator itr;
uiter_setUTF8(&itr, s, s_len);
- uint8_t buf[MAX_HASH_BUFFER];
+ uint8_t *buf = (uint8_t *) tt_static_buf();
uint32_t state[2] = {0, 0};
UErrorCode status = U_ZERO_ERROR;
- while (true) {
- int32_t got = ucol_nextSortKeyPart(coll->icu.collator,
- &itr, state, buf,
- MAX_HASH_BUFFER, &status);
+ int32_t got;
+ do {
+ got = ucol_nextSortKeyPart(coll->icu.collator, &itr, state, buf,
+ TT_STATIC_BUF_LEN, &status);
PMurHash32_Process(ph, pcarry, buf, got);
total_size += got;
- if (got < MAX_HASH_BUFFER)
- break;
- }
+ } while (got == TT_STATIC_BUF_LEN);
return total_size;
}
/**
* Set up ICU collator and init cmp and hash members of collation.
- * @param coll - collation to set up.
- * @param def - collation definition.
- * @return 0 on success, -1 on error.
+ * @param coll Collation to set up.
+ * @param def Collation definition.
+ * @retval 0 Success.
+ * @retval -1 Collation error.
*/
static int
coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
{
- if (coll->icu.collator != NULL) {
- ucol_close(coll->icu.collator);
- coll->icu.collator = NULL;
- }
-
if (def->locale_len >= MAX_LOCALE) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "too long locale");
+ diag_set(CollationError, "too long locale");
return -1;
}
- char locale[MAX_LOCALE];
+ char *locale = tt_static_buf();
memcpy(locale, def->locale, def->locale_len);
locale[def->locale_len] = '\0';
UErrorCode status = U_ZERO_ERROR;
struct UCollator *collator = ucol_open(locale, &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- u_errorName(status));
+ diag_set(CollationError, u_errorName(status));
return -1;
}
coll->icu.collator = collator;
if (def->icu.french_collation != COLL_ICU_DEFAULT) {
enum coll_icu_on_off w = def->icu.french_collation;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF :
+ UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_FRENCH_COLLATION, v, &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set french_collation");
+ diag_set(CollationError, "failed to set "\
+ "french_collation: %s", u_errorName(status));
return -1;
}
}
if (def->icu.alternate_handling != COLL_ICU_AH_DEFAULT) {
- enum coll_icu_alternate_handling w = def->icu.alternate_handling;
+ enum coll_icu_alternate_handling w =
+ def->icu.alternate_handling;
UColAttributeValue v =
w == COLL_ICU_AH_NON_IGNORABLE ? UCOL_NON_IGNORABLE :
- w == COLL_ICU_AH_SHIFTED ? UCOL_SHIFTED :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_ALTERNATE_HANDLING, v, &status);
+ w == COLL_ICU_AH_SHIFTED ? UCOL_SHIFTED : UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_ALTERNATE_HANDLING, v,
+ &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set alternate_handling");
+ diag_set(CollationError, "failed to set "\
+ "alternate_handling: %s", u_errorName(status));
return -1;
}
}
if (def->icu.case_first != COLL_ICU_CF_DEFAULT) {
enum coll_icu_case_first w = def->icu.case_first;
- UColAttributeValue v =
- w == COLL_ICU_CF_OFF ? UCOL_OFF :
+ UColAttributeValue v = w == COLL_ICU_CF_OFF ? UCOL_OFF :
w == COLL_ICU_CF_UPPER_FIRST ? UCOL_UPPER_FIRST :
w == COLL_ICU_CF_LOWER_FIRST ? UCOL_LOWER_FIRST :
UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_CASE_FIRST, v, &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set case_first");
+ diag_set(CollationError, "failed to set case_first: "\
+ "%s", u_errorName(status));
return -1;
}
}
if (def->icu.case_level != COLL_ICU_DEFAULT) {
enum coll_icu_on_off w = def->icu.case_level;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_CASE_LEVEL , v, &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set case_level");
+ diag_set(CollationError, "failed to set case_level: "\
+ "%s", u_errorName(status));
return -1;
}
}
if (def->icu.normalization_mode != COLL_ICU_DEFAULT) {
enum coll_icu_on_off w = def->icu.normalization_mode;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
- ucol_setAttribute(collator, UCOL_NORMALIZATION_MODE, v, &status);
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
+ ucol_setAttribute(collator, UCOL_NORMALIZATION_MODE, v,
+ &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set normalization_mode");
+ diag_set(CollationError, "failed to set "\
+ "normalization_mode: %s", u_errorName(status));
return -1;
}
}
@@ -199,81 +184,51 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_STRENGTH, v, &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set strength");
+ diag_set(CollationError, "failed to set strength: %s",
+ u_errorName(status));
return -1;
}
}
if (def->icu.numeric_collation != COLL_ICU_DEFAULT) {
enum coll_icu_on_off w = def->icu.numeric_collation;
- UColAttributeValue v =
- w == COLL_ICU_ON ? UCOL_ON :
- w == COLL_ICU_OFF ? UCOL_OFF :
- UCOL_DEFAULT;
+ UColAttributeValue v = w == COLL_ICU_ON ? UCOL_ON :
+ w == COLL_ICU_OFF ? UCOL_OFF : UCOL_DEFAULT;
ucol_setAttribute(collator, UCOL_NUMERIC_COLLATION, v, &status);
if (U_FAILURE(status)) {
- diag_set(ClientError, ER_CANT_CREATE_COLLATION,
- "failed to set numeric_collation");
+ diag_set(CollationError, "failed to set "\
+ "numeric_collation: %s", u_errorName(status));
return -1;
}
}
-
coll->cmp = coll_icu_cmp;
coll->hash = coll_icu_hash;
return 0;
}
-/**
- * Destroy ICU collation.
- */
-static void
-coll_icu_destroy(struct coll *coll)
-{
- if (coll->icu.collator != NULL)
- ucol_close(coll->icu.collator);
-}
-
-/**
- * Create a collation by definition.
- * @param def - collation definition.
- * @return - the collation OR NULL on memory error (diag is set).
- */
struct coll *
coll_new(const struct coll_def *def)
{
- assert(def->type == COLL_TYPE_ICU); /* no more types are implemented yet */
-
- size_t total_len = sizeof(struct coll) + def->name_len + 1;
- struct coll *coll = (struct coll *)calloc(1, total_len);
+ assert(def->type == COLL_TYPE_ICU);
+ struct coll *coll = (struct coll *) malloc(sizeof(*coll));
if (coll == NULL) {
- diag_set(OutOfMemory, total_len, "malloc", "struct coll");
+ diag_set(OutOfMemory, sizeof(*coll), "malloc", "coll");
return NULL;
}
-
coll->refs = 1;
- coll->id = def->id;
- coll->owner_id = def->owner_id;
coll->type = def->type;
- coll->name_len = def->name_len;
- memcpy(coll->name, def->name, def->name_len);
- coll->name[coll->name_len] = 0;
-
if (coll_icu_init_cmp(coll, def) != 0) {
free(coll);
return NULL;
}
-
return coll;
}
void
coll_unref(struct coll *coll)
{
- /* No more types are implemented yet. */
- assert(coll->type == COLL_TYPE_ICU);
assert(coll->refs > 0);
if (--coll->refs == 0) {
- coll_icu_destroy(coll);
+ ucol_close(coll->icu.collator);
free(coll);
}
}
diff --git a/src/box/coll.h b/src/coll.h
similarity index 74%
rename from src/box/coll.h
rename to src/coll.h
index 248500ab4..cc834f446 100644
--- a/src/box/coll.h
+++ b/src/coll.h
@@ -1,7 +1,7 @@
-#ifndef TARANTOOL_BOX_COLL_H_INCLUDED
-#define TARANTOOL_BOX_COLL_H_INCLUDED
+#ifndef TARANTOOL_COLL_H_INCLUDED
+#define TARANTOOL_COLL_H_INCLUDED
/*
- * Copyright 2010-2017, Tarantool AUTHORS, please see AUTHORS file.
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
@@ -41,17 +41,13 @@ extern "C" {
struct coll;
-typedef int (*coll_cmp_f)(const char *s, size_t s_len,
- const char *t, size_t t_len,
- const struct coll *coll);
+typedef int (*coll_cmp_f)(const char *s, size_t s_len, const char *t,
+ size_t t_len, const struct coll *coll);
-typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len,
- uint32_t *ph, uint32_t *pcarry,
- struct coll *coll);
+typedef uint32_t (*coll_hash_f)(const char *s, size_t s_len, uint32_t *ph,
+ uint32_t *pcarry, struct coll *coll);
-/**
- * ICU collation specific data.
- */
+/** ICU collation specific data. */
struct UCollator;
struct coll_icu {
@@ -59,13 +55,10 @@ struct coll_icu {
};
/**
- * A collation.
+ * Collation. It has no unique features like name, id or owner.
+ * Only functional part - comparator, locale, ICU settings.
*/
struct coll {
- /** Personal ID */
- uint32_t id;
- /** Owner ID */
- uint32_t owner_id;
/** Collation type. */
enum coll_type type;
/** Type specific data. */
@@ -75,15 +68,13 @@ struct coll {
coll_hash_f hash;
/** Reference counter. */
int refs;
- /** Collation name. */
- size_t name_len;
- char name[0];
};
/**
* Create a collation by definition.
- * @param def - collation definition.
- * @return - the collation OR NULL on memory error (diag is set).
+ * @param def Collation definition.
+ * @retval NULL Collation or memory error.
+ * @retval not NULL Collation.
*/
struct coll *
coll_new(const struct coll_def *def);
@@ -103,4 +94,4 @@ coll_unref(struct coll *coll);
} /* extern "C" */
#endif /* defined(__cplusplus) */
-#endif /* TARANTOOL_BOX_COLL_H_INCLUDED */
+#endif /* TARANTOOL_COLL_H_INCLUDED */
diff --git a/src/coll_def.c b/src/coll_def.c
new file mode 100644
index 000000000..df58caca8
--- /dev/null
+++ b/src/coll_def.c
@@ -0,0 +1,63 @@
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include "coll_def.h"
+
+const char *coll_type_strs[] = {
+ "ICU"
+};
+
+const char *coll_icu_on_off_strs[] = {
+ "DEFAULT",
+ "ON",
+ "OFF"
+};
+
+const char *coll_icu_alternate_handling_strs[] = {
+ "DEFAULT",
+ "NON_IGNORABLE",
+ "SHIFTED"
+};
+
+const char *coll_icu_case_first_strs[] = {
+ "DEFAULT",
+ "OFF",
+ "UPPER_FIRST",
+ "LOWER_FIRST"
+};
+
+const char *coll_icu_strength_strs[] = {
+ "DEFAULT",
+ "PRIMARY",
+ "SECONDARY",
+ "TERTIARY",
+ "QUATERNARY",
+ "IDENTICAL"
+};
diff --git a/src/box/coll_def.h b/src/coll_def.h
similarity index 82%
rename from src/box/coll_def.h
rename to src/coll_def.h
index 7a1027a1e..10dbc860e 100644
--- a/src/box/coll_def.h
+++ b/src/coll_def.h
@@ -1,7 +1,7 @@
-#ifndef TARANTOOL_BOX_COLL_DEF_H_INCLUDED
-#define TARANTOOL_BOX_COLL_DEF_H_INCLUDED
+#ifndef TARANTOOL_COLL_DEF_H_INCLUDED
+#define TARANTOOL_COLL_DEF_H_INCLUDED
/*
- * Copyright 2010-2017, Tarantool AUTHORS, please see AUTHORS file.
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
*
* Redistribution and use in source and binary forms, with or
* without modification, are permitted provided that the following
@@ -30,18 +30,10 @@
* THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
-
#include <stddef.h>
#include <stdint.h>
-#include "opt_def.h"
-
-#if defined(__cplusplus)
-extern "C" {
-#endif /* defined(__cplusplus) */
-/**
- * The supported collation types
- */
+/** The supported collation types */
enum coll_type {
COLL_TYPE_ICU = 0,
coll_type_MAX,
@@ -109,17 +101,8 @@ struct coll_icu_def {
enum coll_icu_on_off numeric_collation;
};
-/**
- * Definition of a collation.
- */
+/** Collation definition. */
struct coll_def {
- /** Perconal ID */
- uint32_t id;
- /** Owner ID */
- uint32_t owner_id;
- /** Collation name. */
- size_t name_len;
- const char *name;
/** Locale. */
size_t locale_len;
const char *locale;
@@ -129,10 +112,4 @@ struct coll_def {
struct coll_icu_def icu;
};
-extern const struct opt_def coll_icu_opts_reg[];
-
-#if defined(__cplusplus)
-} /* extern "C" */
-#endif /* defined(__cplusplus) */
-
-#endif /* TARANTOOL_BOX_COLL_DEF_H_INCLUDED */
+#endif /* TARANTOOL_COLL_DEF_H_INCLUDED */
diff --git a/src/diag.h b/src/diag.h
index dc6c132d5..bd5a539b0 100644
--- a/src/diag.h
+++ b/src/diag.h
@@ -249,6 +249,8 @@ struct error *
BuildSystemError(const char *file, unsigned line, const char *format, ...);
struct error *
BuildXlogError(const char *file, unsigned line, const char *format, ...);
+struct error *
+BuildCollationError(const char *file, unsigned line, const char *format, ...);
struct index_def;
diff --git a/src/exception.cc b/src/exception.cc
index 56077f76d..1cbf8852f 100644
--- a/src/exception.cc
+++ b/src/exception.cc
@@ -235,6 +235,18 @@ IllegalParams::IllegalParams(const char *file, unsigned line,
va_end(ap);
}
+const struct type_info type_CollationError =
+ make_type("CollationError", &type_Exception);
+
+CollationError::CollationError(const char *file, unsigned line,
+ const char *format, ...)
+ : Exception(&type_CollationError, file, line)
+{
+ va_list ap;
+ va_start(ap, format);
+ error_vformat_msg(this, format, ap);
+ va_end(ap);
+}
#define BuildAlloc(type) \
void *p = malloc(sizeof(type)); \
@@ -303,6 +315,18 @@ BuildSystemError(const char *file, unsigned line, const char *format, ...)
return e;
}
+struct error *
+BuildCollationError(const char *file, unsigned line, const char *format, ...)
+{
+ BuildAlloc(CollationError);
+ CollationError *e = new (p) CollationError(file, line, "");
+ va_list ap;
+ va_start(ap, format);
+ error_vformat_msg(e, format, ap);
+ va_end(ap);
+ return e;
+}
+
void
exception_init()
{
diff --git a/src/exception.h b/src/exception.h
index fe7ab84f0..f56616b68 100644
--- a/src/exception.h
+++ b/src/exception.h
@@ -49,6 +49,7 @@ extern const struct type_info type_ChannelIsClosed;
extern const struct type_info type_LuajitError;
extern const struct type_info type_IllegalParams;
extern const struct type_info type_SystemError;
+extern const struct type_info type_CollationError;
const char *
exception_get_string(struct error *e, const struct method_info *method);
@@ -139,6 +140,14 @@ public:
IllegalParams(const char *file, unsigned line, const char *format, ...);
virtual void raise() { throw this; }
};
+
+class CollationError: public Exception {
+public:
+ CollationError(const char *file, unsigned line, const char *format,
+ ...);
+ virtual void raise() { throw this; }
+};
+
/**
* Initialize the exception subsystem.
*/
diff --git a/test/unit/coll.cpp b/test/unit/coll.cpp
index d77959606..53e06f2ce 100644
--- a/test/unit/coll.cpp
+++ b/test/unit/coll.cpp
@@ -1,14 +1,14 @@
-#include "box/coll.h"
#include <iostream>
#include <vector>
#include <algorithm>
#include <string.h>
-#include <box/coll_def.h>
#include <assert.h>
#include <msgpuck.h>
#include <diag.h>
#include <fiber.h>
#include <memory.h>
+#include "coll_def.h"
+#include "coll.h"
#include "third_party/PMurHash.h"
using namespace std;
@@ -51,8 +51,6 @@ manual_test()
def.locale = "ru_RU";
def.locale_len = strlen(def.locale);
def.type = COLL_TYPE_ICU;
- def.name = "test";
- def.name_len = strlen(def.name);
struct coll *coll;
cout << " -- default ru_RU -- " << endl;
@@ -136,8 +134,6 @@ hash_test()
def.locale = "ru_RU";
def.locale_len = strlen(def.locale);
def.type = COLL_TYPE_ICU;
- def.name = "test";
- def.name_len = strlen(def.name);
struct coll *coll;
/* Case sensitive */
--
2.15.1 (Apple Git-101)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v3 3/4] collation: introduce collation fingerprint
2018-05-15 19:54 [tarantool-patches] [PATCH v3 0/4] Lua utf8 module Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 1/4] error: introduce error rebulding API Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 2/4] collation: split collation into core and box objects Vladislav Shpilevoy
@ 2018-05-15 19:54 ` Vladislav Shpilevoy
2018-05-17 19:24 ` [tarantool-patches] " Vladislav Shpilevoy
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 4/4] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
3 siblings, 1 reply; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-15 19:54 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
Collation fingerprint is a formatted string unique for a set
of collation properties. Equal collations with different names
have the same fingerprint.
This new property is used to build collation fingerprint cache
to use in Tarantool internals, where collation name does not
matter.
Fingerprint cache can never conflict or replace on insertion into
it. It means, that, for example, utf8 module being created in
this patchset, can fill collation cache with its own collations
and it will affect neither users or other modules.
---
src/coll.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++-
src/coll.h | 17 ++++++-
src/main.cc | 3 ++
test/unit/CMakeLists.txt | 2 +-
test/unit/coll.cpp | 39 +++++++++++++--
test/unit/coll.result | 5 ++
6 files changed, 180 insertions(+), 8 deletions(-)
diff --git a/src/coll.c b/src/coll.c
index eacb643f2..398bff49e 100644
--- a/src/coll.c
+++ b/src/coll.c
@@ -32,12 +32,44 @@
#include "coll.h"
#include "third_party/PMurHash.h"
#include "diag.h"
+#include "assoc.h"
#include <unicode/ucol.h>
#include <trivia/config.h>
+#define mh_name _coll
+struct mh_coll_key_t {
+ const char *str;
+ size_t len;
+ uint32_t hash;
+};
+#define mh_key_t struct mh_coll_key_t *
+
+struct mh_coll_node_t {
+ size_t len;
+ uint32_t hash;
+ struct coll *coll;
+};
+#define mh_node_t struct mh_coll_node_t
+
+#define mh_arg_t void *
+#define mh_hash(a, arg) ((a)->hash)
+#define mh_hash_key(a, arg) ((a)->hash)
+#define mh_cmp(a, b, arg) ((a)->len != (b)->len || \
+ strncmp((a)->coll->fingerprint, \
+ (b)->coll->fingerprint, (a)->len))
+#define mh_cmp_key(a, b, arg) ((a)->len != (b)->len || \
+ strncmp((a)->str, (b)->coll->fingerprint, \
+ (a)->len))
+#define MH_SOURCE
+#include "salad/mhash.h"
+
+/** Table fingerprint -> collation. */
+static struct mh_coll_t *coll_cache = NULL;
+
enum {
MAX_HASH_BUFFER = 1024,
MAX_LOCALE = 1024,
+ MAX_FINGERPRINT_SIZE = 1024,
};
/** Compare two string using ICU collation. */
@@ -205,21 +237,88 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
return 0;
}
+/**
+ * Print ICU definition into @a buffer limited with @a size bytes.
+ * If @a size bytes is not enough, then total needed byte count is
+ * returned.
+ * @param buffer Buffer to write to.
+ * @param size Size of @a buffer.
+ * @param def ICU definition.
+ *
+ * @retval Written or needed byte count.
+ */
+static int
+coll_icu_def_snfingerprint(char *buffer, int size,
+ const struct coll_icu_def *def)
+{
+ return snprintf(buffer, size, "{french_coll: %d, alt_handling: %d, "\
+ "case_first: %d, case_level: %d, norm_mode: %d, "\
+ "strength: %d, numeric_coll: %d}",
+ (int) def->french_collation,
+ (int) def->alternate_handling, (int) def->case_first,
+ (int) def->case_level, (int) def->normalization_mode,
+ (int) def->strength, (int) def->numeric_collation);
+}
+
+/**
+ * Print collation definition into @a buffer limited with @a size
+ * bytes. If @a size bytes is not enough, then total needed byte
+ * count is returned.
+ * @param buffer Buffer to write to.
+ * @param size Size of @a buffer.
+ * @param def Collation definition.
+ *
+ * @retval Written or needed byte count.
+ */
+static int
+coll_def_snfingerprint(char *buffer, int size, const struct coll_def *def)
+{
+ int total = 0;
+ SNPRINT(total, snprintf, buffer, size, "{locale: %.*s, type = %d, "\
+ "icu: ", (int) def->locale_len, def->locale, (int) def->type);
+ SNPRINT(total, coll_icu_def_snfingerprint, buffer, size, &def->icu);
+ SNPRINT(total, snprintf, buffer, size, "}");
+ return total;
+}
+
struct coll *
coll_new(const struct coll_def *def)
{
assert(def->type == COLL_TYPE_ICU);
- struct coll *coll = (struct coll *) malloc(sizeof(*coll));
+ int fingerprint_len = coll_def_snfingerprint(NULL, 0, def);
+ assert(fingerprint_len <= MAX_FINGERPRINT_SIZE);
+ char fingerprint[MAX_FINGERPRINT_SIZE];
+ coll_def_snfingerprint(fingerprint, MAX_FINGERPRINT_SIZE, def);
+
+ uint32_t hash = mh_strn_hash(fingerprint, fingerprint_len);
+ struct mh_coll_key_t key = { fingerprint, fingerprint_len, hash };
+ mh_int_t i = mh_coll_find(coll_cache, &key, NULL);
+ if (i != mh_end(coll_cache)) {
+ struct coll *coll = mh_coll_node(coll_cache, i)->coll;
+ coll_ref(coll);
+ return coll;
+ }
+
+ int total_size = sizeof(struct coll) + fingerprint_len + 1;
+ struct coll *coll = (struct coll *) malloc(total_size);
if (coll == NULL) {
- diag_set(OutOfMemory, sizeof(*coll), "malloc", "coll");
+ diag_set(OutOfMemory, total_size, "malloc", "coll");
return NULL;
}
+ memcpy((char *) coll->fingerprint, fingerprint, fingerprint_len + 1);
coll->refs = 1;
coll->type = def->type;
if (coll_icu_init_cmp(coll, def) != 0) {
free(coll);
return NULL;
}
+
+ struct mh_coll_node_t node = { fingerprint_len, hash, coll };
+ if (mh_coll_put(coll_cache, &node, NULL, NULL) == mh_end(coll_cache)) {
+ diag_set(OutOfMemory, sizeof(node), "malloc", "coll_cache");
+ coll_unref(coll);
+ return NULL;
+ }
return coll;
}
@@ -228,7 +327,26 @@ coll_unref(struct coll *coll)
{
assert(coll->refs > 0);
if (--coll->refs == 0) {
+ int len = strlen(coll->fingerprint);
+ struct mh_coll_node_t node = {
+ len, mh_strn_hash(coll->fingerprint, len), coll
+ };
+ mh_coll_remove(coll_cache, &node, NULL);
ucol_close(coll->icu.collator);
free(coll);
}
}
+
+void
+coll_init()
+{
+ coll_cache = mh_coll_new();
+ if (coll_cache == NULL)
+ panic("Can not create system collations cache");
+}
+
+void
+coll_free()
+{
+ mh_coll_delete(coll_cache);
+}
diff --git a/src/coll.h b/src/coll.h
index 8798d9491..348f073eb 100644
--- a/src/coll.h
+++ b/src/coll.h
@@ -69,10 +69,17 @@ struct coll {
coll_hash_f hash;
/** Reference counter. */
int refs;
+ /**
+ * Formatted string with collation properties, that
+ * completely describes how the collation works.
+ */
+ const char fingerprint[0];
};
/**
- * Create a core collation by definition.
+ * Create a core collation by definition. Can return an existing
+ * collation object, if a one with the same fingerprint was
+ * created before.
* @param def Core collation definition.
* @retval NULL Illegal parameters or memory error.
* @retval not NULL Collation.
@@ -91,6 +98,14 @@ coll_ref(struct coll *coll)
void
coll_unref(struct coll *coll);
+/** Initialize collations subsystem. */
+void
+coll_init();
+
+/** Destroy collations subsystem. */
+void
+coll_free();
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/main.cc b/src/main.cc
index 1682baea0..3caa59ffa 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -58,6 +58,7 @@
#include <say.h>
#include <rmean.h>
#include <limits.h>
+#include <coll.h>
#include "trivia/util.h"
#include "backtrace.h"
#include "tt_pthread.h"
@@ -581,6 +582,7 @@ tarantool_free(void)
memory_free();
random_free();
#endif
+ coll_free();
systemd_free();
say_logger_free();
}
@@ -732,6 +734,7 @@ main(int argc, char **argv)
coio_enable();
signal_init();
cbus_init();
+ coll_init();
tarantool_lua_init(tarantool_bin, main_argc, main_argv);
start_time = ev_monotonic_time();
diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
index 5d83f53b0..dbc02cdf0 100644
--- a/test/unit/CMakeLists.txt
+++ b/test/unit/CMakeLists.txt
@@ -191,4 +191,4 @@ add_executable(vy_cache.test vy_cache.c ${ITERATOR_TEST_SOURCES})
target_link_libraries(vy_cache.test ${ITERATOR_TEST_LIBS})
add_executable(coll.test coll.cpp)
-target_link_libraries(coll.test box)
+target_link_libraries(coll.test core unit ${ICU_LIBRARIES} misc)
diff --git a/test/unit/coll.cpp b/test/unit/coll.cpp
index 17f26ea07..2e2d4a54a 100644
--- a/test/unit/coll.cpp
+++ b/test/unit/coll.cpp
@@ -9,6 +9,7 @@
#include <diag.h>
#include <fiber.h>
#include <memory.h>
+#include "unit.h"
#include "third_party/PMurHash.h"
using namespace std;
@@ -43,7 +44,7 @@ test_sort_strings(vector<const char *> &strings, struct coll *coll)
void
manual_test()
{
- cout << "\t*** " << __func__ << " ***" << endl;
+ header();
vector<const char *> strings;
struct coll_def def;
@@ -111,7 +112,7 @@ manual_test()
test_sort_strings(strings, coll);
coll_unref(coll);
- cout << "\t*** " << __func__ << ": done ***" << endl;
+ footer();
}
unsigned calc_hash(const char *str, struct coll *coll)
@@ -127,7 +128,7 @@ unsigned calc_hash(const char *str, struct coll *coll)
void
hash_test()
{
- cout << "\t*** " << __func__ << " ***" << endl;
+ header();
struct coll_def def;
memset(&def, 0, sizeof(def));
@@ -155,17 +156,47 @@ hash_test()
cout << (calc_hash("аЕ", coll) != calc_hash("аё", coll) ? "OK" : "Fail") << endl;
coll_unref(coll);
- cout << "\t*** " << __func__ << ": done ***" << endl;
+ footer();
}
+void
+cache_test()
+{
+ header();
+ plan(2);
+
+ struct coll_def def;
+ memset(&def, 0, sizeof(def));
+ def.locale = "ru_RU";
+ def.locale_len = strlen(def.locale);
+ def.type = COLL_TYPE_ICU;
+
+ struct coll *coll1 = coll_new(&def);
+ struct coll *coll2 = coll_new(&def);
+ is(coll1, coll2,
+ "collations with the same definition are not duplicated");
+ coll_unref(coll2);
+ def.locale = "en_EN";
+ coll2 = coll_new(&def);
+ isnt(coll1, coll2,
+ "collations with different definitions are different objects");
+ coll_unref(coll2);
+ coll_unref(coll1);
+
+ check_plan();
+ footer();
+}
int
main(int, const char**)
{
+ coll_init();
memory_init();
fiber_init(fiber_c_invoke);
manual_test();
hash_test();
+ cache_test();
fiber_free();
memory_free();
+ coll_free();
}
\ No newline at end of file
diff --git a/test/unit/coll.result b/test/unit/coll.result
index 218dca8f4..269764246 100644
--- a/test/unit/coll.result
+++ b/test/unit/coll.result
@@ -83,3 +83,8 @@ OK
OK
OK
*** hash_test: done ***
+ *** cache_test ***
+1..2
+ok 1 - collations with the same definition are not duplicated
+ok 2 - collations with different definitions are different objects
+ *** cache_test: done ***
--
2.15.1 (Apple Git-101)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] Re: [PATCH v3 3/4] collation: introduce collation fingerprint
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 3/4] collation: introduce collation fingerprint Vladislav Shpilevoy
@ 2018-05-17 19:24 ` Vladislav Shpilevoy
0 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-17 19:24 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
New patch version. It is changed after the previous patch was
reworked.
---
collation: introduce collation fingerprint
Collation fingerprint is a formatted string unique for a set
of collation properties. Equal collations with different names
have the same fingerprint.
This new property is used to build collation fingerprint cache
to use in Tarantool internals, where collation name does not
matter.
Fingerprint cache can never conflict or replace on insertion into
it. It means, that, for example, utf8 module being created in
this patchset, can fill collation cache with its own collations
and it will affect neither users or other modules.
---
src/coll.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++-
src/coll.h | 17 ++++++-
src/main.cc | 3 ++
test/unit/CMakeLists.txt | 2 +-
test/unit/coll.cpp | 39 +++++++++++++--
test/unit/coll.result | 5 ++
6 files changed, 179 insertions(+), 8 deletions(-)
diff --git a/src/coll.c b/src/coll.c
index 66afa6c4f..d0e36827d 100644
--- a/src/coll.c
+++ b/src/coll.c
@@ -32,9 +32,40 @@
#include "coll.h"
#include "third_party/PMurHash.h"
#include "diag.h"
+#include "assoc.h"
#include <unicode/ucol.h>
#include <trivia/config.h>
+#define mh_name _coll
+struct mh_coll_key_t {
+ const char *str;
+ size_t len;
+ uint32_t hash;
+};
+#define mh_key_t struct mh_coll_key_t *
+
+struct mh_coll_node_t {
+ size_t len;
+ uint32_t hash;
+ struct coll *coll;
+};
+#define mh_node_t struct mh_coll_node_t
+
+#define mh_arg_t void *
+#define mh_hash(a, arg) ((a)->hash)
+#define mh_hash_key(a, arg) ((a)->hash)
+#define mh_cmp(a, b, arg) ((a)->len != (b)->len || \
+ strncmp((a)->coll->fingerprint, \
+ (b)->coll->fingerprint, (a)->len))
+#define mh_cmp_key(a, b, arg) ((a)->len != (b)->len || \
+ strncmp((a)->str, (b)->coll->fingerprint, \
+ (a)->len))
+#define MH_SOURCE
+#include "salad/mhash.h"
+
+/** Table fingerprint -> collation. */
+static struct mh_coll_t *coll_cache = NULL;
+
enum {
MAX_LOCALE = 1024,
};
@@ -205,21 +236,88 @@ coll_icu_init_cmp(struct coll *coll, const struct coll_def *def)
return 0;
}
+/**
+ * Print ICU definition into @a buffer limited with @a size bytes.
+ * If @a size bytes is not enough, then total needed byte count is
+ * returned.
+ * @param buffer Buffer to write to.
+ * @param size Size of @a buffer.
+ * @param def ICU definition.
+ *
+ * @retval Written or needed byte count.
+ */
+static int
+coll_icu_def_snfingerprint(char *buffer, int size,
+ const struct coll_icu_def *def)
+{
+ return snprintf(buffer, size, "{french_coll: %d, alt_handling: %d, "\
+ "case_first: %d, case_level: %d, norm_mode: %d, "\
+ "strength: %d, numeric_coll: %d}",
+ (int) def->french_collation,
+ (int) def->alternate_handling, (int) def->case_first,
+ (int) def->case_level, (int) def->normalization_mode,
+ (int) def->strength, (int) def->numeric_collation);
+}
+
+/**
+ * Print collation definition into @a buffer limited with @a size
+ * bytes. If @a size bytes is not enough, then total needed byte
+ * count is returned.
+ * @param buffer Buffer to write to.
+ * @param size Size of @a buffer.
+ * @param def Collation definition.
+ *
+ * @retval Written or needed byte count.
+ */
+static int
+coll_def_snfingerprint(char *buffer, int size, const struct coll_def *def)
+{
+ int total = 0;
+ SNPRINT(total, snprintf, buffer, size, "{locale: %.*s, type = %d, "\
+ "icu: ", (int) def->locale_len, def->locale, (int) def->type);
+ SNPRINT(total, coll_icu_def_snfingerprint, buffer, size, &def->icu);
+ SNPRINT(total, snprintf, buffer, size, "}");
+ return total;
+}
+
struct coll *
coll_new(const struct coll_def *def)
{
assert(def->type == COLL_TYPE_ICU);
- struct coll *coll = (struct coll *) malloc(sizeof(*coll));
+ int fingerprint_len = coll_def_snfingerprint(NULL, 0, def);
+ assert(fingerprint_len <= TT_STATIC_BUF_LEN);
+ char *fingerprint = tt_static_buf();
+ coll_def_snfingerprint(fingerprint, TT_STATIC_BUF_LEN, def);
+
+ uint32_t hash = mh_strn_hash(fingerprint, fingerprint_len);
+ struct mh_coll_key_t key = { fingerprint, fingerprint_len, hash };
+ mh_int_t i = mh_coll_find(coll_cache, &key, NULL);
+ if (i != mh_end(coll_cache)) {
+ struct coll *coll = mh_coll_node(coll_cache, i)->coll;
+ coll_ref(coll);
+ return coll;
+ }
+
+ int total_size = sizeof(struct coll) + fingerprint_len + 1;
+ struct coll *coll = (struct coll *) malloc(total_size);
if (coll == NULL) {
- diag_set(OutOfMemory, sizeof(*coll), "malloc", "coll");
+ diag_set(OutOfMemory, total_size, "malloc", "coll");
return NULL;
}
+ memcpy((char *) coll->fingerprint, fingerprint, fingerprint_len + 1);
coll->refs = 1;
coll->type = def->type;
if (coll_icu_init_cmp(coll, def) != 0) {
free(coll);
return NULL;
}
+
+ struct mh_coll_node_t node = { fingerprint_len, hash, coll };
+ if (mh_coll_put(coll_cache, &node, NULL, NULL) == mh_end(coll_cache)) {
+ diag_set(OutOfMemory, sizeof(node), "malloc", "coll_cache");
+ coll_unref(coll);
+ return NULL;
+ }
return coll;
}
@@ -228,7 +326,26 @@ coll_unref(struct coll *coll)
{
assert(coll->refs > 0);
if (--coll->refs == 0) {
+ int len = strlen(coll->fingerprint);
+ struct mh_coll_node_t node = {
+ len, mh_strn_hash(coll->fingerprint, len), coll
+ };
+ mh_coll_remove(coll_cache, &node, NULL);
ucol_close(coll->icu.collator);
free(coll);
}
}
+
+void
+coll_init()
+{
+ coll_cache = mh_coll_new();
+ if (coll_cache == NULL)
+ panic("Can not create system collations cache");
+}
+
+void
+coll_free()
+{
+ mh_coll_delete(coll_cache);
+}
diff --git a/src/coll.h b/src/coll.h
index cc834f446..7e950d164 100644
--- a/src/coll.h
+++ b/src/coll.h
@@ -68,10 +68,17 @@ struct coll {
coll_hash_f hash;
/** Reference counter. */
int refs;
+ /**
+ * Formatted string with collation properties, that
+ * completely describes how the collation works.
+ */
+ const char fingerprint[0];
};
/**
- * Create a collation by definition.
+ * Create a collation by definition. Can return an existing
+ * collation object, if a one with the same fingerprint was
+ * created before.
* @param def Collation definition.
* @retval NULL Collation or memory error.
* @retval not NULL Collation.
@@ -90,6 +97,14 @@ coll_ref(struct coll *coll)
void
coll_unref(struct coll *coll);
+/** Initialize collations subsystem. */
+void
+coll_init();
+
+/** Destroy collations subsystem. */
+void
+coll_free();
+
#if defined(__cplusplus)
} /* extern "C" */
#endif /* defined(__cplusplus) */
diff --git a/src/main.cc b/src/main.cc
index 1682baea0..a36a2b0d0 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -58,6 +58,7 @@
#include <say.h>
#include <rmean.h>
#include <limits.h>
+#include "coll.h"
#include "trivia/util.h"
#include "backtrace.h"
#include "tt_pthread.h"
@@ -581,6 +582,7 @@ tarantool_free(void)
memory_free();
random_free();
#endif
+ coll_free();
systemd_free();
say_logger_free();
}
@@ -732,6 +734,7 @@ main(int argc, char **argv)
coio_enable();
signal_init();
cbus_init();
+ coll_init();
tarantool_lua_init(tarantool_bin, main_argc, main_argv);
start_time = ev_monotonic_time();
diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
index 5d83f53b0..dbc02cdf0 100644
--- a/test/unit/CMakeLists.txt
+++ b/test/unit/CMakeLists.txt
@@ -191,4 +191,4 @@ add_executable(vy_cache.test vy_cache.c ${ITERATOR_TEST_SOURCES})
target_link_libraries(vy_cache.test ${ITERATOR_TEST_LIBS})
add_executable(coll.test coll.cpp)
-target_link_libraries(coll.test box)
+target_link_libraries(coll.test core unit ${ICU_LIBRARIES} misc)
diff --git a/test/unit/coll.cpp b/test/unit/coll.cpp
index 53e06f2ce..eeee739b7 100644
--- a/test/unit/coll.cpp
+++ b/test/unit/coll.cpp
@@ -9,6 +9,7 @@
#include <memory.h>
#include "coll_def.h"
#include "coll.h"
+#include "unit.h"
#include "third_party/PMurHash.h"
using namespace std;
@@ -43,7 +44,7 @@ test_sort_strings(vector<const char *> &strings, struct coll *coll)
void
manual_test()
{
- cout << "\t*** " << __func__ << " ***" << endl;
+ header();
vector<const char *> strings;
struct coll_def def;
@@ -111,7 +112,7 @@ manual_test()
test_sort_strings(strings, coll);
coll_unref(coll);
- cout << "\t*** " << __func__ << ": done ***" << endl;
+ footer();
}
unsigned calc_hash(const char *str, struct coll *coll)
@@ -127,7 +128,7 @@ unsigned calc_hash(const char *str, struct coll *coll)
void
hash_test()
{
- cout << "\t*** " << __func__ << " ***" << endl;
+ header();
struct coll_def def;
memset(&def, 0, sizeof(def));
@@ -155,17 +156,47 @@ hash_test()
cout << (calc_hash("аЕ", coll) != calc_hash("аё", coll) ? "OK" : "Fail") << endl;
coll_unref(coll);
- cout << "\t*** " << __func__ << ": done ***" << endl;
+ footer();
}
+void
+cache_test()
+{
+ header();
+ plan(2);
+
+ struct coll_def def;
+ memset(&def, 0, sizeof(def));
+ def.locale = "ru_RU";
+ def.locale_len = strlen(def.locale);
+ def.type = COLL_TYPE_ICU;
+
+ struct coll *coll1 = coll_new(&def);
+ struct coll *coll2 = coll_new(&def);
+ is(coll1, coll2,
+ "collations with the same definition are not duplicated");
+ coll_unref(coll2);
+ def.locale = "en_EN";
+ coll2 = coll_new(&def);
+ isnt(coll1, coll2,
+ "collations with different definitions are different objects");
+ coll_unref(coll2);
+ coll_unref(coll1);
+
+ check_plan();
+ footer();
+}
int
main(int, const char**)
{
+ coll_init();
memory_init();
fiber_init(fiber_c_invoke);
manual_test();
hash_test();
+ cache_test();
fiber_free();
memory_free();
+ coll_free();
}
\ No newline at end of file
diff --git a/test/unit/coll.result b/test/unit/coll.result
index 218dca8f4..269764246 100644
--- a/test/unit/coll.result
+++ b/test/unit/coll.result
@@ -83,3 +83,8 @@ OK
OK
OK
*** hash_test: done ***
+ *** cache_test ***
+1..2
+ok 1 - collations with the same definition are not duplicated
+ok 2 - collations with different definitions are different objects
+ *** cache_test: done ***
--
2.15.1 (Apple Git-101)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tarantool-patches] [PATCH v3 4/4] lua: introduce utf8 built-in globaly visible module
2018-05-15 19:54 [tarantool-patches] [PATCH v3 0/4] Lua utf8 module Vladislav Shpilevoy
` (2 preceding siblings ...)
2018-05-15 19:54 ` [tarantool-patches] [PATCH v3 3/4] collation: introduce collation fingerprint Vladislav Shpilevoy
@ 2018-05-15 19:54 ` Vladislav Shpilevoy
3 siblings, 0 replies; 11+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-15 19:54 UTC (permalink / raw)
To: tarantool-patches; +Cc: kostja
utf8 is a module partially compatible with Lua 5.3 utf8 and
lua-utf8 third party module.
Partially means, that not all functions are implemented.
The patch introduces these ones:
upper, lower, len, char, sub, next.
Len and char works exactly like in Lua 5.3. Other functions work
like in lua-utf8, because they are not presented in Lua 5.3.
Tarantool utf8 has extensions:
* isupper/lower/alpha/digit, that check some property by a symbol
or by its code;
* cmp/casecmp, that compare two UTF8 strings.
Closes #3290
Closes #3385
Closes #3081
---
src/CMakeLists.txt | 3 +-
src/lua/init.c | 3 +
src/lua/utf8.c | 479 +++++++++++++++++++++++++++++++++++++++++++
src/lua/utf8.h | 42 ++++
test/app-tap/string.test.lua | 163 ++++++++++++++-
test/box/ddl.result | 15 ++
test/box/ddl.test.lua | 8 +
7 files changed, 711 insertions(+), 2 deletions(-)
create mode 100644 src/lua/utf8.c
create mode 100644 src/lua/utf8.h
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 5bf17614b..2a952923e 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -166,6 +166,7 @@ set (server_sources
lua/fio.c
lua/crypto.c
lua/httpc.c
+ lua/utf8.c
${lua_sources}
${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
@@ -210,7 +211,7 @@ endif()
set_source_files_compile_flags(${server_sources})
add_library(server STATIC ${server_sources})
-target_link_libraries(server core bit uri uuid)
+target_link_libraries(server core bit uri uuid ${ICU_LIBRARIES})
# Rule of thumb: if exporting a symbol from a static library, list the
# library here.
diff --git a/src/lua/init.c b/src/lua/init.c
index a0a7f63f6..58af1d121 100644
--- a/src/lua/init.c
+++ b/src/lua/init.c
@@ -57,6 +57,7 @@
#include "lua/pickle.h"
#include "lua/fio.h"
#include "lua/httpc.h"
+#include "lua/utf8.h"
#include "digest.h"
#include <small/ibuf.h>
@@ -399,6 +400,7 @@ tarantool_lua_init(const char *tarantool_bin, int argc, char **argv)
lua_call(L, 0, 0);
lua_register(L, "tonumber64", lbox_tonumber64);
+ tarantool_lua_utf8_init(L);
tarantool_lua_utils_init(L);
tarantool_lua_fiber_init(L);
tarantool_lua_fiber_cond_init(L);
@@ -629,6 +631,7 @@ tarantool_lua_run_script(char *path, bool interactive,
void
tarantool_lua_free()
{
+ tarantool_lua_utf8_free();
/*
* Some part of the start script panicked, and called
* exit(). The call stack in this case leads us back to
diff --git a/src/lua/utf8.c b/src/lua/utf8.c
new file mode 100644
index 000000000..e3b2b0a7f
--- /dev/null
+++ b/src/lua/utf8.c
@@ -0,0 +1,479 @@
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include <unicode/ucasemap.h>
+#include <unicode/uchar.h>
+#include <coll.h>
+#include "lua/utils.h"
+#include "lua/utf8.h"
+#include "diag.h"
+#include "small/ibuf.h"
+
+extern struct ibuf *tarantool_lua_ibuf;
+
+/** Default universal casemap for case transformations. */
+static UCaseMap *root_map = NULL;
+
+/** Collations for cmp/casecmp functions. */
+static struct coll *unicode_coll = NULL;
+static struct coll *unicode_ci_coll = NULL;
+
+static int
+utf8_str_to_case(struct lua_State *L, const char *src, int src_bsize,
+ bool is_to_upper)
+{
+ int i = 0;
+ int dst_bsize = src_bsize;
+ (void) i;
+ do {
+ UErrorCode err = U_ZERO_ERROR;
+ ibuf_reset(tarantool_lua_ibuf);
+ char *dst = ibuf_alloc(tarantool_lua_ibuf, dst_bsize);
+ if (dst == NULL) {
+ diag_set(OutOfMemory, dst_bsize, "ibuf_alloc", "dst");
+ return luaT_error(L);
+ }
+ int real_bsize;
+ if (is_to_upper) {
+ real_bsize = ucasemap_utf8ToUpper(root_map, dst,
+ dst_bsize, src,
+ src_bsize, &err);
+ } else {
+ real_bsize = ucasemap_utf8ToLower(root_map, dst,
+ dst_bsize, src,
+ src_bsize, &err);
+ }
+ if (err == U_ZERO_ERROR ||
+ err == U_STRING_NOT_TERMINATED_WARNING) {
+ lua_pushlstring(L, dst, real_bsize);
+ return 1;
+ } else if (err == U_BUFFER_OVERFLOW_ERROR) {
+ assert(real_bsize > dst_bsize);
+ dst_bsize = real_bsize;
+ } else {
+ lua_pushnil(L);
+ lua_pushstring(L, tt_sprintf("error during ICU case "\
+ "transform: %s",
+ u_errorName(err)));
+ return 2;
+ }
+ /*
+ * On a first run either all is ok, or
+ * toLower/Upper returned needed bsize, that is
+ * allocated on a second iteration. Third
+ * iteration is not possible.
+ */
+ assert(++i < 2);
+ } while (true);
+ unreachable();
+ return 0;
+}
+
+/**
+ * Convert a UTF8 string into upper case.
+ * @param String to convert.
+ * @retval not nil String consisting of upper letters.
+ * @retval nil, error Error.
+ */
+static int
+utf8_upper(struct lua_State *L)
+{
+ if (lua_gettop(L) != 1 || !lua_isstring(L, 1))
+ return luaL_error(L, "Usage: utf8.upper(<string>)");
+ size_t len;
+ const char *str = lua_tolstring(L, 1, &len);
+ return utf8_str_to_case(L, str, len, true);
+}
+
+/**
+ * Convert a UTF8 string into lower case.
+ * @param String to convert.
+ * @retval not nil String consisting of lower letters.
+ * @retval nil, error Error.
+ */
+static int
+utf8_lower(struct lua_State *L)
+{
+ if (lua_gettop(L) != 1 || !lua_isstring(L, 1))
+ return luaL_error(L, "Usage: utf8.lower(<string>)");
+ size_t len;
+ const char *str = lua_tolstring(L, 1, &len);
+ return utf8_str_to_case(L, str, len, false);
+}
+
+/**
+ * Calculate a 1-based positive byte offset in a string by any
+ * 1-based offset (possibly negative).
+ * @param offset Original 1-based offset with any sign.
+ * @param len A string byte length.
+ * @retval 1-based positive offset.
+ */
+static inline int
+utf8_convert_offset(int offset, size_t len)
+{
+ if (offset >= 0)
+ return offset;
+ else if ((size_t)-offset > len)
+ return 0;
+ return len + offset + 1;
+}
+
+/**
+ * Calculate length of a UTF8 string. Length here is symbol count.
+ * Works like utf8.len in Lua 5.3. Can take negative offsets. A
+ * negative offset is an offset from the end of string.
+ * Positive position must be inside .
+ * @param String to get length.
+ * @param Start byte offset in [1, #str + 1]. Must point to the
+ * start of symbol. On invalid symbol an error is returned.
+ * @param End byte offset in [0, #str]. Can point to the middle of
+ * symbol. Partial symbol is counted too.
+ * @retval not nil Symbol count.
+ * @retval nil, number Error. Byte position of the error is
+ * returned in the second value.
+ * @retval nil, string Error. Reason is returned in the second
+ * value.
+ */
+static int
+utf8_len(struct lua_State *L)
+{
+ if (lua_gettop(L) > 3 || !lua_isstring(L, 1))
+ return luaL_error(L, "Usage: utf8.len(<string>, [i, [j]])");
+ size_t slen;
+ const char *str = lua_tolstring(L, 1, &slen);
+ int len = (int) slen;
+ int start_pos = utf8_convert_offset(luaL_optinteger(L, 2, 1), len);
+ int end_pos = utf8_convert_offset(luaL_optinteger(L, 3, -1), len);
+ if (start_pos < 1 || --start_pos > len || end_pos > len) {
+ lua_pushnil(L);
+ lua_pushstring(L, "position is out of string");
+ return 2;
+ }
+ int result = 0;
+ if (end_pos > start_pos) {
+ UChar32 c;
+ while (start_pos < end_pos) {
+ ++result;
+ U8_NEXT(str, start_pos, len, c);
+ if (c == U_SENTINEL) {
+ lua_pushnil(L);
+ lua_pushinteger(L, start_pos);
+ return 2;
+ }
+ }
+ }
+ lua_pushinteger(L, result);
+ return 1;
+}
+
+/**
+ * Get next symbol code by @an offset.
+ * @param String to get symbol code.
+ * @param Byte offset from which get.
+ *
+ * @retval - No more symbols.
+ * @retval not nil, not nil Byte offset and symbol code.
+ */
+static int
+utf8_next(struct lua_State *L)
+{
+ if (lua_gettop(L) > 2 || !lua_isstring(L, 1))
+ return luaL_error(L, "Usage: utf8.next(<string>, "\
+ "[<byte offset>])");
+ size_t slen;
+ const char *str = lua_tolstring(L, 1, &slen);
+ int len = (int) slen;
+ int pos = utf8_convert_offset(luaL_optinteger(L, 2, 1), len);
+ if (pos > 0)
+ --pos;
+ if (pos >= len)
+ return 0;
+ UChar32 c;
+ U8_NEXT(str, pos, len, c);
+ if (c == U_SENTINEL)
+ return 0;
+ lua_pushinteger(L, pos + 1);
+ lua_pushinteger(L, c);
+ return 2;
+}
+
+/**
+ * Convert a UTF8 char code (or codes) into Lua string. When
+ * multiple codes are provided, they are concatenated into a
+ * monolite string.
+ * @param Char codes.
+ * @retval Result UTF8 string.
+ */
+static int
+utf8_char(struct lua_State *L)
+{
+ int top = lua_gettop(L);
+ if (top < 1)
+ return luaL_error(L, "Usage: utf8.char(<char code>");
+ int len = 0;
+ UChar32 c;
+ /* Fast way - convert one symbol. */
+ if (top == 1) {
+ char buf[U8_MAX_LENGTH];
+ c = luaL_checkinteger(L, 1);
+ U8_APPEND_UNSAFE(buf, len, c);
+ assert(len <= (int)sizeof(buf));
+ lua_pushlstring(L, buf, len);
+ return 1;
+ }
+ /* Slow way - use dynamic buffer. */
+ ibuf_reset(tarantool_lua_ibuf);
+ char *str = ibuf_alloc(tarantool_lua_ibuf, top * U8_MAX_LENGTH);
+ if (str == NULL) {
+ diag_set(OutOfMemory, top * U8_MAX_LENGTH, "ibuf_alloc",
+ "str");
+ return luaT_error(L);
+ }
+ for (int i = 1; i <= top; ++i) {
+ c = luaL_checkinteger(L, i);
+ U8_APPEND_UNSAFE(str, len, c);
+ }
+ lua_pushlstring(L, str, len);
+ return 1;
+}
+
+/**
+ * Get byte offsets by symbol positions in a string. Positions can
+ * be negative.
+ * @param s Original string.
+ * @param len Length of @an s.
+ * @param start_pos Start position (symbol offset).
+ * @param end_pos End position (symbol offset).
+ * @param[out] start_offset_ Start position (byte offset).
+ * @param[out] end_offset_ End position (byte offset).
+ */
+static void
+utf8_sub(const uint8_t *s, int len, int start_pos, int end_pos,
+ int *start_offset_, int *end_offset_)
+{
+ int start_offset = 0, end_offset = len;
+ if (start_pos >= 0) {
+ U8_FWD_N(s, start_offset, len, start_pos);
+ if (end_pos >= 0) {
+ /* --[-------]---- ... */
+ int n = end_pos - start_pos;
+ end_offset = start_offset;
+ U8_FWD_N(s, end_offset, len, n);
+ } else {
+ /* --[---- ... ----]--- */
+ int n = -(end_pos + 1);
+ U8_BACK_N(s, 0, end_offset, n);
+ }
+ } else {
+ int n;
+ if (end_pos < 0) {
+ /* ... -----[-----]--- */
+ n = -(end_pos + 1);
+ U8_BACK_N(s, 0, end_offset, n);
+ start_offset = end_offset;
+ n = end_pos - start_pos + 1;
+ } else {
+ /* ---]-- ... --[---- */
+ end_offset = 0;
+ U8_FWD_N(s, end_offset, len, end_pos);
+ n = -start_pos;
+ start_offset = len;
+ }
+ U8_BACK_N(s, 0, start_offset, n);
+ }
+ *start_offset_ = start_offset;
+ if (start_offset <= end_offset)
+ *end_offset_ = end_offset;
+ else
+ *end_offset_ = start_offset;
+}
+
+/**
+ * Get a substring from a UTF8 string.
+ * @param String to get a substring.
+ * @param Start position in symbol count. Optional, can be
+ * negative.
+ * @param End position in symbol count. Optional, can be negative.
+ *
+ * @retval Substring.
+ */
+static int
+utf8_lua_sub(struct lua_State *L)
+{
+ if (lua_gettop(L) < 2 || !lua_isstring(L, 1))
+ return luaL_error(L, "Usage: utf8.sub(<string>, [i, [j]])");
+ int start_pos = luaL_checkinteger(L, 2);
+ if (start_pos > 0)
+ --start_pos;
+ int end_pos = luaL_optinteger(L, 3, -1);
+ size_t slen;
+ const char *str = lua_tolstring(L, 1, &slen);
+ int len = (int) slen;
+ int start_offset, end_offset;
+ utf8_sub((const uint8_t *) str, len, start_pos, end_pos, &start_offset,
+ &end_offset);
+ assert(end_offset >= start_offset);
+ lua_pushlstring(L, str + start_offset, end_offset - start_offset);
+ return 1;
+}
+
+/**
+ * Macro to easy create lua wrappers for ICU symbol checkers.
+ * @param One stmbol code or string.
+ * @retval True, if the symbol has a requested property. Else
+ * false.
+ */
+#define UCHAR32_CHECKER(name) \
+static int \
+utf8_##name(struct lua_State *L) \
+{ \
+ if (lua_gettop(L) != 1) \
+ return luaL_error(L, "Usage: utf8."#name"(<string> or "\
+ "<one symbol code>)"); \
+ UChar32 c; \
+ bool result = false; \
+ if (lua_type(L, 1) == LUA_TSTRING) { \
+ size_t slen; \
+ const char *str = lua_tolstring(L, 1, &slen); \
+ int len = (int) slen; \
+ if (len > 0) { \
+ int offset = 0; \
+ U8_NEXT(str, offset, len, c); \
+ result = c != U_SENTINEL && offset == len && \
+ u_##name(c); \
+ } \
+ } else { \
+ result = u_##name(luaL_checkinteger(L, 1)); \
+ } \
+ lua_pushboolean(L, result); \
+ return 1; \
+}\
+
+UCHAR32_CHECKER(islower)
+UCHAR32_CHECKER(isupper)
+UCHAR32_CHECKER(isdigit)
+UCHAR32_CHECKER(isalpha)
+
+static inline int
+utf8_cmp_impl(struct lua_State *L, const char *usage, struct coll *coll)
+{
+ assert(coll != NULL);
+ if (lua_gettop(L) != 2 || !lua_isstring(L, 1) || !lua_isstring(L, 2))
+ luaL_error(L, usage);
+ size_t l1, l2;
+ const char *s1 = lua_tolstring(L, 1, &l1);
+ const char *s2 = lua_tolstring(L, 2, &l2);
+ lua_pushinteger(L, coll->cmp(s1, l1, s2, l2, coll));
+ return 1;
+}
+
+/**
+ * Compare two UTF8 strings.
+ * @param s1 First string.
+ * @param s1 Second string.
+ *
+ * @retval <0 s1 < s2.
+ * @retval >0 s1 > s2.
+ * @retval =0 s1 = s2.
+ */
+static int
+utf8_cmp(struct lua_State *L)
+{
+ return utf8_cmp_impl(L, "Usage: utf8.cmp(<string1>, <string2>)",
+ unicode_coll);
+}
+
+/**
+ * Compare two UTF8 strings ignoring case.
+ * @param s1 First string.
+ * @param s1 Second string.
+ *
+ * @retval <0 s1 < s2.
+ * @retval >0 s1 > s2.
+ * @retval =0 s1 = s2.
+ */
+static int
+utf8_casecmp(struct lua_State *L)
+{
+ return utf8_cmp_impl(L, "Usage: utf8.casecmp(<string1>, <string2>)",
+ unicode_ci_coll);
+}
+
+static const struct luaL_Reg utf8_lib[] = {
+ {"upper", utf8_upper},
+ {"lower", utf8_lower},
+ {"len", utf8_len},
+ {"next", utf8_next},
+ {"char", utf8_char},
+ {"sub", utf8_lua_sub},
+ {"islower", utf8_islower},
+ {"isupper", utf8_isupper},
+ {"isdigit", utf8_isdigit},
+ {"isalpha", utf8_isalpha},
+ {"cmp", utf8_cmp},
+ {"casecmp", utf8_casecmp},
+ {NULL, NULL}
+};
+
+void
+tarantool_lua_utf8_init(struct lua_State *L)
+{
+ UErrorCode err = U_ZERO_ERROR;
+ root_map = ucasemap_open("", 0, &err);
+ if (root_map == NULL) {
+ luaL_error(L, tt_sprintf("error in ICU ucasemap_open: %s",
+ u_errorName(err)));
+ }
+ struct coll_def def;
+ memset(&def, 0, sizeof(def));
+ unicode_coll = coll_new(&def);
+ if (unicode_coll == NULL)
+ goto error_coll;
+ def.icu.strength = COLL_ICU_STRENGTH_PRIMARY;
+ unicode_ci_coll = coll_new(&def);
+ if (unicode_ci_coll == NULL)
+ goto error_coll;
+ luaL_register(L, "utf8", utf8_lib);
+ lua_pop(L, 1);
+ return;
+error_coll:
+ tarantool_lua_utf8_free();
+ luaT_error(L);
+}
+
+void
+tarantool_lua_utf8_free()
+{
+ ucasemap_close(root_map);
+ if (unicode_coll != NULL)
+ coll_unref(unicode_coll);
+ if (unicode_ci_coll != NULL)
+ coll_unref(unicode_ci_coll);
+}
diff --git a/src/lua/utf8.h b/src/lua/utf8.h
new file mode 100644
index 000000000..567ad51f7
--- /dev/null
+++ b/src/lua/utf8.h
@@ -0,0 +1,42 @@
+#ifndef TARANTOOL_LUA_UTF8_H_INCLUDED
+#define TARANTOOL_LUA_UTF8_H_INCLUDED
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the
+ * following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+struct lua_State;
+
+void
+tarantool_lua_utf8_init(struct lua_State *L);
+
+void
+tarantool_lua_utf8_free();
+
+#endif /* TARANTOOL_LUA_UTF8_H_INCLUDED */
diff --git a/test/app-tap/string.test.lua b/test/app-tap/string.test.lua
index 852a7923c..1d10dcfc9 100755
--- a/test/app-tap/string.test.lua
+++ b/test/app-tap/string.test.lua
@@ -3,7 +3,7 @@
local tap = require('tap')
local test = tap.test("string extensions")
-test:plan(5)
+test:plan(6)
test:test("split", function(test)
test:plan(10)
@@ -128,4 +128,165 @@ test:test("strip", function(test)
test:ok(err and err:match("%(string expected, got number%)"))
end )
+test:test("unicode", function(test)
+ test:plan(102)
+ local str = 'хеЛлоу вОрЛд ё Ё я Я э Э ъ Ъ hElLo WorLd 1234 i I İ 勺#☢༺'
+ local upper_res = 'ХЕЛЛОУ ВОРЛД Ё Ё Я Я Э Э Ъ Ъ HELLO WORLD 1234 I I İ 勺#☢༺'
+ local lower_res = 'хеллоу ворлд ё ё я я э э ъ ъ hello world 1234 i i i̇ 勺#☢༺'
+ local s = utf8.upper(str)
+ test:is(s, upper_res, 'default locale upper')
+ s = utf8.lower(str)
+ test:is(s, lower_res, 'default locale lower')
+ test:is(utf8.upper(''), '', 'empty string upper')
+ test:is(utf8.lower(''), '', 'empty string lower')
+ local err
+ s, err = pcall(utf8.upper, true)
+ test:isnt(err:find('Usage'), nil, 'upper usage is checked')
+ s, err = pcall(utf8.lower, true)
+ test:isnt(err:find('Usage'), nil, 'lower usage is checked')
+
+ test:is(utf8.isupper('a'), false, 'isupper("a")')
+ test:is(utf8.isupper('A'), true, 'isupper("A")')
+ test:is(utf8.islower('a'), true, 'islower("a")')
+ test:is(utf8.islower('A'), false, 'islower("A")')
+ test:is(utf8.isalpha('a'), true, 'isalpha("a")')
+ test:is(utf8.isalpha('A'), true, 'isalpha("A")')
+ test:is(utf8.isalpha('aa'), false, 'isalpha("aa")')
+ test:is(utf8.isalpha('勺'), true, 'isalpha("勺")')
+ test:is(utf8.isupper('Ё'), true, 'isupper("Ё")')
+ test:is(utf8.islower('ё'), true, 'islower("ё")')
+ test:is(utf8.isdigit('a'), false, 'isdigit("a")')
+ test:is(utf8.isdigit('1'), true, 'isdigit("1")')
+ test:is(utf8.isdigit('9'), true, 'isdigit("9")')
+
+ test:is(utf8.len(str), 56, 'len works on complex string')
+ s = '12İ☢勺34'
+ test:is(utf8.len(s), 7, 'len works no options')
+ test:is(utf8.len(s, 1), 7, 'default start is 1')
+ test:is(utf8.len(s, 2), 6, 'start 2')
+ test:is(utf8.len(s, 3), 5, 'start 3')
+ local c
+ c, err = utf8.len(s, 4)
+ test:isnil(c, 'middle of symbol offset is error')
+ test:is(err, 4, 'error on 4 byte')
+ test:is(utf8.len(s, 5), 4, 'start 5')
+ c, err = utf8.len(s, 6)
+ test:is(err, 6, 'error on 6 byte')
+ c, err = utf8.len(s, 0)
+ test:is(err, 'position is out of string', 'range is out of string')
+ test:is(utf8.len(s, #s), 1, 'start from the end')
+ test:is(utf8.len(s, #s + 1), 0, 'position is out of string')
+ test:is(utf8.len(s, 1, -1), 7, 'default end is -1')
+ test:is(utf8.len(s, 1, -2), 6, 'end -2')
+ test:is(utf8.len(s, 1, -3), 5, 'end -3')
+ test:is(utf8.len(s, 1, -4), 5, 'end in the middle of symbol')
+ test:is(utf8.len(s, 1, -5), 5, 'end in the middle of symbol')
+ test:is(utf8.len(s, 1, -6), 5, 'end in the middle of symbol')
+ test:is(utf8.len(s, 1, -7), 4, 'end -7')
+ test:is(utf8.len(s, 2, -7), 3, '[2, -7]')
+ test:is(utf8.len(s, 3, -7), 2, '[3, -7]')
+ c, err = utf8.len(s, 4, -7)
+ test:is(err, 4, '[4, -7] is error - start from the middle of symbol')
+ test:is(utf8.len(s, 10, -100), 0, 'it is ok to be out of str by end pos')
+ test:is(utf8.len(s, 10, -10), 0, 'it is ok to swap end and start pos')
+ test:is(utf8.len(''), 0, 'empty len')
+ test:is(utf8.len(s, -6, -1), 3, 'pass both negative offsets')
+ test:is(utf8.len(s, 3, 3), 1, "end in the middle on the same symbol as start")
+ c, err = utf8.len('a\xF4')
+ test:is(err, 2, "invalid unicode in the middle of the string")
+
+ local chars = {}
+ local codes = {}
+ for _, code in utf8.next, s do
+ table.insert(chars, utf8.char(code))
+ table.insert(codes, code)
+ end
+ test:is(table.concat(chars), s, "next and char works")
+ c, err = pcall(utf8.char, 'kek')
+ test:isnt(err:find('bad argument'), nil, 'char usage is checked')
+ c, err = pcall(utf8.next, true)
+ test:isnt(err:find('Usage'), nil, 'next usage is checked')
+ c, err = pcall(utf8.next, '1234', true)
+ test:isnt(err:find('bad argument'), nil, 'next usage is checked')
+ local offset
+ offset, c = utf8.next('')
+ test:isnil(offset, 'next on empty - nil offset')
+ test:isnil(c, 'next on empty - nil code')
+ offset, c = utf8.next('123', 100)
+ test:isnil(offset, 'out of string - nil offset')
+ test:isnil(c, 'out of string - nil code')
+ test:is(utf8.char(unpack(codes)), s, 'char with multiple values')
+
+ local uppers = 0
+ local lowers = 0
+ local digits = 0
+ local letters = 0
+ for _, code in utf8.next, str do
+ if utf8.isupper(code) then uppers = uppers + 1 end
+ if utf8.islower(code) then lowers = lowers + 1 end
+ if utf8.isalpha(code) then letters = letters + 1 end
+ if utf8.isdigit(code) then digits = digits + 1 end
+ end
+ test:is(uppers, 13, 'uppers by code')
+ test:is(lowers, 19, 'lowers by code')
+ test:is(letters, 33, 'letters by code')
+ test:is(digits, 4, 'digits by code')
+
+ s = '12345678'
+ test:is(utf8.sub(s, 1, 1), '1', 'sub [1]')
+ test:is(utf8.sub(s, 1, 2), '12', 'sub [1:2]')
+ test:is(utf8.sub(s, 2, 2), '2', 'sub [2:2]')
+ test:is(utf8.sub(s, 0, 2), '12', 'sub [0:2]')
+ test:is(utf8.sub(s, 3, 7), '34567', 'sub [3:7]')
+ test:is(utf8.sub(s, 7, 3), '', 'sub [7:3]')
+ test:is(utf8.sub(s, 3, 100), '345678', 'sub [3:100]')
+ test:is(utf8.sub(s, 100, 3), '', 'sub [100:3]')
+
+ test:is(utf8.sub(s, 5), '5678', 'sub [5:]')
+ test:is(utf8.sub(s, 1, -1), s, 'sub [1:-1]')
+ test:is(utf8.sub(s, 1, -2), '1234567', 'sub [1:-2]')
+ test:is(utf8.sub(s, 2, -2), '234567', 'sub [2:-2]')
+ test:is(utf8.sub(s, 3, -3), '3456', 'sub [3:-3]')
+ test:is(utf8.sub(s, 5, -4), '5', 'sub [5:-4]')
+ test:is(utf8.sub(s, 7, -7), '', 'sub[7:-7]')
+
+ test:is(utf8.sub(s, -2, -1), '78', 'sub [-2:-1]')
+ test:is(utf8.sub(s, -1, -1), '8', 'sub [-1:-1]')
+ test:is(utf8.sub(s, -4, -2), '567', 'sub [-4:-2]')
+ test:is(utf8.sub(s, -400, -2), '1234567', 'sub [-400:-2]')
+ test:is(utf8.sub(s, -3, -5), '', 'sub [-3:-5]')
+
+ test:is(utf8.sub(s, -6, 5), '345', 'sub [-6:5]')
+ test:is(utf8.sub(s, -5, 4), '4', 'sub [-5:4]')
+ test:is(utf8.sub(s, -2, 2), '', 'sub [-2:2]')
+ test:is(utf8.sub(s, -1, 8), '8', 'sub [-1:8]')
+
+ c, err = pcall(utf8.sub)
+ test:isnt(err:find('Usage'), nil, 'usage is checked')
+ c, err = pcall(utf8.sub, true)
+ test:isnt(err:find('Usage'), nil, 'usage is checked')
+ c, err = pcall(utf8.sub, '123')
+ test:isnt(err:find('Usage'), nil, 'usage is checked')
+ c, err = pcall(utf8.sub, '123', true)
+ test:isnt(err:find('bad argument'), nil, 'usage is checked')
+ c, err = pcall(utf8.sub, '123', 1, true)
+ test:isnt(err:find('bad argument'), nil, 'usage is checked')
+
+ local s1 = '☢'
+ local s2 = 'İ'
+ test:is(s1 < s2, false, 'test binary cmp')
+ test:is(utf8.cmp(s1, s2) < 0, true, 'test unicode <')
+ test:is(utf8.cmp(s1, s1) == 0, true, 'test unicode eq')
+ test:is(utf8.cmp(s2, s1) > 0, true, 'test unicode >')
+ test:is(utf8.casecmp('a', 'A') == 0, true, 'test icase ==')
+ test:is(utf8.casecmp('b', 'A') > 0, true, 'test icase >, first')
+ test:is(utf8.casecmp('B', 'a') > 0, true, 'test icase >, second >')
+ test:is(utf8.cmp('', '') == 0, true, 'test empty compare')
+ test:is(utf8.cmp('', 'a') < 0, true, 'test left empty compare')
+ test:is(utf8.cmp('a', '') > 0, true, 'test right empty compare')
+ test:is(utf8.casecmp('', '') == 0, true, 'test empty icompare')
+ test:is(utf8.casecmp('', 'a') < 0, true, 'test left empty icompare')
+ test:is(utf8.casecmp('a', '') > 0, true, 'test right empty icompare')
+end)
+
os.exit(test:check() == true and 0 or -1)
diff --git a/test/box/ddl.result b/test/box/ddl.result
index f249f8fe3..30f0cf7ec 100644
--- a/test/box/ddl.result
+++ b/test/box/ddl.result
@@ -500,6 +500,21 @@ box.space._collation.index.name:delete{'test'}
- [3, 'test', 0, 'ICU', 'ru_RU', {}]
...
--
+-- gh-3290: expose ICU into Lua. It uses built-in collations, that
+-- must work even if a collation is deleted from _collation.
+--
+t = box.space._collation:delete{1}
+---
+...
+utf8.cmp('abc', 'def')
+---
+- -1
+...
+box.space._collation:replace(t)
+---
+- [1, 'unicode', 1, 'ICU', '', {}]
+...
+--
-- gh-2839: allow to store custom fields in field definition.
--
format = {}
diff --git a/test/box/ddl.test.lua b/test/box/ddl.test.lua
index 6029c6eb6..ebbefe77b 100644
--- a/test/box/ddl.test.lua
+++ b/test/box/ddl.test.lua
@@ -191,6 +191,14 @@ test_run:cmd('restart server default')
box.space._collation:select{}
box.space._collation.index.name:delete{'test'}
+--
+-- gh-3290: expose ICU into Lua. It uses built-in collations, that
+-- must work even if a collation is deleted from _collation.
+--
+t = box.space._collation:delete{1}
+utf8.cmp('abc', 'def')
+box.space._collation:replace(t)
+
--
-- gh-2839: allow to store custom fields in field definition.
--
--
2.15.1 (Apple Git-101)
^ permalink raw reply [flat|nested] 11+ messages in thread