Tarantool development patches archive
 help / color / mirror / Atom feed
* [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua
@ 2018-04-28 22:45 Vladislav Shpilevoy
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 1/5] alter: fix assertion in collations alter Vladislav Shpilevoy
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-04-28 22:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Branch: http://github.com/tarantool/tarantool/tree/gh-3290-lua-icu
Issue: https://github.com/tarantool/tarantool/issues/3290
Issue: https://github.com/tarantool/tarantool/issues/3385
Issue: https://github.com/tarantool/tarantool/issues/3081

Lua can not work with unicode - in Lua it is enterpreted as a binary. On such
string built-in upper/lower functions, '#' (length) and comparison operators do
not work. But Tarantool links with ICU and has comparators with collations, that
can solve the problems.

But there is another issue - string methods must be available before box.cfg,
so the ICU and collations must be built out of main 'box' static library. To do
this the collation related files are moved from 'box' into 'core' library.

A second issue is that when box.cfg is called, it inserts built-in collations
into _collation space, and these insertions can conflict with built-in
collations, created before box.cfg. Delete from _collation can break the
collations cache as well. The patchset solves this by checking collations
deletions and insertions, and if they tries to operate on built-in collations,
then they are ignored - a user sees changes in _collation, but the cache is
unchanged.

Vladislav Shpilevoy (5):
  alter: fix assertion in collations alter
  Move struct on_access_denied_ctx into error.h
  Merge box_error, stat and collations into core library
  Always store built-in collations in the cache
  lua: introduce utf8 built-in globaly visible module

 cmake/module.cmake           |   4 +-
 src/CMakeLists.txt           |  17 +-
 src/box/CMakeLists.txt       |  18 +-
 src/box/alter.cc             | 129 +++++++------
 src/box/lua/call.c           |   2 +-
 src/box/lua/error.cc         |   2 +-
 src/box/lua/net_box.c        |   2 +-
 src/box/lua/tuple.c          |   2 +-
 src/box/lua/xlog.c           |   2 +-
 src/box/schema.h             |  12 --
 src/{box => }/coll.c         |   0
 src/{box => }/coll.h         |   0
 src/{box => }/coll_cache.c   |  34 ++++
 src/{box => }/coll_cache.h   |   0
 src/{box => }/coll_def.c     |   7 +
 src/{box => }/coll_def.h     |  13 ++
 src/{box => }/errcode.c      |   0
 src/{box => }/errcode.h      |   0
 src/{box => }/error.cc       |   1 -
 src/{box => }/error.h        |  12 ++
 src/lua/init.c               |   3 +
 src/lua/utf8.c               | 451 +++++++++++++++++++++++++++++++++++++++++++
 src/lua/utf8.h               |  42 ++++
 src/main.cc                  |   2 +-
 src/{box => }/opt_def.c      |   0
 src/{box => }/opt_def.h      |   0
 test/app-tap/string.test.lua | 160 ++++++++++++++-
 test/box/ddl.result          |  25 +++
 test/box/ddl.test.lua        |  11 ++
 test/unit/CMakeLists.txt     |   8 +-
 test/unit/coll.cpp           |   3 +-
 31 files changed, 862 insertions(+), 100 deletions(-)
 rename src/{box => }/coll.c (100%)
 rename src/{box => }/coll.h (100%)
 rename src/{box => }/coll_cache.c (75%)
 rename src/{box => }/coll_cache.h (100%)
 rename src/{box => }/coll_def.c (96%)
 rename src/{box => }/coll_def.h (94%)
 rename src/{box => }/errcode.c (100%)
 rename src/{box => }/errcode.h (100%)
 rename src/{box => }/error.cc (99%)
 rename src/{box => }/error.h (95%)
 create mode 100644 src/lua/utf8.c
 create mode 100644 src/lua/utf8.h
 rename src/{box => }/opt_def.c (100%)
 rename src/{box => }/opt_def.h (100%)

-- 
2.15.1 (Apple Git-101)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] [PATCH v2 1/5] alter: fix assertion in collations alter
  2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
@ 2018-04-28 22:45 ` Vladislav Shpilevoy
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h Vladislav Shpilevoy
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-04-28 22:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

Forbid collations update - it leads to an assertion in collation
cache about unavailability to correctly replace collations.
Refactor collations deletion and insertion.

Besides, collation update potentially changes indexing order with
no actual indexes rebuilding.
---
 src/box/alter.cc      | 115 ++++++++++++++++++++++++++------------------------
 test/box/ddl.result   |  10 +++++
 test/box/ddl.test.lua |   3 ++
 3 files changed, 74 insertions(+), 54 deletions(-)

diff --git a/src/box/alter.cc b/src/box/alter.cc
index 77b03e680..be2ccc061 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -2376,19 +2376,31 @@ coll_cache_rollback(struct trigger *trigger, void *event)
 	struct txn_stmt *stmt = txn_last_stmt((struct txn*) event);
 	struct tuple *new_tuple = stmt->new_tuple;
 
-	if (new_tuple != NULL) {
+	if (new_tuple == NULL && old_coll != NULL) {
+		/* Rollback DELETE. */
+		if (coll_by_id(old_coll->id) == old_coll) {
+			/*
+			 * Nothing to do - the tx had been failed
+			 * before old collation deletion.
+			 */
+			return;
+		}
+		struct coll *replaced;
+		if (coll_cache_replace(old_coll, &replaced) != 0) {
+			panic("Out of memory on insertion into collation "\
+			      "cache");
+		}
+		assert(replaced == NULL);
+	} else {
+		assert(new_tuple != NULL && old_coll == NULL);
 		uint32_t new_id = tuple_field_u32_xc(new_tuple,
 						     BOX_COLLATION_FIELD_ID);
 		struct coll *new_coll = coll_by_id(new_id);
-		coll_cache_delete(new_coll);
-		coll_delete(new_coll);
-	}
-
-	if (old_coll != NULL) {
-		struct coll *replaced;
-		int rc = coll_cache_replace(old_coll, &replaced);
-		assert(rc == 0 && replaced == NULL);
-		(void)rc;
+		/* Can be NULL, if failed before cache update. */
+		if (new_coll != NULL) {
+			coll_cache_delete(new_coll);
+			coll_delete(new_coll);
+		}
 	}
 }
 
@@ -2412,59 +2424,54 @@ on_replace_dd_collation(struct trigger * /* trigger */, void *event)
 	struct tuple *old_tuple = stmt->old_tuple;
 	struct tuple *new_tuple = stmt->new_tuple;
 	txn_check_singlestatement_xc(txn, "Space _collation");
-	struct coll *old_coll = NULL;
-	if (old_tuple != NULL) {
+	if (new_tuple == NULL && old_tuple != NULL) {
+		/* DELETE */
 		/* TODO: Check that no index uses the collation */
-		uint32_t old_id = tuple_field_u32_xc(old_tuple,
-						     BOX_COLLATION_FIELD_ID);
-		old_coll = coll_by_id(old_id);
+		int32_t old_id = tuple_field_u32_xc(old_tuple,
+						    BOX_COLLATION_FIELD_ID);
+		struct coll *old_coll = coll_by_id(old_id);
 		assert(old_coll != NULL);
 		access_check_ddl(old_coll->name, old_coll->owner_id,
-				 SC_COLLATION,
-				 new_tuple == NULL ? PRIV_D: PRIV_A,
-				 false);
-
-		struct trigger *on_commit =
-			txn_alter_trigger_new(coll_cache_delete_coll, old_coll);
-		txn_on_commit(txn, on_commit);
-	}
-
-	if (new_tuple == NULL) {
-		/* Simple DELETE */
-		assert(old_tuple != NULL);
-		coll_cache_delete(old_coll);
-
+				 SC_COLLATION, PRIV_D, false);
+		/*
+		 * Set on_rollback before deletion from the cache.
+		 * Else rollback setting can fail and the
+		 * collation will be lost.
+		 */
 		struct trigger *on_rollback =
 			txn_alter_trigger_new(coll_cache_rollback, old_coll);
 		txn_on_rollback(txn, on_rollback);
-		return;
-	}
-
-	struct coll_def new_def;
-	coll_def_new_from_tuple(new_tuple, &new_def);
-	access_check_ddl(new_def.name, new_def.owner_id, SC_COLLATION,
-			 old_tuple == NULL ? PRIV_C : PRIV_A, false);
-	struct coll *new_coll = coll_new(&new_def);
-	if (new_coll == NULL)
-		diag_raise();
-	auto def_guard = make_scoped_guard([=] { coll_delete(new_coll); });
-
-	struct coll *replaced;
-	if (coll_cache_replace(new_coll, &replaced) != 0)
-		diag_raise();
-	if (replaced == NULL && old_coll != NULL) {
+		coll_cache_delete(old_coll);
+		struct trigger *on_commit =
+			txn_alter_trigger_new(coll_cache_delete_coll, old_coll);
+		txn_on_commit(txn, on_commit);
+	} else if (new_tuple != NULL && old_tuple == NULL) {
+		/* INSERT */
+		struct coll_def new_def;
+		coll_def_new_from_tuple(new_tuple, &new_def);
+		access_check_ddl(new_def.name, new_def.owner_id, SC_COLLATION,
+				 PRIV_C, false);
 		/*
-		 * ID of a collation was changed.
-		 * Remove collation by old ID.
+		 * Set on_rollback before the collation is
+		 * inserted into the cache. Else if the trigger
+		 * creation fails the collation will not be
+		 * deleted on abort.
 		 */
-		coll_cache_delete(old_coll);
+		struct trigger *on_rollback =
+			txn_alter_trigger_new(coll_cache_rollback, NULL);
+		txn_on_rollback(txn, on_rollback);
+		struct coll *new_coll = coll_new(&new_def);
+		if (new_coll == NULL)
+			diag_raise();
+		struct coll *replaced;
+		if (coll_cache_replace(new_coll, &replaced) != 0)
+			diag_raise();
+		assert(replaced == NULL);
+	} else {
+		/* UPDATE */
+		assert(new_tuple != NULL && old_tuple != NULL);
+		tnt_raise(ClientError, ER_UNSUPPORTED, "collation", "alter");
 	}
-
-	struct trigger *on_rollback =
-		txn_alter_trigger_new(coll_cache_rollback, old_coll);
-	txn_on_rollback(txn, on_rollback);
-
-	def_guard.is_active = false;
 }
 
 /**
diff --git a/test/box/ddl.result b/test/box/ddl.result
index 0eef37992..87b9581c6 100644
--- a/test/box/ddl.result
+++ b/test/box/ddl.result
@@ -289,6 +289,16 @@ box.internal.collation.create('test', 'ICU', 'ru_RU', {strength='primary'}) --ok
 box.internal.collation.drop('test') --ok
 ---
 ...
+c = box.space._collation:get{1}:totable()
+---
+...
+c[2] = 'unicode_test'
+---
+...
+box.space._collation:replace(c)
+---
+- error: collation does not support alter
+...
 box.begin() box.internal.collation.create('test2', 'ICU', 'ru_RU')
 ---
 - error: Space _collation does not support multi-statement transactions
diff --git a/test/box/ddl.test.lua b/test/box/ddl.test.lua
index 820fe7d4d..a1502ae13 100644
--- a/test/box/ddl.test.lua
+++ b/test/box/ddl.test.lua
@@ -127,6 +127,9 @@ box.internal.collation.create('test', 'ICU', 'ru_RU', {strength=2}) --ok
 box.internal.collation.drop('test') --ok
 box.internal.collation.create('test', 'ICU', 'ru_RU', {strength='primary'}) --ok
 box.internal.collation.drop('test') --ok
+c = box.space._collation:get{1}:totable()
+c[2] = 'unicode_test'
+box.space._collation:replace(c)
 
 box.begin() box.internal.collation.create('test2', 'ICU', 'ru_RU')
 box.rollback()
-- 
2.15.1 (Apple Git-101)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h
  2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 1/5] alter: fix assertion in collations alter Vladislav Shpilevoy
@ 2018-04-28 22:45 ` Vladislav Shpilevoy
  2018-05-04 11:06   ` [tarantool-patches] " Alexander Turenko
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 3/5] Merge box_error, stat and collations into core library Vladislav Shpilevoy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-04-28 22:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

The issue #3290 was not only about upper/lower Lua functions, but
about unicode comparison functions too. Actually, the issue
requests upper/lower exactly to do string comparison, that can be
done more quick with no garbage strings creation. For this
Tarantool collations can be used.

To be able to expose collations into Lua, the coll.h/.c,
coll_def.h/.c and coll_cache.h/.c must be moved from 'box' static
library into 'core' static library so that they will be built
together with string utils. But they require 'stat' and
'box_error' libraries. The patch prepares the files going to be
moved, so in the next patch they are just moved, with no changes.
It saves commit history.
---
 src/box/error.cc |  1 -
 src/box/error.h  | 12 ++++++++++++
 src/box/schema.h | 12 ------------
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/src/box/error.cc b/src/box/error.cc
index 99f519537..4bab5d720 100644
--- a/src/box/error.cc
+++ b/src/box/error.cc
@@ -174,7 +174,6 @@ BuildXlogError(const char *file, unsigned line, const char *format, ...)
 	}
 }
 
-#include "schema.h"
 #include "trigger.h"
 
 struct rlist on_access_denied = RLIST_HEAD_INITIALIZER(on_access_denied);
diff --git a/src/box/error.h b/src/box/error.h
index c791e6c6a..69a1022ad 100644
--- a/src/box/error.h
+++ b/src/box/error.h
@@ -133,6 +133,18 @@ extern const struct type_info type_ClientError;
 extern const struct type_info type_XlogError;
 extern const struct type_info type_AccessDeniedError;
 
+/**
+ * Context passed to on_access_denied trigger.
+ */
+struct on_access_denied_ctx {
+	/** Type of declined access */
+	const char *access_type;
+	/** Type of object the required access was denied to */
+	const char *object_type;
+	/** Name of object the required access was denied to */
+	const char *object_name;
+};
+
 #if defined(__cplusplus)
 } /* extern "C" */
 #include "exception.h"
diff --git a/src/box/schema.h b/src/box/schema.h
index 56f39b3fe..4b2a3da4e 100644
--- a/src/box/schema.h
+++ b/src/box/schema.h
@@ -224,16 +224,4 @@ extern struct rlist on_alter_sequence;
  */
 extern struct rlist on_access_denied;
 
-/**
- * Context passed to on_access_denied trigger.
- */
-struct on_access_denied_ctx {
-	/** Type of declined access */
-	const char *access_type;
-	/** Type of object the required access was denied to */
-	const char *object_type;
-	/** Name of object the required access was denied to */
-	const char *object_name;
-};
-
 #endif /* INCLUDES_TARANTOOL_BOX_SCHEMA_H */
-- 
2.15.1 (Apple Git-101)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] [PATCH v2 3/5] Merge box_error, stat and collations into core library
  2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 1/5] alter: fix assertion in collations alter Vladislav Shpilevoy
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h Vladislav Shpilevoy
@ 2018-04-28 22:45 ` Vladislav Shpilevoy
  2018-05-04 11:36   ` [tarantool-patches] " Alexander Turenko
  2018-05-08 13:18   ` Konstantin Osipov
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 4/5] Always store built-in collations in the cache Vladislav Shpilevoy
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-04-28 22:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

The goal is to expose collations into Lua with no dependencies on
box library. But collations merge into core requires box_error
and stat libraries too.
---
 cmake/module.cmake         |  4 ++--
 src/CMakeLists.txt         | 14 ++++++++++----
 src/box/CMakeLists.txt     | 18 +++++-------------
 src/box/lua/call.c         |  2 +-
 src/box/lua/error.cc       |  2 +-
 src/box/lua/net_box.c      |  2 +-
 src/box/lua/tuple.c        |  2 +-
 src/box/lua/xlog.c         |  2 +-
 src/{box => }/coll.c       |  0
 src/{box => }/coll.h       |  0
 src/{box => }/coll_cache.c |  0
 src/{box => }/coll_cache.h |  0
 src/{box => }/coll_def.c   |  0
 src/{box => }/coll_def.h   |  0
 src/{box => }/errcode.c    |  0
 src/{box => }/errcode.h    |  0
 src/{box => }/error.cc     |  0
 src/{box => }/error.h      |  0
 src/main.cc                |  2 +-
 src/{box => }/opt_def.c    |  0
 src/{box => }/opt_def.h    |  0
 test/unit/CMakeLists.txt   |  8 ++++----
 test/unit/coll.cpp         |  3 +--
 23 files changed, 28 insertions(+), 31 deletions(-)
 rename src/{box => }/coll.c (100%)
 rename src/{box => }/coll.h (100%)
 rename src/{box => }/coll_cache.c (100%)
 rename src/{box => }/coll_cache.h (100%)
 rename src/{box => }/coll_def.c (100%)
 rename src/{box => }/coll_def.h (100%)
 rename src/{box => }/errcode.c (100%)
 rename src/{box => }/errcode.h (100%)
 rename src/{box => }/error.cc (100%)
 rename src/{box => }/error.h (100%)
 rename src/{box => }/opt_def.c (100%)
 rename src/{box => }/opt_def.h (100%)

diff --git a/cmake/module.cmake b/cmake/module.cmake
index 988dcb94d..434938cdf 100644
--- a/cmake/module.cmake
+++ b/cmake/module.cmake
@@ -25,7 +25,7 @@ function(rebuild_module_api)
         COMMAND ${CMAKE_C_COMPILER}
             ${cflags}
             -I ${CMAKE_SOURCE_DIR}/src -I ${CMAKE_BINARY_DIR}/src
-            -E ${CMAKE_SOURCE_DIR}/src/box/errcode.h > ${errcodefile}
+            -E ${CMAKE_SOURCE_DIR}/src/errcode.h > ${errcodefile}
         COMMAND
             grep "enum box_error_code" ${errcodefile} >> ${tmpfile}
         COMMAND cat ${CMAKE_CURRENT_SOURCE_DIR}/module_footer.h >> ${tmpfile}
@@ -34,7 +34,7 @@ function(rebuild_module_api)
         DEPENDS ${CMAKE_SOURCE_DIR}/extra/apigen
                 ${CMAKE_CURRENT_SOURCE_DIR}/module_header.h
                 ${CMAKE_CURRENT_SOURCE_DIR}/module_footer.h
-                ${CMAKE_SOURCE_DIR}/src/box/errcode.h
+                ${CMAKE_SOURCE_DIR}/src/errcode.h
                 ${headers}
         )
 
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 8ab09e968..45eb7a431 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -94,6 +94,15 @@ set (core_sources
      random.c
      trigger.cc
      http_parser.c
+     error.cc
+     errcode.c
+     rmean.c
+     latency.c
+     histogram.c
+     coll_def.c
+     coll.c
+     coll_cache.c
+     opt_def.c
  )
 
 if (TARGET_OS_NETBSD)
@@ -112,9 +121,6 @@ target_link_libraries(core
     ${MSGPUCK_LIBRARIES}
 )
 
-add_library(stat STATIC rmean.c latency.c histogram.c)
-target_link_libraries(stat core)
-
 if (ENABLE_BACKTRACE AND NOT TARGET_OS_DARWIN)
 	target_link_libraries(core gcc_s ${UNWIND_LIBRARIES})
 endif()
@@ -190,7 +196,7 @@ set(api_headers
     ${CMAKE_SOURCE_DIR}/src/box/box.h
     ${CMAKE_SOURCE_DIR}/src/box/index.h
     ${CMAKE_SOURCE_DIR}/src/box/iterator_type.h
-    ${CMAKE_SOURCE_DIR}/src/box/error.h
+    ${CMAKE_SOURCE_DIR}/src/error.h
     ${CMAKE_SOURCE_DIR}/src/box/lua/call.h
     ${CMAKE_SOURCE_DIR}/src/box/lua/tuple.h
     ${CMAKE_SOURCE_DIR}/src/latch.h
diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
index ae156a2f5..30d636e47 100644
--- a/src/box/CMakeLists.txt
+++ b/src/box/CMakeLists.txt
@@ -21,15 +21,12 @@ set_property(DIRECTORY PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${lua_sources})
 
 include_directories(${ZSTD_INCLUDE_DIRS})
 
-add_library(box_error STATIC error.cc errcode.c)
-target_link_libraries(box_error core stat)
-
 add_library(vclock STATIC vclock.c)
 target_link_libraries(vclock core)
 
 add_library(xrow STATIC xrow.c iproto_constants.c)
-target_link_libraries(xrow server core small vclock misc box_error
-                      scramble ${MSGPUCK_LIBRARIES})
+target_link_libraries(xrow server core small vclock misc scramble
+                      ${MSGPUCK_LIBRARIES})
 
 add_library(tuple STATIC
     tuple.c
@@ -41,20 +38,15 @@ add_library(tuple STATIC
     tuple_bloom.c
     tuple_dictionary.c
     key_def.cc
-    coll_def.c
-    coll.c
-    coll_cache.c
     field_def.c
-    opt_def.c
 )
-target_link_libraries(tuple json_path box_error core ${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES} misc bit)
+target_link_libraries(tuple json_path core ${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES} misc bit)
 
 add_library(xlog STATIC xlog.c)
-target_link_libraries(xlog core box_error crc32 ${ZSTD_LIBRARIES})
+target_link_libraries(xlog core crc32 ${ZSTD_LIBRARIES})
 
 add_library(box STATIC
     iproto.cc
-    error.cc
     xrow_io.cc
     tuple_convert.c
     identifier.c
@@ -132,6 +124,6 @@ add_library(box STATIC
     lua/xlog.c
     ${bin_sources})
 
-target_link_libraries(box box_error tuple stat xrow xlog vclock crc32 scramble
+target_link_libraries(box tuple xrow xlog vclock crc32 scramble
                       ${common_libraries})
 add_dependencies(box build_bundled_libs)
diff --git a/src/box/lua/call.c b/src/box/lua/call.c
index be13812aa..b60c6c397 100644
--- a/src/box/lua/call.c
+++ b/src/box/lua/call.c
@@ -30,7 +30,7 @@
  */
 #include "box/lua/call.h"
 #include "box/call.h"
-#include "box/error.h"
+#include <error.h>
 #include "fiber.h"
 
 #include "lua/utils.h"
diff --git a/src/box/lua/error.cc b/src/box/lua/error.cc
index 314907421..960ea2aa9 100644
--- a/src/box/lua/error.cc
+++ b/src/box/lua/error.cc
@@ -40,7 +40,7 @@ extern "C" {
 #include <errinj.h>
 
 #include "lua/utils.h"
-#include "box/error.h"
+#include "src/error.h"
 
 static int
 luaT_error_raise(lua_State *L)
diff --git a/src/box/lua/net_box.c b/src/box/lua/net_box.c
index db2d2dbb4..f5cc8d674 100644
--- a/src/box/lua/net_box.c
+++ b/src/box/lua/net_box.c
@@ -43,7 +43,7 @@
 #include "third_party/base64.h"
 
 #include "coio.h"
-#include "box/errcode.h"
+#include "errcode.h"
 #include "lua/fiber.h"
 
 #define cfg luaL_msgpack_default
diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
index 67abb32bd..889f0569b 100644
--- a/src/box/lua/tuple.c
+++ b/src/box/lua/tuple.c
@@ -39,7 +39,7 @@
 
 #include "box/tuple.h"
 #include "box/tuple_convert.h"
-#include "box/errcode.h"
+#include "errcode.h"
 #include "box/memtx_tuple.h"
 #include "json/path.h"
 
diff --git a/src/box/lua/xlog.c b/src/box/lua/xlog.c
index 030f5c2d4..c53b5a21a 100644
--- a/src/box/lua/xlog.c
+++ b/src/box/lua/xlog.c
@@ -35,7 +35,7 @@
 #include <diag.h>
 #include <msgpuck/msgpuck.h>
 
-#include <box/error.h>
+#include "error.h"
 #include <box/xlog.h>
 #include <box/xrow.h>
 #include <box/iproto_constants.h>
diff --git a/src/box/coll.c b/src/coll.c
similarity index 100%
rename from src/box/coll.c
rename to src/coll.c
diff --git a/src/box/coll.h b/src/coll.h
similarity index 100%
rename from src/box/coll.h
rename to src/coll.h
diff --git a/src/box/coll_cache.c b/src/coll_cache.c
similarity index 100%
rename from src/box/coll_cache.c
rename to src/coll_cache.c
diff --git a/src/box/coll_cache.h b/src/coll_cache.h
similarity index 100%
rename from src/box/coll_cache.h
rename to src/coll_cache.h
diff --git a/src/box/coll_def.c b/src/coll_def.c
similarity index 100%
rename from src/box/coll_def.c
rename to src/coll_def.c
diff --git a/src/box/coll_def.h b/src/coll_def.h
similarity index 100%
rename from src/box/coll_def.h
rename to src/coll_def.h
diff --git a/src/box/errcode.c b/src/errcode.c
similarity index 100%
rename from src/box/errcode.c
rename to src/errcode.c
diff --git a/src/box/errcode.h b/src/errcode.h
similarity index 100%
rename from src/box/errcode.h
rename to src/errcode.h
diff --git a/src/box/error.cc b/src/error.cc
similarity index 100%
rename from src/box/error.cc
rename to src/error.cc
diff --git a/src/box/error.h b/src/error.h
similarity index 100%
rename from src/box/error.h
rename to src/error.h
diff --git a/src/main.cc b/src/main.cc
index 1682baea0..3d61e3c51 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -63,7 +63,7 @@
 #include "tt_pthread.h"
 #include "lua/init.h"
 #include "box/box.h"
-#include "box/error.h"
+#include "error.h"
 #include "scoped_guard.h"
 #include "random.h"
 #include "tt_uuid.h"
diff --git a/src/box/opt_def.c b/src/opt_def.c
similarity index 100%
rename from src/box/opt_def.c
rename to src/opt_def.c
diff --git a/src/box/opt_def.h b/src/opt_def.h
similarity index 100%
rename from src/box/opt_def.h
rename to src/opt_def.h
diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
index 049edf641..8792a3ed2 100644
--- a/test/unit/CMakeLists.txt
+++ b/test/unit/CMakeLists.txt
@@ -90,10 +90,10 @@ add_executable(fiber_channel_stress.test fiber_channel_stress.cc)
 target_link_libraries(fiber_channel_stress.test core)
 
 add_executable(cbus_stress.test cbus_stress.c)
-target_link_libraries(cbus_stress.test core stat)
+target_link_libraries(cbus_stress.test core)
 
 add_executable(cbus.test cbus.c)
-target_link_libraries(cbus.test core unit stat)
+target_link_libraries(cbus.test core unit)
 
 add_executable(coio.test coio.cc)
 target_link_libraries(coio.test core eio bit uri unit)
@@ -133,9 +133,9 @@ add_executable(json_path.test json_path.c)
 target_link_libraries(json_path.test json_path unit ${ICU_LIBRARIES})
 
 add_executable(rmean.test rmean.cc)
-target_link_libraries(rmean.test stat unit)
+target_link_libraries(rmean.test core unit)
 add_executable(histogram.test histogram.c)
-target_link_libraries(histogram.test stat unit)
+target_link_libraries(histogram.test core unit)
 
 add_executable(say.test say.c)
 target_link_libraries(say.test core unit)
diff --git a/test/unit/coll.cpp b/test/unit/coll.cpp
index e1643c9c0..acf69b030 100644
--- a/test/unit/coll.cpp
+++ b/test/unit/coll.cpp
@@ -1,9 +1,8 @@
-#include "box/coll.h"
+#include "coll.h"
 #include <iostream>
 #include <vector>
 #include <algorithm>
 #include <string.h>
-#include <box/coll_def.h>
 #include <assert.h>
 #include <msgpuck.h>
 #include <diag.h>
-- 
2.15.1 (Apple Git-101)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] [PATCH v2 4/5] Always store built-in collations in the cache
  2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
                   ` (2 preceding siblings ...)
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 3/5] Merge box_error, stat and collations into core library Vladislav Shpilevoy
@ 2018-04-28 22:45 ` Vladislav Shpilevoy
  2018-05-08 13:20   ` [tarantool-patches] " Konstantin Osipov
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
  2018-05-05  0:19 ` [tarantool-patches] Re: [PATCH v2 0/5] Expose ICU into Lua Alexander Turenko
  5 siblings, 1 reply; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-04-28 22:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

To be able to use collations out of box.cfg it is needed to be
sure, that they are available always. Else it can break non-box
collation usage. So when a built-in collation is deleted from
_collation (for example, on box.internal.bootstrap()), it is not
deleted from the cache. When a built-in collation is inserted
into _collation, it does not touch the cache (it already contains
all built-in collations).
---
 src/box/alter.cc | 14 ++++++++++++++
 src/coll_cache.c | 34 ++++++++++++++++++++++++++++++++++
 src/coll_def.c   |  7 +++++++
 src/coll_def.h   | 13 +++++++++++++
 4 files changed, 68 insertions(+)

diff --git a/src/box/alter.cc b/src/box/alter.cc
index be2ccc061..39fcc8de4 100644
--- a/src/box/alter.cc
+++ b/src/box/alter.cc
@@ -2429,6 +2429,13 @@ on_replace_dd_collation(struct trigger * /* trigger */, void *event)
 		/* TODO: Check that no index uses the collation */
 		int32_t old_id = tuple_field_u32_xc(old_tuple,
 						    BOX_COLLATION_FIELD_ID);
+		if (coll_is_system(old_id)) {
+			/*
+			 * Built-in collations can be deleted from
+			 * _collation, but not from the cache.
+			 */
+			return;
+		}
 		struct coll *old_coll = coll_by_id(old_id);
 		assert(old_coll != NULL);
 		access_check_ddl(old_coll->name, old_coll->owner_id,
@@ -2451,6 +2458,13 @@ on_replace_dd_collation(struct trigger * /* trigger */, void *event)
 		coll_def_new_from_tuple(new_tuple, &new_def);
 		access_check_ddl(new_def.name, new_def.owner_id, SC_COLLATION,
 				 PRIV_C, false);
+		if (coll_is_system(new_def.id)) {
+			/*
+			 * Built-in collation is in the cache
+			 * already.
+			 */
+			return;
+		}
 		/*
 		 * Set on_rollback before the collation is
 		 * inserted into the cache. Else if the trigger
diff --git a/src/coll_cache.c b/src/coll_cache.c
index b7eb3edb9..b7fce5aba 100644
--- a/src/coll_cache.c
+++ b/src/coll_cache.c
@@ -45,7 +45,41 @@ coll_cache_init()
 			 "coll_cache_id");
 		return -1;
 	}
+	struct coll *replaced, *unicode_coll, *unicode_ci_coll;
+	struct coll_def def;
+	memset(&def, 0, sizeof(def));
+	def.id = COLLATION_ID_UNICODE;
+	def.owner_id = 1;
+	def.name = "unicode";
+	def.name_len = strlen(def.name);
+	def.locale = "";
+	def.type = COLL_TYPE_ICU;
+	unicode_coll = coll_new(&def);
+	if (unicode_coll == NULL)
+		goto err_unicode;
+	if (coll_cache_replace(unicode_coll, &replaced) != 0)
+		goto err_unicode_replace;
+	assert(replaced == NULL);
+
+	def.id = COLLATION_ID_UNICODE_CI;
+	def.name = "unicode_ci";
+	def.name_len = strlen(def.name);
+	def.icu.strength = COLL_ICU_STRENGTH_PRIMARY;
+	unicode_ci_coll = coll_new(&def);
+	if (unicode_ci_coll == NULL)
+		goto err_unicode_ci;
+	if (coll_cache_replace(unicode_ci_coll, &replaced) != 0)
+		goto err_unicode_ci_replace;
 	return 0;
+err_unicode_ci_replace:
+	coll_delete(unicode_ci_coll);
+err_unicode_ci:
+	coll_cache_delete(unicode_coll);
+err_unicode_replace:
+	coll_delete(unicode_coll);
+err_unicode:
+	mh_i32ptr_delete(coll_cache_id);
+	return -1;
 }
 
 /** Delete global hash tables. */
diff --git a/src/coll_def.c b/src/coll_def.c
index f849845b3..0a0ead9a1 100644
--- a/src/coll_def.c
+++ b/src/coll_def.c
@@ -63,6 +63,13 @@ const char *coll_icu_strength_strs[] = {
 	"IDENTICAL"
 };
 
+bool
+coll_is_system(int coll_id)
+{
+	return coll_id == COLLATION_ID_UNICODE ||
+	       coll_id == COLLATION_ID_UNICODE_CI;
+}
+
 static int64_t
 icu_on_off_from_str(const char *str, uint32_t len)
 {
diff --git a/src/coll_def.h b/src/coll_def.h
index 7a1027a1e..e86c546e5 100644
--- a/src/coll_def.h
+++ b/src/coll_def.h
@@ -49,6 +49,19 @@ enum coll_type {
 
 extern const char *coll_type_strs[];
 
+/** Built-in collations. */
+enum {
+	COLLATION_ID_UNICODE = 1,
+	COLLATION_ID_UNICODE_CI = 2,
+};
+
+/**
+ * Check if a collation is system by its ID. System collations can
+ * not be deleted.
+ */
+bool
+coll_is_system(int coll_id);
+
 /*
  * ICU collation options. See
  * http://icu-project.org/apiref/icu4c/ucol_8h.html#a583fbe7fc4a850e2fcc692e766d2826c
-- 
2.15.1 (Apple Git-101)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
                   ` (3 preceding siblings ...)
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 4/5] Always store built-in collations in the cache Vladislav Shpilevoy
@ 2018-04-28 22:45 ` Vladislav Shpilevoy
  2018-05-04 22:33   ` [tarantool-patches] " Alexander Turenko
  2018-05-08 13:21   ` Konstantin Osipov
  2018-05-05  0:19 ` [tarantool-patches] Re: [PATCH v2 0/5] Expose ICU into Lua Alexander Turenko
  5 siblings, 2 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-04-28 22:45 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

utf8 is a module partially compatible with Lua 5.3 utf8 and
lua-utf8 third party module.
Partially means, that not all functions are implemented.

The patch introduces these ones:
upper, lower, len, char, sub.

All of them works exactly like in Lua 5.3, or if Lua 5.3 has no
a function, then works like in lua-utf8.

Tarantool utf8 has extensions:

* isupper/lower/alpha/digit, that check some property by a symbol
  or by its code;

* cmp/casecmp, that compare two UTF8 strings.

Closes #3290
Closes #3385
Closes #3081
---
 src/CMakeLists.txt           |   3 +-
 src/lua/init.c               |   3 +
 src/lua/utf8.c               | 451 +++++++++++++++++++++++++++++++++++++++++++
 src/lua/utf8.h               |  42 ++++
 test/app-tap/string.test.lua | 160 ++++++++++++++-
 test/box/ddl.result          |  15 ++
 test/box/ddl.test.lua        |   8 +
 7 files changed, 680 insertions(+), 2 deletions(-)
 create mode 100644 src/lua/utf8.c
 create mode 100644 src/lua/utf8.h

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 45eb7a431..e4826ddaa 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -170,6 +170,7 @@ set (server_sources
      lua/fio.c
      lua/crypto.c
      lua/httpc.c
+     lua/utf8.c
      ${lua_sources}
      ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/lyaml.cc
      ${PROJECT_SOURCE_DIR}/third_party/lua-yaml/b64.c
@@ -214,7 +215,7 @@ endif()
 
 set_source_files_compile_flags(${server_sources})
 add_library(server STATIC ${server_sources})
-target_link_libraries(server core bit uri uuid)
+target_link_libraries(server core bit uri uuid ${ICU_LIBRARIES})
 
 # Rule of thumb: if exporting a symbol from a static library, list the
 # library here.
diff --git a/src/lua/init.c b/src/lua/init.c
index a0a7f63f6..58af1d121 100644
--- a/src/lua/init.c
+++ b/src/lua/init.c
@@ -57,6 +57,7 @@
 #include "lua/pickle.h"
 #include "lua/fio.h"
 #include "lua/httpc.h"
+#include "lua/utf8.h"
 #include "digest.h"
 #include <small/ibuf.h>
 
@@ -399,6 +400,7 @@ tarantool_lua_init(const char *tarantool_bin, int argc, char **argv)
 	lua_call(L, 0, 0);
 	lua_register(L, "tonumber64", lbox_tonumber64);
 
+	tarantool_lua_utf8_init(L);
 	tarantool_lua_utils_init(L);
 	tarantool_lua_fiber_init(L);
 	tarantool_lua_fiber_cond_init(L);
@@ -629,6 +631,7 @@ tarantool_lua_run_script(char *path, bool interactive,
 void
 tarantool_lua_free()
 {
+	tarantool_lua_utf8_free();
 	/*
 	 * Some part of the start script panicked, and called
 	 * exit().  The call stack in this case leads us back to
diff --git a/src/lua/utf8.c b/src/lua/utf8.c
new file mode 100644
index 000000000..d1d3d72c2
--- /dev/null
+++ b/src/lua/utf8.c
@@ -0,0 +1,451 @@
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include <unicode/ucasemap.h>
+#include <unicode/uchar.h>
+#include "lua/utils.h"
+#include "diag.h"
+#include "small/ibuf.h"
+#include "coll_cache.h"
+
+extern struct ibuf *tarantool_lua_ibuf;
+
+/** Default universal casemap for case transformations. */
+static UCaseMap *root_map = NULL;
+
+static int
+utf8_str_to_case(struct lua_State *L, const char *src, int src_bsize,
+		 bool is_to_upper)
+{
+	int i = 0;
+	int dst_bsize = src_bsize;
+	(void) i;
+	do {
+		UErrorCode err = U_ZERO_ERROR;
+		ibuf_reset(tarantool_lua_ibuf);
+		char *dst = ibuf_alloc(tarantool_lua_ibuf, dst_bsize);
+		if (dst == NULL) {
+			diag_set(OutOfMemory, dst_bsize, "ibuf_alloc", "dst");
+			return luaT_error(L);
+		}
+		int real_bsize;
+		if (is_to_upper) {
+			real_bsize = ucasemap_utf8ToUpper(root_map, dst,
+							  dst_bsize, src,
+							  src_bsize, &err);
+		} else {
+			real_bsize = ucasemap_utf8ToLower(root_map, dst,
+							  dst_bsize, src,
+							  src_bsize, &err);
+		}
+		if (err == U_ZERO_ERROR ||
+		    err == U_STRING_NOT_TERMINATED_WARNING) {
+			lua_pushlstring(L, dst, real_bsize);
+			return 1;
+		} else if (err == U_BUFFER_OVERFLOW_ERROR) {
+			assert(real_bsize > dst_bsize);
+			dst_bsize = real_bsize;
+		} else {
+			lua_pushnil(L);
+			lua_pushstring(L, tt_sprintf("error during ICU case "\
+						     "transform: %s",
+						     u_errorName(err)));
+			return 2;
+		}
+		/*
+		 * On a first run either all is ok, or
+		 * toLower/Upper returned needed bsize, that is
+		 * allocated on a second iteration. Third
+		 * iteration is not possible.
+		 */
+		assert(++i < 2);
+	} while (true);
+	unreachable();
+	return 0;
+}
+
+/**
+ * Convert a UTF8 string into upper case.
+ * @param String to convert.
+ * @retval not nil String consisting of upper letters.
+ * @retval nil, error Error.
+ */
+static int
+utf8_upper(struct lua_State *L)
+{
+	if (lua_gettop(L) != 1 || !lua_isstring(L, 1))
+		return luaL_error(L, "Usage: utf8.upper(<string>)");
+	size_t len;
+	const char *str = lua_tolstring(L, 1, &len);
+	return utf8_str_to_case(L, str, len, true);
+}
+
+/**
+ * Convert a UTF8 string into lower case.
+ * @param String to convert.
+ * @retval not nil String consisting of lower letters.
+ * @retval nil, error Error.
+ */
+static int
+utf8_lower(struct lua_State *L)
+{
+	if (lua_gettop(L) != 1 || !lua_isstring(L, 1))
+		return luaL_error(L, "Usage: utf8.lower(<string>)");
+	size_t len;
+	const char *str = lua_tolstring(L, 1, &len);
+	return utf8_str_to_case(L, str, len, false);
+}
+
+/**
+ * Calculate a 0-based positive byte offset in a string by any
+ * 0-based offset (possibly negative).
+ * @param offset Original 0-based offset with any sign.
+ * @param len A string byte length.
+ * @retval 0-based positive offset.
+ */
+static inline int
+utf8_convert_offset(int offset, size_t len)
+{
+	if (offset >= 0)
+		return offset;
+	else if ((size_t)-offset > len)
+		return 0;
+	return len + offset + 1;
+}
+
+/**
+ * Calculate length of a UTF8 string. Length here is symbol count.
+ * Works like utf8.len in Lua 5.3.
+ * @param String to get length.
+ * @param Start byte offset. Must point to the start of symbol. On
+ *        invalid symbol an error is returned. Can be negative.
+ * @param End byte offset, can be negative. Can point to the
+ *        middle of symbol.
+ * @retval not nil Symbol count.
+ * @retval nil, error Error. Byte position of the error is
+ *         returned in the second value.
+ */
+static int
+utf8_len(struct lua_State *L)
+{
+	if (lua_gettop(L) > 3 || !lua_isstring(L, 1))
+		return luaL_error(L, "Usage: utf8.len(<string>, [i, [j]])");
+	size_t slen;
+	const char *str = lua_tolstring(L, 1, &slen);
+	int len = (int)slen;
+	int start_pos = utf8_convert_offset(luaL_optinteger(L, 2, 1), len);
+	int end_pos = utf8_convert_offset(luaL_optinteger(L, 3, -1), len);
+	if (start_pos < 1 || --start_pos > len || end_pos > len) {
+		lua_pushnil(L);
+		lua_pushstring(L, "position is out of string");
+		return 2;
+	}
+	int result = 0;
+	if (end_pos > start_pos) {
+		UChar32 c;
+		++result;
+		/*
+		 * In Lua 5.3 it is not ok to start from bad
+		 * position. But next symbols can be any.
+		 */
+		U8_NEXT(str, start_pos, end_pos, c);
+		if (c == U_SENTINEL) {
+			lua_pushnil(L);
+			lua_pushinteger(L, start_pos);
+			return 2;
+		}
+		while (start_pos < end_pos) {
+			U8_NEXT_UNSAFE(str, start_pos, c);
+			++result;
+		}
+	}
+	lua_pushinteger(L, result);
+	return 1;
+}
+
+/**
+ * Get next symbol code by @an offset.
+ * @param String to get symbol code.
+ * @param Byte offset from which get.
+ *
+ * @retval - No more symbols.
+ * @retval not nil, not nil Byte offset and symbol code.
+ */
+static int
+utf8_next(struct lua_State *L)
+{
+	if (lua_gettop(L) > 2 || !lua_isstring(L, 1))
+		return luaL_error(L, "Usage: utf8.next(<string>, "\
+				     "[<byte offset>])");
+	size_t len;
+	const char *str = lua_tolstring(L, 1, &len);
+	size_t pos =
+		(size_t) utf8_convert_offset(luaL_optinteger(L, 2, 0), len);
+	if (pos >= len)
+		return 0;
+	UChar32 c;
+	U8_NEXT(str, pos, len, c);
+	if (c == U_SENTINEL)
+		return 0;
+	lua_pushinteger(L, pos);
+	lua_pushinteger(L, c);
+	return 2;
+}
+
+/**
+ * Convert a UTF8 char code (or codes) into Lua string. When
+ * multiple codes are provided, they are concatenated into a
+ * monolite string.
+ * @param Char codes.
+ * @retval Result UTF8 string.
+ */
+static int
+utf8_char(struct lua_State *L)
+{
+	int top = lua_gettop(L);
+	if (top < 1)
+		return luaL_error(L, "Usage: utf8.char(<char code>");
+	int len = 0;
+	UChar32 c;
+	/* Fast way - convert one symbol. */
+	if (top == 1) {
+		char buf[U8_MAX_LENGTH];
+		c = luaL_checkinteger(L, 1);
+		U8_APPEND_UNSAFE(buf, len, c);
+		assert(len <= (int)sizeof(buf));
+		lua_pushlstring(L, buf, len);
+		return 1;
+	}
+	/* Slow way - use dynamic buffer. */
+	ibuf_reset(tarantool_lua_ibuf);
+	char *str = ibuf_alloc(tarantool_lua_ibuf, top * U8_MAX_LENGTH);
+	if (str == NULL) {
+		diag_set(OutOfMemory, top * U8_MAX_LENGTH, "ibuf_alloc",
+			 "str");
+		return luaT_error(L);
+	}
+	for (int i = 1; i <= top; ++i) {
+		c = luaL_checkinteger(L, i);
+		U8_APPEND_UNSAFE(str, len, c);
+	}
+	lua_pushlstring(L, str, len);
+	return 1;
+}
+
+/**
+ * Get byte offsets by symbol positions in a string. Positions can
+ * be negative.
+ * @param s Original string.
+ * @param len Length of @an s.
+ * @param start_pos Start position (symbol offset).
+ * @param end_pos End position (symbol offset).
+ * @param[out] start_offset_ Start position (byte offset).
+ * @param[out] end_offset_ End position (byte offset).
+ */
+static void
+utf8_sub(const uint8_t *s, int len, int start_pos, int end_pos,
+	 int *start_offset_, int *end_offset_)
+{
+	int start_offset = 0, end_offset = len;
+	if (start_pos >= 0) {
+		U8_FWD_N(s, start_offset, len, start_pos);
+		if (end_pos >= 0) {
+			/* --[-------]---- ...  */
+			int n = end_pos - start_pos;
+			end_offset = start_offset;
+			U8_FWD_N(s, end_offset, len, n);
+		} else {
+			/* --[---- ... ----]--- */
+			int n = -(end_pos + 1);
+			U8_BACK_N(s, 0, end_offset, n);
+		}
+	} else {
+		int n;
+		if (end_pos < 0) {
+			/* ... -----[-----]--- */
+			n = -(end_pos + 1);
+			U8_BACK_N(s, 0, end_offset, n);
+			start_offset = end_offset;
+			n = end_pos - start_pos + 1;
+		} else {
+			/* ---]-- ... --[---- */
+			end_offset = 0;
+			U8_FWD_N(s, end_offset, len, end_pos);
+			n = -start_pos;
+			start_offset = len;
+		}
+		U8_BACK_N(s, 0, start_offset, n);
+	}
+	*start_offset_ = start_offset;
+	if (start_offset <= end_offset)
+		*end_offset_ = end_offset;
+	else
+		*end_offset_ = start_offset;
+}
+
+/**
+ * Get a substring from a UTF8 string.
+ * @param String to get a substring.
+ * @param Start position in symbol count. Optional, can be
+ *        negative.
+ * @param End position in symbol count. Optional, can be negative.
+ *
+ * @retval Substring.
+ */
+static int
+utf8_lua_sub(struct lua_State *L)
+{
+	if (lua_gettop(L) < 2 || !lua_isstring(L, 1))
+		return luaL_error(L, "Usage: utf8.sub(<string>, [i, [j]])");
+	int start_pos = luaL_checkinteger(L, 2);
+	if (start_pos > 0)
+		--start_pos;
+	int end_pos = luaL_optinteger(L, 3, -1);
+	size_t len;
+	const char *str = lua_tolstring(L, 1, &len);
+	int start_offset, end_offset;
+	utf8_sub((const uint8_t *) str, len, start_pos, end_pos, &start_offset,
+		 &end_offset);
+	assert(end_offset >= start_offset);
+	lua_pushlstring(L, str + start_offset, end_offset - start_offset);
+	return 1;
+}
+
+/** Macro to easy create lua wrappers for ICU symbol checkers. */
+#define UCHAR32_CHECKER(name) \
+static int \
+utf8_##name(struct lua_State *L) \
+{ \
+	if (lua_gettop(L) != 1) \
+		return luaL_error(L, "Usage: utf8."#name"(<string> or "\
+				     "<one symbol code>)"); \
+	UChar32 c; \
+	bool result = false; \
+	if (lua_type(L, 1) == LUA_TSTRING) { \
+		size_t len; \
+		const char *str = lua_tolstring(L, 1, &len); \
+		if (len > 0) { \
+			size_t offset = 0; \
+			U8_NEXT(str, offset, len, c); \
+			result = c != U_SENTINEL && offset == len && \
+				 u_##name(c); \
+		} \
+	} else { \
+		result = u_##name(luaL_checkinteger(L, 1)); \
+	} \
+	lua_pushboolean(L, result); \
+	return 1; \
+}\
+
+UCHAR32_CHECKER(islower)
+UCHAR32_CHECKER(isupper)
+UCHAR32_CHECKER(isdigit)
+UCHAR32_CHECKER(isalpha)
+
+static inline int
+utf8_cmp_impl(struct lua_State *L, const char *usage, uint32_t coll_id)
+{
+	if (lua_gettop(L) != 2 || !lua_isstring(L, 1) || !lua_isstring(L, 2))
+		luaL_error(L, usage);
+	size_t l1, l2;
+	const char *s1 = lua_tolstring(L, 1, &l1);
+	const char *s2 = lua_tolstring(L, 2, &l2);
+	struct coll *coll = coll_by_id(coll_id);
+	assert(coll != NULL);
+	lua_pushinteger(L, coll->cmp(s1, l1, s2, l2, coll));
+	return 1;
+}
+
+/**
+ * Compare two UTF8 strings.
+ * @param s1 First string.
+ * @param s1 Second string.
+ *
+ * @retval <0 s1 < s2.
+ * @retval >0 s1 > s2.
+ * @retval =0 s1 = s2.
+ */
+static int
+utf8_cmp(struct lua_State *L)
+{
+	return utf8_cmp_impl(L, "Usage: utf8.cmp(<string1>, <string2>)",
+			     COLLATION_ID_UNICODE);
+}
+
+/**
+ * Compare two UTF8 strings ignoring case.
+ * @param s1 First string.
+ * @param s1 Second string.
+ *
+ * @retval <0 s1 < s2.
+ * @retval >0 s1 > s2.
+ * @retval =0 s1 = s2.
+ */
+static int
+utf8_casecmp(struct lua_State *L)
+{
+	return utf8_cmp_impl(L, "Usage: utf8.casecmp(<string1>, <string2>)",
+			     COLLATION_ID_UNICODE_CI);
+}
+
+static const struct luaL_Reg utf8_lib[] = {
+	{"upper", utf8_upper},
+	{"lower", utf8_lower},
+	{"len", utf8_len},
+	{"next", utf8_next},
+	{"char", utf8_char},
+	{"sub", utf8_lua_sub},
+	{"islower", utf8_islower},
+	{"isupper", utf8_isupper},
+	{"isdigit", utf8_isdigit},
+	{"isalpha", utf8_isalpha},
+	{"cmp", utf8_cmp},
+	{"casecmp", utf8_casecmp},
+	{NULL, NULL}
+};
+
+void
+tarantool_lua_utf8_init(struct lua_State *L)
+{
+	UErrorCode err = U_ZERO_ERROR;
+	root_map = ucasemap_open("", 0, &err);
+	if (root_map == NULL) {
+		luaL_error(L, tt_sprintf("error in ICU ucasemap_open: %s",
+					 u_errorName(err)));
+	}
+	luaL_register(L, "utf8", utf8_lib);
+	lua_pop(L, 1);
+}
+
+void
+tarantool_lua_utf8_free()
+{
+	ucasemap_close(root_map);
+}
diff --git a/src/lua/utf8.h b/src/lua/utf8.h
new file mode 100644
index 000000000..567ad51f7
--- /dev/null
+++ b/src/lua/utf8.h
@@ -0,0 +1,42 @@
+#ifndef TARANTOOL_LUA_UTF8_H_INCLUDED
+#define TARANTOOL_LUA_UTF8_H_INCLUDED
+/*
+ * Copyright 2010-2018, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+struct lua_State;
+
+void
+tarantool_lua_utf8_init(struct lua_State *L);
+
+void
+tarantool_lua_utf8_free();
+
+#endif /* TARANTOOL_LUA_UTF8_H_INCLUDED */
diff --git a/test/app-tap/string.test.lua b/test/app-tap/string.test.lua
index 852a7923c..dc28b87e1 100755
--- a/test/app-tap/string.test.lua
+++ b/test/app-tap/string.test.lua
@@ -3,7 +3,7 @@
 local tap = require('tap')
 local test = tap.test("string extensions")
 
-test:plan(5)
+test:plan(6)
 
 test:test("split", function(test)
     test:plan(10)
@@ -128,4 +128,162 @@ test:test("strip", function(test)
     test:ok(err and err:match("%(string expected, got number%)"))
 end )
 
+test:test("unicode", function(test)
+    test:plan(100)
+    local str = 'хеЛлоу вОрЛд ё Ё я Я э Э ъ Ъ hElLo WorLd 1234 i I İ 勺#☢༺'
+    local upper_res = 'ХЕЛЛОУ ВОРЛД Ё Ё Я Я Э Э Ъ Ъ HELLO WORLD 1234 I I İ 勺#☢༺'
+    local lower_res = 'хеллоу ворлд ё ё я я э э ъ ъ hello world 1234 i i i̇ 勺#☢༺'
+    local s = utf8.upper(str)
+    test:is(s, upper_res, 'default locale upper')
+    s = utf8.lower(str)
+    test:is(s, lower_res, 'default locale lower')
+    test:is(utf8.upper(''), '', 'empty string upper')
+    test:is(utf8.lower(''), '', 'empty string lower')
+    local err
+    s, err = pcall(utf8.upper, true)
+    test:isnt(err:find('Usage'), nil, 'upper usage is checked')
+    s, err = pcall(utf8.lower, true)
+    test:isnt(err:find('Usage'), nil, 'lower usage is checked')
+
+    test:is(utf8.isupper('a'), false, 'isupper("a")')
+    test:is(utf8.isupper('A'), true, 'isupper("A")')
+    test:is(utf8.islower('a'), true, 'islower("a")')
+    test:is(utf8.islower('A'), false, 'islower("A")')
+    test:is(utf8.isalpha('a'), true, 'isalpha("a")')
+    test:is(utf8.isalpha('A'), true, 'isalpha("A")')
+    test:is(utf8.isalpha('aa'), false, 'isalpha("aa")')
+    test:is(utf8.isalpha('勺'), true, 'isalpha("勺")')
+    test:is(utf8.isupper('Ё'), true, 'isupper("Ё")')
+    test:is(utf8.islower('ё'), true, 'islower("ё")')
+    test:is(utf8.isdigit('a'), false, 'isdigit("a")')
+    test:is(utf8.isdigit('1'), true, 'isdigit("1")')
+    test:is(utf8.isdigit('9'), true, 'isdigit("9")')
+
+    test:is(utf8.len(str), 56, 'len works on complex string')
+    s = '12İ☢勺34'
+    test:is(utf8.len(s), 7, 'len works no options')
+    test:is(utf8.len(s, 1), 7, 'default start is 1')
+    test:is(utf8.len(s, 2), 6, 'start 2')
+    test:is(utf8.len(s, 3), 5, 'start 3')
+    local c
+    c, err = utf8.len(s, 4)
+    test:isnil(c, 'middle of symbol offset is error')
+    test:is(err, 4, 'error on 4 byte')
+    test:is(utf8.len(s, 5), 4, 'start 5')
+    c, err = utf8.len(s, 6)
+    test:is(err, 6, 'error on 6 byte')
+    c, err = utf8.len(s, 0)
+    test:is(err, 'position is out of string', 'range is out of string')
+    test:is(utf8.len(s, #s), 1, 'start from the end')
+    test:is(utf8.len(s, #s + 1), 0, 'position is out of string')
+    test:is(utf8.len(s, 1, -1), 7, 'default end is -1')
+    test:is(utf8.len(s, 1, -2), 6, 'end -2')
+    test:is(utf8.len(s, 1, -3), 5, 'end -3')
+    test:is(utf8.len(s, 1, -4), 5, 'end in the middle of symbol')
+    test:is(utf8.len(s, 1, -5), 5, 'end in the middle of symbol')
+    test:is(utf8.len(s, 1, -6), 5, 'end in the middle of symbol')
+    test:is(utf8.len(s, 1, -7), 4, 'end -7')
+    test:is(utf8.len(s, 2, -7), 3, '[2, -7]')
+    test:is(utf8.len(s, 3, -7), 2, '[3, -7]')
+    c, err = utf8.len(s, 4, -7)
+    test:is(err, 4, '[4, -7] is error - start from the middle of symbol')
+    test:is(utf8.len(s, 10, -100), 0, 'it is ok to be out of str by end pos')
+    test:is(utf8.len(s, 10, -10), 0, 'it is ok to swap end and start pos')
+    test:is(utf8.len(''), 0, 'empty len')
+    test:is(utf8.len(s, -6, -1), 3, 'pass both negative offsets')
+
+    local chars = {}
+    local codes = {}
+    for _, code in utf8.next, s do
+        table.insert(chars, utf8.char(code))
+        table.insert(codes, code)
+    end
+    test:is(table.concat(chars), s, "next and char works")
+    c, err = pcall(utf8.char, 'kek')
+    test:isnt(err:find('bad argument'), nil, 'char usage is checked')
+    c, err = pcall(utf8.next, true)
+    test:isnt(err:find('Usage'), nil, 'next usage is checked')
+    c, err = pcall(utf8.next, '1234', true)
+    test:isnt(err:find('bad argument'), nil, 'next usage is checked')
+    local offset
+    offset, c = utf8.next('')
+    test:isnil(offset, 'next on empty - nil offset')
+    test:isnil(c, 'next on empty - nil code')
+    offset, c = utf8.next('123', 100)
+    test:isnil(offset, 'out of string - nil offset')
+    test:isnil(c, 'out of string - nil code')
+    test:is(utf8.char(unpack(codes)), s, 'char with multiple values')
+
+    local uppers = 0
+    local lowers = 0
+    local digits = 0
+    local letters = 0
+    for _, code in utf8.next, str do
+        if utf8.isupper(code) then uppers = uppers + 1 end
+        if utf8.islower(code) then lowers = lowers + 1 end
+        if utf8.isalpha(code) then letters = letters + 1 end
+        if utf8.isdigit(code) then digits = digits + 1 end
+    end
+    test:is(uppers, 13, 'uppers by code')
+    test:is(lowers, 19, 'lowers by code')
+    test:is(letters, 33, 'letters by code')
+    test:is(digits, 4, 'digits by code')
+
+    s = '12345678'
+    test:is(utf8.sub(s, 1, 1), '1', 'sub [1]')
+    test:is(utf8.sub(s, 1, 2), '12', 'sub [1:2]')
+    test:is(utf8.sub(s, 2, 2), '2', 'sub [2:2]')
+    test:is(utf8.sub(s, 0, 2), '12', 'sub [0:2]')
+    test:is(utf8.sub(s, 3, 7), '34567', 'sub [3:7]')
+    test:is(utf8.sub(s, 7, 3), '', 'sub [7:3]')
+    test:is(utf8.sub(s, 3, 100), '345678', 'sub [3:100]')
+    test:is(utf8.sub(s, 100, 3), '', 'sub [100:3]')
+
+    test:is(utf8.sub(s, 5), '5678', 'sub [5:]')
+    test:is(utf8.sub(s, 1, -1), s, 'sub [1:-1]')
+    test:is(utf8.sub(s, 1, -2), '1234567', 'sub [1:-2]')
+    test:is(utf8.sub(s, 2, -2), '234567', 'sub [2:-2]')
+    test:is(utf8.sub(s, 3, -3), '3456', 'sub [3:-3]')
+    test:is(utf8.sub(s, 5, -4), '5', 'sub [5:-4]')
+    test:is(utf8.sub(s, 7, -7), '', 'sub[7:-7]')
+
+    test:is(utf8.sub(s, -2, -1), '78', 'sub [-2:-1]')
+    test:is(utf8.sub(s, -1, -1), '8', 'sub [-1:-1]')
+    test:is(utf8.sub(s, -4, -2), '567', 'sub [-4:-2]')
+    test:is(utf8.sub(s, -400, -2), '1234567', 'sub [-400:-2]')
+    test:is(utf8.sub(s, -3, -5), '', 'sub [-3:-5]')
+
+    test:is(utf8.sub(s, -6, 5), '345', 'sub [-6:5]')
+    test:is(utf8.sub(s, -5, 4), '4', 'sub [-5:4]')
+    test:is(utf8.sub(s, -2, 2), '', 'sub [-2:2]')
+    test:is(utf8.sub(s, -1, 8), '8', 'sub [-1:8]')
+
+    c, err = pcall(utf8.sub)
+    test:isnt(err:find('Usage'), nil, 'usage is checked')
+    c, err = pcall(utf8.sub, true)
+    test:isnt(err:find('Usage'), nil, 'usage is checked')
+    c, err = pcall(utf8.sub, '123')
+    test:isnt(err:find('Usage'), nil, 'usage is checked')
+    c, err = pcall(utf8.sub, '123', true)
+    test:isnt(err:find('bad argument'), nil, 'usage is checked')
+    c, err = pcall(utf8.sub, '123', 1, true)
+    test:isnt(err:find('bad argument'), nil, 'usage is checked')
+
+    local s1 = '☢'
+    local s2 = 'İ'
+    test:is(s1 < s2, false, 'test binary cmp')
+    test:is(utf8.cmp(s1, s2) < 0, true, 'test unicode <')
+    test:is(utf8.cmp(s1, s1) == 0, true, 'test unicode eq')
+    test:is(utf8.cmp(s2, s1) > 0, true, 'test unicode >')
+    test:is(utf8.casecmp('a', 'A') == 0, true, 'test icase ==')
+    test:is(utf8.casecmp('b', 'A') > 0, true, 'test icase >, first')
+    test:is(utf8.casecmp('B', 'a') > 0, true, 'test icase >, second >')
+    test:is(utf8.cmp('', '') == 0, true, 'test empty compare')
+    test:is(utf8.cmp('', 'a') < 0, true, 'test left empty compare')
+    test:is(utf8.cmp('a', '') > 0, true, 'test right empty compare')
+    test:is(utf8.casecmp('', '') == 0, true, 'test empty icompare')
+    test:is(utf8.casecmp('', 'a') < 0, true, 'test left empty icompare')
+    test:is(utf8.casecmp('a', '') > 0, true, 'test right empty icompare')
+end)
+
 os.exit(test:check() == true and 0 or -1)
diff --git a/test/box/ddl.result b/test/box/ddl.result
index 87b9581c6..03ebe0e58 100644
--- a/test/box/ddl.result
+++ b/test/box/ddl.result
@@ -500,6 +500,21 @@ box.space._collation.index.name:delete{'test'}
 - [3, 'test', 0, 'ICU', 'ru_RU', {}]
 ...
 --
+-- gh-3290: expose ICU into Lua. It uses built-in collations, that
+-- must work even if a collation is deleted from _collation.
+--
+t = box.space._collation:delete{1}
+---
+...
+utf8.cmp('abc', 'def')
+---
+- -1
+...
+box.space._collation:replace(t)
+---
+- [1, 'unicode', 1, 'ICU', '', {}]
+...
+--
 -- gh-2839: allow to store custom fields in field definition.
 --
 format = {}
diff --git a/test/box/ddl.test.lua b/test/box/ddl.test.lua
index a1502ae13..1bfd50b59 100644
--- a/test/box/ddl.test.lua
+++ b/test/box/ddl.test.lua
@@ -191,6 +191,14 @@ test_run:cmd('restart server default')
 box.space._collation:select{}
 box.space._collation.index.name:delete{'test'}
 
+--
+-- gh-3290: expose ICU into Lua. It uses built-in collations, that
+-- must work even if a collation is deleted from _collation.
+--
+t = box.space._collation:delete{1}
+utf8.cmp('abc', 'def')
+box.space._collation:replace(t)
+
 --
 -- gh-2839: allow to store custom fields in field definition.
 --
-- 
2.15.1 (Apple Git-101)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h Vladislav Shpilevoy
@ 2018-05-04 11:06   ` Alexander Turenko
  2018-05-04 12:05     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 21+ messages in thread
From: Alexander Turenko @ 2018-05-04 11:06 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Hi, Vlad!

One tiny comment here.

WBR, Alexander Turenko.

On Sun, Apr 29, 2018 at 01:45:10AM +0300, Vladislav Shpilevoy wrote:
> The issue #3290 was not only about upper/lower Lua functions, but
> about unicode comparison functions too. Actually, the issue
> requests upper/lower exactly to do string comparison, that can be
> done more quick with no garbage strings creation. For this
> Tarantool collations can be used.
> 
> To be able to expose collations into Lua, the coll.h/.c,
> coll_def.h/.c and coll_cache.h/.c must be moved from 'box' static
> library into 'core' static library so that they will be built
> together with string utils. But they require 'stat' and
> 'box_error' libraries. The patch prepares the files going to be
> moved, so in the next patch they are just moved, with no changes.
> It saves commit history.

'together with string utils' -- no more relevant to the patchset part
above this commit?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 3/5] Merge box_error, stat and collations into core library
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 3/5] Merge box_error, stat and collations into core library Vladislav Shpilevoy
@ 2018-05-04 11:36   ` Alexander Turenko
  2018-05-04 12:05     ` Vladislav Shpilevoy
  2018-05-08 13:18   ` Konstantin Osipov
  1 sibling, 1 reply; 21+ messages in thread
From: Alexander Turenko @ 2018-05-04 11:36 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

One comment is below.

WBR, Alexander Turenko.

On Sun, Apr 29, 2018 at 01:45:11AM +0300, Vladislav Shpilevoy wrote:
> The goal is to expose collations into Lua with no dependencies on
> box library. But collations merge into core requires box_error
> and stat libraries too.

> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index be13812aa..b60c6c397 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -30,7 +30,7 @@
>   */
>  #include "box/lua/call.h"
>  #include "box/call.h"
> -#include "box/error.h"
> +#include <error.h>
>  #include "fiber.h"
>  
>  #include "lua/utils.h"
> diff --git a/src/box/lua/error.cc b/src/box/lua/error.cc
> index 314907421..960ea2aa9 100644
> --- a/src/box/lua/error.cc
> +++ b/src/box/lua/error.cc
> @@ -40,7 +40,7 @@ extern "C" {
>  #include <errinj.h>
>  
>  #include "lua/utils.h"
> -#include "box/error.h"
> +#include "src/error.h"
>  
>  static int
>  luaT_error_raise(lua_State *L)

Don't know what policy is in use in Tarantool. The diff above shows two
different approaches to include the same header within the one commit.

It seems that we don't use relative includes (with ..) and prefer either
<> or "src/..." w/o any particular system. Right?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h
  2018-05-04 11:06   ` [tarantool-patches] " Alexander Turenko
@ 2018-05-04 12:05     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-04 12:05 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Hello. Thanks for review!

On 04/05/2018 14:06, Alexander Turenko wrote:
> Hi, Vlad!
> 
> One tiny comment here.
> 
> WBR, Alexander Turenko.
> 
> On Sun, Apr 29, 2018 at 01:45:10AM +0300, Vladislav Shpilevoy wrote:
>> The issue #3290 was not only about upper/lower Lua functions, but
>> about unicode comparison functions too. Actually, the issue
>> requests upper/lower exactly to do string comparison, that can be
>> done more quick with no garbage strings creation. For this
>> Tarantool collations can be used.
>>
>> To be able to expose collations into Lua, the coll.h/.c,
>> coll_def.h/.c and coll_cache.h/.c must be moved from 'box' static
>> library into 'core' static library so that they will be built
>> together with string utils. But they require 'stat' and
>> 'box_error' libraries. The patch prepares the files going to be
>> moved, so in the next patch they are just moved, with no changes.
>> It saves commit history.
> 
> 'together with string utils' -- no more relevant to the patchset part
> above this commit?
> 

Fixed.

"together with common utils"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 3/5] Merge box_error, stat and collations into core library
  2018-05-04 11:36   ` [tarantool-patches] " Alexander Turenko
@ 2018-05-04 12:05     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-04 12:05 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Hello. Thanks for review!

On 04/05/2018 14:36, Alexander Turenko wrote:
> Vlad,
> 
> One comment is below.
> 
> WBR, Alexander Turenko.
> 
> On Sun, Apr 29, 2018 at 01:45:11AM +0300, Vladislav Shpilevoy wrote:
>> The goal is to expose collations into Lua with no dependencies on
>> box library. But collations merge into core requires box_error
>> and stat libraries too.
> 
>> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
>> index be13812aa..b60c6c397 100644
>> --- a/src/box/lua/call.c
>> +++ b/src/box/lua/call.c
>> @@ -30,7 +30,7 @@
>>    */
>>   #include "box/lua/call.h"
>>   #include "box/call.h"
>> -#include "box/error.h"
>> +#include <error.h>
>>   #include "fiber.h"
>>   
>>   #include "lua/utils.h"
>> diff --git a/src/box/lua/error.cc b/src/box/lua/error.cc
>> index 314907421..960ea2aa9 100644
>> --- a/src/box/lua/error.cc
>> +++ b/src/box/lua/error.cc
>> @@ -40,7 +40,7 @@ extern "C" {
>>   #include <errinj.h>
>>   
>>   #include "lua/utils.h"
>> -#include "box/error.h"
>> +#include "src/error.h"
>>   
>>   static int
>>   luaT_error_raise(lua_State *L)
> 
> Don't know what policy is in use in Tarantool. The diff above shows two
> different approaches to include the same header within the one commit.
> 
> It seems that we don't use relative includes (with ..) and prefer either
> <> or "src/..." w/o any particular system. Right?

Right. We do not use relative ways. But "src/error.h" looks ugly, I replaced it
with <error.h>:
diff --git a/src/box/lua/error.cc b/src/box/lua/error.cc
index 960ea2aa9..bf9e5bac4 100644
--- a/src/box/lua/error.cc
+++ b/src/box/lua/error.cc
@@ -38,9 +38,9 @@ extern "C" {
  
  #include <fiber.h>
  #include <errinj.h>
+#include <error.h>
  
  #include "lua/utils.h"
-#include "src/error.h"
  
  static int
  luaT_error_raise(lua_State *L)

For more examples about <> and "" difference you can look at the same file src/box/lua/error.cc:

#include <fiber.h>
#include <errinj.h>

these files are in <> in lua/ dir, but in "" in box/ dir.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
@ 2018-05-04 22:33   ` Alexander Turenko
  2018-05-04 23:32     ` Vladislav Shpilevoy
  2018-05-08 13:21   ` Konstantin Osipov
  1 sibling, 1 reply; 21+ messages in thread
From: Alexander Turenko @ 2018-05-04 22:33 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Are you try to run tests from utf8.lua from [1]?

[1]: https://www.lua.org/tests/lua-5.3.4-tests.tar.gz

Debug build:

```
/home/alex/p/tarantool-meta/review/tarantool/src/lua/utf8.c: In function ‘utf8_next’:
/home/alex/p/tarantool-meta/review/tarantool/src/lua/utf8.c:214:2: error: passing argument 2 of ‘utf8_nextCharSafeBody’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  U8_NEXT(str, pos, len, c);
  ^~~~~~~
In file included from /usr/include/unicode/utf.h:217:0,
                 from /usr/include/unicode/utypes.h:44,
                 from /usr/include/unicode/ucasemap.h:24,
                 from /home/alex/p/tarantool-meta/review/tarantool/src/lua/utf8.c:31:
/usr/include/unicode/utf8.h:120:1: note: expected ‘int32_t * {aka int *}’ but argument is of type ‘size_t * {aka long unsigned int *}’
 utf8_nextCharSafeBody(const uint8_t *s, int32_t *pi, int32_t length, UChar32 c, UBool strict);
 ^~~~~~~~~~~~~~~~~~~~~
```

The following cases works differently in Lua 5.3 and in tarantool's
implementation:

print(utf8.len('12İ☢勺34', 3, 3))
print(utf8.len('\xF4\x9F\xBF\xBF'))

Lua 5.3:

1
nil	1

Tarantool:

nil     3
nil     4

See also comments below.

Thanks!

WBR, Alexander Turenko.

On Sun, Apr 29, 2018 at 01:45:13AM +0300, Vladislav Shpilevoy wrote:
> utf8 is a module partially compatible with Lua 5.3 utf8 and
> lua-utf8 third party module.
> Partially means, that not all functions are implemented.
> 
> The patch introduces these ones:
> upper, lower, len, char, sub.

And next.

> 
> All of them works exactly like in Lua 5.3, or if Lua 5.3 has no
> a function, then works like in lua-utf8.
> 

Only len and char are from lua 5.3, we can state it explicitly.

> Tarantool utf8 has extensions:
> 
> * isupper/lower/alpha/digit, that check some property by a symbol
>   or by its code;
> 
> * cmp/casecmp, that compare two UTF8 strings.
> 
> Closes #3290
> Closes #3385
> Closes #3081
> ---
>  src/CMakeLists.txt           |   3 +-
>  src/lua/init.c               |   3 +
>  src/lua/utf8.c               | 451 +++++++++++++++++++++++++++++++++++++++++++
>  src/lua/utf8.h               |  42 ++++
>  test/app-tap/string.test.lua | 160 ++++++++++++++-
>  test/box/ddl.result          |  15 ++
>  test/box/ddl.test.lua        |   8 +
>  7 files changed, 680 insertions(+), 2 deletions(-)
>  create mode 100644 src/lua/utf8.c
>  create mode 100644 src/lua/utf8.h
> <...>
> diff --git a/src/lua/utf8.c b/src/lua/utf8.c
> new file mode 100644
> index 000000000..d1d3d72c2
> --- /dev/null
> +++ b/src/lua/utf8.c
> <...>
> +/**
> + * Calculate a 0-based positive byte offset in a string by any
> + * 0-based offset (possibly negative).

But you pass 1-based offsets from utf8_len function to that one.

> + * @param offset Original 0-based offset with any sign.
> + * @param len A string byte length.
> + * @retval 0-based positive offset.
> + */
> +static inline int
> +utf8_convert_offset(int offset, size_t len)
> +{
> +	if (offset >= 0)
> +		return offset;
> +	else if ((size_t)-offset > len)
> +		return 0;
> +	return len + offset + 1;

It seems that the function is about 1-based offsets (at some cases at
least).

> +}
> +
> +/**
> + * Calculate length of a UTF8 string. Length here is symbol count.
> + * Works like utf8.len in Lua 5.3.
> + * @param String to get length.
> + * @param Start byte offset. Must point to the start of symbol. On
> + *        invalid symbol an error is returned. Can be negative.

Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
offset equilibristics is not very intuitive (at least for me).

> + * @param End byte offset, can be negative. Can point to the
> + *        middle of symbol.

We need to clarify that a symbol under the end offset is subject to
include into the resulting count (inclusive range).

I would also explicitly stated that -1 is the end byte.

Worth to document allowed range (0 <= |end| <= #str, right?)?

> + * @retval not nil Symbol count.
> + * @retval nil, error Error. Byte position of the error is
> + *         returned in the second value.
> + */
> +static int
> +utf8_len(struct lua_State *L)
> +{
> +	if (lua_gettop(L) > 3 || !lua_isstring(L, 1))
> +		return luaL_error(L, "Usage: utf8.len(<string>, [i, [j]])");
> +	size_t slen;
> +	const char *str = lua_tolstring(L, 1, &slen);
> +	int len = (int)slen;
> +	int start_pos = utf8_convert_offset(luaL_optinteger(L, 2, 1), len);
> +	int end_pos = utf8_convert_offset(luaL_optinteger(L, 3, -1), len);
> +	if (start_pos < 1 || --start_pos > len || end_pos > len) {
> +		lua_pushnil(L);
> +		lua_pushstring(L, "position is out of string");
> +		return 2;
> +	}
> +	int result = 0;
> +	if (end_pos > start_pos) {
> +		UChar32 c;
> +		++result;
> +		/*
> +		 * In Lua 5.3 it is not ok to start from bad
> +		 * position. But next symbols can be any.
> +		 */

I would be not so sure. Consider:

print(utf8.len('a\xF4'))

Lua 5.3:

nil	2

Tarantool:

2

> +		U8_NEXT(str, start_pos, end_pos, c);
> +		if (c == U_SENTINEL) {
> +			lua_pushnil(L);
> +			lua_pushinteger(L, start_pos);
> +			return 2;
> +		}
> +		while (start_pos < end_pos) {
> +			U8_NEXT_UNSAFE(str, start_pos, c);

It is controversial decision. Why you are sure Lua 5.3 works in the same
manner?

> +			++result;
> +		}
> +	}
> +	lua_pushinteger(L, result);
> +	return 1;
> +}
> +
> +/**
> + * Get next symbol code by @an offset.
> + * @param String to get symbol code.
> + * @param Byte offset from which get.
> + *
> + * @retval - No more symbols.

Do you mean nil?

> + * @retval not nil, not nil Byte offset and symbol code.
> + */
> +static int
> +utf8_next(struct lua_State *L)
> <...>
> +UCHAR32_CHECKER(islower)
> +UCHAR32_CHECKER(isupper)
> +UCHAR32_CHECKER(isdigit)
> +UCHAR32_CHECKER(isalpha)

It would be good to have doxygen-style comment in the source code
likewise you posted in #3081.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-05-04 22:33   ` [tarantool-patches] " Alexander Turenko
@ 2018-05-04 23:32     ` Vladislav Shpilevoy
  2018-05-05  0:18       ` Alexander Turenko
  0 siblings, 1 reply; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-04 23:32 UTC (permalink / raw)
  To: tarantool-patches, Alexander Turenko

Hello. Thanks for review.

On 05/05/2018 01:33, Alexander Turenko wrote:
> Vlad,
> 
> Are you try to run tests from utf8.lua from [1]?
> 
> [1]: https://www.lua.org/tests/lua-5.3.4-tests.tar.gz
> 
> Debug build:
> 
> ```
> /home/alex/p/tarantool-meta/review/tarantool/src/lua/utf8.c: In function ‘utf8_next’:
> /home/alex/p/tarantool-meta/review/tarantool/src/lua/utf8.c:214:2: error: passing argument 2 of ‘utf8_nextCharSafeBody’ from incompatible pointer type [-Werror=incompatible-pointer-types]
>    U8_NEXT(str, pos, len, c);

Looks like you did not fetch the newest version. Travis is ok:
https://travis-ci.org/tarantool/tarantool/builds/374952902?utm_source=github_status&utm_medium=notification

>    ^~~~~~~
> In file included from /usr/include/unicode/utf.h:217:0,
>                   from /usr/include/unicode/utypes.h:44,
>                   from /usr/include/unicode/ucasemap.h:24,
>                   from /home/alex/p/tarantool-meta/review/tarantool/src/lua/utf8.c:31:
> /usr/include/unicode/utf8.h:120:1: note: expected ‘int32_t * {aka int *}’ but argument is of type ‘size_t * {aka long unsigned int *}’
>   utf8_nextCharSafeBody(const uint8_t *s, int32_t *pi, int32_t length, UChar32 c, UBool strict);
>   ^~~~~~~~~~~~~~~~~~~~~
> ```
> 
> The following cases works differently in Lua 5.3 and in tarantool's
> implementation:
> 
> print(utf8.len('12İ☢勺34', 3, 3))
> print(utf8.len('\xF4\x9F\xBF\xBF'))
> 
> Lua 5.3:
> 
> 1
> nil	1
> 
> Tarantool:
> 
> nil     3
> nil     4

Second case works ok already:

tarantool> utf8.len('\xF4\x9F\xBF\xBF')
---
- null
- 1
...

It matches Lua 5.3.

First case fixed:

diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index 1d0776ab1..02ecdede4 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -176,7 +176,7 @@ utf8_len(struct lua_State *L)
                  * In Lua 5.3 it is not ok to start from bad
                  * position. But next symbols can be any.
                  */
-               U8_NEXT(str, start_pos, end_pos, c);
+               U8_NEXT(str, start_pos, len, c);
                 if (c == U_SENTINEL) {
                         lua_pushnil(L);
                         lua_pushinteger(L, start_pos);
diff --git a/test/app-tap/string.test.lua b/test/app-tap/string.test.lua
index dc28b87e1..fb96104ed 100755
--- a/test/app-tap/string.test.lua
+++ b/test/app-tap/string.test.lua
@@ -129,7 +129,7 @@ test:test("strip", function(test)
  end )
  
  test:test("unicode", function(test)
-    test:plan(100)
+    test:plan(101)
      local str = 'хеЛлоу вОрЛд ё Ё я Я э Э ъ Ъ hElLo WorLd 1234 i I İ 勺#☢༺'
      local upper_res = 'ХЕЛЛОУ ВОРЛД Ё Ё Я Я Э Э Ъ Ъ HELLO WORLD 1234 I I İ 勺#☢༺'
      local lower_res = 'хеллоу ворлд ё ё я я э э ъ ъ hello world 1234 i i i̇ 勺#☢༺'
@@ -191,6 +191,7 @@ test:test("unicode", function(test)
      test:is(utf8.len(s, 10, -10), 0, 'it is ok to swap end and start pos')
      test:is(utf8.len(''), 0, 'empty len')
      test:is(utf8.len(s, -6, -1), 3, 'pass both negative offsets')
+    test:is(utf8.len(s, 3, 3), 1, "end in the middle on the same symbol as start")
  
      local chars = {}
      local codes = {}

> 
> See also comments below.
> 
> Thanks!
> 
> WBR, Alexander Turenko.
> 
> On Sun, Apr 29, 2018 at 01:45:13AM +0300, Vladislav Shpilevoy wrote:
>> utf8 is a module partially compatible with Lua 5.3 utf8 and
>> lua-utf8 third party module.
>> Partially means, that not all functions are implemented.
>>
>> The patch introduces these ones:
>> upper, lower, len, char, sub.
> 
> And next.
> 
>>
>> All of them works exactly like in Lua 5.3, or if Lua 5.3 has no
>> a function, then works like in lua-utf8.
>>
> 
> Only len and char are from lua 5.3, we can state it explicitly.

Fixed.

     lua: introduce utf8 built-in globaly visible module
     
     utf8 is a module partially compatible with Lua 5.3 utf8 and
     lua-utf8 third party module.
     Partially means, that not all functions are implemented.
     
     The patch introduces these ones:
     upper, lower, len, char, sub, next.
     
     Len and char works exactly like in Lua 5.3. Other functions work
     like in lua-utf8.
     
     Tarantool utf8 has extensions:
     
     * isupper/lower/alpha/digit, that check some property by a symbol
       or by its code;
     
     * cmp/casecmp, that compare two UTF8 strings.
     
     Closes #3290
     Closes #3385
     Closes #3081
>> diff --git a/src/lua/utf8.c b/src/lua/utf8.c
>> new file mode 100644
>> index 000000000..d1d3d72c2
>> --- /dev/null
>> +++ b/src/lua/utf8.c
>> <...>
>> +/**
>> + * Calculate a 0-based positive byte offset in a string by any
>> + * 0-based offset (possibly negative).
> 
> But you pass 1-based offsets from utf8_len function to that one.

diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index 0f4a84c98..d4f57a5f4 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -124,11 +124,11 @@ utf8_lower(struct lua_State *L)
  }
  
  /**
- * Calculate a 0-based positive byte offset in a string by any
- * 0-based offset (possibly negative).
- * @param offset Original 0-based offset with any sign.
+ * Calculate a 1-based positive byte offset in a string by any
+ * 1-based offset (possibly negative).
+ * @param offset Original 1-based offset with any sign.
   * @param len A string byte length.
- * @retval 0-based positive offset.
+ * @retval 1-based positive offset.
   */
  static inline int
  utf8_convert_offset(int offset, size_t len)
@@ -207,14 +207,16 @@ utf8_next(struct lua_State *L)
         size_t slen;
         const char *str = lua_tolstring(L, 1, &slen);
         int len = (int) slen;
-       int pos = utf8_convert_offset(luaL_optinteger(L, 2, 0), len);
+       int pos = utf8_convert_offset(luaL_optinteger(L, 2, 1), len);
+       if (pos > 0)
+               --pos;
         if (pos >= len)
                 return 0;
         UChar32 c;
         U8_NEXT(str, pos, len, c);
         if (c == U_SENTINEL)
                 return 0;
-       lua_pushinteger(L, pos);
+       lua_pushinteger(L, pos + 1);
         lua_pushinteger(L, c);
         return 2;
  }
>> +
>> +/**
>> + * Calculate length of a UTF8 string. Length here is symbol count.
>> + * Works like utf8.len in Lua 5.3.
>> + * @param String to get length.
>> + * @param Start byte offset. Must point to the start of symbol. On
>> + *        invalid symbol an error is returned. Can be negative.
> 
> Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
> offset equilibristics is not very intuitive (at least for me).

No, start can be any, as well as end.

> 
>> + * @param End byte offset, can be negative. Can point to the
>> + *        middle of symbol.
> 
> We need to clarify that a symbol under the end offset is subject to
> include into the resulting count (inclusive range).
> 
> I would also explicitly stated that -1 is the end byte.
> 
> Worth to document allowed range (0 <= |end| <= #str, right?)?

diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index d4f57a5f4..0507da134 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -142,12 +142,13 @@ utf8_convert_offset(int offset, size_t len)
  
  /**
   * Calculate length of a UTF8 string. Length here is symbol count.
- * Works like utf8.len in Lua 5.3.
+ * Works like utf8.len in Lua 5.3. Can take negative offsets. A
+ * negative offset is an offset from the end of string.
   * @param String to get length.
   * @param Start byte offset. Must point to the start of symbol. On
- *        invalid symbol an error is returned. Can be negative.
- * @param End byte offset, can be negative. Can point to the
- *        middle of symbol.
+ *        invalid symbol an error is returned.
+ * @param End byte offset. Can point to the middle of symbol.
+ *        Partial symbol is counted too.
   * @retval not nil Symbol count.
   * @retval nil, error Error. Byte position of the error is
   *         returned in the second value.

> 
>> + * @retval not nil Symbol count.
>> + * @retval nil, error Error. Byte position of the error is
>> + *         returned in the second value.
>> + */
>> +static int
>> +utf8_len(struct lua_State *L)
>> +{
>> +	if (lua_gettop(L) > 3 || !lua_isstring(L, 1))
>> +		return luaL_error(L, "Usage: utf8.len(<string>, [i, [j]])");
>> +	size_t slen;
>> +	const char *str = lua_tolstring(L, 1, &slen);
>> +	int len = (int)slen;
>> +	int start_pos = utf8_convert_offset(luaL_optinteger(L, 2, 1), len);
>> +	int end_pos = utf8_convert_offset(luaL_optinteger(L, 3, -1), len);
>> +	if (start_pos < 1 || --start_pos > len || end_pos > len) {
>> +		lua_pushnil(L);
>> +		lua_pushstring(L, "position is out of string");
>> +		return 2;
>> +	}
>> +	int result = 0;
>> +	if (end_pos > start_pos) {
>> +		UChar32 c;
>> +		++result;
>> +		/*
>> +		 * In Lua 5.3 it is not ok to start from bad
>> +		 * position. But next symbols can be any.
>> +		 */
> 
> I would be not so sure. Consider:
> 
> print(utf8.len('a\xF4'))
> 
> Lua 5.3:
> 
> nil	2
> 
> Tarantool:
> 
> 2

diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index 1d0776ab1..c31171b2c 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -171,20 +171,14 @@ utf8_len(struct lua_State *L)
         int result = 0;
         if (end_pos > start_pos) {
                 UChar32 c;
-               ++result;
-               /*
-                * In Lua 5.3 it is not ok to start from bad
-                * position. But next symbols can be any.
-                */
-               U8_NEXT(str, start_pos, end_pos, c);
-               if (c == U_SENTINEL) {
-                       lua_pushnil(L);
-                       lua_pushinteger(L, start_pos);
-                       return 2;
-               }
                 while (start_pos < end_pos) {
-                       U8_NEXT_UNSAFE(str, start_pos, c);
                         ++result;
+                       U8_NEXT(str, start_pos, len, c);
+                       if (c == U_SENTINEL) {
+                               lua_pushnil(L);
+                               lua_pushinteger(L, start_pos);
+                               return 2;
+                       }
                 }
         }
         lua_pushinteger(L, result);
>> +/**
>> + * Get next symbol code by @an offset.
>> + * @param String to get symbol code.
>> + * @param Byte offset from which get.
>> + *
>> + * @retval - No more symbols.
> 
> Do you mean nil?

No. It returns literally nothing.

> 
>> + * @retval not nil, not nil Byte offset and symbol code.
>> + */
>> +static int
>> +utf8_next(struct lua_State *L)
>> <...>
>> +UCHAR32_CHECKER(islower)
>> +UCHAR32_CHECKER(isupper)
>> +UCHAR32_CHECKER(isdigit)
>> +UCHAR32_CHECKER(isalpha)
> 
> It would be good to have doxygen-style comment in the source code
> likewise you posted in #3081.
> 

-/** Macro to easy create lua wrappers for ICU symbol checkers. */
+/**
+ * Macro to easy create lua wrappers for ICU symbol checkers.
+ * @param One stmbol code or string.
+ * @retval True, if the symbol has a requested property. Else
+ *         false.
+ */
  #define UCHAR32_CHECKER(name) \

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-05-04 23:32     ` Vladislav Shpilevoy
@ 2018-05-05  0:18       ` Alexander Turenko
  2018-05-05  0:24         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 21+ messages in thread
From: Alexander Turenko @ 2018-05-05  0:18 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

Thanks for the fixes. You are rock!

I want to clarify two things, please see below.

WBR, Alexander Turenko.

On Sat, May 05, 2018 at 02:32:27AM +0300, Vladislav Shpilevoy wrote:
> Hello. Thanks for review.
> 
> On 05/05/2018 01:33, Alexander Turenko wrote:
> > Vlad,
> > 
> > Are you try to run tests from utf8.lua from [1]?
> > 
> > [1]: https://www.lua.org/tests/lua-5.3.4-tests.tar.gz
> > 

Are you think such testing would be redundant? I don't insist, just want
to know explicit position.

> > > +
> > > +/**
> > > + * Calculate length of a UTF8 string. Length here is symbol count.
> > > + * Works like utf8.len in Lua 5.3.
> > > + * @param String to get length.
> > > + * @param Start byte offset. Must point to the start of symbol. On
> > > + *        invalid symbol an error is returned. Can be negative.
> > 
> > Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
> > offset equilibristics is not very intuitive (at least for me).
> 
> No, start can be any, as well as end.
> 

It does not look like so:

tarantool> print(utf8.len('abc', 0))
nil     position is out of string

tarantool> print(utf8.len('abc', 5))
nil     position is out of string

tarantool> print(utf8.len('abc', 1, 4))
nil     position is out of string

That matches lua 5.3 behaviour, but contradicts with your words above.
So the question is about proper doxygen-style comment.

> > 
> > > + * @param End byte offset, can be negative. Can point to the
> > > + *        middle of symbol.
> > 
> > We need to clarify that a symbol under the end offset is subject to
> > include into the resulting count (inclusive range).
> > 
> > I would also explicitly stated that -1 is the end byte.
> > 
> > Worth to document allowed range (0 <= |end| <= #str, right?)?
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 0/5] Expose ICU into Lua
  2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
                   ` (4 preceding siblings ...)
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
@ 2018-05-05  0:19 ` Alexander Turenko
  5 siblings, 0 replies; 21+ messages in thread
From: Alexander Turenko @ 2018-05-05  0:19 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

Vlad,

I cannot say more than I did already about your patchset. Consider this
as LGTM. Please, proceed with a next reviewer.

WBR, Alexander Turenko.

On Sun, Apr 29, 2018 at 01:45:08AM +0300, Vladislav Shpilevoy wrote:
> Branch: http://github.com/tarantool/tarantool/tree/gh-3290-lua-icu
> Issue: https://github.com/tarantool/tarantool/issues/3290
> Issue: https://github.com/tarantool/tarantool/issues/3385
> Issue: https://github.com/tarantool/tarantool/issues/3081

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-05-05  0:18       ` Alexander Turenko
@ 2018-05-05  0:24         ` Vladislav Shpilevoy
  2018-05-05  0:38           ` Alexander Turenko
  0 siblings, 1 reply; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-05  0:24 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches



On 05/05/2018 03:18, Alexander Turenko wrote:
> Vlad,
> 
> Thanks for the fixes. You are rock!
> 
> I want to clarify two things, please see below.
> 
> WBR, Alexander Turenko.
> 
> On Sat, May 05, 2018 at 02:32:27AM +0300, Vladislav Shpilevoy wrote:
>> Hello. Thanks for review.
>>
>> On 05/05/2018 01:33, Alexander Turenko wrote:
>>> Vlad,
>>>
>>> Are you try to run tests from utf8.lua from [1]?
>>>
>>> [1]: https://www.lua.org/tests/lua-5.3.4-tests.tar.gz
>>>
> 
> Are you think such testing would be redundant? I don't insist, just want
> to know explicit position.

I think, our existing tests are enough. There is no necessity to duplicate checks.

> 
>>>> +
>>>> +/**
>>>> + * Calculate length of a UTF8 string. Length here is symbol count.
>>>> + * Works like utf8.len in Lua 5.3.
>>>> + * @param String to get length.
>>>> + * @param Start byte offset. Must point to the start of symbol. On
>>>> + *        invalid symbol an error is returned. Can be negative.
>>>
>>> Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
>>> offset equilibristics is not very intuitive (at least for me).
>>
>> No, start can be any, as well as end.

diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index c31171b2c..c84e6ff72 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -144,6 +144,7 @@ utf8_convert_offset(int offset, size_t len)
   * Calculate length of a UTF8 string. Length here is symbol count.
   * Works like utf8.len in Lua 5.3. Can take negative offsets. A
   * negative offset is an offset from the end of string.
+ * Positive position must be inside [1, #str + 1].
   * @param String to get length.
   * @param Start byte offset. Must point to the start of symbol. On
   *        invalid symbol an error is returned.
v:tarantool v.shpilevoy$ git diff
diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index c31171b2c..8f0ca65e5 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -144,14 +144,17 @@ utf8_convert_offset(int offset, size_t len)
   * Calculate length of a UTF8 string. Length here is symbol count.
   * Works like utf8.len in Lua 5.3. Can take negative offsets. A
   * negative offset is an offset from the end of string.
+ * Positive position must be inside [1, #str + 1].
   * @param String to get length.
   * @param Start byte offset. Must point to the start of symbol. On
   *        invalid symbol an error is returned.
   * @param End byte offset. Can point to the middle of symbol.
   *        Partial symbol is counted too.
   * @retval not nil Symbol count.
- * @retval nil, error Error. Byte position of the error is
+ * @retval nil, number Error. Byte position of the error is
   *         returned in the second value.
+ * @retval nil, string Error. Reason is returned in the second
+ *         value.
   */
  static int
  utf8_len(struct lua_State *L)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-05-05  0:24         ` Vladislav Shpilevoy
@ 2018-05-05  0:38           ` Alexander Turenko
  2018-05-05  0:45             ` Vladislav Shpilevoy
  0 siblings, 1 reply; 21+ messages in thread
From: Alexander Turenko @ 2018-05-05  0:38 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches

On Sat, May 05, 2018 at 03:24:41AM +0300, Vladislav Shpilevoy wrote:
> 
> 
> > 
> > > > > +
> > > > > +/**
> > > > > + * Calculate length of a UTF8 string. Length here is symbol count.
> > > > > + * Works like utf8.len in Lua 5.3.
> > > > > + * @param String to get length.
> > > > > + * @param Start byte offset. Must point to the start of symbol. On
> > > > > + *        invalid symbol an error is returned. Can be negative.
> > > > 
> > > > Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
> > > > offset equilibristics is not very intuitive (at least for me).
> > > 
> > > No, start can be any, as well as end.
> 
> diff --git a/src/lua/utf8.c b/src/lua/utf8.c
> index c31171b2c..c84e6ff72 100644
> --- a/src/lua/utf8.c
> +++ b/src/lua/utf8.c
> @@ -144,6 +144,7 @@ utf8_convert_offset(int offset, size_t len)
>   * Calculate length of a UTF8 string. Length here is symbol count.
>   * Works like utf8.len in Lua 5.3. Can take negative offsets. A
>   * negative offset is an offset from the end of string.
> + * Positive position must be inside [1, #str + 1].
>   * @param String to get length.
>   * @param Start byte offset. Must point to the start of symbol. On
>   *        invalid symbol an error is returned.
> v:tarantool v.shpilevoy$ git diff
> diff --git a/src/lua/utf8.c b/src/lua/utf8.c
> index c31171b2c..8f0ca65e5 100644
> --- a/src/lua/utf8.c
> +++ b/src/lua/utf8.c
> @@ -144,14 +144,17 @@ utf8_convert_offset(int offset, size_t len)
>   * Calculate length of a UTF8 string. Length here is symbol count.
>   * Works like utf8.len in Lua 5.3. Can take negative offsets. A
>   * negative offset is an offset from the end of string.
> + * Positive position must be inside [1, #str + 1].
>   * @param String to get length.
>   * @param Start byte offset. Must point to the start of symbol. On
>   *        invalid symbol an error is returned.
>   * @param End byte offset. Can point to the middle of symbol.
>   *        Partial symbol is counted too.
>   * @retval not nil Symbol count.
> - * @retval nil, error Error. Byte position of the error is
> + * @retval nil, number Error. Byte position of the error is
>   *         returned in the second value.
> + * @retval nil, string Error. Reason is returned in the second
> + *         value.
>   */
>  static int
>  utf8_len(struct lua_State *L)
> 
> 

[0, #str] for the end position.

Excuse me for nitpicking.

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-05-05  0:38           ` Alexander Turenko
@ 2018-05-05  0:45             ` Vladislav Shpilevoy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-05  0:45 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches



On 05/05/2018 03:38, Alexander Turenko wrote:
> On Sat, May 05, 2018 at 03:24:41AM +0300, Vladislav Shpilevoy wrote:
>>
>>
>>>
>>>>>> +
>>>>>> +/**
>>>>>> + * Calculate length of a UTF8 string. Length here is symbol count.
>>>>>> + * Works like utf8.len in Lua 5.3.
>>>>>> + * @param String to get length.
>>>>>> + * @param Start byte offset. Must point to the start of symbol. On
>>>>>> + *        invalid symbol an error is returned. Can be negative.
>>>>>
>>>>> Can be 1 <= |start| <= #str + 1, right? Is it worth to document? Such
>>>>> offset equilibristics is not very intuitive (at least for me).
>>>>
>>>> No, start can be any, as well as end.
>>
>> diff --git a/src/lua/utf8.c b/src/lua/utf8.c
>> index c31171b2c..c84e6ff72 100644
>> --- a/src/lua/utf8.c
>> +++ b/src/lua/utf8.c
>> @@ -144,6 +144,7 @@ utf8_convert_offset(int offset, size_t len)
>>    * Calculate length of a UTF8 string. Length here is symbol count.
>>    * Works like utf8.len in Lua 5.3. Can take negative offsets. A
>>    * negative offset is an offset from the end of string.
>> + * Positive position must be inside [1, #str + 1].
>>    * @param String to get length.
>>    * @param Start byte offset. Must point to the start of symbol. On
>>    *        invalid symbol an error is returned.
>> v:tarantool v.shpilevoy$ git diff
>> diff --git a/src/lua/utf8.c b/src/lua/utf8.c
>> index c31171b2c..8f0ca65e5 100644
>> --- a/src/lua/utf8.c
>> +++ b/src/lua/utf8.c
>> @@ -144,14 +144,17 @@ utf8_convert_offset(int offset, size_t len)
>>    * Calculate length of a UTF8 string. Length here is symbol count.
>>    * Works like utf8.len in Lua 5.3. Can take negative offsets. A
>>    * negative offset is an offset from the end of string.
>> + * Positive position must be inside [1, #str + 1].
>>    * @param String to get length.
>>    * @param Start byte offset. Must point to the start of symbol. On
>>    *        invalid symbol an error is returned.
>>    * @param End byte offset. Can point to the middle of symbol.
>>    *        Partial symbol is counted too.
>>    * @retval not nil Symbol count.
>> - * @retval nil, error Error. Byte position of the error is
>> + * @retval nil, number Error. Byte position of the error is
>>    *         returned in the second value.
>> + * @retval nil, string Error. Reason is returned in the second
>> + *         value.
>>    */
>>   static int
>>   utf8_len(struct lua_State *L)
>>
>>
> 
> [0, #str] for the end position.
> 
> Excuse me for nitpicking.


diff --git a/src/lua/utf8.c b/src/lua/utf8.c
index 8f0ca65e5..d124e5321 100644
--- a/src/lua/utf8.c
+++ b/src/lua/utf8.c
@@ -144,12 +144,12 @@ utf8_convert_offset(int offset, size_t len)
   * Calculate length of a UTF8 string. Length here is symbol count.
   * Works like utf8.len in Lua 5.3. Can take negative offsets. A
   * negative offset is an offset from the end of string.
- * Positive position must be inside [1, #str + 1].
+ * Positive position must be inside .
   * @param String to get length.
- * @param Start byte offset. Must point to the start of symbol. On
- *        invalid symbol an error is returned.
- * @param End byte offset. Can point to the middle of symbol.
- *        Partial symbol is counted too.
+ * @param Start byte offset in [1, #str + 1]. Must point to the
+ *        start of symbol. On invalid symbol an error is returned.
+ * @param End byte offset in [0, #str]. Can point to the middle of
+ *        symbol. Partial symbol is counted too.
   * @retval not nil Symbol count.
   * @retval nil, number Error. Byte position of the error is
   *         returned in the second value.


> 
> WBR, Alexander Turenko.
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 3/5] Merge box_error, stat and collations into core library
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 3/5] Merge box_error, stat and collations into core library Vladislav Shpilevoy
  2018-05-04 11:36   ` [tarantool-patches] " Alexander Turenko
@ 2018-05-08 13:18   ` Konstantin Osipov
  2018-05-10 21:06     ` Vladislav Shpilevoy
  1 sibling, 1 reply; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-08 13:18 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/04/29 01:47]:
> The goal is to expose collations into Lua with no dependencies on
> box library. But collations merge into core requires box_error
> and stat libraries too.

Please remove coll_cache dependency on box_error and stat, rather
than move box_error/box_stat over. If you need nice errors in box
for operations with collation cache, you can add wrappers in the
box (or you can raise errors defined in src/ as is already the
case for socket and I/O errors).

> ---
>  cmake/module.cmake         |  4 ++--
>  src/CMakeLists.txt         | 14 ++++++++++----
>  src/box/CMakeLists.txt     | 18 +++++-------------
>  src/box/lua/call.c         |  2 +-
>  src/box/lua/error.cc       |  2 +-
>  src/box/lua/net_box.c      |  2 +-
>  src/box/lua/tuple.c        |  2 +-
>  src/box/lua/xlog.c         |  2 +-
>  src/{box => }/coll.c       |  0
>  src/{box => }/coll.h       |  0
>  src/{box => }/coll_cache.c |  0
>  src/{box => }/coll_cache.h |  0
>  src/{box => }/coll_def.c   |  0
>  src/{box => }/coll_def.h   |  0
>  src/{box => }/errcode.c    |  0
>  src/{box => }/errcode.h    |  0
>  src/{box => }/error.cc     |  0
>  src/{box => }/error.h      |  0
>  src/main.cc                |  2 +-
>  src/{box => }/opt_def.c    |  0
>  src/{box => }/opt_def.h    |  0
>  test/unit/CMakeLists.txt   |  8 ++++----
>  test/unit/coll.cpp         |  3 +--
>  23 files changed, 28 insertions(+), 31 deletions(-)
>  rename src/{box => }/coll.c (100%)
>  rename src/{box => }/coll.h (100%)
>  rename src/{box => }/coll_cache.c (100%)
>  rename src/{box => }/coll_cache.h (100%)
>  rename src/{box => }/coll_def.c (100%)
>  rename src/{box => }/coll_def.h (100%)
>  rename src/{box => }/errcode.c (100%)
>  rename src/{box => }/errcode.h (100%)
>  rename src/{box => }/error.cc (100%)
>  rename src/{box => }/error.h (100%)
>  rename src/{box => }/opt_def.c (100%)
>  rename src/{box => }/opt_def.h (100%)
> 
> diff --git a/cmake/module.cmake b/cmake/module.cmake
> index 988dcb94d..434938cdf 100644
> --- a/cmake/module.cmake
> +++ b/cmake/module.cmake
> @@ -25,7 +25,7 @@ function(rebuild_module_api)
>          COMMAND ${CMAKE_C_COMPILER}
>              ${cflags}
>              -I ${CMAKE_SOURCE_DIR}/src -I ${CMAKE_BINARY_DIR}/src
> -            -E ${CMAKE_SOURCE_DIR}/src/box/errcode.h > ${errcodefile}
> +            -E ${CMAKE_SOURCE_DIR}/src/errcode.h > ${errcodefile}
>          COMMAND
>              grep "enum box_error_code" ${errcodefile} >> ${tmpfile}
>          COMMAND cat ${CMAKE_CURRENT_SOURCE_DIR}/module_footer.h >> ${tmpfile}
> @@ -34,7 +34,7 @@ function(rebuild_module_api)
>          DEPENDS ${CMAKE_SOURCE_DIR}/extra/apigen
>                  ${CMAKE_CURRENT_SOURCE_DIR}/module_header.h
>                  ${CMAKE_CURRENT_SOURCE_DIR}/module_footer.h
> -                ${CMAKE_SOURCE_DIR}/src/box/errcode.h
> +                ${CMAKE_SOURCE_DIR}/src/errcode.h
>                  ${headers}
>          )
>  
> diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
> index 8ab09e968..45eb7a431 100644
> --- a/src/CMakeLists.txt
> +++ b/src/CMakeLists.txt
> @@ -94,6 +94,15 @@ set (core_sources
>       random.c
>       trigger.cc
>       http_parser.c
> +     error.cc
> +     errcode.c
> +     rmean.c
> +     latency.c
> +     histogram.c
> +     coll_def.c
> +     coll.c
> +     coll_cache.c
> +     opt_def.c
>   )
>  
>  if (TARGET_OS_NETBSD)
> @@ -112,9 +121,6 @@ target_link_libraries(core
>      ${MSGPUCK_LIBRARIES}
>  )
>  
> -add_library(stat STATIC rmean.c latency.c histogram.c)
> -target_link_libraries(stat core)
> -
>  if (ENABLE_BACKTRACE AND NOT TARGET_OS_DARWIN)
>  	target_link_libraries(core gcc_s ${UNWIND_LIBRARIES})
>  endif()
> @@ -190,7 +196,7 @@ set(api_headers
>      ${CMAKE_SOURCE_DIR}/src/box/box.h
>      ${CMAKE_SOURCE_DIR}/src/box/index.h
>      ${CMAKE_SOURCE_DIR}/src/box/iterator_type.h
> -    ${CMAKE_SOURCE_DIR}/src/box/error.h
> +    ${CMAKE_SOURCE_DIR}/src/error.h
>      ${CMAKE_SOURCE_DIR}/src/box/lua/call.h
>      ${CMAKE_SOURCE_DIR}/src/box/lua/tuple.h
>      ${CMAKE_SOURCE_DIR}/src/latch.h
> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt
> index ae156a2f5..30d636e47 100644
> --- a/src/box/CMakeLists.txt
> +++ b/src/box/CMakeLists.txt
> @@ -21,15 +21,12 @@ set_property(DIRECTORY PROPERTY ADDITIONAL_MAKE_CLEAN_FILES ${lua_sources})
>  
>  include_directories(${ZSTD_INCLUDE_DIRS})
>  
> -add_library(box_error STATIC error.cc errcode.c)
> -target_link_libraries(box_error core stat)
> -
>  add_library(vclock STATIC vclock.c)
>  target_link_libraries(vclock core)
>  
>  add_library(xrow STATIC xrow.c iproto_constants.c)
> -target_link_libraries(xrow server core small vclock misc box_error
> -                      scramble ${MSGPUCK_LIBRARIES})
> +target_link_libraries(xrow server core small vclock misc scramble
> +                      ${MSGPUCK_LIBRARIES})
>  
>  add_library(tuple STATIC
>      tuple.c
> @@ -41,20 +38,15 @@ add_library(tuple STATIC
>      tuple_bloom.c
>      tuple_dictionary.c
>      key_def.cc
> -    coll_def.c
> -    coll.c
> -    coll_cache.c
>      field_def.c
> -    opt_def.c
>  )
> -target_link_libraries(tuple json_path box_error core ${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES} misc bit)
> +target_link_libraries(tuple json_path core ${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES} misc bit)
>  
>  add_library(xlog STATIC xlog.c)
> -target_link_libraries(xlog core box_error crc32 ${ZSTD_LIBRARIES})
> +target_link_libraries(xlog core crc32 ${ZSTD_LIBRARIES})
>  
>  add_library(box STATIC
>      iproto.cc
> -    error.cc
>      xrow_io.cc
>      tuple_convert.c
>      identifier.c
> @@ -132,6 +124,6 @@ add_library(box STATIC
>      lua/xlog.c
>      ${bin_sources})
>  
> -target_link_libraries(box box_error tuple stat xrow xlog vclock crc32 scramble
> +target_link_libraries(box tuple xrow xlog vclock crc32 scramble
>                        ${common_libraries})
>  add_dependencies(box build_bundled_libs)
> diff --git a/src/box/lua/call.c b/src/box/lua/call.c
> index be13812aa..b60c6c397 100644
> --- a/src/box/lua/call.c
> +++ b/src/box/lua/call.c
> @@ -30,7 +30,7 @@
>   */
>  #include "box/lua/call.h"
>  #include "box/call.h"
> -#include "box/error.h"
> +#include <error.h>
>  #include "fiber.h"
>  
>  #include "lua/utils.h"
> diff --git a/src/box/lua/error.cc b/src/box/lua/error.cc
> index 314907421..960ea2aa9 100644
> --- a/src/box/lua/error.cc
> +++ b/src/box/lua/error.cc
> @@ -40,7 +40,7 @@ extern "C" {
>  #include <errinj.h>
>  
>  #include "lua/utils.h"
> -#include "box/error.h"
> +#include "src/error.h"
>  
>  static int
>  luaT_error_raise(lua_State *L)
> diff --git a/src/box/lua/net_box.c b/src/box/lua/net_box.c
> index db2d2dbb4..f5cc8d674 100644
> --- a/src/box/lua/net_box.c
> +++ b/src/box/lua/net_box.c
> @@ -43,7 +43,7 @@
>  #include "third_party/base64.h"
>  
>  #include "coio.h"
> -#include "box/errcode.h"
> +#include "errcode.h"
>  #include "lua/fiber.h"
>  
>  #define cfg luaL_msgpack_default
> diff --git a/src/box/lua/tuple.c b/src/box/lua/tuple.c
> index 67abb32bd..889f0569b 100644
> --- a/src/box/lua/tuple.c
> +++ b/src/box/lua/tuple.c
> @@ -39,7 +39,7 @@
>  
>  #include "box/tuple.h"
>  #include "box/tuple_convert.h"
> -#include "box/errcode.h"
> +#include "errcode.h"
>  #include "box/memtx_tuple.h"
>  #include "json/path.h"
>  
> diff --git a/src/box/lua/xlog.c b/src/box/lua/xlog.c
> index 030f5c2d4..c53b5a21a 100644
> --- a/src/box/lua/xlog.c
> +++ b/src/box/lua/xlog.c
> @@ -35,7 +35,7 @@
>  #include <diag.h>
>  #include <msgpuck/msgpuck.h>
>  
> -#include <box/error.h>
> +#include "error.h"
>  #include <box/xlog.h>
>  #include <box/xrow.h>
>  #include <box/iproto_constants.h>
> diff --git a/src/box/coll.c b/src/coll.c
> similarity index 100%
> rename from src/box/coll.c
> rename to src/coll.c
> diff --git a/src/box/coll.h b/src/coll.h
> similarity index 100%
> rename from src/box/coll.h
> rename to src/coll.h
> diff --git a/src/box/coll_cache.c b/src/coll_cache.c
> similarity index 100%
> rename from src/box/coll_cache.c
> rename to src/coll_cache.c
> diff --git a/src/box/coll_cache.h b/src/coll_cache.h
> similarity index 100%
> rename from src/box/coll_cache.h
> rename to src/coll_cache.h
> diff --git a/src/box/coll_def.c b/src/coll_def.c
> similarity index 100%
> rename from src/box/coll_def.c
> rename to src/coll_def.c
> diff --git a/src/box/coll_def.h b/src/coll_def.h
> similarity index 100%
> rename from src/box/coll_def.h
> rename to src/coll_def.h
> diff --git a/src/box/errcode.c b/src/errcode.c
> similarity index 100%
> rename from src/box/errcode.c
> rename to src/errcode.c
> diff --git a/src/box/errcode.h b/src/errcode.h
> similarity index 100%
> rename from src/box/errcode.h
> rename to src/errcode.h
> diff --git a/src/box/error.cc b/src/error.cc
> similarity index 100%
> rename from src/box/error.cc
> rename to src/error.cc
> diff --git a/src/box/error.h b/src/error.h
> similarity index 100%
> rename from src/box/error.h
> rename to src/error.h
> diff --git a/src/main.cc b/src/main.cc
> index 1682baea0..3d61e3c51 100644
> --- a/src/main.cc
> +++ b/src/main.cc
> @@ -63,7 +63,7 @@
>  #include "tt_pthread.h"
>  #include "lua/init.h"
>  #include "box/box.h"
> -#include "box/error.h"
> +#include "error.h"
>  #include "scoped_guard.h"
>  #include "random.h"
>  #include "tt_uuid.h"
> diff --git a/src/box/opt_def.c b/src/opt_def.c
> similarity index 100%
> rename from src/box/opt_def.c
> rename to src/opt_def.c
> diff --git a/src/box/opt_def.h b/src/opt_def.h
> similarity index 100%
> rename from src/box/opt_def.h
> rename to src/opt_def.h
> diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt
> index 049edf641..8792a3ed2 100644
> --- a/test/unit/CMakeLists.txt
> +++ b/test/unit/CMakeLists.txt
> @@ -90,10 +90,10 @@ add_executable(fiber_channel_stress.test fiber_channel_stress.cc)
>  target_link_libraries(fiber_channel_stress.test core)
>  
>  add_executable(cbus_stress.test cbus_stress.c)
> -target_link_libraries(cbus_stress.test core stat)
> +target_link_libraries(cbus_stress.test core)
>  
>  add_executable(cbus.test cbus.c)
> -target_link_libraries(cbus.test core unit stat)
> +target_link_libraries(cbus.test core unit)
>  
>  add_executable(coio.test coio.cc)
>  target_link_libraries(coio.test core eio bit uri unit)
> @@ -133,9 +133,9 @@ add_executable(json_path.test json_path.c)
>  target_link_libraries(json_path.test json_path unit ${ICU_LIBRARIES})
>  
>  add_executable(rmean.test rmean.cc)
> -target_link_libraries(rmean.test stat unit)
> +target_link_libraries(rmean.test core unit)
>  add_executable(histogram.test histogram.c)
> -target_link_libraries(histogram.test stat unit)
> +target_link_libraries(histogram.test core unit)
>  
>  add_executable(say.test say.c)
>  target_link_libraries(say.test core unit)
> diff --git a/test/unit/coll.cpp b/test/unit/coll.cpp
> index e1643c9c0..acf69b030 100644
> --- a/test/unit/coll.cpp
> +++ b/test/unit/coll.cpp
> @@ -1,9 +1,8 @@
> -#include "box/coll.h"
> +#include "coll.h"
>  #include <iostream>
>  #include <vector>
>  #include <algorithm>
>  #include <string.h>
> -#include <box/coll_def.h>
>  #include <assert.h>
>  #include <msgpuck.h>
>  #include <diag.h>
> -- 
> 2.15.1 (Apple Git-101)
> 
> 

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 4/5] Always store built-in collations in the cache
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 4/5] Always store built-in collations in the cache Vladislav Shpilevoy
@ 2018-05-08 13:20   ` Konstantin Osipov
  0 siblings, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-08 13:20 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/04/29 01:47]:

> To be able to use collations out of box.cfg it is needed to be
> sure, that they are available always. Else it can break non-box
> collation usage. So when a built-in collation is deleted from
> _collation (for example, on box.internal.bootstrap()), it is not
> deleted from the cache. When a built-in collation is inserted
> into _collation, it does not touch the cache (it already contains
> all built-in collations).

Looks good.


-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module
  2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
  2018-05-04 22:33   ` [tarantool-patches] " Alexander Turenko
@ 2018-05-08 13:21   ` Konstantin Osipov
  1 sibling, 0 replies; 21+ messages in thread
From: Konstantin Osipov @ 2018-05-08 13:21 UTC (permalink / raw)
  To: tarantool-patches; +Cc: alexander.turenko

* Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/04/29 01:47]:
> utf8 is a module partially compatible with Lua 5.3 utf8 and
> lua-utf8 third party module.
> Partially means, that not all functions are implemented.
> 
> The patch introduces these ones:
> upper, lower, len, char, sub.
> 
> All of them works exactly like in Lua 5.3, or if Lua 5.3 has no
> a function, then works like in lua-utf8.
> 
> Tarantool utf8 has extensions:
> 
> * isupper/lower/alpha/digit, that check some property by a symbol
>   or by its code;
> 
> * cmp/casecmp, that compare two UTF8 strings.

This is OK as well.

 

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tarantool-patches] Re: [PATCH v2 3/5] Merge box_error, stat and collations into core library
  2018-05-08 13:18   ` Konstantin Osipov
@ 2018-05-10 21:06     ` Vladislav Shpilevoy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladislav Shpilevoy @ 2018-05-10 21:06 UTC (permalink / raw)
  To: tarantool-patches, Konstantin Osipov; +Cc: alexander.turenko

Hello. Thanks for review!

On 08/05/2018 16:18, Konstantin Osipov wrote:
> * Vladislav Shpilevoy <v.shpilevoy@tarantool.org> [18/04/29 01:47]:
>> The goal is to expose collations into Lua with no dependencies on
>> box library. But collations merge into core requires box_error
>> and stat libraries too.
> 
> Please remove coll_cache dependency on box_error and stat, rather
> than move box_error/box_stat over. If you need nice errors in box
> for operations with collation cache, you can add wrappers in the
> box (or you can raise errors defined in src/ as is already the
> case for socket and I/O errors).

Fixed on the branch.

> 
>> ---
>>   cmake/module.cmake         |  4 ++--
>>   src/CMakeLists.txt         | 14 ++++++++++----
>>   src/box/CMakeLists.txt     | 18 +++++-------------
>>   src/box/lua/call.c         |  2 +-
>>   src/box/lua/error.cc       |  2 +-
>>   src/box/lua/net_box.c      |  2 +-
>>   src/box/lua/tuple.c        |  2 +-
>>   src/box/lua/xlog.c         |  2 +-
>>   src/{box => }/coll.c       |  0
>>   src/{box => }/coll.h       |  0
>>   src/{box => }/coll_cache.c |  0
>>   src/{box => }/coll_cache.h |  0
>>   src/{box => }/coll_def.c   |  0
>>   src/{box => }/coll_def.h   |  0
>>   src/{box => }/errcode.c    |  0
>>   src/{box => }/errcode.h    |  0
>>   src/{box => }/error.cc     |  0
>>   src/{box => }/error.h      |  0
>>   src/main.cc                |  2 +-
>>   src/{box => }/opt_def.c    |  0
>>   src/{box => }/opt_def.h    |  0
>>   test/unit/CMakeLists.txt   |  8 ++++----
>>   test/unit/coll.cpp         |  3 +--
>>   23 files changed, 28 insertions(+), 31 deletions(-)
>>   rename src/{box => }/coll.c (100%)
>>   rename src/{box => }/coll.h (100%)
>>   rename src/{box => }/coll_cache.c (100%)
>>   rename src/{box => }/coll_cache.h (100%)
>>   rename src/{box => }/coll_def.c (100%)
>>   rename src/{box => }/coll_def.h (100%)
>>   rename src/{box => }/errcode.c (100%)
>>   rename src/{box => }/errcode.h (100%)
>>   rename src/{box => }/error.cc (100%)
>>   rename src/{box => }/error.h (100%)
>>   rename src/{box => }/opt_def.c (100%)
>>   rename src/{box => }/opt_def.h (100%)
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-05-10 21:06 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-28 22:45 [tarantool-patches] [PATCH v2 0/5] Expose ICU into Lua Vladislav Shpilevoy
2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 1/5] alter: fix assertion in collations alter Vladislav Shpilevoy
2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 2/5] Move struct on_access_denied_ctx into error.h Vladislav Shpilevoy
2018-05-04 11:06   ` [tarantool-patches] " Alexander Turenko
2018-05-04 12:05     ` Vladislav Shpilevoy
2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 3/5] Merge box_error, stat and collations into core library Vladislav Shpilevoy
2018-05-04 11:36   ` [tarantool-patches] " Alexander Turenko
2018-05-04 12:05     ` Vladislav Shpilevoy
2018-05-08 13:18   ` Konstantin Osipov
2018-05-10 21:06     ` Vladislav Shpilevoy
2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 4/5] Always store built-in collations in the cache Vladislav Shpilevoy
2018-05-08 13:20   ` [tarantool-patches] " Konstantin Osipov
2018-04-28 22:45 ` [tarantool-patches] [PATCH v2 5/5] lua: introduce utf8 built-in globaly visible module Vladislav Shpilevoy
2018-05-04 22:33   ` [tarantool-patches] " Alexander Turenko
2018-05-04 23:32     ` Vladislav Shpilevoy
2018-05-05  0:18       ` Alexander Turenko
2018-05-05  0:24         ` Vladislav Shpilevoy
2018-05-05  0:38           ` Alexander Turenko
2018-05-05  0:45             ` Vladislav Shpilevoy
2018-05-08 13:21   ` Konstantin Osipov
2018-05-05  0:19 ` [tarantool-patches] Re: [PATCH v2 0/5] Expose ICU into Lua Alexander Turenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox