Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH] Add missed icu symbols
@ 2020-09-21 19:51 HustonMmmavr
  2020-09-21 20:29 ` Vladislav Shpilevoy
  0 siblings, 1 reply; 4+ messages in thread
From: HustonMmmavr @ 2020-09-21 19:51 UTC (permalink / raw)
  To: tarantool-patches, yaroslav.dynnikov, alexander.turenko,
	v.shpilevoy, tsafin

After patch 03790ac55 symbols exports was changed (now we have to
export required symbols manually). Icu symbols, required by icu-date
rock (as ffi calls) are unused at linkage stage of tarantool binary
and thrown away from it so icu-date won't work (in case of tarantool
static build). This patch fixes this behaviour by adding symbols
required by icu-date rock to symbols export list.

Close #5266
---

Branch: https://github.com/tarantool/tarantool/tree/HustonMmmavr/gh-5266-icu-symbols
Issue: https://github.com/tarantool/tarantool/issues/5266

 src/exports.c                             |  2 +
 src/exports_icu.h                         | 58 ++++++++++++++
 test/box-tap/gh-5266-icu-exports.test.lua | 97 +++++++++++++++++++++++
 3 files changed, 157 insertions(+)
 create mode 100644 src/exports_icu.h
 create mode 100755 test/box-tap/gh-5266-icu-exports.test.lua

diff --git a/src/exports.c b/src/exports.c
index a3c27143e..3dfef62f0 100644
--- a/src/exports.c
+++ b/src/exports.c
@@ -95,6 +95,7 @@
  */
 #define EXPORT(symbol) extern void symbol(void);
 #include "exports.h"
+#include "exports_icu.h"
 #undef EXPORT
 
 void **
@@ -108,6 +109,7 @@ export_syms(void)
 	#define EXPORT(symbol) ((void *)symbol),
 	static void *symbols[] = {
 		#include "exports.h"
+		#include "exports_icu.h"
 	};
 	#undef EXPORT
 	return symbols;
diff --git a/src/exports_icu.h b/src/exports_icu.h
new file mode 100644
index 000000000..1899a6749
--- /dev/null
+++ b/src/exports_icu.h
@@ -0,0 +1,58 @@
+#include <unicode/urename.h>
+
+/**
+* ICU libraries contains symbols with appended lib version suffix.
+* (This behaviour conifgures by --enable-renaming/disable flag when
+* building libicu; by default --enable-renaming is used)
+*
+* To solve symbols suffix problem <unicode/urename.h> file was included.
+* At this file used #define macros to rename symbols.
+* For example: symobl `u_strlen` converts to `ustrlen_VER`
+*
+* Next symbols used at icu-date rock (also here added u_getVersion symbol)
+* Also we assume that icu version is >= 55, because two symbols used by
+* icu-date rock appeared at following library versions:
+*  - ucal_getTimeZoneID at icu 51
+*  - udat_formatCalendar at icu 55
+*
+* To add this symbols to export list properly (to keep backward compatibility
+* with an older versions of icu) was used define from icu library:
+* U_ICU_VERSION_MAJOR_NUM.
+*
+* Older versions of libicu, for example 4.8 (or 48) has following define:
+*   #define U_ICU_VERSION_MAJOR_NUM 4
+* From version 49 of libicu numbering changed, for example libicu 50 has
+* following define:
+*   #define U_ICU_VERSION_MAJOR_NUM 50
+* Here are links for version numbering of libicu:
+*   - Changes at icu 49: http://site.icu-project.org/download/49
+*   - Version numbering: http://userguide.icu-project.org/design#TOC-Version-Numbers-in-ICU
+*/
+EXPORT(u_strlen)
+EXPORT(u_uastrcpy)
+EXPORT(u_austrcpy)
+EXPORT(u_errorName)
+EXPORT(u_getVersion)
+EXPORT(udat_open)
+EXPORT(udat_setLenient)
+EXPORT(udat_close)
+EXPORT(udat_parseCalendar)
+#if U_ICU_VERSION_MAJOR_NUM >= 55
+    EXPORT(udat_formatCalendar)
+#endif
+EXPORT(ucal_open)
+EXPORT(ucal_close)
+EXPORT(ucal_get)
+EXPORT(ucal_set)
+EXPORT(ucal_add)
+EXPORT(ucal_clear)
+EXPORT(ucal_clearField)
+EXPORT(ucal_getMillis)
+EXPORT(ucal_setMillis)
+EXPORT(ucal_getAttribute)
+EXPORT(ucal_setAttribute)
+#if U_ICU_VERSION_MAJOR_NUM >= 51
+    EXPORT(ucal_getTimeZoneID)
+#endif
+EXPORT(ucal_setTimeZone)
+EXPORT(ucal_getNow)
diff --git a/test/box-tap/gh-5266-icu-exports.test.lua b/test/box-tap/gh-5266-icu-exports.test.lua
new file mode 100755
index 000000000..49e012fac
--- /dev/null
+++ b/test/box-tap/gh-5266-icu-exports.test.lua
@@ -0,0 +1,97 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+local ffi = require('ffi')
+
+local test = tap.test('icu_exports')
+test:plan(2)
+
+-- Find `u_getVersion` symbol (by nm) from tarantool binary
+-- for further detection of icu version. There are two approaches:
+-- * some of icu packages appends icu version to it's symbols as suffix,
+--   so we need to simple parse it.
+-- * other packages don't append version as suffix, so in this case
+--   we have to call u_getVersion through ffi to grab icu version.
+-- On Freebsd nm requires -D option, while on Linux it's not necessary
+-- and macOS nm doesn't have this option.
+local tnt_path = arg[-1]
+local pipe = io.popen(string.format(
+    'nm %s %s | grep u_getVersion',
+    jit.os == "BSD" and '-D' or '',
+    tnt_path
+))
+local u_strlen_info = pipe:read('*all')
+pipe:close()
+
+test:ok(u_strlen_info ~= '', "Finded u_getVersion symbol (to get icu version)")
+local icu_version_suffix = u_strlen_info:gsub('.*u_getVersion', ''):gsub('\n', '')
+
+local icu_version_major = tonumber(icu_version_suffix:gsub('_', ''), 10)
+if icu_version_suffix == '' then
+    ffi.cdef([[
+        typedef uint8_t UVersionInfo[4];
+        void u_getVersion(UVersionInfo versionArray);
+    ]])
+    local version = ffi.new('UVersionInfo')
+    ffi.C.u_getVersion(version)
+    icu_version_major = version[0]
+end
+
+ffi.cdef([[
+    void *dlsym(void *handle, const char *symbol);
+]])
+
+-- See `man 3 dlsym`:
+-- RTLD_DEFAULT
+--   Find the first occurrence of the desired symbol using the default
+--   shared object search order. The search will include global symbols
+--   in the executable and its dependencies, as well as symbols in shared
+--   objects that were dynamically loaded with the RTLD_GLOBAL flag.
+local RTLD_DEFAULT = ffi.cast("void *",
+    (jit.os == "OSX" or jit.os == "BSD") and -2LL or 0LL
+)
+
+local icu_symbols = {
+    'u_strlen',
+    'u_uastrcpy',
+    'u_austrcpy',
+    'u_errorName',
+    'u_getVersion',
+    'udat_open',
+    'udat_setLenient',
+    'udat_close',
+    'udat_parseCalendar',
+    'ucal_open',
+    'ucal_close',
+    'ucal_get',
+    'ucal_set',
+    'ucal_add',
+    'ucal_clear',
+    'ucal_clearField',
+    'ucal_getMillis',
+    'ucal_setMillis',
+    'ucal_getAttribute',
+    'ucal_setAttribute',
+    'ucal_setTimeZone',
+    'ucal_getNow',
+}
+
+if icu_version_major >= 51 then
+    table.insert(icu_symbols, 'ucal_getTimeZoneID')
+end
+if icu_version_major >= 55 then
+    table.insert(icu_symbols, 'udat_formatCalendar')
+end
+
+test:test('icu_symbols', function(t)
+    t:plan(#icu_symbols)
+    for _, sym in ipairs(icu_symbols) do
+        local version_sym = sym .. icu_version_suffix
+        t:ok(
+            ffi.C.dlsym(RTLD_DEFAULT, version_sym) ~= nil,
+            ('Symbol %q found'):format(version_sym)
+        )
+    end
+end)
+
+os.exit(test:check() and 0 or 1)
-- 
2.26.2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH] Add missed icu symbols
  2020-09-21 19:51 [Tarantool-patches] [PATCH] Add missed icu symbols HustonMmmavr
@ 2020-09-21 20:29 ` Vladislav Shpilevoy
  2020-09-21 21:05   ` Timur Safin
  0 siblings, 1 reply; 4+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-21 20:29 UTC (permalink / raw)
  To: HustonMmmavr, tarantool-patches, yaroslav.dynnikov,
	alexander.turenko, tsafin

Thanks for the patch!

On 21.09.2020 21:51, HustonMmmavr wrote:
> After patch 03790ac55 symbols exports was changed (now we have to

When we mention a commit, we use a full hash (usually), and we include
the commit's title in ("...").

    ... your text 03790ac5510648d1d9648bb2281857a7992d0593 ("cmake:
    remove dynamic-list linker option") your next text ...

See 5 comments below.

> export required symbols manually). Icu symbols, required by icu-date
> rock (as ffi calls) are unused at linkage stage of tarantool binary
> and thrown away from it so icu-date won't work (in case of tarantool
> static build). This patch fixes this behaviour by adding symbols
> required by icu-date rock to symbols export list.
> 
> Close #5266
> ---
> 
> Branch: https://github.com/tarantool/tarantool/tree/HustonMmmavr/gh-5266-icu-symbols
> Issue: https://github.com/tarantool/tarantool/issues/5266
> 
>  src/exports.c                             |  2 +
>  src/exports_icu.h                         | 58 ++++++++++++++
>  test/box-tap/gh-5266-icu-exports.test.lua | 97 +++++++++++++++++++++++
>  3 files changed, 157 insertions(+)
>  create mode 100644 src/exports_icu.h
>  create mode 100755 test/box-tap/gh-5266-icu-exports.test.lua
> 
> diff --git a/src/exports.c b/src/exports.c
> index a3c27143e..3dfef62f0 100644
> --- a/src/exports.c
> +++ b/src/exports.c
> @@ -95,6 +95,7 @@
>   */
>  #define EXPORT(symbol) extern void symbol(void);
>  #include "exports.h"
> +#include "exports_icu.h"
>  #undef EXPORT
>  
>  void **
> @@ -108,6 +109,7 @@ export_syms(void)
>  	#define EXPORT(symbol) ((void *)symbol),
>  	static void *symbols[] = {
>  		#include "exports.h"
> +		#include "exports_icu.h"

1. Lets better not touch exports.c. Better leave it subsystem independent if
possible, and not duplicate the "include".

>  	};
>  	#undef EXPORT
>  	return symbols;
> diff --git a/src/exports_icu.h b/src/exports_icu.h
> new file mode 100644
> index 000000000..1899a6749
> --- /dev/null
> +++ b/src/exports_icu.h
> @@ -0,0 +1,58 @@
> +#include <unicode/urename.h>
> +
> +/**
> +* ICU libraries contains symbols with appended lib version suffix.
> +* (This behaviour conifgures by --enable-renaming/disable flag when

2. conifgures -> is configured.

> +* building libicu; by default --enable-renaming is used)
> +*
> +* To solve symbols suffix problem <unicode/urename.h> file was included.
> +* At this file used #define macros to rename symbols.
> +* For example: symobl `u_strlen` converts to `ustrlen_VER`
> +*
> +* Next symbols used at icu-date rock (also here added u_getVersion symbol)
> +* Also we assume that icu version is >= 55, because two symbols used by
> +* icu-date rock appeared at following library versions:
> +*  - ucal_getTimeZoneID at icu 51
> +*  - udat_formatCalendar at icu 55
> +*
> +* To add this symbols to export list properly (to keep backward compatibility
> +* with an older versions of icu) was used define from icu library:
> +* U_ICU_VERSION_MAJOR_NUM.
> +*
> +* Older versions of libicu, for example 4.8 (or 48) has following define:
> +*   #define U_ICU_VERSION_MAJOR_NUM 4
> +* From version 49 of libicu numbering changed, for example libicu 50 has
> +* following define:
> +*   #define U_ICU_VERSION_MAJOR_NUM 50
> +* Here are links for version numbering of libicu:
> +*   - Changes at icu 49: http://site.icu-project.org/download/49
> +*   - Version numbering: http://userguide.icu-project.org/design#TOC-Version-Numbers-in-ICU
> +*/
> +EXPORT(u_strlen)
> +EXPORT(u_uastrcpy)
> +EXPORT(u_austrcpy)
> +EXPORT(u_errorName)
> +EXPORT(u_getVersion)
> +EXPORT(udat_open)
> +EXPORT(udat_setLenient)
> +EXPORT(udat_close)
> +EXPORT(udat_parseCalendar)
> +#if U_ICU_VERSION_MAJOR_NUM >= 55
> +    EXPORT(udat_formatCalendar)
> +#endif
> +EXPORT(ucal_open)
> +EXPORT(ucal_close)
> +EXPORT(ucal_get)
> +EXPORT(ucal_set)
> +EXPORT(ucal_add)
> +EXPORT(ucal_clear)
> +EXPORT(ucal_clearField)
> +EXPORT(ucal_getMillis)
> +EXPORT(ucal_setMillis)
> +EXPORT(ucal_getAttribute)
> +EXPORT(ucal_setAttribute)
> +#if U_ICU_VERSION_MAJOR_NUM >= 51
> +    EXPORT(ucal_getTimeZoneID)
> +#endif
> +EXPORT(ucal_setTimeZone)
> +EXPORT(ucal_getNow)

3. Lets better store everything in exports.h. You can add a new
section in this file separated by a comment, that these symbols are
from ICU, to not scatter them among other exports. Also please sort
the exports in the alphabetical order, as it is asked in exports.h
comment.

> diff --git a/test/box-tap/gh-5266-icu-exports.test.lua b/test/box-tap/gh-5266-icu-exports.test.lua
> new file mode 100755
> index 000000000..49e012fac
> --- /dev/null
> +++ b/test/box-tap/gh-5266-icu-exports.test.lua

4. Since nothing depends on box here, I would move it to app-tap.

> @@ -0,0 +1,97 @@
> +#!/usr/bin/env tarantool
> +
> +local tap = require('tap')
> +local ffi = require('ffi')
> +
> +local test = tap.test('icu_exports')
> +test:plan(2)
> +
> +-- Find `u_getVersion` symbol (by nm) from tarantool binary
> +-- for further detection of icu version. There are two approaches:
> +-- * some of icu packages appends icu version to it's symbols as suffix,
> +--   so we need to simple parse it.
> +-- * other packages don't append version as suffix, so in this case
> +--   we have to call u_getVersion through ffi to grab icu version.
> +-- On Freebsd nm requires -D option, while on Linux it's not necessary
> +-- and macOS nm doesn't have this option.
> +local tnt_path = arg[-1]
> +local pipe = io.popen(string.format(
> +    'nm %s %s | grep u_getVersion',
> +    jit.os == "BSD" and '-D' or '',
> +    tnt_path
> +))
> +local u_strlen_info = pipe:read('*all')
> +pipe:close()
> +
> +test:ok(u_strlen_info ~= '', "Finded u_getVersion symbol (to get icu version)")

5. Finded -> Found.

> +local icu_version_suffix = u_strlen_info:gsub('.*u_getVersion', ''):gsub('\n', '')

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH] Add missed icu symbols
  2020-09-21 20:29 ` Vladislav Shpilevoy
@ 2020-09-21 21:05   ` Timur Safin
  2020-09-22 22:20     ` Alexandr Barulev
  0 siblings, 1 reply; 4+ messages in thread
From: Timur Safin @ 2020-09-21 21:05 UTC (permalink / raw)
  To: 'Vladislav Shpilevoy', 'HustonMmmavr',
	tarantool-patches, yaroslav.dynnikov, alexander.turenko

: From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
: Subject: Re: [PATCH] Add missed icu symbols
: 
...
: > +EXPORT(u_strlen)
: > +EXPORT(u_uastrcpy)
: > +EXPORT(u_austrcpy)
: > +EXPORT(u_errorName)
: > +EXPORT(u_getVersion)
: > +EXPORT(udat_open)
: > +EXPORT(udat_setLenient)
: > +EXPORT(udat_close)
: > +EXPORT(udat_parseCalendar)
: > +#if U_ICU_VERSION_MAJOR_NUM >= 55
: > +    EXPORT(udat_formatCalendar)
: > +#endif
: > +EXPORT(ucal_open)
: > +EXPORT(ucal_close)
: > +EXPORT(ucal_get)
: > +EXPORT(ucal_set)
: > +EXPORT(ucal_add)
: > +EXPORT(ucal_clear)
: > +EXPORT(ucal_clearField)
: > +EXPORT(ucal_getMillis)
: > +EXPORT(ucal_setMillis)
: > +EXPORT(ucal_getAttribute)
: > +EXPORT(ucal_setAttribute)
: > +#if U_ICU_VERSION_MAJOR_NUM >= 51
: > +    EXPORT(ucal_getTimeZoneID)
: > +#endif
: > +EXPORT(ucal_setTimeZone)
: > +EXPORT(ucal_getNow)
: 
: 3. Lets better store everything in exports.h. You can add a new
: section in this file separated by a comment, that these symbols are
: from ICU, to not scatter them among other exports. Also please sort
: the exports in the alphabetical order, as it is asked in exports.h
: comment.

Agreed with Vlad here - keeping all inside of single exports.h would make 
this code much, much easier to maintain. I'd recommend you to return back 
to the original idea with using special EXPORT_ICU(name) macros which would 
expand to EXPORT(U_ICU_ENTRY_POINT_RENAME(name)) _iff_ not U_DISABLE_RENAMING
and to EXPORT(name) otherwise.

And for +#if U_ICU_VERSION_MAJOR_NUM >= 51 I'd suggest (but not insist) to 
create yet another wrapper EXPORT_ICU_51UP(name). This would make it looks 
very consistent with others. Though with 1 usage case it might be overkill.

Regards,
Timur

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Tarantool-patches] [PATCH] Add missed icu symbols
  2020-09-21 21:05   ` Timur Safin
@ 2020-09-22 22:20     ` Alexandr Barulev
  0 siblings, 0 replies; 4+ messages in thread
From: Alexandr Barulev @ 2020-09-22 22:20 UTC (permalink / raw)
  To: Timur Safin
  Cc: Alexander Turenko, tarantool-patches, Vladislav Shpilevoy,
	Yaroslav Dynnikov

[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]

Hello, thank you for the review!

I took into account the comments and soon i will send new version
of the patch.

About additional macroses like EXPORT_ICU and others, as Timur
mentioned, seems there is no need to create them, because as I
noted in commentary at src/exports_icu.h - unicode/urename.h
file do this work (maps exported symbols to symbols with version
suffix, if library is build with option --enable-renaming). Also we
can't rely on U_ICU_ENTRY_POINT_RENAME macros, because old versions
of libicu don't contain this macros and unicode/urename.h has a
simple symbols mapping:
#define symbol symbol_suffix
So, I think, include <unicode/urename.h> is enough.

Regards,
Alexandr

вт, 22 сент. 2020 г. в 00:05, Timur Safin <tsafin@tarantool.org>:

> : From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
> : Subject: Re: [PATCH] Add missed icu symbols
> :
> ...
> : > +EXPORT(u_strlen)
> : > +EXPORT(u_uastrcpy)
> : > +EXPORT(u_austrcpy)
> : > +EXPORT(u_errorName)
> : > +EXPORT(u_getVersion)
> : > +EXPORT(udat_open)
> : > +EXPORT(udat_setLenient)
> : > +EXPORT(udat_close)
> : > +EXPORT(udat_parseCalendar)
> : > +#if U_ICU_VERSION_MAJOR_NUM >= 55
> : > +    EXPORT(udat_formatCalendar)
> : > +#endif
> : > +EXPORT(ucal_open)
> : > +EXPORT(ucal_close)
> : > +EXPORT(ucal_get)
> : > +EXPORT(ucal_set)
> : > +EXPORT(ucal_add)
> : > +EXPORT(ucal_clear)
> : > +EXPORT(ucal_clearField)
> : > +EXPORT(ucal_getMillis)
> : > +EXPORT(ucal_setMillis)
> : > +EXPORT(ucal_getAttribute)
> : > +EXPORT(ucal_setAttribute)
> : > +#if U_ICU_VERSION_MAJOR_NUM >= 51
> : > +    EXPORT(ucal_getTimeZoneID)
> : > +#endif
> : > +EXPORT(ucal_setTimeZone)
> : > +EXPORT(ucal_getNow)
> :
> : 3. Lets better store everything in exports.h. You can add a new
> : section in this file separated by a comment, that these symbols are
> : from ICU, to not scatter them among other exports. Also please sort
> : the exports in the alphabetical order, as it is asked in exports.h
> : comment.
>
> Agreed with Vlad here - keeping all inside of single exports.h would make
> this code much, much easier to maintain. I'd recommend you to return back
> to the original idea with using special EXPORT_ICU(name) macros which
> would
> expand to EXPORT(U_ICU_ENTRY_POINT_RENAME(name)) _iff_ not
> U_DISABLE_RENAMING
> and to EXPORT(name) otherwise.
>
> And for +#if U_ICU_VERSION_MAJOR_NUM >= 51 I'd suggest (but not insist) to
> create yet another wrapper EXPORT_ICU_51UP(name). This would make it looks
> very consistent with others. Though with 1 usage case it might be overkill.
>
> Regards,
> Timur
>
>

[-- Attachment #2: Type: text/html, Size: 3270 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-22 22:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 19:51 [Tarantool-patches] [PATCH] Add missed icu symbols HustonMmmavr
2020-09-21 20:29 ` Vladislav Shpilevoy
2020-09-21 21:05   ` Timur Safin
2020-09-22 22:20     ` Alexandr Barulev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox