Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v2] Add missed icu symbols
@ 2020-09-23 11:02 HustonMmmavr
  2020-09-24 21:10 ` Vladislav Shpilevoy
  0 siblings, 1 reply; 5+ messages in thread
From: HustonMmmavr @ 2020-09-23 11:02 UTC (permalink / raw)
  To: tarantool-patches, yaroslav.dynnikov, alexander.turenko,
	v.shpilevoy, tsafin

After patch 03790ac5510648d1d9648bb2281857a7992d0593 ("cmake: remove
dynamic-list linker option") symbols exports was changed (now we have
to export required symbols manually). Icu symbols, required by icu-date
rock (as ffi calls) are unused at linkage stage of tarantool binary
and thrown away from it so icu-date won't work (in case of tarantool
static build). This patch fixes this behaviour by adding symbols
required by icu-date rock to symbols export list.

Close #5266
---

Branch: https://github.com/tarantool/tarantool/tree/HustonMmmavr/gh-5266-icu-symbols
Issue: https://github.com/tarantool/tarantool/issues/5266

Changes in v2:
* Fixed commit message
* Fixed typos in tests and commentaries
* Moved test from box-tap to app-tap
* Moved icu symbol list to exports.h
* Icu export symbol list was sorted

 src/exports.h                             | 60 ++++++++++++++
 test/app-tap/gh-5266-icu-exports.test.lua | 97 +++++++++++++++++++++++
 2 files changed, 157 insertions(+)
 create mode 100755 test/app-tap/gh-5266-icu-exports.test.lua

diff --git a/src/exports.h b/src/exports.h
index 6d8303180..bcf0e369f 100644
--- a/src/exports.h
+++ b/src/exports.h
@@ -493,3 +493,63 @@ EXPORT(uri_format)
 EXPORT(uri_parse)
 EXPORT(uuid_nil)
 EXPORT(uuid_unpack)
+
+/**
+* ICU symbols
+*
+* ICU libraries contains symbols with appended lib version suffix.
+* (This behaviour is configured by --enable-renaming/disable flag when
+* building libicu; by default --enable-renaming is used)
+*
+* To solve symbols suffix problem <unicode/urename.h> file was included.
+* At this file used #define macros to rename symbols.
+* For example: symobl `u_strlen` converts to `ustrlen_VER`
+*
+* Next symbols used at icu-date rock (also here added u_getVersion symbol)
+* Also we assume that icu version is >= 55, because two symbols used by
+* icu-date rock appeared at following library versions:
+*  - ucal_getTimeZoneID at icu 51
+*  - udat_formatCalendar at icu 55
+*
+* To add this symbols to export list properly (to keep backward compatibility
+* with an older versions of icu) was used define from icu library:
+* U_ICU_VERSION_MAJOR_NUM.
+*
+* Older versions of libicu, for example 4.8 (or 48) has following define:
+*   #define U_ICU_VERSION_MAJOR_NUM 4
+* From version 49 of libicu numbering changed, for example libicu 50 has
+* following define:
+*   #define U_ICU_VERSION_MAJOR_NUM 50
+* Here are links for version numbering of libicu:
+*   - Changes at icu 49: http://site.icu-project.org/download/49
+*   - Version numbering: http://userguide.icu-project.org/design#TOC-Version-Numbers-in-ICU
+*/
+#include <unicode/urename.h>
+EXPORT(u_austrcpy)
+EXPORT(u_errorName)
+EXPORT(u_getVersion)
+EXPORT(u_strlen)
+EXPORT(u_uastrcpy)
+EXPORT(ucal_add)
+EXPORT(ucal_clear)
+EXPORT(ucal_clearField)
+EXPORT(ucal_close)
+EXPORT(ucal_get)
+EXPORT(ucal_getAttribute)
+EXPORT(ucal_getMillis)
+EXPORT(ucal_getNow)
+#if U_ICU_VERSION_MAJOR_NUM >= 51
+    EXPORT(ucal_getTimeZoneID)
+#endif
+EXPORT(ucal_open)
+EXPORT(ucal_set)
+EXPORT(ucal_setAttribute)
+EXPORT(ucal_setMillis)
+EXPORT(ucal_setTimeZone)
+EXPORT(udat_close)
+#if U_ICU_VERSION_MAJOR_NUM >= 55
+    EXPORT(udat_formatCalendar)
+#endif
+EXPORT(udat_open)
+EXPORT(udat_parseCalendar)
+EXPORT(udat_setLenient)
diff --git a/test/app-tap/gh-5266-icu-exports.test.lua b/test/app-tap/gh-5266-icu-exports.test.lua
new file mode 100755
index 000000000..bcf6271a5
--- /dev/null
+++ b/test/app-tap/gh-5266-icu-exports.test.lua
@@ -0,0 +1,97 @@
+#!/usr/bin/env tarantool
+
+local tap = require('tap')
+local ffi = require('ffi')
+
+local test = tap.test('icu_exports')
+test:plan(2)
+
+-- Find `u_getVersion` symbol (by nm) from tarantool binary
+-- for further detection of icu version. There are two approaches:
+-- * some of icu packages appends icu version to it's symbols as suffix,
+--   so we need to simple parse it.
+-- * other packages don't append version as suffix, so in this case
+--   we have to call u_getVersion through ffi to grab icu version.
+-- On Freebsd nm requires -D option, while on Linux it's not necessary
+-- and macOS nm doesn't have this option.
+local tnt_path = arg[-1]
+local pipe = io.popen(string.format(
+    'nm %s %s | grep u_getVersion',
+    jit.os == "BSD" and '-D' or '',
+    tnt_path
+))
+local u_strlen_info = pipe:read('*all')
+pipe:close()
+
+test:ok(u_strlen_info ~= '', "Found u_getVersion symbol (to get icu version)")
+local icu_version_suffix = u_strlen_info:gsub('.*u_getVersion', ''):gsub('\n', '')
+
+local icu_version_major = tonumber(icu_version_suffix:gsub('_', ''), 10)
+if icu_version_suffix == '' then
+    ffi.cdef([[
+        typedef uint8_t UVersionInfo[4];
+        void u_getVersion(UVersionInfo versionArray);
+    ]])
+    local version = ffi.new('UVersionInfo')
+    ffi.C.u_getVersion(version)
+    icu_version_major = version[0]
+end
+
+ffi.cdef([[
+    void *dlsym(void *handle, const char *symbol);
+]])
+
+-- See `man 3 dlsym`:
+-- RTLD_DEFAULT
+--   Find the first occurrence of the desired symbol using the default
+--   shared object search order. The search will include global symbols
+--   in the executable and its dependencies, as well as symbols in shared
+--   objects that were dynamically loaded with the RTLD_GLOBAL flag.
+local RTLD_DEFAULT = ffi.cast("void *",
+    (jit.os == "OSX" or jit.os == "BSD") and -2LL or 0LL
+)
+
+local icu_symbols = {
+    'u_strlen',
+    'u_uastrcpy',
+    'u_austrcpy',
+    'u_errorName',
+    'u_getVersion',
+    'udat_open',
+    'udat_setLenient',
+    'udat_close',
+    'udat_parseCalendar',
+    'ucal_open',
+    'ucal_close',
+    'ucal_get',
+    'ucal_set',
+    'ucal_add',
+    'ucal_clear',
+    'ucal_clearField',
+    'ucal_getMillis',
+    'ucal_setMillis',
+    'ucal_getAttribute',
+    'ucal_setAttribute',
+    'ucal_setTimeZone',
+    'ucal_getNow',
+}
+
+if icu_version_major >= 51 then
+    table.insert(icu_symbols, 'ucal_getTimeZoneID')
+end
+if icu_version_major >= 55 then
+    table.insert(icu_symbols, 'udat_formatCalendar')
+end
+
+test:test('icu_symbols', function(t)
+    t:plan(#icu_symbols)
+    for _, sym in ipairs(icu_symbols) do
+        local version_sym = sym .. icu_version_suffix
+        t:ok(
+            ffi.C.dlsym(RTLD_DEFAULT, version_sym) ~= nil,
+            ('Symbol %q found'):format(version_sym)
+        )
+    end
+end)
+
+os.exit(test:check() and 0 or 1)
-- 
2.26.2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] Add missed icu symbols
  2020-09-23 11:02 [Tarantool-patches] [PATCH v2] Add missed icu symbols HustonMmmavr
@ 2020-09-24 21:10 ` Vladislav Shpilevoy
  2020-09-27 22:39   ` Alexandr Barulev
  0 siblings, 1 reply; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-24 21:10 UTC (permalink / raw)
  To: HustonMmmavr, tarantool-patches, yaroslav.dynnikov,
	alexander.turenko, tsafin

Hi! Thanks for the patch!

I would add 'build: ' prefix to the commit message.

On 23.09.2020 13:02, HustonMmmavr wrote:
> After patch 03790ac5510648d1d9648bb2281857a7992d0593 ("cmake: remove
> dynamic-list linker option") symbols exports was changed (now we have
> to export required symbols manually).

Actually after some thinking I am not sure it is because of that commit.
Symbols exports were always done manually. That commit changed which
symbols were hidden. So essentially it extended the exported symbols
set, not shrunk it.

And that makes me wonder how could it lead to #5266?

> Icu symbols, required by icu-date
> rock (as ffi calls) are unused at linkage stage of tarantool binary
> and thrown away from it so icu-date won't work (in case of tarantool
> static build).

They would have been thrown anyway. It is not related to the dynamic-list
option removed in the mentioned commit.

> This patch fixes this behaviour by adding symbols
> required by icu-date rock to symbols export list.
> 
> Close #5266

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] Add missed icu symbols
  2020-09-24 21:10 ` Vladislav Shpilevoy
@ 2020-09-27 22:39   ` Alexandr Barulev
  2020-09-28  6:26     ` Vladislav Shpilevoy
  0 siblings, 1 reply; 5+ messages in thread
From: Alexandr Barulev @ 2020-09-27 22:39 UTC (permalink / raw)
  To: Vladislav Shpilevoy
  Cc: Alexander Turenko, tarantool-patches, Yaroslav Dynnikov

[-- Attachment #1: Type: text/plain, Size: 2024 bytes --]

Hello, thanks for the review of new patch version!

I've fixed commit message (added `build` prefix) and rebased
from master.

I think commit, mentionied at patch message ("cmake: remove
dynamic-list linker option") is related to #5266 issue.
It's because symbols exports from required libs into tarantool binary
earlier were configured with two options:
by `--whole-archive` option;
and `--dynamic-list,${exports_file}` option, where exports_file was created
with use of mkexports script (symbols was grabbed by nm -D from shared
libraries).

Also, I've checkouted "cmake: remove dynamic-list linker option"
commit, built static tarantool and ran icu-date tests. In result
tests failed with `undefined symbol` errors.
After that, I checkouted previous commit, built tarantool again
and icu-date tests succeed

пт, 25 сент. 2020 г. в 00:10, Vladislav Shpilevoy <v.shpilevoy@tarantool.org
>:

> Hi! Thanks for the patch!
>
> I would add 'build: ' prefix to the commit message.
>
> On 23.09.2020 13:02, HustonMmmavr wrote:
> > After patch 03790ac5510648d1d9648bb2281857a7992d0593 ("cmake: remove
> > dynamic-list linker option") symbols exports was changed (now we have
> > to export required symbols manually).
>
> Actually after some thinking I am not sure it is because of that commit.
> Symbols exports were always done manually. That commit changed which
> symbols were hidden. So essentially it extended the exported symbols
> set, not shrunk it.
>
> And that makes me wonder how could it lead to #5266?
>
> > Icu symbols, required by icu-date
> > rock (as ffi calls) are unused at linkage stage of tarantool binary
> > and thrown away from it so icu-date won't work (in case of tarantool
> > static build).
>
> They would have been thrown anyway. It is not related to the dynamic-list
> option removed in the mentioned commit.
>
> > This patch fixes this behaviour by adding symbols
> > required by icu-date rock to symbols export list.
> >
> > Close #5266
>

[-- Attachment #2: Type: text/html, Size: 2487 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] Add missed icu symbols
  2020-09-27 22:39   ` Alexandr Barulev
@ 2020-09-28  6:26     ` Vladislav Shpilevoy
  2020-10-01  3:19       ` Alexander Turenko
  0 siblings, 1 reply; 5+ messages in thread
From: Vladislav Shpilevoy @ 2020-09-28  6:26 UTC (permalink / raw)
  To: Alexandr Barulev; +Cc: Alexander Turenko, tarantool-patches, Yaroslav Dynnikov

On 28.09.2020 00:39, Alexandr Barulev wrote:
> Hello, thanks for the review of new patch version!
> 
> I've fixed commit message (added `build` prefix) and rebased
> from master.
> 
> I think commit, mentionied at patch message ("cmake: remove
> dynamic-list linker option") is related to #5266 issue.
> It's because symbols exports from required libs into tarantool binary
> earlier were configured with two options:
> by `--whole-archive` option;
> and `--dynamic-list,${exports_file}` option, where exports_file was created
> with use of mkexports script (symbols was grabbed by nm -D from shared
> libraries).
> 
> Also, I've checkouted "cmake: remove dynamic-list linker option"
> commit, built static tarantool and ran icu-date tests. In result
> tests failed with `undefined symbol` errors.
> After that, I checkouted previous commit, built tarantool again
> and icu-date tests succeed

Ok, so this is because of whole-archive removal. Can it be fixed by
adding linking with the static ICU with whole-archive option instead of
manual exporting of each symbol, like it was before?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Tarantool-patches] [PATCH v2] Add missed icu symbols
  2020-09-28  6:26     ` Vladislav Shpilevoy
@ 2020-10-01  3:19       ` Alexander Turenko
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Turenko @ 2020-10-01  3:19 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tarantool-patches, Yaroslav Dynnikov

On Mon, Sep 28, 2020 at 08:26:05AM +0200, Vladislav Shpilevoy wrote:
> On 28.09.2020 00:39, Alexandr Barulev wrote:
> > Hello, thanks for the review of new patch version!
> > 
> > I've fixed commit message (added `build` prefix) and rebased
> > from master.
> > 
> > I think commit, mentionied at patch message ("cmake: remove
> > dynamic-list linker option") is related to #5266 issue.
> > It's because symbols exports from required libs into tarantool binary
> > earlier were configured with two options:
> > by `--whole-archive` option;
> > and `--dynamic-list,${exports_file}` option, where exports_file was created
> > with use of mkexports script (symbols was grabbed by nm -D from shared
> > libraries).
> > 
> > Also, I've checkouted "cmake: remove dynamic-list linker option"
> > commit, built static tarantool and ran icu-date tests. In result
> > tests failed with `undefined symbol` errors.
> > After that, I checkouted previous commit, built tarantool again
> > and icu-date tests succeed
> 
> Ok, so this is because of whole-archive removal. Can it be fixed by
> adding linking with the static ICU with whole-archive option instead of
> manual exporting of each symbol, like it was before?

(I don't know full context of the discussion.)

Decision to link a third-party library statically should be last resort
option and should be considered only when we have a real business need.
Not just to save some lines of code.

Almost all Linux distributions link libraries dynamically and we should
generally follow this way. On the distro level it allows to provide
security updates of libraries (without repackaging of half of a system),
saves some disk and some memory.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-01  3:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 11:02 [Tarantool-patches] [PATCH v2] Add missed icu symbols HustonMmmavr
2020-09-24 21:10 ` Vladislav Shpilevoy
2020-09-27 22:39   ` Alexandr Barulev
2020-09-28  6:26     ` Vladislav Shpilevoy
2020-10-01  3:19       ` Alexander Turenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox