From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp54.i.mail.ru (smtp54.i.mail.ru [217.69.128.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 87423469719 for ; Mon, 7 Sep 2020 01:42:15 +0300 (MSK) Date: Mon, 7 Sep 2020 01:42:15 +0300 From: Alexander Turenko Message-ID: <20200906224215.plpdhgav64zpeyop@tkn_work_nb> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Subject: [Tarantool-discussions] Consider exporting symbols from libraries: small, msgpuck List-Id: Tarantool development process List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tarantool-discussions@dev.tarantool.org I was accumulating thoughts around ABI compatibility for myself during some time and want to share them. The main question that I bring into attention here: whether it worth to expose msgpuck, small and other libraries APIs into tarantool's module API. Problem ------- A tarantool module (say, memcached) uses a library, which is also used in tarantool (say, small). Let's assume that tarantool and the module use different versions of the library. Say, a layout of some structure was changed: a non-last field was removed or a field was added to the middle. | tarantool executable | -------------------- | | /* foo.h */ | | struct foo { | uint64_t bar; | uint64_t baz; | struct foo *next; | } | | void | foo_create(struct foo *foo, struct foo *next); | | /* foo.c */ | | void | foo_create(struct foo *foo, struct foo *next) | { | foo->bar = 0; | foo->baz = 0; | foo->next = next; | } | module dynamic library | ---------------------- | | /* foo.h */ | | struct foo { | /* !! no bar !! */ | uint64_t baz; | struct foo *next; | } | | void | foo_create(struct foo *foo, struct foo *next); | | /* foo.c */ | | void | foo_create(struct foo *foo, struct foo *next) | { | /* !! no foo->bar = 0 !! */ | foo->baz = 0; | foo->next = next; | } Let's look how a breakage may occur. After unhiding internal symbols in tarantool executable (see [1]), a call of foo_create() from the module will actually call the function from tarantool executable, which will set foo->next to NULL (`foo->baz = 0;`) and will access a memory out of the structure bounds (`foo->next = next;`). Note for myself: I would take extra care to inline functions in public headers, however I have no example of a possible breakage in the mind. Noted here to think around it later. Note: Some msgpuck symbols were exposed even before [1]. I guess it was to use them using LuaJIT FFI. [1]: https://github.com/tarantool/tarantool/issues/2971 Background ---------- - Default on Linux: use a symbol from executable file. - MacOS behaviour is like RTLD_DEEPBIND is used (from Vlad Sh.) - See dlopen(3): RTLD_DEEPBIND (place a symbol from a library before global one in the lookup order). Known cases ----------- - LTO and ASAN complains about this. https://github.com/tarantool/tarantool/issues/5001 LTO fix: https://github.com/tarantool/tarantool/commit/36927e540549fbdfd156ac3518616dbf4642711f ASAN fix: https://github.com/tarantool/tarantool/commit/e8c72d4fe66ea94e357af2e527cb5cc4727f09da - memcached fails on some tarantool versions. This case is almost same as the abstract one described above: the symbol unhiding patch leads to the breakage. https://github.com/tarantool/memcached/issues/59 - box_txn_alloc() changes its behaviour. Not strictly related to the problem described above, but it is another tarantool public C API breakage. So it is related to the question below: how to test the API to prevent this kind of breakage. https://github.com/tarantool/memcached/issues/53 My questions ------------ - Should not we expose small, msgpuck libraries symbols from tarantool executable and ship corresponsing header files? - How to ensure that exposed API / ABI is stable: one may use old headers to compile, but symbols from newer executable at runtime. - Of course, we should test tarantool changes against external modules. But it is not general ABI compatibility verification: some cases may not be covered by a module test, there may be closed-source modules. - How existing ABI compatibility checkers are? Say, [2]. - Looks promising: at least the description suggests that the case above would be catched ('renamed fields'). - We should define rules how to change public API structures and functions. Existing of such checklists makes life easier. - Many points should be here, but I'll highlight one that comes into my mind (just to don't forgot about it): we possibly will need to use padding at end of public structures to have ability to extend it. Or explicitly state that a structure is not known at build time, so it may not be used in arrays or allocated on a stack. If there is no need to provide direct access to first N fields (say, due to performance matters), we can just make it opaque. - Can we just ship small / msgpuck header files and expose its symbols from tarantool? Or we need a separate public API layer? - The former would obligate us to keep those libraries ABI compatible. - The latter don't: this way the library should only be used as static one. - How about performance? Whether building a module with a library (like small or msgpuck) directly (not using of tarantool's one) may give better performance because of using inline functions and macros? - Can we make bundling a library into a module safe using symbol renaming (say, some macro magic)? - For particular case: using of fiber()->gc: can we expose some reduced API from tarantool and be happy? [2]: https://lvc.github.io/abi-compliance-checker/ Why I started the discussion? ----------------------------- I want to implement Lua API for key_def as an external module, which should be based on a public C API (which in turn should be extended for this matter). The built-in key_def Lua module uses fiber->gc region; region functions are part of the small library. Considering version mismatch problems we already met in the past I would prefer to expose small library symbols from tarantool executable and use them in the module. I found that just exposing relevant symbols does not shield us from ABI breakage problems, so the questions above should be resolved (sooner or later). Mea culpa --------- Well, I should google for 'how to write abi compatible libraries', read some articles and I guess most of my questions will gone. I wrote the letter above just to formalize things for myself, but than found that it may be used as the base for further discussions. Forward ABI compatibility guidelines ------------------------------------ This sections is added later, so it may contradict with something written above. Excerpts of useful info from different sources. - https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html - symbol versioning - when exactly it is needed? what a problem it solves? - policy: don't change anything, only add - separation of interface and implementation - how about macroses, which wraps sizeof() / alignof() calls? - testing - `make check-abi` - It just check all symbols using sizeof(), alignof() and so. I guess also check list of symbols and each structure field. - Files (from gcc tarball): - libstdc++-v3/testsuite/Makefile.in - libstdc++-v3/testsuite/util/testsuite_abi_check.cc - libstdc++-v3/testsuite/util/testsuite_abi.{h,cc} - libstdc++-v3/libsupc++/cxxabi.h - <...> - `make check-c++` just runs the C standard library test suite. The idea of ABI compatibility check is to run a testsuite from one version against another one. - http://abicheck.sourceforge.net/ - It is linked from the page. Why? Is it used in GCC? Is it related to `make check-abi`? Is it just recommendation? - It just verifies a list of symbols used by an executable file against private / unstable lists. Not ready-to-use compare ABI vs ABI tool. Traversed over several documents and, in brief, the best description is KDE project guidelines (it is often linked from other good sources): https://community.kde.org/Policies/Binary_Compatibility_Issues_With_C%2B%2B https://community.kde.org/Policies/Binary_Compatibility_Examples Those sources are (looked briefly): https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html https://www.akkadia.org/drepper/dsohowto.pdf http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1976.html http://syrcose.ispras.ru/2009/files/02_paper.pdf https://accu.org/content/conf2015/JonathanWakely-What%20Is%20An%20ABI%20And%20Why%20Is%20It%20So%20Complicated.pdf What I need to think: is it okay to use padding for a structure instead of d-pointer? How much padding is okay for performance matters? Is it ever okay to have non-opaque structures (we have no ones now in module.h)? Future updates -------------- Now I investigated the area a bit and want to share certain recommendations: - How to expose a non-opacue structure to keep it ABI compatible over different tarantool versions (padding and so on). - How to write a Lua/C module that able to use a feature from a new tarantool version, but work with reduced functionality on an old tarantool version (using dlsym()). I'll do when time will permit. WBR, Alexander Turenko.