From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpng2.m.smailru.net (smtpng2.m.smailru.net [94.100.179.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id B522A42EF5C for ; Sat, 20 Jun 2020 02:39:29 +0300 (MSK) References: <20200611002510.35349-1-huston.mavr@gmail.com> <7855a532-9877-3fef-4a52-c480b4509e4a@tarantool.org> <2e9c5d4a-4af1-ccb5-ef1d-4e245e62b8b7@tarantool.org> From: Vladislav Shpilevoy Message-ID: <676d7330-579f-7950-896e-ae2b3bf7df9a@tarantool.org> Date: Sat, 20 Jun 2020 01:39:27 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Tarantool-patches] [PATCH] cmake: cleanup src/CMakeLists.txt List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yaroslav Dynnikov Cc: tarantool-patches@dev.tarantool.org, Alexander Turenko Hi! Thanks for the investigation! On 19/06/2020 15:02, Yaroslav Dynnikov wrote: > I've researched the linking process and here is what I've found. > > First of all, let's talk about symbols removal. Here is an interesting note on > how it works: https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols > One of the basic important concepts is an ELF section. It's a chunk of data > (assembly code in our case) that linker operates on. A section may contain code > of one or more functions (depending on build arguments), but it can't be split > during linking. > > Usually, when we build an object file with `gcc libx.c -c -o libx.o`, all > functions from libx.c go into the single ".text" section. This single object > file is usually archived with others into single libx.a, which is later > processed during main executable linking. > > By default (but I can't provide the proof), when we call `gcc -lx` linker > operates on object files from the archive - if at least one symbol is used, > whole .o (not .a) is included in the resulting binary. There are also two linker > options that influence this behavior. At first, there is `-Wl,--whole-archive`, > which makes it to include whole `.a` instead of `.o` granularity. Secondly, there is > `-Wl,--gc-sections` which could remove unused sections, but in basic example > it's useless since all symbols from .o belong to the same .text section. To make > `--gc-sections` have an effect one should compile object files with > `-ffunction-sections` flag. It'll generate a separate section for every function > so the linker could gc unused ones. > > See: > ```console$ cat libx.c > #include > > void fA() { >         printf("fA is here\n"); > } > void fB() { >         printf("fB is here\n"); > } > $ gcc libx.c -c -o libx.o -ffunction-sections > $ readelf -S libx.o | grep .text >   [ 1] .text             PROGBITS         0000000000000000  00000040 >   [ 5] .text.fA          PROGBITS         0000000000000000  00000056 >   [ 6] .rela.text.fA     RELA             0000000000000000  000002d8 >   [ 7] .text.fB          PROGBITS         0000000000000000  0000006d >   [ 8] .rela.text.fB     RELA             0000000000000000  00000308 > ``` > > Now let's move to the `libbit` which Vlad mentioned. > I've investigated how compiler options influence the resulting binary. Unused > functions from bit.c are really remover, but only with Release flags, and here > is why: > > There are only 2 functions implemented in bit.c, and both are unused. All the > others are inlines in bit.h and externs from luajit. When tarantool is built in > debug mode, the inlining is off, so other modules truly link to the bit.o and > all symbols remain including unused functions. But if we specify -O2 flag, > inlining takes place, and all the symbols from bit.o becomes unused, so the > linker drops the whole object file. So do you mean, that if a library consists of more than 1 C file (and builds more than one .o file), and functions from some of them are not used, these .o files and their code will be removed? If it is true, it does not look like green light for the patch and requires more experiments. For example, try to add a new library with several C files, use only some of them in the Tarantool executable (just add to exports.h), and see if code from unused .o files is removed even when they are built into .a file. However, as I said in private - I am ok with pushing this patch now. I just don't see why is it necessary. Why is it so important to push it? Push just for push? Just to close an issue? It does not improve anything, EXPORT_LIST still may appear to be useful along with some other things I removed in 2971, when I didn't think of the static build. > Finally, speaking about this patch, my proposal is to merge this PR as is. > And since we know how to manage linking, other problems can be solved separately > (if they ever occur). > > Best regards > Yaroslav Dynnikov