Hi, Vlad.

Sorry for the late answer. I was going to recheck ideas you've proposed, but didn't have time for that.
So my answer is based on my knowledge only.

On Sat, 20 Jun 2020, 02:39 Vladislav Shpilevoy, <v.shpilevoy@tarantool.org> wrote:
Hi! Thanks for the investigation!

On 19/06/2020 15:02, Yaroslav Dynnikov wrote:
> I've researched the linking process and here is what I've found.
>
> First of all, let's talk about symbols removal. Here is an interesting note on
> how it works: https://stackoverflow.com/questions/55130965/when-drop unuseddrop unusedand-why-would-the-c-linker-exclude-unused-symbols
> One of the basic important concepts is an ELF section. It's a chunk of data
> (assembly code in our case) that linker operates on. A section may contain code
> of one or more functions (depending on build arguments), but it can't be split
> during linking.
>
> Usually, when we build an object file with `gcc libx.c -c -o libx.o`, all
> functions from libx.c go into the single ".text" section. This single object
> file is usually archived with others into single libx.a, which is later
> processed during main executable linking.
>
> By default (but I can't provide the proof), when we call `gcc -lx` linker
> operates on object files from the archive - if at least one symbol is used,
> whole .o (not .a) is included in the resulting binary. There are also two linker
> options that influence this behavior. At first, there is `-Wl,--whole-archive`,
> which makes it to include whole `.a` instead of `.o` granularity. Secondly, there is
> `-Wl,--gc-sections` which could remove unused sections, but in basic example
> it's useless since all symbols from .o belong to the same .text section. To make
> `--gc-sections` have an effect one should compile object files with
> `-ffunction-sections` flag. It'll generate a separate section for every function
> so the linker could gc unused ones.
>
> See:
> ```console$ cat libx.c
> #include <stdio.h>
>
> void fA() {
>         printf("fA is here\n");
> }
> void fB() {
>         printf("fB is here\n");
> }
> $ gcc libx.c -c -o libx.o -ffunction-sections
> $ readelf -S libx.o | grep .text
>   [ 1] .text             PROGBITS         0000000000000000  00000040
>   [ 5] .text.fA          PROGBITS         0000000000000000  00000056
>   [ 6] .rela.text.fA     RELA             0000000000000000  000002d8
>   [ 7] .text.fB          PROGBITS         0000000000000000  0000006d
>   [ 8] .rela.text.fB     RELA             0000000000000000  00000308
> ```
>
> Now let's move to the `libbit` which Vlad mentioned.
> I've investigated how compiler options influence the resulting binary. Unused
> functions from bit.c are really remover, but only with Release flags, and here
> is why:
>
> There are only 2 functions implemented in bit.c, and both are unused. All the
> others are inlines in bit.h and externs from luajit. When tarantool is built in
> debug mode, the inlining is off, so other modules truly link to the bit.o and
> all symbols remain including unused functions. But if we specify -O2 flag,
> inlining takes place, and all the symbols from bit.o becomes unused, so the
> linker drops the whole object file.

So do you mean, that if a library consists of more than 1 C file (and builds
more than one .o file), and functions from some of them are not used, these
.o files and their code will be removed?

Exactly. Moreover, it depends on the gcc command syntax.
Suppose we have libx.a with two object files: fA.o and fB.o. Function from fA is used, and fB isn't.
Then `gcc main.c libx.a` will produce binary with both fA and fB.
While `gcc main.c -lx` will include fA only, and unused fB will be dropped.
It's also mentioned in the SO question I linked in the previous message.


If it is true, it does not look like green light for the patch and requires
more experiments. For example, try to add a new library with several C files,
use only some of them in the Tarantool executable (just add to exports.h),
and see if code from unused .o files is removed even when they are built into
.a file.

I guess the solution will be adding --whole-archive flag. But i'm not sure yet.
 

However, as I said in private - I am ok with pushing this patch now. I just
don't see why is it necessary. Why is it so important to push it? Push just
for push? Just to close an issue? It does not improve anything, EXPORT_LIST
still may appear to be useful along with some other things I removed in 2971,
when I didn't think of the static build.

You're right, it's not that important. I've already told Alex Barulev that we'll postpone this
cleanup until we finish with static build refactoring.
 

> Finally, speaking about this patch, my proposal is to merge this PR as is.
> And since we know how to manage linking, other problems can be solved separately
> (if they ever occur).
>
> Best regards
> Yaroslav Dynnikov