[Tarantool-patches] [PATCH] cmake: cleanup src/CMakeLists.txt

Yaroslav Dynnikov yaroslav.dynnikov at tarantool.org
Tue Jun 23 23:31:59 MSK 2020


Hi, Vlad.

Sorry for the late answer. I was going to recheck ideas you've proposed,
but didn't have time for that.
So my answer is based on my knowledge only.

On Sat, 20 Jun 2020, 02:39 Vladislav Shpilevoy, <v.shpilevoy at tarantool.org>
wrote:

> Hi! Thanks for the investigation!
>
> On 19/06/2020 15:02, Yaroslav Dynnikov wrote:
> > I've researched the linking process and here is what I've found.
> >
> > First of all, let's talk about symbols removal. Here is an interesting
> note on
> > how it works: https://stackoverflow.com/questions/55130965/when-drop
> unuseddrop unusedand-why-would-the-c-linker-exclude-unused-symbols
> <https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols>
> > One of the basic important concepts is an ELF section. It's a chunk of
> data
> > (assembly code in our case) that linker operates on. A section may
> contain code
> > of one or more functions (depending on build arguments), but it can't be
> split
> > during linking.
> >
> > Usually, when we build an object file with `gcc libx.c -c -o libx.o`, all
> > functions from libx.c go into the single ".text" section. This single
> object
> > file is usually archived with others into single libx.a, which is later
> > processed during main executable linking.
> >
> > By default (but I can't provide the proof), when we call `gcc -lx` linker
> > operates on object files from the archive - if at least one symbol is
> used,
> > whole .o (not .a) is included in the resulting binary. There are also
> two linker
> > options that influence this behavior. At first, there is
> `-Wl,--whole-archive`,
> > which makes it to include whole `.a` instead of `.o` granularity.
> Secondly, there is
> > `-Wl,--gc-sections` which could remove unused sections, but in basic
> example
> > it's useless since all symbols from .o belong to the same .text section.
> To make
> > `--gc-sections` have an effect one should compile object files with
> > `-ffunction-sections` flag. It'll generate a separate section for every
> function
> > so the linker could gc unused ones.
> >
> > See:
> > ```console$ cat libx.c
> > #include <stdio.h>
> >
> > void fA() {
> >         printf("fA is here\n");
> > }
> > void fB() {
> >         printf("fB is here\n");
> > }
> > $ gcc libx.c -c -o libx.o -ffunction-sections
> > $ readelf -S libx.o | grep .text
> >   [ 1] .text             PROGBITS         0000000000000000  00000040
> >   [ 5] .text.fA          PROGBITS         0000000000000000  00000056
> >   [ 6] .rela.text.fA     RELA             0000000000000000  000002d8
> >   [ 7] .text.fB          PROGBITS         0000000000000000  0000006d
> >   [ 8] .rela.text.fB     RELA             0000000000000000  00000308
> > ```
> >
> > Now let's move to the `libbit` which Vlad mentioned.
> > I've investigated how compiler options influence the resulting binary.
> Unused
> > functions from bit.c are really remover, but only with Release flags,
> and here
> > is why:
> >
> > There are only 2 functions implemented in bit.c, and both are unused.
> All the
> > others are inlines in bit.h and externs from luajit. When tarantool is
> built in
> > debug mode, the inlining is off, so other modules truly link to the
> bit.o and
> > all symbols remain including unused functions. But if we specify -O2
> flag,
> > inlining takes place, and all the symbols from bit.o becomes unused, so
> the
> > linker drops the whole object file.
>
> So do you mean, that if a library consists of more than 1 C file (and
> builds
> more than one .o file), and functions from some of them are not used, these
> .o files and their code will be removed?
>

Exactly. Moreover, it depends on the gcc command syntax.
Suppose we have libx.a with two object files: fA.o and fB.o. Function from
fA is used, and fB isn't.
Then `gcc main.c libx.a` will produce binary with both fA and fB.
While `gcc main.c -lx` will include fA only, and unused fB will be dropped.
It's also mentioned in the SO question
<https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols>
I linked in the previous message.


> If it is true, it does not look like green light for the patch and requires
> more experiments. For example, try to add a new library with several C
> files,
> use only some of them in the Tarantool executable (just add to exports.h),
> and see if code from unused .o files is removed even when they are built
> into
> .a file.
>

I guess the solution will be adding --whole-archive flag. But i'm not sure
yet.


>
> However, as I said in private - I am ok with pushing this patch now. I just
> don't see why is it necessary. Why is it so important to push it? Push just
> for push? Just to close an issue? It does not improve anything, EXPORT_LIST
> still may appear to be useful along with some other things I removed in
> 2971,
> when I didn't think of the static build.
>

You're right, it's not that important. I've already told Alex Barulev that
we'll postpone this
cleanup until we finish with static build refactoring.


>
> > Finally, speaking about this patch, my proposal is to merge this PR as
> is.
> > And since we know how to manage linking, other problems can be solved
> separately
> > (if they ever occur).
> >
> > Best regards
> > Yaroslav Dynnikov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.tarantool.org/pipermail/tarantool-patches/attachments/20200623/d52fe068/attachment.html>


More information about the Tarantool-patches mailing list