Tarantool development patches archive
 help / color / mirror / Atom feed
From: Yaroslav Dynnikov <yaroslav.dynnikov@tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tarantool-patches@dev.tarantool.org,
	Alexander Turenko <alexander.turenko@tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH] cmake: cleanup src/CMakeLists.txt
Date: Tue, 23 Jun 2020 23:31:59 +0300	[thread overview]
Message-ID: <CAK0MaD3-d9D1UP-MZfD-Hb9eUSU4vgcg59jMup6tJOTzcdydPQ@mail.gmail.com> (raw)
In-Reply-To: <676d7330-579f-7950-896e-ae2b3bf7df9a@tarantool.org>

[-- Attachment #1: Type: text/plain, Size: 5293 bytes --]

Hi, Vlad.

Sorry for the late answer. I was going to recheck ideas you've proposed,
but didn't have time for that.
So my answer is based on my knowledge only.

On Sat, 20 Jun 2020, 02:39 Vladislav Shpilevoy, <v.shpilevoy@tarantool.org>
wrote:

> Hi! Thanks for the investigation!
>
> On 19/06/2020 15:02, Yaroslav Dynnikov wrote:
> > I've researched the linking process and here is what I've found.
> >
> > First of all, let's talk about symbols removal. Here is an interesting
> note on
> > how it works: https://stackoverflow.com/questions/55130965/when-drop
> unuseddrop unusedand-why-would-the-c-linker-exclude-unused-symbols
> <https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols>
> > One of the basic important concepts is an ELF section. It's a chunk of
> data
> > (assembly code in our case) that linker operates on. A section may
> contain code
> > of one or more functions (depending on build arguments), but it can't be
> split
> > during linking.
> >
> > Usually, when we build an object file with `gcc libx.c -c -o libx.o`, all
> > functions from libx.c go into the single ".text" section. This single
> object
> > file is usually archived with others into single libx.a, which is later
> > processed during main executable linking.
> >
> > By default (but I can't provide the proof), when we call `gcc -lx` linker
> > operates on object files from the archive - if at least one symbol is
> used,
> > whole .o (not .a) is included in the resulting binary. There are also
> two linker
> > options that influence this behavior. At first, there is
> `-Wl,--whole-archive`,
> > which makes it to include whole `.a` instead of `.o` granularity.
> Secondly, there is
> > `-Wl,--gc-sections` which could remove unused sections, but in basic
> example
> > it's useless since all symbols from .o belong to the same .text section.
> To make
> > `--gc-sections` have an effect one should compile object files with
> > `-ffunction-sections` flag. It'll generate a separate section for every
> function
> > so the linker could gc unused ones.
> >
> > See:
> > ```console$ cat libx.c
> > #include <stdio.h>
> >
> > void fA() {
> >         printf("fA is here\n");
> > }
> > void fB() {
> >         printf("fB is here\n");
> > }
> > $ gcc libx.c -c -o libx.o -ffunction-sections
> > $ readelf -S libx.o | grep .text
> >   [ 1] .text             PROGBITS         0000000000000000  00000040
> >   [ 5] .text.fA          PROGBITS         0000000000000000  00000056
> >   [ 6] .rela.text.fA     RELA             0000000000000000  000002d8
> >   [ 7] .text.fB          PROGBITS         0000000000000000  0000006d
> >   [ 8] .rela.text.fB     RELA             0000000000000000  00000308
> > ```
> >
> > Now let's move to the `libbit` which Vlad mentioned.
> > I've investigated how compiler options influence the resulting binary.
> Unused
> > functions from bit.c are really remover, but only with Release flags,
> and here
> > is why:
> >
> > There are only 2 functions implemented in bit.c, and both are unused.
> All the
> > others are inlines in bit.h and externs from luajit. When tarantool is
> built in
> > debug mode, the inlining is off, so other modules truly link to the
> bit.o and
> > all symbols remain including unused functions. But if we specify -O2
> flag,
> > inlining takes place, and all the symbols from bit.o becomes unused, so
> the
> > linker drops the whole object file.
>
> So do you mean, that if a library consists of more than 1 C file (and
> builds
> more than one .o file), and functions from some of them are not used, these
> .o files and their code will be removed?
>

Exactly. Moreover, it depends on the gcc command syntax.
Suppose we have libx.a with two object files: fA.o and fB.o. Function from
fA is used, and fB isn't.
Then `gcc main.c libx.a` will produce binary with both fA and fB.
While `gcc main.c -lx` will include fA only, and unused fB will be dropped.
It's also mentioned in the SO question
<https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols>
I linked in the previous message.


> If it is true, it does not look like green light for the patch and requires
> more experiments. For example, try to add a new library with several C
> files,
> use only some of them in the Tarantool executable (just add to exports.h),
> and see if code from unused .o files is removed even when they are built
> into
> .a file.
>

I guess the solution will be adding --whole-archive flag. But i'm not sure
yet.


>
> However, as I said in private - I am ok with pushing this patch now. I just
> don't see why is it necessary. Why is it so important to push it? Push just
> for push? Just to close an issue? It does not improve anything, EXPORT_LIST
> still may appear to be useful along with some other things I removed in
> 2971,
> when I didn't think of the static build.
>

You're right, it's not that important. I've already told Alex Barulev that
we'll postpone this
cleanup until we finish with static build refactoring.


>
> > Finally, speaking about this patch, my proposal is to merge this PR as
> is.
> > And since we know how to manage linking, other problems can be solved
> separately
> > (if they ever occur).
> >
> > Best regards
> > Yaroslav Dynnikov
>

[-- Attachment #2: Type: text/html, Size: 6889 bytes --]

      reply	other threads:[~2020-06-23 20:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11  0:25 HustonMmmavr
2020-06-14 21:34 ` Alexander Turenko
2020-06-15 17:27   ` Mavr Huston
2020-06-15 21:20 ` Vladislav Shpilevoy
2020-06-17 15:29   ` Mavr Huston
2020-06-17 23:09     ` Vladislav Shpilevoy
2020-06-19 13:02       ` Yaroslav Dynnikov
2020-06-19 23:39         ` Vladislav Shpilevoy
2020-06-23 20:31           ` Yaroslav Dynnikov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK0MaD3-d9D1UP-MZfD-Hb9eUSU4vgcg59jMup6tJOTzcdydPQ@mail.gmail.com \
    --to=yaroslav.dynnikov@tarantool.org \
    --cc=alexander.turenko@tarantool.org \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] cmake: cleanup src/CMakeLists.txt' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox