From: Yaroslav Dynnikov <yaroslav.dynnikov@tarantool.org> To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org> Cc: tarantool-patches@dev.tarantool.org, Alexander Turenko <alexander.turenko@tarantool.org> Subject: Re: [Tarantool-patches] [PATCH] cmake: cleanup src/CMakeLists.txt Date: Tue, 23 Jun 2020 23:31:59 +0300 [thread overview] Message-ID: <CAK0MaD3-d9D1UP-MZfD-Hb9eUSU4vgcg59jMup6tJOTzcdydPQ@mail.gmail.com> (raw) In-Reply-To: <676d7330-579f-7950-896e-ae2b3bf7df9a@tarantool.org> [-- Attachment #1: Type: text/plain, Size: 5293 bytes --] Hi, Vlad. Sorry for the late answer. I was going to recheck ideas you've proposed, but didn't have time for that. So my answer is based on my knowledge only. On Sat, 20 Jun 2020, 02:39 Vladislav Shpilevoy, <v.shpilevoy@tarantool.org> wrote: > Hi! Thanks for the investigation! > > On 19/06/2020 15:02, Yaroslav Dynnikov wrote: > > I've researched the linking process and here is what I've found. > > > > First of all, let's talk about symbols removal. Here is an interesting > note on > > how it works: https://stackoverflow.com/questions/55130965/when-drop > unuseddrop unusedand-why-would-the-c-linker-exclude-unused-symbols > <https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols> > > One of the basic important concepts is an ELF section. It's a chunk of > data > > (assembly code in our case) that linker operates on. A section may > contain code > > of one or more functions (depending on build arguments), but it can't be > split > > during linking. > > > > Usually, when we build an object file with `gcc libx.c -c -o libx.o`, all > > functions from libx.c go into the single ".text" section. This single > object > > file is usually archived with others into single libx.a, which is later > > processed during main executable linking. > > > > By default (but I can't provide the proof), when we call `gcc -lx` linker > > operates on object files from the archive - if at least one symbol is > used, > > whole .o (not .a) is included in the resulting binary. There are also > two linker > > options that influence this behavior. At first, there is > `-Wl,--whole-archive`, > > which makes it to include whole `.a` instead of `.o` granularity. > Secondly, there is > > `-Wl,--gc-sections` which could remove unused sections, but in basic > example > > it's useless since all symbols from .o belong to the same .text section. > To make > > `--gc-sections` have an effect one should compile object files with > > `-ffunction-sections` flag. It'll generate a separate section for every > function > > so the linker could gc unused ones. > > > > See: > > ```console$ cat libx.c > > #include <stdio.h> > > > > void fA() { > > printf("fA is here\n"); > > } > > void fB() { > > printf("fB is here\n"); > > } > > $ gcc libx.c -c -o libx.o -ffunction-sections > > $ readelf -S libx.o | grep .text > > [ 1] .text PROGBITS 0000000000000000 00000040 > > [ 5] .text.fA PROGBITS 0000000000000000 00000056 > > [ 6] .rela.text.fA RELA 0000000000000000 000002d8 > > [ 7] .text.fB PROGBITS 0000000000000000 0000006d > > [ 8] .rela.text.fB RELA 0000000000000000 00000308 > > ``` > > > > Now let's move to the `libbit` which Vlad mentioned. > > I've investigated how compiler options influence the resulting binary. > Unused > > functions from bit.c are really remover, but only with Release flags, > and here > > is why: > > > > There are only 2 functions implemented in bit.c, and both are unused. > All the > > others are inlines in bit.h and externs from luajit. When tarantool is > built in > > debug mode, the inlining is off, so other modules truly link to the > bit.o and > > all symbols remain including unused functions. But if we specify -O2 > flag, > > inlining takes place, and all the symbols from bit.o becomes unused, so > the > > linker drops the whole object file. > > So do you mean, that if a library consists of more than 1 C file (and > builds > more than one .o file), and functions from some of them are not used, these > .o files and their code will be removed? > Exactly. Moreover, it depends on the gcc command syntax. Suppose we have libx.a with two object files: fA.o and fB.o. Function from fA is used, and fB isn't. Then `gcc main.c libx.a` will produce binary with both fA and fB. While `gcc main.c -lx` will include fA only, and unused fB will be dropped. It's also mentioned in the SO question <https://stackoverflow.com/questions/55130965/when-and-why-would-the-c-linker-exclude-unused-symbols> I linked in the previous message. > If it is true, it does not look like green light for the patch and requires > more experiments. For example, try to add a new library with several C > files, > use only some of them in the Tarantool executable (just add to exports.h), > and see if code from unused .o files is removed even when they are built > into > .a file. > I guess the solution will be adding --whole-archive flag. But i'm not sure yet. > > However, as I said in private - I am ok with pushing this patch now. I just > don't see why is it necessary. Why is it so important to push it? Push just > for push? Just to close an issue? It does not improve anything, EXPORT_LIST > still may appear to be useful along with some other things I removed in > 2971, > when I didn't think of the static build. > You're right, it's not that important. I've already told Alex Barulev that we'll postpone this cleanup until we finish with static build refactoring. > > > Finally, speaking about this patch, my proposal is to merge this PR as > is. > > And since we know how to manage linking, other problems can be solved > separately > > (if they ever occur). > > > > Best regards > > Yaroslav Dynnikov > [-- Attachment #2: Type: text/html, Size: 6889 bytes --]
prev parent reply other threads:[~2020-06-23 20:32 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-11 0:25 HustonMmmavr 2020-06-14 21:34 ` Alexander Turenko 2020-06-15 17:27 ` Mavr Huston 2020-06-15 21:20 ` Vladislav Shpilevoy 2020-06-17 15:29 ` Mavr Huston 2020-06-17 23:09 ` Vladislav Shpilevoy 2020-06-19 13:02 ` Yaroslav Dynnikov 2020-06-19 23:39 ` Vladislav Shpilevoy 2020-06-23 20:31 ` Yaroslav Dynnikov [this message]
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAK0MaD3-d9D1UP-MZfD-Hb9eUSU4vgcg59jMup6tJOTzcdydPQ@mail.gmail.com \ --to=yaroslav.dynnikov@tarantool.org \ --cc=alexander.turenko@tarantool.org \ --cc=tarantool-patches@dev.tarantool.org \ --cc=v.shpilevoy@tarantool.org \ --subject='Re: [Tarantool-patches] [PATCH] cmake: cleanup src/CMakeLists.txt' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox