From: Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH] say: fix format for fiber()->fid
Date: Wed, 24 Feb 2021 11:28:55 +0300 [thread overview]
Message-ID: <YDYOR2UYsVki0Nfh@grain> (raw)
In-Reply-To: <0482394d-1600-45ca-8b0e-d230d4dfa8fe@tarantool.org>
On Tue, Feb 23, 2021 at 11:47:44PM +0100, Vladislav Shpilevoy wrote:
> >>> diff --git a/src/lib/core/say.c b/src/lib/core/say.c
> >>> index cbd10e107..6138752a7 100644
> >>> --- a/src/lib/core/say.c
> >>> +++ b/src/lib/core/say.c
> >>> @@ -792,7 +792,7 @@ say_format_plain_tail(char *buf, int len, int level, const char *filename,
> >>> if (cord) {
> >>> SNPRINT(total, snprintf, buf, len, " %s", cord->name);
> >>> if (fiber() && fiber()->fid != FIBER_ID_SCHED) {
> >>> - SNPRINT(total, snprintf, buf, len, "/%i/%s",
> >>> + SNPRINT(total, snprintf, buf, len, "/%u/%s",
> >>> fiber()->fid, fiber_name(fiber()));
> >>
> >> I remember we had some issues about %u not being compatible with uint32_t so
> >> we did manual casts to 'unsigned'. Not sure though what was the commit/ticket,
> >> or if it really happened.
> >
> > I suspect you rather mean uint64_t, which indeed better convert to
> > excplicit "long long".
>
> No, I mean exactly uint32_t. You don't have a guarantee that it is
> the same as 'unsigned' AFAIK.
It is, on x86. And I really doubt if we ever meet an arch where int
is not 4 bytes long. Though in embedded world this might be a common
situation (but if we aim ourself to run tarantool here the output
issue gonna be a least problem).
>
> > Strictly speaking if we really need to cover
> > compatibility we would _had_ to use PRIx macros which are so ugly :/
>
> This is why I want us to use casts to 'unsigned' when we use %u. They
> are less ugly.
We will have to cover a bunch of already existing code for nothing :(
But OK, I'll add explicit cast.
>
> > So no, uint32_t is known to be safe with %u format.
>
> Is it written in the standard somewhere? I honestly don't know.
Standart for builtin types doesn't cover the size of machine words
as far as I remember because this would tie a hands for compiler developers
which have to deal with custom archs. It specifies the integer classes
covering and on x86 we know that we're safe. But as I said I'll add
explicit cast.
> > Side note: I found that we still use %u for some of uint64_t,
> > in particuar
> >
> > if (req->term < raft->volatile_term) {
> > say_info("RAFT: the message is ignored due to outdated term - "
> > "current term is %u", raft->volatile_term);
> > return 0;
> > }
> >
> > probably should use %llu and explicit "long long" for long standing
> > servers where raft term would overflow 32 bit value.
>
> Yes, looks like a bug.
Should not be critical, though. It is logging for debug mostly but we
should fix it.
> But a more appropriate fix would be to investigate why don't we have
> compile time checks for this. The bug in raft's say_info() is clearly
> an error and should have been alarmed at compilation.
We have kind of check
/** Internal function used to implement say() macros */
CFORMAT(printf, 5, 0) extern sayfunc_t _say;
still I must confess I don't know yet why it didn't trigger, need
to investigate.
> At least clang supports an attribute which you can set for a function
> and get printf-like checks and warnings in case of format issues.
>
> We should adopt this: https://clang.llvm.org/docs/AttributeReference.html#id278
>
> Somewhy I thought we already use it, but now I don't see any usages.
> It is easy to add, and helps to catch some weird stuff.
>
> Lets do it in scope of this patch series to fix the issue entirely. Not
> just in one exploded place.
OK
>
> >> I am fine with not using the cast as long as all platforms and compilers
> >> build flawlessly. Also it is ok for me if you will add an explicit cast in
> >> the next version of the patch. Up to you. The same below.
> >
> > No, Vlad, I think we don't need explicit casts for 32bit values.
>
> Do you have a proof for this? That it is safe. Where in the standard do they
> say that '%u' is exactly 4 bytes?
Already explained. I'll make it use explicit cast though I still
think it is redundant.
> > I pushed
> > an update to the branch gorcunov/gh-5846-fid-name and here is a new
> > changelog. Does it look better?
>
> It seems you didn't really read my comment, did you? I said:
>
> "Please, add a changelog file"
>
> I didn't ask for more info in the commit message. This particular
> bug in this particular line was obvious.
>
> But you didn't add a changelog **file**. It should be added here:
> https://github.com/tarantool/tarantool/tree/master/changelogs/unreleased
Ah, you mean this new tecique for changelogs. OK, misread your comment,
at first I thought to update commit message for better export into
changelog. Will do.
>
> After more consideration I also realized we should not use uint32_t
> for fiber ids. It is monotonically growing. Which means it will overflow
> in the foreseeable future on a long-running instance.
- it is explected to be overlowable from the beginning
- the IDs are recycled
I've been talking to Kostya about a year ago once I discovered that
we're walking in cycle when reusing IDs. He assured me that this is
fine and to get a problem we have to allocate 4294967195 fibers at
once which will require ~1.5T of memory just to allocate fiber structure.
Hardy to happen. Still I think we at least should add a painc in case of
hash collision for future.
> For example, assume one fiber is started and recycled each millisecond
> (which is quite a small rate). Then uint32 will overflow in less than
> 50 days. Considering we have instances running for years (at least from
> what I heard from Mons), this is clearly a bug.
>
> On the summary, it seems we need to
>
> - Add compile-time format checks using clang __format__ attribute;
> - Fix all the wrong formats;
+1
> - Use 64bit ids for fibers. Unless I am wrong somewhere in my thoughts
> above.
This is dubious. Personally I would prefer to have them as uint64_t too
now and for all, to simply forget about this potential issue. +4 bytes
to fiber structure seems to be acceptable trade off.
Cyrill
next prev parent reply other threads:[~2021-02-24 8:29 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-22 18:20 Cyrill Gorcunov via Tarantool-patches
2021-02-22 18:23 ` Cyrill Gorcunov via Tarantool-patches
2021-02-23 20:41 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-23 21:49 ` Cyrill Gorcunov via Tarantool-patches
2021-02-23 22:47 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24 8:28 ` Cyrill Gorcunov via Tarantool-patches [this message]
2021-02-24 22:39 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 00/10] say: fix format strings Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 01/10] say: use unsigned format for fiber()->fid Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 02/10] popen: fix say_x format arguments Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:39 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 7:27 ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 03/10] raft: fix say_x arguments Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:39 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 7:50 ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 04/10] box/error: fix argument for CustomError Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 05/10] xlog: fix say_x format Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:40 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 06/10] box/vynil: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:43 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-26 8:13 ` Nikita Pettik via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 07/10] txn: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 08/10] limbo: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 09/10] wal: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:41 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25 7:52 ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 10/10] say: fix CFORMAT specification Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:41 ` Vladislav Shpilevoy via Tarantool-patches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YDYOR2UYsVki0Nfh@grain \
--to=tarantool-patches@dev.tarantool.org \
--cc=gorcunov@gmail.com \
--cc=v.shpilevoy@tarantool.org \
--subject='Re: [Tarantool-patches] [PATCH] say: fix format for fiber()->fid' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox