Tarantool development patches archive
 help / color / mirror / Atom feed
From: Cyrill Gorcunov via Tarantool-patches <tarantool-patches@dev.tarantool.org>
To: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Cc: tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH] say: fix format for fiber()->fid
Date: Wed, 24 Feb 2021 11:28:55 +0300	[thread overview]
Message-ID: <YDYOR2UYsVki0Nfh@grain> (raw)
In-Reply-To: <0482394d-1600-45ca-8b0e-d230d4dfa8fe@tarantool.org>

On Tue, Feb 23, 2021 at 11:47:44PM +0100, Vladislav Shpilevoy wrote:
> >>> diff --git a/src/lib/core/say.c b/src/lib/core/say.c
> >>> index cbd10e107..6138752a7 100644
> >>> --- a/src/lib/core/say.c
> >>> +++ b/src/lib/core/say.c
> >>> @@ -792,7 +792,7 @@ say_format_plain_tail(char *buf, int len, int level, const char *filename,
> >>>  	if (cord) {
> >>>  		SNPRINT(total, snprintf, buf, len, " %s", cord->name);
> >>>  		if (fiber() && fiber()->fid != FIBER_ID_SCHED) {
> >>> -			SNPRINT(total, snprintf, buf, len, "/%i/%s",
> >>> +			SNPRINT(total, snprintf, buf, len, "/%u/%s",
> >>>  				fiber()->fid, fiber_name(fiber()));
> >>
> >> I remember we had some issues about %u not being compatible with uint32_t so
> >> we did manual casts to 'unsigned'. Not sure though what was the commit/ticket,
> >> or if it really happened.
> > 
> > I suspect you rather mean uint64_t, which indeed better convert to
> > excplicit "long long".
> 
> No, I mean exactly uint32_t. You don't have a guarantee that it is
> the same as 'unsigned' AFAIK.

It is, on x86. And I really doubt if we ever meet an arch where int
is not 4 bytes long. Though in embedded world this might be a common
situation (but if we aim ourself to run tarantool here the output
issue gonna be a least problem).

> 
> > Strictly speaking if we really need to cover
> > compatibility we would _had_ to use PRIx macros which are so ugly :/
> 
> This is why I want us to use casts to 'unsigned' when we use %u. They
> are less ugly.

We will have to cover a bunch of already existing code for nothing :(
But OK, I'll add explicit cast.

> 
> > So no, uint32_t is known to be safe with %u format.
> 
> Is it written in the standard somewhere? I honestly don't know.

Standart for builtin types doesn't cover the size of machine words
as far as I remember because this would tie a hands for compiler developers
which have to deal with custom archs. It specifies the integer classes
covering and on x86 we know that we're safe. But as I said I'll add
explicit cast.

> > Side note: I found that we still use %u for some of uint64_t,
> > in particuar
> > 
> > 	if (req->term < raft->volatile_term) {
> > 		say_info("RAFT: the message is ignored due to outdated term - "
> > 			 "current term is %u", raft->volatile_term);
> > 		return 0;
> > 	}
> > 
> > probably should use %llu and explicit "long long" for long standing
> > servers where raft term would overflow 32 bit value.
> 
> Yes, looks like a bug.

Should not be critical, though. It is logging for debug mostly but we
should fix it.

> But a more appropriate fix would be to investigate why don't we have
> compile time checks for this. The bug in raft's say_info() is clearly
> an error and should have been alarmed at compilation.

We have kind of check

/** Internal function used to implement say() macros */
CFORMAT(printf, 5, 0) extern sayfunc_t _say;

still I must confess I don't know yet why it didn't trigger, need
to investigate.

> At least clang supports an attribute which you can set for a function
> and get printf-like checks and warnings in case of format issues.
> 
> We should adopt this: https://clang.llvm.org/docs/AttributeReference.html#id278
> 
> Somewhy I thought we already use it, but now I don't see any usages.
> It is easy to add, and helps to catch some weird stuff.
> 
> Lets do it in scope of this patch series to fix the issue entirely. Not
> just in one exploded place.

OK

> 
> >> I am fine with not using the cast as long as all platforms and compilers
> >> build flawlessly. Also it is ok for me if you will add an explicit cast in
> >> the next version of the patch. Up to you. The same below.
> > 
> > No, Vlad, I think we don't need explicit casts for 32bit values.
> 
> Do you have a proof for this? That it is safe. Where in the standard do they
> say that '%u' is exactly 4 bytes?

Already explained. I'll make it use explicit cast though I still
think it is redundant.

> > I pushed
> > an update to the branch gorcunov/gh-5846-fid-name and here is a new
> > changelog. Does it look better?
> 
> It seems you didn't really read my comment, did you? I said:
> 
> 	"Please, add a changelog file"
> 
> I didn't ask for more info in the commit message. This particular
> bug in this particular line was obvious.
> 
> But you didn't add a changelog **file**. It should be added here:
> https://github.com/tarantool/tarantool/tree/master/changelogs/unreleased

Ah, you mean this new tecique for changelogs. OK, misread your comment,
at first I thought to update commit message for better export into
changelog. Will do.

> 
> After more consideration I also realized we should not use uint32_t
> for fiber ids. It is monotonically growing. Which means it will overflow
> in the foreseeable future on a long-running instance.

 - it is explected to be overlowable from the beginning
 - the IDs are recycled

I've been talking to Kostya about a year ago once I discovered that
we're walking in cycle when reusing IDs. He assured me that this is
fine and to get a problem we have to allocate 4294967195 fibers at
once which will require ~1.5T of memory just to allocate fiber structure.
Hardy to happen. Still I think we at least should add a painc in case of
hash collision for future.

> For example, assume one fiber is started and recycled each millisecond
> (which is quite a small rate). Then uint32 will overflow in less than
> 50 days. Considering we have instances running for years (at least from
> what I heard from Mons), this is clearly a bug.
> 
> On the summary, it seems we need to
> 
> - Add compile-time format checks using clang __format__ attribute;
> - Fix all the wrong formats;

+1

> - Use 64bit ids for fibers. Unless I am wrong somewhere in my thoughts
>   above.

This is dubious. Personally I would prefer to have them as uint64_t too
now and for all, to simply forget about this potential issue. +4 bytes
to fiber structure seems to be acceptable trade off.

	Cyrill

  reply	other threads:[~2021-02-24  8:29 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 18:20 Cyrill Gorcunov via Tarantool-patches
2021-02-22 18:23 ` Cyrill Gorcunov via Tarantool-patches
2021-02-23 20:41 ` Vladislav Shpilevoy via Tarantool-patches
2021-02-23 21:49   ` Cyrill Gorcunov via Tarantool-patches
2021-02-23 22:47     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24  8:28       ` Cyrill Gorcunov via Tarantool-patches [this message]
2021-02-24 22:39         ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24 15:36 ` [Tarantool-patches] [PATCH v2 00/10] say: fix format strings Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 01/10] say: use unsigned format for fiber()->fid Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 02/10] popen: fix say_x format arguments Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:39     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25  7:27       ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 03/10] raft: fix say_x arguments Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:39     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25  7:50       ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 04/10] box/error: fix argument for CustomError Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 05/10] xlog: fix say_x format Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:40     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 06/10] box/vynil: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:43     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-26  8:13       ` Nikita Pettik via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 07/10] txn: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 08/10] limbo: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 09/10] wal: " Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:41     ` Vladislav Shpilevoy via Tarantool-patches
2021-02-25  7:52       ` Cyrill Gorcunov via Tarantool-patches
2021-02-24 15:36   ` [Tarantool-patches] [PATCH v2 10/10] say: fix CFORMAT specification Cyrill Gorcunov via Tarantool-patches
2021-02-24 22:41     ` Vladislav Shpilevoy via Tarantool-patches

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YDYOR2UYsVki0Nfh@grain \
    --to=tarantool-patches@dev.tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=v.shpilevoy@tarantool.org \
    --subject='Re: [Tarantool-patches] [PATCH] say: fix format for fiber()->fid' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox