From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from [87.239.111.99] (localhost [127.0.0.1]) by dev.tarantool.org (Postfix) with ESMTP id DFB1D70202; Wed, 24 Feb 2021 11:29:00 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org DFB1D70202 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tarantool.org; s=dev; t=1614155341; bh=QK1+VYFpgwYGn41u/vzgpjwNJwoWvu/zJ2krkv6qTPg=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=NvOCq7kALFslWWwlUo3/e8cpfjUk3b1Y/M5yov/e0DkXjtXSvBanVdEW8XPtD9vNw L44CFkt4qx3HjDRwrqoh+RXWZunorEux78NJm4F1h3vxKFBsQNeYifcVJeOYUc5VTq viU0a9N5XRmu4b2oSWYfJG9x5TxX6dt19QDeHSAg= Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by dev.tarantool.org (Postfix) with ESMTPS id 26F9B70202 for ; Wed, 24 Feb 2021 11:28:59 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 dev.tarantool.org 26F9B70202 Received: by mail-lj1-f179.google.com with SMTP id q14so1439407ljp.4 for ; Wed, 24 Feb 2021 00:28:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=yfjEKls4VFcUsn9n0tZ95GgZNijfa6AZXKsAuuzllJA=; b=bP/ukTHBesugYD0mssYYMvvPHEPZabmY+ulHvM5LQ9eVgpPgentsc4AqBLGEnbnDEG 5xINlhkbZOUA0218++XJN+r1HKzzJi4KwgN1OfH8u1l2oRe1r3jY8D6kEgku5LP8GNjO rx/yHw4dsjXhGn8+uYNKd/YgEeg/zD5fMADWE1s6Ri3MFqAkV1FhwBm98nDze71mLTb6 WFltVOj94qht+R9dknla77BOGaDJ0oy8jW5FJePgy7VkXrZzZM1a9XlfpNa/YkRNZXNy daFABkVcgWXnSorulSnUzBaLETRb2WK5znJRHo49iYTXQme8lSP/yLDzZFj3M3ZUBfDa cKkg== X-Gm-Message-State: AOAM532hokuwzCt744mYbCKc3wSS5pNWomNkWciI8ygHfzDUyKAdbHDP 1DUxuis6USav+qt3fZnE2Su+zEXoFoExTA== X-Google-Smtp-Source: ABdhPJwTVNHjJ7oAuau6geJXjGKD51sHXxmn+DHjYpuXXOmFriZI43vJYzoeembzk0XFI8Gyt0QWgA== X-Received: by 2002:a2e:a550:: with SMTP id e16mr19799967ljn.197.1614155337782; Wed, 24 Feb 2021 00:28:57 -0800 (PST) Received: from grain.localdomain ([5.18.171.94]) by smtp.gmail.com with ESMTPSA id f26sm330095lfc.195.2021.02.24.00.28.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Feb 2021 00:28:56 -0800 (PST) Received: by grain.localdomain (Postfix, from userid 1000) id 81FAA56009C; Wed, 24 Feb 2021 11:28:55 +0300 (MSK) Date: Wed, 24 Feb 2021 11:28:55 +0300 To: Vladislav Shpilevoy Message-ID: References: <20210222182030.76232-1-gorcunov@gmail.com> <41811c65-448c-0de6-68cd-6a64779ca283@tarantool.org> <0482394d-1600-45ca-8b0e-d230d4dfa8fe@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0482394d-1600-45ca-8b0e-d230d4dfa8fe@tarantool.org> User-Agent: Mutt/2.0.5 (2021-01-21) Subject: Re: [Tarantool-patches] [PATCH] say: fix format for fiber()->fid X-BeenThere: tarantool-patches@dev.tarantool.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Tarantool development patches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Cyrill Gorcunov via Tarantool-patches Reply-To: Cyrill Gorcunov Cc: tml Errors-To: tarantool-patches-bounces@dev.tarantool.org Sender: "Tarantool-patches" On Tue, Feb 23, 2021 at 11:47:44PM +0100, Vladislav Shpilevoy wrote: > >>> diff --git a/src/lib/core/say.c b/src/lib/core/say.c > >>> index cbd10e107..6138752a7 100644 > >>> --- a/src/lib/core/say.c > >>> +++ b/src/lib/core/say.c > >>> @@ -792,7 +792,7 @@ say_format_plain_tail(char *buf, int len, int level, const char *filename, > >>> if (cord) { > >>> SNPRINT(total, snprintf, buf, len, " %s", cord->name); > >>> if (fiber() && fiber()->fid != FIBER_ID_SCHED) { > >>> - SNPRINT(total, snprintf, buf, len, "/%i/%s", > >>> + SNPRINT(total, snprintf, buf, len, "/%u/%s", > >>> fiber()->fid, fiber_name(fiber())); > >> > >> I remember we had some issues about %u not being compatible with uint32_t so > >> we did manual casts to 'unsigned'. Not sure though what was the commit/ticket, > >> or if it really happened. > > > > I suspect you rather mean uint64_t, which indeed better convert to > > excplicit "long long". > > No, I mean exactly uint32_t. You don't have a guarantee that it is > the same as 'unsigned' AFAIK. It is, on x86. And I really doubt if we ever meet an arch where int is not 4 bytes long. Though in embedded world this might be a common situation (but if we aim ourself to run tarantool here the output issue gonna be a least problem). > > > Strictly speaking if we really need to cover > > compatibility we would _had_ to use PRIx macros which are so ugly :/ > > This is why I want us to use casts to 'unsigned' when we use %u. They > are less ugly. We will have to cover a bunch of already existing code for nothing :( But OK, I'll add explicit cast. > > > So no, uint32_t is known to be safe with %u format. > > Is it written in the standard somewhere? I honestly don't know. Standart for builtin types doesn't cover the size of machine words as far as I remember because this would tie a hands for compiler developers which have to deal with custom archs. It specifies the integer classes covering and on x86 we know that we're safe. But as I said I'll add explicit cast. > > Side note: I found that we still use %u for some of uint64_t, > > in particuar > > > > if (req->term < raft->volatile_term) { > > say_info("RAFT: the message is ignored due to outdated term - " > > "current term is %u", raft->volatile_term); > > return 0; > > } > > > > probably should use %llu and explicit "long long" for long standing > > servers where raft term would overflow 32 bit value. > > Yes, looks like a bug. Should not be critical, though. It is logging for debug mostly but we should fix it. > But a more appropriate fix would be to investigate why don't we have > compile time checks for this. The bug in raft's say_info() is clearly > an error and should have been alarmed at compilation. We have kind of check /** Internal function used to implement say() macros */ CFORMAT(printf, 5, 0) extern sayfunc_t _say; still I must confess I don't know yet why it didn't trigger, need to investigate. > At least clang supports an attribute which you can set for a function > and get printf-like checks and warnings in case of format issues. > > We should adopt this: https://clang.llvm.org/docs/AttributeReference.html#id278 > > Somewhy I thought we already use it, but now I don't see any usages. > It is easy to add, and helps to catch some weird stuff. > > Lets do it in scope of this patch series to fix the issue entirely. Not > just in one exploded place. OK > > >> I am fine with not using the cast as long as all platforms and compilers > >> build flawlessly. Also it is ok for me if you will add an explicit cast in > >> the next version of the patch. Up to you. The same below. > > > > No, Vlad, I think we don't need explicit casts for 32bit values. > > Do you have a proof for this? That it is safe. Where in the standard do they > say that '%u' is exactly 4 bytes? Already explained. I'll make it use explicit cast though I still think it is redundant. > > I pushed > > an update to the branch gorcunov/gh-5846-fid-name and here is a new > > changelog. Does it look better? > > It seems you didn't really read my comment, did you? I said: > > "Please, add a changelog file" > > I didn't ask for more info in the commit message. This particular > bug in this particular line was obvious. > > But you didn't add a changelog **file**. It should be added here: > https://github.com/tarantool/tarantool/tree/master/changelogs/unreleased Ah, you mean this new tecique for changelogs. OK, misread your comment, at first I thought to update commit message for better export into changelog. Will do. > > After more consideration I also realized we should not use uint32_t > for fiber ids. It is monotonically growing. Which means it will overflow > in the foreseeable future on a long-running instance. - it is explected to be overlowable from the beginning - the IDs are recycled I've been talking to Kostya about a year ago once I discovered that we're walking in cycle when reusing IDs. He assured me that this is fine and to get a problem we have to allocate 4294967195 fibers at once which will require ~1.5T of memory just to allocate fiber structure. Hardy to happen. Still I think we at least should add a painc in case of hash collision for future. > For example, assume one fiber is started and recycled each millisecond > (which is quite a small rate). Then uint32 will overflow in less than > 50 days. Considering we have instances running for years (at least from > what I heard from Mons), this is clearly a bug. > > On the summary, it seems we need to > > - Add compile-time format checks using clang __format__ attribute; > - Fix all the wrong formats; +1 > - Use 64bit ids for fibers. Unless I am wrong somewhere in my thoughts > above. This is dubious. Personally I would prefer to have them as uint64_t too now and for all, to simply forget about this potential issue. +4 bytes to fiber structure seems to be acceptable trade off. Cyrill