Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Mons Anderson <v.perepelitsa@corp.mail.ru>,
	tml <tarantool-patches@dev.tarantool.org>
Subject: Re: [Tarantool-patches] [PATCH v3 2/4] errstat: add crash report base code
Date: Sun, 6 Dec 2020 16:50:19 +0100	[thread overview]
Message-ID: <7179689d-f27f-3665-70a8-1e5cd60757b1@tarantool.org> (raw)
In-Reply-To: <20201205195827.GC2303@grain>

On 05.12.2020 20:58, Cyrill Gorcunov wrote:
> On Sat, Dec 05, 2020 at 07:33:05PM +0100, Vladislav Shpilevoy wrote:
>>>  |   "build": {
>>>  |     "major": 2,
>>>  |     "minor": 7,
>>>  |     "patch": 0,
>>
>> 1. Why do you need the version numbers and the 'version'
>> string both? All can be extracted from 'version' string.
> 
> Sure we can extract it but I thought this gonna be more
> convenient 'cause it won't require the reader to split
> string, extract the major/minor. This is not a problem
> to drop these fields and leave "version" only if you prefer.
> I don't mind.

Currently the reader must concatenate these 3 lines into 1,
which is already present in a next field.

Also this will be handled by the feedback server, not by people.
The people will see some digested form of all of this.

>>>  |   },
>>>  |   "signal": {
>>>  |     "signo": 11,
>>>  |     "si_code": 0,
>>>  |     "si_addr": "0x3e800009727",
>>>  |     "backtrace": "IzAgIDB4NjMyODhiIGlu..",
>>>  |     "timestamp": "0x164ceceac0610ae9",
>>
>> 2. Why is timestamp in hex? And why do you need both
>> hex and string?
> 
> hex is simply more short form for timestamp, I can put
> plain integer number if you prefer.

I don't understand why do we need both hex and string. The
string already contains the timestamp + zone, which can be
parsed on the server side into a number if necessary.

> as to time string -- the prblem is in local timezone,
> the timestamp itself doesn't show which tz it is. thus
> we either have to send tz explicitly or a time string.
> 
> if you mean to simply drop timestamp and use timestring
> only I'm ok with this..

Better drop the hex, I think. The number can be calculated
from the string, and it does not improve the readability
anyhow. Which is not necessary anyway, because it will be
parsed by a server, and can be converted to any form there.

>>>  |     "timestamp_str": "2020-12-02 17:34:20 MSK"
>>>  |   }
>>>  | }
>>>
>>> +static struct errstat glob_errstat;
>>> +static bool cfg_send_crash = false;
>>
>> 5. Why do you have a global object for this module, and yet
>> this part of it is separated into another global variable?
> 
> Good point. Actually I introduced this cfg_send_crash a way
> later than glob_errstat. Indeed we can keep this settings inside
> errstat.
> 
> Also let me explain why this structure called errstat - it
> keeps not just report to send but also registers (we use it
> to print to a console), backtrace, report.

But it is also not errors and not statistics.

> And we're planning
> to gather diag_set() calls here as well and provide some
> box.info interface for last say 10 errors.

Why here? Normal errors and their statistics has nothing to
do with crashes. Only common property of them is that they
are not normal, usually.

>>> +static unsigned char *
>>> +base64_encode(unsigned char *dst, unsigned char *src, size_t src_len)
>>> +{
>>
>> 6. Why can't you use the existing base64_encode() function from
>> base64.h? It is not different from your version. You both use global
>> variables (static also is global, just not visible everywhere). So
>> your version is not 'safer' in terms of being called from a signal
>> handler.
> 
> This library is not linked into src/lib/core by default I can add it
> actually but since it is third_party library I can't be sure that
> it won't be changed someday leading to unpredicted results.

It is in third_party folder, but it is not a submodule. It is our code,
which we can decide when and how to change.

Talking of third_party, I don't know why some of the code is stored
here. I think it is supposed to store submodules which we can only update
from the upstream sometimes, but we also store files here which should
belong to src/lib, with our own code.

With that said, treat base64 as our own library. You can use it anywhere
and not be afraid that it will be changed or something.

> As you
> might note I try to use local functions as much as I can and I think
> we need to implement own strlen/snprintf and etc here as well because
> when we're inside sigsegv some state of libc/third_party components
> might be broken.
> 
> If you think it is safe to reuse base64 here I'll link this library in.

Yes, it is safe. At least not less safe than the inlined version here.

>>> +static void
>>> +prepare_report_script(void)
>>> +{
>>> +	struct errstat_crash *cinfo = &errstat_get()->crash_info;
>>> +	struct errstat_build *binfo = &errstat_get()->build_info;
>>> +	struct errstat_uname *uinfo = &errstat_get()->uname_info;
>>> +
>>> +	char *p, *e;
>>> +
>>> +#ifdef ENABLE_BACKTRACE
>>> +	/*
>>> +	 * We can't use arbitrary data which can be misinterpreted
>>> +	 * by Lua script when we pass it to a script.
>>> +	 *
>>> +	 * WARNING: We use report_encoded as a temp buffer.
>>
>> 16. Static buffer is much bigger, and is already there. Why
>> can't you use it? Instead you somewhy add several kilobytes of
>> another static buffer padding out the process size.
> 
> static buffer is three pages only while we might need more
> size i suspect. let me reconsider this moment maybe indeed
> using tt_static would suite better. thanks!

Isn't your backtrace buffer smaller than 3 pages?

>>> +
>>> +	snprintf_safe("require(\'http.client\').post(\'%s\',"
>>> +		      "'{\"crashdump\":{\"version\":\"%d\","
>>> +		      "\"data\":\"%s\"}}',{timeout=1}); os.exit(1);",
>>
>> 17. Why so small timeout?
> 
> this is the same timeout the feedback daemon uses. I can increase it
> or make it configurable if you prefer.

No, should be fine then.

>>> +	abort();
>>> +}
>>> +
>>> +void
>>> +errstat_exec_send_crash(void)
>>
>> 19. Why is it called 'exec'?
> 
> Because it runs exec inside. I can drop this exec_ if you prefer.

It is not a good idea to expose what this function does inside
if it does not affect the caller. It calls exec in the child
process, so it has nothing to do with the caller. For example,
why popen_new() is not called popen_new_exec() then?

  reply	other threads:[~2020-12-06 15:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04 15:29 [Tarantool-patches] [PATCH v3 0/4] crash dump: implement sending feedback Cyrill Gorcunov
2020-12-04 15:30 ` [Tarantool-patches] [PATCH v3 1/4] backtrace: allow to specify destination buffer Cyrill Gorcunov
2020-12-05 18:30   ` Vladislav Shpilevoy
2020-12-05 18:52     ` Cyrill Gorcunov
2020-12-04 15:30 ` [Tarantool-patches] [PATCH v3 2/4] errstat: add crash report base code Cyrill Gorcunov
2020-12-05 18:33   ` Vladislav Shpilevoy
2020-12-05 19:58     ` Cyrill Gorcunov
2020-12-06 15:50       ` Vladislav Shpilevoy [this message]
2020-12-04 15:30 ` [Tarantool-patches] [PATCH v3 3/4] crash: use errstat code in fatal signals Cyrill Gorcunov
2020-12-05 18:33   ` Vladislav Shpilevoy
2020-12-05 20:04     ` Cyrill Gorcunov
2020-12-04 15:30 ` [Tarantool-patches] [PATCH v3 4/4] cfg: configure crash report sending Cyrill Gorcunov
2020-12-05 18:34   ` Vladislav Shpilevoy
2020-12-05 18:30 ` [Tarantool-patches] [PATCH v3 0/4] crash dump: implement sending feedback Vladislav Shpilevoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7179689d-f27f-3665-70a8-1e5cd60757b1@tarantool.org \
    --to=v.shpilevoy@tarantool.org \
    --cc=gorcunov@gmail.com \
    --cc=tarantool-patches@dev.tarantool.org \
    --cc=v.perepelitsa@corp.mail.ru \
    --subject='Re: [Tarantool-patches] [PATCH v3 2/4] errstat: add crash report base code' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox