* [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper
2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer Cyrill Gorcunov
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
Very convenient to have this string extension.
We will use it in crash handling.
Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
CMakeLists.txt | 1 +
src/lib/core/util.c | 14 ++++++++++++++
src/trivia/config.h.cmake | 5 +++++
src/trivia/util.h | 15 +++++++++++++++
4 files changed, 35 insertions(+)
diff --git a/CMakeLists.txt b/CMakeLists.txt
index fa6818f8e..aa23113b3 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -126,6 +126,7 @@ set(CMAKE_REQUIRED_LIBRARIES "")
check_function_exists(clock_gettime HAVE_CLOCK_GETTIME_WITHOUT_RT)
check_symbol_exists(__get_cpuid cpuid.h HAVE_CPUID)
+check_symbol_exists(strlcpy string.h HAVE_STRLCPY)
# Checks for libev
include(CheckStructHasMember)
diff --git a/src/lib/core/util.c b/src/lib/core/util.c
index dfce317f0..f29886105 100644
--- a/src/lib/core/util.c
+++ b/src/lib/core/util.c
@@ -250,6 +250,20 @@ int2str(long long int val)
return buf;
}
+#ifndef HAVE_STRLCPY
+size_t
+strlcpy(char *dst, const char *src, size_t size)
+{
+ size_t src_len = strlen(src);
+ if (size != 0) {
+ size_t len = (src_len >= size) ? size - 1 : src_len;
+ memcpy(dst, src, len);
+ dst[len] = '\0';
+ }
+ return src_len;
+}
+#endif
+
int
utf8_check_printable(const char *start, size_t length)
{
diff --git a/src/trivia/config.h.cmake b/src/trivia/config.h.cmake
index 89e0d39c6..21efb978e 100644
--- a/src/trivia/config.h.cmake
+++ b/src/trivia/config.h.cmake
@@ -47,6 +47,11 @@
*/
#cmakedefine HAVE_CPUID 1
+/**
+ * Defined if strlcpy() string extension helper present.
+ */
+ #cmakedefine HAVE_STRLCPY 1
+
/*
* Defined if gcov instrumentation should be enabled.
*/
diff --git a/src/trivia/util.h b/src/trivia/util.h
index da5a3705e..15f9881db 100644
--- a/src/trivia/util.h
+++ b/src/trivia/util.h
@@ -450,6 +450,21 @@ fpconv_strtod(const char *nptr, char **endptr)
return strtod(nptr, endptr);
}
+#ifndef HAVE_STRLCPY
+/**
+ * Copy string. Unlike @a strncpy the result string
+ * is always null-terminated.
+ *
+ * @param dst destination buffer.
+ * @param src source string.
+ * @param size destination buffer size.
+ *
+ * @return size of @a src string.
+ */
+size_t
+strlcpy(char *dst, const char *src, size_t size);
+#endif
+
/**
* Check that @a str is valid utf-8 sequence and can be printed
* unescaped.
--
2.26.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer
2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
3 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
This will allow to reuse this routine in crash reports.
Part-of #5261
Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/lib/core/backtrace.cc | 12 ++++++------
src/lib/core/backtrace.h | 3 +++
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/src/lib/core/backtrace.cc b/src/lib/core/backtrace.cc
index 456ce9a4d..b4048089f 100644
--- a/src/lib/core/backtrace.cc
+++ b/src/lib/core/backtrace.cc
@@ -131,7 +131,7 @@ get_proc_name(unw_cursor_t *unw_cur, unw_word_t *offset, bool skip_cache)
}
char *
-backtrace(void)
+backtrace(char *start, size_t size)
{
int frame_no = 0;
unw_word_t sp = 0, old_sp = 0, ip, offset;
@@ -139,10 +139,9 @@ backtrace(void)
unw_getcontext(&unw_context);
unw_cursor_t unw_cur;
unw_init_local(&unw_cur, &unw_context);
- char *backtrace_buf = (char *)static_alloc(SMALL_STATIC_SIZE);
- char *p = backtrace_buf;
- char *end = p + SMALL_STATIC_SIZE - 1;
int unw_status;
+ char *p = start;
+ char *end = start + size - 1;
*p = '\0';
while ((unw_status = unw_step(&unw_cur)) > 0) {
const char *proc;
@@ -174,7 +173,7 @@ backtrace(void)
say_debug("unwinding error: %i", unw_status);
#endif
out:
- return backtrace_buf;
+ return start;
}
/*
@@ -436,7 +435,8 @@ backtrace_foreach(backtrace_cb cb, coro_context *coro_ctx, void *cb_ctx)
void
print_backtrace(void)
{
- fdprintf(STDERR_FILENO, "%s", backtrace());
+ char *start = (char *)static_alloc(SMALL_STATIC_SIZE);
+ fdprintf(STDERR_FILENO, "%s", backtrace(start, SMALL_STATIC_SIZE));
}
#endif /* ENABLE_BACKTRACE */
diff --git a/src/lib/core/backtrace.h b/src/lib/core/backtrace.h
index c119d5402..e0ae56be4 100644
--- a/src/lib/core/backtrace.h
+++ b/src/lib/core/backtrace.h
@@ -40,6 +40,9 @@ extern "C" {
#ifdef ENABLE_BACKTRACE
#include <coro.h>
+char *
+backtrace(char *start, size_t size);
+
void print_backtrace(void);
typedef int (backtrace_cb)(int frameno, void *frameret,
--
2.26.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in
2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
3 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
When SIGSEGV or SIGFPE reaches the tarantool we try to gather
all information related to the crash and print it out to the
console (well, stderr actually). Still there is a request
to not just show this info locally but send it out to the
feedback server.
Thus to keep gathering crash related information in one module,
we move fatal signal handling into the separate crash.c file.
This allows us to collect the data we need in one place and
reuse it when we need to send reports to stderr (and to the
feedback server, which will be implemented in next patch).
Part-of #5261
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
src/lib/core/CMakeLists.txt | 1 +
src/lib/core/crash.c | 295 ++++++++++++++++++++++++++++++++++++
src/lib/core/crash.h | 26 ++++
src/main.cc | 138 +----------------
4 files changed, 327 insertions(+), 133 deletions(-)
create mode 100644 src/lib/core/crash.c
create mode 100644 src/lib/core/crash.h
diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
index 7c62fc5ce..30cf0dd15 100644
--- a/src/lib/core/CMakeLists.txt
+++ b/src/lib/core/CMakeLists.txt
@@ -1,5 +1,6 @@
set(core_sources
diag.c
+ crash.c
say.c
memory.c
clock.c
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
new file mode 100644
index 000000000..3929463f3
--- /dev/null
+++ b/src/lib/core/crash.c
@@ -0,0 +1,295 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#include <string.h>
+#include <signal.h>
+#include <stdint.h>
+#include <time.h>
+
+#include "trivia/util.h"
+
+#include "backtrace.h"
+#include "crash.h"
+#include "say.h"
+
+#define pr_fmt(fmt) "crash: " fmt
+#define pr_syserr(fmt, ...) say_syserror(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_panic(fmt, ...) panic(pr_fmt(fmt), ##__VA_ARGS__)
+
+#ifdef TARGET_OS_LINUX
+#ifndef __x86_64__
+# error "Non x86-64 architectures are not supported"
+#endif
+struct crash_greg {
+ uint64_t r8;
+ uint64_t r9;
+ uint64_t r10;
+ uint64_t r11;
+ uint64_t r12;
+ uint64_t r13;
+ uint64_t r14;
+ uint64_t r15;
+ uint64_t di;
+ uint64_t si;
+ uint64_t bp;
+ uint64_t bx;
+ uint64_t dx;
+ uint64_t ax;
+ uint64_t cx;
+ uint64_t sp;
+ uint64_t ip;
+ uint64_t flags;
+ uint16_t cs;
+ uint16_t gs;
+ uint16_t fs;
+ uint16_t ss;
+ uint64_t err;
+ uint64_t trapno;
+ uint64_t oldmask;
+ uint64_t cr2;
+ uint64_t fpstate;
+ uint64_t reserved1[8];
+};
+#endif /* TARGET_OS_LINUX */
+
+static struct crash_info {
+ /**
+ * These two are mostly useless as being
+ * plain addresses and without real binary
+ * crash dump file we can't use them for
+ * anything suitable (in terms of analysis sake)
+ * but keep for backward compatibility.
+ */
+ void *context_addr;
+ void *siginfo_addr;
+#ifdef TARGET_OS_LINUX
+ /**
+ * Registers contents.
+ */
+ struct crash_greg greg;
+#endif
+ /**
+ * Faulting address.
+ */
+ void *siaddr;
+ /**
+ * Crash signal number.
+ */
+ int signo;
+ /**
+ * Crash signal code.
+ */
+ int sicode;
+#ifdef ENABLE_BACKTRACE
+ /**
+ * 1K of memory should be enough to keep the backtrace.
+ * In worst case it gonna be simply trimmed.
+ */
+ char backtrace_buf[1024];
+#endif
+} crash_info;
+
+/**
+ * The routine is called inside crash signal handler so
+ * be careful to not cause additional signals inside.
+ */
+static struct crash_info *
+crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
+{
+ struct crash_info *cinfo = &crash_info;
+
+ cinfo->signo = signo;
+ cinfo->sicode = siginfo->si_code;
+ cinfo->siaddr = siginfo->si_addr;
+ cinfo->context_addr = ucontext;
+ cinfo->siginfo_addr = siginfo;
+
+#ifdef ENABLE_BACKTRACE
+ char *start = cinfo->backtrace_buf;
+ backtrace(start, sizeof(cinfo->backtrace_buf));
+#endif
+
+#ifdef TARGET_OS_LINUX
+ /*
+ * uc_mcontext on libc level looks somehow strange,
+ * they define an array of uint64_t where each register
+ * defined by REG_x macro.
+ *
+ * In turn the kernel is quite explicit about the context.
+ * Moreover it is a part of user ABI, thus won't be changed.
+ *
+ * Lets use memcpy here to make a copy in a fast way.
+ */
+ ucontext_t *uc = ucontext;
+ memcpy(&cinfo->greg, &uc->uc_mcontext, sizeof(cinfo->greg));
+#endif
+
+ return cinfo;
+}
+
+/**
+ * Report crash information to the stderr
+ * (usually a current console).
+ */
+static void
+crash_report_stderr(struct crash_info *cinfo)
+{
+ if (cinfo->signo == SIGSEGV) {
+ fprintf(stderr, "Segmentation fault\n");
+ const char *signal_code_repr = NULL;
+
+ switch (cinfo->sicode) {
+ case SEGV_MAPERR:
+ signal_code_repr = "SEGV_MAPERR";
+ break;
+ case SEGV_ACCERR:
+ signal_code_repr = "SEGV_ACCERR";
+ break;
+ }
+
+ if (signal_code_repr != NULL)
+ fprintf(stderr, " code: %s\n", signal_code_repr);
+ else
+ fprintf(stderr, " code: %d\n", cinfo->sicode);
+ /*
+ * fprintf is used instead of fdprintf, because
+ * fdprintf does not understand %p
+ */
+ fprintf(stderr, " addr: %p\n", cinfo->siaddr);
+ } else {
+ fprintf(stderr, "Got a fatal signal %d\n", cinfo->signo);
+ }
+
+ fprintf(stderr, " context: %p\n", cinfo->context_addr);
+ fprintf(stderr, " siginfo: %p\n", cinfo->siginfo_addr);
+
+#ifdef TARGET_OS_LINUX
+# define fprintf_reg(__n, __v) \
+ fprintf(stderr, " %-9s0x%-17llx%lld\n", \
+ __n, (long long)__v, (long long)__v)
+ fprintf_reg("rax", cinfo->greg.ax);
+ fprintf_reg("rbx", cinfo->greg.bx);
+ fprintf_reg("rcx", cinfo->greg.cx);
+ fprintf_reg("rdx", cinfo->greg.dx);
+ fprintf_reg("rsi", cinfo->greg.si);
+ fprintf_reg("rdi", cinfo->greg.di);
+ fprintf_reg("rsp", cinfo->greg.sp);
+ fprintf_reg("rbp", cinfo->greg.bp);
+ fprintf_reg("r8", cinfo->greg.r8);
+ fprintf_reg("r9", cinfo->greg.r9);
+ fprintf_reg("r10", cinfo->greg.r10);
+ fprintf_reg("r11", cinfo->greg.r11);
+ fprintf_reg("r12", cinfo->greg.r12);
+ fprintf_reg("r13", cinfo->greg.r13);
+ fprintf_reg("r14", cinfo->greg.r14);
+ fprintf_reg("r15", cinfo->greg.r15);
+ fprintf_reg("rip", cinfo->greg.ip);
+ fprintf_reg("eflags", cinfo->greg.flags);
+ fprintf_reg("cs", cinfo->greg.cs);
+ fprintf_reg("gs", cinfo->greg.gs);
+ fprintf_reg("fs", cinfo->greg.fs);
+ fprintf_reg("cr2", cinfo->greg.cr2);
+ fprintf_reg("err", cinfo->greg.err);
+ fprintf_reg("oldmask", cinfo->greg.oldmask);
+ fprintf_reg("trapno", cinfo->greg.trapno);
+# undef fprintf_reg
+#endif /* TARGET_OS_LINUX */
+
+ fprintf(stderr, "Current time: %u\n", (unsigned)time(0));
+ fprintf(stderr, "Please file a bug at "
+ "http://github.com/tarantool/tarantool/issues\n");
+
+#ifdef ENABLE_BACKTRACE
+ fprintf(stderr, "Attempting backtrace... Note: since the server has "
+ "already crashed, \nthis may fail as well\n");
+ fprintf(stderr, "%s", cinfo->backtrace_buf);
+#endif
+}
+
+/**
+ * Handle fatal (crashing) signal.
+ *
+ * Try to log as much as possible before dumping a core.
+ *
+ * Core files are not always allowed and it takes an effort to
+ * extract useful information from them.
+ *
+ * *Recursive invocation*
+ *
+ * Unless SIGSEGV is sent by kill(), Linux resets the signal
+ * a default value before invoking the handler.
+ *
+ * Despite that, as an extra precaution to avoid infinite
+ * recursion, we count invocations of the handler, and
+ * quietly _exit() when called for a second time.
+ */
+static void
+crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
+{
+ static volatile sig_atomic_t in_cb = 0;
+ struct crash_info *cinfo;
+
+ if (in_cb == 0) {
+ in_cb = 1;
+ cinfo = crash_collect(signo, siginfo, context);
+ crash_report_stderr(cinfo);
+ } else {
+ /* Got a signal while running the handler. */
+ fprintf(stderr, "Fatal %d while backtracing", signo);
+ }
+
+ /* Try to dump a core */
+ struct sigaction sa = {
+ .sa_handler = SIG_DFL,
+ };
+ sigemptyset(&sa.sa_mask);
+ sigaction(SIGABRT, &sa, NULL);
+ abort();
+}
+
+/**
+ * Fatal signals we generate crash on.
+ */
+static const int crash_signals[] = { SIGSEGV, SIGFPE };
+
+void
+crash_signal_reset(void)
+{
+ struct sigaction sa = {
+ .sa_handler = SIG_DFL,
+ };
+ sigemptyset(&sa.sa_mask);
+
+ for (size_t i = 0; i < lengthof(crash_signals); i++) {
+ if (sigaction(crash_signals[i], &sa, NULL) == 0)
+ continue;
+ pr_syserr("reset sigaction %d", crash_signals[i]);
+ }
+}
+
+void
+crash_signal_init(void)
+{
+ /*
+ * SA_RESETHAND resets handler action to the default
+ * one when entering handler.
+ *
+ * SA_NODEFER allows receiving the same signal
+ * during handler.
+ */
+ struct sigaction sa = {
+ .sa_flags = SA_RESETHAND | SA_NODEFER | SA_SIGINFO,
+ .sa_sigaction = crash_signal_cb,
+ };
+ sigemptyset(&sa.sa_mask);
+
+ for (size_t i = 0; i < lengthof(crash_signals); i++) {
+ if (sigaction(crash_signals[i], &sa, NULL) == 0)
+ continue;
+ pr_panic("sigaction %d (%s)", crash_signals[i],
+ strerror(errno));
+ }
+}
diff --git a/src/lib/core/crash.h b/src/lib/core/crash.h
new file mode 100644
index 000000000..cd1db585e
--- /dev/null
+++ b/src/lib/core/crash.h
@@ -0,0 +1,26 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
+ */
+#pragma once
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+/**
+ * Initialize crash signal handlers.
+ */
+void
+crash_signal_init(void);
+
+/**
+ * Reset crash signal handlers.
+ */
+void
+crash_signal_reset(void);
+
+#if defined(__cplusplus)
+}
+#endif /* defined(__cplusplus) */
diff --git a/src/main.cc b/src/main.cc
index 2f48f474c..391e0f878 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -79,6 +79,7 @@
#include "systemd.h"
#include "crypto/crypto.h"
#include "core/popen.h"
+#include "core/crash.h"
static pid_t master_pid = getpid();
static struct pidfh *pid_file_handle;
@@ -184,124 +185,6 @@ signal_sigwinch_cb(ev_loop *loop, struct ev_signal *w, int revents)
rl_resize_terminal();
}
-#if defined(__linux__) && defined(__amd64)
-
-inline void
-dump_x86_64_register(const char *reg_name, unsigned long long val)
-{
- fprintf(stderr, " %-9s0x%-17llx%lld\n", reg_name, val, val);
-}
-
-void
-dump_x86_64_registers(ucontext_t *uc)
-{
- dump_x86_64_register("rax", uc->uc_mcontext.gregs[REG_RAX]);
- dump_x86_64_register("rbx", uc->uc_mcontext.gregs[REG_RBX]);
- dump_x86_64_register("rcx", uc->uc_mcontext.gregs[REG_RCX]);
- dump_x86_64_register("rdx", uc->uc_mcontext.gregs[REG_RDX]);
- dump_x86_64_register("rsi", uc->uc_mcontext.gregs[REG_RSI]);
- dump_x86_64_register("rdi", uc->uc_mcontext.gregs[REG_RDI]);
- dump_x86_64_register("rsp", uc->uc_mcontext.gregs[REG_RSP]);
- dump_x86_64_register("rbp", uc->uc_mcontext.gregs[REG_RBP]);
- dump_x86_64_register("r8", uc->uc_mcontext.gregs[REG_R8]);
- dump_x86_64_register("r9", uc->uc_mcontext.gregs[REG_R9]);
- dump_x86_64_register("r10", uc->uc_mcontext.gregs[REG_R10]);
- dump_x86_64_register("r11", uc->uc_mcontext.gregs[REG_R11]);
- dump_x86_64_register("r12", uc->uc_mcontext.gregs[REG_R12]);
- dump_x86_64_register("r13", uc->uc_mcontext.gregs[REG_R13]);
- dump_x86_64_register("r14", uc->uc_mcontext.gregs[REG_R14]);
- dump_x86_64_register("r15", uc->uc_mcontext.gregs[REG_R15]);
- dump_x86_64_register("rip", uc->uc_mcontext.gregs[REG_RIP]);
- dump_x86_64_register("eflags", uc->uc_mcontext.gregs[REG_EFL]);
- dump_x86_64_register("cs", (uc->uc_mcontext.gregs[REG_CSGSFS] >> 0) & 0xffff);
- dump_x86_64_register("gs", (uc->uc_mcontext.gregs[REG_CSGSFS] >> 16) & 0xffff);
- dump_x86_64_register("fs", (uc->uc_mcontext.gregs[REG_CSGSFS] >> 32) & 0xffff);
- dump_x86_64_register("cr2", uc->uc_mcontext.gregs[REG_CR2]);
- dump_x86_64_register("err", uc->uc_mcontext.gregs[REG_ERR]);
- dump_x86_64_register("oldmask", uc->uc_mcontext.gregs[REG_OLDMASK]);
- dump_x86_64_register("trapno", uc->uc_mcontext.gregs[REG_TRAPNO]);
-}
-
-#endif /* defined(__linux__) && defined(__amd64) */
-
-/** Try to log as much as possible before dumping a core.
- *
- * Core files are not aways allowed and it takes an effort to
- * extract useful information from them.
- *
- * *Recursive invocation*
- *
- * Unless SIGSEGV is sent by kill(), Linux
- * resets the signal a default value before invoking
- * the handler.
- *
- * Despite that, as an extra precaution to avoid infinite
- * recursion, we count invocations of the handler, and
- * quietly _exit() when called for a second time.
- */
-static void
-sig_fatal_cb(int signo, siginfo_t *siginfo, void *context)
-{
- static volatile sig_atomic_t in_cb = 0;
- int fd = STDERR_FILENO;
- struct sigaction sa;
-
- /* Got a signal while running the handler. */
- if (in_cb) {
- fdprintf(fd, "Fatal %d while backtracing", signo);
- goto end;
- }
-
- in_cb = 1;
-
- if (signo == SIGSEGV) {
- fdprintf(fd, "Segmentation fault\n");
- const char *signal_code_repr = 0;
- switch (siginfo->si_code) {
- case SEGV_MAPERR:
- signal_code_repr = "SEGV_MAPERR";
- break;
- case SEGV_ACCERR:
- signal_code_repr = "SEGV_ACCERR";
- break;
- }
- if (signal_code_repr)
- fdprintf(fd, " code: %s\n", signal_code_repr);
- else
- fdprintf(fd, " code: %d\n", siginfo->si_code);
- /*
- * fprintf is used insted of fdprintf, because
- * fdprintf does not understand %p
- */
- fprintf(stderr, " addr: %p\n", siginfo->si_addr);
- } else
- fdprintf(fd, "Got a fatal signal %d\n", signo);
- fprintf(stderr, " context: %p\n", context);
- fprintf(stderr, " siginfo: %p\n", siginfo);
-
-#if defined(__linux__) && defined(__amd64)
- dump_x86_64_registers((ucontext_t *)context);
-#endif
-
- fdprintf(fd, "Current time: %u\n", (unsigned) time(0));
- fdprintf(fd,
- "Please file a bug at http://github.com/tarantool/tarantool/issues\n");
-
-#ifdef ENABLE_BACKTRACE
- fdprintf(fd, "Attempting backtrace... Note: since the server has "
- "already crashed, \nthis may fail as well\n");
- print_backtrace();
-#endif
-end:
- /* Try to dump core. */
- memset(&sa, 0, sizeof(sa));
- sigemptyset(&sa.sa_mask);
- sa.sa_handler = SIG_DFL;
- sigaction(SIGABRT, &sa, NULL);
-
- abort();
-}
-
static void
signal_free(void)
{
@@ -328,11 +211,11 @@ signal_reset(void)
sigaction(SIGINT, &sa, NULL) == -1 ||
sigaction(SIGTERM, &sa, NULL) == -1 ||
sigaction(SIGHUP, &sa, NULL) == -1 ||
- sigaction(SIGWINCH, &sa, NULL) == -1 ||
- sigaction(SIGSEGV, &sa, NULL) == -1 ||
- sigaction(SIGFPE, &sa, NULL) == -1)
+ sigaction(SIGWINCH, &sa, NULL) == -1)
say_syserror("sigaction");
+ crash_signal_reset();
+
/* Unblock any signals blocked by libev. */
sigset_t sigset;
sigfillset(&sigset);
@@ -362,18 +245,7 @@ signal_init(void)
if (sigaction(SIGPIPE, &sa, 0) == -1)
panic_syserror("sigaction");
- /*
- * SA_RESETHAND resets handler action to the default
- * one when entering handler.
- * SA_NODEFER allows receiving the same signal during handler.
- */
- sa.sa_flags = SA_RESETHAND | SA_NODEFER | SA_SIGINFO;
- sa.sa_sigaction = sig_fatal_cb;
-
- if (sigaction(SIGSEGV, &sa, 0) == -1 ||
- sigaction(SIGFPE, &sa, 0) == -1) {
- panic_syserror("sigaction");
- }
+ crash_signal_init();
ev_signal_init(&ev_sigs[0], sig_checkpoint, SIGUSR1);
ev_signal_init(&ev_sigs[1], signal_cb, SIGINT);
--
2.26.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
` (2 preceding siblings ...)
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
2020-12-23 18:47 ` Vladislav Shpilevoy
3 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
To: tml; +Cc: Vladislav Shpilevoy
We have a feedback server which gathers information about a running instance.
While general info is enough for now we may loose a precious information about
crashes (such as call backtrace which caused the issue, type of build and etc).
In the commit we add support of sending this kind of information to the feedback
server. Internally we gather the reason of failure, pack it into base64 form
and then run another Tarantool instance which sends it out.
A typical report might look like
| {
| "crashdump": {
| "version": "1",
| "data": {
| "uname": {
| "sysname": "Linux",
| "release": "5.9.14-100.fc32.x86_64",
| "version": "#1 SMP Fri Dec 11 14:30:38 UTC 2020",
| "machine": "x86_64"
| },
| "build": {
| "version": "2.7.0-115-g360565efb",
| "cmake_type": "Linux-x86_64-Debug"
| },
| "signal": {
| "signo": 11,
| "si_code": 0,
| "si_addr": "0x3e800004838",
| "backtrace": "IzAgIDB4NjMwNzM0IGluIGNyYXNoX2NvbGxlY3...",
| "timestamp": "2020-12-23 14:42:10 MSK"
| }
| }
| }
| }
The `backtrace` itself is encoded as base64 because of newline symbols
(and may comprise some nonascii symbols as well).
There is no simple way to test this so I did it manually:
1) Run instance with
box.cfg{log_level = 8, feedback_host="127.0.0.1:1500"}
2) Run listener shell as
while true ; do nc -l -p 1500 -c 'echo -e "HTTP/1.1 200 OK\n\n $(date)"'; done
3) Send SIGSEGV
kill -11 `pidof tarantool`
Once SIGSEGV is delivered the crashinfo data is generated and sent out. For
debug purpose this data is also printed to the terminal on debug log level.
Closes #5261
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
@TarantoolBot document
Title: Configuration update, allow to disable sending crash information
For better analysis of program crashes the information associated with
the crash such as
- utsname (similar to `uname -a` output except the network name)
- build information
- reason for a crash
- call backtrace
is sent to the feedback server. To disable it set `feedback_crashinfo`
to `false`.
---
src/box/box.cc | 18 ++
src/box/box.h | 1 +
src/box/lua/cfg.cc | 9 +
src/box/lua/load_cfg.lua | 6 +-
src/lib/core/CMakeLists.txt | 2 +-
src/lib/core/crash.c | 308 ++++++++++++++++++++++++++++++++
src/lib/core/crash.h | 18 ++
src/main.cc | 1 +
test/app-tap/init_script.result | 1 +
test/box/admin.result | 2 +
test/box/cfg.result | 4 +
11 files changed, 368 insertions(+), 2 deletions(-)
diff --git a/src/box/box.cc b/src/box/box.cc
index a8bc3471d..440bfa305 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -74,6 +74,7 @@
#include "sql.h"
#include "systemd.h"
#include "call.h"
+#include "crash.h"
#include "func.h"
#include "sequence.h"
#include "sql_stmt_cache.h"
@@ -1213,6 +1214,23 @@ box_set_prepared_stmt_cache_size(void)
return 0;
}
+int
+box_set_crash(void)
+{
+ const char *host = cfg_gets("feedback_host");
+ bool is_enabled_1 = cfg_getb("feedback_enabled");
+ bool is_enabled_2 = cfg_getb("feedback_crashinfo");
+
+ if (host != NULL && strlen(host) >= CRASH_FEEDBACK_HOST_MAX) {
+ diag_set(ClientError, ER_CFG, "feedback_host",
+ "the address is too long");
+ return -1;
+ }
+
+ crash_cfg(host, is_enabled_1 && is_enabled_2);
+ return 0;
+}
+
/* }}} configuration bindings */
/**
diff --git a/src/box/box.h b/src/box/box.h
index b47a220b7..69fa096a1 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -257,6 +257,7 @@ void box_set_replication_sync_timeout(void);
void box_set_replication_skip_conflict(void);
void box_set_replication_anon(void);
void box_set_net_msg_max(void);
+int box_set_crash(void);
int
box_set_prepared_stmt_cache_size(void);
diff --git a/src/box/lua/cfg.cc b/src/box/lua/cfg.cc
index 42805e602..2d3ccbf0e 100644
--- a/src/box/lua/cfg.cc
+++ b/src/box/lua/cfg.cc
@@ -375,6 +375,14 @@ lbox_cfg_set_replication_skip_conflict(struct lua_State *L)
return 0;
}
+static int
+lbox_cfg_set_crash(struct lua_State *L)
+{
+ if (box_set_crash() != 0)
+ luaT_error(L);
+ return 0;
+}
+
void
box_lua_cfg_init(struct lua_State *L)
{
@@ -411,6 +419,7 @@ box_lua_cfg_init(struct lua_State *L)
{"cfg_set_replication_anon", lbox_cfg_set_replication_anon},
{"cfg_set_net_msg_max", lbox_cfg_set_net_msg_max},
{"cfg_set_sql_cache_size", lbox_set_prepared_stmt_cache_size},
+ {"cfg_set_crash", lbox_cfg_set_crash},
{NULL, NULL}
};
diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua
index 770442052..7e41e0999 100644
--- a/src/box/lua/load_cfg.lua
+++ b/src/box/lua/load_cfg.lua
@@ -33,7 +33,8 @@ end
local ifdef_feedback_set_params =
private.feedback_daemon ~= nil and
- private.feedback_daemon.set_feedback_params or nil
+ private.feedback_daemon.set_feedback_params and
+ private.cfg_set_crash or nil
-- all available options
local default_cfg = {
@@ -99,6 +100,7 @@ local default_cfg = {
replication_skip_conflict = false,
replication_anon = false,
feedback_enabled = true,
+ feedback_crashinfo = true,
feedback_host = "https://feedback.tarantool.io",
feedback_interval = 3600,
net_msg_max = 768,
@@ -179,6 +181,7 @@ local template_cfg = {
replication_skip_conflict = 'boolean',
replication_anon = 'boolean',
feedback_enabled = ifdef_feedback('boolean'),
+ feedback_crashinfo = ifdef_feedback('boolean'),
feedback_host = ifdef_feedback('string'),
feedback_interval = ifdef_feedback('number'),
net_msg_max = 'number',
@@ -277,6 +280,7 @@ local dynamic_cfg = {
checkpoint_wal_threshold = private.cfg_set_checkpoint_wal_threshold,
worker_pool_threads = private.cfg_set_worker_pool_threads,
feedback_enabled = ifdef_feedback_set_params,
+ feedback_crashinfo = ifdef_feedback_set_params,
feedback_host = ifdef_feedback_set_params,
feedback_interval = ifdef_feedback_set_params,
-- do nothing, affects new replicas, which query this value on start
diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
index 30cf0dd15..358e98ea7 100644
--- a/src/lib/core/CMakeLists.txt
+++ b/src/lib/core/CMakeLists.txt
@@ -40,7 +40,7 @@ endif()
add_library(core STATIC ${core_sources})
-target_link_libraries(core salad small uri decNumber bit ${LIBEV_LIBRARIES}
+target_link_libraries(core salad small uri decNumber bit misc ${LIBEV_LIBRARIES}
${LIBEIO_LIBRARIES} ${LIBCORO_LIBRARIES}
${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES})
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index 3929463f3..19d3d49ea 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -7,8 +7,13 @@
#include <string.h>
#include <signal.h>
#include <stdint.h>
+#include <limits.h>
#include <time.h>
+#include <sys/types.h>
+#include <sys/utsname.h>
+#include "third_party/base64.h"
+#include "small/static.h"
#include "trivia/util.h"
#include "backtrace.h"
@@ -16,9 +21,16 @@
#include "say.h"
#define pr_fmt(fmt) "crash: " fmt
+#define pr_debug(fmt, ...) say_debug(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_info(fmt, ...) say_info(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_err(fmt, ...) say_error(pr_fmt(fmt), ##__VA_ARGS__)
#define pr_syserr(fmt, ...) say_syserror(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_crit(fmt, ...) fprintf(stderr, pr_fmt(fmt) "\n", ##__VA_ARGS__)
#define pr_panic(fmt, ...) panic(pr_fmt(fmt), ##__VA_ARGS__)
+/** Use strlcpy with destination as an array */
+#define strlcpy_a(dst, src) strlcpy(dst, src, sizeof(dst))
+
#ifdef TARGET_OS_LINUX
#ifndef __x86_64__
# error "Non x86-64 architectures are not supported"
@@ -71,6 +83,10 @@ static struct crash_info {
*/
struct crash_greg greg;
#endif
+ /**
+ * Timestamp in nanoseconds (realtime).
+ */
+ uint64_t timestamp_rt;
/**
* Faulting address.
*/
@@ -92,6 +108,72 @@ static struct crash_info {
#endif
} crash_info;
+static char tarantool_path[PATH_MAX];
+static char feedback_host[CRASH_FEEDBACK_HOST_MAX];
+static bool send_crashinfo = false;
+
+static inline uint64_t
+timespec_to_ns(struct timespec *ts)
+{
+ return (uint64_t)ts->tv_sec * 1000000000 + (uint64_t)ts->tv_nsec;
+}
+
+static char *
+ns_to_localtime(uint64_t timestamp, char *buf, ssize_t len)
+{
+ time_t sec = timestamp / 1000000000;
+ char *start = buf;
+ struct tm tm;
+
+ /*
+ * Use similar format as say_x logger. Except plain
+ * seconds should be enough.
+ */
+ localtime_r(&sec, &tm);
+ ssize_t total = strftime(start, len, "%F %T %Z", &tm);
+ start += total;
+ if (total < len)
+ return buf;
+ buf[len - 1] = '\0';
+ return buf;
+}
+
+void
+crash_init(const char *tarantool_bin)
+{
+ strlcpy_a(tarantool_path, tarantool_bin);
+ if (strlen(tarantool_path) < strlen(tarantool_bin))
+ pr_panic("executable path is trimmed");
+}
+
+void
+crash_cfg(const char *host, bool is_enabled)
+{
+ if (host == NULL || !is_enabled) {
+ if (send_crashinfo) {
+ pr_debug("disable sending crashinfo feedback");
+ send_crashinfo = false;
+ feedback_host[0] = '\0';
+ }
+ return;
+ }
+
+ if (strcmp(feedback_host, host) != 0) {
+ strlcpy_a(feedback_host, host);
+ /*
+ * The caller should have tested already
+ * that there is enough space to keep
+ * the host address.
+ */
+ assert(strlen(feedback_host) == strlen(host));
+ }
+
+ if (!send_crashinfo) {
+ pr_debug("enable sending crashinfo feedback");
+ send_crashinfo = true;
+ }
+}
+
/**
* The routine is called inside crash signal handler so
* be careful to not cause additional signals inside.
@@ -100,6 +182,12 @@ static struct crash_info *
crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
{
struct crash_info *cinfo = &crash_info;
+ struct timespec ts;
+
+ if (clock_gettime(CLOCK_REALTIME, &ts) == 0)
+ cinfo->timestamp_rt = timespec_to_ns(&ts);
+ else
+ cinfo->timestamp_rt = 0;
cinfo->signo = signo;
cinfo->sicode = siginfo->si_code;
@@ -130,6 +218,224 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
return cinfo;
}
+/**
+ * Mark an environment that we're in crashinfo handling, this
+ * allows us to escape recursive attempts to send report,
+ * if the action of sending report is failing itself.
+ */
+static int
+crash_mark_env_mode(void)
+{
+ const char *env_name = "TT_CRASHINFO_MODE";
+ if (getenv(env_name) != NULL) {
+ pr_crit("recursive failure detected");
+ return -1;
+ }
+
+ if (setenv(env_name, "y", 0) != 0) {
+ pr_crit("unable to setup %s", env_name);
+ return -1;
+ }
+
+ return 0;
+}
+
+/**
+ * Zap reserved symbols.
+ */
+static char *
+json_zap(char *s)
+{
+ if (s == NULL)
+ return NULL;
+
+ /*
+ * Actually we should escape them but for
+ * now lets just zap without changing source
+ * size.
+ */
+ for (size_t i = 0; i < strlen(s); i++) {
+ if (s[i] == '\"' || s[i] == '\b' ||
+ s[i] == '\f' || s[i] == '\n' ||
+ s[i] == '\r' || s[i] == '\t' ||
+ s[i] == '\\') {
+ s[i] = ' ';
+ }
+ }
+ return s;
+}
+
+/**
+ * Report crash information to the feedback daemon
+ * (ie send it to feedback daemon).
+ */
+static void
+crash_report_feedback_daemon(struct crash_info *cinfo)
+{
+ if (crash_mark_env_mode() != 0)
+ return;
+
+ /*
+ * Update to a new number if the format get changed.
+ */
+ const int crashinfo_version = 1;
+
+ char *p = static_alloc(SMALL_STATIC_SIZE);
+ char *tail = &p[SMALL_STATIC_SIZE];
+ char *e = &p[SMALL_STATIC_SIZE];
+ char *head = p;
+
+ /*
+ * Note that while we encode the report we
+ * intensively use a tail of the allocated
+ * buffer as a temporary store.
+ */
+
+#define snprintf_safe(__end, __fmt, ...) \
+ do { \
+ size_t size = (char *)__end - p; \
+ p += snprintf(p, size, __fmt, ##__VA_ARGS__); \
+ if (p >= (char *)__end) \
+ goto out; \
+ } while (0)
+
+ /*
+ * Lets reuse tail of the buffer as a temp space.
+ */
+ struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
+ if (p >= (char *)uname_ptr)
+ goto out;
+
+ if (uname(uname_ptr) != 0) {
+ pr_syserr("uname call failed, ignore");
+ memset(uname_ptr, 0, sizeof(struct utsname));
+ }
+
+ /*
+ * Start filling the script. The "data" key value is
+ * filled as a separate code block for easier
+ * modifications in future.
+ */
+ snprintf_safe(uname_ptr,
+ "require(\'http.client\').post(\'%s\',"
+ "'{\"crashdump\":{\"version\":\"%d\","
+ "\"data\":", feedback_host,
+ crashinfo_version);
+
+ /* The "data" key value */
+ snprintf_safe(uname_ptr, "{");
+ snprintf_safe(uname_ptr, "\"uname\":{");
+ snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
+ json_zap(uname_ptr->sysname));
+ /*
+ * nodename might contain a sensitive information, skip.
+ */
+ snprintf_safe(uname_ptr, "\"release\":\"%s\",",
+ json_zap(uname_ptr->release));
+ snprintf_safe(uname_ptr, "\"version\":\"%s\",",
+ json_zap(uname_ptr->version));
+ snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
+ json_zap(uname_ptr->machine));
+ snprintf_safe(uname_ptr, "},");
+
+ snprintf_safe(e, "\"build\":{");
+ snprintf_safe(e, "\"version\":\"%s\",", PACKAGE_VERSION);
+ snprintf_safe(e, "\"cmake_type\":\"%s\"", BUILD_INFO);
+ snprintf_safe(e, "},");
+
+ snprintf_safe(e, "\"signal\":{");
+ snprintf_safe(e, "\"signo\":%d,", cinfo->signo);
+ snprintf_safe(e, "\"si_code\":%d,", cinfo->sicode);
+ if (cinfo->signo == SIGSEGV) {
+ if (cinfo->sicode == SEGV_MAPERR) {
+ snprintf_safe(e, "\"si_code_str\":\"%s\",",
+ "SEGV_MAPERR");
+ } else if (cinfo->sicode == SEGV_ACCERR) {
+ snprintf_safe(e, "\"si_code_str\":\"%s\",",
+ "SEGV_ACCERR");
+ }
+ snprintf_safe(e, "\"si_addr\":\"0x%llx\",",
+ (long long)cinfo->siaddr);
+ }
+
+#ifdef ENABLE_BACKTRACE
+ /*
+ * The backtrace itself is encoded into base64 form
+ * since it might have arbitrary symbols not suitable
+ * for json encoding (newlines and etc).
+ */
+ size_t bt_len = strlen(cinfo->backtrace_buf);
+ size_t bt_elen = base64_bufsize(bt_len, BASE64_NOWRAP);
+ char *bt_base64 = &tail[-bt_elen];
+ if (p >= bt_base64)
+ goto out;
+ base64_encode(cinfo->backtrace_buf, bt_len,
+ bt_base64, bt_elen, BASE64_NOWRAP);
+ bt_base64[bt_elen] = '\0';
+ snprintf_safe(bt_base64, "\"backtrace\":\"%s\",", bt_base64);
+#endif
+
+ /* 64 bytes should be enough for longest localtime */
+ const int ts_size = 64;
+ char *timestamp_rt_str = &tail[-ts_size];
+ if (p >= timestamp_rt_str)
+ goto out;
+ ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
+ snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
+ json_zap(timestamp_rt_str));
+ snprintf_safe(timestamp_rt_str, "}");
+ snprintf_safe(timestamp_rt_str, "}");
+
+ /*
+ * Finalize the "data" key and the script.
+ *
+ * The timeout is choosen to be 1 second as
+ * main feedback daemon uses.
+ */
+ snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
+
+ pr_debug("crashinfo script: %s", head);
+
+ char *exec_argv[4] = {
+ [0] = tarantool_path,
+ [1] = "-e",
+ [2] = head,
+ [3] = NULL,
+ };
+
+ /*
+ * Can't use fork here because libev has own
+ * at_fork helpers with mutex where we might
+ * stuck (see popen code).
+ */
+ pid_t pid = vfork();
+ if (pid == 0) {
+ /*
+ * Environment is needed for recursive
+ * crash protection. See crash_mark_env_mode
+ * above.
+ */
+ extern char **environ;
+ /*
+ * The script must exit at the end but there
+ * is no simple way to make sure from inside
+ * of a signal crash handler. So just hope it
+ * is running fine.
+ */
+ execve(exec_argv[0], exec_argv, environ);
+ pr_crit("exec(%s,[%s,%s,%s]) failed",
+ exec_argv[0], exec_argv[0],
+ exec_argv[1], exec_argv[2]);
+ _exit(1);
+ } else if (pid < 0) {
+ pr_crit("unable to vfork (errno %d)", errno);
+ }
+
+ return;
+out:
+ pr_crit("unable to prepare a crash report");
+}
+
/**
* Report crash information to the stderr
* (usually a current console).
@@ -236,6 +542,8 @@ crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
in_cb = 1;
cinfo = crash_collect(signo, siginfo, context);
crash_report_stderr(cinfo);
+ if (send_crashinfo)
+ crash_report_feedback_daemon(cinfo);
} else {
/* Got a signal while running the handler. */
fprintf(stderr, "Fatal %d while backtracing", signo);
diff --git a/src/lib/core/crash.h b/src/lib/core/crash.h
index cd1db585e..195aef10b 100644
--- a/src/lib/core/crash.h
+++ b/src/lib/core/crash.h
@@ -9,6 +9,24 @@
extern "C" {
#endif /* defined(__cplusplus) */
+/**
+ * PATH_MAX is too big and 2K is recommended
+ * limit for web address.
+ */
+#define CRASH_FEEDBACK_HOST_MAX 2048
+
+/**
+ * Initialize crash subsystem.
+ */
+void
+crash_init(const char *tarantool_bin);
+
+/**
+ * Configure crash parameters.
+ */
+void
+crash_cfg(const char *host, bool is_enabled);
+
/**
* Initialize crash signal handlers.
*/
diff --git a/src/main.cc b/src/main.cc
index 391e0f878..2fce81bb3 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -687,6 +687,7 @@ main(int argc, char **argv)
title_set_script_name(argv[0]);
}
+ crash_init(tarantool_bin);
export_syms();
random_init();
diff --git a/test/app-tap/init_script.result b/test/app-tap/init_script.result
index 72aa67db2..16c5b01d2 100644
--- a/test/app-tap/init_script.result
+++ b/test/app-tap/init_script.result
@@ -10,6 +10,7 @@ checkpoint_wal_threshold:1e+18
coredump:false
election_mode:off
election_timeout:5
+feedback_crashinfo:true
feedback_enabled:true
feedback_host:https://feedback.tarantool.io
feedback_interval:3600
diff --git a/test/box/admin.result b/test/box/admin.result
index e05440f66..05debe673 100644
--- a/test/box/admin.result
+++ b/test/box/admin.result
@@ -41,6 +41,8 @@ cfg_filter(box.cfg)
- off
- - election_timeout
- 5
+ - - feedback_crashinfo
+ - true
- - feedback_enabled
- true
- - feedback_host
diff --git a/test/box/cfg.result b/test/box/cfg.result
index 10fef006c..22a720c2c 100644
--- a/test/box/cfg.result
+++ b/test/box/cfg.result
@@ -29,6 +29,8 @@ cfg_filter(box.cfg)
| - off
| - - election_timeout
| - 5
+ | - - feedback_crashinfo
+ | - true
| - - feedback_enabled
| - true
| - - feedback_host
@@ -142,6 +144,8 @@ cfg_filter(box.cfg)
| - off
| - - election_timeout
| - 5
+ | - - feedback_crashinfo
+ | - true
| - - feedback_enabled
| - true
| - - feedback_host
--
2.26.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
@ 2020-12-23 18:47 ` Vladislav Shpilevoy
2020-12-23 18:57 ` Cyrill Gorcunov
2020-12-23 21:22 ` Cyrill Gorcunov
0 siblings, 2 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-23 18:47 UTC (permalink / raw)
To: Cyrill Gorcunov, tml
Hi! Thanks for the patch!
See 3 comments and fixes below, and top of the branch in a
separate commit.
> diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
> index 3929463f3..19d3d49ea 100644
> --- a/src/lib/core/crash.c
> +++ b/src/lib/core/crash.c
> @@ -130,6 +218,224 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
> return cinfo;
> }
>
> +/**
> + * Mark an environment that we're in crashinfo handling, this
> + * allows us to escape recursive attempts to send report,
> + * if the action of sending report is failing itself.
> + */
> +static int
> +crash_mark_env_mode(void)
1. You don't need to change 'env'. Because in the new process
send_crashinfo will be false. Therefore it won't send the crash
anyway.
> +{
> + const char *env_name = "TT_CRASHINFO_MODE";
> + if (getenv(env_name) != NULL) {
> + pr_crit("recursive failure detected");
> + return -1;
> + }
> +
> + if (setenv(env_name, "y", 0) != 0) {
> + pr_crit("unable to setup %s", env_name);
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * Zap reserved symbols.
> + */
> +static char *
> +json_zap(char *s)
2. Why can't you simply call json_escape(), which we
already have? Yes, it may add a few more bytes, but it
is not something super bad. json_escape() returns number
of written bytes, so you can use it quite easy.
====================
@@ -325,8 +325,11 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
/* The "data" key value */
snprintf_safe(uname_ptr, "{");
snprintf_safe(uname_ptr, "\"uname\":{");
- snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
- json_zap(uname_ptr->sysname));
+ snprintf_safe(uname_ptr, "\"sysname\":\"");
+ p += json_escape(p, (char *)uname_ptr - p, uname_ptr->sysname);
+ if (p >= (char *)uname_ptr)
+ goto out;
+ snprintf_safe(uname_ptr, "\",");
====================
It would look even simpler, if you would use the existing
SNPRINT() macros. It returns -1 on fail, but you could patch
it or introduce SNPRINT_GO which would goto to a given label.
Or keep return -1, and do it like in sio_socketname_to_buffer() -
it has a separate function to do all the prints.
I tried to do that in my diff.
> +{
> + if (s == NULL)
> + return NULL;
> +
> + /*
> + * Actually we should escape them but for
> + * now lets just zap without changing source
> + * size.
> + */
> + for (size_t i = 0; i < strlen(s); i++) {
3. Strlen() is going to be called on each iteration. Please, save
the result into a variable. Or just use json_escape().
> + if (s[i] == '\"' || s[i] == '\b' ||
> + s[i] == '\f' || s[i] == '\n' ||
> + s[i] == '\r' || s[i] == '\t' ||
> + s[i] == '\\') {
> + s[i] = ' ';
> + }
> + }
Consider my fixes below and on top of the branch in a separate
commit. I didn't test them though. So there are likely to be a
few typos.
====================
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index 914c84a1b..4b821d44a 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -218,63 +218,13 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
return cinfo;
}
-/**
- * Mark an environment that we're in crashinfo handling, this
- * allows us to escape recursive attempts to send report,
- * if the action of sending report is failing itself.
- */
-static int
-crash_mark_env_mode(void)
-{
- const char *env_name = "TT_CRASHINFO_MODE";
- if (getenv(env_name) != NULL) {
- pr_crit("recursive failure detected");
- return -1;
- }
-
- if (setenv(env_name, "y", 0) != 0) {
- pr_crit("unable to setup %s", env_name);
- return -1;
- }
-
- return 0;
-}
-
-/**
- * Zap reserved symbols.
- */
-static char *
-json_zap(char *s)
-{
- if (s == NULL)
- return NULL;
-
- /*
- * Actually we should escape them but for
- * now lets just zap without changing source
- * size.
- */
- for (size_t i = 0; i < strlen(s); i++) {
- if (s[i] == '\"' || s[i] == '\b' ||
- s[i] == '\f' || s[i] == '\n' ||
- s[i] == '\r' || s[i] == '\t' ||
- s[i] == '\\') {
- s[i] = ' ';
- }
- }
- return s;
-}
-
/**
* Report crash information to the feedback daemon
* (ie send it to feedback daemon).
*/
-static void
+static int
crash_report_feedback_daemon(struct crash_info *cinfo)
{
- if (crash_mark_env_mode() != 0)
- return;
-
/*
* Update to a new number if the format get changed.
*/
@@ -291,20 +241,17 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* buffer as a temporary store.
*/
-#define snprintf_safe(__end, __fmt, ...) \
- do { \
- size_t size = (char *)__end - p; \
- p += snprintf(p, size, __fmt, ##__VA_ARGS__); \
- if (p >= (char *)__end) \
- goto out; \
- } while (0)
+ int total = 0;
+ int size = 0;
+#define snprintf_safe(...) SNPRINT(total, snprintf, p, size, __VA_ARGS__)
+#define jnprintf_safe(str) SNPRINT(total, json_escape, p, size, str)
/*
* Lets reuse tail of the buffer as a temp space.
*/
struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
if (p >= (char *)uname_ptr)
- goto out;
+ return -1;
if (uname(uname_ptr) != 0) {
pr_syserr("uname call failed, ignore");
@@ -316,48 +263,53 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* filled as a separate code block for easier
* modifications in future.
*/
- snprintf_safe(uname_ptr,
- "require(\'http.client\').post(\'%s\',"
+ size = (char *)uname_ptr - p;
+ snprintf_safe("require(\'http.client\').post(\'%s\',"
"'{\"crashdump\":{\"version\":\"%d\","
"\"data\":", feedback_host,
crashinfo_version);
/* The "data" key value */
- snprintf_safe(uname_ptr, "{");
- snprintf_safe(uname_ptr, "\"uname\":{");
- snprintf_safe(uname_ptr, "\"sysname\":\"");
- p += json_escape(p, (char *)uname_ptr - p, uname_ptr->sysname);
- if (p >= (char *)uname_ptr)
- goto out;
- snprintf_safe(uname_ptr, "\",");
+ snprintf_safe("{");
+ snprintf_safe("\"uname\":{");
+ snprintf_safe("\"sysname\":\"");
+ jnprintf_safe(uname_ptr->sysname);
+ snprintf_safe("\",");
/*
* nodename might contain a sensitive information, skip.
*/
- snprintf_safe(uname_ptr, "\"release\":\"%s\",",
- json_zap(uname_ptr->release));
- snprintf_safe(uname_ptr, "\"version\":\"%s\",",
- json_zap(uname_ptr->version));
- snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
- json_zap(uname_ptr->machine));
- snprintf_safe(uname_ptr, "},");
-
- snprintf_safe(e, "\"build\":{");
- snprintf_safe(e, "\"version\":\"%s\",", PACKAGE_VERSION);
- snprintf_safe(e, "\"cmake_type\":\"%s\"", BUILD_INFO);
- snprintf_safe(e, "},");
-
- snprintf_safe(e, "\"signal\":{");
- snprintf_safe(e, "\"signo\":%d,", cinfo->signo);
- snprintf_safe(e, "\"si_code\":%d,", cinfo->sicode);
+ snprintf_safe("\"release\":\"");
+ jnprintf_safe(uname_ptr->release);
+ snprintf_safe("\",");
+
+ snprintf_safe("\"version\":\"");
+ jnprintf_safe(uname_ptr->version);
+ snprintf_safe("\",");
+
+ snprintf_safe("\"machine\":\"");
+ jnprintf_safe(uname_ptr->machine);
+ snprintf_safe("\"");
+ snprintf_safe("},");
+
+ /* Extend size, because now uname_ptr is not needed. */
+ size = e - p;
+ snprintf_safe("\"build\":{");
+ snprintf_safe("\"version\":\"%s\",", PACKAGE_VERSION);
+ snprintf_safe("\"cmake_type\":\"%s\"", BUILD_INFO);
+ snprintf_safe("},");
+
+ snprintf_safe("\"signal\":{");
+ snprintf_safe("\"signo\":%d,", cinfo->signo);
+ snprintf_safe("\"si_code\":%d,", cinfo->sicode);
if (cinfo->signo == SIGSEGV) {
if (cinfo->sicode == SEGV_MAPERR) {
- snprintf_safe(e, "\"si_code_str\":\"%s\",",
+ snprintf_safe("\"si_code_str\":\"%s\",",
"SEGV_MAPERR");
} else if (cinfo->sicode == SEGV_ACCERR) {
- snprintf_safe(e, "\"si_code_str\":\"%s\",",
+ snprintf_safe("\"si_code_str\":\"%s\",",
"SEGV_ACCERR");
}
- snprintf_safe(e, "\"si_addr\":\"0x%llx\",",
+ snprintf_safe("\"si_addr\":\"0x%llx\",",
(long long)cinfo->siaddr);
}
@@ -371,23 +323,28 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
size_t bt_elen = base64_bufsize(bt_len, BASE64_NOWRAP);
char *bt_base64 = &tail[-bt_elen];
if (p >= bt_base64)
- goto out;
+ return -1;
base64_encode(cinfo->backtrace_buf, bt_len,
bt_base64, bt_elen, BASE64_NOWRAP);
bt_base64[bt_elen] = '\0';
- snprintf_safe(bt_base64, "\"backtrace\":\"%s\",", bt_base64);
+
+ size = bt_base64 - p;
+ snprintf_safe("\"backtrace\":\"%s\",", bt_base64);
#endif
/* 64 bytes should be enough for longest localtime */
const int ts_size = 64;
char *timestamp_rt_str = &tail[-ts_size];
if (p >= timestamp_rt_str)
- goto out;
+ return -1;
ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
- snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
- json_zap(timestamp_rt_str));
- snprintf_safe(timestamp_rt_str, "}");
- snprintf_safe(timestamp_rt_str, "}");
+
+ size = timestamp_rt_str - p;
+ snprintf_safe("\"timestamp\":\"");
+ jnprintf_safe(timestamp_rt_str);
+ snprintf_safe("\"");
+ snprintf_safe("}");
+ snprintf_safe("}");
/*
* Finalize the "data" key and the script.
@@ -395,7 +352,8 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* The timeout is choosen to be 1 second as
* main feedback daemon uses.
*/
- snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
+ size = e - p;
+ snprintf_safe("}}',{timeout=1});os.exit(1);");
pr_debug("crashinfo script: %s", head);
@@ -413,30 +371,22 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
*/
pid_t pid = vfork();
if (pid == 0) {
- /*
- * Environment is needed for recursive
- * crash protection. See crash_mark_env_mode
- * above.
- */
- extern char **environ;
/*
* The script must exit at the end but there
* is no simple way to make sure from inside
* of a signal crash handler. So just hope it
* is running fine.
*/
- execve(exec_argv[0], exec_argv, environ);
+ execve(exec_argv[0], exec_argv, NULL);
pr_crit("exec(%s,[%s,%s,%s]) failed",
exec_argv[0], exec_argv[0],
exec_argv[1], exec_argv[2]);
_exit(1);
} else if (pid < 0) {
pr_crit("unable to vfork (errno %d)", errno);
+ return -1;
}
-
- return;
-out:
- pr_crit("unable to prepare a crash report");
+ return 0;
}
/**
@@ -545,8 +495,8 @@ crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
in_cb = 1;
cinfo = crash_collect(signo, siginfo, context);
crash_report_stderr(cinfo);
- if (send_crashinfo)
- crash_report_feedback_daemon(cinfo);
+ if (send_crashinfo && crash_report_feedback_daemon(cinfo) != 0)
+ pr_crit("unable to send a crash report");
} else {
/* Got a signal while running the handler. */
fprintf(stderr, "Fatal %d while backtracing", signo);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-23 18:47 ` Vladislav Shpilevoy
@ 2020-12-23 18:57 ` Cyrill Gorcunov
2020-12-23 21:22 ` Cyrill Gorcunov
1 sibling, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 18:57 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> >
> > +/**
> > + * Mark an environment that we're in crashinfo handling, this
> > + * allows us to escape recursive attempts to send report,
> > + * if the action of sending report is failing itself.
> > + */
> > +static int
> > +crash_mark_env_mode(void)
>
> 1. You don't need to change 'env'. Because in the new process
> send_crashinfo will be false. Therefore it won't send the crash
> anyway.
Yeah, good catch! Will drop this.
> > +/**
> > + * Zap reserved symbols.
> > + */
> > +static char *
> > +json_zap(char *s)
>
> 2. Why can't you simply call json_escape(), which we
> already have? Yes, it may add a few more bytes, but it
> is not something super bad. json_escape() returns number
> of written bytes, so you can use it quite easy.
Ah, I didn't know that we have this helper. Thanks, Vlad!
>
> It would look even simpler, if you would use the existing
> SNPRINT() macros. It returns -1 on fail, but you could patch
> it or introduce SNPRINT_GO which would goto to a given label.
> Or keep return -1, and do it like in sio_socketname_to_buffer() -
> it has a separate function to do all the prints.
>
> I tried to do that in my diff.
>
> > +{
> > + if (s == NULL)
> > + return NULL;
> > +
> > + /*
> > + * Actually we should escape them but for
> > + * now lets just zap without changing source
> > + * size.
> > + */
> > + for (size_t i = 0; i < strlen(s); i++) {
>
> 3. Strlen() is going to be called on each iteration. Please, save
> the result into a variable. Or just use json_escape().
Nope, it won't. gcc is smart enough to move this bb
out of cycle. This is known for decades already. Still
if I gonna use json helper you pointed above I'll not
need this function completely.
> Consider my fixes below and on top of the branch in a separate
> commit. I didn't test them though. So there are likely to be a
> few typos.
Yeah, thanks a lot. Will update and report you.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-23 18:47 ` Vladislav Shpilevoy
2020-12-23 18:57 ` Cyrill Gorcunov
@ 2020-12-23 21:22 ` Cyrill Gorcunov
2020-12-24 13:16 ` Cyrill Gorcunov
1 sibling, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 21:22 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
>
> See 3 comments and fixes below, and top of the branch in a
> separate commit.
>
Vlad, since we can use json_escape I dropped usage of base64
encoder for backtrace. The final interdiff I pushed into
gorcunov/gh-5261-crash-report-6 is the following
--
diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
index 358e98ea7..30cf0dd15 100644
--- a/src/lib/core/CMakeLists.txt
+++ b/src/lib/core/CMakeLists.txt
@@ -40,7 +40,7 @@ endif()
add_library(core STATIC ${core_sources})
-target_link_libraries(core salad small uri decNumber bit misc ${LIBEV_LIBRARIES}
+target_link_libraries(core salad small uri decNumber bit ${LIBEV_LIBRARIES}
${LIBEIO_LIBRARIES} ${LIBCORO_LIBRARIES}
${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES})
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index 19d3d49ea..e62af6836 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -12,7 +12,6 @@
#include <sys/types.h>
#include <sys/utsname.h>
-#include "third_party/base64.h"
#include "small/static.h"
#include "trivia/util.h"
@@ -218,63 +217,13 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
return cinfo;
}
-/**
- * Mark an environment that we're in crashinfo handling, this
- * allows us to escape recursive attempts to send report,
- * if the action of sending report is failing itself.
- */
-static int
-crash_mark_env_mode(void)
-{
- const char *env_name = "TT_CRASHINFO_MODE";
- if (getenv(env_name) != NULL) {
- pr_crit("recursive failure detected");
- return -1;
- }
-
- if (setenv(env_name, "y", 0) != 0) {
- pr_crit("unable to setup %s", env_name);
- return -1;
- }
-
- return 0;
-}
-
-/**
- * Zap reserved symbols.
- */
-static char *
-json_zap(char *s)
-{
- if (s == NULL)
- return NULL;
-
- /*
- * Actually we should escape them but for
- * now lets just zap without changing source
- * size.
- */
- for (size_t i = 0; i < strlen(s); i++) {
- if (s[i] == '\"' || s[i] == '\b' ||
- s[i] == '\f' || s[i] == '\n' ||
- s[i] == '\r' || s[i] == '\t' ||
- s[i] == '\\') {
- s[i] = ' ';
- }
- }
- return s;
-}
-
/**
* Report crash information to the feedback daemon
* (ie send it to feedback daemon).
*/
-static void
+static int
crash_report_feedback_daemon(struct crash_info *cinfo)
{
- if (crash_mark_env_mode() != 0)
- return;
-
/*
* Update to a new number if the format get changed.
*/
@@ -285,26 +234,18 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
char *e = &p[SMALL_STATIC_SIZE];
char *head = p;
- /*
- * Note that while we encode the report we
- * intensively use a tail of the allocated
- * buffer as a temporary store.
- */
+ int total = 0;
+ int size = 0;
-#define snprintf_safe(__end, __fmt, ...) \
- do { \
- size_t size = (char *)__end - p; \
- p += snprintf(p, size, __fmt, ##__VA_ARGS__); \
- if (p >= (char *)__end) \
- goto out; \
- } while (0)
+#define snprintf_safe(...) SNPRINT(total, snprintf, p, size, __VA_ARGS__)
+#define jnprintf_safe(str) SNPRINT(total, json_escape, p, size, str)
/*
* Lets reuse tail of the buffer as a temp space.
*/
struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
if (p >= (char *)uname_ptr)
- goto out;
+ return -1;
if (uname(uname_ptr) != 0) {
pr_syserr("uname call failed, ignore");
@@ -316,75 +257,75 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* filled as a separate code block for easier
* modifications in future.
*/
- snprintf_safe(uname_ptr,
- "require(\'http.client\').post(\'%s\',"
+ size = (char *)uname_ptr - p;
+ snprintf_safe("require(\'http.client\').post(\'%s\',"
"'{\"crashdump\":{\"version\":\"%d\","
"\"data\":", feedback_host,
crashinfo_version);
/* The "data" key value */
- snprintf_safe(uname_ptr, "{");
- snprintf_safe(uname_ptr, "\"uname\":{");
- snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
- json_zap(uname_ptr->sysname));
+ snprintf_safe("{");
+ snprintf_safe("\"uname\":{");
+ snprintf_safe("\"sysname\":\"");
+ jnprintf_safe(uname_ptr->sysname);
+ snprintf_safe("\",");
/*
* nodename might contain a sensitive information, skip.
*/
- snprintf_safe(uname_ptr, "\"release\":\"%s\",",
- json_zap(uname_ptr->release));
- snprintf_safe(uname_ptr, "\"version\":\"%s\",",
- json_zap(uname_ptr->version));
- snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
- json_zap(uname_ptr->machine));
- snprintf_safe(uname_ptr, "},");
-
- snprintf_safe(e, "\"build\":{");
- snprintf_safe(e, "\"version\":\"%s\",", PACKAGE_VERSION);
- snprintf_safe(e, "\"cmake_type\":\"%s\"", BUILD_INFO);
- snprintf_safe(e, "},");
-
- snprintf_safe(e, "\"signal\":{");
- snprintf_safe(e, "\"signo\":%d,", cinfo->signo);
- snprintf_safe(e, "\"si_code\":%d,", cinfo->sicode);
+ snprintf_safe("\"release\":\"");
+ jnprintf_safe(uname_ptr->release);
+ snprintf_safe("\",");
+
+ snprintf_safe("\"version\":\"");
+ jnprintf_safe(uname_ptr->version);
+ snprintf_safe("\",");
+
+ snprintf_safe("\"machine\":\"");
+ jnprintf_safe(uname_ptr->machine);
+ snprintf_safe("\"");
+ snprintf_safe("},");
+
+ /* Extend size, because now uname_ptr is not needed. */
+ size = e - p;
+ snprintf_safe("\"build\":{");
+ snprintf_safe("\"version\":\"%s\",", PACKAGE_VERSION);
+ snprintf_safe("\"cmake_type\":\"%s\"", BUILD_INFO);
+ snprintf_safe("},");
+
+ snprintf_safe("\"signal\":{");
+ snprintf_safe("\"signo\":%d,", cinfo->signo);
+ snprintf_safe("\"si_code\":%d,", cinfo->sicode);
if (cinfo->signo == SIGSEGV) {
if (cinfo->sicode == SEGV_MAPERR) {
- snprintf_safe(e, "\"si_code_str\":\"%s\",",
+ snprintf_safe("\"si_code_str\":\"%s\",",
"SEGV_MAPERR");
} else if (cinfo->sicode == SEGV_ACCERR) {
- snprintf_safe(e, "\"si_code_str\":\"%s\",",
+ snprintf_safe("\"si_code_str\":\"%s\",",
"SEGV_ACCERR");
}
- snprintf_safe(e, "\"si_addr\":\"0x%llx\",",
+ snprintf_safe("\"si_addr\":\"0x%llx\",",
(long long)cinfo->siaddr);
}
#ifdef ENABLE_BACKTRACE
- /*
- * The backtrace itself is encoded into base64 form
- * since it might have arbitrary symbols not suitable
- * for json encoding (newlines and etc).
- */
- size_t bt_len = strlen(cinfo->backtrace_buf);
- size_t bt_elen = base64_bufsize(bt_len, BASE64_NOWRAP);
- char *bt_base64 = &tail[-bt_elen];
- if (p >= bt_base64)
- goto out;
- base64_encode(cinfo->backtrace_buf, bt_len,
- bt_base64, bt_elen, BASE64_NOWRAP);
- bt_base64[bt_elen] = '\0';
- snprintf_safe(bt_base64, "\"backtrace\":\"%s\",", bt_base64);
+ snprintf_safe("\"backtrace\":\"");
+ jnprintf_safe(cinfo->backtrace_buf);
+ snprintf_safe("\",");
#endif
/* 64 bytes should be enough for longest localtime */
const int ts_size = 64;
char *timestamp_rt_str = &tail[-ts_size];
if (p >= timestamp_rt_str)
- goto out;
+ return -1;
ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
- snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
- json_zap(timestamp_rt_str));
- snprintf_safe(timestamp_rt_str, "}");
- snprintf_safe(timestamp_rt_str, "}");
+
+ size = timestamp_rt_str - p;
+ snprintf_safe("\"timestamp\":\"");
+ jnprintf_safe(timestamp_rt_str);
+ snprintf_safe("\"");
+ snprintf_safe("}");
+ snprintf_safe("}");
/*
* Finalize the "data" key and the script.
@@ -392,7 +333,8 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* The timeout is choosen to be 1 second as
* main feedback daemon uses.
*/
- snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
+ size = e - p;
+ snprintf_safe("}}',{timeout=1});os.exit(1);");
pr_debug("crashinfo script: %s", head);
@@ -410,11 +352,6 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
*/
pid_t pid = vfork();
if (pid == 0) {
- /*
- * Environment is needed for recursive
- * crash protection. See crash_mark_env_mode
- * above.
- */
extern char **environ;
/*
* The script must exit at the end but there
@@ -429,11 +366,10 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
_exit(1);
} else if (pid < 0) {
pr_crit("unable to vfork (errno %d)", errno);
+ return -1;
}
- return;
-out:
- pr_crit("unable to prepare a crash report");
+ return 0;
}
/**
@@ -542,8 +478,10 @@ crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
in_cb = 1;
cinfo = crash_collect(signo, siginfo, context);
crash_report_stderr(cinfo);
- if (send_crashinfo)
- crash_report_feedback_daemon(cinfo);
+ if (send_crashinfo &&
+ crash_report_feedback_daemon(cinfo) != 0) {
+ pr_crit("unable to send a crash report");
+ }
} else {
/* Got a signal while running the handler. */
fprintf(stderr, "Fatal %d while backtracing", signo);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-23 21:22 ` Cyrill Gorcunov
@ 2020-12-24 13:16 ` Cyrill Gorcunov
2020-12-24 17:15 ` Vladislav Shpilevoy
0 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-24 13:16 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Thu, Dec 24, 2020 at 12:22:53AM +0300, Cyrill Gorcunov wrote:
> On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> > Hi! Thanks for the patch!
> >
> > See 3 comments and fixes below, and top of the branch in a
> > separate commit.
> >
>
> Vlad, since we can use json_escape I dropped usage of base64
> encoder for backtrace. The final interdiff I pushed into
> gorcunov/gh-5261-crash-report-6 is the following
FWIW gorcunov/gh-5261-crash-report-6 passes all tests
https://gitlab.com/tarantool/tarantool/-/pipelines/234153923
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-24 13:16 ` Cyrill Gorcunov
@ 2020-12-24 17:15 ` Vladislav Shpilevoy
2020-12-24 17:33 ` Cyrill Gorcunov
0 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-24 17:15 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
On 24.12.2020 14:16, Cyrill Gorcunov wrote:
> On Thu, Dec 24, 2020 at 12:22:53AM +0300, Cyrill Gorcunov wrote:
>> On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
>>> Hi! Thanks for the patch!
>>>
>>> See 3 comments and fixes below, and top of the branch in a
>>> separate commit.
>>>
>>
>> Vlad, since we can use json_escape I dropped usage of base64
>> encoder for backtrace. The final interdiff I pushed into
>> gorcunov/gh-5261-crash-report-6 is the following
>
> FWIW gorcunov/gh-5261-crash-report-6 passes all tests
> https://gitlab.com/tarantool/tarantool/-/pipelines/234153923
The branch didn't work on my machine when I tried to crash Tarantool.
Because uname.version even being json-escaped contained a substring
which is not a valid Lua string. This was the broken string:
"Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2\/RELEASE_X86_64"
Lua didn't understand symbol sequence `\/`.
To workaround this I made the child process accept the dump data and
the feedback host via command line arguments. Now seems to be working.
I pushed my fixes on top of the branch, and pasted them below.
====================
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index e62af6836..ee4991b52 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -258,10 +258,10 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* modifications in future.
*/
size = (char *)uname_ptr - p;
- snprintf_safe("require(\'http.client\').post(\'%s\',"
- "'{\"crashdump\":{\"version\":\"%d\","
- "\"data\":", feedback_host,
- crashinfo_version);
+ snprintf_safe("{");
+ snprintf_safe("\"crashdump\":{");
+ snprintf_safe("\"version\":\"%d\",", crashinfo_version);
+ snprintf_safe("\"data\":");
/* The "data" key value */
snprintf_safe("{");
@@ -334,15 +334,20 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* main feedback daemon uses.
*/
size = e - p;
- snprintf_safe("}}',{timeout=1});os.exit(1);");
+ snprintf_safe("}");
+ snprintf_safe("}");
pr_debug("crashinfo script: %s", head);
- char *exec_argv[4] = {
+ char *exec_argv[7] = {
[0] = tarantool_path,
[1] = "-e",
- [2] = head,
- [3] = NULL,
+ [2] = "require('http.client').post(arg[1],arg[2],{timeout=1});"
+ "os.exit(1);",
+ [3] = "-",
+ [4] = feedback_host,
+ [5] = head,
+ [6] = NULL,
};
/*
@@ -360,9 +365,11 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
* is running fine.
*/
execve(exec_argv[0], exec_argv, environ);
- pr_crit("exec(%s,[%s,%s,%s]) failed",
+ pr_crit("exec(%s,[%s,%s,%s,%s,%s,%s,%s]) failed",
exec_argv[0], exec_argv[0],
- exec_argv[1], exec_argv[2]);
+ exec_argv[1], exec_argv[2],
+ exec_argv[3], exec_argv[4],
+ exec_argv[5], exec_argv[6]);
_exit(1);
} else if (pid < 0) {
pr_crit("unable to vfork (errno %d)", errno);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-24 17:15 ` Vladislav Shpilevoy
@ 2020-12-24 17:33 ` Cyrill Gorcunov
2020-12-24 18:22 ` Vladislav Shpilevoy
0 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-24 17:33 UTC (permalink / raw)
To: Vladislav Shpilevoy; +Cc: tml
On Thu, Dec 24, 2020 at 06:15:02PM +0100, Vladislav Shpilevoy wrote:
> On 24.12.2020 14:16, Cyrill Gorcunov wrote:
> > On Thu, Dec 24, 2020 at 12:22:53AM +0300, Cyrill Gorcunov wrote:
> >> On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> >>> Hi! Thanks for the patch!
> >>>
> >>> See 3 comments and fixes below, and top of the branch in a
> >>> separate commit.
> >>>
> >>
> >> Vlad, since we can use json_escape I dropped usage of base64
> >> encoder for backtrace. The final interdiff I pushed into
> >> gorcunov/gh-5261-crash-report-6 is the following
> >
> > FWIW gorcunov/gh-5261-crash-report-6 passes all tests
> > https://gitlab.com/tarantool/tarantool/-/pipelines/234153923
>
> The branch didn't work on my machine when I tried to crash Tarantool.
> Because uname.version even being json-escaped contained a substring
> which is not a valid Lua string. This was the broken string:
>
> "Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2\/RELEASE_X86_64"
>
> Lua didn't understand symbol sequence `\/`.
>
> To workaround this I made the child process accept the dump data and
> the feedback host via command line arguments. Now seems to be working.
>
> I pushed my fixes on top of the branch, and pasted them below.
Thanks a huge! Squashed into gorcunov/gh-5261-crash-report-7
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
2020-12-24 17:33 ` Cyrill Gorcunov
@ 2020-12-24 18:22 ` Vladislav Shpilevoy
2020-12-24 18:33 ` Cyrill Gorcunov
0 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-24 18:22 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: tml
I applied this diff and pushed to master.
====================
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -327,21 +327,17 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
snprintf_safe("}");
snprintf_safe("}");
- /*
- * Finalize the "data" key and the script.
- *
- * The timeout is choosen to be 1 second as
- * main feedback daemon uses.
- */
+ /* Finalize the "data" key and the whole dump. */
size = e - p;
snprintf_safe("}");
snprintf_safe("}");
- pr_debug("crashinfo script: %s", head);
+ pr_debug("crash dump: %s", head);
char *exec_argv[7] = {
[0] = tarantool_path,
[1] = "-e",
+ /* Timeout 1 sec is taken from the feedback daemon. */
[2] = "require('http.client').post(arg[1],arg[2],{timeout=1});"
"os.exit(1);",
[3] = "-",
^ permalink raw reply [flat|nested] 13+ messages in thread