Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage
@ 2020-12-23 15:41 Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper Cyrill Gorcunov
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

For this sake, if crash happens we can send it to our
feedback server.

In this series we implement only base functionality and
may extend it later.

I didn't find yet a simple way to test this code anything
but manually.

Any comments are highly appreciated.

v2:
 - left for internal use

v3 (by Mons):
 - enable sending crash report by default but give
   an ability to disable it
 - use vfork when running subsequent exeutable of
   tarantool to send report, since plain execve won't
   produce local crashdump

v4 (by Vlad):
 - move all fatal's signal handling to crash.c
 - export only those helpers which are really needed
 - update backtrace to use size as an arument
 - provide strlcpy if not present on the system
 - use static allocated buffer for report
 - use base64 library instead of own implmentation

v5:
 - by Vlad:
  - use crash_cfg name for crash engine configuration
  - configure crash engine via box helpers
  - drop unneded gaps from static allocations
  - encode in base64 form the backtrace only no need
    for second round
  - drop unused headers
  - prevent from chained crash storm
  - make sure no attempts to send feedback is done
    if feedback host was not properly configured
    (iow, if box.cfg{} was not even called)
  - do not use tnt_raise() but diag_set instead
 - by me:
  - zap special json characters

Vlad, take a look please, I put interdiff with
gorcunov/gh-5261-crash-report-4-rev2 below (note
it is interdiff with previous version not master
branch).

issue https://github.com/tarantool/tarantool/issues/5261
branch gorcunov/gh-5261-crash-report-5

Cyrill Gorcunov (4):
  util: introduce strlcpy helper
  backtrace: allow to specify destination buffer
  crash: move fatal signal handling in
  crash: report crash data to the feedback server

 CMakeLists.txt                  |   1 +
 src/box/box.cc                  |  18 +
 src/box/box.h                   |   1 +
 src/box/lua/cfg.cc              |   9 +
 src/box/lua/load_cfg.lua        |   6 +-
 src/lib/core/CMakeLists.txt     |   3 +-
 src/lib/core/backtrace.cc       |  12 +-
 src/lib/core/backtrace.h        |   3 +
 src/lib/core/crash.c            | 603 ++++++++++++++++++++++++++++++++
 src/lib/core/crash.h            |  44 +++
 src/lib/core/util.c             |  14 +
 src/main.cc                     | 139 +-------
 src/trivia/config.h.cmake       |   5 +
 src/trivia/util.h               |  15 +
 test/app-tap/init_script.result |   1 +
 test/box/admin.result           |   2 +
 test/box/cfg.result             |   4 +
 17 files changed, 739 insertions(+), 141 deletions(-)
 create mode 100644 src/lib/core/crash.c
 create mode 100644 src/lib/core/crash.h


base-commit: 28f3b2f1e845aff49048d92f9062a4dfa365bf57
-- 
diff --git a/src/box/box.cc b/src/box/box.cc
index 27079fd46..440bfa305 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1214,19 +1214,21 @@ box_set_prepared_stmt_cache_size(void)
 	return 0;
 }
 
-void
-box_set_crash_params(void)
+int
+box_set_crash(void)
 {
 	const char *host = cfg_gets("feedback_host");
 	bool is_enabled_1 = cfg_getb("feedback_enabled");
 	bool is_enabled_2 = cfg_getb("feedback_crashinfo");
 
 	if (host != NULL && strlen(host) >= CRASH_FEEDBACK_HOST_MAX) {
-		tnt_raise(ClientError, ER_CFG, "feedback_host",
+		diag_set(ClientError, ER_CFG, "feedback_host",
 			  "the address is too long");
+		return -1;
 	}
 
-	crash_cfg_set_params(host, is_enabled_1 && is_enabled_2);
+	crash_cfg(host, is_enabled_1 && is_enabled_2);
+	return 0;
 }
 
 /* }}} configuration bindings */
diff --git a/src/box/box.h b/src/box/box.h
index 6792ade95..69fa096a1 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -257,7 +257,7 @@ void box_set_replication_sync_timeout(void);
 void box_set_replication_skip_conflict(void);
 void box_set_replication_anon(void);
 void box_set_net_msg_max(void);
-void box_set_crash_params(void);
+int box_set_crash(void);
 
 int
 box_set_prepared_stmt_cache_size(void);
diff --git a/src/box/lua/cfg.cc b/src/box/lua/cfg.cc
index 089c770e8..2d3ccbf0e 100644
--- a/src/box/lua/cfg.cc
+++ b/src/box/lua/cfg.cc
@@ -376,13 +376,10 @@ lbox_cfg_set_replication_skip_conflict(struct lua_State *L)
 }
 
 static int
-lbox_cfg_set_crash_params(struct lua_State *L)
+lbox_cfg_set_crash(struct lua_State *L)
 {
-	try {
-		box_set_crash_params();
-	} catch (Exception *) {
+	if (box_set_crash() != 0)
 		luaT_error(L);
-	}
 	return 0;
 }
 
@@ -422,7 +419,7 @@ box_lua_cfg_init(struct lua_State *L)
 		{"cfg_set_replication_anon", lbox_cfg_set_replication_anon},
 		{"cfg_set_net_msg_max", lbox_cfg_set_net_msg_max},
 		{"cfg_set_sql_cache_size", lbox_set_prepared_stmt_cache_size},
-		{"cfg_set_crash_params", lbox_cfg_set_crash_params},
+		{"cfg_set_crash", lbox_cfg_set_crash},
 		{NULL, NULL}
 	};
 
diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua
index 80009c7af..7e41e0999 100644
--- a/src/box/lua/load_cfg.lua
+++ b/src/box/lua/load_cfg.lua
@@ -34,7 +34,7 @@ end
 local ifdef_feedback_set_params =
     private.feedback_daemon ~= nil and
     private.feedback_daemon.set_feedback_params and
-    private.cfg_set_crash_params or nil
+    private.cfg_set_crash or nil
 
 -- all available options
 local default_cfg = {
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index f6dd91987..19d3d49ea 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -5,7 +5,9 @@
  */
 
 #include <string.h>
-#include <unistd.h>
+#include <signal.h>
+#include <stdint.h>
+#include <limits.h>
 #include <time.h>
 #include <sys/types.h>
 #include <sys/utsname.h>
@@ -13,6 +15,7 @@
 #include "third_party/base64.h"
 #include "small/static.h"
 #include "trivia/util.h"
+
 #include "backtrace.h"
 #include "crash.h"
 #include "say.h"
@@ -25,7 +28,7 @@
 #define pr_crit(fmt, ...)	fprintf(stderr, pr_fmt(fmt) "\n", ##__VA_ARGS__)
 #define pr_panic(fmt, ...)	panic(pr_fmt(fmt), ##__VA_ARGS__)
 
-/* Use strlcpy with destination as an array */
+/** Use strlcpy with destination as an array */
 #define strlcpy_a(dst, src) strlcpy(dst, src, sizeof(dst))
 
 #ifdef TARGET_OS_LINUX
@@ -67,8 +70,10 @@ struct crash_greg {
 static struct crash_info {
 	/**
 	 * These two are mostly useless as being
-	 * plain addresses but keep for backward
-	 * compatibility.
+	 * plain addresses and without real binary
+	 * crash dump file we can't use them for
+	 * anything suitable (in terms of analysis sake)
+	 * but keep for backward compatibility.
 	 */
 	void *context_addr;
 	void *siginfo_addr;
@@ -95,7 +100,7 @@ static struct crash_info {
 	 */
 	int sicode;
 #ifdef ENABLE_BACKTRACE
-	/*
+	/**
 	 * 1K of memory should be enough to keep the backtrace.
 	 * In worst case it gonna be simply trimmed.
 	 */
@@ -142,7 +147,7 @@ crash_init(const char *tarantool_bin)
 }
 
 void
-crash_cfg_set_params(const char *host, bool is_enabled)
+crash_cfg(const char *host, bool is_enabled)
 {
 	if (host == NULL || !is_enabled) {
 		if (send_crashinfo) {
@@ -155,8 +160,12 @@ crash_cfg_set_params(const char *host, bool is_enabled)
 
 	if (strcmp(feedback_host, host) != 0) {
 		strlcpy_a(feedback_host, host);
-		if (strlen(feedback_host) < strlen(host))
-			pr_panic("feedback_host is too long");
+		/*
+		 * The caller should have tested already
+		 * that there is enough space to keep
+		 * the host address.
+		 */
+		assert(strlen(feedback_host) == strlen(host));
 	}
 
 	if (!send_crashinfo) {
@@ -231,6 +240,31 @@ crash_mark_env_mode(void)
 	return 0;
 }
 
+/**
+ * Zap reserved symbols.
+ */
+static char *
+json_zap(char *s)
+{
+	if (s == NULL)
+		return NULL;
+
+	/*
+	 * Actually we should escape them but for
+	 * now lets just zap without changing source
+	 * size.
+	 */
+	for (size_t i = 0; i < strlen(s); i++) {
+		if (s[i] == '\"' || s[i] == '\b' ||
+		    s[i] == '\f' || s[i] == '\n' ||
+		    s[i] == '\r' || s[i] == '\t' ||
+		    s[i] == '\\') {
+			s[i] = ' ';
+		}
+	}
+	return s;
+}
+
 /**
  * Report crash information to the feedback daemon
  * (ie send it to feedback daemon).
@@ -242,14 +276,14 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 		return;
 
 	/*
-	 * Update to a new number if format get changed.
+	 * Update to a new number if the format get changed.
 	 */
 	const int crashinfo_version = 1;
 
 	char *p = static_alloc(SMALL_STATIC_SIZE);
-	char *e = &p[SMALL_STATIC_SIZE - 1];
+	char *tail = &p[SMALL_STATIC_SIZE];
+	char *e = &p[SMALL_STATIC_SIZE];
 	char *head = p;
-	char *tail = &p[SMALL_STATIC_SIZE - 8];
 
 	/*
 	 * Note that while we encode the report we
@@ -257,19 +291,19 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * buffer as a temporary store.
 	 */
 
-#define snprintf_safe(__end, __fmt, ...)				\
-	do {								\
-		size_t size = (char *)(void *)__end - p;		\
-		p += snprintf(p, size, __fmt, ##__VA_ARGS__);		\
-		if (p >= (char *)(void *)__end)				\
-			goto out;					\
+#define snprintf_safe(__end, __fmt, ...)			\
+	do {							\
+		size_t size = (char *)__end - p;		\
+		p += snprintf(p, size, __fmt, ##__VA_ARGS__);	\
+		if (p >= (char *)__end)				\
+			goto out;				\
 	} while (0)
 
 	/*
 	 * Lets reuse tail of the buffer as a temp space.
 	 */
 	struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
-	if (p >= (char *)(void *)uname_ptr)
+	if (p >= (char *)uname_ptr)
 		goto out;
 
 	if (uname(uname_ptr) != 0) {
@@ -277,15 +311,31 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 		memset(uname_ptr, 0, sizeof(struct utsname));
 	}
 
+	/*
+	 * Start filling the script. The "data" key value is
+	 * filled as a separate code block for easier
+	 * modifications in future.
+	 */
+	snprintf_safe(uname_ptr,
+		      "require(\'http.client\').post(\'%s\',"
+		      "'{\"crashdump\":{\"version\":\"%d\","
+		      "\"data\":", feedback_host,
+		      crashinfo_version);
+
+	/* The "data" key value */
 	snprintf_safe(uname_ptr, "{");
 	snprintf_safe(uname_ptr, "\"uname\":{");
-	snprintf_safe(uname_ptr, "\"sysname\":\"%s\",", uname_ptr->sysname);
+	snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
+		      json_zap(uname_ptr->sysname));
 	/*
-	 * nodename might a sensitive information, skip.
+	 * nodename might contain a sensitive information, skip.
 	 */
-	snprintf_safe(uname_ptr, "\"release\":\"%s\",", uname_ptr->release);
-	snprintf_safe(uname_ptr, "\"version\":\"%s\",", uname_ptr->version);
-	snprintf_safe(uname_ptr, "\"machine\":\"%s\"", uname_ptr->machine);
+	snprintf_safe(uname_ptr, "\"release\":\"%s\",",
+		      json_zap(uname_ptr->release));
+	snprintf_safe(uname_ptr, "\"version\":\"%s\",",
+		      json_zap(uname_ptr->version));
+	snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
+		      json_zap(uname_ptr->machine));
 	snprintf_safe(uname_ptr, "},");
 
 	snprintf_safe(e, "\"build\":{");
@@ -326,36 +376,23 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 #endif
 
 	/* 64 bytes should be enough for longest localtime */
-	char *timestamp_rt_str = &tail[-64];
+	const int ts_size = 64;
+	char *timestamp_rt_str = &tail[-ts_size];
 	if (p >= timestamp_rt_str)
 		goto out;
-	ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, 64);
-	snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"", timestamp_rt_str);
+	ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
+	snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
+		      json_zap(timestamp_rt_str));
 	snprintf_safe(timestamp_rt_str, "}");
 	snprintf_safe(timestamp_rt_str, "}");
 
-	size_t report_len = p - head;
-	size_t report_elen = base64_bufsize(report_len, BASE64_NOWRAP);
-
-	char *report_base64 = &tail[-report_elen];
-	if (p >= report_base64)
-		goto out;
-	base64_encode(head, report_len, report_base64,
-		      report_elen, BASE64_NOWRAP);
-	report_base64[report_elen] = '\0';
-
 	/*
-	 * Encoded report now sits at report_base64 position,
-	 * at the tail of 'small' static buffer. Lets prepare
-	 * the script to run.
+	 * Finalize the "data" key and the script.
+	 *
+	 * The timeout is choosen to be 1 second as
+	 * main feedback daemon uses.
 	 */
-	p = head;
-	snprintf_safe(report_base64,
-		      "require(\'http.client\').post(\'%s\',"
-		      "'{\"crashdump\":{\"version\":\"%d\","
-		      "\"data\": \"%s\"}}',{timeout=1});"
-		      "os.exit(1);", feedback_host,
-		      crashinfo_version, report_base64);
+	snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
 
 	pr_debug("crashinfo script: %s", head);
 
diff --git a/src/lib/core/crash.h b/src/lib/core/crash.h
index 800d525e5..195aef10b 100644
--- a/src/lib/core/crash.h
+++ b/src/lib/core/crash.h
@@ -5,12 +5,6 @@
  */
 #pragma once
 
-#include <stdint.h>
-#include <signal.h>
-#include <limits.h>
-
-#include "trivia/config.h"
-
 #if defined(__cplusplus)
 extern "C" {
 #endif /* defined(__cplusplus) */
@@ -31,7 +25,7 @@ crash_init(const char *tarantool_bin);
  * Configure crash parameters.
  */
 void
-crash_cfg_set_params(const char *host, bool is_enabled);
+crash_cfg(const char *host, bool is_enabled);
 
 /**
  * Initialize crash signal handlers.
diff --git a/src/trivia/util.h b/src/trivia/util.h
index b8367cc77..15f9881db 100644
--- a/src/trivia/util.h
+++ b/src/trivia/util.h
@@ -452,7 +452,7 @@ fpconv_strtod(const char *nptr, char **endptr)
 
 #ifndef HAVE_STRLCPY
 /**
- * Copy string. Unlike \a strncpy the result string
+ * Copy string. Unlike @a strncpy the result string
  * is always null-terminated.
  *
  * @param dst destination buffer.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper
  2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer Cyrill Gorcunov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

Very convenient to have this string extension.
We will use it in crash handling.

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 CMakeLists.txt            |  1 +
 src/lib/core/util.c       | 14 ++++++++++++++
 src/trivia/config.h.cmake |  5 +++++
 src/trivia/util.h         | 15 +++++++++++++++
 4 files changed, 35 insertions(+)

diff --git a/CMakeLists.txt b/CMakeLists.txt
index fa6818f8e..aa23113b3 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -126,6 +126,7 @@ set(CMAKE_REQUIRED_LIBRARIES "")
 check_function_exists(clock_gettime HAVE_CLOCK_GETTIME_WITHOUT_RT)
 
 check_symbol_exists(__get_cpuid cpuid.h HAVE_CPUID)
+check_symbol_exists(strlcpy string.h HAVE_STRLCPY)
 
 # Checks for libev
 include(CheckStructHasMember)
diff --git a/src/lib/core/util.c b/src/lib/core/util.c
index dfce317f0..f29886105 100644
--- a/src/lib/core/util.c
+++ b/src/lib/core/util.c
@@ -250,6 +250,20 @@ int2str(long long int val)
 	return buf;
 }
 
+#ifndef HAVE_STRLCPY
+size_t
+strlcpy(char *dst, const char *src, size_t size)
+{
+	size_t src_len = strlen(src);
+	if (size != 0) {
+		size_t len = (src_len >= size) ? size - 1 : src_len;
+		memcpy(dst, src, len);
+		dst[len] = '\0';
+	}
+	return src_len;
+}
+#endif
+
 int
 utf8_check_printable(const char *start, size_t length)
 {
diff --git a/src/trivia/config.h.cmake b/src/trivia/config.h.cmake
index 89e0d39c6..21efb978e 100644
--- a/src/trivia/config.h.cmake
+++ b/src/trivia/config.h.cmake
@@ -47,6 +47,11 @@
  */
 #cmakedefine HAVE_CPUID 1
 
+/**
+ * Defined if strlcpy() string extension helper present.
+ */
+ #cmakedefine HAVE_STRLCPY 1
+
 /*
  * Defined if gcov instrumentation should be enabled.
  */
diff --git a/src/trivia/util.h b/src/trivia/util.h
index da5a3705e..15f9881db 100644
--- a/src/trivia/util.h
+++ b/src/trivia/util.h
@@ -450,6 +450,21 @@ fpconv_strtod(const char *nptr, char **endptr)
 	return strtod(nptr, endptr);
 }
 
+#ifndef HAVE_STRLCPY
+/**
+ * Copy string. Unlike @a strncpy the result string
+ * is always null-terminated.
+ *
+ * @param dst destination buffer.
+ * @param src source string.
+ * @param size destination buffer size.
+ *
+ * @return size of @a src string.
+ */
+size_t
+strlcpy(char *dst, const char *src, size_t size);
+#endif
+
 /**
  * Check that @a str is valid utf-8 sequence and can be printed
  * unescaped.
-- 
2.26.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer
  2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
  3 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

This will allow to reuse this routine in crash reports.

Part-of #5261

Acked-by: Serge Petrenko <sergepetrenko@tarantool.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/lib/core/backtrace.cc | 12 ++++++------
 src/lib/core/backtrace.h  |  3 +++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/lib/core/backtrace.cc b/src/lib/core/backtrace.cc
index 456ce9a4d..b4048089f 100644
--- a/src/lib/core/backtrace.cc
+++ b/src/lib/core/backtrace.cc
@@ -131,7 +131,7 @@ get_proc_name(unw_cursor_t *unw_cur, unw_word_t *offset, bool skip_cache)
 }
 
 char *
-backtrace(void)
+backtrace(char *start, size_t size)
 {
 	int frame_no = 0;
 	unw_word_t sp = 0, old_sp = 0, ip, offset;
@@ -139,10 +139,9 @@ backtrace(void)
 	unw_getcontext(&unw_context);
 	unw_cursor_t unw_cur;
 	unw_init_local(&unw_cur, &unw_context);
-	char *backtrace_buf = (char *)static_alloc(SMALL_STATIC_SIZE);
-	char *p = backtrace_buf;
-	char *end = p + SMALL_STATIC_SIZE - 1;
 	int unw_status;
+	char *p = start;
+	char *end = start + size - 1;
 	*p = '\0';
 	while ((unw_status = unw_step(&unw_cur)) > 0) {
 		const char *proc;
@@ -174,7 +173,7 @@ backtrace(void)
 		say_debug("unwinding error: %i", unw_status);
 #endif
 out:
-	return backtrace_buf;
+	return start;
 }
 
 /*
@@ -436,7 +435,8 @@ backtrace_foreach(backtrace_cb cb, coro_context *coro_ctx, void *cb_ctx)
 void
 print_backtrace(void)
 {
-	fdprintf(STDERR_FILENO, "%s", backtrace());
+	char *start = (char *)static_alloc(SMALL_STATIC_SIZE);
+	fdprintf(STDERR_FILENO, "%s", backtrace(start, SMALL_STATIC_SIZE));
 }
 #endif /* ENABLE_BACKTRACE */
 
diff --git a/src/lib/core/backtrace.h b/src/lib/core/backtrace.h
index c119d5402..e0ae56be4 100644
--- a/src/lib/core/backtrace.h
+++ b/src/lib/core/backtrace.h
@@ -40,6 +40,9 @@ extern "C" {
 #ifdef ENABLE_BACKTRACE
 #include <coro.h>
 
+char *
+backtrace(char *start, size_t size);
+
 void print_backtrace(void);
 
 typedef int (backtrace_cb)(int frameno, void *frameret,
-- 
2.26.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in
  2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
  3 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

When SIGSEGV or SIGFPE reaches the tarantool we try to gather
all information related to the crash and print it out to the
console (well, stderr actually). Still there is a request
to not just show this info locally but send it out to the
feedback server.

Thus to keep gathering crash related information in one module,
we move fatal signal handling into the separate crash.c file.
This allows us to collect the data we need in one place and
reuse it when we need to send reports to stderr (and to the
feedback server, which will be implemented in next patch).

Part-of #5261

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/lib/core/CMakeLists.txt |   1 +
 src/lib/core/crash.c        | 295 ++++++++++++++++++++++++++++++++++++
 src/lib/core/crash.h        |  26 ++++
 src/main.cc                 | 138 +----------------
 4 files changed, 327 insertions(+), 133 deletions(-)
 create mode 100644 src/lib/core/crash.c
 create mode 100644 src/lib/core/crash.h

diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
index 7c62fc5ce..30cf0dd15 100644
--- a/src/lib/core/CMakeLists.txt
+++ b/src/lib/core/CMakeLists.txt
@@ -1,5 +1,6 @@
 set(core_sources
     diag.c
+    crash.c
     say.c
     memory.c
     clock.c
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
new file mode 100644
index 000000000..3929463f3
--- /dev/null
+++ b/src/lib/core/crash.c
@@ -0,0 +1,295 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
+ */
+
+#include <string.h>
+#include <signal.h>
+#include <stdint.h>
+#include <time.h>
+
+#include "trivia/util.h"
+
+#include "backtrace.h"
+#include "crash.h"
+#include "say.h"
+
+#define pr_fmt(fmt)		"crash: " fmt
+#define pr_syserr(fmt, ...)	say_syserror(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_panic(fmt, ...)	panic(pr_fmt(fmt), ##__VA_ARGS__)
+
+#ifdef TARGET_OS_LINUX
+#ifndef __x86_64__
+# error "Non x86-64 architectures are not supported"
+#endif
+struct crash_greg {
+	uint64_t	r8;
+	uint64_t	r9;
+	uint64_t	r10;
+	uint64_t	r11;
+	uint64_t	r12;
+	uint64_t	r13;
+	uint64_t	r14;
+	uint64_t	r15;
+	uint64_t	di;
+	uint64_t	si;
+	uint64_t	bp;
+	uint64_t	bx;
+	uint64_t	dx;
+	uint64_t	ax;
+	uint64_t	cx;
+	uint64_t	sp;
+	uint64_t	ip;
+	uint64_t	flags;
+	uint16_t	cs;
+	uint16_t	gs;
+	uint16_t	fs;
+	uint16_t	ss;
+	uint64_t	err;
+	uint64_t	trapno;
+	uint64_t	oldmask;
+	uint64_t	cr2;
+	uint64_t	fpstate;
+	uint64_t	reserved1[8];
+};
+#endif /* TARGET_OS_LINUX */
+
+static struct crash_info {
+	/**
+	 * These two are mostly useless as being
+	 * plain addresses and without real binary
+	 * crash dump file we can't use them for
+	 * anything suitable (in terms of analysis sake)
+	 * but keep for backward compatibility.
+	 */
+	void *context_addr;
+	void *siginfo_addr;
+#ifdef TARGET_OS_LINUX
+	/**
+	 * Registers contents.
+	 */
+	struct crash_greg greg;
+#endif
+	/**
+	 * Faulting address.
+	 */
+	void *siaddr;
+	/**
+	 * Crash signal number.
+	 */
+	int signo;
+	/**
+	 * Crash signal code.
+	 */
+	int sicode;
+#ifdef ENABLE_BACKTRACE
+	/**
+	 * 1K of memory should be enough to keep the backtrace.
+	 * In worst case it gonna be simply trimmed.
+	 */
+	char backtrace_buf[1024];
+#endif
+} crash_info;
+
+/**
+ * The routine is called inside crash signal handler so
+ * be careful to not cause additional signals inside.
+ */
+static struct crash_info *
+crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
+{
+	struct crash_info *cinfo = &crash_info;
+
+	cinfo->signo = signo;
+	cinfo->sicode = siginfo->si_code;
+	cinfo->siaddr = siginfo->si_addr;
+	cinfo->context_addr = ucontext;
+	cinfo->siginfo_addr = siginfo;
+
+#ifdef ENABLE_BACKTRACE
+	char *start = cinfo->backtrace_buf;
+	backtrace(start, sizeof(cinfo->backtrace_buf));
+#endif
+
+#ifdef TARGET_OS_LINUX
+	/*
+	 * uc_mcontext on libc level looks somehow strange,
+	 * they define an array of uint64_t where each register
+	 * defined by REG_x macro.
+	 *
+	 * In turn the kernel is quite explicit about the context.
+	 * Moreover it is a part of user ABI, thus won't be changed.
+	 *
+	 * Lets use memcpy here to make a copy in a fast way.
+	 */
+	ucontext_t *uc = ucontext;
+	memcpy(&cinfo->greg, &uc->uc_mcontext, sizeof(cinfo->greg));
+#endif
+
+	return cinfo;
+}
+
+/**
+ * Report crash information to the stderr
+ * (usually a current console).
+ */
+static void
+crash_report_stderr(struct crash_info *cinfo)
+{
+	if (cinfo->signo == SIGSEGV) {
+		fprintf(stderr, "Segmentation fault\n");
+		const char *signal_code_repr = NULL;
+
+		switch (cinfo->sicode) {
+		case SEGV_MAPERR:
+			signal_code_repr = "SEGV_MAPERR";
+			break;
+		case SEGV_ACCERR:
+			signal_code_repr = "SEGV_ACCERR";
+			break;
+		}
+
+		if (signal_code_repr != NULL)
+			fprintf(stderr, "  code: %s\n", signal_code_repr);
+		else
+			fprintf(stderr, "  code: %d\n", cinfo->sicode);
+		/*
+		 * fprintf is used instead of fdprintf, because
+		 * fdprintf does not understand %p
+		 */
+		fprintf(stderr, "  addr: %p\n", cinfo->siaddr);
+	} else {
+		fprintf(stderr, "Got a fatal signal %d\n", cinfo->signo);
+	}
+
+	fprintf(stderr, "  context: %p\n", cinfo->context_addr);
+	fprintf(stderr, "  siginfo: %p\n", cinfo->siginfo_addr);
+
+#ifdef TARGET_OS_LINUX
+# define fprintf_reg(__n, __v)				\
+	fprintf(stderr, "  %-9s0x%-17llx%lld\n",	\
+		__n, (long long)__v, (long long)__v)
+	fprintf_reg("rax", cinfo->greg.ax);
+	fprintf_reg("rbx", cinfo->greg.bx);
+	fprintf_reg("rcx", cinfo->greg.cx);
+	fprintf_reg("rdx", cinfo->greg.dx);
+	fprintf_reg("rsi", cinfo->greg.si);
+	fprintf_reg("rdi", cinfo->greg.di);
+	fprintf_reg("rsp", cinfo->greg.sp);
+	fprintf_reg("rbp", cinfo->greg.bp);
+	fprintf_reg("r8", cinfo->greg.r8);
+	fprintf_reg("r9", cinfo->greg.r9);
+	fprintf_reg("r10", cinfo->greg.r10);
+	fprintf_reg("r11", cinfo->greg.r11);
+	fprintf_reg("r12", cinfo->greg.r12);
+	fprintf_reg("r13", cinfo->greg.r13);
+	fprintf_reg("r14", cinfo->greg.r14);
+	fprintf_reg("r15", cinfo->greg.r15);
+	fprintf_reg("rip", cinfo->greg.ip);
+	fprintf_reg("eflags", cinfo->greg.flags);
+	fprintf_reg("cs", cinfo->greg.cs);
+	fprintf_reg("gs", cinfo->greg.gs);
+	fprintf_reg("fs", cinfo->greg.fs);
+	fprintf_reg("cr2", cinfo->greg.cr2);
+	fprintf_reg("err", cinfo->greg.err);
+	fprintf_reg("oldmask", cinfo->greg.oldmask);
+	fprintf_reg("trapno", cinfo->greg.trapno);
+# undef fprintf_reg
+#endif /* TARGET_OS_LINUX */
+
+	fprintf(stderr, "Current time: %u\n", (unsigned)time(0));
+	fprintf(stderr, "Please file a bug at "
+		"http://github.com/tarantool/tarantool/issues\n");
+
+#ifdef ENABLE_BACKTRACE
+	fprintf(stderr, "Attempting backtrace... Note: since the server has "
+		"already crashed, \nthis may fail as well\n");
+	fprintf(stderr, "%s", cinfo->backtrace_buf);
+#endif
+}
+
+/**
+ * Handle fatal (crashing) signal.
+ *
+ * Try to log as much as possible before dumping a core.
+ *
+ * Core files are not always allowed and it takes an effort to
+ * extract useful information from them.
+ *
+ * *Recursive invocation*
+ *
+ * Unless SIGSEGV is sent by kill(), Linux resets the signal
+ * a default value before invoking the handler.
+ *
+ * Despite that, as an extra precaution to avoid infinite
+ * recursion, we count invocations of the handler, and
+ * quietly _exit() when called for a second time.
+ */
+static void
+crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
+{
+	static volatile sig_atomic_t in_cb = 0;
+	struct crash_info *cinfo;
+
+	if (in_cb == 0) {
+		in_cb = 1;
+		cinfo = crash_collect(signo, siginfo, context);
+		crash_report_stderr(cinfo);
+	} else {
+		/* Got a signal while running the handler. */
+		fprintf(stderr, "Fatal %d while backtracing", signo);
+	}
+
+	/* Try to dump a core */
+	struct sigaction sa = {
+		.sa_handler = SIG_DFL,
+	};
+	sigemptyset(&sa.sa_mask);
+	sigaction(SIGABRT, &sa, NULL);
+	abort();
+}
+
+/**
+ * Fatal signals we generate crash on.
+ */
+static const int crash_signals[] = { SIGSEGV, SIGFPE };
+
+void
+crash_signal_reset(void)
+{
+	struct sigaction sa = {
+		.sa_handler = SIG_DFL,
+	};
+	sigemptyset(&sa.sa_mask);
+
+	for (size_t i = 0; i < lengthof(crash_signals); i++) {
+		if (sigaction(crash_signals[i], &sa, NULL) == 0)
+			continue;
+		pr_syserr("reset sigaction %d", crash_signals[i]);
+	}
+}
+
+void
+crash_signal_init(void)
+{
+	/*
+	 * SA_RESETHAND resets handler action to the default
+	 * one when entering handler.
+	 *
+	 * SA_NODEFER allows receiving the same signal
+	 * during handler.
+	 */
+	struct sigaction sa = {
+		.sa_flags = SA_RESETHAND | SA_NODEFER | SA_SIGINFO,
+		.sa_sigaction = crash_signal_cb,
+	};
+	sigemptyset(&sa.sa_mask);
+
+	for (size_t i = 0; i < lengthof(crash_signals); i++) {
+		if (sigaction(crash_signals[i], &sa, NULL) == 0)
+			continue;
+		pr_panic("sigaction %d (%s)", crash_signals[i],
+			 strerror(errno));
+	}
+}
diff --git a/src/lib/core/crash.h b/src/lib/core/crash.h
new file mode 100644
index 000000000..cd1db585e
--- /dev/null
+++ b/src/lib/core/crash.h
@@ -0,0 +1,26 @@
+/*
+ * SPDX-License-Identifier: BSD-2-Clause
+ *
+ * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
+ */
+#pragma once
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+/**
+ * Initialize crash signal handlers.
+ */
+void
+crash_signal_init(void);
+
+/**
+ * Reset crash signal handlers.
+ */
+void
+crash_signal_reset(void);
+
+#if defined(__cplusplus)
+}
+#endif /* defined(__cplusplus) */
diff --git a/src/main.cc b/src/main.cc
index 2f48f474c..391e0f878 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -79,6 +79,7 @@
 #include "systemd.h"
 #include "crypto/crypto.h"
 #include "core/popen.h"
+#include "core/crash.h"
 
 static pid_t master_pid = getpid();
 static struct pidfh *pid_file_handle;
@@ -184,124 +185,6 @@ signal_sigwinch_cb(ev_loop *loop, struct ev_signal *w, int revents)
 		rl_resize_terminal();
 }
 
-#if defined(__linux__) && defined(__amd64)
-
-inline void
-dump_x86_64_register(const char *reg_name, unsigned long long val)
-{
-	fprintf(stderr, "  %-9s0x%-17llx%lld\n", reg_name, val, val);
-}
-
-void
-dump_x86_64_registers(ucontext_t *uc)
-{
-	dump_x86_64_register("rax", uc->uc_mcontext.gregs[REG_RAX]);
-	dump_x86_64_register("rbx", uc->uc_mcontext.gregs[REG_RBX]);
-	dump_x86_64_register("rcx", uc->uc_mcontext.gregs[REG_RCX]);
-	dump_x86_64_register("rdx", uc->uc_mcontext.gregs[REG_RDX]);
-	dump_x86_64_register("rsi", uc->uc_mcontext.gregs[REG_RSI]);
-	dump_x86_64_register("rdi", uc->uc_mcontext.gregs[REG_RDI]);
-	dump_x86_64_register("rsp", uc->uc_mcontext.gregs[REG_RSP]);
-	dump_x86_64_register("rbp", uc->uc_mcontext.gregs[REG_RBP]);
-	dump_x86_64_register("r8", uc->uc_mcontext.gregs[REG_R8]);
-	dump_x86_64_register("r9", uc->uc_mcontext.gregs[REG_R9]);
-	dump_x86_64_register("r10", uc->uc_mcontext.gregs[REG_R10]);
-	dump_x86_64_register("r11", uc->uc_mcontext.gregs[REG_R11]);
-	dump_x86_64_register("r12", uc->uc_mcontext.gregs[REG_R12]);
-	dump_x86_64_register("r13", uc->uc_mcontext.gregs[REG_R13]);
-	dump_x86_64_register("r14", uc->uc_mcontext.gregs[REG_R14]);
-	dump_x86_64_register("r15", uc->uc_mcontext.gregs[REG_R15]);
-	dump_x86_64_register("rip", uc->uc_mcontext.gregs[REG_RIP]);
-	dump_x86_64_register("eflags", uc->uc_mcontext.gregs[REG_EFL]);
-	dump_x86_64_register("cs", (uc->uc_mcontext.gregs[REG_CSGSFS] >> 0) & 0xffff);
-	dump_x86_64_register("gs", (uc->uc_mcontext.gregs[REG_CSGSFS] >> 16) & 0xffff);
-	dump_x86_64_register("fs", (uc->uc_mcontext.gregs[REG_CSGSFS] >> 32) & 0xffff);
-	dump_x86_64_register("cr2", uc->uc_mcontext.gregs[REG_CR2]);
-	dump_x86_64_register("err", uc->uc_mcontext.gregs[REG_ERR]);
-	dump_x86_64_register("oldmask", uc->uc_mcontext.gregs[REG_OLDMASK]);
-	dump_x86_64_register("trapno", uc->uc_mcontext.gregs[REG_TRAPNO]);
-}
-
-#endif /* defined(__linux__) && defined(__amd64) */
-
-/** Try to log as much as possible before dumping a core.
- *
- * Core files are not aways allowed and it takes an effort to
- * extract useful information from them.
- *
- * *Recursive invocation*
- *
- * Unless SIGSEGV is sent by kill(), Linux
- * resets the signal a default value before invoking
- * the handler.
- *
- * Despite that, as an extra precaution to avoid infinite
- * recursion, we count invocations of the handler, and
- * quietly _exit() when called for a second time.
- */
-static void
-sig_fatal_cb(int signo, siginfo_t *siginfo, void *context)
-{
-	static volatile sig_atomic_t in_cb = 0;
-	int fd = STDERR_FILENO;
-	struct sigaction sa;
-
-	/* Got a signal while running the handler. */
-	if (in_cb) {
-		fdprintf(fd, "Fatal %d while backtracing", signo);
-		goto end;
-	}
-
-	in_cb = 1;
-
-	if (signo == SIGSEGV) {
-		fdprintf(fd, "Segmentation fault\n");
-		const char *signal_code_repr = 0;
-		switch (siginfo->si_code) {
-		case SEGV_MAPERR:
-			signal_code_repr = "SEGV_MAPERR";
-			break;
-		case SEGV_ACCERR:
-			signal_code_repr = "SEGV_ACCERR";
-			break;
-		}
-		if (signal_code_repr)
-			fdprintf(fd, "  code: %s\n", signal_code_repr);
-		else
-			fdprintf(fd, "  code: %d\n", siginfo->si_code);
-		/*
-		 * fprintf is used insted of fdprintf, because
-		 * fdprintf does not understand %p
-		 */
-		fprintf(stderr, "  addr: %p\n", siginfo->si_addr);
-	} else
-		fdprintf(fd, "Got a fatal signal %d\n", signo);
-	fprintf(stderr, "  context: %p\n", context);
-	fprintf(stderr, "  siginfo: %p\n", siginfo);
-
-#if defined(__linux__) && defined(__amd64)
-	dump_x86_64_registers((ucontext_t *)context);
-#endif
-
-	fdprintf(fd, "Current time: %u\n", (unsigned) time(0));
-	fdprintf(fd,
-		 "Please file a bug at http://github.com/tarantool/tarantool/issues\n");
-
-#ifdef ENABLE_BACKTRACE
-	fdprintf(fd, "Attempting backtrace... Note: since the server has "
-		 "already crashed, \nthis may fail as well\n");
-	print_backtrace();
-#endif
-end:
-	/* Try to dump core. */
-	memset(&sa, 0, sizeof(sa));
-	sigemptyset(&sa.sa_mask);
-	sa.sa_handler = SIG_DFL;
-	sigaction(SIGABRT, &sa, NULL);
-
-	abort();
-}
-
 static void
 signal_free(void)
 {
@@ -328,11 +211,11 @@ signal_reset(void)
 	    sigaction(SIGINT, &sa, NULL) == -1 ||
 	    sigaction(SIGTERM, &sa, NULL) == -1 ||
 	    sigaction(SIGHUP, &sa, NULL) == -1 ||
-	    sigaction(SIGWINCH, &sa, NULL) == -1 ||
-	    sigaction(SIGSEGV, &sa, NULL) == -1 ||
-	    sigaction(SIGFPE, &sa, NULL) == -1)
+	    sigaction(SIGWINCH, &sa, NULL) == -1)
 		say_syserror("sigaction");
 
+	crash_signal_reset();
+
 	/* Unblock any signals blocked by libev. */
 	sigset_t sigset;
 	sigfillset(&sigset);
@@ -362,18 +245,7 @@ signal_init(void)
 	if (sigaction(SIGPIPE, &sa, 0) == -1)
 		panic_syserror("sigaction");
 
-	/*
-	 * SA_RESETHAND resets handler action to the default
-	 * one when entering handler.
-	 * SA_NODEFER allows receiving the same signal during handler.
-	 */
-	sa.sa_flags = SA_RESETHAND | SA_NODEFER | SA_SIGINFO;
-	sa.sa_sigaction = sig_fatal_cb;
-
-	if (sigaction(SIGSEGV, &sa, 0) == -1 ||
-	    sigaction(SIGFPE, &sa, 0) == -1) {
-		panic_syserror("sigaction");
-	}
+	crash_signal_init();
 
 	ev_signal_init(&ev_sigs[0], sig_checkpoint, SIGUSR1);
 	ev_signal_init(&ev_sigs[1], signal_cb, SIGINT);
-- 
2.26.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
                   ` (2 preceding siblings ...)
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in Cyrill Gorcunov
@ 2020-12-23 15:41 ` Cyrill Gorcunov
  2020-12-23 18:47   ` Vladislav Shpilevoy
  3 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 15:41 UTC (permalink / raw)
  To: tml; +Cc: Vladislav Shpilevoy

We have a feedback server which gathers information about a running instance.
While general info is enough for now we may loose a precious information about
crashes (such as call backtrace which caused the issue, type of build and etc).

In the commit we add support of sending this kind of information to the feedback
server. Internally we gather the reason of failure, pack it into base64 form
and then run another Tarantool instance which sends it out.

A typical report might look like

 | {
 |   "crashdump": {
 |     "version": "1",
 |     "data": {
 |       "uname": {
 |         "sysname": "Linux",
 |         "release": "5.9.14-100.fc32.x86_64",
 |         "version": "#1 SMP Fri Dec 11 14:30:38 UTC 2020",
 |         "machine": "x86_64"
 |       },
 |       "build": {
 |         "version": "2.7.0-115-g360565efb",
 |         "cmake_type": "Linux-x86_64-Debug"
 |       },
 |       "signal": {
 |         "signo": 11,
 |         "si_code": 0,
 |         "si_addr": "0x3e800004838",
 |         "backtrace": "IzAgIDB4NjMwNzM0IGluIGNyYXNoX2NvbGxlY3...",
 |         "timestamp": "2020-12-23 14:42:10 MSK"
 |       }
 |     }
 |   }
 | }

The `backtrace` itself is encoded as base64 because of newline symbols
(and may comprise some nonascii symbols as well).

There is no simple way to test this so I did it manually:
1) Run instance with

	box.cfg{log_level = 8, feedback_host="127.0.0.1:1500"}

2) Run listener shell as

	while true ; do nc -l -p 1500 -c 'echo -e "HTTP/1.1 200 OK\n\n $(date)"'; done

3) Send SIGSEGV

	kill -11 `pidof tarantool`

Once SIGSEGV is delivered the crashinfo data is generated and sent out. For
debug purpose this data is also printed to the terminal on debug log level.

Closes #5261

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>

@TarantoolBot document
Title: Configuration update, allow to disable sending crash information

For better analysis of program crashes the information associated with
the crash such as

 - utsname (similar to `uname -a` output except the network name)
 - build information
 - reason for a crash
 - call backtrace

is sent to the feedback server. To disable it set `feedback_crashinfo`
to `false`.
---
 src/box/box.cc                  |  18 ++
 src/box/box.h                   |   1 +
 src/box/lua/cfg.cc              |   9 +
 src/box/lua/load_cfg.lua        |   6 +-
 src/lib/core/CMakeLists.txt     |   2 +-
 src/lib/core/crash.c            | 308 ++++++++++++++++++++++++++++++++
 src/lib/core/crash.h            |  18 ++
 src/main.cc                     |   1 +
 test/app-tap/init_script.result |   1 +
 test/box/admin.result           |   2 +
 test/box/cfg.result             |   4 +
 11 files changed, 368 insertions(+), 2 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index a8bc3471d..440bfa305 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -74,6 +74,7 @@
 #include "sql.h"
 #include "systemd.h"
 #include "call.h"
+#include "crash.h"
 #include "func.h"
 #include "sequence.h"
 #include "sql_stmt_cache.h"
@@ -1213,6 +1214,23 @@ box_set_prepared_stmt_cache_size(void)
 	return 0;
 }
 
+int
+box_set_crash(void)
+{
+	const char *host = cfg_gets("feedback_host");
+	bool is_enabled_1 = cfg_getb("feedback_enabled");
+	bool is_enabled_2 = cfg_getb("feedback_crashinfo");
+
+	if (host != NULL && strlen(host) >= CRASH_FEEDBACK_HOST_MAX) {
+		diag_set(ClientError, ER_CFG, "feedback_host",
+			  "the address is too long");
+		return -1;
+	}
+
+	crash_cfg(host, is_enabled_1 && is_enabled_2);
+	return 0;
+}
+
 /* }}} configuration bindings */
 
 /**
diff --git a/src/box/box.h b/src/box/box.h
index b47a220b7..69fa096a1 100644
--- a/src/box/box.h
+++ b/src/box/box.h
@@ -257,6 +257,7 @@ void box_set_replication_sync_timeout(void);
 void box_set_replication_skip_conflict(void);
 void box_set_replication_anon(void);
 void box_set_net_msg_max(void);
+int box_set_crash(void);
 
 int
 box_set_prepared_stmt_cache_size(void);
diff --git a/src/box/lua/cfg.cc b/src/box/lua/cfg.cc
index 42805e602..2d3ccbf0e 100644
--- a/src/box/lua/cfg.cc
+++ b/src/box/lua/cfg.cc
@@ -375,6 +375,14 @@ lbox_cfg_set_replication_skip_conflict(struct lua_State *L)
 	return 0;
 }
 
+static int
+lbox_cfg_set_crash(struct lua_State *L)
+{
+	if (box_set_crash() != 0)
+		luaT_error(L);
+	return 0;
+}
+
 void
 box_lua_cfg_init(struct lua_State *L)
 {
@@ -411,6 +419,7 @@ box_lua_cfg_init(struct lua_State *L)
 		{"cfg_set_replication_anon", lbox_cfg_set_replication_anon},
 		{"cfg_set_net_msg_max", lbox_cfg_set_net_msg_max},
 		{"cfg_set_sql_cache_size", lbox_set_prepared_stmt_cache_size},
+		{"cfg_set_crash", lbox_cfg_set_crash},
 		{NULL, NULL}
 	};
 
diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua
index 770442052..7e41e0999 100644
--- a/src/box/lua/load_cfg.lua
+++ b/src/box/lua/load_cfg.lua
@@ -33,7 +33,8 @@ end
 
 local ifdef_feedback_set_params =
     private.feedback_daemon ~= nil and
-    private.feedback_daemon.set_feedback_params or nil
+    private.feedback_daemon.set_feedback_params and
+    private.cfg_set_crash or nil
 
 -- all available options
 local default_cfg = {
@@ -99,6 +100,7 @@ local default_cfg = {
     replication_skip_conflict = false,
     replication_anon      = false,
     feedback_enabled      = true,
+    feedback_crashinfo    = true,
     feedback_host         = "https://feedback.tarantool.io",
     feedback_interval     = 3600,
     net_msg_max           = 768,
@@ -179,6 +181,7 @@ local template_cfg = {
     replication_skip_conflict = 'boolean',
     replication_anon      = 'boolean',
     feedback_enabled      = ifdef_feedback('boolean'),
+    feedback_crashinfo    = ifdef_feedback('boolean'),
     feedback_host         = ifdef_feedback('string'),
     feedback_interval     = ifdef_feedback('number'),
     net_msg_max           = 'number',
@@ -277,6 +280,7 @@ local dynamic_cfg = {
     checkpoint_wal_threshold = private.cfg_set_checkpoint_wal_threshold,
     worker_pool_threads     = private.cfg_set_worker_pool_threads,
     feedback_enabled        = ifdef_feedback_set_params,
+    feedback_crashinfo      = ifdef_feedback_set_params,
     feedback_host           = ifdef_feedback_set_params,
     feedback_interval       = ifdef_feedback_set_params,
     -- do nothing, affects new replicas, which query this value on start
diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
index 30cf0dd15..358e98ea7 100644
--- a/src/lib/core/CMakeLists.txt
+++ b/src/lib/core/CMakeLists.txt
@@ -40,7 +40,7 @@ endif()
 
 add_library(core STATIC ${core_sources})
 
-target_link_libraries(core salad small uri decNumber bit ${LIBEV_LIBRARIES}
+target_link_libraries(core salad small uri decNumber bit misc ${LIBEV_LIBRARIES}
                       ${LIBEIO_LIBRARIES} ${LIBCORO_LIBRARIES}
                       ${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES})
 
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index 3929463f3..19d3d49ea 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -7,8 +7,13 @@
 #include <string.h>
 #include <signal.h>
 #include <stdint.h>
+#include <limits.h>
 #include <time.h>
+#include <sys/types.h>
+#include <sys/utsname.h>
 
+#include "third_party/base64.h"
+#include "small/static.h"
 #include "trivia/util.h"
 
 #include "backtrace.h"
@@ -16,9 +21,16 @@
 #include "say.h"
 
 #define pr_fmt(fmt)		"crash: " fmt
+#define pr_debug(fmt, ...)	say_debug(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_info(fmt, ...)	say_info(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_err(fmt, ...)	say_error(pr_fmt(fmt), ##__VA_ARGS__)
 #define pr_syserr(fmt, ...)	say_syserror(pr_fmt(fmt), ##__VA_ARGS__)
+#define pr_crit(fmt, ...)	fprintf(stderr, pr_fmt(fmt) "\n", ##__VA_ARGS__)
 #define pr_panic(fmt, ...)	panic(pr_fmt(fmt), ##__VA_ARGS__)
 
+/** Use strlcpy with destination as an array */
+#define strlcpy_a(dst, src) strlcpy(dst, src, sizeof(dst))
+
 #ifdef TARGET_OS_LINUX
 #ifndef __x86_64__
 # error "Non x86-64 architectures are not supported"
@@ -71,6 +83,10 @@ static struct crash_info {
 	 */
 	struct crash_greg greg;
 #endif
+	/**
+	 * Timestamp in nanoseconds (realtime).
+	 */
+	uint64_t timestamp_rt;
 	/**
 	 * Faulting address.
 	 */
@@ -92,6 +108,72 @@ static struct crash_info {
 #endif
 } crash_info;
 
+static char tarantool_path[PATH_MAX];
+static char feedback_host[CRASH_FEEDBACK_HOST_MAX];
+static bool send_crashinfo = false;
+
+static inline uint64_t
+timespec_to_ns(struct timespec *ts)
+{
+	return (uint64_t)ts->tv_sec * 1000000000 + (uint64_t)ts->tv_nsec;
+}
+
+static char *
+ns_to_localtime(uint64_t timestamp, char *buf, ssize_t len)
+{
+	time_t sec = timestamp / 1000000000;
+	char *start = buf;
+	struct tm tm;
+
+	/*
+	 * Use similar format as say_x logger. Except plain
+	 * seconds should be enough.
+	 */
+	localtime_r(&sec, &tm);
+	ssize_t total = strftime(start, len, "%F %T %Z", &tm);
+	start += total;
+	if (total < len)
+		return buf;
+	buf[len - 1] = '\0';
+	return buf;
+}
+
+void
+crash_init(const char *tarantool_bin)
+{
+	strlcpy_a(tarantool_path, tarantool_bin);
+	if (strlen(tarantool_path) < strlen(tarantool_bin))
+		pr_panic("executable path is trimmed");
+}
+
+void
+crash_cfg(const char *host, bool is_enabled)
+{
+	if (host == NULL || !is_enabled) {
+		if (send_crashinfo) {
+			pr_debug("disable sending crashinfo feedback");
+			send_crashinfo = false;
+			feedback_host[0] = '\0';
+		}
+		return;
+	}
+
+	if (strcmp(feedback_host, host) != 0) {
+		strlcpy_a(feedback_host, host);
+		/*
+		 * The caller should have tested already
+		 * that there is enough space to keep
+		 * the host address.
+		 */
+		assert(strlen(feedback_host) == strlen(host));
+	}
+
+	if (!send_crashinfo) {
+		pr_debug("enable sending crashinfo feedback");
+		send_crashinfo = true;
+	}
+}
+
 /**
  * The routine is called inside crash signal handler so
  * be careful to not cause additional signals inside.
@@ -100,6 +182,12 @@ static struct crash_info *
 crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
 {
 	struct crash_info *cinfo = &crash_info;
+	struct timespec ts;
+
+	if (clock_gettime(CLOCK_REALTIME, &ts) == 0)
+		cinfo->timestamp_rt = timespec_to_ns(&ts);
+	else
+		cinfo->timestamp_rt = 0;
 
 	cinfo->signo = signo;
 	cinfo->sicode = siginfo->si_code;
@@ -130,6 +218,224 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
 	return cinfo;
 }
 
+/**
+ * Mark an environment that we're in crashinfo handling, this
+ * allows us to escape recursive attempts to send report,
+ * if the action of sending report is failing itself.
+ */
+static int
+crash_mark_env_mode(void)
+{
+	const char *env_name = "TT_CRASHINFO_MODE";
+	if (getenv(env_name) != NULL) {
+		pr_crit("recursive failure detected");
+		return -1;
+	}
+
+	if (setenv(env_name, "y", 0) != 0) {
+		pr_crit("unable to setup %s", env_name);
+		return -1;
+	}
+
+	return 0;
+}
+
+/**
+ * Zap reserved symbols.
+ */
+static char *
+json_zap(char *s)
+{
+	if (s == NULL)
+		return NULL;
+
+	/*
+	 * Actually we should escape them but for
+	 * now lets just zap without changing source
+	 * size.
+	 */
+	for (size_t i = 0; i < strlen(s); i++) {
+		if (s[i] == '\"' || s[i] == '\b' ||
+		    s[i] == '\f' || s[i] == '\n' ||
+		    s[i] == '\r' || s[i] == '\t' ||
+		    s[i] == '\\') {
+			s[i] = ' ';
+		}
+	}
+	return s;
+}
+
+/**
+ * Report crash information to the feedback daemon
+ * (ie send it to feedback daemon).
+ */
+static void
+crash_report_feedback_daemon(struct crash_info *cinfo)
+{
+	if (crash_mark_env_mode() != 0)
+		return;
+
+	/*
+	 * Update to a new number if the format get changed.
+	 */
+	const int crashinfo_version = 1;
+
+	char *p = static_alloc(SMALL_STATIC_SIZE);
+	char *tail = &p[SMALL_STATIC_SIZE];
+	char *e = &p[SMALL_STATIC_SIZE];
+	char *head = p;
+
+	/*
+	 * Note that while we encode the report we
+	 * intensively use a tail of the allocated
+	 * buffer as a temporary store.
+	 */
+
+#define snprintf_safe(__end, __fmt, ...)			\
+	do {							\
+		size_t size = (char *)__end - p;		\
+		p += snprintf(p, size, __fmt, ##__VA_ARGS__);	\
+		if (p >= (char *)__end)				\
+			goto out;				\
+	} while (0)
+
+	/*
+	 * Lets reuse tail of the buffer as a temp space.
+	 */
+	struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
+	if (p >= (char *)uname_ptr)
+		goto out;
+
+	if (uname(uname_ptr) != 0) {
+		pr_syserr("uname call failed, ignore");
+		memset(uname_ptr, 0, sizeof(struct utsname));
+	}
+
+	/*
+	 * Start filling the script. The "data" key value is
+	 * filled as a separate code block for easier
+	 * modifications in future.
+	 */
+	snprintf_safe(uname_ptr,
+		      "require(\'http.client\').post(\'%s\',"
+		      "'{\"crashdump\":{\"version\":\"%d\","
+		      "\"data\":", feedback_host,
+		      crashinfo_version);
+
+	/* The "data" key value */
+	snprintf_safe(uname_ptr, "{");
+	snprintf_safe(uname_ptr, "\"uname\":{");
+	snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
+		      json_zap(uname_ptr->sysname));
+	/*
+	 * nodename might contain a sensitive information, skip.
+	 */
+	snprintf_safe(uname_ptr, "\"release\":\"%s\",",
+		      json_zap(uname_ptr->release));
+	snprintf_safe(uname_ptr, "\"version\":\"%s\",",
+		      json_zap(uname_ptr->version));
+	snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
+		      json_zap(uname_ptr->machine));
+	snprintf_safe(uname_ptr, "},");
+
+	snprintf_safe(e, "\"build\":{");
+	snprintf_safe(e, "\"version\":\"%s\",", PACKAGE_VERSION);
+	snprintf_safe(e, "\"cmake_type\":\"%s\"", BUILD_INFO);
+	snprintf_safe(e, "},");
+
+	snprintf_safe(e, "\"signal\":{");
+	snprintf_safe(e, "\"signo\":%d,", cinfo->signo);
+	snprintf_safe(e, "\"si_code\":%d,", cinfo->sicode);
+	if (cinfo->signo == SIGSEGV) {
+		if (cinfo->sicode == SEGV_MAPERR) {
+			snprintf_safe(e, "\"si_code_str\":\"%s\",",
+				      "SEGV_MAPERR");
+		} else if (cinfo->sicode == SEGV_ACCERR) {
+			snprintf_safe(e, "\"si_code_str\":\"%s\",",
+				      "SEGV_ACCERR");
+		}
+		snprintf_safe(e, "\"si_addr\":\"0x%llx\",",
+			      (long long)cinfo->siaddr);
+	}
+
+#ifdef ENABLE_BACKTRACE
+	/*
+	 * The backtrace itself is encoded into base64 form
+	 * since it might have arbitrary symbols not suitable
+	 * for json encoding (newlines and etc).
+	 */
+	size_t bt_len = strlen(cinfo->backtrace_buf);
+	size_t bt_elen = base64_bufsize(bt_len, BASE64_NOWRAP);
+	char *bt_base64 = &tail[-bt_elen];
+	if (p >= bt_base64)
+		goto out;
+	base64_encode(cinfo->backtrace_buf, bt_len,
+		      bt_base64, bt_elen, BASE64_NOWRAP);
+	bt_base64[bt_elen] = '\0';
+	snprintf_safe(bt_base64, "\"backtrace\":\"%s\",", bt_base64);
+#endif
+
+	/* 64 bytes should be enough for longest localtime */
+	const int ts_size = 64;
+	char *timestamp_rt_str = &tail[-ts_size];
+	if (p >= timestamp_rt_str)
+		goto out;
+	ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
+	snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
+		      json_zap(timestamp_rt_str));
+	snprintf_safe(timestamp_rt_str, "}");
+	snprintf_safe(timestamp_rt_str, "}");
+
+	/*
+	 * Finalize the "data" key and the script.
+	 *
+	 * The timeout is choosen to be 1 second as
+	 * main feedback daemon uses.
+	 */
+	snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
+
+	pr_debug("crashinfo script: %s", head);
+
+	char *exec_argv[4] = {
+		[0] = tarantool_path,
+		[1] = "-e",
+		[2] = head,
+		[3] = NULL,
+	};
+
+	/*
+	 * Can't use fork here because libev has own
+	 * at_fork helpers with mutex where we might
+	 * stuck (see popen code).
+	 */
+	pid_t pid = vfork();
+	if (pid == 0) {
+		/*
+		 * Environment is needed for recursive
+		 * crash protection. See crash_mark_env_mode
+		 * above.
+		 */
+		extern char **environ;
+		/*
+		 * The script must exit at the end but there
+		 * is no simple way to make sure from inside
+		 * of a signal crash handler. So just hope it
+		 * is running fine.
+		 */
+		execve(exec_argv[0], exec_argv, environ);
+		pr_crit("exec(%s,[%s,%s,%s]) failed",
+			exec_argv[0], exec_argv[0],
+			exec_argv[1], exec_argv[2]);
+		_exit(1);
+	} else if (pid < 0) {
+		pr_crit("unable to vfork (errno %d)", errno);
+	}
+
+	return;
+out:
+	pr_crit("unable to prepare a crash report");
+}
+
 /**
  * Report crash information to the stderr
  * (usually a current console).
@@ -236,6 +542,8 @@ crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
 		in_cb = 1;
 		cinfo = crash_collect(signo, siginfo, context);
 		crash_report_stderr(cinfo);
+		if (send_crashinfo)
+			crash_report_feedback_daemon(cinfo);
 	} else {
 		/* Got a signal while running the handler. */
 		fprintf(stderr, "Fatal %d while backtracing", signo);
diff --git a/src/lib/core/crash.h b/src/lib/core/crash.h
index cd1db585e..195aef10b 100644
--- a/src/lib/core/crash.h
+++ b/src/lib/core/crash.h
@@ -9,6 +9,24 @@
 extern "C" {
 #endif /* defined(__cplusplus) */
 
+/**
+ * PATH_MAX is too big and 2K is recommended
+ * limit for web address.
+ */
+#define CRASH_FEEDBACK_HOST_MAX 2048
+
+/**
+ * Initialize crash subsystem.
+ */
+void
+crash_init(const char *tarantool_bin);
+
+/**
+ * Configure crash parameters.
+ */
+void
+crash_cfg(const char *host, bool is_enabled);
+
 /**
  * Initialize crash signal handlers.
  */
diff --git a/src/main.cc b/src/main.cc
index 391e0f878..2fce81bb3 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -687,6 +687,7 @@ main(int argc, char **argv)
 		title_set_script_name(argv[0]);
 	}
 
+	crash_init(tarantool_bin);
 	export_syms();
 
 	random_init();
diff --git a/test/app-tap/init_script.result b/test/app-tap/init_script.result
index 72aa67db2..16c5b01d2 100644
--- a/test/app-tap/init_script.result
+++ b/test/app-tap/init_script.result
@@ -10,6 +10,7 @@ checkpoint_wal_threshold:1e+18
 coredump:false
 election_mode:off
 election_timeout:5
+feedback_crashinfo:true
 feedback_enabled:true
 feedback_host:https://feedback.tarantool.io
 feedback_interval:3600
diff --git a/test/box/admin.result b/test/box/admin.result
index e05440f66..05debe673 100644
--- a/test/box/admin.result
+++ b/test/box/admin.result
@@ -41,6 +41,8 @@ cfg_filter(box.cfg)
     - off
   - - election_timeout
     - 5
+  - - feedback_crashinfo
+    - true
   - - feedback_enabled
     - true
   - - feedback_host
diff --git a/test/box/cfg.result b/test/box/cfg.result
index 10fef006c..22a720c2c 100644
--- a/test/box/cfg.result
+++ b/test/box/cfg.result
@@ -29,6 +29,8 @@ cfg_filter(box.cfg)
  |     - off
  |   - - election_timeout
  |     - 5
+ |   - - feedback_crashinfo
+ |     - true
  |   - - feedback_enabled
  |     - true
  |   - - feedback_host
@@ -142,6 +144,8 @@ cfg_filter(box.cfg)
  |     - off
  |   - - election_timeout
  |     - 5
+ |   - - feedback_crashinfo
+ |     - true
  |   - - feedback_enabled
  |     - true
  |   - - feedback_host
-- 
2.26.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
@ 2020-12-23 18:47   ` Vladislav Shpilevoy
  2020-12-23 18:57     ` Cyrill Gorcunov
  2020-12-23 21:22     ` Cyrill Gorcunov
  0 siblings, 2 replies; 13+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-23 18:47 UTC (permalink / raw)
  To: Cyrill Gorcunov, tml

Hi! Thanks for the patch!

See 3 comments and fixes below, and top of the branch in a
separate commit.

> diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
> index 3929463f3..19d3d49ea 100644
> --- a/src/lib/core/crash.c
> +++ b/src/lib/core/crash.c
> @@ -130,6 +218,224 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
>  	return cinfo;
>  }
>  
> +/**
> + * Mark an environment that we're in crashinfo handling, this
> + * allows us to escape recursive attempts to send report,
> + * if the action of sending report is failing itself.
> + */
> +static int
> +crash_mark_env_mode(void)

1. You don't need to change 'env'. Because in the new process
send_crashinfo will be false. Therefore it won't send the crash
anyway.

> +{
> +	const char *env_name = "TT_CRASHINFO_MODE";
> +	if (getenv(env_name) != NULL) {
> +		pr_crit("recursive failure detected");
> +		return -1;
> +	}
> +
> +	if (setenv(env_name, "y", 0) != 0) {
> +		pr_crit("unable to setup %s", env_name);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * Zap reserved symbols.
> + */
> +static char *
> +json_zap(char *s)

2. Why can't you simply call json_escape(), which we
already have? Yes, it may add a few more bytes, but it
is not something super bad. json_escape() returns number
of written bytes, so you can use it quite easy.

====================
@@ -325,8 +325,11 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	/* The "data" key value */
 	snprintf_safe(uname_ptr, "{");
 	snprintf_safe(uname_ptr, "\"uname\":{");
-	snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
-		      json_zap(uname_ptr->sysname));
+	snprintf_safe(uname_ptr, "\"sysname\":\"");
+	p += json_escape(p, (char *)uname_ptr - p, uname_ptr->sysname);
+	if (p >= (char *)uname_ptr)
+		goto out;
+	snprintf_safe(uname_ptr, "\",");
====================

It would look even simpler, if you would use the existing
SNPRINT() macros. It returns -1 on fail, but you could patch
it or introduce SNPRINT_GO which would goto to a given label.
Or keep return -1, and do it like in sio_socketname_to_buffer() -
it has a separate function to do all the prints.

I tried to do that in my diff.

> +{
> +	if (s == NULL)
> +		return NULL;
> +
> +	/*
> +	 * Actually we should escape them but for
> +	 * now lets just zap without changing source
> +	 * size.
> +	 */
> +	for (size_t i = 0; i < strlen(s); i++) {

3. Strlen() is going to be called on each iteration. Please, save
the result into a variable. Or just use json_escape().

> +		if (s[i] == '\"' || s[i] == '\b' ||
> +		    s[i] == '\f' || s[i] == '\n' ||
> +		    s[i] == '\r' || s[i] == '\t' ||
> +		    s[i] == '\\') {
> +			s[i] = ' ';
> +		}
> +	}
Consider my fixes below and on top of the branch in a separate
commit. I didn't test them though. So there are likely to be a
few typos.

====================
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index 914c84a1b..4b821d44a 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -218,63 +218,13 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
 	return cinfo;
 }
 
-/**
- * Mark an environment that we're in crashinfo handling, this
- * allows us to escape recursive attempts to send report,
- * if the action of sending report is failing itself.
- */
-static int
-crash_mark_env_mode(void)
-{
-	const char *env_name = "TT_CRASHINFO_MODE";
-	if (getenv(env_name) != NULL) {
-		pr_crit("recursive failure detected");
-		return -1;
-	}
-
-	if (setenv(env_name, "y", 0) != 0) {
-		pr_crit("unable to setup %s", env_name);
-		return -1;
-	}
-
-	return 0;
-}
-
-/**
- * Zap reserved symbols.
- */
-static char *
-json_zap(char *s)
-{
-	if (s == NULL)
-		return NULL;
-
-	/*
-	 * Actually we should escape them but for
-	 * now lets just zap without changing source
-	 * size.
-	 */
-	for (size_t i = 0; i < strlen(s); i++) {
-		if (s[i] == '\"' || s[i] == '\b' ||
-		    s[i] == '\f' || s[i] == '\n' ||
-		    s[i] == '\r' || s[i] == '\t' ||
-		    s[i] == '\\') {
-			s[i] = ' ';
-		}
-	}
-	return s;
-}
-
 /**
  * Report crash information to the feedback daemon
  * (ie send it to feedback daemon).
  */
-static void
+static int
 crash_report_feedback_daemon(struct crash_info *cinfo)
 {
-	if (crash_mark_env_mode() != 0)
-		return;
-
 	/*
 	 * Update to a new number if the format get changed.
 	 */
@@ -291,20 +241,17 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * buffer as a temporary store.
 	 */
 
-#define snprintf_safe(__end, __fmt, ...)			\
-	do {							\
-		size_t size = (char *)__end - p;		\
-		p += snprintf(p, size, __fmt, ##__VA_ARGS__);	\
-		if (p >= (char *)__end)				\
-			goto out;				\
-	} while (0)
+	int total = 0;
+	int size = 0;
+#define snprintf_safe(...) SNPRINT(total, snprintf, p, size, __VA_ARGS__)
+#define jnprintf_safe(str) SNPRINT(total, json_escape, p, size, str)
 
 	/*
 	 * Lets reuse tail of the buffer as a temp space.
 	 */
 	struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
 	if (p >= (char *)uname_ptr)
-		goto out;
+		return -1;
 
 	if (uname(uname_ptr) != 0) {
 		pr_syserr("uname call failed, ignore");
@@ -316,48 +263,53 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * filled as a separate code block for easier
 	 * modifications in future.
 	 */
-	snprintf_safe(uname_ptr,
-		      "require(\'http.client\').post(\'%s\',"
+	size = (char *)uname_ptr - p;
+	snprintf_safe("require(\'http.client\').post(\'%s\',"
 		      "'{\"crashdump\":{\"version\":\"%d\","
 		      "\"data\":", feedback_host,
 		      crashinfo_version);
 
 	/* The "data" key value */
-	snprintf_safe(uname_ptr, "{");
-	snprintf_safe(uname_ptr, "\"uname\":{");
-	snprintf_safe(uname_ptr, "\"sysname\":\"");
-	p += json_escape(p, (char *)uname_ptr - p, uname_ptr->sysname);
-	if (p >= (char *)uname_ptr)
-		goto out;
-	snprintf_safe(uname_ptr, "\",");
+	snprintf_safe("{");
+	snprintf_safe("\"uname\":{");
+	snprintf_safe("\"sysname\":\"");
+	jnprintf_safe(uname_ptr->sysname);
+	snprintf_safe("\",");
 	/*
 	 * nodename might contain a sensitive information, skip.
 	 */
-	snprintf_safe(uname_ptr, "\"release\":\"%s\",",
-		      json_zap(uname_ptr->release));
-	snprintf_safe(uname_ptr, "\"version\":\"%s\",",
-		      json_zap(uname_ptr->version));
-	snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
-		      json_zap(uname_ptr->machine));
-	snprintf_safe(uname_ptr, "},");
-
-	snprintf_safe(e, "\"build\":{");
-	snprintf_safe(e, "\"version\":\"%s\",", PACKAGE_VERSION);
-	snprintf_safe(e, "\"cmake_type\":\"%s\"", BUILD_INFO);
-	snprintf_safe(e, "},");
-
-	snprintf_safe(e, "\"signal\":{");
-	snprintf_safe(e, "\"signo\":%d,", cinfo->signo);
-	snprintf_safe(e, "\"si_code\":%d,", cinfo->sicode);
+	snprintf_safe("\"release\":\"");
+	jnprintf_safe(uname_ptr->release);
+	snprintf_safe("\",");
+
+	snprintf_safe("\"version\":\"");
+	jnprintf_safe(uname_ptr->version);
+	snprintf_safe("\",");
+
+	snprintf_safe("\"machine\":\"");
+	jnprintf_safe(uname_ptr->machine);
+	snprintf_safe("\"");
+	snprintf_safe("},");
+
+	/* Extend size, because now uname_ptr is not needed. */
+	size = e - p;
+	snprintf_safe("\"build\":{");
+	snprintf_safe("\"version\":\"%s\",", PACKAGE_VERSION);
+	snprintf_safe("\"cmake_type\":\"%s\"", BUILD_INFO);
+	snprintf_safe("},");
+
+	snprintf_safe("\"signal\":{");
+	snprintf_safe("\"signo\":%d,", cinfo->signo);
+	snprintf_safe("\"si_code\":%d,", cinfo->sicode);
 	if (cinfo->signo == SIGSEGV) {
 		if (cinfo->sicode == SEGV_MAPERR) {
-			snprintf_safe(e, "\"si_code_str\":\"%s\",",
+			snprintf_safe("\"si_code_str\":\"%s\",",
 				      "SEGV_MAPERR");
 		} else if (cinfo->sicode == SEGV_ACCERR) {
-			snprintf_safe(e, "\"si_code_str\":\"%s\",",
+			snprintf_safe("\"si_code_str\":\"%s\",",
 				      "SEGV_ACCERR");
 		}
-		snprintf_safe(e, "\"si_addr\":\"0x%llx\",",
+		snprintf_safe("\"si_addr\":\"0x%llx\",",
 			      (long long)cinfo->siaddr);
 	}
 
@@ -371,23 +323,28 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	size_t bt_elen = base64_bufsize(bt_len, BASE64_NOWRAP);
 	char *bt_base64 = &tail[-bt_elen];
 	if (p >= bt_base64)
-		goto out;
+		return -1;
 	base64_encode(cinfo->backtrace_buf, bt_len,
 		      bt_base64, bt_elen, BASE64_NOWRAP);
 	bt_base64[bt_elen] = '\0';
-	snprintf_safe(bt_base64, "\"backtrace\":\"%s\",", bt_base64);
+
+	size = bt_base64 - p;
+	snprintf_safe("\"backtrace\":\"%s\",", bt_base64);
 #endif
 
 	/* 64 bytes should be enough for longest localtime */
 	const int ts_size = 64;
 	char *timestamp_rt_str = &tail[-ts_size];
 	if (p >= timestamp_rt_str)
-		goto out;
+		return -1;
 	ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
-	snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
-		      json_zap(timestamp_rt_str));
-	snprintf_safe(timestamp_rt_str, "}");
-	snprintf_safe(timestamp_rt_str, "}");
+
+	size = timestamp_rt_str - p;
+	snprintf_safe("\"timestamp\":\"");
+	jnprintf_safe(timestamp_rt_str);
+	snprintf_safe("\"");
+	snprintf_safe("}");
+	snprintf_safe("}");
 
 	/*
 	 * Finalize the "data" key and the script.
@@ -395,7 +352,8 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * The timeout is choosen to be 1 second as
 	 * main feedback daemon uses.
 	 */
-	snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
+	size = e - p;
+	snprintf_safe("}}',{timeout=1});os.exit(1);");
 
 	pr_debug("crashinfo script: %s", head);
 
@@ -413,30 +371,22 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 */
 	pid_t pid = vfork();
 	if (pid == 0) {
-		/*
-		 * Environment is needed for recursive
-		 * crash protection. See crash_mark_env_mode
-		 * above.
-		 */
-		extern char **environ;
 		/*
 		 * The script must exit at the end but there
 		 * is no simple way to make sure from inside
 		 * of a signal crash handler. So just hope it
 		 * is running fine.
 		 */
-		execve(exec_argv[0], exec_argv, environ);
+		execve(exec_argv[0], exec_argv, NULL);
 		pr_crit("exec(%s,[%s,%s,%s]) failed",
 			exec_argv[0], exec_argv[0],
 			exec_argv[1], exec_argv[2]);
 		_exit(1);
 	} else if (pid < 0) {
 		pr_crit("unable to vfork (errno %d)", errno);
+		return -1;
 	}
-
-	return;
-out:
-	pr_crit("unable to prepare a crash report");
+	return 0;
 }
 
 /**
@@ -545,8 +495,8 @@ crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
 		in_cb = 1;
 		cinfo = crash_collect(signo, siginfo, context);
 		crash_report_stderr(cinfo);
-		if (send_crashinfo)
-			crash_report_feedback_daemon(cinfo);
+		if (send_crashinfo && crash_report_feedback_daemon(cinfo) != 0)
+			pr_crit("unable to send a crash report");
 	} else {
 		/* Got a signal while running the handler. */
 		fprintf(stderr, "Fatal %d while backtracing", signo);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-23 18:47   ` Vladislav Shpilevoy
@ 2020-12-23 18:57     ` Cyrill Gorcunov
  2020-12-23 21:22     ` Cyrill Gorcunov
  1 sibling, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 18:57 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tml

On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> >  
> > +/**
> > + * Mark an environment that we're in crashinfo handling, this
> > + * allows us to escape recursive attempts to send report,
> > + * if the action of sending report is failing itself.
> > + */
> > +static int
> > +crash_mark_env_mode(void)
> 
> 1. You don't need to change 'env'. Because in the new process
> send_crashinfo will be false. Therefore it won't send the crash
> anyway.

Yeah, good catch! Will drop this.

> > +/**
> > + * Zap reserved symbols.
> > + */
> > +static char *
> > +json_zap(char *s)
> 
> 2. Why can't you simply call json_escape(), which we
> already have? Yes, it may add a few more bytes, but it
> is not something super bad. json_escape() returns number
> of written bytes, so you can use it quite easy.

Ah, I didn't know that we have this helper. Thanks, Vlad!

> 
> It would look even simpler, if you would use the existing
> SNPRINT() macros. It returns -1 on fail, but you could patch
> it or introduce SNPRINT_GO which would goto to a given label.
> Or keep return -1, and do it like in sio_socketname_to_buffer() -
> it has a separate function to do all the prints.
> 
> I tried to do that in my diff.
> 
> > +{
> > +	if (s == NULL)
> > +		return NULL;
> > +
> > +	/*
> > +	 * Actually we should escape them but for
> > +	 * now lets just zap without changing source
> > +	 * size.
> > +	 */
> > +	for (size_t i = 0; i < strlen(s); i++) {
> 
> 3. Strlen() is going to be called on each iteration. Please, save
> the result into a variable. Or just use json_escape().

Nope, it won't. gcc is smart enough to move this bb
out of cycle. This is known for decades already. Still
if I gonna use json helper you pointed above I'll not
need this function completely.

> Consider my fixes below and on top of the branch in a separate
> commit. I didn't test them though. So there are likely to be a
> few typos.

Yeah, thanks a lot. Will update and report you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-23 18:47   ` Vladislav Shpilevoy
  2020-12-23 18:57     ` Cyrill Gorcunov
@ 2020-12-23 21:22     ` Cyrill Gorcunov
  2020-12-24 13:16       ` Cyrill Gorcunov
  1 sibling, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-23 21:22 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tml

On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> Hi! Thanks for the patch!
> 
> See 3 comments and fixes below, and top of the branch in a
> separate commit.
> 

Vlad, since we can use json_escape I dropped usage of base64
encoder for backtrace. The final interdiff I pushed into
gorcunov/gh-5261-crash-report-6 is the following
--
diff --git a/src/lib/core/CMakeLists.txt b/src/lib/core/CMakeLists.txt
index 358e98ea7..30cf0dd15 100644
--- a/src/lib/core/CMakeLists.txt
+++ b/src/lib/core/CMakeLists.txt
@@ -40,7 +40,7 @@ endif()
 
 add_library(core STATIC ${core_sources})
 
-target_link_libraries(core salad small uri decNumber bit misc ${LIBEV_LIBRARIES}
+target_link_libraries(core salad small uri decNumber bit ${LIBEV_LIBRARIES}
                       ${LIBEIO_LIBRARIES} ${LIBCORO_LIBRARIES}
                       ${MSGPUCK_LIBRARIES} ${ICU_LIBRARIES})
 
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index 19d3d49ea..e62af6836 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -12,7 +12,6 @@
 #include <sys/types.h>
 #include <sys/utsname.h>
 
-#include "third_party/base64.h"
 #include "small/static.h"
 #include "trivia/util.h"
 
@@ -218,63 +217,13 @@ crash_collect(int signo, siginfo_t *siginfo, void *ucontext)
 	return cinfo;
 }
 
-/**
- * Mark an environment that we're in crashinfo handling, this
- * allows us to escape recursive attempts to send report,
- * if the action of sending report is failing itself.
- */
-static int
-crash_mark_env_mode(void)
-{
-	const char *env_name = "TT_CRASHINFO_MODE";
-	if (getenv(env_name) != NULL) {
-		pr_crit("recursive failure detected");
-		return -1;
-	}
-
-	if (setenv(env_name, "y", 0) != 0) {
-		pr_crit("unable to setup %s", env_name);
-		return -1;
-	}
-
-	return 0;
-}
-
-/**
- * Zap reserved symbols.
- */
-static char *
-json_zap(char *s)
-{
-	if (s == NULL)
-		return NULL;
-
-	/*
-	 * Actually we should escape them but for
-	 * now lets just zap without changing source
-	 * size.
-	 */
-	for (size_t i = 0; i < strlen(s); i++) {
-		if (s[i] == '\"' || s[i] == '\b' ||
-		    s[i] == '\f' || s[i] == '\n' ||
-		    s[i] == '\r' || s[i] == '\t' ||
-		    s[i] == '\\') {
-			s[i] = ' ';
-		}
-	}
-	return s;
-}
-
 /**
  * Report crash information to the feedback daemon
  * (ie send it to feedback daemon).
  */
-static void
+static int
 crash_report_feedback_daemon(struct crash_info *cinfo)
 {
-	if (crash_mark_env_mode() != 0)
-		return;
-
 	/*
 	 * Update to a new number if the format get changed.
 	 */
@@ -285,26 +234,18 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	char *e = &p[SMALL_STATIC_SIZE];
 	char *head = p;
 
-	/*
-	 * Note that while we encode the report we
-	 * intensively use a tail of the allocated
-	 * buffer as a temporary store.
-	 */
+	int total = 0;
+	int size = 0;
 
-#define snprintf_safe(__end, __fmt, ...)			\
-	do {							\
-		size_t size = (char *)__end - p;		\
-		p += snprintf(p, size, __fmt, ##__VA_ARGS__);	\
-		if (p >= (char *)__end)				\
-			goto out;				\
-	} while (0)
+#define snprintf_safe(...) SNPRINT(total, snprintf, p, size, __VA_ARGS__)
+#define jnprintf_safe(str) SNPRINT(total, json_escape, p, size, str)
 
 	/*
 	 * Lets reuse tail of the buffer as a temp space.
 	 */
 	struct utsname *uname_ptr = (void *)&tail[-sizeof(struct utsname)];
 	if (p >= (char *)uname_ptr)
-		goto out;
+		return -1;
 
 	if (uname(uname_ptr) != 0) {
 		pr_syserr("uname call failed, ignore");
@@ -316,75 +257,75 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * filled as a separate code block for easier
 	 * modifications in future.
 	 */
-	snprintf_safe(uname_ptr,
-		      "require(\'http.client\').post(\'%s\',"
+	size = (char *)uname_ptr - p;
+	snprintf_safe("require(\'http.client\').post(\'%s\',"
 		      "'{\"crashdump\":{\"version\":\"%d\","
 		      "\"data\":", feedback_host,
 		      crashinfo_version);
 
 	/* The "data" key value */
-	snprintf_safe(uname_ptr, "{");
-	snprintf_safe(uname_ptr, "\"uname\":{");
-	snprintf_safe(uname_ptr, "\"sysname\":\"%s\",",
-		      json_zap(uname_ptr->sysname));
+	snprintf_safe("{");
+	snprintf_safe("\"uname\":{");
+	snprintf_safe("\"sysname\":\"");
+	jnprintf_safe(uname_ptr->sysname);
+	snprintf_safe("\",");
 	/*
 	 * nodename might contain a sensitive information, skip.
 	 */
-	snprintf_safe(uname_ptr, "\"release\":\"%s\",",
-		      json_zap(uname_ptr->release));
-	snprintf_safe(uname_ptr, "\"version\":\"%s\",",
-		      json_zap(uname_ptr->version));
-	snprintf_safe(uname_ptr, "\"machine\":\"%s\"",
-		      json_zap(uname_ptr->machine));
-	snprintf_safe(uname_ptr, "},");
-
-	snprintf_safe(e, "\"build\":{");
-	snprintf_safe(e, "\"version\":\"%s\",", PACKAGE_VERSION);
-	snprintf_safe(e, "\"cmake_type\":\"%s\"", BUILD_INFO);
-	snprintf_safe(e, "},");
-
-	snprintf_safe(e, "\"signal\":{");
-	snprintf_safe(e, "\"signo\":%d,", cinfo->signo);
-	snprintf_safe(e, "\"si_code\":%d,", cinfo->sicode);
+	snprintf_safe("\"release\":\"");
+	jnprintf_safe(uname_ptr->release);
+	snprintf_safe("\",");
+
+	snprintf_safe("\"version\":\"");
+	jnprintf_safe(uname_ptr->version);
+	snprintf_safe("\",");
+
+	snprintf_safe("\"machine\":\"");
+	jnprintf_safe(uname_ptr->machine);
+	snprintf_safe("\"");
+	snprintf_safe("},");
+
+	/* Extend size, because now uname_ptr is not needed. */
+	size = e - p;
+	snprintf_safe("\"build\":{");
+	snprintf_safe("\"version\":\"%s\",", PACKAGE_VERSION);
+	snprintf_safe("\"cmake_type\":\"%s\"", BUILD_INFO);
+	snprintf_safe("},");
+
+	snprintf_safe("\"signal\":{");
+	snprintf_safe("\"signo\":%d,", cinfo->signo);
+	snprintf_safe("\"si_code\":%d,", cinfo->sicode);
 	if (cinfo->signo == SIGSEGV) {
 		if (cinfo->sicode == SEGV_MAPERR) {
-			snprintf_safe(e, "\"si_code_str\":\"%s\",",
+			snprintf_safe("\"si_code_str\":\"%s\",",
 				      "SEGV_MAPERR");
 		} else if (cinfo->sicode == SEGV_ACCERR) {
-			snprintf_safe(e, "\"si_code_str\":\"%s\",",
+			snprintf_safe("\"si_code_str\":\"%s\",",
 				      "SEGV_ACCERR");
 		}
-		snprintf_safe(e, "\"si_addr\":\"0x%llx\",",
+		snprintf_safe("\"si_addr\":\"0x%llx\",",
 			      (long long)cinfo->siaddr);
 	}
 
 #ifdef ENABLE_BACKTRACE
-	/*
-	 * The backtrace itself is encoded into base64 form
-	 * since it might have arbitrary symbols not suitable
-	 * for json encoding (newlines and etc).
-	 */
-	size_t bt_len = strlen(cinfo->backtrace_buf);
-	size_t bt_elen = base64_bufsize(bt_len, BASE64_NOWRAP);
-	char *bt_base64 = &tail[-bt_elen];
-	if (p >= bt_base64)
-		goto out;
-	base64_encode(cinfo->backtrace_buf, bt_len,
-		      bt_base64, bt_elen, BASE64_NOWRAP);
-	bt_base64[bt_elen] = '\0';
-	snprintf_safe(bt_base64, "\"backtrace\":\"%s\",", bt_base64);
+	snprintf_safe("\"backtrace\":\"");
+	jnprintf_safe(cinfo->backtrace_buf);
+	snprintf_safe("\",");
 #endif
 
 	/* 64 bytes should be enough for longest localtime */
 	const int ts_size = 64;
 	char *timestamp_rt_str = &tail[-ts_size];
 	if (p >= timestamp_rt_str)
-		goto out;
+		return -1;
 	ns_to_localtime(cinfo->timestamp_rt, timestamp_rt_str, ts_size);
-	snprintf_safe(timestamp_rt_str, "\"timestamp\":\"%s\"",
-		      json_zap(timestamp_rt_str));
-	snprintf_safe(timestamp_rt_str, "}");
-	snprintf_safe(timestamp_rt_str, "}");
+
+	size = timestamp_rt_str - p;
+	snprintf_safe("\"timestamp\":\"");
+	jnprintf_safe(timestamp_rt_str);
+	snprintf_safe("\"");
+	snprintf_safe("}");
+	snprintf_safe("}");
 
 	/*
 	 * Finalize the "data" key and the script.
@@ -392,7 +333,8 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * The timeout is choosen to be 1 second as
 	 * main feedback daemon uses.
 	 */
-	snprintf_safe(e, "}}',{timeout=1});os.exit(1);");
+	size = e - p;
+	snprintf_safe("}}',{timeout=1});os.exit(1);");
 
 	pr_debug("crashinfo script: %s", head);
 
@@ -410,11 +352,6 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 */
 	pid_t pid = vfork();
 	if (pid == 0) {
-		/*
-		 * Environment is needed for recursive
-		 * crash protection. See crash_mark_env_mode
-		 * above.
-		 */
 		extern char **environ;
 		/*
 		 * The script must exit at the end but there
@@ -429,11 +366,10 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 		_exit(1);
 	} else if (pid < 0) {
 		pr_crit("unable to vfork (errno %d)", errno);
+		return -1;
 	}
 
-	return;
-out:
-	pr_crit("unable to prepare a crash report");
+	return 0;
 }
 
 /**
@@ -542,8 +478,10 @@ crash_signal_cb(int signo, siginfo_t *siginfo, void *context)
 		in_cb = 1;
 		cinfo = crash_collect(signo, siginfo, context);
 		crash_report_stderr(cinfo);
-		if (send_crashinfo)
-			crash_report_feedback_daemon(cinfo);
+		if (send_crashinfo &&
+		    crash_report_feedback_daemon(cinfo) != 0) {
+			pr_crit("unable to send a crash report");
+		}
 	} else {
 		/* Got a signal while running the handler. */
 		fprintf(stderr, "Fatal %d while backtracing", signo);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-23 21:22     ` Cyrill Gorcunov
@ 2020-12-24 13:16       ` Cyrill Gorcunov
  2020-12-24 17:15         ` Vladislav Shpilevoy
  0 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-24 13:16 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tml

On Thu, Dec 24, 2020 at 12:22:53AM +0300, Cyrill Gorcunov wrote:
> On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> > Hi! Thanks for the patch!
> > 
> > See 3 comments and fixes below, and top of the branch in a
> > separate commit.
> > 
> 
> Vlad, since we can use json_escape I dropped usage of base64
> encoder for backtrace. The final interdiff I pushed into
> gorcunov/gh-5261-crash-report-6 is the following

FWIW gorcunov/gh-5261-crash-report-6 passes all tests
https://gitlab.com/tarantool/tarantool/-/pipelines/234153923

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-24 13:16       ` Cyrill Gorcunov
@ 2020-12-24 17:15         ` Vladislav Shpilevoy
  2020-12-24 17:33           ` Cyrill Gorcunov
  0 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-24 17:15 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

On 24.12.2020 14:16, Cyrill Gorcunov wrote:
> On Thu, Dec 24, 2020 at 12:22:53AM +0300, Cyrill Gorcunov wrote:
>> On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
>>> Hi! Thanks for the patch!
>>>
>>> See 3 comments and fixes below, and top of the branch in a
>>> separate commit.
>>>
>>
>> Vlad, since we can use json_escape I dropped usage of base64
>> encoder for backtrace. The final interdiff I pushed into
>> gorcunov/gh-5261-crash-report-6 is the following
> 
> FWIW gorcunov/gh-5261-crash-report-6 passes all tests
> https://gitlab.com/tarantool/tarantool/-/pipelines/234153923

The branch didn't work on my machine when I tried to crash Tarantool.
Because uname.version even being json-escaped contained a substring
which is not a valid Lua string. This was the broken string:

	"Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2\/RELEASE_X86_64"

Lua didn't understand symbol sequence `\/`.

To workaround this I made the child process accept the dump data and
the feedback host via command line arguments. Now seems to be working.

I pushed my fixes on top of the branch, and pasted them below.

====================
diff --git a/src/lib/core/crash.c b/src/lib/core/crash.c
index e62af6836..ee4991b52 100644
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -258,10 +258,10 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * modifications in future.
 	 */
 	size = (char *)uname_ptr - p;
-	snprintf_safe("require(\'http.client\').post(\'%s\',"
-		      "'{\"crashdump\":{\"version\":\"%d\","
-		      "\"data\":", feedback_host,
-		      crashinfo_version);
+	snprintf_safe("{");
+	snprintf_safe("\"crashdump\":{");
+	snprintf_safe("\"version\":\"%d\",", crashinfo_version);
+	snprintf_safe("\"data\":");
 
 	/* The "data" key value */
 	snprintf_safe("{");
@@ -334,15 +334,20 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	 * main feedback daemon uses.
 	 */
 	size = e - p;
-	snprintf_safe("}}',{timeout=1});os.exit(1);");
+	snprintf_safe("}");
+	snprintf_safe("}");
 
 	pr_debug("crashinfo script: %s", head);
 
-	char *exec_argv[4] = {
+	char *exec_argv[7] = {
 		[0] = tarantool_path,
 		[1] = "-e",
-		[2] = head,
-		[3] = NULL,
+		[2] = "require('http.client').post(arg[1],arg[2],{timeout=1});"
+		      "os.exit(1);",
+		[3] = "-",
+		[4] = feedback_host,
+		[5] = head,
+		[6] = NULL,
 	};
 
 	/*
@@ -360,9 +365,11 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 		 * is running fine.
 		 */
 		execve(exec_argv[0], exec_argv, environ);
-		pr_crit("exec(%s,[%s,%s,%s]) failed",
+		pr_crit("exec(%s,[%s,%s,%s,%s,%s,%s,%s]) failed",
 			exec_argv[0], exec_argv[0],
-			exec_argv[1], exec_argv[2]);
+			exec_argv[1], exec_argv[2],
+			exec_argv[3], exec_argv[4],
+			exec_argv[5], exec_argv[6]);
 		_exit(1);
 	} else if (pid < 0) {
 		pr_crit("unable to vfork (errno %d)", errno);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-24 17:15         ` Vladislav Shpilevoy
@ 2020-12-24 17:33           ` Cyrill Gorcunov
  2020-12-24 18:22             ` Vladislav Shpilevoy
  0 siblings, 1 reply; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-24 17:33 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tml

On Thu, Dec 24, 2020 at 06:15:02PM +0100, Vladislav Shpilevoy wrote:
> On 24.12.2020 14:16, Cyrill Gorcunov wrote:
> > On Thu, Dec 24, 2020 at 12:22:53AM +0300, Cyrill Gorcunov wrote:
> >> On Wed, Dec 23, 2020 at 07:47:44PM +0100, Vladislav Shpilevoy wrote:
> >>> Hi! Thanks for the patch!
> >>>
> >>> See 3 comments and fixes below, and top of the branch in a
> >>> separate commit.
> >>>
> >>
> >> Vlad, since we can use json_escape I dropped usage of base64
> >> encoder for backtrace. The final interdiff I pushed into
> >> gorcunov/gh-5261-crash-report-6 is the following
> > 
> > FWIW gorcunov/gh-5261-crash-report-6 passes all tests
> > https://gitlab.com/tarantool/tarantool/-/pipelines/234153923
> 
> The branch didn't work on my machine when I tried to crash Tarantool.
> Because uname.version even being json-escaped contained a substring
> which is not a valid Lua string. This was the broken string:
> 
> 	"Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2\/RELEASE_X86_64"
> 
> Lua didn't understand symbol sequence `\/`.
> 
> To workaround this I made the child process accept the dump data and
> the feedback host via command line arguments. Now seems to be working.
> 
> I pushed my fixes on top of the branch, and pasted them below.

Thanks a huge! Squashed into gorcunov/gh-5261-crash-report-7

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-24 17:33           ` Cyrill Gorcunov
@ 2020-12-24 18:22             ` Vladislav Shpilevoy
  2020-12-24 18:33               ` Cyrill Gorcunov
  0 siblings, 1 reply; 13+ messages in thread
From: Vladislav Shpilevoy @ 2020-12-24 18:22 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

I applied this diff and pushed to master.

====================
--- a/src/lib/core/crash.c
+++ b/src/lib/core/crash.c
@@ -327,21 +327,17 @@ crash_report_feedback_daemon(struct crash_info *cinfo)
 	snprintf_safe("}");
 	snprintf_safe("}");
 
-	/*
-	 * Finalize the "data" key and the script.
-	 *
-	 * The timeout is choosen to be 1 second as
-	 * main feedback daemon uses.
-	 */
+	/* Finalize the "data" key and the whole dump. */
 	size = e - p;
 	snprintf_safe("}");
 	snprintf_safe("}");
 
-	pr_debug("crashinfo script: %s", head);
+	pr_debug("crash dump: %s", head);
 
 	char *exec_argv[7] = {
 		[0] = tarantool_path,
 		[1] = "-e",
+		/* Timeout 1 sec is taken from the feedback daemon. */
 		[2] = "require('http.client').post(arg[1],arg[2],{timeout=1});"
 		      "os.exit(1);",
 		[3] = "-",

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server
  2020-12-24 18:22             ` Vladislav Shpilevoy
@ 2020-12-24 18:33               ` Cyrill Gorcunov
  0 siblings, 0 replies; 13+ messages in thread
From: Cyrill Gorcunov @ 2020-12-24 18:33 UTC (permalink / raw)
  To: Vladislav Shpilevoy; +Cc: tml

On Thu, Dec 24, 2020 at 07:22:10PM +0100, Vladislav Shpilevoy wrote:
> I applied this diff and pushed to master.
> 

Great! Thanks a huge, Vlad!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-24 18:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-23 15:41 [Tarantool-patches] [PATCH v5 0/4] Our feedback daemon sends only a few portions of usage Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 1/4] util: introduce strlcpy helper Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 2/4] backtrace: allow to specify destination buffer Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 3/4] crash: move fatal signal handling in Cyrill Gorcunov
2020-12-23 15:41 ` [Tarantool-patches] [PATCH v5 4/4] crash: report crash data to the feedback server Cyrill Gorcunov
2020-12-23 18:47   ` Vladislav Shpilevoy
2020-12-23 18:57     ` Cyrill Gorcunov
2020-12-23 21:22     ` Cyrill Gorcunov
2020-12-24 13:16       ` Cyrill Gorcunov
2020-12-24 17:15         ` Vladislav Shpilevoy
2020-12-24 17:33           ` Cyrill Gorcunov
2020-12-24 18:22             ` Vladislav Shpilevoy
2020-12-24 18:33               ` Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox