Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH v5 0/2] Implement support of strip_core functionality
@ 2019-05-16 21:45 Cyrill Gorcunov
  2019-05-16 21:45 ` [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints Cyrill Gorcunov
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Cyrill Gorcunov @ 2019-05-16 21:45 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Alexander Turenko, gorcunov

The series patches both "small" memory engine and tarantool code as well,
moreover the tarantool patch depends on "small" new api.

Please review carefully. To test if memory doesn't go into coredump I've
been using gcore together with readelf utilities.

When strip_core is set to false (by default)

2019-05-16 23:53:28.755 [29583] main/102/interactive D> tuple arena memtx: addr 0x7f728c000000 size 268435456 flags 0x80000001 dontdump 0
2019-05-16 23:53:28.755 [29583] main/102/interactive I> mapping 134217728 bytes for vinyl tuple arena...

7f7283800000-7f729c081000 rw-p 00000000 00:00 0
VmFlags: rd wr mr mw me ac sd

  [34] load              PROGBITS         00007f7283800000  0df6e284
       0000000018881000  0000000000000000  WA       0     0     1

The segment with memtx memory is present inside corefile (because vma flags
are the same for both memtx and vynil os does merge the memory into single
vma).

In turn when strip_core is set to true we have

2019-05-16 23:58:26.796 [29852] main/102/interactive D> tuple arena memtx: addr 0x7f6978000000 size 268435456 flags 0x80000005 dontdump 1
2019-05-16 23:58:26.796 [29852] main/102/interactive I> mapping 134217728 bytes for vinyl tuple arena...

7f6978000000-7f6979000000 rw-p 00000000 00:00 0
VmFlags: rd wr mr mw me ac dd sd

  [36] load              PROGBITS         00007f6979000000  15f7d364
       000000000f000000  0000000000000000  WA       0     0     1

The memtx is not present in corefile but only vynil goes there.

Volodya, could you please help with more intensive testing since I'm not sure
how to fill memtx from console or some script yet.

Cyrill Gorcunov (2):
  slab_arena: Enhance slab_arena_create to support madvise hints
  box/memtx: Allow to skip tuple memory from coredump

-- 
2.20.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints
  2019-05-16 21:45 [PATCH v5 0/2] Implement support of strip_core functionality Cyrill Gorcunov
@ 2019-05-16 21:45 ` Cyrill Gorcunov
  2019-05-20 12:55   ` Vladimir Davydov
  2019-05-16 21:45 ` [PATCH v5 2/2] box/memtx: Allow to skip tuple memory from coredump Cyrill Gorcunov
  2019-05-20 10:03 ` [PATCH v5 0/2] Implement support of strip_core functionality Vladimir Davydov
  2 siblings, 1 reply; 7+ messages in thread
From: Cyrill Gorcunov @ 2019-05-16 21:45 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Alexander Turenko, gorcunov

The purpose of this extension is to be able to eliminate
some of slab arenas from dumping, especially when huge
amount of memory involves.

To keep backward compatibility we map flags passed into
new SLAB_ARENA_x flags. To be able to distinguish old
and new interfaces we reserve high bit of the integer
value and remap flags internally to syscall compatible
form.

Same time I have to modify build script to define
_GNU_SOURCE symbol otherwise madvise bits won't be
found.

Another aspect of the patch is simple feature engine:
we will need to test if dontdump is supported runtime,
for this sake one can call

	small_test_feature(SMALL_FEATURE_DONTDUMP);

to test if we can exclude particular slab arena
from dumping.

Part-of https://github.com/tarantool/tarantool/issues/3509
---
v2:
  - Encode madvise hint inside slab_arena_create flags, for backward compatibility
  - Enhance test to address dynamic slabs
  - Enhance build procedure to test all falgs and functions we're interested in
v3:
  - Keep SLAB_ARENA_ flags inside struct slab_arena
  - Use HAVE keyword in cmake
  - Move madvise into madvise_checked helper
  - Extend comment for SLAB_ARENA_ flags
v4:
  - Implement small_test_feature helper to test supported
    features runtime
v5:
  - Fix typos in flags testing because flags are multibit and
    we need a complete match via IS_SLAB_ARENA_FLAG helper.

 .gitignore           |   1 +
 CMakeLists.txt       |  20 +++++++
 small/config.h.cmake |  31 ++++++++++
 small/features.c     | 134 +++++++++++++++++++++++++++++++++++++++++++
 small/features.h     |  60 +++++++++++++++++++
 small/slab_arena.c   |  71 ++++++++++++++++++++---
 small/slab_arena.h   |  27 ++++++++-
 test/slab_arena.c    | 130 +++++++++++++++++++++++++++++++++++++++++
 8 files changed, 465 insertions(+), 9 deletions(-)
 create mode 100644 small/config.h.cmake
 create mode 100644 small/features.c
 create mode 100644 small/features.h

diff --git a/.gitignore b/.gitignore
index e2339c3..c4e4ade 100644
--- a/.gitignore
+++ b/.gitignore
@@ -17,3 +17,4 @@ build/
 CTestTestfile.cmake
 Testing
 CTestTestfile.cmake
+small/config.h
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 2b7d0dc..0835865 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1,6 +1,9 @@
 project(small C CXX)
 cmake_minimum_required(VERSION 2.8 FATAL_ERROR)
 
+include(CheckFunctionExists)
+include(CheckSymbolExists)
+
 if(NOT CMAKE_BUILD_TYPE)
     set(CMAKE_BUILD_TYPE Debug)
 endif()
@@ -15,10 +18,26 @@ endif()
 # Enable GNU glibc extentions.
 add_definitions("-D_GNU_SOURCE")
 
+set(CMAKE_REQUIRED_DEFINITIONS "-D_GNU_SOURCE")
+
+check_symbol_exists(MAP_ANON sys/mman.h TARANTOOL_SMALL_HAVE_MAP_ANON)
+check_symbol_exists(MAP_ANONYMOUS sys/mman.h TARANTOOL_SMALL_HAVE_MAP_ANONYMOUS)
+
+check_function_exists(madvise TARANTOOL_SMALL_HAVE_MADVISE)
+check_symbol_exists(MADV_DONTDUMP sys/mman.h TARANTOOL_SMALL_HAVE_MADV_DONTDUMP)
+
+configure_file(
+    "small/config.h.cmake"
+    "small/config.h"
+    )
+message (STATUS "")
+
 # Valgrind
 include_directories(third_party)
 
 set(lib_headers
+    small/config.h
+    small/features.h
     small/ibuf.h
     small/lf_lifo.h
     small/lifo.h
@@ -36,6 +55,7 @@ set(lib_headers
     small/static.h)
 
 set(lib_sources
+    small/features.c
     small/slab_cache.c
     small/region.c
     small/mempool.c
diff --git a/small/config.h.cmake b/small/config.h.cmake
new file mode 100644
index 0000000..989ffd1
--- /dev/null
+++ b/small/config.h.cmake
@@ -0,0 +1,31 @@
+#ifndef TARANTOOL_SMALL_CONFIG_H_INCLUDED
+#define TARANTOOL_SMALL_CONFIG_H_INCLUDED
+
+/*
+ * Check for deprecated MAP_ANON.
+ */
+#cmakedefine TARANTOOL_SMALL_HAVE_MAP_ANON 1
+#cmakedefine TARANTOOL_SMALL_HAVE_MAP_ANONYMOUS 1
+
+#if !defined(TARANTOOL_SMALL_HAVE_MAP_ANONYMOUS) && defined(TARANTOOL_SMALL_HAVE_MAP_ANON)
+/*
+ * MAP_ANON is deprecated, MAP_ANONYMOUS should be used instead.
+ * Unfortunately, it's not universally present (e.g. not present
+ * on FreeBSD.
+ */
+# define MAP_ANONYMOUS MAP_ANON
+#endif
+
+/*
+ * Defined if this platform has madvise(..)
+ * and flags we're interested in.
+ */
+#cmakedefine TARANTOOL_SMALL_HAVE_MADVISE 1
+#cmakedefine TARANTOOL_SMALL_HAVE_MADV_DONTDUMP 1
+
+#if defined(TARANTOOL_SMALL_HAVE_MADVISE)	&& \
+    defined(TARANTOOL_SMALL_HAVE_MADV_DONTDUMP)
+# define TARANTOOL_SMALL_USE_MADVISE 1
+#endif
+
+#endif /* TARANTOOL_SMALL_CONFIG_H_INCLUDED */
diff --git a/small/features.c b/small/features.c
new file mode 100644
index 0000000..ad17d5c
--- /dev/null
+++ b/small/features.c
@@ -0,0 +1,134 @@
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdio.h>
+#include <errno.h>
+
+#include <sys/mman.h>
+
+#include "features.h"
+#include "config.h"
+
+typedef bool (*rt_helper_t)(void);
+
+#define SMALL_FEATURE_MASK(v)	((uint64_t)1 << (v))
+
+/* The mask carries bits for compiled in features */
+static uint64_t builtin_mask =
+#ifdef TARANTOOL_SMALL_USE_MADVISE
+	SMALL_FEATURE_MASK(SMALL_FEATURE_DONTDUMP)	|
+#endif
+	0;
+
+#ifdef TARANTOOL_SMALL_USE_MADVISE
+static bool
+test_dontdump(void)
+{
+	size_t size = sysconf(_SC_PAGESIZE);
+	intptr_t ignore_it;
+	bool ret = false;
+	char buf[64];
+	void *ptr;
+
+	(void)ignore_it;
+
+	/*
+	 * We need to try madvise a real data,
+	 * plain madvise() with -ENOSYS is not
+	 * enough: we need a specific flag to
+	 * be implemented. Thus allocate page
+	 * and work on it.
+	 */
+
+	ptr = mmap(NULL, size, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+	if (ptr == MAP_FAILED) {
+		/*
+		 * We're out of memory, and cant guarantee anything.
+		 */
+		ignore_it = (intptr_t)strerror_r(errno, buf, sizeof(buf));
+		fprintf(stderr, "Error in mmap(NULL, %zu, ...): %s\n", size, buf);
+		goto out;
+	}
+
+	if (madvise(ptr, size, MADV_DONTDUMP) == 0)
+		ret = true;
+
+	if (munmap(ptr, size)) {
+		ignore_it = (intptr_t)strerror_r(errno, buf, sizeof(buf));
+		fprintf(stderr, "Error in munmap(%p, %zu): %s\n",
+			ptr, size, buf);
+	}
+out:
+	return ret;
+}
+#else
+static bool test_dontdump(void) { return false; }
+#endif
+
+/*
+ * Runtime testers, put there features if they are dynamic.
+ */
+static rt_helper_t rt_helpers[FEATURE_MAX] = {
+	[SMALL_FEATURE_DONTDUMP]	= test_dontdump,
+};
+
+/**
+ * small_test_feature -- test if particular feature is supported
+ * @feature: A feature to test.
+ *
+ * Returns true if feature is available, false othrewise.
+ */
+bool
+small_test_feature(unsigned int feature)
+{
+	uint64_t mask = SMALL_FEATURE_MASK(feature);
+
+	if (feature >= FEATURE_MAX)
+		return false;
+
+	/* Make sure it is compiled in. */
+	if (!(builtin_mask & mask))
+		return false;
+
+	/*
+	 * Some feature may require runtime
+	 * testing (say we're compiled with
+	 * particular feature support but
+	 * it is unavailable on the system
+	 * we're running on, or get filtered
+	 * out).
+	 */
+	return rt_helpers[feature] ? rt_helpers[feature]() : true;
+}
diff --git a/small/features.h b/small/features.h
new file mode 100644
index 0000000..0a994cc
--- /dev/null
+++ b/small/features.h
@@ -0,0 +1,60 @@
+#ifndef TARANTOOL_SMALL_FEATURES_H_INCLUDED
+#define TARANTOOL_SMALL_FEATURES_H_INCLUDED
+/*
+ * Copyright 2010-2019, Tarantool AUTHORS, please see AUTHORS file.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above
+ *    copyright notice, this list of conditions and the
+ *    following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above
+ *    copyright notice, this list of conditions and the following
+ *    disclaimer in the documentation and/or other materials
+ *    provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
+ * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
+ * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
+ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stdbool.h>
+
+#if defined(__cplusplus)
+extern "C" {
+#endif /* defined(__cplusplus) */
+
+/**
+ * A list of features to test.
+ */
+enum {
+	/* To check if SLAB_ARENA_DONTDUMP is supported */
+	SMALL_FEATURE_DONTDUMP		= 0,
+
+	FEATURE_MAX
+};
+
+/**
+ * Test if a feature is supported.
+ */
+bool
+small_test_feature(unsigned int feature);
+
+#if defined(__cplusplus)
+} /* extern "C" */
+#endif
+
+#endif /* TARANTOOL_SMALL_FEATURES_H_INCLUDED */
diff --git a/small/slab_arena.c b/small/slab_arena.c
index 26e2620..2d3fc55 100644
--- a/small/slab_arena.c
+++ b/small/slab_arena.c
@@ -41,9 +41,32 @@
 #include <valgrind/valgrind.h>
 #include <valgrind/memcheck.h>
 
-#if !defined(MAP_ANONYMOUS)
-#define MAP_ANONYMOUS MAP_ANON
+static void
+madvise_checked(void *ptr, size_t size, int flags)
+{
+#ifdef TARANTOOL_SMALL_USE_MADVISE
+	if (!ptr)
+		return;
+
+	if (!IS_SLAB_ARENA_FLAG(flags, SLAB_ARENA_DONTDUMP))
+		return;
+
+	if (madvise(ptr, size, MADV_DONTDUMP)) {
+		intptr_t ignore_it;
+		char buf[64];
+
+		ignore_it = (intptr_t)strerror_r(errno, buf, sizeof(buf));
+		(void)ignore_it;
+
+		fprintf(stderr, "Error in madvise(%p, %zu, 0x%x): %s\n",
+			ptr, size, MADV_DONTDUMP, buf);
+	}
+#else
+	(void)ptr;
+	(void)size;
+	(void)flags;
 #endif
+}
 
 static void
 munmap_checked(void *addr, size_t size)
@@ -66,13 +89,18 @@ mmap_checked(size_t size, size_t align, int flags)
 	assert((align & (align - 1)) == 0);
 	/* The size must be a multiple of alignment */
 	assert((size & (align - 1)) == 0);
+
+	if (IS_SLAB_ARENA_FLAG(flags, SLAB_ARENA_PRIVATE))
+		flags = MAP_PRIVATE | MAP_ANONYMOUS;
+	else
+		flags = MAP_SHARED | MAP_ANONYMOUS;
+
 	/*
 	 * All mappings except the first are likely to
 	 * be aligned already.  Be optimistic by trying
 	 * to map exactly the requested amount.
 	 */
-	void *map = mmap(NULL, size, PROT_READ | PROT_WRITE,
-			 flags | MAP_ANONYMOUS, -1, 0);
+	void *map = mmap(NULL, size, PROT_READ | PROT_WRITE, flags, -1, 0);
 	if (map == MAP_FAILED)
 		return NULL;
 	if (((intptr_t) map & (align - 1)) == 0)
@@ -85,8 +113,7 @@ mmap_checked(size_t size, size_t align, int flags)
 	 * fragmentation depending on the kernels allocation
 	 * strategy.
 	 */
-	map = mmap(NULL, size + align, PROT_READ | PROT_WRITE,
-		   flags | MAP_ANONYMOUS, -1, 0);
+	map = mmap(NULL, size + align, PROT_READ | PROT_WRITE, flags, -1, 0);
 	if (map == MAP_FAILED)
 		return NULL;
 
@@ -123,11 +150,33 @@ pow2round(size_t size)
 #define MAX(a, b) ((a) > (b) ? (a) : (b))
 #define MIN(a, b) ((a) < (b) ? (a) : (b))
 
+static void
+slab_arena_flags_init(struct slab_arena *arena, int flags)
+{
+	/*
+	 * Old interface for backward compatibility, MAP_
+	 * flags are passed directly without SLAB_ARENA_FLAG_MARK,
+	 * map them to internal ones.
+	 */
+	if (!(flags & SLAB_ARENA_FLAG_MARK)) {
+		assert(flags & (MAP_PRIVATE | MAP_SHARED));
+		if (flags == MAP_PRIVATE)
+			arena->flags = SLAB_ARENA_PRIVATE;
+		else
+			arena->flags = SLAB_ARENA_SHARED;
+		return;
+	}
+
+	assert(IS_SLAB_ARENA_FLAG(flags, SLAB_ARENA_PRIVATE) ||
+	       IS_SLAB_ARENA_FLAG(flags, SLAB_ARENA_SHARED));
+
+	arena->flags = flags;
+}
+
 int
 slab_arena_create(struct slab_arena *arena, struct quota *quota,
 		  size_t prealloc, uint32_t slab_size, int flags)
 {
-	assert(flags & (MAP_PRIVATE | MAP_SHARED));
 	lf_lifo_init(&arena->cache);
 	VALGRIND_MAKE_MEM_DEFINED(&arena->cache, sizeof(struct lf_lifo));
 
@@ -148,7 +197,7 @@ slab_arena_create(struct slab_arena *arena, struct quota *quota,
 
 	arena->used = 0;
 
-	arena->flags = flags;
+	slab_arena_flags_init(arena, flags);
 
 	if (arena->prealloc) {
 		arena->arena = mmap_checked(arena->prealloc,
@@ -157,6 +206,9 @@ slab_arena_create(struct slab_arena *arena, struct quota *quota,
 	} else {
 		arena->arena = NULL;
 	}
+
+	madvise_checked(arena->arena, arena->slab_size, arena->flags);
+
 	return arena->prealloc && !arena->arena ? -1 : 0;
 }
 
@@ -205,6 +257,9 @@ slab_map(struct slab_arena *arena)
 		__sync_sub_and_fetch(&arena->used, arena->slab_size);
 		quota_release(arena->quota, arena->slab_size);
 	}
+
+	madvise_checked(ptr, arena->slab_size, arena->flags);
+
 	VALGRIND_MAKE_MEM_UNDEFINED(ptr, arena->slab_size);
 	return ptr;
 }
diff --git a/small/slab_arena.h b/small/slab_arena.h
index 364252f..f8b0845 100644
--- a/small/slab_arena.h
+++ b/small/slab_arena.h
@@ -33,6 +33,7 @@
 #include "lf_lifo.h"
 #include <sys/mman.h>
 #include <limits.h>
+#include "config.h"
 
 #if defined(__cplusplus)
 extern "C" {
@@ -45,6 +46,30 @@ enum {
 	SMALL_UNLIMITED = SIZE_MAX/2 + 1
 };
 
+/**
+ * Backward compatible flags to be used with slab_arena_create().
+ * Initially we have been passing MAP_SHARED or MAP_PRIVATE flags
+ * only, thus to continue supporting them we need to sign new API
+ * with some predefined value. For this sake we reserve high bit
+ * as a mark which allows us to distinguish system independent
+ * SLAB_ARENA_ flags from anything else.
+ *
+ * Note the SLAB_ARENA_FLAG_MARK adds a second bit into the flags,
+ * use IS_SLAB_ARENA_FLAG helper for testing.
+ */
+#define SLAB_ARENA_FLAG_MARK	(0x80000000)
+#define SLAB_ARENA_FLAG(x)	((x) | SLAB_ARENA_FLAG_MARK)
+#define IS_SLAB_ARENA_FLAG(f,x)	(((f) & (x)) == (x))
+
+enum {
+	/* mmap() flags */
+	SLAB_ARENA_PRIVATE	= SLAB_ARENA_FLAG(1 << 0),
+	SLAB_ARENA_SHARED	= SLAB_ARENA_FLAG(1 << 1),
+
+	/* madvise() flags */
+	SLAB_ARENA_DONTDUMP	= SLAB_ARENA_FLAG(1 << 2)
+};
+
 /**
  * slab_arena -- a source of large aligned blocks of memory.
  * MT-safe.
@@ -94,7 +119,7 @@ struct slab_arena {
 	 */
 	uint32_t slab_size;
 	/**
-	 * mmap() flags: MAP_SHARED or MAP_PRIVATE
+	 * SLAB_ARENA_ flags for mmap() and madvise() calls.
 	 */
 	int flags;
 };
diff --git a/test/slab_arena.c b/test/slab_arena.c
index d0a78d1..7a7c243 100644
--- a/test/slab_arena.c
+++ b/test/slab_arena.c
@@ -2,7 +2,9 @@
 #include <small/quota.h>
 #include <stdio.h>
 #include <limits.h>
+#include <unistd.h>
 #include <stdlib.h>
+#include <string.h>
 #include <time.h>
 #include "unit.h"
 
@@ -15,6 +17,132 @@ slab_arena_print(struct slab_arena *arena)
 	       arena->used, arena->slab_size);
 }
 
+#define is_hex_digit(c)				\
+	(((c) >= '0' && (c) <= '9')	||	\
+	 ((c) >= 'a' && (c) <= 'f')	||	\
+	 ((c) >= 'A' && (c) <= 'F'))
+
+static bool
+is_vma_range_fmt(char *line, unsigned long *start, unsigned long *end)
+{
+	char *p = line;
+	while (*line && is_hex_digit(*line))
+		line++;
+
+	if (*line++ != '-')
+		return false;
+
+	while (*line && is_hex_digit(*line))
+		line++;
+
+	if (*line++ != ' ')
+		return false;
+
+	sscanf(p, "%lx-%lx", start, end);
+	return true;
+}
+
+static bool
+vma_has_flag(unsigned long vm_start, const char *flag)
+{
+	unsigned long start = 0, end = 0;
+	char buf[1024], *tok = NULL;
+	bool found = false;
+	FILE *f = NULL;
+
+	f = fopen("/proc/self/smaps", "r");
+	if (!f) {
+		printf("ERROR: Can't open smaps for %lx\n", vm_start);
+		return false;
+	}
+
+	while (fgets(buf, sizeof(buf), f)) {
+		if (is_vma_range_fmt(buf, &start, &end))
+			continue;
+		if (strncmp(buf, "VmFlags: ", 9) || start != vm_start)
+			continue;
+		tok = buf;
+		break;
+	}
+
+	if (tok) {
+		for (tok = strtok(tok, " \n"); tok;
+		     tok = strtok(NULL, " \n")) {
+			if (strcmp(tok, flag))
+				continue;
+			found = true;
+			break;
+		}
+	}
+
+	fclose(f);
+	return found;
+}
+
+static void
+slab_test_madvise(void)
+{
+	struct slab_arena arena;
+	struct quota quota;
+	void *ptr;
+
+	/*
+	 * The system doesn't support flags fetching.
+	 */
+	if (access("/proc/self/smaps", F_OK))
+		return;
+
+	/*
+	 * If system supports madvise call, test that
+	 * preallocated area has been madvised.
+	 */
+	quota_init(&quota, 2000000);
+	slab_arena_create(&arena, &quota, 3000000, 1,
+			  SLAB_ARENA_PRIVATE | SLAB_ARENA_DONTDUMP);
+
+	/*
+	 * Will fetch from preallocated area.
+	 */
+	ptr = slab_map(&arena);
+	if (!ptr) {
+		printf("ERROR: can't obtain preallocated slab\n");
+		goto out;
+	}
+
+	if (!vma_has_flag((unsigned long)ptr, "dd"))
+		goto no_dd;
+
+	slab_unmap(&arena, ptr);
+	slab_arena_destroy(&arena);
+
+	/*
+	 * A new slab for dynamic allocation.
+	 */
+	quota_init(&quota, 2000000);
+	slab_arena_create(&arena, &quota, 0, 0x10000,
+			  SLAB_ARENA_PRIVATE | SLAB_ARENA_DONTDUMP);
+
+	/*
+	 * Will fetch newly allocated area.
+	 */
+	ptr = slab_map(&arena);
+	if (!ptr) {
+		printf("ERROR: can't obtain dynamic slab\n");
+		goto out;
+	}
+
+	if (!vma_has_flag((unsigned long)ptr, "dd"))
+		goto no_dd;
+
+out:
+	slab_unmap(&arena, ptr);
+	slab_arena_destroy(&arena);
+	return;
+no_dd:
+	printf("ERROR: Expected dd flag on VMA address %p\n", ptr);
+	goto out;
+}
+
 int main()
 {
 	struct quota quota;
@@ -42,4 +170,6 @@ int main()
 	slab_arena_create(&arena, &quota, 3000000, 1, MAP_PRIVATE);
 	slab_arena_print(&arena);
 	slab_arena_destroy(&arena);
+
+	slab_test_madvise();
 }
-- 
2.20.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v5 2/2] box/memtx: Allow to skip tuple memory from coredump
  2019-05-16 21:45 [PATCH v5 0/2] Implement support of strip_core functionality Cyrill Gorcunov
  2019-05-16 21:45 ` [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints Cyrill Gorcunov
@ 2019-05-16 21:45 ` Cyrill Gorcunov
  2019-05-20 10:03 ` [PATCH v5 0/2] Implement support of strip_core functionality Vladimir Davydov
  2 siblings, 0 replies; 7+ messages in thread
From: Cyrill Gorcunov @ 2019-05-16 21:45 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Alexander Turenko, gorcunov

In case if there are huge amount of tuples the whole
memory goes to coredump file even if we don't need it
for problem investigation. In result coredump may
blow up to gigabytes in size.

Lets allow to exclude this memory from dumping via
box.cfg::strip_core boolean parameter.

Note that the tuple's arena is used not only for tuples
themselves but for memtx->index_extent_pool and
memtx->iterator_pool as well, so they are affected
too.

Fixes #3509

@TarantoolBot document
Title: Document box.cfg.strip_core

When Tarantool runs under a heavy load the memory allocated
for tuples may be very huge in size and to eliminate this
memory from being present in `coredump` file the `box.cfg.strip_core`
parameter should be set to `true`.

The default value is `false`.
---
v2:
 - Use strip_core name for box parameter
 - Pass cfg_geti directly to functions
 - vy_mem_env_create for now left as it was,
   simply because we can't use cfg_geti there
   (linking with library fails), I think we can
   address it on top later
v5:
 - Cover all code with new SLAB_ARENA_FLAG_ entries
   (actually there was one place only: memory_init)
 - In load_cfg do test if we support SMALL_FEATURE_DONTDUMP
   runtime and print a warning if we not.

 src/box/box.cc           |  1 +
 src/box/lua/load_cfg.lua |  2 ++
 src/box/memtx_engine.c   |  4 ++--
 src/box/memtx_engine.h   |  9 ++++++---
 src/box/tuple.c          | 15 ++++++++++++---
 src/box/tuple.h          |  2 +-
 src/box/vy_mem.c         |  2 +-
 src/lib/core/memory.c    |  2 +-
 src/main.cc              | 12 ++++++++++++
 test/box/admin.result    |  2 ++
 test/box/cfg.result      |  4 ++++
 11 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/src/box/box.cc b/src/box/box.cc
index 7828f575b..57419ee01 100644
--- a/src/box/box.cc
+++ b/src/box/box.cc
@@ -1681,6 +1681,7 @@ engine_init()
 				    cfg_geti("force_recovery"),
 				    cfg_getd("memtx_memory"),
 				    cfg_geti("memtx_min_tuple_size"),
+				    cfg_geti("strip_core"),
 				    cfg_getd("slab_alloc_factor"));
 	engine_register((struct engine *)memtx);
 	box_set_memtx_max_tuple_size();
diff --git a/src/box/lua/load_cfg.lua b/src/box/lua/load_cfg.lua
index 5530b2caa..9f3344da3 100644
--- a/src/box/lua/load_cfg.lua
+++ b/src/box/lua/load_cfg.lua
@@ -25,6 +25,7 @@ end
 local default_cfg = {
     listen              = nil,
     memtx_memory        = 256 * 1024 *1024,
+    strip_core          = false,
     memtx_min_tuple_size = 16,
     memtx_max_tuple_size = 1024 * 1024,
     slab_alloc_factor   = 1.05,
@@ -88,6 +89,7 @@ local default_cfg = {
 local template_cfg = {
     listen              = 'string, number',
     memtx_memory        = 'number',
+    strip_core          = 'boolean',
     memtx_min_tuple_size  = 'number',
     memtx_max_tuple_size  = 'number',
     slab_alloc_factor   = 'number',
diff --git a/src/box/memtx_engine.c b/src/box/memtx_engine.c
index 25e800341..f4312484a 100644
--- a/src/box/memtx_engine.c
+++ b/src/box/memtx_engine.c
@@ -1011,7 +1011,7 @@ memtx_engine_gc_f(va_list va)
 struct memtx_engine *
 memtx_engine_new(const char *snap_dirname, bool force_recovery,
 		 uint64_t tuple_arena_max_size, uint32_t objsize_min,
-		 float alloc_factor)
+		 bool dontdump, float alloc_factor)
 {
 	struct memtx_engine *memtx = calloc(1, sizeof(*memtx));
 	if (memtx == NULL) {
@@ -1066,7 +1066,7 @@ memtx_engine_new(const char *snap_dirname, bool force_recovery,
 	/* Initialize tuple allocator. */
 	quota_init(&memtx->quota, tuple_arena_max_size);
 	tuple_arena_create(&memtx->arena, &memtx->quota, tuple_arena_max_size,
-			   SLAB_SIZE, "memtx");
+			   SLAB_SIZE, dontdump, "memtx");
 	slab_cache_create(&memtx->slab_cache, &memtx->arena);
 	small_alloc_create(&memtx->alloc, &memtx->slab_cache,
 			   objsize_min, alloc_factor);
diff --git a/src/box/memtx_engine.h b/src/box/memtx_engine.h
index 8f4ce7cdd..ccb51678d 100644
--- a/src/box/memtx_engine.h
+++ b/src/box/memtx_engine.h
@@ -189,7 +189,8 @@ memtx_engine_schedule_gc(struct memtx_engine *memtx,
 struct memtx_engine *
 memtx_engine_new(const char *snap_dirname, bool force_recovery,
 		 uint64_t tuple_arena_max_size,
-		 uint32_t objsize_min, float alloc_factor);
+		 uint32_t objsize_min, bool dontdump,
+		 float alloc_factor);
 
 int
 memtx_engine_recover_snapshot(struct memtx_engine *memtx,
@@ -257,12 +258,14 @@ memtx_index_def_change_requires_rebuild(struct index *index,
 static inline struct memtx_engine *
 memtx_engine_new_xc(const char *snap_dirname, bool force_recovery,
 		    uint64_t tuple_arena_max_size,
-		    uint32_t objsize_min, float alloc_factor)
+		    uint32_t objsize_min, bool dontdump,
+		    float alloc_factor)
 {
 	struct memtx_engine *memtx;
 	memtx = memtx_engine_new(snap_dirname, force_recovery,
 				 tuple_arena_max_size,
-				 objsize_min, alloc_factor);
+				 objsize_min, dontdump,
+				 alloc_factor);
 	if (memtx == NULL)
 		diag_raise();
 	return memtx;
diff --git a/src/box/tuple.c b/src/box/tuple.c
index 45c6727a4..a7ef332b2 100644
--- a/src/box/tuple.c
+++ b/src/box/tuple.c
@@ -323,7 +323,7 @@ tuple_init(field_name_hash_f hash)
 void
 tuple_arena_create(struct slab_arena *arena, struct quota *quota,
 		   uint64_t arena_max_size, uint32_t slab_size,
-		   const char *arena_name)
+		   bool dontdump, const char *arena_name)
 {
 	/*
 	 * Ensure that quota is a multiple of slab_size, to
@@ -331,11 +331,17 @@ tuple_arena_create(struct slab_arena *arena, struct quota *quota,
 	 */
 	size_t prealloc = small_align(arena_max_size, slab_size);
 
+        /*
+         * Skip from coredump if requested.
+         */
+        int flags = SLAB_ARENA_PRIVATE;
+        if (dontdump)
+                flags |= SLAB_ARENA_DONTDUMP;
+
 	say_info("mapping %zu bytes for %s tuple arena...", prealloc,
 		 arena_name);
 
-	if (slab_arena_create(arena, quota, prealloc, slab_size,
-			      MAP_PRIVATE) != 0) {
+	if (slab_arena_create(arena, quota, prealloc, slab_size, flags) != 0) {
 		if (errno == ENOMEM) {
 			panic("failed to preallocate %zu bytes: Cannot "\
 			      "allocate memory, check option '%s_memory' in box.cfg(..)", prealloc,
@@ -345,6 +351,9 @@ tuple_arena_create(struct slab_arena *arena, struct quota *quota,
 				       " tuple arena", prealloc, arena_name);
 		}
 	}
+
+	say_debug("tuple arena %s: addr %p size %zu flags %#x dontdump %d",
+		  arena_name, arena->arena, prealloc, flags, dontdump);
 }
 
 void
diff --git a/src/box/tuple.h b/src/box/tuple.h
index 0d950940d..99dfeb82d 100644
--- a/src/box/tuple.h
+++ b/src/box/tuple.h
@@ -70,7 +70,7 @@ tuple_free(void);
 void
 tuple_arena_create(struct slab_arena *arena, struct quota *quota,
 		   uint64_t arena_max_size, uint32_t slab_size,
-		   const char *arena_name);
+		   bool dontdump, const char *arena_name);
 
 void
 tuple_arena_destroy(struct slab_arena *arena);
diff --git a/src/box/vy_mem.c b/src/box/vy_mem.c
index a4fae26e2..b4d016a68 100644
--- a/src/box/vy_mem.c
+++ b/src/box/vy_mem.c
@@ -54,7 +54,7 @@ vy_mem_env_create(struct vy_mem_env *env, size_t memory)
 	/* Vinyl memory is limited by vy_quota. */
 	quota_init(&env->quota, QUOTA_MAX);
 	tuple_arena_create(&env->arena, &env->quota, memory,
-			   SLAB_SIZE, "vinyl");
+			   SLAB_SIZE, false, "vinyl");
 	lsregion_create(&env->allocator, &env->arena);
 	env->tree_extent_size = 0;
 }
diff --git a/src/lib/core/memory.c b/src/lib/core/memory.c
index 059304e34..f236c1887 100644
--- a/src/lib/core/memory.c
+++ b/src/lib/core/memory.c
@@ -43,7 +43,7 @@ memory_init()
 
 	/* No limit on the runtime memory. */
 	slab_arena_create(&runtime, &runtime_quota, 0,
-			  SLAB_SIZE, MAP_PRIVATE);
+			  SLAB_SIZE, SLAB_ARENA_PRIVATE);
 }
 
 void
diff --git a/src/main.cc b/src/main.cc
index 606c64c13..569ff4b5f 100644
--- a/src/main.cc
+++ b/src/main.cc
@@ -65,6 +65,7 @@
 #include "lua/init.h"
 #include "box/box.h"
 #include "box/error.h"
+#include "small/features.h"
 #include "scoped_guard.h"
 #include "random.h"
 #include "cfg.h"
@@ -488,6 +489,17 @@ load_cfg()
 #endif
 	}
 
+	/*
+	 * If we're requested to strip coredump
+	 * make sure we can do that, otherwise
+	 * require user to not turn it on.
+	 */
+	if (cfg_geti("strip_core")) {
+		if (!small_test_feature(SMALL_FEATURE_DONTDUMP)) {
+			say_warn("'strip_core' is set but unsupported");
+		}
+	}
+
 	int background = cfg_geti("background");
 	const char *log = cfg_gets("log");
 	const char *log_format = cfg_gets("log_format");
diff --git a/test/box/admin.result b/test/box/admin.result
index 53ced2fcc..bbebbd224 100644
--- a/test/box/admin.result
+++ b/test/box/admin.result
@@ -84,6 +84,8 @@ cfg_filter(box.cfg)
     - 500000
   - - slab_alloc_factor
     - 1.05
+  - - strip_core
+    - false
   - - too_long_threshold
     - 0.5
   - - vinyl_bloom_fpr
diff --git a/test/box/cfg.result b/test/box/cfg.result
index 66b02f591..81f4afac8 100644
--- a/test/box/cfg.result
+++ b/test/box/cfg.result
@@ -72,6 +72,8 @@ cfg_filter(box.cfg)
     - 500000
   - - slab_alloc_factor
     - 1.05
+  - - strip_core
+    - false
   - - too_long_threshold
     - 0.5
   - - vinyl_bloom_fpr
@@ -171,6 +173,8 @@ cfg_filter(box.cfg)
     - 500000
   - - slab_alloc_factor
     - 1.05
+  - - strip_core
+    - false
   - - too_long_threshold
     - 0.5
   - - vinyl_bloom_fpr
-- 
2.20.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 0/2] Implement support of strip_core functionality
  2019-05-16 21:45 [PATCH v5 0/2] Implement support of strip_core functionality Cyrill Gorcunov
  2019-05-16 21:45 ` [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints Cyrill Gorcunov
  2019-05-16 21:45 ` [PATCH v5 2/2] box/memtx: Allow to skip tuple memory from coredump Cyrill Gorcunov
@ 2019-05-20 10:03 ` Vladimir Davydov
  2019-05-20 10:09   ` Cyrill Gorcunov
  2 siblings, 1 reply; 7+ messages in thread
From: Vladimir Davydov @ 2019-05-20 10:03 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml, Alexander Turenko

On Fri, May 17, 2019 at 12:45:57AM +0300, Cyrill Gorcunov wrote:
> Volodya, could you please help with more intensive testing since I'm not sure
> how to fill memtx from console or some script yet.

Just type in the console or a script:

  -- alloc 1 GB for memtx tuple arena
  box.cfg{memtx_memory = 1024 * 1024 * 1024}

  -- create a test space
  s = box.schema.space.create('test')
  s:create_index('primary')

  -- fill it until we run out of memory
  for i = 1, math.huge do s:insert{i, string.rep('x', 100 * 1024)} end

  -- double the size of memtx arena by adding more slabs
  box.cfg{memtx_memory = 2 * 1024 * 1024 * 1024}

  -- fill them up too
  for i = 1, math.huge do s:insert{i, string.rep('x', 100 * 1024)} end

Then kill the tarantool with SIGSEGV to generate a core. The core size
should be considerably less than 2 GB with strip_core enabled.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 0/2] Implement support of strip_core functionality
  2019-05-20 10:03 ` [PATCH v5 0/2] Implement support of strip_core functionality Vladimir Davydov
@ 2019-05-20 10:09   ` Cyrill Gorcunov
  0 siblings, 0 replies; 7+ messages in thread
From: Cyrill Gorcunov @ 2019-05-20 10:09 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml, Alexander Turenko

On Mon, May 20, 2019 at 01:03:53PM +0300, Vladimir Davydov wrote:
> On Fri, May 17, 2019 at 12:45:57AM +0300, Cyrill Gorcunov wrote:
> > Volodya, could you please help with more intensive testing since I'm not sure
> > how to fill memtx from console or some script yet.
> 

Thank you! Will do.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints
  2019-05-16 21:45 ` [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints Cyrill Gorcunov
@ 2019-05-20 12:55   ` Vladimir Davydov
  2019-05-20 13:03     ` Cyrill Gorcunov
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir Davydov @ 2019-05-20 12:55 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml, Alexander Turenko

On Fri, May 17, 2019 at 12:45:58AM +0300, Cyrill Gorcunov wrote:
> @@ -157,6 +206,9 @@ slab_arena_create(struct slab_arena *arena, struct quota *quota,
>  	} else {
>  		arena->arena = NULL;
>  	}
> +
> +	madvise_checked(arena->arena, arena->slab_size, arena->flags);
> +

Apparently, you should use arena->prealloc here. Fixed and pushed the
patch set to master (both tarantool and tarantool-small).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints
  2019-05-20 12:55   ` Vladimir Davydov
@ 2019-05-20 13:03     ` Cyrill Gorcunov
  0 siblings, 0 replies; 7+ messages in thread
From: Cyrill Gorcunov @ 2019-05-20 13:03 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml, Alexander Turenko

On Mon, May 20, 2019 at 03:55:39PM +0300, Vladimir Davydov wrote:
> On Fri, May 17, 2019 at 12:45:58AM +0300, Cyrill Gorcunov wrote:
> > @@ -157,6 +206,9 @@ slab_arena_create(struct slab_arena *arena, struct quota *quota,
> >  	} else {
> >  		arena->arena = NULL;
> >  	}
> > +
> > +	madvise_checked(arena->arena, arena->slab_size, arena->flags);
> > +
> 
> Apparently, you should use arena->prealloc here. Fixed and pushed the
> patch set to master (both tarantool and tarantool-small).

Indeed, thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-05-20 13:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-16 21:45 [PATCH v5 0/2] Implement support of strip_core functionality Cyrill Gorcunov
2019-05-16 21:45 ` [PATCH v5 1/2] slab_arena: Enhance slab_arena_create to support madvise hints Cyrill Gorcunov
2019-05-20 12:55   ` Vladimir Davydov
2019-05-20 13:03     ` Cyrill Gorcunov
2019-05-16 21:45 ` [PATCH v5 2/2] box/memtx: Allow to skip tuple memory from coredump Cyrill Gorcunov
2019-05-20 10:03 ` [PATCH v5 0/2] Implement support of strip_core functionality Vladimir Davydov
2019-05-20 10:09   ` Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox