Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH 0/3] small: Prepare ground for madvise
@ 2019-05-01 15:50 Cyrill Gorcunov
  2019-05-01 15:50 ` [PATCH 1/3] build: Check for madvise syscall Cyrill Gorcunov
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-01 15:50 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

We need to eliminate some slab areas from coredump, for
this sake lets provide a way to small allocator to advise
memory system about our wish.

Guys, this is early RFC, comments and ideas on API are welcome.

Cyrill Gorcunov (3):
  build: Check for madvise syscall
  slab_arena: Provide slab_arena_madvise_create to madvice slabs
  test: slab_arena -- Verify madvise

 CMakeLists.txt       | 11 ++++++
 small/config.h.cmake |  9 +++++
 small/slab_arena.c   | 49 ++++++++++++++++++++++++
 small/slab_arena.h   | 27 +++++++++++++
 test/slab_arena.c    | 90 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 186 insertions(+)
 create mode 100644 small/config.h.cmake

-- 
2.20.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] build: Check for madvise syscall
  2019-05-01 15:50 [PATCH 0/3] small: Prepare ground for madvise Cyrill Gorcunov
@ 2019-05-01 15:50 ` Cyrill Gorcunov
  2019-05-06 10:25   ` Vladimir Davydov
  2019-05-01 15:50 ` [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs Cyrill Gorcunov
  2019-05-01 15:50 ` [PATCH 3/3] test: slab_arena -- Verify madvise Cyrill Gorcunov
  2 siblings, 1 reply; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-01 15:50 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

Need to check if madvise is present in the system

Part of #3509
---
https://github.com/tarantool/tarantool/issues/3509

 CMakeLists.txt       | 11 +++++++++++
 small/config.h.cmake |  9 +++++++++
 2 files changed, 20 insertions(+)
 create mode 100644 small/config.h.cmake

diff --git a/CMakeLists.txt b/CMakeLists.txt
index ad27423..97d25bc 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1,6 +1,8 @@
 project(small C CXX)
 cmake_minimum_required(VERSION 2.8 FATAL_ERROR)
 
+include(CheckFunctionExists)
+
 if(NOT CMAKE_BUILD_TYPE)
     set(CMAKE_BUILD_TYPE Debug)
 endif()
@@ -15,10 +17,19 @@ endif()
 # Enable GNU glibc extentions.
 add_definitions("-D_GNU_SOURCE")
 
+check_function_exists(madvise TARANTOOL_SMALL_HAS_MADVISE)
+
+configure_file(
+    "small/config.h.cmake"
+    "small/config.h"
+    )
+message (STATUS "")
+
 # Valgrind
 include_directories(third_party)
 
 set(lib_headers
+    small/config.h
     small/ibuf.h
     small/lf_lifo.h
     small/lifo.h
diff --git a/small/config.h.cmake b/small/config.h.cmake
new file mode 100644
index 0000000..1c9db10
--- /dev/null
+++ b/small/config.h.cmake
@@ -0,0 +1,9 @@
+#ifndef TARANTOOL_SMALL_CONFIG_H_INCLUDED
+#define TARANTOOL_SMALL_CONFIG_H_INCLUDED
+
+/*
+ * Defined if this platform has madvise(..).
+ */
+ #cmakedefine TARANTOOL_SMALL_HAS_MADVISE 1
+
+#endif /* TARANTOOL_SMALL_CONFIG_H_INCLUDED */
-- 
2.20.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs
  2019-05-01 15:50 [PATCH 0/3] small: Prepare ground for madvise Cyrill Gorcunov
  2019-05-01 15:50 ` [PATCH 1/3] build: Check for madvise syscall Cyrill Gorcunov
@ 2019-05-01 15:50 ` Cyrill Gorcunov
  2019-05-06 10:38   ` Vladimir Davydov
  2019-05-01 15:50 ` [PATCH 3/3] test: slab_arena -- Verify madvise Cyrill Gorcunov
  2 siblings, 1 reply; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-01 15:50 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

The purpose of this helper is to be able to eliminate
some of slab arenas from dumping, especially when huge
amount of memory involves.

For this sake one should call slab_arena_madvise_create
right after area's creation. Strictly speaking this helper
may be called in any time but then some of already created
slabs which are not inside preallocated area won't be
madvised.

Part of #3509
---
https://github.com/tarantool/tarantool/issues/3509

 small/slab_arena.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++
 small/slab_arena.h | 27 +++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/small/slab_arena.c b/small/slab_arena.c
index 26e2620..12f6198 100644
--- a/small/slab_arena.c
+++ b/small/slab_arena.c
@@ -149,6 +149,9 @@ slab_arena_create(struct slab_arena *arena, struct quota *quota,
 	arena->used = 0;
 
 	arena->flags = flags;
+#ifdef TARANTOOL_SMALL_HAS_MADVISE
+	pm_atomic_store(&arena->advice, MADV_NORMAL);
+#endif
 
 	if (arena->prealloc) {
 		arena->arena = mmap_checked(arena->prealloc,
@@ -205,6 +208,12 @@ slab_map(struct slab_arena *arena)
 		__sync_sub_and_fetch(&arena->used, arena->slab_size);
 		quota_release(arena->quota, arena->slab_size);
 	}
+
+#ifdef TARANTOOL_SMALL_HAS_MADVISE
+	if (ptr && pm_atomic_load(&arena->advice) != MADV_NORMAL)
+		madvise(ptr, arena->slab_size, arena->advice);
+#endif
+
 	VALGRIND_MAKE_MEM_UNDEFINED(ptr, arena->slab_size);
 	return ptr;
 }
@@ -226,3 +235,43 @@ slab_arena_mprotect(struct slab_arena *arena)
 	if (arena->arena)
 		mprotect(arena->arena, arena->prealloc, PROT_READ);
 }
+
+#ifdef TARANTOOL_SMALL_HAS_MADVISE
+/**
+ * slab_arena_madvise_create - Initialize madvise permanent strategy
+ * @arena:	slab arena to madvise
+ * @advice:	strategy (MADV_DONTDUMP, ...)
+ *
+ * This function setups @slab_arena strategy so that new memory
+ * allocation with mmap() call gets madvise() call with @advice.
+ *
+ * Note that we don't allow to change strategy once choosen because
+ * otherwise we need to walk over all possible slab cache entry.
+ *
+ * This function should be called right after the slab arena creation,
+ * otherwise if called later some of the non-preallocated slabs won't be
+ * madvise'd.
+ */
+int
+slab_arena_madvise_create(struct slab_arena *arena, int advice)
+{
+	int expected = MADV_NORMAL;
+
+	if (!pm_atomic_compare_exchange_strong(&arena->advice,
+					       &expected, advice)) {
+		return -EBUSY;
+	}
+
+	if (arena->arena)
+		return madvise(arena->arena, arena->prealloc, advice);
+	return 0;
+}
+#else
+int
+slab_arena_madvise_create(struct slab_arena *arena, int advice)
+{
+	(void)arena;
+	(void)advice;
+	return -EINVAL;
+}
+#endif /* TARANTOOL_SMALL_HAS_MADVISE */
diff --git a/small/slab_arena.h b/small/slab_arena.h
index 364252f..4d86545 100644
--- a/small/slab_arena.h
+++ b/small/slab_arena.h
@@ -31,8 +31,10 @@
  * SUCH DAMAGE.
  */
 #include "lf_lifo.h"
+#include "config.h"
 #include <sys/mman.h>
 #include <limits.h>
+#include <errno.h>
 
 #if defined(__cplusplus)
 extern "C" {
@@ -97,6 +99,12 @@ struct slab_arena {
 	 * mmap() flags: MAP_SHARED or MAP_PRIVATE
 	 */
 	int flags;
+#ifdef TARANTOOL_SMALL_HAS_MADVISE
+	/**
+	 * madvise() flag
+	 */
+	int advice;
+#endif
 };
 
 /** Initialize an arena.  */
@@ -120,6 +128,25 @@ slab_unmap(struct slab_arena *arena, void *ptr);
 void
 slab_arena_mprotect(struct slab_arena *arena);
 
+/** madvise() all slabs. */
+int
+slab_arena_madvise_create(struct slab_arena *arena, int advice);
+
+/** handy helpers for slab_arena_madvise_create. */
+#ifdef TARANTOOL_SMALL_HAS_MADVISE
+static inline int
+slab_arena_madvise_dontdump(struct slab_arena *arena)
+{
+	return slab_arena_madvise_create(arena, MADV_DONTDUMP);
+}
+#else
+static inline int
+slab_arena_madvise_dontdump(struct slab_arena *arena)
+{
+	return -EINVAL;
+}
+#endif
+
 /**
  * Align a size - round up to nearest divisible by the given alignment.
  * Alignment must be a power of 2
-- 
2.20.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] test: slab_arena -- Verify madvise
  2019-05-01 15:50 [PATCH 0/3] small: Prepare ground for madvise Cyrill Gorcunov
  2019-05-01 15:50 ` [PATCH 1/3] build: Check for madvise syscall Cyrill Gorcunov
  2019-05-01 15:50 ` [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs Cyrill Gorcunov
@ 2019-05-01 15:50 ` Cyrill Gorcunov
  2019-05-06 10:45   ` Vladimir Davydov
  2 siblings, 1 reply; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-01 15:50 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

Since madvise support depends on sys libraries
and kernel version we print error if only small
supports it and /proc/self/smaps provide more
less decent VmFlags.

Part of #3509
---
https://github.com/tarantool/tarantool/issues/3509

 test/slab_arena.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 90 insertions(+)

diff --git a/test/slab_arena.c b/test/slab_arena.c
index d0a78d1..5c5085c 100644
--- a/test/slab_arena.c
+++ b/test/slab_arena.c
@@ -3,6 +3,7 @@
 #include <stdio.h>
 #include <limits.h>
 #include <stdlib.h>
+#include <string.h>
 #include <time.h>
 #include "unit.h"
 
@@ -15,6 +16,93 @@ slab_arena_print(struct slab_arena *arena)
 	       arena->used, arena->slab_size);
 }
 
+#define is_hex_digit(c)				\
+	(((c) >= '0' && (c) <= '9')	||	\
+	 ((c) >= 'a' && (c) <= 'f')	||	\
+	 ((c) >= 'A' && (c) <= 'F'))
+
+static int is_vma_range_fmt(char *line, unsigned long *start, unsigned long *end)
+{
+	char *p = line;
+	while (*line && is_hex_digit(*line))
+		line++;
+
+	if (*line++ != '-')
+		return 0;
+
+	while (*line && is_hex_digit(*line))
+		line++;
+
+	if (*line++ != ' ')
+		return 0;
+
+	sscanf(p, "%lx-%lx", start, end);
+	return 1;
+}
+
+static void
+slab_test_madvise(void)
+{
+	struct slab_arena arena;
+	struct quota quota;
+	FILE *f = NULL;
+	void *ptr;
+
+	quota_init(&quota, 2000000);
+	slab_arena_create(&arena, &quota, 3000000, 1, MAP_PRIVATE);
+
+	/*
+	 * Will fetch from preallocated area.
+	 */
+	ptr = slab_map(&arena);
+	if (!ptr) {
+		printf("can't obtain slab\n");
+		goto out;
+	}
+
+	/*
+	 * If system supports madvise call, we should try
+	 * fetch the precise VMA state from smaps file,
+	 * but if only it exists.
+	 */
+	if (!slab_arena_madvise_dontdump(&arena)) {
+		unsigned long start = 0, end = 0;
+		char buf[1024], *tok = NULL;
+		bool found = false;
+
+		f = fopen("/proc/self/smaps", "r");
+		if (!f)
+			goto out;
+
+		while (fgets(buf, sizeof(buf), f)) {
+			is_vma_range_fmt(buf, &start, &end);
+			if (strncmp(buf, "VmFlags: ", 9) || (void *)start != ptr)
+				continue;
+			tok = buf;
+			break;
+		}
+
+		if (tok) {
+			for (tok = strtok(tok, " \n"); tok;
+			     tok = strtok(NULL, " \n")) {
+				if (strcmp(tok, "dd"))
+					continue;
+				found = true;
+				break;
+			}
+		}
+
+		if (!found)
+			printf("ERROR: Expected dd flag on VMA address %p\n", ptr);
+
+		fclose(f);
+	}
+
+out:
+	slab_unmap(&arena, ptr);
+	slab_arena_destroy(&arena);
+}
+
 int main()
 {
 	struct quota quota;
@@ -42,4 +130,6 @@ int main()
 	slab_arena_create(&arena, &quota, 3000000, 1, MAP_PRIVATE);
 	slab_arena_print(&arena);
 	slab_arena_destroy(&arena);
+
+	slab_test_madvise();
 }
-- 
2.20.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] build: Check for madvise syscall
  2019-05-01 15:50 ` [PATCH 1/3] build: Check for madvise syscall Cyrill Gorcunov
@ 2019-05-06 10:25   ` Vladimir Davydov
  2019-05-06 10:39     ` Cyrill Gorcunov
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Davydov @ 2019-05-06 10:25 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

On Wed, May 01, 2019 at 06:50:04PM +0300, Cyrill Gorcunov wrote:
> Need to check if madvise is present in the system

I don't think there's much point in splitting this change in three
patches. Please squash. OTOH it'd be nice to see how this API is
intended to be used within Tarantool. Could you please submit the
patch utilizing this feature in the same series?

> 
> Part of #3509

There's no issue 3509 in tarantool/small repository. Please paste the
full link instead - GitHub web interface knows how to show them.

> ---
> https://github.com/tarantool/tarantool/issues/3509
> 
>  CMakeLists.txt       | 11 +++++++++++
>  small/config.h.cmake |  9 +++++++++
>  2 files changed, 20 insertions(+)
>  create mode 100644 small/config.h.cmake
> 
> diff --git a/CMakeLists.txt b/CMakeLists.txt
> index ad27423..97d25bc 100644
> --- a/CMakeLists.txt
> +++ b/CMakeLists.txt
> @@ -1,6 +1,8 @@
>  project(small C CXX)
>  cmake_minimum_required(VERSION 2.8 FATAL_ERROR)
>  
> +include(CheckFunctionExists)
> +
>  if(NOT CMAKE_BUILD_TYPE)
>      set(CMAKE_BUILD_TYPE Debug)
>  endif()
> @@ -15,10 +17,19 @@ endif()
>  # Enable GNU glibc extentions.
>  add_definitions("-D_GNU_SOURCE")
>  
> +check_function_exists(madvise TARANTOOL_SMALL_HAS_MADVISE)

I think we'd better check that MADV_DONTDUMP is available with the aid
of check_symbol_exists.

> +
> +configure_file(
> +    "small/config.h.cmake"
> +    "small/config.h"
> +    )
> +message (STATUS "")
> +
>  # Valgrind
>  include_directories(third_party)
>  
>  set(lib_headers
> +    small/config.h
>      small/ibuf.h
>      small/lf_lifo.h
>      small/lifo.h
> diff --git a/small/config.h.cmake b/small/config.h.cmake
> new file mode 100644
> index 0000000..1c9db10
> --- /dev/null
> +++ b/small/config.h.cmake
> @@ -0,0 +1,9 @@
> +#ifndef TARANTOOL_SMALL_CONFIG_H_INCLUDED
> +#define TARANTOOL_SMALL_CONFIG_H_INCLUDED
> +
> +/*
> + * Defined if this platform has madvise(..).
> + */
> + #cmakedefine TARANTOOL_SMALL_HAS_MADVISE 1
> +
> +#endif /* TARANTOOL_SMALL_CONFIG_H_INCLUDED */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs
  2019-05-01 15:50 ` [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs Cyrill Gorcunov
@ 2019-05-06 10:38   ` Vladimir Davydov
  2019-05-06 11:03     ` Cyrill Gorcunov
  2019-05-06 13:46     ` Alexander Turenko
  0 siblings, 2 replies; 14+ messages in thread
From: Vladimir Davydov @ 2019-05-06 10:38 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml, Alexander Turenko

[Cc += Alexander re tarantool/small backward compatibility]

On Wed, May 01, 2019 at 06:50:05PM +0300, Cyrill Gorcunov wrote:
> The purpose of this helper is to be able to eliminate
> some of slab arenas from dumping, especially when huge
> amount of memory involves.
> 
> For this sake one should call slab_arena_madvise_create
> right after area's creation. Strictly speaking this helper
> may be called in any time but then some of already created
> slabs which are not inside preallocated area won't be
> madvised.
> 
> Part of #3509

Again, please fix the link.

> ---
> https://github.com/tarantool/tarantool/issues/3509
> 
>  small/slab_arena.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++
>  small/slab_arena.h | 27 +++++++++++++++++++++++++
>  2 files changed, 76 insertions(+)
> 
> diff --git a/small/slab_arena.c b/small/slab_arena.c
> index 26e2620..12f6198 100644
> --- a/small/slab_arena.c
> +++ b/small/slab_arena.c
> @@ -149,6 +149,9 @@ slab_arena_create(struct slab_arena *arena, struct quota *quota,
>  	arena->used = 0;
>  
>  	arena->flags = flags;
> +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> +	pm_atomic_store(&arena->advice, MADV_NORMAL);
> +#endif
>  
>  	if (arena->prealloc) {
>  		arena->arena = mmap_checked(arena->prealloc,
> @@ -205,6 +208,12 @@ slab_map(struct slab_arena *arena)
>  		__sync_sub_and_fetch(&arena->used, arena->slab_size);
>  		quota_release(arena->quota, arena->slab_size);
>  	}
> +
> +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> +	if (ptr && pm_atomic_load(&arena->advice) != MADV_NORMAL)
> +		madvise(ptr, arena->slab_size, arena->advice);
> +#endif
> +

Why use atomic? The arena cannot be used until it's fully initialized.

>  	VALGRIND_MAKE_MEM_UNDEFINED(ptr, arena->slab_size);
>  	return ptr;
>  }
> @@ -226,3 +235,43 @@ slab_arena_mprotect(struct slab_arena *arena)
>  	if (arena->arena)
>  		mprotect(arena->arena, arena->prealloc, PROT_READ);
>  }
> +
> +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> +/**
> + * slab_arena_madvise_create - Initialize madvise permanent strategy
> + * @arena:	slab arena to madvise
> + * @advice:	strategy (MADV_DONTDUMP, ...)
> + *
> + * This function setups @slab_arena strategy so that new memory
> + * allocation with mmap() call gets madvise() call with @advice.
> + *
> + * Note that we don't allow to change strategy once choosen because
> + * otherwise we need to walk over all possible slab cache entry.
> + *
> + * This function should be called right after the slab arena creation,
> + * otherwise if called later some of the non-preallocated slabs won't be
> + * madvise'd.
> + */
> +int
> +slab_arena_madvise_create(struct slab_arena *arena, int advice)

I don't see much point in changing madvise settings on the fly.
Let's instead allow to pass DONTDUMP flag at creation time.
This means we have to combine MAP_ and MADV_ flags. I guess it
was a bad idea to pass MAP_ flags directly to small_create() in
the first place. What about introducing small-specific flags like

  SLAB_ARENA_SHARED
  SLAB_ARENA_DONTDUMP

Come to think of it, I assume we don't even need to introduce
SLAB_ARENA_SHARED - we always pass MAP_PRIVATE to small_create().
Let's drop it, perhaps?

It will break backward compatibility, but I don't think it really
matters, as tarantool/small is only used by tarantool - it isn't
a big deal if we change its API. We only need to bump the version
number, I guess.

Alexander, please amend me if I'm wrong.

> +{
> +	int expected = MADV_NORMAL;
> +
> +	if (!pm_atomic_compare_exchange_strong(&arena->advice,
> +					       &expected, advice)) {
> +		return -EBUSY;
> +	}
> +
> +	if (arena->arena)
> +		return madvise(arena->arena, arena->prealloc, advice);
> +	return 0;
> +}
> +#else
> +int
> +slab_arena_madvise_create(struct slab_arena *arena, int advice)
> +{
> +	(void)arena;
> +	(void)advice;
> +	return -EINVAL;
> +}
> +#endif /* TARANTOOL_SMALL_HAS_MADVISE */
> diff --git a/small/slab_arena.h b/small/slab_arena.h
> index 364252f..4d86545 100644
> --- a/small/slab_arena.h
> +++ b/small/slab_arena.h
> @@ -31,8 +31,10 @@
>   * SUCH DAMAGE.
>   */
>  #include "lf_lifo.h"
> +#include "config.h"
>  #include <sys/mman.h>
>  #include <limits.h>
> +#include <errno.h>
>  
>  #if defined(__cplusplus)
>  extern "C" {
> @@ -97,6 +99,12 @@ struct slab_arena {
>  	 * mmap() flags: MAP_SHARED or MAP_PRIVATE
>  	 */
>  	int flags;
> +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> +	/**
> +	 * madvise() flag
> +	 */
> +	int advice;
> +#endif
>  };
>  
>  /** Initialize an arena.  */
> @@ -120,6 +128,25 @@ slab_unmap(struct slab_arena *arena, void *ptr);
>  void
>  slab_arena_mprotect(struct slab_arena *arena);
>  
> +/** madvise() all slabs. */
> +int
> +slab_arena_madvise_create(struct slab_arena *arena, int advice);
> +
> +/** handy helpers for slab_arena_madvise_create. */
> +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> +static inline int
> +slab_arena_madvise_dontdump(struct slab_arena *arena)
> +{
> +	return slab_arena_madvise_create(arena, MADV_DONTDUMP);
> +}
> +#else
> +static inline int
> +slab_arena_madvise_dontdump(struct slab_arena *arena)
> +{
> +	return -EINVAL;
> +}
> +#endif
> +
>  /**
>   * Align a size - round up to nearest divisible by the given alignment.
>   * Alignment must be a power of 2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] build: Check for madvise syscall
  2019-05-06 10:25   ` Vladimir Davydov
@ 2019-05-06 10:39     ` Cyrill Gorcunov
  0 siblings, 0 replies; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-06 10:39 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml

On Mon, May 06, 2019 at 01:25:14PM +0300, Vladimir Davydov wrote:
> On Wed, May 01, 2019 at 06:50:04PM +0300, Cyrill Gorcunov wrote:
> > Need to check if madvise is present in the system
> 
> I don't think there's much point in splitting this change in three
> patches. Please squash. OTOH it'd be nice to see how this API is
> intended to be used within Tarantool. Could you please submit the
> patch utilizing this feature in the same series?

OK, will do.

> > 
> > Part of #3509
> 
> There's no issue 3509 in tarantool/small repository. Please paste the
> full link instead - GitHub web interface knows how to show them.

OK

> >  
> > +check_function_exists(madvise TARANTOOL_SMALL_HAS_MADVISE)
> 
> I think we'd better check that MADV_DONTDUMP is available with the aid
> of check_symbol_exists.

OK

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] test: slab_arena -- Verify madvise
  2019-05-01 15:50 ` [PATCH 3/3] test: slab_arena -- Verify madvise Cyrill Gorcunov
@ 2019-05-06 10:45   ` Vladimir Davydov
  2019-05-06 10:48     ` Cyrill Gorcunov
  2019-05-06 11:04     ` Alexander Turenko
  0 siblings, 2 replies; 14+ messages in thread
From: Vladimir Davydov @ 2019-05-06 10:45 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml, Alexander Turenko

[Cc += Alexander re tests]

On Wed, May 01, 2019 at 06:50:06PM +0300, Cyrill Gorcunov wrote:
> Since madvise support depends on sys libraries
> and kernel version we print error if only small
> supports it and /proc/self/smaps provide more
> less decent VmFlags.

TBO I wouldn't bother testing this feature at all, because it's way too
platform dependent.

Alternatively, we could test it by crashing a process and checking the
size of the generated core file, but then again it doesn't look like a
portable way.

Alexander, any ideas?

> 
> Part of #3509
> ---
> https://github.com/tarantool/tarantool/issues/3509
> 
>  test/slab_arena.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 90 insertions(+)
> 
> diff --git a/test/slab_arena.c b/test/slab_arena.c
> index d0a78d1..5c5085c 100644
> --- a/test/slab_arena.c
> +++ b/test/slab_arena.c
> @@ -3,6 +3,7 @@
>  #include <stdio.h>
>  #include <limits.h>
>  #include <stdlib.h>
> +#include <string.h>
>  #include <time.h>
>  #include "unit.h"
>  
> @@ -15,6 +16,93 @@ slab_arena_print(struct slab_arena *arena)
>  	       arena->used, arena->slab_size);
>  }
>  
> +#define is_hex_digit(c)				\
> +	(((c) >= '0' && (c) <= '9')	||	\
> +	 ((c) >= 'a' && (c) <= 'f')	||	\
> +	 ((c) >= 'A' && (c) <= 'F'))
> +
> +static int is_vma_range_fmt(char *line, unsigned long *start, unsigned long *end)
> +{
> +	char *p = line;
> +	while (*line && is_hex_digit(*line))
> +		line++;
> +
> +	if (*line++ != '-')
> +		return 0;
> +
> +	while (*line && is_hex_digit(*line))
> +		line++;
> +
> +	if (*line++ != ' ')
> +		return 0;
> +
> +	sscanf(p, "%lx-%lx", start, end);
> +	return 1;
> +}
> +
> +static void
> +slab_test_madvise(void)
> +{
> +	struct slab_arena arena;
> +	struct quota quota;
> +	FILE *f = NULL;
> +	void *ptr;
> +
> +	quota_init(&quota, 2000000);
> +	slab_arena_create(&arena, &quota, 3000000, 1, MAP_PRIVATE);
> +
> +	/*
> +	 * Will fetch from preallocated area.
> +	 */
> +	ptr = slab_map(&arena);
> +	if (!ptr) {
> +		printf("can't obtain slab\n");
> +		goto out;
> +	}
> +
> +	/*
> +	 * If system supports madvise call, we should try
> +	 * fetch the precise VMA state from smaps file,
> +	 * but if only it exists.
> +	 */
> +	if (!slab_arena_madvise_dontdump(&arena)) {
> +		unsigned long start = 0, end = 0;
> +		char buf[1024], *tok = NULL;
> +		bool found = false;
> +
> +		f = fopen("/proc/self/smaps", "r");
> +		if (!f)
> +			goto out;
> +
> +		while (fgets(buf, sizeof(buf), f)) {
> +			is_vma_range_fmt(buf, &start, &end);
> +			if (strncmp(buf, "VmFlags: ", 9) || (void *)start != ptr)
> +				continue;
> +			tok = buf;
> +			break;
> +		}
> +
> +		if (tok) {
> +			for (tok = strtok(tok, " \n"); tok;
> +			     tok = strtok(NULL, " \n")) {
> +				if (strcmp(tok, "dd"))
> +					continue;
> +				found = true;
> +				break;
> +			}
> +		}
> +
> +		if (!found)
> +			printf("ERROR: Expected dd flag on VMA address %p\n", ptr);
> +
> +		fclose(f);
> +	}
> +
> +out:
> +	slab_unmap(&arena, ptr);
> +	slab_arena_destroy(&arena);
> +}
> +
>  int main()
>  {
>  	struct quota quota;
> @@ -42,4 +130,6 @@ int main()
>  	slab_arena_create(&arena, &quota, 3000000, 1, MAP_PRIVATE);
>  	slab_arena_print(&arena);
>  	slab_arena_destroy(&arena);
> +
> +	slab_test_madvise();
>  }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] test: slab_arena -- Verify madvise
  2019-05-06 10:45   ` Vladimir Davydov
@ 2019-05-06 10:48     ` Cyrill Gorcunov
  2019-05-06 10:55       ` Vladimir Davydov
  2019-05-06 11:04     ` Alexander Turenko
  1 sibling, 1 reply; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-06 10:48 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml, Alexander Turenko

On Mon, May 06, 2019 at 01:45:20PM +0300, Vladimir Davydov wrote:
> [Cc += Alexander re tests]
> 
> On Wed, May 01, 2019 at 06:50:06PM +0300, Cyrill Gorcunov wrote:
> > Since madvise support depends on sys libraries
> > and kernel version we print error if only small
> > supports it and /proc/self/smaps provide more
> > less decent VmFlags.
> 
> TBO I wouldn't bother testing this feature at all, because it's way too
> platform dependent.

Test does check for /proc/self/smaps file present and for VmFlags entry
present, if neither of it found we just exit silently, without errors.

> 
> Alternatively, we could test it by crashing a process and checking the
> size of the generated core file, but then again it doesn't look like a
> portable way.

Yes, not portable.

> 
> Alexander, any ideas?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] test: slab_arena -- Verify madvise
  2019-05-06 10:48     ` Cyrill Gorcunov
@ 2019-05-06 10:55       ` Vladimir Davydov
  2019-05-06 10:57         ` Cyrill Gorcunov
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Davydov @ 2019-05-06 10:55 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml, Alexander Turenko

On Mon, May 06, 2019 at 01:48:04PM +0300, Cyrill Gorcunov wrote:
> On Mon, May 06, 2019 at 01:45:20PM +0300, Vladimir Davydov wrote:
> > [Cc += Alexander re tests]
> > 
> > On Wed, May 01, 2019 at 06:50:06PM +0300, Cyrill Gorcunov wrote:
> > > Since madvise support depends on sys libraries
> > > and kernel version we print error if only small
> > > supports it and /proc/self/smaps provide more
> > > less decent VmFlags.
> > 
> > TBO I wouldn't bother testing this feature at all, because it's way too
> > platform dependent.
> 
> Test does check for /proc/self/smaps file present and for VmFlags entry
> present, if neither of it found we just exit silently, without errors.

Okay, I'm fine with it.

To be prudent though, we'd have to check that the flag is set not only
for the preallocated memory region, but also for slabs allocated
afterwards (see slab_map -> mmap_checked). Not sure if it's worth it
though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] test: slab_arena -- Verify madvise
  2019-05-06 10:55       ` Vladimir Davydov
@ 2019-05-06 10:57         ` Cyrill Gorcunov
  0 siblings, 0 replies; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-06 10:57 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml, Alexander Turenko

On Mon, May 06, 2019 at 01:55:52PM +0300, Vladimir Davydov wrote:
> 
> Okay, I'm fine with it.
> 
> To be prudent though, we'd have to check that the flag is set not only
> for the preallocated memory region, but also for slabs allocated
> afterwards (see slab_map -> mmap_checked). Not sure if it's worth it
> though.

Yes, and I already extended the test, simply didn't sent it out yet,
because I expected we gonna rework the api :)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs
  2019-05-06 10:38   ` Vladimir Davydov
@ 2019-05-06 11:03     ` Cyrill Gorcunov
  2019-05-06 13:46     ` Alexander Turenko
  1 sibling, 0 replies; 14+ messages in thread
From: Cyrill Gorcunov @ 2019-05-06 11:03 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml, Alexander Turenko

On Mon, May 06, 2019 at 01:38:36PM +0300, Vladimir Davydov wrote:
> >  
> >  	if (arena->prealloc) {
> >  		arena->arena = mmap_checked(arena->prealloc,
> > @@ -205,6 +208,12 @@ slab_map(struct slab_arena *arena)
> >  		__sync_sub_and_fetch(&arena->used, arena->slab_size);
> >  		quota_release(arena->quota, arena->slab_size);
> >  	}
> > +
> > +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> > +	if (ptr && pm_atomic_load(&arena->advice) != MADV_NORMAL)
> > +		madvise(ptr, arena->slab_size, arena->advice);
> > +#endif
> > +
> 
> Why use atomic? The arena cannot be used until it's fully initialized.

Because otherwise we have a race: with current api madvise may
be set up in any moment, say one thread setting up madvise and
second thread calls for slab_map. Not a big deal probably but
to pair with atomic write into @advice variable.
...
> > +int
> > +slab_arena_madvise_create(struct slab_arena *arena, int advice)
> 
> I don't see much point in changing madvise settings on the fly.

All this to not break backward compatibility. If we're allowed to
change it, this won't be a problem to adjust the code.

> Let's instead allow to pass DONTDUMP flag at creation time.
> This means we have to combine MAP_ and MADV_ flags. I guess it
> was a bad idea to pass MAP_ flags directly to small_create() in
> the first place. What about introducing small-specific flags like
> 
>   SLAB_ARENA_SHARED
>   SLAB_ARENA_DONTDUMP
> 
> Come to think of it, I assume we don't even need to introduce
> SLAB_ARENA_SHARED - we always pass MAP_PRIVATE to small_create().
> Let's drop it, perhaps?
> 
> It will break backward compatibility, but I don't think it really
> matters, as tarantool/small is only used by tarantool - it isn't
> a big deal if we change its API. We only need to bump the version
> number, I guess.
> 
> Alexander, please amend me if I'm wrong.

OK, I'll rework in a separate branch, so we will jump back if needed.

	Cyrill

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] test: slab_arena -- Verify madvise
  2019-05-06 10:45   ` Vladimir Davydov
  2019-05-06 10:48     ` Cyrill Gorcunov
@ 2019-05-06 11:04     ` Alexander Turenko
  1 sibling, 0 replies; 14+ messages in thread
From: Alexander Turenko @ 2019-05-06 11:04 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: Cyrill Gorcunov, tml

On Mon, May 06, 2019 at 01:45:20PM +0300, Vladimir Davydov wrote:
> [Cc += Alexander re tests]
> 
> On Wed, May 01, 2019 at 06:50:06PM +0300, Cyrill Gorcunov wrote:
> > Since madvise support depends on sys libraries
> > and kernel version we print error if only small
> > supports it and /proc/self/smaps provide more
> > less decent VmFlags.
> 
> TBO I wouldn't bother testing this feature at all, because it's way too
> platform dependent.
> 
> Alternatively, we could test it by crashing a process and checking the
> size of the generated core file, but then again it doesn't look like a
> portable way.
> 
> Alexander, any ideas?

I think it is okay to test a platform-specific feature only on a
platform where it is supported, but skip the test otherwise.

We only need to ensure our check whether the feature is supported is
reliable. Travis-CI for small checks all (or almost all) supported
distros, but lack of Mac OS target. It worth to check it on Mac OS (and
maybe on FreeBSD) via CI or at least manually.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs
  2019-05-06 10:38   ` Vladimir Davydov
  2019-05-06 11:03     ` Cyrill Gorcunov
@ 2019-05-06 13:46     ` Alexander Turenko
  1 sibling, 0 replies; 14+ messages in thread
From: Alexander Turenko @ 2019-05-06 13:46 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: Cyrill Gorcunov, tml

> > +#ifdef TARANTOOL_SMALL_HAS_MADVISE
> > +/**
> > + * slab_arena_madvise_create - Initialize madvise permanent strategy
> > + * @arena:	slab arena to madvise
> > + * @advice:	strategy (MADV_DONTDUMP, ...)
> > + *
> > + * This function setups @slab_arena strategy so that new memory
> > + * allocation with mmap() call gets madvise() call with @advice.
> > + *
> > + * Note that we don't allow to change strategy once choosen because
> > + * otherwise we need to walk over all possible slab cache entry.
> > + *
> > + * This function should be called right after the slab arena creation,
> > + * otherwise if called later some of the non-preallocated slabs won't be
> > + * madvise'd.
> > + */
> > +int
> > +slab_arena_madvise_create(struct slab_arena *arena, int advice)
> 
> I don't see much point in changing madvise settings on the fly.
> Let's instead allow to pass DONTDUMP flag at creation time.
> This means we have to combine MAP_ and MADV_ flags. I guess it
> was a bad idea to pass MAP_ flags directly to small_create() in
> the first place. What about introducing small-specific flags like
> 
>   SLAB_ARENA_SHARED
>   SLAB_ARENA_DONTDUMP
> 
> Come to think of it, I assume we don't even need to introduce
> SLAB_ARENA_SHARED - we always pass MAP_PRIVATE to small_create().
> Let's drop it, perhaps?
> 
> It will break backward compatibility, but I don't think it really
> matters, as tarantool/small is only used by tarantool - it isn't
> a big deal if we change its API. We only need to bump the version
> number, I guess.
> 
> Alexander, please amend me if I'm wrong.

I don't know other users of the library, so incompatiable changes looks
okay.

I would release a new version (an annotated tag + a github release) and
mention incompatible changes to let potential users know about this.
Maybe this is too formal, don't know :)

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-05-06 13:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-01 15:50 [PATCH 0/3] small: Prepare ground for madvise Cyrill Gorcunov
2019-05-01 15:50 ` [PATCH 1/3] build: Check for madvise syscall Cyrill Gorcunov
2019-05-06 10:25   ` Vladimir Davydov
2019-05-06 10:39     ` Cyrill Gorcunov
2019-05-01 15:50 ` [PATCH 2/3] slab_arena: Provide slab_arena_madvise_create to madvice slabs Cyrill Gorcunov
2019-05-06 10:38   ` Vladimir Davydov
2019-05-06 11:03     ` Cyrill Gorcunov
2019-05-06 13:46     ` Alexander Turenko
2019-05-01 15:50 ` [PATCH 3/3] test: slab_arena -- Verify madvise Cyrill Gorcunov
2019-05-06 10:45   ` Vladimir Davydov
2019-05-06 10:48     ` Cyrill Gorcunov
2019-05-06 10:55       ` Vladimir Davydov
2019-05-06 10:57         ` Cyrill Gorcunov
2019-05-06 11:04     ` Alexander Turenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox