Tarantool development patches archive
 help / color / mirror / Atom feed
* [PATCH v5 0/2] lib/core/fiber: Increase default stack size
@ 2019-03-12 22:47 Cyrill Gorcunov
  2019-03-12 22:47 ` [PATCH 1/3] " Cyrill Gorcunov
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-12 22:47 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

v4:
  We split the stack management task into two: first one to
  increase stack size asap, to not block users and testers,
  and second one to rework stack management to relax rss
  pressue.

  This patchset covers the first task only. The second part
  is still under rfc and hopefully won't take too long.

v5:
  Use random offsets for each fiber instead of global one.
  Shrink stack to 64K on recycle.

Volodya, take a look please, once time permit.

Cyrill Gorcunov (3):
  lib/core/fiber: Increase default stack size
  lib/core/fiber: Mark stack as unneeded on creation
  lib/core/fiber: Put watermarks into stack to track its usage

 src/lib/core/fiber.c | 179 ++++++++++++++++++++++++++++++++++++++++++-
 src/lib/core/fiber.h |  15 ++++
 2 files changed, 192 insertions(+), 2 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] lib/core/fiber: Increase default stack size
  2019-03-12 22:47 [PATCH v5 0/2] lib/core/fiber: Increase default stack size Cyrill Gorcunov
@ 2019-03-12 22:47 ` Cyrill Gorcunov
  2019-03-13 13:42   ` Vladimir Davydov
  2019-03-12 22:47 ` [PATCH 2/3] lib/core/fiber: Mark stack as unneeded on creation Cyrill Gorcunov
  2019-03-12 22:47 ` [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage Cyrill Gorcunov
  2 siblings, 1 reply; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-12 22:47 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

The default 64K stack size used for years become too small
for modern distors (Fedora 29 and etc) where third party libraries
(such as ncurses) started to use 64K for own buffers and we get
SIGSGV early without reaching interactive console phase.

Thus we increase default size up to 512K which should fit
for common case. Later we will make this value configurable
to address arbitrary stack sizes without a need to rebuild
the whole code.

Note the values are switched to 4K page granularity for sake
of future modifications -- we gonna manipulate pages to
relax rss usage if OS allows.

Closes #3418
---
 src/lib/core/fiber.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index abd6c6b11..f16ac873f 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -93,9 +93,9 @@ static int stack_direction;
 
 enum {
 	/* The minimum allowable fiber stack size in bytes */
-	FIBER_STACK_SIZE_MINIMAL = 16384,
+	FIBER_STACK_SIZE_MINIMAL = 4 << 12,
 	/* Default fiber stack size in bytes */
-	FIBER_STACK_SIZE_DEFAULT = 65536
+	FIBER_STACK_SIZE_DEFAULT = 128 << 12
 };
 
 /** Default fiber attributes */
-- 
2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/3] lib/core/fiber: Mark stack as unneeded on creation
  2019-03-12 22:47 [PATCH v5 0/2] lib/core/fiber: Increase default stack size Cyrill Gorcunov
  2019-03-12 22:47 ` [PATCH 1/3] " Cyrill Gorcunov
@ 2019-03-12 22:47 ` Cyrill Gorcunov
  2019-03-12 22:47 ` [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage Cyrill Gorcunov
  2 siblings, 0 replies; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-12 22:47 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

Since we've increased the default stack size we hope
the whole 512K won't be used for regular loads thus
we mark the stack area as unneeded to minimize rss
pressure.

Note we do this on fiber creation at the moment, more
detailed stack usage analisys will be implemented in
another patch series later.

Closes #3418
---
 src/lib/core/fiber.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index f16ac873f..a1b261ad4 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -749,6 +749,14 @@ fiber_stack_create(struct fiber *fiber, size_t stack_size)
 	fiber->stack_id = VALGRIND_STACK_REGISTER(fiber->stack,
 						  (char *)fiber->stack +
 						  fiber->stack_size);
+#ifndef TARGET_OS_DARWIN
+	/*
+	 * We don't expect the whole stack usage in regular
+	 * loads, lets try to minimize rss pressure if OS
+	 * allows us.
+	 */
+	madvise(fiber->stack, fiber->stack_size, MADV_DONTNEED);
+#endif
 
 	mprotect(guard, page_size, PROT_NONE);
 	return 0;
-- 
2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage
  2019-03-12 22:47 [PATCH v5 0/2] lib/core/fiber: Increase default stack size Cyrill Gorcunov
  2019-03-12 22:47 ` [PATCH 1/3] " Cyrill Gorcunov
  2019-03-12 22:47 ` [PATCH 2/3] lib/core/fiber: Mark stack as unneeded on creation Cyrill Gorcunov
@ 2019-03-12 22:47 ` Cyrill Gorcunov
  2019-03-13 13:52   ` Vladimir Davydov
  2 siblings, 1 reply; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-12 22:47 UTC (permalink / raw)
  To: tml; +Cc: Vladimir Davydov, Cyrill Gorcunov

We want to detect a situation where task in fiber is too eager for
stack and close to its exhausting. For this sake upon stack creation
we put 8 marks on last stack page with step of 128 bytes. Such params
allows us to fill ~1/4 of a page, which does seem reasonable but
we might change this params with time.

Since the watermark position is permanent and some task is close
to stack limit we print about the situation once to not spam
a user much and stop putting the mark on recycling.

Still this techique doesn't guarantee to handle all possible
situations so to increas probability of catching eager fibers
we put marks *not* at page start address but withing some offset,
randomly generated.

To minimize a pressue on memory system we try to relax stack
usage with that named "shink" watermark. Basically it is the
same mark as for overflow detection but placed at 64K bound
and on every recycle we try to shrink stack usage dropping
pages if the watermark has been erased by a task.

Closes #3418

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 src/lib/core/fiber.c | 167 +++++++++++++++++++++++++++++++++++++++++++
 src/lib/core/fiber.h |  15 ++++
 2 files changed, 182 insertions(+)

diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
index a1b261ad4..cae291d1a 100644
--- a/src/lib/core/fiber.c
+++ b/src/lib/core/fiber.c
@@ -104,6 +104,27 @@ static const struct fiber_attr fiber_attr_default = {
        .flags = FIBER_DEFAULT_FLAGS
 };
 
+/*
+ * Random values generated with uuid.
+ */
+static const uint64_t poison_pool[] = {
+	0x74f31d37285c4c37, 0xb10269a05bf10c29,
+	0x0994d845bd284e0f, 0x9ffd4f7129c184df,
+	0x357151e6711c4415, 0x8c5e5f41aafe6f28,
+	0x6917dd79e78049d5, 0xba61957c65ca2465,
+};
+
+/*
+ * We poison by 8 bytes as it natural for stack
+ * step on x86-64. Also 128 byte gap between
+ * poison values should cover a common cases.
+ */
+#define POISON_SIZE	(sizeof(poison_pool) / sizeof(poison_pool[0]))
+#define POISON_GAP	(128 + sizeof(poison_pool[0]))
+#define POISON_OFF	(POISON_GAP / sizeof(poison_pool[0]))
+
+static void fiber_wmark_recycle(struct fiber *fiber);
+
 void
 fiber_attr_create(struct fiber_attr *fiber_attr)
 {
@@ -624,6 +645,7 @@ fiber_recycle(struct fiber *fiber)
 	/* no pending wakeup */
 	assert(rlist_empty(&fiber->state));
 	bool has_custom_stack = fiber->flags & FIBER_CUSTOM_STACK;
+	fiber_wmark_recycle(fiber);
 	fiber_reset(fiber);
 	fiber->name[0] = '\0';
 	fiber->f = NULL;
@@ -710,6 +732,146 @@ page_align_up(void *ptr)
 	return page_align_down(ptr + page_size - 1);
 }
 
+static bool
+stack_has_wmark(void *addr)
+{
+	const uint64_t *src = poison_pool;
+	const uint64_t *dst = addr;
+	size_t i;
+
+	for (i = 0; i < POISON_SIZE; i++) {
+		if (*dst != src[i])
+			return false;
+		dst += POISON_OFF;
+	}
+
+	return true;
+}
+
+static void
+stack_put_wmark(void *addr)
+{
+	const uint64_t *src = poison_pool;
+	uint64_t *dst = addr;
+	size_t i;
+
+	for (i = 0; i < POISON_SIZE; i++) {
+		*dst = src[i];
+		dst += POISON_OFF;
+	}
+}
+
+#ifndef TARGET_OS_DARWIN
+static void
+stack_shrink(struct fiber *fiber)
+{
+	void *start, *end;
+
+	/*
+	 * When dropping pages make sure the page
+	 * containing overflow mark is untouched.
+	 * Same time no need to unmap the page which
+	 * carries "shrink" wmark, since we're updating
+	 * this page anyway.
+	 */
+	if (stack_direction < 0) {
+		end = page_align_down(fiber->stack_shrink_wmark);
+		start = page_align_up(fiber->stack_overflow_wmark);
+		madvise(start, end - start, MADV_DONTNEED);
+	} else {
+		start = page_align_up(fiber->stack_shrink_wmark);
+		end = page_align_down(fiber->stack_overflow_wmark);
+		madvise(start, end - start, MADV_DONTNEED);
+	}
+	stack_put_wmark(fiber->stack_shrink_wmark);
+}
+#endif
+
+static void
+fiber_wmark_recycle(struct fiber *fiber)
+{
+	static bool overflow_warned = false;
+
+	if (fiber->stack == NULL || fiber->flags & FIBER_CUSTOM_STACK)
+		return;
+
+#ifndef TARGET_OS_DARWIN
+	/*
+	 * On recycle we're trying to shrink stack
+	 * to release memory pressure but if only
+	 * a fiber has been using too much memory.
+	 */
+	if (!stack_has_wmark(fiber->stack_shrink_wmark))
+		stack_shrink(fiber);
+#endif
+
+	/*
+	 * We are watching for stack overflow in one shot way:
+	 * simply to not spam a user with messages, if someone
+	 * triggered the problem it is highly likely that
+	 * an another fiber hit the same soon.
+	 */
+	if (overflow_warned)
+		return;
+
+	if (!stack_has_wmark(fiber->stack_overflow_wmark)) {
+		say_warn("stack usage is close to the limit of %zu bytes",
+			 (size_t)FIBER_STACK_SIZE_DEFAULT);
+		overflow_warned = true;
+	}
+}
+
+static void
+fiber_wmark_init(struct fiber *fiber)
+{
+	/*
+	 * No tracking on custom stacks
+	 * in a sake of simplicity.
+	 */
+	if (fiber->flags & FIBER_CUSTOM_STACK) {
+		fiber->stack_overflow_wmark = NULL;
+		fiber->stack_shrink_wmark = NULL;
+		fiber->wmark_inpage_offset = 0;
+		return;
+	}
+
+	/*
+	 * To increase probability of the stack overflow
+	 * detection we put first mark at random position
+	 * of the first 128 bytes range. The rest of the marks
+	 * are put with constant step simply to not carry
+	 * offsets in memory.
+	 */
+	fiber->wmark_inpage_offset = ((rand() % 128) + 8) & ~7;
+
+	/*
+	 * Initially we arm the last page of the stack
+	 * to catch if we're getting close to its exhausting.
+	 *
+	 * The shrink watermark is put at 64K limit which is
+	 * known value to not cause much memory pressue even
+	 * with large number of fibers.
+	 */
+	if (stack_direction < 0) {
+		fiber->stack_overflow_wmark  = fiber->stack;
+		fiber->stack_overflow_wmark += fiber->wmark_inpage_offset;
+
+		fiber->stack_shrink_wmark  = fiber->stack + fiber->stack_size;
+		fiber->stack_shrink_wmark -= 16 << 12;
+		fiber->stack_shrink_wmark += fiber->wmark_inpage_offset;
+	} else {
+		fiber->stack_overflow_wmark  = fiber->stack + fiber->stack_size;
+		fiber->stack_overflow_wmark -= page_size;
+		fiber->stack_overflow_wmark += fiber->wmark_inpage_offset;
+
+		fiber->stack_shrink_wmark  = fiber->stack;
+		fiber->stack_shrink_wmark += (16 << 12) - page_size;
+		fiber->stack_shrink_wmark += fiber->wmark_inpage_offset;
+	}
+	stack_put_wmark(fiber->stack_overflow_wmark);
+	stack_put_wmark(fiber->stack_shrink_wmark);
+}
+
 static int
 fiber_stack_create(struct fiber *fiber, size_t stack_size)
 {
@@ -758,6 +920,7 @@ fiber_stack_create(struct fiber *fiber, size_t stack_size)
 	madvise(fiber->stack, fiber->stack_size, MADV_DONTNEED);
 #endif
 
+	fiber_wmark_init(fiber);
 	mprotect(guard, page_size, PROT_NONE);
 	return 0;
 }
@@ -927,8 +1090,12 @@ cord_create(struct cord *cord, const char *name)
 	/* Record stack extents */
 	tt_pthread_attr_getstack(cord->id, &cord->sched.stack,
 				 &cord->sched.stack_size);
+	cord->sched.stack_overflow_wmark = cord->sched.stack;
+	cord->sched.stack_shrink_wmark = cord->sched.stack;
 #else
 	cord->sched.stack = NULL;
+	cord->sched.stack_overflow_wmark = NULL;
+	cord->sched.stack_shrink_wmark = NULL;
 	cord->sched.stack_size = 0;
 #endif
 }
diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
index f1f5a0555..431e3da09 100644
--- a/src/lib/core/fiber.h
+++ b/src/lib/core/fiber.h
@@ -348,6 +348,21 @@ struct fiber {
 	struct slab *stack_slab;
 	/** Coro stack addr. */
 	void *stack;
+	/**
+	 * Stack watermark addr to detect
+	 * if we need shrink stack on reuse.
+	 */
+	void *stack_shrink_wmark;
+	/**
+	 * Stack watermark addr for overflow detection. To warn
+	 * a user about stack eager fibers.
+	 */
+	void *stack_overflow_wmark;
+	/**
+	 * An offset to watermark position in stack
+	 * since page bound address.
+	 */
+	unsigned int wmark_inpage_offset;
 	/** Coro stack size. */
 	size_t stack_size;
 	/** Valgrind stack id. */
-- 
2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] lib/core/fiber: Increase default stack size
  2019-03-12 22:47 ` [PATCH 1/3] " Cyrill Gorcunov
@ 2019-03-13 13:42   ` Vladimir Davydov
  2019-03-13 13:53     ` Cyrill Gorcunov
  0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-13 13:42 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

On Wed, Mar 13, 2019 at 01:47:19AM +0300, Cyrill Gorcunov wrote:
> The default 64K stack size used for years become too small
> for modern distors (Fedora 29 and etc) where third party libraries
> (such as ncurses) started to use 64K for own buffers and we get
> SIGSGV early without reaching interactive console phase.
> 
> Thus we increase default size up to 512K which should fit

May be, better use 1 MB, just to be sure?

> for common case. Later we will make this value configurable
> to address arbitrary stack sizes without a need to rebuild
> the whole code.
> 
> Note the values are switched to 4K page granularity for sake
> of future modifications -- we gonna manipulate pages to
> relax rss usage if OS allows.
> 
> Closes #3418

Should be "Part of #3418". We use "Closes" only with the last patch in
the series (it makes GitHub close the issue).

> ---
>  src/lib/core/fiber.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index abd6c6b11..f16ac873f 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -93,9 +93,9 @@ static int stack_direction;
>  
>  enum {
>  	/* The minimum allowable fiber stack size in bytes */
> -	FIBER_STACK_SIZE_MINIMAL = 16384,
> +	FIBER_STACK_SIZE_MINIMAL = 4 << 12,
>  	/* Default fiber stack size in bytes */
> -	FIBER_STACK_SIZE_DEFAULT = 65536
> +	FIBER_STACK_SIZE_DEFAULT = 128 << 12

I don't like using '<< 12' for stack sizes. Why 12? Because the page
size is typically 4K. I wouldn't count on that - it's userspace so it
looks confusing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage
  2019-03-12 22:47 ` [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage Cyrill Gorcunov
@ 2019-03-13 13:52   ` Vladimir Davydov
  2019-03-13 14:00     ` Cyrill Gorcunov
  0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-13 13:52 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

On Wed, Mar 13, 2019 at 01:47:21AM +0300, Cyrill Gorcunov wrote:
> We want to detect a situation where task in fiber is too eager for
> stack and close to its exhausting. For this sake upon stack creation
> we put 8 marks on last stack page with step of 128 bytes. Such params
> allows us to fill ~1/4 of a page, which does seem reasonable but
> we might change this params with time.
> 
> Since the watermark position is permanent and some task is close
> to stack limit we print about the situation once to not spam
> a user much and stop putting the mark on recycling.
> 
> Still this techique doesn't guarantee to handle all possible
> situations so to increas probability of catching eager fibers
> we put marks *not* at page start address but withing some offset,
> randomly generated.
> 
> To minimize a pressue on memory system we try to relax stack
> usage with that named "shink" watermark. Basically it is the
> same mark as for overflow detection but placed at 64K bound
> and on every recycle we try to shrink stack usage dropping
> pages if the watermark has been erased by a task.
> 
> Closes #3418
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
>  src/lib/core/fiber.c | 167 +++++++++++++++++++++++++++++++++++++++++++
>  src/lib/core/fiber.h |  15 ++++
>  2 files changed, 182 insertions(+)
> 
> diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> index a1b261ad4..cae291d1a 100644
> --- a/src/lib/core/fiber.c
> +++ b/src/lib/core/fiber.c
> @@ -104,6 +104,27 @@ static const struct fiber_attr fiber_attr_default = {
>         .flags = FIBER_DEFAULT_FLAGS
>  };
>  
> +/*
> + * Random values generated with uuid.
> + */
> +static const uint64_t poison_pool[] = {
> +	0x74f31d37285c4c37, 0xb10269a05bf10c29,
> +	0x0994d845bd284e0f, 0x9ffd4f7129c184df,
> +	0x357151e6711c4415, 0x8c5e5f41aafe6f28,
> +	0x6917dd79e78049d5, 0xba61957c65ca2465,
> +};
> +
> +/*
> + * We poison by 8 bytes as it natural for stack
> + * step on x86-64. Also 128 byte gap between
> + * poison values should cover a common cases.
> + */
> +#define POISON_SIZE	(sizeof(poison_pool) / sizeof(poison_pool[0]))
> +#define POISON_GAP	(128 + sizeof(poison_pool[0]))
> +#define POISON_OFF	(POISON_GAP / sizeof(poison_pool[0]))
> +
> +static void fiber_wmark_recycle(struct fiber *fiber);
> +
>  void
>  fiber_attr_create(struct fiber_attr *fiber_attr)
>  {
> @@ -624,6 +645,7 @@ fiber_recycle(struct fiber *fiber)
>  	/* no pending wakeup */
>  	assert(rlist_empty(&fiber->state));
>  	bool has_custom_stack = fiber->flags & FIBER_CUSTOM_STACK;
> +	fiber_wmark_recycle(fiber);
>  	fiber_reset(fiber);
>  	fiber->name[0] = '\0';
>  	fiber->f = NULL;
> @@ -710,6 +732,146 @@ page_align_up(void *ptr)
>  	return page_align_down(ptr + page_size - 1);
>  }
>  
> +static bool
> +stack_has_wmark(void *addr)

Please add brief comments to all the functions you add.

> +{
> +	const uint64_t *src = poison_pool;
> +	const uint64_t *dst = addr;
> +	size_t i;
> +
> +	for (i = 0; i < POISON_SIZE; i++) {
> +		if (*dst != src[i])
> +			return false;
> +		dst += POISON_OFF;
> +	}
> +
> +	return true;
> +}
> +
> +static void
> +stack_put_wmark(void *addr)
> +{
> +	const uint64_t *src = poison_pool;
> +	uint64_t *dst = addr;
> +	size_t i;
> +
> +	for (i = 0; i < POISON_SIZE; i++) {
> +		*dst = src[i];
> +		dst += POISON_OFF;
> +	}
> +}
> +
> +#ifndef TARGET_OS_DARWIN
> +static void
> +stack_shrink(struct fiber *fiber)
> +{
> +	void *start, *end;
> +
> +	/*
> +	 * When dropping pages make sure the page
> +	 * containing overflow mark is untouched.
> +	 * Same time no need to unmap the page which
> +	 * carries "shrink" wmark, since we're updating
> +	 * this page anyway.
> +	 */
> +	if (stack_direction < 0) {
> +		end = page_align_down(fiber->stack_shrink_wmark);
> +		start = page_align_up(fiber->stack_overflow_wmark);
> +		madvise(start, end - start, MADV_DONTNEED);
> +	} else {
> +		start = page_align_up(fiber->stack_shrink_wmark);
> +		end = page_align_down(fiber->stack_overflow_wmark);
> +		madvise(start, end - start, MADV_DONTNEED);
> +	}

Nit: you could move madvise() out of the if-else block.

> +	stack_put_wmark(fiber->stack_shrink_wmark);
> +}
> +#endif
> +
> +static void
> +fiber_wmark_recycle(struct fiber *fiber)
> +{
> +	static bool overflow_warned = false;
> +
> +	if (fiber->stack == NULL || fiber->flags & FIBER_CUSTOM_STACK)
> +		return;
> +
> +#ifndef TARGET_OS_DARWIN
> +	/*
> +	 * On recycle we're trying to shrink stack
> +	 * to release memory pressure but if only
> +	 * a fiber has been using too much memory.
> +	 */
> +	if (!stack_has_wmark(fiber->stack_shrink_wmark))
> +		stack_shrink(fiber);
> +#endif
> +
> +	/*
> +	 * We are watching for stack overflow in one shot way:
> +	 * simply to not spam a user with messages, if someone
> +	 * triggered the problem it is highly likely that
> +	 * an another fiber hit the same soon.
> +	 */
> +	if (overflow_warned)
> +		return;
> +
> +	if (!stack_has_wmark(fiber->stack_overflow_wmark)) {
> +		say_warn("stack usage is close to the limit of %zu bytes",
> +			 (size_t)FIBER_STACK_SIZE_DEFAULT);

IMO warning only when we are close to the limit isn't very useful.
Let's keep track of the max stack size instead and print it whenever
it's increased.

> +		overflow_warned = true;
> +	}
> +}
> +
> +static void
> +fiber_wmark_init(struct fiber *fiber)
> +{
> +	/*
> +	 * No tracking on custom stacks
> +	 * in a sake of simplicity.
> +	 */
> +	if (fiber->flags & FIBER_CUSTOM_STACK) {
> +		fiber->stack_overflow_wmark = NULL;
> +		fiber->stack_shrink_wmark = NULL;
> +		fiber->wmark_inpage_offset = 0;
> +		return;
> +	}
> +
> +	/*
> +	 * To increase probability of the stack overflow
> +	 * detection we put first mark at random position
> +	 * of the first 128 bytes range. The rest of the marks
> +	 * are put with constant step simply to not carry
> +	 * offsets in memory.
> +	 */
> +	fiber->wmark_inpage_offset = ((rand() % 128) + 8) & ~7;

Please define a constant for this.

> +
> +	/*
> +	 * Initially we arm the last page of the stack
> +	 * to catch if we're getting close to its exhausting.
> +	 *
> +	 * The shrink watermark is put at 64K limit which is
> +	 * known value to not cause much memory pressue even
> +	 * with large number of fibers.
> +	 */
> +	if (stack_direction < 0) {
> +		fiber->stack_overflow_wmark  = fiber->stack;
> +		fiber->stack_overflow_wmark += fiber->wmark_inpage_offset;
> +
> +		fiber->stack_shrink_wmark  = fiber->stack + fiber->stack_size;
> +		fiber->stack_shrink_wmark -= 16 << 12;

Ditto.

> +		fiber->stack_shrink_wmark += fiber->wmark_inpage_offset;
> +	} else {
> +		fiber->stack_overflow_wmark  = fiber->stack + fiber->stack_size;
> +		fiber->stack_overflow_wmark -= page_size;
> +		fiber->stack_overflow_wmark += fiber->wmark_inpage_offset;
> +
> +		fiber->stack_shrink_wmark  = fiber->stack;
> +		fiber->stack_shrink_wmark += (16 << 12) - page_size;
> +		fiber->stack_shrink_wmark += fiber->wmark_inpage_offset;
> +	}
> +	stack_put_wmark(fiber->stack_overflow_wmark);
> +	stack_put_wmark(fiber->stack_shrink_wmark);
> +}
> +
>  static int
>  fiber_stack_create(struct fiber *fiber, size_t stack_size)
>  {
> @@ -758,6 +920,7 @@ fiber_stack_create(struct fiber *fiber, size_t stack_size)
>  	madvise(fiber->stack, fiber->stack_size, MADV_DONTNEED);
>  #endif
>  
> +	fiber_wmark_init(fiber);
>  	mprotect(guard, page_size, PROT_NONE);
>  	return 0;
>  }
> @@ -927,8 +1090,12 @@ cord_create(struct cord *cord, const char *name)
>  	/* Record stack extents */
>  	tt_pthread_attr_getstack(cord->id, &cord->sched.stack,
>  				 &cord->sched.stack_size);
> +	cord->sched.stack_overflow_wmark = cord->sched.stack;
> +	cord->sched.stack_shrink_wmark = cord->sched.stack;

I don't think we need to set the watermarks for the sched fiber,
even if ASAN is on.

>  #else
>  	cord->sched.stack = NULL;
> +	cord->sched.stack_overflow_wmark = NULL;
> +	cord->sched.stack_shrink_wmark = NULL;
>  	cord->sched.stack_size = 0;
>  #endif
>  }
> diff --git a/src/lib/core/fiber.h b/src/lib/core/fiber.h
> index f1f5a0555..431e3da09 100644
> --- a/src/lib/core/fiber.h
> +++ b/src/lib/core/fiber.h
> @@ -348,6 +348,21 @@ struct fiber {
>  	struct slab *stack_slab;
>  	/** Coro stack addr. */
>  	void *stack;
> +	/**
> +	 * Stack watermark addr to detect
> +	 * if we need shrink stack on reuse.
> +	 */
> +	void *stack_shrink_wmark;
> +	/**
> +	 * Stack watermark addr for overflow detection. To warn
> +	 * a user about stack eager fibers.
> +	 */
> +	void *stack_overflow_wmark;
> +	/**
> +	 * An offset to watermark position in stack
> +	 * since page bound address.
> +	 */
> +	unsigned int wmark_inpage_offset;
>  	/** Coro stack size. */
>  	size_t stack_size;
>  	/** Valgrind stack id. */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] lib/core/fiber: Increase default stack size
  2019-03-13 13:42   ` Vladimir Davydov
@ 2019-03-13 13:53     ` Cyrill Gorcunov
  2019-03-13 14:13       ` Vladimir Davydov
  0 siblings, 1 reply; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-13 13:53 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml

On Wed, Mar 13, 2019 at 04:42:53PM +0300, Vladimir Davydov wrote:
> On Wed, Mar 13, 2019 at 01:47:19AM +0300, Cyrill Gorcunov wrote:
> > The default 64K stack size used for years become too small
> > for modern distors (Fedora 29 and etc) where third party libraries
> > (such as ncurses) started to use 64K for own buffers and we get
> > SIGSGV early without reaching interactive console phase.
> > 
> > Thus we increase default size up to 512K which should fit
> 
> May be, better use 1 MB, just to be sure?

Well, you know iirc 512 should fit even on current MacOS.
Also 1M is not guarantee us anything we have to make
this value configurable on early init. And I'm gonna
make it so.

> > for common case. Later we will make this value configurable
> > to address arbitrary stack sizes without a need to rebuild
> > the whole code.
> > 
> > Note the values are switched to 4K page granularity for sake
> > of future modifications -- we gonna manipulate pages to
> > relax rss usage if OS allows.
> > 
> > Closes #3418
> 
> Should be "Part of #3418". We use "Closes" only with the last patch in
> the series (it makes GitHub close the issue).

Strictly speaking it closes this particular issue. The things
we gonna implement on top is rather an optimization on memory
usage. So i would rather open another issue to address it.

> > ---
> >  src/lib/core/fiber.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> > index abd6c6b11..f16ac873f 100644
> > --- a/src/lib/core/fiber.c
> > +++ b/src/lib/core/fiber.c
> > @@ -93,9 +93,9 @@ static int stack_direction;
> >  
> >  enum {
> >  	/* The minimum allowable fiber stack size in bytes */
> > -	FIBER_STACK_SIZE_MINIMAL = 16384,
> > +	FIBER_STACK_SIZE_MINIMAL = 4 << 12,
> >  	/* Default fiber stack size in bytes */
> > -	FIBER_STACK_SIZE_DEFAULT = 65536
> > +	FIBER_STACK_SIZE_DEFAULT = 128 << 12
> 
> I don't like using '<< 12' for stack sizes. Why 12? Because the page
> size is typically 4K. I wouldn't count on that - it's userspace so it
> looks confusing.

Because madvise works with 4K pages and such record force
a code reader to pay attention that we're counting by
pages. Actually once I make this values configurable
I'll use "nr_page * page_size" record.

Volodya, if these are only nits you worries then fine
I can easily fix them :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage
  2019-03-13 13:52   ` Vladimir Davydov
@ 2019-03-13 14:00     ` Cyrill Gorcunov
  2019-03-13 14:17       ` Vladimir Davydov
  0 siblings, 1 reply; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-13 14:00 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml

On Wed, Mar 13, 2019 at 04:52:04PM +0300, Vladimir Davydov wrote:
> >  
> > +static bool
> > +stack_has_wmark(void *addr)
> 
> Please add brief comments to all the functions you add.

ok

> > +
> > +	/*
> > +	 * When dropping pages make sure the page
> > +	 * containing overflow mark is untouched.
> > +	 * Same time no need to unmap the page which
> > +	 * carries "shrink" wmark, since we're updating
> > +	 * this page anyway.
> > +	 */
> > +	if (stack_direction < 0) {
> > +		end = page_align_down(fiber->stack_shrink_wmark);
> > +		start = page_align_up(fiber->stack_overflow_wmark);
> > +		madvise(start, end - start, MADV_DONTNEED);
> > +	} else {
> > +		start = page_align_up(fiber->stack_shrink_wmark);
> > +		end = page_align_down(fiber->stack_overflow_wmark);
> > +		madvise(start, end - start, MADV_DONTNEED);
> > +	}
> 
> Nit: you could move madvise() out of the if-else block.

indeed, thanks!

> > +
> > +	if (!stack_has_wmark(fiber->stack_overflow_wmark)) {
> > +		say_warn("stack usage is close to the limit of %zu bytes",
> > +			 (size_t)FIBER_STACK_SIZE_DEFAULT);
> 
> IMO warning only when we are close to the limit isn't very useful.
> Let's keep track of the max stack size instead and print it whenever
> it's increased.

Could you please elaborate. I don't understand what you mean by
keep tracking of max stack size. You propose to remember somewhere
if this mark has been vanished?

> > +	fiber->wmark_inpage_offset = ((rand() % 128) + 8) & ~7;
> 
> Please define a constant for this.

ok

> > +		fiber->stack_shrink_wmark -= 16 << 12;
> 
> Ditto.

ok

> > @@ -927,8 +1090,12 @@ cord_create(struct cord *cord, const char *name)
> >  	/* Record stack extents */
> >  	tt_pthread_attr_getstack(cord->id, &cord->sched.stack,
> >  				 &cord->sched.stack_size);
> > +	cord->sched.stack_overflow_wmark = cord->sched.stack;
> > +	cord->sched.stack_shrink_wmark = cord->sched.stack;
> 
> I don't think we need to set the watermarks for the sched fiber,
> even if ASAN is on.

Won't they left unitialized then? The only reason I put nils here
is to force them have known values even if they are unused.

	Cyrill

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] lib/core/fiber: Increase default stack size
  2019-03-13 13:53     ` Cyrill Gorcunov
@ 2019-03-13 14:13       ` Vladimir Davydov
  0 siblings, 0 replies; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-13 14:13 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

On Wed, Mar 13, 2019 at 04:53:07PM +0300, Cyrill Gorcunov wrote:
> On Wed, Mar 13, 2019 at 04:42:53PM +0300, Vladimir Davydov wrote:
> > On Wed, Mar 13, 2019 at 01:47:19AM +0300, Cyrill Gorcunov wrote:
> > > The default 64K stack size used for years become too small
> > > for modern distors (Fedora 29 and etc) where third party libraries
> > > (such as ncurses) started to use 64K for own buffers and we get
> > > SIGSGV early without reaching interactive console phase.
> > > 
> > > Thus we increase default size up to 512K which should fit
> > 
> > May be, better use 1 MB, just to be sure?
> 
> Well, you know iirc 512 should fit even on current MacOS.

OK I see. Let's leave it as is then.

> Also 1M is not guarantee us anything we have to make
> this value configurable on early init. And I'm gonna
> make it so.
> 
> > > for common case. Later we will make this value configurable
> > > to address arbitrary stack sizes without a need to rebuild
> > > the whole code.
> > > 
> > > Note the values are switched to 4K page granularity for sake
> > > of future modifications -- we gonna manipulate pages to
> > > relax rss usage if OS allows.
> > > 
> > > Closes #3418
> > 
> > Should be "Part of #3418". We use "Closes" only with the last patch in
> > the series (it makes GitHub close the issue).
> 
> Strictly speaking it closes this particular issue. The things
> we gonna implement on top is rather an optimization on memory
> usage. So i would rather open another issue to address it.

Seeing "Closes #3418" a few times in the same series looks weird.
Without the third patch, this one shouldn't be committed so I'd use
"Part of".

> 
> > > ---
> > >  src/lib/core/fiber.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/src/lib/core/fiber.c b/src/lib/core/fiber.c
> > > index abd6c6b11..f16ac873f 100644
> > > --- a/src/lib/core/fiber.c
> > > +++ b/src/lib/core/fiber.c
> > > @@ -93,9 +93,9 @@ static int stack_direction;
> > >  
> > >  enum {
> > >  	/* The minimum allowable fiber stack size in bytes */
> > > -	FIBER_STACK_SIZE_MINIMAL = 16384,
> > > +	FIBER_STACK_SIZE_MINIMAL = 4 << 12,
> > >  	/* Default fiber stack size in bytes */
> > > -	FIBER_STACK_SIZE_DEFAULT = 65536
> > > +	FIBER_STACK_SIZE_DEFAULT = 128 << 12
> > 
> > I don't like using '<< 12' for stack sizes. Why 12? Because the page
> > size is typically 4K. I wouldn't count on that - it's userspace so it
> > looks confusing.
> 
> Because madvise works with 4K pages and such record force
> a code reader to pay attention that we're counting by

But we align pointers passed to madvise by page size anyway.
Seeing <<12 in userspace is weird.

> pages. Actually once I make this values configurable
> I'll use "nr_page * page_size" record.
> 
> Volodya, if these are only nits you worries then fine
> I can easily fix them :)

With this particular patch, yes only nit picks :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage
  2019-03-13 14:00     ` Cyrill Gorcunov
@ 2019-03-13 14:17       ` Vladimir Davydov
  2019-03-13 16:15         ` Cyrill Gorcunov
  0 siblings, 1 reply; 11+ messages in thread
From: Vladimir Davydov @ 2019-03-13 14:17 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: tml

On Wed, Mar 13, 2019 at 05:00:54PM +0300, Cyrill Gorcunov wrote:
> > > +
> > > +	if (!stack_has_wmark(fiber->stack_overflow_wmark)) {
> > > +		say_warn("stack usage is close to the limit of %zu bytes",
> > > +			 (size_t)FIBER_STACK_SIZE_DEFAULT);
> > 
> > IMO warning only when we are close to the limit isn't very useful.
> > Let's keep track of the max stack size instead and print it whenever
> > it's increased.
> 
> Could you please elaborate. I don't understand what you mean by
> keep tracking of max stack size. You propose to remember somewhere
> if this mark has been vanished?

Yeah, keep track of the max recorded stack size in a static variable
and keep advancing the higher watermark position whenever it gets
overwritten so that we can report max stack size used:

  I> max stack size is 73728 bytes
  ...
  I> max stack size is 112640 bytes

(the wording might need polishing)

> > > @@ -927,8 +1090,12 @@ cord_create(struct cord *cord, const char *name)
> > >  	/* Record stack extents */
> > >  	tt_pthread_attr_getstack(cord->id, &cord->sched.stack,
> > >  				 &cord->sched.stack_size);
> > > +	cord->sched.stack_overflow_wmark = cord->sched.stack;
> > > +	cord->sched.stack_shrink_wmark = cord->sched.stack;
> > 
> > I don't think we need to set the watermarks for the sched fiber,
> > even if ASAN is on.
> 
> Won't they left unitialized then? The only reason I put nils here
> is to force them have known values even if they are unused.

I'd set them to NULLs for both cases, i.e. outside ifdef.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage
  2019-03-13 14:17       ` Vladimir Davydov
@ 2019-03-13 16:15         ` Cyrill Gorcunov
  0 siblings, 0 replies; 11+ messages in thread
From: Cyrill Gorcunov @ 2019-03-13 16:15 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: tml

On Wed, Mar 13, 2019 at 05:17:24PM +0300, Vladimir Davydov wrote:
> 
> Yeah, keep track of the max recorded stack size in a static variable
> and keep advancing the higher watermark position whenever it gets
> overwritten so that we can report max stack size used:
> 
>   I> max stack size is 73728 bytes
>   ...
>   I> max stack size is 112640 bytes
> 
> (the wording might need polishing)

I see what you mean. Will try.

> > 
> > Won't they left unitialized then? The only reason I put nils here
> > is to force them have known values even if they are unused.
> 
> I'd set them to NULLs for both cases, i.e. outside ifdef.

OK

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-03-13 16:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-12 22:47 [PATCH v5 0/2] lib/core/fiber: Increase default stack size Cyrill Gorcunov
2019-03-12 22:47 ` [PATCH 1/3] " Cyrill Gorcunov
2019-03-13 13:42   ` Vladimir Davydov
2019-03-13 13:53     ` Cyrill Gorcunov
2019-03-13 14:13       ` Vladimir Davydov
2019-03-12 22:47 ` [PATCH 2/3] lib/core/fiber: Mark stack as unneeded on creation Cyrill Gorcunov
2019-03-12 22:47 ` [PATCH 3/3] lib/core/fiber: Put watermarks into stack to track its usage Cyrill Gorcunov
2019-03-13 13:52   ` Vladimir Davydov
2019-03-13 14:00     ` Cyrill Gorcunov
2019-03-13 14:17       ` Vladimir Davydov
2019-03-13 16:15         ` Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox