Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [small] Revert "Free all slabs on region reset"
@ 2020-01-29  8:06 Alexander Turenko
  2020-01-29  9:59 ` Kirill Yukhin
  2020-01-29 21:41 ` Konstantin Osipov
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Turenko @ 2020-01-29  8:06 UTC (permalink / raw)
  To: Kirill Yukhin; +Cc: tarantool-patches

This reverts commit 67d7ab44ab09df3356929e3692a03321b31f3ebb.

The goal of the reverted commit was to fix flaky fails of tarantool
tests that checks amount of memory used by a fiber:

 | fiber.info()[fiber.self().id()].memory.used

It also attempts to overcome the situation when a fiber holds some
amount of memory, which is not used in any way. The high limit of such
memory is controlled by a threshold in fiber_gc() tarantool's function
(128 KiB at the moment):

 | void
 | fiber_gc(void)
 | {
 |         if (region_used(&fiber()->gc) < 128 * 1024) {
 |                 region_reset(&fiber()->gc);
 |                 return;
 |         }
 |
 |         region_free(&fiber()->gc);
 | }

The reverted commit, however, leads to significant performance
degradation on certain workloads (see #4736). So the revertion fixes the
performance degradation and opens the problem with tests, which is
tracked in #4750.

Related to #12
Related to https://github.com/tarantool/tarantool/issues/4750
Fixes https://github.com/tarantool/tarantool/issues/4736
---

https://github.com/tarantool/small/tree/Totktonada/gh-4736-revert-region-reset

 small/region.h | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/small/region.h b/small/region.h
index d9be176..bea88c6 100644
--- a/small/region.h
+++ b/small/region.h
@@ -156,16 +156,6 @@ region_reserve(struct region *region, size_t size)
 						       slab.next_in_list);
 		if (size <= rslab_unused(slab))
 			return (char *) rslab_data(slab) + slab->used;
-		/* Try to get a slab from the region cache. */
-		slab = rlist_last_entry(&region->slabs.slabs,
-					struct rslab,
-					slab.next_in_list);
-		if (slab->used == 0 && size <= rslab_unused(slab)) {
-			/* Move this slab to the head. */
-			slab_list_del(&region->slabs, &slab->slab, next_in_list);
-			slab_list_add(&region->slabs, &slab->slab, next_in_list);
-			return (char *) rslab_data(slab);
-		}
 	}
 	return region_reserve_slow(region, size);
 }
@@ -222,14 +212,14 @@ region_aligned_alloc(struct region *region, size_t size, size_t alignment)
 
 /**
  * Mark region as empty, but keep the blocks.
- * Do not change the first slab and use previous slabs as a cache to
- * use for future allocations.
  */
 static inline void
 region_reset(struct region *region)
 {
-	struct rslab *slab;
-	rlist_foreach_entry(slab, &region->slabs.slabs, slab.next_in_list) {
+	if (! rlist_empty(&region->slabs.slabs)) {
+		struct rslab *slab = rlist_first_entry(&region->slabs.slabs,
+						       struct rslab,
+						       slab.next_in_list);
 		region->slabs.stats.used -= slab->used;
 		slab->used = 0;
 	}
-- 
2.22.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [small] Revert "Free all slabs on region reset"
  2020-01-29  8:06 [Tarantool-patches] [small] Revert "Free all slabs on region reset" Alexander Turenko
@ 2020-01-29  9:59 ` Kirill Yukhin
  2020-01-29 21:41 ` Konstantin Osipov
  1 sibling, 0 replies; 7+ messages in thread
From: Kirill Yukhin @ 2020-01-29  9:59 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

Hello,

On 29 янв 11:06, Alexander Turenko wrote:
> This reverts commit 67d7ab44ab09df3356929e3692a03321b31f3ebb.
> 
> The goal of the reverted commit was to fix flaky fails of tarantool
> tests that checks amount of memory used by a fiber:
> 
>  | fiber.info()[fiber.self().id()].memory.used
> 
> It also attempts to overcome the situation when a fiber holds some
> amount of memory, which is not used in any way. The high limit of such
> memory is controlled by a threshold in fiber_gc() tarantool's function
> (128 KiB at the moment):
> 
>  | void
>  | fiber_gc(void)
>  | {
>  |         if (region_used(&fiber()->gc) < 128 * 1024) {
>  |                 region_reset(&fiber()->gc);
>  |                 return;
>  |         }
>  |
>  |         region_free(&fiber()->gc);
>  | }
> 
> The reverted commit, however, leads to significant performance
> degradation on certain workloads (see #4736). So the revertion fixes the
> performance degradation and opens the problem with tests, which is
> tracked in #4750.
> 
> Related to #12
> Related to https://github.com/tarantool/tarantool/issues/4750
> Fixes https://github.com/tarantool/tarantool/issues/4736

I've checked your patch into tarantool/small and bumped new version
in master, 2.3, 2.2 and, as exception 2.1.

--
Regards, Kirill Yukhin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [small] Revert "Free all slabs on region reset"
  2020-01-29  8:06 [Tarantool-patches] [small] Revert "Free all slabs on region reset" Alexander Turenko
  2020-01-29  9:59 ` Kirill Yukhin
@ 2020-01-29 21:41 ` Konstantin Osipov
  2020-01-30  7:23   ` Alexander Turenko
  1 sibling, 1 reply; 7+ messages in thread
From: Konstantin Osipov @ 2020-01-29 21:41 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

* Alexander Turenko <alexander.turenko@tarantool.org> [20/01/29 11:11]:
> This reverts commit 67d7ab44ab09df3356929e3692a03321b31f3ebb.
> 
> The goal of the reverted commit was to fix flaky fails of tarantool
> tests that checks amount of memory used by a fiber:
> 
>  | fiber.info()[fiber.self().id()].memory.used
> 
> It also attempts to overcome the situation when a fiber holds some
> amount of memory, which is not used in any way. The high limit of such
> memory is controlled by a threshold in fiber_gc() tarantool's function
> (128 KiB at the moment):

The commit by Georgy was doing the right thing, it just had a bug
in it. I saw Georgy posted a fix for his commit - but I haven't
seen it on the list.

Why revert this commit rather than fix it? It should be pretty
strightforward to fix, no?

The fix is to avoid iterating over all the free slabs when we
failed to find the right slab size several times, and just get a
new slab at once. I'd even say that Georgy's commit was entirely
correct, it just didn't account for the "land mine" it is setting
for alloc case.

Thoughts?


-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [small] Revert "Free all slabs on region reset"
  2020-01-29 21:41 ` Konstantin Osipov
@ 2020-01-30  7:23   ` Alexander Turenko
  2020-01-30  8:21     ` Konstantin Osipov
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Turenko @ 2020-01-30  7:23 UTC (permalink / raw)
  To: Konstantin Osipov; +Cc: tarantool-patches

On Thu, Jan 30, 2020 at 12:41:22AM +0300, Konstantin Osipov wrote:
> * Alexander Turenko <alexander.turenko@tarantool.org> [20/01/29 11:11]:
> > This reverts commit 67d7ab44ab09df3356929e3692a03321b31f3ebb.
> > 
> > The goal of the reverted commit was to fix flaky fails of tarantool
> > tests that checks amount of memory used by a fiber:
> > 
> >  | fiber.info()[fiber.self().id()].memory.used
> > 
> > It also attempts to overcome the situation when a fiber holds some
> > amount of memory, which is not used in any way. The high limit of such
> > memory is controlled by a threshold in fiber_gc() tarantool's function
> > (128 KiB at the moment):
> 
> The commit by Georgy was doing the right thing, it just had a bug
> in it. I saw Georgy posted a fix for his commit - but I haven't
> seen it on the list.
> 
> Why revert this commit rather than fix it? It should be pretty
> strightforward to fix, no?
> 
> The fix is to avoid iterating over all the free slabs when we
> failed to find the right slab size several times, and just get a
> new slab at once. I'd even say that Georgy's commit was entirely
> correct, it just didn't account for the "land mine" it is setting
> for alloc case.
> 
> Thoughts?

For the record, the discussed fix in small is [1], the change in
tarantool is [2].

[1]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013798.html
[2]: https://lists.tarantool.org/pipermail/tarantool-patches/2020-January/013795.html

The Georgy's fix for small keeps linear traversal over allocated slabs
on region_reset(). Fixed region_reserve() releases one slab at max per
allocation, so this linear traversal will persist for several
region_reset() calls in row. This causes degradation of first ~2M
requests that are 4096+ bytes size on the workload #4736 just because we
traverse the same slabs again and again. I suspect that even calling of
region_free() would give better performance on the workload (at least
without big local slowdowns).

W/o the fix I saw 15-20k allocated slabs during workload from #4736 (and
it was increasing) that explains 2M number of degraded requests (I
assume that ~1000 fibers work on them).

I think that reserve/reset should give predictable performance and so
should be constant until region_free() will be called. Ideally it should
doing something like Georgy made: put one or several unnecessary slabs
to a slab_cache. Even better would be spread deallocations across
allocation calls like LuaJIT does: perform more deallocations if there
are more garbage.

Anyway, I need more time to propose something certain. There are some
thoughts in #4750, but I don't ready to defend them:

> Save a pointer to unused tail of slabs (or store them in a separate
> list) and pass one or several slabs back to slab_cache on allocations
> (region_reserve()) -- this is in the spirit of Georgy's commit
> b4c11319e1ca637394d87cde7514fb43aec6b02f. It does not require linear
> slab list traversal in region_reset() that was added in the reverted
> commit. This way we'll spread deallocations across time and so
> performance will be more predictable comparing a solution with linear
> traversal in region_reset() (that is vital when you have an SLA).

The core of this idea is to create 'garbage' list to distinguish such
slabs from used ones without the full traversal in region_reset().

It should decrease average cost of region_free(), but increase cost of
region_reset() and region_reserve() at a constant. I need to test it on
different workloads to understand whether it is worthful.

BTW, #4750 is only about tests and will be closed soon. If you have
thoughts about gains we can get from changing of region_*() algorithms,
please, file a separate issue and describe them.

WBR, Alexander Turenko.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [small] Revert "Free all slabs on region reset"
  2020-01-30  7:23   ` Alexander Turenko
@ 2020-01-30  8:21     ` Konstantin Osipov
  2020-01-30 11:03       ` Alexander Turenko
  0 siblings, 1 reply; 7+ messages in thread
From: Konstantin Osipov @ 2020-01-30  8:21 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

* Alexander Turenko <alexander.turenko@tarantool.org> [20/01/30 10:27]:
> The Georgy's fix for small keeps linear traversal over allocated slabs
> on region_reset(). Fixed region_reserve() releases one slab at max per
> allocation, so this linear traversal will persist for several
> region_reset() calls in row.

Why do you think region_reset() should traverse over empty slabs?
Empty slabs should always remain at the end of the list, let's
stick to this invariant.

> I think that reserve/reset should give predictable performance and so
> should be constant until region_free() will be called. Ideally it should
> doing something like Georgy made: put one or several unnecessary slabs
> to a slab_cache. Even better would be spread deallocations across
> allocation calls like LuaJIT does: perform more deallocations if there
> are more garbage.

region_free()/region_reset() should be (and is in most cases)
bounded, since it is called as soon as region_used() reaches 128KB.
So yes, it's a list traversal, but it is done only once in every
dozen or so requests and is bounded by a list which is a few
dozens of slabs.

There is no need to complicate it beyond that. It
worked fine for 10 years, after all.


> 
> BTW, #4750 is only about tests and will be closed soon. If you have
> thoughts about gains we can get from changing of region_*() algorithms,
> please, file a separate issue and describe them.
> 

The fix that Georgy made in May was not only about tests. It was
improving region memory usage (and thus indirectly performance),
too.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [small] Revert "Free all slabs on region reset"
  2020-01-30  8:21     ` Konstantin Osipov
@ 2020-01-30 11:03       ` Alexander Turenko
  2020-01-30 12:16         ` Konstantin Osipov
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Turenko @ 2020-01-30 11:03 UTC (permalink / raw)
  To: Konstantin Osipov; +Cc: tarantool-patches

On Thu, Jan 30, 2020 at 11:21:15AM +0300, Konstantin Osipov wrote:
> * Alexander Turenko <alexander.turenko@tarantool.org> [20/01/30 10:27]:
> > The Georgy's fix for small keeps linear traversal over allocated slabs
> > on region_reset(). Fixed region_reserve() releases one slab at max per
> > allocation, so this linear traversal will persist for several
> > region_reset() calls in row.
> 
> Why do you think region_reset() should traverse over empty slabs?
> Empty slabs should always remain at the end of the list, let's
> stick to this invariant.

Don't get the point. Do you propose to stop this traverse on a first
unused slabs? AFAIU, there may be slabs with zero slab->used in a
middle. We cannot stop based on `slab->used` value.

> > I think that reserve/reset should give predictable performance and so
> > should be constant until region_free() will be called. Ideally it should
> > doing something like Georgy made: put one or several unnecessary slabs
> > to a slab_cache. Even better would be spread deallocations across
> > allocation calls like LuaJIT does: perform more deallocations if there
> > are more garbage.
> 
> region_free()/region_reset() should be (and is in most cases)
> bounded, since it is called as soon as region_used() reaches 128KB.
> So yes, it's a list traversal, but it is done only once in every
> dozen or so requests and is bounded by a list which is a few
> dozens of slabs.

A list traversal on **each** region_reset(), not once. 20k, not few
dozens. It does not look okay. It is even seen on the benchmark in
#4736 (if uncomment per-batch RPS prints and run it on the patched
version).

> 
> There is no need to complicate it beyond that. It
> worked fine for 10 years, after all.

I agree that the code should not be much more complex than now. Maybe
even should not be changed (however I can say something for sure only
after proper benchmarking of different variants on different workloads).

> 
> 
> > 
> > BTW, #4750 is only about tests and will be closed soon. If you have
> > thoughts about gains we can get from changing of region_*() algorithms,
> > please, file a separate issue and describe them.
> > 
> 
> The fix that Georgy made in May was not only about tests. It was
> improving region memory usage (and thus indirectly performance),
> too.

1. This need to be verified.
2. Even if it increases performance on some workloads, it should not
   significantly degrade others.

Small deallocations on during allocation is the good idea. Let's file an
issue against this.

Linear traversal in region_reset() is the bad one, IMHO. Especially
considering memory access pattern: it is list traversal, not an array.
Especially if we need to traverse a slab N/2 times in average (N is the
list size). At least until the upper limit of the list size is not
proven to be small on any workload.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tarantool-patches] [small] Revert "Free all slabs on region reset"
  2020-01-30 11:03       ` Alexander Turenko
@ 2020-01-30 12:16         ` Konstantin Osipov
  0 siblings, 0 replies; 7+ messages in thread
From: Konstantin Osipov @ 2020-01-30 12:16 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches

* Alexander Turenko <alexander.turenko@tarantool.org> [20/01/30 14:06]:
> > > The Georgy's fix for small keeps linear traversal over allocated slabs
> > > on region_reset(). Fixed region_reserve() releases one slab at max per
> > > allocation, so this linear traversal will persist for several
> > > region_reset() calls in row.
> > 
> > Why do you think region_reset() should traverse over empty slabs?
> > Empty slabs should always remain at the end of the list, let's
> > stick to this invariant.
> 
> Don't get the point. Do you propose to stop this traverse on a first
> unused slabs? AFAIU, there may be slabs with zero slab->used in a
> middle. We cannot stop based on `slab->used` value.

Let's make sure it's impossible. This is basically what Georgy's
patch is doing already.

> > region_free()/region_reset() should be (and is in most cases)
> > bounded, since it is called as soon as region_used() reaches 128KB.
> > So yes, it's a list traversal, but it is done only once in every
> > dozen or so requests and is bounded by a list which is a few
> > dozens of slabs.
> 
> A list traversal on **each** region_reset(), not once. 20k, not few
> dozens. It does not look okay. It is even seen on the benchmark in
> #4736 (if uncomment per-batch RPS prints and run it on the patched
> version).

20k is caused by the bug, it's not worth considering.
Typically it's 128/4k = 32 elements.

> > The fix that Georgy made in May was not only about tests. It was
> > improving region memory usage (and thus indirectly performance),
> > too.
> 
> 1. This need to be verified.
> 2. Even if it increases performance on some workloads, it should not
>    significantly degrade others.
> 
> Small deallocations on during allocation is the good idea. Let's file an
> issue against this.
> 
> Linear traversal in region_reset() is the bad one, IMHO. Especially
> considering memory access pattern: it is list traversal, not an array.
> Especially if we need to traverse a slab N/2 times in average (N is the
> list size). At least until the upper limit of the list size is not
> proven to be small on any workload.

Well, OK. I guess you need time to sort this out.

-- 
Konstantin Osipov, Moscow, Russia

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-01-30 12:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-29  8:06 [Tarantool-patches] [small] Revert "Free all slabs on region reset" Alexander Turenko
2020-01-29  9:59 ` Kirill Yukhin
2020-01-29 21:41 ` Konstantin Osipov
2020-01-30  7:23   ` Alexander Turenko
2020-01-30  8:21     ` Konstantin Osipov
2020-01-30 11:03       ` Alexander Turenko
2020-01-30 12:16         ` Konstantin Osipov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox