[RFC v3] fiber: Increase default stack size
vdavydov.dev at gmail.com
Tue Feb 26 13:26:56 MSK 2019
On Tue, Feb 26, 2019 at 12:12:54PM +0300, Cyrill Gorcunov wrote:
> > > I'm not yet familiar with slab engine, does it allocates
> > > pages on lazy fashion or we need to pass 'dontneed' on
> > > first fiber creation too?
> > Oops, you're right, good catch! The allocator may poison slab if NDEBUG
> > is unset. So we can either
> > - Madvise slab on fiber creation, at least in NDEBUG mode. Simple, but
> > depends on the allocator internals.
> > - Patch the 'small' library to make the allocator do madvise for us.
> > IMO it would look better, but would clutter the allocator API.
> > - Don't use 'small' allocator at all for default slab allocations, and
> > simply mmap stack and link them in a free list (is it OK to mmap a
> > few MB chunk per each fiber?).
> > I'm inclined to choose the last option. I'll discuss the options with
> > others today and follow-up.
> Will continue at the evening. Also note that stack is special and
> we better should not pass arbitrary sizes to allocate even for
> custom stacks, they all _must_ be PAGE_SIZE aligned (or call it
> PAGE_SIZE orders if you prefer :) Anything else make madvise
> idea useless since kernel requires madvise args to be page
> aligned which is understandable due to hardware.
Talked to Kostja and Georgy. We agreed on the following points:
- Regarding slab poisoning. We don't want to implement ad hoc allocator
for fiber stacks, neither do we really want to patch the small lib
for now. Let's use madvise(MADV_DONTNEED) for all fiber stacks
unconditionally on fiber creation. This should be okay from
performance point of view, because once a fiber is created, it's
never destroyed - it stays on the dead list until recycled.
- 1 MB for max stack size seems to be a bit of an overkill for now.
The default value should be set to 256 KB, but we do need a
configuration option for it. Let's add it to the fiber Lua module.
May be done in a separate patch, but should be submitted together in
the same patch set.
- 16 byte unique identifier for detecting stack overflow doesn't seem
to be enough. Imagine a PATH_MAX buffer allocated on stack that uses
only a hundred bytes for path formatting. It can easily jump over the
watermark. We should probably use random poisoning: say, 4 unique
identifiers 8 bytes each scattered a few hundred bytes apart. Some
math/reasoning behind this would be nice to see in the comments.
- Since madvise() is somewhat expensive to be called on each fiber
recycle, we need to make the watermark dynamic. That is, keep track
of the number of fibers that have exceeded the watermark and when
there are too many of those, increase the watermark value. We could
probably use a histogram for this. This would allow us to decrease
the default stack allocation size from 64 KB down to 16 KB, which
would be really nice. This should be done in a separate patch, but
again in the scope of this issue.
- We definitely need this patch (or patches) to be covered with unit
tests. Please add corresponding cases to test/unit/fiber.cc.
More information about the Tarantool-patches