[RFC v3] fiber: Increase default stack size

Vladimir Davydov vdavydov.dev at gmail.com
Tue Feb 26 13:26:56 MSK 2019

On Tue, Feb 26, 2019 at 12:12:54PM +0300, Cyrill Gorcunov wrote:
> > > I'm not yet familiar with slab engine, does it allocates
> > > pages on lazy fashion or we need to pass 'dontneed' on
> > > first fiber creation too?
> > 
> > Oops, you're right, good catch! The allocator may poison slab if NDEBUG
> > is unset. So we can either
> > 
> >  - Madvise slab on fiber creation, at least in NDEBUG mode. Simple, but
> >    depends on the allocator internals.
> >  - Patch the 'small' library to make the allocator do madvise for us.
> >    IMO it would look better, but would clutter the allocator API.
> >  - Don't use 'small' allocator at all for default slab allocations, and
> >    simply mmap stack and link them in a free list (is it OK to mmap a
> >    few MB chunk per each fiber?).
> > 
> > I'm inclined to choose the last option. I'll discuss the options with
> > others today and follow-up.
> Will continue at the evening. Also note that stack is special and
> we better should not pass arbitrary sizes to allocate even for
> custom stacks, they all _must_ be PAGE_SIZE aligned (or call it
> PAGE_SIZE orders if you prefer :) Anything else make madvise
> idea useless since kernel requires madvise args to be page
> aligned which is understandable due to hardware.

Talked to Kostja and Georgy. We agreed on the following points:

 - Regarding slab poisoning. We don't want to implement ad hoc allocator
   for fiber stacks, neither do we really want to patch the small lib
   for now. Let's use madvise(MADV_DONTNEED) for all fiber stacks
   unconditionally on fiber creation. This should be okay from
   performance point of view, because once a fiber is created, it's
   never destroyed - it stays on the dead list until recycled.

 - 1 MB for max stack size seems to be a bit of an overkill for now.
   The default value should be set to 256 KB, but we do need a
   configuration option for it. Let's add it to the fiber Lua module.
   May be done in a separate patch, but should be submitted together in
   the same patch set.

 - 16 byte unique identifier for detecting stack overflow doesn't seem
   to be enough. Imagine a PATH_MAX buffer allocated on stack that uses
   only a hundred bytes for path formatting. It can easily jump over the
   watermark. We should probably use random poisoning: say, 4 unique
   identifiers 8 bytes each scattered a few hundred bytes apart. Some
   math/reasoning behind this would be nice to see in the comments.

 - Since madvise() is somewhat expensive to be called on each fiber
   recycle, we need to make the watermark dynamic. That is, keep track
   of the number of fibers that have exceeded the watermark and when
   there are too many of those, increase the watermark value. We could
   probably use a histogram for this. This would allow us to decrease
   the default stack allocation size from 64 KB down to 16 KB, which
   would be really nice. This should be done in a separate patch, but
   again in the scope of this issue.

 - We definitely need this patch (or patches) to be covered with unit
   tests. Please add corresponding cases to test/unit/fiber.cc.

More information about the Tarantool-patches mailing list