From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: Date: Wed, 6 Mar 2019 22:03:49 +0300 From: Cyrill Gorcunov Subject: Re: [PATCH 3/3] core/fiber: Put watermarks into stack to track its usage Message-ID: <20190306190349.GD10420@uranus.lan> References: <20190305223854.14660-1-gorcunov@gmail.com> <20190305223854.14660-4-gorcunov@gmail.com> <20190306180504.dsa4xq6kbdbymgqe@esperanza> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190306180504.dsa4xq6kbdbymgqe@esperanza> To: Vladimir Davydov Cc: tarantool-patches@freelists.org, kostja@tarantool.org List-ID: On Wed, Mar 06, 2019 at 09:05:04PM +0300, Vladimir Davydov wrote: ... > > + > > +static void > > +fiber_wmark_recycle(struct fiber *fiber) > > +{ > > + static bool overflow_warned = false; > > + > > + if (fiber->stack == NULL || fiber->flags & FIBER_CUSTOM_STACK) > > + return; > > + > > +#ifndef TARGET_OS_DARWIN > > + /* > > + * On recycle we're trying to shrink stack > > + * as much as we can until first mark overwrite > > + * detected, then we simply zap watermark and > > + * assume the stack is balanced and won't change > > + * much in future. > > + */ > > I don't understand. If the watermark has been exceeded once for a > particular fiber, we'll never ever shrink the stack back to normal? I guess you meant "extend", not "shrink". But true -- every recycle we're trying to shrink stack, ie to make it less. And once we found that wmark is vanished we stop further analisys. > That is if a task eager for stack is rescheduled on different fibers > they will all end up with huge stacks. That's not what we want. > We want the stack size to be set dynamically so that the *majority* of > fibers don't need to shrink/grow stacks back and forth. Or am I reading > the code wrong? No, you're right. But it is unclean for me yet how we should reach this goal? Track stack usage in context switch? ... > > + > > + if (!stack_has_wmark(fiber->stack_overflow_wmark)) { > > + say_warn("fiber %d seems to overflow the stack soon", > > + fiber->name, fiber->fid); > > No point in printing fiber->name and id - they are printed by the say > module, anyway. At the same time I'd prefer to keep max stack size here > and print its value whenever we have to grow stack, similarly to how > the Linux kernel handles it. ok > > + if (stack_direction < 0) { > > + fiber->stack_overflow_wmark = fiber->stack; > > + fiber->stack_overflow_wmark += wmark_inpage_offset; > > + > > + fiber->stack_dynamic_wmark = fiber->stack_overflow_wmark + page_size; > > + } else { > > + fiber->stack_overflow_wmark = fiber->stack + fiber->stack_size; > > + fiber->stack_overflow_wmark -= page_size; > > + fiber->stack_overflow_wmark -= wmark_inpage_offset; > > + > > + fiber->stack_dynamic_wmark = fiber->stack_overflow_wmark - page_size; > > I'd rather start with the minimal allowed stack size and grow it on > demand, if there are too many fibers that exceed the watermark now and > then, not vice versa (imagine we start from 16 MB). This won't work. Imagine we put wmark in _first_ page, and then we find that the wmark has been overwriten, where to put next mark then? Or you propose to catch signals? > > + > > + /* > > + * To increase probability of the stack overflow > > + * detection we put _first_ mark at random position > > + * in first 128 bytes range. The rest of the marks > > + * are put with constant step (because we should not > > + * pressue random generator much in case of hight > > + * number of fibers). > > + */ > > + random_bytes((void *)&v, sizeof(v)); > > Why not rand? Because we already have a helper for random numbers :) > > Anyway, I think we should recalculate the offset for each fiber > individually each time it's recycled. Setting it once on system > initialization is no better than not using randomization at all. > As for pressure on the random generator - I don't understand why > it can possibly be bad. After all, it's a pseudo random generator, > not /dev/random. ok > > @@ -348,6 +348,10 @@ struct fiber { > > struct slab *stack_slab; > > /** Coro stack addr. */ > > void *stack; > > + /** Stack dynamic watermark addr for usage optimization. */ > > + void *stack_dynamic_wmark; > > + /** Stack watermark addr for overflow detection. */ > > + void *stack_overflow_wmark; > > Here I'd like to see a more elaborate comment, explaining why we need > these members, what exactly the stack usage optimization is about. ok