[PATCH v2 11/11] vinyl: introduce quota consumer priorities

Mon Oct 8 14:10:10 MSK 2018

On Sat, Oct 06, 2018 at 04:24:05PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev at gmail.com> [18/09/28 21:00]:
> 
> Since we agreed to go with the implemented approach, please see a
> few implementation comments below.
> 
> > Currently, we only limit quota consumption rate so that writers won't
> > hit the hard limit before memory dump is complete. However, it isn't
> > enough, because we also need to consider compaction: if it doesn't keep
> > up with dumps, read and space amplification will grow uncontrollably.
> > 
> > The problem is compaction may be a quota consumer by itself, as it may
> > generate deferred DELETE statements for secondary indexes. We can't
> > ignore quota completely there, because if we do, we may hit the memory
> > limit and stall all writers, which is unacceptable, but we do want to
> > ignore the rate limit imposed to make sure that compaction keeps up with
> > dumps, otherwise compaction won't benefit from such a throttling.
> > 
> > To tackle this problem, this patch introduces quota consumer priorities.
> > Now a quota object maintains one rate limit and one wait queue per each
> > transaction priority. Rate limit i is used by a consumer with priority
> > prio if and only if prio <= i. Whenever a consumer has to be throttled,
> > it is put to sleep to the wait queue corresponding to its priority.
> > When quota is replenished, we pick the oldest consumer among all that
> > may be woken up. This ensures fairness.
> 
> As discusses, these are not priorities, these are resource types.
> Please change the patch to pass a bit mask of resource types for
> which we request the quota.

Actually, there are resource types and there are consumer types.
I admit the fact that I mixed them may look confusing at the first
glance. We may introduce a seprate enum for resource types with a
mapping between them.

enum vy_quota_consumer_type {
	VY_QUOTA_CONSUMER_TX = 0,
	VY_QUOTA_CONSUMER_COMPACTION = 1,
	vy_quota_consumer_max,
};

enum vy_quota_resource_type {
	VY_QUOTA_RESOURCE_DISK = 0,
	VY_QUOTA_RESOURCE_MEMORY = 1,
	vy_quota_resource_max,
};

static unsigned vy_quota_consumer_mask[] = {
	[VY_QUOTA_CONSUMER_TX] = (1 << VY_QUOTA_RESOURCE_DISK) |
				 (1 << VY_QUOTA_RESOURCE_MEMORY),
	[VY_QUOTA_CONSUMER_COMPACTION] = (1 << VY_QUOTA_RESOURCE_MEMORY),
};

struct vy_quota {
	...
	struct rlist wait_queue[vy_quota_consumer_max];
	struct vy_rate_limit rate_limit[vy_quota_resource_max];
};

This would make the code more bulky though. Do we really want to
complicate?

Also, quite frankly I don't quite dig the concept of 'resources' and
the corresponding constant names, because 'memory' rate limit may be
confused with memory usage, which is what vy_quota is about in the first
place.

> 
> Quota refill interval should be varying - please schedule this
> work in a separate patch.
> 
> 
> >  static void
> >  vy_quota_signal(struct vy_quota *q)
> >  {
> > -	if (!rlist_empty(&q->wait_queue)) {
> > +	/*
> > +	 * Wake up a consumer that has waited most no matter
> > +	 * whether it's high or low priority. This assures that
> > +	 * high priority consumers don't uncontrollably throttle
> > +	 * low priority ones.
> > +	 */
> > +	struct vy_quota_wait_node *oldest = NULL;
> > +	for (int i = 0; i < vy_quota_consumer_prio_MAX; i++) {
> > +		struct rlist *wq = &q->wait_queue[i];
> > +		if (rlist_empty(wq))
> > +			continue;
> 
> I still do not understand why you need a second queue and
> timestapms. If you need to ensure total fairness, a single queue
> should be enough. 

I need to maintain one queue per consumer type, because it's possible
that we may wake up consumers of one type, but not of the other. If we
used a single queue, it could occur that consumers that could be woken
up landed in the middle of the queue so that without scanning the queue
we wouldn't be able to find them. Scanning a queue looks ugly.

Think of the multi-queue design like this: there's one queue per each
consumer type. When a resource is replenished, we check only those
queues that store consumers that may proceed and choose the one that has
waited most.

> > +
> >  		struct vy_quota_wait_node *n;
> > -		n = rlist_first_entry(&q->wait_queue,
> > -				      struct vy_quota_wait_node, in_wait_queue);
> > +		n = rlist_first_entry(wq, struct vy_quota_wait_node,
> > +				      in_wait_queue);
> >  		/*
> >  		 * No need in waking up a consumer if it will have
> >  		 * to go back to sleep immediately.
> >  		 */
> > -		if (vy_quota_may_use(q, n->size))
> > -			fiber_wakeup(n->fiber);
> > +		if (!vy_quota_may_use(q, i, n->size))
> > +			continue;
> > +
> > +		if (oldest == NULL || oldest->timestamp > n->timestamp)
> > +			oldest = n;
> >  	}
> > +	if (oldest != NULL)
> > +		fiber_wakeup(oldest->fiber);
> 
> This is some kind of black magic to me. Imagine you have another
> resource type, say, CPU? Will you add the third queue? How this
> will piece of code look then?

Exactly the same. The only thing that would change is vy_quota_may_use:
we would have to add the new resource type there then.

We would have to add a new queue only if a new consumer type had to be
introduced, e.g. a consumer that only consumes CPU and nothing else, but
even then this code would stay the same.