Re: [tarantool-patches] [PATCH v2 1/2] box: create bigrefs for tuples

Tarantool development patches archive
 help / color / mirror / Atom feed

From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: imeevma@tarantool.org
Cc: tarantool-patches@freelists.org
Subject: Re: [tarantool-patches] [PATCH v2 1/2] box: create bigrefs for tuples
Date: Sat, 9 Jun 2018 15:36:02 +0300	[thread overview]
Message-ID: <20180609123602.3wfqvq5abodoeyqg@esperanza> (raw)
In-Reply-To: <6601bc3bb229282c7489ad6391c93319c5ff0366.1528544705.git.imeevma@gmail.com>

On Sat, Jun 09, 2018 at 02:49:06PM +0300, imeevma@tarantool.org wrote:
> Due to limitation of reference counters for tuple being only
> 65535 it was possible to reach this limitation. This patch
> increases capacity of reference counters to 4 billions.
> 
> Closes #3224
> ---

> Branch: https://github.com/tarantool/tarantool/tree/imeevma/gh-3224-tuple-bigrefs
> Issue: https://github.com/tarantool/tarantool/issues/3224

If there's a cover letter, links should be added only there, not to
individual patches.

> @@ -174,20 +174,22 @@ process_rw(struct request *request, struct space *space, struct tuple **result)
>  		txn_rollback_stmt();
>  		return -1;
>  	}
> +	if (result == NULL)
> +		return txn_commit_stmt(txn, request);
> +	*result = tuple;
> +	if (tuple == NULL)
> +		return txn_commit_stmt(txn, request);
>  	/*
>  	 * Pin the tuple locally before the commit,
>  	 * otherwise it may go away during yield in
>  	 * when WAL is written in autocommit mode.
>  	 */
> -	TupleRefNil ref(tuple);
> -	if (txn_commit_stmt(txn, request) != 0)
> -		return -1;
> -	if (result != NULL) {
> -		if (tuple != NULL && tuple_bless(tuple) == NULL)
> -			return -1;
> -		*result = tuple;
> -	}
> -	return 0;
> +	tuple_ref(tuple);
> +	int rc = txn_commit_stmt(txn, request);
> +	if (rc == 0)
> +		tuple_bless(tuple);
> +	tuple_unref(tuple);
> +	return rc;

I don't like they you rework process_rw. IMO calling txn_commit_stmt
conditionally obfuscates the function work flow, which should be as
simple as

    begin_stmt
    execute_request
    commit_stmt

> diff --git a/src/box/tuple.c b/src/box/tuple.c
> index 014f374..68540f4 100644
> --- a/src/box/tuple.c
> +++ b/src/box/tuple.c
> @@ -48,6 +48,26 @@ enum {
>  	OBJSIZE_MIN = 16,
>  };
>  
> +/**
> + * Container for big reference counters. Contains array of big
> + * reference counters, size of this array and number of non-zero
> + * big reference counters. When reference counter of tuple becomes
> + * more than 32767, field refs of this tuple becomes index of big
> + * reference counter in big reference counter array and field
> + * is_bigref is set true. The moment big reference becomes equal
> + * 32767 it is set to 0, refs of the tuple becomes 32767 and
> + * is_bigref becomes false. Big reference counter can be equal to
> + * 0 or be more than 32767.
> + */
> +static struct bigref_list {
> +	/** Array of big reference counters. */
> +	uint32_t *refs;
> +	/** Number of non-zero elements in the array. */
> +	uint16_t size;
> +	/** Capacity of the array. */
> +	uint16_t capacity;
> +} bigref_list;

Technically, it isn't a list, but rather an array. Anyway, I think I'd
call this structure simply 'bigref', and functions for allocating and
freeing big references - 'bigref_alloc' and 'bigref_free'.

> +/**
> + * Return index for new big reference counter and allocate memory
> + * if needed.
> + * @retval index for new big reference counter.
> + */
> +static inline uint16_t
> +bigref_list_new_index()
> +{
> +	if (bigref_list.size < bigref_list.capacity) {
> +		uint16_t vacant_index = 0;
> +		while (bigref_list.refs[vacant_index] != 0)
> +			++vacant_index;
> +		++bigref_list.size;
> +		return vacant_index;
> +	}

Why don't you use a free list for this? This would improve the
worst case complexity of a bigref allocation from O(N) to O(1).

> +	/* Extend the array. */
> +	uint16_t capacity = bigref_list.capacity;
> +	if (capacity == 0)
> +		capacity = BIGREF_MIN_CAPACITY;
> +	else if (capacity < BIGREF_MAX_CAPACITY)
> +		capacity = MIN(capacity * BIGREF_FACTOR, BIGREF_MAX_CAPACITY);
> +	else
> +		panic("Too many big references");
> +	uint32_t *refs = (uint32_t *) realloc(bigref_list.refs, capacity *
> +					      sizeof(*bigref_list.refs));
> +	if (refs == NULL) {
> +		panic("failed to reallocate %zu bytes: Cannot allocate "\
> +		      "memory.", capacity * sizeof(*bigref_list.refs));
> +	}

The max size of a bigref array is 32767 ints, i.e. just 128 KB, and
there's the only object of this type. Is there any point in growing it
dynamically rather than allocating the max size statically on startup?

> +	bigref_list.refs = refs;
> +	memset(bigref_list.refs + bigref_list.capacity, 0, (capacity -
> +	       bigref_list.capacity) * sizeof(*bigref_list.refs));
> +	bigref_list.capacity = capacity;
> +	return bigref_list.size++;
> +}
> +
> +void
> +tuple_ref_slow(struct tuple *tuple)
> +{
> +	assert(tuple->is_bigref || tuple->refs == TUPLE_REF_MAX);
> +	if (! tuple->is_bigref) {
> +		tuple->ref_index = bigref_list_new_index();
> +		tuple->is_bigref = true;
> +		bigref_list.refs[tuple->ref_index] = TUPLE_REF_MAX;
> +	} else if (bigref_list.refs[tuple->ref_index] == BIGREF_MAX) {
> +		panic("Tuple big reference counter overflow");
> +	}
> +	bigref_list.refs[tuple->ref_index]++;
> +}
> +
> +/**
> + * Try to decrease allocated memory if it is possible. Free memory
> + * when size == 0.
> + */
> +static inline void
> +bigref_list_delete_index(uint16_t index)
> +{
> +	bigref_list.refs[index] = 0;
> +	if (--bigref_list.size == 0) {
> +		bigref_list_reset();
> +		return;
> +	}
> +	if (bigref_list.capacity == BIGREF_MIN_CAPACITY ||
> +	    bigref_list.size > bigref_list.capacity / BIGREF_FACTOR)
> +		return;
> +
> +	uint16_t top_index = bigref_list.capacity - 1;
> +	while (bigref_list.refs[top_index] == 0)
> +		top_index--;
> +
> +	uint16_t needed_capacity = top_index + 1;
> +	if (needed_capacity < BIGREF_MIN_CAPACITY)
> +		needed_capacity = BIGREF_MIN_CAPACITY;
> +	if (needed_capacity > bigref_list.capacity / BIGREF_FACTOR)
> +		return;
> +	/* Round up capacity to the next highest power of 2. */
> +	assert(sizeof(needed_capacity) == sizeof(uint16_t));
> +	needed_capacity--;
> +	needed_capacity |= needed_capacity >> 1;
> +	needed_capacity |= needed_capacity >> 2;
> +	needed_capacity |= needed_capacity >> 4;
> +	needed_capacity |= needed_capacity >> 8;
> +	assert(needed_capacity < UINT16_MAX);
> +	needed_capacity++;
> +	uint32_t *refs =
> +		(uint32_t *) realloc(bigref_list.refs, needed_capacity *
> +				     sizeof(*bigref_list.refs));
> +	if (refs == NULL) {
> +		panic("failed to reallocate %zu bytes: Cannot allocate "\
> +		      "memory.", needed_capacity * sizeof(*bigref_list.refs));
> +	}

Even if we agree to grow the bigref array dynamically, I see absolutely
no point in shrinking it: if we used N big refs once, we are likely to
use N big refs again in the future.

> +	bigref_list.refs = refs;
> +	bigref_list.capacity = needed_capacity;
> +}

next prev parent reply	other threads:[~2018-06-09 12:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-09 11:49 [tarantool-patches] [PATCH v2 0/2] Create " imeevma
2018-06-09 11:49 ` [tarantool-patches] [PATCH v2 1/2] box: create " imeevma
2018-06-09 12:36   ` Vladimir Davydov [this message]
2018-06-09 11:49 ` [tarantool-patches] [PATCH v2 2/2] tuple: introduce bigref hints imeevma
2018-06-09 12:39   ` Vladimir Davydov
2018-06-09 12:34 ` [tarantool-patches] [PATCH v2 0/2] Create bigrefs for tuples Vladimir Davydov
     [not found] <cover.1528388742.git.imeevma@gmail.com>
2018-06-07 16:28 ` [tarantool-patches] [PATCH v2 1/2] box: create " imeevma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180609123602.3wfqvq5abodoeyqg@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=imeevma@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] [PATCH v2 1/2] box: create bigrefs for tuples' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox