From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tarantool-patches-bounce@freelists.org>
Received: from localhost (localhost [127.0.0.1])
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 0A158249C0
	for <tarantool-patches@freelists.org>; Tue, 29 May 2018 12:27:17 -0400 (EDT)
Received: from turing.freelists.org ([127.0.0.1])
	by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id FG0-W7wcX0Ud for <tarantool-patches@freelists.org>;
	Tue, 29 May 2018 12:27:16 -0400 (EDT)
Received: from smtp42.i.mail.ru (smtp42.i.mail.ru [94.100.177.102])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 4CCD624A8E
	for <tarantool-patches@freelists.org>; Tue, 29 May 2018 12:27:16 -0400 (EDT)
Subject: [tarantool-patches] Re: [PATCH box 1/1] box: create bigrefs for
 tuples
References: <7a17a3937f7729796eac617e5041722d9f0d07a4.1527595851.git.imeevma@tarantool.org>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <7396f216-bec4-9b58-9c3d-002c5f371cca@tarantool.org>
Date: Tue, 29 May 2018 19:27:13 +0300
MIME-Version: 1.0
In-Reply-To: <7a17a3937f7729796eac617e5041722d9f0d07a4.1527595851.git.imeevma@tarantool.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Sender: tarantool-patches-bounce@freelists.org
Errors-to: tarantool-patches-bounce@freelists.org
Reply-To: tarantool-patches@freelists.org
List-help: <mailto:ecartis@freelists.org?Subject=help>
List-unsubscribe: <tarantool-patches-request@freelists.org?Subject=unsubscribe>
List-software: Ecartis version 1.0.0
List-Id: tarantool-patches <tarantool-patches.freelists.org>
List-subscribe: <tarantool-patches-request@freelists.org?Subject=subscribe>
List-owner: <mailto:>
List-post: <mailto:tarantool-patches@freelists.org>
List-archive: <http://www.freelists.org/archives/tarantool-patches>
To: tarantool-patches@freelists.org, Mergen Imeev <imeevma@tarantool.org>
Cc: Mergen Imeev <imeevma@gmail.com>

Hello. Thanks for the patch! See my 14 comments below.

On 29/05/2018 15:11, Mergen Imeev wrote:
> From: Mergen Imeev <imeevma@gmail.com>

1. Why do I see the SMTP 'From:' in the message body?

> 
> Due to limitation of number of bigrefs being only 65535 it
> was possible to reach this limitation. This patch increases
> capacity of referance counter to 4 billions.

2. reference

> 
> Closes #3224
> ---

3. Please, specify branch and issue web links. And you have to prefix
a branch name with your gh login.

>   src/box/tuple.c          |  18 +++++++
>   src/box/tuple.h          | 135 +++++++++++++++++++++++++++++++++++++++++++----
>   test/box/select.result   |   5 +-
>   test/box/select.test.lua |   2 +-
>   4 files changed, 146 insertions(+), 14 deletions(-)
> 
> diff --git a/src/box/tuple.c b/src/box/tuple.c
> index 014f374..f4e1829 100644
> --- a/src/box/tuple.c
> +++ b/src/box/tuple.c
> @@ -58,6 +58,12 @@ struct tuple *box_tuple_last;
>   
>   struct tuple_format *tuple_format_runtime;
>   
> +/**
> + * All bigrefs of tuples

4. Please, finish a sentence with dot. And describe here shortly
why bigrefs are needed, what the limits are, and how it works.

> + * \sa tuple_ref
> + */
> +struct tuple_bigrefs bigrefs;
> +
>   static void
>   runtime_tuple_delete(struct tuple_format *format, struct tuple *tuple);
>   
> @@ -211,6 +217,16 @@ tuple_init(field_name_hash_f hash)
>   
>   	box_tuple_last = NULL;
>   
> +	bigrefs.size = 0;
> +	bigrefs.capacity = TUPLE_BIGREF_MIN_CAPACITY;
> +	bigrefs.refs = (uint32_t *) malloc(bigrefs.capacity * sizeof(uint32_t));

5. Please, try to do not allocate anything until it is really needed. In the
most cases bigrefs will be unused.

> +	if (bigrefs.refs == NULL) {
> +		diag_set(OutOfMemory, bigrefs.capacity *
> +			sizeof(uint32_t), "malloc", "bigrefs.refs");
> +		return -1;
> +	}
> +
> +
>   	if (coll_id_cache_init() != 0)
>   		return -1;
>   
> @@ -265,6 +281,8 @@ tuple_free(void)
>   	tuple_format_free();
>   
>   	coll_id_cache_destroy();
> +
> +	free(bigrefs.refs);
>   }
>   
>   box_tuple_format_t *
> diff --git a/src/box/tuple.h b/src/box/tuple.h
> index e2384dd..c7dbe3b 100644
> --- a/src/box/tuple.h
> +++ b/src/box/tuple.h
> @@ -327,6 +327,19 @@ struct PACKED tuple
>   	 */
>   };
>   
> +/**
> + * @brief Container for bigrefs.

6. Bigref is an internal structure, and it must be declared in tuple.c only.
So you will have two versions of tuple_ref - static inline tuple_ref in the
header and tuple_ref_slow declared in tuple.h, but implemented in tuple.c.

Static inline tuple_ref calls tuple_ref_slow when sees ref counter is
overflowed already. As examples see region_reserve and region_reserve_slow.

Same about tuple_unref.

> + */
> +struct tuple_bigrefs
> +{
> +	/** reference counter */
> +	uint32_t *refs;
> +	/** index of last non-zero element */
> +	uint16_t size;
> +	/** capacity of container */
> +	uint16_t capacity;
> +};
> +
>   /** Size of the tuple including size of struct tuple. */
>   static inline size_t
>   tuple_size(const struct tuple *tuple)
> @@ -774,7 +787,41 @@ tuple_field_uuid(const struct tuple *tuple, int fieldno,
>   	return 0;
>   }
>   
> -enum { TUPLE_REF_MAX = UINT16_MAX };
> +enum {
> +	TUPLE_BIGREFS_FACTOR = 2,
> +	TUPLE_BIGREF_MAX = UINT32_MAX,
> +	TUPLE_REF_MAX = UINT16_MAX >> 1,
> +	TUPLE_REF_FLAG = UINT16_MAX ^ TUPLE_REF_MAX,
> +	TUPLE_BIGREF_MIN_CAPACITY = 16
> +};
> +
> +extern struct tuple_bigrefs bigrefs;
> +
> +/**
> + * Return index for new bigref
> + * @retval Number less than or equal 0x7fff Success.
> + * @retval Number more than 0x7fff Too many refs error.
> + */
> +static inline uint16_t
> +get_bigref_index(void)

7. This function is actually a method of tuple_bigref structure,
and it not only gets - it may extend refs array, so lets better
call it tuple_bigref_retrieve().

> +{
> +	for(uint16_t i = 0; i < bigrefs.size; ++i)
> +		if(bigrefs.refs[i] == 0)
> +			return i;

8. This cycle is useless when you know all the refs are occupied.
As far as I understand size is a count of occupied bigrefs. Size
is incremented on each new bigref and decremented on each removed
one. And here you at first should check if size < capacity. If it
is, you are sure to find a free bigref in the cycle. Else it
makes no sense to search a free ref.

tuple_bigref_new() {
	if (size < capacity) {
		for (uint16_t i = 0; i < capacity; ++i) {
			if (refs[i] == 0) {
				++size;
				return i;
			}
		}
		unreachable();
	}
	/* Extend the array ... */
}

9. When a 'for' consists of multiple lines, please, put it in
brackets.

> +	if(bigrefs.size == bigrefs.capacity - 1) {
> +		if(bigrefs.capacity > TUPLE_REF_MAX / TUPLE_BIGREFS_FACTOR)
> +			return TUPLE_REF_FLAG;

10. What is TUPLE_REF_FLAG in the get_bigref_index? If it means error,
lets better change the function signature:

/**
  * Get or allocate a new free bigref counter.
  * @param[out] new_ref New reference counter index.
  * @retval 0 Success.
  * @retval -1 Error.
  */
int
tuple_bigref_retrieve(uint16_t *new_ref);

> +		bigrefs.capacity *= TUPLE_BIGREFS_FACTOR;
> +		bigrefs.refs = (uint32_t *) realloc(bigrefs.refs,
> +				bigrefs.capacity * sizeof(uint32_t));

11. If realloc has returned NULL, the original refs would leak.

> +		if (bigrefs.refs == NULL) {
> +			diag_set(OutOfMemory, bigrefs.capacity *
> +				sizeof(uint32_t), "realloc", "bigrefs.refs");
> +			return TUPLE_REF_FLAG;
> +		}
> +	}
> +	return bigrefs.size++;
> +}
>   
>   /**
>    * Increment tuple reference counter.
> @@ -785,15 +832,55 @@ enum { TUPLE_REF_MAX = UINT16_MAX };
>   static inline int
>   tuple_ref(struct tuple *tuple)
>   {
> -	if (tuple->refs + 1 > TUPLE_REF_MAX) {
> -		diag_set(ClientError, ER_TUPLE_REF_OVERFLOW);
> -		return -1;
> +	if(tuple->refs == TUPLE_REF_MAX) {
> +		uint16_t index = get_bigref_index();
> +		if(index > TUPLE_REF_MAX) {
> +			diag_set(ClientError, ER_TUPLE_REF_OVERFLOW);
> +			return -1;
> +		}
> +		tuple->refs = TUPLE_REF_FLAG | index;
> +		bigrefs.refs[index] = TUPLE_REF_MAX;
>   	}
> -	tuple->refs++;
> +	if((tuple->refs & TUPLE_REF_FLAG) > 0) {
> +		uint16_t index = tuple->refs ^ TUPLE_REF_FLAG;
> +		if (bigrefs.refs[index] > TUPLE_BIGREF_MAX - 1) {
> +			diag_set(ClientError, ER_TUPLE_REF_OVERFLOW);

12. I remember Kostja has asked to replace this error with panic(). But I
do not like panic(). Lets ask in the product team chat what to do. In theory
this error is unreachable now but who knows for sure?

> +			return -1;
> +		}
> +		bigrefs.refs[index]++;
> +	} else tuple->refs++;

13. When 'if' body is in {}, the 'else' body must be too.

>   	return 0;
>   }
>   
> diff --git a/test/box/select.test.lua b/test/box/select.test.lua
> index 54c2ecc..5e3071d 100644
> --- a/test/box/select.test.lua
> +++ b/test/box/select.test.lua
> @@ -129,7 +129,7 @@ index = s:create_index('primary', { type = 'tree' })
>   a = s:insert{0}
>   lots_of_links = {}
>   ref_count = 0
> -while (true) do table.insert(lots_of_links, s:get{0}) ref_count = ref_count + 1 end
> +while (ref_count < 100000) do table.insert(lots_of_links, s:get{0}) ref_count = ref_count + 1 end

14. Please, write before the test a comment like
--
-- gh-####: issue description, test description.
--