From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <v.shpilevoy@tarantool.org>
Received: from smtp5.mail.ru (smtp5.mail.ru [94.100.179.24])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dev.tarantool.org (Postfix) with ESMTPS id 102BB445320
 for <tarantool-patches@dev.tarantool.org>;
 Wed, 15 Jul 2020 02:50:49 +0300 (MSK)
References: <1594221263-6228-1-git-send-email-alyapunov@tarantool.org>
 <1594221263-6228-15-git-send-email-alyapunov@tarantool.org>
From: Vladislav Shpilevoy <v.shpilevoy@tarantool.org>
Message-ID: <bdca882f-880b-aeba-4a76-8356fa709bd4@tarantool.org>
Date: Wed, 15 Jul 2020 01:50:47 +0200
MIME-Version: 1.0
In-Reply-To: <1594221263-6228-15-git-send-email-alyapunov@tarantool.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [Tarantool-patches] [PATCH 14/16] tx: indexes
List-Id: Tarantool development patches <tarantool-patches.dev.tarantool.org>
List-Unsubscribe: <https://lists.tarantool.org/mailman/options/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=unsubscribe>
List-Archive: <https://lists.tarantool.org/pipermail/tarantool-patches/>
List-Post: <mailto:tarantool-patches@dev.tarantool.org>
List-Help: <mailto:tarantool-patches-request@dev.tarantool.org?subject=help>
List-Subscribe: <https://lists.tarantool.org/mailman/listinfo/tarantool-patches>, 
 <mailto:tarantool-patches-request@dev.tarantool.org?subject=subscribe>
To: Aleksandr Lyapunov <alyapunov@tarantool.org>, tarantool-patches@dev.tarantool.org

Thanks for the patch!

See 11 comments below.

> diff --git a/src/box/memtx_bitset.c b/src/box/memtx_bitset.c
> index 67eaf6f..f3ab74f 100644
> --- a/src/box/memtx_bitset.c
> +++ b/src/box/memtx_bitset.c
> @@ -198,19 +199,26 @@ bitset_index_iterator_next(struct iterator *iterator, struct tuple **ret)
>  	assert(iterator->free == bitset_index_iterator_free);
>  	struct bitset_index_iterator *it = bitset_index_iterator(iterator);
>  
> -	size_t value = tt_bitset_iterator_next(&it->bitset_it);
> -	if (value == SIZE_MAX) {
> -		*ret = NULL;
> -		return 0;
> -	}
> -
> +	do {
> +		size_t value = tt_bitset_iterator_next(&it->bitset_it);
> +		if (value == SIZE_MAX) {
> +			*ret = NULL;
> +			return 0;
> +		}
>  #ifndef OLD_GOOD_BITSET
> -	struct memtx_bitset_index *index =
> -		(struct memtx_bitset_index *)iterator->index;
> -	*ret = memtx_bitset_index_value_to_tuple(index, value);
> +		struct memtx_bitset_index *index =
> +			(struct memtx_bitset_index *)iterator->index;
> +		struct tuple *tuple =
> +			memtx_bitset_index_value_to_tuple(index, value);
>  #else /* #ifndef OLD_GOOD_BITSET */
> -	*ret = value_to_tuple(value);
> +		struct tuple *tuple =value_to_tuple(value);

1. Missing whitespace afrer =.

>  #endif /* #ifndef OLD_GOOD_BITSET */
> +		uint32_t iid = iterator->index->def->iid;
> +		struct txn *txn = in_txn();
> +		bool is_rw = txn != NULL;
> +		*ret = txm_tuple_clarify(txn, tuple, iid, 0, is_rw);

2. Some of these values you don't need to load in the cycle. They don't
change.

* in_txn() can be called out of the cycle just once;
* is_rw can be calculated only once;
* iid does not change;
* struct memtx_bitset_index *index does not change;

The same applies to rtree changes.

> +	} while (*ret == NULL);
> +
>  	return 0;
>  }
>  
> diff --git a/src/box/memtx_hash.c b/src/box/memtx_hash.c

3. On the branch I see a 'txm_snapshot_cleanser' structure
in this file. But not in the email. Can't review it. Why is
it called 'cleanser' instead of 'cleaner'? What is it doing?

> index cdd531c..b3ae60c 100644
> --- a/src/box/memtx_hash.c
> +++ b/src/box/memtx_hash.c
> @@ -128,6 +129,31 @@ hash_iterator_gt(struct iterator *ptr, struct tuple **ret)
>  	return 0;
>  }
>  
> +#define WRAP_ITERATOR_METHOD(name)                                             \
> +static int                                                                     \
> +name(struct iterator *iterator, struct tuple **ret)                            \
> +{                                                                              \
> +	struct txn *txn = in_txn();                                            \
> +	bool is_rw = txn != NULL;                                              \
> +	uint32_t iid = iterator->index->def->iid;                              \
> +	bool first = true;                                                     \
> +	do {                                                                   \
> +		int rc = first ? name##_base(iterator, ret)                    \
> +			       : hash_iterator_ge_base(iterator, ret);         \

4. Seems like unnecessary branching. If you know you will specially
handle only the first iteration, then why no to make it before the
cycle? And eliminate 'first' + '?' branch. Also use prefix 'is_' for
flag names. Or 'has_'/'does_'/etc. The same for all the other new
flags, including 'preserve_old_tuple'.

> +		if (rc != 0 || *ret == NULL)                                   \
> +			return rc;                                             \
> +		first = false;                                                 \
> +		*ret = txm_tuple_clarify(txn, *ret, iid, 0, is_rw);            \
> +	} while (*ret == NULL);                                                \
> +	return 0;                                                              \
> +}                                                                              \

5. Please, use tabs for alignment. In other places too.

> +struct forgot_to_add_semicolon

6. What is this?

> +
> +WRAP_ITERATOR_METHOD(hash_iterator_ge);
> +WRAP_ITERATOR_METHOD(hash_iterator_gt);
> +
> +#undef WRAP_ITERATOR_METHOD
> +
> @@ -136,12 +162,25 @@ hash_iterator_eq_next(MAYBE_UNUSED struct iterator *it, struct tuple **ret)
>  }
>  
>  static int
> -hash_iterator_eq(struct iterator *it, struct tuple **ret)
> +hash_iterator_eq(struct iterator *ptr, struct tuple **ret)
>  {
> -	it->next = hash_iterator_eq_next;
> -	return hash_iterator_ge(it, ret);
> +	ptr->next = hash_iterator_eq_next;
> +	assert(ptr->free == hash_iterator_free);
> +	struct hash_iterator *it = (struct hash_iterator *) ptr;
> +	struct memtx_hash_index *index = (struct memtx_hash_index *)ptr->index;
> +	struct tuple **res = light_index_iterator_get_and_next(&index->hash_table,
> +							       &it->iterator);

7. Why did you remove the hash_iterator_ge() call? You still can use
it here, with the new name hash_iterator_ge_base().

> +	if (res == NULL) {
> +		*ret = NULL;
> +		return 0;
> +	}
> +	struct txn *txn = in_txn();
> +	bool is_rw = txn != NULL;
> +	*ret = txm_tuple_clarify(txn, *res, ptr->index->def->iid, 0, is_rw);

8. Why isn't it a cycle?

9. Why 'txn != NULL' can't be done inside txm_tuple_clarify()? It
takes txn pointer anyway, and you calculate 'is_rw' everywhere
before the call.

> +	return 0;
>  }
>  
> +

10. Unnecessary new line.

>  /* }}} */
> diff --git a/src/box/memtx_rtree.c b/src/box/memtx_rtree.c
> index 612fcb2..992a422 100644
> --- a/src/box/memtx_rtree.c
> +++ b/src/box/memtx_rtree.c
> @@ -304,6 +305,45 @@ tree_iterator_prev_equal(struct iterator *iterator, struct tuple **ret)
>  	return 0;
>  }
>  
> +#define WRAP_ITERATOR_METHOD(name)                                             \
> +static int                                                                     \
> +name(struct iterator *iterator, struct tuple **ret)                            \
> +{                                                                              \
> +	struct memtx_tree *tree =                                              \
> +		&((struct memtx_tree_index *)iterator->index)->tree;           \
> +	struct tree_iterator *it = tree_iterator(iterator);                    \
> +	struct memtx_tree_iterator *ti = &it->tree_iterator;                   \
> +	uint32_t iid = iterator->index->def->iid;                              \
> +	bool is_multikey = iterator->index->def->key_def->is_multikey;         \

11. All these dereferences are going to cost a lot, even when
there are no concurrent txns. Can they be done in a lazy mode?
Only if the found tuple is dirty. The same applies to all the
other places.

> +	struct txn *txn = in_txn();                                            \
> +	bool is_rw = txn != NULL;                                              \
> +	do {                                                                   \
> +		int rc = name##_base(iterator, ret);                           \
> +		if (rc != 0 || *ret == NULL)                                   \
> +			return rc;                                             \
> +		uint32_t mk_index = 0;                                         \
> +		if (is_multikey) {                                             \
> +			struct memtx_tree_data *check =                        \
> +				memtx_tree_iterator_get_elem(tree, ti);        \
> +			assert(check != NULL);                                 \
> +			mk_index = check->hint;                                \
> +		}                                                              \
> +		*ret = txm_tuple_clarify(txn, *ret, iid, mk_index, is_rw);     \
> +	} while (*ret == NULL);                                                \
> +	tuple_unref(it->current.tuple);                                        \
> +	it->current.tuple = *ret;                                              \
> +	tuple_ref(it->current.tuple);                                          \
> +	return 0;                                                              \
> +}                                                                              \
> +struct forgot_to_add_semicolon
> +
> +WRAP_ITERATOR_METHOD(tree_iterator_next);
> +WRAP_ITERATOR_METHOD(tree_iterator_prev);
> +WRAP_ITERATOR_METHOD(tree_iterator_next_equal);
> +WRAP_ITERATOR_METHOD(tree_iterator_prev_equal);
> +
> +#undef WRAP_ITERATOR_METHOD
> +