[Tarantool-patches] [PATCH 1/3] small: implement new size class evaluation

Sun Dec 20 21:31:37 MSK 2020

Thanks for the patch!

See 16 comments below.

> diff --git a/LICENSE b/LICENSE
> index d797c05..734ba6c 100644
> --- a/LICENSE
> +++ b/LICENSE
> @@ -1,4 +1,4 @@
> -Copyright 2010-2016 Tarantool AUTHORS: please see AUTHORS file.
> +Copyright 2010-2020 Tarantool AUTHORS: please see AUTHORS file.

1. Lets omit unnecessary changes. This clearly has nothing to do with
the functional part of the patch.

> diff --git a/small/small_class.h b/small/small_class.h
> new file mode 100644
> index 0000000..5d9ad5b
> --- /dev/null
> +++ b/small/small_class.h
> @@ -0,0 +1,220 @@
> +#ifndef TARANTOOL_SMALL_SMALL_CLASS_H_INCLUDED
> +#define TARANTOOL_SMALL_SMALL_CLASS_H_INCLUDED

2. Please, use `pragma once` in new files.

> +/*
> + * Copyright 2010-2020, Tarantool AUTHORS, please see AUTHORS file.
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above
> + *    copyright notice, this list of conditions and the
> + *    following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above
> + *    copyright notice, this list of conditions and the following
> + *    disclaimer in the documentation and/or other materials
> + *    provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
> + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
> + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
> + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + */
> +
> +#include <limits.h> /* CHAR_BIT */
> +
> +/* {{{ Explanation */
> +/*

3. Out of function comments should use /** opening.

> + * small_alloc uses a collection of mempools of different sizes.
> + * If small_alloc stores all mempools in an array then it have to determine
> + * an offset in that array where the most suitable mempool is.
> + * Let's name the offset as 'size class' and the size that the corresponding
> + * mempool allocates as 'class size'.
> + * Historically the class sizes grow incrementally up to some point and then
> + * (at some size class) grow exponentially with user-provided factor.
> + * Apart from incremental part the exponential part is not very obvious.
> + * Binary search and floating-point logarithm could be used for size class
> + * determination but both approaches seem to be too slow.
> + *
> + * This module is designed for faster size class determination.
> + * The idea is to use integral binary logarithm (bit search) and improve it
> + * in some way in order to increase precision - allow other logarithm bases
> + * along with 2.
> + * Binary integral logarithm is just a position of the most significant bit of
> + * a value. Let's look closer to binary representation of an allocation size
> + * and size class that is calculated as binary logarithm:
> + *       size      |  size class
> + *    00001????..  |      x
> + *    0001?????..  |    x + 1
> + *    001??????..  |    x + 2
> + * Let's take into account n lower bits just after the most significant
> + * in the value and divide size class into 2^n subclasses. For example if n = 2:
> + *       size      |  size class
> + *    0000100??..  |      x
> + *    0000101??..  |    x + 1
> + *    0000110??..  |    x + 2
> + *    0000111??..  |    x + 3
> + *    000100???..  |    x + 4  <- here the size doubles, in 4 = 2^n steps.
> + *    000101???..  |    x + 5
> + * That gives us some kind of approximation of a logarithm with a base equal
> + * to pow(2, 1 / pow(2, n)). That means that for given factor 'f' of exponent
> + * we can choose such a number of bits 'n' that gives us an approximation of
> + * an exponent that is close to 'f'.
> + *
> + * Of course if the most significant bit of a value is less than 'n' we can't
> + * use the formula above. But it's not a problem since we can (and even would
> + * like to!) use an incremental size class evaluation of those sizes.
> + *     size      |  size class
> + *    0000001    |      1  <- incremental growth.
> + *    0000010    |      2
> + *    0000011    |      3
> + *    0000100    |      4  <- here the exponential approximation starts.
> + *    0000101    |      5
> + *    0000110    |      6
> + *    0000111    |      7
> + *    000100?    |      8
> + *    000101?    |      9
> + *
> + * There's some implementation details. Size class is zero based, and the size
> + * must be rounded up to the closest class size. Even more, we want to round
> + * up size to some granularity, we doesn't want to have incremental pools of
> + * sizes 1, 2, 3.., we want them to be 8, 16, 24.... All that is achieved by
> + * subtracting size by one and omitting several lower bits of the size.
> + */
> +/* }}} */
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif /* defined(__cplusplus) */
> +
> +struct small_class {
> +	/** Every class size must be a multiple of this. */
> +	unsigned granularity;

4. But what about incremental classes? You said their size
growth is not exponential until certain point is reached.

> +	/** log2(granularity), ignore those number of the lowest bit of size. */
> +	unsigned rnd_bits;

5. What is 'rnd'?

> +	/**
> +	 * A number of bits (after the most significant bit) that are used in
> +	 * size class evaluation ('n' in the Explanation above).
> +	 */
> +	unsigned eff_bits;

6. What is 'eff'?

> +	/** 1u << eff_bits. */
> +	unsigned eff_size;
> +	/** eff_size - 1u. */
> +	unsigned eff_mask;
> +	/**
> +	 * By default the lowest possible allocation size (aka class size of
> +	 * class 0) is granularity. If a user wants different min_alloc, we
> +	 * simply shift sizes; min_alloc = granularity + size_shift.
> +	 */
> +	unsigned size_shift;
> +	/** Actually we need 'size_shift + 1', so store it. */
> +	unsigned size_shift_plus_1;

7. Why do you need +1? And why do you need to store both in this structure?

> +	/** Exponential factor that was requested upon small_class creation. */
> +	float requested_factor;

8. It is never used except for the tests, where you know the
requested factor anyway. Therefore it can be dropped.

> +	/**
> +	 * Exponential factor, approximation of which we managed to provide.
> +	 * It is calculated from requested_factor, it's guaranteed that
> +	 * it must be in range [requested_factor/k, requested_factor*k],
> +	 * where k = pow(requested_factor, 0.5).

9. I don't understand. So you can make the factor smaller than it was
requested? What is the point in configuring the factor then?

> +	 */
> +	float actual_factor;
> +};
> +
> +/**
> + * Create an instance of small_class evaluator. All args must meet the
> + * requirements, undefined behaviour otherwise (at least assert).
> + * @param sc - instance to create.
> + * @param granularity - any class size will be a multiple ot this value.

10. ot -> of.

> + *  Must be a power of 2 (and thus greater than zero).
> + * @param desired_factor - desired factor of growth of class size.
> + *  Must be in (1, 2] range. Actual factor can be different.
> + * @param min_alloc - the lowest class size, must be greater than zero.
> + *  The good choice is the same value as granularity.
> + */
> +void
> +small_class_create(struct small_class *sc, unsigned granularity,
> +		   float desired_factor, unsigned min_alloc);
> +
> +/**
> + * Find position of the most significant bit in a value.
> + * If the value is zero the behaviour is undefined.
> + */
> +static inline unsigned
> +small_class_fls(unsigned value)

11. What is 'fls'?

> +{
> +	/*
> +	 * Usually clz is implemented as bsr XOR 0x1f.
> +	 * If we add another XOR the compiler will squash 'em and leave just bsr.
> +	 */
> +	unsigned clz = __builtin_clz(value);
> +	return (sizeof(value) * CHAR_BIT - 1) ^ clz;
> +}
> +
> +static inline unsigned
> +class_by_size(struct small_class *sc, unsigned size)

12. Here I am confused. You have the class right here - sc.
Then what does this function return, if the class is already
known?

Also a general comment - all methods of a struct should have
struct name as a prefix. Here this function must be named
small_class_by_size(). Which still does not explain the
question above.

I assume it may be because you use 'class' both as the struct
name and as the numbers meaning offset in the array of pools.
In that case it makes sense to think how to resolve this
collision, because looks confusing. However, I don't have any
strong proposals here. I thought about naming if 'pool id', but
that would break encapsulation, because small_class has nothing
to do with pools.

I tried to review the math below, and can't spot anything
suspicious. Maybe because it is too complicated. I hope
Alexander L. looked at it carefully enough.

> +
> +#if defined(__cplusplus)
> +} /* extern "C" { */
> +#endif /* defined(__cplusplus) */
> +
> +#endif /* #ifndef TARANTOOL_SMALL_SMALL_CLASS_H_INCLUDED */
> \ No newline at end of file

13. Please, add the empty line like git requests. This is already
third time or so when you missed this warning. I ask you to do
something with your editor.

> diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
> index 8317c65..5ee0821 100644
> --- a/test/CMakeLists.txt
> +++ b/test/CMakeLists.txt
> @@ -31,6 +31,13 @@ set_source_files_properties(rb_rand.cc PROPERTIES
>  add_executable(mempool.test mempool.c)
>  target_link_libraries(mempool.test small)
>  
> +add_executable(small_class.test small_class.c unit.c)
> +target_link_libraries(small_class.test small)
> +
> +add_executable(small_class_branchless.test small_class.c unit.c)
> +target_link_libraries(small_class_branchless.test small)
> +target_compile_definitions(small_class_branchless.test PUBLIC SMALL_CLASS_BRANCHLESS)
> +
>  add_executable(small_alloc.test small_alloc.c)
>  target_link_libraries(small_alloc.test small)
>  
> @@ -64,6 +71,8 @@ add_test(region ${CMAKE_CURRENT_BUILD_DIR}/region.test)
>  add_test(ibuf ${CMAKE_CURRENT_BUILD_DIR}/ibuf.test)
>  add_test(obuf ${CMAKE_CURRENT_BUILD_DIR}/obuf.test)
>  add_test(mempool ${CMAKE_CURRENT_BUILD_DIR}/mempool.test)
> +add_test(small_class ${CMAKE_CURRENT_BUILD_DIR}/small_class.test)
> +add_test(small_class_branchless ${CMAKE_CURRENT_BUILD_DIR}/small_class_branchless.test)
>  add_test(small_alloc ${CMAKE_CURRENT_BUILD_DIR}/small_alloc.test)
>  add_test(lf_lifo ${CMAKE_CURRENT_BUILD_DIR}/lf_lifo.test)
>  add_test(slab_cache ${CMAKE_CURRENT_BUILD_DIR}/slab_cache.test)
> diff --git a/test/small_class.c b/test/small_class.c
> new file mode 100644
> index 0000000..0285819
> --- /dev/null
> +++ b/test/small_class.c
> @@ -0,0 +1,168 @@
> +#include "small/small_class.h"
> +#include "unit.h"
> +#include <math.h>
> +
> +static void
> +test_visual(void)
> +{
> +	header();
> +	plan(0);
> +
> +	struct small_class sc;
> +	small_class_create(&sc, 2, 1.2, 12);
> +	printf("desired factor %f, actual factor %f\n",
> +	       sc.requested_factor, sc.actual_factor);
> +
> +	printf("  sz   cls cls_sz real_factor\n");
> +	for (unsigned i = 0; i <= 100; i++) {
> +		unsigned cls = class_by_size(&sc, i);
> +		unsigned cls_sz = size_by_class(&sc, cls);
> +		unsigned cls_sz_next = size_by_class(&sc, cls + 1);
> +		double real_factor = 1. * cls_sz_next / cls_sz;
> +		printf("%3u   %3u   %3u    %lf\n", i, cls, cls_sz, real_factor);
> +	}
> +
> +	check_plan();
> +	footer();
> +}
> +
> +static void
> +check_expectation()
> +{
> +	header();
> +
> +	const unsigned test_sizes = 1024;
> +	const unsigned test_classes = 1024;
> +	/* we expect 4 effective bits with factor = 1.05 */
> +	float factor = 1.05;
> +	const unsigned eff_size = 16;

14. Why? How can I calculate and check this?

> +	unsigned test_class_size[test_classes + eff_size];
> +
> +	plan(4 * (1 + 2 * (test_sizes + 1)));

15. Please, try not to make these tests print thousands of lines.
Firstly, such result file can't be reviewed, because of its insane
size. Secondly, it will lead to extremely huge diff when anything
will be changed somewhere in the beginning.

> +
> +	for (int variant = 0; variant < 4; variant++) {
> +		unsigned granularity = (variant & 1) != 0 ? 1 : 4;
> +		unsigned min_alloc = granularity + ((variant & 2) != 0 ? 0 : 10);

16. Why & 1, & 2, 1, 4, 0, 10? I don't understand these magic
numbers.

> +
> +		{
> +			unsigned class_size = min_alloc - granularity;
> +			/* incremental growth */
> +			for (unsigned i = 0; i < eff_size; i++) {
> +				class_size += granularity;
> +				test_class_size[i] = class_size;
> +			}
> +			/* exponential growth */
> +			unsigned growth = granularity;
> +			for (unsigned i = eff_size; i < test_classes; i += eff_size) {
> +				for (unsigned j = 0; j < eff_size; j++) {
> +					class_size += growth;
> +					test_class_size[i + j] = class_size;
> +				}
> +				growth *= 2;
> +			}
> +		}