[PATCH v2 07/14] vinyl: add helpers to add/check statement with bloom

Konstantin Osipov kostja at tarantool.org
Wed Mar 13 14:59:34 MSK 2019


* Vladimir Davydov <vdavydov.dev at gmail.com> [19/03/13 11:58]:

> A Vinyl statement may be either a key or a tuple. We must use different
> functions for the two kinds when working with a bloom filter. Let's
> introduce helpers incorporating that logic.

While we are at it, tuple_bloom_builder is a cumbersome name. Why
not simply bloom_builder or vy_bloom_builder?

Especially the name is confusing since now we have
tuple_bloom_builder_add(_tuple) and tuple_bloom_builder_add_key.

Besides, tuple_bloom_builder pretends to be generic, not specific
to vinyl engine. I don't think anyone cares (I don't).

> +int
> +tuple_bloom_builder_add_key(struct tuple_bloom_builder *builder,
> +			    const char *key, uint32_t part_count,
> +			    struct key_def *key_def)
> +{
> +	(void)part_count;
> +	assert(part_count >= key_def->part_count);
> +	assert(builder->part_count == key_def->part_count);
> +
> +	uint32_t h = HASH_SEED;
> +	uint32_t carry = 0;
> +	uint32_t total_size = 0;
> +

Once again, since we are at it I would appreciate an explanation
about our strategy for building bloom filters for partial keys.
No other LSM I'm aware of is doing it this way, so it would be
nice to see a write down of how it works.


> +	for (uint32_t i = 0; i < key_def->part_count; i++) {
> +		total_size += tuple_hash_field(&h, &carry, &key,
> +					       key_def->parts[i].coll);
> +		uint32_t hash = PMurHash32_Result(h, carry, total_size);
> +		if (tuple_hash_array_add(&builder->parts[i], hash) != 0)
> +			return -1;
> +	}
> +	return 0;
> +}

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov



More information about the Tarantool-patches mailing list