[PATCH v2 07/14] vinyl: add helpers to add/check statement with bloom

Vladimir Davydov vdavydov.dev at gmail.com
Wed Mar 13 15:25:05 MSK 2019


On Wed, Mar 13, 2019 at 02:59:34PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev at gmail.com> [19/03/13 11:58]:
> 
> > A Vinyl statement may be either a key or a tuple. We must use different
> > functions for the two kinds when working with a bloom filter. Let's
> > introduce helpers incorporating that logic.
> 
> While we are at it, tuple_bloom_builder is a cumbersome name.
> Why not simply bloom_builder

So as not to consue with generic bloom defined in salad.

> or vy_bloom_builder?

Just in case we want to reuse it. After all, tuple_bloom isn't really
specific to vinyl.

> 
> Especially the name is confusing since now we have
> tuple_bloom_builder_add(_tuple) and tuple_bloom_builder_add_key.
> 
> Besides, tuple_bloom_builder pretends to be generic, not specific
> to vinyl engine. I don't think anyone cares (I don't).

Well, we can move it to vy_bloom, I guess, but I don't think it's really
necessary now.

> 
> > +int
> > +tuple_bloom_builder_add_key(struct tuple_bloom_builder *builder,
> > +			    const char *key, uint32_t part_count,
> > +			    struct key_def *key_def)
> > +{
> > +	(void)part_count;
> > +	assert(part_count >= key_def->part_count);
> > +	assert(builder->part_count == key_def->part_count);
> > +
> > +	uint32_t h = HASH_SEED;
> > +	uint32_t carry = 0;
> > +	uint32_t total_size = 0;
> > +
> 
> Once again, since we are at it I would appreciate an explanation
> about our strategy for building bloom filters for partial keys.
> No other LSM I'm aware of is doing it this way, so it would be
> nice to see a write down of how it works.

Okay, I'll add a comment to tuple_bloom explaining what's going on in
the builder.

> 
> 
> > +	for (uint32_t i = 0; i < key_def->part_count; i++) {
> > +		total_size += tuple_hash_field(&h, &carry, &key,
> > +					       key_def->parts[i].coll);
> > +		uint32_t hash = PMurHash32_Result(h, carry, total_size);
> > +		if (tuple_hash_array_add(&builder->parts[i], hash) != 0)
> > +			return -1;
> > +	}
> > +	return 0;
> > +}



More information about the Tarantool-patches mailing list