Tarantool development patches archive
 help / color / mirror / Atom feed
* Re: [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations
       [not found]   ` <20180327203916.GA11829@atlas>
@ 2018-03-28 10:44     ` Vladimir Davydov
  2018-03-28 17:01       ` Konstantin Osipov
  0 siblings, 1 reply; 2+ messages in thread
From: Vladimir Davydov @ 2018-03-28 10:44 UTC (permalink / raw)
  To: Konstantin Osipov, commits, tarantool-patches

On Tue, Mar 27, 2018 at 11:39:16PM +0300, Konstantin Osipov wrote:
> * Vladimir Davydov <vdavydov.dev@gmail.com> [18/03/21 16:36]:
> >     bloom: use malloc for bitmap allocations
> 
> It's not only pointless, it's harmful, since mmap() is ~100 times
> slower than malloc().
> 
> The patch is OK to push, after explaining to me why had to add an
> extra memset().
> 
> In a follow up patch you could use slab_arena though.

I'm afraid not, because AFAICS slab_arena is single-threaded while bloom
filters are allocated in a worker thread and freed in tx.

> > @@ -42,41 +44,33 @@ bloom_create(struct bloom *bloom, uint32_t number_of_values,
> >  	     double false_positive_rate, struct quota *quota)
> >  {
> >  	/* Optimal hash_count and bit count calculation */
> > -	bloom->hash_count = (uint32_t)
> > -		(log(false_positive_rate) / log(0.5) + 0.99);
> > -	/* Number of bits */
> > -	uint64_t m = (uint64_t)
> > -		(number_of_values * bloom->hash_count / log(2) + 0.5);
> > -	/* mmap page size */
> > -	uint64_t page_size = sysconf(_SC_PAGE_SIZE);
> > -	/* Number of bits in one page */
> > -	uint64_t b = page_size * CHAR_BIT;
> > -	/* number of pages, round up */
> > -	uint64_t p = (uint32_t)((m + b - 1) / b);
> > -	/* bit array size in bytes */
> > -	size_t mmap_size = p * page_size;
> > -	bloom->table_size = p * page_size / sizeof(struct bloom_block);
> > -	if (quota_use(quota, mmap_size) < 0) {
> > -		bloom->table = NULL;
> > +	uint16_t hash_count = ceil(log(false_positive_rate) / log(0.5));
> > +	uint64_t bit_count = ceil(number_of_values * hash_count / log(2));
> > +	uint32_t block_bits = CHAR_BIT * sizeof(struct bloom_block);
> > +	uint32_t block_count = (bit_count + block_bits - 1) / block_bits;
> > +
> > +	size_t size = block_count * sizeof(struct bloom_block);
> > +	if (quota_use(quota, size) < 0)
> >  		return -1;
> > -	}
> > -	bloom->table = (struct bloom_block *)
> > -		mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
> > -		     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> > -	if (bloom->table == MAP_FAILED) {
> > -		bloom->table = NULL;
> > -		quota_release(quota, mmap_size);
> > +
> > +	bloom->table = malloc(size);
> > +	if (bloom->table == NULL) {
> > +		quota_release(quota, size);
> >  		return -1;
> >  	}
> > +
> > +	bloom->table_size = block_count;
> > +	bloom->hash_count = hash_count;
> > +	memset(bloom->table, 0, size);
> 
> Why do you need this memset()?

Because mmap() returns a zeroed-out region of memory. With malloc()
I have to zero it out manually to conform to the old behavior.

OTOH I could use calloc() to avoid memset() in case malloc() actually
falls back on mmap(). I think I'll replace malloc() with calloc() here.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations
  2018-03-28 10:44     ` [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations Vladimir Davydov
@ 2018-03-28 17:01       ` Konstantin Osipov
  0 siblings, 0 replies; 2+ messages in thread
From: Konstantin Osipov @ 2018-03-28 17:01 UTC (permalink / raw)
  To: Vladimir Davydov; +Cc: commits, tarantool-patches

* Vladimir Davydov <vdavydov.dev@gmail.com> [18/03/28 14:49]:
> On Tue, Mar 27, 2018 at 11:39:16PM +0300, Konstantin Osipov wrote:
> > * Vladimir Davydov <vdavydov.dev@gmail.com> [18/03/21 16:36]:
> > >     bloom: use malloc for bitmap allocations
> > 
> > It's not only pointless, it's harmful, since mmap() is ~100 times
> > slower than malloc().
> > 
> > The patch is OK to push, after explaining to me why had to add an
> > extra memset().
> > 
> > In a follow up patch you could use slab_arena though.

> I'm afraid not, because AFAICS slab_arena is single-threaded while bloom
> filters are allocated in a worker thread and freed in tx.

Slab arena is multi-threaded, slab cache is single threaded. But
if producer and consumer are different, then OK, it's not worth
it.

> > > @@ -42,41 +44,33 @@ bloom_create(struct bloom *bloom, uint32_t number_of_values,
> > >  	     double false_positive_rate, struct quota *quota)
> > >  {
> > >  	/* Optimal hash_count and bit count calculation */
> > > -	bloom->hash_count = (uint32_t)
> > > -		(log(false_positive_rate) / log(0.5) + 0.99);
> > > -	/* Number of bits */
> > > -	uint64_t m = (uint64_t)
> > > -		(number_of_values * bloom->hash_count / log(2) + 0.5);
> > > -	/* mmap page size */
> > > -	uint64_t page_size = sysconf(_SC_PAGE_SIZE);
> > > -	/* Number of bits in one page */
> > > -	uint64_t b = page_size * CHAR_BIT;
> > > -	/* number of pages, round up */
> > > -	uint64_t p = (uint32_t)((m + b - 1) / b);
> > > -	/* bit array size in bytes */
> > > -	size_t mmap_size = p * page_size;
> > > -	bloom->table_size = p * page_size / sizeof(struct bloom_block);
> > > -	if (quota_use(quota, mmap_size) < 0) {
> > > -		bloom->table = NULL;
> > > +	uint16_t hash_count = ceil(log(false_positive_rate) / log(0.5));
> > > +	uint64_t bit_count = ceil(number_of_values * hash_count / log(2));
> > > +	uint32_t block_bits = CHAR_BIT * sizeof(struct bloom_block);
> > > +	uint32_t block_count = (bit_count + block_bits - 1) / block_bits;
> > > +
> > > +	size_t size = block_count * sizeof(struct bloom_block);
> > > +	if (quota_use(quota, size) < 0)
> > >  		return -1;
> > > -	}
> > > -	bloom->table = (struct bloom_block *)
> > > -		mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
> > > -		     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> > > -	if (bloom->table == MAP_FAILED) {
> > > -		bloom->table = NULL;
> > > -		quota_release(quota, mmap_size);
> > > +
> > > +	bloom->table = malloc(size);
> > > +	if (bloom->table == NULL) {
> > > +		quota_release(quota, size);
> > >  		return -1;
> > >  	}
> > > +
> > > +	bloom->table_size = block_count;
> > > +	bloom->hash_count = hash_count;
> > > +	memset(bloom->table, 0, size);
> > 
> > Why do you need this memset()?
> 
> Because mmap() returns a zeroed-out region of memory. With malloc()
> I have to zero it out manually to conform to the old behavior.
> 
> OTOH I could use calloc() to avoid memset() in case malloc() actually
> falls back on mmap(). I think I'll replace malloc() with calloc() here.

So I assume it's really needed then.

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-03-28 17:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <152163921995.4274.14738726596084562144@localhost>
     [not found] ` <1521639144.906931598@mxpdd1.i.mail.ru>
     [not found]   ` <20180327203916.GA11829@atlas>
2018-03-28 10:44     ` [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations Vladimir Davydov
2018-03-28 17:01       ` Konstantin Osipov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox