* Re: [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations [not found] ` <20180327203916.GA11829@atlas> @ 2018-03-28 10:44 ` Vladimir Davydov 2018-03-28 17:01 ` Konstantin Osipov 0 siblings, 1 reply; 2+ messages in thread From: Vladimir Davydov @ 2018-03-28 10:44 UTC (permalink / raw) To: Konstantin Osipov, commits, tarantool-patches On Tue, Mar 27, 2018 at 11:39:16PM +0300, Konstantin Osipov wrote: > * Vladimir Davydov <vdavydov.dev@gmail.com> [18/03/21 16:36]: > > bloom: use malloc for bitmap allocations > > It's not only pointless, it's harmful, since mmap() is ~100 times > slower than malloc(). > > The patch is OK to push, after explaining to me why had to add an > extra memset(). > > In a follow up patch you could use slab_arena though. I'm afraid not, because AFAICS slab_arena is single-threaded while bloom filters are allocated in a worker thread and freed in tx. > > @@ -42,41 +44,33 @@ bloom_create(struct bloom *bloom, uint32_t number_of_values, > > double false_positive_rate, struct quota *quota) > > { > > /* Optimal hash_count and bit count calculation */ > > - bloom->hash_count = (uint32_t) > > - (log(false_positive_rate) / log(0.5) + 0.99); > > - /* Number of bits */ > > - uint64_t m = (uint64_t) > > - (number_of_values * bloom->hash_count / log(2) + 0.5); > > - /* mmap page size */ > > - uint64_t page_size = sysconf(_SC_PAGE_SIZE); > > - /* Number of bits in one page */ > > - uint64_t b = page_size * CHAR_BIT; > > - /* number of pages, round up */ > > - uint64_t p = (uint32_t)((m + b - 1) / b); > > - /* bit array size in bytes */ > > - size_t mmap_size = p * page_size; > > - bloom->table_size = p * page_size / sizeof(struct bloom_block); > > - if (quota_use(quota, mmap_size) < 0) { > > - bloom->table = NULL; > > + uint16_t hash_count = ceil(log(false_positive_rate) / log(0.5)); > > + uint64_t bit_count = ceil(number_of_values * hash_count / log(2)); > > + uint32_t block_bits = CHAR_BIT * sizeof(struct bloom_block); > > + uint32_t block_count = (bit_count + block_bits - 1) / block_bits; > > + > > + size_t size = block_count * sizeof(struct bloom_block); > > + if (quota_use(quota, size) < 0) > > return -1; > > - } > > - bloom->table = (struct bloom_block *) > > - mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, > > - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > - if (bloom->table == MAP_FAILED) { > > - bloom->table = NULL; > > - quota_release(quota, mmap_size); > > + > > + bloom->table = malloc(size); > > + if (bloom->table == NULL) { > > + quota_release(quota, size); > > return -1; > > } > > + > > + bloom->table_size = block_count; > > + bloom->hash_count = hash_count; > > + memset(bloom->table, 0, size); > > Why do you need this memset()? Because mmap() returns a zeroed-out region of memory. With malloc() I have to zero it out manually to conform to the old behavior. OTOH I could use calloc() to avoid memset() in case malloc() actually falls back on mmap(). I think I'll replace malloc() with calloc() here. ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations 2018-03-28 10:44 ` [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations Vladimir Davydov @ 2018-03-28 17:01 ` Konstantin Osipov 0 siblings, 0 replies; 2+ messages in thread From: Konstantin Osipov @ 2018-03-28 17:01 UTC (permalink / raw) To: Vladimir Davydov; +Cc: commits, tarantool-patches * Vladimir Davydov <vdavydov.dev@gmail.com> [18/03/28 14:49]: > On Tue, Mar 27, 2018 at 11:39:16PM +0300, Konstantin Osipov wrote: > > * Vladimir Davydov <vdavydov.dev@gmail.com> [18/03/21 16:36]: > > > bloom: use malloc for bitmap allocations > > > > It's not only pointless, it's harmful, since mmap() is ~100 times > > slower than malloc(). > > > > The patch is OK to push, after explaining to me why had to add an > > extra memset(). > > > > In a follow up patch you could use slab_arena though. > I'm afraid not, because AFAICS slab_arena is single-threaded while bloom > filters are allocated in a worker thread and freed in tx. Slab arena is multi-threaded, slab cache is single threaded. But if producer and consumer are different, then OK, it's not worth it. > > > @@ -42,41 +44,33 @@ bloom_create(struct bloom *bloom, uint32_t number_of_values, > > > double false_positive_rate, struct quota *quota) > > > { > > > /* Optimal hash_count and bit count calculation */ > > > - bloom->hash_count = (uint32_t) > > > - (log(false_positive_rate) / log(0.5) + 0.99); > > > - /* Number of bits */ > > > - uint64_t m = (uint64_t) > > > - (number_of_values * bloom->hash_count / log(2) + 0.5); > > > - /* mmap page size */ > > > - uint64_t page_size = sysconf(_SC_PAGE_SIZE); > > > - /* Number of bits in one page */ > > > - uint64_t b = page_size * CHAR_BIT; > > > - /* number of pages, round up */ > > > - uint64_t p = (uint32_t)((m + b - 1) / b); > > > - /* bit array size in bytes */ > > > - size_t mmap_size = p * page_size; > > > - bloom->table_size = p * page_size / sizeof(struct bloom_block); > > > - if (quota_use(quota, mmap_size) < 0) { > > > - bloom->table = NULL; > > > + uint16_t hash_count = ceil(log(false_positive_rate) / log(0.5)); > > > + uint64_t bit_count = ceil(number_of_values * hash_count / log(2)); > > > + uint32_t block_bits = CHAR_BIT * sizeof(struct bloom_block); > > > + uint32_t block_count = (bit_count + block_bits - 1) / block_bits; > > > + > > > + size_t size = block_count * sizeof(struct bloom_block); > > > + if (quota_use(quota, size) < 0) > > > return -1; > > > - } > > > - bloom->table = (struct bloom_block *) > > > - mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, > > > - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > > - if (bloom->table == MAP_FAILED) { > > > - bloom->table = NULL; > > > - quota_release(quota, mmap_size); > > > + > > > + bloom->table = malloc(size); > > > + if (bloom->table == NULL) { > > > + quota_release(quota, size); > > > return -1; > > > } > > > + > > > + bloom->table_size = block_count; > > > + bloom->hash_count = hash_count; > > > + memset(bloom->table, 0, size); > > > > Why do you need this memset()? > > Because mmap() returns a zeroed-out region of memory. With malloc() > I have to zero it out manually to conform to the old behavior. > > OTOH I could use calloc() to avoid memset() in case malloc() actually > falls back on mmap(). I think I'll replace malloc() with calloc() here. So I assume it's really needed then. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-03-28 17:01 UTC | newest] Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <152163921995.4274.14738726596084562144@localhost> [not found] ` <1521639144.906931598@mxpdd1.i.mail.ru> [not found] ` <20180327203916.GA11829@atlas> 2018-03-28 10:44 ` [commits] [tarantool] 01/04: bloom: use malloc for bitmap allocations Vladimir Davydov 2018-03-28 17:01 ` Konstantin Osipov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox