From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Sat, 9 Jun 2018 15:36:02 +0300 From: Vladimir Davydov Subject: Re: [tarantool-patches] [PATCH v2 1/2] box: create bigrefs for tuples Message-ID: <20180609123602.3wfqvq5abodoeyqg@esperanza> References: <6601bc3bb229282c7489ad6391c93319c5ff0366.1528544705.git.imeevma@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6601bc3bb229282c7489ad6391c93319c5ff0366.1528544705.git.imeevma@gmail.com> To: imeevma@tarantool.org Cc: tarantool-patches@freelists.org List-ID: On Sat, Jun 09, 2018 at 02:49:06PM +0300, imeevma@tarantool.org wrote: > Due to limitation of reference counters for tuple being only > 65535 it was possible to reach this limitation. This patch > increases capacity of reference counters to 4 billions. > > Closes #3224 > --- > Branch: https://github.com/tarantool/tarantool/tree/imeevma/gh-3224-tuple-bigrefs > Issue: https://github.com/tarantool/tarantool/issues/3224 If there's a cover letter, links should be added only there, not to individual patches. > @@ -174,20 +174,22 @@ process_rw(struct request *request, struct space *space, struct tuple **result) > txn_rollback_stmt(); > return -1; > } > + if (result == NULL) > + return txn_commit_stmt(txn, request); > + *result = tuple; > + if (tuple == NULL) > + return txn_commit_stmt(txn, request); > /* > * Pin the tuple locally before the commit, > * otherwise it may go away during yield in > * when WAL is written in autocommit mode. > */ > - TupleRefNil ref(tuple); > - if (txn_commit_stmt(txn, request) != 0) > - return -1; > - if (result != NULL) { > - if (tuple != NULL && tuple_bless(tuple) == NULL) > - return -1; > - *result = tuple; > - } > - return 0; > + tuple_ref(tuple); > + int rc = txn_commit_stmt(txn, request); > + if (rc == 0) > + tuple_bless(tuple); > + tuple_unref(tuple); > + return rc; I don't like they you rework process_rw. IMO calling txn_commit_stmt conditionally obfuscates the function work flow, which should be as simple as begin_stmt execute_request commit_stmt > diff --git a/src/box/tuple.c b/src/box/tuple.c > index 014f374..68540f4 100644 > --- a/src/box/tuple.c > +++ b/src/box/tuple.c > @@ -48,6 +48,26 @@ enum { > OBJSIZE_MIN = 16, > }; > > +/** > + * Container for big reference counters. Contains array of big > + * reference counters, size of this array and number of non-zero > + * big reference counters. When reference counter of tuple becomes > + * more than 32767, field refs of this tuple becomes index of big > + * reference counter in big reference counter array and field > + * is_bigref is set true. The moment big reference becomes equal > + * 32767 it is set to 0, refs of the tuple becomes 32767 and > + * is_bigref becomes false. Big reference counter can be equal to > + * 0 or be more than 32767. > + */ > +static struct bigref_list { > + /** Array of big reference counters. */ > + uint32_t *refs; > + /** Number of non-zero elements in the array. */ > + uint16_t size; > + /** Capacity of the array. */ > + uint16_t capacity; > +} bigref_list; Technically, it isn't a list, but rather an array. Anyway, I think I'd call this structure simply 'bigref', and functions for allocating and freeing big references - 'bigref_alloc' and 'bigref_free'. > +/** > + * Return index for new big reference counter and allocate memory > + * if needed. > + * @retval index for new big reference counter. > + */ > +static inline uint16_t > +bigref_list_new_index() > +{ > + if (bigref_list.size < bigref_list.capacity) { > + uint16_t vacant_index = 0; > + while (bigref_list.refs[vacant_index] != 0) > + ++vacant_index; > + ++bigref_list.size; > + return vacant_index; > + } Why don't you use a free list for this? This would improve the worst case complexity of a bigref allocation from O(N) to O(1). > + /* Extend the array. */ > + uint16_t capacity = bigref_list.capacity; > + if (capacity == 0) > + capacity = BIGREF_MIN_CAPACITY; > + else if (capacity < BIGREF_MAX_CAPACITY) > + capacity = MIN(capacity * BIGREF_FACTOR, BIGREF_MAX_CAPACITY); > + else > + panic("Too many big references"); > + uint32_t *refs = (uint32_t *) realloc(bigref_list.refs, capacity * > + sizeof(*bigref_list.refs)); > + if (refs == NULL) { > + panic("failed to reallocate %zu bytes: Cannot allocate "\ > + "memory.", capacity * sizeof(*bigref_list.refs)); > + } The max size of a bigref array is 32767 ints, i.e. just 128 KB, and there's the only object of this type. Is there any point in growing it dynamically rather than allocating the max size statically on startup? > + bigref_list.refs = refs; > + memset(bigref_list.refs + bigref_list.capacity, 0, (capacity - > + bigref_list.capacity) * sizeof(*bigref_list.refs)); > + bigref_list.capacity = capacity; > + return bigref_list.size++; > +} > + > +void > +tuple_ref_slow(struct tuple *tuple) > +{ > + assert(tuple->is_bigref || tuple->refs == TUPLE_REF_MAX); > + if (! tuple->is_bigref) { > + tuple->ref_index = bigref_list_new_index(); > + tuple->is_bigref = true; > + bigref_list.refs[tuple->ref_index] = TUPLE_REF_MAX; > + } else if (bigref_list.refs[tuple->ref_index] == BIGREF_MAX) { > + panic("Tuple big reference counter overflow"); > + } > + bigref_list.refs[tuple->ref_index]++; > +} > + > +/** > + * Try to decrease allocated memory if it is possible. Free memory > + * when size == 0. > + */ > +static inline void > +bigref_list_delete_index(uint16_t index) > +{ > + bigref_list.refs[index] = 0; > + if (--bigref_list.size == 0) { > + bigref_list_reset(); > + return; > + } > + if (bigref_list.capacity == BIGREF_MIN_CAPACITY || > + bigref_list.size > bigref_list.capacity / BIGREF_FACTOR) > + return; > + > + uint16_t top_index = bigref_list.capacity - 1; > + while (bigref_list.refs[top_index] == 0) > + top_index--; > + > + uint16_t needed_capacity = top_index + 1; > + if (needed_capacity < BIGREF_MIN_CAPACITY) > + needed_capacity = BIGREF_MIN_CAPACITY; > + if (needed_capacity > bigref_list.capacity / BIGREF_FACTOR) > + return; > + /* Round up capacity to the next highest power of 2. */ > + assert(sizeof(needed_capacity) == sizeof(uint16_t)); > + needed_capacity--; > + needed_capacity |= needed_capacity >> 1; > + needed_capacity |= needed_capacity >> 2; > + needed_capacity |= needed_capacity >> 4; > + needed_capacity |= needed_capacity >> 8; > + assert(needed_capacity < UINT16_MAX); > + needed_capacity++; > + uint32_t *refs = > + (uint32_t *) realloc(bigref_list.refs, needed_capacity * > + sizeof(*bigref_list.refs)); > + if (refs == NULL) { > + panic("failed to reallocate %zu bytes: Cannot allocate "\ > + "memory.", needed_capacity * sizeof(*bigref_list.refs)); > + } Even if we agree to grow the bigref array dynamically, I see absolutely no point in shrinking it: if we used N big refs once, we are likely to use N big refs again in the future. > + bigref_list.refs = refs; > + bigref_list.capacity = needed_capacity; > +}