[tarantool-patches] Re: [commits] [tarantool] 02/05: collation: introduce collation fingerprint

Konstantin Osipov kostja at tarantool.org
Fri May 11 17:39:54 MSK 2018


* Vladislav Shpilevoy <v.shpilevoy at tarantool.org> [18/05/11 02:00]:
> This is an automated email from the git hooks/post-receive script.
> 
> Gerold103 pushed a commit to branch gh-3290-lua-icu
> in repository tarantool.
> 
> commit 2fb226b2b5c01a42565294abc59a9171d5f695fe
> Author: Vladislav Shpilevoy <v.shpilevoy at tarantool.org>
> AuthorDate: Tue May 8 21:48:18 2018 +0300
> 
>     collation: introduce collation fingerprint
>     
>     Collation fingerprint is a formatted string unique for a set
>     of collation properties. Equal collations with different names
>     have the same fingerprint.
>     
>     This new property is used to build collation fingerprint cache
>     to use in Tarantool internals, where collation name does not
>     matter.
>     
>     Fingerprint cache can never conflict or replace on insertion into
>     it. It means, that, for example, utf8 module being created in
>     this patchset, can fill collation cache with its own collations
>     and it will affect neither users or other modules.
> ---
>  src/box/alter.cc     |   8 ++--
>  src/box/coll.c       |  21 ++++++++-
>  src/box/coll.h       |  19 +++++++++
>  src/box/coll_cache.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++----
>  src/box/coll_cache.h |   4 +-
>  src/box/coll_def.c   |  36 ++++++++++++++++
>  src/box/coll_def.h   |  19 +++++++++
>  7 files changed, 209 insertions(+), 16 deletions(-)
> 
> diff --git a/src/box/alter.cc b/src/box/alter.cc
> index de8ccd3..9ca759c 100644
> --- a/src/box/alter.cc
> +++ b/src/box/alter.cc
> @@ -2386,7 +2386,7 @@ coll_cache_rollback(struct trigger *trigger, void *event)
>  			return;
>  		}
>  		struct coll *replaced;
> -		if (coll_cache_replace(old_coll, &replaced) != 0) {
> +		if (coll_cache_id_replace(old_coll, &replaced) != 0) {
>  			panic("Out of memory on insertion into collation "\

The name has become unclear now. 

> -	size_t total_len = sizeof(struct coll) + def->name_len + 1;
> -	struct coll *coll = (struct coll *)calloc(1, total_len);
> +	int fingerprint_offset = sizeof(struct coll) + def->name_len + 1;
> +	int fingerprint_len = coll_def_fingerprint_len(def);

You don't need a separate member for fingerprint length, asciiz
string is fine. 

> +uint32_t
> +coll_fingerprint_hash(const char *fingerprint, int len)
> +{
> +	uint32_t h = 13;
> +	uint32_t carry = 0;
> +	PMurHash32_Process(&h, &carry, fingerprint, len);
> +	return PMurHash32_Result(h, carry, len);
> +}

> +struct mh_coll_node_t {
> +	/**
> +	 * Collation with unique fingerprint in the collation
> +	 * cache.
> +	 */
> +	struct coll *coll;
> +	/**
> +	 * Reference counter. How many collations has the same
> +	 * fingerprint. This node is deleted from the cache only
> +	 * when there are no more collations with the same
> +	 * fingerprint.
> +	 */
> +	int refs;
> +};

Please rewrite the code without double level of reference
counting.

-- 
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov




More information about the Tarantool-patches mailing list