From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 5 Oct 2018 01:54:29 +0300 From: Konstantin Osipov Subject: Re: [tarantool-patches] [PATCH v1 1/1] lib: implement JSON tree class for json library Message-ID: <20181004225429.GQ22855@chai> References: <51204844504cf430967af4414fb98b552c8f07f7.1538396782.git.kshcherbatov@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51204844504cf430967af4414fb98b552c8f07f7.1538396782.git.kshcherbatov@tarantool.org> To: tarantool-patches@freelists.org Cc: vdavydov.dev@gmail.com, Kirill Shcherbatov List-ID: * Kirill Shcherbatov [18/10/01 18:05]: > +/** Hash record representing json_tree_node. */ > +struct mh_json_tree_node { > + struct json_path_node key; A trivial question: does this data structure represent an entire json path or a fragment of a json path? > + uint32_t key_hash; > + struct json_tree_node *node; > +}; Same here. Are we hashing an entire json path or a fragment of the path? I of course could look up in the implementation. My point is that the comment could be much more informative with just a few words related to substance, and not formal repetition of what is already evident. This review remark is relevant about the rest of the comments. You did not think about the reader, all the comments are purely formal. > + > +/** > + * Compare json_tree_node hash bindings @a and @b. > + * @param a Hash record to compare. > + * @param b Hash record to compare. > + * @retval 0 On equal, not 0 otherwise. > + */ > +bool > +mh_json_tree_node_cmp(const struct mh_json_tree_node *a, > + const struct mh_json_tree_node *b) > +{ > + return (a->key.type != b->key.type) || > + ((a->key.type != JSON_PATH_STR || > + a->key.len != b->key.len || > + memcmp(a->key.str, b->key.str, > + a->key.len) != 0) && > + (a->key.type != JSON_PATH_NUM || > + a->key_hash != b->key_hash)); > +} > + > +/** Define tree record hashtable. */ > +#define MH_SOURCE 1 > +#define mh_name _json_tree_node > +#define mh_key_t struct mh_json_tree_node * > +#define mh_node_t struct mh_json_tree_node > +#define mh_arg_t void * > +#define mh_hash(a, arg) ((a)->key_hash) > +#define mh_hash_key(a, arg) ((a)->key_hash) > +#define mh_cmp(a, b, arg) (mh_json_tree_node_cmp((a), (b))) > +#define mh_cmp_key(a, b, arg) mh_cmp(a, b, arg) > +#include "salad/mhash.h" > + > +static const uint32_t json_path_node_hash_seed = 13U; > + > +uint32_t > +json_path_node_hash(struct json_path_node *key) > +{ > + uint32_t h = json_path_node_hash_seed; > + uint32_t carry = 0; > + uint8_t *data; > + uint32_t data_size; > + if (key->type == JSON_PATH_STR) { > + data = (uint8_t *)key->str; > + data_size = key->len; > + } else if (key->type == JSON_PATH_NUM) { > + data = (uint8_t *)&key->num; > + data_size = sizeof(key->num); > + } else { > + assert(0); > + } > + PMurHash32_Process(&h, &carry, data, data_size); > + return PMurHash32_Result(h, carry, data_size); Why did you choose murmur and not luajit hash? > + memset(node, 0, sizeof(struct json_tree_node)); > + node->children_ht = mh_json_tree_node_new(); > + if (node->children_ht == NULL) > + return -1; > + if (path_node != NULL) { > + memcpy(&node->key, path_node, sizeof(struct json_path_node)); > + node->key_hash = path_node_hash; > + } else { > + node->key.type = JSON_PATH_END; > + } > + rlist_create(&node->siblings); > + rlist_create(&node->children); > + return 0; There should be a comment describing what API this data structure provides and how it works. Why did you choose to create a nested hash in every node? What is the complexity of the json path lookup? There should be a fastpath for exact match which only evaluates a path hash once. > +struct json_tree_node * > +json_tree_lookup_path(struct json_tree_node *root, const char *path, > + uint32_t path_len) > +{ > + int rc; > + struct json_path_parser parser; > + struct json_path_node path_node; > + json_path_parser_create(&parser, path, path_len); > + while ((rc = json_path_next(&parser, &path_node)) == 0 && > + path_node.type != JSON_PATH_END && root != NULL) { > + uint32_t path_node_hash = json_path_node_hash(&path_node); > + struct json_tree_node *node = > + json_tree_lookup(root, &path_node, path_node_hash); > + assert(node == NULL || node->parent == root); > + root = node; > + } > + if (rc != 0 || path_node.type != JSON_PATH_END) > + return NULL; > + return root; > +} > +/** > + * Destroy JSON tree node. > + * @param node JSON tree node to destruct. > + * @retval 0 On success, -1 on error. > + */ > +/** > + * Lookup record with data @path_node having hash @path_node_hash > + * in JSON tree @node childs. > + * @param node JSON tree root node. > + * @param path_node JSON parser node with data. > + * @param path_node_hash Hash of @path_node. > + * @retval not NULL pointer to JSON tree node if any. > + */ I literally hate your comments. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov