From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 16 Oct 2018 11:26:43 +0300 From: Vladimir Davydov Subject: Re: [PATCH v4 08/14] lib: implement JSON tree class for json library Message-ID: <20181016082643.nfhwyoi44n3voco4@esperanza> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: To: Kirill Shcherbatov Cc: tarantool-patches@freelists.org List-ID: On Thu, Oct 11, 2018 at 10:58:53AM +0300, Kirill Shcherbatov wrote: > +/** > + * JSON tree is a hierarchical data structure that organize JSON > + * nodes produced by parser. > + * 1. Numeric records are sorted in ascending order in sibling rlist. > + * 2. Structure key records point to source strings memory. > + * 3. Lookup by path complexity is O(path_len): > + * path = path_token1 + .. + path_tokenN > + * - Complexity of hash(path_tokenI) - O(strlen(path_tokenN)) > + * - Complexity of hash lookup - O(1) + strcmp on collision I doubt that an unaware reader would figure out how the tree organizes json path tokens after reading this comment... > + */ > +struct json_tree_node { > + /** Path component represented by this node. */ > + struct json_path_node key; > + /** Hash calculated for key field. */ > + uint32_t key_hash; > + /** > + * Children hashtable: > + * json_path_node -> json_tree_node. > + */ > + struct mh_json_tree_node_t *child_by_path_node; > + /** > + * Link in json_tree_node::children of the parent node. > + */ > + struct rlist sibling; > + /** > + * List of child nodes, linked by > + * json_tree_node::sibling. > + */ > + struct rlist children; > + /** Pointer to parent json_tree_node record. */ > + struct json_tree_node *parent; > +}; For the record. After discussion with Kostja we agreed that the json tree class needs to be reworked so that there's the only hash map per a tree that contains all descendants. To make it possible to look up a descendant of a non-root node, rolling hashes should be used: hash(child node) = hash(parent node) xor hash(path token) Also, child nodes should be stored in an array rather than in a list, because that would allow fast access of json array elements.