Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Cc: tarantool-patches@freelists.org
Subject: Re: [PATCH v4 08/14] lib: implement JSON tree class for json library
Date: Tue, 16 Oct 2018 11:26:43 +0300	[thread overview]
Message-ID: <20181016082643.nfhwyoi44n3voco4@esperanza> (raw)
In-Reply-To: <b71ad408b343e04f7def0bc9bc29ee03eecf21f4.1539244271.git.kshcherbatov@tarantool.org>

On Thu, Oct 11, 2018 at 10:58:53AM +0300, Kirill Shcherbatov wrote:
> +/**
> + * JSON tree is a hierarchical data structure that organize JSON
> + * nodes produced by parser.
> + * 1. Numeric records are sorted in ascending order in sibling rlist.
> + * 2. Structure key records point to source strings memory.
> + * 3. Lookup by path complexity is O(path_len):
> + * 	path = path_token1 + .. + path_tokenN
> + * 	- Complexity of hash(path_tokenI) - O(strlen(path_tokenN))
> + * 	- Complexity of hash lookup - O(1) + strcmp on collision

I doubt that an unaware reader would figure out how the tree organizes
json path tokens after reading this comment...

> + */
> +struct json_tree_node {
> +	/** Path component represented by this node. */
> +	struct json_path_node key;
> +	/** Hash calculated for key field. */
> +	uint32_t key_hash;
> +	/**
> +	 * Children hashtable:
> +	 * json_path_node -> json_tree_node.
> +	 */
> +	struct mh_json_tree_node_t *child_by_path_node;
> +	/**
> +	 * Link in json_tree_node::children of the parent node.
> +	 */
> +	struct rlist sibling;
> +	/**
> +	 * List of child nodes, linked by
> +	 * json_tree_node::sibling.
> +	 */
> +	struct rlist children;
> +	/** Pointer to parent json_tree_node record. */
> +	struct json_tree_node *parent;
> +};

For the record. After discussion with Kostja we agreed that the json
tree class needs to be reworked so that there's the only hash map per a
tree that contains all descendants. To make it possible to look up a
descendant of a non-root node, rolling hashes should be used:

  hash(child node) = hash(parent node) xor hash(path token)

Also, child nodes should be stored in an array rather than in a list,
because that would allow fast access of json array elements.

  reply	other threads:[~2018-10-16  8:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-11  7:58 [PATCH v4 00/14] box: indexes by JSON path Kirill Shcherbatov
2018-10-11  7:58 ` [PATCH v4 01/14] box: refactor key_def_find routine Kirill Shcherbatov
2018-10-15 17:27   ` Vladimir Davydov
2018-10-11  7:58 ` [PATCH v4 10/14] box: introduce JSON indexes Kirill Shcherbatov
2018-10-16  9:33   ` Vladimir Davydov
2018-10-11  7:58 ` [PATCH v4 11/14] box: introduce has_json_paths flag in templates Kirill Shcherbatov
2018-10-11  7:58 ` [PATCH v4 12/14] box: tune tuple_field_raw_by_path for indexed data Kirill Shcherbatov
2018-10-11  7:58 ` [PATCH v4 13/14] box: introduce offset slot cache in key_part Kirill Shcherbatov
2018-10-11  7:58 ` [PATCH v4 14/14] box: specify indexes in user-friendly form Kirill Shcherbatov
2018-10-11  7:58 ` [PATCH v4 02/14] box: introduce key_def_parts_are_sequential Kirill Shcherbatov
2018-10-15 17:29   ` Vladimir Davydov
2018-10-11  7:58 ` [PATCH v4 03/14] box: introduce tuple_field_by_relative_path Kirill Shcherbatov
2018-10-15 17:46   ` Vladimir Davydov
2018-10-11  7:58 ` [PATCH v4 04/14] box: introduce tuple_format_add_key_part Kirill Shcherbatov
2018-10-15 19:39   ` Vladimir Davydov
2018-10-11  7:58 ` [tarantool-patches] [PATCH v4 05/14] box: introduce tuple_format_sizeof routine Kirill Shcherbatov
2018-10-15 17:52   ` Vladimir Davydov
2018-10-11  7:58 ` [PATCH v4 06/14] box: move tuple_field_go_to_{index,key} definition Kirill Shcherbatov
2018-10-16  8:15   ` Vladimir Davydov
2018-10-11  7:58 ` [PATCH v4 07/14] box: drop format const qualifier in *init_field_map Kirill Shcherbatov
2018-10-11  7:58 ` [PATCH v4 08/14] lib: implement JSON tree class for json library Kirill Shcherbatov
2018-10-16  8:26   ` Vladimir Davydov [this message]
2018-10-11  7:58 ` [PATCH v4 09/14] lib: introduce json_path_normalize routine Kirill Shcherbatov
2018-10-16  8:39   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181016082643.nfhwyoi44n3voco4@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH v4 08/14] lib: implement JSON tree class for json library' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox