Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Cc: tarantool-patches@freelists.org, Kostya Osipov <kostja@tarantool.org>
Subject: Re: [tarantool-patches] Re: [PATCH v5 2/9] lib: implement JSON tree class for json library
Date: Tue, 4 Dec 2018 20:54:12 +0300	[thread overview]
Message-ID: <20181204175412.dayx2wplbxi5rrfz@esperanza> (raw)
In-Reply-To: <3c7bb503-561c-19b0-1197-f714b6f384d4@tarantool.org>

On Tue, Dec 04, 2018 at 06:47:27PM +0300, Kirill Shcherbatov wrote:
> >> +	uint32_t rolling_hash;
> > 
> > Let's call it simply 'hash', short and clear. The rolling nature of the
> > hash should be explained in the comment.
> Ok, done
> 
> > typo: indexe -> index
> > 
> > BTW, json array start indexing from 0, not 1 AFAIK. Starting indexing
> > from 1 looks weird to me.

You left this comment from my previous review unattended.

> > 
> >> +	 * and are allocated sequently for JSON_TOKEN_NUM child
> > 
> > typo: sequently -> sequentially
> Ok, done.

See below for my comments to the new version of the patch.

> From c4e0001ecfd0987fffa2ef5f747ef6f3c016dae7 Mon Sep 17 00:00:00 2001
> From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
> Date: Mon, 1 Oct 2018 15:10:19 +0300
> Subject: [PATCH] lib: implement JSON tree class for json library
> 
> New JSON tree class would store JSON paths for tuple fields
> for registered non-plain indexes. It is a hierarchical data
> structure that organize JSON nodes produced by parser.
> Class provides API to lookup node by path and iterate over the
> tree.
> JSON Indexes patch require such functionality to make lookup
> for tuple_fields by path, make initialization of field map and
> build vynyl_stmt msgpack for secondary index via JSON tree
> iteration.
> 
> Need for #1012

As I've already told you, should be

Needed for #1012

> diff --git a/src/lib/json/json.c b/src/lib/json/json.c
> index eb80e4bb..58a842ef 100644
> --- a/src/lib/json/json.c
> +++ b/src/lib/json/json.c
> +static void
> +json_token_destroy(struct json_token *token)
> +{
> +	/* Token mustn't have JSON subtree. */
> +	#ifndef NDEBUG

#ifndef/endif shouldn't be indented.

> +	struct json_token *iter;
> +	uint32_t nodes = 0;
> +	json_tree_foreach_preorder(token, iter)
> +		nodes++;
> +	assert(nodes == 0);
> +	#endif /* NDEBUG */

I'd prefer to change this to something simpler, like

	assert(token->child_count == 0);

but now I realize that child_count isn't actually the number of
children, as I thought, but the max id of ever existed child.
This is confusing. We need to do something about it.

What about?

	/**
	 * Allocation size of the children array.
	 */
	int children_capacity;
	/**
	 * Max occupied index in the children array.
	 */
	int max_child_idx;

and update max_child_idx on json_tree_del() as well

> +
> +	free(token->children);
> +}
> +
> +void
> +json_tree_destroy(struct json_tree *tree)
> +{
> +	/* Tree must be empty. */
> +	#ifndef NDEBUG
> +	struct json_token *iter;
> +	uint32_t nodes = 0;
> +	json_tree_foreach_preorder(&tree->root, iter)
> +		nodes++;
> +	assert(nodes == 0);
> +	#endif /* NDEBUG */

This check is pointless as the same check is done by json_token_destroy
called right below.

> +
> +	json_token_destroy(&tree->root);
> +	mh_json_delete(tree->hash);
> +}
> +
> +struct json_token *
> +json_tree_lookup_slowpath(struct json_tree *tree, struct json_token *parent,
> +			  const struct json_token *token)
> +{
> +	assert(parent != NULL);

This particular assertion is pointless. You could as well add

	assert(tree != NULL);
	assert(token != NULL);

but why? Such assertions wouldn't enlighten the reader while the program
would crash anyway while trying to dereference NULL. An assertion should
either ensure some non-trivial condition, to prevent the program from
running any further and increasing the mess, or tip the reader what's
going on here.

> +	if (likely(token->type == JSON_TOKEN_STR)) {
> +		struct json_token key, *key_ptr;
> +		key.type = token->type;
> +		key.str = token->str;
> +		key.len = token->len;
> +		key.parent = parent;
> +		key.hash = json_token_hash(&key);
> +		key_ptr = &key;
> +		mh_int_t id = mh_json_find(tree->hash, &key_ptr, NULL);

You pass token** to mh_json_find instead of token*. I haven't noticed
that before, but turns out that

> +#define mh_key_t struct json_token **

This looks weird. Why not

 #define mh_key_t struct json_token *

?

> +		if (id == mh_end(tree->hash))
> +			return NULL;
> +		struct json_token **entry = mh_json_node(tree->hash, id);
> +		assert(entry == NULL || (*entry)->parent == parent);
> +		return entry != NULL ? *entry : NULL;

AFAIU entry can't be NULL here.

> +	} else if (token->type == JSON_TOKEN_NUM) {
> +		uint32_t idx =  token->num - 1;
> +		return likely(idx < parent->child_count) ?
> +		       parent->children[idx] : NULL;
> +	}

What's the point to handle JSON_TOKEN_NUM here? Nobody is supposed to
call json_tree_lookup_slowpath() directly. Everyone should use
json_tree_lookup() instead.

Please change to an assertion ensuring that token->type is NUM and add
a comment to json_tree_lookup_slowpath() saying that it's an internal
function that shouldn't be used directly.

> diff --git a/src/lib/json/json.h b/src/lib/json/json.h
> index ead44687..948fcdb7 100644
> --- a/src/lib/json/json.h
> +++ b/src/lib/json/json.h

> +/**
> + * Make child lookup in JSON tree by token at position specified
> + * with parent.
> + */
> +struct json_token *
> +json_tree_lookup_slowpath(struct json_tree *tree, struct json_token *parent,
> +			  const struct json_token *token);

The comment to this function could be as short as:

/**
 * Internal function, use json_tree_lookup instead.
 */

> +
> +/**
> + * Make child lookup in JSON tree by token at position specified

They don't usually say "make lookup". It's "do lookup" or, even better,
simply "look up a token in a tree". "Make" is more like "build" or
"construct".

> + * with parent without function call in the best-case. */

Comment style.

> +static inline struct json_token *
> +json_tree_lookup(struct json_tree *tree, struct json_token *parent,
> +		 const struct json_token *token)
> +{
> +	struct json_token *ret = NULL;
> +	if (token->type == JSON_TOKEN_NUM) {
> +		uint32_t idx =  token->num - 1;
> +		ret = likely(idx < parent->child_count_max) ?
> +		      parent->children[idx] : NULL;
> +	} else {
> +		ret = json_tree_lookup_slowpath(tree, parent, token);
> +	}
> +	return ret;
> +}

> +/**
> + * Make secure post-order traversal in JSON tree and return entry.
> + * This cycle doesn't visit root node.
> + */
> +#define json_tree_foreach_entry_safe(root, node, type, member, tmp)	     \
> +	for ((node) = json_tree_postorder_next_entry((root), NULL,	     \
> +						     type, member);	     \
> +	     &(node)->member != (root) &&				     \
> +	     ((tmp) =  json_tree_postorder_next_entry((root),		     \

Extra space.

> +	     					      &(node)->member,	     \
> +	     					      type, member));	     \

Mixed tabs and spaces. There are more things like that in this patch.
Please carefully self-review your patch next time to make sure it's
neatly formatted.

> +	     (node) = (tmp))
> +
>  #ifdef __cplusplus
>  }
>  #endif

  reply	other threads:[~2018-12-04 17:54 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-26 10:49 [PATCH v5 0/9] box: indexes by JSON path Kirill Shcherbatov
2018-11-26 10:49 ` [PATCH v5 1/9] box: refactor json_path_parser class Kirill Shcherbatov
2018-11-26 12:53   ` [tarantool-patches] " Kirill Shcherbatov
2018-11-29 15:39     ` Vladimir Davydov
2018-11-26 10:49 ` [PATCH v5 2/9] lib: implement JSON tree class for json library Kirill Shcherbatov
2018-11-26 12:53   ` [tarantool-patches] " Kirill Shcherbatov
2018-11-29 17:38     ` Vladimir Davydov
2018-11-29 17:50       ` Vladimir Davydov
2018-12-04 15:22       ` Vladimir Davydov
2018-12-04 15:47       ` [tarantool-patches] " Kirill Shcherbatov
2018-12-04 17:54         ` Vladimir Davydov [this message]
2018-12-05  8:37           ` Kirill Shcherbatov
2018-12-05  9:07             ` Vladimir Davydov
2018-12-05  9:52               ` Vladimir Davydov
2018-12-06  7:56                 ` Kirill Shcherbatov
2018-12-06  7:56                 ` [tarantool-patches] Re: [PATCH v5 2/9] lib: make index_base support for json_lexer Kirill Shcherbatov
2018-11-26 10:49 ` [PATCH v5 3/9] box: manage format fields with JSON tree class Kirill Shcherbatov
2018-11-29 19:07   ` Vladimir Davydov
2018-12-04 15:47     ` [tarantool-patches] " Kirill Shcherbatov
2018-12-04 16:09       ` Vladimir Davydov
2018-12-04 16:32         ` Kirill Shcherbatov
2018-12-05  8:37         ` Kirill Shcherbatov
2018-12-06  7:56         ` Kirill Shcherbatov
2018-12-06  8:06           ` Vladimir Davydov
2018-11-26 10:49 ` [PATCH v5 4/9] lib: introduce json_path_cmp routine Kirill Shcherbatov
2018-11-30 10:46   ` Vladimir Davydov
2018-12-03 17:37     ` [tarantool-patches] " Konstantin Osipov
2018-12-03 18:48       ` Vladimir Davydov
2018-12-03 20:14         ` Konstantin Osipov
2018-12-06  7:56           ` [tarantool-patches] Re: [PATCH v5 4/9] lib: introduce json_path_cmp, json_path_validate Kirill Shcherbatov
2018-11-26 10:49 ` [tarantool-patches] [PATCH v5 5/9] box: introduce JSON indexes Kirill Shcherbatov
2018-11-30 21:28   ` Vladimir Davydov
2018-12-01 16:49     ` Vladimir Davydov
2018-11-26 10:49 ` [PATCH v5 6/9] box: introduce has_json_paths flag in templates Kirill Shcherbatov
2018-11-26 10:49 ` [PATCH v5 7/9] box: tune tuple_field_raw_by_path for indexed data Kirill Shcherbatov
2018-12-01 17:20   ` Vladimir Davydov
2018-11-26 10:49 ` [PATCH v5 8/9] box: introduce offset slot cache in key_part Kirill Shcherbatov
2018-12-03 21:04   ` Vladimir Davydov
2018-12-04 15:51     ` Vladimir Davydov
2018-11-26 10:49 ` [PATCH v5 9/9] box: specify indexes in user-friendly form Kirill Shcherbatov
2018-12-04 12:22   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181204175412.dayx2wplbxi5rrfz@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [tarantool-patches] Re: [PATCH v5 2/9] lib: implement JSON tree class for json library' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox