[tarantool-patches] [PATCH v5 2/9] lib: implement JSON tree class for json library
Vladimir Davydov
vdavydov.dev at gmail.com
Tue Dec 4 18:22:18 MSK 2018
On Thu, Nov 29, 2018 at 08:38:16PM +0300, Vladimir Davydov wrote:
> > +/** JSON tree object to manage tokens relations. */
> > +struct json_tree {
> > + /** JSON tree root node. */
> > + struct json_token root;
> > + /** Hashtable af all tree nodes. */
> > + struct mh_json_t *hash;
>
> How does it work? In a year nobody will remember, and to understand how
> this thing operates we will have to dive deep into the code...
/**
* Element of a JSON path. It can be either string or number.
* String idenfiers are in ["..."] and between dots. Numbers are
* indexes in [...].
*
* May be organized in a tree-like structure reflecting a JSON
* document structure, for more details see the comment to struct
* json_tree.
*/
struct json_token {
...
};
/**
* This structure is used for organizing JSON tokens produced
* by a lexer in a tree-like structure reflecting a JSON document
* structure.
*
* Each intermediate node of the tree corresponds to either
* a JSON map or an array, depending on the key type used by
* its children (JSON_TOKEN_STR or JSON_TOKEN_NUM, respectively).
* Leaf nodes may represent both complex JSON structures and
* final values - it is not mandated by the JSON tree design.
* The root of the tree doesn't have a key and is preallocated
* when the tree is created.
*
* The json_token structure is intrusive by design, i.e. to store
* arbitrary information in a JSON tree, one has to incorporate it
* into a user defined structure.
*
* Example:
*
* struct data {
* ...
* struct json_token token;
* };
*
* struct json_tree tree;
* json_tree_create(&tree);
* struct json_token *parent = &tree->root;
*
* // Add a path to the tree.
* struct data *data = data_new();
* struct json_lexer lexer;
* json_lexer_create(&lexer, path, path_len);
* json_lexer_next_token(&lexer, &data->token);
* while (data->token.type != JSON_TOKEN_END) {
* json_tree_add(&tree, parent, &data->token);
* parent = &data->token;
* data = data_new();
* json_lexer_next_token(&lexer, &data->token);
* }
* data_delete(data);
*
* // Look up a path in the tree.
* data = json_tree_lookup_path(&tree, &tree.root, path, path_len);
*/
struct json_tree {
/**
* Preallocated token corresponding to the JSON tree root.
* It doesn't have a key (set to JSON_TOKEN_END).
*/
struct json_token root;
/**
* Hash table that is used to quickly look up a token
* corresponding to a JSON map item given a key and
* a parent token. We store all tokens that have type
* JSON_TOKEN_STR in this hash table. Apparently, we
* don't need to store JSON_TOKEN_NUM tokens as we can
* quickly look them up in the children array anyway.
*
* The hash table uses pair <parent, key> as key, so
* even tokens that happen to have the same key will
* have different keys in the hash. To look up a tree
* node corresponding to a particular path, we split
* the path into tokens and look up the first token
* in the root node and each following token in the
* node returned at the previous step.
*
* We compute a hash value for a token by hashing its
* key using the hash value of its parent as seed. This
* is equivalent to computing hash for the path leading
* to the token. However, we don't need to recompute
* hash starting from the root at each step as we
* descend the tree looking for a specific path, and we
* can start descent from any node, not only from the root.
*
* As a consequence of this hashing technique, even
* though we don't need to store JSON_TOKEN_NUM tokens
* in the hash table, we still have to compute hash
* values for them.
*/
struct mh_json_t *hash;
};
More information about the Tarantool-patches
mailing list