From: Konstantin Osipov <kostja@tarantool.org>
To: tarantool-patches@freelists.org
Subject: [tarantool-patches] Re: [PATCH v1 1/1] lib: implement JSON tree class for json library
Date: Sat, 6 Oct 2018 15:37:14 +0300 [thread overview]
Message-ID: <20181006123714.GC1380@chai> (raw)
In-Reply-To: <20181005095012.jirfl46btgrphyct@esperanza>
* Vladimir Davydov <vdavydov.dev@gmail.com> [18/10/05 12:52]:
> There will be such a hash table in each root tuple_field object (it's
> not a part of the generic json tree infrastructure). It will be used to
> speed up data indexing. Why would we need to bother implementing lookup
> in a tree then? Because I assume we want to speed up lookups over all
> sub-paths, e.g. suppose we have a tuple
Why do we maintain a separate hash for tuple field names and json
paths within a tuple?
json paths are just a continuation of existing ways to access a
tuple field.
Looks like we need to rebuild existing implementation of field
names support, not add something on a side.
>
> t = {
> {
> customer = {
> name = {
> first = ...,
> last = ...,
> },
> ...
> }
> },
> ...
> }
>
> Then we can lookup 'customer.name.first' in Lua using any of the
> following ways:
>
> t['[1].customer.name.first']
> t[1]['customer.name.first']
> t[1].customer['name.first']
> t[1].customer.name.first
>
> etc
>
> I assume we want to use tuple field map so as to avoid msgpack decode
> whenever possible, not just for cases 1 or 2. Moreover, I guess
> t[1].customer.name.first will be more commonly used.
>
> We could of course store a hash table over all full paths in each
> intermediate node, but that would result in maintaining way too many
> hash entries, e.g. for a tuple field corresponding to 'customer', we
> would have to store all the following strings in the hash: 'name',
> 'name.first' and so on for other descendants. So I think that we only
> want to use the hash table for data indexing, i.e. when a field is
> looked up by full path like t[1]['customer.name.first']. For all other
> references we will split the path into nodes and descend down the tree
> to find the appropriate tuple field map slot.
I would try to create a single hash table for all possible paths,
including intermediate paths, and store it in the tuple format.
The hash table would store pointers to json tree nodes.
And I also think we need json tree nodes for top level fields -
they are just the same as intermediate fields.
I would also keep in mind that in the future we will automatically
flatten tuples according to a schema before storing in a space.
Flattening, from path access perspective, is simply a replacement
of one path with another (foo.bar.baz -> [1][1][3]).
> Note, we need the tree in any case, even if we decide not to use it for
> lookups at all - it's necessary for building a tuple field map without
> unpacking the same prefix multiple times, e.g. if we have two indexed
> paths
>
> a.b.c.d
> a.b.c.e
>
> Then we want to decode 'a.b.c' only once, not twice. The tree helps
> achieve that.
OK, this is clear.
--
Konstantin Osipov, Moscow, Russia, +7 903 626 22 32
http://tarantool.io - www.twitter.com/kostja_osipov
next prev parent reply other threads:[~2018-10-06 12:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-01 12:27 Kirill Shcherbatov
2018-10-01 21:34 ` Vladimir Davydov
2018-10-03 7:15 ` [tarantool-patches] " Kirill Shcherbatov
2018-10-04 22:59 ` Konstantin Osipov
2018-10-05 9:50 ` Vladimir Davydov
2018-10-06 12:37 ` Konstantin Osipov [this message]
2018-10-08 10:07 ` Vladimir Davydov
2018-10-04 22:54 ` [tarantool-patches] " Konstantin Osipov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181006123714.GC1380@chai \
--to=kostja@tarantool.org \
--cc=tarantool-patches@freelists.org \
--subject='[tarantool-patches] Re: [PATCH v1 1/1] lib: implement JSON tree class for json library' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox