Tarantool development patches archive
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Kirill Shcherbatov <kshcherbatov@tarantool.org>
Cc: tarantool-patches@freelists.org, kostja@tarantool.org
Subject: Re: [PATCH v8 5/5] box: specify indexes in user-friendly form
Date: Wed, 23 Jan 2019 18:29:03 +0300	[thread overview]
Message-ID: <20190123152903.27pfxg4hyg4oyjex@esperanza> (raw)
In-Reply-To: <a9e76ac945ba8dde15c1fa1bd170baf61c3835f6.1547645795.git.kshcherbatov@tarantool.org>

On Wed, Jan 16, 2019 at 04:44:43PM +0300, Kirill Shcherbatov wrote:
> Implemented a more convenient interface for creating an index
> by JSON path. Instead of specifying fieldno and relative path
> it comes possible to pass full JSON path to data.

it is now possible

> 
> Closes #1012
> 
> @TarantoolBot document
> Title: Indexes by JSON path
> Sometimes field data could have complex document structure.
> When this structure is consistent across whole space,
> you are able to create an index by JSON path.
> 
> Example:
> s = box.schema.space.create('sample')
> format = {{'id', 'unsigned'}, {'data', 'map'}}
> s:format(format)
> -- explicit JSON index creation
> age_idx = s:create_index('age', {{2, 'number', path = "age"}})
> -- user-friendly syntax for JSON index creation
> parts = {{'data.FIO["fname"]', 'str'}, {'data.FIO["sname"]', 'str'},
>          {'data.age', 'number'}}
> info_idx = s:create_index('info', {parts = parts}})
> s:insert({1, {FIO={fname="James", sname="Bond"}, age=35}})
> ---
>  src/box/lua/schema.lua    | 60 ++++++++++++++++++++++++++++++++++-----
>  test/engine/json.result   | 42 +++++++++++++++++++++++++++
>  test/engine/json.test.lua | 12 ++++++++
>  3 files changed, 107 insertions(+), 7 deletions(-)
> 
> diff --git a/src/box/lua/schema.lua b/src/box/lua/schema.lua
> index 8a804f0ba..a55ef5b22 100644
> --- a/src/box/lua/schema.lua
> +++ b/src/box/lua/schema.lua
> @@ -575,6 +575,49 @@ local function update_index_parts_1_6_0(parts)
>      return result
>  end
>  

Please add comments to these functions.

> +local function format_field_index_by_name(format, name)
> +    for k,v in pairs(format) do
> +        if v.name == name then
> +            return k
> +        end
> +    end
> +    return nil
> +end
> +
> +local function format_field_resolve(format, name)
> +    local idx = nil
> +    local field_name = nil
> +    -- try resolve whole name

try to

> +    idx = format_field_index_by_name(format, name)
> +    if idx ~= nil then
> +        print("Case 1 "..idx)

What's this? Debugging prints? Looks like you didn't care to carefully
review your patches before sending them out... again.

> +        return idx, nil
> +    end
> +    -- try resolve [%d]
> +    field_name = string.match(name, "^%[(%d+)%]")
> +    idx = tonumber(field_name)
> +    if idx ~= nil then
> +        print("Case 2 "..idx..string.sub(name, string.len(field_name) + 3))
> +        return idx, string.sub(name, string.len(field_name) + 3)
> +    end
> +    -- try resolve ["%s"] or ['%s']
> +    field_name = string.match(name, "^%[\"(.*)\"%]") or
> +                 string.match(name, "^%[\'(.*)\'%]")

Here you use asterisk (*) while in [%d] case you use plus (+).
I understand there's actually no difference in this particular
case, but please be consistent.

> +    idx = format_field_index_by_name(format, field_name)
> +    if idx ~= nil then
> +        print("Case 3 "..idx..string.sub(name, string.len(field_name) + 5))
> +        return idx, string.sub(name, string.len(field_name) + 5)
> +    end
> +    -- try to resolve .*[ and .*.
> +    field_name = string.match(name, "^([^.%[]-)[.%[]")

And here you use minus (-) for some reason.
Wouldn't simple "^([^.[]+)" work?

> +    idx = format_field_index_by_name(format, field_name)
> +    if idx ~= nil then
> +        print("Case 4 "..idx..string.sub(name, string.len(field_name) + 1))
> +        return idx, string.sub(name, string.len(field_name) + 1)
> +    end
> +    return nil, nil
> +end
> +
>  local function update_index_parts(format, parts)
>      if type(parts) ~= "table" then
>          box.error(box.error.ILLEGAL_PARAMS,
> @@ -626,16 +669,19 @@ local function update_index_parts(format, parts)
>              box.error(box.error.ILLEGAL_PARAMS,
>                        "options.parts[" .. i .. "]: field (name or number) is expected")
>          elseif type(part.field) == 'string' then
> -            for k,v in pairs(format) do
> -                if v.name == part.field then
> -                    part.field = k
> -                    break
> -                end
> +            local idx, path = format_field_resolve(format, part.field)
> +            if idx == nil then
> +               box.error(box.error.ILLEGAL_PARAMS,

Bad formatting - should be 4 spaces after if, not 3.

> +                         "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")

The line is too long. Please split.

>              end
> -            if type(part.field) == 'string' then
> +            if part.path ~= nil and part.path ~= path then
>                  box.error(box.error.ILLEGAL_PARAMS,
> -                          "options.parts[" .. i .. "]: field was not found by name '" .. part.field .. "'")
> +                          "options.parts[" .. i .. "]: field path '"..
> +                          path.." doesn't math the path '" ..part.path.."'")

Sometimes you add spaces around concatenation operator (..), sometimes
you don't. I think I've already repeated it a dozen times, but let me
try again: please be consistent throughout the code.

Anyway, I don't think that there's any point to ensure that part.path
matches the path extracted from the field. The error message looks weird
too. Let's just ignore part.path in this case.

>              end
> +            parts_can_be_simplified = parts_can_be_simplified and path == nil
> +            part.field = idx
> +            part.path = path or part.path
>          elseif part.field == 0 then
>              box.error(box.error.ILLEGAL_PARAMS,
>                        "options.parts[" .. i .. "]: field (number) must be one-based")

Let's please unite all the code doing field resolution in
tuple_field_resolve, including the case when the field is
given as a number. And check that the field is 1-based in
all cases.

> diff --git a/test/engine/json.result b/test/engine/json.result
> index 9e5accb0b..32f787a3d 100644
> --- a/test/engine/json.result
> +++ b/test/engine/json.result
> @@ -70,6 +70,48 @@ s:create_index('test2', {parts = {{2, 'number'}, {3, 'number', path = 'FIO.fname
>  - error: Field [3]["FIO"]["fname"] has type 'string' in one index, but type 'number'
>      in another
>  ...
> +s:create_index('test3', {parts = {{2, 'number'}, {']sad.FIO["fname"]', 'str'}}})
> +---
> +- error: 'Illegal parameters, options.parts[2]: field was not found by name '']sad.FIO["fname"]'''
> +...
> +s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str', path = "FIO[\"sname\"]"}}})
> +---
> +- error: 'Illegal parameters, options.parts[2]: field path ''.FIO["fname"] doesn''t
> +    math the path ''FIO["sname"]'''
> +...
> +idx3 = s:create_index('test3', {parts = {{2, 'number'}, {'[3].FIO["fname"]', 'str'}}})
> +---
> +...
> +idx3 ~= nil
> +---
> +- true
> +...
> +idx3.parts[2].path == ".FIO[\"fname\"]"
> +---
> +- true
> +...
> +s:create_index('test4', {parts = {{2, 'number'}, {'invalid.FIO["fname"]', 'str'}}})
> +---
> +- error: 'Illegal parameters, options.parts[2]: field was not found by name ''invalid.FIO["fname"]'''
> +...
> +idx4 = s:create_index('test4', {parts = {{2, 'number'}, {'data.FIO["fname"]', 'str'}}})
> +---
> +...
> +idx4 ~= nil
> +---
> +- true
> +...
> +idx4.parts[2].path == ".FIO[\"fname\"]"
> +---
> +- true
> +...
> +-- Vinyl has optimizations that omit index checks, so errors could differ.

What's this about? I don't understand...

> +idx3:drop()
> +---
> +...
> +idx4:drop()
> +---
> +...
>  s:insert{7, 7, {town = 'London', FIO = 666}, 4, 5}
>  ---
>  - error: 'Tuple field [3]["FIO"] type does not match one required by operation: expected

I'm not planning to review the next version of this patch set for a week
or two so please take your time and perfect it finally unless you want
it to be delayed for another month.

  reply	other threads:[~2019-01-23 15:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-16 13:44 [PATCH v8 0/5] box: Indexes by JSON path Kirill Shcherbatov
2019-01-16 13:44 ` [PATCH v8 1/5] lib: updated msgpuck library version Kirill Shcherbatov
2019-01-23 12:53   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 2/5] box: refactor tuple_field_raw_by_path routine Kirill Shcherbatov
2019-01-23 13:15   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 3/5] box: introduce JSON Indexes Kirill Shcherbatov
2019-01-23 13:49   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 4/5] box: introduce offset_slot cache in key_part Kirill Shcherbatov
2019-01-23 14:23   ` Vladimir Davydov
2019-01-16 13:44 ` [PATCH v8 5/5] box: specify indexes in user-friendly form Kirill Shcherbatov
2019-01-23 15:29   ` Vladimir Davydov [this message]
2019-01-16 15:35 ` [tarantool-patches] [PATCH v8 6/6] box: introduce has_json_paths flag in templates Kirill Shcherbatov
2019-01-23 14:15   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190123152903.27pfxg4hyg4oyjex@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=kostja@tarantool.org \
    --cc=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --subject='Re: [PATCH v8 5/5] box: specify indexes in user-friendly form' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox