Tarantool development patches archive
 help / color / mirror / Atom feed
From: Kirill Shcherbatov <kshcherbatov@tarantool.org>
To: tarantool-patches@freelists.org
Cc: vdavydov.dev@gmail.com, Kirill Shcherbatov <kshcherbatov@tarantool.org>
Subject: [PATCH v5 07/12] lib: introduce json_path_normalize routine
Date: Mon, 29 Oct 2018 09:56:39 +0300	[thread overview]
Message-ID: <134d5b3da2e0d0102287cceb4d347c88ef2616b5.1540795996.git.kshcherbatov@tarantool.org> (raw)
In-Reply-To: <cover.1540795996.git.kshcherbatov@tarantool.org>
In-Reply-To: <cover.1540795996.git.kshcherbatov@tarantool.org>

Introduced a new routine json_path_normalize that makes a
conversion of JSON path to the 'canonical' form:
  - all maps keys are specified with operator ["key"] form
  - all array indexes are specified with operator [i] form.
This notation is preferable because in the general case it can
be uniquely parsed.
We need such API in JSON indexes patch to store all paths in
'canonical' form to commit the path uniqueness checks and
to tune access with JSON path hashtable.

Need for #1012
---
 src/lib/json/path.c        | 25 +++++++++++++++++++++++++
 src/lib/json/path.h        | 18 ++++++++++++++++++
 test/unit/json_path.c      | 41 ++++++++++++++++++++++++++++++++++++++++-
 test/unit/json_path.result | 14 +++++++++++++-
 4 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/src/lib/json/path.c b/src/lib/json/path.c
index 2e72930..0eb5d49 100644
--- a/src/lib/json/path.c
+++ b/src/lib/json/path.c
@@ -242,3 +242,28 @@ json_path_next(struct json_path_parser *parser, struct json_path_node *node)
 		return json_parse_identifier(parser, node);
 	}
 }
+
+int
+json_path_normalize(const char *path, uint32_t path_len, char *out)
+{
+	struct json_path_parser parser;
+	struct json_path_node node;
+	json_path_parser_create(&parser, path, path_len);
+	int rc;
+	while ((rc = json_path_next(&parser, &node)) == 0 &&
+		node.type != JSON_PATH_END) {
+		if (node.type == JSON_PATH_NUM) {
+			out += sprintf(out, "[%llu]",
+				      (unsigned long long)node.num);
+		} else if (node.type == JSON_PATH_STR) {
+			out += sprintf(out, "[\"%.*s\"]", node.len, node.str);
+		} else {
+			unreachable();
+		}
+	};
+	if (rc != 0)
+		return rc;
+	*out = '\0';
+	assert(node.type == JSON_PATH_END);
+	return 0;
+}
diff --git a/src/lib/json/path.h b/src/lib/json/path.h
index c3c381a..f6b2ee2 100644
--- a/src/lib/json/path.h
+++ b/src/lib/json/path.h
@@ -105,6 +105,24 @@ json_path_parser_create(struct json_path_parser *parser, const char *src,
 int
 json_path_next(struct json_path_parser *parser, struct json_path_node *node);
 
+/**
+ * Convert path to the 'canonical' form:
+ *  - all maps keys are specified with operator ["key"] form
+ *  - all array indexes are specified with operator [i] form.
+ * This notation is preferable because in the general case it can
+ * be uniquely parsed.
+ * @param path Source path string to be converted.
+ * @param path_len The length of the @path.
+ * @param[out] out Memory to store normalized string.
+ *                 The worst-case scenario require
+ *                 2.5 * path_len + 1 buffer.
+ * @retval 0 On success.
+ * @retval > 0 Position of a syntax error. A position is 1-based
+ *             and starts from a beginning of a source string.
+ */
+int
+json_path_normalize(const char *path, uint32_t path_len, char *out);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/test/unit/json_path.c b/test/unit/json_path.c
index 75ca11b..583101e 100644
--- a/test/unit/json_path.c
+++ b/test/unit/json_path.c
@@ -367,15 +367,54 @@ test_tree()
 	footer();
 }
 
+void
+test_normalize_path()
+{
+	header();
+	plan(8);
+
+	const char *path_normalized = "[\"FIO\"][3][\"fname\"]";
+	const char *path1 = "FIO[3].fname";
+	const char *path2 = "[\"FIO\"][3].fname";
+	const char *path3 = "FIO[3][\"fname\"]";
+	char buff[strlen(path_normalized) + 1];
+	int rc;
+
+	rc = json_path_normalize(path_normalized, strlen(path_normalized),
+				 buff);
+	is(rc, 0, "normalize '%s' path status", path_normalized);
+	is(strcmp(buff, path_normalized), 0, "normalize '%s' path compare",
+		  path_normalized);
+
+	rc = json_path_normalize(path1, strlen(path1), buff);
+	is(rc, 0, "normalize '%s' path status", path1);
+	is(strcmp(buff, path_normalized), 0, "normalize '%s' path compare",
+		  path1);
+
+	rc = json_path_normalize(path2, strlen(path2), buff);
+	is(rc, 0, "normalize '%s' path status", path2);
+	is(strcmp(buff, path_normalized), 0, "normalize '%s' path compare",
+		  path2);
+
+	rc = json_path_normalize(path3, strlen(path3), buff);
+	is(rc, 0, "normalize '%s' path status", path3);
+	is(strcmp(buff, path_normalized), 0, "normalize '%s' path compare",
+		  path3);
+
+	check_plan();
+	footer();
+}
+
 int
 main()
 {
 	header();
-	plan(3);
+	plan(4);
 
 	test_basic();
 	test_errors();
 	test_tree();
+	test_normalize_path();
 
 	int rc = check_plan();
 	footer();
diff --git a/test/unit/json_path.result b/test/unit/json_path.result
index 5b44fd2..1331f71 100644
--- a/test/unit/json_path.result
+++ b/test/unit/json_path.result
@@ -1,5 +1,5 @@
 	*** main ***
-1..3
+1..4
 	*** test_basic ***
     1..71
     ok 1 - parse <[0]>
@@ -139,4 +139,16 @@ ok 2 - subtests
     ok 36 - records iterated count 4 of 4
 ok 3 - subtests
 	*** test_tree: done ***
+	*** test_normalize_path ***
+    1..8
+    ok 1 - normalize '["FIO"][3]["fname"]' path status
+    ok 2 - normalize '["FIO"][3]["fname"]' path compare
+    ok 3 - normalize 'FIO[3].fname' path status
+    ok 4 - normalize 'FIO[3].fname' path compare
+    ok 5 - normalize '["FIO"][3].fname' path status
+    ok 6 - normalize '["FIO"][3].fname' path compare
+    ok 7 - normalize 'FIO[3]["fname"]' path status
+    ok 8 - normalize 'FIO[3]["fname"]' path compare
+ok 4 - subtests
+	*** test_normalize_path: done ***
 	*** main: done ***
-- 
2.7.4

  parent reply	other threads:[~2018-10-29  6:56 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-29  6:56 [PATCH v5 00/12] box: indexes by JSON path Kirill Shcherbatov
2018-10-29  6:56 ` [PATCH v5 01/12] box: refactor key_def_find routine Kirill Shcherbatov
2018-11-19 17:48   ` Vladimir Davydov
2018-10-29  6:56 ` [PATCH v5 10/12] box: tune tuple_field_raw_by_path for indexed data Kirill Shcherbatov
2018-10-29  6:56 ` [PATCH v5 11/12] box: introduce offset slot cache in key_part Kirill Shcherbatov
2018-11-01 13:32   ` [tarantool-patches] " Konstantin Osipov
2018-11-06 12:15     ` [tarantool-patches] " Kirill Shcherbatov
2018-10-29  6:56 ` [PATCH v5 12/12] box: specify indexes in user-friendly form Kirill Shcherbatov
2018-11-01 13:34   ` [tarantool-patches] " Konstantin Osipov
2018-11-01 14:18   ` Konstantin Osipov
2018-11-06 12:15     ` [tarantool-patches] " Kirill Shcherbatov
2018-10-29  6:56 ` [PATCH v5 02/12] box: introduce key_def_parts_are_sequential Kirill Shcherbatov
2018-11-01 14:23   ` [tarantool-patches] " Konstantin Osipov
2018-11-06 12:14     ` [tarantool-patches] " Kirill Shcherbatov
2018-11-19 17:48   ` Vladimir Davydov
2018-10-29  6:56 ` [PATCH v5 03/12] box: introduce tuple_field_go_to_path Kirill Shcherbatov
2018-11-19 17:48   ` Vladimir Davydov
2018-10-29  6:56 ` [PATCH v5 04/12] box: introduce tuple_format_add_key_part Kirill Shcherbatov
2018-11-01 14:38   ` [tarantool-patches] " Konstantin Osipov
2018-11-06 12:15     ` [tarantool-patches] " Kirill Shcherbatov
2018-11-19 17:50   ` Vladimir Davydov
2018-10-29  6:56 ` [PATCH v5 05/12] lib: implement JSON tree class for json library Kirill Shcherbatov
2018-11-01 15:08   ` [tarantool-patches] " Konstantin Osipov
2018-11-06 12:15     ` [tarantool-patches] " Kirill Shcherbatov
2018-11-19 17:53       ` Vladimir Davydov
2018-11-20 16:43   ` Vladimir Davydov
2018-11-21 10:37     ` [tarantool-patches] " Kirill Shcherbatov
2018-11-26 10:50     ` Kirill Shcherbatov
2018-10-29  6:56 ` [PATCH v5 06/12] box: manage format fields with JSON tree class Kirill Shcherbatov
2018-10-29  6:56 ` Kirill Shcherbatov [this message]
2018-11-01 15:22   ` [tarantool-patches] [PATCH v5 07/12] lib: introduce json_path_normalize routine Konstantin Osipov
2018-11-01 15:27     ` [tarantool-patches] " Kirill Shcherbatov
2018-11-20 15:13       ` Vladimir Davydov
2018-11-26 10:50         ` Kirill Shcherbatov
2018-11-20 15:14   ` Vladimir Davydov
2018-10-29  6:56 ` [PATCH v5 08/12] box: introduce JSON indexes Kirill Shcherbatov
2018-11-20 16:52   ` Vladimir Davydov
2018-11-26 10:50     ` [tarantool-patches] " Kirill Shcherbatov
2018-10-29  6:56 ` [tarantool-patches] [PATCH v5 09/12] box: introduce has_json_paths flag in templates Kirill Shcherbatov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=134d5b3da2e0d0102287cceb4d347c88ef2616b5.1540795996.git.kshcherbatov@tarantool.org \
    --to=kshcherbatov@tarantool.org \
    --cc=tarantool-patches@freelists.org \
    --cc=vdavydov.dev@gmail.com \
    --subject='Re: [PATCH v5 07/12] lib: introduce json_path_normalize routine' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox