From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Thu, 10 May 2018 21:41:07 +0300 From: Konstantin Osipov Subject: Re: [tarantool-patches] [PATCH v2 03/10] yaml: introduce yaml.decode_tag Message-ID: <20180510184107.GC30593@atlas> References: <0bce785538bba5d81f6c813732a7e309a941e175.1524228894.git.v.shpilevoy@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0bce785538bba5d81f6c813732a7e309a941e175.1524228894.git.v.shpilevoy@tarantool.org> To: tarantool-patches@freelists.org Cc: vdavydov.dev@gmail.com List-ID: * Vladislav Shpilevoy [18/04/20 16:25]: > Yaml.decode_tag allows to decode a single tag of a YAML document. Why do you need a function to decode just the tag, not an entire document with tags? > > To distinguish them YAML tags will be used. A client console for > each message will try to find a tag. If a tag is absent, then the > message is a simple response on a request. response to > If a tag is !print!, then the document consists of a single > string, that must be printed. Such a document must be decoded to > get the printed string. So the calls sequence is yaml.decode_tag > + yaml.decode. The reason why a print message must be decoded > is that a print() result on a server side can be not > well-formatted YAML, and must be encoded into it to be correctly > send. For example, when I do on a server side something like > console.print('very bad YAML string') > > The result of a print() is not a YAML document, and to be sent it > must be encoded into YAML on a server side. > > If a tag is !push!, then the document is sent via > box.session.push, and must not be decoded. It can be just printed > or ignored or something. It is nice you explain this convention in a changeset comment, but I'd suggest to move the explanation to the relevant commit, i.e. the one which uses the api you're adding in this commit. > > Needed for #2677 > --- > test/app-tap/yaml.test.lua | 30 ++++++++++++++++++- > third_party/lua-yaml/lyaml.cc | 67 +++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 96 insertions(+), 1 deletion(-) > > diff --git a/test/app-tap/yaml.test.lua b/test/app-tap/yaml.test.lua > index b81402cc0..ef64509fe 100755 > --- a/test/app-tap/yaml.test.lua > +++ b/test/app-tap/yaml.test.lua > @@ -70,7 +70,10 @@ local function test_output(test, s) > end > > local function test_tagged(test, s) > - test:plan(7) > + test:plan(15) > + -- > + -- Test encoding tags. > + -- > local must_be_err = 'Usage: encode_tagged({prefix = , handle = }, object)' > local prefix = 'tag:tarantool.io/push,2018' > local ok, err = pcall(s.encode_tagged) > @@ -87,6 +90,31 @@ local function test_tagged(test, s) > test:is(ret, "%TAG !push! "..prefix.."\n--- 300\n...\n", "encode_tagged usage") > ret = s.encode_tagged({handle = '!print!', prefix = prefix}, {a = 100, b = 200}) > test:is(ret, "%TAG !print! tag:tarantool.io/push,2018\n---\na: 100\nb: 200\n...\n", 'encode_tagged usage') > + -- > + -- Test decoding tags. > + -- > + must_be_err = "Usage: decode_tag()" This is a bad test - any change to the error message will require changes to the test. If you're testing that there is a relevant usage message, you could search for substring. As a person reading your code and test for it, I am at a loss what kind of string the API expects. Is it a fragment of YAML document containing the tag? Is it a single YAML document? Is it a stream with multiple documents? I can see you're passing the return value from the encoder into decode_tag(). Why is the API a-symmetric? It should not be asymmetric in the first place, but if you decided to make it one please this deserves an explanation in the changeset comment. > + ok, err = pcall(s.decode_tag) I suggest you simply have encode_tagged and decode_tagged. Or even simpler, extend dump/load, with tag support. > + test:is(err, must_be_err, "decode_tag usage") > + ok, err = pcall(s.decode_tag, false) > + test:is(err, must_be_err, "decode_tag usage") > + local handle, prefix = s.decode_tag(ret) > + test:is(handle, "!print!", "handle is decoded ok") > + test:is(prefix, "tag:tarantool.io/push,2018", "prefix is decoded ok") > + local several_tags = > +[[%TAG !tag1! tag:tarantool.io/tag1,2018 > +%TAG !tag2! tag:tarantool.io/tag2,2018 Please add a test case for multiple documents. > +--- > +- 100 > +... > +]] > + ok, err = s.decode_tag(several_tags) > + test:is(ok, nil, "can not decode multiple tags") > + test:is(err, "can not decode multiple tags", "same") > + local no_tags = s.encode(100) > + handle, prefix = s.decode_tag(no_tags) > + test:is(handle, nil, "no tag - no handle") > + test:is(prefix, nil, "no tag - no prefix") > end > > tap.test("yaml", function(test) > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc > index 8329f84e9..d24715edd 100644 > --- a/third_party/lua-yaml/lyaml.cc > +++ b/third_party/lua-yaml/lyaml.cc > @@ -361,6 +361,72 @@ static int l_load(lua_State *L) { > return loader.document_count; > } > > +/** > + * Decode a global tag of document. Multiple tags can not be > + * decoded. In a case of multiple documents only first is > + * processed. Decode a document taking into account document tags. In case of success, pops the input from the stack and pushed the document and a table with options, containing tag prefix and tag handle. > + * @param YAML document in string. > + * @retval nil, err Error occured during decoding. In the second > + * value is error description. > + * @retval nil, nil A document does not contain tags. > + * @retval handle, prefix Handle and prefix of a tag. > + */ > +static int > +l_load_tag(struct lua_State *L) Function name in C does not match the Lua name. Please make sure the names match. I understand you sometimes may avoid extra work because you don't believe I am ever going to look at the patch, but this is not extra work, this is just sloppy code. > +{ > + if (lua_gettop(L) != 1 || !lua_isstring(L, 1)) > + return luaL_error(L, "Usage: decode_tag()"); > + size_t len; > + const char *str = lua_tolstring(L, 1, &len); > + struct lua_yaml_loader loader; > + memset(&loader, 0, sizeof(loader)); > + loader.L = L; > + loader.cfg = luaL_checkserializer(L); > + if (yaml_parser_initialize(&loader.parser) == 0) > + luaL_error(L, OOM_ERRMSG); > + yaml_tag_directive_t *start, *end; > + yaml_parser_set_input_string(&loader.parser, (const unsigned char *) str, > + len); Shouldn't you use yaml typedef here rather than const unsigned char *? > + /* Initial parser step. Detect the documents start position. */ > + if (do_parse(&loader) == 0) > + goto parse_error; > + if (loader.event.type != YAML_STREAM_START_EVENT) { > + lua_pushnil(L); > + lua_pushstring(L, "expected STREAM_START_EVENT"); > + return 2; > + } What is the current convention for dump/load API? Does it use nil, err or lua_error() for errors? Why did you decide to depart from the current convention? > + /* Parse a document start. */ > + if (do_parse(&loader) == 0) > + goto parse_error; > + if (loader.event.type == YAML_STREAM_END_EVENT) > + goto no_tags; > + assert(loader.event.type == YAML_DOCUMENT_START_EVENT); > + start = loader.event.data.document_start.tag_directives.start; > + end = loader.event.data.document_start.tag_directives.end; > + if (start == end) > + goto no_tags; > + if (end - start > 1) { > + lua_pushnil(L); > + lua_pushstring(L, "can not decode multiple tags"); > + return 2; > + } > + lua_pushstring(L, (const char *) start->handle); > + lua_pushstring(L, (const char *) start->prefix); > + delete_event(&loader); > + yaml_parser_delete(&loader.parser); > + return 2; Why not make the API symmetric in what it expects and returns? dump(object, options) -> stream load(stream) -> object, options or error > + > +parse_error: > + lua_pushnil(L); > + /* Make nil be before an error message. */ > + lua_insert(L, -2); > + return 2; > + > +no_tags: > + lua_pushnil(L); > + return 1; > +} > + > static int dump_node(struct lua_yaml_dumper *dumper); > > static yaml_char_t *get_yaml_anchor(struct lua_yaml_dumper *dumper) { > @@ -753,6 +819,7 @@ static const luaL_Reg yamllib[] = { > { "encode", l_dump }, > { "encode_tagged", l_dump_tagged }, > { "decode", l_load }, > + { "decode_tag", l_load_tag }, > { "new", l_new }, > { NULL, NULL} > }; -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov