From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 00CB126B50 for ; Mon, 18 Feb 2019 13:55:25 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3pTdA2PQimhn for ; Mon, 18 Feb 2019 13:55:24 -0500 (EST) Received: from smtp57.i.mail.ru (smtp57.i.mail.ru [217.69.128.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 018A925F89 for ; Mon, 18 Feb 2019 13:55:23 -0500 (EST) Date: Mon, 18 Feb 2019 21:55:22 +0300 From: Alexander Turenko Subject: [tarantool-patches] Re: [PATCH v2 2/3] lua-yaml: fix boolean/null representation in yaml Message-ID: <20190218185521.ons5k5oss5vtkz5b@tkn_work_nb> References: <15b711cb55d34fff1f010fd6df1aa32248f65735.1548123025.git.alexander.turenko@tarantool.org> <3da6fc29-991d-e475-0777-49eacb764b94@tarantool.org> <20190205032947.oibtqsvinfn3v34e@tkn_work_nb> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: Vladislav Shpilevoy Cc: tarantool-patches@freelists.org I answered to the comments below and noted the fixes were made. Rebased the patchset on top of fresh 2.1. The new patch is at end of the email. WBR, Alexander Turenko. On Tue, Feb 05, 2019 at 10:36:40PM +0300, Vladislav Shpilevoy wrote: > Thanks for the fixes! > > > > > diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua > > > > index 4f915afd7..68f273234 100755 > > > > --- a/test/app-tap/console.test.lua > > > > +++ b/test/app-tap/console.test.lua > > > > @@ -77,11 +77,24 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter") > > > > -- > > > > -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly. > > > > +-- gh-3662: yaml.encode encodes booleans with multiline format. > > > > +-- gh-3583: yaml.encode encodes null incorrectly. > > > > -- > > > > -test:is(type(yaml.decode(yaml.encode('false'))), 'string') > > > > -test:is(type(yaml.decode(yaml.encode('true'))), 'string') > > > > -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') > > > > -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') > > > > > > 1. Are you sure that this is the correct behaviour? So now it is > > > not true that decode(encode(obj)) == obj? Looks utterly confusing. > > > Or why else you removed these tests? > > > > They are work as before, but now superseded with simpler ones. > > It is not obvious. I see a test > > yaml.encode(false) == "--- false\n...\n" > > I see another test: > > yaml.decode('false') == false > > But where is this? - > > yaml.decode("--- false\n...\n") == false The removed test cases covered one case that is not covered with the new cases: they decode a yaml document, not just a scalar. And it is confusing why I remove test cases in the patch. So it worth to just add them back. Added. > > And why "false" == "--- false\n...\n" according to the tests > in the patch? `---` and `...` is a yaml document start and end, see [1]. A sequence of documents forms a yaml stream like a sequence of stanzas forms a xml stream. We decode one document to its value. Multiple documents however forms a table. Anyway, it does not matter much in context of this issue. [1]: https://yaml.org/spec/1.2/spec.html#id2800401 > > > > > > > > +{ > > > > + if (len > 5) > > > > + return YAML_NO_MATCH; > > > > + if (strcmp(str, "false") == 0 || strcmp(str, "no") == 0) > > > > + return YAML_FALSE; > > > > + if (strcmp(str, "true") == 0 || strcmp(str, "yes") == 0) > > > > > > 3. I think, it is redundant to firstly check length and then > > > call strcmp, which also calculates length (by checking zero > > > byte). We need either use only strcmp(), or use length comparison > > > + memcmp. The same below for the null parser. > > > > Okay, it was the premature optimization. > > > > The more important thing is that is was incorrect: "fals" gives > > YAML_FALSE. > > > > I had fixed it like so, is it okay? > > > > ``` > > if ((len == 5 && memcmp(str, "false", 5) == 0) || > > (len == 2 && memcmp(str, "no", 2) == 0)) { > > if (out != NULL) > > *out = false; > > return 0; > > } > > ... > > ``` > > Yes, this fix is ok, thanks. Ok. > > > > > > > > > > + return YAML_TRUE; > > > > + return YAML_NO_MATCH; > > > > +} > > > > + > > > > +/** > > > > + * Verify whether a string represents a null literal in YAML. > > > > + * > > > > + * Non-standard: don't match an empty string as null. > > > > + */ > > > > +static yaml_type > > > > +yaml_get_null(const char *str, const size_t len){ > > > > > > 4. ? > > > > > > static inline int > > > yaml_parse_null(const char *str, size_t len, bool *is_null); > > > > is_null is redundant here, because a return code contain all needed > > information. Agreed except that. Changed. > > My point was to standardize parser functions so as to always > return 0/-1 on success/error, and always return an actual value > via an out parameter. But now I see, that -1 here is not really an > error, but rather 'not being of a requested type'. So here we > do not have even a notion of 'error'. > > In such a case, please, consider the fixes below. Also, I removed > optionality of out parameter in yaml_parse_bool in order to get rid > of 'if'. I don't mind. Integrated these changes into the patch (with minor wording / formatting changes). > > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc > index 5afcace1b..a4e03a8ba 100644 > --- a/third_party/lua-yaml/lyaml.cc > +++ b/third_party/lua-yaml/lyaml.cc > @@ -96,46 +96,43 @@ struct lua_yaml_dumper { > /** > * Verify whether a string represents a boolean literal in YAML. > - * > * Non-standard: only subset of YAML 1.1 boolean literals are > * treated as boolean values. > - * > - * Return 0 at success, -1 at error. > + * @param str Literal. > + * @param len Length of @a str. > + * @param[out] out Result boolean value. > + * @retval Whether @a str represents a boolean value or not. > */ > -static inline int > -yaml_parse_bool(const char *str, size_t len, bool *out) > +static inline bool > +yaml_is_bool(const char *str, size_t len, bool *out) > { > if ((len == 5 && memcmp(str, "false", 5) == 0) || > (len == 2 && memcmp(str, "no", 2) == 0)) { > - if (out != NULL) > - *out = false; > - return 0; > + *out = false; > + return true; > } > if ((len == 4 && memcmp(str, "true", 4) == 0) || > (len == 3 && memcmp(str, "yes", 3) == 0)) { > - if (out != NULL) > - *out = true; > - return 0; > + *out = true; > + return true; > } > - return -1; > + return false; > } > /** > * Verify whether a string represents a null literal in YAML. > - * > * Non-standard: don't match an empty string, 'Null' and 'NULL' as > * null. > - * > - * Return 0 at success, -1 at error. > + * @retval Whether @a str represents NULL or not. > */ > -static inline int > -yaml_parse_null(const char *str, size_t len) > +static inline bool > +yaml_is_null(const char *str, size_t len) > { > if (len == 1 && str[0] == '~') > - return 0; > + return true; > if (len == 4 && memcmp(str, "null", 4) == 0) > - return 0; > - return -1; > + return true; > + return false; > } > static void generate_error_message(struct lua_yaml_loader *loader) { > @@ -254,9 +251,9 @@ static void load_scalar(struct lua_yaml_loader *loader) { > return; > } else if (!strcmp(tag, "bool")) { > bool value = false; > - int rc = yaml_parse_bool(str, length, &value); > + bool rc = yaml_is_bool(str, length, &value); > (void) rc; > - assert(rc == 0); > + assert(rc); I removed the assert() here, because yaml.decode('!!bool null') should not lead to an assertion fail. I didn't investigate why the behaviour is to produce `false` rather then raise an error, just keep the behaviour the same as before. However I don't add a test, because it is unclear whether the module behaves in this way with intention or by occasion. > lua_pushboolean(loader->L, value); > return; > } else if (!strcmp(tag, "binary")) { > @@ -275,10 +272,10 @@ static void load_scalar(struct lua_yaml_loader *loader) { > */ > lua_pushliteral(loader->L, ""); > return; > - } else if (yaml_parse_null(str, length) == 0) { > + } else if (yaml_is_null(str, length)) { > luaL_pushnull(loader->L); > return; > - } else if (yaml_parse_bool(str, length, &value) == 0) { > + } else if (yaml_is_bool(str, length, &value)) { > lua_pushboolean(loader->L, value); > return; > } > @@ -610,6 +607,7 @@ static int dump_node(struct lua_yaml_dumper *dumper) > yaml_event_t ev; > yaml_scalar_style_t style = YAML_PLAIN_SCALAR_STYLE; > int is_binary = 0; > + bool unused; > char buf[FPCONV_G_FMT_BUFSIZE]; > struct luaL_field field; > @@ -644,9 +642,8 @@ static int dump_node(struct lua_yaml_dumper *dumper) > return dump_table(dumper, &field); > case MP_STR: > str = lua_tolstring(dumper->L, -1, &len); > - if ((yaml_parse_null(str, len) == 0) > - || (yaml_parse_bool(str, len, NULL) == 0) > - || (lua_isnumber(dumper->L, -1))) { > + if (yaml_is_null(str, len) || yaml_is_bool(str, len, &unused) || > + lua_isnumber(dumper->L, -1)) { > /* > * The string is convertible to a null, a boolean or > * a number, quote it to preserve its type. > ---- commit 8531b4af03016a69d82bff35504ef6ae3ffbdb82 Author: Alexander Turenko Date: Tue Jan 22 01:03:01 2019 +0300 lua-yaml: fix strings literals encoding in yaml yaml.encode() now wraps a string literal whose content is equal to a null or a boolean value representation in YAML into single quotes. Those literals are: 'false', 'no', 'true', 'yes', '~', 'null'. Reverted the a2d7643c commit to use single quotes instead of multiline encoding for 'true' and 'false' string literals. Follows up #3476 Closes #3662 Closes #3583 diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua index 4f915afd7..a03df5277 100755 --- a/test/app-tap/console.test.lua +++ b/test/app-tap/console.test.lua @@ -21,7 +21,7 @@ local EOL = "\n...\n" test = tap.test("console") -test:plan(59) +test:plan(70) -- Start console and connect to it local server = console.listen(CONSOLE_SOCKET) @@ -77,12 +77,28 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter") -- -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly. +-- gh-3662: yaml.encode encodes booleans with multiline format. +-- gh-3583: yaml.encode encodes null incorrectly. -- + test:is(type(yaml.decode(yaml.encode('false'))), 'string') test:is(type(yaml.decode(yaml.encode('true'))), 'string') test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') +test:is(yaml.encode(false), "--- false\n...\n") +test:is(yaml.encode(true), "--- true\n...\n") +test:is(yaml.encode('false'), "--- 'false'\n...\n") +test:is(yaml.encode('true'), "--- 'true'\n...\n") +test:is(yaml.encode(nil), "--- null\n...\n") + +test:is(yaml.decode('false'), false) +test:is(yaml.decode('no'), false) +test:is(yaml.decode('true'), true) +test:is(yaml.decode('yes'), true) +test:is(yaml.decode('~'), nil) +test:is(yaml.decode('null'), nil) + box.cfg{ listen=IPROTO_SOCKET; memtx_memory = 107374182, diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc index bd876ab29..46c98bde1 100644 --- a/third_party/lua-yaml/lyaml.cc +++ b/third_party/lua-yaml/lyaml.cc @@ -94,6 +94,55 @@ struct lua_yaml_dumper { luaL_Buffer yamlbuf; }; +/** + * Verify whether a string represents a boolean literal in YAML. + * + * Non-standard: only subset of YAML 1.1 boolean literals are + * treated as boolean values. + * + * @param str Literal to check. + * @param len Length of @a str. + * @param[out] out Result boolean value. + * + * @retval Whether @a str represents a boolean value. + */ +static inline bool +yaml_is_bool(const char *str, size_t len, bool *out) +{ + if ((len == 5 && memcmp(str, "false", 5) == 0) || + (len == 2 && memcmp(str, "no", 2) == 0)) { + *out = false; + return true; + } + if ((len == 4 && memcmp(str, "true", 4) == 0) || + (len == 3 && memcmp(str, "yes", 3) == 0)) { + *out = true; + return true; + } + return false; +} + +/** + * Verify whether a string represents a null literal in YAML. + * + * Non-standard: don't match an empty string, 'Null' and 'NULL' as + * null. + * + * @param str Literal to check. + * @param len Length of @a str. + * + * @retval Whether @a str represents a null value. + */ +static inline bool +yaml_is_null(const char *str, size_t len) +{ + if (len == 1 && str[0] == '~') + return true; + if (len == 4 && memcmp(str, "null", 4) == 0) + return true; + return false; +} + static void generate_error_message(struct lua_yaml_loader *loader) { char buf[256]; luaL_Buffer b; @@ -209,7 +258,9 @@ static void load_scalar(struct lua_yaml_loader *loader) { lua_pushnumber(loader->L, dval); return; } else if (!strcmp(tag, "bool")) { - lua_pushboolean(loader->L, !strcmp(str, "true") || !strcmp(str, "yes")); + bool value = false; + yaml_is_bool(str, length, &value); + lua_pushboolean(loader->L, value); return; } else if (!strcmp(tag, "binary")) { frombase64(loader->L, (const unsigned char *)str, length); @@ -218,20 +269,20 @@ static void load_scalar(struct lua_yaml_loader *loader) { } if (loader->event.data.scalar.style == YAML_PLAIN_SCALAR_STYLE) { - if (!strcmp(str, "~")) { - luaL_pushnull(loader->L); - return; - } else if (!strcmp(str, "true") || !strcmp(str, "yes")) { - lua_pushboolean(loader->L, 1); - return; - } else if (!strcmp(str, "false") || !strcmp(str, "no")) { - lua_pushboolean(loader->L, 0); + bool value; + if (!length) { + /* + * Non-standard: an empty value/document is null + * according to the standard, but we decode it as an + * empty string. + */ + lua_pushliteral(loader->L, ""); return; - } else if (!strcmp(str, "null")) { + } else if (yaml_is_null(str, length)) { luaL_pushnull(loader->L); return; - } else if (!length) { - lua_pushliteral(loader->L, ""); + } else if (yaml_is_bool(str, length, &value)) { + lua_pushboolean(loader->L, value); return; } @@ -547,7 +598,6 @@ static int yaml_is_flow_mode(struct lua_yaml_dumper *dumper) { (evp->type == YAML_MAPPING_START_EVENT && evp->data.mapping_start.style == YAML_FLOW_MAPPING_STYLE)) { return 1; - break; } } } @@ -564,6 +614,8 @@ static int dump_node(struct lua_yaml_dumper *dumper) int is_binary = 0; char buf[FPCONV_G_FMT_BUFSIZE]; struct luaL_field field; + bool unused; + (void) unused; int top = lua_gettop(dumper->L); luaL_checkfield(dumper->L, dumper->cfg, top, &field); @@ -596,8 +648,12 @@ static int dump_node(struct lua_yaml_dumper *dumper) return dump_table(dumper, &field); case MP_STR: str = lua_tolstring(dumper->L, -1, &len); - if (lua_isnumber(dumper->L, -1)) { - /* string is convertible to number, quote it to preserve type */ + if (yaml_is_null(str, len) || yaml_is_bool(str, len, &unused) || + lua_isnumber(dumper->L, -1)) { + /* + * The string is convertible to a null, a boolean or + * a number, quote it to preserve its type. + */ style = YAML_SINGLE_QUOTED_SCALAR_STYLE; break; } @@ -605,12 +661,10 @@ static int dump_node(struct lua_yaml_dumper *dumper) if (utf8_check_printable(str, len)) { if (yaml_is_flow_mode(dumper)) { style = YAML_SINGLE_QUOTED_SCALAR_STYLE; - } else if (strstr(str, "\n\n") != NULL || strcmp(str, "true") == 0 || - strcmp(str, "false") == 0) { + } else if (strstr(str, "\n\n") != 0) { /* * Tarantool-specific: use literal style for string - * with empty lines and strings representing boolean - * types. + * with empty lines. * Useful for tutorial(). */ style = YAML_LITERAL_SCALAR_STYLE;