From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B960B2849E for ; Fri, 22 Feb 2019 10:14:22 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FHSpseVpJXi3 for ; Fri, 22 Feb 2019 10:14:22 -0500 (EST) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 4E72128498 for ; Fri, 22 Feb 2019 10:14:22 -0500 (EST) Subject: [tarantool-patches] Re: [PATCH v2 2/3] lua-yaml: fix boolean/null representation in yaml References: <15b711cb55d34fff1f010fd6df1aa32248f65735.1548123025.git.alexander.turenko@tarantool.org> <3da6fc29-991d-e475-0777-49eacb764b94@tarantool.org> <20190205032947.oibtqsvinfn3v34e@tkn_work_nb> <20190218185521.ons5k5oss5vtkz5b@tkn_work_nb> From: Vladislav Shpilevoy Message-ID: Date: Fri, 22 Feb 2019 18:14:19 +0300 MIME-Version: 1.0 In-Reply-To: <20190218185521.ons5k5oss5vtkz5b@tkn_work_nb> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: Alexander Turenko , Kirill Yukhin Cc: tarantool-patches@freelists.org LGTM. On 18/02/2019 21:55, Alexander Turenko wrote: > I answered to the comments below and noted the fixes were made. Rebased > the patchset on top of fresh 2.1. > > The new patch is at end of the email. > > WBR, Alexander Turenko. > > On Tue, Feb 05, 2019 at 10:36:40PM +0300, Vladislav Shpilevoy wrote: >> Thanks for the fixes! >> >>>>> diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua >>>>> index 4f915afd7..68f273234 100755 >>>>> --- a/test/app-tap/console.test.lua >>>>> +++ b/test/app-tap/console.test.lua >>>>> @@ -77,11 +77,24 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter") >>>>> -- >>>>> -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly. >>>>> +-- gh-3662: yaml.encode encodes booleans with multiline format. >>>>> +-- gh-3583: yaml.encode encodes null incorrectly. >>>>> -- >>>>> -test:is(type(yaml.decode(yaml.encode('false'))), 'string') >>>>> -test:is(type(yaml.decode(yaml.encode('true'))), 'string') >>>>> -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') >>>>> -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') >>>> >>>> 1. Are you sure that this is the correct behaviour? So now it is >>>> not true that decode(encode(obj)) == obj? Looks utterly confusing. >>>> Or why else you removed these tests? >>> >>> They are work as before, but now superseded with simpler ones. >> >> It is not obvious. I see a test >> >> yaml.encode(false) == "--- false\n...\n" >> >> I see another test: >> >> yaml.decode('false') == false >> >> But where is this? - >> >> yaml.decode("--- false\n...\n") == false > > The removed test cases covered one case that is not covered with the new > cases: they decode a yaml document, not just a scalar. And it is > confusing why I remove test cases in the patch. So it worth to just add > them back. > > Added. > >> >> And why "false" == "--- false\n...\n" according to the tests >> in the patch? > > `---` and `...` is a yaml document start and end, see [1]. A sequence of > documents forms a yaml stream like a sequence of stanzas forms a xml > stream. We decode one document to its value. Multiple documents however > forms a table. Anyway, it does not matter much in context of this issue. > > [1]: https://yaml.org/spec/1.2/spec.html#id2800401 > >> >>>> >>>>> +{ >>>>> + if (len > 5) >>>>> + return YAML_NO_MATCH; >>>>> + if (strcmp(str, "false") == 0 || strcmp(str, "no") == 0) >>>>> + return YAML_FALSE; >>>>> + if (strcmp(str, "true") == 0 || strcmp(str, "yes") == 0) >>>> >>>> 3. I think, it is redundant to firstly check length and then >>>> call strcmp, which also calculates length (by checking zero >>>> byte). We need either use only strcmp(), or use length comparison >>>> + memcmp. The same below for the null parser. >>> >>> Okay, it was the premature optimization. >>> >>> The more important thing is that is was incorrect: "fals" gives >>> YAML_FALSE. >>> >>> I had fixed it like so, is it okay? >>> >>> ``` >>> if ((len == 5 && memcmp(str, "false", 5) == 0) || >>> (len == 2 && memcmp(str, "no", 2) == 0)) { >>> if (out != NULL) >>> *out = false; >>> return 0; >>> } >>> ... >>> ``` >> >> Yes, this fix is ok, thanks. > > Ok. > >> >>> >>>> >>>>> + return YAML_TRUE; >>>>> + return YAML_NO_MATCH; >>>>> +} >>>>> + >>>>> +/** >>>>> + * Verify whether a string represents a null literal in YAML. >>>>> + * >>>>> + * Non-standard: don't match an empty string as null. >>>>> + */ >>>>> +static yaml_type >>>>> +yaml_get_null(const char *str, const size_t len){ >>>> >>>> 4. ? >>>> >>>> static inline int >>>> yaml_parse_null(const char *str, size_t len, bool *is_null); >>> >>> is_null is redundant here, because a return code contain all needed >>> information. Agreed except that. Changed. >> >> My point was to standardize parser functions so as to always >> return 0/-1 on success/error, and always return an actual value >> via an out parameter. But now I see, that -1 here is not really an >> error, but rather 'not being of a requested type'. So here we >> do not have even a notion of 'error'. >> >> In such a case, please, consider the fixes below. Also, I removed >> optionality of out parameter in yaml_parse_bool in order to get rid >> of 'if'. > > I don't mind. Integrated these changes into the patch (with minor > wording / formatting changes). > >> >> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc >> index 5afcace1b..a4e03a8ba 100644 >> --- a/third_party/lua-yaml/lyaml.cc >> +++ b/third_party/lua-yaml/lyaml.cc >> @@ -96,46 +96,43 @@ struct lua_yaml_dumper { >> /** >> * Verify whether a string represents a boolean literal in YAML. >> - * >> * Non-standard: only subset of YAML 1.1 boolean literals are >> * treated as boolean values. >> - * >> - * Return 0 at success, -1 at error. >> + * @param str Literal. >> + * @param len Length of @a str. >> + * @param[out] out Result boolean value. >> + * @retval Whether @a str represents a boolean value or not. >> */ >> -static inline int >> -yaml_parse_bool(const char *str, size_t len, bool *out) >> +static inline bool >> +yaml_is_bool(const char *str, size_t len, bool *out) >> { >> if ((len == 5 && memcmp(str, "false", 5) == 0) || >> (len == 2 && memcmp(str, "no", 2) == 0)) { >> - if (out != NULL) >> - *out = false; >> - return 0; >> + *out = false; >> + return true; >> } >> if ((len == 4 && memcmp(str, "true", 4) == 0) || >> (len == 3 && memcmp(str, "yes", 3) == 0)) { >> - if (out != NULL) >> - *out = true; >> - return 0; >> + *out = true; >> + return true; >> } >> - return -1; >> + return false; >> } >> /** >> * Verify whether a string represents a null literal in YAML. >> - * >> * Non-standard: don't match an empty string, 'Null' and 'NULL' as >> * null. >> - * >> - * Return 0 at success, -1 at error. >> + * @retval Whether @a str represents NULL or not. >> */ >> -static inline int >> -yaml_parse_null(const char *str, size_t len) >> +static inline bool >> +yaml_is_null(const char *str, size_t len) >> { >> if (len == 1 && str[0] == '~') >> - return 0; >> + return true; >> if (len == 4 && memcmp(str, "null", 4) == 0) >> - return 0; >> - return -1; >> + return true; >> + return false; >> } >> static void generate_error_message(struct lua_yaml_loader *loader) { >> @@ -254,9 +251,9 @@ static void load_scalar(struct lua_yaml_loader *loader) { >> return; >> } else if (!strcmp(tag, "bool")) { >> bool value = false; >> - int rc = yaml_parse_bool(str, length, &value); >> + bool rc = yaml_is_bool(str, length, &value); >> (void) rc; >> - assert(rc == 0); >> + assert(rc); > > I removed the assert() here, because yaml.decode('!!bool null') should > not lead to an assertion fail. I didn't investigate why the behaviour is > to produce `false` rather then raise an error, just keep the behaviour > the same as before. However I don't add a test, because it is unclear > whether the module behaves in this way with intention or by occasion. > >> lua_pushboolean(loader->L, value); >> return; >> } else if (!strcmp(tag, "binary")) { >> @@ -275,10 +272,10 @@ static void load_scalar(struct lua_yaml_loader *loader) { >> */ >> lua_pushliteral(loader->L, ""); >> return; >> - } else if (yaml_parse_null(str, length) == 0) { >> + } else if (yaml_is_null(str, length)) { >> luaL_pushnull(loader->L); >> return; >> - } else if (yaml_parse_bool(str, length, &value) == 0) { >> + } else if (yaml_is_bool(str, length, &value)) { >> lua_pushboolean(loader->L, value); >> return; >> } >> @@ -610,6 +607,7 @@ static int dump_node(struct lua_yaml_dumper *dumper) >> yaml_event_t ev; >> yaml_scalar_style_t style = YAML_PLAIN_SCALAR_STYLE; >> int is_binary = 0; >> + bool unused; >> char buf[FPCONV_G_FMT_BUFSIZE]; >> struct luaL_field field; >> @@ -644,9 +642,8 @@ static int dump_node(struct lua_yaml_dumper *dumper) >> return dump_table(dumper, &field); >> case MP_STR: >> str = lua_tolstring(dumper->L, -1, &len); >> - if ((yaml_parse_null(str, len) == 0) >> - || (yaml_parse_bool(str, len, NULL) == 0) >> - || (lua_isnumber(dumper->L, -1))) { >> + if (yaml_is_null(str, len) || yaml_is_bool(str, len, &unused) || >> + lua_isnumber(dumper->L, -1)) { >> /* >> * The string is convertible to a null, a boolean or >> * a number, quote it to preserve its type. >> > > ---- > > commit 8531b4af03016a69d82bff35504ef6ae3ffbdb82 > Author: Alexander Turenko > Date: Tue Jan 22 01:03:01 2019 +0300 > > lua-yaml: fix strings literals encoding in yaml > > yaml.encode() now wraps a string literal whose content is equal to a > null or a boolean value representation in YAML into single quotes. Those > literals are: 'false', 'no', 'true', 'yes', '~', 'null'. > > Reverted the a2d7643c commit to use single quotes instead of multiline > encoding for 'true' and 'false' string literals. > > Follows up #3476 > Closes #3662 > Closes #3583 > > diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua > index 4f915afd7..a03df5277 100755 > --- a/test/app-tap/console.test.lua > +++ b/test/app-tap/console.test.lua > @@ -21,7 +21,7 @@ local EOL = "\n...\n" > > test = tap.test("console") > > -test:plan(59) > +test:plan(70) > > -- Start console and connect to it > local server = console.listen(CONSOLE_SOCKET) > @@ -77,12 +77,28 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter") > > -- > -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly. > +-- gh-3662: yaml.encode encodes booleans with multiline format. > +-- gh-3583: yaml.encode encodes null incorrectly. > -- > + > test:is(type(yaml.decode(yaml.encode('false'))), 'string') > test:is(type(yaml.decode(yaml.encode('true'))), 'string') > test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') > test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') > > +test:is(yaml.encode(false), "--- false\n...\n") > +test:is(yaml.encode(true), "--- true\n...\n") > +test:is(yaml.encode('false'), "--- 'false'\n...\n") > +test:is(yaml.encode('true'), "--- 'true'\n...\n") > +test:is(yaml.encode(nil), "--- null\n...\n") > + > +test:is(yaml.decode('false'), false) > +test:is(yaml.decode('no'), false) > +test:is(yaml.decode('true'), true) > +test:is(yaml.decode('yes'), true) > +test:is(yaml.decode('~'), nil) > +test:is(yaml.decode('null'), nil) > + > box.cfg{ > listen=IPROTO_SOCKET; > memtx_memory = 107374182, > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc > index bd876ab29..46c98bde1 100644 > --- a/third_party/lua-yaml/lyaml.cc > +++ b/third_party/lua-yaml/lyaml.cc > @@ -94,6 +94,55 @@ struct lua_yaml_dumper { > luaL_Buffer yamlbuf; > }; > > +/** > + * Verify whether a string represents a boolean literal in YAML. > + * > + * Non-standard: only subset of YAML 1.1 boolean literals are > + * treated as boolean values. > + * > + * @param str Literal to check. > + * @param len Length of @a str. > + * @param[out] out Result boolean value. > + * > + * @retval Whether @a str represents a boolean value. > + */ > +static inline bool > +yaml_is_bool(const char *str, size_t len, bool *out) > +{ > + if ((len == 5 && memcmp(str, "false", 5) == 0) || > + (len == 2 && memcmp(str, "no", 2) == 0)) { > + *out = false; > + return true; > + } > + if ((len == 4 && memcmp(str, "true", 4) == 0) || > + (len == 3 && memcmp(str, "yes", 3) == 0)) { > + *out = true; > + return true; > + } > + return false; > +} > + > +/** > + * Verify whether a string represents a null literal in YAML. > + * > + * Non-standard: don't match an empty string, 'Null' and 'NULL' as > + * null. > + * > + * @param str Literal to check. > + * @param len Length of @a str. > + * > + * @retval Whether @a str represents a null value. > + */ > +static inline bool > +yaml_is_null(const char *str, size_t len) > +{ > + if (len == 1 && str[0] == '~') > + return true; > + if (len == 4 && memcmp(str, "null", 4) == 0) > + return true; > + return false; > +} > + > static void generate_error_message(struct lua_yaml_loader *loader) { > char buf[256]; > luaL_Buffer b; > @@ -209,7 +258,9 @@ static void load_scalar(struct lua_yaml_loader *loader) { > lua_pushnumber(loader->L, dval); > return; > } else if (!strcmp(tag, "bool")) { > - lua_pushboolean(loader->L, !strcmp(str, "true") || !strcmp(str, "yes")); > + bool value = false; > + yaml_is_bool(str, length, &value); > + lua_pushboolean(loader->L, value); > return; > } else if (!strcmp(tag, "binary")) { > frombase64(loader->L, (const unsigned char *)str, length); > @@ -218,20 +269,20 @@ static void load_scalar(struct lua_yaml_loader *loader) { > } > > if (loader->event.data.scalar.style == YAML_PLAIN_SCALAR_STYLE) { > - if (!strcmp(str, "~")) { > - luaL_pushnull(loader->L); > - return; > - } else if (!strcmp(str, "true") || !strcmp(str, "yes")) { > - lua_pushboolean(loader->L, 1); > - return; > - } else if (!strcmp(str, "false") || !strcmp(str, "no")) { > - lua_pushboolean(loader->L, 0); > + bool value; > + if (!length) { > + /* > + * Non-standard: an empty value/document is null > + * according to the standard, but we decode it as an > + * empty string. > + */ > + lua_pushliteral(loader->L, ""); > return; > - } else if (!strcmp(str, "null")) { > + } else if (yaml_is_null(str, length)) { > luaL_pushnull(loader->L); > return; > - } else if (!length) { > - lua_pushliteral(loader->L, ""); > + } else if (yaml_is_bool(str, length, &value)) { > + lua_pushboolean(loader->L, value); > return; > } > > @@ -547,7 +598,6 @@ static int yaml_is_flow_mode(struct lua_yaml_dumper *dumper) { > (evp->type == YAML_MAPPING_START_EVENT && > evp->data.mapping_start.style == YAML_FLOW_MAPPING_STYLE)) { > return 1; > - break; > } > } > } > @@ -564,6 +614,8 @@ static int dump_node(struct lua_yaml_dumper *dumper) > int is_binary = 0; > char buf[FPCONV_G_FMT_BUFSIZE]; > struct luaL_field field; > + bool unused; > + (void) unused; > > int top = lua_gettop(dumper->L); > luaL_checkfield(dumper->L, dumper->cfg, top, &field); > @@ -596,8 +648,12 @@ static int dump_node(struct lua_yaml_dumper *dumper) > return dump_table(dumper, &field); > case MP_STR: > str = lua_tolstring(dumper->L, -1, &len); > - if (lua_isnumber(dumper->L, -1)) { > - /* string is convertible to number, quote it to preserve type */ > + if (yaml_is_null(str, len) || yaml_is_bool(str, len, &unused) || > + lua_isnumber(dumper->L, -1)) { > + /* > + * The string is convertible to a null, a boolean or > + * a number, quote it to preserve its type. > + */ > style = YAML_SINGLE_QUOTED_SCALAR_STYLE; > break; > } > @@ -605,12 +661,10 @@ static int dump_node(struct lua_yaml_dumper *dumper) > if (utf8_check_printable(str, len)) { > if (yaml_is_flow_mode(dumper)) { > style = YAML_SINGLE_QUOTED_SCALAR_STYLE; > - } else if (strstr(str, "\n\n") != NULL || strcmp(str, "true") == 0 || > - strcmp(str, "false") == 0) { > + } else if (strstr(str, "\n\n") != 0) { > /* > * Tarantool-specific: use literal style for string > - * with empty lines and strings representing boolean > - * types. > + * with empty lines. > * Useful for tutorial(). > */ > style = YAML_LITERAL_SCALAR_STYLE; >