From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 55442247F1 for ; Thu, 24 Jan 2019 16:27:11 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SVTUrkVuN-J8 for ; Thu, 24 Jan 2019 16:27:11 -0500 (EST) Received: from smtpng3.m.smailru.net (smtpng3.m.smailru.net [94.100.177.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 1608F246D9 for ; Thu, 24 Jan 2019 16:27:11 -0500 (EST) Subject: [tarantool-patches] Re: [PATCH v2 2/3] lua-yaml: fix boolean/null representation in yaml References: <15b711cb55d34fff1f010fd6df1aa32248f65735.1548123025.git.alexander.turenko@tarantool.org> From: Vladislav Shpilevoy Message-ID: <3da6fc29-991d-e475-0777-49eacb764b94@tarantool.org> Date: Fri, 25 Jan 2019 00:26:56 +0300 MIME-Version: 1.0 In-Reply-To: <15b711cb55d34fff1f010fd6df1aa32248f65735.1548123025.git.alexander.turenko@tarantool.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-subscribe: List-owner: List-post: List-archive: To: tarantool-patches@freelists.org, Alexander Turenko Thanks for the patch! On 22/01/2019 05:12, Alexander Turenko wrote: > Encode changes: > > Wrap the following string literals into single quotes: 'false', 'no', > 'true', 'yes', '~', 'null', 'Null', 'NULL'. The content of these strings > is equals to null or boolean values representation in YAML. > > Reverted the a2d7643c commit to use single quotes instead of multiline > encoding for 'true' and 'false' string literals. > > '~', 'null', 'Null', 'NULL', 'no' and 'yes' were encoded w/o quotes > before, so '~', 'null', 'no', 'yes' were decoded with a wrong type then. > Multiline encoding was used for 'true' and 'false'. Unfortunately, I hardly understood what happened in this commit by reading this message, sorry. Maybe I am confused by imperative in the message body, or by too frequent repetition of this infinite sequence of strings: "'~', 'null', 'Null', 'NULL', 'no' and 'yes'", or by too mixed style of narration of what is the true and standard way of these values encoding - with quotes or without. Could you, please, make it simpler somehow? > > Decode changes: > > Treat 'Null', 'NULL' as null in addition to '~' and 'null'. > > Note: this commit leaves an empty document/value treatment as an empty > string, which is not standard (should be treated as null). This is fixed > in the following commit. > > Follows up #3476 > Closes #3662 > Closes #3583 > --- > test/app-tap/console.test.lua | 23 +++++++--- > test/box/misc.result | 2 +- > third_party/lua-yaml/lyaml.cc | 80 ++++++++++++++++++++++++++--------- > 3 files changed, 80 insertions(+), 25 deletions(-) > > diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua > index 4f915afd7..68f273234 100755 > --- a/test/app-tap/console.test.lua > +++ b/test/app-tap/console.test.lua > @@ -77,11 +77,24 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter") > > -- > -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly. > +-- gh-3662: yaml.encode encodes booleans with multiline format. > +-- gh-3583: yaml.encode encodes null incorrectly. > -- > -test:is(type(yaml.decode(yaml.encode('false'))), 'string') > -test:is(type(yaml.decode(yaml.encode('true'))), 'string') > -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') > -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string') 1. Are you sure that this is the correct behaviour? So now it is not true that decode(encode(obj)) == obj? Looks utterly confusing. Or why else you removed these tests? > + > +test:is(yaml.encode(false), "--- false\n...\n") > +test:is(yaml.encode(true), "--- true\n...\n") > +test:is(yaml.encode('false'), "--- 'false'\n...\n") > +test:is(yaml.encode('true'), "--- 'true'\n...\n") > +test:is(yaml.encode(nil), "--- null\n...\n") > + > +test:is(yaml.decode('false'), false) > +test:is(yaml.decode('no'), false) > +test:is(yaml.decode('true'), true) > +test:is(yaml.decode('yes'), true) > +test:is(yaml.decode('~'), nil) > +test:is(yaml.decode('null'), nil) > +test:is(yaml.decode('Null'), nil) > +test:is(yaml.decode('NULL'), nil) > > box.cfg{ > listen=IPROTO_SOCKET; > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc > index 9b07992d8..73e5d2c31 100644 > --- a/third_party/lua-yaml/lyaml.cc > +++ b/third_party/lua-yaml/lyaml.cc > @@ -94,6 +94,46 @@ struct lua_yaml_dumper { > luaL_Buffer yamlbuf; > }; > > +enum yaml_type { > + YAML_NO_MATCH = 0, > + YAML_FALSE, > + YAML_TRUE, > + YAML_NULL > +}; > + > +/** > + * Verify whether a string represents a boolean literal in YAML. > + * > + * Non-standard: only subset of YAML 1.1 boolean literals are > + * treated as boolean values. > + */ > +static yaml_type > +yaml_get_bool(const char *str, const size_t len) 2. Usually in such cases we use int 0/-1 returned value and return the result via an out-parameter. So it should be static inline int yaml_parse_bool(const char *str, size_t len, bool *value); Yes, I've changed 'get' to 'parse' and inlined the function, if you do not mind. The same for the next function, parsing null. After that you can remove enum yaml_type. Which, btw, is not a type, strictly speaking, because it does not have YAML_INT, YAML_DOUBLE ... Also, I do not see any point in making 'len' be 'const'. Scarcely it is a case when you need a parameter, passed by value instead of pointer, be explicitly const. The compiler can detect it easily. > +{ > + if (len > 5) > + return YAML_NO_MATCH; > + if (strcmp(str, "false") == 0 || strcmp(str, "no") == 0) > + return YAML_FALSE; > + if (strcmp(str, "true") == 0 || strcmp(str, "yes") == 0) 3. I think, it is redundant to firstly check length and then call strcmp, which also calculates length (by checking zero byte). We need either use only strcmp(), or use length comparison + memcmp. The same below for the null parser. > + return YAML_TRUE; > + return YAML_NO_MATCH; > +} > + > +/** > + * Verify whether a string represents a null literal in YAML. > + * > + * Non-standard: don't match an empty string as null. > + */ > +static yaml_type > +yaml_get_null(const char *str, const size_t len){ 4. ? static inline int yaml_parse_null(const char *str, size_t len, bool *is_null); > + if (len == 1 && str[0] == '~') > + return YAML_NULL; > + if (len == 4 && (strcmp(str, "null") == 0 || strcmp(str, "Null") == 0 || 5. Hmm. Length of "null" and "Null" is 3, not 4. As I understand, len does not count terminating zero. And I can prove it - load_scalar() in case if !strcmp(tag, "str") does lua_pushlstring(loader->L, str, length), so length == strlen(str). Even if I am mistaken about it, why two lines above you consider len of "~" as 1, not 2? > + strcmp(str, "NULL") == 0)) > + return YAML_NULL; > + return YAML_NO_MATCH; > +} > + > static void generate_error_message(struct lua_yaml_loader *loader) { > char buf[256]; > luaL_Buffer b; > @@ -597,8 +636,13 @@ static int dump_node(struct lua_yaml_dumper *dumper) > return dump_table(dumper, &field); > case MP_STR: > str = lua_tolstring(dumper->L, -1, &len); > - if (lua_isnumber(dumper->L, -1)) { > - /* string is convertible to number, quote it to preserve type */ > + if ((yaml_get_null(str, len) == YAML_NULL) > + || (yaml_get_bool(str, len) != YAML_NO_MATCH) > + || (lua_isnumber(dumper->L, -1))) { > + /* > + * The string is convertible to a null, a boolean or > + * a number, quote it to preserve its type. > + */ 6. I think, it is related to my first comment. So why did you delete those tests? > style = YAML_SINGLE_QUOTED_SCALAR_STYLE; > break; > }