[tarantool-patches] Re: [PATCH v2 2/3] lua-yaml: fix boolean/null representation in yaml

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Jan 25 00:26:56 MSK 2019


Thanks for the patch!

On 22/01/2019 05:12, Alexander Turenko wrote:
> Encode changes:
> 
> Wrap the following string literals into single quotes: 'false', 'no',
> 'true', 'yes', '~', 'null', 'Null', 'NULL'. The content of these strings
> is equals to null or boolean values representation in YAML.
> 
> Reverted the a2d7643c commit to use single quotes instead of multiline
> encoding for 'true' and 'false' string literals.
> 
> '~', 'null', 'Null', 'NULL', 'no' and 'yes' were encoded w/o quotes
> before, so '~', 'null', 'no', 'yes' were decoded with a wrong type then.
> Multiline encoding was used for 'true' and 'false'.

Unfortunately, I hardly understood what happened in this commit
by reading this message, sorry. Maybe I am confused by imperative
in the message body, or by too frequent repetition of this infinite
sequence of strings: "'~', 'null', 'Null', 'NULL', 'no' and 'yes'",
or by too mixed style of narration of what is the true and standard
way of these values encoding - with quotes or without.

Could you, please, make it simpler somehow?

> 
> Decode changes:
> 
> Treat 'Null', 'NULL' as null in addition to '~' and 'null'.
> 
> Note: this commit leaves an empty document/value treatment as an empty
> string, which is not standard (should be treated as null). This is fixed
> in the following commit.
> 
> Follows up #3476
> Closes #3662
> Closes #3583
> ---
>   test/app-tap/console.test.lua | 23 +++++++---
>   test/box/misc.result          |  2 +-
>   third_party/lua-yaml/lyaml.cc | 80 ++++++++++++++++++++++++++---------
>   3 files changed, 80 insertions(+), 25 deletions(-)
> 
> diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua
> index 4f915afd7..68f273234 100755
> --- a/test/app-tap/console.test.lua
> +++ b/test/app-tap/console.test.lua
> @@ -77,11 +77,24 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter")
>   
>   --
>   -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly.
> +-- gh-3662: yaml.encode encodes booleans with multiline format.
> +-- gh-3583: yaml.encode encodes null incorrectly.
>   --
> -test:is(type(yaml.decode(yaml.encode('false'))), 'string')
> -test:is(type(yaml.decode(yaml.encode('true'))), 'string')
> -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
> -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')

1. Are you sure that this is the correct behaviour? So now it is
not true that decode(encode(obj)) == obj? Looks utterly confusing.
Or why else you removed these tests?

> +
> +test:is(yaml.encode(false), "--- false\n...\n")
> +test:is(yaml.encode(true), "--- true\n...\n")
> +test:is(yaml.encode('false'), "--- 'false'\n...\n")
> +test:is(yaml.encode('true'), "--- 'true'\n...\n")
> +test:is(yaml.encode(nil), "--- null\n...\n")
> +
> +test:is(yaml.decode('false'), false)
> +test:is(yaml.decode('no'), false)
> +test:is(yaml.decode('true'), true)
> +test:is(yaml.decode('yes'), true)
> +test:is(yaml.decode('~'), nil)
> +test:is(yaml.decode('null'), nil)
> +test:is(yaml.decode('Null'), nil)
> +test:is(yaml.decode('NULL'), nil)
>   
>   box.cfg{
>       listen=IPROTO_SOCKET;
> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> index 9b07992d8..73e5d2c31 100644
> --- a/third_party/lua-yaml/lyaml.cc
> +++ b/third_party/lua-yaml/lyaml.cc
> @@ -94,6 +94,46 @@ struct lua_yaml_dumper {
>      luaL_Buffer yamlbuf;
>   };
>   
> +enum yaml_type {
> +   YAML_NO_MATCH = 0,
> +   YAML_FALSE,
> +   YAML_TRUE,
> +   YAML_NULL
> +};
> +
> +/**
> + * Verify whether a string represents a boolean literal in YAML.
> + *
> + * Non-standard: only subset of YAML 1.1 boolean literals are
> + * treated as boolean values.
> + */
> +static yaml_type
> +yaml_get_bool(const char *str, const size_t len)

2. Usually in such cases we use int 0/-1 returned value
and return the result via an out-parameter. So it should be

     static inline int
     yaml_parse_bool(const char *str, size_t len, bool *value);

Yes, I've changed 'get' to 'parse' and inlined the function,
if you do not mind.

The same for the next function, parsing null. After that you
can remove enum yaml_type.

Which, btw, is not a type, strictly speaking, because it
does not have YAML_INT, YAML_DOUBLE ...

Also, I do not see any point in making 'len' be 'const'.

Scarcely it is a case when you need a parameter, passed by
value instead of pointer, be explicitly const. The compiler
can detect it easily.

> +{
> +   if (len > 5)
> +      return YAML_NO_MATCH;
> +   if (strcmp(str, "false") == 0 || strcmp(str, "no") == 0)
> +      return YAML_FALSE;
> +   if (strcmp(str, "true") == 0 || strcmp(str, "yes") == 0)

3. I think, it is redundant to firstly check length and then
call strcmp, which also calculates length (by checking zero
byte). We need either use only strcmp(), or use length comparison
+ memcmp. The same below for the null parser.

> +      return YAML_TRUE;
> +   return YAML_NO_MATCH;
> +}
> +
> +/**
> + * Verify whether a string represents a null literal in YAML.
> + *
> + * Non-standard: don't match an empty string as null.
> + */
> +static yaml_type
> +yaml_get_null(const char *str, const size_t len){

4. ?

     static inline int
     yaml_parse_null(const char *str, size_t len, bool *is_null);

> +   if (len == 1 && str[0] == '~')
> +      return YAML_NULL;
> +   if (len == 4 && (strcmp(str, "null") == 0 || strcmp(str, "Null") == 0 ||

5. Hmm. Length of "null" and "Null" is 3, not 4. As I understand,
len does not count terminating zero. And I can prove it - load_scalar()
in case if !strcmp(tag, "str") does lua_pushlstring(loader->L, str, length),
so length == strlen(str).

Even if I am mistaken about it, why two lines above you consider len
of "~" as 1, not 2?

> +       strcmp(str, "NULL") == 0))
> +      return YAML_NULL;
> +   return YAML_NO_MATCH;
> +}
> +
>   static void generate_error_message(struct lua_yaml_loader *loader) {
>      char buf[256];
>      luaL_Buffer b;
> @@ -597,8 +636,13 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>         return dump_table(dumper, &field);
>      case MP_STR:
>         str = lua_tolstring(dumper->L, -1, &len);
> -      if (lua_isnumber(dumper->L, -1)) {
> -         /* string is convertible to number, quote it to preserve type */
> +      if ((yaml_get_null(str, len) == YAML_NULL)
> +         || (yaml_get_bool(str, len) != YAML_NO_MATCH)
> +         || (lua_isnumber(dumper->L, -1))) {
> +         /*
> +          * The string is convertible to a null, a boolean or
> +          * a number, quote it to preserve its type.
> +          */

6. I think, it is related to my first comment. So why did you
delete those tests?

>            style = YAML_SINGLE_QUOTED_SCALAR_STYLE;
>            break;
>         }




More information about the Tarantool-patches mailing list