[tarantool-patches] Re: [PATCH v2 2/3] lua-yaml: fix boolean/null representation in yaml

Alexander Turenko alexander.turenko at tarantool.org
Tue Feb 5 06:29:48 MSK 2019


Hi!

I dropped support of 'Null' and 'NULL' as null values to stick with the
old approach (only 'null' and '~' are decoded as nulls). At least while
changes are not requested explicitly it is better to keep backward
compatibility even if some parts of behaviour is not standard.

Answered your comments below.

I'll resend the updated patchset as v3 soon.

WBR, Alexander Turenko.

On Fri, Jan 25, 2019 at 12:26:56AM +0300, Vladislav Shpilevoy wrote:
> Thanks for the patch!
> 
> On 22/01/2019 05:12, Alexander Turenko wrote:
> > Encode changes:
> > 
> > Wrap the following string literals into single quotes: 'false', 'no',
> > 'true', 'yes', '~', 'null', 'Null', 'NULL'. The content of these strings
> > is equals to null or boolean values representation in YAML.
> > 
> > Reverted the a2d7643c commit to use single quotes instead of multiline
> > encoding for 'true' and 'false' string literals.
> > 
> > '~', 'null', 'Null', 'NULL', 'no' and 'yes' were encoded w/o quotes
> > before, so '~', 'null', 'no', 'yes' were decoded with a wrong type then.
> > Multiline encoding was used for 'true' and 'false'.
> 
> Unfortunately, I hardly understood what happened in this commit
> by reading this message, sorry. Maybe I am confused by imperative
> in the message body, or by too frequent repetition of this infinite
> sequence of strings: "'~', 'null', 'Null', 'NULL', 'no' and 'yes'",
> or by too mixed style of narration of what is the true and standard
> way of these values encoding - with quotes or without.
> 
> Could you, please, make it simpler somehow?

You are right, it was horrible. I had changed wording and reflected the
patch changes (described above).

```
lua-yaml: fix strings literals encoding in yaml

yaml.encode() now wraps a string literal whose content is equal to a
null or a boolean value representation in YAML into single quotes. Those
literals are: 'false', 'no', 'true', 'yes', '~', 'null'.

Reverted the a2d7643c commit to use single quotes instead of multiline
encoding for 'true' and 'false' string literals.

Follows up #3476
Closes #3662
Closes #3583
```

> 
> > 
> > Decode changes:
> > 
> > Treat 'Null', 'NULL' as null in addition to '~' and 'null'.
> > 
> > Note: this commit leaves an empty document/value treatment as an empty
> > string, which is not standard (should be treated as null). This is fixed
> > in the following commit.
> > 
> > Follows up #3476
> > Closes #3662
> > Closes #3583
> > ---
> >   test/app-tap/console.test.lua | 23 +++++++---
> >   test/box/misc.result          |  2 +-
> >   third_party/lua-yaml/lyaml.cc | 80 ++++++++++++++++++++++++++---------
> >   3 files changed, 80 insertions(+), 25 deletions(-)
> > 
> > diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua
> > index 4f915afd7..68f273234 100755
> > --- a/test/app-tap/console.test.lua
> > +++ b/test/app-tap/console.test.lua
> > @@ -77,11 +77,24 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter")
> >   --
> >   -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly.
> > +-- gh-3662: yaml.encode encodes booleans with multiline format.
> > +-- gh-3583: yaml.encode encodes null incorrectly.
> >   --
> > -test:is(type(yaml.decode(yaml.encode('false'))), 'string')
> > -test:is(type(yaml.decode(yaml.encode('true'))), 'string')
> > -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
> > -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
> 
> 1. Are you sure that this is the correct behaviour? So now it is
> not true that decode(encode(obj)) == obj? Looks utterly confusing.
> Or why else you removed these tests?

They are work as before, but now superseded with simpler ones.

> 
> > +
> > +test:is(yaml.encode(false), "--- false\n...\n")
> > +test:is(yaml.encode(true), "--- true\n...\n")
> > +test:is(yaml.encode('false'), "--- 'false'\n...\n")
> > +test:is(yaml.encode('true'), "--- 'true'\n...\n")
> > +test:is(yaml.encode(nil), "--- null\n...\n")
> > +
> > +test:is(yaml.decode('false'), false)
> > +test:is(yaml.decode('no'), false)
> > +test:is(yaml.decode('true'), true)
> > +test:is(yaml.decode('yes'), true)
> > +test:is(yaml.decode('~'), nil)
> > +test:is(yaml.decode('null'), nil)
> > +test:is(yaml.decode('Null'), nil)
> > +test:is(yaml.decode('NULL'), nil)
> >   box.cfg{
> >       listen=IPROTO_SOCKET;
> > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> > index 9b07992d8..73e5d2c31 100644
> > --- a/third_party/lua-yaml/lyaml.cc
> > +++ b/third_party/lua-yaml/lyaml.cc
> > @@ -94,6 +94,46 @@ struct lua_yaml_dumper {
> >      luaL_Buffer yamlbuf;
> >   };
> > +enum yaml_type {
> > +   YAML_NO_MATCH = 0,
> > +   YAML_FALSE,
> > +   YAML_TRUE,
> > +   YAML_NULL
> > +};
> > +
> > +/**
> > + * Verify whether a string represents a boolean literal in YAML.
> > + *
> > + * Non-standard: only subset of YAML 1.1 boolean literals are
> > + * treated as boolean values.
> > + */
> > +static yaml_type
> > +yaml_get_bool(const char *str, const size_t len)
> 
> 2. Usually in such cases we use int 0/-1 returned value
> and return the result via an out-parameter. So it should be
> 
>     static inline int
>     yaml_parse_bool(const char *str, size_t len, bool *value);
> 
> Yes, I've changed 'get' to 'parse' and inlined the function,
> if you do not mind.
> 
> The same for the next function, parsing null. After that you
> can remove enum yaml_type.
> 
> Which, btw, is not a type, strictly speaking, because it
> does not have YAML_INT, YAML_DOUBLE ...
> 
> Also, I do not see any point in making 'len' be 'const'.
> 
> Scarcely it is a case when you need a parameter, passed by
> value instead of pointer, be explicitly const. The compiler
> can detect it easily.

Agreed with all these comments and fixed them.

> 
> > +{
> > +   if (len > 5)
> > +      return YAML_NO_MATCH;
> > +   if (strcmp(str, "false") == 0 || strcmp(str, "no") == 0)
> > +      return YAML_FALSE;
> > +   if (strcmp(str, "true") == 0 || strcmp(str, "yes") == 0)
> 
> 3. I think, it is redundant to firstly check length and then
> call strcmp, which also calculates length (by checking zero
> byte). We need either use only strcmp(), or use length comparison
> + memcmp. The same below for the null parser.

Okay, it was the premature optimization.

The more important thing is that is was incorrect: "fals" gives
YAML_FALSE.

I had fixed it like so, is it okay?

```
if ((len == 5 && memcmp(str, "false", 5) == 0) ||
    (len == 2 && memcmp(str, "no", 2) == 0)) {
   if (out != NULL)
      *out = false;
   return 0;
}
...
```

> 
> > +      return YAML_TRUE;
> > +   return YAML_NO_MATCH;
> > +}
> > +
> > +/**
> > + * Verify whether a string represents a null literal in YAML.
> > + *
> > + * Non-standard: don't match an empty string as null.
> > + */
> > +static yaml_type
> > +yaml_get_null(const char *str, const size_t len){
> 
> 4. ?
> 
>     static inline int
>     yaml_parse_null(const char *str, size_t len, bool *is_null);

is_null is redundant here, because a return code contain all needed
information. Agreed except that. Changed.

> 
> > +   if (len == 1 && str[0] == '~')
> > +      return YAML_NULL;
> > +   if (len == 4 && (strcmp(str, "null") == 0 || strcmp(str, "Null") == 0 ||
> 
> 5. Hmm. Length of "null" and "Null" is 3, not 4. As I understand,
> len does not count terminating zero. And I can prove it - load_scalar()
> in case if !strcmp(tag, "str") does lua_pushlstring(loader->L, str, length),
> so length == strlen(str).
> 
> Even if I am mistaken about it, why two lines above you consider len
> of "~" as 1, not 2?

Just for the record: you fixed youself in a separate email: null is 4
chars length.

> 
> > +       strcmp(str, "NULL") == 0))
> > +      return YAML_NULL;
> > +   return YAML_NO_MATCH;
> > +}
> > +
> >   static void generate_error_message(struct lua_yaml_loader *loader) {
> >      char buf[256];
> >      luaL_Buffer b;
> > @@ -597,8 +636,13 @@ static int dump_node(struct lua_yaml_dumper *dumper)
> >         return dump_table(dumper, &field);
> >      case MP_STR:
> >         str = lua_tolstring(dumper->L, -1, &len);
> > -      if (lua_isnumber(dumper->L, -1)) {
> > -         /* string is convertible to number, quote it to preserve type */
> > +      if ((yaml_get_null(str, len) == YAML_NULL)
> > +         || (yaml_get_bool(str, len) != YAML_NO_MATCH)
> > +         || (lua_isnumber(dumper->L, -1))) {
> > +         /*
> > +          * The string is convertible to a null, a boolean or
> > +          * a number, quote it to preserve its type.
> > +          */
> 
> 6. I think, it is related to my first comment. So why did you
> delete those tests?

No-no, see my answer for the first comment.

> 
> >            style = YAML_SINGLE_QUOTED_SCALAR_STYLE;
> >            break;
> >         }




More information about the Tarantool-patches mailing list