[tarantool-patches] Re: [PATCH v2 2/3] lua-yaml: fix boolean/null representation in yaml

Vladislav Shpilevoy v.shpilevoy at tarantool.org
Fri Feb 22 18:14:19 MSK 2019


LGTM.

On 18/02/2019 21:55, Alexander Turenko wrote:
> I answered to the comments below and noted the fixes were made. Rebased
> the patchset on top of fresh 2.1.
> 
> The new patch is at end of the email.
> 
> WBR, Alexander Turenko.
> 
> On Tue, Feb 05, 2019 at 10:36:40PM +0300, Vladislav Shpilevoy wrote:
>> Thanks for the fixes!
>>
>>>>> diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua
>>>>> index 4f915afd7..68f273234 100755
>>>>> --- a/test/app-tap/console.test.lua
>>>>> +++ b/test/app-tap/console.test.lua
>>>>> @@ -77,11 +77,24 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter")
>>>>>     --
>>>>>     -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly.
>>>>> +-- gh-3662: yaml.encode encodes booleans with multiline format.
>>>>> +-- gh-3583: yaml.encode encodes null incorrectly.
>>>>>     --
>>>>> -test:is(type(yaml.decode(yaml.encode('false'))), 'string')
>>>>> -test:is(type(yaml.decode(yaml.encode('true'))), 'string')
>>>>> -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
>>>>> -test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
>>>>
>>>> 1. Are you sure that this is the correct behaviour? So now it is
>>>> not true that decode(encode(obj)) == obj? Looks utterly confusing.
>>>> Or why else you removed these tests?
>>>
>>> They are work as before, but now superseded with simpler ones.
>>
>> It is not obvious. I see a test
>>
>>      yaml.encode(false) == "--- false\n...\n"
>>
>> I see another test:
>>
>>      yaml.decode('false') == false
>>
>> But where is this? -
>>
>>      yaml.decode("--- false\n...\n") == false
> 
> The removed test cases covered one case that is not covered with the new
> cases: they decode a yaml document, not just a scalar. And it is
> confusing why I remove test cases in the patch. So it worth to just add
> them back.
> 
> Added.
> 
>>
>> And why "false" == "--- false\n...\n" according to the tests
>> in the patch?
> 
> `---` and `...` is a yaml document start and end, see [1]. A sequence of
> documents forms a yaml stream like a sequence of stanzas forms a xml
> stream. We decode one document to its value. Multiple documents however
> forms a table. Anyway, it does not matter much in context of this issue.
> 
> [1]: https://yaml.org/spec/1.2/spec.html#id2800401
> 
>>
>>>>
>>>>> +{
>>>>> +   if (len > 5)
>>>>> +      return YAML_NO_MATCH;
>>>>> +   if (strcmp(str, "false") == 0 || strcmp(str, "no") == 0)
>>>>> +      return YAML_FALSE;
>>>>> +   if (strcmp(str, "true") == 0 || strcmp(str, "yes") == 0)
>>>>
>>>> 3. I think, it is redundant to firstly check length and then
>>>> call strcmp, which also calculates length (by checking zero
>>>> byte). We need either use only strcmp(), or use length comparison
>>>> + memcmp. The same below for the null parser.
>>>
>>> Okay, it was the premature optimization.
>>>
>>> The more important thing is that is was incorrect: "fals" gives
>>> YAML_FALSE.
>>>
>>> I had fixed it like so, is it okay?
>>>
>>> ```
>>> if ((len == 5 && memcmp(str, "false", 5) == 0) ||
>>>       (len == 2 && memcmp(str, "no", 2) == 0)) {
>>>      if (out != NULL)
>>>         *out = false;
>>>      return 0;
>>> }
>>> ...
>>> ```
>>
>> Yes, this fix is ok, thanks.
> 
> Ok.
> 
>>
>>>
>>>>
>>>>> +      return YAML_TRUE;
>>>>> +   return YAML_NO_MATCH;
>>>>> +}
>>>>> +
>>>>> +/**
>>>>> + * Verify whether a string represents a null literal in YAML.
>>>>> + *
>>>>> + * Non-standard: don't match an empty string as null.
>>>>> + */
>>>>> +static yaml_type
>>>>> +yaml_get_null(const char *str, const size_t len){
>>>>
>>>> 4. ?
>>>>
>>>>       static inline int
>>>>       yaml_parse_null(const char *str, size_t len, bool *is_null);
>>>
>>> is_null is redundant here, because a return code contain all needed
>>> information. Agreed except that. Changed.
>>
>> My point was to standardize parser functions so as to always
>> return 0/-1 on success/error, and always return an actual value
>> via an out parameter. But now I see, that -1 here is not really an
>> error, but rather 'not being of a requested type'. So here we
>> do not have even a notion of 'error'.
>>
>> In such a case, please, consider the fixes below. Also, I removed
>> optionality of out parameter in yaml_parse_bool in order to get rid
>> of 'if'.
> 
> I don't mind. Integrated these changes into the patch (with minor
> wording / formatting changes).
> 
>>
>> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
>> index 5afcace1b..a4e03a8ba 100644
>> --- a/third_party/lua-yaml/lyaml.cc
>> +++ b/third_party/lua-yaml/lyaml.cc
>> @@ -96,46 +96,43 @@ struct lua_yaml_dumper {
>>   /**
>>    * Verify whether a string represents a boolean literal in YAML.
>> - *
>>    * Non-standard: only subset of YAML 1.1 boolean literals are
>>    * treated as boolean values.
>> - *
>> - * Return 0 at success, -1 at error.
>> + * @param str Literal.
>> + * @param len Length of @a str.
>> + * @param[out] out Result boolean value.
>> + * @retval Whether @a str represents a boolean value or not.
>>    */
>> -static inline int
>> -yaml_parse_bool(const char *str, size_t len, bool *out)
>> +static inline bool
>> +yaml_is_bool(const char *str, size_t len, bool *out)
>>   {
>>      if ((len == 5 && memcmp(str, "false", 5) == 0) ||
>>          (len == 2 && memcmp(str, "no", 2) == 0)) {
>> -      if (out != NULL)
>> -         *out = false;
>> -      return 0;
>> +      *out = false;
>> +      return true;
>>      }
>>      if ((len == 4 && memcmp(str, "true", 4) == 0) ||
>>          (len == 3 && memcmp(str, "yes", 3) == 0)) {
>> -      if (out != NULL)
>> -         *out = true;
>> -      return 0;
>> +      *out = true;
>> +      return true;
>>      }
>> -   return -1;
>> +   return false;
>>   }
>>   /**
>>    * Verify whether a string represents a null literal in YAML.
>> - *
>>    * Non-standard: don't match an empty string, 'Null' and 'NULL' as
>>    * null.
>> - *
>> - * Return 0 at success, -1 at error.
>> + * @retval Whether @a str represents NULL or not.
>>    */
>> -static inline int
>> -yaml_parse_null(const char *str, size_t len)
>> +static inline bool
>> +yaml_is_null(const char *str, size_t len)
>>   {
>>      if (len == 1 && str[0] == '~')
>> -      return 0;
>> +      return true;
>>      if (len == 4 && memcmp(str, "null", 4) == 0)
>> -      return 0;
>> -   return -1;
>> +      return true;
>> +   return false;
>>   }
>>   static void generate_error_message(struct lua_yaml_loader *loader) {
>> @@ -254,9 +251,9 @@ static void load_scalar(struct lua_yaml_loader *loader) {
>>            return;
>>         } else if (!strcmp(tag, "bool")) {
>>            bool value = false;
>> -         int rc = yaml_parse_bool(str, length, &value);
>> +         bool rc = yaml_is_bool(str, length, &value);
>>            (void) rc;
>> -         assert(rc == 0);
>> +         assert(rc);
> 
> I removed the assert() here, because yaml.decode('!!bool null') should
> not lead to an assertion fail. I didn't investigate why the behaviour is
> to produce `false` rather then raise an error, just keep the behaviour
> the same as before. However I don't add a test, because it is unclear
> whether the module behaves in this way with intention or by occasion.
> 
>>            lua_pushboolean(loader->L, value);
>>            return;
>>         } else if (!strcmp(tag, "binary")) {
>> @@ -275,10 +272,10 @@ static void load_scalar(struct lua_yaml_loader *loader) {
>>             */
>>            lua_pushliteral(loader->L, "");
>>            return;
>> -      } else if (yaml_parse_null(str, length) == 0) {
>> +      } else if (yaml_is_null(str, length)) {
>>            luaL_pushnull(loader->L);
>>            return;
>> -      } else if (yaml_parse_bool(str, length, &value) == 0) {
>> +      } else if (yaml_is_bool(str, length, &value)) {
>>            lua_pushboolean(loader->L, value);
>>            return;
>>         }
>> @@ -610,6 +607,7 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>>      yaml_event_t ev;
>>      yaml_scalar_style_t style = YAML_PLAIN_SCALAR_STYLE;
>>      int is_binary = 0;
>> +   bool unused;
>>      char buf[FPCONV_G_FMT_BUFSIZE];
>>      struct luaL_field field;
>> @@ -644,9 +642,8 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>>         return dump_table(dumper, &field);
>>      case MP_STR:
>>         str = lua_tolstring(dumper->L, -1, &len);
>> -      if ((yaml_parse_null(str, len) == 0)
>> -         || (yaml_parse_bool(str, len, NULL) == 0)
>> -         || (lua_isnumber(dumper->L, -1))) {
>> +      if (yaml_is_null(str, len) || yaml_is_bool(str, len, &unused) ||
>> +          lua_isnumber(dumper->L, -1)) {
>>            /*
>>             * The string is convertible to a null, a boolean or
>>             * a number, quote it to preserve its type.
>>
> 
> ----
> 
> commit 8531b4af03016a69d82bff35504ef6ae3ffbdb82
> Author: Alexander Turenko <alexander.turenko at tarantool.org>
> Date:   Tue Jan 22 01:03:01 2019 +0300
> 
>      lua-yaml: fix strings literals encoding in yaml
>      
>      yaml.encode() now wraps a string literal whose content is equal to a
>      null or a boolean value representation in YAML into single quotes. Those
>      literals are: 'false', 'no', 'true', 'yes', '~', 'null'.
>      
>      Reverted the a2d7643c commit to use single quotes instead of multiline
>      encoding for 'true' and 'false' string literals.
>      
>      Follows up #3476
>      Closes #3662
>      Closes #3583
> 
> diff --git a/test/app-tap/console.test.lua b/test/app-tap/console.test.lua
> index 4f915afd7..a03df5277 100755
> --- a/test/app-tap/console.test.lua
> +++ b/test/app-tap/console.test.lua
> @@ -21,7 +21,7 @@ local EOL = "\n...\n"
>   
>   test = tap.test("console")
>   
> -test:plan(59)
> +test:plan(70)
>   
>   -- Start console and connect to it
>   local server = console.listen(CONSOLE_SOCKET)
> @@ -77,12 +77,28 @@ test:is(yaml.decode(client:read(EOL)), '', "clear delimiter")
>   
>   --
>   -- gh-3476: yaml.encode encodes 'false' and 'true' incorrectly.
> +-- gh-3662: yaml.encode encodes booleans with multiline format.
> +-- gh-3583: yaml.encode encodes null incorrectly.
>   --
> +
>   test:is(type(yaml.decode(yaml.encode('false'))), 'string')
>   test:is(type(yaml.decode(yaml.encode('true'))), 'string')
>   test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
>   test:is(type(yaml.decode(yaml.encode({a = 'false'})).a), 'string')
>   
> +test:is(yaml.encode(false), "--- false\n...\n")
> +test:is(yaml.encode(true), "--- true\n...\n")
> +test:is(yaml.encode('false'), "--- 'false'\n...\n")
> +test:is(yaml.encode('true'), "--- 'true'\n...\n")
> +test:is(yaml.encode(nil), "--- null\n...\n")
> +
> +test:is(yaml.decode('false'), false)
> +test:is(yaml.decode('no'), false)
> +test:is(yaml.decode('true'), true)
> +test:is(yaml.decode('yes'), true)
> +test:is(yaml.decode('~'), nil)
> +test:is(yaml.decode('null'), nil)
> +
>   box.cfg{
>       listen=IPROTO_SOCKET;
>       memtx_memory = 107374182,
> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> index bd876ab29..46c98bde1 100644
> --- a/third_party/lua-yaml/lyaml.cc
> +++ b/third_party/lua-yaml/lyaml.cc
> @@ -94,6 +94,55 @@ struct lua_yaml_dumper {
>      luaL_Buffer yamlbuf;
>   };
>   
> +/**
> + * Verify whether a string represents a boolean literal in YAML.
> + *
> + * Non-standard: only subset of YAML 1.1 boolean literals are
> + * treated as boolean values.
> + *
> + * @param str Literal to check.
> + * @param len Length of @a str.
> + * @param[out] out Result boolean value.
> + *
> + * @retval Whether @a str represents a boolean value.
> + */
> +static inline bool
> +yaml_is_bool(const char *str, size_t len, bool *out)
> +{
> +   if ((len == 5 && memcmp(str, "false", 5) == 0) ||
> +       (len == 2 && memcmp(str, "no", 2) == 0)) {
> +      *out = false;
> +      return true;
> +   }
> +   if ((len == 4 && memcmp(str, "true", 4) == 0) ||
> +       (len == 3 && memcmp(str, "yes", 3) == 0)) {
> +      *out = true;
> +      return true;
> +   }
> +   return false;
> +}
> +
> +/**
> + * Verify whether a string represents a null literal in YAML.
> + *
> + * Non-standard: don't match an empty string, 'Null' and 'NULL' as
> + * null.
> + *
> + * @param str Literal to check.
> + * @param len Length of @a str.
> + *
> + * @retval Whether @a str represents a null value.
> + */
> +static inline bool
> +yaml_is_null(const char *str, size_t len)
> +{
> +   if (len == 1 && str[0] == '~')
> +      return true;
> +   if (len == 4 && memcmp(str, "null", 4) == 0)
> +      return true;
> +   return false;
> +}
> +
>   static void generate_error_message(struct lua_yaml_loader *loader) {
>      char buf[256];
>      luaL_Buffer b;
> @@ -209,7 +258,9 @@ static void load_scalar(struct lua_yaml_loader *loader) {
>            lua_pushnumber(loader->L, dval);
>            return;
>         } else if (!strcmp(tag, "bool")) {
> -         lua_pushboolean(loader->L, !strcmp(str, "true") || !strcmp(str, "yes"));
> +         bool value = false;
> +         yaml_is_bool(str, length, &value);
> +         lua_pushboolean(loader->L, value);
>            return;
>         } else if (!strcmp(tag, "binary")) {
>            frombase64(loader->L, (const unsigned char *)str, length);
> @@ -218,20 +269,20 @@ static void load_scalar(struct lua_yaml_loader *loader) {
>      }
>   
>      if (loader->event.data.scalar.style == YAML_PLAIN_SCALAR_STYLE) {
> -      if (!strcmp(str, "~")) {
> -         luaL_pushnull(loader->L);
> -         return;
> -      } else if (!strcmp(str, "true") || !strcmp(str, "yes")) {
> -         lua_pushboolean(loader->L, 1);
> -         return;
> -      } else if (!strcmp(str, "false") || !strcmp(str, "no")) {
> -         lua_pushboolean(loader->L, 0);
> +      bool value;
> +      if (!length) {
> +         /*
> +          * Non-standard: an empty value/document is null
> +          * according to the standard, but we decode it as an
> +          * empty string.
> +          */
> +         lua_pushliteral(loader->L, "");
>            return;
> -      } else if (!strcmp(str, "null")) {
> +      } else if (yaml_is_null(str, length)) {
>            luaL_pushnull(loader->L);
>            return;
> -      } else if (!length) {
> -         lua_pushliteral(loader->L, "");
> +      } else if (yaml_is_bool(str, length, &value)) {
> +         lua_pushboolean(loader->L, value);
>            return;
>         }
>   
> @@ -547,7 +598,6 @@ static int yaml_is_flow_mode(struct lua_yaml_dumper *dumper) {
>                  (evp->type == YAML_MAPPING_START_EVENT &&
>                   evp->data.mapping_start.style == YAML_FLOW_MAPPING_STYLE)) {
>               return 1;
> -            break;
>            }
>         }
>      }
> @@ -564,6 +614,8 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>      int is_binary = 0;
>      char buf[FPCONV_G_FMT_BUFSIZE];
>      struct luaL_field field;
> +   bool unused;
> +   (void) unused;
>   
>      int top = lua_gettop(dumper->L);
>      luaL_checkfield(dumper->L, dumper->cfg, top, &field);
> @@ -596,8 +648,12 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>         return dump_table(dumper, &field);
>      case MP_STR:
>         str = lua_tolstring(dumper->L, -1, &len);
> -      if (lua_isnumber(dumper->L, -1)) {
> -         /* string is convertible to number, quote it to preserve type */
> +      if (yaml_is_null(str, len) || yaml_is_bool(str, len, &unused) ||
> +          lua_isnumber(dumper->L, -1)) {
> +         /*
> +          * The string is convertible to a null, a boolean or
> +          * a number, quote it to preserve its type.
> +          */
>            style = YAML_SINGLE_QUOTED_SCALAR_STYLE;
>            break;
>         }
> @@ -605,12 +661,10 @@ static int dump_node(struct lua_yaml_dumper *dumper)
>         if (utf8_check_printable(str, len)) {
>            if (yaml_is_flow_mode(dumper)) {
>               style = YAML_SINGLE_QUOTED_SCALAR_STYLE;
> -         } else if (strstr(str, "\n\n") != NULL || strcmp(str, "true") == 0 ||
> -		    strcmp(str, "false") == 0) {
> +         } else if (strstr(str, "\n\n") != 0) {
>               /*
>                * Tarantool-specific: use literal style for string
> -             * with empty lines and strings representing boolean
> -             * types.
> +             * with empty lines.
>                * Useful for tutorial().
>                */
>               style = YAML_LITERAL_SCALAR_STYLE;
> 




More information about the Tarantool-patches mailing list