[tarantool-patches] Re: [PATCH v2 1/3] lua-yaml: verify arguments count
    Vladislav Shpilevoy 
    v.shpilevoy at tarantool.org
       
    Sat Feb 16 00:28:29 MSK 2019
    
    
  
This commit LGTM. Next is pending of fixes.
On 11/02/2019 14:32, Alexander Turenko wrote:
> lua_is* really checks whether an acceptable index is a valid one, so
> there are two possible approaches, one of which we should stick I think:
> 
> * Verify lua_gettop() upper and lower bounds right at start of a
>    function.
> * Use lua_is* (including lua_isnone() and lua_isnoneornil()) and don't
>    verify arguments count explicitly.
> 
> I think we should use one of these ways within a module: this is more
> important then the patch size. The only difference for a user is that
> the latter approach does not check for extra arguments.
> 
> Now I implemented the latter approach as I see you want to minimize
> explicit checks. See the patch at end of the email.
> 
> It is possible to reduce the patch further, but loss consistency in what
> we check: lua_is* or lua_gettop(). I'll do if you insist, but don't
> think it is the right way to proceed.
> 
> NB: branch: kh/gh-3662-yaml-2.1
> 
> WBR, Alexander Turenko.
> 
> On Tue, Feb 05, 2019 at 10:36:41PM +0300, Vladislav Shpilevoy wrote:
>> Hi! Thanks for the fixes!
>>
>>>>> functions.
>>>>>
>>>>> Without these checks the functions could read garbage outside of a Lua
>>>>> stack when called w/o arguments.
>>>>
>>>> Honestly, I do not understand how is it possible. Please,
>>>> provide a test for both functions. See my 3 doubts below.
>>>
>>> lua_isstring(L, 1) checks a garbage w/o preliminary lua_gettop() check.
>>> yaml.encode() gives me "unsupported Lua type 'thread'" on the current
>>> tarantool 2.1.
>>
>> I looked at lua_isstring implementation, and I see, that it checks
>> top. If an index is above top, then the type is nil.
>>
>> 	static TValue *index2adr(lua_State *L, int idx)
>> 	{
>> 	  if (idx > 0) {
>> 	    TValue *o = L->base + (idx - 1);
>> 	    return o < L->top ? o : niltv(L);
>> 	...
>>
> 
> Ouch, lua_isstring() is called only in case of top == 2, so this is out
> of scope of the discussion. The real cause of this weird "unsupported
> Lua type 'thread'" error is lua_yaml_encode() code: it calls
> `lua_newthread(L)` and then `lua_pushvalue(L, 1);`. A 1st stack item
> should be a value we encode, but when there are no arguments for
> yaml.encode() the new lua thread is the 1st item.
> 
> Anyway, I don't think that "unsupported Lua type 'thread'" is the right
> error message for `yaml.encode()`. Are you agree?
> 
>>>
>>> Anyway, added bad API usage test cases. Also I changed this:
>>>
>>> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
>>> index 3a427263e..46374970f 100644
>>> --- a/third_party/lua-yaml/lyaml.cc
>>> +++ b/third_party/lua-yaml/lyaml.cc
>>> @@ -453,7 +453,7 @@ usage_error:
>>>          return luaL_error(L, OOM_ERRMSG);
>>>       yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
>>>       bool tag_only;
>>> -   if (lua_gettop(L) == 2) {
>>> +   if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) {
>>>          if (! lua_istable(L, 2))
>>>             goto usage_error;
>>>          lua_getfield(L, 2, "tag_only");
>>>
>>> We should not raise an usage error for yaml.decode(object, nil).
>>
>> Why? It is said, that the second value either does not exist, or
>> is a table. Nil is not a table. So why? If your logic was about
>> considering nil as a not existing value, then why don't we handle
>> cases like this: yaml.decode(object, nil, nil, nil, nil) ? The same
>> for l_dump() and encode.
> 
> There is the difference between `yaml.decode(object, nil)` and
> `yaml.decode(object, nil, nil, nil, nil)`. The former one is likely to
> appear due to passing though the 2nd argument, say:
> 
> ```
> local function load_cfg(raw, opts)
>      local object = yaml.decode(raw, opts)
>      ...some post-processing...
>      return object
> end
> ```
> 
> The latter is definitely wrong usage.
> 
> But now I removed checks for extra args, see above.
> 
>>
>>>>>     usage_error:
>>>>>           return luaL_error(L, "Usage: yaml.decode(document, "\
>>>>>                             "[{tag_only = boolean}])");
>>>>> @@ -416,7 +417,7 @@ usage_error:
>>>>>           return luaL_error(L, OOM_ERRMSG);
>>>>>        yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
>>>>>        bool tag_only;
>>>>> -   if (lua_gettop(L) > 1) {
>>>>> +   if (lua_gettop(L) == 2) {
>>>>
>>>> 2. This function never touches anything beyond second value on
>>>> the stack, so here lua_gettop(L) > 1 means the same as
>>>> lua_gettop(L) == 2 - the second argument exist. Third and next
>>>> values do not matter.
>>>
>>> I read this as 'those are equivalent' (correct me if I'm wrong). Ok. I'd
>>> prefer to leave it with ==. Also note the fix I pasted above.
>>
>> Why? Again. I do not see any reason behind this change except personal
>> preference.
> 
> It does not matter much, because I anyway need to add ` && !
> lua_isnil(L, 2)` or use `! lua_isnoneornil(L, 2)` here to make decode
> behaviour consistent with encode one (against 2nd argument). Yep, it is
> personal preference. Anyway, now it is `! lua_isnoneornil(L, 2)`.
> 
>> I reverted all the changes about l_load() function, and the
>> tests passed. So why do we need to make diff bigger?
> 
> yaml.decode('', nil, {}) don't pass before (don't raise an error).
> Other tests are passed, because of two reasons:
> 
> * no test on yaml.decode('', nil);
> * lua_isstring() checks stack size.
> 
> Re test: added for encode and decode.
> 
> Re lua_isstring(): okay, now I understood that it checks given index by
> the API:
> 
> http://pgl.yoyo.org/luai/i/lua_isstring ("acceptable index")
> https://www.lua.org/manual/5.3/manual.html#4.3 ("Valid and Acceptable Indices")
> https://www.lua.org/manual/5.1/manual.html#3.2 (the same for Lua 5.1)
> 
> So I changed the description of the commit to make it clear that the
> reason of the change is to make the code more consistent.
> 
>>
>>>
>>>>
>>>>>           if (! lua_istable(L, 2))
>>>>>              goto usage_error;
>>>>>           lua_getfield(L, 2, "tag_only");
>>>>> @@ -794,7 +795,7 @@ error:
>>>>>     static int l_dump(lua_State *L) {
>>>>>        struct luaL_serializer *serializer = luaL_checkserializer(L);
>>>>>        int top = lua_gettop(L);
>>>>> -   if (top > 2) {
>>>>> +   if (!(top == 1 || top == 2)) {
>>>>
>>>> 3. Here my reasoning is the same - the previous checking works
>>>> as well.
>>>
>>> It will not give an error in case of yaml.encode() and yaml.encode({},
>>> {}, {}).
>>
>> Decent. Here you are right.
>>
>>>
>>>>
>>>>>     usage_error:
>>>>>           return luaL_error(L, "Usage: encode(object, {tag_prefix = <string>, "\
>>>>>                             "tag_handle = <string>})");
>>>>>
>>
>> My diff, which reverts some changes and makes this patch one-liner:
>>
>> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
>> index 354cafe86..854794dd1 100644
>> --- a/third_party/lua-yaml/lyaml.cc
>> +++ b/third_party/lua-yaml/lyaml.cc
>> @@ -400,8 +400,7 @@ static void load(struct lua_yaml_loader *loader) {
>>    */
>>   static int l_load(lua_State *L) {
>>      struct lua_yaml_loader loader;
>> -   int top = lua_gettop(L);
>> -   if (!(top == 1 || top == 2) || !lua_isstring(L, 1)) {
>> +   if (! lua_isstring(L, 1)) {
>>   usage_error:
>>         return luaL_error(L, "Usage: yaml.decode(document, "\
>>                           "[{tag_only = boolean}])");
>> @@ -417,7 +416,7 @@ usage_error:
>>         return luaL_error(L, OOM_ERRMSG);
>>      yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
>>      bool tag_only;
>> -   if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) {
>> +   if (lua_gettop(L) > 1) {
>>         if (! lua_istable(L, 2))
>>            goto usage_error;
>>         lua_getfield(L, 2, "tag_only");
> 
> ----
> 
> The new patch description and diff (w/o tests):
> 
> lua-yaml: verify args in a consistent manner
> 
> Use lua_is*() functions instead of explicit lua_gettop() checks in
> yaml.encode() and yaml.decode() functions.
> 
> Behaviour changes:
> 
> * yaml.decode(object, nil) ignores nil (it is consistent with encode
>    behaviour).
> * yaml.encode() gives an usage error instead of "unsupported Lua type
>    'thread'".
> * yaml.encode('', {}, {}) ignores 3rd argument (it is consistent with
>    decode behaviour).
> 
> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> index c6d118a79..bd876ab29 100644
> --- a/third_party/lua-yaml/lyaml.cc
> +++ b/third_party/lua-yaml/lyaml.cc
> @@ -416,7 +416,7 @@ usage_error:
>         return luaL_error(L, OOM_ERRMSG);
>      yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
>      bool tag_only;
> -   if (lua_gettop(L) > 1) {
> +   if (! lua_isnoneornil(L, 2)) {
>         if (! lua_istable(L, 2))
>            goto usage_error;
>         lua_getfield(L, 2, "tag_only");
> @@ -793,14 +793,13 @@ error:
>    */
>   static int l_dump(lua_State *L) {
>      struct luaL_serializer *serializer = luaL_checkserializer(L);
> -   int top = lua_gettop(L);
> -   if (top > 2) {
> +   if (lua_isnone(L, 1)) {
>   usage_error:
>         return luaL_error(L, "Usage: encode(object, {tag_prefix = <string>, "\
>                           "tag_handle = <string>})");
>      }
>      const char *prefix = NULL, *handle = NULL;
> -   if (top == 2 && !lua_isnil(L, 2)) {
> +   if (! lua_isnoneornil(L, 2)) {
>         if (! lua_istable(L, 2))
>            goto usage_error;
>         lua_getfield(L, 2, "tag_prefix");
> 
    
    
More information about the Tarantool-patches
mailing list