[tarantool-patches] Re: [PATCH v2 1/3] lua-yaml: verify arguments count

Alexander Turenko alexander.turenko at tarantool.org
Mon Feb 11 16:32:18 MSK 2019


lua_is* really checks whether an acceptable index is a valid one, so
there are two possible approaches, one of which we should stick I think:

* Verify lua_gettop() upper and lower bounds right at start of a
  function.
* Use lua_is* (including lua_isnone() and lua_isnoneornil()) and don't
  verify arguments count explicitly.

I think we should use one of these ways within a module: this is more
important then the patch size. The only difference for a user is that
the latter approach does not check for extra arguments.

Now I implemented the latter approach as I see you want to minimize
explicit checks. See the patch at end of the email.

It is possible to reduce the patch further, but loss consistency in what
we check: lua_is* or lua_gettop(). I'll do if you insist, but don't
think it is the right way to proceed.

NB: branch: kh/gh-3662-yaml-2.1

WBR, Alexander Turenko.

On Tue, Feb 05, 2019 at 10:36:41PM +0300, Vladislav Shpilevoy wrote:
> Hi! Thanks for the fixes!
> 
> > > > functions.
> > > > 
> > > > Without these checks the functions could read garbage outside of a Lua
> > > > stack when called w/o arguments.
> > > 
> > > Honestly, I do not understand how is it possible. Please,
> > > provide a test for both functions. See my 3 doubts below.
> > 
> > lua_isstring(L, 1) checks a garbage w/o preliminary lua_gettop() check.
> > yaml.encode() gives me "unsupported Lua type 'thread'" on the current
> > tarantool 2.1.
> 
> I looked at lua_isstring implementation, and I see, that it checks
> top. If an index is above top, then the type is nil.
> 
> 	static TValue *index2adr(lua_State *L, int idx)
> 	{
> 	  if (idx > 0) {
> 	    TValue *o = L->base + (idx - 1);
> 	    return o < L->top ? o : niltv(L);
> 	...
> 

Ouch, lua_isstring() is called only in case of top == 2, so this is out
of scope of the discussion. The real cause of this weird "unsupported
Lua type 'thread'" error is lua_yaml_encode() code: it calls
`lua_newthread(L)` and then `lua_pushvalue(L, 1);`. A 1st stack item
should be a value we encode, but when there are no arguments for
yaml.encode() the new lua thread is the 1st item.

Anyway, I don't think that "unsupported Lua type 'thread'" is the right
error message for `yaml.encode()`. Are you agree?

> > 
> > Anyway, added bad API usage test cases. Also I changed this:
> > 
> > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> > index 3a427263e..46374970f 100644
> > --- a/third_party/lua-yaml/lyaml.cc
> > +++ b/third_party/lua-yaml/lyaml.cc
> > @@ -453,7 +453,7 @@ usage_error:
> >         return luaL_error(L, OOM_ERRMSG);
> >      yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
> >      bool tag_only;
> > -   if (lua_gettop(L) == 2) {
> > +   if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) {
> >         if (! lua_istable(L, 2))
> >            goto usage_error;
> >         lua_getfield(L, 2, "tag_only");
> > 
> > We should not raise an usage error for yaml.decode(object, nil).
> 
> Why? It is said, that the second value either does not exist, or
> is a table. Nil is not a table. So why? If your logic was about
> considering nil as a not existing value, then why don't we handle
> cases like this: yaml.decode(object, nil, nil, nil, nil) ? The same
> for l_dump() and encode.

There is the difference between `yaml.decode(object, nil)` and
`yaml.decode(object, nil, nil, nil, nil)`. The former one is likely to
appear due to passing though the 2nd argument, say:

```
local function load_cfg(raw, opts)
    local object = yaml.decode(raw, opts)
    ...some post-processing...
    return object
end
```

The latter is definitely wrong usage.

But now I removed checks for extra args, see above.

> 
> > > >    usage_error:
> > > >          return luaL_error(L, "Usage: yaml.decode(document, "\
> > > >                            "[{tag_only = boolean}])");
> > > > @@ -416,7 +417,7 @@ usage_error:
> > > >          return luaL_error(L, OOM_ERRMSG);
> > > >       yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
> > > >       bool tag_only;
> > > > -   if (lua_gettop(L) > 1) {
> > > > +   if (lua_gettop(L) == 2) {
> > > 
> > > 2. This function never touches anything beyond second value on
> > > the stack, so here lua_gettop(L) > 1 means the same as
> > > lua_gettop(L) == 2 - the second argument exist. Third and next
> > > values do not matter.
> > 
> > I read this as 'those are equivalent' (correct me if I'm wrong). Ok. I'd
> > prefer to leave it with ==. Also note the fix I pasted above.
> 
> Why? Again. I do not see any reason behind this change except personal
> preference.

It does not matter much, because I anyway need to add ` && !
lua_isnil(L, 2)` or use `! lua_isnoneornil(L, 2)` here to make decode
behaviour consistent with encode one (against 2nd argument). Yep, it is
personal preference. Anyway, now it is `! lua_isnoneornil(L, 2)`.

> I reverted all the changes about l_load() function, and the
> tests passed. So why do we need to make diff bigger?

yaml.decode('', nil, {}) don't pass before (don't raise an error).
Other tests are passed, because of two reasons:

* no test on yaml.decode('', nil);
* lua_isstring() checks stack size.

Re test: added for encode and decode.

Re lua_isstring(): okay, now I understood that it checks given index by
the API:

http://pgl.yoyo.org/luai/i/lua_isstring ("acceptable index")
https://www.lua.org/manual/5.3/manual.html#4.3 ("Valid and Acceptable Indices")
https://www.lua.org/manual/5.1/manual.html#3.2 (the same for Lua 5.1)

So I changed the description of the commit to make it clear that the
reason of the change is to make the code more consistent.

> 
> > 
> > > 
> > > >          if (! lua_istable(L, 2))
> > > >             goto usage_error;
> > > >          lua_getfield(L, 2, "tag_only");
> > > > @@ -794,7 +795,7 @@ error:
> > > >    static int l_dump(lua_State *L) {
> > > >       struct luaL_serializer *serializer = luaL_checkserializer(L);
> > > >       int top = lua_gettop(L);
> > > > -   if (top > 2) {
> > > > +   if (!(top == 1 || top == 2)) {
> > > 
> > > 3. Here my reasoning is the same - the previous checking works
> > > as well.
> > 
> > It will not give an error in case of yaml.encode() and yaml.encode({},
> > {}, {}).
> 
> Decent. Here you are right.
> 
> > 
> > > 
> > > >    usage_error:
> > > >          return luaL_error(L, "Usage: encode(object, {tag_prefix = <string>, "\
> > > >                            "tag_handle = <string>})");
> > > > 
> 
> My diff, which reverts some changes and makes this patch one-liner:
> 
> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> index 354cafe86..854794dd1 100644
> --- a/third_party/lua-yaml/lyaml.cc
> +++ b/third_party/lua-yaml/lyaml.cc
> @@ -400,8 +400,7 @@ static void load(struct lua_yaml_loader *loader) {
>   */
>  static int l_load(lua_State *L) {
>     struct lua_yaml_loader loader;
> -   int top = lua_gettop(L);
> -   if (!(top == 1 || top == 2) || !lua_isstring(L, 1)) {
> +   if (! lua_isstring(L, 1)) {
>  usage_error:
>        return luaL_error(L, "Usage: yaml.decode(document, "\
>                          "[{tag_only = boolean}])");
> @@ -417,7 +416,7 @@ usage_error:
>        return luaL_error(L, OOM_ERRMSG);
>     yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
>     bool tag_only;
> -   if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) {
> +   if (lua_gettop(L) > 1) {
>        if (! lua_istable(L, 2))
>           goto usage_error;
>        lua_getfield(L, 2, "tag_only");

----

The new patch description and diff (w/o tests):

lua-yaml: verify args in a consistent manner

Use lua_is*() functions instead of explicit lua_gettop() checks in
yaml.encode() and yaml.decode() functions.

Behaviour changes:

* yaml.decode(object, nil) ignores nil (it is consistent with encode
  behaviour).
* yaml.encode() gives an usage error instead of "unsupported Lua type
  'thread'".
* yaml.encode('', {}, {}) ignores 3rd argument (it is consistent with
  decode behaviour).

diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
index c6d118a79..bd876ab29 100644
--- a/third_party/lua-yaml/lyaml.cc
+++ b/third_party/lua-yaml/lyaml.cc
@@ -416,7 +416,7 @@ usage_error:
       return luaL_error(L, OOM_ERRMSG);
    yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
    bool tag_only;
-   if (lua_gettop(L) > 1) {
+   if (! lua_isnoneornil(L, 2)) {
       if (! lua_istable(L, 2))
          goto usage_error;
       lua_getfield(L, 2, "tag_only");
@@ -793,14 +793,13 @@ error:
  */
 static int l_dump(lua_State *L) {
    struct luaL_serializer *serializer = luaL_checkserializer(L);
-   int top = lua_gettop(L);
-   if (top > 2) {
+   if (lua_isnone(L, 1)) {
 usage_error:
       return luaL_error(L, "Usage: encode(object, {tag_prefix = <string>, "\
                         "tag_handle = <string>})");
    }
    const char *prefix = NULL, *handle = NULL;
-   if (top == 2 && !lua_isnil(L, 2)) {
+   if (! lua_isnoneornil(L, 2)) {
       if (! lua_istable(L, 2))
          goto usage_error;
       lua_getfield(L, 2, "tag_prefix");




More information about the Tarantool-patches mailing list