[tarantool-patches] Re: [PATCH v2 1/3] lua-yaml: verify arguments count
Alexander Turenko
alexander.turenko at tarantool.org
Mon Feb 11 16:32:18 MSK 2019
lua_is* really checks whether an acceptable index is a valid one, so
there are two possible approaches, one of which we should stick I think:
* Verify lua_gettop() upper and lower bounds right at start of a
function.
* Use lua_is* (including lua_isnone() and lua_isnoneornil()) and don't
verify arguments count explicitly.
I think we should use one of these ways within a module: this is more
important then the patch size. The only difference for a user is that
the latter approach does not check for extra arguments.
Now I implemented the latter approach as I see you want to minimize
explicit checks. See the patch at end of the email.
It is possible to reduce the patch further, but loss consistency in what
we check: lua_is* or lua_gettop(). I'll do if you insist, but don't
think it is the right way to proceed.
NB: branch: kh/gh-3662-yaml-2.1
WBR, Alexander Turenko.
On Tue, Feb 05, 2019 at 10:36:41PM +0300, Vladislav Shpilevoy wrote:
> Hi! Thanks for the fixes!
>
> > > > functions.
> > > >
> > > > Without these checks the functions could read garbage outside of a Lua
> > > > stack when called w/o arguments.
> > >
> > > Honestly, I do not understand how is it possible. Please,
> > > provide a test for both functions. See my 3 doubts below.
> >
> > lua_isstring(L, 1) checks a garbage w/o preliminary lua_gettop() check.
> > yaml.encode() gives me "unsupported Lua type 'thread'" on the current
> > tarantool 2.1.
>
> I looked at lua_isstring implementation, and I see, that it checks
> top. If an index is above top, then the type is nil.
>
> static TValue *index2adr(lua_State *L, int idx)
> {
> if (idx > 0) {
> TValue *o = L->base + (idx - 1);
> return o < L->top ? o : niltv(L);
> ...
>
Ouch, lua_isstring() is called only in case of top == 2, so this is out
of scope of the discussion. The real cause of this weird "unsupported
Lua type 'thread'" error is lua_yaml_encode() code: it calls
`lua_newthread(L)` and then `lua_pushvalue(L, 1);`. A 1st stack item
should be a value we encode, but when there are no arguments for
yaml.encode() the new lua thread is the 1st item.
Anyway, I don't think that "unsupported Lua type 'thread'" is the right
error message for `yaml.encode()`. Are you agree?
> >
> > Anyway, added bad API usage test cases. Also I changed this:
> >
> > diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> > index 3a427263e..46374970f 100644
> > --- a/third_party/lua-yaml/lyaml.cc
> > +++ b/third_party/lua-yaml/lyaml.cc
> > @@ -453,7 +453,7 @@ usage_error:
> > return luaL_error(L, OOM_ERRMSG);
> > yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
> > bool tag_only;
> > - if (lua_gettop(L) == 2) {
> > + if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) {
> > if (! lua_istable(L, 2))
> > goto usage_error;
> > lua_getfield(L, 2, "tag_only");
> >
> > We should not raise an usage error for yaml.decode(object, nil).
>
> Why? It is said, that the second value either does not exist, or
> is a table. Nil is not a table. So why? If your logic was about
> considering nil as a not existing value, then why don't we handle
> cases like this: yaml.decode(object, nil, nil, nil, nil) ? The same
> for l_dump() and encode.
There is the difference between `yaml.decode(object, nil)` and
`yaml.decode(object, nil, nil, nil, nil)`. The former one is likely to
appear due to passing though the 2nd argument, say:
```
local function load_cfg(raw, opts)
local object = yaml.decode(raw, opts)
...some post-processing...
return object
end
```
The latter is definitely wrong usage.
But now I removed checks for extra args, see above.
>
> > > > usage_error:
> > > > return luaL_error(L, "Usage: yaml.decode(document, "\
> > > > "[{tag_only = boolean}])");
> > > > @@ -416,7 +417,7 @@ usage_error:
> > > > return luaL_error(L, OOM_ERRMSG);
> > > > yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
> > > > bool tag_only;
> > > > - if (lua_gettop(L) > 1) {
> > > > + if (lua_gettop(L) == 2) {
> > >
> > > 2. This function never touches anything beyond second value on
> > > the stack, so here lua_gettop(L) > 1 means the same as
> > > lua_gettop(L) == 2 - the second argument exist. Third and next
> > > values do not matter.
> >
> > I read this as 'those are equivalent' (correct me if I'm wrong). Ok. I'd
> > prefer to leave it with ==. Also note the fix I pasted above.
>
> Why? Again. I do not see any reason behind this change except personal
> preference.
It does not matter much, because I anyway need to add ` && !
lua_isnil(L, 2)` or use `! lua_isnoneornil(L, 2)` here to make decode
behaviour consistent with encode one (against 2nd argument). Yep, it is
personal preference. Anyway, now it is `! lua_isnoneornil(L, 2)`.
> I reverted all the changes about l_load() function, and the
> tests passed. So why do we need to make diff bigger?
yaml.decode('', nil, {}) don't pass before (don't raise an error).
Other tests are passed, because of two reasons:
* no test on yaml.decode('', nil);
* lua_isstring() checks stack size.
Re test: added for encode and decode.
Re lua_isstring(): okay, now I understood that it checks given index by
the API:
http://pgl.yoyo.org/luai/i/lua_isstring ("acceptable index")
https://www.lua.org/manual/5.3/manual.html#4.3 ("Valid and Acceptable Indices")
https://www.lua.org/manual/5.1/manual.html#3.2 (the same for Lua 5.1)
So I changed the description of the commit to make it clear that the
reason of the change is to make the code more consistent.
>
> >
> > >
> > > > if (! lua_istable(L, 2))
> > > > goto usage_error;
> > > > lua_getfield(L, 2, "tag_only");
> > > > @@ -794,7 +795,7 @@ error:
> > > > static int l_dump(lua_State *L) {
> > > > struct luaL_serializer *serializer = luaL_checkserializer(L);
> > > > int top = lua_gettop(L);
> > > > - if (top > 2) {
> > > > + if (!(top == 1 || top == 2)) {
> > >
> > > 3. Here my reasoning is the same - the previous checking works
> > > as well.
> >
> > It will not give an error in case of yaml.encode() and yaml.encode({},
> > {}, {}).
>
> Decent. Here you are right.
>
> >
> > >
> > > > usage_error:
> > > > return luaL_error(L, "Usage: encode(object, {tag_prefix = <string>, "\
> > > > "tag_handle = <string>})");
> > > >
>
> My diff, which reverts some changes and makes this patch one-liner:
>
> diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
> index 354cafe86..854794dd1 100644
> --- a/third_party/lua-yaml/lyaml.cc
> +++ b/third_party/lua-yaml/lyaml.cc
> @@ -400,8 +400,7 @@ static void load(struct lua_yaml_loader *loader) {
> */
> static int l_load(lua_State *L) {
> struct lua_yaml_loader loader;
> - int top = lua_gettop(L);
> - if (!(top == 1 || top == 2) || !lua_isstring(L, 1)) {
> + if (! lua_isstring(L, 1)) {
> usage_error:
> return luaL_error(L, "Usage: yaml.decode(document, "\
> "[{tag_only = boolean}])");
> @@ -417,7 +416,7 @@ usage_error:
> return luaL_error(L, OOM_ERRMSG);
> yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
> bool tag_only;
> - if (lua_gettop(L) == 2 && ! lua_isnil(L, 2)) {
> + if (lua_gettop(L) > 1) {
> if (! lua_istable(L, 2))
> goto usage_error;
> lua_getfield(L, 2, "tag_only");
----
The new patch description and diff (w/o tests):
lua-yaml: verify args in a consistent manner
Use lua_is*() functions instead of explicit lua_gettop() checks in
yaml.encode() and yaml.decode() functions.
Behaviour changes:
* yaml.decode(object, nil) ignores nil (it is consistent with encode
behaviour).
* yaml.encode() gives an usage error instead of "unsupported Lua type
'thread'".
* yaml.encode('', {}, {}) ignores 3rd argument (it is consistent with
decode behaviour).
diff --git a/third_party/lua-yaml/lyaml.cc b/third_party/lua-yaml/lyaml.cc
index c6d118a79..bd876ab29 100644
--- a/third_party/lua-yaml/lyaml.cc
+++ b/third_party/lua-yaml/lyaml.cc
@@ -416,7 +416,7 @@ usage_error:
return luaL_error(L, OOM_ERRMSG);
yaml_parser_set_input_string(&loader.parser, (yaml_char_t *) document, len);
bool tag_only;
- if (lua_gettop(L) > 1) {
+ if (! lua_isnoneornil(L, 2)) {
if (! lua_istable(L, 2))
goto usage_error;
lua_getfield(L, 2, "tag_only");
@@ -793,14 +793,13 @@ error:
*/
static int l_dump(lua_State *L) {
struct luaL_serializer *serializer = luaL_checkserializer(L);
- int top = lua_gettop(L);
- if (top > 2) {
+ if (lua_isnone(L, 1)) {
usage_error:
return luaL_error(L, "Usage: encode(object, {tag_prefix = <string>, "\
"tag_handle = <string>})");
}
const char *prefix = NULL, *handle = NULL;
- if (top == 2 && !lua_isnil(L, 2)) {
+ if (! lua_isnoneornil(L, 2)) {
if (! lua_istable(L, 2))
goto usage_error;
lua_getfield(L, 2, "tag_prefix");
More information about the Tarantool-patches
mailing list